This section explains how to deploy Cloud Native Qumulo (CNQ) by creating the persistent storage and the cluster compute and cache resources with CloudFormation. It also provides information about post-deployment actions and optimization.

For an overview of CNQ on AWS, its prerequisites, and limits, see How Cloud Native Qumulo Works.

The aws-cloudformation-cnq.zip file contains comprehensive CloudFormation templates that let you deploy S3 buckets and then create a CNQ cluster with 4 to 24 instances that adhere to the AWS Well-Architected Framework and have fully elastic compute and capacity.

Prerequisites

This section explains the prerequisites to deploying CNQ on AWS.

  • To allow your Qumulo instance to report metrics to Qumulo, your AWS VPC must have outbound Internet connectivity through a NAT gateway or a firewall. Your instance shares no file data during this process.

  • The following features require specific versions of Qumulo Core:

    Feature Minimum Qumulo Core Version
    • Adding S3 buckets to increase persistent storage capacity
    • Increasing the soft capacity limit for an existing CNQ cluster
    7.2.1.1
    7.2.0.2
    Creating persistent storage 7.1.3 with version 4.0 of this repository
  • Before you configure your CloudFormation template, you must sign in to the AWS Management Console.

    A custom IAM role or user must include the following AWS services:

    • cloudformation:*
    • ec2:*
    • elasticloadbalancing:*
    • iam:*
    • kms:*
    • lambda:*
    • logs:*
    • resource-groups:*
    • route53:*
    • s3:*
    • secretsmanager:*
    • sns:*
    • ssm:*
    • sts:*

How the CNQ Provisioner Works

The CNQ Provisioner is an m5.large EC2 instance that uses custom user data to configure your Qumulo cluster and any additional AWS environment requirements.

The Provisioner stores all necessary state information in the AWS Parameter Store and shuts down automatically when it completes any of its following major tasks:

Click to expand Qumulo Cluster Configuration
  • Forms the first quorum with specific Hot or Cold parameters
  • Adds nodes to the quorum (when expanding the cluster)
  • Assigns floating IP addresses to nodes in the cluster
  • Manages cluster replacement (new compute and cache resources) for changing instance sizes
  • Manages the addition of S3 buckets and soft capacity limit increases
  • Changes the administrative password
AWS Configuration
  • Checks for connectivity to Amazon S3
  • Checks for the presence of an S3 Gateway in the VPC (this is required for provisioning)
  • Checks that all S3 buckets are empty before forming quorum
  • Checks for connectivity to the public Internet running a curl command against api.missionq.qumulo.com/
  • Assigns a policy to the top-level CloudFormation stack to protect the cluster during subsequent stack updates
  • Configures the throughput and IOPS for the EBS gp3 volume
  • Tags EBS volumes with the stack name and volume type
  • Tracks software versions, cluster IP addresses, instance IDs, and UUID in the AWS Parameter Store
  • Tracks the last-run-status for the Provisioner in the Parameter Store
  • Configures Termination Protection for the stack and the EC2 Instances

Step 1: Deploying Cluster Persistent Storage

This section explains how to deploy the S3 buckets that act as persistent storage for your Qumulo cluster.

  1. Log in to Nexus, click Downloads > Deployment on AWS, and then download the CloudFormation template, Debian package, and host configuration file.

  2. In your S3 bucket, create the qumulo-core-install directory. Within this directory, create another directory with the Qumulo Core version as its name. The following is an example path:

    my-s3-bucket-name/my-s3-bucket-prefix/qumulo-core-install/7.2.3
    
  3. Copy qumulo-core.deb and host_configuration.tar.gz into the directory named after the Qumulo Core version (in this example, it is 7.2.3).

  4. Copy aws-cloudformation-cnq.zip to the my-s3-bucket-name/my-s3-bucket-prefix/aws-cloudformation-cnq directory. and decompress it.

  5. Clone the aws-cloudformation-cnq.zip file to an S3 bucket and find the URL to templates/persistent-storage.template.yaml. For example:

    https://my-bucket.s3.us-west-2.amazonaws.com/aws-cloudformation-cnq/templates/persistent-storage.template.yaml
    
  6. Log in to the AWS CloudFormation console.

  7. On the Stacks page, in the upper right, click Create stack > With new resources (standard).

  8. On the Create stack page, in the Specify template section, click Amazon S3 URL, enter the URL to persistent-storage.template.yaml, and then click Next.

  9. On the Specify stack details page, enter the Stack name and review the information in the Parameters section:

    1. Enter the S3 bucket Region.

    2. Select the Soft Capacity Limit for the subsequent CNQ deployment.

    3. Click Next.

  10. On the Configure stack options page, click Next.

  11. On the Review and create page, click Submit.

    CloudFormation creates S3 buckets and their stack.

Step 2: Deploying Cluster Compute and Cache Resources

This section explains how to deploy compute and cache resources for a Qumulo cluster by using a Ubuntu AMI and the Qumulo Core .deb installer.

  1. Configure your VPC to use the gateway VPC endpoint for S3.

  2. Log in to the AWS CloudFormation console.

  3. On the Stacks page, in the upper right, click Create stack > With new resources (standard).

  4. On the Create stack page, in the Specify template section, click Amazon S3 URL, enter the URL to cnq-standard-template.yaml, and then click Next.

  5. On the Specify stack details page, enter the Stack name and review the information in the Parameters section, and then click Next.

  6. On the Configure stack options page, click Next.

  7. On the Review and create page, click Submit.

    CloudFormation creates S3 buckets and their stack.

  8. To log in to your cluster’s Web UI, use the IP address from the top-level stack output as the endpoint and the username and password that you have configured during deployment as the credentials.

    You can use the Web UI to create and manage NFS exports, SMB shares, snapshots, and continuous replication relationships You can also join your cluster to Active Directory, configure LDAP, and perform many other operations.

  9. Mount your Qumulo file system by using NFS or SMB and your cluster’s DNS name or IP address.

Step 3: Performing Post-Deployment Actions

This section describes the common actions you can perform on a CNQ cluster after deploying it.

Adding a Node to an Existing Cluster

  1. Log in to the AWS CloudFormation console.
  2. On the Stacks page, select your compute and cache deployment stack and then, in the upper right, click Update.
  3. On the Update stack page, click Use existing template and then click Next.
  4. On the Specify stack details page, enter a new value for Node Count and then click Next.
  5. On the Configure stack options page, click Next.
  6. On the Review <my-stack-name> page, click Rollback on failure and then click Submit.

  7. To ensure that the Provisioner shut downs automatically, review the /qumulo/my-stack-name/last-run-status parameter in the AWS Parameter Store.
  8. To check that the cluster is healthy, log in to the Web UI.

Removing a Node from an Existing Cluster

Removing a node from an existing cluster is a two-step process. First, you remove the node from the cluster’s quorum. Next, you tidy up your AWS resources.

Step 1: Remove the Node from the Cluster’s Quorum

You must perform this step while the cluster is running.

  1. Copy the remove-nodes.sh script from the utilities directory to an AWS Linux 2 AMI running in your VPC.

  2. Run the remove-nodes.sh script and specify the AWS region, the unique deployment name, the current node count, and the final node count.

    In the following example, we reduce a cluster from 6 to 4 nodes.

    ./remove-nodes.sh \
      --region us-west-2 \
      --qstackname my-unique-deployment-name \
      --currentnodecount 6 \
      --finalnodecount 4
    
  3. When prompted, confirm the nodes’ removal.
  4. To check that the cluster is healthy, log in to the Web UI.

Step 2: Tidy Up Your AWS Resources

  1. On the Stacks page, select your compute and cache deployment stack and then, in the upper right, click Update.
  2. On the Update stack page, click Use existing template and then click Next.
  3. On the Specify stack details page, enter a lower value for Node Count (for example, 4) and then click Next.
  4. On the Configure stack options page, click Next.
  5. On the Review <my-stack-name> page, click Rollback on failure and then click Submit.

    The node and the infrastructure associated with the node are removed.

  6. To check that the cluster is healthy, log in to the Web UI.

Changing the EC2 Instance Type for an Existing Cluster

Changing the EC2 instance type is a three-step process. First, you create a new deployment in a new CloudFormation stack (this process ensures that the required instances are available) and join the new instances to a quorum. Next, you clean up your S3 bucket policies. Finally, you remove the existing instances.

Step 1: Create a New Deployment in a New Terraform Workspace

  1. Log in to the AWS CloudFormation console.
  2. On the Stacks page, in the upper right, click Create stack > With new resources (standard).
  3. On the Create stack page, in the Specify template section, click Amazon S3 URL, enter the URL to your CloudFormation template, and then click Next.
  4. On the Specify stack details page, to use the same S3 buckets as before, enter the same Stack name as the one you used for the persistent storage stack name and then review the information in the Parameters section:

    1. For QReplacementCluster, click Yes.
    2. For QExistingDeploymentUniqueName, enter the current stack name.
    3. For QInstanceType, enter the EC2 instance type.
    4. (Optional) To change the number of nodes, enter the QNodeCount
    5. Click Next.
  5. On the Configure stack options page, click Next.

  6. On the Review and create page, click Submit.

  7. To ensure that the Provisioner shut downs automatically, review the /qumulo/my-stack-name/last-run-status parameter in the AWS Parameter Store.

  8. To check that the cluster is healthy, log in to the Web UI.

Step 2: Remove the Existing Instances

  1. To delete the previous CloudFormation stack, on the Stacks page, select the stack name for your previous deployment and then, in the upper right, click Delete.
  2. To ensure that the stack is deleted correctly, watch the deletion process.

    The previous instances are deleted.

Step 3: Clean Up S3 Bucket Policies

  1. On the Stacks page, select the newly created stack and then, in the upper right, click Update.
  2. On the Update stack page, click Use existing template and then click Next.
  3. On the Specify stack details page, for QReplacementCluster, click No.
  4. On the Configure stack options page, click Next.
  5. On the Review <my-stack-name> page, click Rollback on failure and then click Submit.

Increasing the Soft Capacity Limit for an Existing Cluster

Increasing the soft capacity limit for an existing cluster is a two-step process. First, you set new persistent storage parameters. Next, you set new compute and cache deployment parameters.

Step 1: Set New Persistent Storage Parameters

  1. On the Stacks page, select your persistent storage stack and then, in the upper right, click Update.
  2. On the Update stack page, click Use existing template and then click Next.
  3. On the Specify stack details page, enter a higher value for QSoftCapacityLimit and then click Next.
  4. On the Configure stack options page, click Next.
  5. On the Review <my-stack-name> page, click Rollback on failure and then click Submit.

    Terraform creates new S3 buckets as necessary.

Step 2: Update Existing Compute and Cache Resource Deployment

  1. On the Stacks page, select your compute and cache deployment stack and then, in the upper right, click Update.
  2. On the Update stack page, click Use existing template and then click Next.
  3. On the Specify stack details page click Next.
  4. On the Configure stack options page, click Next.
  5. On the Review <my-stack-name> page, click Rollback on failure and then click Submit.

    Terraform updates the necessary IAM roles and S3 bucket policies, adds S3 buckets to the persistent storage list for the cluster, and increases the soft capacity limit. When the Provisioner shuts down automatically, this process is complete.

Deleting an Existing Cluster

Deleting a cluster is a two-step process. First, you delete your Cloud Native Qumulo resources. Next you delete your persistent storage.

  1. After you back up your data safely, disable termination protection for your CloudFormation stack.
  2. To update your stack, do the following:
    1. On the Stacks page, select the existing stack and then, in the upper right, click Update.
    2. On the Update stack page, click Use existing template and then click Next.
    3. On the Specify stack details page, click Next.
    4. On the Configure stack options page, click Next.
    5. On the Review <my-stack-name> page, click Rollback on failure and then click Submit.
  3. Delete your CloudFormation stack.