This section explains how to deploy Cloud Native Qumulo (CNQ) on GCP by creating the persistent storage and the cluster compute and cache resources by using Terraform. It also provides recommendations for Terraform deployments and information about post-deployment actions and optimization.

For an overview of CNQ on GCP, its prerequisites, and limits, see How Cloud Native Qumulo Works.

The gcp-terraform-cnq-<x.y>.zip file (the version in the file name corresponds to the provisioning scripts, not to the version of Qumulo Core) contains comprehensive Terraform configurations that let you deploy GCS buckets and then create a CNQ cluster with 1 or 3–24 instances that have fully elastic compute and capacity.

Prerequisites

This section explains the prerequisites to deploying CNQ on GCP.

  • Qumulo Core 7.6.0 (or higher)

  • To allow instances without external IP addresses to reach GCP APIs, you must enable Private Google Access.

  • To allow your Qumulo cluster to report metrics to Qumulo, your VPC must have outbound Internet connectivity through a Cloud NAT gateway. Your instance shares no file data during this process.

  • To enable the following services for your Google Cloud project, use the gcloud services enable command:

    • cloudkms.googleapis.com
    • compute.googleapis.com
    • logging.googleapis.com
    • monitoring.googleapis.com
    • secretmanager.googleapis.com
    • storage-api.googleapis.com: Required only if you store your Terraform state in GCS buckets
    • storage.googleapis.com
  • To configure IAM, add the least-privilege role configurations appropriate for your organization to separate service accounts:

    • Terraform Deployment Service Account

      • roles/cloudkms.cryptoKeyEncrypterDecrypter: Required only if you use customer-managed encryption keys (CMEKs)
      • roles/compute.admin
      • roles/iam.serviceAccountUser
      • roles/logging.configWriter: Required only for creating log sinks
      • roles/monitoring.editor: Required only for creating dashboards and alerts
      • roles/resourcemanager.projectIamAdmin
      • roles/secretmanager.admin
      • roles/storage.admin
    • Virtual Machine (Node) Service Account

      • roles/compute.viewer: Required only for instance metadata introspection
      • roles/secretmanager.secretAccessor
      • roles/storage.objectViewer

How the CNQ Provisioner Works

The CNQ Provisioner is a Google Compute Engine (GCE) instance that configures your Qumulo cluster and any additional GCP environment requirements.

To Monitor the Provisioner’s Status

You can monitor the Provisioner from the GCP Console (in your project, click Compute Engine > VM Instances) or by using the gcloud CLI. The Provisioner stores all necessary state information in a Firestore database and shuts down automatically when it completes its tasks.

  1. To check whether the Provisioner is still running, use the gcloud compute instances list command and specify your deployment's name and the format. For example:

    gcloud compute instances \
      list --filter="mydeploymentname-ABCDE01EG2H" \
      --format="table(name,zone,status)"
  2. Do one of the following:

    • If the Provisioner's status is RUNNING, you can retrieve the last console logs for troubleshooting by using the gcloud compute instances get-serial-port-output command and specify your deployment's name and the availability zone. For example:

      gcloud compute instances
        get-serial-port-output "mydeploymentname-ABCDE01EG2H" \
        --zone us-central1-a \
        --port 1 | tail -n 100
    • If the Provisioner's status is TERMINATED, you can check the Firestore database named after the unique deployment name of your persistent storage.

Step 1: Deploying Cluster Persistent Storage

This section explains how to deploy the GCS buckets that act as persistent storage for your Qumulo cluster.

Part 1: Prepare the Required Files

Before you can deploy the persistent storage for your cluster, you must download and prepare the required files.

  1. Log in to Qumulo Nexus and click Downloads > Cloud Native Qumulo Downloads.

  2. On the GCP tab, in the Download the required files section, select the Qumulo Core version that you want to deploy and then download the corresponding Terraform configuration and Debian or RPM package.

  3. In a new or existing GCS bucket, within your chosen prefix, create the qumulo-core-install directory.

  4. Within this directory, create another directory with the Qumulo Core version as its name. For example:

     gs://my-gcs-bucket-name/my-prefix/qumulo-core-install/7.5.0
    
  5. Copy qumulo-core.deb or qumulo-core.rpm into the directory named after the Qumulo Core version (in this example, it is 7.5.0).

  6. Copy gcp-terraform-cnq-<x.y>.zip to your Terraform environment and then decompress the file.

Part 2: Create the Necessary Resources

  1. Navigate to the persistent-storage directory.

  2. Edit the provider.tf file:

    • To store the Terraform state remotely, add the name of a GCS bucket to the section that begins with backend "s3" {.

    • To store the Terraform state locally, comment out the section that begins with backend "s3" {.

  3. Run the terraform init command.

    Terraform prepares the environment and displays the message Terraform has been successfully initialized!

  4. Edit the terraform.tfvars file.

    • Specify the deployment_name and the correct gcp_region for your cluster’s persistent storage.

    • Set the soft_capacity_limit to 500 (or higher).

  5. Use the gcloud CLI to authenticate to your GCP account.

    1. Run the terraform apply command.

    2. Review the Terraform execution plan and then enter yes.

      Terraform displays:

      • The Apply complete! message with a count of added resources

      • The names of the created GCS buckets

      • Your deployment’s unique name

      For example:

      Apply complete! Resources: 15 added, 0 changed, 0 destroyed.
            
      Outputs:
      
      persistent_storage_bucket_names = tolist([
        "ab5cdefghij-my-deployment-klmnopqr9st-qps-1",
        "ab4cdefghij-my-deployment-klmnopqr8st-qps-2",
        "ab3cdefghij-my-deployment-klmnopqr7st-qps-3",
        ...
        "ab2cdefghij-my-deployment-klmnopqr6st-qps-16"
      ])
      deployment_unique_name = "mydeploymentname-ABCDE01EG2H"
      ...
      

Step 2: Deploying Cluster Compute and Cache Resources

This section explains how to deploy compute and cache resources for a Qumulo cluster by using a Ubuntu AMI and the Qumulo Core .deb installer.

Recommendations

  • Provisioning completes successfully when the Provisioner shuts down automatically. If the Provisioner doesn’t shut down, the provisioning cycle has failed and you must troubleshoot it. To monitor the Provisioner’s status, you can watch the Terraform operations in your terminal or monitor the Provisioner in the GCP Console.

  • The first variable in the example configuration files in the gcp-terraform-cnq repository is deployment_name. To help avoid conflicts between load balancers, resource labels, and other deployment components, Terraform ignores the deployment_name value and generates a deployment_unique_name variable. Terraform appends a random, alphanumeric value to the variable and then tags all future resources with this value. The deployment_unique_name variable never changes during subsequent Terraform deployments.

  • If you plan to deploy multiple Qumulo clusters, give the q_cluster_name variable a unique name for each cluster.

  • We recommend forwarding DNS queries to Qumulo Authoritative DNS (QDNS). For multi-zone deployments, specify a value for q_cluster_fqdn. Qumulo Core uses this variable to forward DNS requests to your cluster, where Qumulo Core resolves DNS for your floating IP addresses.

To Deploy the Cluster Compute and Cache Resources

  1. Edit the provider.tf file:

    • To store the Terraform state remotely, add the name of an S3 bucket to the sections that begin with backend "s3" { and data "terraform_remote_state" "persistent_storage" {.

    • To store the Terraform state locally, comment the sections that begin with backend "s3" { and data "terraform_remote_state" "persistent_storage" { and uncomment the section that contains backend = "local".

  2. Navigate to the gcp-terraform-cnq-<x.y>/compute directory and then run the terraform init command.

    Terraform prepares the environment and displays the message Terraform has been successfully initialized!

  3. Edit the terraform.tfvars file, specifying the values for all variables.

    For more information, see README.pdf in gcp-terraform-cnq-<x.y>.zip.

  4. Run the terraform apply command.

  5. Review the Terraform execution plan and then enter yes.

    Terraform displays:

    • The Apply complete! message with a count of added resources

    • Your deployment’s unique name

    • The names of the created GCS buckets

    • The floating IP addresses for your Qumulo cluster

    • The primary (static) IP addresses for your Qumulo cluster

    • The Qumulo Core Web UI endpoint

    For example:

    Apply complete! Resources: 62 added, 0 changed, 0 destroyed.
      
    Outputs:
      
    cluster_provisioned = "Success"
    deployment_unique_name = "mydeploymentname-ABCDE01EG2H"
    ...
    persistent_storage_bucket_names = tolist([
      "ab5cdefghij-my-deployment-klmnopqr9st-qps-1",
      "ab4cdefghij-my-deployment-klmnopqr8st-qps-2",
      "ab3cdefghij-my-deployment-klmnopqr7st-qps-3",
      ...
      "ab2cdefghij-my-deployment-klmnopqr6st-qps-16",
    ])
    qumulo_floating_ips = tolist([
      "203.0.113.42",
      "203.0.113.84",
      ...
    ])
    ...
    qumulo_primary_ips = tolist([
      "203.0.113.5",
      "203.0.113.6",
      "203.0.113.7"
    ])
    ...
    qumulo_private_url_node1 = "https://203.0.113.5"
    

To Mount the Qumulo File System

  1. To log in to your cluster’s Web UI, use the endpoint from the Terraform output and the username and password that you have configured.

    You can use the Qumulo Core Web UI to create and manage NFS exports, SMB shares, snapshots, and continuous replication relationships You can also join your cluster to Active Directory, configure LDAP, and perform many other operations.

  2. Mount your Qumulo file system by using NFS or SMB and your cluster’s DNS name or IP address.

Step 3: Performing Post-Deployment Actions

This section describes the common actions you can perform on a CNQ cluster after deploying it.

Adding Nodes to an Existing Cluster

  1. Edit terraform.tfvars and change the value of q_node_count to a new value.
  2. Run the terraform apply command.
  3. Review the Terraform execution plan and then enter yes.

    Terraform displays an additional primary (static) IP for the new node. For example:

    qumulo_primary_ips = tolist([
      "203.0.113.5",
      "203.0.113.6",
      "203.0.113.7",
      "203.0.113.8",
      "203.0.113.9"   
    ])
    
  4. To ensure that the Provisioner shut downs automatically, monitor the /qumulo/my-deployment-name/last-run-status parameter for the Provisioner. To monitor the Provisioner’s status, you can watch the Terraform operations in your terminal or monitor the Provisioner in the GCP Console.
  5. To check that the cluster is healthy and has the needed number of nodes, log in to the Qumulo Core Web UI.

Removing Nodes from an Existing Cluster

Removing nodes from an existing cluster is a two-step process in which you remove the nodes from your cluster’s quorum and then tidy up the GCP resources for the removed nodes.

Step 1: Remove Nodes from the Cluster’s Quorum

  1. Edit the terraform.tfvars file, setting the value of q_target_node_count to a reduced number of nodes in the cluster.

  2. Run the terraform apply command.

  3. Review the nodes to be removed and then enter yes.

    Terraform removes the nodes and displays:

    • The Apply complete! message with a count of removed resources

    • Your deployment’s unique name

    • The remaining GCS buckets for your Qumulo cluster

    • The primary (static) IP addresses for the node removed from your Qumulo cluster

    • The Qumulo Core Web UI endpoint

    For example:

    Apply complete! Resources: 0 added, 0 changed, 1 destroyed.
    
    Outputs:
    
    cluster_provisioned = "Success"
    deployment_unique_name = "mydeploymentname-ABCDE01EG2H"
    ...
    persistent_storage_bucket_names = tolist([
      "ab5cdefghij-my-deployment-klmnopqr9st-qps-1",
      "ab4cdefghij-my-deployment-klmnopqr8st-qps-2",
      "ab3cdefghij-my-deployment-klmnopqr7st-qps-3",
      ...
      "ab2cdefghij-my-deployment-klmnopqr6st-qps-16"
    ])
    qumulo_floating_ips = tolist([
      "203.0.113.42",
      "203.0.113.84",
      ...
    ])
    ...
    qumulo_primary_ips_removed_nodes = "203.0.113.24",
    ...
    qumulo_private_url_node1 = "https://203.0.113.10"
    

Step 2: Tidy Up GCP Resources for Removed Nodes

  1. Edit the terraform.tfvars file:

    1. Set the value of the q_node_count variable to a reduced number of nodes in the cluster.

    2. Set the value of the q_target_node_count to null.

  2. Run the terraform apply command.

  3. Review the resources to be removed and then enter yes.

  4. To check that the cluster is healthy and has the needed number of nodes, log in to the Qumulo Core Web UI.

    Terraform tidies up the resources for removed nodes and displays:

    • The Apply complete! message with a count of removed resources

    • Your deployment’s unique name

    • The remaining GCS buckets for your Qumulo cluster

    • The remaining floating IP addresses for your Qumulo cluster

    • The remaining primary (static) IP addresses for your Qumulo cluster

    • The Qumulo Core Web UI endpoint

    For example:

    Apply complete! Resources: 0 added, 0 changed, 66 destroyed.
    
    Outputs:
    
    cluster_provisioned = "Success"
    deployment_unique_name = "mydeploymentname-ABCDE01EG2H"
    ...
    persistent_storage_bucket_names = tolist([
      "ab5cdefghij-my-deployment-klmnopqr9st-qps-1",
      "ab4cdefghij-my-deployment-klmnopqr8st-qps-2",
      "ab3cdefghij-my-deployment-klmnopqr7st-qps-3",
      ...
      "ab2cdefghij-my-deployment-klmnopqr6st-qps-16"
    ])
    qumulo_floating_ips = tolist([
      "203.0.113.42",
      "203.0.113.84",
      ...
    ])
    ...
    qumulo_primary_ips = tolist([
      "203.0.113.4",
      "203.0.113.5",
      "203.0.113.6",
      "203.0.113.7"
    ])
    ...
    qumulo_private_url_node1 = "https://203.0.113.10"
    

Increasing the Soft Capacity Limit for an Existing Cluster

Increasing the soft capacity limit for an existing cluster is a two-step process in which you configure new persistent storage parameters and then configure new compute and cache deployment parameters.

Step 1: Set New Persistent Storage Parameters

  1. Edit the terraform.tfvars file in the persistent-storage directory and set the soft_capacity_limit variable to a higher value.
  2. Run the terraform apply command.

    Review the Terraform execution plan and then enter yes.

    Terraform creates new GCS buckets as necessary and displays:

    • The Apply complete! message with a count of changed resources

    • The names of the created GCS buckets

    • Your deployment’s unique name

    • The new soft capacity limit

    For example:

    Apply complete! Resources: 0 added, 1 changed, 0 destroyed.
    
    Outputs:
    
    persistent_storage_bucket_names = tolist([
      "ab5cdefghij-my-deployment-klmnopqr9st-qps-1",
      "ab4cdefghij-my-deployment-klmnopqr8st-qps-2",
      "ab3cdefghij-my-deployment-klmnopqr7st-qps-3",
      ...
      "ab2cdefghij-my-deployment-klmnopqr6st-qps--16"
    ])
    deployment_unique_name = "mydeploymentname-ABCDE01EG2H"
    ...
    soft_capacity_limit = "1000 TB"
    

Step 2: Update Existing Compute and Cache Resource Deployment

  1. Navigate to the root directory of the gcp-terraform-cnq-<x.y> repository.
  2. Run the terraform apply command.

    Review the Terraform execution plan and then enter yes.

    Terraform updates the necessary IAM roles and GCS bucket policies, adds GCS buckets to the persistent storage list for the cluster, increases the soft capacity limit, and displays the Apply complete! message.

    When the Provisioner shuts down automatically, this process is complete.

Changing the GCE Instance Type of Your CNQ on GCP Cluster

You can change the GCE instance type, node count, and to convert your cluster from single-zone to multi-zone, or the other way around.

Changing the GCE instance type of your cluster is a three-step process in which you create a new deployment in a new Terraform workspace and join the new GCE instances to a quorum, remove the existing GCE instances, and then clean up your GCS bucket policies.

Step 1: Create a New Deployment in a New Terraform Workspace

  1. To create a new Terraform workspace, run the terraform workspace new my-new-workspace-name command.
  2. Edit the terraform.tfvars file:

    1. Specify the value for the gcp_subnet_name variable.

    2. Specify the value for the q_instance_type variable.
    3. Set the value of the q_replacement_cluster variable to true.
    4. Set the value of the q_existing_deployment_unique_name variable to the current deployment’s name.
    5. (Optional) To change the number of nodes, specify the value for the q_node_count variable.
  3. Run the terraform apply command.

    Review the Terraform execution plan and then enter yes.

    Terraform displays:

    • The Apply complete! message with a count of added resources

    • Your deployment’s unique name

    • The names of the created GCS buckets

    • The same floating IP addresses for your Qumulo cluster

    • New primary (static) IP addresses for your Qumulo cluster

    • The Qumulo Core Web UI endpoint

    For example:

    Apply complete! Resources: 66 added, 0 changed, 0 destroyed.
    
    Outputs:
    
    cluster_provisioned = "Success"
    deployment_unique_name = "mydeploymentname-ABCDE01EG2H"
    ...
    persistent_storage_bucket_names = tolist([
      "ab5cdefghij-my-deployment-klmnopqr9st-qps-1",
      "ab4cdefghij-my-deployment-klmnopqr8st-qps-2",
      "ab3cdefghij-my-deployment-klmnopqr7st-qps-3",
      ...
      "ab2cdefghij-my-deployment-klmnopqr6st-qps--16"
    ])
    qumulo_floating_ips = tolist([
      "203.0.113.42",
      "203.0.113.84",
      ...
    ])
    ...
    qumulo_primary_ips = tolist([
      "203.0.113.4",
      "203.0.113.5",
      "203.0.113.6",
      "203.0.113.7"
    ])
    ...
    qumulo_private_url_node1 = "https://203.0.113.10"
    
  4. To ensure that the Provisioner shut downs automatically, monitor the /qumulo/my-deployment-name/last-run-status parameter for the Provisioner. To monitor the Provisioner’s status, you can watch the Terraform operations in your terminal or monitor the Provisioner in the GCP Console.
  5. To check that the cluster is healthy and has the needed number of nodes, log in to the Qumulo Core Web UI.

Step 2: Remove the Previous Deployment

  1. To select the previous Terraform workspace (for example, default), run the terraform workspace select default command.
  2. To ensure that the correct workspace is selected, run the terraform workspace show command.
  3. Run the terraform destroy command.

    Review the Terraform execution plan and then enter yes.

    Terraform displays the Destroy complete! message with a count of destroyed resources.

    The previous deployment is deleted.

Step 3: Clean Up GCS Bucket Policies

  1. To list your Terraform workspaces, run the terraform workspace list command.
  2. To select your new Terraform workspace, run the terraform workspace select <my-new-workspace-name> command.
  3. Edit the terraform.tfvars file and set the q_replacement_cluster variable to false.
  4. Run the terraform apply command. This ensures that the GCS bucket policies have least privilege.

    Review the Terraform execution plan and then enter yes.

    Terraform displays the Apply complete! message with a count of destroyed resources.

Deleting an Existing Cluster

Deleting a cluster is a two-step process in which you delete your cluster’s compute and cache resources and then delete your persistent storage.

Step 1: To Delete Your Cluster’s Compute and Cache Resources

  1. After you back up your data safely, edit your terraform.tfvars file and set the term_protection variable to false.
  2. Run the terraform apply command.

    Review the Terraform execution plan and then enter yes.

    Terraform displays the Apply complete! message with a count of changed resources.

  3. Run the terraform destroy command.

    Review the Terraform execution plan and then enter yes.

    Terraform deletes all of your cluster’s compute and cache resources and displays the Destroy complete! message and a count of destroyed resources.

Step 2: To Delete Your Cluster’s Persistent Storage

  1. Navigate to the persistent-storage directory.
  2. Edit your terraform.tfvars file and set the prevent_destroy parameter to false.
  3. Run the terraform apply command.

    Review the Terraform execution plan and then enter yes.

    Terraform displays the Apply complete! message with a count of changed resources.

  4. Run the terraform destroy command.

    Review the Terraform execution plan and then enter yes.

    Terraform deletes all of your cluster’s persistent storage and displays the Destroy complete! message and a count of destroyed resources.