This section explains how to deploy Cloud Native Qumulo (CNQ) on GCP by creating the persistent storage and the cluster compute and cache resources by using Terraform. It also provides recommendations for Terraform deployments and information about post-deployment actions and optimization.
For an overview of CNQ on GCP, its prerequisites, and limits, see How Cloud Native Qumulo Works.
The qumulo-terraform-gcp-<x.y>.zip file (the version in the file name corresponds to the provisioning scripts, not to the version of Qumulo Core) contains comprehensive Terraform configurations that let you deploy GCS buckets and then create a CNQ cluster with 1 or 3–24 instances that have fully elastic compute and capacity.
Prerequisites
This section explains the prerequisites to deploying CNQ on GCP.
-
To allow instances without external IP addresses to reach GCP APIs, you must enable Private Google Access.
-
To allow your Qumulo cluster to report metrics to Qumulo, your VPC must have outbound Internet connectivity through a Cloud NAT gateway. Your instance shares no file data during this process.
Important
Connectivity to the following endpoints is required for a successful deployment of a Qumulo instance and quorum formation:api.missionq.qumulo.comapi.nexus.qumulo.com
-
To enable the following services for your Google Cloud project, use the
gcloud services enablecommand:cloudkms.googleapis.comcompute.googleapis.comlogging.googleapis.commonitoring.googleapis.comsecretmanager.googleapis.comstorage-api.googleapis.com: Required only if you store your Terraform state in GCS bucketsstorage.googleapis.com
-
To configure IAM, add the least-privilege role configurations appropriate for your organization to separate service accounts:
-
Terraform Deployment Service Account
roles/cloudkms.crypto: Required only if you use customer-managed encryption keys (CMEKs)Key Encrypter Decrypter roles/compute.adminroles/iam.serviceAccountUserroles/logging.configWriter: Required only for creating log sinksroles/monitoring.editor: Required only for creating dashboards and alertsroles/resourcemanager.projectIamAdminroles/secretmanager.adminroles/storage.admin
-
Virtual Machine (Node) Service Account
roles/compute.viewer: Required only for instance metadata introspectionroles/secretmanager.secretAccessorroles/storage.objectViewer
-
How the CNQ Provisioner Works
The CNQ Provisioner is a Google Compute Engine (GCE) instance that configures your Qumulo cluster and any additional GCP environment requirements.
You can monitor the Provisioner from the GCP Console or by using thegcloud CLI.
To Monitor the Provisioner’s Status by Using the GCP Console
-
Log in to the GCP Console.
-
In your project, click Compute Engine > VM Instances.
-
To ensure that the Provisioner shut downs automatically, monitor the
deployment_unique_name/computeanddeployment_unique_name/last-run-statusdocument paths in the Firestore database for the Provisioner.The Provisioner stores all necessary state information in a Firestore database and shuts down automatically when it completes its tasks.
To Monitor the Provisioner’s Status by using the gcloud CLI
-
To check whether the Provisioner is still running, use the
gcloud compute instances listcommand and specify your deployment's name and the format. For example:gcloud compute instances \ list --filter="mydeploymentname-ABCDE01EG2H" \ --format="table(name,zone,status)"
-
Do one of the following:
-
If the Provisioner's status is
RUNNING, you can retrieve the last console logs for troubleshooting by using thegcloud compute instances get-serial-port-outputcommand and specify your deployment's name and the availability zone. For example:gcloud compute instances get-serial-port-output "mydeploymentname-ABCDE01EG2H" \ --zone us-central1-a \ --port 1 | tail -n 100
- If the Provisioner's status is
TERMINATED, you can check the Firestore database named after the unique deployment name of your persistent storage.
-
Step 1: Deploying Cluster Persistent Storage
This section explains how to deploy the GCS buckets that act as persistent storage for your Qumulo cluster.
Part 1: Prepare the Required Files
Before you can deploy the persistent storage for your cluster, you must download and prepare the required files.
-
Log in to Qumulo Nexus and click Downloads > Cloud Native Qumulo Downloads.
-
On the GCP tab, in the Download the required files section, select the Qumulo Core version that you want to deploy and then download the corresponding Terraform configuration and Debian or RPM package.
-
In a new or existing GCS bucket, within your chosen prefix, create the
qumulo-core-installdirectory. -
Within this directory, create another directory with the Qumulo Core version as its name. For example:
gs://my-gcs-bucket-name/my-prefix/qumulo-core-install/7.6.0Tip
Make a new subdirectory for every new release of Qumulo Core. -
Copy
qumulo-core.deborqumulo-core.rpminto the directory named after the Qumulo Core version (in this example, it is7.5.0). -
Copy
qumulo-terraform-gcp-<x.y>.zipto your Terraform environment and then decompress the file.
Part 2: Configure the Persistent Storage
-
Navigate to the
persistent-storagedirectory. -
Edit the
provider.tffile:-
To store the Terraform state remotely, add the name of a GCS bucket to the section that begins with
backend "gcp" {. -
To store the Terraform state locally, comment out the section that begins with
backend "gcp" {and uncomment the section that containsbackend = "local".Important
We don’t recommend storing the Terraform state locally for production deployments.
-
-
Run the
terraform initcommand.Terraform prepares the environment and displays the message
Terraform has been successfully initialized! -
Edit the
terraform.tfvarsfile.-
Specify the
deployment_nameand the correctgcp_regionfor your cluster’s persistent storage. -
Set the
soft_capacity_limitto500(or higher).Note
This value specifies the initial capacity limit of your Qumulo clusters (in TB). It is possible to increase this limit at any time.
-
Part 3: Create the Necessary Resources
-
To authenticate to your GCP account, use the
gcloudCLI. -
Run the
terraform applycommand. -
Review the Terraform execution plan and then enter
yes.Terraform creates resources according to the execution plan and displays:
-
The names of the created GCS buckets
-
Your deployment’s unique name
For example:
persistent_storage_bucket_names = tolist([ "ab5cdefghij-my-deployment-klmnopqr9st-qps-1", "ab4cdefghij-my-deployment-klmnopqr8st-qps-2", "ab3cdefghij-my-deployment-klmnopqr7st-qps-3", ... "ab2cdefghij-my-deployment-klmnopqr6st-qps-16" ]) deployment_unique_name = "mydeploymentname-ABCDE01EG2H" -
Step 2: Deploying Cluster Compute and Cache Resources
This section explains how to deploy compute and cache resources for a Qumulo cluster by using a Ubuntu image and the Qumulo Core .deb installer.
Recommendations
We strongly recommend reviewing the following recommendations before beginning this process.
-
Provisioning completes successfully when the Provisioner shuts down automatically. If the Provisioner doesn’t shut down, the provisioning cycle has failed and you must troubleshoot it. To monitor the Provisioner’s status, you can watch the Terraform operations in your terminal or monitor the Provisioner in the GCP Console.
-
The first variable in the example configuration files in the
qumulo-terraform-gcprepository isdeployment_name. To help avoid conflicts between resource labels and other deployment components, Terraform ignores thedeployment_namevalue and generates adeployment_unique_namevariable. Terraform appends a random, alphanumeric value to the variable and then tags all future resources with this value. Thedeployment_unique_namevariable never changes during subsequent Terraform deployments. -
If you plan to deploy multiple Qumulo clusters, give the
q_cluster_namevariable a unique name for each cluster.
Part 1: To Deploy the Cluster Compute and Cache Resources
-
Edit the
provider.tffile:-
To store the Terraform state remotely, add the name of an S3 bucket to the sections that begin with
backend "gcp" {anddata "terraform_remote_state" "persistent_storage" {. -
To store the Terraform state locally, comment the sections that begin with
backend "gcp" {anddata "terraform_remote_state" "persistent_storage" {and uncomment the section that containsbackend = "local".Important
We don’t recommend storing the Terraform state locally for production deployments.
-
-
Navigate to the
qumulo-terraform-gcp-<x.y>/computedirectory and then run theterraform initcommand.Terraform prepares the environment and displays the message
Terraform has been successfully initialized! -
Edit the
terraform.tfvarsfile and specify the values for all variables.For more information, see
README.pdfinqumulo-terraform-gcp-<x.y>.zip. -
Run the
terraform applycommand. -
Review the Terraform execution plan and then enter
yes.Terraform creates resources according to the execution plan and displays:
-
Your deployment’s unique name
-
The names of the created GCS buckets
-
The floating IP addresses for your Qumulo cluster
Note
You must specify the floating IP addresses in yourterraform.tfvarsfile explicitly. -
The primary (static) IP addresses for your Qumulo cluster
-
The Qumulo Core Web UI endpoint
For example:
deployment_unique_name = "mydeploymentname-ABCDE01EG2H" ... persistent_storage_bucket_names = tolist([ "ab5cdefghij-my-deployment-klmnopqr9st-qps-1", "ab4cdefghij-my-deployment-klmnopqr8st-qps-2", "ab3cdefghij-my-deployment-klmnopqr7st-qps-3", ... "ab2cdefghij-my-deployment-klmnopqr6st-qps-16", ]) qumulo_floating_ips = tolist([ "203.0.113.42", "203.0.113.84", ... ]) ... qumulo_primary_ips = tolist([ "203.0.113.5", "203.0.113.6", "203.0.113.7" ]) ... qumulo_private_url_node1 = "https://203.0.113.5" -
Part 2: To Mount the Qumulo File System
-
To log in to your cluster’s Web UI, use the endpoint from the Terraform output and the username and password that you have configured.
Important
If you change the administrative password for your cluster by using the Qumulo Core Web UI,qqCLI, or REST API after deployment, you must update your password in GCP Secrets Manager.You can use the Qumulo Core Web UI to create and manage NFS exports, SMB shares, snapshots, and continuous replication relationships You can also join your cluster to Active Directory, configure LDAP, and perform many other operations.
-
Mount your Qumulo file system by using NFS or SMB and your cluster’s DNS name or IP address.
Step 3: Performing Post-Deployment Actions
This section describes the common actions you can perform on a CNQ cluster after deploying it.
Adding Nodes to an Existing Cluster
To add nodes to an existing cluster, the total node count must be greater than that of the current deployment.
- Edit
terraform.tfvarsand change the value ofq_node_countto a new value. - Run the
terraform applycommand. -
Review the Terraform execution plan and then enter
yes.Terraform displays an additional primary (static) IP for the new node. For example:
qumulo_primary_ips = tolist([ "203.0.113.5", "203.0.113.6", "203.0.113.7", "203.0.113.8", "203.0.113.9" ]) - To ensure that the Provisioner shut downs automatically, monitor the
deployment_unique_name/computeanddeployment_unique_name/last-run-statusdocument paths in the Firestore database for the Provisioner. To monitor the Provisioner’s status, you can watch the Terraform operations in your terminal or monitor the Provisioner in the GCP Console. - To check that the cluster is healthy and has the needed number of nodes, log in to the Qumulo Core Web UI.
Removing Nodes from an Existing Cluster
Removing nodes from an existing cluster is a two-step process in which you remove the nodes from your cluster’s quorum and then tidy up the GCP resources for the removed nodes.
Step 1: Remove Nodes from the Cluster’s Quorum
You must perform this step while the cluster is running.
-
Edit the
terraform.tfvarsfile and set the value ofq_target_node_countto a lower number of nodes. -
Run the
terraform applycommand. -
Review the nodes to be removed and then enter
yes.Terraform removes the nodes and displays:
-
Your deployment’s unique name
-
The remaining GCS buckets for your Qumulo cluster
-
The primary (static) IP addresses for the node removed from your Qumulo cluster
-
The Qumulo Core Web UI endpoint
For example:
deployment_unique_name = "mydeploymentname-ABCDE01EG2H" ... persistent_storage_bucket_names = tolist([ "ab5cdefghij-my-deployment-klmnopqr9st-qps-1", "ab4cdefghij-my-deployment-klmnopqr8st-qps-2", "ab3cdefghij-my-deployment-klmnopqr7st-qps-3", ... "ab2cdefghij-my-deployment-klmnopqr6st-qps-16" ]) qumulo_floating_ips = tolist([ "203.0.113.42", "203.0.113.84", ... ]) ... qumulo_primary_ips_removed_nodes = "203.0.113.24", ... qumulo_private_url_node1 = "https://203.0.113.10" -
Step 2: Tidy Up GCP Resources for Removed Nodes
-
Edit the
terraform.tfvarsfile:-
Set the value of the
q_node_countvariable to a lower number of nodes. -
Set the value of the
q_target_node_counttonull.
-
-
Run the
terraform applycommand. -
Review the resources to be removed and then enter
yes. -
To check that the cluster is healthy and has the needed number of nodes, log in to the Qumulo Core Web UI.
Terraform tidies up the resources for removed nodes and displays:
-
Your deployment’s unique name
-
The remaining GCS buckets for your Qumulo cluster
-
The remaining floating IP addresses for your Qumulo cluster
-
The remaining primary (static) IP addresses for your Qumulo cluster
-
The Qumulo Core Web UI endpoint
For example:
deployment_unique_name = "mydeploymentname-ABCDE01EG2H" ... persistent_storage_bucket_names = tolist([ "ab5cdefghij-my-deployment-klmnopqr9st-qps-1", "ab4cdefghij-my-deployment-klmnopqr8st-qps-2", "ab3cdefghij-my-deployment-klmnopqr7st-qps-3", ... "ab2cdefghij-my-deployment-klmnopqr6st-qps-16" ]) qumulo_floating_ips = tolist([ "203.0.113.42", "203.0.113.84", ... ]) ... qumulo_primary_ips = tolist([ "203.0.113.4", "203.0.113.5", "203.0.113.6", "203.0.113.7" ]) ... qumulo_private_url_node1 = "https://203.0.113.10" -
Increasing the Soft Capacity Limit for an Existing Cluster
Increasing the soft capacity limit for an existing cluster is a two-step process in which you configure new persistent storage parameters and then configure new compute and cache deployment parameters.
Step 1: Set New Persistent Storage Parameters
- Edit the
terraform.tfvarsfile in thepersistent-storagedirectory and set thesoft_capacity_limitvariable to a higher value. -
Run the
terraform applycommand.Review the Terraform execution plan and then enter
yes.Terraform creates new GCS buckets as necessary and displays:
-
The
Apply complete!message with a count of changed resources -
The names of the created GCS buckets
-
Your deployment’s unique name
-
The new soft capacity limit
For example:
Apply complete! Resources: 0 added, 1 changed, 0 destroyed. Outputs: persistent_storage_bucket_names = tolist([ "ab5cdefghij-my-deployment-klmnopqr9st-qps-1", "ab4cdefghij-my-deployment-klmnopqr8st-qps-2", "ab3cdefghij-my-deployment-klmnopqr7st-qps-3", ... "ab2cdefghij-my-deployment-klmnopqr6st-qps--16" ]) deployment_unique_name = "mydeploymentname-ABCDE01EG2H" ... soft_capacity_limit = "1000 TB" -
Step 2: Update Existing Compute and Cache Resource Deployment
- Navigate to the root directory of the
qumulo-terraform-gcp-<x.y>repository. -
Run the
terraform applycommand.Review the Terraform execution plan and then enter
yes.Terraform updates the necessary IAM roles and GCS bucket policies, adds GCS buckets to the persistent storage list for the cluster, increases the soft capacity limit, and displays the
Apply complete!message.When the Provisioner shuts down automatically, this process is complete.
Changing the GCE Instance Type of Your CNQ on GCP Cluster
You can change the GCE instance type, node count, and to convert your cluster from single-zone to multi-zone, or the other way around.
- To minimize potential availability interruptions, you must perform the cluster replacement procedure as a two-quorum event. For example, if you stop the existing GCE instances by using the GCP Console and change the GCE instance types, two quorum events occur for each node and the read and write cache isn't optimized for the GCE instance type.
- Performing the cluster replacement procedure ensures that the required GCE instance types are available in advance.
Changing the GCE instance type of your cluster is a three-step process in which you create a new deployment in a new Terraform workspace and join the new GCE instances to a quorum, remove the existing GCE instances, and then clean up your GCS bucket policies.
Step 1: Create a New Deployment in a New Terraform Workspace
- To create a new Terraform workspace, run the
terraform workspace new my-new-workspace-namecommand. -
Edit the
terraform.tfvarsfile:-
Specify the value for the
gcp_subnet_namevariable. -
Specify the value for the
gcp_zonesvariable.Note
For multi-zone deployments, specify values as a comma-delimited list. -
Specify the value for the
q_instance_typevariable. -
Set the value of the
q_replacement_clustervariable totrue. -
Set the value of the
q_existing_deployment_unique_namevariable to the current deployment’s name. -
(Optional) To change the number of nodes, specify the value for the
q_node_countvariable.
Important
Leave the other variables unchanged. -
-
Run the
terraform applycommand.Review the Terraform execution plan and then enter
yes.Terraform creates resources according to the execution plan and displays:
-
Your deployment’s unique name
-
The names of the created GCS buckets
-
The same floating IP addresses for your Qumulo cluster
-
New primary (static) IP addresses for your Qumulo cluster
-
The Qumulo Core Web UI endpoint
For example:
deployment_unique_name = "mydeploymentname-ABCDE01EG2H" ... persistent_storage_bucket_names = tolist([ "ab5cdefghij-my-deployment-klmnopqr9st-qps-1", "ab4cdefghij-my-deployment-klmnopqr8st-qps-2", "ab3cdefghij-my-deployment-klmnopqr7st-qps-3", ... "ab2cdefghij-my-deployment-klmnopqr6st-qps--16" ]) qumulo_floating_ips = tolist([ "203.0.113.42", "203.0.113.84", ... ]) ... qumulo_primary_ips = tolist([ "203.0.113.4", "203.0.113.5", "203.0.113.6", "203.0.113.7" ]) ... qumulo_private_url_node1 = "https://203.0.113.10" -
- To ensure that the Provisioner shut downs automatically, monitor the
deployment_unique_name/computeanddeployment_unique_name/last-run-statusdocument paths in the Firestore database for the Provisioner. To monitor the Provisioner’s status, you can watch the Terraform operations in your terminal or monitor the Provisioner in the GCP Console. - To check that the cluster is healthy and has the needed number of nodes, log in to the Qumulo Core Web UI.
Step 2: Remove the Previous Deployment
- To select the previous Terraform workspace (for example,
default), run theterraform workspace select defaultcommand. - To ensure that the correct workspace is selected, run the
terraform workspace showcommand. -
Run the
terraform destroycommand.Review the Terraform execution plan and then enter
yes.Terraform displays the
Destroy complete!message with a count of destroyed resources.The previous deployment is deleted.
The persistent storage deployment remains in its original Terraform workspace. You can perform the next cluster replacement procedure in the
default workspace.Step 3: Clean Up GCS Bucket Policies
- To list your Terraform workspaces, run the
terraform workspace listcommand. - To select your new Terraform workspace, run the
terraform workspace select <my-new-workspace-name>command. - Edit the
terraform.tfvarsfile and set theq_replacement_clustervariable tofalse. -
Run the
terraform applycommand. This ensures that the GCS bucket policies have least privilege.Review the Terraform execution plan and then enter
yes.Terraform displays the
Apply complete!message with a count of destroyed resources.
Deleting an Existing Cluster
Deleting a cluster is a two-step process in which you delete your cluster’s compute and cache resources and then delete your persistent storage.
- When you no longer need your cluster, you must back up all important data on the cluster safely before deleting the cluster.
- When you delete your cluster's cache and computer resources, it isn't possible to access your persistent storage anymore.
Step 1: To Delete Your Cluster’s Compute and Cache Resources
- After you back up your data safely, edit your
terraform.tfvarsfile and set theterm_protectionvariable tofalse. -
Run the
terraform applycommand.Review the Terraform execution plan and then enter
yes.Terraform displays the
Apply complete!message with a count of changed resources. -
Run the
terraform destroycommand.Review the Terraform execution plan and then enter
yes.Terraform deletes all of your cluster’s compute and cache resources and displays the
Destroy complete!message and a count of destroyed resources.
Step 2: To Delete Your Cluster’s Persistent Storage
- Navigate to the
persistent-storagedirectory. - Edit your
terraform.tfvarsfile and set theprevent_destroyparameter tofalse. -
Run the
terraform applycommand.Review the Terraform execution plan and then enter
yes.Terraform displays the
Apply complete!message with a count of changed resources. -
Run the
terraform destroycommand.Review the Terraform execution plan and then enter
yes.Terraform deletes all of your cluster’s persistent storage and displays the
Destroy complete!message and a count of destroyed resources.