This section explains how to deploy Cloud Native Qumulo (CNQ) on Azure by creating the persistent storage and the cluster compute and cache resources by using Terraform. It also provides recommendations for Terraform deployments and information about post-deployment actions and optimization.
For an overview of CNQ on Azure, its prerequisites, and limits, see How Cloud Native Qumulo Works.
The azure-terraform-cnq-<x.y>.zip
file (the version in the file name corresponds to the provisioning scripts, not to the version of Qumulo Core) contains comprehensive Terraform configurations that let you deploy storage accounts and then create a CNQ cluster with 3 to 24 instances and have fully elastic compute and capacity.
Prerequisites
This section explains the prerequisites to deploying CNQ on Azure.
-
To allow your Qumulo instance to report metrics to Qumulo, your Azure Virtual Network must have outbound Internet connectivity through a NAT gateway or a firewall. Your instance shares no file data during this process.
Important
Connectivity to the following endpoints is required for a successful deployment of a Qumulo instance and quorum formation:api.missionq.qumulo.com
api.nexus.qumulo.com
-
Before you configure your Terraform environment, you must sign in to the
az
CLI. -
Ensure that your Azure subscription includes the
Reader
andContributor
role assignments. -
For scenarios in which your CNQ cluster must run in a secure environment, you must make the following changes in the
terraform.tfvars
file before deploying your cluster’s persistent storage:-
Set the
disable_public_network_access
variable totrue
-
Specify the values for the
privatelink_blob_dns_zone_resource_group_name
andprivatelink_blob_dns_zone_virtual_link_name
variables
-
How the CNQ Provisioner Works
The CNQ Provisioner is an Azure Compute instance that configures your Qumulo cluster and any additional Azure environment requirements.
The Provisioner stores all necessary state information in Azure App Configuration (on the left navigation panel, click Operations > Configuration Explorer) and shuts down automatically when it completes its tasks.
Step 1: Deploying Cluster Persistent Storage
This section explains how to deploy the storage accounts that act as persistent storage for your Qumulo cluster.
Part 1: Prepare the Required Files
Before you can deploy the persistent storage for your cluster, you must download and prepare the required files.
-
Log in to Nexus and click Downloads > Cloud Native Qumulo Downloads.
-
On the Azure tab and, in the Download the required files section, select the Qumulo Core version that you want to deploy and then download the corresponding Terraform configuration, Debian package.
-
In a storage account named
qumulo
, create theimages
directory. Within this directory, create another directory with the Qumulo Core version as its name. For example:my-storage-account/qumulo/images/7.2.3.2
Tip
Make a new subdirectory for every new release of Qumulo Core. -
Copy
qumulo-core.deb
orqumulo-core.rpm
into the directory named after the Qumulo Core version (in this example, it is7.5.0
). -
Copy
azure-terraform-cnq-<x.y>.zip
to your Terraform environment and decompress the file.
Part 2: Configure the Persistent Storage
-
Navigate to the
persistent-storage
directory. -
Edit the
provider.tf
file:-
To store the Terraform state remotely, add the storage account details to the section that begins with
backend "azurerm" {
. -
To store the Terraform state locally, comment out the section that begins with
backend "azurerm" {
and uncomment the section that containsbackend = "local"
.Important
We don’t recommend storing the Terraform state locally for production deployments.
-
-
Run the
terraform init
command.Terraform prepares the environment and displays the message
Terraform has been successfully initialized!
-
Edit the
terraform.tfvars
file.-
Specify the
deployment_name
, theaz_subscription_id
, and the correctaz_location
for your cluster’s persistent storage. -
Specify the
az_subnet_name
,az_vnet_name
, and theaz_vnet_rg
(resource group) for your Virtual Network. -
Set the
soft_capacity_limit
to500
(or higher).Note
This value specifies the initial capacity limit of your Qumulo clusters (in TB). It is possible to increase this limit at any time. -
If you have an existing resource group that must contain your CNQ on Azure deployment due to your organization’s policies or specific naming conventions, specify a value for the
advanced_az_resource_group_name
variable.
-
Part 3: Create the Necessary Resources
-
To authenticate to your Azure account, use the
az
CLI. -
Run the
terraform apply
command. -
Review the Terraform execution plan and then enter
yes
.Terraform creates resources according to the execution plan and displays:
-
The names of the created persistent storage accounts
-
Your persistent storage resource group’s unique name
For example:
persistent_storage_accounts = tolist([ "ab5cdefghij1", "ab4cdefghij2", "ab3cdefghij3", "ab2cdefghij4", ]) persistent_storage_resource_group = "mynamePStore-abcde"
-
Step 2: Deploying Cluster Compute and Cache Resources
This section explains how to deploy compute and cache resources for a Qumulo cluster by using a Ubuntu image and the Qumulo Core .deb
installer.
Recommendations
We strongly recommend reviewing the following recommendations before beginning this process.
-
Provisioning completes successfully when the Provisioner shuts down automatically. If the Provisioner doesn’t shut down, the provisioning cycle has failed and you must troubleshoot it. To monitor the provisioner’s status, you can watch the Terraform status posts in your terminal or in Azure App Configuration (on the left navigation panel, click Operations > Configuration Explorer).
-
The first variable in the example configuration files in the
azure-terraform-cnq
repository isdeployment_name
. To help avoid conflicts between resource groups and other deployment components, Terraform ignores thedeployment_name
value and generates adeployment_unique_name
variable. Terraform appends a random, alphanumeric value to the variable and then tags all future resources with this value. Thedeployment_unique_name
variable never changes during subsequent Terraform deployments. -
If you plan to deploy multiple Qumulo clusters, give the
q_cluster_name
variable a unique name for each cluster. -
We recommend forwarding DNS queries to Qumulo Authoritative DNS (QDNS). For a single-AZ deployment, to allow Qumulo Core to create an Amazon Route 53 outbound resolver, specify values for the
q_cluster_fqdn
andsecond_private_subnet_id
variables. The resolver uses theq_cluster_fqdn
variable to forward DNS requests to your cluster, where Qumulo Core resolves DNS for your floating IP addresses.
Part 1: To Deploy the Cluster Compute and Cache Resources
-
To ensure that your Virtual Network subnet has the required service endpoints, take the following steps:
-
In the Azure Portal, search for
Virtual networks
and then select your Virtual Network. -
On the left panel, click Settings > Service endpoints.
-
On the Service endpoints page, ensure that the
Microsoft.KeyVault
andMicrosoft.Storage
service endpoints are added and enabled for the subnet where CNQ on Azure is to be deployed.
Important
It isn’t possible to deploy your cluster without these service endpoints. -
-
Edit the
provider.tf
file:-
To store the Terraform state remotely, add the name of an S3 bucket to the sections that begin with
backend "azurerm" {
anddata "terraform_remote_state" "persistent_storage" {
. -
To store the Terraform state locally, comment out the sections that begin with
backend "azurerm" {
anddata "terraform_remote_state" "persistent_storage" {
and uncomment the section that containsbackend = "local"
.Important
We don’t recommend storing the Terraform state locally for production deployments.
-
-
Navigate to the
azure-terraform-cnq-<x.y>
directory and then run theterraform init
command.Terraform prepares the environment and displays the message
Terraform has been successfully initialized!
-
Edit the
terraform.tfvars
file and specify the values for all variables.Note
If you have an existing resource group that must contain your CNQ on Azure deployment due to your organization’s policies or specific naming conventions, specify a value for theadvanced_az_resource_group_name
variable.For more information, see
README.pdf
inazure-terraform-cnq-<x.y>.zip
. -
Run the
terraform apply
command. -
Review the Terraform execution plan and then enter
yes
.Terraform creates resources according to the execution plan and displays:
-
Your deployment’s unique name
-
The IP address for your Provisioner
-
The floating IP addresses for your Qumulo cluster
Note
You must specify the floating IP addresses in yourterraform.tfvars
file explicitly. -
The primary (static) IP addresses for your Qumulo cluster
-
The Qumulo Core Web UI endpoint
For example:
deployment_unique_name = "mynameCompute-ABCDEFG" provisioner = { "provisioner_ip_address" = "203.0.113.0" "qumulo_cluster_floating_ips" = tolist([ "203.0.113.42", "203.0.113.84", ... ]) } ... qumulo_primary_ips = tolist([ "203.0.113.1", "203.0.113.2", "203.0.113.3", "203.0.113.4" ]) ... qumulo_private_url_node1 = "https://203.0.113.10"
-
Part 2: To Mount the Qumulo File System
-
To log in to your cluster’s Web UI, use the endpoint from the Terraform output as the endpoint and the username and password that you have configured during deployment as the credentials.
Important
If you change the administrative password for your cluster by using the Qumulo Core Web UI,qq
CLI, or REST API after deployment, you must add your new password in Azure App Configuration (on the left navigation panel, click Operations > Configuration Explorer).You can use the Qumulo Core Web UI to create and manage NFS exports, SMB shares, snapshots, and continuous replication relationships You can also join your cluster to Active Directory, configure LDAP, and perform many other operations.
-
Mount your Qumulo file system by using NFS or SMB and your cluster’s DNS name or IP address.
Step 3: Performing Post-Deployment Actions
This section describes the common actions you can perform on a CNQ cluster after deploying it.
Adding Nodes to an Existing Cluster
To add nodes to an existing cluster, the total node count must be greater than that of the current deployment.
- Edit
terraform.tfvars
and change the value ofq_node_count
to a new value. - Run the
terraform apply
command. -
Review the Terraform execution plan and then enter
yes
.Terraform changes resources according to the execution plan and displays an additional primary (static) IP for the new node. For example:
qumulo_primary_ips = tolist([ "203.0.113.1", "203.0.113.2", "203.0.113.3", "203.0.113.4", "203.0.113.5" ])
- To ensure that the Provisioner shuts down automatically, review the
last-run-status
parameter in Azure App Configuration (on the left navigation panel, click Operations > Configuration Explorer). - To check that the cluster is healthy and has the needed number of nodes, log in to the Qumulo Core Web UI.
Removing Nodes from an Existing Cluster
Removing nodes from an existing cluster is a two-step process in which you remove the nodes from your cluster’s quorum and then tidy up the Azure resources for the removed nodes.
Step 1: Remove Nodes from the Cluster’s Quorum
You must perform this step while the cluster is running.
-
Edit the
terraform.tfvars
file and set the value ofq_target_node_count
to a lower number of nodes. -
Run the
terraform apply
command. -
Review the nodes to be removed and then enter
yes
.Terraform removes the nodes and displays:
-
Your deployment’s unique name
-
The remaining primary (static) IP addresses for your Qumulo cluster
-
The Qumulo Core Web UI endpoint
For example:
deployment_unique_name = "mynameCompute-ABCDEFG" qumulo_cluster_uuid = "12345678-1234-1234-1234-123456789012" qumulo_primary_ips = tolist([ "203.0.113.1", "203.0.113.2", "203.0.113.3" ]) qumulo_private_url_node1 = "https://203.0.113.1"
-
Step 2: Tidy Up Azure Resources for Removed Nodes
-
Edit the
terraform.tfvars
file:-
Set the value of the
q_node_count
variable to a lower number of nodes in the cluster. -
Set the value of the
q_target_node_count
tonull
.
-
-
Run the
terraform apply
command. -
Review the resources to be removed and then enter
yes
. -
To check that the cluster is healthy and has the needed number of nodes, log in to the Qumulo Core Web UI.
Terraform tidies up the resources for removed nodes and displays:
-
Your deployment’s unique name
-
The remaining primary (static) IP addresses for your Qumulo cluster
-
The Qumulo Core Web UI endpoint
For example:
deployment_unique_name = "mynameCompute-ABCDEFG" qumulo_cluster_uuid = "12345678-1234-1234-1234-123456789012" qumulo_primary_ips = [ "203.0.113.1", "203.0.113.2" ] qumulo_private_url_node1 = "https://203.0.113.1"
-
Increasing the Soft Capacity Limit for an Existing Cluster
Increasing the soft capacity limit for an existing cluster is a two-step process in which you configure new persistent storage parameters and then configure new compute and cache deployment parameters.
Step 1: Set New Persistent Storage Parameters
- Edit the
terraform.tfvars
file in thepersistent-storage
directory and set theq_cluster_soft_capacity_limit
variable to a higher value. -
Run the
terraform apply
command.Review the Terraform execution plan and then enter
yes
.Terraform creates new storage accounts as necessary and displays:
-
The names of the created storage accounts
-
Your persistent storage resource group’s unique name
-
The new soft capacity limit
For example:
Outputs: persistent_storage_accounts = [ "ab5cdefghij1", "ab4cdefghij2", "ab3cdefghij3", "ab2cdefghij4", ] persistent_storage_resource_group = "mynamePStore-abcde" ... soft_capacity_limit = "1000 TB"
-
Step 2: Update Existing Compute and Cache Resource Deployment
- Navigate to the root directory of the
azure-terraform-cnq-<x.y>
repository. -
Run the
terraform apply -var-file config-standard.tfvars
command.Review the Terraform execution plan and then enter
yes
.Terraform updates the necessary roles and storage account policies, adds storage accounts to the persistent storage list for the cluster, increases the soft capacity limit, and displays the
Apply complete!
message.When the Provisioner shuts down automatically, this process is complete.
Changing the VM Instance Type of Your CNQ on Azure Cluster
You can change the VM instance type, node count, and convert your cluster from single-AZ to multi-AZ, or the other way around.
- To minimize potential availability interruptions, you must perform the cluster replacement procedure as a two-quorum event. For example, if you stop the existing VMs by using the Azure portal and change the VM instance types, two quorum events occur for each node and the read and write cache isn't optimized for the VM instance type.
- Performing the cluster replacement procedure ensures that the required VM instance types are available in advance.
Changing the VM instance type of your CNQ on Azure cluster is a three-step process in which you create a new deployment in a new Terraform workspace and join the new VMs to a quorum, remove the existing VMs, and then clean up your storage account policies.
Step 1: Create a New Deployment in a New Terraform Workspace
- To create a new Terraform workspace, run the
terraform workspace new my-new-workspace-name
command. -
Edit the
terraform.tfvars
file:-
Specify the value for the
az_subnet_name
variable.Note
For multi-AZ deployments, specify values as a comma-delimited list. - Specify the value for the
q_vm_type
variable. - Set the value of the
q_replacement_cluster
variable totrue
. - Set the value of the
q_existing_deployment_unique_name
variable to the current deployment’s name. - (Optional) To change the number of nodes, specify the value for the
q_node_count
variable.
Important
Leave the other variables unchanged. -
-
Run the
terraform apply
command.Review the Terraform execution plan and then enter
yes
.Terraform creates resources according to the execution plan and displays:
-
Your deployment’s unique name
-
New primary (static) IP addresses for your Qumulo cluster
-
The Qumulo Core Web UI endpoint
For example:
deployment_unique_name = "mynameCompute-ABCDEFG" qumulo_cluster_uuid = "87654321-4321-4321-4321-210987654321" qumulo_primary_ips = tolist([ "203.0.113.4", "203.0.113.5", "203.0.113.6", "203.0.113.7" ]) qumulo_private_url_node1 = "https://203.0.113.4"
-
- To ensure that the Provisioner shuts down automatically, review the
last-run-status
parameter in Azure App Configuration (on the left navigation panel, click Operations > Configuration Explorer). - To check that the cluster is healthy and has the needed number of nodes, log in to the Qumulo Core Web UI.
Step 2: Remove the Previous Deployment
- To select the previous Terraform workspace (for example,
default
), run theterraform workspace select default
command. - To ensure that the correct workspace is selected, run the
terraform workspace show
command. -
Run the
terraform destroy
command.Review the Terraform execution plan and then enter
yes
.Terraform displays the
Destroy complete!
message with a count of destroyed resources.The previous deployment is deleted.
The persistent storage deployment remains in its original Terraform workspace. You can perform the next cluster replacement procedure in the
default
workspace.Step 3: Clean Up Storage Account Policies
- To list your Terraform workspaces, run the
terraform workspace list
command. - To select your new Terraform workspace, run the
terraform workspace select <my-new-workspace-name>
command. - Edit the
terraform.tfvars
file and set theq_replacement_cluster
variable tofalse
. -
Run the
terraform apply
command. This ensures that the storage account policies have least privilege.Review the Terraform execution plan and then enter
yes
.Terraform displays the
Apply complete!
message with a count of destroyed resources.
Deleting an Existing Cluster
Deleting a cluster is a two-step process in which you delete your cluster’s compute and cache resources and then delete your persistent storage.
- When you no longer need your cluster, you must back up all important data on the cluster safely before deleting the cluster.
- When you delete your cluster's cache and computer resources, it isn't possible to access your persistent storage anymore.
Step 1: To Delete Your Cluster’s Compute and Cache Resources
- After you back up your data safely, edit your
terraform.tfvars
file and set theterm_protection
variable tofalse
. -
Run the
terraform apply
command.Review the Terraform execution plan and then enter
yes
.Terraform displays the
Apply complete!
message with a count of changed resources. -
Run the
terraform destroy
command.Review the Terraform execution plan and then enter
yes
.Terraform deletes all of your cluster’s compute and cache resources and displays the
Destroy complete!
message and a count of destroyed resources.
Step 2: To Delete Your Cluster’s Persistent Storage
- Navigate to the
persistent-storage
directory. - Edit your
terraform.tfvars
file and set theprevent_destroy
parameter tofalse
. -
Run the
terraform apply
command.Review the Terraform execution plan and then enter
yes
.Terraform displays the
Apply complete!
message with a count of changed resources. -
Run the
terraform destroy
command.Review the Terraform execution plan and then enter
yes
.Terraform deletes all of your cluster’s persistent storage and displays the
Destroy complete!
message and a count of destroyed resources.