This section explains how creating portals on Qumulo clusters, and establishing relationships between spoke and hub portals, enables Cloud Data Fabric functionality in Qumulo Core.
- Global Namespace is now a core component of Qumulo Cloud Data Fabric.
- For a general conceptual introduction, see What is Hub and Spoke Topology?
- For specific implementation of the Cloud Data Fabric functionality in Qumulo Core, see Example Cloud Data Fabric Scenarios.
Qumulo clusters can take advantage of the Cloud Data Fabric functionality that lets clusters across disparate geographic or infrastructural locations (on-premises and in the cloud) access the same data while maintaining independent namespace structures on each cluster (for example, by setting only a portion of the cluster’s file system as the portal root directory).
To enable Cloud Data Fabric functionality, you must define a spoke portal on one cluster, a hub portal on another cluster, and then propose a portal relationship between the two.
- Before you begin to implement Cloud Data Fabric in your organization, we strongly recommend reviewing this page, especially the Known Limitations section.
- For any questions, contact the Qumulo Care team.
Key Terms
The following key terms help define the components of Cloud Data Fabric functionality in Qumulo Core.
Clusters and Root Directories
-
Cluster: Any Qumulo cluster that shares a portion of its file system for a hub portal or a spoke portal. A directory on a cluster defines the root directory for a spoke portal or a hub portal.
Tip
Because a portion of a Qumulo cluster's file system can hold the hub portal root directory or spoke portal root directory, using the correct terminology can help avoid confusion:
- ❌ hub cluster
- ✅ hub portal host cluster
- ❌ spoke cluster
- ✅ spoke portal host cluster
-
Spoke Portal Root Directory, Hub Portal Root Directory: A directory on a cluster that uses a portion of its file system for the hub portal or spoke portal.
According to the file system permissions that a hub portal might impose, you can access a spoke portal root directory by using NFSv3, SMB, or the Qumulo REST API.
Portals
-
Spoke Portal: An interface point on a Qumulo cluster that accesses a portion of the file system on another cluster (which has a hub portal). A directory on a cluster defines the root directory for spoke portal. The spoke portal initiates the creation of a hub portal.
-
Read-Write Portal: A spoke portal that can access, modify, and create any files or directories within the hub portal root directory according to the file system permissions.
-
Read-Only Portal: A spoke portal that can access any files or directories within the hub portal root directory according to the file system permissions, but can’t modify or create any files or directories regardless of file system permissions.
-
-
Hub Portal: An interface point on a Qumulo cluster that shares a portion of its file system with another cluster (which has a spoke portal). A directory on a cluster defines the root directory for hub portal. The spoke portal initiates the creation of a hub portal. You can configure multiple portal relationships, with the same hub portal root directory, with nested directories, or with independent ones.
Note
- It isn't possible to create hub portal without a spoke portal. For example, a spoke portal on Cluster A can propose a portal relationship to Cluster B. This action initiates the creation of a hub portal in a
Pending
state on Cluster B. - You must authorize the portal relationship before you can use it.
- While a spoke portal can be either read-only or read-write, a hub portal is always read-write.
- It isn't possible to create hub portal without a spoke portal. For example, a spoke portal on Cluster A can propose a portal relationship to Cluster B. This action initiates the creation of a hub portal in a
-
Portal Relationship: A proposal that a spoke portal on one Qumulo cluster issues to another Qumulo cluster (with a hub portal), which the Qumulo cluster with the hub portal authorizes.
Portal States
Throughout the process of creating a spoke portal and proposing a portal relationship, either portal type can be in one of the following states.
State | Description |
---|---|
Unlinked |
Qumulo Core has created the spoke portal, but hasn't established a relationship for it. Use the qq portal_propose_hub
command. |
Pending |
Qumulo Core has established a relationship between the spoke portal and a hub portal, but the hub portal hasn't given its authorization. Use the qq portal_authorize_hub
command. |
Active |
The portal relationship is operational for both clusters and the spoke portal root directory is accessible if full connectivity is established. |
Ended |
The spoke portal root directory is inaccessible because the relationship between the hub portal and spoke portal was removed. |
Working with the Cloud Data Fabric Functionality
When the hub portal authorizes the portal relationship, the contents of the hub portal root directory become available to the spoke portal immediately.
The first time a client accesses a spoke portal root directory, the spoke portal begins to read and cache data from the hub portal. Subsequent access to the same data accesses the cache of the spoke portal host cluster, with performance characteristics equivalent to access to non-portal data on the spoke portal host cluster. Caching takes place on demand, when a client with access to the spoke portal retrieves new portions of the namespace that the hub portal provides. For more information, see Configuring Cache Management for Spoke Portals in Qumulo Core.
For read-write portals, data synchronization is bidirectional, asychronous, and strictly consistent upon access. For example, when a client creates or modifies files or directories in the spoke portal root directory, the spoke portal synchronizes these changes to the hub portal in the background. Clients that access the hub portal can see these changes immediately.
To ensure that any changes on one portal become available immediately to any client that reads data from the portal’s peers, Qumulo Core uses a proprietary locking synchronization mechanism.
The cache of a spoke portal is inherently ephemeral. You must not use it in place of data replication or backup.
Qumulo Core enforces permissions in the same way for files and directories in the spoke portal root directory and the hub portal root directory.
- Deleting the portal relationship never affects the data on the hub portal.
- For a spoke portal to be accessible, there must be full connectivity between the two clusters in a portal relationship, without which files or directories with outstanding modifications on one portal are inaccessible on other portals.
You can remove the portal relationship from either the spoke or hub portal.
-
If you remove the spoke portal, Qumulo Core also deletes its root directory, reclaims the capacity consumed by cached data, and the hub portal enters the
Ended
state. -
If you remove the hub portal, all data transfer to the spoke portal stops immediately and the spoke portal enters the
Ended
state. -
When you remove a portal relationship, any files or directories on the hub portal that were inaccessible, due to both connectivity loss and outstanding spoke portal modifications, become accessible.
Example Cloud Data Fabric Scenarios
The following are examples of some of the most common scenarios for workloads that use Cloud Data Fabric functionality.
Edge Clusters
In this scenario, you deploy a single, large central cluster at your organization’s data center and multiple, small edge clusters at your organization’s branch offices or in remote locations.

The Cloud Data Fabric functionality lets you make the data on the central cluster available to the remote clusters without the need to replicate data to each location. The data remains available to the edge clusters even if their capacity is lower than that of the central cluster. While a read-write portal lets the edge clusters create or modify data on the central cluster, a read-only portal lets only the edge clusters read data from the central cluster.
Active Workload with Archive
In this scenario, several clusters serve active workloads but require access to a large data archive after the initial workflow completes.

The Cloud Data Fabric functionality lets you:
-
Move your cold (infrequently accessed) data to a central archive cluster and then provide access to this data by using a portal on the original cluster.
The active workload clusters can reclaim most of the data set capacity that was tiered to the data archive cluster. This makes it possible to access all of the data as before, while using only the capacity on the active workload clusters for the data that your system reads through the portal.
-
Serve specific archive capacity and performance needs by scaling the archive cluster independently of any active workflow clusters.
Known Limitations of the Cloud Data Fabric Functionality in Qumulo Core
General
- Currently, it is possible to configure and manage Cloud Data Fabric functionality only by using the
qq
CLI.
File System
- A Qumulo cluster can be a portal host for any number of hub portals or for a single spoke portal. It isn't possible for a Qumulo cluster to be a host for spoke and hub portals simultaneously.
- While Qumulo Core doesn’t support hard links between the files local to the spoke portal host cluster and files within the spoke portal root directory, it does support hard links entirely outside or inside the spoke portal root directory.
Data Caching
-
Although first-time data access to data in a portal root directory is subject to round-trip latency between the spoke portal host cluster and the hub portal host cluster, subsequent access to the data is faster. Making changes to data under a portal root directory is also subject to latency when the system recaches these changes upon access.
-
The cache of a spoke portal is inherently ephemeral. You must not use it in place of data replication or backup.
Portal Connectivity
-
For a spoke portal to be accessible, there must be full connectivity between the two clusters in a portal relationship, without which files or directories with outstanding modifications on one portal are inaccessible on other portals.
-
A spoke portal is inaccessible if the hub portal host cluster and the spoke portal host cluster run different versions of Qumulo Core.
Protocols
-
It isn't possible to create a spoke portal on a cluster with the NFSv4.1 or S3 protocols enabled or to enable these protocols on an existing spoke potal host cluster.
-
Protocol locks don't synchronize between the hub portal host cluster and the spoke portal host cluster. Specifically, NFSv3 or NLM byte-range locks, SMB share-mode locks, SMB byte-range locks, and SMB leases function independently on the two clusters.