This section lists the names, types, labels, and descriptions for the metrics that Qumulo Core 5.3.0 (and higher) emits in OpenMetrics API format.
The Qumulo OpenMetrics API has a single endpoint that provides a complete view of point-in-time telemetry from Qumulo Core to monitoring systems. These systems, such as Prometheus, can consume the OpenMetrics data format that the Qumulo REST API emits without custom code or a monitoring agent. For more information about data formats, see your monitoring system’s documentation.
Accessing Qumulo Metrics
Qumulo metrics are available at the following endpoint.
https://<my-cluster-hostname>:8000/v2/metrics/endpoints/default/data
You can configure a monitoring system that supports the OpenMetrics Specification to use bearer token authentication to access this endpoint.
Metric Types
All Qumulo metrics belong to one of the following OpenMetrics types.
For more information, see Metric Types in the OpenMetrics Specification.
Metric Labels
The OpenMetrics format allows for metric labeling for communicating additional information. To provide context for metrics, Qumulo Core emits metric-specific labels. For example, the name
of a protocol operation or the url
of a remote server. For more information, see Available Labels.
Available Metrics
The following table lists metric names, types, labels, and descriptions.
For Qumulo as a Service, all metrics with a
node_id
label are unavailable because they refer to specific hardware.Metric Name | Metric Type | Labels | Supported from Qumulo Core Version | Description |
---|---|---|---|---|
qumulo |
info |
|
5.3.0 | Qumulo Core information, including the cluster name, cluster UUID, and the current Qumulo Core version. |
qumulo_node |
info |
|
6.0.2 | Information about the nodes in the cluster, including the node ID and the node model |
qumulo_ad_netlogon_request |
counter |
5.3.0 | The total number of Active Directory (AD) NETLOGON requests that resulted in an error |
|
qumulo_ad_netlogon_request |
histogram |
5.3.0 | The total latency for AD NETLOGON requests |
|
qumulo_ad_netlogon_requests |
counter |
5.3.0 | The total number of completed AD NETLOGON operations |
|
qumulo_cpu_crit_temperature_celsius |
gauge |
7.2.0.2 | The critical temperature threshold for each physical CPU | |
qumulo_cpu_max_temperature |
gauge |
5.3.1 | The maximum temperature threshold for each physical CPU | |
qumulo_cpu_temperature |
gauge |
5.3.0 | The temperature for each physical CPU, in degrees Celsius | |
qumulo_disk_endurance |
gauge |
5.3.1 | The remaining disk endurance value for each disk in the cluster, ranging 100 (no disk wear) to 0 (disk is worn fully) |
|
qumulo_disk_transport |
counter |
5.3.2 | The total number of communication errors between the specified drive and its host. | |
qumulo_disk_uncorrectable |
counter |
5.3.2 | The total number of uncorrectable errors on the specified drive's physical media. | |
qumulo_disk_is_unhealthy |
gauge |
5.3.0 | The health of each disk in the cluster, ranging from 0 (the disk is healthy) to 1 (the disk is unhealthy) |
|
qumulo_disk_operation |
histogram |
5.3.0 | The total latency for disk I/O operations | |
qumulo_fan_speed_rpm |
gauge |
5.3.0 | The fan speed, in RPM | |
qumulo_fs_capacity_bytes |
gauge |
— | 5.3.0 | The total cluster space, in bytes |
qumulo_fs_directory |
gauge |
5.3.0 | The number of file system objects on the cluster, sorted by object type | |
qumulo_fs_directory |
gauge |
5.3.0 | The amount of space that object types use, in bytes | |
qumulo_fs_free_bytes |
gauge |
— | 5.3.0 | The free space on the cluster, in bytes |
qumulo_fs_snapshots |
gauge |
— | 5.3.0 | The number of snapshots on the cluster |
qumulo_ldap_lookup |
counter |
5.3.0 | The total number of LDAP requests that resulted in an error | |
qumulo_ldap_lookup |
histogram |
5.3.0 | The total latency of LDAP requests | |
qumulo_ldap_lookup |
counter |
5.3.0 | The total number of completed LDAP requests | |
qumulo_ldap_operation |
counter |
domain_url |
5.3.0 | The total number of LDAP operations that resulted in an error |
qumulo_ldap_operation |
histogram |
domain_url |
5.3.0 | The total latency for LDAP operations |
qumulo_ldap_operations |
counter |
domain_url |
5.3.0 | The total number of completed LDAP operations |
qumulo_memory_correctable |
counter |
node_id |
5.3.0 | The total number of memory errors that Qumulo Core corrected automatically |
qumulo_network_interface |
gauge |
5.3.0 | The interface status, 0 (interface is up) or 1 (interface is down) |
|
qumulo_network_interface |
gauge |
5.3.0 | The negotiated link speed for the specified interface | |
qumulo_network_interface |
counter |
5.3.0 | The total number of receive errors on the specified interface | |
qumulo_network_interface |
counter |
5.3.0 | The total bytes received on the specified interface | |
qumulo_network_interface |
counter |
5.3.0 | The total number of packets received on the specified interface | |
qumulo_network_interface |
counter |
5.3.0 | The total number of transmission errors on the specified interface | |
qumulo_network_interface |
counter |
5.3.0 | The total number of bytes transmitted on the specified interface | |
qumulo_network_interface |
counter |
5.3.0 | The total number of packets transmitted on the specified interface | |
qumulo_power_supply |
gauge |
5.3.0 | PSU health, 0 (healthy) or 1 (unplugged, removed, or unresponsive) |
|
qumulo_protocol_client |
counter |
protocol |
5.3.0 | The total number of clients that have connected to the specified protocol |
qumulo_protocol_client |
counter |
protocol |
5.3.0 | The total number of clients that have disconnected from the specified protocol |
qumulo_protocol_operation |
counter |
5.3.0 | The total bytes that protocol operations have transferred | |
qumulo_protocol_operation |
histogram |
5.3.0 | The total latency for protocol operations | |
qumulo_protocol_operations |
counter |
5.3.0 | The total number of completed protocol operations | |
qumulo_quorum_node_is |
gauge |
node_id |
5.3.0 | The online status for each node in the cluster, 0 (node online) or 1 (node offline) |
qumulo_time_is_not_synchronizing |
gauge |
node_id |
5.3.0 | The time synchronization status for each node in the cluster, 0 (time is synchronized) or 1 (time isn't synchronized) |
Available Labels
The following table lists metric label names, possible values, and descriptions.