Guaranteeing throughput with QoS

You can use storage quality of service (QoS) to guarantee that performance of critical workloads is not degraded by competing workloads. You can set a throughput ceiling on a competing workload to limit its impact on system resources, or set a throughput floor for a critical workload, ensuring that it meets minimum throughput targets, regardless of demand by competing workloads. You can even set a ceiling and floor for the same workload.

Understanding throughput ceilings (QoS Max)

A throughput ceiling limits throughput for a workload to a maximum number of IOPS or MBps, or IOPS and MBps. In the figure below, the throughput ceiling for workload 2 ensures that it does not bully workloads 1 and 3.

A policy group defines the throughput ceiling for one or more workloads. A workload represents the I/O operations for a storage object: a volume, file, qtree or LUN, or all the volumes, files, qtrees, or LUNs in an SVM. You can specify the ceiling when you create the policy group, or you can wait until after you monitor workloads to specify it.

Note: Throughput to workloads might exceed the specified ceiling by up to 10%, especially if a workload experiences rapid changes in throughput. The ceiling might be exceeded by up to 50% to handle bursts. Bursts occur on single nodes when tokens accumulate up to 150%


Understanding throughput floors (QoS Min)

A throughput floor guarantees that throughput for a workload does not fall below a minimum number of IOPS or MBps, or IOPS and MBps.In the figure below, the throughput floors for workload 1 and workload 3 ensure that they meet minimum throughput targets, regardless of demand by workload 2.

Tip: As the examples suggest, a throughput ceiling throttles throughput directly. A throughput floor throttles throughput indirectly, by giving priority to the workloads for which the floor has been set.

A policy group that defines a throughput floor cannot be applied to an SVM. You can specify the floor when you create the policy group, or you can wait until after you monitor workloads to specify it.

Note: In releases before ONTAP 9.7, throughput floors are guaranteed when there is sufficient performance capacity available. In ONTAP 9.7 and later, throughput floors can be guaranteed even when there is insufficient performance capacity available. This new floor behavior is called floors v2. To meet the guarantees, floors v2 can result in higher latency on workloads without a throughput floor or on work that exceeds the floor settings. Floors v2 applies to both QoS and adaptive QoS. The option of enabling/disabling the new behavior of floors v2 is available in ONTAP 9.7P6 and later.

A workload might fall below the specified floor during critical operations like volume move trigger-cutover. Even when sufficient capacity is available and critical operations are not taking place, throughput to a workload might fall below the specified floor by up to 5%. If floors are overprovisioned and there is no performance capacity, some workloads might fall below the specified floor.



Understanding shared and non-shared QoS policy groups

Starting with ONTAP 9.4, you can use a non-shared QoS policy group to specify that the defined throughput ceiling or floor applies to each member workload individually. Behavior of shared policy groups depends on the policy type:

Understanding adaptive QoS

Ordinarily, the value of the policy group you assign to a storage object is fixed. You need to change the value manually when the size of the storage object changes. An increase in the amount of space used on a volume, for example, usually requires a corresponding increase in the throughput ceiling specified for the volume.

Adaptive QoS automatically scales the policy group value to workload size, maintaining the ratio of IOPS to TBs|GBs as the size of the workload changes. That is a significant advantage when you are managing hundreds or thousands of workloads in a large deployment.

You typically use adaptive QoS to adjust throughput ceilings, but you can also use it to manage throughput floors (when workload size increases). Workload size is expressed as either the allocated space for the storage object or the space used by the storage object.
Note: Used space is available for throughput floors in ONTAP 9.5 and later. It is not supported for throughput floors in ONTAP 9.4 and earlier.

Starting with ONTAP 9.5, you can specify an I/O block size for your application that enables a throughput limit to be expressed in both IOPS and MBps. The MBps limit is calculated from the block size multiplied by the IOPS limit. For example, an I/O block size of 32K for an IOPS limit of 6144IOPS/TB yields an MBps limit of 192MBps.

You can expect the following behavior for both throughput ceilings and floors:

Throughput must increase by at least 10 IOPS before updates take place.

Adaptive QoS policy groups are always non-shared: the defined throughput ceiling or floor applies to each member workload individually.

Starting with ONTAP 9.6, throughput floors is supported on ONTAP Select premium with SSD.

General support

The following table shows the differences in support for throughput ceilings, throughput floors, and adaptive QoS.

Resource or feature

Throughput ceiling Throughput floor Throughput floor v2 Adaptive QoS
ONTAP 9 version All 9.2 and later 9.7 and later 9.3 and later
Platforms All
  • AFF
  • C190 *
  • ONTAP Select premium with SSD *
  • AFF
  • C190
  • ONTAP Select premium with SSD
All
Protocols All All All All
FabricPool Yes Yes, if the tiering policy is set to none and no blocks are in the cloud. Yes, if the tiering policy is set to none and no blocks are in the cloud. Yes
SnapMirror Synchronous Yes No No Yes

*C190 and ONTAP Select support started with the ONTAP 9.6 release.

Supported workloads for throughput ceilings

The following table shows workload support for throughput ceilings by ONTAP 9 version. Root volumes, load-sharing mirrors, and data protection mirrors are not supported.

Workload support - ceiling

9.0 9.1 9.2 9.3 9.4 and later 9.8 and later
Volume yes yes yes yes yes yes
File yes yes yes yes yes yes
LUN yes yes yes yes yes yes
SVM yes yes yes yes yes yes
FlexGroup volume no no no yes yes yes
qtrees* no no no no no yes
Multiple workloads per policy group yes yes yes yes yes yes
Non-shared policy groups no no no no yes yes

*Starting from ONTAP 9.8, NFS access is supported in qtrees in FlexVol and FlexGroup volumes with NFS enabled. Starting from ONTAP 9.9.1, SMB access is also supported in qtrees in FlexVol and FlexGroup volumes with SMB enabled.

Supported workloads for throughput floors

The following table shows workload support for throughput floors by ONTAP 9 version. Root volumes, load-sharing mirrors, and data protection mirrors are not supported.

Workload support - floor

9.2 9.3 9.4 and later 9.8 and later
Volume yes yes yes yes
File no yes yes yes
LUN yes yes yes yes
SVM no no no yes
FlexGroup volume no no yes yes
qtrees * no no no yes
Multiple workloads per policy group no no yes yes
Non-shared policy groups no no yes yes

*Starting from ONTAP 9.8, NFS access is supported in qtrees in FlexVol and FlexGroup volumes with NFS enabled. Starting from ONTAP 9.9.1, SMB access is also supported in qtrees in FlexVol and FlexGroup volumes with SMB enabled.

Supported workloads for adaptive QoS

The following table shows workload support for adaptive QoS by ONTAP 9 version. Root volumes, load-sharing mirrors, and data protection mirrors are not supported.

Workload support - adaptive QoS

9.3 9.4 and later
Volume yes yes
File no yes
LUN no yes
SVM no no
FlexGroup volume no yes
Multiple workloads per policy group yes yes
Non-shared policy groups yes yes

Maximum number of workloads and policy groups

The following table shows the maximum number of workloads and policy groups by ONTAP 9 version.

Workload support

9.3 and earlier 9.4 and later
Maximum workloads per cluster 12,000 40,000
Maximum workloads per node 12,000 40,000
Maximum policy groups 12,000 12,000