Guarantee throughput with QoS overview

Contributors

You can use storage quality of service (QoS) to guarantee that performance of critical workloads is not degraded by competing workloads. You can set a throughput ceiling on a competing workload to limit its impact on system resources, or set a throughput floor for a critical workload, ensuring that it meets minimum throughput targets, regardless of demand by competing workloads. You can even set a ceiling and floor for the same workload.

About throughput ceilings (QoS Max)

A throughput ceiling limits throughput for a workload to a maximum number of IOPS or MBps, or IOPS and MBps. In the figure below, the throughput ceiling for workload 2 ensures that it does not "bully" workloads 1 and 3.

A policy group defines the throughput ceiling for one or more workloads. A workload represents the I/O operations for a storage object: a volume, file, qtree or LUN, or all the volumes, files, qtrees, or LUNs in an SVM. You can specify the ceiling when you create the policy group, or you can wait until after you monitor workloads to specify it.

Note

Throughput to workloads might exceed the specified ceiling by up to 10%, especially if a workload experiences rapid changes in throughput. The ceiling might be exceeded by up to 50% to handle bursts. Bursts occur on single nodes when tokens accumulate up to 150%

qos ceiling

About throughput floors (QoS Min)

A throughput floor guarantees that throughput for a workload does not fall below a minimum number of IOPS or MBps, or IOPS and MBps.In the figure below, the throughput floors for workload 1 and workload 3 ensure that they meet minimum throughput targets, regardless of demand by workload 2.

Tip

As the examples suggest, a throughput ceiling throttles throughput directly. A throughput floor throttles throughput indirectly, by giving priority to the workloads for which the floor has been set.

A policy group that defines a throughput floor cannot be applied to an SVM. You can specify the floor when you create the policy group, or you can wait until after you monitor workloads to specify it.

Note

In releases before ONTAP 9.7, throughput floors are guaranteed when there is sufficient performance capacity available. In ONTAP 9.7 and later, throughput floors can be guaranteed even when there is insufficient performance capacity available. This new floor behavior is called floors v2. To meet the guarantees, floors v2 can result in higher latency on workloads without a throughput floor or on work that exceeds the floor settings. Floors v2 applies to both QoS and adaptive QoS. The option of enabling/disabling the new behavior of floors v2 is available in ONTAP 9.7P6 and later.A workload might fall below the specified floor during critical operations like volume move trigger-cutover. Even when sufficient capacity is available and critical operations are not taking place, throughput to a workload might fall below the specified floor by up to 5%. If floors are overprovisioned and there is no performance capacity, some workloads might fall below the specified floor.

qos floor

About shared and non-shared QoS policy groups

Starting with ONTAP 9.4, you can use a non-shared QoS policy group to specify that the defined throughput ceiling or floor applies to each member workload individually. Behavior of shared policy groups depends on the policy type:

  • For throughput ceilings, the total throughput for the workloads assigned to the shared policy group cannot exceed the specified ceiling.

  • For throughput floors, the shared policy group can be applied to a single workload only.

About adaptive QoS

Ordinarily, the value of the policy group you assign to a storage object is fixed. You need to change the value manually when the size of the storage object changes. An increase in the amount of space used on a volume, for example, usually requires a corresponding increase in the throughput ceiling specified for the volume.

Adaptive QoS automatically scales the policy group value to workload size, maintaining the ratio of IOPS to TBs|GBs as the size of the workload changes. That is a significant advantage when you are managing hundreds or thousands of workloads in a large deployment.

You typically use adaptive QoS to adjust throughput ceilings, but you can also use it to manage throughput floors (when workload size increases). Workload size is expressed as either the allocated space for the storage object or the space used by the storage object.

Note

Used space is available for throughput floors in ONTAP 9.5 and later. It is not supported for throughput floors in ONTAP 9.4 and earlier.

  • An allocated space policy maintains the IOPS/TB|GB ratio according to the nominal size of the storage object. If the ratio is 100 IOPS/GB, a 150 GB volume will have a throughput ceiling of 15,000 IOPS for as long as the volume remains that size. If the volume is resized to 300 GB, adaptive QoS adjusts the throughput ceiling to 30,000 IOPS.

  • A used space policy (the default) maintains the IOPS/TB|GB ratio according to the amount of actual data stored before storage efficiencies. If the ratio is 100 IOPS/GB, a 150 GB volume that has 100 GB of data stored would have a throughput ceiling of 10,000 IOPS. As the amount of used space changes, adaptive QoS adjusts the throughput ceiling according to the ratio.

Starting with ONTAP 9.5, you can specify an I/O block size for your application that enables a throughput limit to be expressed in both IOPS and MBps. The MBps limit is calculated from the block size multiplied by the IOPS limit. For example, an I/O block size of 32K for an IOPS limit of 6144IOPS/TB yields an MBps limit of 192MBps.

You can expect the following behavior for both throughput ceilings and floors:

  • When a workload is assigned to an adaptive QoS policy group, the ceiling or floor is updated immediately.

  • When a workload in an adaptive QoS policy group is resized, the ceiling or floor is updated in approximately five minutes.

Throughput must increase by at least 10 IOPS before updates take place.

Adaptive QoS policy groups are always non-shared: the defined throughput ceiling or floor applies to each member workload individually.

Starting with ONTAP 9.6, throughput floors is supported on ONTAP Select premium with SSD.

General support

The following table shows the differences in support for throughput ceilings, throughput floors, and adaptive QoS.

Resource or feature Throughput ceiling Throughput floor Throughput floor v2 Adaptive QoS

ONTAP 9 version

All

9.2 and later

9.7 and later

9.3 and later

Platforms

All

  • AFF

  • C190 *

  • ONTAP Select premium with SSD *

  • AFF

  • C190

  • ONTAP Select premium with SSD

All

Protocols

All

All

All

All

FabricPool

Yes

Yes, if the tiering policy is set to "none" and no blocks are in the cloud.

Yes, if the tiering policy is set to "none" and no blocks are in the cloud.

Yes

SnapMirror Synchronous

Yes

No

No

Yes

*C190 and ONTAP Select support started with the ONTAP 9.6 release.

Supported workloads for throughput ceilings

The following table shows workload support for throughput ceilings by ONTAP 9 version. Root volumes, load-sharing mirrors, and data protection mirrors are not supported.

Workload support - ceiling 9.0 9.1 9.2 9.3 9.4 and later 9.8 and later

Volume

yes

yes

yes

yes

yes

yes

File

yes

yes

yes

yes

yes

yes

LUN

yes

yes

yes

yes

yes

yes

SVM

yes

yes

yes

yes

yes

yes

FlexGroup volume

no

no

no

yes

yes

yes

qtrees*

no

no

no

no

no

yes

Multiple workloads per policy group

yes

yes

yes

yes

yes

yes

Non-shared policy groups

no

no

no

no

yes

yes

*Starting from ONTAP 9.8, NFS access is supported in qtrees in FlexVol and FlexGroup volumes with NFS enabled. Starting from ONTAP 9.9.1, SMB access is also supported in qtrees in FlexVol and FlexGroup volumes with SMB enabled.

Supported workloads for throughput floors

The following table shows workload support for throughput floors by ONTAP 9 version. Root volumes, load-sharing mirrors, and data protection mirrors are not supported.

Workload support - floor 9.2 9.3 9.4 and later 9.8 and later

Volume

yes

yes

yes

yes

File

no

yes

yes

yes

LUN

yes

yes

yes

yes

SVM

no

no

no

yes

FlexGroup volume

no

no

yes

yes

qtrees *

no

no

no

yes

Multiple workloads per policy group

no

no

yes

yes

Non-shared policy groups

no

no

yes

yes

*Starting from ONTAP 9.8, NFS access is supported in qtrees in FlexVol and FlexGroup volumes with NFS enabled. Starting from ONTAP 9.9.1, SMB access is also supported in qtrees in FlexVol and FlexGroup volumes with SMB enabled.

Supported workloads for adaptive QoS

The following table shows workload support for adaptive QoS by ONTAP 9 version. Root volumes, load-sharing mirrors, and data protection mirrors are not supported.

Workload support - adaptive QoS 9.3 9.4 and later

Volume

yes

yes

File

no

yes

LUN

no

yes

SVM

no

no

FlexGroup volume

no

yes

Multiple workloads per policy group

yes

yes

Non-shared policy groups

yes

yes

Maximum number of workloads and policy groups

The following table shows the maximum number of workloads and policy groups by ONTAP 9 version.

Workload support 9.3 and earlier 9.4 and later

Maximum workloads per cluster

12,000

40,000

Maximum workloads per node

12,000

40,000

Maximum policy groups

12,000

12,000