How to address aggregate fullness and overallocation alerts

ONTAP issues EMS messages when aggregates are running out of space so that you can take corrective action by providing more space for the full aggregate. Knowing the types of alerts and how you can address them helps you ensure your data availability.

When an aggregate is described as full, it means that the percentage of the space in the aggregate available for use by volumes has fallen below a predefined threshold. When an aggregate becomes overallocated, the space used by ONTAP for metadata and to support basic data access has been exhausted. Sometimes space normally reserved for other purposes can be used to keep the aggregate functioning, but volume guarantees for volumes associated with the aggregate or data availability can be at risk.

Overallocation can be either logical or physical. Logical overallocation means that space reserved to honor future space commitments, such as volume guarantees, has been used for another purpose. Physical overallocation means that the aggregate is running out of physical blocks to use. Aggregates in this state are at risk for refusing writes, going offline, or potentially causing a controller disruption.

The following table describes the aggregate fullness and overallocation alerts, the actions you can take to address the issue, and the risks of not taking action.

Alert type EMS Level Configurable? Definition Ways to address Risk if no action taken
Nearly full Debug N The amount of space allocated for volumes, including their guarantees, has exceeded the threshold set for this alert (95%).

The percentage is the Used total minus the size of the Snapshot reserve.

  • Adding storage to the aggregate
  • Shrinking or deleting volumes
  • Moving volumes to another aggregate with more space
  • Removing volume guarantees (setting them to none)
No risk to write operations or data availability yet.
Full Debug N The file system has exceeded the threshold set for this alert (98%).

The percentage is the Used total minus the size of the Snapshot reserve.

  • Adding storage to the aggregate
  • Shrinking or deleting volumes
  • Moving volumes to another aggregate with more space
  • Removing volume guarantees (setting them to none)
Volume guarantees for volumes in the aggregate might be at risk, as well as write operations to those volumes.
Logically overallocated SVC Error N In addition to the space reserved for volumes being full, the space in the aggregate used for metadata has been exhausted.
  • Adding storage to the aggregate
  • Shrinking or deleting volumes
  • Moving volumes to another aggregate with more space
  • Removing volume guarantees (setting them to none)
Volume guarantees for volumes in the aggregate are at risk, as well as write operations to those volumes.
Physically overallocated Node Error N The aggregate is running out of physical blocks it can write to.
  • Adding storage to the aggregate
  • Shrinking or deleting volumes
  • Moving volumes to another aggregate with more space
Write operations to volumes in the aggregate are at risk, as well as data availability; the aggregate could go offline. In extreme cases, the node could experience a disruption.

Every time a threshold is crossed for an aggregate, whether the fullness percentage is rising or falling, an EMS message is generated. When the fullness level of the aggregate falls below a threshold, an aggregate ok EMS message is generated.