Responding to node resources overutilized performance events
PDF of this doc site
- Install Unified Manager on VMware vSphere systems
- Install Unified Manager on Linux systems
- Install Unified Manager on Windows systems
Perform configuration and administrative tasks
- Configuring Active IQ Unified Manager
- Using the maintenance console
Monitor and manage storage
- Monitoring and managing clusters from the dashboard
- Provisioning and managing workloads
Manage events and alerts
- Managing events
Monitor and manage cluster performance
- Navigating performance workflows in the Unified Manager GUI
- Monitoring cluster performance from the Performance Cluster Landing page
- Monitoring performance using the Performance Inventory pages
- Monitoring performance using the Performance Explorer pages
- Analyzing performance events
Monitor and manage cluster health
Common Unified Manager health workflows and tasks
- Monitoring and troubleshooting data availability
- Managing backup and restore operations
- Common Unified Manager health workflows and tasks
Protect and restore data
- Creating and troubleshooting protection relationships
Generate custom reports
- Sample custom reports
Unified Manager generates node resources overutilized warning events when a single node is operating above the bounds of its operational efficiency, and therefore potentially affecting workload latencies. These system-defined events provide the opportunity to correct potential performance issues before many workloads are affected by latency.
What you’ll need
You must have the Operator, Application Administrator, or Storage Administrator role.
There must be new or obsolete performance events.
Unified Manager generates warning events for node resources overutilized policy breaches by looking for nodes that are using more than 100% of their performance capacity for more than 30 minutes.
You can use System Manager or the ONTAP commands to correct this type of performance issue, including the following tasks:
Creating and applying a QoS policy to any volumes or LUNs that are overusing system resources
Reducing the QoS maximum throughput limit of a policy group to which workloads have been applied
Moving a workload to a different aggregate or node
Increasing capacity by adding disks to the node, or by upgrading to a node with a faster CPU and more RAM
Display the Event details page to view information about the event.
Review the Description, which describes the threshold breach that caused the event.
For example, the message “Perf. Capacity Used value of 139% on simplicity-02 has triggered a WARNING event to identify potential performance problems in the data processing unit.” indicates that performance capacity on node simplicity-02 is overused and affecting node performance.
Under the System Diagnosis section, review the three charts: one for performance capacity used on the node, one for average storage IOPS being used by the top workloads, and one for latency on the top workloads. When arranged in this way you can see which workloads are the cause of the latency on the node.
You can view which workloads have QoS policies applied, and which do not, by moving your cursor over the IOPS chart.
Under the Suggested Actions section, review the suggestions and determine which actions you should perform to avoid an increase in latency for the workload.
If required, click the Help button to view more details about the suggested actions you can perform to try and resolve the performance event.