Learn about the Overview dashboard in Workload Factory for EDA

06/18/2026 Contributors

The Overview dashboard gives IT administrators one place to monitor EDA workloads across multiple FSx for ONTAP file systems. Use it to quickly check system health and usage, choose where to place new volumes or jobs, find volumes or SVMs that may need to be moved, and see when it’s time to increase capacity or throughput.

Overview

The Overview dashboard collects CloudWatch metrics for all FSx for ONTAP file systems associated with your configured AWS credentials.

It includes:

Cluster health status: A summary at the top highlights latency events, SSD usage, capacity recommendations, and ONTAP EMS events across your file systems.
Clusters table: A searchable table that shows usage and performance for each cluster. You can filter and sort the data, move between pages, and export it as a CSV file.

It helps you:

Place new volumes and rebalance workloads
Plan capacity or throughput scaling
Monitor cluster health at scale
Make informed decisions about volume placement
Identify clusters approaching capacity limits

Dashboard components

Cluster health status

Cluster health shows a quick view of activity across the file systems you selected. This appears only if at least one FSx for ONTAP link is connected to your file systems.

The health status includes the following areas:

Latency: Shows the number of latency events found in the selected file systems. This information appears only if latency monitoring is enabled.
SSD capacity management: Displays the number of file systems with SSD usage above 80% and the number of file systems with active capacity recommendations. This helps you quickly identify file systems that might require capacity attention.
ONTAP events: Displays the number of EMS events detected, categorized by Capacity, Availability & protection, and Security & other.

Clusters table

The clusters table provides a detailed view of each FSx for ONTAP file system, filtered by your active region and AWS account selections. Data is collected from CloudWatch metrics.

Use the table to:

Identify file systems approaching capacity limits (SSD usage column)
Compare throughput demand to provisioned throughput SKU (Throughput usage P99 column)
Track performance metrics across multiple clusters
Check link configuration status (Associated link column) - Connection validity is verified daily
Select multiple clusters for bulk parameter updates

SSD capacity management

Management modes

Automate: Workload Factory automatically adds more SSD space when usage reaches set limits. It handles this on its own, so no manual work is needed. This works well for environments that prefer automatic management.
Recommend: Workload Factory analyzes how you use your SSDs and suggests when to increase capacity. You review these suggestions and decide whether to apply them. This keeps you in full control of capacity changes while still benefiting from automated analysis.
None: The system does not make capacity recommendations or take automated actions. Use this option if you want to manage capacity yourself.

Capacity recommendations

When Workload Factory is in Automate or Recommend mode, it automatically checks each FSx for ONTAP file system and runs a capacity recommendation algorithm. The system scans every 24 hours to see if SSD capacity should be adjusted.

When a recommendation is identified:

You receive an immediate notification based on your Workload Factory notification settings.
Identify file systems with recommendations by filtering the Clusters table using the Last SSD increase timestamp or Last SSD increase description columns.
The interface shows the total number of file systems that currently have active recommendations.

The recommendation explains the suggested change and the reasoning behind it, such as: We recommend increasing the SSD size based on your file system SSD usage pattern.

SSD management parameters

Parameters control how the capacity management system analyzes and acts on your SSD usage:

Threshold (10-90%): The SSD usage percentage that triggers capacity recommendations or automation actions. For example, a threshold of 80% means recommendations or actions occur when SSD usage reaches 80%. This setting is available in both Recommend and Automate modes.
Lookback (1-200 hours): The time period used to analyze historical SSD usage patterns. A longer lookback period provides more historical context for capacity decisions. Available in Automate mode only.
Ahead (1-200 hours): The time period used to project future capacity needs. A longer ahead period plans further into the future for capacity growth. Available in Automate mode only.

You can set these parameters for each file system individually, or use bulk editing to apply the same settings to multiple file systems.

Understanding capacity decision points

The SSD usage graph marks when the system made capacity recommendations or took automated actions. These markers show how the capacity management algorithm behaved over time.

Recommendation decision points: Appear when the system detects that more SSD capacity is needed. If capacity has not been increased, these points may appear as often as every 30 minutes. The graph shows each decision point when possible. If the selected time range is too dense, the points are grouped together.
Automation decision points: Appear when the automation system tries to increase SSD capacity. They show whether the action succeeded or failed.

Use the SSD usage history graph to:

See how often you need to change storage size
Decide whether automation or recommendation mode fits your usage patterns better
Spot repeated storage shortages
Plan future storage needs based on growth over time
Investigate why automation attempts failed