You must monitor the system for diagnostic purposes and to get information about performance trends and statuses of various system operations. You might need to replace nodes or SSDs for maintenance purposes.
More information
Viewing information about system events
You can view information about various events detected in the system. The system refreshes the event messages every 30 seconds. The event log displays key events for the cluster.
Viewing status of running tasks
You can view the progress and completion status of running tasks in the web UI that are being reported by the ListSyncJobs and ListBulkVolumeJobs API methods. You can access the Running Tasks page from the Reporting tab of the Element UI.
Viewing system alerts
You can view alerts for information about cluster faults or errors in the system. Alerts can be information, warnings, or errors and are a good indicator of how well the cluster is running. Most errors resolve themselves automatically.
Viewing node performance activity
You can view performance activity for each node in a graphical format. This information provides real-time statistics for CPU and read/write I/O operations per second (IOPS) for each drive the node. The utilization graph is updated every five seconds, and the drive statistics graph updates every ten seconds.
Viewing volume performance
You can view detailed performance information for all volumes in the cluster. You can sort the information by volume ID or by any of the performance columns. You can also use filter the information by certain criteria.
Viewing iSCSI sessions
You can view the iSCSI sessions that are connected to the cluster. You can filter the information to include only the desired sessions.
Viewing Fibre Channel sessions
You can view the Fibre Channel (FC) sessions that are connected to the cluster. You can filter information to include only those connections you want displayed in the window.
Troubleshooting drives
You can replace a failed solid-state drive (SSD) with a replacement drive. SSDs for SolidFire storage nodes are hot-swappable. If you suspect an SSD has failed, contact NetApp Support to verify the failure and walk you through the proper resolution procedure. NetApp Support also works with you to get a replacement drive according to your service-level agreement.
Troubleshooting nodes
You can remove nodes from a cluster for maintenance or replacement. You should use the NetApp Element UI or API to remove nodes before taking them offline.
Working with per-node utilities for network troubleshooting
You can use the per-node utilities for troubleshooting network problems if the standard monitoring tools in the NetApp Element software UI do not give you enough information for troubleshooting. Per-node utilities provide specific information that can help you troubleshoot network problems between nodes or with the management node.
Working with the management node
You can use the management node (mNode) to upgrade system services, manage cluster assets and settings, run system tests and utilities, configure Active IQ for system monitoring, and enable NetApp Support access for troubleshooting.
Understanding cluster fullness levels
The cluster running Element software generates cluster faults to warn the storage administrator when the cluster is running out of capacity. There are three levels of cluster fullness, all of which are displayed in the NetApp Element UI: warning, error, and critical.
Enabling FIPS 140-2 on your cluster
You can use the EnableFeature API method to enable the FIPS 140-2 operating mode for HTTPS communications.