Alert error codes

The system reports error codes with each alert on the Alerts page. Error codes help you determine what component of the system experienced the alert and why the alert was generated.

The following list outlines the different types of system alerts.

availableVirtualNetworkIPAddressesLow
The number of virtual network addresses in the block of IP addresses is low. To resolve this fault, add more IP addresses to the block of virtual network addresses.
blockClusterFull
There is not enough free block storage space to support a single node loss. To resolve this fault, add another storage node to the storage cluster.
blockServiceTooFull
A block service is using too much space. To resolve this fault, add more provisioned capacity.
blockServiceUnhealthy
A block service has been detected as unhealthy. The system is automatically moving affected data to other healthy drives.
clusterCannotSync
There is an out-of-space condition and data on the offline block storage drives cannot be synced to drives that are still active. To resolved this fault, add more storage.
clusterFull
There is no more free storage space in the storage cluster. To resolve this fault, add more storage.
clusterIOPSAreOverProvisioned
Cluster IOPS are over provisioned. The sum of all minimum QoS IOPS is greater than the expected IOPS of the cluster. Minimum QoS cannot be maintained for all volumes simultaneously.
disableDriveSecurityFailed
The cluster is not configured to enable drive security (Encryption at Rest), but at least one drive has drive security enabled, meaning that disabling drive security on those drives failed. This fault is logged with “Warning” severity.
To resolve this fault, check the fault details for the reason why drive security could not be disabled. Possible reasons are:
  • The encryption key could not be acquired, investigate the problem with access to the key or the external key server.
  • The disable operation failed on the drive, determine whether the wrong key could possibly have been acquired.
If neither of these are the reason for the fault, the drive might need to be replaced.

You can attempt to recover a drive that does not successfully disable security even when the correct authentication key is provided. To perform this operation, remove the drive(s) from the system by moving it to Available, perform a secure erase on the drive and move it back to Active.

disconnectedClusterPair
A cluster pair is disconnected or configured incorrectly.
disconnectedRemoteNode
A remote node is either disconnected or configured incorrectly.
disconnectedSnapMirrorEndpoint
A remote SnapMirror endpoint is disconnected or configured incorrectly.
driveAvailable
One or more drives are available in the cluster. In general, all clusters should have all drives added and none in the available state. If this fault appears unexpectedly, contact NetApp Support. To resolve this fault, add any available drives to the storage cluster.
driveFailed
One or more drives have failed. If the reason for the failure is because the authentication key is inaccessible, resolve any key server connectivity issues. For other issues, contact NetApp Support.
driveWearFault
A drive's remaining life has dropped below thresholds, but it is still functioning. To resolve this fault, replace the drive soon.
duplicateClusterMasterCandidates
More than one storage cluster master candidate has been detected. Contact NetApp Support for assistance.
enableDriveSecurityFailed
The cluster is configured to require drive security (Encryption at Rest), but drive security could not be enabled on at least one drive. This fault is logged with “Warning” severity.
To resolve this fault, check the fault details for the reason why drive security could not be enabled. Possible reasons are:
  • The encryption key could not be acquired, investigate the problem with access to the key or the external key server.
  • The enable operation failed on the drive, determine whether the wrong key could possibly have been acquired.
If neither of these are the reason for the fault, the drive might need to be replaced.

You can attempt to recover a drive that does not successfully enable security even when the correct authentication key is provided. To perform this operation, remove the drive(s) from the system by moving it to Available, perform a secure erase on the drive and move it back to Active.

ensembleDegraded
Network connectivity or power has been lost to one or more of the ensemble nodes. To resolve this fault, restore network connectivity or power.
exception
A fault reported that is other than a routine fault. These faults are not automatically cleared from the fault queue. Contact NetApp Support for assistance.
failedSpaceTooFull
A block service is not responding to data write requests. This causes the slice service to run out of space to store failed writes. To resolve this fault, restore block services functionality to allow writes to continue normally and failed space to be flushed from the slice service.
fanSensor
A fan sensor has failed or is missing. Contact NetApp Support for assistance.
fibreChannelAccessDegraded
A Fibre Channel node is not responding to other nodes in the storage cluster over its storage IP for a period of time. In this state, the node will then be considered unresponsive and generate a cluster fault.
fibreChannelAccessUnavailable
All Fibre Channel nodes are unresponsive. The node IDs are displayed.
fibreChannelConfig
This cluster fault indicates one of the following conditions:
  • There is an unexpected Fibre Channel port on a PCI slot.
  • There is an unexpected Fibre Channel HBA model.
  • There is a problem with the firmware of a Fibre Channel HBA.
  • A Fibre Channel port is not online.
  • There is a persistent issue configuring Fibre Channel passthrough.
Contact NetApp Support for assistance.
fileSystemCapacityLow
There is insufficient space on one of the filesystems.
To resolve this fault, add more capacity to the filesystem.
FIPS drives mismatched

A non-FIPS drive has been physically inserted into a FIPS capable storage node or a FIPS drive has been physically inserted into a non-FIPS storage node. A single fault is generated per node and lists all drives affected.

To resolve this fault, remove or replace the drive or drives in question.

FIPS drives out of compliance

The system has detected that Encryption at Rest was disabled after the FIPS Drives feature was enabled. This fault is also generated when the FIPS Drives feature is enabled and a non-FIPS drive or node is present in the storage cluster.

To resolve this fault, enable Encryption at Rest or remove the non-FIPS hardware from the storage cluster.

fipsSelfTestFailure
The FIPS subsystem has detected a failure during the self test.
Contact NetApp Support for assistance.
hardwareConfigMismatch
This cluster fault indicates one of the following conditions:
  • The configuration does not match the node definition.
  • There is an incorrect drive size for this type of node.
  • An unsupported drive has been detected.
  • There is a drive firmware mismatch.
  • The drive encryption capable state does not match the node.
Contact NetApp Support for assistance.
inconsistentBondModes
The bond modes on the VLAN device are missing. This fault will display the expected bond mode and the bond mode currently in use.
inconsistentInterfaceConfiguration
The interface configuration is inconsistent.
To resolve this fault, ensure the node interfaces in the storage cluster are consistently configured.
inconsistentMtus
This cluster fault indicates one of the following conditions:
  • Bond1G mismatch: Inconsistent MTUs have been detected on Bond1G interfaces.
  • Bond10G mismatch: Inconsistent MTUs have been detected on Bond10G interfaces.
This fault displays the node or nodes in question along with the associated MTU value.
inconsistentRoutingRules
The routing rules for this interface are inconsistent.
inconsistentSubnetMasks
The network mask on the VLAN device does not match the internally recorded network mask for the VLAN. This fault displays the expected network mask and the network mask currently in use.
incorrectBondPortCount
The number of bond ports is incorrect.
invalidConfiguredFibreChannelNodeCount
One of the two expected Fibre Channel node connections is degraded. This fault appears when only one Fibre Channel node is connected.
irqBalanceFailed
An exception occurred while attempting to balance interrupts.
Contact NetApp Support for assistance.
kmipCertificateFault
  • Root Certification Authority (CA) certificate is nearing expiration.

    To resolve this fault, acquire a new certificate from the root CA with expiration date at least 30 days out and use ModifyKeyServerKmip to provide the updated root CA certificate.

  • Client certificate is nearing expiration.

    To resolve this fault, create a new CSR using GetClientCertificateSigningRequest, have it signed ensuring the new expiration date is at least 30 days out, and use ModifyKeyServerKmip to replace the expiring KMIP client certificate with the new certificate.

  • Root Certification Authority (CA) certificate has expired.

    To resolve this fault, acquire a new certificate from the root CA with expiration date at least 30 days out and use ModifyKeyServerKmip to provide the updated root CA certificate.

  • Client certificate has expired.

    To resolve this fault, create a new CSR using GetClientCertificateSigningRequest, have it signed ensuring the new expiration date is at least 30 days out, and use ModifyKeyServerKmip to replace the expired KMIP client certificate with the new certificate.

  • Root Certification Authority (CA) certificate error.

    To resolve this fault, check that the correct certificate was provided, and, if needed, reacquire the certificate from the root CA. Use ModifyKeyServerKmip to install the correct KMIP client certificate.

  • Client certificate error.

    To resolve this fault, check that the correct KMIP client certificate is installed. The root CA of the client certificate should be installed on the EKS. Use ModifyKeyServerKmip to install the correct KMIP client certificate.

kmipServerFault
  • Connection failure

    To resolve this fault, check that the External Key Server is alive and reachable via the network. Use TestKeyServerKimp and TestKeyProviderKmip to test your connection.

  • Authentication failure

    To resolve this fault, check that the correct root CA and KMIP client certificates are being used, and that the private key and the KMIP client certificate match.

  • Server error

    To resolve this fault, check the details for the error. Troubleshooting on the External Key Server might be necessary based on the error returned.

memoryUsageThreshold
Memory usage is above normal.
Contact NetApp Support for assistance.
metadataClusterFull
There is not enough free metadata space to support a single node loss.
To resolve this fault, add another storage node to the storage cluster.
mtuCheckFailure
A network device is not configured for the proper MTU size.
To resolve this fault, ensure that all network interfaces and switch ports are configured for jumbo frames (MTUs up to 9000 bytes in size).
networkConfig
This cluster fault indicates one of the following conditions:
  • An expected interface is not present.
  • A duplicate interface is present.
  • A configured interface is down.
  • A network restart is required.
Contact NetApp Support for assistance.
networkErrorsExceedThreshold
This cluster fault indicates one of the following conditions:
  • The number of frame errors is above normal.
  • The number of CRC errors is above normal.

To resolve this fault, replace the network cable connected to the interface reporting these errors.

Contact NetApp Support for assistance.
noAvailableVirtualNetworkIPAddresses
There are no available virtual network addresses in the block of IP addresses. No more storage nodes can be added to the cluster.
To resolve this fault, add more IP addresses to the block of virtual network addresses.
nodeOffline
Element software cannot communicate with the specified node.
notUsingLACPBondMode
LACP bonding mode is not configured.
To resolve this fault, use LACP bonding when deploying storage nodes; clients might experience performance issues if LACP is not enabled and properly configured.
ntpServerUnreachable
The storage cluster cannot communicate with the specified NTP server or servers.
To resolve this fault, check the configuration for the NTP server, network, and firewall.
ntpTimeNotInSync
The difference between storage cluster time and the specified NTP server time is too large. The storage cluster cannot correct the difference automatically. To resolve this fault, use NTP servers that are internal to your network, rather than the installation defaults. If you are using internal NTP servers and the issue persists, contact NetApp Support for assistance.
nvramDeviceStatus
An NVRAM device has an error, is failing, or has failed.
Contact NetApp Support for assistance.
powerSupplyError
This cluster fault indicates one of the following conditions:
  • A power supply is not present.
  • A power supply has failed.
  • A power supply input is missing or out of range.

To resolve this fault, verify that redundant power is supplied to all nodes. Contact NetApp Support for assistance.

provisionedSpaceTooFull
The overall provisioned capacity of the cluster is too full.
To resolve this fault, add more provisioned space, or delete and purge volumes.
remoteRepAsyncDelayExceeded
The configured asynchronous delay for replication has been exceeded.
remoteRepClusterFull
The volumes have paused remote replication because the target storage cluster is too full.
remoteRepSnapshotClusterFull
The volumes have paused remote replication of snapshots because the target storage cluster is too full.
To resolve this fault, free up some space on the target storage cluster.
remoteRepSnapshotsExceededLimit
The volumes have paused remote replication of snapshots because the target storage cluster volume has exceeded its snapshot limit.
scheduleActionError
One or more of the scheduled activities ran, but failed.
The fault clears if the scheduled activity runs again and succeeds, if the scheduled activity is deleted, or if the activity is paused and resumed.
sensorReadingFailed
The Baseboard Management Controller (BMC) self-test failed or a sensor could not communicate with the BMC.
Contact NetApp Support for assistance.
serviceNotRunning
A required service is not running.
Contact NetApp Support for assistance.
sliceServiceTooFull
A slice service has too little provisioned capacity assigned to it.
To resolve this fault, add more provisioned capacity.
sliceServiceUnhealthy
The system has detected that a slice service is unhealthy and is automatically decommissioning it.
sshEnabled
The SSH service is enabled on one or more nodes in the storage cluster.
To resolve this fault, disable the SSH service on the appropriate node or nodes or contact NetApp Support for assistance.
sslCertificateExpiration
The SSL certificate associated with this node has expired.
To resolve this fault, renew the SSL certificate. If needed, contact NetApp Support for assistance.
tempSensor
A temperature sensor is reporting higher than normal temperatures. This fault can be triggered in conjunction with powerSupplyError or fanSensor faults.
To resolve this fault, check for airflow obstructions near the storage cluster. If needed, contact NetApp Support for assistance.
upgrade
An upgrade has been in progress for more than 24 hours.
to resolve this fault, resume the upgrade or contact NetApp Support for assistance.
unbalancedMixedNodes
A single node accounts for more than one-third of the storage cluster's capacity.
Contact NetApp Support for assistance.
unresponsiveService
A service has become unresponsive.
Contact NetApp Support for assistance.
virtualNetworkConfig
This cluster fault indicates one of the following conditions:
  • An interface is not present.
  • There is an incorrect namespace on an interface.
  • There is an incorrect netmask.
  • There is an incorrect IP address.
  • An interface is not up and running.
  • There is a superfluous interface on a node.
Contact NetApp Support for assistance.
volumeDegraded
Secondary volumes have not finished replicating and synchronizing. The message is cleared when the synchronizing is complete.
volumesOffline
One or more volumes in the storage cluster are offline.
Contact NetApp Support for assistance.