English

Maintenance mode

Contributors netapp-mwallis Download PDF of this page

If you need to take a storage node offline for maintenance such as software upgrades or host repairs, you can minimize the I/O impact to the rest of the storage cluster by enabling maintenance mode for that node. You can use maintenance mode with both appliance nodes as well as SolidFire Enterprise SDS nodes.

You can only transition a storage node to maintenance mode if the node is healthy (has no blocking cluster faults) and the storage cluster is tolerant to a single node failure. Once you enable maintenance mode for a healthy and tolerant node, the node is not immediately transitioned; it is monitored until the following conditions are true:

  • All volumes hosted on the node have failed over

  • The node is no longer hosting as the primary for any volume

  • A temporary standby node is assigned for every volume being failed over

After these criteria are met, the node is transitioned to maintenance mode. If these criteria are not met within a 5 minute period, the node will not enter maintenance mode.

When you disable maintenance mode for a storage node, the node is monitored until the following conditions are true:

  • All data is fully replicated to the node

  • All blocking cluster faults are resolved

  • All temporary standby node assignments for the volumes hosted on the node have been inactivated

After these criteria are met, the node is transitioned out of maintenance mode. If these criteria are not met within one hour, the node will fail to transition out of maintenance mode.

You can see the states of maintenance mode operations when working with maintenance mode using the Element API:

  • Disabled: No maintenance has been requested.

  • FailedToRecover: The node failed to recover from maintenance.

  • RecoveringFromMaintenance: The node is in the process of recovering from maintenance.

  • PreparingForMaintenance: Actions are being taken to allow a node to have maintenance performed.

  • ReadyForMaintenance: The node is ready for maintenance to be performed.

Find more information