Use maintenance mode on SolidFire eSDS clusters

If you need to take a storage node offline for maintenance such as software upgrades or host repairs, you can minimize the I/O impact to the rest of the storage cluster by enabling maintenance mode for that node.

Note: Ensure that you do the maintenance as soon as maintenance mode is enabled. Do not leave the node in maintenance mode any more than necessary.
You can transition a storage node to maintenance mode only if the node is healthy (has no blocking cluster faults) and the storage cluster is tolerant to a single node failure. After you enable maintenance mode for a healthy and tolerant node, the node is not immediately transitioned; it is monitored until the following conditions are true: After these criteria are met, the node is transitioned to maintenance mode. If these criteria are not met within a five-minute period, the node will not enter maintenance mode.
When you disable maintenance mode for a storage node, the node is monitored until the following conditions are true:
  • All data is fully replicated to the node.
  • All blocking cluster faults are resolved.
  • All temporary standby node assignments for the volumes hosted on the node have been inactivated.
After these criteria are met, the node is transitioned out of maintenance mode. If these criteria are not met within one hour, the node will fail to transition out of maintenance mode.

Possible scenarios while using maintenance mode

Enable maintenance mode

You can enable maintenance mode using the EnableMaintenanceMode API method. This method has the following input parameters:
Name Description Type Default value Required
forceWithUnresolvedFaults Force maintenance mode to be enabled for this node even with blocking cluster faults present. boolean False No
nodes The list of node IDs to put in maintenance mode. Only one node at a time is supported. integer array None Yes
perMinutePrimarySwapLimit The number of primary slices to swap per minute. If not specified, all primary slices will be swapped at once. integer None No
timeout Specifies how long maintenance mode should remain enabled before it is automatically disabled. Formatted as a time string (for example, HH:mm:ss). If not specified, maintenance mode will remain enabled until explicitly disabled. string None No
This method has the following return values:
Name Description Type
asyncHandle You can use the GetAsyncResult method to retrieve this asyncHandle and determine when the maintenance mode transition is complete. integer
currentMode The current maintenance mode state of the node. Possible values:
  • Disabled: No maintenance has been requested.
  • FailedToRecover: The node failed to recover from maintenance mode.
  • RecoveringFromMaintenance: The node is in the process of recovering from maintenance mode.
  • PreparingForMaintenance: Actions are being taken to prepare a node to have maintenance performed.
  • ReadyForMaintenance: The node is ready for maintenance to be performed.
MaintenanceMode (string)
requestedMode The requested maintenance mode state of the node. Possible values:
  • Disabled: No maintenance has been requested.
  • FailedToRecover: The node failed to recover from maintenance mode.
  • RecoveringFromMaintenance: The node is in the process of recovering from maintenance mode.
  • PreparingForMaintenance: Actions are being taken to prepare a node to have maintenance performed.
  • ReadyForMaintenance: The node is ready for maintenance to be performed.
MaintenanceMode (string)
See the Element 12.2 API Reference Guide for more information.

Disable maintenance mode

You can disable maintenance mode using the DisableMaintenanceMode API method. This method has the following input parameter:
Name Description Type Default value Required
nodes List of storage node IDs to take out of maintenance mode. integer array None Yes
This method has the following return values:
Name Description Type
asyncHandle You can use the GetAsyncResult method to retrieve this asyncHandle and determine when the maintenance mode transition is complete. integer
currentMode The current maintenance mode state of the node. Possible values:
  • Disabled: No maintenance has been requested.
  • FailedToRecover: The node failed to recover from maintenance mode.
  • Unexpected: The node was found to be offline, but was in the Disabled mode.
  • RecoveringFromMaintenance: The node is in the process of recovering from maintenance mode.
  • PreparingForMaintenance: Actions are being taken to prepare a node to have maintenance performed.
  • ReadyForMaintenance: The node is ready for maintenance to be performed.
MaintenanceMode (string)
requestedMode The requested maintenance mode state of the node. Possible values:
  • Disabled: No maintenance has been requested.
  • FailedToRecover: The node failed to recover from maintenance mode.
  • Unexpected: The node was found to be offline, but was in the Disabled mode.
  • RecoveringFromMaintenance: The node is in the process of recovering from maintenance mode.
  • PreparingForMaintenance: Actions are being taken to prepare a node to have maintenance performed.
  • ReadyForMaintenance: The node is ready for maintenance to be performed.
MaintenanceMode (string)
See the Element 12.2 API Reference Guide for more information.