Halt or reboot a node without initiating takeover in a two-node cluster
You halt or reboot a node in a two-node cluster without initiating takeover when you perform certain hardware maintenance on a node or a shelf and you want to limit down time by keeping the partner node up, or when there are issues preventing a manual takeover and you want to keep the partner node’s aggregates up and serving data. Additionally, if technical support is assisting you with troubleshooting problems, they might have you perform this procedure as part of those efforts.
-
Before you inhibit takeover (using the
-inhibit-takeover true
parameter), you disable cluster HA.
|
-
You migrate LIFs (logical interfaces) to the partner node that you want to remain online.
-
If on the node you are halting or rebooting there are aggregates you want to keep, you move them to the node that you want to remain online.
-
Verify both nodes are healthy:
cluster show
For both nodes,
true
appears in theHealth
column.cluster::> cluster show Node Health Eligibility ------------ ------- ------------ node1 true true node2 true true
-
Migrate all LIFs from the node that you will halt or reboot to the partner node:
network interface migrate-all -node node_name
-
If on the node you will halt or reboot there are aggregates you want to keep online when the node is down, relocate them to the partner node; otherwise, go to the next step.
-
Show the aggregates on the node you will halt or reboot:
storage aggregates show -node node_name
For example, node1 is the node that will be halted or rebooted:
cluster::> storage aggregates show -node node1 Aggregate Size Available Used% State #Vols Nodes RAID Status --------- ---- --------- ----- ----- ----- ----- ---- ------ aggr0_node_1_0 744.9GB 32.68GB 96% online 2 node1 raid_dp, normal aggr1 2.91TB 2.62TB 10% online 8 node1 raid_dp, normal aggr2 4.36TB 3.74TB 14% online 12 node1 raid_dp, normal test2_aggr 2.18TB 2.18TB 0% online 7 node1 raid_dp, normal 4 entries were displayed.
-
Move the aggregates to the partner node:
storage aggregate relocation start -node node_name -destination node_name -aggregate-list aggregate_name
For example, aggregates aggr1, aggr2 and test2_aggr are being moved from node1 to node2:
storage aggregate relocation start -node node1 -destination node2 -aggregate-list aggr1,aggr2,test2_aggr
-
-
Disable cluster HA:
cluster ha modify -configured false
The return output confirms HA is disabled:
Notice: HA is disabled
This operation does not disable storage failover. -
Halt or reboot and inhibit takeover of the target node, by using the appropriate command:
-
system node halt -node node_name -inhibit-takeover true
-
system node reboot -node node_name -inhibit-takeover true
In the command output, you will see a warning asking you if you want to proceed, enter y
.
-
-
Verify that the node that is still online is in a healthy state (while the partner is down):
cluster show
For the online node,
true
appears in theHealth
column.In the command output, you will see a warning that cluster HA is not configured. You can ignore the warning at this time. -
Perform the actions that required you to halt or reboot the node.
-
Boot the offlined node from the LOADER prompt:
boot_ontap
-
Verify both nodes are healthy:
cluster show
For both nodes,
true
appears in theHealth
column.In the command output, you will see a warning that cluster HA is not configured. You can ignore the warning at this time. -
Reenable cluster HA:
cluster ha modify -configured true
-
If earlier in this procedure you relocated aggregates to the partner node, move them back to their home node; otherwise, go to the next step:
storage aggregate relocation start -node node_name -destination node_name -aggregate-list aggregate_name
For example, aggregates aggr1, aggr2 and test2_aggr are being moved from node node2 to node node1:
storage aggregate relocation start -node node2 -destination node1 -aggregate-list aggr1,aggr2,test2_aggr
-
Revert LIFs to their home ports:
-
View LIFs that are not at home:
network interface show -is-home false
-
If there are non-home LIFs that were not migrated from the down node, verify it is safe to move them before reverting.
-
If it is safe to do so, revert all LIFs home.
network interface revert *
-