Halt or reboot a node without initiating takeover in a two-node cluster

09/10/2022 Contributors

You halt or reboot a node in a two-node cluster without initiating takeover when you perform certain hardware maintenance on a node or a shelf and you want to limit down time by keeping the partner node up, or when there are issues preventing a manual takeover and you want to keep the partner node’s aggregates up and serving data. Additionally, if technical support is assisting you with troubleshooting problems, they might have you perform this procedure as part of those efforts.

About this task

Before you inhibit takeover (using the -inhibit-takeover true parameter), you disable cluster HA.

In a two-node cluster, cluster HA ensures that the failure of one node does not disable the cluster. However, if you do not disable cluster HA before using the -inhibit-takeover true parameter, both nodes stop serving data.
If you attempt to halt or reboot a node before disabling cluster HA, ONTAP issues a warning and instructs you to disable cluster HA.

You migrate LIFs (logical interfaces) to the partner node that you want to remain online.
If on the node you are halting or rebooting there are aggregates you want to keep, you move them to the node that you want to remain online.

Steps

Verify both nodes are healthy:
cluster show

For both nodes, true appears in the Health column.

cluster::> cluster show
Node         Health  Eligibility
------------ ------- ------------
node1        true     true
node2        true     true

Migrate all LIFs from the node that you will halt or reboot to the partner node:
network interface migrate-all -node node_name

If on the node you will halt or reboot there are aggregates you want to keep online when the node is down, relocate them to the partner node; otherwise, go to the next step.

Show the aggregates on the node you will halt or reboot:
storage aggregates show -node node_name

For example, node1 is the node that will be halted or rebooted:

cluster::> storage aggregates show -node node1
Aggregate  Size  Available  Used%  State  #Vols   Nodes   RAID  Status
---------  ----  ---------  -----  -----  -----   -----   ----  ------
aggr0_node_1_0
           744.9GB   32.68GB   96% online       2 node1    raid_dp,
                                                                normal
aggr1       2.91TB    2.62TB   10% online       8 node1    raid_dp,
                                                                normal
aggr2
            4.36TB    3.74TB   14% online      12 node1    raid_dp,
                                                                normal
test2_aggr  2.18TB    2.18TB    0% online       7 node1    raid_dp,
                                                                normal
4 entries were displayed.

Move the aggregates to the partner node:
storage aggregate relocation start -node node_name -destination node_name -aggregate-list aggregate_name

For example, aggregates aggr1, aggr2 and test2_aggr are being moved from node1 to node2:

storage aggregate relocation start -node node1 -destination node2 -aggregate-list aggr1,aggr2,test2_aggr

Disable cluster HA:
cluster ha modify -configured false

The return output confirms HA is disabled: Notice: HA is disabled

This operation does not disable storage failover.
Halt or reboot and inhibit takeover of the target node, by using the appropriate command:
- system node halt -node node_name -inhibit-takeover true
- system node reboot -node node_name -inhibit-takeover true
  
  In the command output, you will see a warning asking you if you want to proceed, enter y.
Verify that the node that is still online is in a healthy state (while the partner is down):
cluster show

For the online node, true appears in the Health column.

In the command output, you will see a warning that cluster HA is not configured. You can ignore the warning at this time.
Perform the actions that required you to halt or reboot the node.
Boot the offlined node from the LOADER prompt:
boot_ontap
Verify both nodes are healthy:
cluster show

For both nodes, true appears in the Health column.

In the command output, you will see a warning that cluster HA is not configured. You can ignore the warning at this time.
Reenable cluster HA:
cluster ha modify -configured true
If earlier in this procedure you relocated aggregates to the partner node, move them back to their home node; otherwise, go to the next step:
storage aggregate relocation start -node node_name -destination node_name -aggregate-list aggregate_name

For example, aggregates aggr1, aggr2 and test2_aggr are being moved from node node2 to node node1:
storage aggregate relocation start -node node2 -destination node1 -aggregate-list aggr1,aggr2,test2_aggr
Revert LIFs to their home ports:
1. View LIFs that are not at home:
  network interface show -is-home false
2. If there are non-home LIFs that were not migrated from the down node, verify it is safe to move them before reverting.
3. If it is safe to do so, revert all LIFs home.
  network interface revert *

Halt or reboot a node without initiating takeover in a two-node cluster

Creating your file...