Power off and power on a single site in a MetroCluster IP configuration

05/24/2024 Contributors

If you need to perform site maintenance or relocate a single site in a MetroCluster IP configuration, you must know how to power off and power on the site.

If you need to relocate and reconfigure a site (for example, if you need to expand from a four-node to an eight-node cluster), you cannot complete these tasks at the same time. This procedure only covers the steps that are required to perform site maintenance or to relocate a site without changing its configuration.

The following diagram shows a MetroCluster configuration. Cluster_B is powered off for maintenance.

Power off a MetroCluster site

You must power off a site and all of the equipment before site maintenance or relocation can begin.

About this task

All the commands in the following steps are issued from the site that remains powered on.

Steps

Before you begin, check that any non-mirrored aggregates at the site are offline.
Verify the operation of the MetroCluster configuration in ONTAP:
1. Check whether the system is multipathed:
  
  node run -node node-name sysconfig -a
2. Check for any health alerts on both clusters:
  
  system health alert show
3. Confirm the MetroCluster configuration and that the operational mode is normal:
  
  metrocluster show
4. Perform a MetroCluster check:
  metrocluster check run
5. Display the results of the MetroCluster check:
  
  metrocluster check show
6. Check for any health alerts on the switches (if present):
  
  storage switch show
7. Run Config Advisor.
  
  NetApp Downloads: Config Advisor
8. After running Config Advisor, review the tool's output and follow the recommendations in the output to address any issues discovered.
From the site you want to remain up, implement the switchover:

metrocluster switchover
```
cluster_A::*> metrocluster switchover
```
The operation can take several minutes to complete.

Monitor and verify the completion of the switchover:

metrocluster operation show

cluster_A::*> metrocluster operation show
  Operation: Switchover
 Start time: 10/4/2012 19:04:13
State: in-progress
   End time: -
     Errors:

cluster_A::*> metrocluster operation show
  Operation: Switchover
 Start time: 10/4/2012 19:04:13
      State: successful
   End time: 10/4/2012 19:04:22
     Errors: -

If you have a MetroCluster IP configuration running ONTAP 9.6 or later, wait for the disaster site plexes to come online and the healing operations to automatically complete.

In MetroCluster IP configurations running ONTAP 9.5 or earlier, the disaster site nodes do not automatically boot to ONTAP and the plexes remain offline.
Move any volumes and LUNs that belong to unmirrored aggregates offline.
1. Move the volumes offline.
  cluster_A::* volume offline <volume name>
2. Move the LUNs offline.
  cluster_A::* lun offline lun_path <lun_path>

Move unmirrored aggregates offline: storage aggregate offline

cluster_A*::> storage aggregate offline -aggregate <aggregate-name>

Depending on your configuration and ONTAP version, identify and move offline affected plexes that are located at the disaster site (Cluster_B).

You should move the following plexes offline:

Non-mirrored plexes residing on disks located at the disaster site.

If you do not move the non-mirrored plexes at the disaster site offline, an outage might occur when the disaster site is later powered off.
Mirrored plexes residing on disks located at the disaster site for aggregate mirroring. After they are moved offline, the plexes are inaccessible.

Identify the affected plexes.

Plexes that are owned by nodes at the surviving site consist of Pool1 disks. Plexes that are owned by nodes at the disaster site consist of Pool0 disks.

Cluster_A::> storage aggregate plex show -fields aggregate,status,is-online,Plex,pool
aggregate    plex  status        is-online pool
------------ ----- ------------- --------- ----
Node_B_1_aggr0 plex0 normal,active true     0
Node_B_1_aggr0 plex1 normal,active true     1

Node_B_2_aggr0 plex0 normal,active true     0
Node_B_2_aggr0 plex5 normal,active true     1

Node_B_1_aggr1 plex0 normal,active true     0
Node_B_1_aggr1 plex3 normal,active true     1

Node_B_2_aggr1 plex0 normal,active true     0
Node_B_2_aggr1 plex1 normal,active true     1

Node_A_1_aggr0 plex0 normal,active true     0
Node_A_1_aggr0 plex4 normal,active true     1

Node_A_1_aggr1 plex0 normal,active true     0
Node_A_1_aggr1 plex1 normal,active true     1

Node_A_2_aggr0 plex0 normal,active true     0
Node_A_2_aggr0 plex4 normal,active true     1

Node_A_2_aggr1 plex0 normal,active true     0
Node_A_2_aggr1 plex1 normal,active true     1
14 entries were displayed.

Cluster_A::>

The affected plexes are those that are remote to cluster A. The following table shows whether the disks are local or remote relative to cluster A:

Node	Disks in pool	Should the disks be set offline?	Example of plexes to be moved offline
Node _A_1 and Node _A_2	Disks in pool 0	No. Disks are local to cluster A.	-
Disks in pool 1	Yes. Disks are remote to cluster A.	Node_A_1_aggr0/plex4 Node_A_1_aggr1/plex1 Node_A_2_aggr0/plex4 Node_A_2_aggr1/plex1
Node _B_1 and Node _B_2	Disks in pool 0	Yes. Disks are remote to cluster A.	Node_B_1_aggr1/plex0 Node_B_1_aggr0/plex0 Node_B_2_aggr0/plex0 Node_B_2_aggr1/plex0
Disks in pool 1	No. Disks are local to cluster A.	-

Node

Disks in pool

Should the disks be set offline?

Example of plexes to be moved offline

Node _A_1 and Node _A_2

Disks in pool 0

No. Disks are local to cluster A.

Disks in pool 1

Yes. Disks are remote to cluster A.

Node_A_1_aggr0/plex4

Node_A_1_aggr1/plex1

Node_A_2_aggr0/plex4

Node_A_2_aggr1/plex1

Node _B_1 and Node _B_2

Disks in pool 0

Yes. Disks are remote to cluster A.

Node_B_1_aggr1/plex0

Node_B_1_aggr0/plex0

Node_B_2_aggr0/plex0

Node_B_2_aggr1/plex0

Disks in pool 1

No. Disks are local to cluster A.

Move the affected plexes offline:

storage aggregate plex offline
```
storage aggregate plex offline -aggregate Node_B_1_aggr0 -plex plex0
```
Perform this step for all plexes that have disks that are remote to Cluster_A.

Persistently offline the ISL switch ports according to the switch type.
Halt the nodes by running the following command on each node:

node halt -inhibit-takeover true -skip-lif-migration true -node <node-name>
Power off the equipment at the disaster site.

You must power off the following equipment in the order shown:
- Storage controllers - the storage controllers should currently be at the LOADER prompt, you must power them off completely.
- MetroCluster IP switches
- Storage shelves

Relocating the powered-off site of the MetroCluster

After the site is powered off, you can begin maintenance work. The procedure is the same whether the MetroCluster components are relocated within the same data center or relocated to a different data center.

The hardware should be cabled in the same way as the previous site.
If the Inter-Switch Link (ISL) speed, length, or number has changed, they all need to be reconfigured.

Steps

Verify that the cabling for all components is carefully recorded so that it can be correctly reconnected at the new location.
Physically relocate all the hardware, storage controllers, IP switches, FibreBridges, and storage shelves.
Configure the ISL ports and verify the intersite connectivity.
1. Power on the IP switches.
  
  Do not power up any other equipment.
Use tools on the switches (as they are available) to verify the intersite connectivity.

You should only proceed if the links are correctly configured and stable.
Disable the links again if they are found to be stable.

Powering on the MetroCluster configuration and returning to normal operation

After maintenance has been completed or the site has been moved, you must power on the site and reestablish the MetroCluster configuration.

About this task

All the commands in the following steps are issued from the site that you power on.

Steps

Power on the switches.

You should power on the switches first. They might have been powered on during the previous step if the site was relocated.
1. Reconfigure the Inter-Switch Link (ISL) if required or if this was not completed as part of the relocation.
2. Enable the ISL if fencing was completed.
3. Verify the ISL.
Power on the storage controllers and wait until you see the LOADER prompt. The controllers must not be fully booted.

If auto boot is enabled, press Ctrl+C to stop the controllers from automatically booting.
Power on the shelves, allowing enough time for them to power on completely.
Verify that the storage is visible.
1. Verify that the storage is visible from the surviving site. Bring the offline plexes back online to restart the resync operation and reestablish the SyncMirror.
2. Verify that the local storage is visible from the node in Maintenance mode:
  
  disk show -v
Reestablish the MetroCluster configuration.

Follow the instructions in Verifying that your system is ready for a switchback to perform healing and switchback operations according to your MetroCluster configuration.

Power off and power on a single site in a MetroCluster IP configuration

Creating your file...

Power off a MetroCluster site

Relocating the powered-off site of the MetroCluster

Powering on the MetroCluster configuration and returning to normal operation