Preparing for switchback in a MetroCluster FC configuration

Download PDF of this page

You must perform certain tasks in order to prepare the MetroCluster FC configuration for the switchback operation.

Verifying port configuration (MetroCluster FC configurations only)

You must set the environmental variables on the node and then power it off to prepare it for MetroCluster configuration.

This procedure is performed with the replacement controller modules in Maintenance mode.

The steps to check configuration of ports is needed only on systems in which FC or CNA ports are used in initiator mode.

  1. In Maintenance mode, enter the following command to restore the FC port configuration: ucadmin modify -m fc -t initiatoradapter_name

    If you only want to use one of a port pair in the initiator configuration, enter a precise adapter_name.

  2. Take one of the following actions, depending on your configuration:

    If the FC port configuration is…​

    Then…​

    The same for both ports

    Answer y when prompted by the system since modifying one port in a port pair modifies the other port as well.

    Different

    1. Answer n when prompted by the system.

    2. Enter the following command to restore the FC port configuration: ucadmin modify -m fc -t initiator|targetadapter_name

  3. Exit Maintenance mode by entering the following command: halt

    After you issue the command, wait until the system stops at the LOADER prompt.

  4. Boot the node back into Maintenance mode for the configuration changes to take effect: boot_ontap maint

  5. Verify the values of the variables by entering the following command: ucadmin show

  6. Exit Maintenance mode and display the LOADER prompt:halt

Configuring the FC-to-SAS bridges (MetroCluster FC configurations only)

If you replaced the FC-to-SAS bridges, you must configure them when restoring the MetroCluster configuration. The procedure is identical to the initial configuration of an FC-to-SAS bridge.

  1. Power on the FC-to-SAS bridges.

  2. Set the IP address on the Ethernet ports by using the set IPAddressportipaddress command.

    port can be either MP1 or MP2.

    ipaddress can be an IP address in the format xxx.xxx.xxx.xxx.

    In the following example, the IP address is 10.10.10.55 on Ethernet port 1:

    Ready.
    set IPAddress MP1 10.10.10.55
    
    Ready. *
  3. Set the IP subnet mask on the Ethernet ports by using the set IPSubnetMaskportmask command.

    port can be MP1 or MP2.

    mask can be a subnet mask in the format xxx.xxx.xxx.xxx.

    In the following example, the IP subnet mask is 255.255.255.0 on Ethernet port 1:

    Ready.
    set IPSubnetMask MP1 255.255.255.0
    
    Ready. *
  4. Set the speed on the Ethernet ports by using the set EthernetSpeedportspeed command.

    port can be MP1 or MP2.

    speed can be 100, 1000, or auto.

    In the following example, the Ethernet speed is set to 1000 on Ethernet port 1.

    Ready.
    set EthernetSpeed MP1 1000
    
    Ready. *
  5. Save the configuration by using the saveConfiguration command, and restart the bridge when prompted to do so.

    Saving the configuration after configuring the Ethernet ports enables you to proceed with the bridge configuration using Telnet and enables you to access the bridge using FTP to perform firmware updates.

    The following example shows the saveConfiguration command and the prompt to restart the bridge.

    Ready.
    SaveConfiguration
      Restart is necessary....
      Do you wish to restart (y/n) ?
    Confirm with 'y'. The bridge will save and restart with the new settings.
  6. After the FC-to-SAS bridge reboots, log in again.

  7. Set the speed on the FC ports by using the set fcdatarateportspeed command.

    port can be 1 or 2.

    speed can be 2 Gb, 4 Gb, 8 Gb, or 16 Gb, depending on your model bridge.

    In the following example, the port FC1 speed is set to 8 Gb.

    Ready.
    set fcdatarate 1 8Gb
    
    Ready. *
  8. Set the topology on the FC ports by using the set FCConnModeportmode command.

    port can be 1 or 2.

    mode can be ptp, loop, ptp-loop, or auto.

    In the following example, the port FC1 topology is set to ptp.

    Ready.
    set FCConnMode 1 ptp
    
    Ready. *
  9. Save the configuration by using the saveConfiguration command, and restart the bridge when prompted to do so.

    The following example shows the saveConfiguration command and the prompt to restart the bridge.

     Ready.
     SaveConfiguration
        Restart is necessary....
        Do you wish to restart (y/n) ?
     Confirm with 'y'. The bridge will save and restart with the new settings.
  10. After the FC-to-SAS bridge reboots, log in again.

  11. If the FC-to-SAS bridge is running firmware 1.60 or later, enable SNMP.

    Ready.
    set snmp enabled
    
    Ready. *
    saveconfiguration
    
    Restart is necessary....
    Do you wish to restart (y/n) ?
    
    Verify with 'y' to restart the FibreBridge.
  12. Power off the FC-to-SAS bridges.

Configuring the FC switches (MetroCluster FC configurations only)

If you have replaced the FC switches in the disaster site, you must configure them using the vendor-specific procedures. You must configure one switch, verify that storage access on the surviving site is not impacted, and then configure the second switch.

Related information

Configuring a Brocade FC switch after site disaster

You must use this Brocade-specific procedure to configure the replacement switch and enable the ISL ports.

The examples in this procedure are based on the following assumptions:

  • Site A is the disaster site.

  • FC_switch_A_1 has been replaced.

  • FC_switch_A_2 has been replaced.

  • Site B is the surviving site.

  • FC_switch_B_1 is healthy.

  • FC_switch_B_2 is healthy.

You must verify that you are using the specified port assignments when you cable the FC switches:

The examples show two FC-to-SAS bridges. If you have more bridges, you must disable and subsequently enable the additional ports.

  1. Boot and pre-configure the new switch:

    1. Power up the new switch and let it boot up.

    2. Check the firmware version on the switch to confirm it matches the version of the other FC switches: firmwareShow

    3. Configure the new switch as described in the MetroCluster Installation and Configuration Guide, skipping the steps for configuring zoning on the switch.

    4. Disable the switch persistently: switchcfgpersistentdisable

      The switch will remain disabled after a reboot or fastboot. If this command is not available, you should use the switchdisable command.

      The following example shows the command on BrocadeSwitchA:

      BrocadeSwitchA:admin> switchcfgpersistentdisable

      The following example shows the command on BrocadeSwitchB:

      BrocadeSwitchA:admin> switchcfgpersistentdisable
  2. Complete configuration of the new switch:

    1. Enable the ISLs on the surviving site: portcfgpersistentenable port-number

      FC_switch_B_1:admin> portcfgpersistentenable 10
      FC_switch_B_1:admin> portcfgpersistentenable 11
    2. Enable the ISLs on the replacement switches: portcfgpersistentenable port-number

      FC_switch_A_1:admin> portcfgpersistentenable 10
      FC_switch_A_1:admin> portcfgpersistentenable 11
    3. On the replacement switch (FC_switch_A_1 in our example) verify that the ISL’s are online:switchshow

      FC_switch_A_1:admin> switchshow
      switchName: FC_switch_A_1
      switchType: 71.2
      switchState:Online
      switchMode: Native
      switchRole: Principal
      switchDomain:       4
      switchId:   fffc03
      switchWwn:  10:00:00:05:33:8c:2e:9a
      zoning:             OFF
      switchBeacon:       OFF
      
      Index Port Address Media Speed State  Proto
      ==============================================
      ...
      10   10    030A00 id   16G     Online  FC E-Port 10:00:00:05:33:86:89:cb "FC_switch_A_1"
      11   11    030B00 id   16G     Online  FC E-Port 10:00:00:05:33:86:89:cb "FC_switch_A_1" (downstream)
      ...
  3. Persistently enable the switch: switchcfgpersistentenable

  4. Verify that the ports are online:switchshow

Configuring a Cisco FC switch after site disaster

You must use the Cisco-specific procedure to configure the replacement switch and enable the ISL ports.

The examples in this procedure are based on the following assumptions:

  • Site A is the disaster site.

  • FC_switch_A_1 has been replaced.

  • FC_switch_A_2 has been replaced.

  • Site B is the surviving site.

  • FC_switch_B_1 is healthy.

  • FC_switch_B_2 is healthy.

    1. Configure the switch:

      1. Refer to the Fabric-attached MetroCluster Installation and Configuration Guide.

      2. Follow the steps for configuring the switch in the Configuring the Cisco FC switches section, except for the “Configuring zoning on a Cisco FC switch” section:

        Zoning is configured later in this procedure.

    2. On the healthy switch (in this example, FC_switch_B_1), enable the ISL ports.

      The following example shows the commands to enable the ports:

      FC_switch_B_1# conf t
      FC_switch_B_1(config)# int fc1/14-15
      FC_switch_B_1(config)# no shut
      FC_switch_B_1(config)# end
      FC_switch_B_1# copy running-config startup-config
      FC_switch_B_1#
    3. Verify that the ISL ports are up by using the show interface brief command.

    4. Retrieve the zoning information from the fabric.

      The following example shows the commands to distribute the zoning configuration:

      FC_switch_B_1(config-zone)# zoneset distribute full vsan 10
      FC_switch_B_1(config-zone)# zoneset distribute full vsan 20
      FC_switch_B_1(config-zone)# end

      FC_switch_B_1 is distributed to all other switches in the fabric for vsan 10 and vsan 20, and the zoning information is retrieved from FC_switch_A_1.

    5. On the healthy switch, verify that the zoning information is properly retrieved from the partner switch: show zone

      FC_switch_B_1# show zone
      zone name FC-VI_Zone_1_10 vsan 10
        interface fc1/1 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/2 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/1 swwn 20:00:54:7f:ee:b8:24:c0
        interface fc1/2 swwn 20:00:54:7f:ee:b8:24:c0
      
      zone name STOR_Zone_1_20_25A vsan 20
        interface fc1/5 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/8 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/9 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/10 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/11 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/8 swwn 20:00:54:7f:ee:b8:24:c0
        interface fc1/9 swwn 20:00:54:7f:ee:b8:24:c0
        interface fc1/10 swwn 20:00:54:7f:ee:b8:24:c0
        interface fc1/11 swwn 20:00:54:7f:ee:b8:24:c0
      
      zone name STOR_Zone_1_20_25B vsan 20
        interface fc1/8 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/9 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/10 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/11 swwn 20:00:54:7f:ee:e3:86:50
        interface fc1/5 swwn 20:00:54:7f:ee:b8:24:c0
        interface fc1/8 swwn 20:00:54:7f:ee:b8:24:c0
        interface fc1/9 swwn 20:00:54:7f:ee:b8:24:c0
        interface fc1/10 swwn 20:00:54:7f:ee:b8:24:c0
        interface fc1/11 swwn 20:00:54:7f:ee:b8:24:c0
      FC_switch_B_1#
    6. Determine the worldwide names (WWNs) of the switches in the switch fabric.

      In this example, the two switch WWNs are as follows:

      • FC_switch_A_1: 20:00:54:7f:ee:b8:24:c0

      • FC_switch_B_1: 20:00:54:7f:ee:c6:80:78

      FC_switch_B_1# show wwn switch
      Switch WWN is 20:00:54:7f:ee:c6:80:78
      FC_switch_B_1#
      
      FC_switch_A_1# show wwn switch
      Switch WWN is 20:00:54:7f:ee:b8:24:c0
      FC_switch_A_1#
    7. Enter configuration mode for the zone and remove zone members that do not belong to the switch WWNs of the two switches: no member interface interface-ide swwn wwn

      In this example, the following members are not associated with the WWN of either of the switches in the fabric and must be removed:

      • Zone name FC-VI_Zone_1_10 vsan 10

        • Interface fc1/1 swwn 20:00:54:7f:ee:e3:86:50

        • Interface fc1/2 swwn 20:00:54:7f:ee:e3:86:50 Note: AFF A700 and FAS9000 systems support four FC-VI ports. You must remove all four ports from the FC-VI zone.

      • Zone name STOR_Zone_1_20_25A vsan 20

        • Interface fc1/5 swwn 20:00:54:7f:ee:e3:86:50

        • Interface fc1/8 swwn 20:00:54:7f:ee:e3:86:50

        • Interface fc1/9 swwn 20:00:54:7f:ee:e3:86:50

        • Interface fc1/10 swwn 20:00:54:7f:ee:e3:86:50

        • Interface fc1/11 swwn 20:00:54:7f:ee:e3:86:50

      • Zone name STOR_Zone_1_20_25B vsan 20

        • Interface fc1/8 swwn 20:00:54:7f:ee:e3:86:50

        • Interface fc1/9 swwn 20:00:54:7f:ee:e3:86:50

        • Interface fc1/10 swwn 20:00:54:7f:ee:e3:86:50

        • Interface fc1/11 swwn 20:00:54:7f:ee:e3:86:50 The following example shows the removal of these interfaces:

     FC_switch_B_1# conf t
     FC_switch_B_1(config)# zone name FC-VI_Zone_1_10 vsan 10
     FC_switch_B_1(config-zone)# no member interface fc1/1 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# no member interface fc1/2 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# zone name STOR_Zone_1_20_25A vsan 20
     FC_switch_B_1(config-zone)# no member interface fc1/5 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# no member interface fc1/8 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# no member interface fc1/9 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# no member interface fc1/10 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# no member interface fc1/11 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# zone name STOR_Zone_1_20_25B vsan 20
     FC_switch_B_1(config-zone)# no member interface fc1/8 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# no member interface fc1/9 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# no member interface fc1/10 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# no member interface fc1/11 swwn 20:00:54:7f:ee:e3:86:50
     FC_switch_B_1(config-zone)# save running-config startup-config
     FC_switch_B_1(config-zone)# zoneset distribute full 10
     FC_switch_B_1(config-zone)# zoneset distribute full 20
     FC_switch_B_1(config-zone)# end
     FC_switch_B_1# copy running-config startup-config
    1. Add the ports of the new switch to the zones.

      The following example assumes that the cabling on the replacement switch is the same as on the old switch:

       FC_switch_B_1# conf t
       FC_switch_B_1(config)# zone name FC-VI_Zone_1_10 vsan 10
       FC_switch_B_1(config-zone)# member interface fc1/1 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# member interface fc1/2 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# zone name STOR_Zone_1_20_25A vsan 20
       FC_switch_B_1(config-zone)# member interface fc1/5 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# member interface fc1/8 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# member interface fc1/9 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# member interface fc1/10 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# member interface fc1/11 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# zone name STOR_Zone_1_20_25B vsan 20
       FC_switch_B_1(config-zone)# member interface fc1/8 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# member interface fc1/9 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# member interface fc1/10 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# member interface fc1/11 swwn 20:00:54:7f:ee:c6:80:78
       FC_switch_B_1(config-zone)# save running-config startup-config
       FC_switch_B_1(config-zone)# zoneset distribute full 10
       FC_switch_B_1(config-zone)# zoneset distribute full 20
       FC_switch_B_1(config-zone)# end
       FC_switch_B_1# copy running-config startup-config
    2. Verify that the zoning is properly configured: show zone

      The following example output shows the three zones:

       FC_switch_B_1# show zone
         zone name FC-VI_Zone_1_10 vsan 10
           interface fc1/1 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/2 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/1 swwn 20:00:54:7f:ee:b8:24:c0
           interface fc1/2 swwn 20:00:54:7f:ee:b8:24:c0
      
         zone name STOR_Zone_1_20_25A vsan 20
           interface fc1/5 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/8 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/9 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/10 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/11 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/8 swwn 20:00:54:7f:ee:b8:24:c0
           interface fc1/9 swwn 20:00:54:7f:ee:b8:24:c0
           interface fc1/10 swwn 20:00:54:7f:ee:b8:24:c0
           interface fc1/11 swwn 20:00:54:7f:ee:b8:24:c0
      
         zone name STOR_Zone_1_20_25B vsan 20
           interface fc1/8 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/9 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/10 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/11 swwn 20:00:54:7f:ee:c6:80:78
           interface fc1/5 swwn 20:00:54:7f:ee:b8:24:c0
           interface fc1/8 swwn 20:00:54:7f:ee:b8:24:c0
           interface fc1/9 swwn 20:00:54:7f:ee:b8:24:c0
           interface fc1/10 swwn 20:00:54:7f:ee:b8:24:c0
           interface fc1/11 swwn 20:00:54:7f:ee:b8:24:c0
       FC_switch_B_1#

Verifying the storage configuration

You must confirm that all storage is visible from the surviving nodes.

  1. Confirm that all storage components at the disaster site are the same in quantity and type at the surviving site.

    The surviving site and disaster site should have the same number of disk shelf stacks, disk shelves, and disks. In a bridge-attached or fabric-attached MetroCluster configuration, the sites should have the same number of FC-to-SAS bridges.

  2. Confirm that all disks that have been replaced at the disaster site are unowned: run local disk show-n

    Disks should appear as being unowned.

  3. If no disks were replaced, confirm that all disks are present: disk show

Powering on the equipment at the disaster site

You must power on the MetroCluster components at the disaster site when you are ready to prepare for switchback. In addition, you must also recable the SAS storage connections in direct-attached MetroCluster configurations and enable non-Inter-Switch Link ports in fabric-attached MetroCluster configurations.

You must have already replaced and cabled the MetroCluster components exactly as the old ones.

The examples in this procedure assume the following:

  • Site A is the disaster site.

  • FC_switch_A_1 has been replaced.

  • FC_switch_A_2 has been replaced.

  • Site B is the surviving site.

  • FC_switch_B_1 is healthy.

  • FC_switch_B_2 is healthy.

The FC switches are present only in fabric-attached MetroCluster configurations.

  1. In a stretch MetroCluster configuration using SAS cabling (and no FC switch fabric or FC-to-SAS bridges), connect all the storage including the remote storage across both sites.

    The controller at the disaster site must remain powered off or at the LOADER prompt.

  2. On the surviving site, disable disk autoassignment: storage disk option modify -autoassign off *

    cluster_B::> storage disk option modify -autoassign off *
    2 entries were modified.
  3. On the surviving site, confirm that disk autoassignment is off: storage disk option show

     cluster_B::> storage disk option show
     Node     BKg. FW. Upd.  Auto Copy   Auto Assign  Auto Assign Policy
    --------  -------------  -----------  -----------  ------------------
    node_B_1       on            on          off             default
    node_B_2       on            on          off             default
    2 entries were displayed.
    
     cluster_B::>
  4. Turn on the disk shelves at the disaster site and make sure that all disks are running.

  5. In a bridge-attached or fabric-attached MetroCluster configuration, turn on all FC-to-SAS bridges at the disaster site.

  6. If any disks were replaced, leave the controllers powered off or at the LOADER prompt.

  7. In a fabric-attached MetroCluster configuration, enable the non-ISL ports on the FC switches.

    If the switch vendor is…​

    Then use these steps to enable the ports…​

    Brocade

    1. Persistently enable the ports connected to the FC-to-SAS bridges: portpersistentenable port-number

      In the following example, ports 6 and 7 are enabled:

      FC_switch_A_1:admin> portpersistentenable 6
      FC_switch_A_1:admin> portpersistentenable 7
      
      FC_switch_A_1:admin>
    2. Persistently enable the ports connected to the HBAs and FC-VI adapters: portpersistentenable port-number

      In the following example, ports 6 and 7 are enabled:

      FC_switch_A_1:admin> portpersistentenable 1
      FC_switch_A_1:admin> portpersistentenable 2
      FC_switch_A_1:admin> portpersistentenable 4
      FC_switch_A_1:admin> portpersistentenable 5
      FC_switch_A_1:admin>
      For AFF A700 and FAS9000 systems, you must persistently enable all four FC-VI ports by using the switchcfgpersistentenable command.
    3. Repeat substeps a and b for the second FC switch at the surviving site.

    Cisco

    1. Enter configuration mode for the interface, and then enable the ports with the no shut command.

      In the following example, port fc1/36 is disabled:

       FC_switch_A_1# conf t
      FC_switch_A_1(config)# interface fc1/36
      FC_switch_A_1(config)# no shut
      FC_switch_A_1(config-if)# end
      FC_switch_A_1# copy running-config startup-config
    2. Verify that the switch port is enabled: show interface brief

    3. Repeat substeps a and b on the other ports connected to the FC-to-SAS bridges, HBAs, and FC-VI adapters.

    4. Repeat substeps a, b, and c for the second FC switch at the surviving site.

Assigning ownership for replaced drives

If you replaced drives when restoring hardware at the disaster site or you had to zero drives or remove ownership, you must assign ownership to the affected drives.

The disaster site must have at least as many available drives as it did prior to the disaster.

The drives shelves and drives arrangement must meet the requirements in the “Required MetroCluster IP components and naming conventions” section of the MetroCluster IP Installation and Configuration Guide.

These steps are performed on the cluster at the disaster site.

This procedure shows the reassignment of all drives and the creation of new plexes at the disaster site. The new plexes are remote plexes of surviving site and local plexes of disaster site.

This section provides examples for two and four-node configurations. For two-node configurations, you can ignore references to the second node at each site. For eight-node configurations, you must account for the additional nodes on the second DR group. The examples make the following assumptions:

  • Site A is the disaster site.

  • node_A_1 has been replaced.

  • node_A_2 has been replaced.

    Present only in four-node MetroCluster configurations.

  • Site B is the surviving site.

  • node_B_1 is healthy.

  • node_B_2 is healthy.

    Present only in four-node MetroCluster configurations.

The controller modules have the following original system IDs:

Number of nodes in MetroCluster configuration

Node

Original system ID

Four

node_A_1

4068741258

node_A_2

4068741260

node_B_1

4068741254

node_B_2

4068741256

Two

node_A_1

4068741258

You should keep in mind the following points when assigning the drives:

  • The old-count-of-disks must be at least the same number of disks for each node that were present before the disaster.

    If a lower number of disks is specified or present, the healing operations might not be completed due to insufficient space.

  • The new plexes to be created are remote plexes belonging to the surviving site (node_B_x pool1) and local plexes belonging to the disaster site (node_B_x pool0).

  • The total number of required drives should not include the root aggr disks.

    If n disks are assigned to pool1 of the surviving site, then (n-3) disks should be assigned to the disaster site with the assumption that the root aggregate uses three disks.

  • None of the disks can be assigned to a pool that is different from the one to which all other disks on the same stack are assigned.

  • Disks belonging to the surviving site are assigned to pool 1 and disks belonging to the disaster site are assigned to pool 0.

    1. Assign the new, unowned drives based on whether you have a four-node or two-node MetroCluster configuration:

      • For four-node MetroCluster configurations, assign the new, unowned disks to the appropriate disk pools by using the following series of commands on the replacement nodes:

        1. Systematically assign the replaced disks for each node to their respective disk pools: disk assign -s sysid -n old-count-of-disks -p pool

          From the surviving site, you issue a disk assign command for each node:

          cluster_B::> disk assign -s node_B_1-sysid -n old-count-of-disks -p 1 **\(remote pool of surviving site\)**
          cluster_B::> disk assign -s node_B_2-sysid -n old-count-of-disks -p 1 **\(remote pool of surviving site\)**
          cluster_B::> disk assign -s node_A_1-old-sysid -n old-count-of-disks -p 0 **\(local pool of disaster site\)**
          cluster_B::> disk assign -s node_A_2-old-sysid -n old-count-of-disks -p 0 **\(local pool of disaster site\)**

          The following example shows the commands with the system IDs:

          cluster_B::> disk assign -s 4068741254 -n 21 -p 1
          cluster_B::> disk assign -s 4068741256 -n 21 -p 1
          cluster_B::> disk assign -s 4068741258 -n 21 -p 0
          cluster_B::> disk assign -s 4068741260 -n 21 -p 0
        2. Confirm the ownership of the disks: storage disk show -fields owner, pool

          storage disk show -fields owner, pool
          cluster_A::> storage disk show -fields owner, pool
          disk     owner          pool
          -------- ------------- -----
          0c.00.1  node_A_1      Pool0
          0c.00.2  node_A_1      Pool0
          .
          .
          .
          0c.00.8  node_A_1      Pool1
          0c.00.9  node_A_1      Pool1
          .
          .
          .
          0c.00.15 node_A_2      Pool0
          0c.00.16 node_A_2      Pool0
          .
          .
          .
          0c.00.22 node_A_2      Pool1
          0c.00.23 node_A_2      Pool1
          .
          .
          .
      • For two-node MetroCluster configurations, assign the new, unowned disks to the appropriate disk pools by using the following series of commands on the replacement node:

        1. Display the local shelf IDs: run local storage show shelf

        2. Assign the replaced disks for the healthy node to pool 1: run local disk assign -shelf shelf-id -n old-count-of-disks -p 1 -s node_B_1-sysid -f

        3. Assign the replaced disks for the replacement node to pool 0: run local disk assign -shelf shelf-id -n old-count-of-disks -p 0 -s node_A_1-sysid -f

    2. On the surviving site, turn on automatic disk assignment again: storage disk option modify -autoassign on *

      cluster_B::> storage disk option modify -autoassign on *
      2 entries were modified.
    3. On the surviving site, confirm that automatic disk assignment is on: storage disk option show

       cluster_B::> storage disk option show
       Node     BKg. FW. Upd.  Auto Copy   Auto Assign  Auto Assign Policy
      --------  -------------  -----------  -----------  ------------------
      node_B_1       on            on          on             default
      node_B_2       on            on          on             default
      2 entries were displayed.
      
       cluster_B::>

Related information

Performing aggregate healing and restoring mirrors (MetroCluster FC configurations)

After replacing hardware and assigning disks, you can perform the MetroCluster healing operations. You must then confirm that aggregates are mirrored and, if necessary, restart mirroring.

  1. Perform the two phases of healing (aggregate healing and root healing) on the disaster site:

    cluster_B::> metrocluster heal -phase aggregates
    
    cluster_B::> metrocluster heal -phase root aggregates
  2. Monitor the healing and verify that the aggregates are in either the resyncing or mirrored state: storage aggregate show -node local

    If the aggregate shows this state…​

    Then…​

    resyncing

    No action is required. Let the aggregate complete resyncing.

    mirror degraded

    Proceed to step 3.

    mirrored, normal

    No action is required.

    unknown, offline

    The root aggregate shows this state if all the disks on the disaster sites were replaced.

    cluster_B::> storage aggregate show -node local
    
     Aggregate     Size Available Used% State   #Vols  Nodes      RAID Status
     --------- -------- --------- ----- ------- ------ ---------- ------------
     node_B_1_aggr1
                227.1GB   11.00GB   95% online       1 node_B_1   raid_dp,
                                                                  resyncing
     NodeA_1_aggr2
                430.3GB   28.02GB   93% online       2 node_B_1   raid_dp,
                                                                  mirror
                                                                  degraded
     node_B_1_aggr3
                812.8GB   85.37GB   89% online       5 node_B_1   raid_dp,
                                                                  mirrored,
                                                                  normal
     3 entries were displayed.
    
    cluster_B::>

    In the following examples, the three aggregates are each in a different state:

    Node

    State

    node_B_1_aggr1

    resyncing

    node_B_1_aggr2

    mirror degraded

    node_B_1_aggr3

    mirrored, normal

  3. If one or more plexes remain offline, additional steps are required to rebuild the mirror.

    In the preceding table, the mirror for node_B_1_aggr2 must be rebuilt.

    1. View details of the aggregate to identify any failed plexes: storage aggregate show -r -aggregate node_B_1_aggr2

      In the following example, plex /node_B_1_aggr2/plex0 is in a failed state:

      cluster_B::> storage aggregate show -r -aggregate node_B_1_aggr2
      
       Owner Node: node_B_1
        Aggregate: node_B_1_aggr2 (online, raid_dp, mirror degraded) (block checksums)
         Plex: /node_B_1_aggr2/plex0 (offline, failed, inactive, pool0)
          RAID Group /node_B_1_aggr2/plex0/rg0 (partial)
                                                                     Usable Physical
            Position Disk                     Pool Type     RPM     Size     Size Status
            -------- ------------------------ ---- ----- ------ -------- -------- ----------
      
         Plex: /node_B_1_aggr2/plex1 (online, normal, active, pool1)
          RAID Group /node_B_1_aggr2/plex1/rg0 (normal, block checksums)
                                                                     Usable Physical
            Position Disk                     Pool Type     RPM     Size     Size Status
            -------- ------------------------ ---- ----- ------ -------- -------- ----------
            dparity  1.44.8                    1   SAS    15000  265.6GB  273.5GB (normal)
            parity   1.41.11                   1   SAS    15000  265.6GB  273.5GB (normal)
            data     1.42.8                    1   SAS    15000  265.6GB  273.5GB (normal)
            data     1.43.11                   1   SAS    15000  265.6GB  273.5GB (normal)
            data     1.44.9                    1   SAS    15000  265.6GB  273.5GB (normal)
            data     1.43.18                   1   SAS    15000  265.6GB  273.5GB (normal)
       6 entries were displayed.
      
       cluster_B::>
    2. Delete the failed plex: storage aggregate plex delete -aggregate aggregate-name -plex plex

    3. Reestablish the mirror: storage aggregate mirror -aggregate aggregate-name

    4. Monitor the resynchronization and mirroring status of the plex until all mirrors are reestablished and all aggregates show mirrored, normal status: storage aggregate show

Reassigning disk ownership for root aggregates to replacement controller modules (MetroCluster FC configurations)

If one or both of the controller modules or NVRAM cards were replaced at the disaster site, the system ID has changed and you must reassign disks belonging to the root aggregates to the replacement controller modules.

Because the nodes are in switchover mode and healing has been done, only the disks containing the root aggregates of pool1 of the disaster site will be reassigned in this section. They are the only disks still owned by the old system ID at this point.

This section provides examples for two and four-node configurations. For two-node configurations, you can ignore references to the second node at each site. For eight-node configurations, you must account for the additional nodes on the second DR group. The examples make the following assumptions:

  • Site A is the disaster site.

  • node_A_1 has been replaced.

  • node_A_2 has been replaced.

    Present only in four-node MetroCluster configurations.

  • Site B is the surviving site.

  • node_B_1 is healthy.

  • node_B_2 is healthy.

    Present only in four-node MetroCluster configurations.

The old and new system IDs were identified in Acquiring the new System ID.

The examples in this procedure use controllers with the following system IDs:

Number of nodes

Node

Original system ID

New system ID

Four

node_A_1

4068741258

1574774970

node_A_2

4068741260

1574774991

node_B_1

4068741254

unchanged

node_B_2

4068741256

unchanged

Two

node_A_1

4068741258

1574774970

node_B_1

4068741254

unchanged

  1. With the replacement node in Maintenance mode, reassign the root aggregate disks: disk reassign -s old-system-ID -d new-system-ID

    *> disk reassign -s 4068741258 -d 1574774970
  2. View the disks to confirm the ownership change of the pool1 root aggr disks of the disaster site to the replacement node: disk show

    The output might show more or fewer disks, depending on how many disks are in the root aggregate and whether any of these disks failed and were replaced. If the disks were replaced, then Pool0 disks will not appear in the output.

    The pool1 root aggregate disks of the disaster site should now be assigned to the replacement node.

    *> disk show
    Local System ID: 1574774970
    
      DISK             OWNER             POOL   SERIAL NUMBER         HOME                DR HOME
    ------------       -------------     -----  -------------         -------------       -------------
    sw_A_1:6.126L19 node_A_1(1574774970) Pool0  serial-number  node_A_1(1574774970)
    sw_A_1:6.126L3  node_A_1(1574774970) Pool0  serial-number  node_A_1(1574774970)
    sw_A_1:6.126L7  node_A_1(1574774970) Pool0  serial-number  node_A_1(1574774970)
    sw_B_1:6.126L8  node_A_1(1574774970) Pool1  serial-number  node_A_1(1574774970)
    sw_B_1:6.126L24 node_A_1(1574774970) Pool1  serial-number  node_A_1(1574774970)
    sw_B_1:6.126L2  node_A_1(1574774970) Pool1  serial-number  node_A_1(1574774970)
    
    *> aggr status
             Aggr State           Status
     node_A_1_root online          raid_dp, aggr
                                   mirror degraded
                                   64-bit
    *>
  3. View the aggregate status: aggr status

    The output might show more or fewer disks, depending on how many disks are in the root aggregate and whether any of these disks failed and were replaced. If disks were replaced, then Pool0 disks will not appear in output.

    *> aggr status
              Aggr State           Status
      node_A_1_root online          raid_dp, aggr
                                    mirror degraded
                                    64-bit
    *>
  4. Delete the contents of the mailbox disks: mailbox destroy local

  5. If the aggregate is not online, bring it online: aggr online aggr_name

  6. Halt the node to display the LOADER prompt: halt

Booting the new controller modules (MetroCluster FC configurations)

After aggregate healing has been completed for both the data and root aggregates, you must boot the node or nodes at the disaster site.

This task begins with the nodes showing the LOADER prompt.

  1. Display the boot menu: boot_ontap menu

  2. From the boot menu, select option 6, Update flash from backup config.

  3. Respond y to the following prompt:This will replace all flash-based configuration with the last backup to disks. Are you sure you want to continue?: y

    The system will boot twice, the second time to load the new configuration.

    If you did not clear the NVRAM contents of a used replacement controller, then you might see a panic with the following message: PANIC: NVRAM contents are invalid…​

    If this occurs, repeat step 2 to boot the system to the ONTAP prompt. You will then need to perform a root recovery. Contact technical support for assistance.

  4. Mirror the root aggregate on plex 0:

    1. Assign three pool0 disks to the new controller module.

    2. Mirror the root aggregate pool1 plex: aggr mirror root-aggr-name

    3. Assign unowned disks to pool0 on the local node

  5. Refresh the MetroCluster configuration:

    1. Enter advanced privilege mode: set -privilege advanced

    2. Refresh the configuration: metrocluster configure -refresh true

    3. Return to admin privilege mode: set -privilege admin

  6. If you have a four-node configuration, repeat the previous steps on the other node at the disaster site.

Proceed to verify the licenses on the replaced nodes.