Skip to main content
ONTAP MetroCluster

Upgrading controllers in a MetroCluster IP configuration using switchover and switchback (ONTAP 9.8 and later)

Contributors netapp-aoife

Beginning with ONTAP 9.8, you can use the MetroCluster switchover operation to provide nondisruptive service to clients while the controller modules on the partner cluster are upgraded. Other components (such as storage shelves or switches) cannot be upgraded as part of this procedure.

Platforms supported by this procedure

  • The platforms must be running ONTAP 9.8 or later.

  • The target (new) platform must be a different model than the original platform.

  • You can only upgrade specific platform models using this procedure in a MetroCluster IP configuration.

About this task

  • This procedure applies to controller modules in a MetroCluster IP configuration.

  • All controllers in the configuration should be upgraded during the same maintenance period.

    Operating the MetroCluster configuration with different controller types is not supported outside of this maintenance activity.

  • The MetroCluster IP switches (switch type, vendor, and model) and firmware version must be supported on the existing and new controllers in your upgrade configuration.

    Refer to the NetApp Hardware Universe or the IMT for supported switches and firmware versions.

  • If it is enabled on your system, disable end-to-end encryption before performing the upgrade.

  • If the new platform has fewer slots than the original system, or if it has fewer or different types of ports, you might need to add an adapter to the new system.

  • You reuse the IP addresses, netmasks, and gateways of the original platforms on the new platforms.

The following example names are used in this procedure:

  • site_A

    • Before upgrade:

      • node_A_1-old

      • node_A_2-old

    • After upgrade:

      • node_A_1-new

      • node_A_2-new

  • site_B

    • Before upgrade:

      • node_B_1-old

      • node_B_2-old

    • After upgrade:

      • node_B_1-new

      • node_B_2-new

Enable console logging

NetApp strongly recommends that you enable console logging on the devices that you are using and take the following actions when performing this procedure:

Set the required bootarg on the existing system

If you are upgrading to an AFF A70, AFF A90, or AFF A1K system, follow the steps to set the hw.cxgbe.toe_keepalive_disable=1 bootarg.

Caution If you are upgrading to an AFF A70, AFF A90, or AFF A1K system you must complete this task before performing the upgrade. This task only applies to upgrades to an AFF A70, AFF A90, or AFF A1K system from a supported system. For all other upgrades, you can skip this task and go directly to Prepare for the upgrade.
Steps
  1. Halt one node at each site and allow its HA partner to perform a storage takeover of the node:

    halt -node <node_name>

  2. At the LOADER prompt of the halted node, enter the following:

    setenv hw.cxgbe.toe_keepalive_disable 1

    saveenv

    printenv hw.cxgbe.toe_keepalive_disable

  3. Boot the node:

    boot_ontap

  4. When the node boots, perform a giveback for the node at the prompt:

    storage failover giveback -ofnode <node_name>

  5. Repeat the steps on each node in the DR group that is being upgraded.

Prepare for the upgrade

Before making any changes to the existing MetroCluster configuration, you must check the health of the configuration, prepare the new platforms, and perform other miscellaneous tasks.

Workflow for upgrading controllers in an MetroCluster IP configuration

You can use the workflow diagram to help you plan the upgrade tasks.

workflow ip upgrade

Update the MetroCluster switch RCF files before upgrading controllers

Depending on the old platform models, or if switch configuration is not on the minimum version, or if you want to change VLAN IDs used by the back-end MetroCluster connections, you must update the switch RCF files before you begin the platform upgrade procedure.

About this task

You must update the RCF file in the following scenarios:

  • For certain platform models, the switches must be using a supported VLAN ID for the back-end MetroCluster IP connections. If the old or new platform models are in the following table, and not using a supported VLAN ID, you must update the switch RCF files.

    Note The local cluster connections can use any VLAN, they do not need to be in the given range.

    Platform model (old or new)

    Supported VLAN IDs

    • AFF A400

    • 10

    • 20

    • Any value in the range 101 to 4096 inclusive.

  • The switch configuration was not configured with minimum supported RCF version:

    Switch model

    Required RCF file version

    Cisco 3132Q-V

    1.7 or later

    Cisco 3232C

    1.7 or later

    Broadcom BES-53248

    1.3 or later

  • You want to change the VLAN configuration.

    The VLAN ID range is 101 to 4096 inclusive.

The switches at site_A will be upgraded when the controllers on site_A are upgraded.

Steps
  1. Prepare the IP switches for the application of the new RCF files.

    Follow the steps in the section for your switch vendor:

  2. Download and install the RCF files.

    Follow the steps in the section for your switch vendor:

Map ports from the old nodes to the new nodes

You must verify that the physical ports on node_A_1-old map correctly to the physical ports on node_A_1-new, which will allow node_A_1-new to communicate with other nodes in the cluster and with the network after the upgrade.

About this task

When the new node is first booted during the upgrade process, it will replay the most recent configuration of the old node it is replacing. When you boot node_A_1-new, ONTAP attempts to host LIFs on the same ports that were used on node_A_1-old. Therefore, as part of the upgrade you must adjust the port and LIF configuration so it is compatible with that of the old node. During the upgrade procedure, you will perform steps on both the old and new nodes to ensure correct cluster, management, and data LIF configuration.

The following table shows examples of configuration changes related to the port requirements of the new nodes.

Cluster interconnect physical ports

Old controller

New controller

Required action

e0a, e0b

e3a, e3b

No matching port. After upgrade, you must recreate cluster ports.

e0c, e0d

e0a,e0b,e0c,e0d

e0c and e0d are matching ports. You do not have to change the configuration, but after upgrade you can spread your cluster LIFs across the available cluster ports.

Steps
  1. Determine what physical ports are available on the new controllers and what LIFs can be hosted on the ports.

    The controller's port usage depends on the platform module and which switches you will use in the MetroCluster IP configuration. You can gather the port usage of the new platforms from the NetApp Hardware Universe.

  2. Plan your port usage and fill in the following tables for reference for each of the new nodes.

    You will refer to the table as you carry out the upgrade procedure.

    node_A_1-old

    node_A_1-new

    LIF

    Ports

    IPspaces

    Broadcast domains

    Ports

    IPspaces

    Broadcast domains

    Cluster 1

    Cluster 2

    Cluster 3

    Cluster 4

    Node management

    Cluster management

    Data 1

    Data 2

    Data 3

    Data 4

    SAN

    Intercluster port

Netboot the new controllers

After you install the new nodes, you need to netboot to ensure the new nodes are running the same version of ONTAP as the original nodes. The term netboot means you are booting from an ONTAP image stored on a remote server. When preparing for netboot, you must put a copy of the ONTAP 9 boot image onto a web server that the system can access.

Steps
  1. Netboot the new controllers:

    1. Access the NetApp Support Site to download the files used for performing the netboot of the system.

    2. Download the appropriate ONTAP software from the software download section of the NetApp Support Site and store the ontap-version_image.tgz file on a web-accessible directory.

    3. Change to the web-accessible directory and verify that the files you need are available.

      Your directory listing should contain a netboot folder with a kernel file:

      _ontap-version_image.tgz

      You do not need to extract the _ontap-version_image.tgz file.

    4. At the LOADER prompt, configure the netboot connection for a management LIF:

      If IP addressing is…​

      Then…​

      DHCP

      Configure the automatic connection:

      ifconfig e0M -auto

      Static

      Configure the manual connection:

      ifconfig e0M -addr=ip_addr -mask=netmask -gw=gateway

    5. Perform the netboot.

      netboot http://_web_server_ip/path_to_web-accessible_directory/ontap-version_image.tgz

    6. From the boot menu, select option (7) Install new software first to download and install the new software image to the boot device.

      Disregard the following message:

      "This procedure is not supported for Non-Disruptive Upgrade on an HA pair". It applies to nondisruptive upgrades of software, not to upgrades of controllers.

    7. If you are prompted to continue the procedure, enter y, and when prompted for the package, enter the URL of the image file:

      http://web_server_ip/path_to_web-accessible_directory/ontap-version_image.tgz

    8. Enter the user name and password if applicable, or press Enter to continue.

    9. Be sure to enter n to skip the backup recovery when you see a prompt similar to the following:

      Do you want to restore the backup configuration now? {y|n} n
    10. Reboot by entering y when you see a prompt similar to the following:

      The node must be rebooted to start using the newly installed software. Do you want to reboot now? {y|n}

Clear the configuration on a controller module

Before using a new controller module in the MetroCluster configuration, you must clear the existing configuration.

Steps
  1. If necessary, halt the node to display the LOADER prompt:

    halt

  2. At the LOADER prompt, set the environmental variables to default values:

    set-defaults

  3. Save the environment:

    saveenv

  4. At the LOADER prompt, launch the boot menu:

    boot_ontap menu

  5. At the boot menu prompt, clear the configuration:

    wipeconfig

    Respond yes to the confirmation prompt.

    The node reboots and the boot menu is displayed again.

  6. At the boot menu, select option 5 to boot the system into Maintenance mode.

    Respond yes to the confirmation prompt.

Verify MetroCluster health before site upgrade

You must verify the health and connectivity of the MetroCluster configuration prior to performing the upgrade.

Steps
  1. Verify the operation of the MetroCluster configuration in ONTAP:

    1. Check whether the nodes are multipathed:
      node run -node <node_name> sysconfig -a

      You should issue this command for each node in the MetroCluster configuration.

    2. Verify that there are no broken disks in the configuration:
      storage disk show -broken

      You should issue this command on each node in the MetroCluster configuration.

    3. Check for any health alerts:

      system health alert show

      You should issue this command on each cluster.

    4. Verify the licenses on the clusters:

      system license show

      You should issue this command on each cluster.

    5. Verify the devices connected to the nodes:

      network device-discovery show

      You should issue this command on each cluster.

    6. Verify that the time zone and time is set correctly on both sites:

      cluster date show

      You should issue this command on each cluster. You can use the cluster date commands to configure the time and time zone.

  2. Confirm the operational mode of the MetroCluster configuration and perform a MetroCluster check.

    1. Confirm the MetroCluster configuration and that the operational mode is normal:
      metrocluster show

    2. Confirm that all expected nodes are shown:
      metrocluster node show

    3. Issue the following command:

      metrocluster check run

    4. Display the results of the MetroCluster check:

      metrocluster check show

  3. Check the MetroCluster cabling with the Config Advisor tool.

    1. Download and run Config Advisor.

    2. After running Config Advisor, review the tool's output and follow the recommendations in the output to address any issues discovered.

Gather information before the upgrade

Before upgrading, you must gather information for each of the nodes, and, if necessary, adjust the network broadcast domains, remove any VLANs and interface groups, and gather encryption information.

Steps
  1. Record the physical cabling for each node, labelling cables as needed to allow correct cabling of the new nodes.

  2. Gather interconnect, port and LIF information for each node.

    You should gather the output of the following commands for each node:

    • metrocluster interconnect show

    • metrocluster configuration-settings connection show

    • network interface show -role cluster,node-mgmt

    • network port show -node <node_name> -type physical

    • network port vlan show -node <node_name>

    • network port ifgrp show -node <node_name> -instance

    • network port broadcast-domain show

    • network port reachability show -detail

    • network ipspace show

    • volume show

    • storage aggregate show

    • system node run -node <node_name> sysconfig -a

    • aggr show -r

    • disk show

    • system node run <node-name> disk show

    • vol show -fields type

    • vol show -fields type , space-guarantee

    • vserver fcp initiator show

    • storage disk show

    • metrocluster configuration-settings interface show

  3. Gather the UUIDs for the site_B (the site whose platforms are currently being upgraded):

    metrocluster node show -fields node-cluster-uuid, node-uuid

    These values must be configured accurately on the new site_B controller modules to ensure a successful upgrade. Copy the values to a file so that you can copy them into the proper commands later in the upgrade process.

    The following example shows the command output with the UUIDs:

    cluster_B::> metrocluster node show -fields node-cluster-uuid, node-uuid
      (metrocluster node show)
    dr-group-id cluster     node   node-uuid                            node-cluster-uuid
    ----------- --------- -------- ------------------------------------ ------------------------------
    1           cluster_A node_A_1 f03cb63c-9a7e-11e7-b68b-00a098908039 ee7db9d5-9a82-11e7-b68b-00a098908039
    1           cluster_A node_A_2 aa9a7a7a-9a81-11e7-a4e9-00a098908c35 ee7db9d5-9a82-11e7-b68b-00a098908039
    1           cluster_B node_B_1 f37b240b-9ac1-11e7-9b42-00a098c9e55d 07958819-9ac6-11e7-9b42-00a098c9e55d
    1           cluster_B node_B_2 bf8e3f8f-9ac4-11e7-bd4e-00a098ca379f 07958819-9ac6-11e7-9b42-00a098c9e55d
    4 entries were displayed.
    cluster_B::*

    It is recommended that you record the UUIDs into a table similar to the following.

    Cluster or node

    UUID

    cluster_B

    07958819-9ac6-11e7-9b42-00a098c9e55d

    node_B_1

    f37b240b-9ac1-11e7-9b42-00a098c9e55d

    node_B_2

    bf8e3f8f-9ac4-11e7-bd4e-00a098ca379f

    cluster_A

    ee7db9d5-9a82-11e7-b68b-00a098908039

    node_A_1

    f03cb63c-9a7e-11e7-b68b-00a098908039

    node_A_2

    aa9a7a7a-9a81-11e7-a4e9-00a098908c35

  4. If the MetroCluster nodes are in a SAN configuration, collect the relevant information.

    You should gather the output of the following commands:

    • fcp adapter show -instance

    • fcp interface show -instance

    • iscsi interface show

    • ucadmin show

  5. If the root volume is encrypted, collect and save the passphrase used for key-manager:

    security key-manager backup show

  6. If the MetroCluster nodes are using encryption for volumes or aggregates, copy information about the keys and passphrases.

    1. If Onboard Key Manager is configured:
      security key-manager onboard show-backup

      You will need the passphrase later in the upgrade procedure.

    2. If enterprise key management (KMIP) is configured, issue the following commands:

      security key-manager external show -instance security key-manager key query

  7. Gather the system IDs of the existing nodes:

    metrocluster node show -fields node-systemid,ha-partner-systemid,dr-partner-systemid,dr-auxiliary-systemid

    The following output shows the reassigned drives.

    ::> metrocluster node show -fields node-systemid,ha-partner-systemid,dr-partner-systemid,dr-auxiliary-systemid
    
    dr-group-id cluster     node     node-systemid ha-partner-systemid dr-partner-systemid dr-auxiliary-systemid
    ----------- ----------- -------- ------------- ------------------- ------------------- ---------------------
    1           cluster_A node_A_1   537403324     537403323           537403321           537403322
    1           cluster_A node_A_2   537403323     537403324           537403322           537403321
    1           cluster_B node_B_1   537403322     537403321           537403323           537403324
    1           cluster_B node_B_2   537403321     537403322           537403324           537403323
    4 entries were displayed.

Remove Mediator or Tiebreaker monitoring

Before the upgrading the platforms, you must remove monitoring if the MetroCluster configuration is monitored with the Tiebreaker or Mediator utility.

Steps
  1. Collect the output for the following command:

    storage iscsi-initiator show

  2. Remove the existing MetroCluster configuration from Tiebreaker, Mediator, or other software that can initiate switchover.

    If you are using…​

    Use this procedure…​

    Tiebreaker

    Mediator

    Issue the following command from the ONTAP prompt:

    metrocluster configuration-settings mediator remove

    Third-party applications

    Refer to the product documentation.

Send a custom AutoSupport message prior to maintenance

Before performing the maintenance, you should issue an AutoSupport message to notify NetApp technical support that maintenance is underway. Informing technical support that maintenance is underway prevents them from opening a case on the assumption that a disruption has occurred.

About this task

This task must be performed on each MetroCluster site.

Steps
  1. Log in to the cluster.

  2. Invoke an AutoSupport message indicating the start of the maintenance:

    system node autosupport invoke -node * -type all -message MAINT=maintenance-window-in-hours

    The maintenance-window-in-hours parameter specifies the length of the maintenance window, with a maximum of 72 hours. If the maintenance is completed before the time has elapsed, you can invoke an AutoSupport message indicating the end of the maintenance period:

    system node autosupport invoke -node * -type all -message MAINT=end

  3. Repeat these steps on the partner site.

Switch over the MetroCluster configuration

You must switch over the configuration to site_A so that the platforms on site_B can be upgraded.

About this task

This task must be performed on site_A.

After completing this task, cluster_A is active and serving data for both sites. cluster_B is inactive, and ready to begin the upgrade process.

mcc upgrade cluster a in switchover
Steps
  1. Switch over the MetroCluster configuration to site_A so that site_B's nodes can be upgraded:

    1. Issue the following command on cluster_A:

      metrocluster switchover -controller-replacement true

      The operation can take several minutes to complete.

    2. Monitor the switchover operation:

      metrocluster operation show

    3. After the operation is complete, confirm that the nodes are in switchover state:

      metrocluster show

    4. Check the status of the MetroCluster nodes:

      metrocluster node show

      Automatic healing of aggregates after negotiated switchover is disabled during controller upgrade.

Remove interface configurations and uninstall the old controllers

You must move data LIFs to a common port, remove VLANs and interface groups on the old controllers and then physically uninstall the controllers.

About this task
Steps
  1. Boot the old nodes and log in to the nodes:

    boot_ontap

  2. Modify the intercluster LIFs on the old controllers to use a different home port than the ports used for HA interconnect or MetroCluster IP DR interconnect on the new controllers.

    Note This step is required for a successful upgrade.

    The intercluster LIFs on the old controllers must use a different home port than the ports used for HA interconnect or MetroCluster IP DR interconnect on the new controllers. For example, when you upgrade to AFF A90 controllers, the HA interconnect ports are e1a and e7a, and the MetroCluster IP DR interconnect ports are e2b and e3b. You must move the intercluster LIFs on the old controllers if they are hosted on ports e1a, e7a, e2b, or e3b.

    For port distribution and allocation on the new nodes, refer to the NetApp Hardware Universe.

    1. On the old controllers, view the intercluster LIFs:

      network interface show -role intercluster

      Take one of the following actions depending on whether the intercluster LIFs on the old controllers use the same ports as the ports used for HA interconnect or MetroCluster IP DR interconnect on the new controllers.

      If the intercluster LIFs…​ Go to…​

      Use the same home port

      Substep b

      Use a different home port

      Step 3

    2. Modify the intercluster LIFs to use a different home port:

      network interface modify -vserver <vserver> -lif <intercluster_lif> -home-port <port-not-used-for-ha-interconnect-or-mcc-ip-dr-interconnect-on-new-nodes>

    3. Verify that all intercluster LIFs are on their new home ports:

      network interface show -role intercluster -is-home false

      The command output should be empty, indicating that all intercluster LIFs are on their respective home ports.

    4. If there are any LIFs that are not on their home ports, revert them using the following command:

      network interface revert -lif <intercluster_lif>

      Repeat the command for each intercluster LIF that is not on the home port.

  3. Assign the home port of all data LIFs on the old controller to a common port that is the same on both the old and new controller modules.

    1. Display the LIFs:

      network interface show

      All data LIFS including SAN and NAS will be admin up and operationally down since those are up at switchover site (cluster_A).

    2. Review the output to find a common physical network port that is the same on both the old and new controllers that is not used as a cluster port.

      For example, e0d is a physical port on old controllers and is also present on new controllers. e0d is not used as a cluster port or otherwise on the new controllers.

      For port usage for platform models, see the NetApp Hardware Universe

    3. Modify all data LIFS to use the common port as the home port:
      network interface modify -vserver <svm-name> -lif <data-lif> -home-port <port-id>

      In the following example, this is "e0d".

      For example:

      network interface modify -vserver vs0 -lif datalif1 -home-port e0d
  4. Modify broadcast domains to remove VLAN and physical ports that need to be deleted:

    broadcast-domain remove-ports -broadcast-domain <broadcast-domain-name> -ports <node-name:port-id>

    Repeat this step for all VLAN and physical ports.

  5. Remove any VLAN ports using cluster ports as member ports and ifgrps using cluster ports as member ports.

    1. Delete VLAN ports:
      network port vlan delete -node <node_name> -vlan-name <portid-vlandid>

      For example:

      network port vlan delete -node node1 -vlan-name e1c-80
    2. Remove physical ports from the interface groups:

      network port ifgrp remove-port -node <node_name> -ifgrp <interface-group-name> -port <portid>

      For example:

      network port ifgrp remove-port -node node1 -ifgrp a1a -port e0d
    3. Remove VLAN and interface group ports from broadcast domain:

      network port broadcast-domain remove-ports -ipspace <ipspace> -broadcast-domain <broadcast-domain-name> -ports <nodename:portname,nodename:portnamee>,..

    4. Modify interface group ports to use other physical ports as member, as needed:

      ifgrp add-port -node <node_name> -ifgrp <interface-group-name> -port <port-id>

  6. Halt the nodes to the LOADER prompt:

    halt -inhibit-takeover true

  7. Connect to the serial console of the old controllers (node_B_1-old and node_B_2-old) at site_B and verify it is displaying the LOADER prompt.

  8. Gather the bootarg values:

    printenv

  9. Disconnect the storage and network connections on node_B_1-old and node_B_2-old and label the cables so they can be reconnected to the new nodes.

  10. Disconnect the power cables from node_B_1-old and node_B_2-old.

  11. Remove the node_B_1-old and node_B_2-old controllers from the rack.

Set up the new controllers

You must rack and cable the new controllers.

Steps
  1. Plan out the positioning of the new controller modules and storage shelves as needed.

    The rack space depends on the platform model of the controller modules, the switch types, and the number of storage shelves in your configuration.

  2. Properly ground yourself.

  3. If your upgrade requires replacement of the controller modules, for example, upgrading from an AFF 800 to an AFF A90 system, you must remove the controller module from the chassis when you replace the controller module. For all other upgrades, skip to Step 4.

    On the front of the chassis, use your thumbs to firmly push each drive in until you feel a positive stop. This confirms that the drives are firmly seated against the chassis midplane.

    Shows removing controller module from chassis
  4. Install the controller modules.

    Note The installation steps you follow depend on whether your upgrade requires replacement of the controller modules, such as an upgrade from an AFF 800 to an AFF A90 system.
    Replacing controller modules

    Installing the new controllers separately is not applicable for upgrades of integrated systems with disks and controllers in the same chassis, for example, from an AFF A800 system to an AFF A90 system. The new controller modules and I/O cards must be swapped after powering off the old controllers, as shown in the image below.

    The following example image is for representation only, the controller modules and I/O cards can vary between systems.

    Shows controller module swap
    All other upgrades

    Install the controller modules in the rack or cabinet.

  5. Cable the controllers' power, serial console, and management connections as described in Cabling the MetroCluster IP switches

    Do not connect any other cables that were disconnected from old controllers at this time.

  6. Power up the new nodes and boot them to Maintenance mode.

Restore the HBA configuration

Depending on the presence and configuration of HBA cards in the controller module, you need to configure them correctly for your site's usage.

Steps
  1. In Maintenance mode configure the settings for any HBAs in the system:

    1. Check the current settings of the ports:

      ucadmin show

    2. Update the port settings as needed.

    If you have this type of HBA and desired mode…​

    Use this command…​

    CNA FC

    ucadmin modify -m fc -t initiator <adapter-name>

    CNA Ethernet

    ucadmin modify -mode cna <adapter-name>

    FC target

    fcadmin config -t target <adapter-name>

    FC initiator

    fcadmin config -t initiator <adapter-name>

  2. Exit Maintenance mode:

    halt

    After you run the command, wait until the node stops at the LOADER prompt.

  3. Boot the node back into Maintenance mode to enable the configuration changes to take effect:

    boot_ontap maint

  4. Verify the changes you made:

    If you have this type of HBA…​

    Use this command…​

    CNA

    ucadmin show

    FC

    fcadmin show

Set the HA state on the new controllers and chassis

You must verify the HA state of the controllers and chassis, and, if necessary, update the state to match your system configuration.

Steps
  1. In Maintenance mode, display the HA state of the controller module and chassis:

    ha-config show

    The HA state for all components should be mccip.

  2. If the displayed system state of the controller or chassis is not correct, set the HA state:

    ha-config modify controller mccip

    ha-config modify chassis mccip

  3. Verify and modify the Ethernet ports connected to NS224 shelves or storage switches.

    1. Verify the Ethernet ports connected to NS224 shelves or storage switches:

      storage port show

    2. Set all Ethernet ports connected to Ethernet shelves or storage switches, including shared switches for storage and cluster, to storage mode:

      storage port modify -p <port> -m storage

      Example:

      *> storage port modify -p e5b -m storage
      Changing NVMe-oF port e5b to storage mode
      Note This must be set on all affected ports for a successful upgrade.

      Disks from the shelves attached to the Ethernet ports are reported in the sysconfig -v output.

      Refer to the NetApp Hardware Universe for information on the storage ports for the system you are upgrading to.

    3. Verify that storage mode is set and confirm that the ports are in the online state:

      storage port show

  4. Halt the node: halt

    The node should stop at the LOADER> prompt.

  5. On each node, check the system date, time, and time zone: show date

  6. If necessary, set the date in UTC or GMT: set date <mm/dd/yyyy>

  7. Check the time by using the following command at the boot environment prompt: show time

  8. If necessary, set the time in UTC or GMT: set time <hh:mm:ss>

  9. Save the settings: saveenv

  10. Gather environment variables: printenv

Update the switch RCFs to accommodate the new platforms

You must update the switches to a configuration that supports the new platform models.

About this task

You perform this task at the site containing the controllers that are currently being upgraded. In the examples shown in this procedure, we are upgrading site_B first.

The switches at site_A will be upgraded when the controllers on site_A are upgraded.

Steps
  1. Prepare the IP switches for the application of the new RCF files.

    Follow the steps in the procedure for your switch vendor:

  2. Download and install the RCF files.

    Follow the steps in the section for your switch vendor:

Set the MetroCluster IP bootarg variables

Certain MetroCluster IP bootarg values must be configured on the new controller modules. The values must match those configured on the old controller modules.

About this task

In this task, you will use the UUIDs and system IDs identified earlier in the upgrade procedure in Gather information before the upgrade.

Steps
  1. If the nodes being upgraded are AFF A400, FAS8300, or FAS8700 models, set the following bootargs at the LOADER prompt:

    setenv bootarg.mcc.port_a_ip_config <local-IP-address/local-IP-mask,0,HA-partner-IP-address,DR-partner-IP-address,DR-aux-partnerIP-address,vlan-id>

    setenv bootarg.mcc.port_b_ip_config <local-IP-address/local-IP-mask,0,HA-partner-IP-address,DR-partner-IP-address,DR-aux-partnerIP-address,vlan-id>

    Note If the interfaces are using the default VLANs, the vlan-id is not necessary.

    The following commands set the values for node_B_1-new using VLAN 120 for the first network and VLAN 130 for the second network:

    setenv bootarg.mcc.port_a_ip_config 172.17.26.10/23,0,172.17.26.11,172.17.26.13,172.17.26.12,120
    setenv bootarg.mcc.port_b_ip_config 172.17.27.10/23,0,172.17.27.11,172.17.27.13,172.17.27.12,130

    The following commands set the values for node_B_2-new using VLAN 120 for the first network and VLAN 130 for the second network:

    setenv bootarg.mcc.port_a_ip_config 172.17.26.11/23,0,172.17.26.10,172.17.26.12,172.17.26.13,120
    setenv bootarg.mcc.port_b_ip_config 172.17.27.11/23,0,172.17.27.10,172.17.27.12,172.17.27.13,130

    The following example shows the commands for node_B_1-new when the default VLAN is used:

    setenv bootarg.mcc.port_a_ip_config 172.17.26.10/23,0,172.17.26.11,172.17.26.13,172.17.26.12
    setenv bootarg.mcc.port_b_ip_config 172.17.27.10/23,0,172.17.27.11,172.17.27.13,172.17.27.12

    The following example shows the commands for node_B_2-new when the default VLAN is used:

    setenv bootarg.mcc.port_a_ip_config 172.17.26.11/23,0,172.17.26.10,172.17.26.12,172.17.26.13
    setenv bootarg.mcc.port_b_ip_config 172.17.27.11/23,0,172.17.27.10,172.17.27.12,172.17.27.13
  2. If the nodes being upgraded are not systems listed in the previous step, at the LOADER prompt for each of the surviving nodes, set the following bootargs with local_IP/mask:

    setenv bootarg.mcc.port_a_ip_config <local-IP-address/local-IP-mask,0,HA-partner-IP-address,DR-partner-IP-address,DR-aux-partnerIP-address>

    setenv bootarg.mcc.port_b_ip_config <local-IP-address/local-IP-mask,0,HA-partner-IP-address,DR-partner-IP-address,DR-aux-partnerIP-address>

    The following commands set the values for node_B_1-new:

    setenv bootarg.mcc.port_a_ip_config 172.17.26.10/23,0,172.17.26.11,172.17.26.13,172.17.26.12
    setenv bootarg.mcc.port_b_ip_config 172.17.27.10/23,0,172.17.27.11,172.17.27.13,172.17.27.12

    The following commands set the values for node_B_2-new:

    setenv bootarg.mcc.port_a_ip_config 172.17.26.11/23,0,172.17.26.10,172.17.26.12,172.17.26.13
    setenv bootarg.mcc.port_b_ip_config 172.17.27.11/23,0,172.17.27.10,172.17.27.12,172.17.27.13
  3. At the new nodes' LOADER prompt, set the UUIDs:

    setenv bootarg.mgwd.partner_cluster_uuid <partner-cluster-UUID>

    setenv bootarg.mgwd.cluster_uuid <local-cluster-UUID>

    setenv bootarg.mcc.pri_partner_uuid <DR-partner-node-UUID>

    setenv bootarg.mcc.aux_partner_uuid <DR-aux-partner-node-UUID>

    setenv bootarg.mcc_iscsi.node_uuid <local-node-UUID>

    1. Set the UUIDs on node_B_1-new.

      The following example shows the commands for setting the UUIDs on node_B_1-new:

      setenv bootarg.mgwd.cluster_uuid ee7db9d5-9a82-11e7-b68b-00a098908039
      setenv bootarg.mgwd.partner_cluster_uuid 07958819-9ac6-11e7-9b42-00a098c9e55d
      setenv bootarg.mcc.pri_partner_uuid f37b240b-9ac1-11e7-9b42-00a098c9e55d
      setenv bootarg.mcc.aux_partner_uuid bf8e3f8f-9ac4-11e7-bd4e-00a098ca379f
      setenv bootarg.mcc_iscsi.node_uuid f03cb63c-9a7e-11e7-b68b-00a098908039
    2. Set the UUIDs on node_B_2-new:

      The following example shows the commands for setting the UUIDs on node_B_2-new:

      setenv bootarg.mgwd.cluster_uuid ee7db9d5-9a82-11e7-b68b-00a098908039
      setenv bootarg.mgwd.partner_cluster_uuid 07958819-9ac6-11e7-9b42-00a098c9e55d
      setenv bootarg.mcc.pri_partner_uuid bf8e3f8f-9ac4-11e7-bd4e-00a098ca379f
      setenv bootarg.mcc.aux_partner_uuid f37b240b-9ac1-11e7-9b42-00a098c9e55d
      setenv bootarg.mcc_iscsi.node_uuid aa9a7a7a-9a81-11e7-a4e9-00a098908c35
  4. Determine whether the original systems were configured for Advanced Drive Partitioning (ADP) by running the following command from the site that is up:

    disk show

    The "container type" column displays "shared" in the disk show output if ADP is configured. If "container type" has any other value, ADP is not configured on the system. The following example output shows a system configured with ADP:

    ::> disk show
                        Usable               Disk    Container   Container
    Disk                Size       Shelf Bay Type    Type        Name      Owner
    
    Info: This cluster has partitioned disks. To get a complete list of spare disk
          capacity use "storage aggregate show-spare-disks".
    ----------------    ---------- ----- --- ------- ----------- --------- --------
    1.11.0              894.0GB    11    0   SSD      shared     testaggr  node_A_1
    1.11.1              894.0GB    11    1   SSD      shared     testaggr  node_A_1
    1.11.2              894.0GB    11    2   SSD      shared     testaggr  node_A_1
  5. If the original systems were configured with partitioned disks for ADP, enable it at the LOADER prompt for each replacement node:

    setenv bootarg.mcc.adp_enabled true

  6. Set the following variables:

    setenv bootarg.mcc.local_config_id <original-sys-id>

    setenv bootarg.mcc.dr_partner <dr-partner-sys-id>

    Note The setenv bootarg.mcc.local_config_id variable must be set to the sys-id of the original controller module, node_B_1-old.
    1. Set the variables on node_B_1-new.

      The following example shows the commands for setting the values on node_B_1-new:

      setenv bootarg.mcc.local_config_id 537403322
      setenv bootarg.mcc.dr_partner 537403324
    2. Set the variables on node_B_2-new.

      The following example shows the commands for setting the values on node_B_2-new:

      setenv bootarg.mcc.local_config_id 537403321
      setenv bootarg.mcc.dr_partner 537403323
  7. If using encryption with external key manager, set the required bootargs:

    setenv bootarg.kmip.init.ipaddr

    setenv bootarg.kmip.kmip.init.netmask

    setenv bootarg.kmip.kmip.init.gateway

    setenv bootarg.kmip.kmip.init.interface

Reassign root aggregate disks

Reassign the root aggregate disks to the new controller module, using the sysids gathered earlier.

About this task

These steps are performed in Maintenance mode.

Note Root aggregate disks are the only disks that must be reassigned during the controller upgrade process. Disk ownership of data aggregates is handled as part of the switchover/switchback operation.
Steps
  1. Boot the system to Maintenance mode:

    boot_ontap maint

  2. Display the disks on node_B_1-new from the Maintenance mode prompt:

    disk show -a

    Caution Before you proceed with disk reassignment, you must verify that the pool0 and pool1 disks belonging to the node's root aggregate are displayed in the disk show output. In the following example, the output lists the pool0 and pool1 disks owned by node_B_1-old.

    The command output shows the system ID of the new controller module (1574774970). However, the root aggregate disks are still owned by the old system ID (537403322). This example does not show drives owned by other nodes in the MetroCluster configuration.

    *> disk show -a
    Local System ID: 1574774970
    DISK                  OWNER                 POOL   SERIAL NUMBER   HOME                  DR HOME
    ------------          ---------             -----  -------------   -------------         -------------
    prod3-rk18:9.126L44   node_B_1-old(537403322)  Pool1  PZHYN0MD     node_B_1-old(537403322)  node_B_1-old(537403322)
    prod4-rk18:9.126L49   node_B_1-old(537403322)  Pool1  PPG3J5HA     node_B_1-old(537403322)  node_B_1-old(537403322)
    prod4-rk18:8.126L21   node_B_1-old(537403322)  Pool1  PZHTDSZD     node_B_1-old(537403322)  node_B_1-old(537403322)
    prod2-rk18:8.126L2    node_B_1-old(537403322)  Pool0  S0M1J2CF     node_B_1-old(537403322)  node_B_1-old(537403322)
    prod2-rk18:8.126L3    node_B_1-old(537403322)  Pool0  S0M0CQM5     node_B_1-old(537403322)  node_B_1-old(537403322)
    prod1-rk18:9.126L27   node_B_1-old(537403322)  Pool0  S0M1PSDW     node_B_1-old(537403322)  node_B_1-old(537403322)
    .
    .
    .
  3. Reassign the root aggregate disks on the drive shelves to the new controllers.

    If you are using ADP…​

    Then use this command…​

    Yes

    disk reassign -s <old-sysid> -d <new-sysid> -r <dr-partner-sysid>

    No

    disk reassign -s <old-sysid> -d <new-sysid>

  4. Reassign the root aggregate disks on the drive shelves to the new controllers:

    disk reassign -s <old-sysid> -d <new-sysid>

    The following example shows reassignment of drives in a non-ADP configuration:

    *> disk reassign -s 537403322 -d 1574774970
    Partner node must not be in Takeover mode during disk reassignment from maintenance mode.
    Serious problems could result!!
    Do not proceed with reassignment if the partner is in takeover mode. Abort reassignment (y/n)? n
    
    After the node becomes operational, you must perform a takeover and giveback of the HA partner node to ensure disk reassignment is successful.
    Do you want to continue (y/n)? y
    Disk ownership will be updated on all disks previously belonging to Filer with sysid 537403322.
    Do you want to continue (y/n)? y
  5. Verify that the disks of the root aggregate are properly reassigned old-remove:

    disk show

    storage aggr status

    *> disk show
    Local System ID: 537097247
    
      DISK                    OWNER                    POOL   SERIAL NUMBER   HOME                     DR HOME
    ------------              -------------            -----  -------------   -------------            -------------
    prod03-rk18:8.126L18 node_B_1-new(537097247)  Pool1  PZHYN0MD        node_B_1-new(537097247)   node_B_1-new(537097247)
    prod04-rk18:9.126L49 node_B_1-new(537097247)  Pool1  PPG3J5HA        node_B_1-new(537097247)   node_B_1-new(537097247)
    prod04-rk18:8.126L21 node_B_1-new(537097247)  Pool1  PZHTDSZD        node_B_1-new(537097247)   node_B_1-new(537097247)
    prod02-rk18:8.126L2  node_B_1-new(537097247)  Pool0  S0M1J2CF        node_B_1-new(537097247)   node_B_1-new(537097247)
    prod02-rk18:9.126L29 node_B_1-new(537097247)  Pool0  S0M0CQM5        node_B_1-new(537097247)   node_B_1-new(537097247)
    prod01-rk18:8.126L1  node_B_1-new(537097247)  Pool0  S0M1PSDW        node_B_1-new(537097247)   node_B_1-new(537097247)
    ::>
    ::> aggr status
               Aggr          State           Status                Options
    aggr0_node_B_1           online          raid_dp, aggr         root, nosnap=on,
                                             mirrored              mirror_resync_priority=high(fixed)
                                             fast zeroed
                                             64-bit

Boot up the new controllers

You must boot the new controllers, taking care to ensure that the bootarg variables are correct and, if needed, perform the encryption recovery steps.

Steps
  1. Halt the new nodes:

    halt

  2. If external key manager is configured, set the related bootargs:

    setenv bootarg.kmip.init.ipaddr <ip-address>

    setenv bootarg.kmip.init.netmask <netmask>

    setenv bootarg.kmip.init.gateway <gateway-addres>

    setenv bootarg.kmip.init.interface <interface-id>

  3. Check if the partner-sysid is the current:

    printenv partner-sysid

    If the partner-sysid is not correct, set it:

    setenv partner-sysid <partner-sysID>

  4. Display the ONTAP boot menu:

    boot_ontap menu

  5. If root encryption is used, select the boot menu option for your key management configuration.

    If you are using…​

    Select this boot menu option…​

    Onboard key management

    Option 10

    Follow the prompts to provide the required inputs to recover and restore the key-manager configuration.

    External key management

    Option 11

    Follow the prompts to provide the required inputs to recover and restore the key-manager configuration.

  6. From the boot menu, select “(6) Update flash from backup config”.

    Note Option 6 will reboot the node twice before completing.

    Respond “y” to the system id change prompts. Wait for the second reboot messages:

    Successfully restored env file from boot media...
    
    Rebooting to load the restored env file...
  7. On LOADER, double-check the bootarg values and update the values as needed.

  8. Double-check that the partner-sysid is the correct:

    printenv partner-sysid

    If the partner-sysid is not correct, set it:

    setenv partner-sysid <partner-sysID>

  9. If root encryption is used, select the boot menu option again for your key management configuration.

    If you are using…​

    Select this boot menu option…​

    Onboard key management

    Option 10

    Follow the prompts to provide the required inputs to recover and restore the key-manager configuration.

    External key management

    Option “11”

    Follow the prompts to provide the required inputs to recover and restore the key-manager configuration.

    Depending on the key manager setting, perform the recovery procedure by selecting option “10” or option “11”, followed by option 6 at the first boot menu prompt. To boot the nodes completely, you might need to repeat the recovery procedure continued by option “1” (normal boot).

  10. Wait for the replaced nodes to boot up.

    If either node is in takeover mode, perform a giveback using the storage failover giveback command.

  11. If encryption is used, restore the keys using the correct command for your key management configuration.

    If you are using…​

    Use this command…​

    Onboard key management

    security key-manager onboard sync

    External key management

    security key-manager external restore -vserver <SVM> -node <node> -key-server <host_name|IP_address:port> -key-id key_id -key-tag key_tag <node_name>

  12. Verify that all ports are in a broadcast domain:

    1. View the broadcast domains:

      network port broadcast-domain show

    2. If a new broadcast domain is created for the data ports on the newly upgraded controllers, delete the broadcast domain:

      Note Only delete the the new broadcast domain. Do not delete any of the broadcast domains that existed before starting the upgrade.

      broadcast-domain delete -broadcast-domain <broadcast_domain_name>

    3. Add any ports to a broadcast domain as needed.

    4. Recreate VLANs and interface groups as needed.

      VLAN and interface group membership might be different than that of the old node.

Verify and restore LIF configuration

Verify that LIFs are hosted on appropriate nodes and ports as mapped out at the beginning of the upgrade procedure.

About this task
Steps
  1. Verify that LIFs are hosted on the appropriate node and ports prior to switchback.

    1. Change to the advanced privilege level:

      set -privilege advanced

    2. Override the port configuration to ensure proper LIF placement:

      vserver config override -command "network interface modify -vserver <svm-name> -home-port <active_port_after_upgrade> -lif <lif_name> -home-node <new_node_name>

      When entering the network interface modify command within the vserver config override command, you cannot use the tab autocomplete feature. You can create the network interface modify using autocomplete and then enclose it in the vserver config override command.

    3. Return to the admin privilege level:

      set -privilege admin

  2. Revert the interfaces to their home node:

    network interface revert * -vserver <svm-name>

    Perform this step on all SVMs as required.

Switch back the MetroCluster configuration

In this task, you will perform the switchback operation, and the MetroCluster configuration returns to normal operation. The nodes on site_A are still awaiting upgrade.

mcc upgrade cluster a switchback
Steps
  1. Issue the metrocluster node show command on site_B and check the output.

    1. Verify that the new nodes are represented correctly.

    2. Verify that the new nodes are in "Waiting for switchback state."

  2. Perform the healing and switchback by running the required commands from any node in the active cluster (the cluster that is not undergoing upgrade).

    1. Heal the data aggregates:
      metrocluster heal aggregates

    2. Heal the root aggregates:

      metrocluster heal root

    3. Switchback the cluster:

      metrocluster switchback

  3. Check the progress of the switchback operation:

    metrocluster show

    The switchback operation is still in progress when the output displays waiting-for-switchback:

    cluster_B::> metrocluster show
    Cluster                   Entry Name          State
    ------------------------- ------------------- -----------
     Local: cluster_B         Configuration state configured
                              Mode                switchover
                              AUSO Failure Domain -
    Remote: cluster_A         Configuration state configured
                              Mode                waiting-for-switchback
                              AUSO Failure Domain -

    The switchback operation is complete when the output displays normal:

    cluster_B::> metrocluster show
    Cluster                   Entry Name          State
    ------------------------- ------------------- -----------
     Local: cluster_B         Configuration state configured
                              Mode                normal
                              AUSO Failure Domain -
    Remote: cluster_A         Configuration state configured
                              Mode                normal
                              AUSO Failure Domain -

    If a switchback takes a long time to finish, you can check on the status of in-progress baselines by using the metrocluster config-replication resync-status show command. This command is at the advanced privilege level.

Check the health of the MetroCluster configuration

After upgrading the controller modules you must verify the health of the MetroCluster configuration.

About this task

This task can be performed on any node in the MetroCluster configuration.

Steps
  1. Verify the operation of the MetroCluster configuration:

    1. Confirm the MetroCluster configuration and that the operational mode is normal:
      metrocluster show

    2. Perform a MetroCluster check:
      metrocluster check run

    3. Display the results of the MetroCluster check:

      metrocluster check show

  2. Verify the MetroCluster connectivity and status.

    1. Check the MetroCluster IP connections:

      storage iscsi-initiator show

    2. Check that the nodes are operating:

      metrocluster node show

    3. Check that the MetroCluster IP interfaces are up:

      metrocluster configuration-settings interface show

    4. Check that local failover is enabled:

      storage failover show

Upgrade the nodes on cluster_A

You must repeat the upgrade tasks on cluster_A.

Steps
  1. Repeat the steps to upgrade the nodes on cluster_A, beginning with Preparing for the upgrade.

    As you perform the tasks, all example references to the clusters and nodes are reversed. For example, when the example is given to switchover from cluster_A, you will switchover from cluster_B.

Restore Tiebreaker or Mediator monitoring

After completing the upgrade of the MetroCluster configuration, you can resume monitoring with the Tiebreaker or Mediator utility.

Steps
  1. Restore monitoring if necessary, using the procedure for your configuration.

    If you are using…​ Use this procedure

    Tiebreaker

    Mediator

    link:../install-ip/concept_mediator_requirements.html [Configuring the ONTAP Mediator service from a MetroCluster IP configuration].

    Third-party applications

    Refer to the product documentation.

Send a custom AutoSupport message after maintenance

After completing the upgrade, you should send an AutoSupport message indicating the end of maintenance, so automatic case creation can resume.

Steps
  1. To resume automatic support case generation, send an Autosupport message to indicate that the maintenance is complete.

    1. Issue the following command:
      system node autosupport invoke -node * -type all -message MAINT=end

    2. Repeat the command on the partner cluster.

Configure end-to-end encryption

If it is supported on your system, you can encrypt back-end traffic, such as NVlog and storage replication data, between the MetroCluster IP sites. Refer to Configure end-to-end encryption for more information.