Skip to main content

Prepare the nodes for upgrade

Contributors netapp-pcarriga netapp-aoife

The controller replacement process begins with a series of prechecks. You also gather information about the original nodes for use later in the procedure and, if required, determine the type of self-encrypting drives that are in use.

Steps
  1. Begin the controller replacement process by entering the following command in the ONTAP command line:

    system controller replace start -nodes node_names

    Note
    • Beginning with ONTAP 9.10.1, the automated negotiated switchover (NSO) based upgrade procedure is the default for a four-node MetroCluster FC configuration. If you are upgrading a four-node MetroCluster FC configuration, when you issue the system controller replace start command, you must prevent the NSO based procedure initiating by setting the -nso parameter to false:

      system controller replace start -nodes node_names -nso false

    • The system controller replace start command can only be executed at the advanced privilege level:

      set -privilege advanced

    You will see the following output:

    Warning:
    1. Current ONTAP version is 9.x
    Before starting controller replacement operation, ensure that the new controllers are running the version 9.x
    
    2. Verify that NVMEM or NVRAM batteries of the new nodes are charged, and charge them if they are not. You need to physically check the new nodes to see if the NVMEM or NVRAM  batteries are charged. You can check the battery status either by connecting to a serial console or using SSH, logging into the Service Processor (SP) or Baseboard Management Controller (BMC) for your system, and use the system sensors to see if the battery has a sufficient charge.
    
    Attention: Do not try to clear the NVRAM contents. If there is a need to clear the contents of NVRAM, contact NetApp technical support.
    
    3. If a controller was previously part of a different cluster, run wipeconfig before using it as the replacement controller.
    
    Do you want to continue? {y|n}: y
  2. Press y, you will see the following output:

    Controller replacement operation: Prechecks in progress.
    Controller replacement operation has been paused for user intervention.

    The system runs the following prechecks; record the output of each precheck for use later in the procedure:

    Precheck Description

    Cluster Health Check

    Checks all the nodes in the cluster to confirm they are healthy.

    MCC Cluster Check

    Checks if the system is a MetroCluster configuration.
    The operation automatically detects if it is a MetroCluster configuration or not and performs the specific prechecks and verification checks.
    Only 4-node MetroCluster FC configuration is supported. In the case of 2-node MetroCluster configuration and 4-node MetroCluster IP configuration, the check fails.
    If the MetroCluster configuration is in switched over state, the check fails.

    Aggregate Relocation Status Check

    Checks whether an aggregate relocation is already in progress.
    If another aggregate relocation is in progress, the check fails.

    Model Name Check

    Checks whether the controller models are supported for this procedure.
    If the models are not supported, the task fails.

    Cluster Quorum Check

    Checks that the nodes being replaced are in quorum. If the nodes are not in quorum, the task fails.

    Image Version Check

    Checks that the nodes being replaced run the same version of ONTAP.
    If the ONTAP image versions are different, the task fails.
    The new nodes must have the same version of ONTAP 9.x installed on them that is installed on the original nodes. If the new nodes have a different version of ONTAP installed, you need to netboot the new controllers after you install them. For instructions on how to upgrade ONTAP, refer to References to link to Upgrade ONTAP.

    HA Status Check

    Checks if both the nodes being replaced are in a high- availability (HA) pair configuration.
    If storage failover is not enabled for the controllers, the task fails.

    Aggregate Status Check

    If the nodes being replaced own aggregates for which they are not the home owner, the task fails.
    The nodes should not own any non-local aggregates.

    Disk Status Check

    If any nodes being replaced have missing or failed disks, the task fails.
    If any disks are missing, refer to References to link to Disk and aggregate management with the CLI, Logical storage management with the CLI, and High Availability management to configure storage for the HA pair.

    Data LIF Status Check

    Checks if any of the nodes being replaced have non- local data LIFs.
    The nodes should not contain any data LIFs for which they are not the home owner. If one of the nodes contains non-local data LIFs, the task fails.

    Cluster LIF Status

    Checks whether the cluster LIFs are up for both nodes. If the cluster LIFs are down, the task fails.

    ASUP Status Check

    If ASUP notifications are not configured, the task fails.
    You must enable ASUP before beginning the controller replacement procedure.

    CPU Utilization Check

    Checks if the CPU utilization is more than 50% for any of the nodes being replaced.
    If the CPU usage is more than 50% for a considerable period of time, the task fails.

    Aggregate Reconstruction Check

    Checks if reconstruction is occurring on any data aggregates.
    If aggregate reconstruction is in progress, the task fails.

    Node Affinity Job Check

    Checks if any node affinity jobs are running.
    If node affinity jobs are running, the check fails.

  3. After the controller replacement operation is started and the prechecks are completed, the operation pauses enabling you to collect output information that you might need later when configuring node3.

    Note If you have a system with more than two cluster ports per node, such as an FAS8080 or an AFF8080 system, before you start the upgrade, you must migrate and re-home the cluster LIFs to two cluster ports per node. If you perform the controller upgrade with more than two cluster ports per node, cluster LIFs might be missing on the new controller after the upgrade.
  4. Run the below set of commands as directed by the controller replacement procedure on the system console.

    From the serial port connected to each node, run and save the output of the following commands individually:

    • vserver services name-service dns show

    • network interface show -curr-node local -role cluster,intercluster,node-mgmt,cluster-mgmt,data

    • network port show -node local -type physical

    • service-processor show -node local -instance

    • network fcp adapter show -node local

    • network port ifgrp show -node local

    • system node show -instance -node local

    • run -node local sysconfig

    • storage aggregate show -node local

    • volume show -node local

    • storage array config show -switch switch_name

    • system license show -owner local

    • storage encryption disk show

    • security key-manager onboard show-backup

    • security key-manager external show

    • security key-manager external show-status

    • network port reachability show -detail -node local

    Note If NetApp Volume Encryption (NVE) or NetApp Aggregate Encryption (NAE) using the Onboard Key Manager (OKM) is in use, keep the key manager passphrase ready to complete the key manager resync later in the procedure.
  5. If your system uses self-encrypting drives, see the Knowledge Base article How to tell if a drive is FIPS certified to determine the type of self-encrypting drives that are in use on the HA pair that you are upgrading. ONTAP software supports two types of self-encrypting drives:

    • FIPS-certified NetApp Storage Encryption (NSE) SAS or NVMe drives

    • Non-FIPS self-encrypting NVMe drives (SED)

    Note

    You cannot mix FIPS drives with other types of drives on the same node or HA pair.

    You can mix SEDs with non-encrypting drives on the same node or HA pair.

Correct aggregate ownership if an ARL precheck fails

If the Aggregate Status Check fails, you must return aggregates owned by the partner node to the home owner node and initiate the precheck process again.

Steps
  1. Return the aggregates currently owned by the partner node to the home owner node:

    storage aggregate relocation start -node source_node -destination destination-node -aggregate-list *

  2. Verify that neither node1 nor node2 still owns aggregates for which it is the current owner (but not the home owner):

    storage aggregate show -nodes node_name -is-home false -fields owner-name, home-name, state

    The following example shows the output of the command when a node is both the current owner and home owner of aggregates:

    cluster::> storage aggregate show -nodes node1 -is-home true -fields owner-name,home-name,state
    aggregate   home-name  owner-name  state
    ---------   ---------  ----------  ------
    aggr1       node1      node1       online
    aggr2       node1      node1       online
    aggr3       node1      node1       online
    aggr4       node1      node1       online
    
    4 entries were displayed.

After you finish

You must restart the controller replacement process:

system controller replace start -nodes node_names

License

Some features require licenses, which are issued as packages that include one or more features. Each node in the cluster must have its own key for each feature to be used in the cluster.

If you do not have new license keys, currently licensed features in the cluster are available to the new controller. However, using unlicensed features on the controller might put you out of compliance with your license agreement, so you should install the new license key or keys for the new controller after the upgrade is complete.

Refer to References to link to the NetApp Support Site where you can obtain new 28-character license keys for ONTAP. The keys are available in the My Support section under Software licenses. If the site does not have the license keys you need, you can contact your NetApp sales representative.

For detailed information about licensing, refer to References to link to the System Administration Reference.