Shut down the impaired controller - FAS8200

Contributors dougthomp netapp-martyh

You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.

Option 1: Most systems

To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.

If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy node shows false for eligibility and health, you must correct the issue before shutting down the impaired node; see the Administration overview with the CLI.

Steps
  1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=_number_of_hours_down_h

    The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h

  2. If the impaired node is part of an HA pair, disable automatic giveback from the console of the healthy node: storage failover modify -node local -auto-giveback false

  3. Take the impaired node to the LOADER prompt:

    If the impaired node is displaying…​ Then…​

    The LOADER prompt

    Go to the next step.

    Waiting for giveback…​

    Press Ctrl-C, and then respond y.

    System prompt or password prompt (enter system password)

    Take over or halt the impaired node: storage failover takeover -ofnode impaired_node_name When the impaired node shows Waiting for giveback…​, press Ctrl-C, and then respond y.

Option 2: Controller is in a MetroCluster

Note Do not use this procedure if your system is in a two-node MetroCluster configuration.

To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.

  • If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy node shows false for eligibility and health, you must correct the issue before shutting down the impaired node; see the Administration overview with the CLI.

  • If you have a MetroCluster configuration, you must have confirmed that the MetroCluster Configuration State is configured and that the nodes are in an enabled and normal state (metrocluster node show).

Steps
  1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh

    The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h

  2. Disable automatic giveback from the console of the healthy node: storage failover modify –node local -auto-giveback false

  3. Take the impaired node to the LOADER prompt:

    If the impaired node is displaying…​ Then…​

    The LOADER prompt

    Go to the next step.

    Waiting for giveback…​

    Press Ctrl-C, and then respond y when prompted.

    System prompt or password prompt (enter system password)

    Take over or halt the impaired node:

    • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name

      When the impaired node shows Waiting for giveback…​, press Ctrl-C, and then respond y.

Option 3: Controller is in a two-node MetroCluster

To shut down the impaired node, you must determine the status of the node and, if necessary, switch over the node so that the healthy node continues to serve data from the impaired node storage.

About this task
  • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the "Returning SEDs to unprotected mode" section of Administration overview with the CLI.

  • You must leave the power supplies turned on at the end of this procedure to provide power to the healthy node.

Steps
  1. Check the MetroCluster status to determine whether the impaired node has automatically switched over to the healthy node: metrocluster show

  2. Depending on whether an automatic switchover has occurred, proceed according to the following table:

    If the impaired node…​ Then…​

    Has automatically switched over

    Proceed to the next step.

    Has not automatically switched over

    Perform a planned switchover operation from the healthy node: metrocluster switchover

    Has not automatically switched over, you attempted switchover with the metrocluster switchover command, and the switchover was vetoed

    Review the veto messages and, if possible, resolve the issue and try again. If you are unable to resolve the issue, contact technical support.

  3. Resynchronize the data aggregates by running the metrocluster heal -phase aggregates command from the surviving cluster.

    controller_A_1::> metrocluster heal -phase aggregates
    [Job 130] Job succeeded: Heal Aggregates is successful.

    If the healing is vetoed, you have the option of reissuing the metrocluster heal command with the -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.

  4. Verify that the operation has been completed by using the metrocluster operation show command.

    controller_A_1::> metrocluster operation show
        Operation: heal-aggregates
          State: successful
    Start Time: 7/25/2016 18:45:55
       End Time: 7/25/2016 18:45:56
         Errors: -
  5. Check the state of the aggregates by using the storage aggregate show command.

    controller_A_1::> storage aggregate show
    Aggregate     Size Available Used% State   #Vols  Nodes            RAID Status
    --------- -------- --------- ----- ------- ------ ---------------- ------------
    ...
    aggr_b2    227.1GB   227.1GB    0% online       0 mcc1-a2          raid_dp, mirrored, normal...
  6. Heal the root aggregates by using the metrocluster heal -phase root-aggregates command.

    mcc1A::> metrocluster heal -phase root-aggregates
    [Job 137] Job succeeded: Heal Root Aggregates is successful

    If the healing is vetoed, you have the option of reissuing the metrocluster heal command with the -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation.

  7. Verify that the heal operation is complete by using the metrocluster operation show command on the destination cluster:

    mcc1A::> metrocluster operation show
      Operation: heal-root-aggregates
          State: successful
     Start Time: 7/29/2016 20:54:41
       End Time: 7/29/2016 20:54:42
         Errors: -
  8. On the impaired controller module, disconnect the power supplies.