Shut down the controllers - AFF A400
Shut down or take over the impaired controller using the appropriate procedure for your configuration.
Option 1: Shut down the controllers when replacing a chassis
Shut down the controllers so you can perform maintenance on the chassis.
This procedure is for systems with two node configurations. If you have a system with more than two nodes, see How to perform a graceful shutdown and power up of one HA pair in a four node cluster.
-
Stop all clients/host from accessing data on the NetApp system.
-
Suspend external backup jobs.
-
Make sure you have the necessary permissions and credentials:
-
Local administrator credentials for ONTAP.
-
NetApp onboard key management (OKM) cluster-wide passphrase if using storage encryption or NVE/NAE.
-
BMC accessability for each controller.
-
-
Make sure you have the necessary tools and equipment for the replacement.
-
As a best practice before shutdown, you should:
-
Perform additional system health checks.
-
Upgrade ONTAP to a recommended release for the system.
-
Resolve any Active IQ Wellness Alerts and Risks. Make note of any faults presently on the system, such as LEDs on the system components.
-
-
Log into the cluster through SSH or log in from any node in the cluster using a local console cable and a laptop/console.
-
Turn off AutoSupport and indicate how long you expect the system to be offline:
system node autosupport invoke -node * -type all -message "MAINT=8h Power Maintenance"
-
Identify the SP/BMC address of all nodes:
system service-processor show -node * -fields address
-
Exit the cluster shell:
exit
-
Log into SP/BMC over SSH using the IP address of any of the nodes listed in the output from the previous step.
If you are using a console/laptop, log into the controller using the same cluster administrator credentials.
Open an SSH session to every SP/BMC connection so that you can monitor progress. -
Halt the two nodes located in the impaired chassis:
system node halt -node <node1>,<node2> -skip-lif-migration-before-shutdown true -ignore-quorum-warnings true -inhibit-takeover true
For clusters using SnapMirror synchronous operating in StrictSync mode: system node halt -node <node1>,<node2> -skip-lif-migration-before-shutdown true -ignore-quorum-warnings true -inhibit-takeover true -ignore-strict-sync-warnings true
-
Enter y for each controller in the cluster when you see
Warning: Are you sure you want to halt node "cluster <node-name> number"? {y|n}:
-
Wait for each controller to halt and display the LOADER prompt.
Option 2: Shut down a controller in a two-node MetroCluster configuration
To shut down the impaired controller, you must determine the status of the controller and, if necessary, switch over the controller so that the healthy controller continues to serve data from the impaired controller storage.
-
You must leave the power supplies turned on at the end of this procedure to provide power to the healthy controller.
-
Check the MetroCluster status to determine whether the impaired controller has automatically switched over to the healthy controller:
metrocluster show
-
Depending on whether an automatic switchover has occurred, proceed according to the following table:
If the impaired controller… Then… Has automatically switched over
Proceed to the next step.
Has not automatically switched over
Perform a planned switchover operation from the healthy controller:
metrocluster switchover
Has not automatically switched over, you attempted switchover with the
metrocluster switchover
command, and the switchover was vetoedReview the veto messages and, if possible, resolve the issue and try again. If you are unable to resolve the issue, contact technical support.
-
Resynchronize the data aggregates by running the
metrocluster heal -phase aggregates
command from the surviving cluster.controller_A_1::> metrocluster heal -phase aggregates [Job 130] Job succeeded: Heal Aggregates is successful.
If the healing is vetoed, you have the option of reissuing the
metrocluster heal
command with the-override-vetoes
parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation. -
Verify that the operation has been completed by using the metrocluster operation show command.
controller_A_1::> metrocluster operation show Operation: heal-aggregates State: successful Start Time: 7/25/2016 18:45:55 End Time: 7/25/2016 18:45:56 Errors: -
-
Check the state of the aggregates by using the
storage aggregate show
command.controller_A_1::> storage aggregate show Aggregate Size Available Used% State #Vols Nodes RAID Status --------- -------- --------- ----- ------- ------ ---------------- ------------ ... aggr_b2 227.1GB 227.1GB 0% online 0 mcc1-a2 raid_dp, mirrored, normal...
-
Heal the root aggregates by using the
metrocluster heal -phase root-aggregates
command.mcc1A::> metrocluster heal -phase root-aggregates [Job 137] Job succeeded: Heal Root Aggregates is successful
If the healing is vetoed, you have the option of reissuing the
metrocluster heal
command with the -override-vetoes parameter. If you use this optional parameter, the system overrides any soft vetoes that prevent the healing operation. -
Verify that the heal operation is complete by using the
metrocluster operation show
command on the destination cluster:mcc1A::> metrocluster operation show Operation: heal-root-aggregates State: successful Start Time: 7/29/2016 20:54:41 End Time: 7/29/2016 20:54:42 Errors: -
-
On the impaired controller module, disconnect the power supplies.