Replace the NVDIMM battery - AFF A800

Contributors dougthomp netapp-martyh

To replace the NVDIMM battery, you must remove the controller module, remove the battery, replace the battery, and then reinstall the controller module.

All other components in the system must be functioning properly; if not, you must contact technical support.

Step 1: Shut down the impaired controller

You can shut down or take over the impaired controller using different procedures, depending on the storage system hardware configuration.

Option 1: Most configurations

To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.

About this task

If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy node shows false for eligibility and health, you must correct the issue before shutting down the impaired node; see the Administration overview with the CLI.

Steps
  1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh

    The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h

  2. Disable automatic giveback from the console of the healthy node: storage failover modify –node local -auto-giveback false

  3. Take the impaired node to the LOADER prompt:

    If the impaired node is displaying…​ Then…​

    The LOADER prompt

    Go to the next step.

    Waiting for giveback…​

    Press Ctrl-C, and then respond y when prompted.

    System prompt or password prompt (enter system password)

    Take over or halt the impaired node:

    • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name

      When the impaired node shows Waiting for giveback…​, press Ctrl-C, and then respond y.

Option 2: Controller is in a MetroCluster

Note Do not use this procedure if your system is in a two-node MetroCluster configuration.

To shut down the impaired node, you must determine the status of the node and, if necessary, take over the node so that the healthy node continues to serve data from the impaired node storage.

  • If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy node shows false for eligibility and health, you must correct the issue before shutting down the impaired node; see the Administration overview with the CLI.

  • If you have a MetroCluster configuration, you must have confirmed that the MetroCluster Configuration State is configured and that the nodes are in an enabled and normal state (metrocluster node show).

Steps
  1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh

    The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h

  2. Disable automatic giveback from the console of the healthy node: storage failover modify –node local -auto-giveback false

  3. Take the impaired node to the LOADER prompt:

    If the impaired node is displaying…​ Then…​

    The LOADER prompt

    Go to the next step.

    Waiting for giveback…​

    Press Ctrl-C, and then respond y when prompted.

    System prompt or password prompt (enter system password)

    Take over or halt the impaired node:

    • For an HA pair, take over the impaired node from the healthy node: storage failover takeover -ofnode impaired_node_name

      When the impaired node shows Waiting for giveback…​, press Ctrl-C, and then respond y.

Step 2: Remove the controller module

You must remove the controller module from the chassis when you replace the controller module or replace a component inside the controller module.

  1. If you are not already grounded, properly ground yourself.

  2. Unplug the controller module power supplies from the source.

  3. Release the power cable retainers, and then unplug the cables from the power supplies.

  4. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables and SFP and QSFP modules (if needed) from the controller module, keeping track of where the cables were connected. Loosen the hook and loop strap binding the cables to the cable management device, and then unplug the system cables from the controller module, keeping track of where the cables were connected.

    Leave the cables in the cable management device so that when you reinstall the cable management device, the cables are organized.

  5. Remove the cable management device from the controller module and set it aside.

  6. Press down on both of the locking latches, and then rotate both latches downward at the same time.

    The controller module moves slightly out of the chassis.

    drw a800 pcm remove

    legend icon 01

    Locking latch

    legend icon 02

    Locking pin

  7. Slide the controller module out of the chassis.

    Make sure that you support the bottom of the controller module as you slide it out of the chassis.

  8. Set the controller module aside in a safe place.

Step 3: Replace the NVDIMM battery

To replace the NVDIMM battery, you must remove the failed battery from the controller module and install the replacement battery into the controller module.

  1. Open the air duct cover and locate the NVDIMM battery in the riser.

    drw a800 nvdimm battery replace

    legend icon 01

    Air duct riser

    legend icon 02

    NVDIMM battery plug

    legend icon 03

    NVDIMM battery pack

    Attention: The NVDIMM battery control board LED blinks while destaging contents to the flash memory when you halt the system. After the destage is complete, the LED turns off.

  2. Locate the battery plug and squeeze the clip on the face of the battery plug to release the plug from the socket, and then unplug the battery cable from the socket.

  3. Grasp the battery and lift the battery out of the air duct and controller module, and then set it aside.

  4. Remove the replacement battery from its package.

  5. Install the replacement battery pack in the NVDIMM air duct:

    1. Insert the battery pack into the slot and press firmly down on the battery pack to make sure that it is locked into place.

    2. Plug the battery plug into the riser socket and make sure that the plug locks into place.

  6. Close the NVDIMM air duct.

    Make sure that the plug locks into the socket.

Step 4: Reinstall the controller module and booting the system

After you replace a FRU in the controller module, you must reinstall the controller module and reboot it.

  1. Align the end of the controller module with the opening in the chassis, and then gently push the controller module halfway into the system.

    Note Do not completely insert the controller module in the chassis until instructed to do so.
  2. Recable the system, as needed.

    If you removed the media converters (QSFPs or SFPs), remember to reinstall them if you are using fiber optic cables.

  3. Plug the power cord into the power supply, reinstall the power cable locking collar, and then connect the power supply to the power source.

  4. Complete the reinstallation of the controller module:

    1. Firmly push the controller module into the chassis until it meets the midplane and is fully seated.

      The locking latches rise when the controller module is fully seated.

      Note Do not use excessive force when sliding the controller module into the chassis to avoid damaging the connectors.

      The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process.

    2. Rotate the locking latches upward, tilting them so that they clear the locking pins, and then lower them into the locked position.

    3. If you have not already done so, reinstall the cable management device.

    4. Interrupt the normal boot process by pressing Ctrl-C.

Step 5: Run diagnostics

After you have replaced a component in your system, you should run diagnostic tests on that component.

Your system must be at the LOADER prompt to start diagnostics.

All commands in the diagnostic procedures are issued from the node where the component is being replaced.

  1. If the node to be serviced is not at the LOADER prompt, reboot the node: system node halt -node node_name

    After you issue the command, you should wait until the system stops at the LOADER prompt.

  2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags

  3. Select Scan System from the displayed menu to enable running the diagnostics tests.

  4. Select Test Memory from the displayed menu.

  5. Proceed based on the result of the preceding step:

    • If the test failed, correct the failure, and then rerun the test.

    • If the test reported no failures, select Reboot from the menu to reboot the system.

Step 6: Return the failed part to NetApp

After you replace the part, you can return the failed part to NetApp, as described in the RMA instructions shipped with the kit. Contact technical support at NetApp Support, 888-463-8277 (North America), 00-800-44-638277 (Europe), or +800-800-80-800 (Asia/Pacific) if you need the RMA number or additional help with the replacement procedure.