Replace a DIMM - AFF A250

Contributors dougthomp thrisun netapp-martyh

You must replace a DIMM in the controller module when your system registers an increasing number of correctable error correction codes (ECC); failure to do so causes a system panic.

About this task

All other components in the system must be functioning properly; if not, you must contact technical support.

You must replace the failed component with a replacement FRU component you received from your provider.

Step 1: Shut down the impaired controller

To shut down the impaired controller, you must determine the status of the controller and, if necessary, take over the controller so that the healthy controller continues to serve data from the impaired controller storage.

About this task
  • If you are using NetApp Storage Encryption, you must have reset the MSID using the instructions in the “Returning SEDs to unprotected mode” section of the ONTAP 9 NetApp Encryption Power Guide.

  • If you have a SAN system, you must have checked event messages (event log show) for impaired controller SCSI blade.

    Each SCSI-blade process should be in quorum with the other nodes in the cluster. Any issues must be resolved before you proceed with the replacement.

  • If you have a cluster with more than two nodes, it must be in quorum. If the cluster is not in quorum or a healthy controller shows false for eligibility and health, you must correct the issue before shutting down the impaired controller; see the Administration overview with the CLI.

  • If you have a MetroCluster configuration, you must have confirmed that the MetroCluster Configuration State is configured and that the nodes are in an enabled and normal state (metrocluster node show).

Steps
  1. If AutoSupport is enabled, suppress automatic case creation by invoking an AutoSupport message: system node autosupport invoke -node * -type all -message MAINT=number_of_hours_downh

    The following AutoSupport message suppresses automatic case creation for two hours: cluster1:*> system node autosupport invoke -node * -type all -message MAINT=2h

  2. Disable automatic giveback from the console of the healthy controller: storage failover modify –node local -auto-giveback false

    Note When you see Do you want to disable auto-giveback?, enter y.
  3. Take the impaired controller to the LOADER prompt:

    If the impaired controller is displaying…​ Then…​

    The LOADER prompt

    Go to Remove controller module.

    Waiting for giveback…​

    Press Ctrl-C, and then respond y when prompted.

    System prompt or password prompt (enter system password)

    Take over or halt the impaired controller from the healthy controller: storage failover takeover -ofnode impaired_node_name

    When the impaired controller shows Waiting for giveback…​, press Ctrl-C, and then respond y.

Step 2: Remove the controller module

You must remove the controller module from the chassis when you replace a component inside the controller module.

Make sure that you label the cables so that you know where they came from.

  1. If you are not already grounded, properly ground yourself.

  2. Unplug the controller module power supplies from the source.

  3. Release the power cable retainers, and then unplug the cables from the power supplies.

  4. Insert your forefinger into the latching mechanism on either side of the controller module, press the lever with your thumb, and gently pull the controller a few inches out of the chassis.

    Note If you have difficulty removing the controller module, place your index fingers through the finger holes from the inside (by crossing your arms).
    drw a250 pcm remove install

    legend icon 01

    Lever

    legend icon 02

    Latching mechanism

  5. Using both hands, grasp the controller module sides and gently pull it out of the chassis and set it on a flat, stable surface.

  6. Turn the thumbscrew on the front of the controller module anti-clockwise and open the controller module cover.

    drw a250 open controller module cover

    legend icon 01

    Thumbscrew

    legend icon 02

    Controller module cover.

  7. Lift out the air duct cover.

    drw a250 remove airduct cover

Step 3: Replace a DIMM

To replace a DIMM, you must locate it in the controller module using the DIMM map label on top of the air duct and then replace it following the specific sequence of steps.

Use the following video or the tabulated steps to replace a DIMM:

Animation - Replace a DIMM
  1. Replace the impaired DIMM on your controller module.

    The DIMMs are in slot 3 or 1 on the motherboard. Slot 2 and 4 are left empty. Do not attempt to install DIMMs into these slots.

    drw a250 dimm replace
  2. Note the orientation of the DIMM in the socket so that you can insert the replacement DIMM in the proper orientation.

  3. Slowly push apart the DIMM ejector tabs on either side of the DIMM, and slide the DIMM out of the slot.

  4. Leave DIMM ejector tabs on the connector in the open position.

  5. Remove the replacement DIMM from the antistatic shipping bag, hold the DIMM by the corners, and align it to the slot.

    Note Hold the DIMM by the edges to avoid pressure on the components on the DIMM circuit board.
  6. Insert the replacement DIMM squarely into the slot.

    The DIMMs fit tightly in the socket. If not, reinsert the DIMM to realign it with the socket.

  7. Visually inspect the DIMM to verify that it is evenly aligned and fully inserted into the socket.

Step 4: Install the controller module

After you have replaced the component in the controller module, you must reinstall the controller module into the chassis, and then boot it to Maintenance mode.

You can use the following illustrations or the written steps to install the replacement controller module in the chassis.

  1. If you have not already done so, install the air duct.

    drw a250 install airduct cover
  2. Close the controller module cover and tighten the thumbscrew.

    drw a250 close controller module cover

    legend icon 01

    Controller module cover

    legend icon 02

    Thumbscrew

  3. Insert the controller module into the chassis:

    1. Ensure the latching mechanism arms are locked in the fully extended position.

    2. Using both hands, align and gently slide the controller module into the latching mechanism arms until it stops.

    3. Place your index fingers through the finger holes from the inside of the latching mechanism.

    4. Press your thumbs down on the orange tabs on top of the latching mechanism and gently push the controller module over the stop.

    5. Release your thumbs from the top of the latching mechanisms and continue pushing until the latching mechanisms snap into place.

      The controller module begins to boot as soon as it is fully seated in the chassis. Be prepared to interrupt the boot process.

    The controller module should be fully inserted and flush with the edges of the chassis.

  4. Cable the management and console ports only, so that you can access the system to perform the tasks in the following sections.

    Note You will connect the rest of the cables to the controller module later in this procedure.

Step 5: Run diagnostics

After you have replaced a component in your system, you should run diagnostic tests on that component.

Your system must be at the LOADER prompt to start diagnostics.

All commands in the diagnostic procedures are issued from the controller where the component is being replaced.

  1. If the controller to be serviced is not at the LOADER prompt, reboot the controller: system node halt -node node_name

    After you issue the command, you should wait until the system stops at the LOADER prompt.

  2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags

  3. Select Scan System from the displayed menu to enable running the diagnostics tests.

  4. Select Test Memory from the displayed menu.

  5. Proceed based on the result of the preceding step:

    • If the test failed, correct the failure, and then rerun the test.

    • If the test reported no failures, select Reboot from the menu to reboot the system.

Step 6: Return the failed part to NetApp

Return the failed part to NetApp, as described in the RMA instructions shipped with the kit. See the Part Return & Replacements page for further information.