Running system-level diagnostics

After installing a new DIMM, you should run diagnostics.

Before you begin

Your system must be at the LOADER prompt to start System Level Diagnostics.

About this task

All commands in the diagnostic procedures are issued from the node where the component is being replaced.

Steps

  1. If the node to be serviced is not at the LOADER prompt, perform the following steps:
    1. Select the Maintenance mode option from the displayed menu.
    2. After the node boots to Maintenance mode, halt the node: halt
      After you issue the command, you should wait until the system stops at the LOADER prompt.
      Important: During the boot process, you can safely respond y to prompts:
      • A prompt warning that when entering Maintenance mode in an HA configuration, you must ensure that the healthy node remains down.
  2. At the LOADER prompt, access the special drivers specifically designed for system-level diagnostics to function properly: boot_diags
    During the boot process, you can safely respond y to the prompts until the Maintenance mode prompt (*>) appears.
  3. Run diagnostics on the system memory: sldiag device run -dev mem
  4. Verify that no hardware problems resulted from the replacement of the DIMMs: sldiag device status -dev mem -long -state failed
    System-level diagnostics returns you to the prompt if there are no test failures, or lists the full status of failures resulting from testing the component.
  5. Proceed based on the result of the preceding step:
    If the system-level diagnostics tests... Then...
    Were completed without any failures
    1. Clear the status logs: sldiag device clearstatus
    2. Verify that the log was cleared: sldiag device status

      The following default response is displayed:

      SLDIAG: No log messages are present.
    3. Exit Maintenance mode: halt

      The node displays the LOADER prompt.

    4. Boot the node from the LOADER prompt: boot_ontap
    5. Return the node to normal operation:
      If your node is in... Then...
      An HA pair Perform a give back: storage failover giveback -ofnode replacement_node_name
      A two-node MetroCluster configuration Proceed to the next step.

      The MetroCluster healing and switchback procedures are done in the next task in the replacement process.

      A stand-alone configuration Proceed to the next step.

      No action is required.

    You have completed system-level diagnostics.

    Resulted in some test failures Determine the cause of the problem:
    1. Exit Maintenance mode: halt

      After you issue the command, wait until the system stops at the LOADER prompt.

    2. Turn off or leave on the power supplies, depending on how many controller modules are in the chassis:
      • If you have two controller modules in the chassis, leave the power supplies turned on to provide power to the other controller module.
      • If you have one controller module in the chassis, turn off the power supplies and unplug them from the power sources.
    3. Verify that you have observed all the considerations identified for running system-level diagnostics, that cables are securely connected, and that hardware components are properly installed in the storage system.
    4. Boot the controller module you are servicing, interrupting the boot by pressing Ctrl-C when prompted to get to the Boot menu:
      • If you have two controller modules in the chassis, fully seat the controller module you are servicing in the chassis.

        The controller module boots up when fully seated.

      • If you have one controller module in the chassis, connect the power supplies, and then turn them on.
    5. Select Boot to maintenance mode from the menu.
    6. Exit Maintenance mode by entering the following command: halt

      After you issue the command, wait until the system stops at the LOADER prompt.

    7. Rerun the system-level diagnostic test.