Verify cluster and storage health after an ONTAP revert
After you revert an ONTAP cluster, you should verify that the nodes are healthy and eligible to participate in the cluster, and that the cluster is in quorum. You should also verify the status of your disks, aggregates, and volumes.
Verify cluster health
-
Verify that the nodes in the cluster are online and are eligible to participate in the cluster:
In this example, the cluster is healthy and all nodes are eligible to participate in the cluster.
cluster1::> cluster show Node Health Eligibility --------------------- ------- ------------ node0 true true node1 true true
If any node is unhealthy or ineligible, check EMS logs for errors and take corrective action.
-
Set the privilege level to advanced:
Enter
y
to continue. -
Verify the configuration details for each RDB process.
-
The relational database epoch and database epochs should match for each node.
-
The per-ring quorum master should be the same for all nodes.
Note that each ring might have a different quorum master.
To display this RDB process… Enter this command… Management application
Volume location database
Virtual-Interface manager
SAN management daemon
This example shows the volume location database process:
cluster1::*> cluster ring show -unitname vldb Node UnitName Epoch DB Epoch DB Trnxs Master Online --------- -------- -------- -------- -------- --------- --------- node0 vldb 154 154 14847 node0 master node1 vldb 154 154 14847 node0 secondary node2 vldb 154 154 14847 node0 secondary node3 vldb 154 154 14847 node0 secondary 4 entries were displayed.
-
-
Return to the admin privilege level:
-
If you are operating in a SAN environment, verify that each node is in a SAN quorum:
The most recent scsiblade event message for each node should indicate that the scsi-blade is in quorum.
cluster1::*> event log show -severity informational -message-name scsiblade.* Time Node Severity Event --------------- ---------- -------------- --------------------------- MM/DD/YYYY TIME node0 INFORMATIONAL scsiblade.in.quorum: The scsi-blade ... MM/DD/YYYY TIME node1 INFORMATIONAL scsiblade.in.quorum: The scsi-blade ...
Verify storage health
After you revert or downgrade a cluster, you should verify the status of your disks, aggregates, and volumes.
-
Verify disk status:
To check for… Do this… Broken disks
-
Display any broken disks:
-
Remove or replace any broken disks.
Disks undergoing maintenance or reconstruction
-
Display any disks in maintenance, pending, or reconstructing states:
-
Wait for the maintenance or reconstruction operation to finish before proceeding.
-
-
Verify that all aggregates are online by displaying the state of physical and logical storage, including storage aggregates:
This command displays the aggregates that are not online. All aggregates must be online before and after performing a major upgrade or reversion.
cluster1::> storage aggregate show -state !online There are no entries matching your query.
-
Verify that all volumes are online by displaying any volumes that are not online:
All volumes must be online before and after performing a major upgrade or reversion.
cluster1::> volume show -state !online There are no entries matching your query.
-
Verify that there are no inconsistent volumes:
See the Knowledge Base article Volume Showing WAFL Inconsistent on how to address the inconsistent volumes.
Verify client access (SMB and NFS)
For the configured protocols, test access from SMB and NFS clients to verify that the cluster is accessible.