Monitoring and protecting the file system consistency using NVFAIL
The -nvfail
parameter of the volume modify
command enables ONTAP to detect nonvolatile RAM (NVRAM) inconsistencies when the system is booting or after a switchover operation. It also warns you and protects the system against data access and modification until the volume can be manually recovered.
If ONTAP detects any problems, database or file system instances stop responding or shut down. ONTAP then sends error messages to the console to alert you to check the state of the database or file system. You can enable NVFAIL to warn database administrators of NVRAM inconsistencies among clustered nodes that can compromise database validity.
After the NVRAM data loss during failover or boot recovery, NFS clients cannot access data from any of the nodes until the NVFAIL state is cleared. CIFS clients are unaffected.
How NVFAIL impacts access to NFS volumes or LUNs
The NVFAIL state is set when ONTAP detects NVRAM errors when booting, when a MetroCluster switchover operation occurs, or during an HA takeover operation if the NVFAIL option is set on the volume. If no errors are detected at startup, the file service is started normally. However, if NVRAM errors are detected or NVFAIL processing is enforced on a disaster switchover, ONTAP stops database instances from responding.
When you enable the NVFAIL option, one of the processes described in the following table takes place during bootup:
If… |
Then… |
||
---|---|---|---|
ONTAP detects no NVRAM errors |
File service starts normally. |
||
ONTAP detects NVRAM errors |
|
||
If one of the following parameters is used:
|
You can unset the
|
||
ONTAP detects NVRAM errors on a volume that contains LUNs |
LUNs in that volume are brought offline. The |
Commands for monitoring data loss events
If you enable the NVFAIL option, you receive notification when a system crash caused by NVRAM inconsistencies or a MetroCluster switchover occurs.
By default, the NVFAIL parameter is not enabled.
If you want to… |
Use this command… |
---|---|
Create a new volume with NVFAIL enabled |
|
Enable NVFAIL on an existing volume |
Note: You set the |
Display whether NVFAIL is currently enabled for a specified volume |
Note: You set the |
See the man page for each command for more information.
Accessing volumes in NVFAIL state after a switchover
After a switchover, you must clear the NVFAIL state by resetting the -in-nvfailed-state
parameter of the volume modify
command to remove the restriction of clients to access data.
The database or file system must not be running or trying to access the affected volume.
Setting -in-nvfailed-state
parameter requires advanced-level privilege.
-
Recover the volume by using the volume modify command with the -in-nvfailed-state parameter set to false.
For instructions about examining database file validity, see the documentation for your specific database software.
If your database uses LUNs, review the steps to make the LUNs accessible to the host after an NVRAM failure.
Monitoring and protecting the files system consistency using NVFAIL
Recovering LUNs in NVFAIL states after switchover
After a switchover, the host no longer has access to data on the LUNs that are in NVFAIL states. You must perform a number of actions before the database has access to the LUNs.
The database must not be running.
-
Clear the NVFAIL state on the affect volume that hosts the LUNs by resetting the
-in-nvfailed-state
parameter of thevolume modify
command. -
Bring the affected LUNs online.
-
Examine the LUNs for any data inconsistencies and resolve them.
This might involve host-based recovery or recovery done on the storage controller using SnapRestore.
-
Bring the database application online after recovering the LUNs.