How NVFAIL impacts access to NFS volumes or LUNs

The NVFAIL state is set when ONTAP detects NVRAM errors when booting, when a MetroCluster switchover operation occurs, or during an HA takeover operation if the NVFAIL option is set on the volume. If no errors are detected at startup, the file service is started normally. However, if NVRAM errors are detected or NVFAIL processing is enforced on a disaster switchover, ONTAP stops database instances from responding.

When you enable the NVFAIL option, one of the processes described in the following table takes place during bootup:

If... Then...
ONTAP detects no NVRAM errors File service starts normally.
ONTAP detects NVRAM errors
  • ONTAP returns a stale file handle (ESTALE) error to NFS clients trying to access the database, causing the application to stop responding, crash, or shut down.

    ONTAP then sends an error message to the system console and log file.

  • When the application restarts, files are available to CIFS clients even if you have not verified that they are valid.

    For NFS clients, files remain inaccessible until you reset the in-nvfailed-state option on the affected volume.

If one of the following parameters is used:

  • dr-force-nvfail volume option is set
  • force-nvfail-all switchover command option is set.
You can unset the dr-force-nvfail option after the switchover, if the administrator is not expecting to force NVFAIL processing for possible future disaster switchover operations.

For NFS clients, files remain inaccessible until you reset the in-nvfailed-state option on the affected volume.

Note: Using the force-nvfail-all option causes the dr-force-nvfail option to be set on all of the DR volumes processed during the disaster switchover.
ONTAP detects NVRAM errors on a volume that contains LUNs LUNs in that volume are brought offline. The in-nvfailed-state option on the volume must be cleared, and the NVFAIL attribute on the LUNs must be cleared by bringing each LUN in the affected volume online.

You can perform the steps to check the integrity of the LUNs and recover the LUN from a Snapshot copy or back up as necessary. After all of the LUNs in the volume are recovered, the in-nvfailed-state option on the affected volume is cleared.