Recover from automatic unplanned failover operations

08/20/2024 Contributors

An automatic unplanned failover (AUFO) operation occurs when the primary cluster is down or isolated. The ONTAP Mediator detects when a failover occurs and, and executes an automatic unplanned failover to the the secondary cluster. The secondary cluster is converted to the primary and begins serving clients. This operation is performed only with assistance from the ONTAP Mediator.

After the automatic unplanned failover, it is important to rescan the host LUN I/O paths so that there is no loss of I/O paths.

Reestablish the protection relationship after an unplanned failover

You can reestablish the protection relationship using System Manager or the ONTAP CLI.

System Manager

Steps

From ONTAP 9.8 through 9.14.1, SnapMirror active sync is referred to as SnapMirror Business Continuity (SM-BC).

Navigate to Protection > Relationships and wait for the relationship state to show “InSync.”
To resume operations on the original source cluster, click and select Failover.

CLI

You can monitor the status of the automatic unplanned failover using the snapmirror failover show command.

For example:

ClusterB::> snapmirror failover show -instance
Start Time: 9/23/2020 22:03:29
         Source Path: vs1:/cg/scg3
    Destination Path: vs3:/cg/dcg3
     Failover Status: completed
        Error Reason:
            End Time: 9/23/2020 22:03:30
Primary Data Cluster: cluster-2
Last Progress Update: -
       Failover Type: unplanned
  Error Reason codes: -

Refer to the EMS reference to learn about event messages and about corrective actions.

Resume protection in a fan-out configuration after failover

Beginning with ONTAP 9.15.1, SnapMirror active sync supports automatic reconfiguration in the fan-out leg after a failover event. For more information, see fan-out configurations.

If you're using ONTAP 9.14.1 or earlier and you experience a failover on the secondary cluster in the SnapMirror active sync relationship, the SnapMirror asynchronous destination becomes unhealthy. You must manually restore protection by deleting and recreating the relationship with the SnapMirror asynchronous endpoint.

Steps

Verify the failover has completed successfully:
snapmirror failover show
On the SnapMirror asynchronous endpoint, delete the fan-out endpoint:
snapmirror delete -destination-path destination_path
On the third site, create a SnapMirror asynchronous relationships between the new SnapMirror active sync primary volume and the async fan-out destination volume:
snapmirror create -source-path source_path -destination-path destination_path -policy MirrorAllSnapshots -schedule schedule
Resynchronize the relationship:
snapmirror resync -destination-path destination_path
Verify the relationship status and heath:
snapmirror show

Recover from automatic unplanned failover operations

Creating your file...

Reestablish the protection relationship after an unplanned failover

Resume protection in a fan-out configuration after failover