Replicate apps between storage backends using SnapMirror technology

04/15/2024 Contributors

Using Astra Control, you can build business continuity for your applications with a low-RPO (Recovery Point Objective) and low-RTO (Recovery Time Objective) using asynchronous replication capabilities of NetApp SnapMirror technology. Once configured, this enables your applications to replicate data and application changes from one storage backend to another, on the same cluster or between different clusters.

For a comparison between backups/restores and replication, refer to Data protection concepts.

You can replicate apps in different scenarios, such as the following on-premises only, hybrid, and multi-cloud scenarios:

On-premises site A to on-premises site A
On-premises site A to on-premises site B
On-premises to cloud with Cloud Volumes ONTAP
Cloud with Cloud Volumes ONTAP to on-premises
Cloud with Cloud Volumes ONTAP to cloud (between different regions in the same cloud provider or to different cloud providers)

Astra Control can replicate apps across on-premises clusters, on-premises to cloud (using Cloud Volumes ONTAP) or between clouds (Cloud Volumes ONTAP to Cloud Volumes ONTAP).

You can simultaneously replicate a different app in the opposite direction. For example, Apps A, B, C can be replicated from Datacenter 1 to Datacenter 2; and Apps X, Y, Z can be replicated from Datacenter 2 to Datacenter 1.

Using Astra Control, you can do the following tasks related to replicating applications:

Set up a replication relationship
Bring a replicated app online on the destination cluster (failover)
Resync a failed over replication
Reverse application replication
Fail back applications to the original source cluster
Delete an application replication relationship

Replication prerequisites

Astra Control application replication requires that the following prerequisites be met before you begin:

ONTAP clusters

Astra Control Provisioner or Astra Trident: Astra Control Provisioner or Astra Trident must exist on both the source and destination Kubernetes clusters that utilize ONTAP as a backend. Astra Control supports replication with NetApp SnapMirror technology using storage classes backed by the following drivers:
- ontap-nas
- ontap-san
Licenses: ONTAP SnapMirror asynchronous licenses using the Data Protection bundle must be enabled on both the source and destination ONTAP clusters. Refer to SnapMirror licensing overview in ONTAP for more information.

Peering

Cluster and SVM: The ONTAP storage backends must be peered. Refer to Cluster and SVM peering overview for more information.

Ensure that the SVM names used in the replication relationship between two ONTAP clusters are unique.
Astra Control Provisioner or Astra Trident and SVM: The peered remote SVMs must be available to Astra Control Provisioner or Astra Trident on the destination cluster.

Astra Control Center

Deploy Astra Control Center in a third fault domain or secondary site for seamless disaster recovery.

Managed backends: You need to add and manage ONTAP storage backends in Astra Control Center to create a replication relationship.

Adding and managing ONTAP storage backends in Astra Control Center is optional if you have enabled Astra Control Provisioner.
Managed clusters: Add and manage the following clusters with Astra Control, ideally at different failure domains or sites:
- Source Kubernetes cluster
- Destination Kubernetes cluster
- Associated ONTAP clusters

User accounts: When you add an ONTAP storage backend to Astra Control Center, apply user credentials with the "admin" role. This role has access methods http and ontapi enabled on both ONTAP source and destination clusters. Refer to Manage User Accounts in ONTAP documentation for more information.

With Astra Control Provisioner functionality, you don't need to specifically define an "admin" role to manage clusters in Astra Control Center as these credentials aren't required by Astra Control Center.

Astra Control Center does not support NetApp SnapMirror replication for storage backends that are using the NVMe over TCP protocol.

Astra Trident / ONTAP configuration

Astra Control Center requires that you configure at least one storage backend that supports replication for both the source and destination clusters. If the source and destination clusters are the same, the destination application should use a different storage backend than the source application for the best resiliency.

Astra Control replication supports apps that use a single storage class. When you add an app to a namespace, be sure the app has the same storage class as other apps in the namespace. When you add a PVC to a replicated app, be sure the new PVC has the same storage class as other PVCs in the namespace.

Set up a replication relationship

Setting up a replication relationship involves the following:

Choosing how frequently you want Astra Control to take an app snapshot (which includes the app's Kubernetes resources as well as the volume snapshots for each of the app's volumes)
Choosing the replication schedule (included Kubernetes resources as well as persistent volume data)
Setting the time for the snapshot to be taken

Steps

From the Astra Control left navigation, select Applications.
Select the Data Protection > Replication tab.
Select Configure replication policy. Or, from the Application Protection box, select the Actions option and select Configure replication policy.

Enter or select the following information:

Destination cluster: Enter a destination cluster (this can be the same as the source cluster).
Destination storage class: Select or enter the storage class that uses the peered SVM on the destination ONTAP cluster. As a best practice, the destination storage class should point to a different storage backend than the source storage class.
Replication type: Asynchronous is currently the only replication type available.
Destination namespace: Enter new or existing destination namespaces for the destination cluster.
(Optional) Add additional namespaces by selecting Add namespace and choosing the namespace from the drop-down list.
Replication frequency: Set how often you want Astra Control to take a snapshot and replicate it to the destination.

Offset: Set the number of minutes from the top of the hour that you want Astra Control to take a snapshot. You might want to use an offset so that it doesn't coincide with other scheduled operations.

Offset backup and replication schedules to avoid schedule overlaps. For example, perform backups at the top of the hour every hour and schedule replication to start with a 5-minute offset and a 10-minute interval.

Select Next, review the summary, and select Save.

At first, the status displays "app-mirror" before the first schedule occurs.

Astra Control creates an application snapshot used for replication.
To see the application snapshot status, select the Applications > Snapshots tab.

The snapshot name uses the format of replication-schedule-<string>. Astra Control retains the last snapshot that was used for replication. Any older replication snapshots are deleted after successful completion of replication.

Result

This creates the replication relationship.

Astra Control completes the following actions as a result of establishing the relationship:

Creates a namespace on the destination (if it doesn't exist)
Creates a PVC on the destination namespace corresponding to the source app's PVCs.
Takes an initial app-consistent snapshot.
Establishes the SnapMirror relationship for persistent volumes using the initial snapshot.

The Data Protection page shows the replication relationship state and status:
<Health status> | <Relationship life cycle state>

For example:
Normal | Established

Learn more about replication states and status at the end of this topic.

Bring a replicated app online on the destination cluster (failover)

Using Astra Control, you can fail over replicated applications to a destination cluster. This procedure stops the replication relationship and brings the app online on the destination cluster. This procedure does not stop the app on the source cluster if it was operational.

Steps

From the Astra Control left navigation, select Applications.
Select the Data Protection > Replication tab.
From the Actions menu, select Fail over.
In the Fail over page, review the information and select Fail over.

Result

The following actions occur as a result of the failover procedure:

The destination app is started based on the latest replicated snapshot.
The source cluster and app (if operational) are not stopped and will continue to run.
The replication state changes to "Failing over" and then to "Failed over" when it has completed.
The source app's protection policy is copied to the destination app based on the schedules present on the source app at the time of the failover.
If the source app has one or more post-restore execution hooks enabled, those execution hooks are run for the destination app.
Astra Control shows the app both on the source and destination clusters and its respective health.

Resync a failed over replication

The resync operation re-establishes the replication relationship. You can choose the source of the relationship to retain the data on the source or destination cluster. This operation re-establishes the SnapMirror relationships to start the volume replication in the direction of choice.

The process stops the app on the new destination cluster before re-establishing replication.

During the resync process, the life cycle state shows as "Establishing."

Steps

From the Astra Control left navigation, select Applications.
Select the Data Protection > Replication tab.
From the Actions menu, select Resync.
In the Resync page, select either the source or destination app instance containing the data that you want to preserve.

Choose the resync source carefully, as the data on the destination will be overwritten.
Select Resync to continue.
Type "resync" to confirm.
Select Yes, resync to finish.

Result

The Replication page shows "Establishing" as the replication status.
Astra Control stops the application on the new destination cluster.
Astra Control re-establishes the persistent volume replication in the selected direction using SnapMirror resync.
The Replication page shows the updated relationship.

Reverse application replication

This is the planned operation to move the application to the destination storage backend while continuing to replicate back to the original source storage backend. Astra Control stops the source application and replicates the data to the destination before failing over to the destination app.

In this situation, you are swapping the source and destination.

Steps

From the Astra Control left navigation, select Applications.
Select the Data Protection > Replication tab.
From the Actions menu, select Reverse replication.
In the Reverse Replication page, review the information and select Reverse replication to continue.

Result

The following actions occur as a result of the reverse replication:

A snapshot is taken of the original source app's Kubernetes resources.
The original source app's pods are gracefully stopped by deleting the app's Kubernetes resources (leaving PVCs and PVs in place).
After the pods are shut down, snapshots of the app's volumes are taken and replicated.
The SnapMirror relationships are broken, making the destination volumes ready for read/write.
The app's Kubernetes resources are restored from the pre-shutdown snapshot, using the volume data replicated after the original source app was shut down.
Replication is re-established in the reverse direction.

Fail back applications to the original source cluster

Using Astra Control, you can achieve "fail back" after a failover operation by using the following sequence of operations. In this workflow to restore the original replication direction, Astra Control replicates (resyncs) any application changes back to the original source application before reversing the replication direction.

This process starts from a relationship that has completed a failover to a destination and involves the following steps:

Start with a failed over state.
Resync the relationship.
Reverse the replication.

Steps

From the Astra Control left navigation, select Applications.
Select the Data Protection > Replication tab.
From the Actions menu, select Resync.
For a fail back operation, choose the failed over app as the source of the resync operation (preserving any data written post failover).
Type "resync" to confirm.
Select Yes, resync to finish.
After the resync is complete, in the Data Protection > Replication tab, from the Actions menu, select Reverse replication.
In the Reverse Replication page, review the information and select Reverse replication.

Result

This combines the results from the "resync" and "reverse relationship" operations to bring the application online on the original source cluster with replication resumed to the original destination cluster.

Delete an application replication relationship

Deleting the relationship results in two separate apps with no relationship between them.

Steps

From the Astra Control left navigation, select Applications.
Select the Data Protection > Replication tab.
From the Application Protection box or in the relationship diagram, select Delete replication relationship.

Result

The following actions occur as a result of deleting a replication relationship:

If the relationship is established but the app has not yet been brought online on the destination cluster (failed over), Astra Control retains PVCs created during initialization, leaves an "empty" managed app on the destination cluster, and retains the destination app to keep any backups that might have been created.
If the app has been brought online on the destination cluster (failed over), Astra Control retains PVCs and destination apps. Source and destination apps are now treated as independent apps. The backup schedules remain on both apps but are not associated with each other.

Replication relationship health status and relationship life cycle states

Astra Control displays the health of the relationship and the states of the life cycle of the replication relationship.

Replication relationship health statuses

The following statuses indicate the health of the replication relationship:

Normal: The relationship is either establishing or has established, and the most recent snapshot transferred successfully.
Warning: The relationship is either failing over or has failed over (and therefore is no longer protecting the source app).
Critical
- The relationship is establishing or failed over, and the last reconcile attempt failed.
- The relationship is established, and the last attempt to reconcile the addition of a new PVC is failing.
- The relationship is established (so a successful snapshot has replicated, and failover is possible), but the most recent snapshot failed or failed to replicate.

Replication life cycle states

The following states states reflect the different stages of the replication life cycle:

Establishing: A new replication relationship is being created. Astra Control creates a namespace if needed, creates persistent volume claims (PVCs) on new volumes on the destination cluster, and creates SnapMirror relationships. This status can also indicate that the replication is resyncing or reversing replication.
Established: A replication relationship exists. Astra Control periodically checks that the PVCs are available, checks the replication relationship, periodically creates snapshots of the app, and identifies any new source PVCs in the app. If so, Astra Control creates the resources to include them in the replication.
Failing over: Astra Control breaks the SnapMirror relationships and restores the app's Kubernetes resources from the last successfully replicated app snapshot.
Failed over: Astra Control stops replicating from the source cluster, uses the most recent (successful) replicated app snapshot on the destination, and restores the Kubernetes resources.
Resyncing: Astra Control resyncs the new data on the resync source to the resync destination by using SnapMirror resync. This operation might overwrite some of the data on the destination based on the direction of the sync. Astra Control stops the app running on the destination namespace and removes the Kubernetes app. During the resyncing process, the status shows as "Establishing."
Reversing: The is the planned operation to move the application to the destination cluster while continuing to replicate back to the original source cluster. Astra Control stops the application on the source cluster, replicates the data to the destination before failing over the app to the destination cluster. During the reverse replication, the status shows as "Establishing."
Deleting:
- If the replication relationship was established but not failed over yet, Astra Control removes PVCs that were created during replication and deletes the destination managed app.
- If the replication failed over already, Astra Control retains the PVCs and destination app.

Replicate apps between storage backends using SnapMirror technology

Creating your file...

Replication prerequisites

Set up a replication relationship

Bring a replicated app online on the destination cluster (failover)

Resync a failed over replication

Reverse application replication

Fail back applications to the original source cluster

Delete an application replication relationship

Replication relationship health status and relationship life cycle states

Replication relationship health statuses

Replication life cycle states