Skip to main content
NetApp Solutions

Use VMware Site Recovery Manager for Disaster Recovery of NFS datastores

Contributors jpowellgit kevin-hoke

The utilization of ONTAP tools for VMware vSphere 10 and the Site Replication Adapter (SRA) in conjunction with VMware Site Recovery Manager (SRM) brings significant value to disaster recovery efforts. ONTAP tools 10 provide robust storage capabilities, including native high availability and scalability for the VASA Provider, supporting iSCSI and NFS vVols. This ensures data availability and simplifies the management of multiple VMware vCenter servers and ONTAP clusters. By using the SRA with VMware Site Recovery Manager, organizations can achieve seamless replication and failover of virtual machines and data between sites, enabling efficient disaster recovery processes. The combination of ONTAP tools and the SRA empowers businesses to protect critical workloads, minimize downtime, and maintain business continuity in the face of unforeseen events or disasters.

ONTAP tools 10 simplifies storage management and efficiency features, enhances availability, and reduces storage costs and operational overhead, whether you are using SAN or NAS. It uses best practices for provisioning datastores and optimizes ESXi host settings for NFS and block storage environments. For all these benefits, NetApp recommends this plug-in when using vSphere with systems running ONTAP software.

The SRA is used together with SRM to manage the replication of VM data between production and disaster recovery sites for traditional VMFS and NFS datastores and also for the nondisruptive testing of DR replicas. It helps automate the tasks of discovery, recovery, and reprotection.

In this scenario we will demonstrate how to deploy and use VMWare Site Recovery manager to protect datastores and run both a test and final failover to a secondary site. Reprotection and failback are also discussed.

Scenario Overview

This scenario covers the following high level steps:

  • Configure SRM with vCenter servers at primary and secondary sites.

  • Install the SRA adapter for ONTAP tools for VMware vSphere 10 and register with vCenters.

  • Create SnapMirror relationships between source and destination ONTAP storage systems

  • Configure Site Recovery for SRM.

  • Conduct test and final failover.

  • Discuss reprotection and failback.

Architecture

The following diagram shows a typical VMware Site Recovery architecture with ONTAP tools for VMware vSphere 10 configured in a 3-node high availability configuration.

Configure appliance
 

Prerequisites

This scenario requires the following components and configurations:

  • vSphere 8 clusters installed at both the primary and secondary locations with suitable networking for communications between environments.

  • ONTAP storage systems at both the primary and secondary locations, with physical data ports on ethernet switches dedicated to NFS storage traffic.

  • ONTAP tools for VMware vSphere 10 is installed and has both vCenter servers registered.

  • VMware Site Replication Manager appliances have been installed for the primary and secondary sites.

    • Inventory mappings (network, folder, resource, storage policy) have been configured for SRM.

NetApp recommends a redundant network designs for NFS, providing fault tolerance for storage systems, switches, networks adapters and host systems. It is common to deploy NFS with a single subnet or multiple subnets depending on the architectural requirements.

Refer to Best Practices For Running NFS with VMware vSphere for detailed information specific to VMware vSphere.

For network guidance on using ONTAP with VMware vSphere refer to the Network configuration - NFS section of the NetApp enterprise applications documentation.

For NetApp documentation on using ONTAP storage with VMware SRM refer to VMware Site Recovery Manager with ONTAP

Deployment Steps

The following sections outline the deployment steps to implement and test a VMware Site Recovery Manager configuration with ONTAP storage system.

Create SnapMirror relationship between ONTAP storage systems

A SnapMirror relationship must be established between the source and destination ONTAP storage systems, for the datastore volumes to be protected.

Refer to ONTAP documentation starting HERE for complete information on creating SnapMirror relationships for ONTAP volumes.

Step-by-step instructions are outline in the following document, located HERE. These steps outline how to create cluster peer and SVM peer relationships and then SnapMirror relationships for each volume. These steps can be performed in ONTAP System Manager or using the ONTAP CLI.

Configure the SRM appliance

Complete the following steps to configure the SRM appliance and SRA adapter.

Connect the SRM appliance for primary and secondary sites

The following steps must be completed for both the primary and secondary sites.

  1. In a web browser, navigate to https://<SRM_appliance_IP>:5480 and log in. Click on Configure Appliance to get started.

    Configure appliance

     

  2. On the Platform Services Controller page of the Configure Site Recovery Manager wizard, fill in the credentials of the vCenter server to which SRM will be registered. Click on Next to continue.

    platform services controller

     

  3. On the vCenter Server page, view the connected vServer and click on Next to continue.

  4. On the Name and extension page, fill in a name for the SRM site, an administrators email address, and the local host to be used by SRM. Click on Next to continue.

    Configure appliance

     

  5. On the Ready to complete page review the summary of changes

Configure SRA on the SRM appliance

Complete the following steps to configure the SRA on the SRM appliance:

  1. Download the SRA for ONTAP tools 10 at the NetApp support site and save the tar.gz file to a local folder.

  2. From the SRM management appliance click on Storage Replication Adapters in the left hand menu and then on New Adapter.

    Add new SRM adapter

     

  3. Follow the steps outlined on the ONTAP tools 10 documentation site at Configure SRA on the SRM appliance. Once complete, the SRA can communicate with SRA using the provided IP address and credentials of the vCenter server.

Configure Site Recovery for SRM

Complete the following steps to configure Site Pairing, create Protection Groups,

Configure Site Pairing for SRM

The following step is completed in the vCenter client of the primary site.

  1. In the vSphere client click on Site Recovery in the left hand menu. A new browser windows opens to the SRM management UI on the primary site.

    Site Recovery

     

  2. On the Site Recovery page, click on NEW SITE PAIR.

    Site Recovery

     

  3. On the Pair type page of the New Pair wizard, verify that the local vCenter server is selected and select the Pair type. Click on Next to continue.

    Pair type

     

  4. On the Peer vCenter page fill out the credentials of the vCenter at the secondary site and click on Find vCenter Instances. Verify the the vCenter instance has been discovered and click on Next to continue.

    Peer vCenter

     

  5. On the Services page, check the box next the proposed site pairing. Click on Next to continue.

    Services

     

  6. On the Ready to complete page, review the proposed configuration and then click on the Finish button to create the Site Pairing

  7. The new Site Pair and its summary can be viewed on the Summary page.

    Site pair summary

Add an Array Pair for SRM

The following step is completed in the Site Recovery interface of the primary site.

  1. In the Site Recovery interface navigate to Configure > Array Based Replication > Array Pairs in the left hand menu. Click on ADD to get started.

    Array pairs

     

  2. On the Storage replication adapter page of the Add Array Pair wizard, verify the SRA adapter is present for the primary site and click on Next to continue.

    Add array pair

     

  3. On the Local array manager page, enter a name for the array at the primary site, the FQDN of the storage system, the SVM IP addresses serving NFS, and optionally, the names of specific volumes to be discovered. Click on Next to continue.

    Local array manager

     

  4. On the Remote array manager fill out the same information as the last step for the ONTAP storage system at the secondary site.

    Remote array manager

     

  5. On the Array pairs page, select the array pairs to enable and click on Next to continue.

    Array pairs

     

  6. Review the information on the Ready to complete page and click on Finish to create the array pair.

Configure Protection Groups for SRM

The following step is completed in the Site Recovery interface of the primary site.

  1. In the Site Recovery interface click on the Protection Groups tab and then on New Protection Group to get started.

    Site Recovery

     

  2. On the Name and direction page of the New Protection Group wizard, provide a name for the group and choose the site direction for protection of the data.

    Name and direction

     

  3. On the Type page select the protection group type (datastore, VM, or vVol) and select the array pair. Click on Next to continue.

    Type

     

  4. On the Datastore groups page, select the datastores to include in the protection group. VMs currently residing on the datastore are displayed for each datastore selected. Click on Next to continue.

    Datastore groups

     

  5. On the Recovery plan page, optionally choose to add the protection group to a recovery plan. In this case, the recovery plan is not yet created so Do not add to recovery plan is selected. Click on Next to continue.

    Recovery plan

     

  6. On the Ready to complete page, review the new protection group parameters and click on Finish to create the group.

    Recovery plan

Configure Recovery Plan for SRM

The following step is completed in the Site Recovery interface of the primary site.

  1. In the Site Recovery interface click on the Recovery plan tab and then on New Recovery Plan to get started.

    New recovery plan

     

  2. On the Name and direction page of the Create Recovery Plan wizard, provide a name for the recovery plan and choose the direction between source and destination sites. Click on Next to continue.

    Name and direction

     

  3. On the Protection groups page, select the previously created protection groups to include in the recovery plan. Click on Next to continue.

    Protection groups

     

  4. On the Test Networks configure specific networks that will be used during the test of the plan. If no mapping exists or if no network is selected, an isolated test network will be created. Click on Next to continue.

    Test networks

     

  5. On the Ready to complete page, review the chosen parameters and then click on Finish to create the recovery plan.

Disaster recovery operations with SRM

In this section various functions of using disaster recovery with SRM will be covered including, testing failover, performing failover, performing reprotection and failback.

Refer to Operational best practices for more information on using ONTAP storage with SRM disaster recovery operations.

Testing failover with SRM

The following step is completed in the Site Recovery interface.

  1. In the Site Recovery interface click on the Recovery plan tab and then select a recovery plan. Click on the Test button to begin testing failover to the secondary site.

    Test failover

     

  2. You can view the progress of the test from the Site Recovery task pane as well the vCenter task pane.

    test failover in task pane

     

  3. SRM sends commands via the SRA to the secondary ONTAP storage system. A FlexClone of the most recent snapshot is created and mounted at the secondary vSphere cluster. The newly mounted datastore can be viewed in the storage inventory.

    Newly mounted datastore

     

  4. Once the test has completed, click on Cleanup to unmount the datastore and revert back to the original environment.

    Newly mounted datastore

Run Recovery Plan with SRM

Perform a full recovery and failover to the secondary site.

  1. In the Site Recovery interface click on the Recovery plan tab and then select a recovery plan. Click on the Run button to begin failover to the secondary site.

    Run failover

     

  2. Once the failover is complete you can see the datastore mounted and the VMs registered at the secondary site.

    Filover complete

Additional functions are possible in SRM once a failover has completed.

Reprotection: Once the recovery process is complete, the previously designated recovery site assumes the role of the new production site. However, it's important to note that the SnapMirror replication is disrupted during the recovery operation, leaving the new production site vulnerable to future disasters. To ensure continued protection, it is recommended to establish new protection for the new production site by replicating it to another site. In cases where the original production site remains functional, the VMware administrator can repurpose it as a new recovery site, effectively reversing the direction of protection. It's crucial to highlight that re-protection is only feasible in non-catastrophic failures, necessitating the eventual recoverability of the original vCenter Servers, ESXi servers, SRM servers, and their respective databases. If these components are unavailable, the creation of a new protection group and a new recovery plan becomes necessary.

Failback: A failback operation is a reverse failover, returning operations to the original site. It's crucial to ensure that the original site has regained functionality before initiating the failback process. To ensure a smooth failback, it's recommended to conduct a test failover after completing the reprotection process and before executing the final failback. This practice serves as a verification step, confirming that the systems at the original site are fully capable of handling the operation. By following this approach, you can minimize risks and ensure a more reliable transition back to the original production environment.

Additional information

For NetApp documentation on using ONTAP storage with VMware SRM refer to VMware Site Recovery Manager with ONTAP

For information on configuring ONTAP storage systems refer to the ONTAP 9 Documentation center.

For information on configuring VCF refer to VMware Cloud Foundation Documentation.