Reinstalling NAS Bridge if a node fails

If a NAS Bridge node fails, you can reinstall the NAS Bridge software, import saved configuration settings, and resume operations.

Before you begin

About this task

You can recover a NAS Bridge node from most failures without the risk of losing data. The only type of disaster that puts data at risk is the unrecoverable failure of a cache device. If a cache device fails, any data that has not yet been uploaded to the StorageGRID Webscale system is lost. In addition, the following inconsistencies can occur in the namespace:
  • The mtime of a file might not be updated correctly. For example, new data is written to the file but the mtime stays the same.
  • The nlink count of a file might be artificially incremented by 1. While the file can be deleted from the namespace, the data remains in StorageGRID Webscale and is not deleted.
  • The mtime of a directory might not be updated correctly. For example, a file or directory is created, deleted, or renamed in a directory, but the mtime of that directory is not updated.
  • The nlink count of a directory might be artificially incremented by 1. For example, a subdirectory is deleted, but the nlink count is not decremented. Or, NAS Bridge increments the nlink count, but the creation of the actual subdirectory fails.
    Note: This inconsistency has no impact on your ability to delete the directory when it becomes empty.
  • The namespace might have a hard link to a directory. For example, a hard link might result if the cache device failed during a directory rename, and the removal of the old name was not processed before the disaster recovery event. The old name will be deleted the first time the directory is listed or accessed.

Steps

  1. Reinstall the NAS Bridge software by downloading the virtual machine image and deploying the virtual machine.

    You must deploy NAS Bridge with the same networking setup (same static IP address or DHCP) as the one from the recovery package. If you want to change the networking setup, perform the disaster recovery procedure first, and then use the vApp Options to change the NAS Bridge’s networking setup. See the instructions for changing the IP configuration for the default logical interface in Administering NAS Bridge.

  2. Create and attach new cache devices.
    You must create the same configuration of cache device disks for the replacement virtual machine as you did for the original virtual machine. You must attach the cache devices to the virtual machine in the same order as you attached the original cache devices. If desired, you can create cache devices of larger capacities than the originals.
    Example
    If you originally had three cache devices with capacities of 1 TB, 4 TB, and 4 TB, create three new cache devices with a minimum of 1 TB for the first disk that you attach, and two disks with a minimum of 4 TB.
    Note: If the cache was not destroyed during the failure, you don’t have to create the cache first. Just attach the cache devices to the virtual machine in the same order as you attached the cache devices to the original virtual machine. Use vSphere to verify that the new virtual machine has a disk configuration that is identical to the disk configuration of the original virtual machine.
  3. Import the saved configuration settings.
    1. Click Maintenance at the top of the web page.
    2. Click Recovery Package.
    3. Depending on your browser, click Browse or Choose File, and select the saved recovery package.
    4. Click Upload.
      The recovery package is displayed in the table on the web page.
    5. Select the recovery package in the table, and click Import to update the NAS Bridge with the saved configuration data.
    Note: If you do not have access to a saved recovery package, you must re-enter all configuration data. Repeat the configuration steps you performed during installation as well as the configuration steps you followed in Administering NAS Bridge.
  4. When prompted, click Yes to import the file and reboot the node.
    When you import a configuration file, the node reboots. You might receive error messages if you access the node or Management API while the system is rebooting. To avoid these errors, do not attempt to access the system for approximately 15 minutes after starting the import process.
  5. After the node has rebooted, log in again.
  6. If you had previously added an Active Directory server, that server will now be in a FAILED ADD state. To reconnect the server:
    1. Select Configuration > Active Directory Server.
    2. Select the Active Directory server in the table, and click Edit.
    3. Enter the admin password, and click Save.
    4. Wait for the Active Directory server to move to the READY state.
    5. Reboot the node again so that the SMB file system can mount correctly.
      If you reboot the node before Active Directory is ready, you might need to reenter the password and reboot the node.