Skip to main content
NetApp Data Classification

Scan Azure NetApp Files volumes with NetApp Data Classification

Contributors netapp-ahibbard

Complete a few steps to get started with NetApp Data Classification for Azure NetApp Files.

Discover the Azure NetApp Files system that you want to scan

If the Azure NetApp Files system you want to scan is not already in the NetApp Console as a system, add it in the Systems page.

Deploy the Data Classification instance

Deploy Data Classification if there isn't already an instance deployed.

Data Classification must be deployed in the cloud when scanning Azure NetApp Files volumes, and it must be deployed in the same region as the volumes you wish to scan.

Note: Deploying Data Classification in an on-premises location is not currently supported when scanning Azure NetApp Files volumes.

Enable Data Classification in your systems

You can enable Data Classification on your Azure NetApp Files volumes.

  1. From the Data Classification menu, select Configuration.

  2. Identify the system you want to enable scanning on.

  3. Decide how you want to scan the volumes. What's the difference between Mapping and Classification scans?.

  4. To enable scanning on all volumes in the system, select Activate scan then select Map only all volumes or Full scan all volumes.

    To manage scanning on only select volumes, select Activate scan then Manage. In the system overview, identify the resources you want to scan. Select the scan type for each volume then set it to the scanning type you want: full or mapping-only.

Result

Data Classification starts scanning the volumes you selected in the system. Results are available in the Compliance dashboard as soon as Data Classification finishes the initial scans. The time that it takes depends on the amount of data—​it could be a few minutes or hours. You can track the progress of the initial scan by navigating to the system overview page in the Configuration menu. Data Classification displays a progress bar for each scan. You can hover over the progress bar to see the number of files scanned relative to the total number of files in the volume.

  • By default, if Data Classification doesn't have write attributes permissions in CIFS, or write permissions in NFS, the system won't scan the files in your volumes because Data Classification can't revert the "last access time" to the original timestamp. If you don't care if the last access time is reset, select Or select scanning type for each volume. The resulting page has a setting you can enable so that Data Classification will scan the volumes regardless of permissions.

  • Data Classification scans only one file share under a volume. If you have multiple shares in your volumes, you'll need to scan those other shares separately as a shares group. Learn about this Data Classification limitation.

Verify that Data Classification has access to volumes

Make sure that Data Classification can access volumes by checking your networking, security groups, and export policies. You need to provide Data Classification with CIFS credentials so it can access CIFS volumes.

Note For Azure NetApp Files, Data Classification can only scan volumes in the same region as the Console.
Checklist
  • Make sure that there's a network connection between the Data Classification instance and each network that includes volumes for Azure NetApp Files.

  • Ensure the following ports are open to the Data Classification instance:

    • For NFS – ports 111 and 2049.

    • For CIFS – ports 139 and 445.

  • Ensure the NFS volume export policies include the IP address of the Data Classification instance so it can access the data on each volume.

Steps
  1. From the Data Classification menu, select Configuration.

    1. If you're using CIFS (SMB), ensure the Active Directory credentials are correct. For each system, select Edit CIFS Credentials then enter the user name and password that Data Classification needs to access CIFS volumes on the system.

      The credentials can be read-only; providing admin credentials ensures that Data Classification can read any data that requires elevated permissions. The credentials are stored on the Data Classification instance.

      If you want to make sure your files "last accessed times" are unchanged by Data Classification scans, it's recommended the user has Write Attributes permissions in CIFS or write permissions in NFS. If possible, configure the Active Directory user as part of a parent group in the organization which has permissions to all files.

      After you enter the credentials, you should see a message that all CIFS volumes were authenticated successfully.

  2. On the Configuration page, select View Details to review the status for each CIFS and NFS volume. If necessary, correct any errors such as network connectivity issues.

Enable or disable scans on volumes

You can start or stop scans on any system at any time from the Configuration page. You can also switch scans from map-only scans to mapping and classification scans, and vice-versa. It's recommended that you scan all volumes in a system.

The switch at the top of the page for Scan without write permissions is disabled by default. This means that if Data Classification doesn't have write attributes permissions in CIFS or write permissions in NFS, the system won't scan the files because Data Classification can't revert the "last access time" to the original timestamp. If you don't care if the last access time is reset, turn the switch ON and all files are scanned regardless of the permissions. Learn more.

Note New volumes added to the system are automatically scanned only when you have enabled scanning for all volumes. If you've only enabled scanning on certain volumes, you must manually enable scanning on the newly added volume.

A screenshot of the Configuration page where you can enable or disable scanning of individual volumes.

Steps
  1. From the Data Classification menu, select Configuration.

  2. Identify the system you want to scan. Select Activate scan. In the dropdown, choose Scan all volumes: full scan, Scan all volumes: mapping only, or Manage scans to open the system menu and configure scanning on specific volumes.

    To enable or disable scans for individual volumes, find the volumes in the list. In the scan type column, select Map only or Full scan.

Result

When you enable scanning, Data Classification starts scanning the volumes you selected in the system. Results start to appear in the Compliance dashboard as soon as Data Classification starts the scan. Scan completion time depends on the amount of data, ranging from minutes to hours.