Changing the scan settings for your repositories

Contributors netapp-tonacki

You can manage the data that is being scanned in each of your working environments and data sources. You can make the changes on a "repository" basis; meaning you can make changes for each volume, bucket, share, schema, user, etc. depending on the type of data source you are scanning.

Some of the things you can change are whether a repository is scanned or not, and whether Data Sense is performing a mapping scan or a mapping & classification scan. You can also pause and resume scanning, for example, if you need to stop scanning a volume for a certain period of time.

Viewing the scan status for your repositories

You can view the individual repositories that Data Sense is scanning (volumes, buckets, etc.) for each working environment and data source. Additionally, you can see how many have been "Mapped", and how many have been "Classified". Classification takes a longer time as the full AI identification is being performed on all data.

Steps
  1. From the Configuration tab, click the Configuration button for the working environment.

    A screenshot showing how to click the Configuration button for a working environment.

  2. In the Scan Configuration page you can view the scan settings for all repositories.

    A screenshot showing whether your buckets are being scanned, and the current scan status.

You can hover your cursor over the chart in the Scan Status column to see the number of files that remain to be mapped or classified in each repository (bucket in this example).

Changing the type of scanning for a repository

You can start or stop mapping-only scans, or mapping and classification scans, in a working environment at any time from the Configuration page. You can also change from mapping-only scans to mapping and classification scans, and vice-versa.

Tip Databases can’t be set to mapping-only scans. Database scanning can be Off or On; where On is equivalent to Map & Classify.
Steps
  1. From the Configuration tab, click the Configuration button for the working environment.

    A screenshot showing how to click the Configuration button for a working environment.

  2. In the Scan Configuration page you can change any of the repositories (buckets in this example) to perform Map or Map & Classify scans, or select Off to stop scanning for a particular bucket.

    A screenshot showing how to select the type of scanning for a bucket.

Certain types of working environments enable you to change the type of scanning globally for all repositories using a button bar at the top of the page. This is valid for Cloud Volumes ONTAP, on-premises ONTAP, Azure NetApp Files, and Amazon FSx for ONTAP systems.

The example below shows this button bar for an Azure NetApp Files system.

A screenshot showing how to configure the same scan setting for all volumes in a working environment.

Pausing and resuming scanning for a repository

You can "pause" scanning on a repository if you want to temporarily stop scanning certain content. This is not the same as turning scanning "off". When scanning is turned off, all the indexing and information about that volume is removed from the system. Pausing scanning means that Data Sense won’t perform any future scans for changes or additions to the repository, but that all the current results will still be displayed in the system.

You can "resume" scanning at any time.

Steps
  1. From the Configuration tab, click the Configuration button for the working environment.

    A screenshot showing how to click the Configuration button for a working environment.

  2. In the Scan Configuration page, click the Pause button to pause scanning for a volume, or press the Resume button to resume scanning for a volume that had been previously paused.

    A screenshot showing how to pause and resume scanning on a volume.

    Note that some data sources provide the Pause and Resume functionality in a menu, as shown below for SharePoint sites.

    A screenshot showing how to pause and resume scanning on a SharePoint site.

Rescanning data for an existing repository

Data Sense continuously scans your data to detect incremental changes in the repositories that you’ve added. However, it takes time for the system to scan all the environments, and there is no method to control the order of the repositories that are scanned. So if you need to rescan a particular repository immediately so that changes are reflected in the system, you can select the repository and rescan it. This allows you to prioritize scanning of certain data before other data. After the rescan action, the selected repository returns to being scanned under the normal Data Sense schedule.

Tip Currently we support rescanning a single directory (folder or share). Future support will include rescanning additional repository types (files, databases, etc.).
  • When rescanning a directory, all the files within the directory are rescanned, but sub-folders within the directory are not rescanned.

  • When rescanning a share, only the share’s metadata is rescanned.

Steps
  1. In the Data Investigation results pane, select the folder or share that you want to rescan, and click Rescan.

    A screenshot showing how to select and rescan a directory.

  2. In the Rescan Directory dialog, click Rescan.

Note that you can also rescan an individual directory when viewing the metadata details. Just click Rescan.

A screenshot showing how to rescan a single folder or share.