Scan data sources with NetApp Data Classification
NetApp Data Classification scans the data in the repositories (the volumes, database schemas, or other user data) that you select to identify personal and sensitive data. Data Classification then maps your organizational data, categorizes each file, and identifies predefined patterns in the data. The result of the scan is an index of personal information, sensitive personal information, data categories, and file types.
After the initial scan, Data Classification continuously scans your data in a round-robin fashion to detect incremental changes. This is why it's important to keep the instance running.
You can enable and disable scans at the volume level or at the database schema level.
Understand scan types
You can conduct two types of scans in Data Classification:
-
Map-only scans provide only a high-level overview of your data and are performed on selected data sources. Map-only scans take less time than full scans because they don't access files to see the data inside.
-
Full scans provide deep-level scanning of your data. A full scan includes a mapping scan and a classification of data inside the files.
|
|
Performing a map-only scan can be helpful in gaining an overview of your data and pinpoint specific areas to perform full scans on. |
Review the tables to see what information is surfaced in map-only scans and full scans.
| Feature | Full scan | Map-only scan |
|---|---|---|
Scan speed |
Slow |
Fast |
Pricing |
Free |
Free |
Capacity |
Limited to 500 TiB* |
Limited to 500 TiB* |
List of file types and used capacity |
Yes |
Yes |
Number of files and used capacity |
Yes |
Yes |
Age and size of files |
Yes |
Yes |
Ability to run a Data Mapping Report |
Yes |
Yes |
Data Investigation page to view file details |
Yes |
No |
Search for names within files |
Yes |
No |
Create saved queries that provide custom search results |
Yes |
No |
Ability to run other reports |
Yes |
No |
Ability to see metadata from files** |
Yes |
Yes |
* Data Classification does not impose a limit on the amount of data it scans. Each Console agent supports scanning and displaying 500 TiB of data. To scan more than 500 TiB of data, install another Console agent then deploy another Data Classification instance.
The Console UI displays data from a single connector. For tips on viewing data from multiple Console agents, see Work with multiple Console agents.
** The following metadata is extracted from files during mapping scans:
-
File discovered time
-
File last accessed
-
File last modified
-
File type
-
File size
-
File creation
-
Number of files
-
Permissions extraction
-
Storage repository
-
System
-
System type
-
Used capacity
Governance dashboard differences:
| Feature | Full scan | Map-only scan |
|---|---|---|
Stale data |
Yes |
Yes |
Non-business data |
Yes |
Yes |
Duplicated files |
Yes |
Yes |
Predefined saved queries |
Yes |
No |
Default saved queries |
Yes |
Yes |
DDA report |
Yes |
Yes |
Mapping report |
Yes |
Yes |
Sensitivity level detection |
Yes |
No |
Sensitive data with wide permissions |
Yes |
No |
Open permissions |
Yes |
Yes |
Age of data |
Yes |
Yes |
Size of data |
Yes |
Yes |
Categories |
Yes |
No |
File types |
Yes |
Yes |
Compliance dashboard differences:
| Feature | Full scan | Map-only scan |
|---|---|---|
Personal information |
Yes |
No |
Sensitive personal information |
Yes |
No |
Privacy risk assessment report |
Yes |
No |
HIPAA report |
Yes |
No |
PCI DSS report |
Yes |
No |
Investigation filters differences:
| Feature | Full scan | Map-only scan |
|---|---|---|
Saved queries |
Yes |
Yes |
System type |
Yes |
Yes |
System |
Yes |
Yes |
Storage repository |
Yes |
Yes |
File type |
Yes |
Yes |
File size |
Yes |
Yes |
Created time |
Yes |
Yes |
Discovered time |
Yes |
Yes |
Last modified |
Yes |
Yes |
Last access |
Yes |
Yes |
Open permissions |
Yes |
Yes |
File directory path |
Yes |
Yes |
Category |
Yes |
No |
Sensitivity level |
Yes |
No |
Number of identifiers |
Yes |
No |
Personal data |
Yes |
No |
Sensitive personal data |
Yes |
No |
Data subject |
Yes |
No |
Duplicates |
Yes |
Yes |
Classification status |
Yes |
Status is always "Limited insights" |
Scan analysis event |
Yes |
Yes |
File hash |
Yes |
Yes |
Number of users with access |
Yes |
Yes |
User/group permissions |
Yes |
Yes |
File owner |
Yes |
Yes |
Directory type |
Yes |
Yes |