Skip to main content
BlueXP classification

Investigate the data stored in your organization with BlueXP classification

Contributors amgrissino netapp-ahibbard

You can investigate the data from your organization by viewing details in the Data Investigation page. Here is where you can continue your research after looking at the Governance dashboard. On the Investigation page, you can filter the data using one of the many filters to show only the results you want to see. You can also view file metadata, permissions for files and directories, and check for duplicate files in your storage systems.

You can navigate to this page from many areas of the BlueXP classification UI, including the Governance and Compliance dashboards with the filters selected already on those pages. You can export the data into a CSV or JSON file for further analysis or to share with others.

Note The capabilities described in this section are available only if you have chosen to perform a full classification scan on your data sources. Data sources that have had a mapping-only scan do not show file-level details.

Filter data in the Data Investigation page

You can filter the contents of the investigation page to display only the results you want to see.

Steps
  1. From the BlueXP classification menu, select Investigation.

  2. On the Data Investigation page, do any of the following:

  3. To download the contents of the page as a report after you've refined it, select the download button button.

    A screenshot of the filters available when refining the results in the investigation page.

  4. To view the data from files (unstructured data), directories (folders and file shares), or from databases (structured data), select one of the tabs at the top.

  5. To sort the results in numerical or alphabetical order, select the control at the top of each column.

  6. To refine the results even more, select one of the many filters in the left Filter pane.

Note You can only view the first 10,000 results—or 500 pages—for a scan on the Data Investigation page.

Filter data by sensitivity and content

Use the following filters to view how much sensitive information is contained in your data.

Filter Details

Category

Select the types of categories.

Sensitivity Level

Select the sensitivity level: Personal, Sensitive personal, or Non sensitive.

Number of identifiers

Select the range of detected sensitive identifiers per file. Includes personal data and sensitive personal data. When filtering in Directories, BlueXP classification totals the matches from all files in each folder (and sub-folders).

NOTE: The December 2023 (version 1.26.6) release removed the option to calculate the number of personal identifiable information (PII) data by Directories.

Personal Data

Select the types of personal data.

Sensitive Personal Data

Select the types of sensitive personal data.

Data Subject

Enter a data subject's full name or known identifier. Learn more about data subjects here.

Filter data by user owner and user permissions

Use the following filters to view file owners and permissions to access your data.

Filter Details

Open Permissions

Select the type of permissions within the data and within folders/shares.

User / Group Permissions

Select one or multiple user names and/or group names, or enter a partial name.

File Owner

Enter the file owner name.

Number of users with access

Select one or multiple category ranges to show which files and folders are open to a certain number of users.

Filter data by time

Use the following filters to view data based on time criteria.

Filter Details

Created Time

Select a time range when the file was created. You can also specify a custom time range to further refine the search results.

Discovered Time

Select a time range when BlueXP classification discovered the file. You can also specify a custom time range to further refine the search results.

Last Modified

Select a time range when the file was last modified. You can also specify a custom time range to further refine the search results.

Last Accessed

Select a time range when the file, or directory (CIFS or NFS only), was last accessed. You can also specify a custom time range to further refine the search results. For the types of files that BlueXP classification scans, this is the last time BlueXP classification scanned the file.

BlueXP classification does not extract the "last accessed time" from the following data sources: SharePoint Online, SharePoint On-premises (SharePoint Server), OneDrive, Google Drive, and Amazon S3.

Filter data by metadata

Use the following filters to view data based on location, size, and directory or file type.

Filter Details

File Path

Enter up to 20 partial or full paths that you want to include or exclude from the query. If you enter both include paths and exclude paths, BlueXP classification finds all files in the included paths first, then it removes files from excluded paths, and then it displays the results. Note that using "*" in this filter has no effect, and that you can't exclude specific folders from the scan - all the directories and files under a configured share will be scanned.

Directory Type

Select the directory type; either "Share" or "Folder".

File Type

Select the types of files.

File Size

Select the file size range.

File Hash

Enter the file's hash to find a specific file, even if the name is different.

Filter data by storage type

Use the following filters to view data by storage type.

Filter Details

Working Environment Type

Select the type of working environment. OneDrive, SharePoint, and Google Drive are categorized under "Apps".

Working Environment name

Select specific working environments.

Storage Repository

Select the storage repository, for example, a volume or a schema.

Filter data by saved searches

Use the following filter to view data by saved searches.

Filter Details

Saved search

Select one saved search or multiples. Go to the saved searches tab to view the list of existing saved searches and create new ones.

Filter data by analysis status

Use the following filter to view data by the BlueXP classification scan status.

Filter Details

Analysis Status

Select an option to show the list of files that are Pending First Scan, Completed being scanned, Pending Rescan, or that have Failed to be scanned.

Scan Analysis Event

Select whether you want to view files that were not classified because BlueXP classification couldn't revert last accessed time, or files that were classified even though BlueXP classification couldn't revert last accessed time.

See details about the "last accessed time" timestamp for more information about the items that appear in the Investigation page when filtering using the Scan Analysis Event.

Filter data by duplicates

Use the following filter to view files that are duplicated in your storage.

Filter Details

Duplicates

Select whether the file is duplicated in the repositories.

View file metadata

In addition to showing you the working environment and volume where the file resides, the metadata shows much more information, including the file permissions, file owner, and whether there are duplicates of this file. This information is useful if you're planning to create saved searches because you can see all the information that you can use to filter your data.

Not all information is available for all data sources - just what is appropriate for that data source. For example, volume name and permissions are not relevant for database files.

Steps
  1. From the BlueXP classification menu, select Investigation.

  2. In the Data Investigation list on the right, select the down-caret down-caret on the right for any single file to view the file metadata.

    A screenshot showing the metadata details for a file in the Data Investigation page.

View users' permissions for files and directories

To view a list of all users or groups who have access to a file or to a directory and the types of permissions they have, select View all Permissions. This button is available only for data in CIFS shares.

Note that if you see SIDs (Security IDentifiers) instead of user and group names, you should integrate your Active Directory into BlueXP classification. See how to do this.

Steps
  1. From the BlueXP classification menu, select Investigation.

  2. In the Data Investigation list on the right, select the down-caret down-caret on the right for any single file to view the file metadata.

  3. To view a list of all users or groups who have access to a file or to a directory and the types of permissions they have, in the Open Permissions field, select View all Permissions.

    Note BlueXP classification shows up to 100 users in the list.

    A screenshot showing detailed file permissions.

  4. Select the down-caret down-caret button for any group to see the list of users who are part of the group.

    Tip You can expand one level of the group to see the users who are part of the group.
  5. Select the name of a user or group to refresh the Investigation page so you can see all the files and directories that the user or group has access to.

Check for duplicate files in your storage systems

You can view if duplicate files are being stored in your storage systems. This is useful if you want to identify areas where you can save storage space. It can also be helpful to make sure certain files that have specific permissions or sensitive information are not unnecessarily duplicated in your storage systems.

All of your files (not including databases) that are 1 MB or larger, or that contain personal or sensitive personal information, are compared to see if there are duplicates.

BlueXP classification uses hashing technology to determine duplicate files. If any file has the same hash code as another file, we can be 100% sure that the files are exact duplicates — even if the file names are different.

Steps
  1. From the BlueXP classification menu, select Investigation.

  2. In the Investigation page Filters pane on the left, select "File Size" along with "Duplicates" ("Has duplicates") to see which files of a certain size range are duplicated in your environment.

  3. Optionally, download the list of duplicate files and send it to your storage admin so they can decide which files, if any, can be deleted.

  4. Optionally, delete the file yourself if you are confident that a specific version of the file is not needed.

View if a specific file is duplicated

You can see if a single file has duplicates.

Steps
  1. From the BlueXP classification menu, select Investigation.

  2. In the Data Investigation list, select down-caret on the right for any single file to view the file metadata.

    If duplicates exist for a file, this information appears next to the Duplicates field.

  3. To view the list of duplicate files and where they are located, select View Details.

  4. In the next page select View Duplicates to view the files in the Investigation page.

    A screenshot showing how to view where duplicated files are located.

    Tip You can use the "file hash" value provided in this page and enter it directly in the Investigation page to search for a specific duplicate file at any time - or you can use it in a saved search.

Create the Data Investigation Report

The Data Investigation Report is a download of the filtered contents of the Data Investigation page.

The report is available as a .CSV or .JSON file that you can save to the local machine.

There can be up to three report files downloaded if BlueXP classification is scanning files (unstructured data), directories (folders and file shares), and databases (structured data).

The files are split into files with a fixed number of rows or records:

  • JSON - 100,000 records

  • CSV - 200,000 records

    Note You can download a version of the CSV file to view in this browser. This version is limited to 10,000 records.

What's included in the Data Investigation Report

The Unstructured Files Data Report includes the following information about your files:

  • File name

  • Location type

  • Working environment name

  • Storage repository (for example, a volume, bucket, shares)

  • Repository type

  • File path

  • File type

  • File size (in MB)

  • Created time

  • Last modified

  • Last accessed

  • File owner

  • Category

  • Personal information

  • Sensitive personal information

  • Open permissions

  • Scan Analysis Error

  • Deletion detection date

    A deletion detection date identifies the date that the file was deleted or moved. This enables you to identify when sensitive files have been moved. Deleted files aren't part of the file number count that appears in the dashboard or on the Investigation page. The files only appear in the CSV reports.

The Unstructured Directories Data Report includes the following information about your folders and file shares:

  • Working environment type

  • Working environment name

  • Directory name

  • Storage repository (for example, a folder or file shares)

  • Directory owner

  • Created time

  • Discovered time

  • Last modified

  • Last accessed

  • Open permissions

  • Directory type

The Structured Data Report includes the following information about your database tables:

  • DB Table name

  • Location type

  • Working environment name

  • Storage repository (for example, a schema)

  • Column count

  • Row count

  • Personal information

  • Sensitive personal information

Steps to generate the report
  1. From the Data Investigation page, select the download button button on the top, right of the page.

  2. Choose the report type: CSV or JSON.

  3. Enter a Report name.

  4. To download the complete report, select Working environment then choose the Working Environment and Volume from the respective dropdown menus. Provide a Destination folder path.

    To download the report in the browser, select Local . Note this option limits the report to the first 10,000 rows and is limited to the CSV format. You don't need to complete any other fields if you select Local.

  5. Select Download Report.

    A screenshot of the Download Investigation Report page with multiple options.

Result

A dialog displays a message that the reports are being downloaded.

Create a saved search based on selected filters

You can create a saved search for frequently used search filters in the Data Investigation page to easily replicate those search queries.

Steps
  1. From the BlueXP classification menu, select Investigation.

  2. On the Data Investigation page, select the filters you want to use to create a saved search.

  3. At the bottom of the Filter pane, select Create saved search from this search.

  4. Enter a name and a description for the saved search.

  5. Choose any of the following:

  6. Select Create Saved Search.

Tip It might take up to 15 minutes for the results to appear on the Saved Searches page.