Investigate the data stored in your organization with BlueXP classification
You can investigate the data from your organization by viewing details in the Data Investigation page. Here is where you can continue your research after looking at the Governance dashboard. On the Investigation page, you can filter the data using one of the many filters to show only the results you want to see. You can also view file metadata, permissions for files and directories, and check for duplicate files in your storage systems.
You can navigate to this page from many areas of the BlueXP classification UI, including the Governance and Compliance dashboards with the filters selected already on those pages. You can export the data into a CSV or JSON file for further analysis or to share with others.
|
The capabilities described in this section are available only if you have chosen to perform a full classification scan on your data sources. Data sources that have had a mapping-only scan do not show file-level details. |
Filter data in the Data Investigation page
You can filter the contents of the investigation page to display only the results you want to see.
-
From the BlueXP classification menu, select Investigation.
-
On the Data Investigation page, do any of the following:
-
To download the contents of the page as a report after you've refined it, select the
button.
-
To view the data from files (unstructured data), directories (folders and file shares), or from databases (structured data), select one of the tabs at the top.
-
To sort the results in numerical or alphabetical order, select the control at the top of each column.
-
To refine the results even more, select one of the many filters in the left Filter pane.
|
You can only view the first 10,000 results—or 500 pages—for a scan on the Data Investigation page. |
Filter data by sensitivity and content
Use the following filters to view how much sensitive information is contained in your data.
Filter | Details |
---|---|
Category |
Select the types of categories. |
Sensitivity Level |
Select the sensitivity level: Personal, Sensitive personal, or Non sensitive. |
Number of identifiers |
Select the range of detected sensitive identifiers per file. Includes personal data and sensitive personal data. When filtering in Directories, BlueXP classification totals the matches from all files in each folder (and sub-folders). |
Personal Data |
Select the types of personal data. |
Sensitive Personal Data |
Select the types of sensitive personal data. |
Data Subject |
Enter a data subject's full name or known identifier. Learn more about data subjects here. |
Filter data by user owner and user permissions
Use the following filters to view file owners and permissions to access your data.
Filter | Details |
---|---|
Open Permissions |
Select the type of permissions within the data and within folders/shares. |
User / Group Permissions |
Select one or multiple user names and/or group names, or enter a partial name. |
File Owner |
Enter the file owner name. |
Number of users with access |
Select one or multiple category ranges to show which files and folders are open to a certain number of users. |
Filter data by time
Use the following filters to view data based on time criteria.
Filter | Details |
---|---|
Created Time |
Select a time range when the file was created. You can also specify a custom time range to further refine the search results. |
Discovered Time |
Select a time range when BlueXP classification discovered the file. You can also specify a custom time range to further refine the search results. |
Last Modified |
Select a time range when the file was last modified. You can also specify a custom time range to further refine the search results. |
Last Accessed |
Select a time range when the file, or directory (CIFS or NFS only), was last accessed. You can also specify a custom time range to further refine the search results. For the types of files that BlueXP classification scans, this is the last time BlueXP classification scanned the file. BlueXP classification does not extract the "last accessed time" from the following data sources: SharePoint Online, SharePoint On-premises (SharePoint Server), OneDrive, Google Drive, and Amazon S3. |
Filter data by metadata
Use the following filters to view data based on location, size, and directory or file type.
Filter | Details |
---|---|
File Path |
Enter up to 20 partial or full paths that you want to include or exclude from the query. If you enter both include paths and exclude paths, BlueXP classification finds all files in the included paths first, then it removes files from excluded paths, and then it displays the results. Note that using "*" in this filter has no effect, and that you can't exclude specific folders from the scan - all the directories and files under a configured share will be scanned. |
Directory Type |
Select the directory type; either "Share" or "Folder". |
File Type |
Select the types of files. |
File Size |
Select the file size range. |
File Hash |
Enter the file's hash to find a specific file, even if the name is different. |
Filter data by storage type
Use the following filters to view data by storage type.
Filter | Details |
---|---|
Working Environment Type |
Select the type of working environment. OneDrive, SharePoint, and Google Drive are categorized under "Apps". |
Working Environment name |
Select specific working environments. |
Storage Repository |
Select the storage repository, for example, a volume or a schema. |
Filter data by saved searches
Use the following filter to view data by saved searches.
Filter | Details |
---|---|
Saved search |
Select one saved search or multiples. Go to the saved searches tab to view the list of existing saved searches and create new ones. |
Filter data by analysis status
Use the following filter to view data by the BlueXP classification scan status.
Filter | Details |
---|---|
Analysis Status |
Select an option to show the list of files that are Pending First Scan, Completed being scanned, Pending Rescan, or that have Failed to be scanned. |
Scan Analysis Event |
Select whether you want to view files that were not classified because BlueXP classification couldn't revert last accessed time, or files that were classified even though BlueXP classification couldn't revert last accessed time. |
See details about the "last accessed time" timestamp for more information about the items that appear in the Investigation page when filtering using the Scan Analysis Event.
Filter data by duplicates
Use the following filter to view files that are duplicated in your storage.
Filter | Details |
---|---|
Duplicates |
Select whether the file is duplicated in the repositories. |
View file metadata
In addition to showing you the working environment and volume where the file resides, the metadata shows much more information, including the file permissions, file owner, and whether there are duplicates of this file. This information is useful if you're planning to create saved searches because you can see all the information that you can use to filter your data.
Not all information is available for all data sources - just what is appropriate for that data source. For example, volume name and permissions are not relevant for database files.
-
From the BlueXP classification menu, select Investigation.
-
In the Data Investigation list on the right, select the down-caret
on the right for any single file to view the file metadata.
View users' permissions for files and directories
To view a list of all users or groups who have access to a file or to a directory and the types of permissions they have, select View all Permissions. This button is available only for data in CIFS shares.
Note that if you see SIDs (Security IDentifiers) instead of user and group names, you should integrate your Active Directory into BlueXP classification. See how to do this.
-
From the BlueXP classification menu, select Investigation.
-
In the Data Investigation list on the right, select the down-caret
on the right for any single file to view the file metadata.
-
To view a list of all users or groups who have access to a file or to a directory and the types of permissions they have, in the Open Permissions field, select View all Permissions.
BlueXP classification shows up to 100 users in the list. -
Select the down-caret
button for any group to see the list of users who are part of the group.
You can expand one level of the group to see the users who are part of the group. -
Select the name of a user or group to refresh the Investigation page so you can see all the files and directories that the user or group has access to.
Check for duplicate files in your storage systems
You can view if duplicate files are being stored in your storage systems. This is useful if you want to identify areas where you can save storage space. It can also be helpful to make sure certain files that have specific permissions or sensitive information are not unnecessarily duplicated in your storage systems.
All of your files (not including databases) that are 1 MB or larger, or that contain personal or sensitive personal information, are compared to see if there are duplicates.
BlueXP classification uses hashing technology to determine duplicate files. If any file has the same hash code as another file, we can be 100% sure that the files are exact duplicates — even if the file names are different.
-
From the BlueXP classification menu, select Investigation.
-
In the Investigation page Filters pane on the left, select "File Size" along with "Duplicates" ("Has duplicates") to see which files of a certain size range are duplicated in your environment.
-
Optionally, download the list of duplicate files and send it to your storage admin so they can decide which files, if any, can be deleted.
-
Optionally, delete the file yourself if you are confident that a specific version of the file is not needed.
View if a specific file is duplicated
You can see if a single file has duplicates.
-
From the BlueXP classification menu, select Investigation.
-
In the Data Investigation list, select
on the right for any single file to view the file metadata.
If duplicates exist for a file, this information appears next to the Duplicates field.
-
To view the list of duplicate files and where they are located, select View Details.
-
In the next page select View Duplicates to view the files in the Investigation page.
You can use the "file hash" value provided in this page and enter it directly in the Investigation page to search for a specific duplicate file at any time - or you can use it in a saved search.
Create the Data Investigation Report
The Data Investigation Report is a download of the filtered contents of the Data Investigation page.
The report is available as a .CSV or .JSON file that you can save to the local machine.
There can be up to three report files downloaded if BlueXP classification is scanning files (unstructured data), directories (folders and file shares), and databases (structured data).
The files are split into files with a fixed number of rows or records:
-
JSON - 100,000 records
-
CSV - 200,000 records
You can download a version of the CSV file to view in this browser. This version is limited to 10,000 records.
What's included in the Data Investigation Report
The Unstructured Files Data Report includes the following information about your files:
-
File name
-
Location type
-
Working environment name
-
Storage repository (for example, a volume, bucket, shares)
-
Repository type
-
File path
-
File type
-
File size (in MB)
-
Created time
-
Last modified
-
Last accessed
-
File owner
-
Category
-
Personal information
-
Sensitive personal information
-
Open permissions
-
Scan Analysis Error
-
Deletion detection date
A deletion detection date identifies the date that the file was deleted or moved. This enables you to identify when sensitive files have been moved. Deleted files aren't part of the file number count that appears in the dashboard or on the Investigation page. The files only appear in the CSV reports.
The Unstructured Directories Data Report includes the following information about your folders and file shares:
-
Working environment type
-
Working environment name
-
Directory name
-
Storage repository (for example, a folder or file shares)
-
Directory owner
-
Created time
-
Discovered time
-
Last modified
-
Last accessed
-
Open permissions
-
Directory type
The Structured Data Report includes the following information about your database tables:
-
DB Table name
-
Location type
-
Working environment name
-
Storage repository (for example, a schema)
-
Column count
-
Row count
-
Personal information
-
Sensitive personal information
-
From the Data Investigation page, select the
button on the top, right of the page.
-
Choose the report type: CSV or JSON.
-
Enter a Report name.
-
To download the complete report, select Working environment then choose the Working Environment and Volume from the respective dropdown menus. Provide a Destination folder path.
To download the report in the browser, select Local . Note this option limits the report to the first 10,000 rows and is limited to the CSV format. You don't need to complete any other fields if you select Local.
-
Select Download Report.
A dialog displays a message that the reports are being downloaded.
Create a saved search based on selected filters
You can create a saved search for frequently used search filters in the Data Investigation page to easily replicate those search queries.
-
From the BlueXP classification menu, select Investigation.
-
On the Data Investigation page, select the filters you want to use to create a saved search.
-
At the bottom of the Filter pane, select Create saved search from this search.
-
Enter a name and a description for the saved search.
-
Choose any of the following:
-
Select Create Saved Search.
|
It might take up to 15 minutes for the results to appear on the Saved Searches page. |