Categories of private data
There are many types of private data that BlueXP classification can identify in your volumes and databases.
BlueXP classification identifies two types of personal data:
-
Personally identifiable information (Pii)
-
Sensitive personal information (SPii)
If you need BlueXP classification to identify other private data types, such as additional national ID numbers or healthcare identifiers, email ng-contact-data-sense@netapp.com with your request. |
Types of personal data
The personal data, or personally identifiable information (Pii), found in files can be general personal data or national identifiers. The third column in the table below identifies whether BlueXP classification uses proximity validation to validate its findings for the identifier.
The languages in which these items can be recognized are identified in the table.
Type | Identifier | Proximity validation? | English | German | Spanish | French | Japanese |
---|---|---|---|---|---|---|---|
General |
Credit card number |
No |
✓ |
✓ |
✓ |
✓ |
|
Data Subjects |
No |
✓ |
✓ |
✓ |
|||
Email Address |
No |
✓ |
✓ |
✓ |
✓ |
||
IBAN Number (International Bank Account Number) |
No |
✓ |
✓ |
✓ |
✓ |
||
IP Address |
No |
✓ |
✓ |
✓ |
✓ |
||
Password |
Yes |
✓ |
✓ |
✓ |
✓ |
||
National Identifiers |
Australian TFN (Tax File Number) |
Yes |
✓ |
✓ |
✓ |
||
Australian Driver's License |
Yes |
✓ |
✓ |
✓ |
|||
Australian Medicare Number |
Yes |
✓ |
✓ |
✓ |
|||
Australian Passport Number |
Yes |
✓ |
✓ |
✓ |
|||
Austrian SSN |
Yes |
✓ |
✓ |
✓ |
|||
Belgian ID (Numero National) |
Yes |
✓ |
✓ |
✓ |
|||
Botswana Identity Card (Omang) Number |
Yes |
✓ |
✓ |
✓ |
|||
Botswana Passport Number |
Yes |
✓ |
✓ |
✓ |
|||
Brazilian ID (CPF) |
Yes |
✓ |
✓ |
✓ |
|||
British Passport |
Yes |
✓ |
✓ |
✓ |
|||
Bulgarian ID (UCN) |
Yes |
✓ |
✓ |
✓ |
|||
Croatian ID (OIB) |
Yes |
✓ |
✓ |
✓ |
|||
Cyprus Tax Identification Number (TIC) |
Yes |
✓ |
✓ |
✓ |
|||
Czech/Slovak ID |
Yes |
✓ |
✓ |
✓ |
|||
Danish ID (CPR) |
Yes |
✓ |
✓ |
✓ |
|||
Dutch ID (BSN) |
Yes |
✓ |
✓ |
✓ |
|||
Estonian ID |
Yes |
✓ |
✓ |
✓ |
|||
Finnish ID (HETU) |
Yes |
✓ |
✓ |
✓ |
|||
French Driver's License |
Yes |
✓ |
✓ |
✓ |
✓ |
||
French ID |
Yes |
✓ |
✓ |
✓ |
✓ |
||
French INSEE |
Yes |
✓ |
✓ |
✓ |
✓ |
||
French Social Security Number |
Yes |
✓ |
✓ |
✓ |
✓ |
||
French Tax Identification Number (SPI) |
Yes |
✓ |
✓ |
✓ |
✓ |
||
German ID (Personalausweisnummer) |
Yes |
✓ |
✓ |
✓ |
|||
German Internal ID for Bank Transfers |
Yes |
✓ |
✓ |
✓ |
|||
German Social Security Number (Sozialversicherungsnummer) |
Yes |
✓ |
✓ |
✓ |
|||
German Tax Identification Number (Steuerliche Identifikationsnummer) |
Yes |
✓ |
✓ |
✓ |
|||
Greek ID |
Yes |
✓ |
✓ |
✓ |
|||
Hungarian Tax Identification Number |
Yes |
✓ |
✓ |
✓ |
|||
Irish ID (PPS) |
Yes |
✓ |
✓ |
✓ |
|||
Israeli ID |
Yes |
✓ |
✓ |
✓ |
|||
Italian Tax Identification Number |
Yes |
✓ |
✓ |
✓ |
|||
Japanese Personal Identification Number (both Personal and Corporate) |
Yes |
✓ |
✓ |
✓ |
✓ |
||
Latvian ID |
Yes |
✓ |
✓ |
✓ |
|||
Lithuanian ID |
Yes |
✓ |
✓ |
✓ |
|||
Luxembourg ID |
Yes |
✓ |
✓ |
✓ |
|||
Maltese ID |
Yes |
✓ |
✓ |
✓ |
|||
National Health Service (NHS) Number |
Yes |
✓ |
✓ |
✓ |
|||
New Zealand Bank Account |
Yes |
✓ |
✓ |
✓ |
|||
New Zealand Driver's License |
Yes |
✓ |
✓ |
✓ |
|||
New Zealand IRD Number (Tax ID) |
Yes |
✓ |
✓ |
✓ |
|||
New Zealand NHI (National Health Index) Number |
Yes |
✓ |
✓ |
✓ |
|||
New Zealand Passport Number |
Yes |
✓ |
✓ |
✓ |
|||
Polish ID (PESEL) |
Yes |
✓ |
✓ |
✓ |
|||
Portuguese Tax Identification Number (NIF) |
Yes |
✓ |
✓ |
✓ |
|||
Romanian ID (CNP) |
Yes |
✓ |
✓ |
✓ |
|||
Singapore National Registration Identity Card (NRIC) |
Yes |
✓ |
✓ |
✓ |
|||
Slovenian ID (EMSO) |
Yes |
✓ |
✓ |
✓ |
|||
South African ID |
Yes |
✓ |
✓ |
✓ |
|||
Spanish Tax Identification Number |
Yes |
✓ |
✓ |
✓ |
|||
Swedish ID |
Yes |
✓ |
✓ |
✓ |
|||
Texas Driver's License |
Yes |
✓ |
✓ |
✓ |
|||
U.K. ID (NINO) |
Yes |
✓ |
✓ |
✓ |
|||
USA California Driver's License |
Yes |
✓ |
✓ |
✓ |
|||
USA Indiana Driver's License |
Yes |
✓ |
✓ |
✓ |
|||
USA New York Driver's License |
Yes |
✓ |
✓ |
✓ |
|||
USA Social Security Number (SSN) |
Yes |
✓ |
✓ |
✓ |
Types of sensitive personal data
BlueXP classification can find the following sensitive personal information (SPii) in files.
The items in this category can be recognized only in English at this time.
-
Criminal Procedures Reference: Data concerning a natural person's criminal convictions and offenses.
-
Ethnicity Reference: Data concerning a natural person's racial or ethnic origin.
-
Health Reference: Data concerning a natural person's health.
-
ICD-9-CM Medical Codes: Codes used in the medical and health industry.
-
ICD-10-CM Medical Codes: Codes used in the medical and health industry.
-
Philosophical Beliefs Reference: Data concerning a natural person's philosophical beliefs.
-
Political Opinions Reference: Data concerning a natural person's political opinions.
-
Religious Beliefs Reference: Data concerning a natural person's religious beliefs.
-
Sex Life or Orientation Reference: Data concerning a natural person's sex life or sexual orientation.
Types of categories
BlueXP classification categorizes your data as follows.
Most of these categories can be recognized in English, German, and Spanish.
Category | Type | English | German | Spanish |
---|---|---|---|---|
Finance |
Balance Sheets |
✓ |
✓ |
✓ |
Purchase Orders |
✓ |
✓ |
✓ |
|
Invoices |
✓ |
✓ |
✓ |
|
Quarterly Reports |
✓ |
✓ |
✓ |
|
HR |
Background Checks |
✓ |
✓ |
|
Compensation Plans |
✓ |
✓ |
✓ |
|
Employee Contracts |
✓ |
✓ |
||
Employee Reviews |
✓ |
✓ |
||
Health |
✓ |
✓ |
||
Resumes |
✓ |
✓ |
✓ |
|
Legal |
NDAs |
✓ |
✓ |
✓ |
Vendor-Customer contracts |
✓ |
✓ |
✓ |
|
Marketing |
Campaigns |
✓ |
✓ |
✓ |
Conferences |
✓ |
✓ |
✓ |
|
Operations |
Audit Reports |
✓ |
✓ |
✓ |
Sales |
Sales Orders |
✓ |
✓ |
|
Services |
RFI |
✓ |
✓ |
|
RFP |
✓ |
✓ |
||
SOW |
✓ |
✓ |
✓ |
|
Training |
✓ |
✓ |
✓ |
|
Support |
Complaints and Tickets |
✓ |
✓ |
✓ |
The following Metadata is also categorized, and are identified in the same supported languages:
-
Application Data
-
Archive Files
-
Audio
-
Breadcrumbs from BlueXP classification
Business Application Data -
CAD Files
-
Code
-
Corrupted
-
Database and index files
-
Design Files
-
Email Application Data
-
Encrypted (files with a high entropy score)
-
Executables
-
Financial Application Data
-
Health Application Data
-
Images
-
Logs
-
Miscellaneous Documents
-
Miscellaneous Presentations
-
Miscellaneous Spreadsheets
-
Miscellaneous "Unknown"
-
Password Protected files
-
Structured Data
-
Videos
-
Zero-Byte Files
Types of files
BlueXP classification scans all files for category and metadata insights and displays all file types in the file types section of the dashboard.
But when BlueXP classification detects Personal Identifiable Information (PII), or when it performs a DSAR search, only the following file formats are supported:
.CSV, .DCM, .DICOM, .DOC, .DOCX, .JSON, .PDF, .PPTX, .RTF, .TXT, .XLS, .XLSX, Docs, Sheets, and Slides
Accuracy of information found
NetApp can't guarantee 100% accuracy of the personal data and sensitive personal data that BlueXP classification identifies. You should always validate the information by reviewing the data.
Based on our testing, the table below shows the accuracy of the information that BlueXP classification finds. We break it down by precision and recall:
- Precision
-
The probability that what BlueXP classification finds has been identified correctly. For example, a precision rate of 90% for personal data means that 9 out of 10 files identified as containing personal information, actually contain personal information. 1 out of 10 files would be a false positive.
- Recall
-
The probability for BlueXP classification to find what it should. For example, a recall rate of 70% for personal data means that BlueXP classification can identify 7 out of 10 files that actually contain personal information in your organization. BlueXP classification would miss 30% of the data and it won't appear in the dashboard.
We are constantly improving the accuracy of our results. Those improvements will be automatically available in future BlueXP classification releases.
Type | Precision | Recall |
---|---|---|
Personal data - General |
90%-95% |
60%-80% |
Personal data - Country identifiers |
30%-60% |
40%-60% |
Sensitive personal data |
80%-95% |
20%-30% |
Categories |
90%-97% |
60%-80% |