Deploy NetApp Data Classification in the cloud using the NetApp Console
You can deploy NetApp Data Classification in the cloud with the NetApp Console. The Console deploys the Data Classification instance in the same cloud provider network as the Console agent.
Note that you can also install Data Classification on a Linux host that has internet access. This type of installation may be a good option if you prefer to scan on-premises ONTAP systems using a Data Classification instance that's also located on premises — but this is not a requirement. The software functions exactly the same way regardless of which installation method you choose.
Quick start
Get started quickly by following these steps, or scroll down to the remaining sections for full details.

If you don't already have a Console agent, create one. See creating a Console agent in AWS, creating a Console agent in Azure, or creating a Console agent in GCP.
You can also install the Console agent on-premises on a Linux host in your network or on a Linux host in the cloud.

Ensure that your environment can meet the prerequisites. This includes outbound internet access for the instance, connectivity between the Console agent and Data Classification over port 443, and more. << Prerequisites,See the complete list>>.

Launch the installation wizard to deploy the Data Classification instance in the cloud.
Create a Console agent
If you don't already have a Console agent, create a Console agent in your cloud provider. See creating a Console agent in AWS or creating a Console agent in Azure, or creating a Console agent in GCP. In most cases you will probably have a Console agent set up before you attempt to activate Data Classification because most Console features require a Console agent, but there are cases where you'll you need to set one up now.
There are some scenarios where you have to use a Console agent that's deployed in a specific cloud provider:
-
When scanning data in Cloud Volumes ONTAP in AWS or Amazon FSx for ONTAP buckets, you use a Console agent in AWS.
-
When scanning data in Cloud Volumes ONTAP in Azure or in Azure NetApp Files, you use a Console agent in Azure.
-
For Azure NetApp Files, it must be deployed in the same region as the volumes you wish to scan.
-
-
When scanning data in Cloud Volumes ONTAP in GCP, you use a Console agent in GCP.
On-prem ONTAP systems, NetApp file shares, and databases can be scanned when using any of these cloud Console agents.
Note that you can also install the Console agent on-premises on a Linux host in your network or in the cloud. Some users planning to install Data Classification on-prem may also choose to install the Console agent on-premises.
As you can see, there may be some situations where you need to use multiple Console agents.
|
Data Classification does not impose a limit on the amount of data it can scan. Each Console agent supports scanning and displaying 500 TiB of data. To scan more than 500 TiB of data, install another Console agent then deploy another Data Classification instance. The Console UI displays data from a single connector. For tips on viewing data from multiple Console agents, see Work with multiple Console agents. |
Government region support
Data Classification is supported when the Console agent is deployed in a Government region (AWS GovCloud, Azure Gov, or Azure DoD). When deployed in this manner, Data Classification has the following restrictions:
Prerequisites
Review the following prerequisites to make sure that you have a supported configuration before you deploy Data Classification in the cloud. When you deploy Data Classification in the cloud, it's located in the same subnet as the Console agent.
- Enable outbound internet access from Data Classification
-
Data Classification requires outbound internet access. If your virtual or physical network uses a proxy server for internet access, ensure that the Data Classification instance has outbound internet access to contact the following endpoints. The proxy must be non-transparent. Transparent proxies are not currently supported.
Review the appropriate table below depending on whether you are deploying Data Classification in AWS, Azure, or GCP.
Endpoints | Purpose |
---|---|
https://api.console.netapp.com |
Communication with the Console service, which includes NetApp accounts. |
https://netapp-cloud-account.auth0.com |
Communication with the Console website for centralized user authentication. |
https://cloud-compliance-support-netapp.s3.us-west-2.amazonaws.com |
Provides access to software images, manifests, and templates. |
https://kinesis.us-east-1.amazonaws.com |
Enables NetApp to stream data from audit records. |
https://cognito-idp.us-east-1.amazonaws.com |
Enables Data Classification to access and download manifests and templates, and to send logs and metrics. |
Endpoints | Purpose |
---|---|
https://api.console.netapp.com |
Communication with the Console service, which includes NetApp accounts. |
https://netapp-cloud-account.auth0.com |
Communication with the Console website for centralized user authentication. |
https://support.compliance.api.console.netapp.com/ |
Provides access to software images, manifests, templates, and to send logs and metrics. |
https://support.compliance.api.console.netapp.com/ |
Enables NetApp to stream data from audit records. |
Endpoints | Purpose |
---|---|
https://api.console.netapp.com |
Communication with the Console service, which includes NetApp accounts. |
https://netapp-cloud-account.auth0.com |
Communication with the Console website for centralized user authentication. |
https://support.compliance.api.console.netapp.com/ |
Provides access to software images, manifests, templates, and to send logs and metrics. |
https://support.compliance.api.console.netapp.com/ |
Enables NetApp to stream data from audit records. |
- Ensure that Data Classification has the required permissions
-
Ensure that Data Classification has permissions to deploy resources and create security groups for the Data Classification instance.
- Ensure that the Console agent can access Data Classification
-
Ensure connectivity between the Console agent and the Data Classification instance. The security group for the Console agent must allow inbound and outbound traffic over port 443 to and from the Data Classification instance. This connection enables deployment of the Data Classification instance and enables you to view information in the Compliance and Governance tabs. Data Classification is supported in Government regions in AWS and Azure.
Additional inbound and outbound security group rules are required for AWS and AWS GovCloud deployments. See Rules for the Console agent in AWS for details.
Additional inbound and outbound security group rules are required for Azure and Azure Government deployments. See Rules for the Console agent in Azure for details.
- Ensure you can keep Data Classification running
-
The Data Classification instance needs to stay on to continuously scan your data.
- Ensure web browser connectivity to Data Classification
-
After Data Classification is enabled, ensure that users access the Console interface from a host that has a connection to the Data Classification instance.
The Data Classification instance uses a private IP address to ensure that the indexed data isn't accessible to the internet. As a result, the web browser that you use to access the Console must have a connection to that private IP address. That connection can come from a direct connection to your cloud provider (for example, a VPN), or from a host that's inside the same network as the Data Classification instance.
- Check your vCPU limits
-
Ensure that your cloud provider's vCPU limit allows for the deployment of an instance with the necessary number of cores. You'll need to verify the vCPU limit for the relevant instance family in the region where the Console is running. See the required instance types.
See the following links for more details on vCPU limits:
Deploy Data Classification in the cloud
Follow these steps to deploy an instance of Data Classification in the cloud. The Console agent will deploy the instance in the cloud, and then install Data Classification software on that instance.
In regions where the default instance type isn't available, Data Classification runs on an alternate instance type.
-
From the main page of Data Classification, select Deploy Classification On-Premises or Cloud.
-
From the Installation page, select Deploy > Deploy to use the "Large" instance size and start the cloud deployment wizard.
-
The wizard displays progress as it goes through the deployment steps. When inputs are required or if it encounters issues, you are prompted.
-
When the instance is deployed and Data Classification is installed, select Continue to configuration to go to the Configuration page.
-
From the main page of Data Classification, select Deploy Classification On-Premises or Cloud.
-
Select Deploy to start the cloud deployment wizard.
-
The wizard displays progress as it goes through the deployment steps. It will stop and prompt for input if it runs into any issues.
-
When the instance is deployed and Data Classification is installed, select Continue to configuration to go to the Configuration page.
-
From the main page of Data Classification, select Governance > Classification.
-
Select Deploy Classification On-Premises or Cloud.
-
Select Deploy to start the cloud deployment wizard.
-
The wizard displays progress as it goes through the deployment steps. It will stop and prompt for input if it runs into any issues.
-
When the instance is deployed and Data Classification is installed, select Continue to configuration to go to the Configuration page.
The Console deploys the Data Classification instance in your cloud provider.
Upgrades to the Console agent and Data Classification software is automated as long as the instances have internet connectivity.
From the Configuration page you can select the data sources that you want to scan.