Skip to main content
Data Infrastructure Insights

Configuring the AWS S3 as Storage data collector

Contributors netapp-alavoie

Data Infrastructure Insights uses the AWS S3 as Storage data collector to acquire inventory and performance data from AWS S3 instances.

Requirements

In order to collect data from AWS S3 as Storage devices, you must have the following information:

  • You must have one of the following:

    • The IAM Role for your AWS cloud account, if using IAM Role Authentication. IAM Role only applies if your acquisition unit is installed on an AWS instance.

    • The IAM Access Key ID and Secret Access Key for your AWS cloud account, if using IAM Access Key authentication.

  • You must have the "list organization" privilege

  • Port 443 HTTPS

  • AWS S3 Instances can be reported as a Virtual Machine, or (less naturally) a Host. EBS Volumes can be reported as both a VirtualDisk used by the VM, as well as a DataStore providing the Capacity for the VirtualDisk.

Access keys consist of an access key ID (for example, AKIAIOSFODNN7EXAMPLE) and a secret access key (for example, wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY). You use access keys to sign programmatic requests that you make to AWS if you use the AWS SDKs, REST, or Query API operations. These keys are provided with your contract from Amazon.

Configuration

Enter data into the data collector fields according to the table below:

Field Description

AWS Region

Choose AWS region

IAM Role

For use only when acquired on an AU in AWS. See below for more information on IAM Role.

AWS IAM Access Key ID

Enter AWS IAM Access Key ID. Required if you do not use IAM Role.

AWS IAM Secret Access Key

Enter AWS IAM Secret Access Key. Required if you do not use IAM Role.

I understand AWS bills me for API requests

Check this to verify your understanding that AWS bills you for API requests made by Data Infrastructure Insights polling.

Advanced Configuration

Field Description

Cross Account Role

Role for accessing resources in different AWS accounts.

Inventory Poll Interval (min)

The default is 60

Choose 'Exclude' or 'Include' to Apply to Filter VMs by Tags

Specify whether to include or exclude VM's by Tags when collecting data. If ‘Include’ is selected, the Tag Key field can not be empty.

Performance Poll Interval (sec)

The default is 1800

IAM Access Key

Access keys are long-term credentials for an IAM user or the AWS account root user. Access keys are used to sign programmatic requests to the AWS CLI or AWS API (directly or using the AWS SDK).

Access keys consist of two parts: an access key ID and a secret access key. When you use IAM Access Key authentication (as opposed to IAM Role authentication), you must use both the access key ID and secret access key together for authentication of requests. For more information, see the Amazon documentation on Access Keys.

IAM Role

When using IAM Role authentication (as opposed to IAM Access Key authentication), you must ensure that the role you create or specify has the appropriate permissions needed to access your resources.

For example, if you create an IAM role named InstanceS3ReadOnly, you must set up the policy to grant S3 read-only list access permission to all S3 resources for this IAM role. Additionally, you must grant STS (Security Token Service) access so that this role is allowed to assume roles cross accounts.

After you create an IAM role, you can attach it when you create a new S3 instance or any existing S3 instance.

After you attach the IAM role InstanceS3ReadOnly to an S3 instance, you will be able to retrieve the temporary credential through instance metadata by IAM role name and use it to access AWS resources by any application running on this S3 instance.

For more information see the Amazon documentaiton on IAM Roles.

Note: IAM role can be used only when the Acquisition Unit is running in an AWS instance.

Mapping Amazon tags to Data Infrastructure Insights annotations

The AWS S3 as Storage data collector includes an option that allows you to populate Data Infrastructure Insights annotations with tags configured on S3. The annotations must be named exactly as the AWS tags. Data Infrastructure Insights will always populate same-named text-type annotations, and will make a "best attempt" to populate annotations of other types (number, boolean, etc). If your annotation is of a different type and the data collector fails to populate it, it may be necessary to remove the annotation and re-create it as a text type.

Note that AWS is case-sensitive, while Data Infrastructure Insights is case-insensitive. So if you create an annotation named "OWNER" in Data Infrastructure Insights, and tags named "OWNER", "Owner", and "owner" in S3, all of the S3 variations of "owner" will map to Cloud Insight's "OWNER" annotation.

Include Extra Regions

In the AWS Data Collector Advanced Configuration section, you can set the Include extra regions field to include additional regions, separated by comma or semi-colon. By default, this field is set to us-.*, which collects on all US AWS regions. To collect on all regions, set this field to .*.
If the Include extra regions field is empty, the data collector will collect on assets specified in the AWS Region field as specified in the Configuration section.

Collecting from AWS Child Accounts

Data Infrastructure Insights supports collection of child accounts for AWS within a single AWS data collector. Configuration for this collection is performed in the AWS environment:

  • You must configure each child account to have an AWS Role that allows the main account ID to access S3 details from the children account.

  • Each child account must have the role name configured as the same string.

  • Enter this role name string into the Data Infrastructure Insights AWS Data Collector Advanced Configuration section, in the Cross account role field.

  • The account where the collector is installed needs to have delegate access administrator privileges. See the AWS Documentation for more information.

Best Practice: It is highly recommended to assign the AWS predefined AmazonS3ReadOnlyAccess policy to the S3 main account. Also, the user configured in the data source should have at least the predefined AWSOrganizationsReadOnlyAccess policy assigned, in order to query AWS.

Please see the following for information on configuring your environment to allow Data Infrastructure Insights to collect from AWS child accounts:

Troubleshooting

Additional information on this Data Collector may be found from the Support page or in the Data Collector Support Matrix.