Manage S3 Select for tenant accounts
You can allow certain S3 tenants to use S3 Select to issue SelectObjectContent requests on individual objects.
S3 Select provides an efficient way to search through large amounts of data without having to deploy a database and associated resources to enable searches. It also reduces the cost and latency of retrieving data.
What is S3 Select?
S3 Select allows S3 clients to use SelectObjectContent requests to filter and retrieve only the data needed from an object. The StorageGRID implementation of S3 Select includes a subset of S3 Select commands and features.
Considerations and requirements for using S3 Select
Grid administration requirements
The grid administrator must grant tenants S3 Select ability. Select Allow S3 Select when creating a tenant or editing a tenant.
Object format requirements
The object you want to query must be in one of the following formats:
-
CSV. Can be used as is or compressed into GZIP or BZIP2 archives.
-
Parquet. Additional requirements for Parquet objects:
-
S3 Select supports only columnar compression using GZIP or Snappy. S3 Select doesn't support whole-object compression for Parquet objects.
-
S3 Select doesn't support Parquet output. You must specify the output format as CSV or JSON.
-
The maximum uncompressed row group size is 512 MB.
-
You must use the data types specified in the object's schema.
-
You can't use INTERVAL, JSON, LIST, TIME, or UUID logical types.
-
Endpoint requirements
The SelectObjectContent request must be sent to a StorageGRID load balancer endpoint.
The Admin and Gateway Nodes used by the endpoint must be one of the following:
-
An SG100 or SG1000 appliance node
-
A VMware-based software node
-
A bare metal node running a kernel with cgroup v2 enabled
General considerations
Queries can't be sent directly to Storage Nodes.
SelectObjectContent requests can decrease load-balancer performance for all S3 clients and all tenants. Enable this feature only when required and only for trusted tenants. |
See the instructions for using S3 Select.
To view Grafana charts for S3 Select operations over time, select SUPPORT > Tools > Metrics in the Grid Manager.