Skip to main content
How to enable StorageGRID in your environment

Configure Dremio data source with StorageGRID

Contributors netapp-aronk

Dremio supports a varity of data sources, including cloud-based or on-premises object storage. You can configure Dremio to use StorageGRID as object storage data source.

Configure Dremio data source

Prerequisites

  • A StorageGRID S3 endpoint URL, a tenant s3 access key ID, and secret access key.

  • StorageGRID configuration recommendation: disable compression (disabled by default).
    Dremio uses byte range GET to fetch different byte ranges from within the same object concurrently during query. Typical size for byte-range requests is 1MB. Compressed object degrades byte-range GET performance.

Instruction

  1. On Dremio Datasets page, click + sign to add a source, select 'Amazon S3'.

  2. Enter a name for this new data source, StorageGRID S3 tenant access key ID and secret access key.

  3. Check the box 'Encrypt connection' if using https for connection to StorageGRID S3 endpoint.
    If using self-signed CA cert for this s3 endpoint, follow Dremio guide instrution to add this CA cert into Dremio server's <JAVA_HOME>/jre/lib/security
    Sample screenshot

    New Source - General

  4. Click 'Advanced Options', check 'Enable compatibility mode'

  5. Under Connection properties, click + Add Properties and add these s3a properties.

  6. fs.s3a.connection.maximum default is 100. If your s3 datasets include large Parquet files with 100 or more columns, must enter a value greater than 100. Refer to Dremio guide for this setting.

    Name Value

    fs.s3a.endpoint

    <StorageGRID S3 endpoint:port>

    fs.s3a.path.style.access

    true

    fs.s3a.connection.maximum

    <a value greater than 100>

    Sample screenshot

    New Source - Advanced Options

  7. Configure other Dremio options as per your organization or application requirements.

  8. Click the Save button to create this new data source.

  9. Once StorageGRID data source is added successfully, a list of buckets will be displayed on the left panel.
    Sample screenshot

    New data source added

By Angela Cheng