Data-to-RAG quick start for AI Data Engine

04/28/2026 Contributors

Go from a newly deployed AI Data Engine (AIDE) system to a working retrieval-augmented generation (RAG) endpoint using this workflow. Understand how storage administrators, data engineers, and data scientists collaborate using ONTAP System Manager and AIDE Console.

The following instructions assume a NetApp DCN-based AIDE deployment.

Before you begin

You've installed and added NetApp DCNs to the ONTAP cluster.
You've installed AIDE and activated the AIDE premium services license for vectorization and guardrails features.
You've configured OpenID Connect (OIDC) and mapped roles for admin, data engineer, and data scientist roles.

Define data scope and governance

As a storage administrator or security administrator, you want to prepare the environment in AIDE Console and ONTAP System Manager:

Create one or more workspaces from local and remote data sources.
Configure classifiers and guardrail policies in AIDE Console.
Assign data engineer and data scientist access to the workspaces.

Explore workspace metadata

As a data engineer or data scientist, you want to explore the workspace metadata using AIDE Console:

Explore workspace metadata to understand available content.
Define one or more logical subsets of data that should feed RAG (for example, support articles, product manuals, or anonymized clinical notes).

Create and publish a data collection

As a data engineer or data scientist, you want to turn the chosen subset into a RAG-ready collection:

Create a data collection from the workspace using selected filters.
Publish the data collection and monitor indexing until it reaches Ready state.
Copy the retrieval endpoint URI for the chosen collection and provide to data scientists or application developers.
View data collection status and vector footprint as needed.

What's next?

Data-to-RAG quick start for AI Data Engine

Creating your file...