Skip to main content

Manage knowledge bases

Contributors netapp-mwallis netapp-tonacki netapp-bcammett netapp-rlithman

After you create a knowledge base, you can view the knowledge base details, modify the knowledge base, integrate additional data sources, or delete the knowledge base.

View information about a knowledge base

You can view information about the settings for a knowledge base and the data source that are integrated.

Steps
  1. Log in to workload factory using one of the console experiences.

  2. From the workload factory navigation menu, select AI.

  3. Select the knowledge base that you want to view.

    If defined, the conversation starters that are currently being used display in the right pane.

  4. To view knowledge base details, select the option button and select Manage knowledge base.

    This page displays the published status, embedding status of the data sources, embedding mode, the list of all embedded data sources, and more.

    The Actions menu enables you to manage the knowledge base if you want to make any changes.

Edit a knowledge base

You can update a knowledge base by changing some settings, or you can add or remove data sources.

Each time you add, modify, or remove data sources from the knowledge base, you must sync the data source so that it is re-indexed to the knowledge base. Syncing is incremental, so Amazon Bedrock only processes the objects in your FSx for ONTAP volume that have been added, modified, or deleted since the last sync.

Steps
  1. Log in to workload factory using one of the console experiences.

  2. From the Knowledge bases inventory page, select the knowledge base that you want to update.

  3. Select the option button and select Manage knowledge base.

    This page displays the published status, embedding status of the data sources, embedding mode, the list of all embedded data sources, and more.

  4. Select the Actions menu and select Edit knowledge base.

  5. In the Edit knowledge base page, you can change the knowledge base name, description, embedding model, chat model, data guardrails enablement, choose whether conversation starters are created automatically or manually, and the snapshot policy used for the volume that contains the knowledge base.

    If you use Manual mode for conversation starters, you can change conversation starters here as well.

    Note Every knowledge base scan, which includes embedding, costs. If data guardrails is enabled after a knowledge base has been created, then the knowledge base gets scanned again and incurs costs.
  6. Select Save after you have made your changes.

Protect a knowledge base with snapshots

You can protect your knowledge base data by taking and restoring snapshots of your knowledge base volumes. You can restore from a snapshot to revert to the previous version of the knowledge base at any time.

Snapshots can be faster and more storage-efficient than backups, and enable you to protect each knowledge base using a different protection policy. Some of the scenarios where snapshots can be useful are:

  • Accidental data loss or corruption

  • Recovering from incorrect data being ingested into the knowledge base

  • Testing different data sources or chunking strategies, and quickly reverting when the testing is complete

Take a snapshot of a knowledge base volume

You can save the state of a knowledge base by taking a manual snapshot of the knowledge base volume.

Steps
  1. Log in to workload factory using one of the console experiences.

  2. From the Knowledge bases inventory page, select the knowledge base that you want to protect.

  3. Select the option button and select Manage knowledge base.

    This page displays the published status, embedding status of the data sources, embedding mode, the list of all embedded data sources, and more.

  4. Select the Actions menu and select Snapshot > Create new snapshot.

    A snapshot of the knowledge base is created.

Restore a snapshot of a knowledge base volume

You can restore a manual or scheduled snapshot of a knowledge base volume at any time.

Note You cannot restore a snapshot using the Generative AI workloads UI if the database stored on the volume is corrupt or has been deleted. As a workaround, you can restore the snapshot using the ONTAP CLI on the ONTAP cluster where the volume is hosted.
Steps
  1. Log in to workload factory using one of the console experiences.

  2. From the Knowledge bases inventory page, select the knowledge base that you want to restore.

  3. Select the option button and select Manage knowledge base.

    This page displays the published status, embedding status of the data sources, embedding mode, the list of all embedded data sources, and more.

  4. Select the Actions menu and select Snapshot > Restore snapshot.

    The snapshot selection dialog appears, where you can see a list of the snapshots that have been created for this knowledge base.

  5. (Optional) Deselect the Pause running and scheduled scans after restoring the snapshot option if you want scheduled and currently running data source scans to continue after the snapshot is restored.

    This option is enabled by default to ensure that a scan doesn't happen while the knowledge base is in a partially restored state, or that a scan doesn't update a freshly restored knowledge base with older data.

  6. Select the snapshot you want to restore from the list.

  7. Select Restore.

Add additional data sources to a knowledge base

You can embed additional data sources in your knowledge base to populate it with additional organization data.

Steps
  1. Log in to workload factory using one of the console experiences.

  2. From the Knowledge bases inventory page, select the knowledge base where you want to add the data source.

  3. Select the option button and select Add data source.

  4. Select a file system: Select the FSx for ONTAP file system where your data source files reside and select Next.

  5. Select a volume: Select the volume on which your data source files reside and select Next.

    When selecting files stored using the SMB protocol, you'll need to enter the Active Directory information, which includes the domain, IP address, user name, and password.

  6. Select a data source: Select the data source location based on where you have saved the files. This can be an entire volume, or just a specific folder or sub-folder in the volume, and select Next.

  7. Define AI parameters: In the Chunking strategy section, define the how the GenAI engine splits data source content into chunks when the data source is integrated with a knowledge base. You can choose one of the following strategies:

    • Multi-sentence chunking: Organizes information from your data source into sentence-defined chunks. You can choose how many sentences make up each chunk (up to 100).

    • Overlap-based chunking: Organizes information from your data source into character-defined chunks that can overlap neighboring chunks. You can choose the size of each chunk in characters, and how much each chunk overlaps with adjacent chunks. You can configure a chunk size of between 50 and 3000 characters, and an overlap percentage of between 1 and 99%.

      Note Choosing a high overlap percentage can greatly increase storage requirements with only slight improvements in retrieval accuracy.
  8. In the Permission aware section, which is available only when the data source you selected is on a volume that uses the SMB protocol, you can enable or disable the selection:

    • Enabled: Users of the chatbot who access this knowledge base will only get responses to queries from data sources to which they have access.

    • Disabled: Users of the chatbot will receive responses using content from all integrated data sources.

  9. Select Add to add this data source to your knowledge base.

Result

The data source is integrated into your knowledge base.

Synchronize your data sources with a knowledge base

Data sources are synchronized with the associated knowledge base automatically once a day so that any data source changes are reflected in the chatbot. If you make changes to any of your data sources and you'd like to synchronize the data immediately, you can perform an on-demand synchronization.

Syncing is incremental, so Amazon Bedrock only processes the objects in your data sources that have been added, modified, or deleted since the last sync.

Steps
  1. Log in to workload factory using one of the console experiences.

  2. From the Knowledge bases inventory page, select the knowledge base that you want to synchronize.

  3. Select the option button and select Manage knowledge base.

  4. Select the Actions menu and select Scan now.

    You'll see a message that your data sources are being scanned, and a final message when the scan is complete.

Result

The knowledge base is synchronized with the attached data sources and any active chatbot will start using the newest information from your data sources.

Evaluate chat models before creating a knowledge base

You can evaluate the available foundational chat models before creating a knowledge base so you can see which model works best for your implementation. Since model support varies by AWS region, refer to this AWS documentation page to verify which models you can use in the regions where you plan to deploy your knowledge base.

Note This functionality is available only when no knowledge bases have been created — when no knowledge bases exist in the Knowledge bases inventory page.
Steps
  1. Log in to workload factory using one of the console experiences.

  2. From the Knowledge bases inventory page, you'll see the option to select the chat model on the right side of the page for the Chatbot.

  3. Select the chat model from the list and enter a set of questions in the prompt area to see how the chatbot responds.

  4. Try multiple models to see which model is best for your implementation.

Result

Use that chat model when you create your knowledge base.

Unpublish your knowledge base

After you've published your knowledge base so that it can be integrated with a chatbot application, you can unpublish it if you want to disable the chatbot application from accessing the knowledge base.

Unpublishing the knowledge base stops any chat applications from working. The unique API endpoint at which the knowledge base was accessible is disabled.

Steps
  1. Log in to workload factory using one of the console experiences.

  2. From the Knowledge bases inventory page, select the knowledge base that you want to unpublish.

  3. Select the option button and select Manage knowledge base.

    This page displays the published status, embedding status of the data sources, embedding mode, and the list of all embedded data sources.

  4. Select the Actions menu and select Unpublish.

Result

The knowledge base is disabled and is no longer accessible by a chatbot application.

Delete a knowledge base

If you no longer need a knowledge base, you can delete it. When you delete a knowledge base, it is removed from workload factory and the volume that contains the knowledge base is deleted. Any applications or chatbots that are using the knowledge base will stop working. Deleting a knowledge base is not reversible.

When you delete a knowledge base, you should also disassociate the knowledge base from any agents it is associated with to fully delete all resources associated with the knowledge base.

Steps
  1. Log in to workload factory using one of the console experiences.

  2. From the Knowledge bases inventory page, select the knowledge base that you want to delete.

  3. Select the option button and select Manage knowledge base.

  4. Select the Actions menu and select Delete knowledge base.

  5. In the Delete knowledge base dialog, confirm that you want to delete it and select Delete.

Result

The knowledge base is removed from workload factory and its associated volume is deleted.