Solution Technology

Contributors Download PDF of this page

The following figure illustrates the proposed conversational AI system architecture. You can interact with the system with either speech signal or text input. If spoken input is detected, Jarvis AI-as-service (AIaaS) performs ASR to produce text for Dialog Manager. Dialog Manager remembers states of conversation, routes text to corresponding services, and passes commands to Fulfillment Engine. Jarvis NLP Service takes in text, recognizes intents and entities, and outputs those intents and entity slots back to Dialog Manager, which then sends Action to Fulfillment Engine. Fulfillment Engine consists of third-party APIs or SQL databases that answer user queries. After receiving Result from Fulfillment Engine, Dialog Manager routes text to Jarvis TTS AIaaS to produce an audio response for the end-user. We can archive conversation history, annotate sentences with intents and slots for NeMo training such that NLP Service improves as more users interact with the system.

Error: Missing Graphic Image

Hardware Requirements

This solution was validated using one DGX Station and one AFF A220 storage system. Jarvis requires either a T4 or V100 GPU to perform deep neural network computations.

The following table lists the hardware components that are required to implement the solution as tested.

Hardware Quantity

T4 or V100 GPU

1

NVIDIA DGX Station

1

Software Requirements

The following table lists the software components that are required to implement the solution as tested.

Software Version or Other Information

NetApp ONTAP data management software

9.6

Cisco NX-OS switch firmware

7.0(3)I6(1)

NVIDIA DGX OS

4.0.4 - Ubuntu 18.04 LTS

NVIDIA Jarvis Framework

EA v0.2

NVIDIA NeMo

nvcr.io/nvidia/nemo:v0.10

Docker container platform

18.06.1-ce [e68fc7a]