Skip to main content
NetApp Solutions

Solution Technology

Contributors kevin-hoke

The following figure illustrates the proposed conversational AI system architecture. You can interact with the system with either speech signal or text input. If spoken input is detected, Jarvis AI-as-service (AIaaS) performs ASR to produce text for Dialog Manager. Dialog Manager remembers states of conversation, routes text to corresponding services, and passes commands to Fulfillment Engine. Jarvis NLP Service takes in text, recognizes intents and entities, and outputs those intents and entity slots back to Dialog Manager, which then sends Action to Fulfillment Engine. Fulfillment Engine consists of third-party APIs or SQL databases that answer user queries. After receiving Result from Fulfillment Engine, Dialog Manager routes text to Jarvis TTS AIaaS to produce an audio response for the end-user. We can archive conversation history, annotate sentences with intents and slots for NeMo training such that NLP Service improves as more users interact with the system.

cainvidia image3

Hardware Requirements

This solution was validated using one DGX Station and one AFF A220 storage system. Jarvis requires either a T4 or V100 GPU to perform deep neural network computations.

The following table lists the hardware components that are required to implement the solution as tested.

Hardware Quantity

T4 or V100 GPU




Software Requirements

The following table lists the software components that are required to implement the solution as tested.

Software Version or Other Information

NetApp ONTAP data management software


Cisco NX-OS switch firmware



4.0.4 - Ubuntu 18.04 LTS

NVIDIA Jarvis Framework

EA v0.2


Docker container platform

18.06.1-ce [e68fc7a]