Building ASR + LLM + TTS Pipeline

Introduction

Imagine talking to your Jetson device and receiving spoken responses — just like talking to a friend. An ASR + LLM + TTS pipeline makes this possible by combining three AI capabilities into a complete voice assistant that runs entirely offline:

  • ASR (Automatic Speech Recognition): Listen to your voice and convert it to text
  • LLM (Large Language Model): Understand and generate intelligent responses
  • TTS (Text-to-Speech): Convert the text response back to natural speech
Code
Microphone --> [ASR] --> Text --> [LLM] --> Text --> [TTS] --> Speaker
                  (Whisper)           (Ollama)           (Coqui/Riva)

voice-pipeline

System Requirements

ComponentMinimumRecommended
DeviceJetson Orin Nano 8GBJetson Orin NX 16GB / AGX Orin 32GB
AudioUSB microphoneUSB microphone + speakers
OSJetPack 6.2+Latest JetPack 6.x
Storage50GB free100GB+

Hands-On: Two Ready-to-Run Projects

Below are two complete, tested projects that implement the ASR + LLM + TTS pipeline on Jetson. Pick the one that matches your setup and follow the guide.


Project 1: Voice LLM on reComputer + Reachy Mini

reachy

A fully local, low-latency voice-interactive robotic assistant built on reComputer Mini J501 paired with the Reachy Mini Lite open-source robot.

ComponentTechnology
ASRFunASR (ModelScope)
LLMOllama — Qwen2.5 7B
TTSCoqui TTS — Tacotron2-DDC-GST
HardwarereComputer Mini J501 + Reachy Mini Lite

Quick Start:

bash
# Step 1: Install Ollama and pull the LLM
curl -fsSL https://ollama.com/install.sh | sh
ollama pull qwen2.5:7b

# Step 2: Clone the project and install dependencies
git clone https://github.com/Seeed-Projects/reachy-mini-loacl-conversation.git
cd reachy-mini-loacl-conversation
pip install -r requirements.txt -i https://pypi.jetson-ai-lab.io/
pip install "reachy-mini"

# Step 3: Start the Reachy Mini daemon (Terminal 1)
reachy-mini-daemon

# Step 4: Launch the voice assistant (Terminal 2)
export OLLAMA_HOST="http://localhost:11434"
export OLLAMA_MODEL="qwen2.5:7b"
export COQUI_MODEL_NAME="tts_models/zh-CN/baker/tacotron2-DDC-GST"
export DEFAULT_VOLUME="1.5"
python main.py

Press R to start recording, S to stop. The system will process your speech and respond with a spoken answer.

For the full walkthrough, see: Deploy Local Voice LLM on reComputer Jetson for Reachy Mini


Project 2: Local Voice Chatbot with NVIDIA Riva + Llama2

riva

A complete voice chatbot using NVIDIA Riva for ASR/TTS and Llama2 7B for language generation, running entirely on a Jetson AGX Orin with no cloud dependency.

ComponentTechnology
ASR + TTSNVIDIA Riva — Conformer ASR + FastPitch/HiFiGAN TTS
LLMLlama2 7B Chat via text-generation-inference
AudioReSpeaker USB Mic Array
HardwareJetson AGX Orin 32GB

Quick Start:

bash
# Step 1: Install Riva Server (requires NGC CLI + API key)
# Download and configure riva_quickstart_arm64:2.13.1
# Edit config.sh: enable ASR + TTS, disable NLP/NMT
sudo bash riva_init.sh    # Download models (~5 min)
sudo bash riva_start.sh   # Start Riva server (keep running)

# Step 2: Start LLM inference server (Terminal 2)
# Using dusty-nv/jetson-containers + text-generation-inference
# Model served on port 8899

# Step 3: Run the chatbot demo (Terminal 3)
python3 local_chatbot.py --input-device <id> --output-device <id>
# Use --list-input-devices / --list-output-devices to find your audio device IDs

Three terminals run simultaneously: Riva server, LLM server, and the chatbot demo.

jetson-agx

For the full walkthrough, see: Local Voice Chatbot: Deploy Riva and Llama2 on reComputer


Comparison

Project 1 (Reachy Mini)Project 2 (Riva + Llama2)
ASRFunASRNVIDIA Riva
TTSCoqui TTSNVIDIA Riva
LLMQwen2.5 7B (Ollama)Llama2 7B (TGI)
Min RAM8GB16GB
Special HWReachy Mini Lite robotReSpeaker USB Mic Array
LanguageChinese (configurable)English
DifficultyEasyIntermediate

References

  • Jetson AI Lab — NVIDIA's official platform for AI on Jetson
  • FunASR — speech recognition toolkit
  • Coqui TTS — open-source text-to-speech
  • NVIDIA Riva — GPU-accelerated speech AI SDK
  • Ollama — local LLM inference engine

Congratulations! You've completed Module 5 and learned how to run LLMs offline on your Jetson device and build a complete voice assistant pipeline!

Next: Proceed to Chapter 6: Generative AI Applications or return to Chapter 5 Overview.