Alexa Voice Assistant System

A phased-parallel voice assistant system with multiple microservices communicating via gRPC.

Architecture

The system consists of 7 processes running on localhost:

Main: Bootstrap process
Loader (port 5002): Orchestrator for phased-parallel startup
Logger (port 5001): Centralized logging service
KWD (port 5003): Keyword detection (wake word)
STT (port 5004): Speech-to-text
LLM (port 5005): Language model (Ollama)
TTS (port 5006): Text-to-speech

Current Status

✅ Completed Services

Logger Service (port 5001)
- Application and dialog logging
- Log rotation support
- gRPC RPCs: WriteApp, NewDialog, WriteDialog
- Health check implementation
KWD Service (port 5003)
- OpenWakeWord integration with "Alexa" wake word
- 0.6 confidence threshold, 1s cooldown
- Real-time audio processing at 16kHz
- gRPC RPCs: Events (stream), Enable, Disable
- Health check implementation
STT Service (port 5004)
- Whisper integration (small.en model)
- WebRTC VAD for automatic finalization (~2s silence)
- CUDA acceleration for fast transcription
- gRPC RPCs: Start, Stop, Results (stream)
- Multi-session support for concurrent dialogs

🚧 Services To Build

LLM Service (port 5005) - Ollama bridge
TTS Service (port 5006) - Kokoro integration
Loader Service (port 5002) - Phased orchestration

Prerequisites

Python 3.11
CUDA-capable GPU with 8GB+ VRAM
PortAudio (for audio capture)
Ollama (for LLM)

Installation

# Create virtual environment with uv
uv venv
source .venv/bin/activate

# Install dependencies
uv pip install grpcio grpcio-tools grpcio-health-checking
uv pip install aiofiles pyyaml nvidia-ml-py3 psutil 
uv pip install sounddevice numpy scipy
uv pip install onnxruntime openwakeword tqdm

Configuration

Configuration is in config/config.ini:

VRAM guardrail: 8000MB minimum
Ports: 5001-5006 (localhost only)
Wake word: "Alexa" (threshold 0.6)

Usage

Service Management

# Check status of all services
python manage_services.py status

# Start individual service
python manage_services.py start logger
python manage_services.py start kwd

# Stop service
python manage_services.py stop kwd

# Restart service
python manage_services.py restart logger

# Start all services
python manage_services.py start all

# Stop all services
python manage_services.py stop all

Testing Services

Test Logger Service

# Start logger
python manage_services.py start logger

# View logs
cat logs/app.log

Test KWD Service

# Start KWD service
python manage_services.py start kwd

# Run test client
python tests/test_kwd.py

# Say "Alexa" to trigger wake word detection

Test STT Service

# Start STT service (and logger)
python manage_services.py start logger
python manage_services.py start stt

# Run test client
python tests/test_stt.py

# Speak after the prompt - recognition finalizes after 2s of silence

# Test continuous recognition
python tests/test_stt.py --continuous

Service Logs

Each service writes to its own log file:

logger_service.log
kwd_service.log
Application logs: logs/app.log
Dialog logs: logs/dialog_*.log

Development

Project Structure

Alexa_W/
├── services/           # Service implementations
│   ├── logger/
│   ├── kwd/
│   ├── stt/
│   ├── llm/
│   ├── tts/
│   └── loader/
├── common/            # Shared modules
│   ├── base_service.py
│   ├── config_loader.py
│   ├── health_client.py
│   └── gpu_monitor.py
├── proto/             # gRPC definitions
│   ├── services.proto
│   └── generated files
├── config/            # Configuration
│   ├── config.ini
│   └── Modelfile
├── models/            # ML models
│   └── alexa_v0.1.onnx
├── logs/              # Log files
└── tests/             # Test scripts

Adding a New Service

Create service directory under services/
Inherit from BaseService class
Implement service-specific RPCs
Add to manage_services.py
Test with health checks

Troubleshooting

Port Already in Use

# Find process using port
lsof -i :5001

# Kill process
kill -9 <PID>

Audio Issues

Ensure microphone permissions are granted
Check audio device with python -m sounddevice
Verify sample rate compatibility (16kHz required)

GPU/VRAM Issues

Check GPU with nvidia-smi
Ensure 8GB+ VRAM available
Monitor usage with common/gpu_monitor.py

Performance Targets

Wake detection latency: <200ms
First token latency (LLM): <800ms
First audio latency (TTS): <150ms
Dialog follow-up window: 4s

Security

All services bind to localhost (127.0.0.1) only
No external network calls
Config validation on startup
VRAM guardrails enforced

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
common		common
config		config
docs		docs
models		models
proto		proto
services		services
tests		tests
.gitignore		.gitignore
.python-version		.python-version
=0.9.4		=0.9.4
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run.sh		run.sh
test_logger_console.py		test_logger_console.py
test_running_logger.py		test_running_logger.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Alexa Voice Assistant System

Architecture

Current Status

✅ Completed Services

🚧 Services To Build

Prerequisites

Installation

Configuration

Usage

Service Management

Testing Services

Test Logger Service

Test KWD Service

Test STT Service

Service Logs

Development

Project Structure

Adding a New Service

Troubleshooting

Port Already in Use

Audio Issues

GPU/VRAM Issues

Performance Targets

Security

About

Uh oh!

Releases

Packages

Languages

TrueAndyT/Alexa_W

Folders and files

Latest commit

History

Repository files navigation

Alexa Voice Assistant System

Architecture

Current Status

✅ Completed Services

🚧 Services To Build

Prerequisites

Installation

Configuration

Usage

Service Management

Testing Services

Test Logger Service

Test KWD Service

Test STT Service

Service Logs

Development

Project Structure

Adding a New Service

Troubleshooting

Port Already in Use

Audio Issues

GPU/VRAM Issues

Performance Targets

Security

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages