roomba_stack

Layered software stack for controlling iRobot Roomba (600 series OI).
Dev on Ubuntu laptop, deploy to Raspberry Pi 5.

SW architecture breakdown: [External User] ↑ WiFi / Web / BLE / CLI │ [L5 UI Layer] ← FastAPI, WebSocket, CLI, etc. │ [L4 App Layer] ← event bus, command bus, scheduler │ [L3 Domain] ← behaviors, safety, state machine │ [L2 OI] ← opcodes, sensor parsing, OI service │ [L1 Serial] ← (CURRENT MODULE) → abstract serial port │ [Roomba OI Port] ← physical UART / USB-TTL only

                                ┌──────────────────────────────────────────────────────────────────────┐
                               │                           HOST (Ubuntu)                              │
                               │                                                                      │
    Roomba OI UART             │  USB Bridge (CP210x/FTDI/CH340)     Linux USB/TTY driver + buffers   │
TX──▶ 115200 8N1 ─────────────▶│────────────── USB packets ─────────▶ /dev/ttyUSBx (/dev/ttyACMx)      │
                               │                                                                      │
                               └──────────────────────────────────────────────────────────────────────┘

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ l1_drivers.PySerialPort (thread-safe facade; Story 2) │ │ │ │ [Reader Thread] [Writer Thread - future] │ │ ─────────────── ───────────────────────── │ │ - reads chunks via pyserial.read() - drains TxFrameQueue (timed get) │ │ - invokes set_reader(cb)(data) - pyserial.write(frame) │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ data: bytes (arbitrary chunking; may contain partial/whole/multiple frames) ▼

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ app.OIService (process boundary) │ │ │ │ (A) ENQUEUE-ONLY RX CALLBACK │ │ ────────────────────────── │ │ _on_serial_bytes(data): │ │ if not data: return │ │ ok = RxByteQueue.put(data, timeout=Q_PUT_TIMEOUT) │ │ if not ok: WARN "overflow: dropped {len(data)}" │ │ │ │ (B) BOUNDED QUEUES (Monitor semantics) │ │ ───────────────────────────────────────────────────────────────────────────── │ │ RxByteQueue (max=RX_QUEUE_MAX) TxFrameQueue (max=TX_QUEUE_MAX) …future │ │ • timed put/get (no infinite waits) • timed put/get (no infinite waits) │ │ • overflow policy: drop-newest • overflow policy: drop-newest │ │ │ │ (C) DISPATCHER THREAD (Half-Sync side) │ │ ───────────────────────────────────────────────────────────────────────────── │ │ _dispatcher_loop(): │ │ while _running: │ │ ok, chunk = RxByteQueue.get(timeout=Q_GET_TIMEOUT) # timed; no busy spin │ │ if not ok: continue │ │ _rx_buf.extend(chunk) # reassembly buffer │ │ _decode_available_frames() # decode 0..N frames; consume bytes exactly │ │ │ │ (D) DECODER / DEMUX (single event dispatcher) │ │ ───────────────────────────────────────────────────────────────────────────── │ │ _decode_available_frames(): │ │ while True: │ │ if not _rx_buf: break │ │ lead = _rx_buf[0] │ │ │ │ Case A: STREAM FRAME (opcode 148; header 0x13) │ │ - need ≥ 3 bytes to read len N │ │ - sanity-cap N (e.g., ≤128) │ │ - total = 2 + N + 1 (hdr,len,chk) │ │ - if buffer has < total: break (await more) │ │ - verify checksum (sum==0 mod 256); if bad → WARN + drop 1 byte (resync) │ │ - decode payload → {pid→parsed} │ │ - for each (pid, parsed): _deliver(pid, parsed) │ │ - del _rx_buf[:total] and continue │ │ │ │ Case B: PENDING SINGLE REPLY (opcode 142; raw payload only) │ │ - _pending_request_id = pid │ │ - expected_len = packet_length(pid) │ │ - if buffer has < expected_len: break │ │ - raw = _rx_buf[:expected_len]; parse → parsed │ │ - _deliver(pid, parsed); clear _pending_request_id │ │ - del _rx_buf[:expected_len]; continue │ │ │ │ Case C: UNKNOWN / GARBAGE │ │ - WARN "RX resync: dropping 1 byte (buf=..)" │ │ - del _rx_buf[0]; continue │ │ │ │ (E) DELIVERY (single place, deterministic) │ │ ───────────────────────────────────────────────────────────────────────────── │ │ _deliver(pid, parsed): │ │ latest_packets[pid] = parsed │ │ if _on_sensor: │ │ try: _on_sensor(pid, parsed) │ │ except: log.exception("on_sensor callback error") │ │ │ │ (F) SHUTDOWN │ │ ───────────────────────────────────────────────────────────────────────────── │ │ close(): │ │ _running=False; port.set_reader(None) │ │ join dispatcher (timeout) if alive & not self │ │ port.close() │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ │ │ events: (pid, value) + cache: latest_packets[pid] ▼

┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ Upper Consumers (current & near-future) │ │ │ │ • CLI / UI (today) │ │ - subscribes via _on_sensor callback; prints packet 7, 25, etc. │ │ │ │ • App logic (near-future) │ │ - Pub/Sub bus: topics sensor., mode.changed, rx.raw, tx.sent │ │ - OI mode State Machine (OFF/PASSIVE/SAFE/FULL) gates allowed commands │ │ - Safety/Watchdog: timeouts, RX stall monitor, overflow counters │ │ - Structured logging/telemetry │ │ │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

…TX PATH (next iteration; shown for completeness) ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── Commands (Start/Safe/Drive/Sensors/StreamOn/Off) → validate against State Machine → encode via codec → enqueue TxFrameQueue (timed put; bounded) → writer thread drains and pyserial.write(frame)

How to configure speakerphone: 320 sudo apt install -y mpv ffmpeg 321 pip3 install --upgrade yt-dlp run the script to configure the speaker: ./audio_jabra_default.sh Example to play songs, 322 mpv --no-video --ao=pulse --ytdl-format=bestaudio --volume=70 "ytdl://ytsearch1:beatles let it be official audio" 323 mpv --no-video --ao=pulse --ytdl-format=bestaudio --volume=70 "ytdl://ytsearch1:Taylor Swift official audio" 324 mpv --no-video --ao=pulse --ytdl-format=bestaudio --volume=70 "ytdl://ytsearch1:Taylor Swift August official audio" 325 mpv --no-video --ao=pulse --ytdl-format=bestaudio --volume=70 "ytdl://ytsearch1:beatles let it be official audio"

Developer Shell (Maintenance Console)

apps/dev_shell.py mirrors the production wiring (EventBus + CommandBus + OIService + voice stack) but exposes a REPL for manual testing.

Run

PYTHONPATH=src python apps/dev_shell.py --device /dev/ttyUSB0 --baud 115200

Useful commands

speaker gena 0.95 → publish SpeakerIdentity (authorizes the voice path).
transcript "stop" 0.90 → publish a transcript; IntentRouter maps “stop” to StopCmd.
drive --straight 200 / drive_direct 200 200 → enqueue motion commands via CommandBus.
dock, reset, start, safe, full → dispatch the corresponding mode/dock commands.
state → print the latest RobotSnapshot; queues → dump EventBus/Rx/Tx queue depth.

VS Code launch (add to `.vscode/launch.json`)

{
  "name": "Python: Dev Shell",
  "type": "python",
  "request": "launch",
  "program": "${workspaceFolder}/apps/dev_shell.py",
  "args": ["--device", "/dev/ttyUSB0", "--baud", "115200"],
  "env": {"PYTHONPATH": "${workspaceFolder}/src"},
  "console": "integratedTerminal"
}

Attach the debugger to break inside IntentRouter or OIService while driving the robot from the shell.

Voice Gateway (MVP)

This runner bridges external voice services to the stack via HTTP and routes intents.

Run

python apps/gateway.py

# Speaker identity (from your GMM service)
curl -i -H "Content-Type: application/json" \
  -d '{"topic":"voice.speaker","ts":1730900000000,"speaker":"gena","confidence":0.94}' \
  http://127.0.0.1:8765/

# Transcript (from STT/KWS)
curl -i -H "Content-Type: application/json" \
  -d '{"topic":"voice.transcript","ts":1730900000500,"text":"stop","confidence":0.92}' \
  http://127.0.0.1:8765/

What happens

VoiceHttpBridge (HTTP → EventBus) publishes SpeakerIdentity / AudioTranscript.

VoiceAuthPolicy (allowlist + TTL + greeting cooldown) emits TtsRequest greetings.

IntentRouter (MVP) maps “stop” to StopCmd with a confirmation window.

PrintTtsAdapter subscribes to voice.tts and prints [SAY] ... lines.

Configuration notes

Allowed speakers and thresholds live in apps/gateway.py (see VoiceAuthConfig and IntentThresholds).

The EventBus queue is bounded; if overloaded, newest TTS publishes may drop (protects producer threads).

Replace PrintTtsAdapter with a real TTS adapter later without changing domain logic.

### KWS/STT integration

Any keyword spotter (KWS) or speech-to-text (STT) engine can send events to the gateway via HTTP.

**Use the helper CLI**
```bash
# Keyword spotter hit (normalized as transcript)
python apps/voice_post.py --transcript "stop" --confidence 0.97 --source kws

# STT transcript
python apps/voice_post.py --transcript "turn left" --confidence 0.88 --source stt

# Speaker identity (GMM)
python apps/voice_post.py --speaker "gena" --confidence 0.94

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
apps		apps
config		config
src/roomba_stack		src/roomba_stack
tests		tests
.gitignore		.gitignore
README.md		README.md
Roomba_691_OI_Cheat_Sheet		Roomba_691_OI_Cheat_Sheet
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

roomba_stack

Developer Shell (Maintenance Console)

Run

Useful commands

VS Code launch (add to `.vscode/launch.json`)

Voice Gateway (MVP)

Run

About

Uh oh!

Releases

Packages

Languages

GenaNiv/roomba_stack

Folders and files

Latest commit

History

Repository files navigation

roomba_stack

Developer Shell (Maintenance Console)

Run

Useful commands

VS Code launch (add to .vscode/launch.json)

Voice Gateway (MVP)

Run

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

VS Code launch (add to `.vscode/launch.json`)

Packages