You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Run the `recognize`command with a wav file. The CLI prints the best match and log-likelihood scores that come from the shared `VoiceRecognitionService`.
The `recognize_stream`command reuses the same service façade but feeds the audio file in chunks (default 0.5 s). This mimics real-time capture and prints interim matches as soon as the likelihoods are high enough.
Use `src/live_recognition.py` to capture audio from the default input device and route it directly through the streaming API. Ensure `sounddevice` sees your microphone, then run:
84
+
85
+
```bash
86
+
python src/live_recognition.py
87
+
```
88
+
89
+
Speak into the microphone—interim matches will appear as the engine accumulates enough audio. Press `Ctrl+C` to stop.
90
+
91
+
## Embedding the Service API
92
+
93
+
For tighter integration with other applications (e.g., the upcoming voice engine), import `VoiceRecognitionService` and the request/response models:
94
+
95
+
```python
96
+
from file_management.bst import BinarySearchTree
97
+
from service.api import VoiceRecognitionService, EnrollmentRequest, EnrollmentConfig
98
+
from service.audio_sources import BufferAudioSource
99
+
100
+
bst = BinarySearchTree()
101
+
service = VoiceRecognitionService(bst=bst, base_directory="test_environment")
The same façade exposes `recognize`, `start_session`, `list_speakers`, and `delete_speaker`, allowing other repositories to depend on this module without invoking the CLI.
113
+
114
+
## Recording a Test WAV on Raspberry Pi with Jabra Speak 410
115
+
116
+
Use this workflow to capture a 16 kHz mono WAV file on the Raspberry Pi 5 connected to the Jabra speaker/mic. All commands assume the repository lives under `/home/gena/PROJECTS`.
117
+
118
+
1. Set the Jabra device as the default PipeWire sink/source:
119
+
```bash
120
+
./roomba_stack/audio_jabra_default.sh
121
+
```
122
+
2. Confirm the capture device name (needed in the next step):
123
+
```bash
124
+
pactl list short sources | grep -i jabra
125
+
```
126
+
You should see something like `alsa_input.usb-0b0e_Jabra_SPEAK_410_USB_...-mono-fallback` running at 16 kHz.
127
+
3. Make sure there is a place to store recordings:
128
+
```bash
129
+
mkdir -p voice-recognition-engine/audio_files
130
+
```
131
+
4. Record a short sample (5–10 seconds) using the PipeWire/ALSA device discovered in step 2:
The resulting `gmm_test.wav` resides in`voice-recognition-engine/audio_files/` and can be supplied to the CLI commands (e.g., `python src/cli.py recognize voice-recognition-engine/audio_files/gmm_test.wav --sample_rate 16000`).
0 commit comments