This repository was archived by the owner on Aug 30, 2020. It is now read-only.

Description
Hello
I am experiencing an issue where it is really hard for rhasspy to recognize the word no especially when pronounced by a female voice (sometimes 0% rate over 10 utterances).
My sentences.ini file:
[yesAnswer]
yes
yep
yeah
yes please
[noAnswer]
no
no thanks
no thank you
no thanks and no thank you work a lot better even with female voices
My setup is as follow: rhasspy is running as a service on my raspberry PI4. I have another python3 script which is controlling rhasspy via the HTTP API, receives the transcript and forwards that over to another machine. This other script also runs at startup as a service.
I send messages to that script and ask it to query Rhasspy over HTTP.
This is the log of one of my requests
[INFO:304795] quart.serving: 127.0.0.1:39554 POST /api/stop-recording 1.1 200 292 215383
[DEBUG:304792] InboxActor: -> stopped
[DEBUG:304789] main: {"intent": {"name": "noAnswer", "confidence": 1.0}, "entities": [], "text": "no", "raw_text": "no", "recognize_seconds": 0.00034929599996758043, "tokens": ["no"], "raw_tokens": ["no"], "wav_seconds": 0.0, "transcribe_seconds": 0.0, "speech_confidence": 0.1044066508421162, "slots": {}, "wakeId": "", "siteId": "default"}
[DEBUG:304788] InboxActor: -> stopped
[DEBUG:304785] main: no
[DEBUG:304784] InboxActor: -> stopped
[DEBUG:304782] PocketsphinxDecoder: no
[DEBUG:304781] PocketsphinxDecoder: Transcription confidence: 0.1044066508421162
[DEBUG:304780] PocketsphinxDecoder: Decoded WAV in 0.18922734260559082 second(s)
[DEBUG:304589] PocketsphinxDecoder: rate=16000, width=2, channels=1.
[DEBUG:304585] main: Recorded 137324 byte(s) of audio data
[DEBUG:304584] InboxActor: -> stopped
[INFO:300307] quart.serving: 127.0.0.1:39550 POST /api/start-recording 1.1 200 2 7190
What can I do to solve this issue?
At the moment I am using Pocketsphinx. I noticed that Rhasspy 2.5 also now supports Deepspeech. Would I get a better result switching to a different recogniser such as Kaldi or Deepspeech?