Replies: 2 comments 2 replies
-
Those results sound pretty reasonable, tbh. Not sure you can get much better performance using a cloud-based LLM conversation agent. At some point, streaming will probably be supported over the Wyoming protocol. Then the perceived performance may visibly increase, but I can't think of any other major developments until then. |
Beta Was this translation helpful? Give feedback.
-
First - if you have a good GPU in your household you could move Whisper there - incredible results even with a RTX1650 and up. Next, if you do not, you can use Home Assistant Cloud for STT, it is pretty fast actually. ElevenLabs is of course slower compared to run a local Piper TTS. If speed is the priority;
If quality is the priority (TTS), use ElevenLans Turbo 2.5 (as you already are doing). |
Beta Was this translation helpful? Give feedback.
-
I've assembled an Onju Voice for a child as an Alexa replacement. Overall, it works quite well, but it seems the child is a bit disappointed by it answering so slow in comparison to an Alexa.
My current pipeline is:
A random run with some command looks like:
Do you got any tips on speeding up these pipeline steps - in particular STT and Natural Language Processing, which seem to cause the delay in playing an answer.
Maybe you're even using other upstream providers, which may be faster.
Beta Was this translation helpful? Give feedback.
All reactions