You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have some custom REST API server code that currently replies with some string, and using WIS TTS the string gets read out on the ESP32 box. This is great for short replies, but not so great if the LLM needs to read a long text. Would it be somehow possible to send responses sentence by sentence as it gets streamed from the LLM I use? Something similar to how the voice version of ChatGPT gradually streams back the reply?
The text was updated successfully, but these errors were encountered:
Maybe related observation: by default my C# ASP.NET API uses Transfer-Encoding: chunked and it does not return a Content-Length header. In that case willow just reads aloud "Success" instead of the body I send, because it fails to determine the length. If I change my code to force it to send Content-Length, then it reads the body correctly.
This got me thinking... could my request above be implemented using chunked transfer encoding?
Something like this proposal from GPT-4:
[HttpGet("stream")]publicasyncTaskStreamResponse(){Response.Headers.Add("Transfer-Encoding","chunked");foreach(varpartinGetDataParts()){awaitResponse.WriteAsync(part);awaitResponse.Body.FlushAsync();// Important to flush the stream// Simulate some real-time delay or processingawaitTask.Delay(1000);}}privateIEnumerable<string>GetDataParts(){yieldreturn"Part 1 ";yieldreturn"Part 2 ";yieldreturn"Part 3 ";}
The difficulty is that then the ESP box would need to keep contacting the inference server to get audio for each separate sentence as in comes in.
I have some custom REST API server code that currently replies with some string, and using WIS TTS the string gets read out on the ESP32 box. This is great for short replies, but not so great if the LLM needs to read a long text. Would it be somehow possible to send responses sentence by sentence as it gets streamed from the LLM I use? Something similar to how the voice version of ChatGPT gradually streams back the reply?
The text was updated successfully, but these errors were encountered: