Create chat application for LLM agents

This is the basis of an oncoming paper (to be published in RBVM 2025). Its main goal is designing a chat application that involves audio synthesis and recognition and an LLM-based completion device. It is meant to leverage new YARP interfaces (https://github.com/roboticslab-uc3m/speech/issues/31). Components:

- Speech synthesis (TTS): [`yarp::dev::ISpeechSynthesizer`](https://www.yarp.it/latest/classyarp_1_1dev_1_1ISpeechSynthesizer.html) implemented with [Piper](https://github.com/rhasspy/piper)
- Speech recognition (ASR): [`yarp::dev::ISpeechTranscription`](https://www.yarp.it/latest/classyarp_1_1dev_1_1ISpeechTranscription.html) implemented with [Vosk](https://github.com/alphacep/vosk-api)
    - includes wake word detection implemented with [openWakeWord](https://github.com/dscripka/openWakeWord)
- Large language model (LLM): [`yarp::dev::ILLM`](https://www.yarp.it/latest/classyarp_1_1dev_1_1ILLM.html) implemented with [llama.cpp](https://github.com/ggml-org/llama.cpp)

A very similar idea has been presented in [robotology/yarp-devices-llm](https://github.com/robotology/yarp-devices-llm) involving the OpenAI API through Azure servers (perhaps TTS/ASR is also online). However, this app is meant to be run fully offline. As such, its main motivation is introducing the recent DeepSeek models in the LLM component.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create chat application for LLM agents #34

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Create chat application for LLM agents #34

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions