Audio cues: issue tracking #69

naorunaoru · 2024-07-19T12:28:07Z

naorunaoru
Jul 19, 2024

As a user, I'd like my smart whoopie cushion to respond with audio feedback. Example cases:

detecting a wake word (micro or otherwise)
detecting end of speech
failing to perform a request
changing volume

There's a feature request in ESPHome: esphome/feature-requests#2490 which didn't get much attention for some reason.

There's also a HomeAssistant discussion with some hacks: https://community.home-assistant.io/t/play-sound-when-wakeword-detected/653928/32

Apart from the ESPHome limitation, is there anything specific to Onju Voice that might prevent us from implementing this in the future? From the top of my head I can think of full duplex audio with echo cancellation, as I'm not familiar with ESP-IDF audio pipeline abilities. The custom component used in microwakeword version of this config apparently supports simultaneous listening and playback but it comes with its own tradeoffs. The other issue is that apparently ESPHome can't play multiple audio streams at the same time, which is needed for audio feedback for adjusting the volume while there's music playing.

The purpose of opening this thread is to share any links related to this feature and discuss possible workarounds.

I also noticed that people generally suggest playing a prerecorded audio, which isn't quite well suited to ESP32-based systems due to flash size limitations. I'd suggest using a tone generator instead. For example, a sine wave synthesizer with ability to define ADSR volume envelope wouldn't take much EEPROM space and CPU cycles but would allow for instant audio cues without relying on external audio sources. What do you think?

tetele · 2024-07-19T13:35:27Z

tetele
Jul 19, 2024
Maintainer

I also noticed that people generally suggest playing a prerecorded audio, which isn't quite well suited to ESP32-based systems due to flash size limitations.

The Onju has 16MB of NOR flash on board which could be used to store such small audio files, but the limitation is due to the fact that ESPHome does not play "files" which are not embedded in the firmware. But the truth is I am nearly clueless about partitions and how to extend the onboard flash with the external one.

12 replies

tetele Aug 2, 2024
Maintainer

Apart from not being able to play media on a speaker, the biggest loss is volume control. And it's quite loud at the default 100%.

You could maybe go around that by controlling the DAC directly (i haven't researched how to do that).

I do agree, however, that it's tedious to set up URLs pointing to HA to play sounds.

jhbruhn Aug 2, 2024

Maybe we'll need to implement a custom component for the Onjus DAC based on this, as soon as that implementation has matured a bit. But the MAX98357 does not have a digital interface (besides I2S), so all volume setting would be done on the ESP32 Audio side anyways.

dreimer1986 Aug 3, 2024

Well, I added timer support and kept media_player as it is. It was one of the main reasons for me to be able to run Music Assistant over it and thus speaker is out of question for me. And yes I get my timer finished signal off a online source, just like the official ESPHome ones do, too.

Like: https://github.com/esphome/firmware/blob/main/wake-word-voice-assistant/esp32-s3-box-3.yaml
Check the file: setting.

My sorry experiments:
https://github.com/dreimer1986/onju-voice-satellite/blob/main/esphome/onju-voice-microwakeword.yaml

While we are at it... I try to get microwakeword and Assistant itself to have sort of better hearing. To make it wake up I have to sit more than just next to the Onju Device. I tried a few settings you can find under microphone and voice_assistant commented out, but it seems like my current tinkering did not really help here...

P.S. I am more than open for any suggestions regarding my below basic level tinkering. Of course I wait for tetele to have some spare time for continuing his great work, but until now... I thought I could hack a bit myself ^^

jhbruhn Aug 3, 2024

I think you solved it in the best way possible with keeping media_player right now. But the S3 Box firmware does it a bit differently: It embeds the file during compilation (yes, from an online source) into the firmware image. During runtime, the sound is loaded from the firmware image, so the S3 Box is not doing an online request.

dreimer1986 Aug 3, 2024

I just realized that, too. Sry for the false claim... Aaaanyways. It works and I love the fact that I have one thing less I still had to rely on my Echo device ^^. The list shrinks more and more... If this thing would just listen to commands and wake word a bit better...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio cues: issue tracking #69

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 12 replies

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Audio cues: issue tracking #69

naorunaoru Jul 19, 2024

Replies: 1 comment · 12 replies

tetele Jul 19, 2024 Maintainer

tetele Aug 2, 2024 Maintainer

jhbruhn Aug 2, 2024

dreimer1986 Aug 3, 2024

jhbruhn Aug 3, 2024

dreimer1986 Aug 3, 2024

naorunaoru
Jul 19, 2024

Replies: 1 comment 12 replies

tetele
Jul 19, 2024
Maintainer

tetele Aug 2, 2024
Maintainer