-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow singleUtterance: false #63
Comments
Thanks for the super detailed suggestion! This is a good idea :) I'm working on a refactor of Sonus that allows for more cloud recognizers and a bit more control of their configuration, which should address this. Another thought I've had is to add the ability to change configuration after a hotword is detected, but before speech begins streaming (essentially a synchronous config update via a callback function that gets passed to the |
@rhclayto - That's exactly what I'm looking for! |
@evancohen
|
Awesome :) I'll definitely integrate this into the next release. One change though: noSingleUtterace should be a parameter on the hotword - it shouldn't be hard-coded for a specific index. I'm traveling this week but next week I'll release the update (unless you want to send a PR for it) |
Hi Evan, yesterday I checked out your latest release and tried to implement the discussed option for Sonus to allow The approach I've made in v0.1.9-1 was to define an option
And here are the modification (*** added) in the
|
Setting
singleUtterance: true
inrecognizer.createRecognizeStream
leaves it to Google to detect when an utterance has stopped & end the recognition stream. A problem I encountered is that often Google is too eager to stop the detection, i.e., it detects slight pauses in speech as an end of utterance. This is good for detecting short command-type utterances, such as 'turn off light', etc., but for longer free-form dictation it's problematic.Adding an option to Sonus to allow configurable
singleUtterance
would give flexibility to developers to implement their own end-of-utterance detection & stream teardown. So I guess this is a feature request.But I also wanted to post this here because I made some changes to my fork of Sonus to allow this, & thought I'd share it, even though it's got some stuff specific to my use hard coded into & is therefore not pull-request-worthy. But it might contain the seed of something that can be put into Sonus, if desired.
These changes make it possible to configure
singleUtterance
to true or false based on the hotword used to trigger Sonus. They also make it possible to stop the recognition stream by emitting arecognitionStreamShutdown
event from somewhere in an app. So now deciding when an utterance is over & when to stop the recognition stream with Google is in the developers hands. Developers will still need to handle the fact that Google is going to issue isFinal properties on its results when it thinks the utterance is over, so if you want your transcription to span the length of your configured timeout period, you'll need to concatenate the various isFinal results that Google issues.As an example, I used the silence & sound events emitted by Sonus (as determined by the Snowboy detector), along with a setTimeout, to determine when an utterance had ended (i.e., after a certain amount of sustained silence, the utterance is determined to be over). Here's the code I used:
Perhaps this can help someone.
The text was updated successfully, but these errors were encountered: