fix: strip leading/trailing silence from tts output #420
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Kokoro generates ~300ms of silence at the start and end of every generation.
This strips that silence so that there aren't such long breaks between sentences.
Important
Adds
strip_silence()
to remove silence from TTS output inscripts/tts_server.py
.strip_silence()
function inscripts/tts_server.py
to remove leading/trailing silence from audio data.strip_silence()
intext_to_speech()
to process generated audio.strip_silence()
uses amplitude threshold and minimum silence duration to identify and remove silence.This description was created by for d79606a. It will automatically update as commits are pushed.