-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] upgrade TTS Python packahe #4169
Comments
We do maintain a fork at https://github.com/idiap/coqui-ai-TTS (available via Python 3.12 is already supported. For 3.13 we have to wait for all dependencies to update, but you can subscribe to idiap#108 for updates. |
Thanks @eginhard I'm searching the docs and did not find the Python API documentation for text-to-speech |
Thanks @eginhard .I saw that page with the helpful examples, but I'm looking for API documentation.
Where can I find out what the I am looking for this kind of detail for all relevant functions. By the way, the |
You can use Python's built-in help command in the REPL to display the docstring of any Python function: >> from TTS.api import TTS
>> xtts = TTS("tts_models/multilingual/multi-dataset/xtts_v2")
>> help(xtts.tts)
Help on method tts in module TTS.api:
tts(text: str, speaker: str | None = None, language: str | None = None, speaker_wav: str | None = None, emotion: str | None = None, split_sentences: bool = True, **kwargs) method of TTS.api.TTS instance
Convert text to speech.
Args:
text (str):
Input text to synthesize.
speaker (str, optional):
Speaker name for multi-speaker. You can check whether loaded model is multi-speaker by
`tts.is_multi_speaker` and list speakers by `tts.speakers`. Defaults to None.
language (str): Language of the text. If None, the default language of the speaker is used. Language is only
supported by `XTTS` model.
speaker_wav (str, optional):
Path to a reference wav file to use for voice cloning with supporting models like YourTTS.
Defaults to None.
emotion (str, optional):
Emotion to use for 🐸Coqui Studio models. If None, Studio models use "Neutral". Defaults to None.
split_sentences (bool, optional):
Split text into sentences, synthesize them separately and concatenate the file audio.
Setting it False uses more VRAM and possibly hit model specific text length or VRAM limits. Only
applicable to the 🐸TTS models. Defaults to True.
kwargs (dict, optional):
Additional arguments for the model. Note that Coqui Studio doesn't exist anymore, so the
No, |
How can I retrieve the sample rate used by the speech synthesizer using the UPDATE: I found it... |
@eginhard is there any default integrated script which can be used and with flag one can generate emotion while cloning? and is there any list of emotions available inside the model? please also guide me if I can increase the input text limit from 200 to 512 or so. |
@Tortoise17 Your questions are completely unrelated to this issue. In the future, please create a new discussion/issue in that case.
|
🚀 Feature Description
The TTS package does not install when Python version is greater than 3.11
This is problematic considering that the current version is 3.13, and 3.14 is around the corner
Solution
Support Python version 3.13 (at least)
Alternative Solutions
If upgrading is not hassle-free, fork the project to have a version that is comoaè
Additional context
The text was updated successfully, but these errors were encountered: