Russian Text to Speech

TTS models are stored in ONNX format. ONNX is a computational graph with weights, platform agnostic. To run inference from ONNX model you need any ONNX framework and the dictionary to convert text to token ids.

ONNX model takes 4 inputs:

input - text ids. Model takes integers as inputs. Each integer corresponds to id of a letter. Mapping between them is provided in text_to_seq.py
input_lengths - length of text. Should match legth of input.
scales - array of [noise scale, length scale, noise scale of duration predictor]. Modify length_scale to make speech faster and shorter. Noise scale for audio and duration predictor are vital for natural sound. Not changing them is recommended.
sid - Speaker id. Integer. Optional and unused in this setup, but the final system would have multiple speakers available.

Directory structure:

infer_onnx.py - contains an example python function to run inference from model.

Python Usage

Setup

pip install -r requirements.txt

Now you can use infer_onnx.py in your setup. Modify variable text to change input for the model.

python infer_onnx.py --text "Что-то совсем старушка распоясалася." --model /path/to/model

Huggingface page

https://huggingface.co/frappuccino/vits2_ru_natasha/

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
services		services
README.md		README.md
demo_onnx.py		demo_onnx.py
flask_api.py		flask_api.py
infer_onnx.py		infer_onnx.py
install.sh		install.sh
output.wav		output.wav
requirements.txt		requirements.txt
russian_normalization.py		russian_normalization.py
shergin_api.py		shergin_api.py
streamlit_fastapi.py		streamlit_fastapi.py
symbols.py		symbols.py
tryme.py		tryme.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Russian Text to Speech

ONNX model takes 4 inputs:

Directory structure:

Python Usage

Huggingface page

About

Releases

Packages

Languages

shigabeev/vits2-inference

Folders and files

Latest commit

History

Repository files navigation

Russian Text to Speech

ONNX model takes 4 inputs:

Directory structure:

Python Usage

Huggingface page

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages