Prosody2Vec

This repository contains code for training and inference of a multi-speaker and single-speaker speech synthesis model using Prosody2Vec. The repo is conceptually based on https://arxiv.org/pdf/2212.06972 Paper and enhances it with multi-speaker prosody conversion.

Installation

Clone the repository:

git clone https://github.com/yourusername/Prosody2Vec.git
cd Prosody2Vec

Install the required dependencies:
```
pip install -r requirements.txt
```

Dataset Preparation

Place your dataset in the Emotion Speech Dataset directory.
Ensure the dataset is organized in subdirectories for each emotion and speaker.

Model Architecture

The models use a combination of pre-trained models from https://github.com/bshall/acoustic-model/releases/tag/v0.1 and custom layers for speech synthesis. The main components include:

Encoder: Extracts features from the input speech.
Decoder: Generates the output speech from the encoded features.
Fusion Layers: Combine features from different sources (e.g., emotion vectors, speaker vectors).

Acknowledgements

This project uses pre-trained models from the following repositories:

We thank the authors of these repositories for their contributions to the community.

Prosody vector TSNE

s

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.gitignore		.gitignore
README.md		README.md
emo.png		emo.png
feature_extract.ipynb		feature_extract.ipynb
image.png		image.png
inference.ipynb		inference.ipynb
model.py		model.py
trainig_multi_speaker.ipynb		trainig_multi_speaker.ipynb
trainig_multi_speaker_pretraining.ipynb		trainig_multi_speaker_pretraining.ipynb
trainig_single_speaker.ipynb		trainig_single_speaker.ipynb
trainig_single_with_pretraining.ipynb		trainig_single_with_pretraining.ipynb
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Prosody2Vec

Table of Contents

Installation

Dataset Preparation

Model Architecture

Acknowledgements

Prosody vector TSNE

About

Uh oh!

Releases

Packages

Languages

Dannynis/Prosody2Vec

Folders and files

Latest commit

History

Repository files navigation

Prosody2Vec

Table of Contents

Installation

Dataset Preparation

Model Architecture

Acknowledgements

Prosody vector TSNE

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages