[CONTRIBUTION] Speech Dataset Generator

Hi everyone!

My name is David Martin Rius and I have just published this project on GitHub: https://github.com/davidmartinrius/speech-dataset-generator/

Now you can **create datasets automatically** with any audio or lists of audios. 

I hope you find it useful.

## Here are the key functionalities of the project:

1. **Dataset Generation:** Creation of **multilingual**  datasets with Mean Opinion Score (MOS).

2. **Silence Removal:** It includes a feature to remove silences from audio files, enhancing the overall quality.

3. **Sound Quality Improvement:** It improves the quality of the audio when needed.

4. **Audio Segmentation:** It can segment audio files within specified second ranges.

5. **Transcription:** The project transcribes the segmented audio, providing a textual representation.

6. **Gender Identification:** It identifies the gender of each speaker in the audio.

7. **Pyannote Embeddings:** Utilizes pyannote embeddings for speaker detection across multiple audio files.

8. **Automatic Speaker Naming:** Automatically assigns names to speakers detected in multiple audios.

9. **Multiple Speaker Detection:** Capable of detecting multiple speakers within each audio file.

10. **Store speaker embeddings:** The speakers are detected and stored in a Chroma database, so you do not need to assign a speaker name.

11. **Syllabic and words-per-minute metrics**

Feel free to explore the project at https://github.com/davidmartinrius/speech-dataset-generator

David Martin Rius

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CONTRIBUTION] Speech Dataset Generator #56

Here are the key functionalities of the project:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

[CONTRIBUTION] Speech Dataset Generator #56

Description

Here are the key functionalities of the project:

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions