Building a massive 1000 GB TTS torrent #2597

neurlang · 2023-05-07T16:12:07Z

neurlang
May 7, 2023

Hello I have a new disk
I want to build a massive FLAC dataset containing various languages (as many as possible)
This will not be a repack of ljspeech or whatever, focusing more on more rare data sets that could disappear.
Anyone have an idea of what to include (please also mention the estimated size of the data in GB).
Also idea of a file name / meta data convention would be great.

FrontierDK · 2023-05-07T16:17:39Z

FrontierDK
May 7, 2023

Each dataset will most likely have it's own unique rights, so instead of creating a massive file, create several list files with each dataset - with usage rights listed next to each file?

0 replies

neurlang · 2023-05-07T16:20:54Z

neurlang
May 7, 2023
Author

I wasn't thinking about a massive file, mostly the opposite, one torrent with language subfolders and then speaker subfolders.
This will ensure that people can download just the thing they need.
Speaking of the rights to the dataset, obviously only things that are reasonable public shareable licensed will be distributed.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Building a massive 1000 GB TTS torrent #2597

{{title}}

Replies: 2 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Building a massive 1000 GB TTS torrent #2597

neurlang May 7, 2023

Replies: 2 comments

FrontierDK May 7, 2023

neurlang May 7, 2023 Author

neurlang
May 7, 2023

FrontierDK
May 7, 2023

neurlang
May 7, 2023
Author