Replies: 2 comments
-
Each dataset will most likely have it's own unique rights, so instead of creating a massive file, create several list files with each dataset - with usage rights listed next to each file? |
Beta Was this translation helpful? Give feedback.
0 replies
-
I wasn't thinking about a massive file, mostly the opposite, one torrent with language subfolders and then speaker subfolders. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello I have a new disk
I want to build a massive FLAC dataset containing various languages (as many as possible)
This will not be a repack of ljspeech or whatever, focusing more on more rare data sets that could disappear.
Anyone have an idea of what to include (please also mention the estimated size of the data in GB).
Also idea of a file name / meta data convention would be great.
Beta Was this translation helpful? Give feedback.
All reactions