Here we will keep track of the latest AI models for audio generation, starting in 2023!
Date | Release | Paper | Code | Trained Model |
---|---|---|---|---|
30.01 | SingSong: Generating musical accompaniments from singing | arXiv | - | - |
30.01 | AudioLDM: Text-to-Audio Generation with Latent Diffusion Models | arXiv | GitHub | Hugging Face |
30.01 | Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion | arXiv | GitHub | - |
29.01 | Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models | - | - | |
28.01 | Noise2Music | - | - | - |
27.01 | RAVE2 | arXiv | GitHub | - |
26.01 | MusicLM: Generating Music From Text | arXiv | - | - |
18.01 | Msanii: High Fidelity Music Synthesis on a Shoestring Budget | arXiv | GitHub | Hugging Face Colab |
16.01 | ArchiSound: Audio Generation with Diffusion | GitHub | - | |
05.01 | VALL-E: Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers | arXiv | - | - |