Code for the "Deep Learning (for Audio) with Python" series on The Sound of AI YouTube channel.
This repository is a comprehensive collection of resources and code for understanding and implementing deep learning models for audio tasks. It serves as a practical guide, starting from the absolute basics (building neurons and backpropagation from scratch), moving to TensorFlow implementation, and culminating in building a complete Music Genre Classification system using various architectures (MLP, CNN, RNN-LSTM).
While this v2 release is fully functional and optimized for current environments, it may differ from the original version shown in the course. The codebase has been updated to reflect modern best practices (e.g. TensorFlow 2.16+, Librosa 0.11+) and improved dependency management. Consequently, the original course version has been deprecated; however, it remains available in the legacy branch for those wishing to follow the video content exactly.
To run the music genre classification lessons (Part 4 & 5), you will need the GTZAN dataset. We provide an automated downloader to handle the acquisition, extraction, and folder organization for you.
- Quick Start: Run
python dataset_downloader.pyfrom the root directory. - Prerequisites: Install requirements.txt.
Full Instructions: Please check the Instructions GTZAN file for detailed help using the downloader script or manual download steps.
- Course Overview: Video | Slides
- AI, Machine Learning and Deep Learning: Video | Slides
- Implementing an Artificial Neuron from Scratch: Video | Slides | Code
- Vector and Matrix Operations: Video | Slides
- Computation in Neural Networks: Video | Slides
- Implementing a Neural Network from Scratch: Video | Code
- Training a Neural Network (Backprop & Gradient Descent): Video | Slides
- Implementing Backpropagation from Scratch: Video | Code
- Implementing a Neural Network with TensorFlow 2: Video | Code
- Understanding Audio Data for Deep Learning: Video | Slides
- Preprocessing Audio Data (MFCCs/Spectrograms): Video | Code
- Preparing the Dataset: Video | Code
- Implementing a Neural Network for Classification: Video | Slides | Code
- Solving Overfitting: Video | Slides | Code
- Convolutional Neural Networks (CNN) Explained: Video | Slides
- Implementing a CNN for Music Genre Classification: Video | Code
- Recurrent Neural Networks (RNN) Explained: Video | Slides
- Long Short Term Memory (LSTM) Explained: Video | Slides
- Implementing an RNN-LSTM for Music Genre Classification: Video | Code
To ensure the models and scripts execute correctly, please follow these steps from your terminal:
Before running inference, ensure you have the necessary dependencies installed:
pip install -r requirements.txtEach class is self-contained. Move into the specific directory for the lesson you are studying:
cd class/folder/name # Replace with the specific class directoryRun the main script using Python:
python mlp.py # Replace with the specific script name