Deep Learning For Audio With Python

Code for the "Deep Learning (for Audio) with Python" series on The Sound of AI YouTube channel.

This repository is a comprehensive collection of resources and code for understanding and implementing deep learning models for audio tasks. It serves as a practical guide, starting from the absolute basics (building neurons and backpropagation from scratch), moving to TensorFlow implementation, and culminating in building a complete Music Genre Classification system using various architectures (MLP, CNN, RNN-LSTM).

Note on Versioning

While this v2 release is fully functional and optimized for current environments, it may differ from the original version shown in the course. The codebase has been updated to reflect modern best practices (e.g. TensorFlow 2.16+, Librosa 0.11+) and improved dependency management. Consequently, the original course version has been deprecated; however, it remains available in the legacy branch for those wishing to follow the video content exactly.

Convolutional Neural Networks (CNN) Explained: Video | Slides
Implementing a CNN for Music Genre Classification: Video | Code
Recurrent Neural Networks (RNN) Explained: Video | Slides
Long Short Term Memory (LSTM) Explained: Video | Slides
Implementing an RNN-LSTM for Music Genre Classification: Video | Code

How to Run the Scripts

To ensure the models and scripts execute correctly, please follow these steps from your terminal:

2. Prepare the Environment (Recommended)

Before running inference, ensure you have the necessary dependencies installed:

pip install -r requirements.txt

2. Navigate to the Lesson Folder

Each class is self-contained. Move into the specific directory for the lesson you are studying:

cd class/folder/name  # Replace with the specific class directory

3. Execute the Script

Run the main script using Python:

python mlp.py  # Replace with the specific script name

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Deep Learning For Audio With Python

Note on Versioning

Table of Contents

Dataset Setup (GTZAN)

Course Structure

Part 1: Fundamentals & Math

Part 2: Neural Networks from Scratch

Part 3: TensorFlow & Audio Preprocessing

Part 4: Music Genre Classification Project (MLP)

Part 5: Advanced Architectures (CNN & RNN-LSTM)

How to Run the Scripts

2. Prepare the Environment (Recommended)

2. Navigate to the Lesson Folder

3. Execute the Script

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
01 - Course overview/slides		01 - Course overview/slides
02 - Ai, machine learning and deep learning/slides		02 - Ai, machine learning and deep learning/slides
03 - Implementing an artificial neuron from scratch		03 - Implementing an artificial neuron from scratch
04 - Vector and matrix operations/slides		04 - Vector and matrix operations/slides
05 - Computation in neural networks/slides		05 - Computation in neural networks/slides
06 - Implementing a neural network from scratch/code		06 - Implementing a neural network from scratch/code
07 - Bagkpropagation and gradient descent/slides		07 - Bagkpropagation and gradient descent/slides
08 - Training a neural network - Implementing back propagation from scratch/code		08 - Training a neural network - Implementing back propagation from scratch/code
09 - How to imlement a simple neural network with TensorFlow/code		09 - How to imlement a simple neural network with TensorFlow/code
10 - Understanding audio data for deep learning/slides		10 - Understanding audio data for deep learning/slides
11 - Preprocessing audio data for deep learning/code		11 - Preprocessing audio data for deep learning/code
12 - Music genre classification - Preparing the dataset/code		12 - Music genre classification - Preparing the dataset/code
13 - Implementing a neural network for music genre classification		13 - Implementing a neural network for music genre classification
14 - Solving overfitting in neural networks		14 - Solving overfitting in neural networks
15 - How does a convolutional neural network work/slides		15 - How does a convolutional neural network work/slides
16 - How to implement a CNN for music genre classification/code		16 - How to implement a CNN for music genre classification/code
17 - Recurrent Neural Networks explained easily/slides		17 - Recurrent Neural Networks explained easily/slides
18 - LSTM networks explained easily/slides		18 - LSTM networks explained easily/slides
19 - How to implement an RNN-LSTM for music genre classification/code		19 - How to implement an RNN-LSTM for music genre classification/code
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
Instructions_GTZAN.md		Instructions_GTZAN.md
LICENSE		LICENSE
README.md		README.md
dataset_downloader.py		dataset_downloader.py
environment.yml		environment.yml
requirements.txt		requirements.txt

License

musikalkemist/DeepLearningForAudioWithPython

Folders and files

Latest commit

History

Repository files navigation

Deep Learning For Audio With Python

Note on Versioning

Table of Contents

Dataset Setup (GTZAN)

Course Structure

Part 1: Fundamentals & Math

Part 2: Neural Networks from Scratch

Part 3: TensorFlow & Audio Preprocessing

Part 4: Music Genre Classification Project (MLP)

Part 5: Advanced Architectures (CNN & RNN-LSTM)

How to Run the Scripts

2. Prepare the Environment (Recommended)

2. Navigate to the Lesson Folder

3. Execute the Script

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages