Skip to content

Code and slides for the "Deep Learning (For Audio) With Python" course on TheSoundOfAI Youtube channel.

License

Notifications You must be signed in to change notification settings

musikalkemist/DeepLearningForAudioWithPython

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Learning For Audio With Python

Code for the "Deep Learning (for Audio) with Python" series on The Sound of AI YouTube channel.

This repository is a comprehensive collection of resources and code for understanding and implementing deep learning models for audio tasks. It serves as a practical guide, starting from the absolute basics (building neurons and backpropagation from scratch), moving to TensorFlow implementation, and culminating in building a complete Music Genre Classification system using various architectures (MLP, CNN, RNN-LSTM).

Maintained Python 3.11 librosa TensorFlow Keras Scikit-Learn NumPy Matplotlib License

Note on Versioning

While this v2 release is fully functional and optimized for current environments, it may differ from the original version shown in the course. The codebase has been updated to reflect modern best practices (e.g. TensorFlow 2.16+, Librosa 0.11+) and improved dependency management. Consequently, the original course version has been deprecated; however, it remains available in the legacy branch for those wishing to follow the video content exactly.

Table of Contents


Dataset Setup (GTZAN)

To run the music genre classification lessons (Part 4 & 5), you will need the GTZAN dataset. We provide an automated downloader to handle the acquisition, extraction, and folder organization for you.

  • Quick Start: Run python dataset_downloader.py from the root directory.
  • Prerequisites: Install requirements.txt.

Full Instructions: Please check the Instructions GTZAN file for detailed help using the downloader script or manual download steps.


Course Structure

Part 1: Fundamentals & Math

  1. Course Overview: Video | Slides
  2. AI, Machine Learning and Deep Learning: Video | Slides
  3. Implementing an Artificial Neuron from Scratch: Video | Slides | Code
  4. Vector and Matrix Operations: Video | Slides
  5. Computation in Neural Networks: Video | Slides

Part 2: Neural Networks from Scratch

  1. Implementing a Neural Network from Scratch: Video | Code
  2. Training a Neural Network (Backprop & Gradient Descent): Video | Slides
  3. Implementing Backpropagation from Scratch: Video | Code

Part 3: TensorFlow & Audio Preprocessing

  1. Implementing a Neural Network with TensorFlow 2: Video | Code
  2. Understanding Audio Data for Deep Learning: Video | Slides
  3. Preprocessing Audio Data (MFCCs/Spectrograms): Video | Code

Part 4: Music Genre Classification Project (MLP)

  1. Preparing the Dataset: Video | Code
  2. Implementing a Neural Network for Classification: Video | Slides | Code
  3. Solving Overfitting: Video | Slides | Code

Part 5: Advanced Architectures (CNN & RNN-LSTM)

  1. Convolutional Neural Networks (CNN) Explained: Video | Slides
  2. Implementing a CNN for Music Genre Classification: Video | Code
  3. Recurrent Neural Networks (RNN) Explained: Video | Slides
  4. Long Short Term Memory (LSTM) Explained: Video | Slides
  5. Implementing an RNN-LSTM for Music Genre Classification: Video | Code

How to Run the Scripts

To ensure the models and scripts execute correctly, please follow these steps from your terminal:

2. Prepare the Environment (Recommended)

Before running inference, ensure you have the necessary dependencies installed:

pip install -r requirements.txt

2. Navigate to the Lesson Folder

Each class is self-contained. Move into the specific directory for the lesson you are studying:

cd class/folder/name  # Replace with the specific class directory

3. Execute the Script

Run the main script using Python:

python mlp.py  # Replace with the specific script name

About

Code and slides for the "Deep Learning (For Audio) With Python" course on TheSoundOfAI Youtube channel.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages