🤖🎬 YouTube Audio-to-Text Transcription 🎧📝

A sophisticated and user-friendly automation that downloads audio from YouTube videos, transcribes the content into text, detects the language of the transcribed text, and saves the transcription to a text file. Save time, effort, and resources by harnessing cutting-edge technology to streamline the transcription process.

Description

This script is designed to facilitate the transcription of YouTube videos into text format. It eliminates the need for time-consuming manual transcription by automating the process through a series of well-defined steps. The user-friendly interface allows users to input a YouTube video URL, which is then processed to extract the audio, convert it into text, and save the transcription as a text file. This efficient and convenient solution is ideal for those who require quick and accurate transcriptions for various purposes, such as research, content creation, or accessibility.

Key Features

User-friendly: Designed for ease of use, the script prompts users to enter a YouTube video URL, minimizing the need for complicated setup processes.
Efficient Audio Extraction: The script utilizes the pytubefix library to effectively filter and download the audio stream from the specified YouTube video.
High-Quality Transcription: The whisper library, a powerful speech-to-text tool, is employed to accurately transcribe the downloaded audio into text.
Convenient Output: The transcription is saved as a text file in the same directory as the script, ensuring easy access and sharing capabilities.

Prerequisites

Python 3.6+
pip to install required libraries

Required Libraries

pytubefix: A lightweight Python library that enables the downloading of YouTube videos and the extraction of audio streams.
whisper: An advanced speech-to-text library that facilitates accurate and efficient transcription of audio files.
langdetect: A language detection library ported from Google's language-detection.

Installation

Clone this repository or download the script.

Install the required libraries:

pip install pytubefix

pip install git+https://github.com/openai/whisper.git

pip install langdetect

Download FFmpeg and add it to environment variables.

Windows: https://phoenixnap.com/kb/ffmpeg-windows
Mac: https://phoenixnap.com/kb/ffmpeg-mac
Ubuntu: https://phoenixnap.com/kb/install-ffmpeg-ubuntu

Usage

Run the script:
```
python youtube_audio_to_text.py
```

When prompted, enter the YouTube video URL you wish to transcribe:

Enter the YouTube video URL: https://www.youtube.com/watch?v=XXXXXXXXXXX

The script will download the audio, transcribe it, detect language, and save the transcription to a text file called output_{language}.txt.
Access the transcription by opening the output_{language}.txt file located in the same directory as the script.

Workflow

The user inputs a YouTube video URL when prompted.
The pytubefix library is used to create a YouTube object and filter the audio stream.
The audio stream is downloaded as an MP3 file and saved in the YoutubeAudios folder.
The whisper library loads a base model and transcribes the downloaded audio into text.
The langdetect library detects the language of the transcribed text.
The transcription is saved to a text file named output_{language}.txt with the language code as part of the filename and opened for the user to view.

Contributing 🤝🌱

Contributions from users are highly valued and appreciated. There are two main ways to contribute to this project: through pull requests and issues.

Pull Requests

Fork the repository and create a branch from the main branch.
Make changes or additions to the code.
Commit the changes, and push them to the branch.
Open a pull request to the main branch with a clear and concise description of the changes.

Issues

Navigate to the Issues section of the repository.
Check if there is an existing issue similar to the one you'd like to create.
If there isn't an existing issue, create a new issue by clicking the "New issue" button.
Provide a descriptive title and detailed information about the proposed changes that you want to potentially add to the current script.

🎓🌟 Feel free to contribute, share, and spread the love 💖💬🌍

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github		.github
.idea		.idea
YoutubeAudios		YoutubeAudios
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
output_en.txt		output_en.txt
output_fr.txt		output_fr.txt
youtube_audio_to_text.py		youtube_audio_to_text.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🤖🎬 YouTube Audio-to-Text Transcription 🎧📝

Table of Contents

Description

Key Features

Prerequisites

Required Libraries

Installation

Usage

Workflow

Contributing 🤝🌱

Pull Requests

Issues

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

javedali99/audio-to-text-transcription

Folders and files

Latest commit

History

Repository files navigation

🤖🎬 YouTube Audio-to-Text Transcription 🎧📝

Table of Contents

Description

Key Features

Prerequisites

Required Libraries

Installation

Usage

Workflow

Contributing 🤝🌱

Pull Requests

Issues

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages