Skip to content

Japanese-Lab/japanese-audio-transcriber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Japanese Audio Transcription App

A Python desktop application for Japanese audio transcription using Whisper, with an interactive UI for playback and sentence navigation. Perfect for language learners who want to study audio with timestamps.

Features

  • Load any Japanese audio file (.mp3, .wav).
  • Transcribe audio into sentences with timestamps using Whisper.
  • Play and stop audio.
  • Click on sentences to jump to the corresponding point in the audio.
  • Status updates for model loading and transcription progress.
  • Safe on macOS with PyTorch + multiprocessing.
  • Highly sentence-based transcription for better navigation.
  • Translate transcripts to Vietnamese (optional).

Screenshots

Videos

Watch the video

Installation

  1. Clone the repository:
git clone https://github.com/chuongmep/japanese-audio-transcriber.git
cd japanese-audio-transcriber
  1. Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate   # macOS/Linux
venv\Scripts\activate      # Windows
  1. Install dependencies:
pip install -r requirements.txt

requirements.txt should include at least:

PySide6
whisper
torch
pydub
simpleaudio
  1. (macOS only) Make sure ffmpeg is installed for audio processing:
brew install ffmpeg

Windows

winget install --id Gyan.FFmpeg -e

Usage

Run the application:

python main.py
  1. Click Load Audio to select a Japanese audio file.
  2. Click Transcribe to generate sentences with timestamps.
  3. Use Play / Stop to listen to the audio.
  4. Click on any sentence on the right panel to jump to that point in the audio.

Notes

  • Whisper automatically uses the best model available. The current version uses small. You can change to medium or large for higher accuracy.
  • The application uses a separate thread for transcription to prevent crashes on macOS with PyTorch.
  • For large audio files, transcription may take several minutes.

Contributing

Contributions are welcome! You can improve:

  • Word-level clickable transcription.
  • Support for other languages.
  • Export transcripts to CSV or SRT format.
  • UI enhancements.

License

MIT License – see LICENSE file.

Issues

  • Whisper TypeError: argument of type 'NoneType' is not iterable
pip install git+https://github.com/openai/whisper.git

About

A Python desktop application for Japanese audio transcription using [Whisper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages