Subtitle to Speech

Convert subtitle files into synchronized German language audio using OpenAI's Text-to-Speech (TTS) API.

Features

Accessibility-First: Designed to enhance the accessibility of videos, the tool is particularly useful for adding spoken audio to sign language content, for instance, like in this video.
Subtitle support: Parses both .srt and .sbv subtitle formats.
High-Quality speech: Uses OpenAI’s TTS API with support for multiple expressive voices that perform well in German.
Timing preservation: Retains original subtitle timings and inserts silent breaks where necessary.
Customizable silence padding: Adds silent breaks after subtitle segments for improved flow.
Volume normalization: Optionally normalize loudness across segments for consistency.
Flexible output formats: Exports to various formats (e.g., WAV, MP3) using pydub.

Prerequisites

OpenAI API Key
ffmpeg: Required by pydub to process audio files.

Install ffmpeg:
- macOS (Homebrew): brew install ffmpeg
- Ubuntu/Debian: sudo apt update && sudo apt install ffmpeg

Installation

Clone the repository and set up the virtual environment using uv:

git clone https://github.com/machinelearningZH/subtitle-to-speech.git
cd subtitle-to-speech

pip3 install uv
uv venv
source .venv/bin/activate
uv sync

Configuration

Set your OpenAI API key as an environment variable:

export OPENAI_API_KEY='your_api_key_here'

You can customize the voice style by modifying the prompt in utils.py.

Running the App

To start the Streamlit app:

streamlit run subtitle-to-speech.py

Access the app at http://127.0.0.1:8501

Important

The app starts to create the audio from timecode 00:00:00 (not e.g. 01:00:00). Make sure that this is the starting timecode of your subtitle file and video. Otherwise you have to adjust the code.

Note

Set the value for maximum parallel threads based on your tier on OpenAI's developer platform. Higher tiers offer increased rate limits, allowing faster data processing. However, setting too many parallel calls may still exceed your rate limits, so adjust accordingly.

Project Team

Simone Luchetta — Staatskanzlei Zürich: Team Informationszugang & Dialog
Chantal Amrhein, Patrick Arnecke — Statistisches Amt Zürich: Team Data

Feedback and Contributing

We welcome feedback and contributions. Email us or open an issue or pull request.

We use Ruff for linting and code formatting.

Install pre-commit hooks for automatic checks before opening a pull request:

pre-commit install

License

This project is licensed under the MIT License. See LICENSE for details.

Disclaimer

This software (the Software) has been developed according to and with the intent to be used under Swiss law. Please be aware that the EU Artificial Intelligence Act (EU AI Act) may, under certain circumstances, be applicable to your use of the Software. You are solely responsible for ensuring that your use of the Software complies with all applicable local, national and international laws and regulations. By using this Software, you acknowledge and agree (a) that it is your responsibility to assess which laws and regulations, in particular regarding the use of AI technologies, are applicable to your intended use and to comply therewith, and (b) that you will hold us harmless from any action, claims, liability or loss in respect of your use of the Software.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
_imgs		_imgs
_sample_files		_sample_files
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
subtitle-to-speech.py		subtitle-to-speech.py
utils.py		utils.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Subtitle to Speech

Features

Prerequisites

Installation

Configuration

Running the App

Project Team

Feedback and Contributing

License

Disclaimer

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

machinelearningZH/subtitle-to-speech

Folders and files

Latest commit

History

Repository files navigation

Subtitle to Speech

Features

Prerequisites

Installation

Configuration

Running the App

Project Team

Feedback and Contributing

License

Disclaimer

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages