An open source video transcreation platform for Indic languages using ML models
Chitralekha is an open source platform for video transcreation across various Indic languages, using ML model support (ASR for Transcription, NMT for Translation and TTS for Voice-over)
Chitralekha offers support for multiple input sources (Ex : Youtube, local), transcription generation process (Ex : Models, Source captions, Custom subtitle files, manually created), translation generation process (Models, manually created) and voice-over generation process (Models, manually created). Currently, Chitralekha supports voice-over for only single speaker videos. Support for multi-speaker videos is under development.
In current world, there are numerous informative videos available online. Mostly they are associated with very few languages. The usefulness of the content can be increased by creating the sub-titles and voice-over of these across various Indic languages. With millions of hours of video contents, it becomes harder to manually create the multi-lingual sub-titles. This is where Chitraleka comes to the rescue.
The existing state-of-the-art ASR, Translation ML and TTS models can power the Chitralekha tool, to provides the platform for the Transcriptionists/Translators to create the multi-lingual sub-titles at scale with high accuracy.
- Support all possible video sources and languages
- Build a reliable & scalable platform beneath Chitralekha
- Keep the UI simple and intuitive
Chitralekha supports importing videos and optional subtitles from YouTube. It also enables export of the subtitles in standard formats which can be used to update videos on YouTube.
Chitralekha supports translating the transcription into English and 12 Indian languages supported by IndicTrans model. Eventually it would be a Plug & Play feature.
Chitralekha supports transcribing the input video with IndicASR for English and 9 Indian languages. This automatically creates timestamped transcription cards which can be edited. Eventually it would be a Plug & Play feature.
Chitralekha supports editing the transcriptions both in the source and target languages in Roman characters with IndicXlit support.
Chitralekha supports voice-over generation for the translated subtitles of the input video with IndicTTS for Indian languages. This automatically creates timestamped voice-over audio files which can be edited by editing the subtitle text of that particular timestamp.
git clone --recurse-submodules https://github.com/AI4Bharat/Chitralekha
Any information/help/discussion required, can be taken up using the following link :
https://github.com/AI4Bharat/Chitralekha/discussions
This project adheres to the Contributor Covenant code of conduct. By participating, you are expected to uphold this code. Please report unacceptable behavior to [email protected].