NOTE: Repo is still a WIP. Planned: exo support, CPU-only, other Indic languages, etc.
Description : Convert PDF files into podcasts This project converts text from PDF files into audio podcasts using: BERT, PyMuPDF
├── Summarization -> Fb/bart-large-cnn
├── Translation -> Krutrim/ Helsinki
└── Transcript -> gemma2b/llama2-3b/llamafile
└── Audio -> Silero TTS
- PDF Text Extraction: Extract text from PDFs using PyMuPDF.
- Text-to-Speech: Convert text to audio using Silero TTS.
- Script Conversion: Krutrim for translation
- Lightweight: Works on both CPU and GPU.
- Customizable: Supports multiple languages and voices.
- Natural sounding AI-generated voice using Silero TTS which is lightweight and runs relatively fast on CPU
- Using SSML to enhance quality of output audio
- Clone the repository
- Fork the repository
- Create a new branch (git checkout -b feature/NewFeature)
- Commit your changes (git commit -m 'Add some NewFeature')
- Push to the branch (git push origin feature/NewFeature)
- Open a pull request
- Silero TTS: snakers4/silero-models for text-to-speech
- PyMuPDF: PyMuPDF for PDF text extraction
- Krutrim: Translation