Skip to content

Netherquark/oratio

Repository files navigation

Oratio

NOTE: Repo is still a WIP. Planned: exo support, CPU-only, other Indic languages, etc.

Description : Convert PDF files into podcasts This project converts text from PDF files into audio podcasts using: BERT, PyMuPDF

📁 Architecture :

├── PDF -> My Skills

├── Chunking -> My Skills

├── Summarization -> Fb/bart-large-cnn

├── Translation -> Krutrim/ Helsinki

└── Transcript -> gemma2b/llama2-3b/llamafile

└── Audio -> Silero TTS

⚡Features :

  • PDF Text Extraction: Extract text from PDFs using PyMuPDF.
  • Text-to-Speech: Convert text to audio using Silero TTS.
  • Script Conversion: Krutrim for translation
  • Lightweight: Works on both CPU and GPU.
  • Customizable: Supports multiple languages and voices.
  • Natural sounding AI-generated voice using Silero TTS which is lightweight and runs relatively fast on CPU
  • Using SSML to enhance quality of output audio

🔨 Installation :

  • Clone the repository

⌨️ Contributing :

  1. Fork the repository
  2. Create a new branch (git checkout -b feature/NewFeature)
  3. Commit your changes (git commit -m 'Add some NewFeature')
  4. Push to the branch (git push origin feature/NewFeature)
  5. Open a pull request

🤝 Acknowledgments :

  • Silero TTS: snakers4/silero-models for text-to-speech
  • PyMuPDF: PyMuPDF for PDF text extraction
  • Krutrim: Translation

About

Research paper summarisation & podcast generation (EN>HIN)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •