Skip to content

📸 Voice-Enabled Smart Vision Backend Description This is designed to enhance accessibility for visually impaired users. Built with FastAPI and integrated with a modular camera pipeline, it captures live frames, analyzes objects using deep learning, and narrates results using pyttsx3.

Notifications You must be signed in to change notification settings

Akarshak51/YOLO-Object-Detection-Voice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

YOLO-Object-Detection-Voice

📸 Voice-Enabled Smart Vision Backend Description This is designed to enhance accessibility for visually impaired users. Built with FastAPI and integrated with a modular camera pipeline, it captures live frames, analyzes objects using deep learning, and narrates results using pyttsx3.

🔧 Features

  • Real-Time Object Detection Seamless camera-to-backend pipeline optimized for mobile and desktop environments.
  • Voice Narration Engine Converts object labels into spoken feedback using a threaded TTS system.
  • Modular Architecture Clean separation of capture, analysis, and narration layers for rapid iteration and scalability.
  • Accessibility-First Design Prioritizes low-latency feedback and compatibility with assistive technologies.

🧠 Tech Stack | Layer | Tools & Frameworks | | Backend | FastAPI, Python | | TTS Engine | pyttsx3 | | Object Detection | OpenCV, TensorFlow/PyTorch (customizable) | | Deployment | Docker, Uvicorn, Gunicorn | | Monitoring | PostHog / Mixpanel (optional) |

🚀 Getting Started git clone https://github.com/your-username/vss-backend.git cd vss-backend pip install -r requirements.txt uvicorn main:app --reload

📌 Roadmap

  • [*] Thread-safe voice narration
  • [*] Mobile-compatible camera capture
  • [*] Multi-language narration support
  • [*] Integration with Bharat Explorer frontend
  • [*] Offline mode for low-connectivity regions

About

📸 Voice-Enabled Smart Vision Backend Description This is designed to enhance accessibility for visually impaired users. Built with FastAPI and integrated with a modular camera pipeline, it captures live frames, analyzes objects using deep learning, and narrates results using pyttsx3.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published