ReelTek Vision System

This repo showcases the ReelTek Vision System, real-time camera analysis platform utilizing local vision-language models with llama.cpp server scene understanding. ReelTek uses SmolVLM and TTS to provide realtime insights from your webcam feed

EXAMPLE:

Features

Real-time video analysis and object detection using webcam feed
Integration with lightweight vision-language models for efficient processing
Modern, user-friendly web interface with video display and control panels
Customizable analysis instructions for tailored output
System metrics monitoring for performance optimization
Speech synthesis support - Hear the AI's analysis spoken aloud
- Browser-based TTS with multiple voice options
- Optional Python TTS server for enhanced voice quality

Prerequisites

A webcam connected to your computer
A modern web browser (e.g., Chrome, Firefox)
GPU support (optional but recommended for better performance) with NVidia, AMD, or Intel GPU

How to Setup

Install llama.cpp:
- Download and install llama.cpp from the official repository: llama.cpp GitHub.
- Follow the installation instructions specific to your operating system (Windows, macOS, Linux).
Run the llama.cpp Server:
- Open a terminal or command prompt.
- Execute the following command to start the server with the SmolVLM model:
```
llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF
```
- OR
```
 .\llama-b5538-bin-win-cuda-12.4-x64\llama-server.exe -hf ggml-org/SmolVLM-500M-Instruct-GGUF -ngl 99
```
- Note for GPU Users: the -ngl 99 flag like the above is necessary to enable GPU acceleration:
- Alternative Models: Explore other compatible multimodal models listed in the llama.cpp documentation.
Launch the Demo:
- Open the index.html file in your web browser. This file is located in the root directory of this repository.
Customize Instructions (Optional):
- Modify the instruction prompt in the web interface to change how the model interprets and describes the camera feed. For example, you can format the output as JSON for structured data.
Start the Analysis:
- Click the "Start" button in the web interface to begin real-time video analysis. Allow browser access to your webcam when prompted.

Usage

Once the system is running, point your webcam at objects or scenes you want to analyze.
The interface will display real-time descriptions or detections based on the model's output, alongside system metrics and controls.
To enable speech output: Check the "Enable Speech Output" checkbox in the control panel. The system will speak the AI's analysis aloud.
To stop the analysis, click the "Stop" button or close the browser tab.

Speech Synthesis Options

Browser TTS (Default): Works immediately, no setup required
Python TTS (Optional): For better voice quality, see

Customization

Model Selection: You can switch to other vision-language models supported by llama.cpp by changing the model name in the server command.
Instruction Tuning: Experiment with different prompts in the interface to get varied outputs, such as detailed descriptions, specific object focus, or structured formats like JSON.

Troubleshooting

Webcam Access Issues: Ensure your browser has permission to access your webcam. Check browser settings if prompted.
Server Not Running: Verify that the llama.cpp server is active before opening index.html. If the server fails to start, check for error messages in the terminal and ensure the model is correctly specified.
Performance Lag: If the system runs slowly, ensure GPU acceleration is enabled (if available) or reduce the webcam resolution in browser settings.
For additional support, refer to the llama.cpp GitHub issues page or community forums.

Contributing

Contributions to the ReelTek Vision System are welcome! If you have improvements, bug fixes, or new features to add:

Fork this repository.
Create a new branch for your changes.
Submit a pull request with a detailed description of your updates.

License

This project is licensed under the terms specified in the LICENSE file. Please review it for details on usage and distribution rights.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example.jpg		example.jpg
index.html		index.html
requirements.txt		requirements.txt
test_tts.py		test_tts.py
tts_integration.js		tts_integration.js
tts_server.py		tts_server.py
ui.png		ui.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ReelTek Vision System

Features

Prerequisites

How to Setup

Usage

Speech Synthesis Options

Customization

Troubleshooting

Contributing

License

About

Uh oh!

Releases

Packages

Languages

License

iBz-04/reeltek

Folders and files

Latest commit

History

Repository files navigation

ReelTek Vision System

Features

Prerequisites

How to Setup

Usage

Speech Synthesis Options

Customization

Troubleshooting

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages