This repository is a simple demo for how to use llama.cpp server with SmolVLM 500M to get real-time object detection
- Install llama.cpp
- Run
llama-server -hf ggml-org/SmolVLM-500M-Instruct-GGUF
Note: you may need to add-ngl 99
to enable GPU (if you are using NVidia/AMD/Intel GPU)
Note (2): You can also try other models here - Open
index.html
- Optionally change the instruction (for example, make it returns JSON)
- Click on "Start" and enjoy