-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
Please describe the feature you want
When running Tabby on a CPU-only host, the official Docker image still attempts to launch the CUDA-based llama-server
process for embeddings.
This causes continuous errors such as:
error while loading shared libraries: libcuda.so.1: cannot open shared object file: No such file or directory
Currently, CPU users must manually override the container’s entrypoint to /opt/tabby/bin/tabby-cpu
, which bypasses the default startup logic but breaks the intended Docker UX.
Feature request:
Please add a way to run Tabby’s Docker image in CPU-only mode without errors — for example:
- Support an environment variable like
CPU_ONLY=true
that skips launchingllama-server
, or - Add a dedicated
registry.tabbyml.com/tabbyml/tabby:cpu
tag, or - Automatically detect missing CUDA and gracefully start
tabby-cpu
instead oftabby
.
This would make CPU-only deployments first-class and eliminate the libcuda.so.1
crash loop.
Additional context
-
Affected image:
registry.tabbyml.com/tabbyml/tabby:latest
-
Host environment: Linux VPS, no NVIDIA GPU
-
The issue reproduces with default compose and resolved only by forcing the entrypoint:
entrypoint: ["/opt/tabby/bin/tabby-cpu"]
-
Example log snippet:
/opt/tabby/bin/llama-server: error while loading shared libraries: libcuda.so.1
This change would improve accessibility for developers hosting Tabby on small or shared servers without GPUs.
Please reply with a 👍 if you want this feature.