-
Notifications
You must be signed in to change notification settings - Fork 125
Description
Hi All,
I was able to build my custom GGUF model file which is around 5 GB inside Ollama docker container image.
On docker, I am able to run this container without any issues and it works as expected.
Following the docs, I tried running this same image on KubeAI as model CRD using
apiVersion: kubeai.org/v1
kind: Model
metadata:
name: model-test
spec:
features: ["TextGeneration"]
owner: custom
image: drtinkerer/ollama-test:deepseek-test
url: "ollama://deepseek-local"
engine: OLlama
resourceProfile: cpu:1
The pod that comes up has below startup probe configured
Startup: exec [bash -c /bin/ollama pull deepseek-local && /bin/ollama cp deepseek-local model-test && /bin/ollama run model-test hi] delay=1s timeout=10800s period=3s #success=1 #failure=10
Now the URL i have put there is not available on ollama registry and the startup probe seems to be trying to pull it.
So, the startup probe fails as it tried to pull non-existing image where I want to utilize image that is baked inside Ollama container image itself.
Error: pull model manifest: file does not exist
Normal Started 54s (x5 over 2m55s) kubelet Started container server
Normal Created 54s (x5 over 2m55s) kubelet Created container: server
Normal Pulled 54s (x5 over 2m55s) kubelet Container image "drtinkerer/ollama-test:deepseek-test" already present on machine
Normal Killing 25s (x5 over 2m25s) kubelet Container server failed startup probe, will be restarted
In my particular use case with GGUF model inside ollama container, my startupProbe should only look like /bin/ollama run model-test or perhaps, I want to configure startupProbe as httpGet instead of script for my own pod.
Same behaviour is there when I upload my GGUF to a PVC and try to load it from there.
Its the startup probe that fails always.
I believe this is the function responsible for configuring startup probe that needs to be configurable.
- Is my method of putting GGUF model inside docker image correct ?
- Are there better ways to achieve the same ?
- Can the startup probe made be customisable ?
Any help is appreciated. Thanks :)