Example for running custom GGUF models

Hi All,
I was able to build my custom GGUF model file which is around 5 GB inside Ollama docker container image.
On docker, I am able to run this container without any issues and it works as expected.

Following the docs, I tried running this same image on KubeAI as model CRD using

```
apiVersion: kubeai.org/v1
kind: Model
metadata:
  name: model-test
spec:
  features: ["TextGeneration"]
  owner: custom
  image: drtinkerer/ollama-test:deepseek-test
  url: "ollama://deepseek-local"
  engine: OLlama
  resourceProfile: cpu:1
```

The pod that comes up has below startup probe configured

```
    Startup:    exec [bash -c /bin/ollama pull deepseek-local && /bin/ollama cp deepseek-local model-test && /bin/ollama run model-test hi] delay=1s timeout=10800s period=3s #success=1 #failure=10
```

Now the URL i have put there is not available on ollama registry and the startup probe seems to be trying to pull it.
So, the startup probe fails as it tried to pull non-existing image where I want to utilize image that is baked inside Ollama container image itself.

```
Error: pull model manifest: file does not exist
  Normal  Started  54s (x5 over 2m55s)  kubelet  Started container server
  Normal  Created  54s (x5 over 2m55s)  kubelet  Created container: server
  Normal  Pulled   54s (x5 over 2m55s)  kubelet  Container image "drtinkerer/ollama-test:deepseek-test" already present on machine
  Normal  Killing  25s (x5 over 2m25s)  kubelet  Container server failed startup probe, will be restarted
```

In my particular use case with GGUF model inside ollama container, my startupProbe should only look like `/bin/ollama run model-test` or perhaps, I want to configure startupProbe as httpGet instead of script for my own pod.


Same behaviour is there when I upload my GGUF to a PVC and try to load it from there.
Its the startup probe that fails always.

I believe this is the function responsible for configuring startup probe that needs to be configurable.


https://github.com/substratusai/kubeai/blob/d6e393ca76f11da76b3a6db74b737b94d1a4f057/internal/modelcontroller/engine_ollama.go#L171

  - Is my method of putting GGUF model inside docker image correct ? 
  - Are there better ways to achieve the same ?
  - Can the startup probe made be customisable ?
  
Any help is appreciated. Thanks :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example for running custom GGUF models #517

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Example for running custom GGUF models #517

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions