Obstacles seen with this project

Team,

I recently had the opportunity to work with this project and was able to get everything up and running successfully using the `$ make install` command. 

Overall, the setup process went smoothly, but I did have to make a few adjustments due to environment-specific and hardware-related constraints.

### 1. **Container Image Pull Issues**
Initially, I encountered issues pulling container images due to authentication restrictions. To resolve this, I followed the article from the Red Hat documentation here: https://access.redhat.com/RegistryAuthentication

I ran the following commands to create and link a pull secret:

```bash
$ oc create secret generic <pull_secret_name> \
    --from-file=.dockerconfigjson=<path/to/.docker/config.json> \
    --type=kubernetes.io/dockerconfigjson

$ oc secrets link default <pull_secret_name> --for=pull
```

I executed these commands during the `$ make install` process specifically, at the point where the script was prompting for my Hugging Face token. This allowed the project setup to proceed using the pull secret by default.

### 2. **Precision Compatibility with GPU**
Once the containers were successfully pulled, I ran into a precision related error due to the GPU on my hardware (Tesla T4) not supporting BF16:

```
ValueError: Bfloat16 is only supported on GPUs with compute capability of at least 8.0.
Your Tesla T4 GPU has compute capability 7.5. You can use float16 instead by explicitly
setting the `dtype` flag in CLI, for example: --dtype=half.
```

To resolve this, I changed the model precision to half by modifying the following files:

- `deploy/helm/rag-ui/values.yaml`
```
safety-model:
  extraArgs:
    - --dtype
    - "half"
    - --model
    - meta-llama/Llama-Guard-3-1B
```

- `deploy/helm/rag-ui/charts/llama-serve/values.yaml`
```
args:
  - --enable-auto-tool-choice
  - --chat-template
  - /app/tool_chat_template_llama3.2_json.jinja
  - --tool-call-parser
  - llama3_json
  - --port
  - "8000"
  - --dtype
  - "half"
  - '--max-model-len'
  - '8192'
```




### Summary
Everything worked well after these changes. I’m sharing this here in case others run into similar issues or if the team wants to consider incorporating these workarounds into the documentation for broader compatibility.


Thanks!


~ Limitless

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Obstacles seen with this project #16

1. Container Image Pull Issues

2. Precision Compatibility with GPU

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Obstacles seen with this project #16

Description

1. Container Image Pull Issues

2. Precision Compatibility with GPU

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions