-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
When using Hugging Face Chat UI with an OpenAI-compatible /v1/chat/completions endpoint (type: "openai" in the config), and connecting it to an NVIDIA NIM container running meta/llama-3.1-70b-instruct, the backend responds with a 500 Internal Server Error:
{
"object": "error",
"message": "list object has no element 0",
"type": "InternalServerError",
"param": null,
"code": 500
}
After debugging the issue, I found that Hugging Face Chat UI sends a message format and content structure that are not compatible with the NIM container’s expectations. I debugged the chat ui container by printing the request that is sent to the endpoint.
Failing Example (Sent Automatically by Hugging Face Chat UI):
{
"model": "meta/llama-3.1-70b-instruct",
"messages": [
{ "role": "system", "content": "" },
{ "role": "user", "content": [{ "type": "text", "text": "Hello!" }] }
],
"stream": true,
"max_tokens": 8192,
"stop": [],
"temperature": 0.2,
"top_p": 0.95
}
The endpoint expects a string with >= 1 characters. In contrast, content is an empty string.
Therefore, the minimal change that I could make in order to get a successful response from the NIM container was to add a whitespace to the content of the system message:
{ "role": "system", "content": " " },
However, it would be even better if the messages could be sent in this format so that the system message is completely omitted and plain text is sent instead:
"messages": [
{ "role": "user", "content": "Hello!" }
],
I already tried to set "systemRoleSupported": false, but this did not change anything unfortunately.
This is the current chat UI MODELS Config:
{
"name": "meta/llama-3.1-70b-instruct",
"displayName": "meta-llama/Meta-Llama-3.1-70B-Instruct",
"parameters": {
"temperature": 0.2,
"top_p": 0.95,
"repetition penalty": 1.2,
"top_k": 50,
"truncate": 8192,
"max_new_tokens": 8192,
"stop": []
},
"endpoints": [
{
"type": "openai",
"baseURL": "https://nim-custom.net/v1"
}
]
}
- Has anyone encountered this same issue with OpenAI-compatible backends?
- Is there a known way to override or customize the prompt format (e.g., using chatPromptTemplate) for "openai" type endpoints in the Chat UI?
- Can the system message be omitted or defaulted when content is empty?
- Would a workaround like modifying the MODELS config help prevent the UI from sending empty content?
Any guidance, workarounds, or suggestions from those who’ve run into this would be greatly appreciated!