fix: OpenAiClient streaming error handling #436

declark1 · 2025-07-03T17:04:13Z

This PR fixes an issue where OpenAI/vLLM errors sent over a stream were not being deserialized and passed through properly.

Example: previously, the below error message was being returned as a generic internal server error.

event: error
data: {"code":400,"details":"Requested sample logprobs of 21, which is greater than max allowed: 20"}

data: [DONE]

Integration test error cases will be added as part of #417.

Signed-off-by: declark1 <[email protected]>

mdevino

Looks good to me. Thanks for figuring this one out!

declark1 added 2 commits July 3, 2025 09:54

Update handle_streaming error handling logic, add OpenAiErrorMessage

bf70bd4

Signed-off-by: declark1 <[email protected]>

Update From<orchestrator::Error> for Error implementation

1e17e7c

Signed-off-by: declark1 <[email protected]>

declark1 marked this pull request as ready for review July 3, 2025 17:05

declark1 requested review from gkumbhat, evaline-ju and mdevino as code owners July 3, 2025 17:05

mdevino approved these changes Jul 3, 2025

View reviewed changes

mdevino merged commit 6408740 into foundation-model-stack:main Jul 3, 2025
3 checks passed

Provide feedback