-
Notifications
You must be signed in to change notification settings - Fork 239
Description
Neither the old guided decoding API (guided_json) nor the new structured outputs API (structured_outputs) work in this worker. JSON schema constraints are ignored with the old API, and the new API throws parameter errors.
Issues:
- Old API (
guided_json): Schema constraints are completely ignored - model follows contradictory system prompts instead of enforced JSON - New API (
structured_outputs): Throws "Unexpected keyword argument 'extra_body'" error - Both backends affected:
outlinesandlm-format-enforcerboth fail to enforce constraints
Steps to reproduce:
Test 1 - Old API failure:
{
"input": {
"messages": [
{
"role": "system",
"content": "OUTPUT YAML ONLY: reasoning: |\n your thoughts\nquestion: your question"
},
{
"role": "user",
"content": "Analyze: The Eiffel Tower was completed in 1889 for the World's Fair."
}
],
"sampling_params": {
"max_tokens": 500,
"temperature": 0.1
},
"guided_json": {
"type": "object",
"properties": {
"reasoning": {"type": "string"},
"question": {"type": "string"}
},
"required": ["reasoning", "question"]
},
"guided_decoding_backend": "outlines"
}
}Result: Model outputs YAML, completely ignoring JSON schema constraints.
Test 2 - New API failure:
{
"input": {
"messages": [
{"role": "user", "content": "Analyze: Some text"}
],
"sampling_params": {
"max_tokens": 500,
"temperature": 0.1,
"extra_body": {
"structured_outputs": {
"json": {
"type": "object",
"properties": {
"reasoning": {"type": "string"},
"question": {"type": "string"}
},
"required": ["reasoning", "question"]
}
}
}
}
}
}Result: "Unexpected keyword argument 'extra_body'" error.
Environment:
- Worker: vllm v2.9.6 (vLLM 0.11.0)
- GUIDED_DECODING_BACKEND environment variable set to
outlines
Possible cause:
The worker was updated to vLLM 0.11.0 but may not have been updated to handle the API changes. vLLM 0.11.0 introduced the new structured_outputs API while deprecating the old guided decoding fields, but the worker's custom wrapper doesn't support the new API and the old API may no longer function properly.
Both guided decoding approaches are currently unusable in this worker version.