-
Notifications
You must be signed in to change notification settings - Fork 90
feat: vllm specic fields for embedding completion #1084
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: yxia216 <[email protected]>
Signed-off-by: yxia216 <[email protected]>
ae164bf
to
e0b150f
Compare
do you want to add documentation to https://github.com/envoyproxy/ai-gateway/blob/main/site/docs/capabilities/llm-integrations/vendor-specific-fields.md ? |
@mathetake This is not ready and used for discussion. Thanks a lot for your suggestions! |
@@ -1320,3 +1325,19 @@ type AnthropicVendorFields struct { | |||
// https://docs.anthropic.com/en/api/messages#body-thinking | |||
Thinking *anthropic.ThinkingConfigParamUnion `json:"thinking,omitzero"` | |||
} | |||
|
|||
type vLLMEmbeddingVendorFields struct { | |||
AddSpecialTokens param.Opt[bool] `json:"add_special_tokens,omitzero"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you not use openai sdk just for the optional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Relevant questions: Currently, chatcompletion
and embedding
requests are manually maintained. Do you have plans to import https://github.com/openai/openai-go directly and embed new fields for different providers? I thought you might do this so just tried to make it consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have plans to import https://github.com/openai/openai-go directly and embed new fields for different providers
not a plan but some people started thinking about it, but not sure if we want to do that due to the loss of our control, lack of compliance in their OpenAPI (not AI) specification etc. See my old comment here
ai-gateway/internal/apischema/openai/openai.go
Lines 6 to 8 in e33a5f3
// Package openai contains the following is the OpenAI API schema definitions. | |
// Note that we intentionally do not use the code generation tools like OpenAPI Generator not only to keep the code simple | |
// but also because the OpenAI's OpenAPI definition is not compliant with the spec and the existing tools do not work well. |
For example, they sometimes use the "undocumented" field etc. Given that SDK is auto-generated by some vendor (stainless), I am not 100% sure if we want to do that. Maintaining the schema for select a few endpoints is not that pain in other words.
edited: on the other hand, I am not completely opposed to it either. Just that the current state gives me the impression that using it seems like a cure worse than the disease. my 2c.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that these added fields in the request body will be forwarded to the self-hosted vLLM intact, even without the changes of explicitly adding them to the API schema.
vLLM supports many such fields in chat completions and embedding. Should this just be a documentation update highlighting that vLLM fields are supported, or an example?
Codecov Report✅ All modified and coverable lines are covered by tests. ❌ Your project status has failed because the head coverage (78.03%) is below the target coverage (86.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## main #1084 +/- ##
==========================================
- Coverage 78.32% 78.03% -0.30%
==========================================
Files 82 82
Lines 9647 9647
==========================================
- Hits 7556 7528 -28
- Misses 1728 1764 +36
+ Partials 363 355 -8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Description
vLLM specific fields for embedding completions.