Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: add OpenAI compatible response format for legacy /completions with b… #10645

Closed
wants to merge 1 commit into from

Conversation

Nero7991
Copy link

@Nero7991 Nero7991 commented Dec 4, 2024

This is based of a previous PR

However, @ngxson seems to be working refactoring the server.cpp to prevent use of JSON as stated here so I don't expect is to be merged easily. However, might be of use to someone else.

Support for full (almost) OpenAI API response format for the legacy completion related endpoints (including when logprobs is specified)

When oai_compat is set to True in the request (as suggested by @ngxson, the old response format is used (check tests)

HELM benchmarks from CRFM have support for a OpenAI compatible API server that uses this endpoint, this enables testing differently quantized models for degradation against this benchmark. Tested it on a QwQ Preview 32B GGUF Q4_K_M to evaluate the model against other frontier models. I've described that here

@ngxson
Copy link
Collaborator

ngxson commented Dec 12, 2024

FYI logprobs will be fixed in #10783 and turns out it's a bit complicated. Not just changing the output format, but also need to do some extra computations.

Btw on second thought, I think we can get rid of adding "oai_compat" to each requests as I said earlier. Instead, we can make /completions to be non-OAI and /v1/completions to be OAI.

I can take over this part if you're too busy.

@JeroenAdam
Copy link

JeroenAdam commented Dec 20, 2024

(deleted) I have created a new issue #10924

@ngxson
Copy link
Collaborator

ngxson commented Dec 25, 2024

Superseded by #10645

@ngxson ngxson closed this Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants