-
Notifications
You must be signed in to change notification settings - Fork 11.4k
Misc. bug: Server response "model" seems to be incorrect #11069
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It's expected, and the old Ref: #10691 |
It was nice having it per generation because requests might be submitted to a pool where it doesn't know which server its requests went to / came from and proxies didn't have to monitor |
It's remove because you can set model name via --alias argument. It doesn't make sense to use model path as model name (or if you really want, we can add is back as model_path field) |
I see, I think I never noticed it was a path because of the way I load the server. Respectfully I think the right path was better than wrong value without setting a flag, but I see your point, |
If --alias works for you, then I think we don't need to add model_path anymore. IMO --alias should be more suitable to your reverse proxy use case, since you can set it to hostname / container ID / etc. |
Great, thanks for the help! |
Name and Version
version: 4410 (4b0c638)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
Which llama.cpp modules do you know to be affected?
llama-server
Problem description & steps to reproduce
The
/completion
end point use to emit aoutput.generation_settings.model
with the correct value.Its now
output.model
which is fine, but it seems to always saygpt-3.5-turbo
which is not the model I'm using. I've also tried settingsresponse_fields
to see if that "tricks" it into giving the right value but it doesn't seem to.When adding the model in the server tool it appears as the attached logs.
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: