-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli : auto activate conversation mode if chat template is available #11214
Conversation
I think |
This change is related to these upstream PR: - ggerganov/llama.cpp#11195 allows using tag-based repo name like on ollama - ggerganov/llama.cpp#11214 automatically turn on `--conversation` mode for models having chat template Example: ```sh # for "instruct" model, conversation mode is enabled automatically llama-cli -hf bartowski/Llama-3.2-1B-Instruct-GGUF # for non-instruct model, it runs as completion llama-cli -hf TheBloke/Llama-2-7B-GGUF -p "Once upon a time," ```
For what it's worth, this was a fairly problematic change for me as my CLI scripts suddenly broke, due to Llama-CLI now deciding that my prompts (passed with -p) are now system prompts instead because of auto-conversation. I just have to pass -no-cnv to resume script usage, which isn't a big deal, but was quite confusing out of the blue. As a suggestion, perhaps a message under |
Also maybe worth mentioning the "Example usage" at the bottom of
The first example is no longer text gen, but conversation. The readme adds "-no-cnv" to this. The second example would work, but is redundant, and as per the readme change would need a template if it's being manually specified for an unsupported model. |
This change is related to these upstream PR: - ggerganov/llama.cpp#11195 allows using tag-based repo name like on ollama - ggerganov/llama.cpp#11214 automatically turn on `--conversation` mode for models having chat template Example: ```sh # for "instruct" model, conversation mode is enabled automatically llama-cli -hf bartowski/Llama-3.2-1B-Instruct-GGUF # for non-instruct model, it runs as completion llama-cli -hf TheBloke/Llama-2-7B-GGUF -p "Once upon a time," ```
The goal of this PR is to provide an automatic way to enable
-cnv
whenever chat template is available (either built-in or via--chat-template
)Please note that, this is not related to the recent discussion on chat UX (#11203), but rather focus on OOTB experience.
With this PR, the OOTB experience will become:
User can still force-disable it via
-no-cnv
This also fixes #11157 since it's not obvious that some models need system message.