cli : auto activate conversation mode if chat template is available #11214

ngxson · 2025-01-13T12:44:34Z

The goal of this PR is to provide an automatic way to enable -cnv whenever chat template is available (either built-in or via --chat-template)

Please note that, this is not related to the recent discussion on chat UX (#11203), but rather focus on OOTB experience.

With this PR, the OOTB experience will become:

# install
brew install llama.cpp

# use local gguf file
llama-cli -m Qwen2.5-7B-Instruct-IQ2_M.gguf

# use HF hosted model
llama-cli -hf bartowski/Llama-3.2-1B-Instruct-GGUF

User can still force-disable it via -no-cnv

This also fixes #11157 since it's not obvious that some models need system message.

ggerganov · 2025-01-14T13:28:25Z

I think ggml-ci needs to be updated after this change: https://github.com/ggml-org/ci/tree/results/llama.cpp/84/a44815f704aaed8e8edec7a39e846a975c7ba9/ggml-2-x86-cpu

This change is related to these upstream PR: - ggerganov/llama.cpp#11195 allows using tag-based repo name like on ollama - ggerganov/llama.cpp#11214 automatically turn on `--conversation` mode for models having chat template Example: ```sh # for "instruct" model, conversation mode is enabled automatically llama-cli -hf bartowski/Llama-3.2-1B-Instruct-GGUF # for non-instruct model, it runs as completion llama-cli -hf TheBloke/Llama-2-7B-GGUF -p "Once upon a time," ```

strawberrymelonpanda · 2025-01-19T21:57:12Z

For what it's worth, this was a fairly problematic change for me as my CLI scripts suddenly broke, due to Llama-CLI now deciding that my prompts (passed with -p) are now system prompts instead because of auto-conversation.

I just have to pass -no-cnv to resume script usage, which isn't a big deal, but was quite confusing out of the blue. As a suggestion, perhaps a message under == Running in interactive mode. == that "Conversation is now the default mode; pass "--no-conversation" for scripted usage" would be prudent?

strawberrymelonpanda · 2025-01-19T22:44:08Z

Also maybe worth mentioning the "Example usage" at the bottom of llama-cli --help could use updating.

example usage:

  text generation:     ./build/bin/llama-cli -m your_model.gguf -p "I believe the meaning of life is" -n 128

  chat (conversation): ./build/bin/llama-cli -m your_model.gguf -p "You are a helpful assistant" -cnv

The first example is no longer text gen, but conversation. The readme adds "-no-cnv" to this. The second example would work, but is redundant, and as per the readme change would need a template if it's being manually specified for an unsupported model.

This change is related to these upstream PR: - ggerganov/llama.cpp#11195 allows using tag-based repo name like on ollama - ggerganov/llama.cpp#11214 automatically turn on `--conversation` mode for models having chat template Example: ```sh # for "instruct" model, conversation mode is enabled automatically llama-cli -hf bartowski/Llama-3.2-1B-Instruct-GGUF # for non-instruct model, it runs as completion llama-cli -hf TheBloke/Llama-2-7B-GGUF -p "Once upon a time," ```

cli : auto activate conversation mode if chat template is detected

9d7d5f2

ngxson requested a review from ggerganov January 13, 2025 12:44

add warn on bad template

5b1f710

github-actions bot added the examples label Jan 13, 2025

ggerganov approved these changes Jan 13, 2025

View reviewed changes

ngxson added 4 commits January 13, 2025 16:27

Merge branch 'master' into xsn/cli_auto_cnv

e6e9a6f

update readme (writing with the help of chatgpt)

723d77c

update readme (2)

dd221bd

do not activate -cnv for non-instruct models

73c4e77

ngxson merged commit 84a4481 into ggerganov:master Jan 13, 2025
48 checks passed

ngxson mentioned this pull request Jan 14, 2025

local-apps: update llama.cpp snippet huggingface/huggingface.js#1103

Merged

ngxson mentioned this pull request Jan 14, 2025

ci : add -no-cnv for tests #11238

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cli : auto activate conversation mode if chat template is available #11214

cli : auto activate conversation mode if chat template is available #11214

ngxson commented Jan 13, 2025

ggerganov commented Jan 14, 2025

strawberrymelonpanda commented Jan 19, 2025 •

edited

Loading

strawberrymelonpanda commented Jan 19, 2025 •

edited

Loading

cli : auto activate conversation mode if chat template is available #11214

cli : auto activate conversation mode if chat template is available #11214

Conversation

ngxson commented Jan 13, 2025

ggerganov commented Jan 14, 2025

strawberrymelonpanda commented Jan 19, 2025 • edited Loading

strawberrymelonpanda commented Jan 19, 2025 • edited Loading

strawberrymelonpanda commented Jan 19, 2025 •

edited

Loading

strawberrymelonpanda commented Jan 19, 2025 •

edited

Loading