simple-chat : fix context-exceeded condition #14494

ggerganov · 2025-07-02T05:31:45Z

Fix off-by-one error. New behavior after the fix:

make -j && ./bin/llama-simple-chat -m ../models/gemma-3-1b-it/ggml-model-q8_0.gguf -c 128

> Tell me a long story
Okay, here’s a long story, aiming for a bit of depth and emotional resonance. It’s a bit sprawling, so buckle up! It’s titled “The Cartographer’s Echo.”
---
The salt spray stung Elias’s face as he adjusted the compass, the needle spinning wildly in the grey, relentless wind. He was perched on the crumbling cliffs of Aethelgard, a tiny, forgotten village clinging to the edge of the Whispering Sea, a place time seemed to have deliberately abandoned. He’d inherited the cartography shop
context size exceeded

ggml-ci

slaren · 2025-07-02T11:02:13Z

examples/simple-chat/simple-chat.cpp

@@ -114,14 +114,15 @@ int main(int argc, char ** argv) {
            // check if we have enough space in the context to evaluate this batch
            int n_ctx = llama_n_ctx(ctx);
            int n_ctx_used = llama_memory_seq_pos_max(llama_get_memory(ctx), 0);
-            if (n_ctx_used + batch.n_tokens > n_ctx) {
+            if (n_ctx_used + batch.n_tokens >= n_ctx) {


To be more precise, I think it would be better to add 1 to the value returned by llama_memory_seq_pos_max.

int n_ctx_used = llama_memory_seq_pos_max(llama_get_memory(ctx), 0) + 1;

ggml-ci

* origin/master: llama : initial Mamba-2 support (ggml-org#9126) sync : ggml ggml : add version function to get lib version (ggml/1286) Set RPATH to "@loader_path" / "$ORIGIN" to ensure executables and dynamic libraries search for dependencies in their origin directory. (ggml-org#14309) CUDA: add softmax broadcast (ggml-org#14475) CUDA: broadcasting for FlashAttention mask (ggml-org#14500) vulkan: support softmax/FA batch and broadcast (ggml-org#14449) ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (ggml-org#14435) opencl : fix possible buffer overflow in dump_tensor (ggml-org#14490) simple-chat : fix context-exceeded condition (ggml-org#14494) opencl : skip empty nodes on cgraph compute (ggml-org#14491) opencl : update upscale to support align corners (ggml-org#14488) ci : add OpenCL to labeler workflow (ggml-org#14496) github : add OpenCL backend to issue templates (ggml-org#14492) ggml : Callback before abort (ggml-org#14481) ci : disable fast-math for Metal GHA CI (ggml-org#14478)

* simple-chat : fix context-exceeded condition ggml-ci * cont : fix n_ctx_used computation ggml-ci

simple-chat : fix context-exceeded condition

ed9ca47

ggml-ci

ggerganov requested a review from slaren July 2, 2025 05:31

ggerganov mentioned this pull request Jul 2, 2025

Eval bug: llama-simple-chat crashes with "failed to decode" after some requests #14487

Closed

github-actions bot added the examples label Jul 2, 2025

slaren reviewed Jul 2, 2025

View reviewed changes

cont : fix n_ctx_used computation

6bd73e3

ggml-ci

slaren approved these changes Jul 2, 2025

View reviewed changes

ggerganov merged commit d7f5f4e into master Jul 2, 2025
48 of 53 checks passed

ggerganov deleted the gg/simple-chat-error-handle branch July 2, 2025 11:12

Minh141120 pushed a commit to menloresearch/llama.cpp that referenced this pull request Jul 5, 2025

simple-chat : fix context-exceeded condition (ggml-org#14494)

b65a3dd

* simple-chat : fix context-exceeded condition ggml-ci * cont : fix n_ctx_used computation ggml-ci

qnixsynapse pushed a commit to menloresearch/llama.cpp that referenced this pull request Jul 6, 2025

simple-chat : fix context-exceeded condition (ggml-org#14494)

928a6bb

* simple-chat : fix context-exceeded condition ggml-ci * cont : fix n_ctx_used computation ggml-ci

qnixsynapse pushed a commit to menloresearch/llama.cpp that referenced this pull request Jul 6, 2025

simple-chat : fix context-exceeded condition (ggml-org#14494)

e40afd8

* simple-chat : fix context-exceeded condition ggml-ci * cont : fix n_ctx_used computation ggml-ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

simple-chat : fix context-exceeded condition #14494

simple-chat : fix context-exceeded condition #14494

Uh oh!

ggerganov commented Jul 2, 2025

Uh oh!

slaren Jul 2, 2025

Uh oh!

Uh oh!

Uh oh!

simple-chat : fix context-exceeded condition #14494

simple-chat : fix context-exceeded condition #14494

Uh oh!

Conversation

ggerganov commented Jul 2, 2025

Uh oh!

slaren Jul 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!