Misc. bug: Embedding/pooling: I receive 10xvector not 1xvector

### Name and Version

./llama-server --version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ:    no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
  Device 0: NVIDIA GeForce GTX 750 Ti, compute capability 5.0, VMM: yes
version: 5797 (de569441)
built with cc (Debian 12.2.0-14+deb12u1) 12.2.0 for x86_64-linux-gnu





### Operating systems

Linux

### Which llama.cpp modules do you know to be affected?

llama-server

### Command line

```shell
llama-server --embedding --pooling last/mean/any etc....
```

### Problem description & steps to reproduce

Helo, 
For text (which turns out to be of ten tokens) i get 10 vectors even though i have --pooling enabled. Am I missing something obvious?
This is the script for server, curl post and the embeddings in file. It outputs 10x1024 vectors, not 1x1024 vector.

Server script 
!/bin/bash
LLAMA_MODEL="Qwen3-Embedding-0.6B-Q8_0.gguf"
LLAMA_MODEL_PATH="/home/DATA/GGUF/embed"
LLAMA_OPTS="-c 1024 --temp 0.3 --top-k 40 --top-p 0.9 --n-predict 60 --no-warmup --port 8081 --embedding"
LLAMA_PERF_OPTS="-ngl 99 --mlock --pooling last"

llama-server ${LLAMA_PERF_OPTS} ${LLAMA_OPTS} -m ${LLAMA_MODEL_PATH}/${LLAMA_MODEL} ${@} 

curl -s -X POST http://localhost:8081/embedding \
  -H "Content-Type: application/json" \
  -d '{
  "model": "Qwen3-Embedding-0.6B-Q8_0.gguf",
  "input": "The quick brown fox jumps over the lazy dog."
}' > q-test-embedding.txt

ls -l 
212K Jul  5 03:51 q-test-embedding.txt
jq '.[].embedding | length' ~/tmp/q-test-embedding.txt
10
grep -o ',' q-test-embedding.txt | wc -l
10240


### First Bad Commit

_No response_

### Relevant log output

```shell

```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Misc. bug: Embedding/pooling: I receive 10xvector not 1xvector #14543

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Misc. bug: Embedding/pooling: I receive 10xvector not 1xvector #14543

Description

Name and Version

Operating systems

Which llama.cpp modules do you know to be affected?

Command line

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions