-
Notifications
You must be signed in to change notification settings - Fork 14.3k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
server: fix files built redundantly
examples
server
#18474
opened Dec 30, 2025 by
jeffbolznv
Loading…
CUDA: fix 0.0f/0.0f for FA fixup
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18472
opened Dec 29, 2025 by
JohannesGaessler
Loading…
Add self‑speculative decoding (no draft model required)
examples
server
#18471
opened Dec 29, 2025 by
srogmann
Loading…
Adding support for Nvidia Music Flamingo Model
examples
python
python script changes
#18470
opened Dec 29, 2025 by
Henry147147
Loading…
vulkan: support buffer_from_host_ptr
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18467
opened Dec 29, 2025 by
jeffbolznv
•
Draft
model-conversion : use CONVERTED_MODEL for compare-embeddings
examples
#18461
opened Dec 29, 2025 by
danbev
Loading…
kleidiai: add and integrate SVE 256-bit vector-length kernel
ggml
changes relating to the ggml tensor library for machine learning
#18458
opened Dec 29, 2025 by
chaxu01
Loading…
metal : remove BF16 x F16 kernels
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
#18456
opened Dec 29, 2025 by
ggerganov
Loading…
quantize: prevent input/output file collision
examples
#18451
opened Dec 29, 2025 by
Anri-Lombard
Loading…
vulkan: Implement mmvq for iq1_s/iq1_m
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18450
opened Dec 29, 2025 by
jeffbolznv
Loading…
ggml-cuda: filter architectures in CMAKE_CUDA_ARCHITECTURES_NATIVE
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18449
opened Dec 29, 2025 by
QDelta
Loading…
Patch perf regression for mmq kernels in ROCm
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18442
opened Dec 28, 2025 by
jiachengjason
Loading…
docker : add CUDA 13.1 image build
devops
improvements to build systems and github actions
#18441
opened Dec 28, 2025 by
CISC
Loading…
ggml-cuda: fixed assertion in ggml_cuda_cpy (#18140, #18341)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18433
opened Dec 28, 2025 by
Meet91721
Loading…
add in mcp server support to frontend webui [SERVER] [WEBUI]
examples
python
python script changes
server
#18422
opened Dec 28, 2025 by
brucepro
Loading…
model: add Qwen3-Omni Thinker support (qwen3omnimoe)
model
Model specific
python
python script changes
#18420
opened Dec 28, 2025 by
TrevorS
Loading…
vulkan: Optimize GGML_OP_CUMSUM
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18417
opened Dec 28, 2025 by
jeffbolznv
Loading…
ggml: add ggml_rope_comp
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
examples
ggml
changes relating to the ggml tensor library for machine learning
server
testing
Everything test related
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.