Skip to content

Pull requests: ggml-org/llama.cpp

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

server: fix files built redundantly examples server
#18474 opened Dec 30, 2025 by jeffbolznv Loading…
CUDA: fix 0.0f/0.0f for FA fixup ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#18472 opened Dec 29, 2025 by JohannesGaessler Loading…
Adding support for Nvidia Music Flamingo Model examples python python script changes
#18470 opened Dec 29, 2025 by Henry147147 Loading…
lora: count lora nodes in graph_max_nodes
#18469 opened Dec 29, 2025 by ngxson Loading…
2
vulkan: support buffer_from_host_ptr ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#18467 opened Dec 29, 2025 by jeffbolznv Draft
implement new jinja template engine testing Everything test related
#18462 opened Dec 29, 2025 by ngxson Draft
kleidiai: add and integrate SVE 256-bit vector-length kernel ggml changes relating to the ggml tensor library for machine learning
#18458 opened Dec 29, 2025 by chaxu01 Loading…
metal : remove BF16 x F16 kernels Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning
#18456 opened Dec 29, 2025 by ggerganov Loading…
vulkan: Implement mmvq for iq1_s/iq1_m ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#18450 opened Dec 29, 2025 by jeffbolznv Loading…
ggml-cuda: filter architectures in CMAKE_CUDA_ARCHITECTURES_NATIVE ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#18449 opened Dec 29, 2025 by QDelta Loading…
Patch perf regression for mmq kernels in ROCm ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#18442 opened Dec 28, 2025 by jiachengjason Loading…
docker : add CUDA 13.1 image build devops improvements to build systems and github actions
#18441 opened Dec 28, 2025 by CISC Loading…
ggml-cuda: fixed assertion in ggml_cuda_cpy (#18140, #18341) ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs
#18433 opened Dec 28, 2025 by Meet91721 Loading…
android: fix infinite generation in shift_context() android Issues specific to Android examples
#18432 opened Dec 28, 2025 by ssam18 Loading…
add in mcp server support to frontend webui [SERVER] [WEBUI] examples python python script changes server
#18422 opened Dec 28, 2025 by brucepro Loading…
model: add Qwen3-Omni Thinker support (qwen3omnimoe) model Model specific python python script changes
#18420 opened Dec 28, 2025 by TrevorS Loading…
vulkan: Optimize GGML_OP_CUMSUM ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#18417 opened Dec 28, 2025 by jeffbolznv Loading…
ggml: add ggml_rope_comp Apple Metal https://en.wikipedia.org/wiki/Metal_(API) examples ggml changes relating to the ggml tensor library for machine learning server testing Everything test related
#18401 opened Dec 26, 2025 by ngxson Draft
vulkan: disable events for UMA systems to workaround directio failures ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
#18397 opened Dec 26, 2025 by jeffbolznv Loading…
ggml-hexagon: optimize activation function ggml changes relating to the ggml tensor library for machine learning
#18393 opened Dec 26, 2025 by joeldushouyu Loading…
ProTip! Add no:assignee to see everything that’s not assigned.