-
Notifications
You must be signed in to change notification settings - Fork 11k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
vulkan: Pad N dimension of B matrix for coopmat2 perf, to avoid bounds checking
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12273
opened Mar 8, 2025 by
jeffbolznv
Loading…
vulkan: fix coopmat shader generation when cross-compiling
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12272
opened Mar 8, 2025 by
Icenowy
Loading…
metal: Cache compiled library at device level
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
doc: add software architecture and propose PAL in toplevel README.md
#12263
opened Mar 8, 2025 by
zhouwg
Loading…
vulkan: optimization proposals for coopmat1 mul_mm
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12260
opened Mar 7, 2025 by
remyoudompheng
•
Draft
vulkan: Adjust coopmat2 tile sizes and selection heuristic
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12258
opened Mar 7, 2025 by
jeffbolznv
Loading…
server : Add verbose output to OAI compatible chat endpoint.
android
Issues specific to Android
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
build
Compilation issues
devops
improvements to build systems and github actions
documentation
Improvements or additions to documentation
examples
ggml
changes relating to the ggml tensor library for machine learning
Kompute
https://github.com/KomputeProject/kompute/
nix
Issues specific to consuming flake.nix, or generally concerned with ❄ Nix-based llama.cpp deployment
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
script
Script related
server
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#12246
opened Mar 7, 2025 by
mglambda
Loading…
Fix rocWMMA build documentation
documentation
Improvements or additions to documentation
#12243
opened Mar 7, 2025 by
Headcrabed
Loading…
Issues while enabling MMA support on AIX machines
ggml
changes relating to the ggml tensor library for machine learning
#12241
opened Mar 7, 2025 by
mehendarkarprajwal
Loading…
tests: use adaptive number of threads
testing
Everything test related
#12236
opened Mar 6, 2025 by
JohannesGaessler
Loading…
Optimized DeepSeek V2/V3 implementation (MLA + flash attention)
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
python
python script changes
opencl: use OpenCL C standard supported by the device
ggml
changes relating to the ggml tensor library for machine learning
#12221
opened Mar 6, 2025 by
linehill
Loading…
feat(CMakeLists): Add MSVC-specific compiler warning flags in CMake configuration
ggml
changes relating to the ggml tensor library for machine learning
#12206
opened Mar 5, 2025 by
25077667
Loading…
SYCL: Rename oneMKL to oneMath
documentation
Improvements or additions to documentation
examples
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#12192
opened Mar 5, 2025 by
Rbiessy
Loading…
libfuse3 supported mounting split gguf's to a single in-memory file
examples
#12189
opened Mar 5, 2025 by
matbee-eth
•
Draft
vulkan: double buffer scale caches
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#12188
opened Mar 4, 2025 by
netrunnereve
Loading…
fix: AVX2 intrinsics, const correctness, and SIMD headers
build
Compilation issues
ggml
changes relating to the ggml tensor library for machine learning
#12186
opened Mar 4, 2025 by
sandboxyer
Loading…
CUDA: Improve flash decoding kernel GPU occupancy for BS=1 case
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
testing
Everything test related
#12183
opened Mar 4, 2025 by
gaugarg-nv
Loading…
1 of 3 tasks
CUDA/HIP: refractor mmqv to unify the calculation of nwarps and rows per block between host and device code.
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#12177
opened Mar 4, 2025 by
IMbackK
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.