sync : llama.cpp #1281

ggerganov · 2025-06-20T18:05:04Z

No description provided.

Signed-off-by: Aaron Teo <[email protected]>

* metal : add mean kernel ggml-ci * cont : dedup implementation ggml-ci

…fallback to CPU buffer (llama/14249)

Addresses unused reorder path

* Change _contains_any() substrs to std::string_view and fix the find comparison logic.

Support for Arm runtime feature detection has now been added to GGML_CPU_ALL_VARIANTS. This removes the old and not very functional code.

* CUDA: add conv_2d_dw * better naming * simplify using template * Review: fix operation ordering in ggml-cuda, use __forceinline__, use more const

ggml-ci

…/14288) Workarounds an issue that may cause CUDA graph capture to fail when a cuBLAS handle is destroyed in a different thread

* Add PowerPC feature detection and scoring * ggml-cpu: Implement GGML_CPU_ALL_VARIANTS for PowerPC * ggml-cpu: Delay some initializations until function is called When using GGML_BACKEND_DL=ON, these initializations might use instructions that are not supported by the current CPU. --------- Co-authored-by: Diego Devesa <[email protected]>

* Add header and namespace to use enqueue_functions extension * Convert submit and parallel_for to use new extension in convert.cpp * Convert submit and parallel_for to use extension in ggml-sycl.cpp * Convert submit and parallel_for to use extension in gla.cpp * Convert submit and parallel_for in mmq.cpp * Convert submit and parallel_for in mmvq.cpp * Convert submit and parallel_for in remaining files * Convert all simple parallel_for to nd_launch from enqueue_functions extension * Wrapping extension in general function Create a general function that enable the enqueue_functions extension if it is enable in the compiler, otherwise call the general SYCL function to launch kernels. --------- Signed-off-by: nscipione <[email protected]>

* CUDA: add conv_2d_transpose * remove direct include of cuda_fp16 * Review: add brackets for readability, remove ggml_set_param and add asserts

ggml-ci

chaxu01 and others added 17 commits June 20, 2025 21:03

ggml: Add Apple support for GGML_CPU_ALL_VARIANTS (llama/14258)

a69d76a

ggml-cpu: fix uncaught underscore terminators (llama/14023)

99db4b0

Signed-off-by: Aaron Teo <[email protected]>

ggml-cpu: reduce asm calls for hsum (llama/14037)

1bfedd4

Signed-off-by: Aaron Teo <[email protected]>

metal : add mean kernel (llama/14267)

c3ae37a

* metal : add mean kernel ggml-ci * cont : dedup implementation ggml-ci

Vulkan: Set device max size for host memory to avoid OOM warning and …

849ff2c

…fallback to CPU buffer (llama/14249)

llamafile : support s390x SIMD instruction set (llama/14273)

d764f39

sycl: Cleanup codepaths in Get Rows in sycl backend (llama/14215)

d7083fe

Addresses unused reorder path

build : suppress gcc15 compile warnings (llama/14261)

5547adc

* Change _contains_any() substrs to std::string_view and fix the find comparison logic.

ggml-cpu : remove unnecesary arm feature detection (llama/14281)

6567be5

Support for Arm runtime feature detection has now been added to GGML_CPU_ALL_VARIANTS. This removes the old and not very functional code.

CUDA: add conv_2d_dw (llama/14265)

4afb955

* CUDA: add conv_2d_dw * better naming * simplify using template * Review: fix operation ordering in ggml-cuda, use __forceinline__, use more const

ggml: Update KleidiAI to v1.9.0 (llama/14277)

3ddcbd9

ggml : fix repack work size for mul_mat_id (llama/14292)

16184a9

ggml-ci

cuda : synchronize graph capture and cublas handle destruction (llama…

7063b1c

…/14288) Workarounds an issue that may cause CUDA graph capture to fail when a cuBLAS handle is destroyed in a different thread

CUDA: add conv_2d_transpose (llama/14287)

9378011

* CUDA: add conv_2d_transpose * remove direct include of cuda_fp16 * Review: add brackets for readability, remove ggml_set_param and add asserts

sync : llama.cpp

6f0d302

ggml-ci

ggerganov merged commit 4fb4faf into master Jun 20, 2025
10 checks passed

ggerganov deleted the sync-llama.cpp-25-06-20 branch June 20, 2025 18:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sync : llama.cpp #1281

sync : llama.cpp #1281

Uh oh!

ggerganov commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!

sync : llama.cpp #1281

sync : llama.cpp #1281

Uh oh!

Conversation

ggerganov commented Jun 20, 2025

Uh oh!

Uh oh!

Uh oh!