Skip to content

sync : llama.cpp #1192

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 14, 2025
Merged

sync : llama.cpp #1192

merged 8 commits into from
Apr 14, 2025

Conversation

ggerganov
Copy link
Member

No description provided.

ggerganov and others added 8 commits April 14, 2025 09:26
* ggml: fixes #12846 compilation error

Signed-off-by: Aaron Teo <[email protected]>

Co-authored-by: Aleksei Nikiforov <[email protected]>

* ggml: add documentation for code change

Signed-off-by: Aaron Teo <[email protected]>

Co-authored-by: Aleksei Nikiforov <[email protected]>

* ggml: refactor to type-cast and update documentation

Signed-off-by: Aaron Teo <[email protected]>

Co-authored-by: Aleksei Nikiforov <[email protected]>

* ggml: update documentation to provide full issue link

Signed-off-by: Aaron Teo <[email protected]>

Co-authored-by: Aleksei Nikiforov <[email protected]>

---------

Co-authored-by: Aleksei Nikiforov <[email protected]>
* SYCL: Add fp16 support to some elementwise OP kernels

* remove comment

ggml-ci

* Use static_cast directly

* remove not needed cast from tanh

* Use static cast and remove unneeded castings

* Adjust device_support_op for unary OPs

* Use cast_data and typed_data struct to deduplicate casting code
The current usage of the SYCL-Graph extension checks for
the `sycl_ext_oneapi_graph` device aspect. However, it is also
possible to support `sycl_ext_oneapi_limied_graph` devices that
don't support update
Rewrite the stride logic for the mask tensor in the FA shader to force the
stride to be aligned, to allow using more efficient loads.
… the result register (llama/12773)

* ggml: use _mm[512/256]_dpbusd[_avx]_epi32 to directly accumulate into the result register

* simplifies the codebase by removing redundant functions
@ggerganov ggerganov merged commit be935ac into master Apr 14, 2025
11 checks passed
@ggerganov ggerganov deleted the sync-llama.cpp-25-04-14 branch April 14, 2025 07:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants