Skip to content

opencl : skip empty nodes on cgraph compute #14491

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 2, 2025

Conversation

EZForever
Copy link
Contributor

This behavior is cross-checked to be consistent with CUDA, SYCL and Kompute backends. Without this, certain ops (e.g. GGML_OP_SOFT_MAX) will crash under certain circumstances (e.g. non-outputting ubatches).

Fixes #14453.

@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jul 2, 2025
Copy link
Collaborator

@CISC CISC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason not to merge the check with the one below, but otherwise LGTM.

@CISC CISC added the OpenCL Issues specific to the OpenCL backend label Jul 2, 2025
@EZForever
Copy link
Contributor Author

I separated it from the following check since it's done that way in Kompute backend (it uses a switch statement though); I could merge the checks if necessary.

@CISC
Copy link
Collaborator

CISC commented Jul 2, 2025

I separated it from the following check since it's done that way in Kompute backend (it uses a switch statement though); I could merge the checks if necessary.

That's different, I would prefer if you merged it.

@EZForever
Copy link
Contributor Author

Checks merged.

@CISC
Copy link
Collaborator

CISC commented Jul 2, 2025

Thanks, will merge once CI is ready.

@CISC CISC merged commit c8a4e47 into ggml-org:master Jul 2, 2025
48 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Jul 2, 2025
* origin/master:
llama : initial Mamba-2 support (ggml-org#9126)
sync : ggml
ggml : add version function to get lib version (ggml/1286)
Set RPATH to "@loader_path" / "$ORIGIN" to ensure executables and dynamic libraries search for dependencies in their origin directory. (ggml-org#14309)
CUDA: add softmax broadcast (ggml-org#14475)
CUDA: broadcasting for FlashAttention mask (ggml-org#14500)
vulkan: support softmax/FA batch and broadcast (ggml-org#14449)
ggml : support bcast ggml_soft_max_ext, ggml_flash_attn_ext (ggml-org#14435)
opencl : fix possible buffer overflow in dump_tensor (ggml-org#14490)
simple-chat : fix context-exceeded condition (ggml-org#14494)
opencl : skip empty nodes on cgraph compute (ggml-org#14491)
opencl : update upscale to support align corners (ggml-org#14488)
ci : add OpenCL to labeler workflow (ggml-org#14496)
github : add OpenCL backend to issue templates (ggml-org#14492)
ggml : Callback before abort (ggml-org#14481)
ci : disable fast-math for Metal GHA CI (ggml-org#14478)
Minh141120 pushed a commit to menloresearch/llama.cpp that referenced this pull request Jul 5, 2025
qnixsynapse pushed a commit to menloresearch/llama.cpp that referenced this pull request Jul 6, 2025
qnixsynapse pushed a commit to menloresearch/llama.cpp that referenced this pull request Jul 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Eval bug: "Floating point exception" on OpenCL backend when using MoE models and processing prompt longer than ubatch size
2 participants