Sync master with upstream release b6062#191
Merged
jan-service-account merged 6 commits intodevfrom Aug 2, 2025
Merged
Conversation
* support hunyuan_v1_dense Signed-off-by: stevenkuang <[email protected]> * update hunyuan_moe to hunyuan_v1_moe Signed-off-by: stevenkuang <[email protected]> * fix rope alpha assert and bos token Signed-off-by: stevenkuang <[email protected]> * add blank line Signed-off-by: stevenkuang <[email protected]> * Revert "update hunyuan_moe to hunyuan_v1_moe" This reverts commit aa973ca. * use hunyuan_dense instead of hunyuan_v1_dense Signed-off-by: stevenkuang <[email protected]> * fix hunyuan_moe chat template Signed-off-by: stevenkuang <[email protected]> * remove leftover code Signed-off-by: stevenkuang <[email protected]> * update hunyuan dense chat template Signed-off-by: stevenkuang <[email protected]> * fix hunyuan dense vocab and chat template Signed-off-by: stevenkuang <[email protected]> --------- Signed-off-by: stevenkuang <[email protected]>
* vendor : update vendored copy of google/minja Signed-off-by: Lennart Austenfeld <[email protected]> * Re-remove trailing whitespace Signed-off-by: Lennart Austenfeld <[email protected]> * Remove another trailing whitespace Signed-off-by: Lennart Austenfeld <[email protected]> --------- Signed-off-by: Lennart Austenfeld <[email protected]>
* vulkan: optimizations for direct convolution - Empirically choose a better tile size. Reducing BS_K/BS_NPQ helps fill the GPU. The new size should be amenable to using coopmat, too. - Fix shmem bank conflicts. 16B padding should work with coopmat. - Some explicit loop unrolling. - Skip math/stores work for parts of the tile that are OOB. - Apply fastdiv opt. - Disable shuffles for NV. * Three tiles sizes for CONV_2D, and a heuristic to choose * reallow collectives for pre-Turing * make SHMEM_PAD a spec constant * fixes for intel perf - no shmem padding, placeholder shader core count * shader variants with/without unrolling * 0cc4m's fixes for AMD perf Co-authored-by: 0cc4m <[email protected]> --------- Co-authored-by: 0cc4m <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Updates dev branch with latest release (b6062) from ggml-org/llama.cpp