Skip to content

Conversation

@chaxu01
Copy link
Collaborator

@chaxu01 chaxu01 commented Dec 29, 2025

Benchmark results for Llama 3.2 1B Q4_0 on Graviton 3 (tokens/sec):
Comparison of Non-SVE vs SVE-enabled kernels.

Threads Test w/o SVE (t/s) w/ SVE (t/s) Uplift (%)
1 pp512 75.09 ± 0.02 84.51 ± 0.05 +12.55%
1 tg128 18.69 ± 0.01 20.49 ± 0.00 +9.63%
2 pp512 148.77 ± 0.02 166.83 ± 0.02 +12.14%
2 tg128 34.63 ± 0.01 37.21 ± 0.02 +7.45%
4 pp512 293.64 ± 0.07 326.82 ± 0.13 +11.30%
4 tg128 63.49 ± 0.07 67.95 ± 0.02 +7.04%
8 pp512 525.17 ± 0.11 568.10 ± 0.12 +8.17%
8 tg128 97.93 ± 0.03 105.00 ± 0.06 +7.20%
16 pp512 949.33 ± 11.10 1016.97 ± 1.04 +7.13%
16 tg128 131.35 ± 0.39 136.51 ± 0.37 +3.93%

@chaxu01 chaxu01 requested a review from ggerganov as a code owner December 29, 2025 11:31
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Dec 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant