Replies: 1 comment
-
Found a thread on reddit with answers to my questions. Also, tried turning off Efficient cores and Hyper-threading – no changes. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Been running inference for
gpt_bigcode-santacoder-ggml.bin q4_1
on two systems:Tried different amount of threads with
./bin/starcoder -m ../models/bigcode/gpt_bigcode-santacoder-ggml.bin -p "def fibonnaci(" --top_k 0 --top_p 0.95 --temp 0.2 -t N
, here is the comparison:I also ran
mbw
tests on both systems:Questions: Why my mac is so much faster? Is it due memory bandwidth difference? (200 GB/s on mac vs 76.8 GB/s on intel). Any chance there is a way to boost inference speed on intel?
Beta Was this translation helpful? Give feedback.
All reactions