Any way to improve memory bandwidth for better ms per token for intel silicon? #238

t1maccapp · 2023-06-08T06:05:47Z

t1maccapp
Jun 8, 2023

Been running inference for gpt_bigcode-santacoder-ggml.bin q4_1 on two systems:

Macbook Pro, m1 Pro + 16GB RAM
i5 12600k, 32GB RAM DDR5 5200 MHz (40-40-40)

Tried different amount of threads with ./bin/starcoder -m ../models/bigcode/gpt_bigcode-santacoder-ggml.bin -p "def fibonnaci(" --top_k 0 --top_p 0.95 --temp 0.2 -t N, here is the comparison:

I also ran mbw tests on both systems:

i5 12600k + ddr5 5200:

AVG  Method: MEMCPY  Elapsed: 0.18891  MiB: 2048.00000  Copy: 10841.354 MiB/s
AVG  Method: DUMB  Elapsed: 0.08703  MiB: 2048.00000  Copy: 23530.953 MiB/s
AVG  Method: MCBLOCK  Elapsed: 0.12635  MiB: 2048.00000  Copy: 16209.367 MiB/s

m1 Pro 16gb:

AVG  Method: MEMCPY  Elapsed: 0.05120  MiB: 2048.00000  Copy: 40003.203 MiB/s
AVG  Method: DUMB  Elapsed: 0.08666  MiB: 2048.00000  Copy: 23632.205 MiB/s
AVG  Method: MCBLOCK  Elapsed: 0.04780  MiB: 2048.00000  Copy: 42844.830 MiB/s

Questions: Why my mac is so much faster? Is it due memory bandwidth difference? (200 GB/s on mac vs 76.8 GB/s on intel). Any chance there is a way to boost inference speed on intel?

t1maccapp · 2023-06-08T13:14:35Z

t1maccapp
Jun 8, 2023
Author

Found a thread on reddit with answers to my questions.

Also, tried turning off Efficient cores and Hyper-threading – no changes.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Any way to improve memory bandwidth for better ms per token for intel silicon? #238

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Any way to improve memory bandwidth for better ms per token for intel silicon? #238

Uh oh!

t1maccapp Jun 8, 2023

Replies: 1 comment

Uh oh!

t1maccapp Jun 8, 2023 Author

t1maccapp
Jun 8, 2023

t1maccapp
Jun 8, 2023
Author