MPT 30B inference on Mac M1 #353
RonanKMcGovern
started this conversation in
Ideas
Replies: 1 comment
-
This is now supported in llama.cpp : ggml-org/llama.cpp#3417 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Is it realistic to try and get inference running on Mac M1 with similar results quality as on a GPU?
I find 7B and 13B models are not good enough to get working well with functions. Also, I like that MPT has extendable context compared to Falcon and llama.
If I were to try and get MPT 30B running, can I bootstrap using work from llama? Thanks
Beta Was this translation helpful? Give feedback.
All reactions