You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I can load the instruct model using the transformers loader and 8bit bits and bytes, I can get it to load evenly among multiple gpus.
However, I cannot seem to load the model with 4bit precion over multiple gpus, I managed to get the model to load across 1 24GB gpu and then start loading onto a second gpu of equivalent size, but it will not move on to any of the remaining gpus (7 in total). It will oom on the second gpu with the others sitting empty.
I've loaded other transformers based models via 4bit and never experience this heavily unbalanced loading before.
The text was updated successfully, but these errors were encountered:
I can load the instruct model using the transformers loader and 8bit bits and bytes, I can get it to load evenly among multiple gpus.
However, I cannot seem to load the model with 4bit precion over multiple gpus, I managed to get the model to load across 1 24GB gpu and then start loading onto a second gpu of equivalent size, but it will not move on to any of the remaining gpus (7 in total). It will oom on the second gpu with the others sitting empty.
I've loaded other transformers based models via 4bit and never experience this heavily unbalanced loading before.
The text was updated successfully, but these errors were encountered: