You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These new announcements of APUs that can use much more vram got me wondering. I doubt I'd enjoy their tk/s speed, but what if I can still use my GPU to run a draft model while across LAN I have a (very) big model running on an APU? The overhead may not be pretty but there should surely still be a huge speed increase over using just the APU.
And if I do have all my vram to spend on draft models, what about chaining them? I would have enough to split between a big draft model plus a tiny draft model for the big draft model.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
These new announcements of APUs that can use much more vram got me wondering. I doubt I'd enjoy their tk/s speed, but what if I can still use my GPU to run a draft model while across LAN I have a (very) big model running on an APU? The overhead may not be pretty but there should surely still be a huge speed increase over using just the APU.
And if I do have all my vram to spend on draft models, what about chaining them? I would have enough to split between a big draft model plus a tiny draft model for the big draft model.
Maybe it's crazy but I gotta put it out there.
Beta Was this translation helpful? Give feedback.
All reactions