You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are a few configurations that we are tracking for llama (8b, 70b, 405b) with different lowerings for attention, tensor parallelism, and batch size. With so many configurations, it is important to have them all well tested and an easy way to track the current status of each one. On this front, Avi and I have been iterating on a few tests and getting initial reports up (nod-ai/SHARK-Platform#284, nod-ai/SHARK-Platform#321, nod-ai/SHARK-Platform#363, nod-ai/SHARK-Platform#322, nod-ai/SHARK-Platform#414).
There are a few configurations that we are tracking for llama (8b, 70b, 405b) with different lowerings for attention, tensor parallelism, and batch size. With so many configurations, it is important to have them all well tested and an easy way to track the current status of each one. On this front, Avi and I have been iterating on a few tests and getting initial reports up (nod-ai/SHARK-Platform#284, nod-ai/SHARK-Platform#321, nod-ai/SHARK-Platform#363, nod-ai/SHARK-Platform#322, nod-ai/SHARK-Platform#414).
Remaining work:
The text was updated successfully, but these errors were encountered: