llama reporting tracking #383

saienduri · 2024-11-04T16:09:05Z

There are a few configurations that we are tracking for llama (8b, 70b, 405b) with different lowerings for attention, tensor parallelism, and batch size. With so many configurations, it is important to have them all well tested and an easy way to track the current status of each one. On this front, Avi and I have been iterating on a few tests and getting initial reports up (nod-ai/SHARK-Platform#284, nod-ai/SHARK-Platform#321, nod-ai/SHARK-Platform#363, nod-ai/SHARK-Platform#322, nod-ai/SHARK-Platform#414).

Remaining work:

Land Update yml file to run 8b tests on presubmit and 70b and 405b tests nightly SHARK-Platform#387
Iterate on expected output/accuracy validation

saienduri self-assigned this Nov 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama reporting tracking #383

llama reporting tracking #383

saienduri commented Nov 4, 2024

llama reporting tracking #383

llama reporting tracking #383

Comments

saienduri commented Nov 4, 2024