Replies: 1 comment
-
tl;dr how to cache part of compute graph |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Here is some example code that computes matmul twice.
Output:
Is there any way this can be automatically cached?
This is only a simple example. With a more complex scenario, caching is not so easy.
The Strassen Algorithm need to preprocess the matrices. In a model,$A*B$ where $A$ is fixed (model weights) and $B$ is dependent on input, the preprocess steps that only dependent on $A$ can be cached. This algorithm is recursive, so a single matrix multiplication may be expanded to a balanced tree -shaped compute graph.
Beta Was this translation helpful? Give feedback.
All reactions