Clarification on Tensor Dimensionality in Equation (1) Split Operation #254
Trouvaille0823
started this conversation in
General
Replies: 1 comment
-
Hi @Trouvaille0823 - these two should be equivalent. Some of the checked-in code in this repo splits along dim=1 because the input is already jagged, in which case -1 dim == 1 dim. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
@xing-liu Thank you for the guidance. I'd like to document this dimensional analysis for future contributors.
Misinterpretation Source
Initially, I interpreted dim=1 in the split operation as the token dimension under a 3D tensor assumption (e.g., [batch_size, seq_len, embed_dim]). This conflicted with the paper's description of splitting along the embedding dimension (dim=-1).
Resolution via Shape Analysis
@xing-liu's clarification reveals the critical detail:
uvqk = ... # 2D tensor
u, v, q, k = torch.split(uvqk, [...], dim=1)
Beta Was this translation helpful? Give feedback.
All reactions