Clarification on Tensor Dimensionality in Equation (1) Split Operation #254

Trouvaille0823 · 2025-04-16T09:39:39Z

Trouvaille0823
Apr 16, 2025

@xing-liu Thank you for the guidance. I'd like to document this dimensional analysis for future contributors.

Misinterpretation Source
Initially, I interpreted dim=1 in the split operation as the token dimension under a 3D tensor assumption (e.g., [batch_size, seq_len, embed_dim]). This conflicted with the paper's description of splitting along the embedding dimension (dim=-1).

Resolution via Shape Analysis
@xing-liu's clarification reveals the critical detail:
uvqk = ... # 2D tensor
u, v, q, k = torch.split(uvqk, [...], dim=1)

jiaqizhai · 2025-04-20T03:09:20Z

jiaqizhai
Apr 20, 2025

Hi @Trouvaille0823 - these two should be equivalent. Some of the checked-in code in this repo splits along dim=1 because the input is already jagged, in which case -1 dim == 1 dim.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarification on Tensor Dimensionality in Equation (1) Split Operation #254

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Clarification on Tensor Dimensionality in Equation (1) Split Operation #254

Uh oh!

Trouvaille0823 Apr 16, 2025

Replies: 1 comment

Uh oh!

jiaqizhai Apr 20, 2025

Trouvaille0823
Apr 16, 2025

jiaqizhai
Apr 20, 2025