Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] layout deduction ambiguity of Nested Layout Access Problem #2000

Open
yiakwy-xpu-ml-framework-team opened this issue Dec 18, 2024 · 1 comment
Labels
? - Needs Triage bug Something isn't working

Comments

@yiakwy-xpu-ml-framework-team
Copy link

yiakwy-xpu-ml-framework-team commented Dec 18, 2024

Describe the Issue
Name : either layout deduction or print2D problem

In the offical example, we have

template <class Shape, class Stride>
void print2D(Layout<Shape,Stride> const& layout)
{
  for (int m = 0; m < size<0>(layout); ++m) {
    for (int n = 0; n < size<1>(layout); ++n) {
      printf("%3d  ", layout(m,n));
    }
    printf("\n");
  }
}

// This introduces layout deduction problem. 
// From outter most diension, we derived from strides that it is row mjoar (continuous dimension is the last dimension or -1)
// But mean while, the inner nested dimension is still column major.
//
// This is unreasoable!
Layout s2xh4 = make_layout(make_shape (2,make_shape (2,2)),
                           make_stride(4,make_stride(2,1)));

/*
> print2D(s2xh4)
  0    2    1    3 (stride 4 for dimension 0)
  4    6    5    7
*/

In this example, we didn't do any 2D view operation explicitly for the layout s2xh4. It creates ambiguity. Let's dive into it.

Then the 4 element sequence "0 2 1 3", should be ideally still with row mjaor layout.

  1. Suppose it is by default column layout, and that cuTe does not produce any layout deduction, then inner most dimension of the row vector "0 2 1 3" is 0 and its 2D view is :
/*

column major layout
0=(0, 0) 2=(0, 1)
1=(1, 0)  3=(1, 1)

*/

That means the 2D view of sequence "0 2 1 3" is column mjaor but fetched row by row. The tricky part is we don't have any explicity 2D view operation in print2D, right ?

  1. Ok, let's assume we have layout deduction, then the inner most dimension for row vector "0 2 1 3" is 1, and its 2D view is:
/*

row major layout
0=(0, 0) 1=(0, 1)
2=(1, 0)  3=(1, 1)

*/

That means the 2D view is row major but fetched column by column. I don't this it is expected behavor.

In either case, I don't think it is "naturally good enough" because for array with shape (2, 2, 2) (we should have a program to translate (2, (2, 2)) to (2, 2, 2), think about python numpy indicer a[0, 5:6/nested array/, 0]), and its 2D view with shape (2, 4) should always print

0 1  2  3
4 5 6 7

and its 3D view

// in row major its continous dimension contains 0 1 instead 0 1 0 3
0 1
2 3

but it stills 0 1 2 3 in memory if we view it as 1-D array

Because this kind of view does not change the continuous dimension (0 or -1) right ?

Proposal

  1. Add layout deduction function explicitly.

In nested intTuple object, we could deduct its layout from the innner most nested call with a layout argument.

  1. Allow shape association operation

shape deduction : (2, (2,2)/4 elements in 2D view/) -> (2, 2, 2) (3D view)

  1. Add shape view operation

3.1 add View : (shapeXD, stridesXD) -> (shapeYD, stridesYD) , the memory order never change

3.2 add Reshape : (shapeXD, stridesXD) -> (shapeYD, stridesYD) , the memory order may change

Expected behavior

template <class Shape, class Stride>
void print2D(Layout<Shape,Stride> const& layout)
{
  for (int m = 0; m < size<0>(layout); ++m) {
    for (int n = 0; n < size<1>(layout); ++n) {
      printf("%3d  ", layout(m,n));
    }
    printf("\n");
  }
}

// This introduces layout deduction problem. 
// From outter most diension, we derived from strides that it is row mjoar (continuous dimension is the last dimension or -1)
// But mean while, the inner nested dimension is still column major.
//
// This is unreasoable!
Layout s2xh4 = make_layout(make_shape (2,make_shape (2,2)),
                           make_stride(4,make_stride(2,1)));

/*
> print2D(s2xh4)
  0    1    2    3 (stride 4 for dimension 0)
  4    5    6    7
*/

Environment details (please complete the following information):

  • Environment location: [Bare-metal, Docker, Cloud(specify cloud provider)]

Additional context
N/A

@ccecka
Copy link

ccecka commented Dec 18, 2024

From what I can gather, your definition of "row-major" is if the last stride is 1?

From outter most diension, we derived from strides that it is row mjoar (continuous dimension is the last dimension or -1)

That is a poor definition of row-major. This layout is row-major:

Layout rowm_2x4 = make_layout(make_shape (2,make_shape (2,2)),
                              make_stride(4,make_stride(1,2)));

while the layout you point to is not. I'm not sure what you want to "deduce".

Proposal 2:

// (2,(2,2)) => (2,2,2)
Layout layout_2x2x2 = flatten(s2xh4);
// or
Layout layout_2x2x2 = s2xh4.with_shape(2,2,2);
// or
Layout layout_2x2x2 = composition(s2xh4, make_layout(make_shape(2,2,2)));

You can interpret composition as an arbitrary layout reordering and reshaping utility.

Proposal 3.1:

// (2,(2,2)) => ((2,2),2)
Layout layout_4x2 = composition(s2xh4, make_layout(make_shape(4,2)));
// or
Layout layout_4x2 = s2xh4.with_shape(4,2);

Proposal 3.2:
Memory orders never change in layout/view manipulations, you'll want an explicit copy.

I recommend reading through more of the CuTe documentation here:
https://github.com/NVIDIA/cutlass/tree/main/media/docs/cute

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants