Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Required Arguments to Herd Function #669

Open
hunhoffe opened this issue Jul 16, 2024 · 12 comments
Open

Required Arguments to Herd Function #669

hunhoffe opened this issue Jul 16, 2024 · 12 comments
Labels
question Further information is requested

Comments

@hunhoffe
Copy link
Contributor

While I was writing some of the python examples, I realized that there is some inconsistency between the number of arguments and the sizes parameter between segment and herd. I haven't fully explored what is allowed (syntactically) in launch because multi-launches are a bit up in the air (pun intended) in terms of implementation/meaning.

Anyways, both herd and segments have a sizes argument. If sizes in not specified, it defaults to [1, 1].

I think the segment usage makes sense to me. The following are valid python snippets for segment:

@segment(name="seg")
def segment_body():
    ...
@segment(name="seg", sizes=[1])
def segment_body(tx, sx):
    ...
@segment(name="seg", sizes=[1, 1])
def segment_body(tx, ty, sx, sy):
    ...

However, herd MUST have the arguments below, even when sizes is not specified. E.g., these are valid ways to define your herd:

@herd(name="herd", sizes=[1, 1])
def herd_body(tx, ty, sx, sy):
    ...
@herd(name="copyherd")
def herd_body(tx, ty, sx, sy):

(but I couldn't figure out how to get sizes=[1] to be valid)

I was wondering if I could get some help understanding what should be valid in terms of specifying sizes for a herd, and also if it's a good idea or not to figure out how to hide the arguments when the herd is 1x1 the same way the segment works (because why would one need size or indices in that case??)

I'd appreciate some feedback on this issue.

@hunhoffe hunhoffe added the question Further information is requested label Jul 16, 2024
@hunhoffe
Copy link
Contributor Author

Some actions items from this list that I think make sense after further discussion:

  • Sizes arguments for segments and herds should probably be validated to be 2 dimensional (e.g., segment size [1] should probably not be valid)
  • To clean up air python code, I should figure out how to hide the four arguments for a herd of default [1,1] size, the same way it is for segments.

@hunhoffe hunhoffe self-assigned this Jul 17, 2024
@jgmelber
Copy link
Collaborator

Adding here for "documentation":

Multiple herds can exist in a segment and are placed to fit and the segment must have enough resources for that placement, the herds are guaranteed to run/exist at the same time within the spatial fabric the segment reserves. Same goes for segments within a launch, they are guaranteed to exist on the device at the same time or the launch cannot be executed. Therefore both herds and segments should have a shape. Though the shape of a segment is much more physical and must account for the structure of a target device's dataflow units (DUs), where a DU is the smallest unit of reservation/configuration. Launch has an "iteration space" for invocations of that launch and those can exist in space or time. It is up for debate whether the iteration space is a simple count or a shape.

@hunhoffe
Copy link
Contributor Author

Here are a few points from discussion today:

  • It's a bit unclear whether herd sizes correspond to physical blocks/shapes, or to iteration spaces
  • Segments correspond to iteration spaces, so how segments function makes sense with [1] being valid (to be specific, 1 or 2 or 3 dimensions should be valid)
  • Regardless, for a user, being able to specify the size of the herd in a multi-dimensional way and then get access to indices for each iteration may save the programmer from repetitive modular arithmetic to calculate indices. So we don't want to move to a single-dimension count argument as a matter of utility/convenience, regardless of what physical dimensions the device may have.

From this, the favored solution would be:

  • Fix bugs/current state so that herds function the same way that segments do as shown in the examples in the initial issue writeup above
  • Eventually, there should be an additional argument to herd declaration that specifies whether the size should correspond to the physical shape or not, and this flag should be percolated to the compiler passes which do the physical mapping to let the passes know if they are permitted to change the shape or not.

Outstanding implementation question: where should validation happen?

  • My thought: If the herd op should then be able to take in a shape of 1, 2, (or 3 even) dimensions - it would be up to the compiler to decide, if the shape is meant to be taken as a physical description, whether the dimension is valid for the target device (e.g., for npu, I think a two dimensional physical shape would be required).

@jgmelber
Copy link
Collaborator

Segments correspond to iteration spaces

I am not sure why a Segment corresponds to an iteration space? It is a reservation of physical resources needed for the herds and memory allocations it encapsulates.

It's a bit unclear whether herd sizes correspond to physical blocks/shapes, or to iteration spaces

This is a bit more nuanced, it could be either. But there is a physicality to herds that each worker is physically co-located in spatial resources.

@fifield
Copy link
Collaborator

fifield commented Jul 22, 2024

Here are a few points from discussion today:

  • It's a bit unclear whether herd sizes correspond to physical blocks/shapes, or to iteration spaces

The sizes correspond to physical cores which defines the spatial iteration space. We do not try to map a virtual iteration space onto the physical iteration space.

  • Segments correspond to iteration spaces, so how segments function makes sense with [1] being valid (to be specific, 1 or 2 or 3 dimensions should be valid)

IMHO we should only allow segment size = [1] until shape is well defined for segment. The most common thinking is that it's a spatial unroll factor, or equivalently, a resource reservation. It is generally not thought of as an iteration space like in a scf.parallel or air.launch.

  • Regardless, for a user, being able to specify the size of the herd in a multi-dimensional way and then get access to indices for each iteration may save the programmer from repetitive modular arithmetic to calculate indices. So we don't want to move to a single-dimension count argument as a matter of utility/convenience, regardless of what physical dimensions the device may have.

From this, the favored solution would be:

  • Fix bugs/current state so that herds function the same way that segments do as shown in the examples in the initial issue writeup above
  • Eventually, there should be an additional argument to herd declaration that specifies whether the size should correspond to the physical shape or not, and this flag should be percolated to the compiler passes which do the physical mapping to let the passes know if they are permitted to change the shape or not.

Maybe, but for now herd size == physical shape == iteration space. We do let the compiler to physically place the herd in the segment.

Outstanding implementation question: where should validation happen?

  • My thought: If the herd op should then be able to take in a shape of 1, 2, (or 3 even) dimensions - it would be up to the compiler to decide, if the shape is meant to be taken as a physical description, whether the dimension is valid for the target device (e.g., for npu, I think a two dimensional physical shape would be required).

Agreed for future work, but for now it is a physical description onto a 2d grid.

@jgmelber
Copy link
Collaborator

Maybe, but for now herd size == physical shape == iteration space. We do let the compiler to physically place the herd in the segment.

Sounds good. Let's go with that.

@hunhoffe
Copy link
Contributor Author

Thank you for chiming in @fifield and @jgmelber! I do think I need a bit more clarification for how to map this to actionable implementation changes.

From talking with @jgmelber, a segment may be meant to map to the data unit (which, to my understanding, is limited to 1 column). That would mean that only herds of specific dimensions could fit into a segment if we pin the segment size to [1] and map the herd size to a direct physical representation.

I don't have any examples currently working with segments size != [1], so that part doesn't much change how I am constructing examples. However, currently, there are quite a few (working) examples I've made that have herd size [2,2] in a singular segment ([1]) - and, if we map segment size to DUs - this wouldn't currently be correct/legal.

How should I change my examples to be correct mlir-air if this is true? I could change the herd size to be 1x4 in the examples. That would remove example code showing two dimensional indexing within the herd, which is functionally okay but from an instructional standpoint would perhaps be a bit of a loss. I could change the examples to use multiple segments but then they would cease to work (and cease to function as pseudo "unit tests" for the python bindings) until there is support for multiple segments.

I was also under the impression that sometimes a [2,2] herd is interpreted as [1,4] physically. If that's true, then there is also a departure between our intended API and our implementation which would need to be addressed.

Any guidance would be appreciated!

@jgmelber
Copy link
Collaborator

jgmelber commented Jul 24, 2024

data unit (which, to my understanding, is limited to 1 column)

A dataflow unit (DU) is 1 column. A segment is a reservation of 1 or more DUs/.

I don't have any examples currently working with segments size != [1], so that part doesn't much change how I am constructing examples. However, currently, there are quite a few (working) examples I've made that have herd size [2,2] in a singular segment ([1]) - and, if we map segment size to DUs - this wouldn't currently be correct/legal.

For this example the segment should be of size [2] in order to have enough resources for a [2,2] herd to be placed.

I was also under the impression that sometimes a [2,2] herd is interpreted as [1,4] physically. If that's true, then there is also a departure between our intended API and our implementation which would need to be addressed.

This is a possibility we are not supporting currently after @fifield 's suggestions above, the size/shape is physical not logical. If a pass transforms the [2,2] to [1,4] explicitly then it would be mapped as such physically. A [2,2] herd is mapped to a physical 2 x 2 square grid of compute cores.

@jgmelber
Copy link
Collaborator

IMHO we should only allow segment size = [1] until shape is well defined for segment. The most common thinking is that it's a spatial unroll factor, or equivalently, a resource reservation. It is generally not thought of as an iteration space like in a scf.parallel or air.launch.

Agreed, if @fifield means that segment size is 1 dimension for now until shape is well defined. The segment is a reservation of 1 or more "dataflow units" (DUs), which for most spatial devices we target now is 1 column of compute/memory/shim tiles.

@fifield
Copy link
Collaborator

fifield commented Jul 25, 2024

IMHO we should only allow segment size = [1] until shape is well defined for segment. The most common thinking is that it's a spatial unroll factor, or equivalently, a resource reservation. It is generally not thought of as an iteration space like in a scf.parallel or air.launch.

Agreed, if @fifield means that segment size is 1 dimension for now until shape is well defined. The segment is a reservation of 1 or more "dataflow units" (DUs), which for most spatial devices we target now is 1 column of compute/memory/shim tiles.

I mean [1], i.e. one dim of size one. DU is not defined or used anywhere in mlir-air. The segment op supports 2d anchor and size attributes like a herd, in terms of tiles, but that is the only place resources are mentioned (as currently implemented). The attributes get used in the herd placement pass. The size parameter (not attribute) was intended to be used slightly differently, as a physical unroll factor (iteration space) of the segment. That is, a [2,2] segment would be physically duplicated in a 2x2 grid and all resources must be available to start the [four copies of the] segment. We could introduce something like mlir-aie targetmodel which talks about DUs for a given device/platform/whatever but mlir-air doesn't have anything like that yet.

@fifield
Copy link
Collaborator

fifield commented Jul 25, 2024

I was also under the impression that sometimes a [2,2] herd is interpreted as [1,4] physically. If that's true, then there is also a departure between our intended API and our implementation which would need to be addressed.

There are transformations to achieve this, but it is not the default behavior.

@hunhoffe
Copy link
Contributor Author

hunhoffe commented Jul 30, 2024

I think I have some better understanding of sizes at this point, namely that for herds it is a physical representation. Thank you all for discussing this with me!

However, I am still feeling fuzzy on what implementation changes should be made to make this more clear through the API. To focus that more general question specifically on what a valid herd sizes should be, I'm still not sure.

  • Is a 1 dimensional size valid?
    I'm still unsure. This documentation here claims 1 or 2 dimensional is valid.
  • Should I be able to omit sizes when creating a herd (e.g., use default value)?
    I think the answer to this is 'yes', and it should hide the extra arguments to the function in this case (as with the first segment example). Implementation of this question depends on the answer to the first question, in terms of what the default value should be.

@hunhoffe hunhoffe removed their assignment Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants