-
Notifications
You must be signed in to change notification settings - Fork 7
Add Workload object and execution methods #18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
rolfmorel
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a partial pass through it. Will try to complete the first pass through tomorrow!
| pass | ||
|
|
||
| @abstractmethod | ||
| def get_input_arrays(self, execution_engine) -> list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> abc.Sequence[ir.Value]?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we had that each Workload is explicitly associated with a Target (or more than one), i.e. has a Target member, and each Target keeps track of it's associated execution_engine, this signature becomes simpler.
| pass | ||
|
|
||
| @abstractmethod | ||
| def get_complexity(self) -> list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the different "dimensions" of the returned thing are known, and have sensible names, I would go with a NamedTuple.
| print(schedule_module) | ||
|
|
||
|
|
||
| def lower_payload( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this just be a non-abstractmethod method on the Workload class?
| dump_schedule: bool = False, | ||
| ): | ||
| if not dump_kernel or dump_kernel != "initial": | ||
| with context, location: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my mind there are very few cases where you have a reference to a mlir.ir.Context and haven't entered its context manager yet. The story for mlir.ir.Location is similar (there you might actually want to enter and exit nested Locations based on the scope of the IR you're building).
Have a look at #20 (comment) and #20 (comment) . My take is that the function which acts as the process's main() should set up and enter a mlir.ir.Context and set a mlir.ir.Location.unknown(). If we do that, the rest of the code should barely need to think about these any more.
| if not dump_kernel or dump_kernel != "initial": | ||
| with context, location: | ||
| # invoke transform interpreter directly | ||
| transform_interpreter.apply_named_sequence( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have llvm/llvm-project#168223, this can probably be simplified. (E.g. do we need apply_transform_schedule?)
| return (flop_count, memory_reads, memory_writes) | ||
|
|
||
| def payload_module(self): | ||
| with self.context, self.location: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| with self.context, self.location: |
| A = f.arguments[0] | ||
| B = f.arguments[1] | ||
| C = f.arguments[2] | ||
| a_tensor = bufferization.ToTensorOp(tensor_t, A, restrict=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general principle we can now set for ourselves for the project right at the start: in case the upstream bindings already have a workable snake_case version of an op, lets use that over the CamelCaseOp version. The crux of the argument being that this will make the Python code look closer to a terse version of the MLIR textual format.
| c_tensor = bufferization.ToTensorOp( | ||
| tensor_t, C, restrict=True, writable=True | ||
| ) | ||
| add = linalg.add(a_tensor, b_tensor, outs=[c_tensor]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to prefer doing the linalg.add on tensors rather than on memrefs? If anything, the IR would be even simpler.
| schedule_module.operation.attributes[ | ||
| "transform.with_named_sequence"] = (ir.UnitAttr.get()) | ||
| with ir.InsertionPoint(schedule_module.body): | ||
| named_sequence = transform.NamedSequenceOp( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| named_sequence = transform.NamedSequenceOp( | |
| named_sequence = transform.named_sequence( |
| if dump_kernel == "bufferized": | ||
| transform.YieldOp() | ||
| return schedule_module |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want this in the committed version?
An alternative to consider is inserting transform.print %mod if some verbosity flag has been set.
Implements #16.
Workloadobject and execution engine utility functions that can execute it.