e2e matmul test improvements #19016

bjacob · 2024-11-04T21:30:56Z

Working on #18980 let me spend quality time with e2e matmul tests and suggested some changes.

The main change is to simplify the printing of numerical values to always use high precision, meaning print all significant digits of floating point values.

Since our tests generate small integral values and the intent is generally to be testing mostly the exact arithmetic that happens on small integral values, in most cases this doesn't make any difference.

But I found that RDNA3 float arithmetic produces non-exact results even on those values. As a result, I got values like 1+epsilon where 1 was expected, causing a test to fail (since we didn't know we needed to opt out from requiring exact results) and the test output cryptically printed both values as "1".

The other change is to more consistently print the same number of rows and columns regardless of whether we are at the start or in the middle of a dimension, and to have that number be what we call "context" (before, it was "2 * context").

Also a seasonal emoji change.

Signed-off-by: Benoit Jacob <[email protected]>

ScottTodd · 2024-11-04T22:08:03Z

tools/testing/e2e/iree-e2e-matmul-test.cc

Drive by: I'd still really like to delete all the in-tree code for this test suite and migrate to https://github.com/iree-org/iree-test-suites/tree/main/linalg_ops. Different sets of improvements have been made to both places at this point.

The tests hit a complexity level that I'm not comfortable supporting in-tree, and we should continue transitioning to package-based testing (decoupled from the core project CMake build, runnable from dev/nightly/stable package builds on a wide range of systems) if possible.

@ScottTodd Let's chat next week. I really want to help with the test changes you're making and I would like to offer directly taking care of tihs to take this off your plate. But there's something important here -- these tests aren't just "regression" tests that would matter for QA but maybe not so much for day to day development.

These tests are are the center of day to day development, so changes to these tests need to be made as frequently as we add codegen support for a new compilation path, such as when we implement GPU data tiling on another GPU target or supporting another data type.

I think you caught this particular PR because it touches the files under tools/testing/ while the more frequent case with my day-to-day PRs is that they only touch the code under tests/e2e/matmul/. The latter is really changing at a high frequency (see git history). The former, a bit less high frequency, but still, just today I noticed I needed to make changes there for f64, #19093, for instance (which I need to test F64 MFMA on CDNA3, which I need for completeness because if the hw supports f64, it's easier to just support it than to refrain from it even if no immediate need).

That doesn't preclude moving these tests out of tree but that does mean that:

The out-of-tree move needs to be atomic --- if it means I can't land test changes concurrently, that's literally blocking my work. If other test changes/improvements are planned together with the move, it's really important to keep them separate from the move itself, so the atomic op here can have low latency.

The out-of-tree move will incur a permanent penalty on my development velocity. I think I'm OK with that because I really care about test infrastructure health and trust your judgement here, but that's part of why I'd like to be involved (the other part being that I really want to help with that!).

I'm still thinking through some of the design questions here, I don't have a final state in mind that I'm totally happy with. Here are some responses to your individual points.

These tests are are the center of day to day development, so changes to these tests need to be made as frequently as we add codegen support for a new compilation path, such as when we implement GPU data tiling on another GPU target or supporting another data type.

It's my (perhaps naive) hope that we could somewhat exhaustively enumerate the data types we could support in the test suite, independent of what we currently do support on any given target. I'd like to get out of the habit of only testing what we already know works. For that I want to support XFAIL tests, but I had trouble building that into the in-tree tests. To be fair... the out of tree tests don't yet have XFAIL either, and I'm not sure if CMake/ctest is the right framework for that level of configuration. I've been much happier with pytest for handling test filtering and expected outcomes.

As for things like feature flag flips and choosing compilation pipelines, I like that putting an air gap between the compiler+runtime and the tests keeps us honest about default compiler behavior and user-friendly optimization settings. The air gap / github repository boundary does introduce friction for daily development though. I can think of a few ways we can mitigate that friction, but any of them would require some changes to developer workflows:

Include iree-test-suites as a subproject using CMake FetchContent (not a git submodule, so random users cloning the repo don't pay the cost for test suites)

Add scripts that go through common steps like checking out the test suite repo, building the tools, running tests, getting test results, collecting benchmarks, etc.

Having the tests in their own project gives us more freedom here to deviate from IREE's core CMake setup

RE: atomic changes, I started https://github.com/iree-org/iree-test-suites/tree/main/linalg_ops as a relatively simple fork at first and then started making deeper changes to take advantage of the new decoupling:

Checking in the test files so you can see what they include without needing to run the generator script: https://github.com/iree-org/iree-test-suites/blob/main/linalg_ops/matmul/generated/f16_into_f32/matmul_f16_into_f32_large.mlir

Restructuring the CMake test files, though they currently lack full coverage of the configurations we want to test (e.g. CPU features, GPU flags): https://github.com/iree-org/iree-test-suites/blob/main/linalg_ops/matmul/CMakeLists.txt

Adding a new top level CMake project, decoupled from things like the iree-test-deps target and the IREE_BUILD_COMPILER option (including cross-compilation concerns): https://github.com/iree-org/iree-test-suites/blob/main/linalg_ops/CMakeLists.txt

Leaving things half migrated is unfortunate, but my time was pulled elsewhere. @erman-gurses has also been helping get convolution and attention tests moved into the test suite repo recently. If we come to a decision we are all happy with then we can schedule some time on the roadmap to finish the migration work.

Thanks for the details! Let's find time this week for a video call. For now just recording a couple data points. One is:

~/iree-build$ find . -name '*e2e_matmul*mlir' -printf "%s\n" | awk '{sum+=$1} END {print sum}' 26158552

Another data point: pace of introduction of new element types in CDNA architectures:

CDNA1: f32, f16, i8

CDNA2: f64, bf16

CDNA3: xf32, f8E5M2FNUZ, f8E4M3FNUZ

Near future: f8E5M2 (non-FNUZ), f8E4M3 (non-FNUZ), FP6, FP4, INT6, INT4, presumably all what's being standardized in Microscaling 1.0.

Meanwhile, NVIDIA architectures support some of these types already in production, as well as altogether different types such as TF32.

pumpkins

62f7512

Signed-off-by: Benoit Jacob <[email protected]>

bjacob marked this pull request as ready for review November 4, 2024 21:37

bjacob requested a review from benvanik as a code owner November 4, 2024 21:37

ScottTodd reviewed Nov 4, 2024

View reviewed changes

ScottTodd self-requested a review November 11, 2024 18:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

e2e matmul test improvements #19016

e2e matmul test improvements #19016

bjacob commented Nov 4, 2024

ScottTodd Nov 4, 2024

bjacob Nov 9, 2024 •

edited

Loading

ScottTodd Nov 12, 2024

bjacob Nov 12, 2024 •

edited

Loading

e2e matmul test improvements #19016

Are you sure you want to change the base?

e2e matmul test improvements #19016

Conversation

bjacob commented Nov 4, 2024

ScottTodd Nov 4, 2024

Choose a reason for hiding this comment

bjacob Nov 9, 2024 • edited Loading

Choose a reason for hiding this comment

ScottTodd Nov 12, 2024

Choose a reason for hiding this comment

bjacob Nov 12, 2024 • edited Loading

Choose a reason for hiding this comment

bjacob Nov 9, 2024 •

edited

Loading

bjacob Nov 12, 2024 •

edited

Loading