Skip to content

Releases: iree-org/iree

iree candidate iree-3.10.0rc20251128

28 Nov 10:15
4bb0c12

Choose a tag to compare

Pre-release

Automatic candidate release of iree.

iree candidate iree-3.10.0rc20251127

27 Nov 10:40
46fbe05

Choose a tag to compare

Pre-release

Automatic candidate release of iree.

iree candidate iree-3.10.0rc20251126

26 Nov 10:19
74ee8f2

Choose a tag to compare

Pre-release

Automatic candidate release of iree.

Release v3.9.0

25 Nov 18:51
1a66819

Choose a tag to compare

IREE Release v3.9.0

1. Compiler

1.1 Data Tiling & GEMM Improvements

  • iree-opt-data-tiling promoted to umbrella flag with suggested config. (#22295)
  • Default path switched to DispatchCreation phase; use --iree-global-opt-data-tiling for legacy behavior. See
    docs. (#21441)
  • Implemented subgroups_k in data-tiled MMA layouts. (#22519)
  • Added per-operand M/N/K interleaving control. (#22626)
  • Added layout transfer support in MaterializeEncoding. (#22582)
  • Strict inner_tiled verifier with distributed/opaque params. (#22369)
  • Unified encoding materialization passes. (#22472)
  • Encoding op fusion with multi-use producers at -O3. (#22444)
  • Intentional padding for non-K-major layouts (~2.7% GEMM improvement). (#22486)
  • Better heuristics for extremely large GEMMs. (#22636)
  • Refactored narrow matmul tile size selection. (#22177)
  • Split reduction for large-K GEMMs. (#22357)
  • Updated ukernel data layout. (#22350)
  • Fixed large f16 ukernel bounds. (#22481)
  • Added LLaMA 8B FP8 benchmark tests on gfx942. (#22387)

1.2 Dispatch Creation

  • Added split-reduction support for arg_compare, preventing shared-memory overflow and fixing LLaMA 8B FP16 compilation failures. (#22466)
  • Added aggressive multi-use fusion for encoding ops (enabled at -O3), significantly improving fusion patterns seen in SDXL. (#22444)
  • Enabled consumer fusion for GPUApplyTilingLevel on scf.forall loops, enhancing padding-level fusion. (#22522)

1.3 GPU Codegen

  • Added barrier insertion before first shared-memory write for AMD GPUs, fixing non-deterministic strided conv results (13% -> 0% failure rate). (#22669)
  • Rewrote loop prefetcher with a stage-based backward slicing model for better maintainability (no functional change). (#22605)
  • Implemented vector size inference for UKernelGenericOp, enabling downstream ops (e.g., unpack) to correctly vectorize instead of falling back to scalar code. (#22440)
  • Improved f16 medium ukernel bounds on ROCm for better matmul throughput. (#22393)
  • Added mmt4d ukernel support for RISC-V zvfh/zvfhmin, enabling f16xf16->f16/f32 kernels with runtime hardware probing. (#22231)
  • Generalized GPU lowering for linalg.reduce ops, converting illegal i1 reductions to generic form to unblock split-reduction pipelines. (#22490)

1.4 Others

2. Runtime

  • Implemented the first end-to-end support for external transients, enabling early—but functional—handling of control flow and cross-dispatch transient values.
    • Current limitations: no function calls and no data-dependent values; simple control flow is supported and aligns with future dispatch specialization work. (#22625)
  • Added timeline-aware async execution across module boundaries, introducing foundational interfaces for precise cross-module scheduling. (#22381)
  • Improved support for iree_codegen.extract_strided_metadata, ensuring information-preserving lowering:
    • Now normalizes into iree_codegen earlier, avoiding loss of stride/offset/alignment information that occurred when prematurely converting to memref. (#22606)
  • Added new Stream canonicalizations and improved RefineUsage to reduce unnecessary copies and fix correctness bugs. (#22610)
  • Added --gen-dialect-json to iree-tblgen, generating JSON databases of dialect definitions using tablegen metadata. (#22603)

Change Log

Git History

What's Changed

Read more

iree candidate iree-3.9.0rc20251125

25 Nov 10:09
1a66819

Choose a tag to compare

Pre-release

Automatic candidate release of iree.

iree candidate iree-3.9.0rc20251124

24 Nov 10:20
b98c1b9

Choose a tag to compare

Pre-release

Automatic candidate release of iree.

iree candidate iree-3.9.0rc20251123

23 Nov 10:16
b98c1b9

Choose a tag to compare

Pre-release

Automatic candidate release of iree.

iree candidate iree-3.9.0rc20251122

22 Nov 10:20
b98c1b9

Choose a tag to compare

Pre-release

Automatic candidate release of iree.

iree candidate iree-3.9.0rc20251121

21 Nov 10:22
f530214

Choose a tag to compare

Pre-release

Automatic candidate release of iree.

iree candidate iree-3.9.0rc20251120

20 Nov 10:22
c65eeb7

Choose a tag to compare

Pre-release

Automatic candidate release of iree.