Multi-Layer Offload #1489

auphelia · 2025-11-25T12:14:14Z

Multilayer Offload (MLO) allows significantly larger neural networks (with repetitive blocks) to be deployed on FPGAs using FINN.

Instead of mapping every layer of a model onto hardware, MLO implements just one representative block of the repetitive structure in the network (for example, a single transformer encoder layer) and repeatedly reuses it. The weights for each iteration are streamed in from external memory such as DRAM or HBM, enabling models far larger than what could fit fully on-chip.

Importantly, the repeated block does not need to cover the entire network. Many architectures contain head and tail portions around a large repeated middle section. MLO can be applied selectively: fixed, non‑repetitive layers can be implemented as standalone hardware layers, while the repetitive core (e.g., the main stack of transformer layers) is handled by the MLO mechanism.

FINN continues to stream both the activations and the weight tensors as needed. The MLO control logic, which resides in the FPGA fabric next to the accelerator, operates fully automatically: it monitors the flowing data, manages iteration indices, selects the correct weight set, and loops outputs back into the computation when required. This removes the need for any manual scheduling or user‑driven orchestration of layer execution. The hardware autonomously steps through each repeated layer based purely on the data and iteration control embedded in the design.

The result is a practical trade‑off: slightly reduced throughput in exchange for the ability to deploy much larger models.

Add loop_control and loop_control_wrapper

…gnment

…ion with Vivado

…ased on a lambda filter, updated the loop test to reflect the changes

Extended adjacency_list utility function to accept a lambda for filtering decisions.

…op_op

Update loop tb

… no fork

…4 mismatch issue)

… an MLO sim is being performed.

Refactor Loop Op flow

… when extracting folding config json

…s not available

preusser

This is looking good functionally.
Please, fix licenses and clean up module interfaces though.

finn-rtllib/mlo/cdma/cdma_top.sv

finn-rtllib/mlo/cdma/krnl_counter.sv

finn-rtllib/mlo/inst_ip.tcl

finn-rtllib/mlo/local_weight_buffer.sv

finn-rtllib/mlo/loop_control.sv

- Modified module parameters. - License header updates. - Refactoring RTL directories, getting rid of duplicates. - RTL internal changes.

…of loop body

… to work on MLO

Joshua Monson and others added 30 commits June 23, 2025 23:44

add forgotten file

5850e63

Merge pull request #1367 from jsmonson/feature/loop_op

1cc3feb

Add loop_control and loop_control_wrapper

[FINNLoop] Update ipi stitching script and fix fetch weights clk assi…

9da9b56

…gnment

[FINNLoop] Saving bd design before validation to allow for investigat…

6272ea4

…ion with Vivado

Merge branch 'feature/finnxsi_integration' into feature/loop_op

51ef902

[FINNLoop] Adding exp cycles function from @jsmonson

b022ca5

move template replacement into mlo_max_iter if-statement

7653487

added the ability to create an adjacency_list of the original model b…

ad8e6ce

…ased on a lambda filter, updated the loop test to reflect the changes

[FINNLoop] Exposing clk and rst signals

6abc5b4

Merge pull request #1376 from Xilinx/feature/adj_list_lambda

761adfa

Extended adjacency_list utility function to accept a lambda for filtering decisions.

Fix continuous combinatorial assignments.

ea1b238

Fixes in serving of memory image.

80ee883

Merge remote-tracking branch 'origin/feature/loop_op' into feature/lo…

cdc6022

…op_op

[FINNLoop] Linting

24c8413

[FINNLoop] Update stitching script to connect components

52cb964

[FINNLoop] Update stitching to expose interfaces

b3329fc

[FINNLoop] Add verilog and system verilog files from rtllib

f063a8a

relocated and reorganize files, fix typo, add missing sources

212d30a

add a branching node and a dyn_matmul to testbench

cb05624

relocated and reorganize files, fix typo, add missing sources

708b49b

Merge pull request #1378 from jsmonson/feature/updated_loop_tb

a6763fb

Update loop tb

[Tests] Add duplicate streams to new loop test

1bade7d

[Tests] Add name for dup strms layer and run cppsim steps

305bc62

[Tests] Add fifo node attribute

ff64113

Adding a test that makes a stitchedIP of the loop body

26646ca

Added an extra make_model function to create a simpler MLO model with…

3b6c258

… no fork

Including .svh files for mlo sim object compliation

36222bf

Including the .svh file the proper way

4e6f0ba

[FINNLoop] Generate stream tap graph for branching structures

3d84223

Sketching out model for AXIMM-backed loopback storage.

28133df

Joshua Monson and others added 19 commits October 24, 2025 15:21

update loop_hierarhcy in builder and checking in LoopExtraction

469310f

[MLO INT4] Disable narrow weights when MLO is being used (fix for INT…

9ad2da6

…4 mismatch issue)

add fix for rhs_style

29f477b

[MLO] Clean-up files after cherry-picking commits

1c3a0ff

[MLOSim VerifyStep] Fixing issue when user request a full context and…

589fb30

… an MLO sim is being performed.

[Builder] Remove f in print statement

7a546cc

[MLO] Insert GiveUniqueNodeNames

be2310c

Merge pull request #1466 from Xilinx/feature/refactor_loop_ipgen

7cc5ee2

Refactor Loop Op flow

[MLO] Make sure that body is saved after changes

0b82ff3

[MLO] Refactor node naming for loop node

4944870

Merge branch 'dev' into feature/loop_op

583d40c

[Tests] Thresholding: move sim reset into prehook fct

462fddd

[FINNLoop] Update exp cycles estimate

18423d1

[MLO] Make sure subgraph nodes are named differently from upper level…

bb4ae3d

… when extracting folding config json

Merge branch 'dev' into feature/loop_op

39e3eff

[FINNLoop] Propagate rtlsim trace node attribute to execute node

520ea98

[MLO] Cleanup and align with recent changes

769d6af

[Deps] Make import of finn xsi optional in case Vivado installation i…

fbc2afe

…s not available

[Tests] Add markers to finnloop test

5da9d6b

auphelia requested a review from preusser November 25, 2025 15:57

auphelia marked this pull request as ready for review November 25, 2025 15:58

preusser requested changes Dec 10, 2025

View reviewed changes

auphelia and others added 7 commits December 10, 2025 09:42

Merge branch 'dev' into feature/loop_op

a2f6327

Merge branch 'dev' into feature/loop_op

5cc6d48

PR review fixes

577b1b9

- Modified module parameters. - License header updates. - Refactoring RTL directories, getting rid of duplicates. - RTL internal changes.

Merge branch 'dev' into feature/loop_op

2a2e50c

[MLO] Instead of node metadata propagation, implement manual setting …

080c164

…of loop body

Merge branch 'dev' into feature/loop_op

a7218dd

[MLO] Add auto gen doc and allow for analysis passes in builder steps…

e35e28a

… to work on MLO

auphelia requested a review from preusser January 12, 2026 18:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multi-Layer Offload #1489

Multi-Layer Offload #1489

auphelia commented Nov 25, 2025 •

edited

Loading

Uh oh!

preusser left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Multi-Layer Offload #1489

Are you sure you want to change the base?

Multi-Layer Offload #1489

Conversation

auphelia commented Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

preusser left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

auphelia commented Nov 25, 2025 •

edited

Loading