Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly materialize zero-width FIFOs. #1896

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

schilkp
Copy link
Contributor

@schilkp schilkp commented Jan 29, 2025

Gates the generation of the FIFO "data path" behind a non-zero data width and adjusts the signature accordingly.

Problem:

Materializing channels with a zero-width data path fails due to mismatching signatures.

Repro

buf.x:

proc Nop {
    input: chan<()> in;
    output: chan<()> out;

    init { () }

    config(input: chan<()> in, output: chan<()> out) {
        (input, output)
    }

    next(_: ()) {
        let (tok, v) = recv(join(), input);
        send(tok, output, v);
    }

}


proc Top {
    input: chan<()> in;
    output: chan<()> out;

    init { () }

    config(input: chan<()> in, output: chan<()> out) {
        let (s, r) = chan<(), u32:1>("between");
        spawn Nop(input, s);
        spawn Nop(r, output);

        (input, output)
    }

    next(_: ()) { }
}

Run:

./bazel-bin/xls/dslx/ir_convert/ir_converter_main ./buf.x > buf.ir
./bazel-bin/xls/tools/opt_main ./buf.ir --top="__buf__Top__Nop_1_next" > buf.opt.ir
./bazel-bin/xls/tools/codegen_main ./buf.opt.ir --delay_model=unit --pipeline_stages=1 --reset="rst" --multi_proc --materialize_internal_fifos

Error:

Error: INTERNAL: Instantiation `materialized_fifo_fifo_buf__between_` of block `fifo_for_depth_1_ty____with_bypass_register_push` is missing instantation input/output node for port `push_data`; [after 'Materialize FIFO instantiations into block-instantiation if no FIFO template is specified.' pass, dynamic pass #14]; Running pass #15: Top level codegen pass pipeline [short: codegen]%

Notes

This is my first time digging into the codegen codebase - I think this is the best way of fixing this but I really don't have a good overview :)

Also: This probably could do with a test for the materialization pass that double-checks the functionality of zero-width FIFOs, and maybe some sort of end-to-end test to validate the whole pipeline with zero-width channel/FIFOs: They are the type of edge-case that often gets forgotten/leads to regressions.

I am not 100% sure what the best way of adding these tests is - so I thought I would open the PR and ask for feedback :)

Happy to rework this/add tests as desired. Let me know.

Cheers!

@ericastor ericastor requested a review from grebe January 29, 2025 23:10
@schilkp schilkp force-pushed the schilkp/fix_zerowidth_fifos branch from d43ca04 to 54334b2 Compare January 30, 2025 05:32
@schilkp
Copy link
Contributor Author

schilkp commented Jan 30, 2025

On second thought: The current state is likely also not optimal for non-materialized zero-width FIFOs.

Repro

buf.x:

proc Nop {
    input: chan<()> in;
    output: chan<()> out;

    init { () }

    config(input: chan<()> in, output: chan<()> out) {
        (input, output)
    }

    next(_: ()) {
        let (tok, v) = recv(join(), input);
        send(tok, output, v);
    }

}


proc Top {
    input: chan<()> in;
    output: chan<()> out;

    init { () }

    config(input: chan<()> in, output: chan<()> out) {
        let (s, r) = chan<(), u32:1>("between");
        spawn Nop(input, s);
        spawn Nop(r, output);

        (input, output)
    }

    next(_: ()) { }
}

Run:

./bazel-bin/xls/dslx/ir_convert/ir_converter_main ./buf.x > buf.ir
./bazel-bin/xls/tools/opt_main ./buf.ir --top="__buf__Top__Nop_1_next" > buf.opt.ir
./bazel-bin/xls/tools/codegen_main ./buf.opt.ir --delay_model=unit --pipeline_stages=1 --reset="rst" --multi_proc

Output:

// -- snip --
module __buf__Top__Nop_1_next(
  input wire clk,
  input wire rst,
  input wire buf__input_vld,
  input wire buf__output_rdy,
  output wire buf__input_rdy,
  output wire buf__output_vld
);

  // -- snip --
  
  xls_fifo_wrapper #(
    .Width(32'd0),
    .Depth(32'd1),
    .EnableBypass(1'd1),
    .RegisterPushOutputs(1'd1),
    .RegisterPopOutputs(1'd0)
  ) fifo_buf__between (
    .clk(clk),
    .rst(rst),
    .push_valid(instantiation_output_134),
    .pop_ready(instantiation_output_141),
    .push_ready(instantiation_output_135),
    .pop_valid(instantiation_output_140)
  );

  // -- snip --

endmodule

I am not 100% sure how you consume this (I don't see an implementation xls_fifo_wrapper in the repo) but here is my thinking:

The generated xls_fifo_wrapper instantiation uses the same verilog module as a non-zero width fifo, and omits the data inputs/outputs (due to them getting dropped by the port legalization pass, as far as I can tell).

For reference, a normal fifo:

  xls_fifo_wrapper #(
    .Width(32'd32),
    .Depth(32'd1),
    .EnableBypass(1'd0),
    .RegisterPushOutputs(1'd0),
    .RegisterPopOutputs(1'd0)
  ) fifo_internal (
    .clk(clk),
    .rst(rst),
    .push_data(instantiation_output_159),
    .push_valid(instantiation_output_160),
    .pop_ready(instantiation_output_167),
    .push_ready(instantiation_output_161),
    .pop_data(instantiation_output_165),
    .pop_valid(instantiation_output_166)
  );

This makes it a bit of a pain to write an implementation of xls_fifo_wrapper that works in both cases since zero-width wires/ports are not legal.

In verilog, it would be more natural to do the same split of FIFO control and data path:

module xls_fifo_wrapper #(
  parameter Width = 32,
  parameter Depth = 1,
  parameter EnableBypass = 0,
  parameter RegisterPushOutputs = 0,
  parameter RegisterPopOutputs = 0
) (
  input  wire                clk,
  input  wire                rst,
  input  wire [Width-1:0]    push_data,
  input  wire                push_valid,
  input  wire                pop_ready,
  output wire                push_ready,
  output wire [Width-1:0]    pop_data,
  output wire                pop_valid
);
  
  xls_fifo_control_wrapper #(
    .Depth(Depth),
    .EnableBypass(EnableBypass),
    .RegisterPushOutputs(1'd1),
    .RegisterPopOutputs(1'd0)
  ) fifo_buf__between (
    .clk(clk),
    .rst(rst),
    .push_valid(push_valid),
    .pop_ready(pop_ready),
    .push_ready(push_ready),
    .pop_valid(pop_valid)
    
    // Potentially more outputs to provide state to
    // data path here.
  );

  // -- DATA PATH HERE --

endmodule

In short - it may be sensible to give generated xls_fifo_wrappers a different name and omit the width parameter if it is of zero width. Something like:

  xls_fifo_control_wrapper #(
    .Depth(32'd1),
    .EnableBypass(1'd1),
    .RegisterPushOutputs(1'd1),
    .RegisterPopOutputs(1'd0)
  ) fifo_buf__between (
    .clk(clk),
    .rst(rst),
    .push_valid(instantiation_output_134),
    .pop_ready(instantiation_output_141),
    .push_ready(instantiation_output_135),
    .pop_valid(instantiation_output_140)
  );

WDYT?

@grebe
Copy link
Collaborator

grebe commented Feb 3, 2025

In short - it may be sensible to give generated xls_fifo_wrappers a different name and omit the width parameter if it is of zero width

I agree with this- it's unfortunate that SystemVerilog parameterization doesn't really allow for conditionally including/excluding a port, but I think it's best to have a separate name rather than e.g. keeping 1-bit ports around and ignoring them. A 'nodata' FIFO is really a counter where the push side asserts incr on valid and the pop side asserts decr on ready. I think a counter is probably enough less complex than a FIFO that I could imagine it being OK to implement in block IR directly without needing a wrapper?

@schilkp
Copy link
Contributor Author

schilkp commented Feb 4, 2025

it's unfortunate that SystemVerilog ...

Yeah... SV is a can of worms but its the best (only realistic?) way of interfacing with the industrial ECAD tools...

Chip design is a good case study of how the world would look like if GCC/LLVM did not exist and the only acceptable compiler was a comercial COBOL one. Everything would have to be a COBOL transpiler..

I agree with this [...]. I think it's best to have a separate name rather than e.g. keeping 1-bit ports around and ignoring them.

If there is consenus on this, I am happy to PR this. Do you want it here or a separate PR?

A 'nodata' FIFO is really a counter where the push side asserts incr on valid and the pop side asserts decr on ready.

To a first approximation yes, but a "proper" implementation needs to also account for bypass/transparency.

I think a counter is probably enough less complex than a FIFO that I could imagine it being OK to implement in block IR directly without needing a wrapper?

Well the code exists to implement both a full FIFO or nodata FIFO in IR - In my PR I split it so that everything before

// Only generate data path and I/O if the data type is of non-zero wdith.
  if (have_data) {
      ...

is the control path/nodata FIFO and everything after is the data path/buffer.

As a user I would be confused if nodata fifos are always materialized while data fifos are only sometimes materialized. I would either have them behave the same or provide a separate --materialize-no-data-fifos flag.

Again to my previous comment - I don't see many references to the fifo wrapper in XLS - is this something you actively use? Is there a reason that materializing FIFOs is not the default?

@grebe
Copy link
Collaborator

grebe commented Feb 11, 2025

Sorry for the slow responses!

Do you want it here or a separate PR?
I think it should be a separate PR. To be clear, you're talking about the codegen of xls_fifo_wrapper, not the name used by the fifo materialization pass, right?

Reading the latest comment, I'm now realizing some of the context isn't very clear and probably this should be written up in a doc. I'll write a rough draft here:

Channels in proc IR are lowered into FIFO instantiations in block IR. These FIFO instantiations can end up as different things depending on what you're doing.

  1. Block interpreter: the block interpreter directly models a FIFO in pure software. The interpreter isn't supposed to mutate (i.e. run passes on) the IR, so this is a convenient way to model the FIFO.
  2. Block JIT: before converting XLS IR to LLVM IR, fifo instantiations are materialized (i.e. replaced with instantiations of blocks that model the FIFO) and inlined. This seemed better than trying to integrate the software model with the block JIT. The block IR of the FIFO gets converted to object code and more-or-less is hidden, so it's OK to mutate the IR.
  3. Codegen: by default, fifo instantiations turn into xls_fifo_wrapper instantiations. It is expected that users will provide their own good fifo implementation and wrap it in xls_fifo_wrapper.
  4. Codegen with materialized fifos: fifo instantiations are replaced by block IR and converted to RTL. This removes the need to provide your own fifo implementation, but beware: these fifos aren't really designed to have good PPA. The block IR was written to be functionally correct and maintainable, not to minimize state and/or timing paths. Probably there's a better fifo out there. For example, if you're on an FPGA it's better to use the FIFO IP the FPGA provides rather than hope the tools will infer a BRAM from our FIFO.

In summary, the purpose of the block-IR FIFOs is to model the behavior of a potentially-better-handwritten RTL FIFO. We use the materialize fifo pass when 1) modeling the FIFO w/ our JIT or 2) it's difficult to integrate a handwritten FIFO. Our FIFO isn't open source, so in the repo it might look like we use the materialized FIFO more than we actually do.

@schilkp
Copy link
Contributor Author

schilkp commented Feb 11, 2025

Sorry for the slow responses!

No worries - and thanks both to you and @ericastor for the explanations here and in the MLIR fifo props issue.

I think it should be a separate PR. To be clear, you're talking about the codegen of xls_fifo_wrapper, not the name used by the fifo materialization pass, right?

Ok will do + yes exactly. I should have a PR of this up today or tomorrow.

@schilkp
Copy link
Contributor Author

schilkp commented Feb 12, 2025

Just to be sure: Is there something you still need from my side or is this in the "review queue"?

No worries/rush if you are busy - just making sure its not me blocking this :)

Copy link
Collaborator

@grebe grebe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once again, sorry to be so slow in responding.

I think the ideal place for coverage is materialize_fifos_pass_test.cc. Currently, that test hardcodes 32-bit inputs, but it looks like you could add bitwidth as a parameter fairly easily as the places it hardcodes 32-bits look like they immediately construct a type and pass it down.

To make this coverage work, I suspect you'll need to update the interpreter's FifoModel as the idea of the test is to compare the behavior from the interpreter's model to the block-IR implementation. I think you'll need to special case the data_.has_value() kind of checks to be data->GetType()->GetFlatBitCount() == 0 || data_.has_value().

I think it should be OK for FifoInstantiation to stay the same, but ideally it would conditionally exclude the data port based on the type.

@allight, is there anything you can think of that I'm missing to test this?

xls/codegen/materialize_fifos_pass.cc Outdated Show resolved Hide resolved
@allight
Copy link
Collaborator

allight commented Feb 13, 2025

That all seems sufficient to me.

You'll probably need to update fifo_model_test_utils.h a bit too to make those support zero-width input/output but that shouldn't be that difficult.

@schilkp schilkp force-pushed the schilkp/fix_zerowidth_fifos branch from 0327604 to 42b4af6 Compare February 14, 2025 06:05
Gates the generation of the FIFO "data path" behind a non-zero data
width and adjusts the signature accordingly.
@schilkp schilkp force-pushed the schilkp/fix_zerowidth_fifos branch from 42b4af6 to 1451d9b Compare February 14, 2025 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants