-
Notifications
You must be signed in to change notification settings - Fork 203
modules/zstd: Add support for decoding compressed blocks (DSLX part) #2230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The PNR step of the action has been running for 7+ hours: https://github.com/google/xls/actions/runs/15171186417/job/42667677899?pr=2230 is that expected? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason we're removin this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We modified the interface of the part responsible for decoding fame header, and it was easier for us to extend the DSLX tests. This also made the tests consistent with the rest of the modules.
xls/modules/zstd/magic.x
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that obsolete now w/ the new implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The part responsible for checking the magic number has been moved to FrameHeaderDecoder
:
https://github.com/google/xls/pull/2230/files#diff-09700fe24633d73f1e4217266bae21f08269db322cc896aec4b946ecf9842cc3R428
static const uint32_t random_frames_count = 100; | ||
}; | ||
|
||
// Test `random_frames_count` instances of randomly generated valid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curious if that would have done using fuzzing instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We tried using fuzzing, but the problem is that the control sections need to follow a specific structure and match the properties of the generated data. It might be possible to use fuzzing with some constraints on the frame, but there are a lot of rules to follow, which makes it complicated. In the end, it was easier to use a tool made for creating random test data that meets these requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we make all the verilog .sv
files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we bump the timeout?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After bumping the PR we noticed that some targets become longer then expected.
We are working on fixing that
@QuantamHD @mikesinouye is a 17h+ runtime expected for PnR w/ a block of this size, or are there some tweak we should do to the config to speed things up? |
xls/modules/zstd/BUILD
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add:
# pytype binary, library"
near the other load
statements that will help resolve the internal version of the python rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
I notice that the We've done a lot of work to speed things up since the current rules_hdl version - I think this would be worth investigation after bumping again. We're still planning to do this soon. Everything but Qt is building using bazel in upstream OpenROAD, so we're almost there. |
Looks like there is an issue w/ executing some of DSLX tests:
|
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Maciej Torhan <[email protected]>
Remove references to buffer structs as those are not used anywhere Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Co-authored-by: Pawel Czarnecki <[email protected]> Co-authored-by: Robert Winkler <[email protected]> Signed-off-by: Maciej Torhan <[email protected]> Signed-off-by: Pawel Czarnecki <[email protected]> Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Krzysztof Oblonczek <[email protected]>
Signed-off-by: Maciej Torhan <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
* Use materialize_internal_fifos when possible * Disable the above option for procs with loopback channels * Add missing module names * Add xls_fifo_wrapper verilog dependency to the synthesis of the procs without materialized internal fifos Signed-off-by: Pawel Czarnecki <[email protected]> Co-authored-by: Wojciech Sipak <[email protected]>
No longer applicable - there are no CC tests in ZSTD module Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Wojciech Sipak <[email protected]>
Improve formatting, wording and fix lint issues Co-authored-by: Dominik Lau <[email protected]> Co-authored-by: Szymon Gizler <[email protected]> Signed-off-by: Wojciech Sipak <[email protected]>
Co-authored-by: Pawel Czarnecki <[email protected]> Co-authored-by: Krzysztof Obłonczek <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
a6afa36
to
a8497ec
Compare
Let's use predefined data first and only then add zstd as dependency. Signed-off-by: Wojciech Sipak <[email protected]>
fc26a47
to
aab0556
Compare
aab0556
to
8f93fcc
Compare
Signed-off-by: Robert Winkler <[email protected]>
aab0556
to
54b5d01
Compare
It seems like GitHub complains about me not being CLA approved as a PR opener (even though I have been covered by the Google CLA as a commit author since ages). This may be due to some (new?) email privacy settings in GH which I now disabled but the PR was issued prior to that. To check if that was indeed the problem, I'm closing this PR and a follow up PR should be opened shortly. Sorry for the inconvenience, but it seems like a GH limitation. |
Moved to #2296 |
This PR adds the DSLX parts of #1857 in order not to depend on cocotb-related python packages
supersedes #1857
This PR extends the ZstdDecoder with support for decoding compressed blocks.
The decoder is capable of decoding RAW and RLE literals as well as sequences with predefined FSE tables.
A suite of DSLX tests comprising unit tests of all underlying procs and an integration test was prepared.
The integration test, similarly as in #1654, first generates a random valid ZSTD frame with compressed blocks and expected decoded output. Test data is then converted to a DSLX file (example) that is imported by the integration tests file.
At the beginning of the test, the default FSE decoding tables are filled with default distributions taken from RFC 8878 section 3.1.1.3.2.2. Default Distributions . Next, the encoded frame is loaded to the system memory and the decoder is configured through a set of CSRs to start the decoding process. The decoder starts the operation and writes the decoded frame back into the output buffer in the system memory. Once it finishes, it sends a pulse on the notify channel signaling the end of the decoding. The output of the decoder is compared against the decoding result from the reference library.
The PR introduces among others:
SequenceDecoder
- responsible for decoding sequence sections of the compressed blocksFseDecoder
- introduced as the core part of theSequenceDecoder
RefillingShiftBuffer
- used for storing and outputting in forward and backward fashion an arbitrary amount of bits required by the FSE decoderLiteralsDecoder
- capable of decoding RAW, RLE and Huffman-coded literalsHuffmanDecoder
- used in decoding huffman-coded literals. Decoded Huffman trees are then used to decode one or four Huffman-coded streams.CommandConstructor
- this proc is responsible for sending packets with decoded sequences and literals to theSequenceExecutor
procRamMux
andRamDemux
- procs used for handling requests/responses to multiple memory models. The procs interface with 3 separate memory buffers for FSE decoding tables.