modules/zstd: Add support for decoding compressed blocks (DSLX part)#2230
modules/zstd: Add support for decoding compressed blocks (DSLX part)#2230wsipak wants to merge 83 commits intogoogle:mainfrom
Conversation
|
The PNR step of the action has been running for 7+ hours: https://github.com/google/xls/actions/runs/15171186417/job/42667677899?pr=2230 is that expected? |
There was a problem hiding this comment.
We modified the interface of the part responsible for decoding fame header, and it was easier for us to extend the DSLX tests. This also made the tests consistent with the rest of the modules.
There was a problem hiding this comment.
is that obsolete now w/ the new implementation?
There was a problem hiding this comment.
The part responsible for checking the magic number has been moved to FrameHeaderDecoder:
https://github.com/google/xls/pull/2230/files#diff-09700fe24633d73f1e4217266bae21f08269db322cc896aec4b946ecf9842cc3R428
| static const uint32_t random_frames_count = 100; | ||
| }; | ||
|
|
||
| // Test `random_frames_count` instances of randomly generated valid |
There was a problem hiding this comment.
curious if that would have done using fuzzing instead?
There was a problem hiding this comment.
We tried using fuzzing, but the problem is that the control sections need to follow a specific structure and match the properties of the generated data. It might be possible to use fuzzing with some constraints on the frame, but there are a lot of rules to follow, which makes it complicated. In the end, it was easier to use a tool made for creating random test data that meets these requirements.
There was a problem hiding this comment.
can we make all the verilog .sv files?
There was a problem hiding this comment.
After bumping the PR we noticed that some targets become longer then expected.
We are working on fixing that
@QuantamHD @mikesinouye is a 17h+ runtime expected for PnR w/ a block of this size, or are there some tweak we should do to the config to speed things up? |
There was a problem hiding this comment.
can you add:
# pytype binary, library"
near the other load statements that will help resolve the internal version of the python rules.
I notice that the We've done a lot of work to speed things up since the current rules_hdl version - I think this would be worth investigation after bumping again. We're still planning to do this soon. Everything but Qt is building using bazel in upstream OpenROAD, so we're almost there. |
|
Looks like there is an issue w/ executing some of DSLX tests: |
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Maciej Torhan <mtorhan@antmicro.com>
Remove references to buffer structs as those are not used anywhere Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Co-authored-by: Pawel Czarnecki <pczarnecki@antmicro.com> Co-authored-by: Robert Winkler <rwinkler@antmicro.com> Signed-off-by: Maciej Torhan <mtorhan@antmicro.com> Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com> Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Signed-off-by: Krzysztof Oblonczek <koblonczek@antmicro.com>
Signed-off-by: Maciej Torhan <mtorhan@antmicro.com>
Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
* Use materialize_internal_fifos when possible * Disable the above option for procs with loopback channels * Add missing module names * Add xls_fifo_wrapper verilog dependency to the synthesis of the procs without materialized internal fifos Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com> Co-authored-by: Wojciech Sipak <wsipak@antmicro.com>
No longer applicable - there are no CC tests in ZSTD module Signed-off-by: Pawel Czarnecki <pczarnecki@antmicro.com>
Signed-off-by: Wojciech Sipak <wsipak@antmicro.com>
Improve formatting, wording and fix lint issues Co-authored-by: Dominik Lau <dlau@antmicro.com> Co-authored-by: Szymon Gizler <sgizler@antmicro.com> Signed-off-by: Wojciech Sipak <wsipak@antmicro.com>
Co-authored-by: Pawel Czarnecki <pczarnecki@antmicro.com> Co-authored-by: Krzysztof Obłonczek <koblonczek@antmicro.com>
a6afa36 to
a8497ec
Compare
Let's use predefined data first and only then add zstd as dependency. Signed-off-by: Wojciech Sipak <wsipak@antmicro.com>
fc26a47 to
aab0556
Compare
aab0556 to
8f93fcc
Compare
Signed-off-by: Robert Winkler <rwinkler@antmicro.com>
aab0556 to
54b5d01
Compare
|
It seems like GitHub complains about me not being CLA approved as a PR opener (even though I have been covered by the Google CLA as a commit author since ages). This may be due to some (new?) email privacy settings in GH which I now disabled but the PR was issued prior to that. To check if that was indeed the problem, I'm closing this PR and a follow up PR should be opened shortly. Sorry for the inconvenience, but it seems like a GH limitation. |
|
Moved to #2296 |
This PR adds the DSLX parts of #1857 in order not to depend on cocotb-related python packages
supersedes #1857
This PR extends the ZstdDecoder with support for decoding compressed blocks.
The decoder is capable of decoding RAW and RLE literals as well as sequences with predefined FSE tables.
A suite of DSLX tests comprising unit tests of all underlying procs and an integration test was prepared.
The integration test, similarly as in #1654, first generates a random valid ZSTD frame with compressed blocks and expected decoded output. Test data is then converted to a DSLX file (example) that is imported by the integration tests file.
At the beginning of the test, the default FSE decoding tables are filled with default distributions taken from RFC 8878 section 3.1.1.3.2.2. Default Distributions . Next, the encoded frame is loaded to the system memory and the decoder is configured through a set of CSRs to start the decoding process. The decoder starts the operation and writes the decoded frame back into the output buffer in the system memory. Once it finishes, it sends a pulse on the notify channel signaling the end of the decoding. The output of the decoder is compared against the decoding result from the reference library.
The PR introduces among others:
SequenceDecoder- responsible for decoding sequence sections of the compressed blocksFseDecoder- introduced as the core part of theSequenceDecoderRefillingShiftBuffer- used for storing and outputting in forward and backward fashion an arbitrary amount of bits required by the FSE decoderLiteralsDecoder- capable of decoding RAW, RLE and Huffman-coded literalsHuffmanDecoder- used in decoding huffman-coded literals. Decoded Huffman trees are then used to decode one or four Huffman-coded streams.CommandConstructor- this proc is responsible for sending packets with decoded sequences and literals to theSequenceExecutorprocRamMuxandRamDemux- procs used for handling requests/responses to multiple memory models. The procs interface with 3 separate memory buffers for FSE decoding tables.