-
Notifications
You must be signed in to change notification settings - Fork 203
modules/zstd: Add support for decoding compressed blocks (DSLX part) #2296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
modules/zstd: Add support for decoding compressed blocks (DSLX part) #2296
Conversation
Looks like the following target blow up over the internal memory limit for test (~10GB):
Is that expected? (possibly related to #1042?) |
@proppy yest the issue is caused by huge RAM consumption in DSLX interpreter in general, which was also reported in #1897 |
Is there a way to reduce the number of test cases so that it fits within 10G? |
import xls.modules.zstd.literals_buffer; | ||
import xls.modules.zstd.fse_table_creator; | ||
import xls.modules.zstd.ram_mux; | ||
// import xls.modules.zstd.zstd_frame_testcases as comp_frame; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is that commented on purpose? is the idea here to allow different type of frame to be tested depending on what you uncomment? (if yes maybe those should be split in different tests?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be a perfect solution, however currently the interpreted does not free memory between running separate test cases, which may be a problem for this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also tried using ir-jit
evaluator for dslx tests, but it works just for 29/55 of them. The other are failing due to the following reasons:
- Incorrect token usage in the for loop which cannot be converted to ir (these tests should use
unroll_for!
instead) IrConversionError: Could not find name for invocation: 'enumerate'
, indicating that theenumerate
has no representation in IR?- Just a SegmanetationFault on many tests
I noticed that OpenROAD failed with a segmentation fault in the CI. We've actually seen quite a bit of instability with this tool internally. In many cases, simply rerunning the flow has resolved the issue. |
@QuantamHD / @mikesinouye fyi. |
We might need to bump openroad. There were some tsan issues resolved recently. |
xls/modules/zstd/BUILD
Outdated
tags = ["manual"], | ||
target_die_utilization_percentage = "10", | ||
) | ||
# TODO: Uncomment after resolving problems with too long P&R |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't manual
be enough to avoid triggering it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The manual
tag is used to exclude a target from the //xls/modules/zstd:all
and //xls/modules/zstd/...
targets. We use it for all targets except xls_dslx_library
so that the main CI can typecheck the zstd part without running other expensive checks.
Hoverwer, there is a dedicated CI for zstd. To run all manual
targets, it explicitly lists them using bazel query
, which searches for groups of targets with common suffixes. I made these rules more strict to allow re-enabling some previously commented-out targets. To skip them safely, I used the _skip
suffix in the target names so the CI won't match them.
@QuantamHD's current work is finishing the Qt bazel build hookup for upstream OpenROAD. Once that's finalized, we can clean rules_hdl's deps and bump the pointer to head for OR. There have been many stability improvements in the last 15 months so hopefully that addresses whatever issues you're observing. I think this will happen soonish, once we do let's revisit this to confirm it does. |
b352452
to
19e2fba
Compare
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Maciej Torhan <[email protected]>
Remove references to buffer structs as those are not used anywhere Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Pawel Czarnecki <[email protected]>
* Use materialize_internal_fifos when possible * Disable the above option for procs with loopback channels * Add missing module names * Add xls_fifo_wrapper verilog dependency to the synthesis of the procs without materialized internal fifos Signed-off-by: Pawel Czarnecki <[email protected]> Co-authored-by: Wojciech Sipak <[email protected]>
No longer applicable - there are no CC tests in ZSTD module Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Wojciech Sipak <[email protected]>
Improve formatting, wording and fix lint issues Co-authored-by: Dominik Lau <[email protected]> Co-authored-by: Szymon Gizler <[email protected]> Signed-off-by: Wojciech Sipak <[email protected]>
Co-authored-by: Pawel Czarnecki <[email protected]> Co-authored-by: Krzysztof Obłonczek <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Let's use predefined data first and only then add zstd as dependency. Signed-off-by: Wojciech Sipak <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
Signed-off-by: Robert Winkler <[email protected]>
The CI setup for the zstd module uses a bazel query to find groups of targets that share a common suffix. By appending `_skip` to a target name, the query no longer matches it, effectively excluding that target from the CI run. Signed-off-by: Robert Winkler <[email protected]>
Make Bazel queries stricter so that target names must end exactly with the specified suffix, without allowing any additional characters after it. Signed-off-by: Robert Winkler <[email protected]>
cffde4e
to
484cc69
Compare
This PR adds the DSLX parts of #1857 in order not to depend on cocotb-related python packages supersedes #1857
This PR extends the ZstdDecoder with support for decoding compressed blocks.
The decoder is capable of decoding RAW and RLE literals as well as sequences with predefined FSE tables. A suite of DSLX tests comprising unit tests of all underlying procs and an integration test was prepared. The integration test, similarly as in #1654, first generates a random valid ZSTD frame with compressed blocks and expected decoded output. Test data is then converted to a DSLX file (example) that is imported by the integration tests file. At the beginning of the test, the default FSE decoding tables are filled with default distributions taken from RFC 8878 section 3.1.1.3.2.2. Default Distributions . Next, the encoded frame is loaded to the system memory and the decoder is configured through a set of CSRs to start the decoding process. The decoder starts the operation and writes the decoded frame back into the output buffer in the system memory. Once it finishes, it sends a pulse on the notify channel signaling the end of the decoding. The output of the decoder is compared against the decoding result from the reference library.
The PR introduces among others:
SequenceDecoder
- responsible for decoding sequence sections of the compressed blocksFseDecoder
- introduced as the core part of theSequenceDecoder
RefillingShiftBuffer
- used for storing and outputting in forward and backward fashion an arbitrary amount of bits required by the FSE decoderLiteralsDecoder
- capable of decoding RAW, RLE and Huffman-coded literalsHuffmanDecoder
- used in decoding huffman-coded literals. Decoded Huffman trees are then used to decode one or four Huffman-coded streams.CommandConstructor
- this proc is responsible for sending packets with decoded sequences and literals to theSequenceExecutor
procRamMux
andRamDemux
- procs used for handling requests/responses to multiple memory models. The procs interface with 3 separate memory buffers for FSE decoding tables.