-
Notifications
You must be signed in to change notification settings - Fork 5
IP Caching #99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
IP Caching #99
Conversation
|
Caching should now be enabled out of the box. If no arguments are given, the cache directory is placed in the root of the finn-plus repository as There are still some open points to consider:
|
|
Thanks! For operators like the MVAU, we might need to hash the content of the weight/treshold tensors in addition to the CustomOp attributes and finn-plus commit hash. |
|
For the MVAU_hls i already implemented it somewhat, by hashing |
|
I've now added Additionally I covered (I think) all operators that use external parameters and added their parameters to the lookup key. Your @iksnagreb ops should now be covered as well. Ill add some tests and try it out with different models, and then it should be ready to merge. |
…ored some functions.
…th standlone thresholds and wrong order of cache application.
…nto feature/ip_cache
This comment was marked as outdated.
This comment was marked as outdated.
|
📋 Docstring Check Report Checked files:
❌ Docstring check failed! Missing Docstrings Details:📄 src/finn/builder/build_dataflow_steps.py:
📄 src/finn/custom_op/fpgadataflow/hls/init.py:
📄 src/finn/custom_op/fpgadataflow/rtl/init.py:
📄 src/finn/interface/run_finn.py:
📄 src/finn/transformation/fpgadataflow/ip_cache.py:
📄 src/finn/util/deps.py:
Total missing docstrings: 23 How to Fix:Please add docstrings to the missing functions, classes, and modules listed above. Docstring Guidelines:
Raw output from docstring checker |
|
|
||
| def step_ip_generation(model: ModelWrapper, cfg: DataflowBuildConfig) -> ModelWrapper: | ||
| """Unified step, that does what step_hw_codegen and step_hw_ipgen did before. (With cache!).""" | ||
| if cfg.use_ip_caching: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
step_hw_ipgen has more info logging here than this step. Should be consistent between both (and probably lean towards verbose logging for a critical feature like this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean generally more logging during the IP Cache transformation or during the steps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wherever it fits best. I think we should always log summary information, like
- Caching enabled/disabled (along with global cache settings if enabled)
- How many IPs in total are present in the cache
- Restoring x out of y layers from cache
- Placing z out of y layers in the cache that were not previously cached
And in verbose/debug mode info like this should be printed per-layer.
I think most of this is already in place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have a look what gets logged when and revise it if necessary.
| if cfg.use_ip_caching: | ||
| log.info("Using IP cache to fetch generated IPs...") | ||
| clk = cfg._resolve_hls_clk_period() | ||
| if clk is None and cfg.cache_hls_clk_period: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this level of error handling if somehow no (HLS) clock is specified. It adds bloat to this already long file and this condition shouldn't be possible, especially with the latest changes to DataflowBuildConfig, where synth_clk_period_ns is no longer Optional[] and has a default value. _resolve_hls_clk_period always falls back to this clock.
The same comment applies to step_ip_generation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this was before the Small Fixes PR, so the clocks could still be None, and thus i checked them. Going to remove this.
| if sys.platform != "win32": | ||
| self.max_hash_len = os.pathconf("/", "PC_NAME_MAX") | ||
| self.max_path_len = os.pathconf("/", "PC_PATH_MAX") | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to support Windows at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines don't impact us much, but I guess I could remove them. Do we have any checks if we are on windows? If not we should maybe explicitly add one to the CLI of FINN+ so that users don't try to run it on Windows in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't bother at all with this. We assume a very specific Ubuntu setup already with heavy tool chains and a long list of apt packages installed already. We can assume every user has read the basic quickstart documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'll add it into the frontend rework still. Only a few lines and could potentially help users. But either way that's another topic. I'll remove it here.
| parambytes = _ndarray_to_bytes(model.get_initializer(op.onnx_node.input[1])) | ||
| array_hash = self.hasher(parambytes).hexdigest() | ||
| return f"param_hash:{array_hash}\n" | ||
| elif isinstance(op, (ElementwiseBinaryOperation,)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do ElementwiseBinaryOperations really always have 2 initializers @iksnagreb? I don't see any error handling here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, they can have up to two initializer inputs, though in practice that should probably not even happen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the heads-up, I'll add handling for different cases here.
| op.set_nodeattr("ipgen_path", str(ip_dir)) | ||
| op.set_nodeattr("gen_top_module", new_module_name) | ||
|
|
||
| elif issubclass(type(op), HLSBackend): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just out of curiosity: Do HLS nodes really not need any (ugly) renaming of module names like the RTL nodes do? What if op.onnx_node.name is different in the current model than in the cached version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be an issue. I noticed the RTL side because a layer was re-used during testing, but I didn't have an example that does this with an HLS layer yet. It might still work though, since I am not sure if later transformations care about the name of the IP core. I'll check this today.
| from finn.transformation.fpgadataflow.specialize_layers import SpecializeLayers | ||
| from finn.util.basic import alveo_part_map | ||
| from finn.util.deps import get_cache_path | ||
| from tests.fpgadataflow.test_fpgadataflow_mvau import make_single_fclayer_modelwrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This import doesn't work, at least in the CI where FINN+ is launched as a pip-installed package.
|
Currently test_split_large_fifos and test_fifosizing_linear fail (here), but I couldn't reproduce it locally yet, even when running multiple different test variants after each other. Is it possible we are running into race conditions with this caching system if multiple tests or builds or NodeLocalTransformations within a graph run simultaneously? |
I ran into this before as well. It should not be a race condition as far as I am aware. I think the error originates from the fact that there simply is no |
Yes, this is an issue I also ran into in #113. Using |
| if attributes is not None: | ||
| CACHE_IP_DEFINITIONS[op_cls]["use"] = attributes | ||
| else: | ||
| # List of fields that don't define the IP core itself, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be safe to add the following attributes to this list of default ignored attributes. Most of them are optional (set to "" or [] by default) attributes of HWCustomOp, but they currently still show up in the key file.
backend
preferred_impl_style
exec_mode
rtlsim_trace
slr
mem_port
partition_id
device_id
inFIFODepths
outFIFODepths
output_hook
io_chrc_in
io_chrc_out
io_chrc_period
io_chrc_pads_in
io_chrc_pads_out
| # Prepare some always needed values | ||
| # FINN Commit | ||
| self.finn_commit = subprocess.run( | ||
| shlex.split("git rev-parse HEAD"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work in the CI, where FINN+ is installed as package and not from the repo. I.e., I get the following and the FINN+ commit does not contribute to the key, since there is no sanity check:
FINN Commit reads: (authored at: )
HLSLIB Commit reads: 5dde96382b84979c6caa6f34cdad2ac72fa28489 (authored at: 2025-07-14 11:35:11 +0200)
Can we use "poetry version" or some pip command to use the version string as key instead? It should be PEP 440 compliant, i.e., contain the latest short commit hash.
|
Currently, the caching does not save any time, because We likely need to split up This issue has been previously discussed here: |
When I ran the test suite a second time to test the effects of caching, I got 6 more failures than on the first run:
Might be worth looking into these. |
It would be useful, especially for larger models, to have cached IP cores, so that we can mostly skip IP-generation steps which, after synthesis itself, take the longest time in the build flow. It is also helpful for certain tests that require repeated simulations with the same IP. The idea was first mentioned in Xilinx#174 and implemented in Xilinx#349 but never merged as far as I can see. We should maybe implement this as well.
Here is a proposal of some changes to the original / some implementation ideas:
@cache_ip(attrs="..., ..., ..."). To cover the case above, we would simply use@cache_ip().@no_cacheand@custom_cache(func=...)HLSSynthIP())True.FINN_DEPSand (definitely)FINN_BUILD_DIR, since we want it to be usable across runs and maybe independent of the FINN executable. It should be configurable insettings.yamlor per Env-var or per command-line argument, like any other configuration option.finn-hlslib,finn-plusitself and any other required repository should be part of the hash keyin_width: 10andout_width: 10could yield the same hash)ncharacters of the hex representation as a prefix, etc.@iksnagreb @LinusJungemann @fpjentzsch do you have additions, suggestions, changes, etc.? Once we have a fixed idea I'd start implementing it.
Checks until PR ready:
Optional: