Skip to content

Conversation

@mdanilow
Copy link

@mdanilow mdanilow commented Mar 4, 2025

Faster implementation of finn.util.data_packing.packed_bytearray_to_finnpy. It utilizes bitshifts and is necessary for realtime processing on pynq.

@fpjentzsch
Copy link
Collaborator

Might be a good idea to review/merge this PR together with #1172 (@bwintermann).

@bwintermann
Copy link

I indeed ran into the same issue and opened the PR mentioned by @fpjentzsch because of it. Did you measure how much faster this is than the previous approach? I would be quite happy if we didn't have to load some C code to get fast execution speed.

@fpjentzsch
Copy link
Collaborator

I tried this on a MNv1 with FLOAT32 output and got this error because you expect a 5-dimensional output shape:

File "/home/xilinx/jupyter_notebooks/ras_end2end_val_manual/measurement/driver/finn/util/data_packing.py", line 451, in packed_bytearray_to_finnpy
result[:, :, :, :, i] += ret[:, :, :, :, packing*i + fold] << (packing - 1 - fold) * 8
IndexError: too many indices for array: array is 3-dimensional, but 5 were indexed

After adjusting the code from 5 to 3 dimensions, it worked for this case and sped up the output data unpacking by ~3000x.
The new bottleneck is now the ImageNet input data pre-processing, so my overall speedup was ~14x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants