Skip to content

Conversation

@psavery
Copy link
Collaborator

@psavery psavery commented Aug 22, 2025

Most of these speedups are applicable for generating a whole bunch of polar view in a row with the same image (i. e., cache_coordinate_map=True when instantiating the PolarView class), but a few of them also improve performance of creating the PolarView in general.

Overall, we achieved at least a 10x speedup (in serial) for generating a bunch of images in a row, and even more of a speedup for some examples. A list of the improvements are as follows:

  1. Using a "moving sum" for summing the detector images. This is particularly valuable for when you have many detectors (i. e., 32 subpanel Eiger). It eliminated the requirement of stacking all the images together before performing the sum, and we can instead discard images immediately after they are included in the sum.
  2. Precompute bilinear interpolation parameters, including the weights and indices of detector pixels used for each polar pixel. As long as the instrument doesn't change, these interpolation parameters don't change either, and thus we don't need to generate them every time. They are now included in the cache map.
  3. Run bilinear interpolation in numba. This is faster in the single-threaded case, and it also improves parallelism via threading since numba releases the GIL.
  4. Instantiate and provide a re-usable output buffer for the interpolation. We re-use the output buffer for every detector, so that we don't have to repeatedly instantiate a chunk of memory for the output for every detector, as we were doing before. - superseded by number 6.
  5. Generate nan mask ahead of time. When we reuse the same instrument for many images in a row, we can now reuse the same nan mask, since it should not change. Skipping the step where we regenerate the nan mask provided a considerable performance improvement.
  6. Avoid creating the full-size intermediate image immediately before the moving sum. Instead, we can just add valid values to the moving sum directly. This provided yet another big speedup.
  7. Completely eliminate all intermediary arrays, so that the weighted interpolation is always computed and set directly into the output array.

I tested these changes a fair amount, and the output looks completely identical to what it was before.

Rather than storing all detector images until the end and then summing
them, sum them as they are created. This saves memory and can save a
substantial amount of time, especially for 32-subpanel Eiger.

For a high resolution polar view and 32-subpanel Eiger, we found a
30% speed up in polar view generation time.

Signed-off-by: Patrick Avery <[email protected]>
This can be a time-consuming operation to perform repeatedly. When we
reuse the same images over and over again, we should apply the panel
buffer beforehand and skip this step when generating the polar view.

This results in yet another 40% speedup for generating a bunch of
high resolution polar images for 32-subpanel eiger.

Signed-off-by: Patrick Avery <[email protected]>
For running repeated caking using the same instrument parameters, it
can be substantially faster to precompute the bilinear interpolation
parameters and just re-use them every time.

This resulted in yet another ~50% speedup on top of the other speedups.

Signed-off-by: Patrick Avery <[email protected]>
It runs faster and is more multi-thread friendly.

We are also now allowing an output buffer to be passed to the
bilinear interpolation. This means that we don't have to instantiate
a new output every time.

Signed-off-by: Patrick Avery <[email protected]>
This gives us a significant speedup (perhaps 2x) for warping many
images in a row that use the same instrument config.

Signed-off-by: Patrick Avery <[email protected]>
@codecov
Copy link

codecov bot commented Aug 22, 2025

Codecov Report

❌ Patch coverage is 86.88525% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 49.29%. Comparing base (06a6678) to head (f88a57d).
⚠️ Report is 13 commits behind head on master.

Files with missing lines Patch % Lines
hexrd/instrument/detector.py 72.00% 7 Missing ⚠️
hexrd/projections/polar.py 97.22% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #833      +/-   ##
==========================================
+ Coverage   49.26%   49.29%   +0.03%     
==========================================
  Files         143      143              
  Lines       23109    23135      +26     
==========================================
+ Hits        11385    11405      +20     
- Misses      11724    11730       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

We can actually completely avoid the intermediate full-size image, and
just add valid values to the running sum as we go. This provides yet
another 30% speedup on top of all of the other speedups.

Signed-off-by: Patrick Avery <[email protected]>
Now there are no intermediary arrays for interpolating bilinear on
a bunch of images.

Signed-off-by: Patrick Avery <[email protected]>
@psavery psavery changed the title Implement several speeds for polar view generation Implement several speedups for polar view generation Aug 24, 2025
@psavery
Copy link
Collaborator Author

psavery commented Sep 3, 2025

Chris Budrow tested and approved these changes.

@psavery psavery merged commit cf45859 into master Sep 3, 2025
7 checks passed
@psavery psavery deleted the polar-view-speedups branch September 3, 2025 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants