Skip to content

Conversation

@wdconinc
Copy link
Contributor

@wdconinc wdconinc commented Oct 30, 2025

Briefly, what does this PR introduce?

This PR introduces caching of the signal pulse function to speed up the reconstruction of TOF hits (primarily; others will benefit).

Test command:

eicrecon -Ppodio:output_file=rec_dis_10x100_minQ2\=100_craterlake.edm4hep.root -Ppodio:output_collections=TOFEndcapSmoothPulses -Ppodio:output_collections=TOFEndcapSmoothPulses sim_dis_10x100_minQ2\=100_craterlake.edm4hep.root

Before:

Factories:
-510.93 s (-100.0%) JEventProcessorPODIO
0.00 us (  0.0%) edm4hep::SimTrackerHit:TOFEndcapHits
0.00 us (  0.0%) podio::Frame
3.60 s (  0.7%) edm4hep::SimTrackerHit:TOFEndcapSharedHits
507.33 s ( 99.3%) edm4eic::SimPulse:TOFEndcapSmoothPulses

After:

Factories:
-77.57 s (-100.0%) JEventProcessorPODIO
0.00 us (  0.0%) edm4hep::SimTrackerHit:TOFEndcapHits
0.00 us (  0.0%) podio::Frame
5.71 s (  7.4%) edm4hep::SimTrackerHit:TOFEndcapSharedHits
71.86 s ( 92.6%) edm4eic::SimPulse:TOFEndcapSmoothPulses

In the signal pulse generation, the same (linear) Landau pulse shape is sampled 10k times per hit, evaluating the same complex function repeatedly. For many hits, this becomes a bottleneck.

We can optimize this in two ways:

  • use the linearity property of the pulse functions,
  • use function caching on a grid, with interpolation.

This PR modifies the interface to allow specified functions to specify if the function is linear, and it checks for this linearity when caching is enabled. The caching stores values on a grid spaced by the time step, and evaluates them only on the first access. On future accesses, the previous value is used, scaled by the new charge (hence the requirement for linearity). Since we access the values serially in time, we add a further optimization for this streaming through the pulse shape. When a second function is requested at a slightly different time, leading to points falling between earlier grid points, linear interpolation is used.

Futher optimization (with valarray) was considered, but due to the early break when the function does not go over threshold, there is too much branching.

Cache enable

sequenceDiagram
  autonumber
  participant App as PulseGeneration<HitT>::init()
  participant Pulse as SignalPulse
  participant Check as validateLinearity()
  participant Vec as m_cache_values (vector<double>)

  App->>Pulse: supportsCaching()
  alt linear in charge
    App->>Pulse: enableCache(timestep)
    Pulse->>Check: validateLinearity()
    alt isLinearInCharge()==false
      Check-->>Pulse: throw (unsupported for caching)
      Pulse-->>App: exception
    else linear and passes runtime check
      Check-->>Pulse: OK
      Pulse->>Vec: assign(size=2*MAX+3, NaN)
      Pulse->>Pulse: m_cache_resolution = timestep<br/>m_cache_offset = MAX+1<br/>m_cache_size = size
      Pulse-->>App: OK
    end
  else non-linear in charge
    App-->>App: skip enableCache()
  end
Loading

Process dispatch

sequenceDiagram
  autonumber
  participant Proc as PulseGeneration::process()
  participant Pulse as SignalPulse

  Proc->>Pulse: cachingEnabled()?
  alt no
    Note over Proc,Pulse: Fallback: per-bin exact evaluation (no cache)
  else yes
    Proc->>Pulse: supportsLinearInterpolation()?
    alt yes (smooth pulse e.g., Landau)
      Note over Proc,Pulse: Interpolated streaming (fast path)
    else no (e.g., Evaluator expression)
      Note over Proc,Pulse: Exact fractional streaming (safe path)
    end
  end
Loading

Interpolated streaming (fast path)

sequenceDiagram
  autonumber
  participant Proc as PulseGeneration::process()
  participant Pulse as SignalPulse
  participant Prep as prepareStreaming()
  participant Sample as sampleNormalizedAtIndex()
  participant Out as rawPulses

  Proc->>Pulse: prepareStreaming(signal_time, time, dt)
  Pulse->>Prep: nt0 = (signal_time - time)/dt
  Prep-->>Prep: s = -nt0
  Prep-->>Prep: base_index = floor(s)
  Prep-->>Prep: frac = s - base_index
  Prep->>Sample: sampleNormalizedAtIndex(base_index) -> v_floor_n
  Prep->>Sample: sampleNormalizedAtIndex(base_index+1) -> v_ceil_n
  Prep-->>Proc: StreamState{base_index, frac, v_floor_n, v_ceil_n}

  loop for i in [0, max_time_bins)
    Proc-->>Proc: v = v_floor_n + frac*(v_ceil_n - v_floor_n)
    Proc-->>Proc: signal = charge * v
    alt signal < ignore_thres AND not passed_threshold
      Proc-->>Proc: skip_bins = i
      Proc-->>Proc: base_index += 1
      Proc->>Sample: sampleNormalizedAtIndex(base_index+1) -> v_next
      Proc-->>Proc: v_floor_n = v_ceil_n
      Proc-->>Proc: v_ceil_n = v_next
      Note over Proc: continue
    else signal < ignore_thres after start
      Proc-->>Proc: t = (i - nt0) * dt
      alt t > min_sampling_time
        Proc-->>Proc: break loop
      else within window
        Proc-->>Proc: base_index += 1
        Proc->>Sample: sampleNormalizedAtIndex(base_index+1) -> v_next
        Proc-->>Proc: v_floor_n = v_ceil_n
        Proc-->>Proc: v_ceil_n = v_next
        Note over Proc: continue
      end
    else above threshold
      Proc-->>Proc: passed_threshold = true
      Proc-->>Proc: integral += signal
      Proc->>Out: addToAmplitude(signal)
      Proc-->>Proc: base_index += 1
      Proc->>Sample: sampleNormalizedAtIndex(base_index+1) -> v_next
      Proc-->>Proc: v_floor_n = v_ceil_n
      Proc-->>Proc: v_ceil_n = v_next
    end
  end

  Proc->>Out: setInterval(dt), setTime(signal_time + skip_bins*dt), setIntegral(integral)
Loading

Exact fractional streaming (for discontinuities)

sequenceDiagram
  autonumber
  participant Proc as PulseGeneration::process()
  participant Pulse as SignalPulse
  participant Prep as prepareStreaming()
  participant SampF as sampleNormalizedAt()  %% fractional
  participant Out as rawPulses

  Proc->>Pulse: prepareStreaming(signal_time, time, dt)
  Pulse->>Prep: nt0 = (signal_time - time)/dt
  Prep-->>Proc: 
  Proc-->>Proc: s = -nt0
  Proc-->>Proc: base_index = floor(s)
  Proc-->>Proc: frac = s - base_index

  loop for i in [0, max_time_bins)
    Proc->>SampF: sampleNormalizedAt(base_index + frac) -> v
    Proc-->>Proc: signal = charge * v
    alt signal < ignore_thres AND not passed_threshold
      Proc-->>Proc: skip_bins = i
      Proc-->>Proc: base_index += 1
      Note over Proc: continue
    else signal < ignore_thres after start
      Proc-->>Proc: t = (i - nt0) * dt
      alt t > min_sampling_time
        Proc-->>Proc: break loop
      else within window
        Proc-->>Proc: base_index += 1
        Note over Proc: continue
      end
    else above threshold
      Proc-->>Proc: passed_threshold = true
      Proc-->>Proc: integral += signal
      Proc->>Out: addToAmplitude(signal)
      Proc-->>Proc: base_index += 1
    end
  end

  Proc->>Out: setInterval(dt)
  Proc->>Out: setTime(signal_time + skip_bins*dt)
  Proc->>Out: setIntegral(integral)
Loading

Generic operator() path (single-sample access)

sequenceDiagram
  autonumber
  participant Call as SignalPulse::operator()(time, charge)
  participant Pulse as SignalPulse
  participant Get as getCachedValue(index, charge)
  participant Vec as m_cache_values
  participant Eval as evaluate(time, 1.0)

  Call-->>Call: normalized_time = time / m_cache_resolution
  Call-->>Call: index = floor(normalized_time), frac = normalized_time - index
  Call->>Get: value_floor = getCachedValue(index, charge)
  alt in-bounds slot is NaN
    Get->>Vec: read slot (NaN)
    Get->>Eval: evaluate(index * dt, 1.0)
    Eval-->>Get: v_norm
    Get->>Vec: slot = v_norm
    Get-->>Call: value_floor = v_norm * charge
  else in-bounds slot filled
    Get->>Vec: read slot
    Get-->>Call: value_floor = slot * charge
  else out-of-bounds
    Get->>Eval: evaluate(index * dt, charge)
    Eval-->>Get: value_floor
    Get-->>Call: value_floor
  end

  Call->>Get: value_ceil = getCachedValue(index + 1, charge)
  Note over Call,Pulse: Same logic as above to obtain value_ceil
  Call-->>Call: return value_floor + frac * (value_ceil - value_floor)
Loading

What kind of change does this PR introduce?

  • Bug fix (issue #__)
  • New feature (issue: speed up PulseGeneration)
  • Documentation update
  • Other: __

Please check if this PR fulfills the following:

  • Tests for the changes have been added
  • Documentation has been added / updated
  • Changes have been communicated to collaborators

Does this PR introduce breaking changes? What changes might users need to make to their code?

No.

Does this PR change default behavior?

Yes, it adds default caching. Small pulse generation changes may occur, consistent with the interpolation error on the 10 ps grid

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)


namespace eicrecon {

class SignalPulse {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ cppcoreguidelines-special-member-functions ⚠️
class SignalPulse defines a default destructor but does not define a copy constructor, a copy assignment operator, a move constructor or a move assignment operator

wdconinc and others added 3 commits October 30, 2025 15:24
…x: iwyu) (#2167)

This PR applies the include-what-you-use fixes as suggested by
https://github.com/eic/EICrecon/actions/runs/18953609849.
Please merge this PR into the branch `signal-pulse-caching`
to resolve failures in PR #2166.

Auto-generated by [create-pull-request][1]

[1]: https://github.com/peter-evans/create-pull-request

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Clang-Tidy found issue(s) with the introduced code (1/1)

@wdconinc wdconinc requested a review from veprbl October 30, 2025 22:49
@github-actions github-actions bot dismissed stale reviews from themself October 30, 2025 23:15

No Clang-Tidy warnings found so I assume my comments were addressed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants