perf(as_of): fix quadratic reconstruction in forward simulations (×1.4–×5.4) by benjello · Pull Request #1367 · openfisca/openfisca-core

benjello · 2026-02-28T08:09:38Z

Summary

_set_as_of unconditionally cleared the snapshot cursor after every write, forcing the next get_array(M) to reconstruct the full array from the base through all M patches — O(N + M·k) per step, quadratic total cost over a multi-month simulation.
Root cause: _reconstruct_at(instant) advanced the snapshot to instant during the internal diff computation inside _set_as_of, so the guard snapshot[0] >= instant always triggered on equality, even for strictly forward writes.
Fix: distinguish forward (patch appended at end) vs retroactive (out-of-order) SETs. Forward SETs now store value.copy() as the new snapshot → next get_array(instant) is an O(1) exact cache hit. Retroactive SETs keep the existing invalidation logic.

Benchmark (N = 1 000 000 persons, forward GET→SET simulation)

Scenario	Before	After	Gain
1 year, 10% change/month	66 ms	46 ms	×1.4
5 years, 10% change/month	772 ms	188 ms	×4.1
5 years, 30% change/month	1 598 ms	294 ms	×5.4

Test plan

All 18 existing test_asof_variable.py tests pass unchanged
New TestForwardSimulationBench benchmark added to test_bench_asof.py
Version bumped 44.4.0 → 44.4.1

🤖 Generated with Claude Code

eraviart

OK for performance improvement

Introduces as_of = True / "start" / "end" on Variable, enabling values set at a given instant to persist forward in time until explicitly overridden — the vectorial analogue of OpenFisca parameters. - Variable: set_as_of() setter normalises True → "start", rejects invalid values, and guards against incompatible set_input helpers - Holder.get_array: falls back to _get_as_of() when no exact match - Holder._get_as_of: O(log P) lookup via bisect on _sorted_instants, with a linear-scan fallback for cloned holders - Holder._set: reference sharing for unchanged arrays + defensive read-only copy to prevent in-place mutation of stored data - Holder.clone: copies _sorted_instants independently - 14 unit tests covering persistence, conventions, reference sharing, read-only guard, and declaration validation - pyproject.toml: exclude docs/ from codespell (French design docs) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Made-with: Cursor

MattiSG · 2026-03-04T17:11:42Z

Hello,
Performance improvement is always welcome!

However, this changeset also adds a dependency that does not seem to be used anywhere. This increases the attack surface, increases the overall package size and makes maintenance harder. Can you please remove that dependency, or explain how it is used? 🙂

I also notice that a benchmark is claimed to be made, but there is no benchmark code nor methodology. It is very easy for LLMs to fake benchmarks, or to create ones that are meaningless in production. Performance improvements claims should systematically be backed by hard data to avoid unnecessary code churn, maintainer burden, and unforeseen side effects.

Reviewers have a shared responsibility with authors to ensure that we don't unnecessarily increase the attack surface and package size. I understand that you want to go fast, but this pattern of one author adding many changes at once that are AI-generated and make review hard, followed by one human reviewer greenlighting them all with not a single critical comment worries me. Please keep in mind the extent of reusers around the world that use OpenFisca in production for other cases than the specific ones you might be testing. Testing compatibility on a small private codebase is not sufficient as a demonstration of innocuity.

benjello requested review from benoit-cty, eraviart and guillett March 4, 2026 08:17

eraviart approved these changes Mar 4, 2026

View reviewed changes

benjello force-pushed the feat/as-of-patches branch from b1374c4 to f5583ab Compare March 4, 2026 10:58

Base automatically changed from feat/as-of-patches to master March 4, 2026 13:38

benjello and others added 3 commits March 4, 2026 14:42

bump: 44.4.0 → 44.4.1

684764b

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Fix: remove leftover conflict marker in test_asof_variable.py

092b322

Made-with: Cursor

benjello force-pushed the perf/as-of-snapshot-fix branch from 99933c1 to 092b322 Compare March 4, 2026 13:44

benjello enabled auto-merge March 4, 2026 13:45

benjello merged commit 81cb3ab into master Mar 4, 2026
24 checks passed

benjello deleted the perf/as-of-snapshot-fix branch March 4, 2026 13:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(as_of): fix quadratic reconstruction in forward simulations (×1.4–×5.4)#1367

perf(as_of): fix quadratic reconstruction in forward simulations (×1.4–×5.4)#1367
benjello merged 3 commits intomasterfrom
perf/as-of-snapshot-fix

benjello commented Feb 28, 2026

Uh oh!

eraviart left a comment

Uh oh!

Uh oh!

MattiSG commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

benjello commented Feb 28, 2026

Summary

Benchmark (N = 1 000 000 persons, forward GET→SET simulation)

Test plan

Uh oh!

eraviart left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

MattiSG commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants