Skip to content

fix(links): use dynamic person entity key instead of hardcoded 'person'#1371

Open
benjello wants to merge 15 commits intomasterfrom
fix/entity-links-person-key
Open

fix(links): use dynamic person entity key instead of hardcoded 'person'#1371
benjello wants to merge 15 commits intomasterfrom
fix/entity-links-person-key

Conversation

@benjello
Copy link
Member

@benjello benjello commented Mar 5, 2026

Feature: Generic Entity Links (LIAM2-inspired)

Context & Motivation

OpenFisca's traditional entity model has historically been strictly hierarchical and bipartite: individuals belong to groups (households, families, tax units), and groups contain individuals. This rigid structure works well for static tax-benefit systems but struggles with complex, real-world socioeconomic models, such as:

  • Intra-entity relationships: Kinship graphs (person $\rightarrow$ mother, person $\rightarrow$ spouse).
  • Arbitrary inter-entity networks: Employment networks (person $\rightarrow$ employer), geographical mobility, or ad-hoc associations.
  • Deep chaining: Navigating multiple relationship hops (e.g., "the region of the household of the mother of the person").

To solve this, we drew inspiration from LIAM2's linking system and adapted it to OpenFisca's unique architecture (specifically integrating with our Role semantics and vectorized execution).

What we did

This PR introduces a generic, highly performant, and 100% backward-compatible Entity Linking system.

1. Core Link Classes (openfisca_core/links)

  • Many2OneLink: Resolves N source members to 1 target entity (e.g., person.mother, person.employer). Supports fetching values (.get()) and dynamic chaining (.mother.household.rent).
  • One2ManyLink: Aggregates from N target members back to 1 source entity. Supports a wide suite of vectorized aggregations (sum, count, any, all, min, max, avg) along with filtering by role or an arbitrary boolean condition mask.

2. Implicit Links & Backward Compatibility

A major design goal was to avoid breaking existing country packages (openfisca-france, openfisca-tunisia, etc.).

  • Links are strictly additive.
  • During Simulation initialization, OpenFisca now automatically reads the existing GroupEntity structure and injects Implicit Links:
    • ImplicitMany2OneLink: Automatically adds person.household, mapping directly to the high-performance GroupPopulation.members_entity_id array.
    • ImplicitOne2ManyLink: Automatically adds household.persons, replacing the need for verbose legacy aggregations.
  • Population.__getattr__ was carefully patched to first check self.links["..."] before natively falling back to the legacy get_projector_from_shortcut() route. Everything keeps working identically.

3. Syntax Sugar & Chaining

The new API allows natural, pythonic data fetching:

# Old projector way (still works!):
rents = sim.persons.household("rent", "2024")

# New explicit link definition (e.g., for arbitrary networks)
mother_link = Many2OneLink(name="mother", link_field="mother_id", target_entity_key="person")
person_entity.add_link(mother_link)

# New chaining syntax:
mother_household_rents = sim.persons.mother.household.get("rent", "2024")

# New declarative aggregations:
female_salaries = sim.households.persons.sum("salary", "2024", condition=is_female)

Performance

Performance is a critical constraint for OpenFisca simulations. We added pytest-benchmark tests validating the new mechanics.

  • .get() resolutions (Many-to-One) perform identically to legacy Projectors (~118μs on 15,000 entities).
  • Aggregations (One2Many.sum()) introduce a negligible setup overhead (< 1ms) but execute fully vectorized numpy.bincount and numpy.maximum.at operations under the hood.

Associated Documentation

We've added guides to help framework users model new relationships:

  • docs/implementation/links-api.md: Reference for creating and querying Many2OneLink and One2ManyLink.
  • docs/implementation/transition-guide.md: Migration guide demonstrating how to gradually adopt Links over Legacy Projectors.

Builder & test clarity

  • build_default_simulation(..., group_members=...): Optional group_members dict (e.g. {"household": [0,0,1,1]}) sets group structure at build time so tests no longer patch private attributes.
  • GroupPopulation.set_members_entity_id(array): Public API to set group structure and clear internal caches; tests use this instead of touching _members_position / _ordered_members_map.

Testing

  • 12 new, comprehensive tests covering unit mechanics, system integrations, filtering, chaining, and OpenFisca core lifecycle (_resolve_links).
  • All 158 core tests and existing Country Template tests continue to pass locally (make test-code).

ImplicitOne2ManyLink hardcoded target_entity_key='person', which crashes
when the person entity has a different key (e.g. 'individu' in
openfisca-france).

This fix adds a person_entity_key parameter to ImplicitOne2ManyLink
and passes the actual person entity key from Simulation._resolve_links().

Fixes the France API crash:
  KeyError: "Link 'individus': target entity 'person' not found in
  populations ['individu', 'famille', 'foyer_fiscal', 'menage']"
@benjello benjello requested a review from MattiSG March 5, 2026 08:30
@MattiSG MattiSG removed their request for review March 5, 2026 09:37
@MattiSG
Copy link
Member

MattiSG commented Mar 5, 2026

Can you clarify what this means:

All 76 link tests pass
All 57 entity/simulation builder tests pass

From what I see in the changeset, no tests were modified, and the previous code also passed tests.

This changeset changes the code signature of the ImplicitOne2ManyLink, is it thus a breaking change?

@benjello
Copy link
Member Author

benjello commented Mar 5, 2026

I see ImplicitOne2ManyLink more as an internal thing so for me it is just a bugfix. But I am not an expert.

@benjello
Copy link
Member Author

benjello commented Mar 5, 2026

And now we have a test that deals with more TBS that are more exotic than conuntry-template.

Add tests verifying ImplicitOne2ManyLink and _resolve_links work when
the person entity has a key other than 'person' (e.g. 'individu' in
openfisca-france):

- test_implicit_one2many_with_non_default_person_key (unit)
- test_resolve_links_with_individu_key (integration)
- test_implicit_link_target_uses_actual_person_key (integration)
@benjello benjello force-pushed the fix/entity-links-person-key branch from 3920732 to af716a8 Compare March 5, 2026 09:57
benjello added 13 commits March 5, 2026 12:28
… population/group_population updates

Made-with: Cursor
….assert_near; clarify DIVIDE complex-period docstring

Made-with: Cursor
…OneLink

When a wrapped callable (e.g. projector) returns a members-sized array
(one value per person), return it as-is instead of calling project(),
which expects entity-sized input. Fixes InvalidArraySizeError in France
YAML tests (e.g. maj_nbp.yaml) where activite (size 5) was passed to
famille.project() expecting size 1.

Made-with: Cursor
…sized result

- Add docstring to _project_implicit documenting entity vs members-sized behaviour
- test_implicit_m2o_project_entity_sized_same_as_old_logic: entity-sized result
  is projected identically to old logic (target.project(result))
- test_implicit_m2o_members_sized_returned_unchanged: members-sized result is
  returned as-is (regression for France maj_nbp / InvalidArraySizeError)
- test_implicit_m2o_project_implicit_rejects_wrong_size: wrong size raises
  ValueError

Made-with: Cursor
- Backward-compat _resolve_links and builder SituationParsingError
  (unchanged logic, formatting only for simulation.py)
- pyproject: add activite to codespell ignore (French variable name)

Made-with: Cursor
…ng in M2O link

Return _CallableProxy(callable, target_attr) instead of a bare function so
that person.famille.demandeur.has_role(FoyerFiscal.DECLARANT_PRINCIPAL)
and similar patterns in openfisca-france continue to work.

Made-with: Cursor
…et (old vs new behavior)

- test_implicit_m2o_sum_returns_person_sized: assert raw target sum is (2,) and
  link.sum() returns projected (5,); documents old bug (entity-sized caused broadcast error)
- test_implicit_m2o_role_projector_chained_get_returns_person_sized: assert
  person.household.demandeur('age', period) returns person-sized (regression for
  individu.famille.demandeur.foyer_fiscal(...) in openfisca-france)

Made-with: Cursor
…for clearer tests

- build_default_simulation(..., group_members=...) sets group structure at build time
- GroupPopulation.set_members_entity_id(arr) sets mapping and clears caches (public API)
- test_many2one: use group_members in fixture, drop private-attr patches

Made-with: Cursor
@benjello
Copy link
Member Author

benjello commented Mar 6, 2026

All tests added here were meant to make openfisca-france tests pass openfisca/openfisca-france#2711

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants