Skip to content

Add a __class__ key to the demography .asdict methods #2368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 23, 2025

Conversation

hyanwong
Copy link
Member

@hyanwong hyanwong commented May 17, 2025

And convert list input to numpy array.

The __class__ key is useful in general because e.g. for events, the dictionary keys aren't enough to distinguish between the event types. This also means that the classes are properly output when saving provenance, aiding reproducibility.

Also makes sure that events are properly associated with the Demography object, if they are provided.

No tests yet, as I wanted to check if this seemed reasonable. This is the first of 2 steps to allow us to create a fully functional demography object from provenance records, which enables us e.g. to plot demographies from most stdpopsim-created tree sequences, which seems very worth it.

Copy link

codecov bot commented May 17, 2025

Codecov Report

Attention: Patch coverage is 87.50000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 90.85%. Comparing base (92787f5) to head (d6ebc17).
Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
msprime/demography.py 87.50% 0 Missing and 2 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2368   +/-   ##
=======================================
  Coverage   90.85%   90.85%           
=======================================
  Files          20       20           
  Lines       11979    11993   +14     
  Branches     2325     2330    +5     
=======================================
+ Hits        10883    10896   +13     
  Misses        606      606           
- Partials      490      491    +1     
Flag Coverage Δ
C 90.85% <87.50%> (+<0.01%) ⬆️
python 98.68% <87.50%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hyanwong
Copy link
Member Author

I'm not suggesting we make the provenance fully reproducible, but I am finding it helpful to e.g. show the demography that was used to create an msprime tree sequence, and the additions in the PR seem sensible anyway.

@jeromekelleher
Copy link
Member

The important question here is, what could this break if we made the change?

@hyanwong
Copy link
Member Author

hyanwong commented May 19, 2025

Indeed. No tests break. I don't think people are relying on the .asdict method from Demographies to not contain a "class" key (and it is not a documented function for these classes anyway). As far as I know, we almost entirely use .asdict() when saving provenances. The only other place in the code I can see it is in ancestry.py, Class Simulator:

ll_population_configuration = [pop.asdict() for pop in demography.populations]

This class is extensively tested in ancestry.py, and the C code which then reads these populations looks for specific keys, so an extra __class__ key has no effect.

@hyanwong
Copy link
Member Author

The other option that would work in my use-case is to make the JSON encoder output __class__ for lists nested in the Demography dict, rather than change the .asdict method. Here was the original issue that prompted this, FWIW: #2367

@jeromekelleher
Copy link
Member

What about stdpopsim? I remember there's some close coupling in there with msprime, and that could potentially break? I guess running the stdpopsim tests suite with this change in place would be a good exercise.

If that's OK then I think it's probably harmless and we can add.

@hyanwong
Copy link
Member Author

I ran the stdpopsim tests with this version of msprime and there was no change in test response.

If @benjeffery also thinks this is fine, and better than trying to complicate the JSON provenance encoding function by making it recurse over the embedded bits of the Demography object, then I'll add the tests.

@benjeffery
Copy link
Member

I've had a look through GitHub code search for how/if people are using this and I can't see anything that will break. I also think it is needed for what @hyanwong is trying to do (make diagrams from a tree sequence provenance). It arguably is an oversight in my original provenance work and should be in there.

@hyanwong
Copy link
Member Author

Do you think that it's preferable to adding this to the .asdict() method rather than complicating the provenance encoder by trying to make it recursive, @benjeffery ?

@benjeffery
Copy link
Member

I think it is fine to add to asdict.

@hyanwong hyanwong force-pushed the asdict-class branch 3 times, most recently from 7ee9107 to d16b6ce Compare May 21, 2025 20:09
@hyanwong
Copy link
Member Author

OK, this has the tests now. I think the codecov report is wrong: I checked that the two supposedly partially covered lines were hit, by temporarily raising an error on those lines.

Copy link
Member

@benjeffery benjeffery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Think this needs a changelog though.

@hyanwong hyanwong force-pushed the asdict-class branch 2 times, most recently from 081e65f to a709d2b Compare May 22, 2025 20:42
@hyanwong
Copy link
Member Author

hyanwong commented May 22, 2025

Totally needs a changelog entry. Sorry that I forgot. Now added.

And convert list input to numpy array.

The __class__ key is useful because e.g. for events, the dictionary keys aren't enough to distinguish between the event types. This also means that the classes are properly output when saving provenance, aiding reproducibility.
@mergify mergify bot force-pushed the asdict-class branch from a709d2b to d6ebc17 Compare May 23, 2025 08:50
@mergify mergify bot merged commit 8d65577 into tskit-dev:main May 23, 2025
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants