-
Notifications
You must be signed in to change notification settings - Fork 0
DM-53622: enable provenance recording #363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
kfindeisen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some concerns about separation of concerns, and having multiple methods with multiple, partially overlapping responsibilities. That's something we've tried to avoid in this codebase so far. In a few cases (like the error handling), it's easy to factor the code to keep the boundaries clean; with the predefined provenance types it's harder.
I think the root cause of a lot of the complexity is the apparent need to pass a single set of "correct" provenance dimensions to the pipeline execution. How does a single definition work if the pipeline tasks have different dimensions (as they do in practice)? How can the need to human curate the definitions work for general-purpose execution? Why can't the dimensions be inferred automatically from the pipeline definition?
| for task_node in qg.pipeline_graph.tasks.values(): | ||
| if task_node.dimensions == dataset_type.dimensions: | ||
| data_ids = qg.quanta_by_task[task_node.label].keys() | ||
| if len(data_ids) == 1: | ||
| return DatasetRef(dataset_type, next(iter(data_ids)), run=qg.header.output_run) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a while to work out that this is getting a "typical" data ID for the pipeline execution. We already have code for generating that. The caveat is that MWI mostly works in terms of exposure, not visit; I think visit is only queried in the APDB error handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think about it -- I don't think this would work in the case where we have multiple snaps but also can only run the ISR-only pipeline (this almost happens in daytime testing, though those aren't proper snaps so we only get to run one of them). In that case, you need two exposure IDs and two provenance datasets, don't you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've received a guarantee from project leadership that there will not be snaps, ever, and we can start removing code to support them. The exposure vs. visit split in butler won't be going away anytime soon, if ever, but it is now safe to assume that they have a 1-0 or 1-1 relationship.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much like with the Butler, I'm not sure it's worth the effort to rewrite all the code. 🙂
Though your comment on Jira about groups still having multiple exposures/visits is very relevant here -- snaps may have been what originally motivated PP's multi-exposure support, but it can still be useful in other cases. (Though maybe that means we should remove all uses of the word "snap" to avoid confusion.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been switched to a small butler query that uses the where string you mentioned.
|
This PR can't be tested or merged until we have a |
TallJimbo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm deferring replies/action on points related to the "provenance dimensions" problem while we discuss the big picture in Jira.
| for task_node in qg.pipeline_graph.tasks.values(): | ||
| if task_node.dimensions == dataset_type.dimensions: | ||
| data_ids = qg.quanta_by_task[task_node.label].keys() | ||
| if len(data_ids) == 1: | ||
| return DatasetRef(dataset_type, next(iter(data_ids)), run=qg.header.output_run) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've received a guarantee from project leadership that there will not be snaps, ever, and we can start removing code to support them. The exposure vs. visit split in butler won't be going away anytime soon, if ever, but it is now safe to assume that they have a 1-0 or 1-1 relationship.
8705bec to
615ffc8
Compare
kfindeisen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but please clean up the commit history before merging -- it looks like most of the later commits are fixes to the original ones?
| return PredictedQuantumGraph.make_empty(universe=DimensionUniverse(), output_run="test") | ||
| elif n_quanta < 0: | ||
| raise RuntimeError("Invalid input") | ||
| else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a TODO comment for the remaining from_old_quantum_graph call?
tests/test_middleware_interface.py
Outdated
| "activator.middleware_interface.SeparablePipelineExecutor.run_pipeline" | ||
| ) as mock_run, | ||
| # Mocked QGs do not have realistic dimensions, and provenance | ||
| # dataset types need to have the same dimensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this comment be attached to the definition of test_provenance_dataset_type instead?
| self.butler.dimensions.conform(["exposure", "detector"]), | ||
| "ProvenanceQuantumGraph", | ||
| ) | ||
| self.butler.registry.registerDatasetType(self._exposure_provenance_dataset_type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how it's cleaner to make the dataset type part of the object state (and far from where they're actually needed), but I suppose it minimizes the changes that need to be made to the tests.
c59f134 to
7cd6e38
Compare
This wasn't an option when this code was first written, but we don't have to live with all those backslashes anymore.
7cd6e38 to
ffa6a9f
Compare
|
I believe all comments have now been addressed. I'll post again when all of the needed upstream changes have made it into a daily or weekly release. |
|
The upstream changes are now in a ( |
|
We'll create a new base build as soon as we're able to test it, but that's proving tricky right now. Thank you for your patience. |
Previously, we'd depended on the default value being False, and didn't update our code when the API switched the default. It didn't matter until provenance datasets could appear in each run with the same ID.
617a096 to
610b8be
Compare
No description provided.