Attempt to reuse previously materialized datasets #20718
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The conditions are:
Providing hashes is currently a bit of a niche thing, but we do use this
in BRC analytics (in particular all fastq files contain hashes).
This should help a lot with demo'ing things.
We should also be able to make use of this in planemo, where this would be a realistic path to enable "invocation resume" functionality.
We might eventually allow this for public datasets as well, but perhaps this should be a little more explicit. We could for instance include cache hints in the dataset request syntax (maybe something like
cache_strategy: own
,cache_strategy: public
,cache_strategy: never
) and a top level setting for the workflow request.How to test the changes?
(Select all options that apply)
License