-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add "multi-scene" collecting and publishing #140
Comments
I didn't know there existed standardised message types with defined data structures. Is this defined/documented and/or enforced/tested anywhere?
For what it's worth, in one software package I know the seven dimensions are called Library, Vitrine, Shelf, Book, Page, Row, Column :-) On a more serious note, if we do use standardised names and a collection collects all granules or segments belonging to a single scene, then "multicollection" would be I think quite clear in its purpose. |
I doubt it's documented anywhere. I was thinking the same earlier today. But the above is most of what we have in use in posttroll-based packages. The
I like that, the data are most likely passed to |
I'm thinking that the difference between |
Thanks, I'll think about the naming. I've started with |
For creation of multi-temporal datasets data need to be collected and published for multiple time slots.
As an example, pytroll/satpy#2488 needs three distinct datasets:
The time-shift between the datasets can be anything, for example 15/30/60 minutes. It can even be irregular if used for polar satellite data or emphasis is needed on one direction or the other.
There are other envisioned needs for this kind of collection/publishing, so the feature needs to be kept as flexible as possible.
Messages
Currently we have the following message types for publishing data:
file
: plain json without nested lists nor dictionaries, everything at the "top level" of the messagedataset
: combined metadata (start/end times, platform, and such) at the top level, and a list nameddataset
of dictionaries having URI and UID of individual filescollection
: same as above, but there is a list namedcollection
with dictionaries of individual start/end times anddataset
sThe
collection
message type could be used for the collection of multi-temporal data that described here, but how to distinguish from the existing usage? Should there be new message type likelibrary
(file -> dataset -> collection -> library 😜) or something that has a list namedlibrary
withcollection
s withdataset
s inside?Configuration
This is the first crude idea of how to configure which data are published together. The publishing would be triggered after each data collection has terminated.
The min/max ages are relative to the start time of the currently completed collection. Just having the
0/0
combination would equal the current behaviour of publishing the latest completed set. If all the criteria are not met (just after restart, for example, we might not have the earlier slots collected).Internals
Currently the completed
Slot
s are deleted. We need to add a new check that looks at thepublished_slots
config (andtimeliness
?) to determine which slots are not needed anymore. As the keys in theself.slots
dictionary are the nominal or start time (possibly rounded, depending on config) of the slot as a string, comparison is quite easy.The text was updated successfully, but these errors were encountered: