-
Notifications
You must be signed in to change notification settings - Fork 20
Data Models v1.0
After some thorough investigation of various requirements and concepts, I propose a list of changes to be implemented within EOxServer. The core reasons to do so are:
- Implement a Collection <-> Product <-> Coverage model (the previous "Collection" <-> "Coverage" one was simply not fine grained enough to express the complex structure of various EO products)
- Get rid of unnecessarily complex structure to both reduce the said complexity and improve the performance
- Get away from the "Coverage" data model (see 1.). This does not mean we abandon WCS, but allows us more freedom when dealing with too complex datasets and still map those to the coverages model where possible and appropriate.
The core of the proposed changes orbit around the Collection, Product and Coverage models. In the following section, the models will be discussed in detail.
As the name implies collections are collections of data. They shall be both generic enough to group datasets of different structure, but also be specific enough to provide useful metadata about its contents and to prohibit errors early.
It shall be able to put both Products and Coverages into a Collection.
A Collection can be associated with a CollectionType, which limits the types of Products/Coverages that can be inserted: only Products whose ProductType is within the CollectionTypes.allowed_product_types and similarly with Coverages.
A Collection can be associated with a Grid, which prohibits insertion of Coverages of different Grids.
Previously known as DatasetSeries and RectifiedStitchedMosaic
A Product is a grouping of Coverages of roughly the same spatio-temporal extent. The added benefit is to group said Coverages and to provide additional metadata that applies to all of them without the necessity to duplicate.
A Product can be inserted into any number of Collections
A Product can be associated with a ProductType, which prohibits insertion of Coverages whose CoverageType is part of the ProductType.allowed_coverage_types.
A whole Sentinel-2 SAFE container is a good example for a real-life Product.
Previously Products had no direct representation, but could have been portrayed as nested DatasetSeries
This model adds additional searchable metadata to allow refined searches (e.g: in OpenSearch)
A Coverage is an n-dimensional construct (2D for the time beeing) with at least one, but potentially more, sample properties. A Coverage is not necessarily stored as a single file: the data can be split into separate files when necessary.
A Coverage can be inserted into any number of Collections, but can only be part of at most one Product (I'm not adamant on this, I just don't think that there will be a situation where a Coverage is added to more than one Product).
A Coverage can be associated with a Grid, in this case it also has to have its origin properties set, to correctly place it on the Grid.
A Coverage can be associated with a CoverageType structure. This allows to provide additional band-related metadata and the ability to insert it into a Collection or Product that limits insertion of Coverages to that specific types.
Previously known as RectifiedDataset and ReferenceableDataset
Coverages only have a limited amount of metadata associated (time interval and footprint, and, technically, Grid and CoverageType). This model allows for further detailed searches (used in e.g: OpenSearch)
A browse references a pre-computed image for display purposes. Though technically Rasters themselves, there are slight differences:
-
Browses do not have an own identifier - There is always a specific style associated (with a sensible default), for which this
Browseshall be used
These models give metadata about the entailed raster values of the associated Collection, Product, and Raster.
A Sample portrays a single physical observed value. e.g: a certain wave-length or color-channel or similar.
The Samples model groups one or more Sample items and adds additional metadata.
The SamplesCollection is a set of Samples, to be used in Collections and Rasters.
As stated earlier, only Rasters have to have a linked Samples, for all other models, this is optional (but beneficial for various reasons; I'll go into more detail below).
Previously known as Band and RangeType (no direct representation of SamplesCollection)
This model represents an n-dimensional (but limited to 2D for now) grid for georeferencing data. It consists of information about the used CRS and resolution.
Previously known as Extent/Projection
This model is linked with a ProductType to add information how to translate the values to an RGB view. For starters, this will just be selection of the used red, green and blue bands with a scaling value, but using color expression based syntax and color scales are thinkable as-well.
This was previously known as WMSRenderOptions in the services app, but had some slight differences
This model is used to denote the presence and some rendering hints for vector- and bitmasks. Besides from the name of the mask type (like 'clouds', 'snow', 'validity'), a default color and mechanism (mask "in" or "out") can be specified.
For plain WCS there is little change with the exception that Coverages with a Grid will be advertised in GetCapabilities and rendered in GetCoverage.
Collections will be advertised in the GetCapabilities as DatasetSeries.
Products are not advertised (we could think of a visible flag; then they could be advertised as DatasetSeries)
Coverages are not advertised (we could also use a visible flag, then the Coverage is advertised either as a RectifiedDataset or ReferenceableDataset, depending on whether or not it is associated with a Grid)
Only works for Coverages, with no notable changes to the currently implemented solution.
As we have already established, Collections are either DatasetSeries or RectifiedStitchedMosaics, so using an identifier referencing a Collection will include all Products as sub-DatasetSeries and all Coverages (either directly associated or via its Products) are displayed as RectifiedDatasets/ReferenceableDatasets (depending on the prevalence of a Grid).
Similarly, using an id for a Product will result in a list of all Coverages as datasets.
When used with ids referencing a Coverage, there is no change.
In the capabilities document, all Collections are displayed as separate <Layer> tags, with an associated time interval.
When a CollectionType is associated with the Collection, then a nested <Layer> is added with the "_BANDS"-postfix. Inside, a <Dimensions> tag is added where the FieldType-names of all allowed_coverage_types (or wavelengths, if available) are listed. Also for each ProductType in allowed_product_types all linked BrowseTypes and MaskTypes, a nested <Layer> is added with the postfix of the name.
In order to keep the size of the capabilities document sane, Products and Coverages are not advertised as layers by default (we could introduce a visible flag here as-well).
The behavior of GetMap depends on a variety different parameters:
- The referenced objects type (
Collection,Product,Coverage) - The used postfix (e.g: "_BANDS")
- The used
dim_bands/dim_wavelengthparameter - The used
styleparameter
| Type | Postfix | Behavior |
|---|---|---|
Collection |
none | When there is a default BrowseType, then all Products Browses of that type will be rendered. All Products that do not have such a Browse, the actual Coverages will be accessed and rendered accordingly. |
Collection |
_< browse > | The same as above, but instead of the default style, the named BrowseType is used |
Collection |
_< mask > | The MaskType of the given name is used to determine what masks are used and how they are rendered. The style parameter may be passed to indicate the desired color |
Collection |
_BANDS | Here, the Coverages will always be directly accessed (no Browses used). |
Collection |
_OUTLINES | In this case the footprints of all Products and Coverages not in Products are rendered (again, the style parameter indicates the color) |
Product |
none | Similar to a whole Collection, just using the single Product. |
Product |
_< browse > | Similar to a whole Collection, just using the single Product. |
Product |
_< mask > | Similar to a whole Collection, just using the single Product. |
Product |
_BANDS | Similar to a whole Collection, just using the single Product. |
Product |
_OUTLINES | Similar to a whole Collection, just using the single Product. |
Coverage |
none | An exception report is generated and returned. |
Coverage |
_BANDS | The selected bands of the Coverage will be used as either gray or red, green and blue colors. |
Coverage |
_OUTLINES | The footprint will be drawn (using the style parameter as color hint). |
Direct download view for Browses.
The initial OSDD provides URLs (for various formats, such as ATOM, RSS, JSON) that, once invoked, result in search results listing Collections. Among other metadata, each item provides a URL to a Collection specific OSDD which in turn provides URLs for the aforementioned formats.
These search URLs yield search result documents where each item refers to either a Product or Coverage (that is in turn not part of a Product). Coverages inside of a Product are not expanded to result items themselves.
The links included for each item are:
- A link to the same item in the same format (self reference)
- A link to the default WMS view of the
Product/Coverage - For each
BrowseTypeandMaskType, a link the associated WMS view - For Products:
- A link to a
DescribeEOCoverageSetfor thatProduct - For each entailed
CoverageaGetCoverageURL
Previously, it was possible to easily sub-class EOxServer models to add additional fields to the models and using .cast() to retrieve the actual model. Sub-classing is still possible (though, just adding a model and reference the Collection, Product or Raster is valid too), but the .cast() method will disappear. The reason is, that it is not required anymore (the models are now more concise) and it removes a considerable performance detriment (using .cast() would almost always hit the DB, and this for a single record).
-
Remove
RectifiedStitchedMosaic:- Was not really used, in the future a
Mosaicmodel can be introduced.
- Was not really used, in the future a
-
Remove
DatasetSeries:- There never was any added value anyhow, and now, a
DatasetSeriesis becoming an abstract concept, but more on that below.
- There never was any added value anyhow, and now, a
-
Remove
RectifiedDatasetandReferenceableDataset:- Both can be abstracted by a
Coveragewith aGridset or not
- Both can be abstracted by a
-
Add
CollectionType,ProductTypeandCoverageTypemodels -
Rename
RangeTypetoCoverageType -
Rename
BandtoFieldType- Add nullable
wavelengthfield
- Add nullable
-
Changes to
Collection:- disallow nesting of
Collections: the nesting feature was never really used, made the code complicated and querying slow (it was impossible to get all records for an arbitrarily nested collection) - Add nullable (thus optional) link to
SampleCollection(see below, necessary to quickly see whatSamples(beforeRangeTypes) are allowed in this collection). - Add nullable link to
Grid: when theGridis set, only rasters with that grid can be added - Inherit from
backends.Datasetto allow to add metadata files to the collection itself
- disallow nesting of
-
Introduce model
Gridas a combination ofProjectionandExtent