The rapid adoption of Artificial Intelligence (AI) has highlighted the need for good quality training data to enable the systems to offer effective decision making support. Data quality is therefore a critical requirement for future geospatial technologies just as it has always been for historic ones. Whereas some of those future geospatial technologies will likely rely on raster data, many others will rely on vector feature data. Therefore, validators capable of checking the validity of vector feature data files are likely to play a key role in an AI-driven future.
The focus of this Engineering Report (ER) is a code sprint that was held from July 10th to 12th, 2024 to advance the support and development of open standards within the developer community. The code sprint was organized by the Open Geospatial Consortium (OGC) and hosted by Geovation in London, England. The code sprint was sponsored by Google and supported by Natural Resources Canada (NRCan). The code sprint included activities involving several OGC API Standards and data encoding standards, as well as special tracks on Data Quality & AI, Map Markup Language (MapML) and Validators. Other activities at the code sprint were related to OGC CoverageJSON, OGC SensorThings API WebSub Extension, OGC API — Records, OGC API — Features, and OGC Styles and Symbology.
The MapML special track of the code sprint sought to prototype integration of a MapML viewer into implementations of OGC API Standards. The MapML specification extends the semantics of several HTML Standard elements, and specifies a small set of new, mapping-specific elements, in the HTML namespace. This engineering report concludes that the sprint participants proved that MapML has a role to play in the geospatial ecosystem and that MapML can be easily integrated into implementations of OGC API Standards.
The Data Quality and AI track of the code sprint sought to implement support for the Training Data Markup Language for Artificial Intelligence (TrainingDML-AI) Standard and for a Data Quality Measures Register based on the ISO 19157 series of Standards. The TrainingDML-AI Standard offers a conceptual model and data encodings for geospatial machine learning training data. The engineering report concludes that the use cases for machine readable and executable provenance chains should be widened to include specific Machine Learning (ML) training data use cases. The ability to sample, correct, and train large datasets should have a reproducible method for training models.
The Validators track of the code sprint sought to extend and implement various tools for testing datasets and products for compliance to OGC Standards. In a previous code sprint a JSON-FG Linter was developed. A linter checks the correctness of code or encoding format inside an Editor application while a user is writing or editing. In this code sprint, some of the participants sought to expand on the previous work by creating a linter for OGC API — Features. Another team of participants sought to implement an Executable Test Suite (ETS) for TrainingDML-AI. The engineering report concludes that there is significant potential for client-side validators such as the Linter to enhance the developer experience, alongside server-side validators such as TEAM Engine.
The code sprint was held as a generic code sprint meaning that all OGC working groups were encouraged to participate in the event. As a result, several OGC Standards Working Groups (SWGs) set up teams of developers to collaborate with during the three-day event. In addition to providing software developers with an environment for collaborative coding and experimentation, the code sprint also provided opportunities for thought leadership through presentations and tutorials in the Mentor Stream. This made the code sprint a rich environment for knowledge transfer across teams, as well as for nurturing cross-functional teams.
The sprint participants made the following recommendations regarding potential experiments in future Collaborative Solutions and Innovation (COSI) Program initiatives:
The rapid adoption of Artificial Intelligence (AI) has highlighted the need for good quality training data to enable the systems to offer effective decision making support. Data quality is therefore a critical requirement for future geospatial technologies just as it has always been for historic ones. Whereas some of those future geospatial technologies will likely rely on raster data, many others will rely on vector feature data. Therefore, validators capable of checking the validity of vector feature data files are likely to play a key role in an AI-driven future.
The focus of this Engineering Report (ER) is a code sprint that was held from July 10th to 12th, 2024 to advance the support and development of open standards within the developer community. The code sprint was organized by the Open Geospatial Consortium (OGC) and hosted by Geovation in London, England. The code sprint was sponsored by Google and supported by Natural Resources Canada (NRCan). The code sprint included activities involving several OGC API Standards and data encoding standards, as well as special tracks on Data Quality & AI, Map Markup Language (MapML) and Validators. Other activities at the code sprint were related to OGC CoverageJSON, OGC SensorThings API WebSub Extension, OGC API — Records, OGC API — Features, and OGC Styles and Symbology.
The MapML special track of the code sprint sought to prototype integration of a MapML viewer into implementations of OGC API Standards. The MapML specification extends the semantics of several HTML Standard elements, and specifies a small set of new, mapping-specific elements, in the HTML namespace. This engineering report concludes that the sprint participants proved that MapML has a role to play in the geospatial ecosystem and that MapML can be easily integrated into implementations of OGC API Standards.
The Data Quality and AI track of the code sprint sought to implement support for the Training Data Markup Language for Artificial Intelligence (TrainingDML-AI) Standard and for a Data Quality Measures Register based on the ISO 19157 series of Standards. The TrainingDML-AI Standard offers a conceptual model and data encodings for geospatial machine learning training data. The engineering report concludes that the use cases for machine readable and executable provenance chains should be widened to include specific Machine Learning (ML) training data use cases. The ability to sample, correct, and train large datasets should have a reproducible method for training models.
The Validators track of the code sprint sought to extend and implement various tools for testing datasets and products for compliance to OGC Standards. In a previous code sprint a JSON-FG Linter was developed. A linter checks the correctness of code or encoding format inside an Editor application while a user is writing or editing. In this code sprint, some of the participants sought to expand on the previous work by creating a linter for OGC API — Features. Another team of participants sought to implement an Executable Test Suite (ETS) for TrainingDML-AI. The engineering report concludes that there is significant potential for client-side validators such as the Linter to enhance the developer experience, alongside server-side validators such as TEAM Engine.
The code sprint was held as a generic code sprint meaning that all OGC working groups were encouraged to participate in the event. As a result, several OGC Standards Working Groups (SWGs) set up teams of developers to collaborate with during the three-day event. In addition to providing software developers with an environment for collaborative coding and experimentation, the code sprint also provided opportunities for thought leadership through presentations and tutorials in the Mentor Stream. This made the code sprint a rich environment for knowledge transfer across teams, as well as for nurturing cross-functional teams.
The sprint participants made the following recommendations regarding potential experiments in future Collaborative Solutions and Innovation (COSI) Program initiatives:
OGC API — 3D Geovolumes experimentation in the context of Digital Twins
CDB2 experimentation in the context of Digital Twins
@@ -1875,30 +1875,30 @@
Provide feedback to ISO/TC 211 on findings regarding Data Quality and AI.
-
The following are keywords to be used by search engines and document catalogues.
ogcdoc, OGC document, API, openapi, html, tdml-ai, mapml, json-fg
All questions regarding this document should be directed to the editors or the contributors:
Table — Submitters
Name | Organization | Role |
---|
Gobe Hobona | OGC | Editor |
Joana Simoes | OGC | Editor |
Tom Kralidis | OSGeo | Contributor |
Chris Little | Met Office | Contributor |
Frank Terpstra | Geonovum | Contributor |
Joost Farla | Geonovum | Contributor |
Maxime Collombin | University of Applied Sciences, Western Switzerland (HEIG-VD) | Contributor |
Sam Meek | Helyx Secure Information Systems | Contributor |
Peter Rushforth | Natural Resources Canada | Contributor |
Aliyan Haq | Natural Resources Canada | Contributor |
Rui Cavaco | Norte Portugal Regional Coordination and Development Commission | Contributor |
Ivana Ivanova | Curtin University | Contributor |
Jerome St-Louis | Ecere | Contributor |
Ricardo Garcia Silva | Geobeyond | Contributor |
Samantha Lavender | Pixalytics Ltd | Contributor |
Joan Maso | UAB-CREAF | Contributor |
Panagiotis (Peter) A. Vretanos | CubeWerx Inc. | Contributor |
The subject of this Engineering Report (ER) is a code sprint that was held from July 10th to 12th, 2024 to advance the support and development of open standards within the developer community. The code sprint was organized by the Open Geospatial Consortium (OGC) and hosted by Geovation in London, England. The code sprint was sponsored by Google and supported by Natural Resources Canada (NRCan). The code sprint included activities involving several OGC API Standards and data encoding standards, as well as special tracks on Data Quality & Artificial Intelligence, Map Markup Language (MapML) and Validators.
OGC Code Sprints experiment with emerging ideas in the context of geospatial Standards and help improve interoperability of existing Standards by experimenting with new extensions or profiles. They are also used for building proofs-of-concept to support standards development activities and the enhancement of software products. The nature of the activities is influenced by whether a code sprint is ‘generic’ or ‘focused’. All OGC working groups are invited and encouraged to set up a thread in generic code sprints, whereas focused code sprints are tailored to a specific set of standards (typically limited to three standards).
This ER presents the high-level architecture of the code sprint and describes each of the standards and software packages that were deployed in support of the code sprint. The ER also discusses the results and presents a set of conclusions and recommendations. The recommendations identify ideas for future work, some of which may be more appropriate for testbeds, pilots, or other types of OGC initiatives. Therefore, the reader is encouraged to consider the recommended future work within the context of all OGC Standards development, collaborative solutions, and innovation activities.
OGC Code Sprints experiment with emerging ideas in the context of geospatial Standards and help improve interoperability of existing Standards by experimenting with new extensions or profiles. They are also used for building proofs-of-concept to support standards development activities and the enhancement of software products. The nature of the activities is influenced by whether a code sprint is ‘generic’ or ‘focused’. All OGC working groups are invited and encouraged to set up a thread in generic code sprints, whereas focused code sprints are tailored to a specific set of standards (typically limited to three standards).
This ER presents the high-level architecture of the code sprint and describes each of the standards and software packages that were deployed in support of the code sprint. The ER also discusses the results and presents a set of conclusions and recommendations. The recommendations identify ideas for future work, some of which may be more appropriate for testbeds, pilots, or other types of OGC initiatives. Therefore, the reader is encouraged to consider the recommended future work within the context of all OGC Standards development, collaborative solutions, and innovation activities.
This document uses the terms defined in OGC Policy Directive 49, which is based on the ISO/IEC Directives, Part 2, Rules for the structure and drafting of International Standards. In particular, the word “shall” (not “must”) is the verb form used to indicate a requirement to be strictly followed to conform to this document and OGC documents do not use the equivalent phrases in the ISO/IEC Directives, Part 2.
This document also uses terms defined in the OGC Standard for Modular specifications (OGC 08-131r3), also known as the ‘ModSpec’. The definitions of terms such as standard, specification, requirement, and conformance test are provided in the ModSpec.
For the purposes of this document, the following additional terms and definitions apply.
-
+
An Application Programming Interface (API) is a standard set of documented and supported functions and procedures that expose the capabilities or data of an operating system, application, or service to other applications (adapted from ISO/IEC TR 13066-2:2016).
-
+
A coordinate system that is related to the real world by a datum term name (source: ISO 19111).
-
+
A document (or set of documents) that defines or describes an API. An OpenAPI definition uses and conforms to the OpenAPI Specification (https://www.openapis.org).
-
+
An API using an architectural style that is founded on the technologies of the Web [source: OGC API — Features — Part 1: Core].
-
+
API
Application Programming Interface
CITE
Compliance Interoperability & Testing Evaluation
@@ -1910,23 +1910,23 @@
REST
Representational State Transfer
TEAM
Test, Evaluation, And Measurement Engine
-
As illustrated in Figure 1, the sprint architecture was designed to enable client applications to connect to different servers that implement a variety of standards. The architecture also included several different software libraries that support open geospatial standards and enable the extraction, transformation, and loading of geospatial data. The participants deployed the their software in their own infrastructure.
+
As illustrated in Figure 1, the sprint architecture was designed to enable client applications to connect to different servers that implement a variety of standards. The architecture also included several different software libraries that support open geospatial standards and enable the extraction, transformation, and loading of geospatial data. The participants deployed the their software in their own infrastructure.
Figure 1 — High Level Overview of the Sprint Architecture
The rest of this section describes the software deployed, and standards implemented during the code sprint or in support of the code sprint.
+
Figure 1 — High Level Overview of the Sprint Architecture
The rest of this section describes the software deployed, and standards implemented during the code sprint or in support of the code sprint.
-
+
The OGC SensorThings API Standard provides an open and harmonized way to interconnect devices, applications, and data over the web and on the Internet of Things (IoT) (OGC 18-088). At a high level the SensorThings API provides two main parts, namely Part I — Sensing, and Part II — Tasking. The Sensing part of the Standard provides a way to manage and retrieve observations and metadata from different sensor systems. The Tasking part of the Standard provides a way for tasking IoT devices, such as actuators and sensors. The SensorThings API follows REST principles and uses JSON for encoding messages as well as Message Queuing Telemetry Transport (MQTT) for publish/subscribe operations.
In this code sprint, activity related to the OGC SensorThings API focused on a WebSub extension to the OGC SensorThings API Standard. The OGC SensorThings API Extension WebSub prototype is based on the W3C WebSub Recommendation. The prototype supports subscribe/unsubscribe and opaque discovery capabilities that are offered by implementations of OGC SensorThings API.
-
+
The OGC API — Features Standard offers the capability to create, manage, and query spatial data on the Web. The Standard specifies requirements and recommendations for Web APIs that are designed to facilitate the sharing of feature data. The specification is a multi-part standard. Part 1, labelled the Core, describes the mandatory capabilities that every implementing service has to support and is restricted to read-access to spatial data that is referenced to the World Geodetic System 1984 (WGS 84) Coordinate Reference System (CRS) (OGC 17-069r4). Part 2 enables the use of different CRSs, in addition to the WGS 84 (OGC 18-058r1). Additional capabilities that address specific needs will be specified in additional parts. Envisaged future capabilities include, for example, support for creating and modifying data, more complex data models, and richer queries.
-
+
OGC API — Tiles specifies a Standard for Web APIs that provide tiles of geospatial information (OGC 20-057). The Standard supports different forms of geospatial data, such as tiles of vector features (colloquially called “vector tiles”), coverages, maps (or imagery), and potentially eventually additional types of tiles of geospatial data.
@@ -1935,52 +1935,52 @@
In this context, a map is essentially an image representing at least one type of geospatial information. Tiles of maps (i.e., map tiles) represent subsets of maps covering an area.
-
+
The OGC API — Environmental Data Retrieval (EDR) Standard provides a family of lightweight interfaces to access Environmental Data resources. Each resource addressed by an EDR API maps to a defined query pattern. This Standard identifies resources, captures compliance classes, and specifies requirements which are applicable to OGC Environmental Data Retrieval API’s. This Standard addresses both discovery and query operations. Discovery operations enable the API to be interrogated to determine its capabilities and retrieve metadata about the published resource. Query operations allow Environmental Data resources to be retrieved from the underlying data store based upon simple selection criteria, defined by this standard and selected by the client.
Version 1.1 of OGC API — EDR has been published (OGC 19-086r6). The EDR API Standards Working Group (SWG) has recently obtained approval to publish Part 2 of the Standard: “OGC API — Environmental Data Retrieval — Part 2: Publish-Subscribe workflow” (OGC 23-057). The focus of the EDR API-related work in this code sprint is therefore on the use of Part 2 of the Standard. Work continues on defining improvements to Part 1: Core Version 1.1 to be known as Version 1.2.
-
+
The OGC API — Processes Standard supports the wrapping of computational tasks into executable processes that can be offered by a server through a Web API and be invoked by a client application (OGC 18-062r2). The Standard enables the execution of computing processes and the retrieval of metadata describing the purpose and functionality of the processes. Typically, these processes execute well-defined algorithms that ingest vector and/or coverage data to produce new datasets.
The OGC API — Processes — Part 2: Deploy, Replace, Undeploy candidate Standard extends the core capabilities specified in the OGC API — Processes — Part 1: Core (OGC 18-062r2) with the ability to dynamically add, modify and/or delete individual processes using an implementation (endpoint) of the OGC API — Processes Standard.
-
+
The Training Data Markup Language for Artificial Intelligence (TrainingDML-AI) is a Standard for describing Machine Learning Training Datasets and their quality, as well as how they have been developed. Version 1.0 of the Part 1: Conceptual Model Standard has been published. Parts 2 & 3 with JSON and XML encodings have been submitted to OGC.
-
+
CoverageJSON is a format for encoding coverage data such as grids, time series, and vertical profiles, each distinguished by the geometry of their spatiotemporal domain. A CoverageJSON document is serialized in JavaScipt Object Notation (JSON). A CoverageJSON object represents a domain, a range, a coverage, or a collection of coverages. A range in CoverageJSON represents coverage values. A coverage in CoverageJSON is the combination of a domain, parameters, ranges, and additional metadata. A coverage collection represents a list of coverages.
-
-
+
The OGC API — Connected Systems candidate Standard specifies the fundamental API building blocks for interacting with Connected Systems and associated resources (OGC 20-058). A Connected System represents any kind of system that can either directly transmit data via communication networks (being connected to them in a permanent or temporary fashion), or whose data is made available in one form or another via such networks. This definition encompasses systems of all kinds, including in-situ and remote sensors, actuators, fixed and mobile platforms, airborne and space-borne systems, robots and drones, and even humans who collect data or execute specific tasks.
-
+
The OGC API — Maps candidate Standard describes an API that can serve spatially referenced and dynamically rendered electronic maps (OGC 20-058). The specification describes the discovery and query operations of an API that provides access to electronic maps in a manner independent of the underlying data store. The query operations allow dynamically rendered maps to be retrieved from the underlying data store based upon simple selection criteria as defined by the client.
-
+
The OGC API — Records candidate Standard provides discovery and access to metadata records that describe resources such as features, coverages, tiles / maps, models, assets, datasets, services, or widgets (OGC 20-004). The candidate Standard enables the discovery of geospatial resources by standardizing the way collections of descriptive information about the resources (metadata) are exposed. The candidate Standard also enables the discovery and sharing of related resources that may be referenced from geospatial resources or their metadata by standardizing the way all kinds of records are exposed and managed.
-
+
The OGC API — Discrete Global Grid Systems candidate Standard defines building blocks which can be used as part of a Web API to retrieve geospatial data for a specific area, time and resolution of interest, based on a specific Discrete Global Grid System (DGGS) and indexing scheme (OGC 21-038). The candidate Standard also supports querying of the list of DGGS zones from which data is available.
-
+
The draft OGC Features and Geometries JSON (JSON-FG) Standard extends the GeoJSON format to support a limited set of additional capabilities that are out-of-scope for GeoJSON but that are important for a variety of use cases involving feature data (OGC 21-045r1). In particular, the JSON-FG Standard specifies the following extensions to the GeoJSON format:
@@ -1995,20 +1995,20 @@
-
+
The DGGS-JSON and DGGS-UBJSON (based on Universal Binary JSON) formats, defined as part of requirement classes for OGC API — DGGS, provide a compact and efficient way to retrieve data quantized to a particular Discrete Global Grid Reference System (DGGRS). By leveraging a shared knowledge of the DGGRS between the client and server (or the producer and consumer in the case of an offline file), the large majority of the payload is simply the data values associated with each sub-zone at a given relative depth of the parent zone for which data is being encoded. A fixed deterministic sub-zone order needs to be defined as part of the DGGRS in order to enable this.
A draft JSON schema for DGGS-JSON is available. See also the first implementation deployed in the GNOSIS Map Server during this code sprint.
-
+
The OGC Cartographic Symbology candidate Standard (CartoSym) defines a Conceptual Model, a Logical Model and Encodings for describing symbology rules for the portrayal of geographical data (OGC 18-067r4). The targets of this candidate Standard are symbology encodings and cartographic rendering engines. The candidate Standard is modularized into multiple requirements classes, with a minimal core describing an extensible framework, with clear extension points, for defining styles consisting of styling rules selected through expressions and applying symbolizers configured using properties. The candidate Standard defines two encodings for the logical model: one based on JSON which can be readily parsed by JSON parsers, as well as a more expressive encoding better suited for hand-editing inspired from Web Cartographic Styled Sheets (CSS) and related cartographic symbology encodings.
-