Skip to content

1 Data stages

Saif Shabou edited this page Jun 15, 2021 · 6 revisions

Data stages

  • Raw data: In raw folder, data is stored as it is collected from data sources. Fo some cases, such Adgar datasets, modifications may be applied to raw data in order to provide it in an easy to use format.

  • Staging: The staging folder stores data mapped into a specific data model by using the connecteor framework. Staging data need to be as similar as possible to raw data in terms of content and values. Only the schemas is changed in order to prvide a shared schemas between datasets. Description of data staging model

  • Storage database: It consists of the database that combines the different data sources collected and integrated into staging folder. We choosed mongodb as database type to store ghg emissions data as documents and collections. In order to control better data coherence and unicity, we decided for the first version to use a relational model by joiniung different tables.
    Description of data storage model

  • Exposure database: It consists of a mongodb database that is querid by the ogs api in order to expose harmonized data to external application. Description of data exposure model

Clone this wiki locally