-
Notifications
You must be signed in to change notification settings - Fork 0
2 Data Model Staging
Saif Shabou edited this page May 17, 2021
·
2 revisions
Staging data corresponds to the first step fro standardizing ghg emissiosn data. Every dataset is stored in one file composed of n lines, in a way that each line refers to an emission object with respect to a json structure. The goal here is to store as much as possible information with respect to a normalized schemas while keeping flexibility when data soruces provide specific information.
We may fill this excel file to store mapping information for each datasource: https://docs.google.com/spreadsheets/d/1CnTpHjZZZepgJ1o1VuQUN61ZaLLRtM1OhZzJU9HCPaY/edit#gid=2038606787
An emission object is composed of 4 main components:
-
data_source: Information related to data sources such as name, link, description... Data-source specific attributes may be stored in property object inside
data_source
. -
geo_component: Information describing the geographical entity where the emission is referring to such as name, iso-code, geo-scale... Data-source specific attributes relate to geographical entity may be stored in property object inside
geo_component
. - date: The date of the reported emission. For yearly emission, set the day and month to be the first of January.
- emission: Information describing the emission characteristics: gas, sector, unit, value.
{
{
"data_source": {
"name": "gcp",
"link": "url",
"properties": {
"description": "This is a short description of data source",
"provider": "gcp"
}
},
"geo_component": {
"scale": "Country",
"name": "france",
"identifier": {
"id": "FRA",
"type": "alpha3"
},
"properties": {
"data_source_code": "FRA"
}
},
"date": "2011-01-01",
"emission": {
"gas": "co2",
"value": 624.0,
"unit": {
"unit_used": "Mt co2eq"
},
"sector":{
"sector_origin_name": "Coal",
"sector_mapped_name": "fossil_emissions_coal"
}
}
},
{
"data_source": {
"name": "gcp",
"link": "url",
},
"geo_component": {
"scale": "Country",
"name": "france",
"identifier": {
"id": "FRA",
"type": "alpha3"
}
},
"date": "2012-01-01",
"emission": {
"gas": "co2",
"value": 624.0,
"unit": {
"unit_used": "Mt co2eq"
},
"sector":{
"sector_origin_name": "Coal",
"sector_mapped_name": "fossil_emissions_coal"
}
}
}
}
field | field | field | Obligatory | Type | Description | Values |
---|---|---|---|---|---|---|
data_source | Yes | json object containing information related to the data source | ||||
. | name | Yes | string | The name of the data sources as defined in the repository structure (e.g. "wri", "cdp"...) | wri, gcp, cdp | |
. | link | No | string | Http link to the datasource | ||
. | properties | No | json object containing specific data source properties | |||
. | . | scenario | No | The used scenario for filling empty emission values | ||
. | . | description | No | Short description of data source | ||
geo_component | Yes | json object containing information related to the geographical entity | ||||
. | scale | Yes | string | Spatial resolution of considered geocomponent based on a defined list |
country-group ; country ; city ; grid
|
|
. | name | No | string | The name of the geocomponent in lower case |
france ; italy
|
|
. | identifier | Yes | string | json object containing pricipal identifier of the geo-component | ||
. | . | id | Yes | string | The geocomponent identifier used in the dataset (for exemple: FRA, FR, france) | |
. | . | type | Yes | string | The type of identifier. Values should besenected from: alpha3, alpha2 or name |
alpha3 alpha2 name
|
. | properties | No | string | json object containing specific geo-component properties | ||
. | . | datasource_code | No | string | The identifier of the geocomponent in the datasource reference system | Account Number |
date | Yes | date | The date of emissions reporting | |||
emission | Yes | json object containing emissions values information | ||||
. | sector | Yes (unless it is a city emission) | ||||
. | . | sector_origin_name | Yes (unless it is a city emission) | string | The sector name as mentioned in the raw data source | |
. | . | sector_mapped_name | No | enum | Sector name based on the sector modalities mapping table | |
. | scope | Yes (only for city emission) | ||||
. | . | scope_origin_name | Yes (only for city emission) | string | The scope name as mentioned in the raw data source | |
. | . | scope_mapped_name | No | enum | scope name based on the sector modalities mapping table | |
. | gas | Yes | enum | Gas name based on gas modalities table | ||
. | value | Yes | numeric | value of gas emisssion quantity as provided by the datasource | ||
. | unit | Yes | ||||
. | . | unit used | Yes | string | unit used for quantifying ghg emissions | |
. | . | gwp_report_reference | No | string | scope name based on the sector modalities mapping table |