You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Change in data source: Datadista to esCovid19data.
- Adaptation of the Spain dataitems, they are now calculated from the Autonomous Communities.
- Added dataitem "Accumulated lethality".
- Added vaccines dataitems: "Dose of vaccine delivered", "Dose of vaccine supplied", "Percentage of doses of vaccine supplied" and "Percentage of population vaccinated.
- Implemented attributes of temporal granularity, regional granularity and update frequency. Now, each data source is only refreshed following its update frequency.
- Change from ES-regions to ES-communities.
- A new Region config file for countries is added.
Copy file name to clipboardExpand all lines: lib/README.md
+60-17Lines changed: 60 additions & 17 deletions
Original file line number
Diff line number
Diff line change
@@ -43,16 +43,28 @@ A Data Item is a low-grain resource which codifies a specific piece of informati
43
43
44
44
45
45
### Data Type
46
-
The COnVIDa library considers two types of Data Items used to interpret and analyze them, namely:
46
+
COnVIDa library considers two types of Data Items used to interpret and analyze them, namely:
47
47
48
-
***Temporal**: The data items are indexed by days, so they will show the daily values. In particular, _COVID19, Mobility, MoMo_ and _AEMET_ data items are temporal. For instance, if we select the COVID19 cases in Murcia from 21/02/2020 until 14/05/2020, the X axis will show all the days between those two dates, while Y axis will show the daily COVID19 cases in Murcia.
48
+
***Temporal**: The data items are indexed by time units (up to date, only days supported), so they will show in that temporal frequency. In particular, _COVID19, Mobility, MoMo_ and _AEMET_ data items are temporal. For instance, if we select the COVID19 cases in Murcia from 21/02/2020 until 14/05/2020, the X axis will show all the periods between those two dates, while Y axis will show the COVID19 cases in Murcia.
49
+
50
+
***Geographical**: The data items are indexed by region units. In particular, current _INE_ data items are geographical. It is worth mentioning that the user of this library could transform temporal data items to a geographical perspective by applying any kind of aggregation scheme. For instance, in COnVIDa service, if we choose the analysis type by regions and select some temporal data items, then COnVIDa service will descriptive statistical functions of those data items within the specified data ranges.
51
+
52
+
### Temporal Granularity
53
+
The current release of COnVIDa library considers the following temporal units:
54
+
55
+
***DAILY**: For temporal data sources, the data items should be presented by days. For creating new data sources to be directly integrated in the platform, developers should guarantee that granularity in the time series.
56
+
57
+
_More granularities can be supported in the future_
49
58
50
59
51
-
***Geographical**: The data items are indexed by regions and the data is aggregated with absolute values. In particular, current _INE_ data items are geographical. It is worth mentioning that the user of this library could transform temporal data items to a geographical perspective by applying any kind of aggregation scheme. For instance, in COnVIDa service, if we choose the analysis type by regions and select some temporal data items, then COnVIDa service will use the mean of those data items within the specified data ranges.
60
+
### Regional Granularity
61
+
The current release of COnVIDa library supports the following regional units:
52
62
63
+
***COMMUNITY**: The data items can be presented per Spanish communities.
53
64
54
-
### Regions
55
-
Regions are divisions of the territory that allow a more exhaustive and deeper collection and analysis. Currently, they are implemented as the Autonomous Regions in Spain, although the granularity (provinces, minicipalities, etc.) can be easily adapted. In this sense, _COnVIDa_ lib allows filtering the aforementioned data items by regions.
65
+
***PROVINCE**: The data items can be presented per Spanish provinces.
66
+
67
+
_More granularities can be supported in the future_
56
68
57
69
58
70
## User guidelines
@@ -62,15 +74,36 @@ The [test lib notebook](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/
Implements the required information for Regions management
64
76
65
-
##### `get_country_codes()`
66
-
Returns a list with the supported country codes. Right now, only 'ES' for Spanish regiones is available, although this is easily extensible to other countries.
77
+
##### `get_regions(country_code='ES')`
78
+
Returns a list with the names of the regions associated with a country code.
Returns a dictionary with data sources as keys, and an array of associated data item names as values.
85
122
@@ -112,7 +149,7 @@ Provides an interface for the library user to avoid the use of low-level functio
112
149
113
150
Parameters
114
151
- data_items: list of data item names. By default, 'all' are collected.
115
-
- regions: list of region names. By default, 'ES' refers to all Spanish Autonomous Regions.
152
+
- regions: list of region names. By default, 'ES' refers to all Spanish regions.
116
153
- start_date: first day in pandas datetime to be considered in TEMPORAL data items. By default, None is established.
117
154
- end_date: last day in pandas datetime to be considered in TEMPORAL data items. By default, None is established.
118
155
- language: language of the returned data.
@@ -134,9 +171,12 @@ _COnVIDa-lib_ constitutes an object-oriented package ready to be extended. Consi
134
171
135
172
1. First of all, some elements should be defined regarding your new Data Source:
136
173
* Name of the Data Source
137
-
* Data Format of the resource (`JSON` or `CSV`)
138
174
* Data Type of the Data Source (`TEMPORAL` or `GEOGRAPHICAL`)
175
+
* Temporal Granularity the Data Source (`DAILY`)
176
+
* Regional Granularity the Data Source (`COMMUNITIES or/and PROVINCES`)
139
177
* Representation of the regions within the Data Source (_iso\_3166\_2_, _ine code_, ...)
178
+
* Data Format of the resource (`JSON` or `CSV`)
179
+
* Update Frequency of the data series (in days)
140
180
* Information of each Data Item of the Data Source
141
181
* Name (literally used by the Data Source)
142
182
* Display Name (used to change the third-party nomenclature to a desired custom one)
@@ -145,9 +185,9 @@ _COnVIDa-lib_ constitutes an object-oriented package ready to be extended. Consi
145
185
146
186
2. Configure the aforementioned principal elements of your new Data Source:
147
187
148
-
* The name, data format, data type and region representationshould be included in the [datasources configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data-sources-config.json). With this aim, append a new entry in the JSON object with the data source name as a key, and a dictionary with the corresponding information regarding `DATA FORMAT`, `DATA TYPE` and `REGION REPRESENTATION` as values. If needed, specific config elements of your Data Source can be also included here (_for example, [AEMET data source](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/AEMETDataSource.py) defines its `API KEY` necessary for it to work_).
188
+
* The name, data type, temporal and regional granularities, region representation, data format, and update frequency should be included in the [data sources configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data-sources-config.json). With this aim, append a new entry in the JSON object with the data source name as a key, and a dictionary with the corresponding information regarding `DATA TYPE`, `TEMPORAL GRANULARITY`, `REGIONAL GRANULARITY`, `REGION REPRESENTATION`, `DATA FORMAT`, and `UPDATE FREQUENCY` as values. If needed, specific config elements of your Data Source can be also included here (_for example, [AEMET data source](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/AEMETDataSource.py) defines its `API KEY` necessary for it to work_).
149
189
150
-
* For each Spanish region, the representation used by your Data Source should be appended accordingly in the [regions configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/config/ES-regions.json) (in case it does not exist yet). Note that the key of the new entries to be added for each region should match with the aforementioned `REGION REPRESENTATION` attribute (defined in [datasources configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data-sources-config.json)).
190
+
* For each region, the representation used by your Data Source should be appended accordingly in the [regions configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/config/ES-regions.json) (in case it does not exist yet). Note that the key of the new entries to be added for each region should match with the aforementioned `REGION REPRESENTATION` attribute (defined in [data sources configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data-sources-config.json)).
151
191
152
192
* The information of the Data Items offered by your Data Source should be included in a new configuration file `YourDataSourceName-config.json` in the [specific data source configuration folder](https://github.com/CyberDataLab/COnVIDa-lib/tree/master/lib/datasources/config/data_sources). As in the other configuration files residing in that folder (which may guide you in this procedure), each Data Item should constitute an entry. In particular, each entry is defined by the Data Item name (literally used by the Data Source) as the key and the properties `display_name`, `description` and `data_unit` as the values. The latter should include, in turn, translation in both Spanish and English (or any other language you may define). If needed, specific properties of your Data Items can be also included here (for example, the [Mobility data source](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data_sources/MobilityDataSource-config.json) includes the `data_source` attribute to distinguish the resource where each Data Item comes from).
153
193
@@ -167,16 +207,19 @@ _COnVIDa-lib_ constitutes an object-oriented package ready to be extended. Consi
167
207
168
208
* Declare to `None` the following classattributes:
169
209
```python
170
-
DATA_FORMAT=None
171
210
DATA_TYPE=None
211
+
TEMPORAL_GRANULARITY=None
212
+
REGIONAL_GRANULARITY=None
172
213
REGION_REPRESENTATION=None
214
+
DATA_FORMAT=None
215
+
UPDATE_FREQUENCY=None
173
216
DATA_ITEMS=None
174
217
DATA_ITEMS_INFO=None
175
218
```
176
219
In the first execution of the class, these class attributes will load the values from the config files.
177
220
178
221
179
-
* Define and fulfill the following functions:
222
+
* Define and fulfill the following functions Specifically, the function which processes partial data should apply the necessary transformations to return data compliant with standard temporal and regional granularity:
@@ -185,7 +228,7 @@ _COnVIDa-lib_ constitutes an object-oriented package ready to be extended. Consi
185
228
186
229
Parameters
187
230
- data_items: list of data item names. By default, 'all' are collected.
188
-
- regions: list of region names. By default, 'ES' refers to all Spanish provinces.
231
+
- regions: list of region names. By default, 'ES' refers to Spanish regions.
189
232
- start_date: first day in pandas datetime to be considered in TEMPORAL data items. By default, None is established. If the Data Source is a GOGRAPHICAL data type, then it can be supressed.
190
233
- end_date: last day in pandas datetime to be considered in TEMPORAL data items. By default, None is established. If the Data Source is a GOGRAPHICAL data type, then it can be supressed.
0 commit comments