You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: lib/README.md
+17-60Lines changed: 17 additions & 60 deletions
Original file line number
Diff line number
Diff line change
@@ -43,28 +43,16 @@ A Data Item is a low-grain resource which codifies a specific piece of informati
43
43
44
44
45
45
### Data Type
46
-
COnVIDa library considers two types of Data Items used to interpret and analyze them, namely:
46
+
The COnVIDa library considers two types of Data Items used to interpret and analyze them, namely:
47
47
48
-
***Temporal**: The data items are indexed by time units (up to date, only days supported), so they will show in that temporal frequency. In particular, _COVID19, Mobility, MoMo_ and _AEMET_ data items are temporal. For instance, if we select the COVID19 cases in Murcia from 21/02/2020 until 14/05/2020, the X axis will show all the periods between those two dates, while Y axis will show the COVID19 cases in Murcia.
49
-
50
-
***Geographical**: The data items are indexed by region units. In particular, current _INE_ data items are geographical. It is worth mentioning that the user of this library could transform temporal data items to a geographical perspective by applying any kind of aggregation scheme. For instance, in COnVIDa service, if we choose the analysis type by regions and select some temporal data items, then COnVIDa service will descriptive statistical functions of those data items within the specified data ranges.
51
-
52
-
### Temporal Granularity
53
-
The current release of COnVIDa library considers the following temporal units:
54
-
55
-
***DAILY**: For temporal data sources, the data items should be presented by days. For creating new data sources to be directly integrated in the platform, developers should guarantee that granularity in the time series.
56
-
57
-
_More granularities can be supported in the future_
48
+
***Temporal**: The data items are indexed by days, so they will show the daily values. In particular, _COVID19, Mobility, MoMo_ and _AEMET_ data items are temporal. For instance, if we select the COVID19 cases in Murcia from 21/02/2020 until 14/05/2020, the X axis will show all the days between those two dates, while Y axis will show the daily COVID19 cases in Murcia.
58
49
59
50
60
-
### Regional Granularity
61
-
The current release of COnVIDa library supports the following regional units:
51
+
***Geographical**: The data items are indexed by regions and the data is aggregated with absolute values. In particular, current _INE_ data items are geographical. It is worth mentioning that the user of this library could transform temporal data items to a geographical perspective by applying any kind of aggregation scheme. For instance, in COnVIDa service, if we choose the analysis type by regions and select some temporal data items, then COnVIDa service will use the mean of those data items within the specified data ranges.
62
52
63
-
***COMMUNITY**: The data items can be presented per Spanish communities.
64
53
65
-
***PROVINCE**: The data items can be presented per Spanish provinces.
66
-
67
-
_More granularities can be supported in the future_
54
+
### Regions
55
+
Regions are divisions of the territory that allow a more exhaustive and deeper collection and analysis. Currently, they are implemented as the Autonomous Regions in Spain, although the granularity (provinces, minicipalities, etc.) can be easily adapted. In this sense, _COnVIDa_ lib allows filtering the aforementioned data items by regions.
68
56
69
57
70
58
## User guidelines
@@ -74,36 +62,15 @@ The [test lib notebook](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/
Returns the number of citizens per region in a specific country
97
-
98
-
Parameters
99
-
- country_code: str
100
-
Country code of the regions.
101
-
102
65
##### `get_country_codes()`
103
-
Returns a dictionary with the supported countries as keys, and their codes as values.
66
+
Returns a list with the supported country codes. Right now, only 'ES' for Spanish regiones is available, although this is easily extensible to other countries.
104
67
105
68
69
+
##### `get_regions(country_code='ES')`
70
+
Returns a list with the names of the Spanish Autonomous Regions.
106
71
72
+
Parameters
73
+
- country_code: string indicating the country of the regions. Right now, only 'ES' for Spanish regiones is available.
107
74
108
75
***
109
76
@@ -113,10 +80,6 @@ Provides an interface for the library user to avoid the use of low-level functio
113
80
##### `get_data_types()`
114
81
Returns the implemented DataTypes in string format.
115
82
116
-
##### `get_sources_info()`
117
-
Prints and returns a dictionary with the metadata about the supported data sources
Returns a dictionary with data sources as keys, and an array of associated data item names as values.
122
85
@@ -149,7 +112,7 @@ Provides an interface for the library user to avoid the use of low-level functio
149
112
150
113
Parameters
151
114
- data_items: list of data item names. By default, 'all' are collected.
152
-
- regions: list of region names. By default, 'ES' refers to all Spanish regions.
115
+
- regions: list of region names. By default, 'ES' refers to all Spanish Autonomous Regions.
153
116
- start_date: first day in pandas datetime to be considered in TEMPORAL data items. By default, None is established.
154
117
- end_date: last day in pandas datetime to be considered in TEMPORAL data items. By default, None is established.
155
118
- language: language of the returned data.
@@ -171,12 +134,9 @@ _COnVIDa-lib_ constitutes an object-oriented package ready to be extended. Consi
171
134
172
135
1. First of all, some elements should be defined regarding your new Data Source:
173
136
* Name of the Data Source
137
+
* Data Format of the resource (`JSON` or `CSV`)
174
138
* Data Type of the Data Source (`TEMPORAL` or `GEOGRAPHICAL`)
175
-
* Temporal Granularity the Data Source (`DAILY`)
176
-
* Regional Granularity the Data Source (`COMMUNITIES or/and PROVINCES`)
177
139
* Representation of the regions within the Data Source (_iso\_3166\_2_, _ine code_, ...)
178
-
* Data Format of the resource (`JSON` or `CSV`)
179
-
* Update Frequency of the data series (in days)
180
140
* Information of each Data Item of the Data Source
181
141
* Name (literally used by the Data Source)
182
142
* Display Name (used to change the third-party nomenclature to a desired custom one)
@@ -185,9 +145,9 @@ _COnVIDa-lib_ constitutes an object-oriented package ready to be extended. Consi
185
145
186
146
2. Configure the aforementioned principal elements of your new Data Source:
187
147
188
-
* The name, data type, temporal and regional granularities, region representation, data format, and update frequency should be included in the [data sources configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data-sources-config.json). With this aim, append a new entry in the JSON object with the data source name as a key, and a dictionary with the corresponding information regarding `DATA TYPE`, `TEMPORAL GRANULARITY`, `REGIONAL GRANULARITY`, `REGION REPRESENTATION`, `DATA FORMAT`, and `UPDATE FREQUENCY` as values. If needed, specific config elements of your Data Source can be also included here (_for example, [AEMET data source](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/AEMETDataSource.py) defines its `API KEY` necessary for it to work_).
148
+
* The name, data format, data type and region representationshould be included in the [datasources configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data-sources-config.json). With this aim, append a new entry in the JSON object with the data source name as a key, and a dictionary with the corresponding information regarding `DATA FORMAT`, `DATA TYPE` and `REGION REPRESENTATION` as values. If needed, specific config elements of your Data Source can be also included here (_for example, [AEMET data source](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/AEMETDataSource.py) defines its `API KEY` necessary for it to work_).
189
149
190
-
* For each region, the representation used by your Data Source should be appended accordingly in the [regions configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/config/ES-regions.json) (in case it does not exist yet). Note that the key of the new entries to be added for each region should match with the aforementioned `REGION REPRESENTATION` attribute (defined in [data sources configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data-sources-config.json)).
150
+
* For each Spanish region, the representation used by your Data Source should be appended accordingly in the [regions configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/config/ES-regions.json) (in case it does not exist yet). Note that the key of the new entries to be added for each region should match with the aforementioned `REGION REPRESENTATION` attribute (defined in [datasources configuration file](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data-sources-config.json)).
191
151
192
152
* The information of the Data Items offered by your Data Source should be included in a new configuration file `YourDataSourceName-config.json` in the [specific data source configuration folder](https://github.com/CyberDataLab/COnVIDa-lib/tree/master/lib/datasources/config/data_sources). As in the other configuration files residing in that folder (which may guide you in this procedure), each Data Item should constitute an entry. In particular, each entry is defined by the Data Item name (literally used by the Data Source) as the key and the properties `display_name`, `description` and `data_unit` as the values. The latter should include, in turn, translation in both Spanish and English (or any other language you may define). If needed, specific properties of your Data Items can be also included here (for example, the [Mobility data source](https://github.com/CyberDataLab/COnVIDa-lib/blob/master/lib/datasources/config/data_sources/MobilityDataSource-config.json) includes the `data_source` attribute to distinguish the resource where each Data Item comes from).
193
153
@@ -207,19 +167,16 @@ _COnVIDa-lib_ constitutes an object-oriented package ready to be extended. Consi
207
167
208
168
* Declare to `None` the following classattributes:
209
169
```python
170
+
DATA_FORMAT=None
210
171
DATA_TYPE=None
211
-
TEMPORAL_GRANULARITY=None
212
-
REGIONAL_GRANULARITY=None
213
172
REGION_REPRESENTATION=None
214
-
DATA_FORMAT=None
215
-
UPDATE_FREQUENCY=None
216
173
DATA_ITEMS=None
217
174
DATA_ITEMS_INFO=None
218
175
```
219
176
In the first execution of the class, these class attributes will load the values from the config files.
220
177
221
178
222
-
* Define and fulfill the following functions Specifically, the function which processes partial data should apply the necessary transformations to return data compliant with standard temporal and regional granularity:
@@ -228,7 +185,7 @@ _COnVIDa-lib_ constitutes an object-oriented package ready to be extended. Consi
228
185
229
186
Parameters
230
187
- data_items: list of data item names. By default, 'all' are collected.
231
-
- regions: list of region names. By default, 'ES' refers to Spanish regions.
188
+
- regions: list of region names. By default, 'ES' refers to all Spanish provinces.
232
189
- start_date: first day in pandas datetime to be considered in TEMPORAL data items. By default, None is established. If the Data Source is a GOGRAPHICAL data type, then it can be supressed.
233
190
- end_date: last day in pandas datetime to be considered in TEMPORAL data items. By default, None is established. If the Data Source is a GOGRAPHICAL data type, then it can be supressed.
0 commit comments