Skip to content

Commit 830692b

Browse files
authored
Merge pull request #201 from khaeru/enh/read-csv
Read SDMX-CSV 2.0.0
2 parents 70cf79c + f70790d commit 830692b

File tree

17 files changed

+492
-21
lines changed

17 files changed

+492
-21
lines changed

doc/api/model-v21-list.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
:obj:`~.v21.ContentConstraint`
99
:obj:`~.v21.DataKey`
1010
:obj:`~.v21.DataKeySet`
11+
:obj:`~.v21.DataSet`
1112
:obj:`~.v21.DataSetTarget`
1213
:obj:`~.v21.DataStructureDefinition`
1314
:obj:`~.v21.DataflowDefinition`

doc/api/model-v30-list.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@
88
:obj:`~.v30.DataConstraint`
99
:obj:`~.v30.DataKey`
1010
:obj:`~.v30.DataKeySet`
11+
:obj:`~.v30.DataSet`
1112
:obj:`~.v30.DataStructureDefinition`
1213
:obj:`~.v30.Dataflow`
1314
:obj:`~.v30.DataflowRelationship`

doc/api/reader.rst

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,15 +43,38 @@ SDMX-JSON
4343
:members:
4444
:undoc-members:
4545

46+
47+
.. currentmodule:: sdmx.reader.csv
48+
4649
SDMX-CSV
4750
=========
4851

49-
.. currentmodule:: sdmx.reader.csv
52+
:mod:`sdmx.reader.csv` supports SDMX-CSV 2.0.0, corresponding to SDMX 3.0.0.
53+
See :ref:`sdmx-csv` for differences between versions of the SDMX-CSV file format.
5054

51-
.. autoclass:: sdmx.reader.csv.Reader
52-
:members:
53-
:undoc-members:
55+
Implementation details:
56+
57+
- :meth:`.Reader.inspect_header` inspects the header line in the CSV input and constructs a set of :class:`~.csv.Handler` instances, one for each field appearing in the particular file.
58+
Some of these handlers do actually process the contents of the field, but silently discard it; for example, when ``labels="name"``, the name fields are not processed.
59+
- :meth:`.Reader.handle_row` is applied to every record in the CSV input.
60+
Each Handler is applied to its respective field.
61+
Every :meth:`.handle_row` call constructs a single :class:`~.v30.Observation`.
62+
- :meth:`.Reader.read_message` assembles the resulting observations into one or more :class:`DataSets <.common.BaseDataSet>`.
63+
SDMX-CSV 2.0.0 specifies a mix of codes such as "I" (:attr:`.ActionType.information`) and "D" (:attr:`.ActionType.delete`) in the "ACTION" field for each observation in the same file, whereas the SDMX IM specifies that :attr:`~.BaseDataSet.action` is an attribute of an entire DataSet.
64+
:class:`~.csv.Reader` groups all observations into 1 or more DataSet instances, according to their respective "ACTION" field values.
65+
66+
Currently :mod:`.reader.csv` has the following limitations:
5467

68+
- :meth:`.Reader.read_message` generates SDMX 3.0.0 (:mod:`.model.v30`) artefacts such as :class:`.v30.DataSet`, since these correspond to the supported SDMX-CSV 2.0.0 format.
69+
It is not currently supported to generate SDMX 2.1 artefacts such as :class:`.v21.DataSet`.
70+
- Currently only a single :class`.v30.Dataflow` or :class:`.v30.DataStructureDefinition` can be supplied to :meth:`.Reader.read_message`.
71+
The SDMX-CSV 2.0.0 format supports mixing data flows and data structures in the same message.
72+
Such messages can be read with :mod:`sdmx`, but the resulting data sets will only correspond to the given data flow.
73+
74+
.. automodule:: sdmx.reader.csv
75+
:members:
76+
:undoc-members:
77+
:show-inheritance:
5578

5679
Reader API
5780
==========

doc/implementation.rst

Lines changed: 17 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -300,21 +300,32 @@ The SDMX-JSON *format* is versioned differently from the overall SDMX *standard*
300300
SDMX-CSV
301301
--------
302302

303-
Reference: https://github.com/sdmx-twg/sdmx-csv
303+
Reference: https://github.com/sdmx-twg/sdmx-csv; see in particular the file `sdmx-csv-field-guide.md <https://github.com/sdmx-twg/sdmx-csv/blob/v2.0.0/data-message/docs/sdmx-csv-field-guide.md>`_.
304304

305305
Based on Comma-Separated Value (CSV).
306306
The SDMX-CSV *format* is versioned differently from the overall SDMX *standard*:
307307

308308
- `SDMX-CSV 1.0 <https://github.com/sdmx-twg/sdmx-csv/tree/v1.0>`__ corresponds to SDMX 2.1.
309309
It supports only data and metadata, not structures.
310-
- SDMX-CSV 2.0 corresponds to SDMX 3.0.
310+
SDMX-CSV 1.0 files are recognizable by the header ``DATAFLOW`` in the first column of the first row.
311311

312-
.. versionadded:: 2.9.0
312+
.. versionadded:: 2.9.0
313313

314-
Support for SDMX-CSV 1.0.
314+
Support for *writing* SDMX-CSV 1.0.
315+
See :mod:`.writer.csv`.
315316

316-
:mod:`sdmx` does not currently support *writing* SDMX-CSV.
317-
See :issue:`34`.
317+
:mod:`sdmx` does not currently support *reading* SDMX-CSV 1.0.
318+
319+
- `SDMX-CSV 2.0.0 <https://github.com/sdmx-twg/sdmx-csv/tree/v2.0.0>`_ corresponds to SDMX 3.0.0.
320+
The format differs from and is not backwards compatible with SDMX-CSV 1.0.
321+
SDMX-CSV 2.0.0 files are recognizable by the header ``STRUCTURE`` in the first column of the first row.
322+
323+
.. versionadded:: 2.19.0
324+
325+
Initial support for *reading* SDMX-CSV 2.0.0.
326+
See :mod:`.reader.csv`.
327+
328+
:mod:`sdmx` does not currently support *writing* SDMX-CSV 2.0.0.
318329

319330
.. _sdmx-rest:
320331
.. _web-service:

doc/whatsnew.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ What's new?
66
Next release
77
============
88

9+
- :mod:`.reader.csv` supports reading :ref:`SDMX-CSV 2.0.0 <sdmx-csv>` (corresponding to SDMX 3.0.0) (:pull:`201`, :issue:`34`).
10+
See the implementation notes for information about the differences between the SDMX-CSV 1.0 and 2.0.0 formats and their support in :mod:`sdmx`.
911
- Bug fix for writing :class:`.VersionableArtefact` to SDMX-ML 2.1: :class:`KeyError` was raised if :attr:`.VersionableArtefact.version` was an instance of :class:`.Version` (:pull:`198`).
1012
- Bug fix for reading data from structure-specific SDMX-ML: :class:`.XMLParseError` / :class:`NotImplementedError` was raised if reading 2 messages in sequence with different XML namespaces defined (:pull:`200`, thanks :gh-user:`mephinet` for :issue:`199`).
1113

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,7 @@ select = ["C9", "E", "F", "I", "W"]
9595
ignore = ["E501", "W191"]
9696
# Exceptions:
9797
# - .client._handle_get_kwargs: 12
98+
# - .reader.csv.Reader.inspect_header: 12
9899
# - .reader.xml.v21._component_end: 12
99100
# - .testing.generate_endpoint_tests: 11
100101
# - .writer.pandas._maybe_convert_datetime: 23

sdmx/model/common.py

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -463,7 +463,27 @@ def __contains__(self, name): ...
463463

464464
# §3.4: Data Types
465465

466-
466+
#: Per the standard…
467+
#:
468+
#: ..
469+
#:
470+
#: …used to specify the action that a receiving system should take when processing
471+
#: the content that is the object of the action:
472+
#:
473+
#: Append
474+
#: Data or metadata is an incremental update for an existing data/metadata set or
475+
#: the provision of new data or documentation (attribute values) formerly absent.
476+
#: If any of the supplied data or metadata is already present, it will not replace
477+
#: that data or metadata.
478+
#: Replace
479+
#: Data/metadata is to be replaced and may also include additional data/metadata
480+
#: to be appended.
481+
#: Delete
482+
#: Data/Metadata is to be deleted.
483+
#: Information
484+
#: Data and metadata are for information purposes.
485+
#:
486+
#: — SDMX 3.0.0 Section 2 §3.4.2.1
467487
ActionType = Enum("ActionType", "delete replace append information")
468488

469489
ConstraintRoleType = Enum("ConstraintRoleType", "allowable actual")
@@ -1990,7 +2010,7 @@ def compare(self, other, strict=True):
19902010
class BaseDataSet(AnnotableArtefact):
19912011
"""Common features of SDMX 2.1 and 3.0 DataSet."""
19922012

1993-
#:
2013+
#: Action to be performed
19942014
action: Optional[ActionType] = None
19952015
#:
19962016
valid_from: Optional[str] = None

sdmx/model/v21.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@
5454
"DataStructureDefinition",
5555
"DataflowDefinition",
5656
"Observation",
57+
"DataSet",
5758
"StructureSpecificDataSet",
5859
"GenericDataSet",
5960
"GenericTimeSeriesDataSet",

sdmx/model/v30.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@
4343
"DataStructureDefinition",
4444
"Dataflow",
4545
"Observation",
46+
"DataSet",
4647
"StructureSpecificDataSet",
4748
"MetadataAttributeDescriptor",
4849
"IdentifiableObjectSelection",

sdmx/reader/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
from pathlib import Path
22

3-
from . import json, xml
3+
from . import csv, json, xml
44

55
#: Reader classes
6-
READERS = [json.Reader, xml.Reader]
6+
READERS = [csv.Reader, json.Reader, xml.Reader]
77

88

99
def _readers():

0 commit comments

Comments
 (0)