Skip to content

Commit 59a90ec

Browse files
authored
Merge pull request #234 from khaeru/enh/submitstructureresponse
Miscellaneous improvements for 2025-W27
2 parents f140e40 + 78dffe8 commit 59a90ec

File tree

23 files changed

+838
-404
lines changed

23 files changed

+838
-404
lines changed

doc/api.rst

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,15 @@ Top-level methods and classes
4848
to_sdmx
4949
validate_xml
5050

51+
``compare``: Compare SDMX artefacts
52+
===================================
53+
54+
.. currentmodule:: sdmx.compare
55+
56+
.. automodule:: sdmx.compare
57+
:members:
58+
:show-inheritance:
59+
5160
``format``: SDMX file formats
5261
=============================
5362

doc/api/model-common-list.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@
5555
:obj:`~.common.KeyValue`
5656
:obj:`~.common.Level`
5757
:obj:`~.common.MaintainableArtefact`
58+
:obj:`~.common.MessageText`
5859
:obj:`~.common.MetadataTargetRegion`
5960
:obj:`~.common.NamePersonalisation`
6061
:obj:`~.common.NamePersonalisationScheme`
@@ -70,8 +71,11 @@
7071
:obj:`~.common.SeriesKey`
7172
:obj:`~.common.SimpleDatasource`
7273
:obj:`~.common.StartPeriod`
74+
:obj:`~.common.StatusMessage`
7375
:obj:`~.common.Structure`
7476
:obj:`~.common.StructureUsage`
77+
:obj:`~.common.SubmissionResult`
78+
:obj:`~.common.SubmissionStatusType`
7579
:obj:`~.common.TimeDimension`
7680
:obj:`~.common.TimeKeyValue`
7781
:obj:`~.common.ToVTLSpaceKey`

doc/implementation.rst

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ Multiple versions of the SDMX standards have been adopted:
3131
- 2.0 in November 2005.
3232
- 2.1 in August 2011; published at the International Standards Organization (ISO) in January 2013; and revised multiple times since.
3333
- 3.0.0 in October 2021.
34-
- 3.1 planned for some time in 2025.
34+
- 3.1 in May 2025.
3535

3636
Some notes about the organization of the standards:
3737

@@ -93,6 +93,7 @@ Reference:
9393

9494
- `SDMX 2.1 Section 2 — Information Model <https://sdmx.org/wp-content/uploads/SDMX_2-1-1_SECTION_2_InformationModel_201108.pdf>`_ (PDF).
9595
- `SDMX 3.0.0 Section 2 — Information Model <https://sdmx.org/wp-content/uploads/SDMX_3-0-0_SECTION_2_FINAL-1_0.pdf>`_ (PDF).
96+
- `SDMX 3.1 Section 2 — Information Model <https://sdmx.org/wp-content/uploads/SDMX_3-1-0_SECTION_2_FINAL.pdf>`_ (PDF).
9697

9798
In general:
9899

@@ -268,6 +269,37 @@ Constraints
268269
:attr:`~.v21.Constraint.data_content_keys` are ignored.
269270
None of the data sources supported by :mod:`sdmx` appears to use this latter form.
270271

272+
.. _impl-im-reg:
273+
274+
Registration and maintenance
275+
----------------------------
276+
277+
As of SDMX 3.1,
278+
Section 5 (“SDMX Registry Specification”) §7.4.3 Registry Response and Figure 18 (“Section 5”)
279+
show information that is only partly complete
280+
and that does not clearly align with the XSD schemas for SDMX-ML (specifically :file:`SDMXRegistryStructure.xsd`)
281+
or the `draft standards for structural metadata maintenance`_ (“the schemas”).
282+
For example, the UML diagrams and tables do not include all the types and classes that appear in the schemas.
283+
284+
Generally, :mod:`sdmx` chooses an implementation that is consistent with the XSD schemas.
285+
In particular:
286+
287+
- :class:`.SubmissionResult` appears in the schemas,
288+
but does not appear in Section 5 or other standards documents.
289+
- Section 5 does not give a name for the relationship between,
290+
for instance, :py:`RegistrationStatus` and :class:`.StatusMessage`.
291+
- Section 5 shows that :attr:`.StatusMessage.status` has type String.
292+
The schemas require that the attribute be one of a few fixed values.
293+
294+
:mod:`sdmx` implements these values in the :class:`.SubmissionStatusType` enumeration.
295+
- Section 5 shows :class:`.MessageText` with attributes ``errorCode`` and ``errorText``.
296+
The schemas and draft standards for structural metadata maintenance show that
297+
these codes and texts can be used for *successes* (not only errors)
298+
and do not include "error" in the XML attribute and tag names, respectively.
299+
300+
:mod:`sdmx` implements :attr:`.MessageText.text` and :attr:`.MessageText.code`.
301+
302+
.. _`draft standards for structural metadata maintenance`: https://github.com/sdmx-twg/sdmx-rest/blob/409064fe335f12eaf6456b95a530dd379dff2728/doc/maintenance.md
271303

272304
.. _formats:
273305

doc/whatsnew.rst

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,30 @@ What's new?
66
Next release
77
============
88

9+
- Expand :mod:`.model`, :mod:`.reader.xml`, and :mod:`.writer.xml` support for :ref:`impl-im-reg` messages (:pull:`234`).
10+
See the documentation for implementation details and errata in the standards documents.
11+
12+
- New classes
13+
:class:`.model.common.MessageText`,
14+
:class:`.StatusMessage`,
15+
:class:`.SubmissionResult`, and
16+
:class:`.SubmissionStatusType`.
17+
- New classes :class:`.message.RegistryInterface` and :class:`.SubmitStructureResponse`.
18+
19+
- New module :mod:`sdmx.compare` that collects logic for recursive comparison of SDMX artefacts (:pull:`234`).
20+
21+
- New mix-in :class:`.Comparable` that adds a :meth:`~.Comparable.compare` method to subclasses.
22+
- New class :class:`.compare.Options` to control comparison behaviour and logging.
23+
- :func:`sdmx.util.compare` is deprecated and will be removed in a future version.
24+
25+
- :class:`.Key` is sortable (:pull:`234`).
926
- :func:`.install_schemas` and :func:`.construct_schema` fetch, store, and use a local copy of :file:`xhtml1-strict.dsd` (:pull:`236`, :issue:`235`).
1027
This enables use of :func:`.validate_xml`
1128
with lxml version 6.0.0 (`released 2025-06-26 <https://lxml.de/6.0/changes-6.0.0.html>`__)
1229
for SDMX-ML messages containing XHTML values.
1330
- Correct a broken link to :ref:`im` in the README (:pull:`233`; thanks :gh-user:`econometricsfanboy` for :issue:`232`).
1431
- Update the base URL of the :ref:`ILO <ILO>` source to use HTTPS instead of plain HTTP (:pull:`237`).
32+
- New utilities :class:`.CompareTests` and :func:`.preserve_dunders` (:pull:`234`).
1533

1634
.. _2.22.0:
1735

sdmx/compare.py

Lines changed: 248 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,248 @@
1+
"""Compare SDMX artefacts."""
2+
3+
import datetime
4+
import enum
5+
import logging
6+
import textwrap
7+
from collections import defaultdict
8+
from collections.abc import Iterable
9+
from copy import copy
10+
from dataclasses import dataclass, fields, is_dataclass
11+
from functools import singledispatch
12+
from typing import Any, TypeVar, Union
13+
14+
import lxml.etree
15+
16+
from . import urn
17+
from .model import internationalstring
18+
19+
log = logging.getLogger(__name__)
20+
21+
22+
IGNORE_CONTEXT = {"Categorisation.artefact"}
23+
VISITED: dict[tuple[int, int], set[int]] = defaultdict(set)
24+
25+
26+
class Comparable:
27+
"""Mix-in class for objects with a :meth:`.compare` method."""
28+
29+
def compare(self, other, strict: bool = True, **options) -> bool:
30+
"""Return :any:`True` if `self` is the same as `other`.
31+
32+
`strict` and other `options` are used to construct an instance of
33+
:class:`Options`.
34+
"""
35+
return compare(self, other, Options(self, strict=strict, **options))
36+
37+
38+
@dataclass
39+
class Options:
40+
"""Options for a comparison."""
41+
42+
#: Base object for a recursive comparison. Used internally for memoization/to
43+
#: improve performance.
44+
base: Any
45+
46+
#: Objects compare equal even if :attr:`.IdentifiableArtefact.urn` is :any:`None`
47+
#: for either or both, so long as the URNs implied by their other attributes—that
48+
#: is, returned by :func:`sdmx.urn.make`—are the same.
49+
allow_implied_urn: bool = True
50+
51+
#: Strict comparison: if :any:`True` (the default), then attributes and associated
52+
#: objects must compare exactly equal. If :any:`False`, then :any:`None` values on
53+
#: either side are permitted.
54+
strict: bool = True
55+
56+
#: Level for log messages.
57+
log_level: int = logging.NOTSET
58+
59+
#: Verbose comparison: continue comparing even after reaching a definitive
60+
#: :any:`False` result. If :attr:`log_level` is not set, :py:`verbose = True`
61+
#: implies :py:`log_level = logging.DEBUG`.
62+
verbose: bool = False
63+
64+
_memo_key: tuple[int, int] = (0, 0)
65+
66+
def __post_init__(self) -> None:
67+
# Create a key for memoization
68+
self._memo_key = (id(self.base), id(self))
69+
VISITED[self._memo_key].clear()
70+
71+
# If no log level is given, set a default based on verbose
72+
if self.log_level == logging.NOTSET:
73+
self.log_level = {True: logging.DEBUG, False: logging.INFO}[self.verbose]
74+
75+
def log(self, message: str, level: int = logging.INFO) -> None:
76+
"""Log `message` on `level`.
77+
78+
`level` must be at least :attr:`log_level`.
79+
"""
80+
if level >= self.log_level:
81+
log.log(level, message)
82+
83+
def visited(self, obj) -> bool:
84+
"""Return :any:`True` if `obj` has already be compared."""
85+
if type(obj).__module__ == "builtins":
86+
return False
87+
88+
entry = id(obj)
89+
90+
if entry in VISITED[self._memo_key]:
91+
return True
92+
else:
93+
VISITED[self._memo_key].add(entry)
94+
return False
95+
96+
97+
T = TypeVar("T", bound=object)
98+
99+
100+
@singledispatch
101+
def compare(left: object, right, opts: Options, context: str = "") -> bool:
102+
"""Compare `left` to `right`."""
103+
if is_dataclass(left):
104+
return compare_dataclass(left, right, opts, context)
105+
106+
raise NotImplementedError(f"Compare {type(left)} {left!r} in {context}")
107+
108+
109+
def compare_dataclass(left, right, opts: Options, context: str) -> bool:
110+
c = context or type(left).__name__
111+
112+
result = right is not None
113+
for f in fields(left) if result else []:
114+
l_val, r_val = getattr(left, f.name), getattr(right, f.name)
115+
116+
if opts.visited(l_val):
117+
continue # Already compared to its counterpart
118+
119+
c_sub = f"{c}.{f.name}"
120+
121+
# Handle Options.allow_implied_urn
122+
if f.name == "urn" and not l_val is r_val is None and opts.allow_implied_urn:
123+
try:
124+
l_val = l_val or urn.make(left)
125+
except (AttributeError, ValueError):
126+
pass
127+
try:
128+
r_val = r_val or urn.make(right)
129+
except (AttributeError, ValueError):
130+
pass
131+
132+
result_f = (
133+
l_val is r_val
134+
or compare(l_val, r_val, opts, c_sub)
135+
or c_sub in IGNORE_CONTEXT
136+
)
137+
138+
result &= result_f
139+
140+
if result_f is False:
141+
opts.log(f"Not identical: {c_sub}={shorten(l_val)} != {shorten(r_val)}")
142+
if not opts.verbose:
143+
break
144+
else:
145+
opts.log(f"{c_sub}={shorten(l_val)} == {shorten(r_val)}", logging.DEBUG)
146+
147+
return result
148+
149+
150+
# Built-in types
151+
152+
153+
# TODO When dropping support for Python <=3.10, change to '@compare.register'
154+
@compare.register(int)
155+
@compare.register(str)
156+
@compare.register(datetime.date)
157+
def _eq(left: Union[int, str, datetime.date], right, opts, context=""):
158+
"""Built-in types that must compare equal."""
159+
return left == right or (not opts.strict and right is None)
160+
161+
162+
# TODO When dropping support for Python <=3.10, change to '@compare.register'
163+
@compare.register(type(None))
164+
@compare.register(bool)
165+
@compare.register(float)
166+
@compare.register(type)
167+
@compare.register(enum.Enum)
168+
def _is(left: Union[None, bool, float, type, enum.Enum], right, opts, context):
169+
"""Built-in types that must compare identical."""
170+
return left is right or (not opts.strict and right is None or left is None)
171+
172+
173+
@compare.register
174+
def _(left: dict, right, opts, context=""):
175+
"""Return :obj:`True` if `self` is the same as `other`.
176+
177+
Two DictLike instances are identical if they contain the same set of keys, and
178+
corresponding values compare equal.
179+
"""
180+
result = True
181+
182+
l_keys = set(left.keys())
183+
r_keys = set(right.keys()) if hasattr(right, "keys") else set()
184+
if l_keys != r_keys:
185+
opts.log(
186+
f"Mismatched {type(left).__name__} keys: {shorten(sorted(l_keys))} "
187+
f"!= {shorten(sorted(r_keys))}"
188+
)
189+
result = False
190+
191+
# Compare items pairwise
192+
for key in sorted(l_keys) if (result or opts.verbose and right is not None) else ():
193+
result &= compare(left[key], right.get(key, None), opts)
194+
if result is False and not opts.verbose:
195+
break
196+
197+
return result
198+
199+
200+
# TODO When dropping support for Python <=3.10, change to '@compare.register'
201+
@compare.register(list)
202+
@compare.register(set)
203+
def _(left: Union[list, set], right, opts, context=""):
204+
if len(left) != len(right):
205+
opts.log(f"Mismatched length: {len(left)} != {len(right)}")
206+
return False
207+
208+
try:
209+
l_values: Iterable = sorted(left)
210+
r_values: Iterable = sorted(right)
211+
except TypeError:
212+
l_values, r_values = left, right
213+
214+
return all(
215+
compare(a, b, opts, f"{context}[{i}]")
216+
for i, (a, b) in enumerate(zip(l_values, r_values))
217+
)
218+
219+
220+
# Types from upstream packages
221+
222+
223+
@compare.register
224+
def _(left: lxml.etree._Element, right, opts, context=""):
225+
try:
226+
r_val = copy(right)
227+
lxml.etree.cleanup_namespaces(r_val)
228+
except TypeError:
229+
return not opts.strict
230+
else:
231+
l_val = copy(left)
232+
lxml.etree.cleanup_namespaces(l_val)
233+
return lxml.etree.tostring(l_val) == lxml.etree.tostring(r_val)
234+
235+
236+
# SDMX types
237+
238+
239+
@compare.register
240+
def _(left: internationalstring.InternationalString, right, opts, context=""):
241+
return compare(
242+
left.localizations, right.localizations, opts, f"{context}.localizations"
243+
)
244+
245+
246+
def shorten(value: Any) -> str:
247+
"""Return a shortened :func:`repr` of `value` for logging."""
248+
return textwrap.shorten(repr(value), 30, placeholder="…")

0 commit comments

Comments
 (0)