Skip to content

Commit

Permalink
validate and xp: grouped options and documentation.
Browse files Browse the repository at this point in the history
  • Loading branch information
peteradrichem committed Jan 18, 2025
1 parent d63283c commit d4f033a
Show file tree
Hide file tree
Showing 6 changed files with 156 additions and 120 deletions.
50 changes: 33 additions & 17 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,6 @@ Xul -- XML Utilities
:target: https://github.com/psf/black

Xul is a set of XML scripts written in Python.
Documentation can be found on `Read The Docs`_.


Xul scripts
===========
Expand Down Expand Up @@ -60,32 +58,50 @@ Dependencies
Xul uses the excellent lxml_ XML toolkit, a Pythonic binding for the C libraries
libxml2_ and libxslt_.

Documentation
=============
Xul documentation can be found on `Read The Docs`_.

Options
-------
List the command-line options of a Xul script with ``--help``.
For example:

.. code::
$ ppx --help
$ xp --help
usage: ppx [-h] [-V] [-n] [-o] [xml_source [xml_source ...]]
usage: xp [-h] [-V] [-l | -L] [-d DEFAULT_NS_PREFIX] [-e] [-q] [-p] [-r] [-m] xpath_expr [xml_source ...]
Pretty Print XML source in human readable form.
Select nodes in an XML source with an XPath expression.
positional arguments:
xml_source XML source (file, <stdin>, http://...)
optional arguments:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-n, --no-syntax no syntax highlighting
-o, --omit-declaration
omit the XML declaration
Documentation
=============
Xul documentation can be found on `Read The Docs`_.
xpath_expr XPath expression
xml_source XML source (file, <stdin>, http://...)
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-m, --method use ElementTree.xpath method instead of XPath class
file hit options:
output filenames to standard output
-l, -f, --files-with-hits
only the names of files with a non-false and non-NaN result are written to standard output
-L, -F, --files-without-hits
only the names of files with a false or NaN result, or without any results are written to
standard output
namespace options:
-d DEFAULT_NS_PREFIX, --default-prefix DEFAULT_NS_PREFIX
set the prefix for the default namespace in XPath [default: 'd']
-e, --exslt add EXSLT XML namespaces
-q, --quiet don't print XML source namespaces
output options:
-p, --pretty-element pretty print the result element
-r, --result-xpath print the XPath expression of the result element (or its parent)
W3C standards
-------------
Expand Down
3 changes: 2 additions & 1 deletion docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Changelog

This document records all notable changes to `Xul <https://xul.readthedocs.io/>`_.

`Unreleased <https://github.com/peteradrichem/Xul/compare/2.5.1...py3k>`_ (2025-01-17)
`Unreleased <https://github.com/peteradrichem/Xul/compare/2.5.1...py3k>`_ (2025-01-18)
--------------------------------------------------------------------------------------
* Drop support for Python < 3.9.
* :doc:`xp <xp>`: fix boolean result (Python >= 3.12).
Expand All @@ -14,6 +14,7 @@ This document records all notable changes to `Xul <https://xul.readthedocs.io/>`
* :doc:`transform <transform>`: removed ``--xsl-output`` option (always for ``--file``).
* Fixed encoding issues.
* Clearer error messages.
* Improved documentation; grouped CLI options.
* Code checks: ruff, black, isort, mypy (GitHub Action).
* Test script for local testing with Docker Compose.
* Typing.
Expand Down
38 changes: 21 additions & 17 deletions docs/validate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,30 +67,34 @@ Options
$ validate --help
usage: validate [-h] [-V] (-x XSD_SOURCE | -d DTD_SOURCE | -r RELAXNG_SOURCE)
[-f | -F]
[xml_source [xml_source ...]]
usage: validate [-h] [-V] (-x XSD_SOURCE | -d DTD_SOURCE | -r RELAXNG_SOURCE) [-l | -L] [xml_source ...]
Validate XML source with XSD, DTD or RELAX NG.
Validate an XML source with XSD, DTD or RELAX NG.
positional arguments:
xml_source XML source (file, <stdin>, http://...)
optional arguments:
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
XML validator:
choose an XML validator: XSD, DTD or RELAX NG
-x XSD_SOURCE, --xsd XSD_SOURCE
XML Schema Definition (XSD) source
-d DTD_SOURCE, --dtd DTD_SOURCE
Document Type Definition (DTD) source
-r RELAXNG_SOURCE, --relaxng RELAXNG_SOURCE
RELAX NG source
-f, -l, --validated-files
only the names of validated XML files are written to
standard output
-F, -L, --invalidated-files
only the names of invalidated XML files are written to
standard output
file hit options:
output filenames to standard output
-l, -f, --validated-files
only the names of validated XML files are written to standard output
-L, -F, --invalidated-files
only the names of invalidated XML files are written to standard output
XML Validation
Expand Down Expand Up @@ -155,10 +159,10 @@ Validate the XML Schema XSD with the
Print file names
----------------
.. program:: validate
.. option:: -f, -l, --validated-files
.. option:: -l, -f, --validated-files

The ``-f, -l, --validated-files`` command-line option only prints the names
of validated XML files.
The ``--validated-files`` command-line option only prints the names of validated XML files
(similar to ``grep --files-with-matches``).

Find XML files that validate:

Expand All @@ -167,10 +171,10 @@ Find XML files that validate:
validate -x schema.xsd *.xml -l
.. program:: validate
.. option:: -F, -L, --invalidated-files
.. option:: -L, -F, --invalidated-files

The ``-F, -L, --invalidated-files`` command-line option only prints the names
of XML files that don't validate.
The ``--invalidated-files`` command-line option only prints the names of XML files that don't validate
(similar to ``grep --files-without-match``).

Remove XML files that fail to validate:

Expand Down
80 changes: 45 additions & 35 deletions docs/xp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Options
$ xp --help
usage: xp [-h] [-V] [-e] [-d DEFAULT_NS_PREFIX] [-q] [-p] [-r] [-f | -F] [-m] xpath_expr [xml_source ...]
usage: xp [-h] [-V] [-l | -L] [-d DEFAULT_NS_PREFIX] [-e] [-q] [-p] [-r] [-m] xpath_expr [xml_source ...]
Select nodes in an XML source with an XPath expression.
Expand All @@ -51,18 +51,26 @@ Options
options:
-h, --help show this help message and exit
-V, --version show program's version number and exit
-e, --exslt add EXSLT XML namespaces
-d DEFAULT_NS_PREFIX, --default-prefix DEFAULT_NS_PREFIX
set the prefix for the default namespace in XPath [default: 'd']
-q, --quiet don't print XML source namespaces
-p, --pretty-element pretty print the result element
-r, --result-xpath print the XPath expression of the result element (or its parent)
-f, -l, --files-with-hits
-m, --method use ElementTree.xpath method instead of XPath class
file hit options:
output filenames to standard output
-l, -f, --files-with-hits
only the names of files with a non-false and non-NaN result are written to standard output
-F, -L, --files-without-hits
-L, -F, --files-without-hits
only the names of files with a false or NaN result, or without any results are written to
standard output
-m, --method use ElementTree.xpath method instead of XPath class
namespace options:
-d DEFAULT_NS_PREFIX, --default-prefix DEFAULT_NS_PREFIX
set the prefix for the default namespace in XPath [default: 'd']
-e, --exslt add EXSLT XML namespaces
-q, --quiet don't print XML source namespaces
output options:
-p, --pretty-element pretty print the result element
-r, --result-xpath print the XPath expression of the result element (or its parent)
.. index::
Expand Down Expand Up @@ -90,6 +98,27 @@ List the XPath expressions of all elements with attributes:
xp -r "//@*" file.xml
.. index::
single: xp script; pretty print

Pretty print result element
---------------------------
.. program:: xp
.. option:: -p, --pretty-element

A result element node can be pretty printed with the ``--pretty-element`` command-line option.

.. warning:: The ``--pretty-element`` option removes all white space text nodes
*before* applying the XPath expression. Therefore there will be no white space
text nodes in the results.

Pretty print the latest Python PEP:

.. code-block:: bash
curl -s https://peps.python.org/peps.rss | xp "//item[1]" -p
.. index::
single: xp script; namespaces
single: XML Namespaces
Expand Down Expand Up @@ -174,43 +203,23 @@ Find Python PEPs with four digits in the title (case-insensitive):
xp -e '//item/title[re:match(text(), "pep [0-9]{4}:", "i")]' -q
.. index::
single: xp script; pretty print

Pretty print element result
---------------------------
.. program:: xp
.. option:: -p, --pretty-element

A result element node can be pretty printed with the ``--pretty-element`` command-line option.

.. warning:: The ``--pretty-element`` option removes all white space text nodes
*before* applying the XPath expression. Therefore there will be no white space
text nodes in the results.

Pretty print the latest Python PEP:

.. code-block:: bash
curl -s https://peps.python.org/peps.rss | xp "//item[1]" -p
.. index::
single: xp script; file names

Print file names
----------------
.. program:: xp
.. option:: -f, -l, --files-with-hits
.. option:: -l, -f, --files-with-hits

The ``--files-with-hits`` command-line option only prints the names
of files with an XPath result that is not false and not NaN (not a number).
This is similar to ``grep --files-with-matches`` using XPath instead of regular expressions.

Find XML files with HTTP URL's:

.. code-block:: bash
xp "//mpeg7:MediaUri[starts-with(., 'http://')]" *.xml -f
xp "//mpeg7:MediaUri[starts-with(., 'http://')]" *.xml -l
XML files where all the book prices are below € 25,-.

Expand All @@ -219,16 +228,17 @@ XML files where all the book prices are below € 25,-.
xp -el "math:max(//book/price[@currency='€'])<25" *.xml
.. program:: xp
.. option:: -F, -L, --files-without-hits
.. option:: -L, -F, --files-without-hits

The ``--files-without-hits`` command-line option only prints the names
of files without any XPath results, or with a false or NaN result.
This is similar to ``grep --files-without-match`` using XPath instead of regular expressions.

XML files without a person with the family name 'Bauwens':

.. code-block:: bash
xp "//mpeg7:FamilyName[text()='Bauwens']" *.xml -F
xp "//mpeg7:FamilyName[text()='Bauwens']" *.xml -L
xpath method
------------
Expand Down
33 changes: 18 additions & 15 deletions src/xul/cmd/validate.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Validate XML source with XSD, DTD or RELAX NG."""
"""Validate an XML source with XSD, DTD or RELAX NG."""

import argparse
import sys
Expand All @@ -7,6 +7,7 @@
from lxml import etree

from .. import __version__
from ..etree import get_source_name
from ..log import setup_logger_console
from ..validate import build_dtd, build_relaxng, build_xml_schema, validate_xml

Expand All @@ -15,33 +16,39 @@ def parse_cl() -> argparse.Namespace:
"""Parse the command line for options and XML sources."""
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("-V", "--version", action="version", version="%(prog)s " + __version__)
lang_group = parser.add_mutually_exclusive_group(required=True)
lang_group.add_argument(
validator_args_group = parser.add_argument_group(
title="XML validator", description="choose an XML validator: XSD, DTD or RELAX NG"
)
validator_group = validator_args_group.add_mutually_exclusive_group(required=True)
validator_group.add_argument(
"-x", "--xsd", action="store", dest="xsd_source", help="XML Schema Definition (XSD) source"
)
lang_group.add_argument(
validator_group.add_argument(
"-d",
"--dtd",
action="store",
dest="dtd_source",
help="Document Type Definition (DTD) source",
)
lang_group.add_argument(
validator_group.add_argument(
"-r", "--relaxng", action="store", dest="relaxng_source", help="RELAX NG source"
)
file_group = parser.add_mutually_exclusive_group(required=False)
file_group.add_argument(
"-f",
file_group = parser.add_argument_group(
title="file hit options", description="output filenames to standard output"
)
file_hit_group = file_group.add_mutually_exclusive_group(required=False)
file_hit_group.add_argument(
"-l",
"-f",
"--validated-files",
action="store_true",
default=False,
dest="validated_files",
help="only the names of validated XML files are written to standard output",
)
file_group.add_argument(
"-F",
file_hit_group.add_argument(
"-L",
"-F",
"--invalidated-files",
action="store_true",
default=False,
Expand Down Expand Up @@ -71,11 +78,7 @@ def apply_validator(
if args.validated_files or args.invalidated_files:
valid = validate_xml(xml_source, validator, silent=True)
if (valid and args.validated_files) or (not valid and args.invalidated_files):
if xml_source in ("-", sys.stdin):
# <stdin>.
print(sys.stdin.name)
else:
print(xml_source)
print(get_source_name(xml_source))
else:
validate_xml(xml_source, validator)

Expand Down
Loading

0 comments on commit d4f033a

Please sign in to comment.