diff --git a/README.rst b/README.rst index 4fbfb63..23f30d8 100644 --- a/README.rst +++ b/README.rst @@ -26,6 +26,9 @@ Xul -- XML Utilities :target: https://github.com/peteradrichem/Xul/actions/workflows/code-checks.yml :alt: Code checks +.. image:: https://img.shields.io/badge/code%20style-black-000000.svg + :target: https://github.com/psf/black + Xul is a set of XML scripts written in Python. Documentation can be found on `Read The Docs`_. diff --git a/docs/changelog.rst b/docs/changelog.rst index 829d200..aa9913d 100644 --- a/docs/changelog.rst +++ b/docs/changelog.rst @@ -3,16 +3,19 @@ Changelog This document records all notable changes to `Xul `_. -`Unreleased `_ (2025-01-13) +`Unreleased `_ (2025-01-17) -------------------------------------------------------------------------------------- * Drop support for Python < 3.9. * :doc:`xp `: fix boolean result (Python >= 3.12). * :doc:`xp `: fix string result representation (Python 3). * :doc:`xp `: improved printing of namespaces. -* Better error messages. -* Code checks: ruff, black, isort, mypy. +* :doc:`transform `: syntax highlighting for terminal output +* :doc:`transform `: added ``--no-syntax`` option for terminal output +* :doc:`transform `: removed ``--xsl-output`` option (always for ``--file``). +* Fixed encoding issues. +* Clearer error messages. +* Code checks: ruff, black, isort, mypy (GitHub Action). * Test script for local testing with Docker Compose. -* GitHub Action: code checks. * Typing. * Updated Sphinx configuration. * Output formatting (f-strings). diff --git a/docs/index.rst b/docs/index.rst index 11ed2ac..8a38b02 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -29,6 +29,9 @@ Current version: |release| :target: https://github.com/peteradrichem/Xul/actions/workflows/code-checks.yml :alt: Code checks +.. image:: https://img.shields.io/badge/code%20style-black-000000.svg + :target: https://github.com/psf/black + .. index:: single: scripts diff --git a/docs/ppx.rst b/docs/ppx.rst index 490c828..877bdbf 100644 --- a/docs/ppx.rst +++ b/docs/ppx.rst @@ -17,6 +17,8 @@ Use ``ppx`` to pretty print an :ref:`xml_source` in human readable form. .. index:: single: white space +``ppx`` will try to use the character encoding of your terminal and defaults to UTF-8. + White Space ----------- For greater readability ``ppx`` removes and adds *white space*. @@ -33,14 +35,14 @@ Options $ ppx --help - usage: ppx [-h] [-V] [-n] [-o] [xml_source [xml_source ...]] + usage: ppx [-h] [-V] [-n] [-o] [xml_source ...] Pretty Print XML source in human readable form. positional arguments: xml_source XML source (file, , http://...) - optional arguments: + options: -h, --help show this help message and exit -V, --version show program's version number and exit -n, --no-syntax no syntax highlighting @@ -50,7 +52,7 @@ Options .. index:: single: ppx script; syntax highlighting - single: syntax highlighting + single: syntax highlighting; ppx Syntax Highlighting ------------------- @@ -70,7 +72,6 @@ You can disable syntax highlighting with the ``--no-syntax`` option. .. index:: single: ppx script; XML declaration - single: XML declaration single: XML declaration; ppx XML declaration diff --git a/docs/transform.rst b/docs/transform.rst index 24fae9b..126b340 100644 --- a/docs/transform.rst +++ b/docs/transform.rst @@ -11,17 +11,23 @@ an :ref:`xml_source`. If you need a command-line XSLT processor with more options have a look at `xsltproc `_ -Transform an XML file: +Output a transformed XML file with syntax highlighting like :doc:`ppx `: .. code-block:: bash transform stylesheet.xsl file.xml -Transform an XML file and :doc:`pretty print ` the result: +Transform an URL: .. code-block:: bash - transform --xsl-output stylesheet.xsl file.xml | ppx + curl -s https://example.com/path/file.xml | transform stylesheet.xsl + +Save transformed XML to a file: + +.. code-block:: bash + + transform stylesheet.xsl source.xml --file new.xml Options ------- @@ -31,37 +37,60 @@ Options $ transform --help - usage: transform [-h] [-V] [-x | -o] [-f FILE] xslt_source xml_source + usage: transform [-h] [-V] [-f FILE | -s] [-o] xslt_source [xml_source] - Transform XML source with XSLT. + Transform an XML source with XSLT. positional arguments: xslt_source XSLT source (file, http://...) xml_source XML source (file, , http://...) - optional arguments: + options: -h, --help show this help message and exit -V, --version show program's version number and exit - -x, --xsl-output honor xsl:output + -f FILE, --file FILE save result to file + + terminal output options: + -n, --no-syntax no syntax highlighting -o, --omit-declaration omit the XML declaration - -f FILE, --file FILE save result to file + + +.. index:: + single: transform script; syntax highlighting + single: syntax highlighting; transform + +Syntax highlighting +------------------- + +.. program:: transform +.. option:: -n, --no-syntax + +``transform`` will syntax highlight the XML result if you have Pygments_ installed. +Output the transformed XML without syntax highlighting: + +.. code-block:: bash + + transform --no-syntax stylesheet.xsl file.xml + .. index:: single: transform script; XML declaration single: XML declaration; transform -XSL output ----------- +XML declaration +--------------- +XML documents should begin with an XML declaration which specifies the version of XML being used [#]_. .. program:: transform -.. option:: -x, --xsl-output +.. option:: -o, --omit-declaration -You can honor the ``xsl:output`` element [#]_ with the ``--xsl-output`` option. +You can omit the XML declaration with the ``--omit-declaration`` option. .. code-block:: bash - transform --xsl-output stylesheet.xsl file.xml + transform --omit-declaration stylesheet.xsl file.xml + Save transformation result to file ---------------------------------- @@ -86,31 +115,21 @@ Example stylesheet that converts an XML document to UTF-16 encoding: -Save the transformation result to a little-endian UTF-16 Unicode text file. +Save the transformation result to a little-endian UTF-16 text file. .. code-block:: bash - transform --xsl-output to_utf16.xsl utf8.xml --file utf16.xml + transform to_utf16.xsl utf8.xml --file utf16.xml -When saving to file use the ``--xsl-output`` option to preserve the character encoding of the transformation. +Save to file will honor the ``xsl:output`` element [#]_. -XML declaration ---------------- -XML documents should begin with an XML declaration which specifies the version of XML being used [#]_. -.. program:: transform -.. option:: -o, --omit-declaration - -You can omit the XML declaration with the ``--omit-declaration`` option. - -.. code-block:: bash - - transform --omit-declaration stylesheet.xsl file.xml +.. _Pygments: https://pygments.org/ .. rubric:: Footnotes .. [#] `XSL Transformations (XSLT) 1.0 `_ -.. [#] `XSL Transformations: 16 Output `_ .. [#] Extensible Markup Language ยง2.8 `Prolog and Document Type Declaration `_ +.. [#] `XSL Transformations: 16 Output `_ diff --git a/docs/xp.rst b/docs/xp.rst index de56d4f..fe99727 100644 --- a/docs/xp.rst +++ b/docs/xp.rst @@ -45,24 +45,24 @@ Options Select nodes in an XML source with an XPath expression. positional arguments: - xpath_expr XPath expression - xml_source XML source (file, , http://...) + xpath_expr XPath expression + xml_source XML source (file, , http://...) options: - -h, --help show this help message and exit - -V, --version show program's version number and exit - -e, --exslt add EXSLT XML namespaces - -d DEFAULT_NS_PREFIX, --default-prefix DEFAULT_NS_PREFIX - set the prefix for the default namespace in XPath [default: 'd'] - -q, --quiet don't print XML source namespaces - -p, --pretty-element pretty print the result element - -r, --result-xpath print the XPath expression of the result element (or its parent) - -f, -l, --files-with-hits - only the names of files with a non-false and non-NaN result are written to standard output - -F, -L, --files-without-hits - only the names of files with a false or NaN result, or without any results are written to - standard output - -m, --method use ElementTree.xpath method instead of XPath class + -h, --help show this help message and exit + -V, --version show program's version number and exit + -e, --exslt add EXSLT XML namespaces + -d DEFAULT_NS_PREFIX, --default-prefix DEFAULT_NS_PREFIX + set the prefix for the default namespace in XPath [default: 'd'] + -q, --quiet don't print XML source namespaces + -p, --pretty-element pretty print the result element + -r, --result-xpath print the XPath expression of the result element (or its parent) + -f, -l, --files-with-hits + only the names of files with a non-false and non-NaN result are written to standard output + -F, -L, --files-without-hits + only the names of files with a false or NaN result, or without any results are written to + standard output + -m, --method use ElementTree.xpath method instead of XPath class .. index:: diff --git a/pyproject.toml b/pyproject.toml index 9feacc4..250b401 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -37,7 +37,7 @@ test = [ "isort~=5.13.2", "lxml-stubs~=0.5.1", "mypy~=1.14.1", - "ruff~=0.9.1", + "ruff~=0.9.2", "types-Pygments~=2.19", ] syntax = [ diff --git a/src/xul/cmd/transform.py b/src/xul/cmd/transform.py index 6ab98e5..2866978 100644 --- a/src/xul/cmd/transform.py +++ b/src/xul/cmd/transform.py @@ -1,4 +1,4 @@ -"""Transform XML source with XSLT.""" +"""Transform an XML source with XSLT.""" import argparse import sys @@ -8,12 +8,14 @@ from .. import __version__ from ..log import setup_logger_console +from ..ppxml import prettyprint from ..xsl import build_xsl_transform, xml_transformer def parse_cl() -> argparse.Namespace: """Parse the command line for options, XSLT source and XML sources.""" parser = argparse.ArgumentParser(description=__doc__) + parser.add_argument("-V", "--version", action="version", version="%(prog)s " + __version__) parser.add_argument("xslt_source", help="XSLT source (file, http://...)") parser.add_argument( @@ -23,14 +25,16 @@ def parse_cl() -> argparse.Namespace: type=argparse.FileType("r"), help="XML source (file, , http://...)", ) - output_group = parser.add_mutually_exclusive_group(required=False) + parser.add_argument("-f", "--file", dest="file", help="save result to file") + + output_group = parser.add_argument_group("terminal output options") output_group.add_argument( - "-x", - "--xsl-output", - action="store_true", - default=False, - dest="xsl_output", - help="honor xsl:output", + "-n", + "--no-syntax", + action="store_false", + default=True, + dest="syntax", + help="no syntax highlighting", ) output_group.add_argument( "-o", @@ -40,7 +44,7 @@ def parse_cl() -> argparse.Namespace: dest="declaration", help="omit the XML declaration", ) - parser.add_argument("-f", "--file", dest="file", help="save result to file") + return parser.parse_args() @@ -55,16 +59,6 @@ def print_result(result) -> None: sys.stderr.write(f"Cannot print XSLT result (LookupError): {e}\n") -def save_to_file(result, target_file: str) -> None: - """Save transformation result to file.""" - try: - with open(target_file, "bx") as file_object: - file_object.write(result) - except OSError as e: - sys.stderr.write(f"Saving result to {target_file} failed: {e.strerror}\n") - sys.exit(80) - - def output_xslt( xml_source: Union[TextIO, str], transformer: etree.XSLT, @@ -82,42 +76,18 @@ def output_xslt( if not result: return None + # Result is an lxml.etree._XSLTResultTree. # https://lxml.de/apidoc/lxml.etree.html#lxml.etree._XSLTResultTree - # _XSLTResultTree (./src/lxml/xslt.pxi): - if result.getroot() is None: - # Result (lxml.etree._XSLTResultTree) is not an ElementTree. - if args.file: - save_to_file(result, args.file) - else: - print_result(result) - return None + + if args.file: + return result.write_output(args.file) # type: ignore[attr-defined] # https://lxml.de/xpathxslt.html#xslt-result-objects - if args.xsl_output: - if args.file: - save_to_file(result, args.file) - else: - # Standard output: sys.stdout.encoding. - # Document labelled UTF-16 but has UTF-8 content: - # str(result, result.docinfo.encoding) == - # bytes(result).decode(result.docinfo.encoding) - print_result(result) - - # https://lxml.de/parsing.html#serialising-to-unicode-strings - # For normal byte encodings, the tostring() function automatically adds - # a declaration as needed that reflects the encoding of the returned string. - else: - # lxml.etree.tostring returns bytes. - etree_result = etree.tostring( - result, encoding=result.docinfo.encoding, xml_declaration=args.declaration - ) - if args.file: - save_to_file(etree_result, args.file) - else: - # Bytes => unicode string (Python 3). - print_result(etree_result.decode(result.docinfo.encoding)) # type: ignore[arg-type,union-attr] + if result.getroot() is None: + # Result is not an ElementTree. + return print_result(result) - return None + prettyprint(result, syntax=args.syntax, xml_declaration=args.declaration) def main(): @@ -139,7 +109,5 @@ def main(): sys.stderr.write("No XSLT source specified\n") sys.exit(50) - # Initialise XML parser. - parser = etree.XMLParser() # Transform XML source with XSL Transformer. - output_xslt(args.xml_source, transformer, parser, args) + output_xslt(args.xml_source, transformer, etree.XMLParser(), args) diff --git a/src/xul/ppxml.py b/src/xul/ppxml.py index a5e201b..00870b1 100644 --- a/src/xul/ppxml.py +++ b/src/xul/ppxml.py @@ -19,8 +19,16 @@ def _private_pp( :param syntax: syntax highlighting (or not) :param xml_declaration: print an XML declaration (or not) - https://lxml.de/api.html#serialisation - https://lxml.de/apidoc/lxml.etree.html#lxml.etree.tostring + Serialising to Unicode strings + https://lxml.de/parsing.html#serialising-to-unicode-strings + For normal byte encodings, the tostring() function automatically adds + a declaration as needed that reflects the encoding of the returned string. + + Pretty printing + https://lxml.de/api.html#serialisation + + lxml.etree.tostring + https://lxml.de/apidoc/lxml.etree.html#lxml.etree.tostring """ try: encoding = "utf-8" if sys.stdout.encoding is None else sys.stdout.encoding