Skip to content

Fragment files with \r\n line endings get written out as \r\r\n #524

Open
@bennyrowland

Description

@bennyrowland

I am using the https://github.com/sphinx-contrib/sphinxcontrib-towncrier Sphinx extension which embeds the latest towncrier draft changelog into the built changelog. This works nicely but there is an odd effect that when a changelog fragment has multiple lines separated by \r\n line endings, the draft changelog gets written with those line endings replaced by a double carriage return \r\r\n then reading that back in Python converts it to \n\n so that each line of the fragment gets its own paragraph in the rendered changelog.

This effect seems to be caused by mixing binary reading of fragments with text output, as alluded to in this issue #420 by @hynek. I should note that this is all on Windows. Python has a rather weird behaviour trying to handle newlines in text because it forces all newlines to be \n internally. So reading a file as text it will convert any \r\n newlines into simply \n. Correspondingly, when it writes out the string as text it will automatically revert the \n to a \r\n string. But if the file is read in binary then it will keep the \r\n form for all newlines, then when it is written out as text the \n is converted to \r\n, leaving the written bytes as \r\r\n.

You can reproduce this with this very simple example code:

with open("example.txt", "w") as fout:
    fout.write("\r\n")
with open("example.txt", "rb") as fin:
    print(fin.read())  # prints "\r\r\n"
with open("example.txt", "r") as fin:
    print(fin.read().encode("utf8"))  # prints "\n\n"

Of course, when the content is written straight to the console then you don't see this issue, so it is normally fine for --draft, but when the content is read back in to Python for further use then it is a problem (on Windows). The simplest solution to this problem would be to encode the draft output before echoing it here:

click.echo(content)

click can handle writing the bytes just as easily as a str, but this preserves the newlines without adding the extra \r and makes the problem go away. I can't see any obvious side effects for any other use case of doing this encode step. Happy to put together a PR if desired, although this is only a 1 line change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions