Skip to content

retain authored HTML for empty elements #95

@kartikprabhu

Description

@kartikprabhu

Currently mf2py due to using BeautifulSoup closes empty HTML tags. e.g. <br> gets converted to <br/> and <hr> gets converted into <hr/>. This makes the e-content[html] different from the authored one.

This does not seem to be an issue in actual use but will be for any tests. So I am documenting this here.

Details

html5lib by default does not do this see: https://github.com/html5lib/html5lib-python/blob/5e6b61b4630165dd4765fff41d0f855534d5e2fe/html5lib/serializer.py#L114

The relevant lines in BeautifulSoup which explicitly do this are https://github.com/waylan/beautifulsoup/blob/480367ce8c8a4d1ada3012a95f0b5c2cce4cf497/bs4/element.py#L1106-L1107 (Note that this is not the canonial source for BS4)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions