Skip to content

Break after footnote markers #17

@alxtsg

Description

@alxtsg

Given the following piece of text extracted from a page on the Wikipedia:

Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project[3] by a community of volunteer editors using a wiki-based editing system.[4] It is the largest and most popular general reference work on the World Wide Web,[5][6][7] and is one of the most popular websites ranked by Alexa as of October 2019.

I get the following result when using sentence-splitter to split the text:

Sentence #0: Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project[3] by a community of volunteer editors using a wiki-based editing system.[4] It is the largest and most popular general reference work on the World Wide Web,[5][6][7] and is one of the most popular websites ranked by Alexa as of October 2019.

Does it make sense to split the text at the footnote marker [4]? The result will become:

Sentence #0: Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project[3] by a community of volunteer editors using a wiki-based editing system.[4]
Sentence #1: It is the largest and most popular general reference work on the World Wide Web,[5][6][7] and is one of the most popular websites ranked by Alexa as of October 2019.

If this make sense, where should I start to change the codes?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions