-
Notifications
You must be signed in to change notification settings - Fork 18
Open
Labels
Status: ProposalRequest for commentsRequest for comments
Description
Given the following piece of text extracted from a page on the Wikipedia:
Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project[3] by a community of volunteer editors using a wiki-based editing system.[4] It is the largest and most popular general reference work on the World Wide Web,[5][6][7] and is one of the most popular websites ranked by Alexa as of October 2019.
I get the following result when using sentence-splitter to split the text:
Sentence #0: Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project[3] by a community of volunteer editors using a wiki-based editing system.[4] It is the largest and most popular general reference work on the World Wide Web,[5][6][7] and is one of the most popular websites ranked by Alexa as of October 2019.
Does it make sense to split the text at the footnote marker [4]
? The result will become:
Sentence #0: Wikipedia is a multilingual online encyclopedia created and maintained as an open collaboration project[3] by a community of volunteer editors using a wiki-based editing system.[4]
Sentence #1: It is the largest and most popular general reference work on the World Wide Web,[5][6][7] and is one of the most popular websites ranked by Alexa as of October 2019.
If this make sense, where should I start to change the codes?
Metadata
Metadata
Assignees
Labels
Status: ProposalRequest for commentsRequest for comments