This repository was archived by the owner on Mar 13, 2023. It is now read-only.
This repository was archived by the owner on Mar 13, 2023. It is now read-only.
Article titles can contain ":" #3
Open
Description
Currently, analyze_chunk()
removes all titles that contain :
under the assumption that these are non-mainspace titles. However, article titles can contain colons, e.g. Batman v Superman: Dawn of Justice or UTC+03:00 (or, on Simple Wikipedia, UTC+08:00 or Avatar: The Last Airbender). Many of these titles are actually redirects to titles without a colon, but all redirects are already removed by this point in the function, so that's immaterial.
Metadata
Metadata
Assignees
Labels
No labels