-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Description
Hi,
I have this code, that work perfectly find in Ubuntu 18.04 :
from mw import xml_dump
# For instance, with frwiki dump in 7z format
files = ["examples/dump.xml.7z"]
def page_info(dump, path):
for page in dump:
yield page.id, page.namespace, page.title
for page_id, page_namespace, page_title in xml_dump.map(files, page_info):
print(" ".join([str(page_id), str(page_namespace), page_title]))
That don't work anymore when I am going to CentOs in my lab.
This problem is solved when I take my dump into XML or even BZ2 format. This tuhs is not a solution for me : given files are much too heavy. I would really need to have 7z format available.
Metadata
Metadata
Assignees
Labels
No labels