-
-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
I downloaded a website from Internet Archive using wayback-machine-downloader then created a WARC using warcit with the following command: warcit --fixed-dt 20100212221453 http://domainname.com /dirpath
.
It did create a WARC file. I would like to index them into solr using webarchive-discovery. When trying to do so, I get the following error:
2018-08-16 18:22:08 WARN WARCIndexer:414 - Invalid status line: null@28005
2018-08-16 18:22:08 WARN WARCIndexer:414 - Invalid status line: null@40193
2018-08-16 18:22:08 WARN WARCIndexer:414 - Invalid status line: null@79054
I could not load it into to AUT as well.
Example warc is attached. Can WARCIT be used to convert snapshots downloaded from Internet Archive into WARC format? (Unfortunately, Internet Archive does not provide a way to download WARCs).
Metadata
Metadata
Assignees
Labels
No labels