Description
Motivation and context:
Briefly describe the dataset. What is it, and why do we want to archive it regularly?
Include a link to the dataset webpage and any metadata documentation.
The links to all of the files show up on these two pages above, but the urls where the data is actually stored all seem to follow this pattern:
https://www.epa.gov/system/files/documents/{year of publication}-{month}/egrid{data year}{file name}
Note that the publication date is different than the data year. The data year is what we want to reference when we archive this data.
Also note that there are several files per year. We want to grab all of them so you'll need to use add_to_archive
.
Requirements for archiving
To be archived on Zenodo, a dataset must be:
- published under an open license that permits reuse and redistribution
- less than 50Gb in size (when zipped)
- relevant to energy modelling and research
Checklist for archive creation
Based on the README documentation on creating a new archive:
- [x] [Define the dataset's metadata](https://github.com/catalyst-cooperative/pudl-archiver#step-1-define-the-datasets-metadata)
- [ ] [Implement archiver interface](https://github.com/catalyst-cooperative/pudl-archiver#step-2-implement-archiver-interface)
- [ ] [Test archiver locally](https://github.com/catalyst-cooperative/pudl-archiver#step-3-test-archiver-locally)
- [ ] [Test uploading to Zenodo](https://github.com/catalyst-cooperative/pudl-archiver#step-4-test-uploading-to-zenodo)
- [ ] [Manually review archive before publication](https://github.com/catalyst-cooperative/pudl-archiver#step-5-manually-review-your-archive-before-publication)
- [ ] [Finalize archive](https://github.com/catalyst-cooperative/pudl-archiver#step-6-finalizing-the-archive) (only core Catalyst developers can complete this step)
- [ ] [Automate archiving](https://github.com/catalyst-cooperative/pudl-archiver#step-7-automate-archiving)
Links to published archives:
Include a link to the published sandbox archive for review.
Metadata
Metadata
Assignees
Type
Projects
Status