Fetch datasets as release assets (instead of Git LFS pull)

Large datasets, mostly CSV files, are currently fetched directly from Git LFS which induce significant Git LFS bandwidth costs.

Fetching these datasets as pre-compressed release assets will reduce download time and eliminate most GitHub Git LFS bandwidth costs.  Thanks to @jvanulde for the idea and @DamonU2 for the pioneering work.

This, I think, is easier to implement and maintain, thus more robust and less error-prone than my previous unimplemented "XZ-compressed copies of repos" idea:

* OpenDRR/opendrr-api#91

Data source repos:

* OpenDRR/openquake-inputs
* OpenDRR/model-inputs
* OpenDRR/canada-srm2
* OpenDRR/earthquake-scenarios

Scripts that fetch from these repos include (but may not be limited to):

* python/add_data.sh (OpenDRR/opendrr-api)
* scripts/DSRA_outputs2postgres_lfs.py (OpenDRR/model-factory)

Cf. these commands found in add_data.sh, for example:

```bash
fetch_csv openquake-inputs ...
fetch_csv model-inputs ...
curl -L https://api.github.com/repos/OpenDRR/canada-srm2/contents/cDamage/output?ref=tieg_natmodel2021
curl -L https://api.github.com/repos/OpenDRR/earthquake-scenarios/contents/FINISHED
python3 DSRA_outputs2postgres_lfs.py --dsraModelDir=$DSRA_REPOSITORY --columnsINI=DSRA_outputs2postgres.ini --eqScenario="$eqscenario"
```

XZ or Zstd compression?  (compressed file sizes vs. decompression speed)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fetch datasets as release assets (instead of Git LFS pull) #190

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Fetch datasets as release assets (instead of Git LFS pull) #190

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions