Some KG files are provided with checksums, some not.
How to handle checksums for those KGs that do not provide checksums?
Options:
- compute for small files (100MB) only -> can be implemented in github action
- compute for medium size files (<5GB) -> can be implemented in github action but can take few mins (1-2 mins)
Limitations:
- github runner storage - 14-16 GB -> max file size 10-15 GB, for larger files, do chunking
- job time limits - max 6 hrs
- number of files -> unlimited, but delete after processing (download, compute, remove)