Skip to content

ESA Copernicus DEM 30 meter dataset support.#574

Merged
jpswinski merged 4 commits intomainfrom
cop30
Feb 5, 2026
Merged

ESA Copernicus DEM 30 meter dataset support.#574
jpswinski merged 4 commits intomainfrom
cop30

Conversation

@elidwa
Copy link
Contributor

@elidwa elidwa commented Feb 4, 2026

This is the first SlideRule supported dataset that is not hosted on AWS, even though it implements the S3 protocol. The data are hosted at the San Diego Supercomputer Center (SDSC) on an S3-compatible object storage system. SlideRule accesses the dataset via GDAL using the vsis3 virtual filesystem, which allows transparent access to non-AWS S3 endpoints.

Performance note
A benchmark was run using 26k pseudorandom sample points over the Grand Mesa region across two raster datasets hosted on different infrastructures and regions. esa-worldcover-10meter, hosted on AWS eu-central-1 (Frankfurt, Germany), completed in 2.64 seconds. In contrast, esa-copernicus-30meter, hosted at SDSC on an S3-compatible but non-AWS object store, completed in 12.27 seconds. This corresponds to a ~4.6× slowdown, or an approximately 365% increase in execution time, even when compared against a dataset hosted in Europe. The result highlights the performance impact of non-AWS S3 endpoints and cross-infrastructure access, despite both datasets being accessed via GDAL vsis3.

Dataset Resolution Hosting / Region Exec Time (s) Relative to WorldCover
esa-worldcover-10meter 10 m AWS eu-central-1 (Frankfurt, DE) 2.641911 1.00×
esa-copernicus-30meter 30 m SDSC S3-compatible (non-AWS) 12.268738 4.64×

@elidwa elidwa requested a review from jpswinski February 4, 2026 18:23
@jpswinski
Copy link
Member

@elidwa What would be the downside to using the "endpoint" field in the asset directory instead of creating a new field called "aws_s3_endpoint"?

@elidwa
Copy link
Contributor Author

elidwa commented Feb 4, 2026

Using endpoint for S3 hosts would work technically, but the downside is semantic collision.

  • In earth_data_query.lua, endpoint already means AMS/API endpoint (for example "3dep" or "atl13"), and is used in:
    • dataset["endpoint"]
    • core.ams("POST", dataset["endpoint"], ...)
  • If endpoint is also used for object-store hosts (for example opentopography.s3.sdsc.edu), one field represents two different concepts:
    • control-plane API routing
    • data-plane S3-compatible storage endpoint
  • That makes behavior less predictable and increases risk of subtle bugs when assets are added or refactored.
  • It also makes debugging harder, because endpoint no longer has a single meaning.

@jpswinski jpswinski merged commit 388325b into main Feb 5, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants