Skip to content

feat: containers to run custom STAC API and STAC ingester services #206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 3, 2025

Conversation

ceholden
Copy link
Contributor

@ceholden ceholden commented Jul 2, 2025

What type of PR is this? (check all applicable)

  • 🍕 Feature

Related Issue

Both of these issues would be closed pending incorporation of these containers into our K8s infrastructure in https://github.com/hotosm/k8s-infra

Describe this PR

This PR adds two containers for STAC related services that we'd like to run in our K8s cluster. Both containers are relatively small as the majority of the code exists in other components,

  • backend/stac-api is a customized version of the STAC FastAPI application. I modified it from the stac-fastapi-pgstac==5.0.2 release to avoid enabling the "transaction" extension that allow create/update/delete of STAC records. If we want to enable this in the future this main.py in the container could be modified to include authorization to permit only select people to modify our STAC records.
    • Once published we would update our eoapi.yaml at stac.image.name to point to this container
  • backend/stac-ingester is a container to house the CLI from stactools-hotosm
    • Once published we would want to run this as a cron scheduled service

I tried using the container publishing step from hotosm/gh-workflows but would love any guidance for what we'd like to do for this part!

Screenshots

N/A

Alternative Approaches Considered

I tended towards "duplicate" versus "create framework for future" with regards to things like Python .gitignores and Github Actions since it seemed easier to combine as needed and I don't have a great idea about preferences from the team. Happy to move things around! This is all pretty portable...

I could also see us wanting to consolidate the docker-compose.yml setup, at least for the backend services. They both depend on PgSTAC so that might be nice to share and we could see the ingested records reflected in the STAC API service.

Review Guide

I poke at this locally I would,

  1. For STAC API you should be able to get the service up with a simple docker compose up and navigate to http://0.0.0.0:8082/api.html (see also the README for it).
  2. For the STAC ingester I was able to ETL records after,
    i. Create the openaerialmap STAC Collection. This could be done via the pypgstac CLI but I just opened psql and ran SELECT * FROM create_collection('{"id": "openaerialmap"}'::jsonb)
    ii. Run the ingester hotosm sync-oam --uploaded-after 2025-07-01ploaded-after 2025-07-01

Checklist before requesting a review

[optional] What gif best describes this PR or how it makes you feel?

giphy

Copy link
Member

@spwoodcock spwoodcock Jul 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of bundling all the code for stac-fastapi-pgstac here (it's difficult to know what is modified and what isn't), could do something like:

  • Use the official image based on the main branch: ghcr.io/stac-utils/stac-fastapi-pgstac:main, including the disabled transaction extension by default (not ideal, as it's not a pinned image).
  • Probably better: grab the current ghcr.io/stac-utils/stac-fastapi-pgstac:main image, tag as ghcr.io/hotosm/stac-fastapi-pgstac:d33b0f3d1d509e16fe0dfc765dc47aea6613aac8 (based on the exact commit used), and use that image here?

I think that would make it clearer what the upgrade pathway is in future (hopefully just swapping to the official image at ghcr.io/stac-utils/stac-fastapi-pgstac:5.0.3

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a potential pathway, and we could might be able to get an older patch release (looks like most of the new commits are preparing for a v6 major release).

I had some thoughts queued up while you were taking a look (🙇) about reasons to set this up on our own, #206 (comment)

It's definitely a little hard to track what the full "app" setup should do, but every project I've worked on for this has eventually had to add customizations that require setting up the FastAPI app

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧠 happy to lift this up to a parent scope ([repo]/.gitignore or [repo]/backend/.gitignore). It's the Github's default Python.gitignore, but I wasn't sure if it would interfere with files we don't want to ignore from the frontend part of this repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on the Dockerfile from stac-fastapi-pgstac remixed to use uv

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is based on the app.py from stac-fastapi-pgstac which exists as a sort of "batteries included" minimum viable API setup.

By default eoapi-k8s and eoapi-cdk just use the container from stac-fastapi-pgstac, but in practice each project usually needs their own copy to fine tune behavior. Here we're disabling the "transaction" endpoints that allow create/update/delete of STAC records, but in the future we might also want to configure things like authorization or observability. For example this is the "FastAPI app" for the NASA VEDA project's STAC API

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to unify everything from https://github.com/hotosm/stactools-hotosm into this monorepo this would be a pretty obvious place. That would remove 1 layer of nesting from k8s-infra(OpenAerialMap(stactools-hotosm)) to k8s-infra(OpenAerialMap)

Figuring out how to publish the STAC extension seems like the most fiddly part of moving this code, and once that's done we'd want to update the hard coded link to the extension URL in the STAC Item creation code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should probably consider doing that if possible 🙏

Copy link
Member

@spwoodcock spwoodcock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

Just a minor question about the bundled stac-fastapi-pgstac code!

@ceholden
Copy link
Contributor Author

ceholden commented Jul 3, 2025

Hey @spwoodcock, I pushed one update to remove an unnecessary envvar in the two Github Action workflows that you pointed out. Otherwise I think this is good to merge whenever you're ready. I'll keep an eye on it to see how the container deploys go 🤞 so we can integrate them into k8s-infra

@spwoodcock spwoodcock merged commit bb19f85 into hotosm:main Jul 3, 2025
4 checks passed
@ceholden ceholden mentioned this pull request Jul 3, 2025
1 task
This was referenced Jul 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants