Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replicate GOV.UK databases directly instead of via backups #557

Open
nacnudus opened this issue Nov 16, 2023 · 0 comments
Open

Replicate GOV.UK databases directly instead of via backups #557

nacnudus opened this issue Nov 16, 2023 · 0 comments

Comments

@nacnudus
Copy link
Contributor

nacnudus commented Nov 16, 2023

Trello

Currently we get the data via https://github.com/alphagov/govuk-s3-mirror, which clones nightly backups of GOV.UK databases from GOV.UK's integration environment to a bucket. We restore the backup file to a running instance of the database, extract what we need, and also copy the original tables into BigQuery. This is a delayed and fragile batch process, running from a non-production environment. Direct replication would be:

  • From a production environment
  • More reliable, presumably
  • Less delayed
  • Streamed, not batch
  • more expensive, because we would have to pay for a constantly running Cloud SQL instance, instead of ~1h/day compute engine instances

https://cloud.google.com/database-migration/docs/postgres/configure-source-database

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant