-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/move fixtures to migrations #73
Feature/move fixtures to migrations #73
Conversation
mwestphall
commented
Jul 25, 2024
- Create a database migration for importing the baseline schema from the most recent prod dump
- Move all "fixture" database migrations into the migrations cli subsystem system
- Give the ability to specify whether a migration depends on one or more other migrations having completed
- Pull up a lot of the logic from individual subclasses of Migration into the main Migration class to allow for more concise definition of new migrations
- Add a docker-compose for spinning up a local postgis instance with appropriate extensions installed
…able to apply or has already applied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! A few changes to take a look at, but it looks nearly ready to merge.
CAN_APPLY = "can_apply" | ||
|
||
# The postconditions for this migration are met, so it doesn't need to be applied | ||
APPLIED = "applied" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would there be any value in having a state UNNECESSARY
which would mean that the postconditions are met but the preconditions are not? This could be valuable for migrations that are to recover from invalid/legacy database states of some sort.
Put another way, is there value to having the difference between CANT_APPLY and not APPLIED
(needs running but can't be right now) and CANT_APPLY and APPLIED
(does not need running since database is already in a valid state)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the current implementation of checking for whether a migration is applied is sophisticated enough to handle that use case, since it just checks the current state of tables in the database and assigns one of the 3 labels based on that state. We might be able to get at this by looking further down the dependency graph, eg. checking whether any dependents of a CANT_APPLY
migration are in an APPLIED
state.
MacrostratCoreMigration, | ||
] | ||
# Find all subclasses of Migration among imported modules | ||
migrations = Migration.__subclasses__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am into this but maybe it's a bit too implicit? Should there be a register_migration
method instead?
- this could be wrapped into a
MigrationsManager
class - this might allow us to break up migrations by subsystem, if desired
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the DROP SCHEMA macrostrat
command attempted in the MacrostratCoreMigration, I think it'll be necessary to disable that migration for now at least on the main
branch. Don't want to have data-destroying migrations quite yet until we're satisfied that the MariaDB -> Postgres migration into the Macrostrat schema is functional. Then I think we can rewrite that baseline migration quite substantially.
for cls in migrations: | ||
# Initialize migration | ||
_migration = cls() | ||
for _migration in instances: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One pattern we could eventually add here, which I have successfully used in the past, is to have a "dry run" migration automatically run first, based on a schema-only clone of the database.
if db is not None: | ||
db.session.flush() | ||
db.session.close() | ||
db = Database(PG_DATABASE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add this method to the macrostrat.database
module.
depends_on = ['map-source-slug'] | ||
|
||
# Confirm that the tables created by the API v3 migrations are present | ||
postconditions = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Postcondition checks are a fantastic addition. Looks like you went for higher-order functions simply to enhance composability, yeah?
|
||
# Confirm that the tables created by the API v3 migrations are present | ||
postconditions = [ | ||
exists("storage","object_group","object"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might want to use a pattern like tables_exist("object_group", "object", schema="storage")
for enhanced readability.
|
||
from psycopg2.sql import Identifier | ||
|
||
MATCHES_SLUG_SQL = """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't look used here?
docker-compose.yaml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this could be merged with local-root/docker-compose.yaml
? This is what I use for a local copy of Macrostrat currently. Could perhaps rename this directory for clarity.
db-local.Dockerfile
Outdated
@@ -0,0 +1,5 @@ | |||
FROM postgis/postgis:15-3.4 | |||
RUN apt-get update && apt-get install -y --no-install-recommends \ | |||
postgresql-$PG_MAJOR-pgaudit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! I've been meaning to get pgaudit working in my local environment.
I've used imresamu/postgis:15-3.4
while postgis/docker-postgis#216 is open, to help those of use on ARM machines.
MacrostratCoreMigration, | ||
] | ||
# Find all subclasses of Migration among imported modules | ||
migrations = Migration.__subclasses__() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the DROP SCHEMA macrostrat
command attempted in the MacrostratCoreMigration, I think it'll be necessary to disable that migration for now at least on the main
branch. Don't want to have data-destroying migrations quite yet until we're satisfied that the MariaDB -> Postgres migration into the Macrostrat schema is functional. Then I think we can rewrite that baseline migration quite substantially.
8dcccc2
to
2af13e5
Compare