Skip to content

backfill prod GCP endpoint #2425

@jmartin-sul

Description

@jmartin-sul

blocked by #2424

same process as #2415 -- useful activerecord statements copied from that issue

# get a list of batch_size random druids that are not yet replicated to GCP:
batch_size = 100 # you probably want to 10x or 100x this eventually (and do it in a screen session if so)

druids = PreservedObject.where.not(id: PreservedObject.joins(zipped_moab_versions: [:zip_endpoint]).where(zip_endpoint: { endpoint_name: 'gcp_s3_south_1' })).limit(batch_size).pluck(:druid)

# sanity check: list the endpoints that have your druids
PreservedObject.joins(zipped_moab_versions: [:zip_endpoint]).where(druid: druids).group(:druid).pluck('druid', 'ARRAY_AGG(zip_endpoints.endpoint_name)')

# ship em 🚢 
PreservedObject.where(druid: druids).find_each(&:create_zipped_moab_versions!)

resources for query tweaking:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Ready (Ordered by Priority)

    Status

    Backlog (Blocked, Epics, &c.)

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions