create test or alert if rolling re-indexer isn't keeping up with demand #4700

ndushay · 2024-02-28T01:32:04Z

This ticket might belong in the DSA or infrastructure integration github issues.

It seems useful to check this weekly(?), and we want to check the live production argo index.

This could be a cron job, that perhaps sends a weekly email indicating what the oldest timestemp in argo prod's Solr index is and what the parallel workers value is. Or maybe it runs weekly (with a HB alert if it isn't) and only emails when the oldest Solr document is ?? 2 weeks ?? old (this should be a configuration setting).

You can determine the oldest timestamp for a document in the Argo index. The oldest document should not be older than _____ (something to be in settings.yml as "one month" or "two weeks" or something?)

Using a query such as that below, you can calculate approximatly how long it will take to reindex all of the SDR

http://sul-solr-prod-h.stanford.edu/solr/argo_prod/select?q=*:*&facet.range=timestamp&f.timestamp.facet.range.start=NOW%2FDAY-90DAYS&f.timestamp.facet.range.end=NOW&f.timestamp.facet.range.gap=%2B1DAY&rows=0&facet.field=timestamp&wt=xml&f.timestamp.sort=index

Currently, with "parallel" workers set to 3 in dor-services-app, it takes about 5-6 days.

If there are 2 workers ...

If there is 1 worker ...

jmartin-sul · 2024-02-28T19:35:27Z

Seems like another option might be an okcomputer check, if the crux of the check is something like "was the least recently indexed Solr doc indexed more than threshold weeks ago?" We used to have a similar check in pres cat's okcomputer to make sure that everything was being audited in a timely fashion, but that's no longer in the pres cat okcomputer checks. It appears we removed it for DB performance reasons, even though there is/was an index on the date field that check queried. Not sure if Solr might have similar performance issues for that sort of query, in which case maybe the okcomputer approach is a non-starter for this ticket also.

If we do go the cron job route, and want to HB alert when the cron fails to run, here's an explicit reminder that we can use Honeybadger's checkin feature for that (probably intended/implied by the description, but saying just in case).

ndushay changed the title ~~test that rolling re-indexer is keeping up with demand~~ create test or alert if rolling re-indexer isn't keeping up with demand Feb 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create test or alert if rolling re-indexer isn't keeping up with demand #4700

create test or alert if rolling re-indexer isn't keeping up with demand #4700

ndushay commented Feb 28, 2024

jmartin-sul commented Feb 28, 2024

create test or alert if rolling re-indexer isn't keeping up with demand #4700

create test or alert if rolling re-indexer isn't keeping up with demand #4700

Comments

ndushay commented Feb 28, 2024

jmartin-sul commented Feb 28, 2024