You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This ticket might belong in the DSA or infrastructure integration github issues.
It seems useful to check this weekly(?), and we want to check the live production argo index.
This could be a cron job, that perhaps sends a weekly email indicating what the oldest timestemp in argo prod's Solr index is and what the parallel workers value is. Or maybe it runs weekly (with a HB alert if it isn't) and only emails when the oldest Solr document is ?? 2 weeks ?? old (this should be a configuration setting).
You can determine the oldest timestamp for a document in the Argo index. The oldest document should not be older than _____ (something to be in settings.yml as "one month" or "two weeks" or something?)
Using a query such as that below, you can calculate approximatly how long it will take to reindex all of the SDR
Seems like another option might be an okcomputer check, if the crux of the check is something like "was the least recently indexed Solr doc indexed more than threshold weeks ago?" We used to have a similar check in pres cat's okcomputer to make sure that everything was being audited in a timely fashion, but that's no longer in the pres cat okcomputer checks. It appears we removed it for DB performance reasons, even though there is/was an index on the date field that check queried. Not sure if Solr might have similar performance issues for that sort of query, in which case maybe the okcomputer approach is a non-starter for this ticket also.
If we do go the cron job route, and want to HB alert when the cron fails to run, here's an explicit reminder that we can use Honeybadger's checkin feature for that (probably intended/implied by the description, but saying just in case).
ndushay
changed the title
test that rolling re-indexer is keeping up with demand
create test or alert if rolling re-indexer isn't keeping up with demand
Feb 29, 2024
This ticket might belong in the DSA or infrastructure integration github issues.
It seems useful to check this weekly(?), and we want to check the live production argo index.
This could be a cron job, that perhaps sends a weekly email indicating what the oldest timestemp in argo prod's Solr index is and what the parallel workers value is. Or maybe it runs weekly (with a HB alert if it isn't) and only emails when the oldest Solr document is ?? 2 weeks ?? old (this should be a configuration setting).
You can determine the oldest timestamp for a document in the Argo index. The oldest document should not be older than _____ (something to be in settings.yml as "one month" or "two weeks" or something?)
Using a query such as that below, you can calculate approximatly how long it will take to reindex all of the SDR
http://sul-solr-prod-h.stanford.edu/solr/argo_prod/select?q=*:*&facet.range=timestamp&f.timestamp.facet.range.start=NOW%2FDAY-90DAYS&f.timestamp.facet.range.end=NOW&f.timestamp.facet.range.gap=%2B1DAY&rows=0&facet.field=timestamp&wt=xml&f.timestamp.sort=index
Currently, with "parallel" workers set to 3 in dor-services-app, it takes about 5-6 days.
If there are 2 workers ...
If there is 1 worker ...
The text was updated successfully, but these errors were encountered: