-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Job Performance Metrics #3596
Labels
Comments
Noticing that the average job runtime is going up, seems to be reset with every production deployment: SELECT DATE(created_at), EXTRACT(EPOCH FROM AVG(delivered_at - created_at)) AS avg_seconds
FROM core_job
WHERE created_at >= '2023-01-01'
GROUP BY DATE(created_at)
HAVING EXTRACT(EPOCH FROM AVG(delivered_at - created_at)) < 700 -- exclude anomalies
ORDER BY DATE(created_at); |
Another version of the query above, but looking at p75 and p95 metrics instead of the average: WITH query_durations AS (
SELECT DATE(created_at), EXTRACT(EPOCH FROM delivered_at - created_at) AS duration
FROM core_job
WHERE created_at >= '2023-01-01'
)
SELECT date,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY duration) AS p75,
PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY duration) AS p95
FROM query_durations
WHERE date NOT IN (
-- Exclude dates with exceptional values to better see trend
'2023-01-16',
'2023-02-16',
'2023-02-20',
'2023-02-22')
GROUP BY date
ORDER BY date; |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add a CloudWatch (or other) dashboard to monitor app-specific metrics. We run these manually from time to time, but should be available on demand to monitor ongoing health of the application.
Once such metrics are in place, we can also consider adding alarms for when they exceed some bounds.
Examples of current metrics:
Number of Jobs per Week:
Job Success Rate (completed jobs / all jobs) per Week:
Average Job Completion Time (delivered_at - created_at, in seconds):
The text was updated successfully, but these errors were encountered: