Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Readiness + Startup probes #804

Merged
merged 21 commits into from
Jan 18, 2025
Merged

Readiness + Startup probes #804

merged 21 commits into from
Jan 18, 2025

Conversation

james-otten
Copy link
Collaborator

@james-otten james-otten commented Jan 7, 2025

The liveness probe was not working because the element was duplicated with the second being empty. Re-enable it, make it work, and add readiness + startup probes to allow for slow startups, while pulling pods from the service and eventually restarting them if they are unresponsive.

Let's let this soak in dev for a while.

@james-otten james-otten requested a review from WillNilges January 7, 2025 04:41
Copy link

codecov bot commented Jan 7, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.88%. Comparing base (832eb2e) to head (013cca3).
Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #804   +/-   ##
=======================================
  Coverage   94.88%   94.88%           
=======================================
  Files          89       89           
  Lines        3816     3816           
=======================================
  Hits         3621     3621           
  Misses        195      195           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

command:
- bash
- -c
- 'curl http://127.0.0.1:{{ .Values.meshweb.port }}/api/v1/ -H "Host: db.nycmesh.net" -s | grep meshin'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grep is fine but knowing that it's we're meshin' confused my brain for a sec lol

Comment on lines +61 to +81
exec:
command:
- bash
- -c
- 'curl http://127.0.0.1:{{ .Values.meshweb.port }}/api/v1/ -H "Host: db.nycmesh.net" -s | grep meshin'
periodSeconds: 3
initialDelaySeconds: 10
timeoutSeconds: 3
{{ end }}
{{ if eq .Values.meshweb.startup_probe "true" }}
startupProbe:
exec:
command:
- bash
- -c
- 'curl http://127.0.0.1:{{ .Values.meshweb.port }}/api/v1/ -H "Host: db.nycmesh.net" -s | grep meshin'
periodSeconds: 3
initialDelaySeconds: 20
timeoutSeconds: 3
failureThreshold: 20
{{ end }}
Copy link
Collaborator

@WillNilges WillNilges Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's worth having redundant probes. We should consider what kind of probes to use in place of these. Maybe we should just check if we get anything back from nginx at all as the liveness probe, then make the readiness probe check we're meshin'?

Not sure about startupProbe

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I might be wrong to suggest that.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

A common pattern for liveness probes is to use the same low-cost HTTP endpoint as for readiness probes, but with a higher failureThreshold. This ensures that the pod is observed as not-ready for some period of time before it is hard killed.

Comment on lines +61 to +81
exec:
command:
- bash
- -c
- 'curl http://127.0.0.1:{{ .Values.meshweb.port }}/api/v1/ -H "Host: db.nycmesh.net" -s | grep meshin'
periodSeconds: 3
initialDelaySeconds: 10
timeoutSeconds: 3
{{ end }}
{{ if eq .Values.meshweb.startup_probe "true" }}
startupProbe:
exec:
command:
- bash
- -c
- 'curl http://127.0.0.1:{{ .Values.meshweb.port }}/api/v1/ -H "Host: db.nycmesh.net" -s | grep meshin'
periodSeconds: 3
initialDelaySeconds: 20
timeoutSeconds: 3
failureThreshold: 20
{{ end }}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think I might be wrong to suggest that.

https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/

A common pattern for liveness probes is to use the same low-cost HTTP endpoint as for readiness probes, but with a higher failureThreshold. This ensures that the pod is observed as not-ready for some period of time before it is hard killed.

@james-otten james-otten merged commit af6f44f into main Jan 18, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants