Skip to content

Conversation

@plural-copilot
Copy link

Summary

This PR addresses the intermittent 500 Internal Server Errors observed on the /ping endpoint in the flow-test workload deployed in the boot-prod EKS Kubernetes cluster. The root cause was a deliberate exception raised every 3 seconds in the application code, causing unhealthy error rates and firing alerts.

Changes Made

  • Removed the conditional exception raising in the /ping endpoint in app/main.py to always return a successful response.

Rationale

This removal prevents artificial failures that were likely leftover debugging or test code that mistakenly remained active in production. The fix should stop the intermittent 500 errors and reduce the alert noise.

After merging, redeploy the service and monitor error rates to confirm the resolution of the alert.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants