Skip to content

Conversation

@plural-copilot
Copy link

Summary

This PR fixes the ongoing issue that causes the "Flow 500s" alert to fire for the flow-test service running in the demo-prod cluster. The root cause was an intentional exception being raised every 3 seconds when handling the /ping endpoint, causing intermittent 500 errors and alert noise.

Changes Made

  • Removed the code block inside /ping handler in app/main.py that raised Exception("unknown internal error") based on the current time modulo 3.
  • Now /ping always returns a 200 OK with a simple pong response.

Rationale

Removing this forced error simulation prevents the service from sending 500 error responses on a health check endpoint, thus stopping the alert from firing due to these intended failures. This ensures availability monitoring is accurate and not disrupted by artificial errors. If error simulation is still required, it should be constrained to non-production environments outside this deployment.

Following this PR merge and deployment via the standard Plural GitOps flow, the alert should stop firing for this service.

Please review and approve to mitigate the alert noise issue.

@michaeljguarino michaeljguarino deleted the plrl/ai/fix/flow-500s-remove-intentional-errors-gyqe06 branch September 17, 2025 17:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants