Skip to content

Add validations related to minReplicas and maxReplicas before updating HPA #451

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

AVSBharadwaj
Copy link
Collaborator

@AVSBharadwaj AVSBharadwaj commented Jul 1, 2025

Summary

Adds comprehensive validation logic to prevent Kubernetes HPA validation errors by ensuring minReplicas <= maxReplicas and respecting all replica bounds across all Tortoise phases.

What Changed

New Validation Logic

  • Early validation: Check minReplicas/maxReplicas against user and cluster limits
  • Relationship validation: Ensure minReplicas <= maxReplicas
  • Phase-specific handling:
    • Emergency: Set minReplicas = maxReplicas
    • BackToNormal: Temporarily increase maxReplicas for safe gradual reduction
    • Working: Use recommended values directly
  • Final validation: Double-check all constraints after phase adjustments

Enhanced Logging

  • New WarningReplicaValidation event for better categorization
  • Clear error messages explaining adjustments made

Test Coverage

Added 8 test cases covering all validation scenarios including edge cases and phase transitions.

Why These Changes

Problem

  • Kubernetes rejects HPA configurations where minReplicas > maxReplicas
  • Missing validation for replica bounds and user/cluster limits
  • Poor error handling during phase transitions (especially BackToNormal)

Solution

  • Comprehensive validation prevents HPA errors while maintaining safety
  • Smart handling during BackToNormal phase prioritizes safety over strict validation
  • Enhanced logging improves debugging and observability

Benefits

  • Reliability: Prevents HPA validation errors
  • Safety: Maintains safety during critical phases
  • Observability: Clear logging for adjustments
  • Backward Compatible: No breaking changes

Testing

  • 8 comprehensive unit tests covering all validation scenarios
  • Tests edge cases, phase transitions, and boundary conditions

Type: Enhancement
Breaking Change: No
Testing: ✅ Unit tests added

}

// Ensure minReplicas doesn't exceed maxReplicas
if recommendMin > recommendMax {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there's a bug with the recommender logic if it ever returns recommendMin > recommendMax, should there also be a fix there?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The recommender will not result in such a recommendations ( where recommendMin > recommendMax) . This is added just for completion sake to cover all possible cases.

Copy link
Contributor

@djahandarie djahandarie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look safe to me and fix a bug, so I think we should more forward with them.

Though, as mentioned on Slack, I think this routine needs some refactoring to make it clearer what invariants we are trying to preserve (probably by having a pure subroutine which purely is responsible for enforcing those invariants, and nothing else), and that subroutine can be called as many times as needed. Should be fine to do in a separate PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants