Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add exponential backoff to opamp onmessage #7387

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

nityanandagohain
Copy link
Member

@nityanandagohain nityanandagohain commented Mar 20, 2025

Add exponential back so that the server doesn't panic.


Important

Adds exponential backoff in OnMessage of opamp_server.go to handle agent creation errors with retries.

  • Behavior:
    • Adds exponential backoff in OnMessage in opamp_server.go to handle errors when finding or creating an agent.
    • Retries operation with increasing delay (doubles each time) until success.
  • Logging:
    • Logs error with backoff duration when retrying in OnMessage.

This description was created by Ellipsis for 0f7e6da. It will automatically update as commits are pushed.

@github-actions github-actions bot added the bug Something isn't working label Mar 20, 2025
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to eec45da in 1 minute and 58 seconds

More details
  • Looked at 35 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. pkg/query-service/app/opamp/opamp_server.go:85
  • Draft comment:
    Infinite loop risk: The backoff loop has no retry limit or maximum delay. Consider adding a cap to avoid indefinite blocking if agent creation repeatedly fails.
  • Reason this comment was not posted:
    Marked as duplicate.
2. pkg/query-service/app/opamp/opamp_server.go:94
  • Draft comment:
    Typographical: The error log message on line 94 uses 'retrying....' with four dots. Consider using the standard ellipsis 'retrying...' for consistency.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 10% vs. threshold = 50%
    While the comment is technically correct about inconsistent ellipsis usage, this is an extremely minor issue that doesn't affect functionality. Log messages are primarily for debugging, and the extra dot doesn't impact readability or cause any real problems. This feels like unnecessary nitpicking that adds noise to the PR review.
    The comment is factually correct and points out an inconsistency in code style. Consistent formatting can make code more professional and maintainable.
    While consistency is good, this level of nitpicking about log message punctuation is too trivial and doesn't meet the bar of "clearly a code change required" from our rules. It's the kind of minor detail that just adds noise to the review process.
    Delete this comment as it violates the rule about not making comments that are obvious or unimportant. The extra dot in an error message is too trivial to warrant a PR comment.
3. pkg/query-service/app/opamp/opamp_server.go:118
  • Draft comment:
    Typographical/Grammar: The comment on lines 118-119 ('global var methods to support singleton pattern. we want to discourage allow multiple servers in one installation') appears to have a grammatical issue. Consider rephrasing it for clarity, e.g., 'Global variable methods supporting the singleton pattern. We want to discourage the use of multiple servers in a single installation.'
  • Reason this comment was not posted:
    Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_YqBoZ594doOw7tCj


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 0f7e6da in 58 seconds

More details
  • Looked at 13 lines of code in 1 files
  • Skipped 0 files when reviewing.
  • Skipped posting 3 drafted comments based on config settings.
1. pkg/query-service/app/opamp/opamp_server.go:94
  • Draft comment:
    Consider capping the exponential backoff to a maximum duration to prevent excessively long sleep times in case of continuous failures.
  • Reason this comment was not posted:
    Comment was on unchanged code.
2. pkg/query-service/app/opamp/opamp_server.go:94
  • Draft comment:
    Good addition of backoff logging. Consider capping the exponential backoff duration and optionally adding jitter to avoid unbounded delays or synchronized retries.
  • Reason this comment was not posted:
    Comment was on unchanged code.
3. pkg/query-service/app/opamp/opamp_server.go:94
  • Draft comment:
    The logging message 'Failed to find or create agent retrying....' uses four dots. Consider using three dots ('retrying...') to follow standard punctuation conventions.
  • Reason this comment was not posted:
    Comment was on unchanged code.

Workflow ID: wflow_Yo1f0bctgJiPmDHZ


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@nityanandagohain nityanandagohain marked this pull request as draft March 24, 2025 12:33
@nityanandagohain
Copy link
Member Author

Check if this is required at all when we move to bun instead of diretly using sqlite and see if we can handle this error on the collector side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant