-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Attempt to fix flaky Harbor E2E setup #20641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Attempt to fix flaky Harbor E2E setup #20641
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
@@ -49,8 +48,7 @@ def dd_environment(e2e_instance): | |||
expected_log = "http server Running on" if HARBOR_VERSION < [1, 10, 0] else "API server is serving at" | |||
conditions = [ | |||
CheckDockerLogs(compose_file, expected_log, wait=3), | |||
lambda: time.sleep(4), | |||
WaitFor(create_simple_user), | |||
WaitFor(create_simple_user, wait=5), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to believe that one more second is enough? Just curious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class WaitFor(LazyFunction):
def __init__(
self,
func, # type: Callable
attempts=60, # type: int
wait=1, # type: int
args=(), # type: Tuple
kwargs=None, # type: Dict
):
This is actually increasing the waiting time by 4 seconds for every attempt (60 by default)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh true, I checked how WaitFor works but missed that the sleep in there is in each loop. Seems a crazy increase though we will go from 64 to 240 seconds max haha, hopefully 4 minutes is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figured it's pointless to increase it just a bit every time it fails, I'd rather wait as much possible cause we have no choice, if it still fails with this big timeout, then we have a different problem, we might need to consider switching to exponential retry backoff
What does this PR do?
The Harbor E2E tests failed recently with the following error:
This PR tries to address that issue by increasing the wait time giving more time for the harbor users endpoint to be healthy.
Motivation
Failing job: https://github.com/DataDog/integrations-core/actions/runs/16020744131/job/45196949202
Review checklist (to be filled by reviewers)
qa/skip-qa
label if the PR doesn't need to be tested during QA.backport/<branch-name>
label to the PR and it will automatically open a backport PR once this one is merged