Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVEREST-1182 fix install of everest operator #461

Merged
merged 2 commits into from
Jun 27, 2024

Conversation

recharte
Copy link
Collaborator

We introduced this phase check because OLM has a bug where it sometimes creates duplicate InstallPlans for the same CSV and we found a few cases where the duplicate InstallPlan wasn't reconciled correctly and abandoned by OLM. This abandoned InstallPlan was missing the status field meaning it was also missing the necessary plan to install the operator. Approving this InstallPlan would cause the operator to never be installed. By checking the phase we make sure we will be approving an InstallPlan that is actually ready to be approved. See operator-framework/kubectl-operator#13 for more details on a similar issue.

We introduced this phase check because OLM has a bug where it sometimes
creates duplicate InstallPlans for the same CSV and we found a few cases
where the duplicate InstallPlan wasn't reconciled correctly and
abandoned by OLM. This abandoned InstallPlan was missing the status
field meaning it was also missing the necessary plan to install the
operator. Approving this InstallPlan would cause the operator to never
be installed. By checking the phase we make sure we will be approving an
InstallPlan that is actually ready to be approved. See
operator-framework/kubectl-operator#13 for
more details on a similar issue.
@recharte recharte requested a review from a team as a code owner June 27, 2024 16:04
@recharte recharte enabled auto-merge (squash) June 27, 2024 16:07
@oksana-grishchenko oksana-grishchenko enabled auto-merge (squash) June 27, 2024 16:36
@oksana-grishchenko oksana-grishchenko merged commit e04e882 into main Jun 27, 2024
11 checks passed
@oksana-grishchenko oksana-grishchenko deleted the EVEREST-1182-fix-install-everest-operator branch June 27, 2024 16:42
recharte added a commit that referenced this pull request Jun 27, 2024
* EVEREST-1182 fix install of everest operator

We introduced this phase check because OLM has a bug where it sometimes
creates duplicate InstallPlans for the same CSV and we found a few cases
where the duplicate InstallPlan wasn't reconciled correctly and
abandoned by OLM. This abandoned InstallPlan was missing the status
field meaning it was also missing the necessary plan to install the
operator. Approving this InstallPlan would cause the operator to never
be installed. By checking the phase we make sure we will be approving an
InstallPlan that is actually ready to be approved. See
operator-framework/kubectl-operator#13 for
more details on a similar issue.

* EVEREST-1182 fix idempotency of the install command
Copy link
Member

@mayankshah1607 mayankshah1607 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you able to test if this works?

My understanding was that in most cases, this function returns at line 581, which returns the InstallPlanRef from the subscription - when this happens, the Subscription still points at the malformed InstallPlan, and takes a while to update (basically the for-loop is never executed).

I realised that the Install flow is a bit complicated, and its hard to figure out just by looking which conditions will be triggered and which execution path our code will take 😓

@recharte
Copy link
Collaborator Author

recharte commented Jun 27, 2024

Were you able to test if this works?

My understanding was that in most cases, this function returns at line 581, which returns the InstallPlanRef from the subscription - when this happens, the Subscription still points at the malformed InstallPlan, and takes a while to update..

I realised that the Install flow is a bit complicated, and its hard to figure out just by looking which conditions will be triggered and which execution path our code will take 😓

I did test that the install works correctly for the generic case. I'm still unable to replicate the case where the IP was missing the status but I guess this is as good as we'll get at this stage. I'll do some more tests to see if I can catch that case but it hasn't been easy...

Yes, the install flow is a bit complicated, we should probably refactor it in following releases.

@recharte
Copy link
Collaborator Author

My understanding was that in most cases, this function returns at line 581, which returns the InstallPlanRef from the subscription - when this happens, the Subscription still points at the malformed InstallPlan, and takes a while to update..

Actually, for the everest operator we are setting the startingCSV so it never returns at line 581 because in line 575 we set it.

@mayankshah1607
Copy link
Member

mayankshah1607 commented Jun 27, 2024

Actually, for the everest operator we are setting the startingCSV so it never returns at line 581 because in line 575 we set it.

If you trace back how the req is passed, we never set the StartingCSV value there (it is empty).. So it is the same as "" and 581 gets triggered

@recharte
Copy link
Collaborator Author

recharte commented Jun 27, 2024

Actually, for the everest operator we are setting the startingCSV so it never returns at line 581 because in line 575 we set it.

If you trace back how the req is passed, we never set the StartingCSV value there (it is empty).. So it is the same as "" and 581 gets triggered

Maybe I'm missing something but I don't think that is true. Let's take a closer look...
The getTargetInstallPlanName function is called here
In this function the req is never changed so it comes from the caller function. This call is here
The startingCSV is set here which comes from here which in turn takes the version param that comes from the caller function.
This is the call which has the v var set as long as it exists in the recommended versions from VS. These recommended versions come from here.

Is the recommended version empty in VS? 🙈

@mayankshah1607
Copy link
Member

mayankshah1607 commented Jun 27, 2024

Is the recommended version empty in VS? 🙈

If you trace back ever further and see the implementation of ReccomendedVersions, you'll see that we never set EverestOperator there. So, its nil.. Now if you follow along from there, it eventually leads to not setting req.StartingCSV 😅

@recharte
Copy link
Collaborator Author

Is the recommended version empty in VS? 🙈

If you trace back ever further and see the implementation of ReccomendedVersions, you'll see that we never set EverestOperator there. So, its nil.. Now if you follow along from there, it eventually leads to not setting the StartingCSV 😅

But then how is it set in the Subscription? I always see it set there...

@mayankshah1607
Copy link
Member

Is the recommended version empty in VS? 🙈

If you trace back ever further and see the implementation of ReccomendedVersions, you'll see that we never set EverestOperator there. So, its nil.. Now if you follow along from there, it eventually leads to not setting the StartingCSV 😅

But then how is it set in the Subscription? I always see it set there...

My bad, you're right. I missed this line where we're checking if it is nil, and then using the latest version:

recVer.EverestOperator = latest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants