Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AUT-2125: Remove tiller usage #70

Open
wants to merge 3 commits into
base: develop
Choose a base branch
from

Conversation

bmcblanc
Copy link

@bmcblanc bmcblanc commented Feb 2, 2024

  • Add documentation for migrating a running cluster

Copy link
Contributor

@owenallenaz owenallenaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bmcblanc This looks like a pretty small amount of changes. Help me understand the process you anticipate needing to go through to get this out in the wild to live servers. For example, lets imagine we did not perform any helm 2 to helm 3 migration of cloud clusters. What would happen if sv-deploy-gce used helm 3 but the app was previously installed as helm 2?

> It seems that we could simply redeploy (rollback) with helm2 container
> Rollback would mean, cleaning up Helm v3 configs/secrets/deploys?

### Local cluster upgrade
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the local cluster, no one has anything that is so persistent that it can't be restarted. The newer version of sv-kube will have a newer version of minikube/helm installed which means that in practice they'll be starting from a fresh sv-kube VM. In that sense, do we need to use helm-2to3 at all? Is it necessary to run some sort of migration commands for the cloud clusters?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed not mandatory for a local dev cluster as it could be just wiped out and spun up again, but I put it there in case some people want to keep their things running and simply upgrade.

For LIVE clusters, we must run the migration steps otherwise Helm 3 won't be able to see and manage the currently deployed Helm 2 charts.

I added a documentation in sv-deploy-gce for LIVE cluster migration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, if we need helm 2 to 3 in order to handle the upgrade on our live boxes, then I think, for now, lets include it in the base install of our boxes that way devops and myself has access to it. I would add the installation of this into the install_helm script. So that anyone using sv-kube has access to helm2 to 3. When we're off this, we can unwind it, but for the time being it feels like we should just have it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, I'll add it to the PR


if [ "$helm_version" != "$helm_version_expected" ]; then
if ! [[ "$helm_version_current" =~ $helm_version ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we're doing a regex comparison for 2 strings?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comparison was too restrictive and wasn't actually working on my VM (Apple Silicon based). Using a regex makes sure only the version matters, independently of the output format

Copy link
Contributor

@owenallenaz owenallenaz Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you run helm version --client --short on your silicon box, what do you get back? My suspicion is it has to do with the fact that helm version is like v3.14.0 but you're probably getting back v3.14.0+hash. If so, then perhaps we do like helm version --template='{{.Version}}' and then that might get us back what we expect. I don't think the hash is meaningful to our comparisons.

Copy link
Author

@bmcblanc bmcblanc Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I believe the hash was giving me troubles but let me rework that part then.

I used Regex because the version output was different between Helm 2 and Helm 3 and I was trying to keep it compatible, which does not seem to be relevant. Good catch on the template option, it works like a charm

# AUT-2125: Tiller has been removed from HELM v3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file looks dead. Lets just remove it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I left it there just to make sure it wan't breaking anything else if I got rid of it. I'll update

@bmcblanc
Copy link
Author

bmcblanc commented Feb 21, 2024

@bmcblanc This looks like a pretty small amount of changes. Help me understand the process you anticipate needing to go through to get this out in the wild to live servers. For example, lets imagine we did not perform any helm 2 to helm 3 migration of cloud clusters. What would happen if sv-deploy-gce used helm 3 but the app was previously installed as helm 2?

While Helm 3 can manage Helm 2 charts, they still need to be configured in the Helm 3 way. That's what the migration is doing. Changing some types, renaming, moving configurations. The migration steps are mandatory in order to use Helm 3 with Helm 2 and Helm 3 charts.

The steps for live clusters migration are listed in sv-deploy-gce documentation and must be carried out prior to running Helm 3 integrations commands.

The process goes this way:

  • Merge changes in SV-DEPLOY-GCE
  • Run a sv-deploy-gce container
  • Install 2to3 Helm plugin
  • Connect to the cluster to migrate
  • Run the 2to3 migration commands

@bmcblanc bmcblanc force-pushed the AUT-2125-update-sv-kubernetes-to-support-helm-3 branch from 47425b2 to 4e8c70c Compare February 21, 2024 16:16
@bmcblanc
Copy link
Author

@owenallenaz Pushed a new commit with the required changes in install_helm.sh


if [ "$helm_version" != "$helm_version_expected" ]; then
if ! [[ "$helm_version_current" != $helm_version ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last nit here... this looks like a not not. Since it's if NOT (current NOT expected) is that right? Wouldn't this re-install each time?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was too hasty, thanks catching that !
Just pushed the update

@bmcblanc bmcblanc force-pushed the AUT-2125-update-sv-kubernetes-to-support-helm-3 branch from d104ed0 to 7b08ee0 Compare February 22, 2024 01:44
@owenallenaz
Copy link
Contributor

@bmcblanc Ok, I think we're almost done here. One more thing we need here. I frequently have a problem, and devops people likely will to, where we forget to switch our cluster from a live cluster back to our local cluster. For example, I'll run a command like sudo sv stop dms and right now, if my cluster was a live cluster, it fails because it says cannot find tiller. In this newer variant of sv-kube though, now that the tiller isn't necessary, if I ran that command, by accident, I would have just hosed a real product. Whoops.

Right now the two commands that I think we need to protect are sv start and sv stop. Basically if your cluster isn't currently local I think we should prompt the user. We already have a method getCurrentContext. If it's not the local value (not sure if this is local or minikube) prompt the user with something like:

Current Context: ${cluster}
You are performing an action on a non-local cluster.
[Enter] to continue or ctrl+c to cancel.

We already have the npm read package available. You might be able to accomplish it, or something close with it. This should only display if the cluster isn't local. Then when we're calling it from sv-deploy-gce we'll need a way to override this. I think the easiest way is to repurpose the logContext function. We don't need that if we have a prompt, and the prompt is better. The two ways I can see to disable the feature when calling it from sv-deploy-gce is either A: Target an ENV value that CircleCI always has present or B: Pass a flag from sv-deploy-gce which disables the checkContext(). Either approach is valid.

@bmcblanc
Copy link
Author

@bmcblanc Ok, I think we're almost done here. One more thing we need here. I frequently have a problem, and devops people likely will to, where we forget to switch our cluster from a live cluster back to our local cluster. For example, I'll run a command like sudo sv stop dms and right now, if my cluster was a live cluster, it fails because it says cannot find tiller. In this newer variant of sv-kube though, now that the tiller isn't necessary, if I ran that command, by accident, I would have just hosed a real product. Whoops.

Right now the two commands that I think we need to protect are sv start and sv stop. Basically if your cluster isn't currently local I think we should prompt the user. We already have a method getCurrentContext. If it's not the local value (not sure if this is local or minikube) prompt the user with something like:

Current Context: ${cluster}
You are performing an action on a non-local cluster.
[Enter] to continue or ctrl+c to cancel.

We already have the npm read package available. You might be able to accomplish it, or something close with it. This should only display if the cluster isn't local. Then when we're calling it from sv-deploy-gce we'll need a way to override this. I think the easiest way is to repurpose the logContext function. We don't need that if we have a prompt, and the prompt is better. The two ways I can see to disable the feature when calling it from sv-deploy-gce is either A: Target an ENV value that CircleCI always has present or B: Pass a flag from sv-deploy-gce which disables the checkContext(). Either approach is valid.

I'll work on it

@bmcblanc bmcblanc force-pushed the AUT-2125-update-sv-kubernetes-to-support-helm-3 branch from 7b08ee0 to 746f178 Compare March 1, 2024 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants