Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

baremetal: Add cluster admin kubeconfig into a secret #4456

Closed
wants to merge 3 commits into from
Closed

baremetal: Add cluster admin kubeconfig into a secret #4456

wants to merge 3 commits into from

Conversation

kirankt
Copy link
Contributor

@kirankt kirankt commented Dec 4, 2020

The baremetal platform is pivoting away from using pointer ignition config. It will be using a combination of full rendered ignition config fetched from the MCS (to deploy masters) [1][2] and the use of the rendered machineconfig objects for adding workers [3].

This PR saves the admin kubeconfig into a machineconfig asset. This is required for the baremetal machine controller to save the contents of the admin kubeconfig on the workers.

[1] #4427
[2] #4413
[3] openshift/cluster-api-provider-baremetal#127

@kirankt
Copy link
Contributor Author

kirankt commented Dec 4, 2020

/label platform/baremetal

@openshift-ci-robot openshift-ci-robot added the platform/baremetal IPI bare metal hosts platform label Dec 4, 2020
@hardys
Copy link
Contributor

hardys commented Dec 4, 2020

Some additional context - it seems that the kubeconfig file is appended to the MCS config ref https://github.com/openshift/machine-config-operator/blob/master/pkg/server/cluster_server.go#L116

This means it's missing when we consume the rendered config via the merged MachineConfig object in openshift/cluster-api-provider-baremetal#127 because that doesn't access via the MCS (due to the network rules discussed previously ref openshift/machine-config-operator#1690)

@kirankt
Copy link
Contributor Author

kirankt commented Dec 4, 2020

/retest

1 similar comment
@kirankt
Copy link
Contributor Author

kirankt commented Dec 4, 2020

/retest

Copy link
Contributor

@staebler staebler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we OK that anyone who has RBAC permissions to get MachineConfigs can get the admin kubeconfig?

pkg/asset/machines/worker.go Outdated Show resolved Hide resolved
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign crawford after the PR has been reviewed.
You can assign the PR to them by writing /assign @crawford in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kirankt kirankt changed the title baremetal: Add cluster admin kubeconfig into machineconfig baremetal: Add cluster admin kubeconfig into a secret Dec 4, 2020
@hardys
Copy link
Contributor

hardys commented Dec 5, 2020

This seems reasonable, but I'd like feedback from the MCO team to understand why the MCO generates the kubeconfig on the fly, and loads the cert/secret from disk ref https://github.com/openshift/machine-config-operator/blob/master/pkg/server/cluster_server.go#L116

I'm not clear if that's a historical thing or a deliberate choice, e.g to ensure the admin kubeconfig and related secret isn't accessible directly via the k8s API?

The other alternative would be to generate the kubeconfig in the cluster-api-provider-baremetal similar to the MCS, but I'm not clear if the following cert/secret is sufficient for that?

$ oc get pod machine-api-controllers-c85c69c75-45nnx -n openshift-machine-api -o json | jq '.spec.containers[1]|.name,.volumeMounts' | more
"machine-controller"
[
  {
    "mountPath": "/etc/pki/ca-trust/extracted/pem",
    "name": "trusted-ca",
    "readOnly": true
  },
  {
    "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
    "name": "machine-api-controllers-token-gqzjr",
    "readOnly": true
  }
]

I don't see the /var/run/secrets/kubernetes.io/serviceaccount mount in the MAO code, does anyone know where that comes from, and if it's equivalent to the secret used in the MCS?

All that said, I'm fine with solving this in the installer, provided we can get sufficient feedback re the security aspects.

@hardys
Copy link
Contributor

hardys commented Dec 5, 2020

/cc @crawford @cgwalters @runcom

@kirankt
Copy link
Contributor Author

kirankt commented Dec 5, 2020

/retest

@kirankt
Copy link
Contributor Author

kirankt commented Dec 5, 2020

This seems reasonable, but I'd like feedback from the MCO team to understand why the MCO generates the kubeconfig on the fly, and loads the cert/secret from disk ref https://github.com/openshift/machine-config-operator/blob/master/pkg/server/cluster_server.go#L116

I'm not clear if that's a historical thing or a deliberate choice, e.g to ensure the admin kubeconfig and related secret isn't accessible directly via the k8s API?

The other alternative would be to generate the kubeconfig in the cluster-api-provider-baremetal similar to the MCS, but I'm not clear if the following cert/secret is sufficient for that?

$ oc get pod machine-api-controllers-c85c69c75-45nnx -n openshift-machine-api -o json | jq '.spec.containers[1]|.name,.volumeMounts' | more
"machine-controller"
[
  {
    "mountPath": "/etc/pki/ca-trust/extracted/pem",
    "name": "trusted-ca",
    "readOnly": true
  },
  {
    "mountPath": "/var/run/secrets/kubernetes.io/serviceaccount",
    "name": "machine-api-controllers-token-gqzjr",
    "readOnly": true
  }
]

I don't see the /var/run/secrets/kubernetes.io/serviceaccount mount in the MAO code, does anyone know where that comes from, and if it's equivalent to the secret used in the MCS?

All that said, I'm fine with solving this in the installer, provided we can get sufficient feedback re the security aspects.

I did a little bit of digging and noticed that the contents in MCO comes from a secret in the opeshift-machine-config-operator namespace called node-bootstrapper-token, which has the contents of the kublet's kubeconfig data.

I surmise that the decision made by MCO (to defer adding the contents of the secret in the MCS) is that this kubeconfig data is secure and only needs to exist in the master nodes, where the MCS daemonset runs. This also explains why there was an explicit firewall rule in the workers nodes to block access from the pods to MCS.

In any case, this PR now creates a similar secret asset in the openshift-machine-api namespace, which the baremetal controller now [1] can read and append to the rendered machineconfig much like what MCS does...

[1] openshift/cluster-api-provider-baremetal#127

@kirankt
Copy link
Contributor Author

kirankt commented Dec 5, 2020

/retest

@kirankt
Copy link
Contributor Author

kirankt commented Dec 8, 2020

@staebler . Hello!! Can you PTAL at this now? I've made the kubeconfig a secret in the openshift-machine-api namespace, which we can access from the baremetal machine-controller. Thanks.

@kirankt
Copy link
Contributor Author

kirankt commented Dec 8, 2020

/retest

@openshift-merge-robot
Copy link
Contributor

@kirankt: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-ovirt 6d3609f link /test e2e-ovirt
ci/prow/e2e-libvirt 6d3609f link /test e2e-libvirt
ci/prow/e2e-crc 6d3609f link /test e2e-crc

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@staebler
Copy link
Contributor

staebler commented Dec 8, 2020

I did a little bit of digging and noticed that the contents in MCO comes from a secret in the opeshift-machine-config-operator namespace called node-bootstrapper-token, which has the contents of the kublet's kubeconfig data.

This secret is managed by kubernetes and is associated with the node-bootstrapper service account. Can the machine-api-controller generate a kubeconfig using a service account token in the same way? I am very leery of laying down the admin kubeconfig in a secret. It is also not clear to me yet why machines need the admin kubeconfig secret at all--particular worker machines.

@staebler
Copy link
Contributor

staebler commented Dec 8, 2020

I did a little bit of digging and noticed that the contents in MCO comes from a secret in the opeshift-machine-config-operator namespace called node-bootstrapper-token, which has the contents of the kublet's kubeconfig data.

This secret is managed by kubernetes and is associated with the node-bootstrapper service account. Can the machine-api-controller generate a kubeconfig using a service account token in the same way? I am very leery of laying down the admin kubeconfig in a secret. It is also not clear to me yet why machines need the admin kubeconfig secret at all--particular worker machines.

Looking at openshift/cluster-api-provider-baremetal@5391f7e#diff-2635a4094a1b9403011c9044be0b7adff3ed8d6bcec5cb76f215e95f33e04699R605, it seems to me that the kubeconfig is going to be used by the kubelet on the machines. This should almost certainly not be the admin kubeconfig.

@kirankt
Copy link
Contributor Author

kirankt commented Dec 9, 2020

I did a little bit of digging and noticed that the contents in MCO comes from a secret in the opeshift-machine-config-operator namespace called node-bootstrapper-token, which has the contents of the kublet's kubeconfig data.

This secret is managed by kubernetes and is associated with the node-bootstrapper service account. Can the machine-api-controller generate a kubeconfig using a service account token in the same way? I am very leery of laying down the admin kubeconfig in a secret. It is also not clear to me yet why machines need the admin kubeconfig secret at all--particular worker machines.

Hi @staebler . Yes. This secret is the kubelet secret NOT the admin secret as originally planned. Sorry, if I wasn't clear. I believe the node-bootstrapper-token one lives in the openshift-machine-controller-operator namespace and is the exact same thing as this kubelet kubeconfig which this PR creates in the openshift-machine-api namespace. Thanks.

Here is the proof: 😄

kubeconfigKubelet := &kubeconfig.Kubelet{}

@staebler
Copy link
Contributor

staebler commented Dec 9, 2020

I did a little bit of digging and noticed that the contents in MCO comes from a secret in the opeshift-machine-config-operator namespace called node-bootstrapper-token, which has the contents of the kublet's kubeconfig data.

This secret is managed by kubernetes and is associated with the node-bootstrapper service account. Can the machine-api-controller generate a kubeconfig using a service account token in the same way? I am very leery of laying down the admin kubeconfig in a secret. It is also not clear to me yet why machines need the admin kubeconfig secret at all--particular worker machines.

Hi @staebler . Yes. This secret is the kubelet secret NOT the admin secret as originally planned. Sorry, if I wasn't clear. I believe the node-bootstrapper-token one lives in the openshift-machine-controller-operator namespace and is the exact same thing as this kubelet kubeconfig which this PR creates in the openshift-machine-api namespace. Thanks.

Here is the proof: smile

kubeconfigKubelet := &kubeconfig.Kubelet{}

Oh yeah, I missed that it was using the kubelet kubeconfig. It is still not clear to me why this needs to be laid down by the installer instead of handling with a service account token by the machine-api-operator.

@kirankt
Copy link
Contributor Author

kirankt commented Jan 18, 2021

Closing this PR as we need explore iso-based userdata installs.

@kirankt kirankt closed this Jan 18, 2021
hardys pushed a commit to hardys/installer that referenced this pull request Mar 8, 2021
This doesn't work for IPI baremetal deployments driven via hive,
because there are firewall rules that prevent access to the
bootstrap MCS from the pod running the installer.

This was implemented in:
openshift#4427

But we ran into problems making the same approach work for
worker machines ref:
openshift#4456

We're now looking at other approaches to resolve the
network-config requirements driving that work, so
switching back to the pointer config for masters seems
reasonable, particularly given this issue discovered for
hive deployments.

Conflicts:
  pkg/tfvars/baremetal/baremetal.go

This reverts commit 98dc381.
hardys pushed a commit to hardys/installer that referenced this pull request Mar 8, 2021
This doesn't work for IPI baremetal deployments driven via hive,
because there are firewall rules that prevent access to the
bootstrap MCS from the pod running the installer.

This was implemented in:
openshift#4427

But we ran into problems making the same approach work for
worker machines ref:
openshift#4456

We're now looking at other approaches to resolve the
network-config requirements driving that work, so
switching back to the pointer config for masters seems
reasonable, particularly given this issue discovered for
hive deployments.

Conflicts:
  pkg/tfvars/baremetal/baremetal.go

This reverts commit 98dc381.
smrowley pushed a commit to smrowley/installer that referenced this pull request Mar 31, 2021
This doesn't work for IPI baremetal deployments driven via hive,
because there are firewall rules that prevent access to the
bootstrap MCS from the pod running the installer.

This was implemented in:
openshift#4427

But we ran into problems making the same approach work for
worker machines ref:
openshift#4456

We're now looking at other approaches to resolve the
network-config requirements driving that work, so
switching back to the pointer config for masters seems
reasonable, particularly given this issue discovered for
hive deployments.

Conflicts:
  pkg/tfvars/baremetal/baremetal.go

This reverts commit 98dc381.
AnnaZivkovic pushed a commit to AnnaZivkovic/installer that referenced this pull request Apr 1, 2022
This doesn't work for IPI baremetal deployments driven via hive,
because there are firewall rules that prevent access to the
bootstrap MCS from the pod running the installer.

This was implemented in:
openshift#4427

But we ran into problems making the same approach work for
worker machines ref:
openshift#4456

We're now looking at other approaches to resolve the
network-config requirements driving that work, so
switching back to the pointer config for masters seems
reasonable, particularly given this issue discovered for
hive deployments.

Conflicts:
  pkg/tfvars/baremetal/baremetal.go

This reverts commit 98dc381.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform/baremetal IPI bare metal hosts platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants