Consistency issues due to the use of mount binds

Hello,

We ended-up in a really edgy problem with our https://github.com/cert-manager/csi-driver-spiffe.
Basically, unlike most other Kubernetes CSI drivers, it uses a first `tmpfs` that will contains both metadata for volumes and SPIFFE SVID certificates (x509 certificates issued by cert-manager). Then, it answers to `NodePublishVolume` CSI RPC requests by making a `mount` with `bind` + `ro` from this `tmpfs` to the `target_path` given by the Container Orchestrator (in our case Kube).

The issue is that, on restart or failure, the bind no longer point to the valid `tmpfs` directory, we end up with a never renewed certificates from the POV of pods. One could say, it's fine if the instance of csi-driver-spiffe is never rebooted, you never have to care about that. But the issue is a bit more complex than that!

Since this driver uses `RequiresRepublish`, `kubelet` will issue **A LOT** of repeated calls to `NodePublishVolume` and any failure will lead to an `unmount`, but this **will not be propagated to the `mount_namespace` of the running pods!**. Hence, requiring users to restart the pod manually.

This is a real problem when used with SPIFFE SVID since they are typically short-lived and rotated often. A failure in this critical rotation mechanism leads to major workload disruptions.

I am revamping parts of this library in a fork to:

1. Split metadata and data: metadata ends up in a local `tmpfs` and data gets its own `tmpfs` directly mounted at `target_path` where we will perform atomic updates. No more mount binds so it is simpler to manage.
2. If the `target_path` is already correctly mounted, avoid unmount or doing any destructive changes since the running pods wouldn't see that.
3. Improve the `IsMounted` detection to ensure it's not just any mount point but one as expected by this library.
4. Add some metrics / stats collection.
5. Change error reporting in CSI RPC since they have side-effects, please see: https://github.com/kubernetes/kubernetes/issues/121271#issuecomment-2522628107

Please, tell me if any of those changes seems too risky / if we have a good reason to have a mount bind instead of two separate mounts etc..

Also, tell me if you are interested in getting that merged upstream :)

Thanks 🙏 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consistency issues due to the use of mount binds #74

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consistency issues due to the use of mount binds #74

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions