Skip to content

Retrying compartmentless containers #1608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: experimental-windows-ambient
Choose a base branch
from

Conversation

grnmeira
Copy link
Contributor

@grnmeira grnmeira commented Aug 7, 2025

Problem

We have an issue with containerd and Windows' HCS where namespaces for pods don't yet have a compartment ID when the CNI signals ztunnel for a new workload that's just been created. Without that ID we can't create ztunnel proxy sockets inside the pod's network compartment. The compartment ID will only be available after ztunnel signals the CNI that the workload has been assimilated by ztunnel (which can't happen without a compartment ID 🫤).

What this PR does

This PR mitigates the stated problem with ztunnel replying an ACK to the CNI during the ADD operation, even though the workload proxies haven't been yet instantiated inside ztunnel. After a timeout, ztunnel tries again to add the workflow, and now the HCS API returns a valid compartment ID, which allows the creation of a proxy without any problems.

A more detailed flow looks like:

  1. CNI sends an ADD command to ztunnel
  2. ztunnel queries HCS and checks if a comapartment ID is available for that pod
    2.1 If the compartment ID is available, we create a proxy for the workload and ACK the CNI
    2.2 If the compartment ID is not available, we mark the workload as pending inside ztunnel and ACK the CNI.
  3. The CNI receives an ACK and the pod creation proceeds.
  4. After a timeout, ztunnel tries to add the pending workload again.

How the PR does it

It introduces a queue of events that runs in the same thread where ZDS commands are processed. These events are managed in a quite simple way at the moment, but can be expanded in the future for more advanced handling of future "internal events", including the current retries of pending workloads.

Caveats

We'll keep retrying compartmentless workloads even if they're failing in an unretriable way (to be fixed in a different PR). There's no current way to signal CNI after sending an ACK for the ADD command that the workload actually failed to be assimilated by ztunnel.

@grnmeira grnmeira requested a review from a team as a code owner August 7, 2025 14:15
@istio-testing istio-testing added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 7, 2025
@grnmeira grnmeira added the windows Experimental Windows support label Aug 8, 2025
@@ -339,6 +362,98 @@ impl WorkloadProxyManagerState {
}
}

pub async fn retry_comparmentless(&mut self, poddata: &WorkloadData) -> Result<(), Error> {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

compartmentless*

@@ -34,8 +34,7 @@ mod workloadmanager;
#[cfg(any(test, feature = "testing"))]
pub mod test_helpers;


#[derive(Debug)]
#[derive(Debug, Clone)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want to clone this?

Comment on lines +65 to +66
pub fn proxy_pending(&self, uid: &crate::inpod::WorkloadUid, workload_info: &WorkloadInfo) {
let mut state = self.state.write().unwrap();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to merge these?

@@ -36,14 +34,14 @@ impl InPodConfig {
..cfg.socket_config
};
Ok(InPodConfig {
cur_namespace: InpodNamespace::current()?,
cur_namespace: NetworkNamespace::current()?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the rename?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, the current naming makes no distiction between the specific namespace. Does this rename add value?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L Denotes a PR that changes 100-499 lines, ignoring generated files. windows Experimental Windows support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants