Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wait condition script for nginx ingress #2293

Open
aperrot42 opened this issue Jun 1, 2021 · 19 comments · May be fixed by #3773
Open

Fix wait condition script for nginx ingress #2293

aperrot42 opened this issue Jun 1, 2021 · 19 comments · May be fixed by #3773
Assignees
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@aperrot42
Copy link

aperrot42 commented Jun 1, 2021

What would you like to be documented:

Fix wait for ingress script in https://kind.sigs.k8s.io/docs/user/ingress/#ingress-nginx

Why is this needed:

This is needed because with a slow internet connection and a slow computer
ingress-nginx-admission-create and ingress-nginx-admission-patch jobs take some time to complete.

While they are not completed the wait command provided in the documentation returns after a few seconds with

error: no matching resources found

This is because we first have to wait for ingress-nginx-admission-patch to complete before waiting for the controller to be ready.

I would propose to add a first wait command to wait for the jobs to complete in this documentation under the nginx section :

{{< codeFromInline lang="bash" >}} kubectl wait --namespace ingress-nginx 
--for=condition=complete job/ingress-nginx-admission-patch 
--timeout=30s 
&& 
kubectl wait --namespace ingress-nginx 
--for=condition=ready pod 
--selector=app.kubernetes.io/component=controller 
--timeout=90s {{< /codeFromInline >}}
@aperrot42 aperrot42 added the kind/documentation Categorizes issue or PR as related to documentation. label Jun 1, 2021
@aojea
Copy link
Contributor

aojea commented Jun 2, 2021

it doesn't matter to wait for the first job to complete, that will be the same as using 120s instead of 90s, right?

kubectl wait --namespace ingress-nginx 
--for=condition=ready pod 
--selector=app.kubernetes.io/component=controller 
--timeout=120s

@amwat
Copy link
Contributor

amwat commented Jun 2, 2021

yeah +1 to what @aojea said, we don't want to necessarily keep on unravelling the dependency graph there, but rather provide a simple single command that users can run.

I'm also not sure how much we want to fine tune this. 90s was a random reasonably small value to wait for.
I think the main gist is to "wait for Ingress controller to be ready" which can take an unpredictable amount of time based on the system.

We can probably just bump the timeout to something very high (but also risk catching actual misconfigurations late).

@aperrot42
Copy link
Author

aperrot42 commented Jun 3, 2021 via email

@aojea
Copy link
Contributor

aojea commented Jun 3, 2021

🤔 Checking the manifest the pod controller is created by s deployment object, and the job runs in parallel https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/static/provider/kind/deploy.yaml

Let's ask the author @aledbf 😉, he will solve any doubts and he can advice us on which is the best solution

@aperrot42
Copy link
Author

Here is the output if I just wait for the pod to be ready without waiting for the patch job to be completed first :

## Add nginx ingress

namespace/ingress-nginx created
serviceaccount/ingress-nginx created
configmap/ingress-nginx-controller created
clusterrole.rbac.authorization.k8s.io/ingress-nginx created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx created
role.rbac.authorization.k8s.io/ingress-nginx created
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
service/ingress-nginx-controller-admission created
service/ingress-nginx-controller created
deployment.apps/ingress-nginx-controller created
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
serviceaccount/ingress-nginx-admission created
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission created
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
role.rbac.authorization.k8s.io/ingress-nginx-admission created
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created
error: no matching resources found

And error: no matching resources found is the output of
kubectl wait --namespace ingress-nginx --for=condition=Ready pod --selector=app.kubernetes.io/component=controller --timeout=600s. It happens much earlier than the timeout (after 5 seconds I would say).

@aledbf
Copy link
Member

aledbf commented Jun 3, 2021

error: no matching resources found

This error means the ingress-nginx deployment has not been scheduled. Is not related to the --timeout flag.
I can reproduce this issue only if my laptop is under heavy load during the execution of kubectl commands.

@aperrot42
Copy link
Author

aperrot42 commented Jun 3, 2021 via email

@aojea
Copy link
Contributor

aojea commented Jun 3, 2021

Can it be scheduled before the patch job completes ?

The deployment resource is independent of the job resource, each of them are managed by different controllers, so I guess that are independent ... another thing is that the logic on the pod depends on the job execution, but we should only care that the pod is scheduled

@aperrot42
Copy link
Author

aperrot42 commented Jun 7, 2021

Indeed, it seems like this job ingress-nginx/templates/admission-webhooks/job-patch/job-createSecret.yaml

must have created this secret : ingress-nginx-admission

For this volume to be mounted in the controller :

      volumes:
        - name: webhook-cert
          secret:
            secretName: ingress-nginx-admission

@stefanondisponibile
Copy link

is there any fix for this?

@BenTheElder
Copy link
Member

cc @rikatz WDYT? 🙏

@rikatz
Copy link
Contributor

rikatz commented Jul 29, 2021

/assign
/cc
Let me check this asap :)

@rikatz
Copy link
Contributor

rikatz commented Jul 29, 2021

So, for what I could read here, the example waits only for the controller, and the controller depends on the admission jobs (which are only responsible to generate the certificates) runs. And this reflects on the deployment that needs to wait for the volume (the certificates) to be mounted.

I guess there's no "right" solution, but I agree with Antonio that the idea of the documentation is actually to be a quick start and trying to cover every user case is hard.

For example, I keep forgetting to add the label ingress-ready=true in my nodes, but I don't think we should reproduce the same instruction again below! :)

I think increasing to 120s is reasonable (or even 180s, as Contour uses it in its example above, in the same page).

@aojea
Copy link
Contributor

aojea commented Jul 30, 2021

I think increasing to 120s is reasonable (or even 180s, as Contour uses it in its example above, in the same page).

That is a fair point, better if we have the same timeouts in all of them

@rikatz
Copy link
Contributor

rikatz commented Jul 30, 2021

so who wants to open the PR? @aperrot42 do you wanna make a contribution? :)

@BenTheElder BenTheElder added good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Jul 30, 2021
@BenTheElder BenTheElder added the sig/testing Categorizes an issue or PR as relevant to SIG Testing. label Jul 30, 2021
@aperrot42
Copy link
Author

aperrot42 commented Jul 30, 2021 via email

@aojea
Copy link
Contributor

aojea commented Jul 30, 2021

as other ingress controller ?

yep, I take back my comment, is not fair to have one waiting 90s and other 180s ... but please let's not bump timeouts anymore 😛

@rikatz
Copy link
Contributor

rikatz commented Jul 31, 2021

@aperrot42 let me know if you need some help with this PR. Just ping me in Slack :)

@graham2071
Copy link

graham2071 commented Nov 23, 2021

Hi there, maybe you can wait for the pod to be created before waiting for the controller to be ready:

### Wait ingress nginx pod creation
echo -n "Wait for pod app.kubernetes.io/component=controller to be created."
while : ; do
  [ ! -z "`kubectl -n ingress-nginx get pod --selector=app.kubernetes.io/component=controller 2> /dev/null`" ] && echo && break
  sleep 2
  echo -n "."
done
### Wait pod is ready
timeout="600s"
echo -n "Wait for pod app.kubernetes.io/component=controller to be ready (timeout=$timeout)..."
kubectl wait  --namespace ingress-nginx \
              --for=condition=ready pod \
              --selector=app.kubernetes.io/component=controller \
              --timeout=$timeout
echo

That does the trick for me when working with a slow connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/documentation Categorizes issue or PR as related to documentation. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants