Skip to content

CI fluke: two pods created for a ReplicaSet with one replica #304

Open
@dtantsur

Description

@dtantsur

Observed in https://github.com/metal3-io/ironic-standalone-operator/actions/runs/16174585591/attempts/1?pr=303

Two Ironic pods are created, both progress normally since they are on different nodes. The tests blow up because of httpd being not ready in one of them, but it's possible that more instances of the same problem happen without a failure.

Relevant kube-scheduler logs:

I0709 16:44:28.771186       1 cache.go:504] "Pod was added to a different node than it was assumed" podKey="6503fbad-ccc5-4ea7-babc-9b68132dfc9b" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-f7bcs" assumedNode="minikube" currentNode="minikube-m03"
I0709 16:44:28.788794       1 cache.go:504] "Pod was added to a different node than it was assumed" podKey="ac1a8982-0dfc-46da-a5cb-6dfde9ca5e99" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-8lzrh" assumedNode="minikube-m03" currentNode="minikube"
I0709 16:44:28.812594       1 cache.go:504] "Pod was added to a different node than it was assumed" podKey="0fba088b-7c02-4a37-91f7-1c6eccfb6deb" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-ntzxt" assumedNode="minikube-m02" currentNode="minikube-m03"
E0709 16:44:28.830483       1 framework.go:1317] "Plugin Failed" err="Operation cannot be fulfilled on pods/binding \"test-ironic-service-d9cb7cbb7-f7bcs\": pod test-ironic-service-d9cb7cbb7-f7bcs is already assigned to node \"minikube\"" plugin="DefaultBinder" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-f7bcs" node="minikube-m03"
E0709 16:44:28.830792       1 schedule_one.go:347] "scheduler cache ForgetPod failed" err="pod 6503fbad-ccc5-4ea7-babc-9b68132dfc9b(test-disabled-downloader/test-ironic-service-d9cb7cbb7-f7bcs) was assumed on minikube-m03 but assigned to minikube" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-f7bcs"
E0709 16:44:28.831020       1 schedule_one.go:1046] "Error scheduling pod; retrying" err="running Bind plugin \"DefaultBinder\": Operation cannot be fulfilled on pods/binding \"test-ironic-service-d9cb7cbb7-f7bcs\": pod test-ironic-service-d9cb7cbb7-f7bcs is already assigned to node \"minikube\"" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-f7bcs"
E0709 16:44:28.830670       1 framework.go:1317] "Plugin Failed" err="Operation cannot be fulfilled on pods/binding \"test-ironic-service-d9cb7cbb7-8lzrh\": pod test-ironic-service-d9cb7cbb7-8lzrh is already assigned to node \"minikube-m03\"" plugin="DefaultBinder" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-8lzrh" node="minikube"
E0709 16:44:28.831060       1 schedule_one.go:347] "scheduler cache ForgetPod failed" err="pod ac1a8982-0dfc-46da-a5cb-6dfde9ca5e99(test-disabled-downloader/test-ironic-service-d9cb7cbb7-8lzrh) was assumed on minikube but assigned to minikube-m03" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-8lzrh"
E0709 16:44:28.831074       1 schedule_one.go:1046] "Error scheduling pod; retrying" err="running Bind plugin \"DefaultBinder\": Operation cannot be fulfilled on pods/binding \"test-ironic-service-d9cb7cbb7-8lzrh\": pod test-ironic-service-d9cb7cbb7-8lzrh is already assigned to node \"minikube-m03\"" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-8lzrh"
I0709 16:44:28.831090       1 schedule_one.go:1059] "Pod has been assigned to node. Abort adding it back to queue." pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-8lzrh" node="minikube-m03"
I0709 16:44:28.831050       1 schedule_one.go:1059] "Pod has been assigned to node. Abort adding it back to queue." pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-f7bcs" node="minikube"
E0709 16:44:28.830613       1 framework.go:1317] "Plugin Failed" err="Operation cannot be fulfilled on pods/binding \"test-ironic-service-d9cb7cbb7-ntzxt\": pod test-ironic-service-d9cb7cbb7-ntzxt is already assigned to node \"minikube-m02\"" plugin="DefaultBinder" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-ntzxt" node="minikube-m03"
E0709 16:44:28.832189       1 schedule_one.go:347] "scheduler cache ForgetPod failed" err="pod 0fba088b-7c02-4a37-91f7-1c6eccfb6deb(test-disabled-downloader/test-ironic-service-d9cb7cbb7-ntzxt) was assumed on minikube-m03 but assigned to minikube-m02" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-ntzxt"
E0709 16:44:28.832328       1 schedule_one.go:1046] "Error scheduling pod; retrying" err="running Bind plugin \"DefaultBinder\": Operation cannot be fulfilled on pods/binding \"test-ironic-service-d9cb7cbb7-ntzxt\": pod test-ironic-service-d9cb7cbb7-ntzxt is already assigned to node \"minikube-m02\"" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-ntzxt"
I0709 16:44:28.832665       1 schedule_one.go:1059] "Pod has been assigned to node. Abort adding it back to queue." pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-ntzxt" node="minikube-m02"
E0709 16:44:30.229970       1 framework.go:1317] "Plugin Failed" err="Operation cannot be fulfilled on pods/binding \"test-ironic-service-d9cb7cbb7-j6sr2\": pod test-ironic-service-d9cb7cbb7-j6sr2 is already assigned to node \"minikube-m02\"" plugin="DefaultBinder" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-j6sr2" node="minikube-m02"
E0709 16:44:30.230018       1 schedule_one.go:347] "scheduler cache ForgetPod failed" err="pod aa4dcca3-ed79-442e-a2ab-a00cc7a4f9b4(test-disabled-downloader/test-ironic-service-d9cb7cbb7-j6sr2) wasn't assumed so cannot be forgotten" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-j6sr2"
E0709 16:44:30.230057       1 schedule_one.go:1046] "Error scheduling pod; retrying" err="running Bind plugin \"DefaultBinder\": Operation cannot be fulfilled on pods/binding \"test-ironic-service-d9cb7cbb7-j6sr2\": pod test-ironic-service-d9cb7cbb7-j6sr2 is already assigned to node \"minikube-m02\"" pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-j6sr2"
I0709 16:44:30.230079       1 schedule_one.go:1059] "Pod has been assigned to node. Abort adding it back to queue." pod="test-disabled-downloader/test-ironic-service-d9cb7cbb7-j6sr2" node="minikube-m02"

I think a similar problem may be affecting the upgrade jobs but I have no proofs of that.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/ciCategorizes issue or PR as related to CI or testing.needs-triageIndicates an issue lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions