Skip to content

[ZK] Triggering validation plan returns an error for zookeeper operator #308

Open
@rishabh96b

Description

@rishabh96b

Description

The validation plan of zookeeper operator does not run properly and marked as COMPLETED. Please find the detailed logs below.

└── zookeeper-instance (Operator-Version: "zookeeper-3.4.14-0.3.1" Active-Plan: "validation")
    ├── Plan deploy (serial strategy) [NOT ACTIVE]
    │   ├── Phase zookeeper (parallel strategy) [NOT ACTIVE]
    │   │   └── Step deploy [NOT ACTIVE]
    │   └── Phase validation (serial strategy) [NOT ACTIVE]
    │       ├── Step validation [NOT ACTIVE]
    │       └── Step cleanup [NOT ACTIVE]
    ├── Plan not-allowed (serial strategy) [NOT ACTIVE]
    │   └── Phase not-allowed (serial strategy) [NOT ACTIVE]
    │       └── Step not-allowed [NOT ACTIVE]
    └── Plan validation (serial strategy) [COMPLETE], last updated 2021-01-04 20:10:40
        └── Phase connection (serial strategy) [COMPLETE]
            ├── Step connection [COMPLETE]
            └── Step cleanup [COMPLETE]

Command

kubectl kudo plan trigger --name=validation --instance=zookeeper-instance

The kudo-controller logs are flooded with

2021/01/04 14:20:10 HealthUtil: unknown type *v1beta1.PodDisruptionBudget is marked healthy by default
2021/01/04 14:20:10 HealthUtil: statefulset "zookeeper-instance-zookeeper" is not healthy: Waiting for 1 pods to be ready...
2021/01/04 14:20:10 TaskExecution: object default/zookeeper-instance-zookeeper is NOT healthy: statefulset "zookeeper-instance-zookeeper" is not healthy: Waiting for 1 pods to be ready...
2021/01/04 14:20:10 PlanExecution: 'deploy' step(s) (instance: default/zookeeper-instance) of the deploy.zookeeper are not ready
2021/01/04 14:20:10 InstanceController: Received Reconcile request for instance default/zookeeper-instance

The plan is supposed to trigger a job which in turn will print the zookeeper URI. But it is unable to create any job stating

 HealthUtil: job "zookeeper-instance-validation" still running or failed
2021/01/04 14:20:28 TaskExecution: object default/zookeeper-instance-validation is NOT healthy: job "zookeeper-instance-validation" still running or failed
2021/01/04 14:20:28 PlanExecution: 'validation' task(s) (instance: default/zookeeper-instance) of the deploy.validation.validation are not ready
2021/01/04 14:20:28 PlanExecution: 'validation,cleanup' step(s) (instance: default/zookeeper-instance) of the deploy.validation are not ready

The zookeeper-instance StatefulSet looks to be okay.

""2021-01-04 14:24:16,272 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@222] - Accepted socket connection from /127.0.0.1:39720
""2021-01-04 14:24:16,272 [myid:3] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@908] - Processing ruok command from /127.0.0.1:39720
""2021-01-04 14:24:16,273 [myid:3] - INFO  [Thread-290:NIOServerCnxn@1056] - Closed socket connection for client /127.0.0.1:39720 (no session established for client)

Lastly, I am getting a TLS handshake error as well

2021/01/04 14:20:31 InstanceController: Error when updating instance status. Operation cannot be fulfilled on instances.kudo.dev "zookeeper-instance": the object has been modified; please apply your changes to the latest version and try again
2021/01/04 14:20:32 InstanceController: Received Reconcile request for instance default/zookeeper-instance
2021/01/04 14:20:32 Computing health out of 0 Deployments, 0 ReplicaSets, 1 StatefulSets, 0 DaemonSets, 3 Pods
2021/01/04 14:20:32 Updating instance default/zookeeper-instance readiness to: true
2021/01/04 14:20:32 InstanceController: Readiness did not change for default/zookeeper-instance. Not updating.
2021/01/04 14:20:32 http: TLS handshake error from 10.0.130.81:56732: EOF
2021/01/04 14:20:42 http: TLS handshake error from 10.0.130.81:56844: EOF
...

KUDO Version

KUDO Version: version.Info{GitVersion:"0.17.2", GitCommit:"d902714c", BuildDate:"2020-11-16T20:34:11Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64", KubernetesClientVersion:"v0.19.2"}

I tried this with KUDO version 0.17.0 and was getting the same error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions