Skip to content

Shard Removal Fails Due to Direct mongod Connection Check #2055

@Fluder-Paradyne

Description

@Fluder-Paradyne

Report

When attempting to remove a shard from a sharded Percona Server for MongoDB cluster, the operator's reconcile loop fails. The failure is caused by a pre-delete safety check (checkIfUserDataExistInRS) that attempts to connect directly to the mongod instance of the shard being removed. In a sharded topology, this direct connection is improper and is rejected by MongoDB, causing the operator to error out and preventing the scale-down from completing.

More about the problem

Error log:

2025-09-21T15:13:11.721Z	INFO	Deleting STS component from replst	{"controller": "psmdb-controller", "controllerGroup": "psmdb.percona.com", "controllerKind": "PerconaServerMongoDB", "PerconaServerMongoDB": {"name":"people-psmdb-db","namespace":"mongodb"}, "namespace": "mongodb", "name": "people-psmdb-db", "reconcileID": "0e825148-6106-4fc9-bc25-38fc1ebe8cdd", "sts": "people-psmdb-db-rs2", "rs": "rs2", "port": 27017}
2025-09-21T15:13:11.748Z	ERROR	failed to list databases	{"controller": "psmdb-controller", "controllerGroup": "psmdb.percona.com", "controllerKind": "PerconaServerMongoDB", "PerconaServerMongoDB": {"name":"people-psmdb-db","namespace":"mongodb"}, "namespace": "mongodb", "name": "people-psmdb-db", "reconcileID": "0e825148-6106-4fc9-bc25-38fc1ebe8cdd", "rs": "rs2", "error": "listDatabases: (Unauthorized) You are connecting to a sharded cluster improperly by connecting directly to a shard. Please connect to the cluster via a router (mongos).", "errorVerbose": "(Unauthorized) You are connecting to a sharded cluster improperly by connecting directly to a shard. Please connect to the cluster via a router (mongos).\nlistDatabases\ngithub.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo.(*mongoClient).ListDBs\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/psmdb/mongo/mongo.go:490\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).checkIfUserDataExistInRS\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:859\ngithub.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile\n\t/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:361\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:334\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:294\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:255\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_arm64.s:1223"}
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).checkIfUserDataExistInRS
	/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:861
github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb.(*ReconcilePerconaServerMongoDB).Reconcile
	/go/src/github.com/percona/percona-server-mongodb-operator/pkg/controller/perconaservermongodb/psmdb_controller.go:361
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Reconcile
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:119
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:334
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:294
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func2.2
	/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:255

Steps to reproduce

  1. Deploy a sharded PSMDB cluster with two or more replica sets (e.g., rs0, rs1, rs2).
  2. Follow the official MongoDB procedure to drain a shard by connecting to a mongos instance and successfully running b.adminCommand({ removeShard: "rs2" })..
  3. Modify the PerconaServerMongoDB custom resource to remove the rs2 replica set from the spec.replsets array.
  4. Apply the updated configuration.

Versions

  1. Kubernetes -- 1.32
  2. Operator -- 1.20.0
  3. Database -- 8.0.4-1-multi

Anything else?

[direct: mongos] config> db.adminCommand({ removeShard: "rs2" });
{
  msg: 'removeshard completed successfully',
  state: 'completed',
  shard: 'rs2',
  ok: 1,
  '$clusterTime': {
    clusterTime: Timestamp({ t: 1758467128, i: 17 }),
    signature: {
      hash: Binary.createFromBase64('dv0FyFVESLFebpbU2Is3YB6bJlI=', 0),
      keyId: Long('7541359738756268055')
    }
  },
  operationTime: Timestamp({ t: 1758467128, i: 17 })
}
``

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions