Skip to content

Physical backup restore stuck on version 1.20.1 #1995

@tomsozolins

Description

@tomsozolins

Report

Restoring from physical backup with point-in-time recovery results in a stuck restore. The cluster has sharding enabled on the collection.

➜ k describe PerconaServerMongoDBRestore
Name:         restore1
Namespace:    demo-mongodb
Labels:       <none>
Annotations:  <none>
API Version:  psmdb.percona.com/v1
Kind:         PerconaServerMongoDBRestore
Metadata:
  Creation Timestamp:  2025-07-08T10:02:57Z
  Generation:          1
  Resource Version:    9536705
  UID:                 5dd0c8c8-f3b6-4481-8f68-54104afc552c
Spec:
  Backup Name:   backup1
  Cluster Name:  demo-psmdb-db
  Pitr:
    Type:  latest
Status:
  Pbm Name:     2025-07-08T10:10:12.768452917Z
  Pitr Target:  2025-07-08T08:54:01
  State:        requested
Events:         <none>

More about the problem

Operator does the restore procedure and gets stuck on this log:

2025-07-08T10:10:12.789Z	INFO	Restore state changed	{"controller": "psmdbrestore-controller", "controllerGroup": "psmdb.percona.com", "controllerKind": "PerconaServerMongoDBRestore", "PerconaServerMongoDBRestore": {"name":"restore1","namespace":"demo-mongodb"}, "namespace": "demo-mongodb", "name": "restore1", "reconcileID": "a38c71a6-7b43-4d90-aa94-6e31f2136a55", "previous": "waiting", "current": "requested"}

The DB is never restored and the cluster is in initializing state. Restarting operator deployment does not help, it doesn't try to continue the restore process.

Steps to reproduce

  1. Create DB
replsets:
  rs0:
    size: 3
    serviceAccountName: psmdb-operator
    resources:
      limits:
        cpu: 300m
        memory: 1024Mi
      requests:
        cpu: 150m
        memory: 512Mi
    volumeSpec:
      pvc:
        storageClassName: gp3
        resources:
          requests:
            storage: 4Gi
    arbiter:
      enabled: false
      size: 1
  rs1:
    size: 3
    serviceAccountName: psmdb-operator
    resources:
      limits:
        cpu: 300m
        memory: 1024Mi
      requests:
        cpu: 150m
        memory: 512Mi
    volumeSpec:
      pvc:
        storageClassName: gp3
        resources:
          requests:
            storage: 4Gi
    arbiter:
      enabled: false
      size: 1
  sharding:
    configrs:
      size: 3
      serviceAccountName: psmdb-operator
      volumeSpec:
        pvc:
          storageClassName: gp3
          resources:
            requests:
              storage: 4Gi
    mongos:
      size: 3
      resources:
        limits:
          cpu: 1000m
          memory: 1024M
        requests:
          cpu: 300m
          memory: 500M
      serviceAccountName: psmdb-operator

  backup:
    enabled: true
    annotations:
      eks.amazonaws.com/role-arn: arn:aws:iam::123456789:role/psmdb-operator
    storages:
      s3-eu-north-1:
        main: true
        type: s3
        s3:
          bucket: psmdb-operator
          retryer:
            numMaxRetries: 3
            minRetryDelay: 30ms
            maxRetryDelay: 5m
          region: eu-north-1
    pitr:
      enabled: true
      compressionType: gzip
      compressionLevel: 6
    tasks:
      - name: daily-s3-eu-north-1-physical
        enabled: true
        schedule: "0 0 * * *"
        keep: 30
        type: physical
        storageName: s3-eu-north-1
        compressionType: gzip
        compressionLevel: 6
  1. Login with databaseAdmin user using mongosh cli and create data
use demo
db.demo.insertOne({ msg: "This is the first document" })
  1. Login with clusterAdmin user using mongosh cli and enable sharding
use admin
sh.shardCollection("demo.demo", { _id: 1 })
  1. Create backup
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDBBackup
metadata:
  finalizers:
    - percona.com/delete-backup
  name: backup1
  namespace: demo-mongodb
spec:
  clusterName: demo-psmdb-db
  storageName: s3-eu-north-1
  type: physical
  1. Restore from backup
apiVersion: psmdb.percona.com/v1
kind: PerconaServerMongoDBRestore
metadata:
  name: restore1
spec:
  clusterName: demo-psmdb-db
  backupName: backup1
  pitr:
    type: latest

Versions

  1. Kubernetes EKS 1.31
  2. Operator 1.20.1
  3. Database 1.20.1

Anything else?

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions