PBM-1442 PBM-1443: improve pbm diagnostic #1129
GitHub Actions / JUnit Test Report
failed
Nov 14, 2024 in 0s
41 tests run, 32 passed, 8 skipped, 1 failed.
Annotations
Check failure on line 54 in psmdb-testing/pbm-functional/pytest/test_rename_replicaset.py
github-actions / JUnit Test Report
test_rename_replicaset.test_logical_pitr_crud_PBM_T270[replaces]
AssertionError: Backup failed{"Error":"get backup metadata: get: context deadline exceeded"}
2024-11-14T15:11:41Z I [rs1/rs101:27017] pbm-agent:
Version: 2.7.0
Platform: linux/amd64
GitCommit: 192769fd681964e48871725f596761b8933bdad4
GitBranch: CURRENT_PR
BuildTime: 2024-11-14_14:08_UTC
GoVersion: go1.22.9
2024-11-14T15:11:41Z I [rs1/rs102:27017] pbm-agent:
Version: 2.7.0
Platform: linux/amd64
GitCommit: 192769fd681964e48871725f596761b8933bdad4
GitBranch: CURRENT_PR
BuildTime: 2024-11-14_14:08_UTC
GoVersion: go1.22.9
2024-11-14T15:11:41Z I [rs1/rs102:27017] starting PITR routine
2024-11-14T15:11:41Z I [rs1/rs101:27017] starting PITR routine
2024-11-14T15:11:41Z I [rs1/rs103:27017] pbm-agent:
Version: 2.7.0
Platform: linux/amd64
GitCommit: 192769fd681964e48871725f596761b8933bdad4
GitBranch: CURRENT_PR
BuildTime: 2024-11-14_14:08_UTC
GoVersion: go1.22.9
2024-11-14T15:11:41Z I [rs1/rs103:27017] starting PITR routine
2024-11-14T15:11:41Z I [rs1/rs101:27017] node: rs1/rs101:27017
2024-11-14T15:11:41Z I [rs1/rs102:27017] node: rs1/rs102:27017
2024-11-14T15:11:41Z I [rs1/rs103:27017] node: rs1/rs103:27017
2024-11-14T15:11:41Z E [rs1/rs101:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
2024-11-14T15:11:41Z E [rs1/rs102:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
2024-11-14T15:11:41Z I [rs1/rs101:27017] conn level ReadConcern: majority; WriteConcern: majority
2024-11-14T15:11:41Z I [rs1/rs102:27017] conn level ReadConcern: majority; WriteConcern: majority
2024-11-14T15:11:41Z I [rs1/rs103:27017] conn level ReadConcern: majority; WriteConcern: majority
2024-11-14T15:11:41Z E [rs1/rs103:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
2024-11-14T15:11:41Z I [rs1/rs102:27017] listening for the commands
2024-11-14T15:11:41Z I [rs1/rs101:27017] listening for the commands
2024-11-14T15:11:41Z I [rs1/rs103:27017] listening for the commands
2024-11-14T15:11:46Z E [rs1/rs101:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
2024-11-14T15:11:46Z E [rs1/rs102:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
2024-11-14T15:11:46Z E [rs1/rs103:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
2024-11-14T15:11:48Z I [rs1/rs102:27017] got command resync <ts: 1731597108>, opid: 6736133461bbacba664157cd
2024-11-14T15:11:48Z I [rs1/rs101:27017] got command resync <ts: 1731597108>, opid: 6736133461bbacba664157cd
2024-11-14T15:11:48Z I [rs1/rs103:27017] got command resync <ts: 1731597108>, opid: 6736133461bbacba664157cd
2024-11-14T15:11:48Z I [rs1/rs102:27017] got epoch {1731597106 6}
2024-11-14T15:11:48Z I [rs1/rs101:27017] got epoch {1731597106 6}
2024-11-14T15:11:48Z I [rs1/rs103:27017] got epoch {1731597106 6}
2024-11-14T15:11:48Z D [rs1/rs101:27017] [resync] lock not acquired
2024-11-14T15:11:48Z I [rs1/rs102:27017] [resync] started
2024-11-14T15:11:48Z D [rs1/rs103:27017] [resync] lock not acquired
2024-11-14T15:11:48Z D [rs1/rs102:27017] [resync] uploading ".pbm.init" [size hint: 5 (5.00B); part size: 10485760 (10.00MB)]
2024-11-14T15:11:48Z D [rs1/rs102:27017] [resync] got backups list: 0
2024-11-14T15:11:48Z D [rs1/rs102:27017] [resync] got physical restores list: 0
2024-11-14T15:11:48Z D [rs1/rs102:27017] [resync] epoch set to {1731597108 19}
2024-11-14T15:11:48Z I [rs1/rs102:27017] [resync] succeed
2024-11-14T15:11:54Z I [rs1/rs102:27017] got command resync <ts: 1731597113>, opid: 67361339d7e79927bf9b1904
2024-11-14T15:11:54Z I [rs1/rs103:27017] got command resync <ts: 1731597113>, opid: 67361339d7e79927bf9b1904
2024-11-14T15:11:54Z I [rs1/rs101:27017] got command resync <ts: 1731597113>, opid: 67361339d7e79927bf9b1904
2024-11-14T15:11:54Z I [rs1/rs103:27017] got epoch {1731597108 19}
2024-11-14T15:11:54Z I [rs1/rs102:27017] got epoch {1731597108 19}
2024-11-14T15:11:54Z I [rs1/rs101:27017] got epoch {1731597108 19}
2024-11-14T15:11:54Z D [rs1/rs102:27017] [resync] lock not acquired
2024-11-14T15:11:54Z I [rs1/rs103:27017] [resync] started
2024-11-14T15:11:54Z I [rs1/rs102:27017] got command backup [name: 2024-11-14T15:11:53Z, compression: none (level: default)] <ts: 1731597113>, opid: 6736133965f2960bc0250fbc
2024-11-14T15:11:54Z D [rs1/rs101:27017] [resync] lock not acquired
2024-11-14T15:11:54Z I [rs1/rs101:27017] got command backup [name: 2024-11-14T15:11:53Z, compression: none (level: default)] <ts: 1731597113>, opid: 6736133965f2960bc0250fbc
2024-11-14T15:11:54Z I [rs1/rs102:27017] got epoch {1731597108 19}
2024-11-14T15:11:54Z I [rs1/rs101:27017] got epoch {1731597108 19}
2024-11-14T15:11:54Z D [rs1/rs103:27017] [resync] got backups list: 0
2024-11-14T15:11:54Z E [rs1/rs101:27017] [backup/2024-11-14T15:11:53Z] unable to proceed with the backup, active lock is present
2024-11-14T15:11:54Z D [rs1/rs103:27017] [resync] got physical restores list: 0
2024-11-14T15:11:54Z D [rs1/rs103:27017] [resync] epoch set to {1731597114 20}
2024-11-14T15:11:54Z I [rs1/rs103:27017] [resync] succeed
2024-11-14T15:11:54Z I [rs1/rs103:27017] got command backup [name: 2024-11-14T15:11:53Z, compression: none (level: default)] <ts: 1731597113>, opid: 6736133965f2960bc0250fbc
2024-11-14T15:11:54Z I [rs1/rs103:27017] got epoch {1731597114 20}
2024-11-14T15:12:09Z D [rs1/rs102:27017] [backup/2024-11-14T15:11:53Z] nomination timeout
2024-11-14T15:12:09Z D [rs1/rs102:27017] [backup/2024-11-14T15:11:53Z] skip after nomination, probably started by another node
2024-11-14T15:12:09Z D [rs1/rs103:27017] [backup/2024-11-14T15:11:53Z] nomination timeout
2024-11-14T15:12:09Z D [rs1/rs103:27017] [backup/2024-11-14T15:11:53Z] skip after nomination, probably started by another node
Raw output
start_cluster = True, cluster = <cluster.Cluster object at 0x7fa34b0e6a50>
collection = 'replaces'
@pytest.mark.timeout(300,func_only=True)
@pytest.mark.parametrize('collection',['inserts','replaces','updates','deletes','indexes'])
def test_logical_pitr_crud_PBM_T270(start_cluster,cluster,collection):
cluster.check_pbm_status()
> cluster.make_backup("logical")
test_rename_replicaset.py:54:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = <cluster.Cluster object at 0x7fa34b0e6a50>, type = 'logical'
def make_backup(self, type):
n = testinfra.get_host("docker://" + self.pbm_cli)
timeout = time.time() + 120
while True:
running = self.get_status()['running']
Cluster.log("Current operation: " + str(running))
if not running:
if type:
start = n.run(
'pbm backup --out=json --type=' + type)
else:
start = n.run('pbm backup --out=json')
if start.rc == 0:
name = json.loads(start.stdout)['name']
Cluster.log("Backup started")
break
elif "resync" in start.stdout:
Cluster.log("Resync in progress, retrying: " + start.stdout)
else:
logs = n.check_output("pbm logs -sD -t0")
> assert False, "Backup failed" + start.stdout + start.stderr + '\n' + logs
E AssertionError: Backup failed{"Error":"get backup metadata: get: context deadline exceeded"}
E
E 2024-11-14T15:11:41Z I [rs1/rs101:27017] pbm-agent:
E Version: 2.7.0
E Platform: linux/amd64
E GitCommit: 192769fd681964e48871725f596761b8933bdad4
E GitBranch: CURRENT_PR
E BuildTime: 2024-11-14_14:08_UTC
E GoVersion: go1.22.9
E 2024-11-14T15:11:41Z I [rs1/rs102:27017] pbm-agent:
E Version: 2.7.0
E Platform: linux/amd64
E GitCommit: 192769fd681964e48871725f596761b8933bdad4
E GitBranch: CURRENT_PR
E BuildTime: 2024-11-14_14:08_UTC
E GoVersion: go1.22.9
E 2024-11-14T15:11:41Z I [rs1/rs102:27017] starting PITR routine
E 2024-11-14T15:11:41Z I [rs1/rs101:27017] starting PITR routine
E 2024-11-14T15:11:41Z I [rs1/rs103:27017] pbm-agent:
E Version: 2.7.0
E Platform: linux/amd64
E GitCommit: 192769fd681964e48871725f596761b8933bdad4
E GitBranch: CURRENT_PR
E BuildTime: 2024-11-14_14:08_UTC
E GoVersion: go1.22.9
E 2024-11-14T15:11:41Z I [rs1/rs103:27017] starting PITR routine
E 2024-11-14T15:11:41Z I [rs1/rs101:27017] node: rs1/rs101:27017
E 2024-11-14T15:11:41Z I [rs1/rs102:27017] node: rs1/rs102:27017
E 2024-11-14T15:11:41Z I [rs1/rs103:27017] node: rs1/rs103:27017
E 2024-11-14T15:11:41Z E [rs1/rs101:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
E 2024-11-14T15:11:41Z E [rs1/rs102:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
E 2024-11-14T15:11:41Z I [rs1/rs101:27017] conn level ReadConcern: majority; WriteConcern: majority
E 2024-11-14T15:11:41Z I [rs1/rs102:27017] conn level ReadConcern: majority; WriteConcern: majority
E 2024-11-14T15:11:41Z I [rs1/rs103:27017] conn level ReadConcern: majority; WriteConcern: majority
E 2024-11-14T15:11:41Z E [rs1/rs103:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
E 2024-11-14T15:11:41Z I [rs1/rs102:27017] listening for the commands
E 2024-11-14T15:11:41Z I [rs1/rs101:27017] listening for the commands
E 2024-11-14T15:11:41Z I [rs1/rs103:27017] listening for the commands
E 2024-11-14T15:11:46Z E [rs1/rs101:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
E 2024-11-14T15:11:46Z E [rs1/rs102:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
E 2024-11-14T15:11:46Z E [rs1/rs103:27017] [agentCheckup] check storage connection: unable to get storage: get config: get: mongo: no documents in result
E 2024-11-14T15:11:48Z I [rs1/rs102:27017] got command resync <ts: 1731597108>, opid: 6736133461bbacba664157cd
E 2024-11-14T15:11:48Z I [rs1/rs101:27017] got command resync <ts: 1731597108>, opid: 6736133461bbacba664157cd
E 2024-11-14T15:11:48Z I [rs1/rs103:27017] got command resync <ts: 1731597108>, opid: 6736133461bbacba664157cd
E 2024-11-14T15:11:48Z I [rs1/rs102:27017] got epoch {1731597106 6}
E 2024-11-14T15:11:48Z I [rs1/rs101:27017] got epoch {1731597106 6}
E 2024-11-14T15:11:48Z I [rs1/rs103:27017] got epoch {1731597106 6}
E 2024-11-14T15:11:48Z D [rs1/rs101:27017] [resync] lock not acquired
E 2024-11-14T15:11:48Z I [rs1/rs102:27017] [resync] started
E 2024-11-14T15:11:48Z D [rs1/rs103:27017] [resync] lock not acquired
E 2024-11-14T15:11:48Z D [rs1/rs102:27017] [resync] uploading ".pbm.init" [size hint: 5 (5.00B); part size: 10485760 (10.00MB)]
E 2024-11-14T15:11:48Z D [rs1/rs102:27017] [resync] got backups list: 0
E 2024-11-14T15:11:48Z D [rs1/rs102:27017] [resync] got physical restores list: 0
E 2024-11-14T15:11:48Z D [rs1/rs102:27017] [resync] epoch set to {1731597108 19}
E 2024-11-14T15:11:48Z I [rs1/rs102:27017] [resync] succeed
E 2024-11-14T15:11:54Z I [rs1/rs102:27017] got command resync <ts: 1731597113>, opid: 67361339d7e79927bf9b1904
E 2024-11-14T15:11:54Z I [rs1/rs103:27017] got command resync <ts: 1731597113>, opid: 67361339d7e79927bf9b1904
E 2024-11-14T15:11:54Z I [rs1/rs101:27017] got command resync <ts: 1731597113>, opid: 67361339d7e79927bf9b1904
E 2024-11-14T15:11:54Z I [rs1/rs103:27017] got epoch {1731597108 19}
E 2024-11-14T15:11:54Z I [rs1/rs102:27017] got epoch {1731597108 19}
E 2024-11-14T15:11:54Z I [rs1/rs101:27017] got epoch {1731597108 19}
E 2024-11-14T15:11:54Z D [rs1/rs102:27017] [resync] lock not acquired
E 2024-11-14T15:11:54Z I [rs1/rs103:27017] [resync] started
E 2024-11-14T15:11:54Z I [rs1/rs102:27017] got command backup [name: 2024-11-14T15:11:53Z, compression: none (level: default)] <ts: 1731597113>, opid: 6736133965f2960bc0250fbc
E 2024-11-14T15:11:54Z D [rs1/rs101:27017] [resync] lock not acquired
E 2024-11-14T15:11:54Z I [rs1/rs101:27017] got command backup [name: 2024-11-14T15:11:53Z, compression: none (level: default)] <ts: 1731597113>, opid: 6736133965f2960bc0250fbc
E 2024-11-14T15:11:54Z I [rs1/rs102:27017] got epoch {1731597108 19}
E 2024-11-14T15:11:54Z I [rs1/rs101:27017] got epoch {1731597108 19}
E 2024-11-14T15:11:54Z D [rs1/rs103:27017] [resync] got backups list: 0
E 2024-11-14T15:11:54Z E [rs1/rs101:27017] [backup/2024-11-14T15:11:53Z] unable to proceed with the backup, active lock is present
E 2024-11-14T15:11:54Z D [rs1/rs103:27017] [resync] got physical restores list: 0
E 2024-11-14T15:11:54Z D [rs1/rs103:27017] [resync] epoch set to {1731597114 20}
E 2024-11-14T15:11:54Z I [rs1/rs103:27017] [resync] succeed
E 2024-11-14T15:11:54Z I [rs1/rs103:27017] got command backup [name: 2024-11-14T15:11:53Z, compression: none (level: default)] <ts: 1731597113>, opid: 6736133965f2960bc0250fbc
E 2024-11-14T15:11:54Z I [rs1/rs103:27017] got epoch {1731597114 20}
E 2024-11-14T15:12:09Z D [rs1/rs102:27017] [backup/2024-11-14T15:11:53Z] nomination timeout
E 2024-11-14T15:12:09Z D [rs1/rs102:27017] [backup/2024-11-14T15:11:53Z] skip after nomination, probably started by another node
E 2024-11-14T15:12:09Z D [rs1/rs103:27017] [backup/2024-11-14T15:11:53Z] nomination timeout
E 2024-11-14T15:12:09Z D [rs1/rs103:27017] [backup/2024-11-14T15:11:53Z] skip after nomination, probably started by another node
cluster.py:393: AssertionError
Loading