Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check of corrupted file deadlocks #877

Open
serathius opened this issue Dec 20, 2024 · 1 comment
Open

Check of corrupted file deadlocks #877

serathius opened this issue Dec 20, 2024 · 1 comment

Comments

@serathius
Copy link
Member

Checking status on snapshot file deadlocks due to Check command never ending. Below I provide a stack trace of 30 minute execution. I expect the file might be corrupted, however Check command should identify that and exit instead of looping forever.

goroutine 6459 gp=0xc000949a40 m=nil [runnable]:
go.etcd.io/bbolt/bolt.(*Cursor).searchPage.func1(0x13?)
	go.etcd.io/bbolt/cursor.go:293 +0xbb fp=0xc000951608 sp=0xc000951600 pc=0x1f918bb
sort.Search(0xc00007f808?, 0xc000951690)
	go/gc/src/sort/search.go:65 +0x42 fp=0xc000951638 sp=0xc000951608 pc=0x1b2d202
go.etcd.io/bbolt/bolt.(*Cursor).searchPage(0xc000951820, {0x7d478969209e, 0x11, 0x11}, 0x580be0?)
	go.etcd.io/bbolt/cursor.go:293 +0xf1 fp=0xc0009516e0 sp=0xc000951638 pc=0x1f91731
go.etcd.io/bbolt/bolt.(*Cursor).search(0xc000951820, {0x7d478969209e, 0x11, 0x11}, 0x2?)
	go.etcd.io/bbolt/cursor.go:265 +0x1b9 fp=0xc000951778 sp=0xc0009516e0 pc=0x1f91339
go.etcd.io/bbolt/bolt.(*Cursor).seek(0xc000951820, {0x7d478969209e?, 0x7d48b21fff18?, 0x580be0?})
	go.etcd.io/bbolt/cursor.go:159 +0x45 fp=0xc0009517d0 sp=0xc000951778 pc=0x1f90c45
go.etcd.io/bbolt/bolt.(*Bucket).Bucket(0xc00099b340, {0x7d478969209e, 0x11, 0x2c?})
	go.etcd.io/bbolt/bucket.go:105 +0xaf fp=0xc000951850 sp=0xc0009517d0 pc=0x1f8d9ef
go.etcd.io/bbolt/bolt.(*Tx).checkBucket.func2({0x7d478969209e?, 0x11?, 0x11?}, {0x7d47887fb041?, 0x81e?, 0x81e?})
	go.etcd.io/bbolt/tx.go:488 +0x4c fp=0xc0009518a8 sp=0xc000951850 pc=0x1f9ed4c
go.etcd.io/bbolt/bolt.(*Bucket).ForEach(0xc0005b20e0?, 0xc000951928)
	go.etcd.io/bbolt/bucket.go:392 +0x83 fp=0xc000951908 sp=0xc0009518a8 pc=0x1f8ecc3
go.etcd.io/bbolt/bolt.(*Tx).checkBucket(0xc0005b20e0, 0xc00099b340, 0xc000951c98, 0xc000951bd8, 0xc00129a150)
	go.etcd.io/bbolt/tx.go:487 +0xe5 fp=0xc000951998 sp=0xc000951908 pc=0x1f9eca5
go.etcd.io/bbolt/bolt.(*Tx).checkBucket.func2({0x7d47bffcc1b9?, 0x18b519d?, 0xc00082ea40?}, {0x1f9f851?, 0x7d47bffcc000?, 0xc0005b20e0?})
	go.etcd.io/bbolt/tx.go:489 +0x6d fp=0xc0009519f0 sp=0xc000951998 pc=0x1f9ed6d
go.etcd.io/bbolt/bolt.(*Bucket).ForEach(0xc0005b20e0?, 0xc000951a70)
	go.etcd.io/bbolt/bucket.go:392 +0x83 fp=0xc000951a50 sp=0xc0009519f0 pc=0x1f8ecc3
go.etcd.io/bbolt/bolt.(*Tx).checkBucket(0xc0005b20e0, 0xc0005b20f8, 0xc00082ec98, 0xc00082ebd8, 0xc00129a150)
	go.etcd.io/bbolt/tx.go:487 +0xe5 fp=0xc000951ae0 sp=0xc000951a50 pc=0x1f9eca5
go.etcd.io/bbolt/bolt.(*DB).freepages(0xc002284488)
	go.etcd.io/bbolt/db.go:1059 +0x1f6 fp=0xc000951cf8 sp=0xc000951ae0 pc=0x1f95716
go.etcd.io/bbolt/bolt.(*DB).loadFreelist.func1()
	go.etcd.io/bbolt/db.go:320 +0xbb fp=0xc000951d28 sp=0xc000951cf8 pc=0x1f9281b
sync.(*Once).doSlow(0xc00082edb0?, 0x1897350?)
	go/gc/src/sync/once.go:76 +0xb4 fp=0xc000951d88 sp=0xc000951d28 pc=0x18d0214
sync.(*Once).Do(...)
	go/gc/src/sync/once.go:67
go.etcd.io/bbolt/bolt.(*DB).loadFreelist(0x18981b2?)
	go.etcd.io/bbolt/db.go:316 +0x3b fp=0xc000951db8 sp=0xc000951d88 pc=0x1f9273b
go.etcd.io/bbolt/bolt.(*Tx).check(0xc0005b2000, 0xc00129a0e0)
	go.etcd.io/bbolt/tx.go:419 +0x36 fp=0xc000951fc0 sp=0xc000951db8 pc=0x1f9e576
go.etcd.io/bbolt/bolt.(*Tx).Check.gowrap1()
	go.etcd.io/bbolt/tx.go:413 +0x25 fp=0xc000951fe0 sp=0xc000951fc0 pc=0x1f9e505
runtime.goexit({})
	go/gc/src/runtime/asm_amd64.s:1700 +0x1 fp=0xc000951fe8 sp=0xc000951fe0 pc=0x18c3461
created by go.etcd.io/bbolt/bolt.(*Tx).Check in goroutine 6309
	go.etcd.io/bbolt/tx.go:413 +0x78


goroutine 6309 gp=0xc001300fc0 m=nil [chan receive, 30 minutes]:
runtime.gopark(0x18?, 0xc00129a0e0?, 0x0?, 0x0?, 0xc00007f808?)
	go/gc/src/runtime/proc.go:438 +0xce fp=0xc0009e1720 sp=0xc0009e1700 pc=0x18ba6ce
runtime.chanrecv(0xc00129a0e0, 0xc0009e1888, 0x1)
	go/gc/src/runtime/chan.go:639 +0x40b fp=0xc0009e1798 sp=0xc0009e1720 pc=0x1850bcb
runtime.chanrecv2(0xc0005b2000?, 0x0?)
	go/gc/src/runtime/chan.go:494 +0x12 fp=0xc0009e17c0 sp=0xc0009e1798 pc=0x18507b2
go.etcd.io/etcd/client/v3/snapshot/snapshot.(*v3Manager).Status.func1(0xc0005b2000)
	go.etcd.io/etcd/client/v3/snapshot/v3_snapshot.go:181 +0x9b fp=0xc0009e18e8 sp=0xc0009e17c0 pc=0x208d2bb
go.etcd.io/bbolt/bolt.(*DB).View(0xc0000c4400?, 0xc0009e1a28)
	go.etcd.io/bbolt/db.go:772 +0x6c fp=0xc0009e1950 sp=0xc0009e18e8 pc=0x1f9440c
go.etcd.io/etcd/client/v3/snapshot/snapshot.(*v3Manager).Status(0xc0009e1b48?, {0xc0024c80f0, 0x21})
	go.etcd.io/etcd/client/v3/snapshot/v3_snapshot.go:178 +0x187 fp=0xc0009e1a60 sp=0xc0009e1950 pc=0x208d0c7

@serathius serathius changed the title Check command deadlocks Check of corrupted file deadlocks Dec 20, 2024
@serathius
Copy link
Member Author

serathius commented Dec 20, 2024

Confirmed that db file is corrupted:

$ bbolt pages db
panic: freepages: failed to get all reachable pages (page 583622: multiple references (stack: [382879 309161 463064 1918 583622]))

goroutine 6 [running]:
go.etcd.io/bbolt.(*DB).freepages.func2()
	/root/go/pkg/mod/go.etcd.io/[email protected]/db.go:1204 +0x8d
created by go.etcd.io/bbolt.(*DB).freepages in goroutine 1
	/root/go/pkg/mod/go.etcd.io/[email protected]/db.go:1202 +0x1c5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants