Skip to content

Deadlock on PublishMsgAsync on 1.41.2 #1865

Open
@RenaultAI

Description

@RenaultAI

Observed behavior

I noticed deadlock on version 1.39.0. After seeing #1801, we upgraded to 1.41.2 and still see the same deadlock behavior.

Our code looks effectively identical as @Zach-Johnson 's reproducer in the above issue. I was able to get a goroutine dump that shows some sort of deadlock behavior on the select statement line.

select {
case <-completion.Future.Ok():
	if completion.OnSuccess != nil {
		completion.OnSuccess()
	}
case err := <-completion.Future.Err():
	if completion.OnError != nil {
		completion.OnError(err)
	}
}

goroutine stack dump

goroutine 15 [IO wait]:
internal/poll.runtime_pollWait(0x7f1fe5fa7e58, 0x72)
	/Users/xxx/.gvm/gos/go1.23.1/src/runtime/netpoll.go:351 +0x85
internal/poll.(*pollDesc).wait(0xc000444200?, 0xc000708000?, 0x0)
	/Users/xxx/.gvm/gos/go1.23.1/src/internal/poll/fd_poll_runtime.go:84 +0x27
internal/poll.(*pollDesc).waitRead(...)
	/Users/xxx/.gvm/gos/go1.23.1/src/internal/poll/fd_poll_runtime.go:89
internal/poll.(*FD).Read(0xc000444200, {0xc000708000, 0x8000, 0x8000})
	/Users/xxx/.gvm/gos/go1.23.1/src/internal/poll/fd_unix.go:165 +0x27a
net.(*netFD).Read(0xc000444200, {0xc000708000?, 0x1c?, 0x7fce?})
	/Users/xxx/.gvm/gos/go1.23.1/src/net/fd_posix.go:55 +0x25
net.(*conn).Read(0xc0001bab58, {0xc000708000?, 0xfb?, 0x7ee9?})
	/Users/xxx/.gvm/gos/go1.23.1/src/net/net.go:189 +0x45
github.com/nats-io/nats%2ego.(*natsReader).Read(0xc0005d8340)
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:2023 +0x89
github.com/nats-io/nats%2ego.(*Conn).readLoop(0xc000581508)
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:3134 +0xf3
created by github.com/nats-io/nats%2ego.(*Conn).processConnectInit in goroutine 1
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:2443 +0x2d3

goroutine 16 [chan receive]:
github.com/nats-io/nats%2ego.(*Conn).flusher(0xc000581508)
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:3557 +0xe9
created by github.com/nats-io/nats%2ego.(*Conn).processConnectInit in goroutine 1
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:2444 +0x319

goroutine 114 [sync.Cond.Wait, 8121 minutes]:
sync.runtime_notifyListWait(0xc0005d8310, 0x0)
	/Users/xxx/.gvm/gos/go1.23.1/src/runtime/sema.go:587 +0x159
sync.(*Cond).Wait(0x0?)
	/Users/xxx/.gvm/gos/go1.23.1/src/sync/cond.go:71 +0x85
github.com/nats-io/nats%2ego.(*asyncCallbacksHandler).asyncCBDispatcher(0xc000610c20)
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:3059 +0xdc
created by github.com/nats-io/nats%2ego.Options.Connect in goroutine 1
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:1635 +0x387

goroutine 115 [sync.Cond.Wait, 8 minutes]:
sync.runtime_notifyListWait(0xc000632550, 0x81fcd)
	/Users/xxx/.gvm/gos/go1.23.1/src/runtime/sema.go:587 +0x159
sync.(*Cond).Wait(0xc0015470a0?)
	/Users/xxx/.gvm/gos/go1.23.1/src/sync/cond.go:71 +0x85
github.com/nats-io/nats%2ego.(*Conn).waitForMsgs(0xc000581508, 0xc00025f680)
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:3178 +0xc5
created by github.com/nats-io/nats%2ego.(*Conn).subscribeLocked in goroutine 1
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:4468 +0x41b

goroutine 6873 [sync.Cond.Wait, 42 minutes]:
sync.runtime_notifyListWait(0xc000721810, 0x7d2483)
	/Users/xxx/.gvm/gos/go1.23.1/src/runtime/sema.go:587 +0x159
sync.(*Cond).Wait(0xc00025f1d0?)
	/Users/xxx/.gvm/gos/go1.23.1/src/sync/cond.go:71 +0x85
github.com/nats-io/nats%2ego.(*Conn).waitForMsgs(0xc000581508, 0xc00025f1d0)
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:3178 +0xc5
created by github.com/nats-io/nats%2ego.(*Conn).subscribeLocked in goroutine 6898
	/Users/xxx/.gvm/pkgsets/go1.23.1/global/pkg/mod/github.com/nats-io/[email protected]/nats.go:4468 +0x41b

Expected behavior

No deadlock after #1812 was deployed.

Server and client version

Client: 1.41.2
Server: 2.10.24

Host environment

No response

Steps to reproduce

No response

Metadata

Metadata

Labels

defectSuspected defect such as a bug or regression

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions