Skip to content

Conversation

@tsardelli-dh
Copy link

Overview

This patch moves connector refresh calls outside of the storage transaction and introduces a per-refresh-ID mutex to ensure only one concurrent request per token hits the external IdP. Other concurrent requests wait for the mutex and reuse the updated identity.

What this PR does / why we need it

This is an initial attempt to fix #4209. The gist of the problem is that commit 4b5f1d52 moved the connection to the external IdP to update the refresh token, in the same DB transaction used to update the token in dex's storage. This causes a deadlock when the "external IdP" is still dex (e.g. when using PasswordDB) and the underlying storage sets a maximum number of open connections (e.g. SQLite or PgBouncer)

For a detailed overview of what triggers the issue, please see #4209 (comment)

Closes #4209

Special notes for your reviewer

I'm not sure how to properly test this change or if it's already covered by the existing test suite. If you have any suggestions please let me know.

@tsardelli-dh tsardelli-dh force-pushed the fix-refresh-token-db-connections branch 3 times, most recently from c9b79a5 to fc00334 Compare September 12, 2025 14:20
@tsardelli-dh tsardelli-dh force-pushed the fix-refresh-token-db-connections branch 4 times, most recently from 62941c5 to 9320c9c Compare September 24, 2025 07:49
…and adding per-token mutex

Previously, `updateRefreshToken` executed `refreshWithConnector` inside the
`UpdateRefreshToken` transaction. With SQL backends that enforce strict
connection limits (e.g. SQLite), this blocked the only available connection
while the connector call could indirectly trigger further storage access
(e.g. when using PasswordDB), causing the system to hang.

This patch moves connector refresh calls outside of the storage transaction
and introduces a per-refresh-ID mutex to ensure only one concurrent request
per token hits the external IdP. Other concurrent requests wait for the mutex
and reuse the updated identity.

Signed-off-by: Tommaso Sardelli <[email protected]>
@tsardelli-dh tsardelli-dh force-pushed the fix-refresh-token-db-connections branch from 9320c9c to cccbebc Compare September 25, 2025 12:30
@tsardelli-dh
Copy link
Author

Hey folks. Whenever you get a chance, I’d love to hear any thoughts or suggestions on this. I'm totally open to changes if this approach doesn’t feel right. Thanks!


refreshTokenPolicy *RefreshTokenPolicy
// mutex to refresh the same token only once for concurrent requests
refreshLocks sync.Map
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does it work when Dex is running in the HA mode, e.g., two instances of Dex are serving clients at the same time? For the Kubernetes storage, we did locking on the Kubernetes side, but in in-memory lock doesn't help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Token refresh hangs when using local passwords on SQLite

2 participants