Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack overflow from static constructors taking locks on Native AOT #113949

Closed
Tracked by #114179
Sergio0694 opened this issue Mar 26, 2025 · 13 comments · Fixed by #114227
Closed
Tracked by #114179

Stack overflow from static constructors taking locks on Native AOT #113949

Sergio0694 opened this issue Mar 26, 2025 · 13 comments · Fixed by #114227
Assignees
Labels
area-NativeAOT-coreclr partner-impact This issue impacts a partner who needs to be kept updated
Milestone

Comments

@Sergio0694
Copy link
Contributor

Sergio0694 commented Mar 26, 2025

Description

We're seeing quite a few hits across multiple WER buckets in the Microsoft Store, with similar stacktraces:

00 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnterSlow_0+0xb 
01 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnter_Outlined+0x72 
02 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnter_0+0xe 
03 <MICROSOFT_STORE>!S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__DeadlockAwareAcquire+0x65 
04 <MICROSOFT_STORE>!S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__EnsureClassConstructorRun+0x77 
05 <MICROSOFT_STORE>!S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__CheckStaticClassConstructionReturnGCStaticBase+0xd 
06 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_WaitHandle__WaitForMultipleObjectsIgnoringSyncContext+0x23c 
07 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_WaitHandle__WaitOneNoCheck+0x126 
08 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnterSlow_0+0x517 
09 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnter_Outlined+0x72 
0a <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnter_0+0xe 
0b <MICROSOFT_STORE>!S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__DeadlockAwareAcquire+0x65 
0c <MICROSOFT_STORE>!S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__EnsureClassConstructorRun+0x77 
0d <MICROSOFT_STORE>!S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__CheckStaticClassConstructionReturnGCStaticBase+0xd 
0e <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_WaitHandle__WaitForMultipleObjectsIgnoringSyncContext+0x23c 
0f <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_WaitHandle__WaitOneNoCheck+0x126 
10 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnterSlow_0+0x517 
11 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnter_Outlined+0x72 
12 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnter_0+0xe 
13 <MICROSOFT_STORE>!S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__DeadlockAwareAcquire+0x65 
14 <MICROSOFT_STORE>!S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__EnsureClassConstructorRun+0x77 
15 <MICROSOFT_STORE>!S_P_CoreLib_System_Runtime_CompilerServices_ClassConstructorRunner__CheckStaticClassConstructionReturnGCStaticBase+0xd 
16 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_WaitHandle__WaitForMultipleObjectsIgnoringSyncContext+0x23c 
17 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_WaitHandle__WaitOneNoCheck+0x126 
18 <MICROSOFT_STORE>!S_P_CoreLib_System_Threading_Lock__TryEnterSlow_0+0x517 
[...]

Here's a bunch of buckets with the same issue:

  • a9dc9e76-1a0a-2469-8b52-3dd7697aceb2
  • f687741f-21c4-c75d-7e19-6db18f7622d0
  • 09ea45c6-7f69-9ef2-d3fa-b613a502a0d1
  • efc37b69-f121-2ab5-f95a-db9c03a28093
  • ede6abc4-1fa5-1b0a-b3fc-02f82e69cb65
  • 64a1f0fd-9cae-2f64-30c4-a86282ce4c37

Opening this for tracking. @MichalStrehovsky and other folks are already looking into this internally (let me know if you'd like me to add you).

Configuration

  • .NET SDK 9.0.3
  • Native AOT, x64
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Mar 26, 2025
Copy link
Contributor

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

@MichalStrehovsky MichalStrehovsky added this to the 10.0.0 milestone Mar 26, 2025
@MichalStrehovsky
Copy link
Member

Cc @kouvel for an apparent recursion in locking

@Sergio0694 Sergio0694 added the partner-impact This issue impacts a partner who needs to be kept updated label Mar 27, 2025
@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Mar 27, 2025
@hoyosjs
Copy link
Member

hoyosjs commented Mar 27, 2025

This is when trying to access the gc static base of Thread in WaitForMultipleObjectsIgnoringSyncContext. There __GetGCStaticBase_S_P_CoreLib_System_Threading_Thread takes this branch, which recurses into EnsureClassConstructorRun and the into lock again, which goes into the WaitHandle again, and then gets to loading the thread cctor again.

  • It's not clear why it doesn't enter the lock when using State.TryLock. It seems to not be owned not is it locked:
    [+0x000] Object           [Type: Object]
    [+0x010] _owningThreadId  : 0x0 [Type: unsigned int]
    [+0x014] _state           : 0xdc60 [Type: unsigned int]
    [+0x018] _recursionCount  : 0x0 [Type: unsigned int]
    [+0x01c] _spinCount       : -87 [Type: short]
    [+0x01e] _waiterStartTimeMs : 0xc81a [Type: unsigned short]
    [+0x008] _waitEvent       : 0x800483c3b0 [Type: S_P_CoreLib_System_Threading_AutoResetEvent &]
    [=0x7ffe147a8680] __NONGCSTATICS   [Type: __type__NONGCSTATICSS_P_CoreLib_System_Threading_Lock]

And in the dumps I don't see thread usage that would make the Interlocked.CompareExchange fail in it.

  • I am not sure why getgcstatics is not returning in that branch - it might be an artifact of the dump, but the WinStore_App!S_P_CoreLib_System_Threading_Monitor::__NONGCSTATICS marker I see is 0, which seems to mean initialized.

@jkotas
Copy link
Member

jkotas commented Mar 27, 2025

It's not clear why it doesn't enter the lock when using State.TryLock. It seems to not be owned not is it locked

The most likely explanation is that it was locked when the code was executing. It was not locked by the time the dump got captured.

@hoyosjs
Copy link
Member

hoyosjs commented Mar 27, 2025

There are no other threads in this particular case that would explain convoy, but if could definitely be a "too late" situation. The dump has 9 threads, only 3 with managed code in them. The others seem not to be doing much, but this one is 3559 frames deep, mostly in lock recursion.

@jkotas
Copy link
Member

jkotas commented Mar 27, 2025

and surrounding code assumes that System.Threading.Thread does not have a static constructor.

System.Threading.Thread has static constructor due to this line

private static readonly bool s_isProcessorNumberReallyFast = ProcessorIdCache.ProcessorNumberSpeedCheck();
. This static constructor was introduced by https://github.com/dotnet/runtime/pull/80532/files#diff-d5367ffeac7b529ffce3c329ff6ccd507c8d6186840e031b877c1ae6baf8d5b8R673 (cc @VSadov).

We either need to get rid of this static constructor; or get rid of the assumption that System.Threading.Thread does not have a static constructor in the above code.

@VSadov
Copy link
Member

VSadov commented Mar 27, 2025

We either need to get rid of this static constructor; or get rid of the assumption that System.Threading.Thread does not have a static constructor in the above code.

We can also force the Thread's cctor to run early, before we have other threads.
That is what likely mitigates this scenario. It may be unusual to get far and not trigger Thread's cctor before having enough threads to cause contention.

ProcessorNumberSpeedCheck takes about 1 microsecond, so is fairly cheap.

@VSadov
Copy link
Member

VSadov commented Mar 27, 2025

We could also move the ProcessorNumberSpeedCheck on NativeAOT to some ModuleInitializer and make Thread cctor-free.

There is a benefit on CoreCLR from storing this in a readonly static and computing in a cctor, since it can become a JIT constant.

I do not think that is useful on NativeAOT, so any global location and any kind of initialization sequence would be ok. It does not even need to be computed before anything in particular (default false might be suboptimal, but not hugely so) - just running this early(ish) is ok.

@VSadov
Copy link
Member

VSadov commented Mar 27, 2025

My current preference would be to make s_isProcessorNumberReallyFast not readonly on NativeAOT and move initialization somewhere else, thus making Thread cctor-free.

@VSadov VSadov self-assigned this Mar 27, 2025
@jkotas
Copy link
Member

jkotas commented Mar 27, 2025

BTW: We do not see it in our test runs since the Thread gets initialized very early in regular .exe mode before there is any multithreading. The crash can really only happen in dynamic library mode. The watson buckets above point to backgroundTaskHost.exe where the native AOT component runs in dynamic library mode.

@jkotas
Copy link
Member

jkotas commented Mar 27, 2025

My current preference would be to make s_isProcessorNumberReallyFast not readonly on NativeAOT

I would move private static readonly bool s_isProcessorNumberReallyFast = ProcessorIdCache.ProcessorNumberSpeedCheck(); to ProcessIdCatche.cs, but otherwise keep things same between NAOT and regular CoreCLR. Do you see any downsides with that?

Separately, we can choose to mark ProcessorIdCache with [EagerStaticClassConstruction] if we want to run its cctor early during startup, but it should not be needed for a minimal fix.

@VSadov
Copy link
Member

VSadov commented Apr 3, 2025

I would move private static readonly bool s_isProcessorNumberReallyFast = ProcessorIdCache.ProcessorNumberSpeedCheck(); to ProcessIdCatche.cs, but otherwise keep things same between NAOT and regular CoreCLR. Do you see any downsides with that?

Nope. I think the choice was just because GetCurrentProcessorId API lives on Thread and we wanted to optimize for common case where getting core ID is supported by CPU.

I think even with everything being AggressiveInlining it will actually work, since the static will be formally on a different class.

Separately, we can choose to mark ProcessorIdCache with [EagerStaticClassConstruction] if we want to run its cctor early during startup, but it should not be needed for a minimal fix.

I think simply moving to ProcessorIdCache will be sufficient. As long as nothing in the Lock and related machinery does GetCurrentProcessorId, which it should not.

@VSadov
Copy link
Member

VSadov commented Apr 3, 2025

I think one reason why we wanted the static to be on Thread is to make it likely that by the time we need to JIT ProcessorIdCache.GetCurrentProcessorId, the readonly static was already initialized, thus JIT could treat it as a constant.

It was somewhat common pattern in the past where hot readonly statics would be initialized or "touched" on some less hot, but likely to happen early code path.
With tiered compilation, this does not seem to be as useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-NativeAOT-coreclr partner-impact This issue impacts a partner who needs to be kept updated
Projects
Status: No status
Development

Successfully merging a pull request may close this issue.

5 participants