Skip to content

Commit 4281d40

Browse files
authored
Add ClrMD examples (#6)
* Add ClrMD example * Cleanups * Make it clear we are using WinDbg
1 parent cd5aaec commit 4281d40

File tree

1 file changed

+233
-5
lines changed

1 file changed

+233
-5
lines changed

notebooks/00_blocked_finalizers.ipynb

Lines changed: 233 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -121,9 +121,8 @@
121121
}
122122
],
123123
"source": [
124-
"// Boilerplate to get started: Load up the WinDbgKernel and ClrMD\n",
124+
"// Boilerplate to get started: Load up the WinDbgKernel\n",
125125
"\n",
126-
"#r \"nuget:Microsoft.Diagnostics.Runtime\"\n",
127126
"#r \"..\\WinDbgKernel\\bin\\Debug\\net8.0-windows10.0.17763.0\\publish\\WinDbgKernel.dll\"\n",
128127
"#!import \"..\\WinDbgKernel\\extension.dib\""
129128
]
@@ -189,7 +188,7 @@
189188
"\n",
190189
"Remember, again, that this is a mix of both 'live' and 'dead' objects that are currently on the heap. The 'dead' finalizable objects will only be cleaned up when a GC is performed and locates them, then their finalizer method is called by the finalizer thread. (There are other caveats here, like resurrection that are beyond the scope of this document.)\n",
191190
"\n",
192-
"Here is that full list of all finalizable objects living on the heap:"
191+
"Using WinDbg with SOS, here is that full list of all finalizable objects living on the heap:"
193192
]
194193
},
195194
{
@@ -519,14 +518,243 @@
519518
"~~[6c50]e !clrstack"
520519
]
521520
},
521+
{
522+
"cell_type": "markdown",
523+
"metadata": {},
524+
"source": [
525+
"## Investigating this programmatically\n",
526+
"\n",
527+
"You can write code using the [ClrMD library](https://github.com/Microsoft/clrmd) to automatically detect a small subset of this scenario. ClrMD can only inspect the managed state of the process (it cannot inspect native stack frames or variables), but specifically for the case where a `Monitor` (`lock` keyword) is causing the issue, we can translate those SOS commands into C# code.\n",
528+
"\n",
529+
"Note that this is an example of the power and flexibility of being able to use SOS-like commands from a C# program. However, we have already investigated and solved the blocked finalizer repro we set out to using WinDbg. We do not need to use ClrMD to investigate these issues.\n",
530+
"\n",
531+
"### Getting started with ClrMD\n",
532+
"\n",
533+
"We start by loading up ClrMD with the `Microsoft.Diagnostics.Runtime` package, and define a few variables. Outside of a notebook, we should use the `using` keyword to properly dispose of `DataTarget` and `ClrRuntime`."
534+
]
535+
},
536+
{
537+
"cell_type": "code",
538+
"execution_count": 2,
539+
"metadata": {
540+
"vscode": {
541+
"languageId": "polyglot-notebook"
542+
}
543+
},
544+
"outputs": [
545+
{
546+
"data": {
547+
"text/html": [
548+
"<div><div></div><div></div><div><strong>Installed Packages</strong><ul><li><span>Microsoft.Diagnostics.Runtime, 3.1.512801</span></li></ul></div></div>"
549+
]
550+
},
551+
"metadata": {},
552+
"output_type": "display_data"
553+
}
554+
],
555+
"source": [
556+
"#r \"nuget:Microsoft.Diagnostics.Runtime\"\n",
557+
"using Microsoft.Diagnostics.Runtime;\n",
558+
"\n",
559+
"/* using */ DataTarget dt = DataTarget.LoadDump(@\"..\\BlockedFinalizer\\bin\\release\\net8.0\\BlockedFinalizer.dmp\");\n",
560+
"/* using */ ClrRuntime runtime = dt.ClrVersions[0].CreateRuntime();\n",
561+
"ClrHeap heap = runtime.Heap;"
562+
]
563+
},
564+
{
565+
"cell_type": "markdown",
566+
"metadata": {},
567+
"source": [
568+
"### Find the finalizer thread's stack trace\n",
569+
"\n",
570+
"We now need to find the finalizer thread and take a look at its managed stacktrace. This is similar to using `!clrthreads` to find the finalizer, then `!clrstack` to see its managed stack trace."
571+
]
572+
},
573+
{
574+
"cell_type": "code",
575+
"execution_count": 4,
576+
"metadata": {
577+
"vscode": {
578+
"languageId": "polyglot-notebook"
579+
}
580+
},
581+
"outputs": [
582+
{
583+
"name": "stdout",
584+
"output_type": "stream",
585+
"text": [
586+
"Finalizer thread OSID 0xe6d8, stack:\n",
587+
" [HelperMethodFrame_1OBJ] (System.Threading.Monitor.ReliableEnter)\n",
588+
" BlockedFinalizer.BlockedFinalizer.Finalize()\n",
589+
" [DebuggerU2MCatchHandlerFrame]\n"
590+
]
591+
}
592+
],
593+
"source": [
594+
"// Print the stack of the finalizer thread.\n",
595+
"\n",
596+
"ClrThread finalizerThread = runtime.Threads.Where(t => t.IsFinalizer).Single();\n",
597+
"Console.WriteLine(\"Finalizer thread OSID 0x{0:x}, stack:\", finalizerThread.OSThreadId);\n",
598+
"\n",
599+
"foreach (ClrStackFrame frame in finalizerThread.EnumerateStackTrace())\n",
600+
" Console.WriteLine($\" {frame}\");"
601+
]
602+
},
603+
{
604+
"cell_type": "markdown",
605+
"metadata": {},
606+
"source": [
607+
"### Are there any Monitor objects on the finalizer thread?\n",
608+
"\n",
609+
"If there are any stack frames from the Monitor class `Monitor`, we should see if there is a stack root containing the object that we are attempting to `lock`. To do this we will inspect the stack roots and what frame the root comes from.\n",
610+
"\n",
611+
"You may notice that we do not simply look for arguments and local variables using ClrMD. ClrMD itself is not a debugging API (that would be `ICorDebug`). ClrMD is a \".Net runtime inspection\" API. The distinction matters here because the .Net runtime itself doesn't directly track local variable and argument addresses. At least, it doesn't track it in a direct way that we can get through ClrMD. Instead the runtime tracks roots which point to live objects. In this case, that's sometimes enough to find the object we are looking for."
612+
]
613+
},
614+
{
615+
"cell_type": "code",
616+
"execution_count": 10,
617+
"metadata": {
618+
"vscode": {
619+
"languageId": "polyglot-notebook"
620+
}
621+
},
622+
"outputs": [
623+
{
624+
"name": "stdout",
625+
"output_type": "stream",
626+
"text": [
627+
"Found 1 object of interest.\r\n"
628+
]
629+
}
630+
],
631+
"source": [
632+
"HashSet<ClrObject> objectsOfInterest = [];\n",
633+
"foreach (ClrStackRoot root in finalizerThread.EnumerateStackRoots())\n",
634+
"{\n",
635+
" if (root.StackFrame.Method?.Type?.Name != \"System.Threading.Monitor\")\n",
636+
" continue;\n",
637+
" \n",
638+
" objectsOfInterest.Add(root.Object);\n",
639+
"}\n",
640+
"\n",
641+
"Console.WriteLine($\"Found {objectsOfInterest.Count} object{(objectsOfInterest.Count == 1 ? \"\" : \"s\")} of interest.\");"
642+
]
643+
},
644+
{
645+
"cell_type": "markdown",
646+
"metadata": {},
647+
"source": [
648+
"### Find out which thread is holding these Monitors via the SyncBlock table\n",
649+
"\n",
650+
"Our heuristic might have found the wrong object, or the object in question might not have a SyncBlock created for it. Similar to `!syncblk -all`, we will next loop through all SyncBlocks in the process and see if our objects of interest are on the SyncBlock table. If so, we'll save those `SyncBlock` objects."
651+
]
652+
},
653+
{
654+
"cell_type": "code",
655+
"execution_count": 11,
656+
"metadata": {
657+
"vscode": {
658+
"languageId": "polyglot-notebook"
659+
}
660+
},
661+
"outputs": [
662+
{
663+
"name": "stdout",
664+
"output_type": "stream",
665+
"text": [
666+
"Found potential monitor objects:\n",
667+
" 018e74447c58 System.Object\n",
668+
"\n"
669+
]
670+
}
671+
],
672+
"source": [
673+
"\n",
674+
"List<SyncBlock> syncBlocksOfInterest = [];\n",
675+
"if (objectsOfInterest.Count > 0)\n",
676+
"{\n",
677+
" Console.WriteLine(\"Found potential monitor objects:\");\n",
678+
" foreach (ClrObject obj in objectsOfInterest)\n",
679+
" Console.WriteLine($\" {obj.Address:x12} {obj.Type?.Name}\");\n",
680+
"\n",
681+
" Console.WriteLine();\n",
682+
"\n",
683+
" foreach (SyncBlock syncBlock in heap.EnumerateSyncBlocks())\n",
684+
" {\n",
685+
" // SyncBlock holds only the object's address. We need to convert it into a ClrObject to use it here.\n",
686+
" ClrObject obj = heap.GetObject(syncBlock.Object);\n",
687+
" if (objectsOfInterest.Contains(obj))\n",
688+
" syncBlocksOfInterest.Add(syncBlock);\n",
689+
" }\n",
690+
"}"
691+
]
692+
},
693+
{
694+
"cell_type": "markdown",
695+
"metadata": {},
696+
"source": [
697+
"### Finding the thread holding the lock\n",
698+
"\n",
699+
"We now have a list of SyncBlocks matching objects rooted by `Monitor` frames. We'll now go through each SyncBlock and see if the monitor is held, and if so by which thread."
700+
]
701+
},
702+
{
703+
"cell_type": "code",
704+
"execution_count": 13,
705+
"metadata": {
706+
"vscode": {
707+
"languageId": "polyglot-notebook"
708+
}
709+
},
710+
"outputs": [
711+
{
712+
"name": "stdout",
713+
"output_type": "stream",
714+
"text": [
715+
"Found the following blocked Monitors on the Finalizer thread:\n",
716+
" 018e74447c58 System.Object blocked by thread 0x6c50\n"
717+
]
718+
}
719+
],
720+
"source": [
721+
"List<(ClrObject Object, ClrThread Thread)> blockedObjects = [];\n",
722+
"if (syncBlocksOfInterest.Count > 0)\n",
723+
"{\n",
724+
" foreach (SyncBlock syncBlock in syncBlocksOfInterest)\n",
725+
" {\n",
726+
" if (syncBlock.IsMonitorHeld && syncBlock.HoldingThreadAddress != 0)\n",
727+
" {\n",
728+
" ClrThread holdingThread = runtime.Threads.FirstOrDefault(t => t.Address == syncBlock.HoldingThreadAddress);\n",
729+
" if (holdingThread != null)\n",
730+
" blockedObjects.Add((heap.GetObject(syncBlock.Object), holdingThread));\n",
731+
" }\n",
732+
" }\n",
733+
"}\n",
734+
"\n",
735+
"if (blockedObjects.Count > 0)\n",
736+
"{\n",
737+
" Console.WriteLine(\"Found the following blocked Monitors on the Finalizer thread:\");\n",
738+
" foreach ((ClrObject Object, ClrThread Thread) in blockedObjects)\n",
739+
" Console.WriteLine($\" {Object.Address:x12} {Object.Type?.Name} blocked by thread 0x{Thread.OSThreadId:x}\");\n",
740+
"}\n",
741+
"else\n",
742+
"{\n",
743+
" Console.WriteLine(\"Did not find a blocked Monitor object on the Finalizer thread.\");\n",
744+
"}\n"
745+
]
746+
},
522747
{
523748
"cell_type": "markdown",
524749
"metadata": {},
525750
"source": [
526751
"## Wrap-up\n",
527752
"\n",
528-
"A lot of native memory allocated by managed code is eventually released by the finalizer queue if the `IDisposable` pattern wasn't used (or if `Dispose` couldn't be called for some reason). This means that one of the first steps in debugging these kinds of failures is to make sure the finalizer thread isn't blocked, and the number of objects \"ready\" for finalization isn't much higher than expected. This can mean comparing with a baseline dump file of the process earlier in its lifetime, or intuition if you have debugged the same process many times.\n",
529-
"\n"
753+
"One of the first steps in investigating higher than expected memory usage is checking if the Finalizer queue is blocked, and if there's more than expected finalizable objects. This is because finalizers are one of the ways that native resources are released and given back to the operating system.\n",
754+
"\n",
755+
"You can quickly check whether the finalizer queue is blocked by finding the finalizer thread with `!clrthreads` then using your debugger to see the native (or managed) stack trace of that thread. You can see if there's an abnormally large amount of finalizable objects using `!fq -stat`, and whether there's an abnormally large amount of finalizable objects that the GC has already decided are unreferenced and ready for finalization using `!fq -allReady -stat`.\n",
756+
"\n",
757+
"This works better when you have a baseline to compare against. If you are investigating a memory leak, usually that means taking one crash dump early in the process's lifetime (before memory grows abnormally large) and one after memory has grown. Otherwise, you will need to have a pretty good idea about the \"normal\" state of finalizable objects in your application, which is an intuition you can build by using the `!fq` command regularly."
530758
]
531759
}
532760
],

0 commit comments

Comments
 (0)