HwRender: fix ARM64 crash when GPU is suspended during Present (DXGI_ERROR_DEVICE_REMOVED)#11472
HwRender: fix ARM64 crash when GPU is suspended during Present (DXGI_ERROR_DEVICE_REMOVED)#11472etvorun wants to merge 1 commit intodotnet:mainfrom
Conversation
On ARM64, the D3D9-over-DXGI compatibility layer surfaces DXGI_ERROR_DEVICE_REMOVED (0x887A0005) directly when the GPU is suspended or removed, instead of mapping it to D3DERR_DEVICELOST. PresentWithD3D only checked for S_OK, S_PRESENT_MODE_CHANGED, and S_PRESENT_OCCLUDED. Any unrecognised HRESULT fell through to MIL_THR, which treats 0x887A0005 as fatal, triggering VS_FatalError and a crash. HandlePresentFailure was never reached. Fix: - wgx_error.h: define DXGI_ERROR_DEVICE_REMOVED locally to avoid pulling in dxgi.h; the value is a stable DirectX ABI constant. - PresentWithD3D: add else-if branch that converts DXGI_ERROR_DEVICE_REMOVED to D3DERR_DEVICELOST before MIL_THR, so the existing device-lost recovery path handles it gracefully. - HandlePresentFailure: add DXGI_ERROR_DEVICE_REMOVED to the device-lost condition block as defense-in-depth for other call sites (PresentWithGDI). Recovery follows the existing TDR path: MarkUnusable() invalidates GPU resources, render thread skips the frame and fires RENDERING_STATUS_DEVICE_LOST, and when the GPU resumes UpdateDisplayState recreates resources and repaints the window.
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical crash on ARM64 devices when the GPU is suspended during active rendering. On ARM64, the D3D9-over-DXGI compatibility layer returns DXGI_ERROR_DEVICE_REMOVED (0x887A0005) instead of the expected D3D9 error code D3DERR_DEVICELOST. WPF's PresentWithD3D did not recognize this code, causing MIL_THR to treat it as fatal and crash the process. The fix converts the DXGI error to the D3D9 equivalent before error processing, allowing the existing device-lost recovery mechanism to handle GPU suspension gracefully.
Changes:
- Defined
DXGI_ERROR_DEVICE_REMOVEDconstant locally inwgx_error.hto avoid adding a dependency ondxgi.h - Added error code translation in
PresentWithD3Dto convertDXGI_ERROR_DEVICE_REMOVEDtoD3DERR_DEVICELOSTbeforeMIL_THRprocessing - Added
DXGI_ERROR_DEVICE_REMOVEDto the device-lost condition inHandlePresentFailureas defense-in-depth
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/Microsoft.DotNet.Wpf/src/WpfGfx/include/wgx_error.h | Adds local definition of DXGI_ERROR_DEVICE_REMOVED constant (0x887A0005L) with #ifndef guard |
| src/Microsoft.DotNet.Wpf/src/WpfGfx/core/hw/d3ddevice.cpp | Converts DXGI_ERROR_DEVICE_REMOVED to D3DERR_DEVICELOST in PresentWithD3D before MIL_THR processing; adds DXGI_ERROR_DEVICE_REMOVED to HandlePresentFailure's device-lost condition |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
On ARM64 devices, WPF crashes when the GPU is suspended during an active
IDirect3DSwapChain9::Presentcall. The D3D9-over-DXGI compatibility layer on ARM64 surfacesDXGI_ERROR_DEVICE_REMOVED(0x887A0005) directly instead of translating it toD3DERR_DEVICELOST. WPF'sPresentWithD3Ddid not recognize this code; it fell through toMIL_THR, which treats0x887A0005as fatal and crashes the process.HandlePresentFailure— and the device-lost recovery path — was never reached.What changed
src/Microsoft.DotNet.Wpf/src/WpfGfx/include/wgx_error.h#define DXGI_ERROR_DEVICE_REMOVED ((HRESULT)0x887A0005L)guarded by#ifndef, avoiding a new dependency ondxgi.h. The value is a stable DirectX ABI constant.src/Microsoft.DotNet.Wpf/src/WpfGfx/core/hw/d3ddevice.cppPresentWithD3D: newelse if (hr == DXGI_ERROR_DEVICE_REMOVED)branch placed before theMIL_THRcall, converting the error toD3DERR_DEVICELOSTso the existing device-lost recovery path handles it.HandlePresentFailure: addedDXGI_ERROR_DEVICE_REMOVEDto the device-lostifcondition, as defense-in-depth for other callers such asPresentWithGDI.Why
The fix is aligned with the previously validated internal source fix behavior. The change is surgical: no new recovery logic is introduced; WPF reuses the existing TDR / device-lost path (
MarkUnusable()→WGXERR_DISPLAYSTATEINVALID→RENDERING_STATUS_DEVICE_LOST→ resource recreation on GPU resume).Recovery sequence
PresentreturnsDXGI_ERROR_DEVICE_REMOVEDD3DERR_DEVICELOST;HandlePresentFailurecallsMarkUnusable(), invalidates GPU resources, notifies device managerCRenderTargetManager::HandlePresentErrorsswallowsWGXERR_DISPLAYSTATEINVALID; firesRENDERING_STATUS_DEVICE_LOSTto UI thread. No crash, no zombie.UpdateDisplayStatereturnsWGXERR_DISPLAYSTATEINVALID; render passes are skippedUpdateDisplayStatesucceeds;NotifyTierChangefires; render targets recreated; window repaintedValidation
Validated manually on ARM64 hardware by triggering GPU suspension events while a WPF application is actively rendering. The application recovered without crashing after applying this fix. No automated test infrastructure changes are included.
Fixes #11471
Microsoft Reviewers: Open in CodeFlow