Skip to content

HwRender: fix ARM64 crash when GPU is suspended during Present (DXGI_ERROR_DEVICE_REMOVED)#11472

Open
etvorun wants to merge 1 commit intodotnet:mainfrom
etvorun:fix/arm64-dxgi-device-removed-crash
Open

HwRender: fix ARM64 crash when GPU is suspended during Present (DXGI_ERROR_DEVICE_REMOVED)#11472
etvorun wants to merge 1 commit intodotnet:mainfrom
etvorun:fix/arm64-dxgi-device-removed-crash

Conversation

@etvorun
Copy link

@etvorun etvorun commented Feb 20, 2026

Summary

On ARM64 devices, WPF crashes when the GPU is suspended during an active IDirect3DSwapChain9::Present call. The D3D9-over-DXGI compatibility layer on ARM64 surfaces DXGI_ERROR_DEVICE_REMOVED (0x887A0005) directly instead of translating it to D3DERR_DEVICELOST. WPF's PresentWithD3D did not recognize this code; it fell through to MIL_THR, which treats 0x887A0005 as fatal and crashes the process. HandlePresentFailure — and the device-lost recovery path — was never reached.

What changed

src/Microsoft.DotNet.Wpf/src/WpfGfx/include/wgx_error.h

  • Added a local #define DXGI_ERROR_DEVICE_REMOVED ((HRESULT)0x887A0005L) guarded by #ifndef, avoiding a new dependency on dxgi.h. The value is a stable DirectX ABI constant.

src/Microsoft.DotNet.Wpf/src/WpfGfx/core/hw/d3ddevice.cpp

  • PresentWithD3D: new else if (hr == DXGI_ERROR_DEVICE_REMOVED) branch placed before the MIL_THR call, converting the error to D3DERR_DEVICELOST so the existing device-lost recovery path handles it.
  • HandlePresentFailure: added DXGI_ERROR_DEVICE_REMOVED to the device-lost if condition, as defense-in-depth for other callers such as PresentWithGDI.

Why

The fix is aligned with the previously validated internal source fix behavior. The change is surgical: no new recovery logic is introduced; WPF reuses the existing TDR / device-lost path (MarkUnusable()WGXERR_DISPLAYSTATEINVALIDRENDERING_STATUS_DEVICE_LOST → resource recreation on GPU resume).

Recovery sequence

State What happens
Present returns DXGI_ERROR_DEVICE_REMOVED Converted to D3DERR_DEVICELOST; HandlePresentFailure calls MarkUnusable(), invalidates GPU resources, notifies device manager
Same frame — render thread CRenderTargetManager::HandlePresentErrors swallows WGXERR_DISPLAYSTATEINVALID; fires RENDERING_STATUS_DEVICE_LOST to UI thread. No crash, no zombie.
Subsequent frames (GPU suspended) UpdateDisplayState returns WGXERR_DISPLAYSTATEINVALID; render passes are skipped
GPU comes back UpdateDisplayState succeeds; NotifyTierChange fires; render targets recreated; window repainted

Validation

Validated manually on ARM64 hardware by triggering GPU suspension events while a WPF application is actively rendering. The application recovered without crashing after applying this fix. No automated test infrastructure changes are included.

Fixes #11471

Microsoft Reviewers: Open in CodeFlow

On ARM64, the D3D9-over-DXGI compatibility layer surfaces
DXGI_ERROR_DEVICE_REMOVED (0x887A0005) directly when the GPU is
suspended or removed, instead of mapping it to D3DERR_DEVICELOST.

PresentWithD3D only checked for S_OK, S_PRESENT_MODE_CHANGED, and
S_PRESENT_OCCLUDED. Any unrecognised HRESULT fell through to MIL_THR,
which treats 0x887A0005 as fatal, triggering VS_FatalError and a crash.
HandlePresentFailure was never reached.

Fix:
- wgx_error.h: define DXGI_ERROR_DEVICE_REMOVED locally to avoid
  pulling in dxgi.h; the value is a stable DirectX ABI constant.
- PresentWithD3D: add else-if branch that converts
  DXGI_ERROR_DEVICE_REMOVED to D3DERR_DEVICELOST before MIL_THR, so
  the existing device-lost recovery path handles it gracefully.
- HandlePresentFailure: add DXGI_ERROR_DEVICE_REMOVED to the device-lost
  condition block as defense-in-depth for other call sites (PresentWithGDI).

Recovery follows the existing TDR path: MarkUnusable() invalidates GPU
resources, render thread skips the frame and fires
RENDERING_STATUS_DEVICE_LOST, and when the GPU resumes
UpdateDisplayState recreates resources and repaints the window.
@etvorun etvorun requested review from a team and Copilot February 20, 2026 23:03
@dotnet-policy-service dotnet-policy-service bot added the PR metadata: Label to tag PRs, to facilitate with triage label Feb 20, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a critical crash on ARM64 devices when the GPU is suspended during active rendering. On ARM64, the D3D9-over-DXGI compatibility layer returns DXGI_ERROR_DEVICE_REMOVED (0x887A0005) instead of the expected D3D9 error code D3DERR_DEVICELOST. WPF's PresentWithD3D did not recognize this code, causing MIL_THR to treat it as fatal and crash the process. The fix converts the DXGI error to the D3D9 equivalent before error processing, allowing the existing device-lost recovery mechanism to handle GPU suspension gracefully.

Changes:

  • Defined DXGI_ERROR_DEVICE_REMOVED constant locally in wgx_error.h to avoid adding a dependency on dxgi.h
  • Added error code translation in PresentWithD3D to convert DXGI_ERROR_DEVICE_REMOVED to D3DERR_DEVICELOST before MIL_THR processing
  • Added DXGI_ERROR_DEVICE_REMOVED to the device-lost condition in HandlePresentFailure as defense-in-depth

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/Microsoft.DotNet.Wpf/src/WpfGfx/include/wgx_error.h Adds local definition of DXGI_ERROR_DEVICE_REMOVED constant (0x887A0005L) with #ifndef guard
src/Microsoft.DotNet.Wpf/src/WpfGfx/core/hw/d3ddevice.cpp Converts DXGI_ERROR_DEVICE_REMOVED to D3DERR_DEVICELOST in PresentWithD3D before MIL_THR processing; adds DXGI_ERROR_DEVICE_REMOVED to HandlePresentFailure's device-lost condition

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@lindexi lindexi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR metadata: Label to tag PRs, to facilitate with triage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

HwRender: WPF crashes on ARM64 when GPU is suspended during frame presentation (DXGI_ERROR_DEVICE_REMOVED)

3 participants