fix(host/windows): stop SudoVDA MODE_CHANGE_IN_PROGRESS storm — don't force IDD primary by default

ROOT CAUSE (verified by multi-agent compare vs Apollo + adversarial review):
set_active_mode() applied the SudoVDA mode with CDS_UPDATEREGISTRY | CDS_GLOBAL
| CDS_SET_PRIMARY + DM_POSITION(0,0) — promoting the freshly-added IDD to
PRIMARY at the virtual-screen origin and persisting it globally. On this box
(baseline active display = a 1024x768 basic 'WinDisc') that primary-promotion
contests the existing display so the desktop topology never reaches a stable
fixed point → every DuplicateOutput/AcquireNextFrame during the unending
settle returns DXGI_ERROR_MODE_CHANGE_IN_PROGRESS (0x887A0025). Apollo, live
on this EXACT box with an empty config, never promotes primary and captures
the same SudoVDA at 5120x1440 with zero DXGI errors. (Ruled out earlier on the
live box: win32u hook, DPI, independent-flip/overlay, isolation, render pin.)

Fixes (subtractive, gated per adversarial review):
- sudovda.rs set_active_mode: default to CDS_UPDATEREGISTRY only (no primary
  promotion, no GLOBAL, no DM_POSITION) = Apollo-parity for the multi-display
  default. Promote to primary (CDS_GLOBAL|CDS_SET_PRIMARY+DM_POSITION) ONLY
  when PUNKTFUNK_ISOLATE_DISPLAYS=1 (sole display, where a blank extended IDD
  would otherwise yield no frames). Avoids regressing headless/isolated +
  mid-stream Reconfigure.
- dxgi.rs acquire: treat MODE_CHANGE_IN_PROGRESS (0x887A0025) as a TRANSIENT
  (Ok(None), repeat last frame, wait it out) instead of falling through to the
  fatal Err arm → cold-rebuild → create()→set_active_mode (which re-issued the
  mode change and amplified the storm).
- dxgi.rs acquire: remove the born-lost cold-rebuild escape — it re-created the
  SudoVDA (IOCTL REMOVE/ADD = the audible PnP chime the user heard) and never
  converged; now repeat last frame in-process (never tear the IDD down mid-
  session, like Apollo). Overlay + cheap-spin/HDR recovery left intact.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-16 14:59:42 +00:00
parent 900089c44c
commit 769fd96b87
2 changed files with 47 additions and 27 deletions
+24 -16
View File
@@ -39,7 +39,7 @@ use windows::Win32::Graphics::Dxgi::Common::{
use windows::Win32::Graphics::Dxgi::{ use windows::Win32::Graphics::Dxgi::{
CreateDXGIFactory1, IDXGIAdapter1, IDXGIFactory1, IDXGIOutput1, IDXGIOutput5, CreateDXGIFactory1, IDXGIAdapter1, IDXGIFactory1, IDXGIOutput1, IDXGIOutput5,
IDXGIOutputDuplication, IDXGIResource, DXGI_ERROR_ACCESS_LOST, DXGI_ERROR_DEVICE_REMOVED, IDXGIOutputDuplication, IDXGIResource, DXGI_ERROR_ACCESS_LOST, DXGI_ERROR_DEVICE_REMOVED,
DXGI_ERROR_DEVICE_RESET, DXGI_ERROR_DEVICE_RESET, DXGI_ERROR_MODE_CHANGE_IN_PROGRESS,
DXGI_ERROR_INVALID_CALL, DXGI_ERROR_WAIT_TIMEOUT, DXGI_OUTDUPL_DESC, DXGI_OUTDUPL_FRAME_INFO, DXGI_ERROR_INVALID_CALL, DXGI_ERROR_WAIT_TIMEOUT, DXGI_OUTDUPL_DESC, DXGI_OUTDUPL_FRAME_INFO,
DXGI_OUTDUPL_POINTER_SHAPE_INFO, DXGI_OUTDUPL_POINTER_SHAPE_TYPE_COLOR, DXGI_OUTDUPL_POINTER_SHAPE_INFO, DXGI_OUTDUPL_POINTER_SHAPE_TYPE_COLOR,
DXGI_OUTDUPL_POINTER_SHAPE_TYPE_MASKED_COLOR, DXGI_OUTDUPL_POINTER_SHAPE_TYPE_MASKED_COLOR,
@@ -1698,6 +1698,20 @@ impl DuplCapturer {
} }
return Ok(None); return Ok(None);
} }
// MODE_CHANGE_IN_PROGRESS (0x887A0025) is TRANSIENT by design ("the call may succeed at a
// later attempt") — the display topology is mid-settle (e.g. just after the IDD's mode is
// applied). Do NOT recover/rebuild: a rebuild re-issues create()→set_active_mode, re-touching
// the topology and PERPETUATING the change (the storm we measured). Just repeat the last frame
// and wait it out, like a timeout. Throttled log so a genuinely stuck change stays visible.
Err(e) if e.code() == DXGI_ERROR_MODE_CHANGE_IN_PROGRESS => {
self.dbg_timeouts += 1;
if self.dbg_timeouts % 120 == 1 {
tracing::warn!(
"DXGI mode change in progress (0x887A0025) — waiting for topology to settle"
);
}
return Ok(None);
}
// Recoverable losses, ALL handled by rebuilding the duplication (device + re-DuplicateOutput): // Recoverable losses, ALL handled by rebuilding the duplication (device + re-DuplicateOutput):
// ACCESS_LOST — desktop switch (normal <-> Winlogon secure: lock/login/UAC) or mode change // ACCESS_LOST — desktop switch (normal <-> Winlogon secure: lock/login/UAC) or mode change
// INVALID_CALL — the secure->user-desktop switch (post-login) leaves the duplication in a // INVALID_CALL — the secure->user-desktop switch (post-login) leaves the duplication in a
@@ -1760,24 +1774,18 @@ impl DuplCapturer {
} else { } else {
std::thread::sleep(Duration::from_millis(8)); std::thread::sleep(Duration::from_millis(8));
} }
// Escape the born-lost storm on the NORMAL desktop. If rebuilds keep coming back // Born-lost rebuilds (created OK, instant ACCESS_LOST) used to escalate to a full pipeline
// born-lost (created OK, instant ACCESS_LOST), the cheap+heavy re-duplicate will never // cold-rebuild here — but that re-issued vd.create()→set_active_mode (an audible PnP
// converge — this is the hybrid reparent/independent-flip wedge that froze the stream on // add/remove chime + a fresh topology mode change), which never converged and amplified
// its last frame forever. Surface an error so the m3 loop cold-rebuilds the WHOLE // the storm. With the topology fix (set_active_mode no longer promotes the IDD to PRIMARY
// pipeline (fresh VirtualDisplay + device + output), bounded by MAX_CAPTURE_REBUILDS. // by default) the born-lost storm is gone at its source; if one ever recurs, just keep
// NEVER on the secure (Winlogon) desktop: a long static lock/login/UAC dwell is // repeating the last frame in-process — never tear the IDD down mid-session (Apollo never
// legitimate and must not end the session. // does). Throttled visibility only.
const BORN_LOST_ESCAPE: u32 = 20; // ~5 s at the 250 ms rebuild throttle if self.consecutive_born_lost > 0 && self.consecutive_born_lost % 40 == 1 {
if self.ever_got_frame
&& self.consecutive_born_lost >= BORN_LOST_ESCAPE
&& !crate::capture::desktop_watch::is_secure_desktop()
{
tracing::warn!( tracing::warn!(
consecutive = self.consecutive_born_lost, consecutive = self.consecutive_born_lost,
"DDA born-lost storm on normal desktop — escalating to full pipeline cold-rebuild" "DDA born-lost rebuilds — repeating last frame in-process (no teardown)"
); );
self.consecutive_born_lost = 0;
return Err(anyhow!("DDA born-lost storm — cold-rebuilding capture pipeline"));
} }
return Ok(None); return Ok(None);
} }
+23 -11
View File
@@ -341,9 +341,22 @@ fn set_active_mode(gdi_name: &str, mode: Mode) {
); );
} }
// Default (multi-display, Apollo-parity): set ONLY this output's mode in place. Promoting the IDD
// to PRIMARY at the virtual-screen origin (DM_POSITION 0,0) + persisting it GLOBALly contests the
// box's baseline display (e.g. a 1024x768 basic "WinDisc") so the desktop topology never reaches a
// stable fixed point → a perpetual DXGI_ERROR_MODE_CHANGE_IN_PROGRESS storm (the freeze + audible
// PnP chime measured live on the RTX4090+iGPU box). Apollo with an EMPTY config never promotes
// primary and captures the same SudoVDA cleanly (verified live). So default to CDS_UPDATEREGISTRY
// only. ONLY when isolating to a SOLE display does the IDD genuinely need to be primary — a blank
// EXTENDED IDD may not be DWM-composited and would yield no duplication frames.
let isolating = std::env::var("PUNKTFUNK_ISOLATE_DISPLAYS").is_ok();
let mut dm_fields = DM_PELSWIDTH | DM_PELSHEIGHT | DM_DISPLAYFREQUENCY | DM_BITSPERPEL;
if isolating {
dm_fields |= DM_POSITION; // pin to origin, but only as the sole/primary display
}
let dm = DEVMODEW { let dm = DEVMODEW {
dmSize: size_of::<DEVMODEW>() as u16, dmSize: size_of::<DEVMODEW>() as u16,
dmFields: DM_PELSWIDTH | DM_PELSHEIGHT | DM_DISPLAYFREQUENCY | DM_BITSPERPEL | DM_POSITION, dmFields: dm_fields,
dmBitsPerPel: 32, dmBitsPerPel: 32,
dmPelsWidth: mode.width, dmPelsWidth: mode.width,
dmPelsHeight: mode.height, dmPelsHeight: mode.height,
@@ -363,17 +376,16 @@ fn set_active_mode(gdi_name: &str, mode: Mode) {
); );
return; return;
} }
// Default: CDS_UPDATEREGISTRY only — set this output's mode WITHOUT promoting it to primary or
// rewriting the global topology (which storms MODE_CHANGE_IN_PROGRESS). Promote to primary only when
// isolating to a sole display.
let apply_flags = if isolating {
CDS_UPDATEREGISTRY | CDS_GLOBAL | CDS_SET_PRIMARY
} else {
CDS_UPDATEREGISTRY
};
let apply = unsafe { let apply = unsafe {
ChangeDisplaySettingsExW( ChangeDisplaySettingsExW(PCWSTR(wname.as_ptr()), Some(&dm), None, apply_flags, None)
PCWSTR(wname.as_ptr()),
Some(&dm),
None,
// Make it the PRIMARY display: a blank *extended* IDD output isn't composited by the DWM,
// so it produces no duplication frames. As primary it carries the shell/cursor → frames
// flow (this is what Apollo does). Position is (0,0) via DM_POSITION (zeroed by default).
CDS_UPDATEREGISTRY | CDS_GLOBAL | CDS_SET_PRIMARY,
None,
)
}; };
if apply == DISP_CHANGE_SUCCESSFUL { if apply == DISP_CHANGE_SUCCESSFUL {
tracing::info!( tracing::info!(