Files
punktfunk/design/windows-virtual-display-rust-port.md
T
enricobuehler 7b99b41ede docs(design): trim shipped plans, consolidate cluster, add index
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).

- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
  apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
  host-latency, gpu-contention (fixed stale status table), game-library,
  linux-setup (fixed m0->spike + stale zero-copy claim),
  session-aware-host-followups, windows-client-bootstrap,
  windows-dualsense-{scoping,game-detection}, windows-virtual-display,
  security-review (per-finding status table; #12 still open),
  apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
  windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
  merged, M4 done); windows-secure-desktop.md archived (now a fallback
  behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
  roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:39:06 +00:00

11 KiB
Raw Blame History

Windows virtual display — a Rust port of SudoVDA (investigation & plan)

Status: SHIPPED (P1, 2026-06-22) + P2 CLOSED as a dead end. The all-Rust IddCx driver pf-vdisplay (packaging/windows/drivers/pf-vdisplay/) replaced the vendored SudoVDA C++ driver (SudoVDA backend deleted in 84a3b95) and is the sole Windows vdisplay backend; the host drives it via crates/punktfunk-host/src/vdisplay/windows/pf_vdisplay.rs. Live-validated streaming on the RTX box at 5120×1440@240. The current consolidated Windows-host architecture lives in windows-host-rewrite.md. This doc is trimmed to the two things git history can't replace: the on-glass driver-iteration gotchas, and the P2 decision record proving direct-frame-push (IDD-push) is architecturally impossible for bare-metal capture — do not re-attempt it.

All the P1 planning/feasibility/decision content (signing tier, Rust prior art, binding-stack choice, IOCTL contract, phased plan) executed as designed and now lives in the code + windows-host-rewrite.md; it has been cut. What remains below is the durable record.

Driver-iteration gotchas (hard-won, on-glass)

These cost real time during P1 bring-up and apply to any future IddCx/UMDF driver work on this box.

  • INF DriverVer gate. Updating an installed UMDF driver only takes if the INF DriverVer changesdeploy-dev.ps1 stamps a date.time -v on every run; without a bump the old binary keeps running (silently).
  • Devnode hygiene — nefconc, never devgen. Create the root devnode with nefconc --create-device-node (a clean ROOT\DISPLAY node), NOT devgen /add — devgen makes persistent SWD\DEVGEN software devices that survive reboot and registry deletion and resurrect on every pnputil /add-driver (they carry hwid root\pf_vdisplay, so the driver install re-materializes them). The production installer must use a single nefconc/INF-created node and never devgen.
  • Session-0 vs Session-1 observability. Every standalone probe (vdtest, the host's live_create_drop test) runs in Session 0 — the services session, whose desktop is a throwaway 1024×768 basic display. IddCx activation happens in the console Session 1, where the GPU drives the real desktop. So Screen.AllScreens/CCD queries from Session 0 can never see the virtual monitor activate — they report the wrong desktop. The only valid way to drive + observe it is the host service (SYSTEM, which targets Session 1) plus the driver's own OutputDebugString (system-wide, session-agnostic). (An early "monitor arrives but never gets a swap-chain / no DXGI output" symptom was this measurement artifact, not a driver bug.)
  • Accumulated device-state damage. Repeated reinstalls + Disable/Enable-PnpDevice cycles + a control handle the host cached across all of it wedge the device tree (stale handle → the host's PINGs fail → the 3 s watchdog tears the monitor down mid-session → capture opens a dying display → "no DXGI output"). A reboot clears it and it works on the first connect. Lesson: after device churn, restart the host service (fresh handle) — and when in doubt, reboot.
  • Hot-reload is unreliable; deploy = install + reboot. pnputil /restart-device does NOT restart WUDFHost (old image stays mapped), Disable/Enable-PnpDevice errors on the root-enumerated IDD, and killing WUDFHost invalidates the host's cached {e5bcc234} control handle (every ADD then fails 0x80070006, and the device can wedge to FAILED_POST_START). A reboot loads a freshly-installed build cleanly. Recovery from a broken build is clean and reboot-free: pnputil /delete-driver <oemNN>.inf /uninstall removes the bad package and the device rebinds the previous (validated) package in the DriverStore.
  • FAILED_POST_START is usually churn, not the binary. Comparing a working vs. a suspect DLL's import tables came out identical (same DLLs; the size/hash delta is just the Authenticode signature). A clean install + reboot (no restart-device/disable-enable/kill in between) loads to OK.
  • The swap-chain drain is required. The swap-chain processor is a faithful port of virtual-display-rs's — it drains correctly via ReleaseAndAcquireBuffer + FinishedProcessingFrame. The drain is required; a true no-op stalls DWM and freezes the captured image.
  • pf-vdisplay can't coexist with SudoVDA. They register the same control-interface GUID, so two IddCx adapters claiming {e5bcc234}FAILED_POST_START. pf-vdisplay replaces SudoVDA (now moot — SudoVDA is deleted — but the same rule binds any second IDD that claims the GUID).

P2 — direct frame push (kill DDA): decision record — DEAD END, DO NOT PURSUE

P2 wanted the driver to publish each swap-chain frame to the host directly (Looking-Glass style), to retire DXGI Desktop Duplication and its multi-GPU survival code (capture/dxgi.rs's DXGI_ERROR_ACCESS_LOST/MODE_CHANGE_IN_PROGRESS re-duplication churn and the win32u.dll install_gpu_pref_hook() patch). It cannot work for bare-metal console-desktop capture. All the IDD-push code stays in-tree, compiles, and is gated off behind PUNKTFUNK_IDD_PUSH — dormant and harmless — as the documented record so it isn't re-tried.

What was proven sound (so the failure is not a transport bug)

  • Producer and consumer are both in Session 0. The pf-vdisplay host process is WUDFHost.exe (-DeviceGroupId:pfVDisplayGroup) and the punktfunk host service is LocalSystemboth Session 0. So a D3D11 shared keyed-mutex texture created in the driver can be opened by name in the host (ID3D11Device1::OpenSharedResourceByName) with both devices on the same render-adapter LUID (the driver reports it out of the ADD IOCTL via OsAdapterLuid). Named kernel objects resolve through Session 0's shared \BaseNamedObjects, so no Global\ prefix / SeCreateGlobalPrivilege gymnastics are needed for same-session use. The Looking-Glass cross-VM shared-memory device is unnecessary — this is cross-process, same-session, one GPU.
  • Transport shape (built): a ring of N (default 3) shared keyed-mutex textures (newest-wins, so the swap-chain thread never blocks — a stalled IddCxSwapChainReleaseAndAcquire loop freezes DWM compositing system-wide) + a named metadata header ({magic, version, generation, width, height, dxgi_format, ring_len, latest}) + a frame-ready auto-reset event. A generation counter bumps on a mode change so the host re-opens the ring.
  • The inversion (required) — host creates, driver opens. WUDFHost runs with a restricted token: it can neither write the filesystem nor create named kernel objects (CreateFileMappingW/CreateEventW/ CreateSharedHandle all fail silently), which a file-logging driver build confirmed (it wrote no file at all even though init() runs in DriverEntry and the device is OK). This is exactly why the gamepad UMDF drivers invert it (inject/dualsense_windows.rs): the HOST creates the section (privileged → a permissive Global\ name + SDDL D:(A;;GA;;;WD)) and the DRIVER only OPENS it. The host-created-ring / restricted-open split was implemented and works every time (created shared ring … render_luid=…, no name collisions after the per-attempt generation fix). The gamepad drivers independently prove a UMDF driver can open + write a host-created Global\ section on this box — so the driver writing nothing is not an access problem.

Root cause — the swap-chain is never assigned (fundamental, not fixable)

Across every configuration tested, the driver's run_core swap-chain processor is never entered (run_core_entries=0):

  • in-process (Session 0) and WGC-triggered (Session 1 helper) sessions,
  • a user-created ring AND a host-created (LocalSystem) ring with the permissive D:(A;;GA;;;WD) SDDL,
  • with and without a Low-IL (S:(ML;;NW;;;LW)) mandatory label,
  • with WUDFHost confirmed not an AppContainer (IsAppContainer=0),

— even while WGC simultaneously captured the same virtual monitor's composition and streamed multi-MB of HEVC.

An IddCx virtual monitor only receives a swap-chain (EVT_IDD_CX_MONITOR_ASSIGN_SWAPCHAIN) when the OS presents/scans-out to it, which requires a real presentation consumer. WGC/DDA capture of the composed desktop does NOT count — it reads DWM's composition, bypassing the driver's swap-chain. With no physical scanout and no consumer that routes through the driver, the path stays inactive (IDDCX_PATH_FLAGS=0) and ASSIGN_SWAPCHAIN never fires. Session 0 additionally has no DWM/compositor at all.

Ecosystem + first-party confirmation:

  • Every bare-metal virtual-display capture project uses WGC/DDA, not the driver swap-chain: SudoVDA (its swap-chain loop acquires-and-discards), Apollo/Sunshine (DDA + WGC backends), virtual-display-rs (discards), parsec-vdd (no frame path). Only Looking Glass consumes the driver swap-chain — and only because a VM guest scans out the display (the consumer). Bare metal has no equivalent.
  • Microsoft's own unanswered Q&A (learn.microsoft.com/answers 4096179) reports the identical symptom for the IddSampleDriver: virtual display "always inactive," ASSIGN_SWAPCHAIN never runs.

Both remaining escape hatches tested and closed

  • Option 3 — a present source on the display — TESTED, failed. A present-trigger added to the Session-1 WGC helper successfully created a D3D11 swapchain on the virtual display and presented continuously (WGC even captured the flashing window). The driver stayed run_core_entries=0 / frames_acquired=0. So an active present source does NOT make the OS assign the driver's swap-chain — DWM composes the present onto the display (capturable) without routing it through the driver.
  • Option 2 — a driver flag — closed by analysis. The present-trigger succeeding proves the path is already active; the missing piece is scanout routed through the driver, which the OS does only for a real consumer (physical display / VM guest / RDP). The one IddCx flag for that — IDDCX_ADAPTER_FLAGS_REMOTE_SESSION_DRIVER — requires the RDP protocol stack as the consumer, which bare-metal console capture has no equivalent of.

Verdict (final)

IDD-push needs a presentation consumer (scanout / VM guest / RDP) that bare-metal console desktop-capture fundamentally cannot provide. No host-side capture, no in-process path, no present source, and no available driver flag overcomes it. WGC (normal desktop) + DDA (secure desktop) is the only viable Windows capture path — as the entire ecosystem already does. Any future "lower overhead" must come from optimizing the WGC/DDA path (trimming the Session-0↔Session-1 relay, zero-copy encode), not from IDD-push. The remaining gaps a hypothetical IDD-push would also have had (cursor delivered separately via IddCxMonitorSetupHardwareCursor/QueryHardwareCursor; HDR needing the IddCx 1.11 D3D12 acquire path SetDevice2/ReleaseAndAcquireBuffer2ID3D12Resource, since the default swap-chain surface is 8-bit) are moot for the same reason.

Open items

None. P1 shipped; P2 is a permanent do-not-pursue record (no pending work). WGC/DDA is the shipping capture path.