Files
punktfunk/docs/windows-virtual-display-rust-port.md
T
enricobuehler d39da4bc06 feat(windows): pf-vdisplay — all-Rust IddCx virtual display (replaces SudoVDA)
P1 done: a pure-Rust UMDF2 IddCx driver, drop-in compatible with the host's
existing vdisplay/sudovda.rs control plane (the {e5bcc234} interface + the
SudoVDA IOCTL ABI), so the host drives it unchanged. Validated streaming on
glass at 5120x1440@240 — steady 240 fps, ~2.4 ms encode, clean teardown, full
parity with SudoVDA.

- Vendored wdf-umdf-sys / wdf-umdf bindgen crates (MIT, from virtual-display-rs)
  + the SDK-version build.rs fix that resolves the IddCxStub lib path by the WDK
  version actually containing um\x64\iddcx, not the max base SDK.
- pf-vdisplay crate: entry/callbacks/context/control/monitor/edid/
  swap_chain_processor. Our OWN 128-byte EDID (manufacturer PNK, product
  punktfunk — no SudoVDA bytes), a real swap-chain drain (faithful vdd port,
  required so DWM keeps compositing), the SudoVDA-compatible IOCTL control plane
  (ADD/REMOVE/PING/GET_WATCHDOG/GET_VERSION/SET_RENDER_ADAPTER) + a watchdog that
  tears down orphaned monitors when the host stops pinging.
- deploy-dev.ps1: stage + sign + stampinf (date.time DriverVer) + Inf2Cat +
  install, codifying the "bump DriverVer or pnputil keeps the old binary" gotcha.
- docs/windows-virtual-display-rust-port.md: investigation, the on-glass
  validation, and the two traps that cost time (Session-0 measurement +
  accumulated device-state needing a reboot).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-24 00:36:21 +02:00

21 KiB
Raw Blame History

Windows virtual display — a Rust port of SudoVDA (investigation & plan)

Status: P1 done — pf-vdisplay validated streaming on glass at 5120×1440@240 (2026-06-22). The all-Rust IddCx driver replaces the vendored SudoVDA C++ driver, matching the "all-Rust UMDF, zero external driver deps" direction we finished for gamepads (ViGEmBus gone; DualSense/DS4/XUSB shipped). The investigation/plan below is kept for context; see Validated on-box for the result.

TL;DR

A Rust port is feasible, low-on-blockers, and strategically aligned — and there's an unexpected architectural prize beyond "same thing, in Rust."

  • Signing is not a blocker. An IddCx driver is UMDF user-mode; it needs no WHQL, no attestation, no test-signing. A self-signed cert in LocalMachine Root + TrustedPublisher loads it — exactly the model our gamepad drivers already ship (and exactly what SudoVDA and the other forks do). (Do UMDF drivers require signing?)
  • We would not be first in Rust. MolotovCherry/virtual-display-rs is a complete, shipping IddCx driver written in Rust (MIT), with hand-rolled IddCx/WDF bindgen bindings (wdf-umdf-sys + wdf-umdf) and a reference swap-chain processor. This turns "greenfield FFI" into "adapt a proven reference."
  • The prize: we can stop using DXGI Desktop Duplication. An IddCx driver already receives the composited desktop frames in its swap-chain. Looking Glass ships exactly this in production — driver consumes the swap-chain, hands frames to a separate process, "operates entirely independently of DDA." Doing the same would delete an entire class of multi-GPU bugs the current capture/dxgi.rs is built to survive (ACCESS_LOST storms, MODE_CHANGE_IN_PROGRESS, the win32u.dll reparenting patch).

Recommendation: yes, build it in Rust, in phases — a drop-in DDA-compatible driver first (own the stack at low risk), then the direct-frame-push path (the real cleanup). Keep vendoring SudoVDA as the safe interim until the Rust driver is on-glass-validated on the RTX box.

Validated on-box (2026-06-22)

Before committing, the toolchain + load path were proven on the RTX box (Win11 26200, WDK 26100):

  • A Rust IddCx driver builds with our toolchain. Cloned virtual-display-rs and built its driver .dll against our WDK (UMDF 2.31 + IddCx 1.4 stubs, bindgen over IddCx.h via our LLVM, nightly-2024-07-26). One fix needed: its build.rs picked the max SDK Lib version (10.0.28000.0, a base SDK with no IddCx) for the IddCxStub search path; resolving it by the version that actually contains um\x64\iddcx\1.4 (10.0.26100.0, the WDK) fixed the link.
  • It installs self-signed and loads. Signed .dll/.cat with our existing driver cert (the gamepad punktfunk-ds-test), pnputil /add-driver, root devnode via devgen. The device came up Status OK / CM_PROB_NONE, Class Display, hosted by WUDFRd — a Rust IddCx adapter initialized cleanly. (SudoVDA, already live here, independently confirms IddCx + self-signed UMDF work on this box.) Test artifacts removed afterward; SudoVDA untouched.

Conclusion: the central risk ("can we build + load a Rust IddCx driver here?") is retired. The binding question (D2) resolves toward reusing virtual-display-rs's self-contained wdf-umdf-sys + wdf-umdf bindgen crates (now proven to build + load on our box) rather than extending windows-drivers-rs — IddCx functions are direct IddCxStub exports the WDF function-table macro can't reach anyway, so a unified bindgen is the cleaner base for pf-vdisplay. Reference clone kept at C:\Users\Public\virtual-display-rs.

Scaffold + driver logic landed + on-glass: packaging/windows/vdisplay-driver/ — vendored wdf-umdf-sys/wdf-umdf (MIT, + the SDK-version build.rs fix) + the pf-vdisplay driver crate. The full IddCx driver is ported (entry → IDD_CX_CLIENT_CONFIG with all 7 callbacks → device/monitor context → our own EDID → a real swap-chain drain), with the IPC/serde/tokio stack replaced by an in-tree monitor model and OutputDebugString logging. Validated on the RTX box: built, signed (our punktfunk-ds-test cert), installed, loaded Status OK, and arrived a real virtual monitor ("VirtuDisplay+", DISPLAY\CHY0000) — i.e. an OURS, all-Rust IddCx virtual display creating a monitor.

IOCTL control plane done + on-glass (P1 functionally complete): the SudoVDA-compatible control plane is implemented (EVT_IDD_CX_DEVICE_IO_CONTROL + the {e5bcc234-…} interface registered via WdfDeviceCreateDeviceInterface; control.rs with byte-identical structs) — ADD a monitor at a requested mode → {LUID, target_id} (target id + adapter LUID captured from IDARG_OUT_MONITORARRIVAL), REMOVE by GUID, PING/GET_WATCHDOG watchdog, GET_VERSION, SET_RENDER_ADAPTER (IddCxAdapterSetRenderAdapter); per-ADD mode injection (requested mode preferred + fallbacks). Added the five missing FFI wrappers to the vendored wdf-umdf. Validated on the RTX box with a probe that mimics vdisplay/sudovda.rs exactly: GET_VERSION → 0.2.1, GET_WATCHDOG → timeout=3, ADD 1920×1080@60 → target_id=257 + adapter LUID, a real "VirtuDisplay+" monitor arrived at the requested mode, REMOVE ok. Constraint: pf-vdisplay can't coexist with SudoVDA — they register the same interface GUID, so two IddCx adapters claiming it → FAILED_POST_START; pf-vdisplay replaces SudoVDA (validated by disabling SudoVDA first).

Watchdog + real-host drive validated: added the watchdog thread (1 Hz countdown reset by any IOCTL; tears down all monitors at 0 so a gone host never leaves a phantom display; mirrors SudoVDA's RunWatchdog). Pointed the real host at it — removed SudoVDA's devnode so pf-vdisplay is the sole {e5bcc234} provider, then ran the host's vdisplay::sudovda::tests::live_create_drop (PUNKTFUNK_SUDOVDA_LIVE=1): test passed, and the pf-vdisplay log shows the host's IOCTLs landing — ADD 1920x1080@60 → target_id=258, luid=…02619823, then the watchdog correctly tore the monitor down when the test process exited without a final REMOVE. So vdisplay/sudovda.rs drives pf-vdisplay unchanged through the full control contract.

Validated streaming end-to-end on glass (2026-06-22) — P1 complete. pf-vdisplay is a working SudoVDA replacement. Driven by the real host (serve, the LocalSystem service) with a stock client at 5120×1440@240: the monitor arrives, resolve_gdi_name → \\.\DISPLAY10, set_active_mode + CCD-isolate succeed, the DXGI output resolves under the RTX 4090, WGC capture + NVENC run at steady 240 fps, ~2.4 ms encode, 6512 AUs sent, clean teardown (isolate restored rc=0x0). Same vdisplay/sudovda.rs path, unchanged — full parity with SudoVDA.

The earlier "monitor arrives but never gets a swap-chain / no DXGI output" symptoms were a measurement + state artifact, not a driver bug. Two traps cost a lot of time:

  1. Session 0. Every standalone probe (vdtest, the host's live_create_drop test) ran in Session 0 — the services session, whose desktop is a throwaway 1024×768 basic display. IddCx activation happens in the console Session 1, where the 4090 drives the real desktop. So Screen.AllScreens/CCD queries from Session 0 can never see the virtual monitor activate — they report the wrong desktop. The only valid way to drive + observe it is the host service (SYSTEM, which targets Session 1) plus the driver's own OutputDebugString (system-wide, session-agnostic).
  2. Accumulated device-state damage. Repeated reinstalls + Disable/Enable-PnpDevice cycles + a control handle the host cached across all of it left the device tree wedged (stale handle → the host's PINGs fail → the 3 s watchdog tears the monitor down mid-session → capture opens a dying display → "no DXGI output"). A reboot cleared it and it worked on the first connect. Lesson: after device churn, restart the host service (fresh handle) — and when in doubt, reboot.

The swap-chain processor is a faithful port of virtual-display-rs's (it drains correctly via ReleaseAndAcquireBuffer + FinishedProcessingFrame — the drain is required; a true no-op would stall DWM and freeze the captured image). The EDID is our own clean 128-byte block (manufacturer PNK, product punktfunk) — no SudoVDA bytes.

Build gotcha (important for iterating): updating an installed UMDF driver only takes if the INF DriverVer changesdeploy-dev.ps1 stamps a date.time -v on every run; without a bump the old binary keeps running (silently). Devnode hygiene: create the root devnode with nefconc --create-device-node (a clean ROOT\DISPLAY node), NOT devgen /add — devgen makes persistent SWD\DEVGEN software devices that survive reboot and registry deletion and resurrect on every pnputil /add-driver (they have hwid root\pf_vdisplay, so the driver install re-materializes them). The production installer must use a single nefconc/INF-created node and never devgen.

Why we'd do this

The user's goals, mapped to outcomes:

Goal Outcome
Drop external deps No more vendored prebuilt SudoVDA .dll/.cat (third-party, C++, single upstream).
Increase Rust coverage The display driver joins the gamepad drivers as in-tree Rust UMDF.
Own the stack / easier display management We control the IOCTL protocol, the EDID, the mode list, the watchdog — and can fold the topology/mode logic that's currently scattered in vdisplay/sudovda.rs into the driver.
Cleaner code Phase 2 retires capture/dxgi.rs's DDA workarounds + the win32u.dll patch.

What we'd be replacing (current architecture)

  • Driver: SudoVDA — UMDF2 IddCx, Class=Display, UmdfExtensions=IddCx0102, UpperFilters=IndirectKmd, root-enumerated Root\SudoMaker\SudoVDA. Vendored prebuilt under packaging/windows/sudovda/, installed by install-sudovda.ps1 (cert → nefconc devnode → pnputil). Source is public (SudoMaker/SudoVDA, README-only MIT/CC0 grant over the MS sample, ~1,900 LOC C++).
  • Host contract: crates/punktfunk-host/src/vdisplay/sudovda.rs opens the control device by interface GUID {e5bcc234-…} and drives a tiny METHOD_BUFFERED IOCTL protocol — byte-identical to SudoVDA's Common/Include/sudovda-ioctl.h:
    • ADD (0x800) {w,h,refresh,GUID,name[14],serial[14]}{LUID, target_id}
    • REMOVE (0x801) {GUID} · SET_RENDER_ADAPTER (0x802) {LUID} · GET_WATCHDOG (0x803) · PING (0x888) (mandatory keepalive) · GET_VERSION (0x8FF)
  • Capture: capture/dxgi.rs finds the virtual monitor's GDI output across all adapters (it's enumerated under the rendering GPU, not SudoVDA's LUID) and runs DXGI Desktop Duplication (DuplicateOutput1, FP16 for HDR). This file is dominated by virtual-display-over-DDA survival code: DXGI_ERROR_ACCESS_LOST re-duplication with retries, MODE_CHANGE_IN_PROGRESS backoff, legacy-DuplicateOutput fallback, CCD display isolation to make the IDD the sole composited desktop, and an install_gpu_pref_hook() that patches win32u.dll!NtGdiDdDDIGetCachedHybridQueryValue to stop DXGI reparenting the output across GPUs. Most of that exists because we capture a virtual display via DDA on a multi-GPU box.

Feasibility findings

Signing — green (the make-or-break)

UMDF user-mode ⇒ Code-Integrity signing rules don't apply to our binary (the only kernel piece is Microsoft's inbox IndirectKmd). Self-signed cert in Root + TrustedPublisher is sufficient on a normal Secure-Boot Win11 box — no bcdedit /set testsigning. SudoVDA and virtual-display-rs both ship this way. This is the same model as our DualSense/DS4/XUSB drivers. (The only thing that breaks install is a botched cert placement, not a signing tier.)

Rust prior art — exists, MIT, reusable

virtual-display-rs proves an all-Rust IddCx driver runs in production and gives us: wdf-umdf-sys (bindgen over WDF and iddcx.h, links IddCxStub), wdf-umdf (safe wrappers — iddcx.rs ~300 LOC, with an IddCxIsFunctionAvailable! version-gate macro), and a reference driver (swap_chain_processor.rs ~158 LOC, direct_3d_device.rs, edid.rs). Caveat: it uses its own bindgen stack, not microsoft/windows-drivers-rs — see Decision D2.

windows-drivers-rs IddCx support — absent, but a bounded extension

Our wdk-sys (m0) binds Base + WDF + feature-gated subsets (hid/gpio/spb/…). Zero IddCx symbols. Adding it is the same shape as the existing subsets: an ApiSubset::Iddcx variant + iddcx feature → iddcx_headers() returning iddcx.h for bindgen, and linking IddCx.lib. IddCx functions are not WDF-table functions, so the call_unsafe_wdf_function_binding! macro doesn't apply — they're direct IddCx.lib exports we'd #[link(name="IddCx")] extern (or bindgen) and wrap ourselves. windows 0.58 (already in the tree) provides the Direct3D11/Dxgi APIs the swap-chain loop needs.

The IddCx driver itself — well-understood, ~12k LOC

Required callbacks (baselined on the MS IddSampleDriver, ~1,100 LOC, IddCx 1.4): EVT_IDD_CX_ADAPTER_INIT_FINISHED, ADAPTER_COMMIT_MODES, PARSE_MONITOR_DESCRIPTION, MONITOR_GET_DEFAULT_DESCRIPTION_MODES, MONITOR_QUERY_TARGET_MODES, MONITOR_ASSIGN_SWAPCHAIN (the only callback with real D3D work), MONITOR_UNASSIGN_SWAPCHAIN, and DEVICE_IO_CONTROL (where our ADD/REMOVE/PING IOCTLs live). Init flow: WdfDeviceCreate → IddCxDeviceInitConfig → IddCxDeviceInitialize → IddCxAdapterInitAsync → IddCxMonitorCreate → IddCxMonitorArrival.

Arbitrary resolutions don't need EDID timings: ship one generic ~128/256-byte EDID base block to make Windows treat the target as a real monitor, then advertise modes programmatically from the mode-list callbacks — a static table plus the runtime-requested client mode injected as preferred (exactly SudoVDA's s_DefaultModes[] + per-ADD preferred-mode approach). 5120×1440@240 just gets added at ADD time.

HDR/10-bit: supported, but it's the one place IddCx is harder than today. The default swap-chain surface is 8-bit A8R8G8B8; FP16/HDR requires the IddCx 1.11 D3D12 acquire path (SetDevice2/ReleaseAndAcquireBuffer2ID3D12Resource, with a stricter sync model). Our box is Win11 26200 (IddCx ≥ 1.10), so this is reachable, but it's real work — and our current WGC/DDA path gives FP16 HDR "for free." Build against 1.10 and runtime-gate the newer DDIs (SudoVDA's pattern).

The architectural prize: skip DDA (Phase 2)

An IddCx driver gets each presented frame from IddCxSwapChainReleaseAndAcquireBuffer as an IDXGIResource on a device we bind via IddCxSwapChainSetDevice. We can copy it into a shared texture / shared section and hand it to the host's encoder process directly — no Desktop Duplication. Why this is the real win, not just a detour:

  • It's the intended IddCx use case. IddCx exists for remote/wireless/USB displays that ship swap-chain frames over a wire; consuming frames in the driver is the designed path, and Looking Glass already does exactly this (driver → shared memory → separate consumer, no DDA).
  • It kills the multi-GPU bug class. We call IddCxAdapterSetRenderAdapter to pin the swap-chain to the same GPU as our NVENC encoder before adding the monitor, and the OS honors it. No more DXGI reparenting the output onto the wrong GPU, no ACCESS_LOST storms, and we can retire install_gpu_pref_hook() (the win32u.dll patch) and most of capture/dxgi.rs. Swap-chain re-creation becomes a documented, in-band event (ABANDON_SWAPCHAIN) instead of an undocumented failure we fight with retries.

What it does not remove (be honest): display topology management — making the virtual display the sole/primary composited desktop so the game (and Winlogon) render to it — is independent of how we get frames and stays (though we can integrate it more cleanly). And the watchdog stays, now ours.

The cost: a Session-0 → service cross-process frame transport (the driver host is WUDFHost in Session 0 / LocalService; our host is a LocalSystem service). A Global\-named, explicitly-ACL'd shared section + keyed-mutex texture (Looking Glass's shape) is where the engineering actually goes — prototype this first, it's the only genuinely new risk. Plus the HDR D3D12 path above.

Decisions to make at kickoff

  • D1 — Own the driver? Recommend yes, in Rust. (Alternatives: fork SudoVDA's C++ — fastest to a known-good HDR driver but reintroduces a C++ toolchain and README-only license provenance; or keep vendoring — zero cost, but none of the goals.)
  • D2 — Binding stack? The main implementation fork.
    • (a) Extend our windows-drivers-rs (m0) with an iddcx subset — one toolchain across all our drivers, our build env, but we write the IddCx bindings ourselves (+~35 wk), using virtual-display-rs's iddcx.rs as the 1:1 guide. Preferred for consistency.
    • (b) Vendor virtual-display-rs's wdf-umdf* crates (MIT) — fastest to first light, but a second WDK-binding stack in-tree.
    • Suggested sequence: prototype on (b) to prove IddCx-on-our-box in days, then build production on (a) for consistency.
  • D3 — Frame transport? Phase it: DDA-compatible first (zero capture-side change), direct push second (the cleanup). Don't couple the driver rewrite to the transport rewrite.
  • P0 — now: keep vendoring SudoVDA. No change. (The gamepad-driver installer work just shipped; this is independent.)
  • P1 — drop-in Rust IddCx driver (pf-vdisplay). Replicate SudoVDA's IOCTL contract exactly (same struct layouts; reuse or re-issue the control interface GUID) so vdisplay/sudovda.rs needs ~zero change (at most a GUID constant). Class=Display + IddCx INF, our own EDID + programmatic mode list incl. the per-ADD client mode, the watchdog, a real swap-chain drain (the vdd port — the drain is required so DWM keeps compositing; DDA/WGC still captures the desktop). Bundle + self-sign + pnputil-install via the installer, identical to the gamepad-driver path we just built. Outcome: all-Rust, SudoVDA dependency dropped, DDA capture unchanged. Effort ≈ 24 wk to first light, 57 wk to parity (HDR, multi-monitor, CI).
  • P2 — direct frame push (kill DDA). Add a swap-chain processor that copies each frame into a shared section/texture; new capture backend reads it directly; pin the render adapter to the encoder GPU. Gate behind a flag, validate against DDA, then retire the DDA path + the win32u.dll patch. HDR via the IddCx 1.11 D3D12 acquire path. Outcome: the real "owning the stack pays off" cleanup. Effort: additional; the Session-0 transport is the long pole.

Risks

  1. D3-in-a-driver swap-chain loop — the one genuinely new piece; bugs here = black screens/TDR. Mitigated by virtual-display-rs's swap_chain_processor.rs + the MS sample as references.
  2. Session-0 cross-process transport (P2) — the actual hard part; prototype it first.
  3. HDR = the harder D3D12 1.11 path — our current WGC/DDA HDR is free; the IddCx HDR path is not.
  4. Two binding stacks if we go D2(b) — a maintenance cost cutting against "clean/consistent."
  5. No WHQL ⇒ no Windows Update / Dev-Center distribution — same constraint our gamepad drivers already accept (bundle + self-sign + import cert).

References