# Windows virtual display — a Rust port of SudoVDA (investigation & plan) Status: **P1 done — `pf-vdisplay` validated streaming on glass at 5120×1440@240** (2026-06-22). The all-Rust IddCx driver replaces the vendored **SudoVDA** C++ driver, matching the "all-Rust UMDF, zero external driver deps" direction we finished for gamepads (ViGEmBus gone; DualSense/DS4/XUSB shipped). The investigation/plan below is kept for context; see **Validated on-box** for the result. ## TL;DR A Rust port is **feasible, low-on-blockers, and strategically aligned** — and there's an unexpected architectural prize beyond "same thing, in Rust." - **Signing is not a blocker.** An IddCx driver is UMDF *user-mode*; it needs **no WHQL, no attestation, no test-signing**. A self-signed cert in LocalMachine `Root` + `TrustedPublisher` loads it — **exactly the model our gamepad drivers already ship** (and exactly what SudoVDA and the other forks do). ([Do UMDF drivers require signing?](https://learn.microsoft.com/en-us/archive/blogs/peterwie/do-umdf-drivers-require-signing)) - **We would not be first in Rust.** [`MolotovCherry/virtual-display-rs`](https://github.com/MolotovCherry/virtual-display-rs) is a complete, shipping **IddCx driver written in Rust** (MIT), with hand-rolled IddCx/WDF bindgen bindings (`wdf-umdf-sys` + `wdf-umdf`) and a reference swap-chain processor. This turns "greenfield FFI" into "adapt a proven reference." - **The prize: we can stop using DXGI Desktop Duplication.** An IddCx driver already *receives* the composited desktop frames in its swap-chain. [Looking Glass](https://deepwiki.com/gnif/LookingGlass/2.5-indirect-display-driver-(idd)) ships exactly this in production — driver consumes the swap-chain, hands frames to a separate process, "operates entirely independently of DDA." Doing the same would **delete an entire class of multi-GPU bugs** the current `capture/dxgi.rs` is built to survive (ACCESS_LOST storms, MODE_CHANGE_IN_PROGRESS, the `win32u.dll` reparenting patch). Recommendation: **yes, build it in Rust**, in phases — a drop-in DDA-compatible driver first (own the stack at low risk), then the direct-frame-push path (the real cleanup). Keep vendoring SudoVDA as the safe interim until the Rust driver is on-glass-validated on the RTX box. ## Validated on-box (2026-06-22) Before committing, the toolchain + load path were proven on the RTX box (Win11 26200, WDK 26100): - **A Rust IddCx driver builds with our toolchain.** Cloned [`virtual-display-rs`](https://github.com/MolotovCherry/virtual-display-rs) and built its driver `.dll` against our WDK (UMDF 2.31 + IddCx 1.4 stubs, bindgen over `IddCx.h` via our LLVM, nightly-2024-07-26). One fix needed: its `build.rs` picked the **max** SDK Lib version (`10.0.28000.0`, a base SDK with no IddCx) for the `IddCxStub` search path; resolving it by the version that actually contains `um\x64\iddcx\1.4` (`10.0.26100.0`, the WDK) fixed the link. - **It installs self-signed and loads.** Signed `.dll`/`.cat` with our existing driver cert (the gamepad `punktfunk-ds-test`), `pnputil /add-driver`, root devnode via `devgen`. The device came up **Status OK / CM_PROB_NONE**, Class Display, hosted by `WUDFRd` — a Rust IddCx adapter initialized cleanly. (SudoVDA, already live here, independently confirms IddCx + self-signed UMDF work on this box.) Test artifacts removed afterward; SudoVDA untouched. **Conclusion:** the central risk ("can we build + load a Rust IddCx driver here?") is retired. The binding question (D2) resolves toward **reusing `virtual-display-rs`'s self-contained `wdf-umdf-sys` + `wdf-umdf` bindgen crates** (now proven to build + load on our box) rather than extending `windows-drivers-rs` — IddCx functions are direct `IddCxStub` exports the WDF function-table macro can't reach anyway, so a unified bindgen is the cleaner base for `pf-vdisplay`. Reference clone kept at `C:\Users\Public\virtual-display-rs`. **Scaffold + driver logic landed + on-glass:** `packaging/windows/vdisplay-driver/` — vendored `wdf-umdf-sys`/`wdf-umdf` (MIT, + the SDK-version build.rs fix) + the `pf-vdisplay` driver crate. The full IddCx driver is ported (entry → `IDD_CX_CLIENT_CONFIG` with all 7 callbacks → device/monitor context → our own EDID → a real swap-chain drain), with the IPC/serde/`tokio` stack replaced by an in-tree `monitor` model and `OutputDebugString` logging. **Validated on the RTX box:** built, signed (our `punktfunk-ds-test` cert), installed, loaded **Status OK**, and **arrived a real virtual monitor** ("VirtuDisplay+", `DISPLAY\CHY0000`) — i.e. an OURS, all-Rust IddCx virtual display creating a monitor. **IOCTL control plane done + on-glass (P1 functionally complete):** the SudoVDA-compatible control plane is implemented (`EVT_IDD_CX_DEVICE_IO_CONTROL` + the `{e5bcc234-…}` interface registered via `WdfDeviceCreateDeviceInterface`; `control.rs` with byte-identical structs) — `ADD` a monitor at a requested mode → `{LUID, target_id}` (target id + adapter LUID captured from `IDARG_OUT_MONITORARRIVAL`), `REMOVE` by GUID, `PING`/`GET_WATCHDOG` watchdog, `GET_VERSION`, `SET_RENDER_ADAPTER` (`IddCxAdapterSetRenderAdapter`); per-`ADD` mode injection (requested mode preferred + fallbacks). Added the five missing FFI wrappers to the vendored `wdf-umdf`. **Validated on the RTX box** with a probe that mimics `vdisplay/sudovda.rs` exactly: `GET_VERSION → 0.2.1`, `GET_WATCHDOG → timeout=3`, `ADD 1920×1080@60 → target_id=257 + adapter LUID`, a real "VirtuDisplay+" monitor arrived at the requested mode, `REMOVE` ok. **Constraint:** pf-vdisplay can't coexist with SudoVDA — they register the same interface GUID, so two IddCx adapters claiming it → `FAILED_POST_START`; pf-vdisplay *replaces* SudoVDA (validated by disabling SudoVDA first). **Watchdog + real-host drive validated:** added the watchdog thread (1 Hz countdown reset by any IOCTL; tears down all monitors at 0 so a gone host never leaves a phantom display; mirrors SudoVDA's `RunWatchdog`). Pointed the **real host** at it — removed SudoVDA's devnode so pf-vdisplay is the sole `{e5bcc234}` provider, then ran the host's `vdisplay::sudovda::tests::live_create_drop` (`PUNKTFUNK_SUDOVDA_LIVE=1`): **test passed**, and the pf-vdisplay log shows the host's IOCTLs landing — `ADD 1920x1080@60 → target_id=258, luid=…02619823`, then the watchdog correctly tore the monitor down when the test process exited without a final REMOVE. So `vdisplay/sudovda.rs` drives pf-vdisplay unchanged through the full control contract. **Validated streaming end-to-end on glass (2026-06-22) — P1 complete.** pf-vdisplay is a working SudoVDA replacement. Driven by the **real host** (`serve`, the LocalSystem service) with a stock client at **5120×1440@240**: the monitor arrives, `resolve_gdi_name → \\.\DISPLAY10`, `set_active_mode` + CCD-isolate succeed, the DXGI output resolves **under the RTX 4090**, WGC capture + NVENC run at **steady 240 fps, ~2.4 ms encode**, 6512 AUs sent, clean teardown (`isolate restored rc=0x0`). Same `vdisplay/sudovda.rs` path, unchanged — full parity with SudoVDA. **The earlier "monitor arrives but never gets a swap-chain / no DXGI output" symptoms were a measurement + state artifact, not a driver bug.** Two traps cost a lot of time: 1. **Session 0.** Every standalone probe (`vdtest`, the host's `live_create_drop` test) ran in **Session 0** — the services session, whose desktop is a throwaway **1024×768** basic display. IddCx activation happens in the **console Session 1**, where the 4090 drives the real desktop. So `Screen.AllScreens`/CCD queries from Session 0 *can never* see the virtual monitor activate — they report the wrong desktop. The only valid way to drive + observe it is the **host service** (SYSTEM, which targets Session 1) plus the driver's own `OutputDebugString` (system-wide, session-agnostic). 2. **Accumulated device-state damage.** Repeated reinstalls + `Disable`/`Enable-PnpDevice` cycles + a control handle the host **cached across all of it** left the device tree wedged (stale handle → the host's PINGs fail → the 3 s watchdog tears the monitor down mid-session → capture opens a dying display → "no DXGI output"). **A reboot cleared it and it worked on the first connect.** Lesson: after device churn, restart the host service (fresh handle) — and when in doubt, reboot. The swap-chain processor is a **faithful port of virtual-display-rs's** (it drains correctly via `ReleaseAndAcquireBuffer` + `FinishedProcessingFrame` — the drain is *required*; a true no-op would stall DWM and freeze the captured image). The EDID is our **own clean 128-byte block** (manufacturer `PNK`, product `punktfunk`) — no SudoVDA bytes. **Build gotcha (important for iterating):** updating an installed UMDF driver only takes if the INF **DriverVer changes** — `deploy-dev.ps1` stamps a date.time `-v` on every run; without a bump the old binary keeps running (silently). **Devnode hygiene:** create the root devnode with `nefconc --create-device-node` (a clean `ROOT\DISPLAY` node), NOT `devgen /add` — devgen makes **persistent `SWD\DEVGEN` software devices** that survive reboot *and* registry deletion and resurrect on every `pnputil /add-driver` (they have `hwid root\pf_vdisplay`, so the driver install re-materializes them). The production installer must use a single `nefconc`/INF-created node and never `devgen`. ## P2 — direct frame push (kill DDA): design & decision record Status: **in progress.** P1 ships frames the old way (the driver drains its swap-chain and DDA/WGC re-captures the composited desktop). P2 makes the driver *publish* each swap-chain frame to the host directly, so we can retire Desktop Duplication and its multi-GPU survival code. Built behind `PUNKTFUNK_IDD_PUSH`, A/B'd against DDA, and only then made the default. ### The decisive finding: producer and consumer are both in Session 0 The whole transport design hinged on one unknown — same-session or cross-session? **Measured on the RTX box (2026-06-22):** the pf-vdisplay host process is `WUDFHost.exe` with `-DeviceGroupId:pfVDisplayGroup`, running in **Session 0**; the punktfunk host service is `LocalSystem`, also **Session 0**. So the swap-chain processor thread (spawned by our own `thread::spawn` inside the driver, i.e. in `WUDFHost`) and the encoder live in the **same session**. This is the easy case: - A D3D11 **shared keyed-mutex texture** created in the driver can be opened by name in the host with `ID3D11Device1::OpenSharedResourceByName` — both devices created on the **same render-adapter LUID** (which the driver already reports out of the `ADD` IOCTL via `OsAdapterLuid`, surfaced as `WinCaptureTarget::adapter_luid`). - Named kernel objects resolve through Session 0's shared `\BaseNamedObjects`, so **no `Global\` prefix / `SeCreateGlobalPrivilege` gymnastics** are needed (kept the names unprefixed; documented that this relies on both processes being Session 0). The Looking-Glass cross-*VM* shared-memory device is unnecessary — this is cross-*process*, same-session, on one GPU. This collapses the "Session-0 cross-process transport is the long pole" risk from the original plan. ### Transport: a ring of shared keyed-mutex textures + a metadata header + an event A single ping-pong keyed mutex would couple the driver's present rate to the host's consume rate — and **the swap-chain thread must never block** (a stalled `IddCxSwapChainReleaseAndAcquire`/processing loop freezes DWM compositing system-wide). So, the Looking-Glass shape — multiple frame buffers, newest wins: - **Ring** of `N` (default 3) shared textures, `RESOURCE_MISC_SHARED_NTHANDLE | SHARED_KEYEDMUTEX`, fixed size for the session. A **generation** counter bumps on a mode change (resize): the driver tears down + recreates the ring at the new size, the host notices the generation change and re-opens. - **Named metadata header** (`CreateFileMapping`): `{magic, version, generation, width, height, dxgi_format, ring_len, latest}` where `latest` packs `{write_index, monotonic sequence}` published *after* the copy completes. Plain (unprefixed) names — Session-0 shared namespace. - **Frame-ready auto-reset event** so the consumer waits instead of spinning. - **Producer (driver, per acquired frame):** pick `(latest_index + 1) % N`; **try**-acquire that slot's keyed mutex with a 0 ms timeout (if the host still holds it — rare with 3 slots — reuse the current slot or skip, **never block**); `CopyResource` the acquired `MetaData.pSurface` into the slot; release the mutex; publish `{index, ++seq}`; `SetEvent`. Then `FinishedProcessingFrame` as today. - **Consumer (host `IddPushCapturer`):** `WaitForSingleObject(event, timeout)`; read `latest`; if `seq` advanced, acquire that slot's mutex, `CopyResource` into an owned NVENC-input texture, release, yield `FramePayload::D3d11{texture, device}` — straight into the existing zero-copy NVENC path. No DDA, no CPU readback. ### What P2 removes vs. keeps - **Removes:** `capture/dxgi.rs`'s `DXGI_ERROR_ACCESS_LOST`/`MODE_CHANGE_IN_PROGRESS` re-duplication churn, the legacy-`DuplicateOutput` fallback, and **`install_gpu_pref_hook()` (the `win32u.dll` patch)** — by **pinning the render adapter to the encoder GPU** (`IddCxAdapterSetRenderAdapter`, the existing `SET_RENDER_ADAPTER` IOCTL, driven before `ADD`), so the OS never reparents the output and the shared texture + NVENC share one device by construction. - **Keeps:** display **topology** (making the virtual display the composited desktop) and the **watchdog** (now ours). The **two-process WGC secure-desktop relay** stays until we confirm the IDD push also delivers the secure (Winlogon) desktop; if it does, that retires too. ### On-glass attempt 2026-06-22 — code complete, blocked at driver load The full transport (driver publisher + host `IddPushCapturer` + render-LUID robustness + in-process routing) is written and compiles clean. The first on-glass A/B exposed several real things and one hard blocker: - **The service captures in a Session-1 WGC helper, not in-process.** `should_use_helper()` returns true for a SYSTEM service, so it spawns a user-session helper that does capture **and input injection**. IDD-push must capture **in-process in Session 0** (where the driver publishes) — wired via `should_use_helper()` returning false for `PUNKTFUNK_IDD_PUSH`. **Caveat:** `SendInput` from Session 0 can't reach the user's Session-1 desktop, so in-process IDD-push has **no working input** yet. Production needs either a Session-1 input-only helper, or `Global\`-namespaced shared textures so a Session-1 helper consumes IDD-push for both video + input. - **`SET_RENDER_ADAPTER` is ignored by the driver** (the IDD lands on a different adapter than pinned: observed IDD adapter `0xd60722` vs pinned 4090 `0x15de1`). The render-LUID-in-header path makes the host bind correctly regardless, but the driver should be made to actually honor the pin (or the host must copy across adapters) so NVENC gets a 4090 surface. - **Cursor is included** in the IddCx composited frame (DDA strips it) — so the host-side cursor compositor (P2.5) is likely unnecessary for this path. - **`FAILED_POST_START` was a red herring (churn, not the binary).** Comparing the 2157 (works) and the `frame_transport` DLL import tables: **identical** (same 8 DLLs; the size/hash delta is just the Authenticode signature). A clean install **+ reboot** (no `restart-device`/`disable-enable`/kill in between) loads the `frame_transport` driver to **`OK`**. The earlier `FAILED_POST_START` was the device wedging from the hot-reload churn (the deploy gotchas above). **Lesson: deploy = install + reboot, full stop.** - **THE REAL BLOCKER — the driver can't CREATE the shared objects.** With the driver loaded clean and the monitor active, the host's `IddPushCapturer` still times out: `pfvd-hdr- never appeared`. The driver's own `OutputDebugString` is invisible (UMDF redirects it to ETW, not DebugView — verified with a working DBWIN self-test), so a **file-logging** driver build was tried — and it wrote **no file at all**, even though `init()` runs in `DriverEntry`, the device is `OK`, WUDFHost runs as `LocalService`, and `C:\Users\Public` is world-writable. **WUDFHost runs with a restricted token: it can neither write the filesystem nor create named kernel objects** (`CreateFileMappingW`/`CreateEventW`/ `CreateSharedHandle`), so `FramePublisher::new` fails silently. This is exactly why the **gamepad UMDF drivers invert it**: `inject/dualsense_windows.rs` — *"the host creates the section (privileged → a permissive SDDL so the WUDFHost can open it); the driver maps it"* — `Global\pfds-shm-` + SDDL `D:(A;;GA;;;WD)`. **Fix: invert frame-push to match.** The HOST creates the header + event + ring textures (`Global\` names, `D:(A;;GA;;;WD)` SDDL); the DRIVER only OPENS them, writes its actual render LUID + a status code back into the host-created header (so we get driver visibility through the host log), and runs the copy loop. The host creates the textures on the render adapter the driver reports. - **Also unresolved: `SET_RENDER_ADAPTER` appears ignored** (the host's pin to the 4090 vs the ADD-reply adapter differ every time). The inverted header carries the driver's *actual* render LUID so the host can create textures + run NVENC on the right adapter — but if that's the iGPU, NVENC (NVIDIA) can't encode it, so the driver must be made to honor the pin (or the host must cross-adapter copy). Needs its own investigation. **Driver deploy gotchas learned (this box):** hot-reloading a UMDF display driver is unreliable — `pnputil /restart-device` does NOT restart WUDFHost (old image stays mapped), `Disable/Enable-PnpDevice` errors on the root-enumerated IDD, and **killing WUDFHost invalidates the host's cached `{e5bcc234}` control handle** (every ADD then fails `0x80070006`, and the device can wedge to `FAILED_POST_START`). A **reboot** loads a freshly-installed build cleanly. **Recovery** from a broken build is clean and reboot-free: `pnputil /delete-driver .inf /uninstall` removes the bad package and the device rebinds the previous (validated) package in the DriverStore — restored 2157 → `OK` immediately. ### On-glass attempt 2 (2026-06-23) — inversion works; in-process Session-0 path is a dead end Implemented the **inversion** (host creates the header + event + ring textures with the `D:(A;;GA;;;WD)` SDDL, driver only opens them) + a per-attempt **generation** (kills the `DXGI_ERROR_NAME_ALREADY_EXISTS` retry collisions) + a fixed-name **`Global\pfvd-dbg` debug channel** (structured counters the driver writes, since UMDF/ETW + the restricted token block its other logs). Results on the RTX box: - ✅ The host **creates the shared ring every time** (`created shared ring … render_luid=…`) — the privileged-create / restricted-open split is sound. - ✅ No more name collisions (generation fix). - ❌ **The driver writes NOTHING** — debug block all zeros, crucially `run_core_entries=0`. The swap-chain processor **never runs**, i.e. the OS **never assigns a swap-chain** to the virtual monitor in this path. **Root cause: an IddCx monitor only gets a swap-chain when something PRESENTS to it, and the in-process path has no presenter.** The host + the CCD topology-isolate run in **Session 0, which has no DWM / compositor**. The WGC path works because its capture helper lives in **Session 1**, where DWM composes the desktop onto the display (that composition is the swap-chain trigger). So in-process Session-0 IDD-push gets no frames to push, full stop — a **fundamental** barrier, not a fixable bug. The original plan's "Session-0 transport is the long pole" was right, but the long pole turned out to be *triggering presentation*, not the shared-memory mechanics (those work). **Consequence:** the only viable IDD-push shape is **option 3 — a Session-1 helper drives presentation + consumes the `Global\` ring** (the inversion built here is exactly what it needs). But it carries an unretired risk: it's still unproven whether the swap-chain gets assigned even with a Session-1 consumer that isn't WGC. Until that's answered, **DDA/WGC stays the shipping Windows capture path** — it works. All the IDD-push code (driver open-side + host create-side + debug channel) is written, compiles, and is gated behind `PUNKTFUNK_IDD_PUSH` (off), so it's dormant and harmless. ### CONCLUSION (2026-06-23): IDD-push is not viable for bare-metal capture — the swap-chain is never assigned After the inversion + a fixed-name debug channel + a host-created-ring observer + an autonomous loopback test harness (`punktfunk-probe` → the SYSTEM service, paired via the mgmt API), the question "does the driver's swap-chain processor ever run?" was answered **definitively: no.** The driver's `run_core` is **never entered** — `run_core_entries=0` in *every* configuration tested: - in-process (Session 0) and WGC-triggered (Session 1 helper) sessions, - a user-created ring AND a host-created (LocalSystem) ring with a permissive `D:(A;;GA;;;WD)` SDDL, - with and without a Low-IL (`S:(ML;;NW;;;LW)`) mandatory label, - with WUDFHost confirmed **not** an AppContainer (`IsAppContainer=0`), — even while WGC simultaneously captured the same virtual monitor's composition and streamed multi-MB of HEVC. The gamepad UMDF drivers prove a UMDF driver *can* open + write a host-created `Global\` section on this box, so the driver writing nothing is **not** an access problem — `run_core` simply does not run. **Root cause (researched + ecosystem-confirmed):** an IddCx virtual monitor only receives a swap-chain (`EVT_IDD_CX_MONITOR_ASSIGN_SWAPCHAIN`) when the OS **presents/scans-out** to it, which requires a real presentation consumer. **WGC/DDA capture of the composed desktop does NOT count** — it reads DWM's composition, bypassing the driver's swap-chain. With no physical scanout and no consumer that routes *through the driver*, the path stays inactive (`IDDCX_PATH_FLAGS=0`) and `ASSIGN_SWAPCHAIN` never fires. Confirming evidence: - **Every bare-metal virtual-display capture project uses WGC/DDA, not the driver swap-chain:** SudoVDA (its swap-chain loop acquires-and-discards), Apollo/Sunshine (DDA + WGC backends), virtual-display-rs (discards), parsec-vdd (no frame path). Only **Looking Glass** consumes the driver swap-chain — and only because a **VM guest scans out** the display (the consumer). We have no equivalent on bare metal. - Microsoft's own unanswered Q&A (learn.microsoft.com/answers 4096179) reports the identical symptom for the IddSampleDriver: virtual display "always inactive," `ASSIGN_SWAPCHAIN` never runs. **Verdict:** the "driver consumes its swap-chain and pushes frames" architecture (P2 / Looking-Glass style) **cannot get frames** for punktfunk's bare-metal, whole-desktop, capture-only use case. The shared-memory transport machinery (host-creates / driver-opens, the gamepad pattern) is all sound and proven to *create*, but there is nothing for the driver to publish. **DDA/WGC remains the only viable Windows capture path**, which is exactly what the entire ecosystem does. The IDD-push code stays in-tree, compiles, and is gated `off` (`PUNKTFUNK_IDD_PUSH`) — dormant and harmless — documenting the attempt so it isn't re-tried. "Better performance/lower overhead" must come from optimizing the WGC/DDA path (e.g. trimming the Session-0↔Session-1 relay, zero-copy encode), not from IDD-push. The only unexplored avenue is **driver-side** (a different adapter/monitor/path setup that might make the OS treat the virtual display as a presentation target) — but it needs a reboot to test, the MS Q&A suggests it's unsolved, and the unanimous ecosystem choice of WGC/DDA argues it's a dead end. **Final exhaustion (2026-06-23, follow-up): both remaining avenues closed.** - **Option 3 (present source) — TESTED, failed.** Added a present-trigger to the Session-1 WGC helper: it successfully created a D3D11 swapchain on the virtual display and presented continuously (WGC even captured the flashing window). The driver stayed `run_core_entries=0` / `frames_acquired=0`. So an active *present source* on the display does NOT make the OS assign the driver's swap-chain either — DWM composes the present onto the display (capturable) without routing it through the driver's swap-chain. - **Option 2 (driver flag) — closed by analysis.** The present-trigger succeeding proves the **path is already active** (a swapchain presents to the display fine); the missing piece is **scanout routed through the driver**, which the OS does only for a real consumer (physical display / VM guest / RDP). The one IddCx flag for that — `IDDCX_ADAPTER_FLAGS_REMOTE_SESSION_DRIVER` — requires the **RDP protocol stack** as the consumer, which bare-metal console capture has no equivalent of. **Verdict is final:** IDD-push needs a presentation consumer (scanout / VM guest / RDP) that bare-metal console desktop-capture fundamentally cannot provide. No host-side capture, no in-process path, no present source, and no available driver flag overcomes it. WGC (normal desktop) + DDA (secure desktop) is the only viable Windows capture path — as the entire ecosystem already does. The IDD-push + present-trigger code stays in-tree, gated off, as the documented record of the attempt. ### Known gaps the build-out must close (tracked as P2.* tasks) - **Cursor.** DDA/WGC composite the HW cursor host-side from frame-info; the IDD path delivers the cursor separately (`IddCxMonitorSetupHardwareCursor` event → `QueryHardwareCursor`). The prototype may ship cursor-less; the build-out wires the IDD cursor into the existing `CursorCompositor`. - **HDR.** The default IddCx swap-chain surface is 8-bit `B8G8R8A8`; FP16/HDR needs the **IddCx 1.11 D3D12 acquire path** (`SetDevice2`/`ReleaseAndAcquireBuffer2` → `ID3D12Resource`). Build against 1.10, runtime-gate 1.11. SDR-only for the prototype. ## Why we'd do this The user's goals, mapped to outcomes: | Goal | Outcome | | --- | --- | | Drop external deps | No more vendored prebuilt SudoVDA `.dll`/`.cat` (third-party, C++, single upstream). | | Increase Rust coverage | The display driver joins the gamepad drivers as in-tree Rust UMDF. | | Own the stack / easier display management | We control the IOCTL protocol, the EDID, the mode list, the watchdog — and can fold the topology/mode logic that's currently scattered in `vdisplay/sudovda.rs` into the driver. | | Cleaner code | Phase 2 retires `capture/dxgi.rs`'s DDA workarounds + the `win32u.dll` patch. | ## What we'd be replacing (current architecture) - **Driver:** SudoVDA — UMDF2 IddCx, `Class=Display`, `UmdfExtensions=IddCx0102`, `UpperFilters=IndirectKmd`, root-enumerated `Root\SudoMaker\SudoVDA`. Vendored prebuilt under `packaging/windows/sudovda/`, installed by `install-sudovda.ps1` (cert → `nefconc` devnode → `pnputil`). Source is public ([SudoMaker/SudoVDA](https://github.com/SudoMaker/SudoVDA), README-only MIT/CC0 grant over the MS sample, ~1,900 LOC C++). - **Host contract:** `crates/punktfunk-host/src/vdisplay/sudovda.rs` opens the control device by interface GUID `{e5bcc234-…}` and drives a tiny `METHOD_BUFFERED` IOCTL protocol — byte-identical to SudoVDA's `Common/Include/sudovda-ioctl.h`: - `ADD (0x800)` `{w,h,refresh,GUID,name[14],serial[14]}` → `{LUID, target_id}` - `REMOVE (0x801)` `{GUID}` · `SET_RENDER_ADAPTER (0x802)` `{LUID}` · `GET_WATCHDOG (0x803)` · `PING (0x888)` (mandatory keepalive) · `GET_VERSION (0x8FF)` - **Capture:** `capture/dxgi.rs` finds the virtual monitor's GDI output **across all adapters** (it's enumerated under the *rendering* GPU, not SudoVDA's LUID) and runs **DXGI Desktop Duplication** (`DuplicateOutput1`, FP16 for HDR). This file is **dominated by virtual-display-over-DDA survival code**: `DXGI_ERROR_ACCESS_LOST` re-duplication with retries, `MODE_CHANGE_IN_PROGRESS` backoff, legacy-`DuplicateOutput` fallback, CCD display isolation to make the IDD the sole composited desktop, and an **`install_gpu_pref_hook()` that patches `win32u.dll!NtGdiDdDDIGetCachedHybridQueryValue`** to stop DXGI reparenting the output across GPUs. Most of that exists *because* we capture a virtual display via DDA on a multi-GPU box. ## Feasibility findings ### Signing — green (the make-or-break) UMDF user-mode ⇒ Code-Integrity signing rules don't apply to our binary (the only kernel piece is Microsoft's inbox `IndirectKmd`). Self-signed cert in `Root` + `TrustedPublisher` is sufficient on a normal Secure-Boot Win11 box — no `bcdedit /set testsigning`. SudoVDA and `virtual-display-rs` both ship this way. This is the **same** model as our DualSense/DS4/XUSB drivers. (The only thing that breaks install is a botched cert placement, not a signing *tier*.) ### Rust prior art — exists, MIT, reusable `virtual-display-rs` proves an all-Rust IddCx driver runs in production and gives us: `wdf-umdf-sys` (bindgen over WDF **and** `iddcx.h`, links `IddCxStub`), `wdf-umdf` (safe wrappers — `iddcx.rs` ~300 LOC, with an `IddCxIsFunctionAvailable!` version-gate macro), and a reference driver (`swap_chain_processor.rs` ~158 LOC, `direct_3d_device.rs`, `edid.rs`). **Caveat:** it uses its *own* bindgen stack, **not** `microsoft/windows-drivers-rs` — see Decision D2. ### windows-drivers-rs IddCx support — absent, but a bounded extension Our `wdk-sys` (m0) binds Base + WDF + feature-gated subsets (hid/gpio/spb/…). **Zero IddCx symbols.** Adding it is the same shape as the existing subsets: an `ApiSubset::Iddcx` variant + `iddcx` feature → `iddcx_headers()` returning `iddcx.h` for bindgen, and linking `IddCx.lib`. IddCx functions are **not** WDF-table functions, so the `call_unsafe_wdf_function_binding!` macro doesn't apply — they're direct `IddCx.lib` exports we'd `#[link(name="IddCx")] extern` (or bindgen) and wrap ourselves. `windows` 0.58 (already in the tree) provides the Direct3D11/Dxgi APIs the swap-chain loop needs. ### The IddCx driver itself — well-understood, ~1–2k LOC Required callbacks (baselined on the MS [IddSampleDriver](https://github.com/microsoft/Windows-driver-samples/blob/main/video/IndirectDisplay/IddSampleDriver/Driver.cpp), ~1,100 LOC, IddCx 1.4): `EVT_IDD_CX_ADAPTER_INIT_FINISHED`, `ADAPTER_COMMIT_MODES`, `PARSE_MONITOR_DESCRIPTION`, `MONITOR_GET_DEFAULT_DESCRIPTION_MODES`, `MONITOR_QUERY_TARGET_MODES`, `MONITOR_ASSIGN_SWAPCHAIN` (the only callback with real D3D work), `MONITOR_UNASSIGN_SWAPCHAIN`, and `DEVICE_IO_CONTROL` (where our ADD/REMOVE/PING IOCTLs live). Init flow: `WdfDeviceCreate → IddCxDeviceInitConfig → IddCxDeviceInitialize → IddCxAdapterInitAsync → IddCxMonitorCreate → IddCxMonitorArrival`. **Arbitrary resolutions don't need EDID timings:** ship one generic ~128/256-byte EDID base block to make Windows treat the target as a real monitor, then advertise modes programmatically from the mode-list callbacks — a static table **plus the runtime-requested client mode injected as preferred** (exactly SudoVDA's `s_DefaultModes[]` + per-ADD preferred-mode approach). 5120×1440@240 just gets added at ADD time. **HDR/10-bit:** supported, but it's the one place IddCx is *harder* than today. The default swap-chain surface is **8-bit `A8R8G8B8`**; FP16/HDR requires the IddCx **1.11 D3D12 acquire path** (`SetDevice2`/`ReleaseAndAcquireBuffer2` → `ID3D12Resource`, with a stricter sync model). Our box is Win11 26200 (IddCx ≥ 1.10), so this is reachable, but it's real work — and our current WGC/DDA path gives FP16 HDR "for free." Build against 1.10 and runtime-gate the newer DDIs (SudoVDA's pattern). ## The architectural prize: skip DDA (Phase 2) An IddCx driver gets each presented frame from `IddCxSwapChainReleaseAndAcquireBuffer` as an `IDXGIResource` on a device **we** bind via `IddCxSwapChainSetDevice`. We can copy it into a shared texture / shared section and hand it to the host's encoder process directly — **no Desktop Duplication**. Why this is the real win, not just a detour: - **It's the *intended* IddCx use case.** IddCx exists for remote/wireless/USB displays that ship swap-chain frames over a wire; consuming frames in the driver is the designed path, and **Looking Glass already does exactly this** (driver → shared memory → separate consumer, no DDA). - **It kills the multi-GPU bug class.** We call `IddCxAdapterSetRenderAdapter` to pin the swap-chain to the **same GPU as our NVENC encoder before adding the monitor**, and the OS honors it. No more DXGI reparenting the output onto the wrong GPU, no ACCESS_LOST storms, and we can **retire `install_gpu_pref_hook()` (the `win32u.dll` patch)** and most of `capture/dxgi.rs`. Swap-chain re-creation becomes a documented, in-band event (`ABANDON_SWAPCHAIN`) instead of an undocumented failure we fight with retries. What it does **not** remove (be honest): display **topology** management — making the virtual display the sole/primary composited desktop so the game (and Winlogon) render to it — is independent of how we *get* frames and stays (though we can integrate it more cleanly). And the watchdog stays, now ours. The cost: a **Session-0 → service cross-process frame transport** (the driver host is `WUDFHost` in Session 0 / LocalService; our host is a LocalSystem service). A `Global\`-named, explicitly-ACL'd shared section + keyed-mutex texture (Looking Glass's shape) is where the engineering actually goes — prototype this first, it's the only genuinely new risk. Plus the HDR D3D12 path above. ## Decisions to make at kickoff - **D1 — Own the driver?** Recommend **yes, in Rust.** (Alternatives: fork SudoVDA's C++ — fastest to a known-good HDR driver but reintroduces a C++ toolchain and README-only license provenance; or keep vendoring — zero cost, but none of the goals.) - **D2 — Binding stack?** The main implementation fork. - **(a)** Extend our `windows-drivers-rs` (m0) with an `iddcx` subset — **one toolchain across all our drivers**, our build env, but we write the IddCx bindings ourselves (+~3–5 wk), using `virtual-display-rs`'s `iddcx.rs` as the 1:1 guide. *Preferred for consistency.* - **(b)** Vendor `virtual-display-rs`'s `wdf-umdf*` crates (MIT) — fastest to first light, but a *second* WDK-binding stack in-tree. - Suggested sequence: **prototype on (b) to prove IddCx-on-our-box in days**, then build production on **(a)** for consistency. - **D3 — Frame transport?** Phase it: **DDA-compatible first** (zero capture-side change), **direct push second** (the cleanup). Don't couple the driver rewrite to the transport rewrite. ## Recommended plan - **P0 — now:** keep vendoring SudoVDA. No change. (The gamepad-driver installer work just shipped; this is independent.) - **P1 — drop-in Rust IddCx driver (`pf-vdisplay`).** Replicate SudoVDA's IOCTL contract **exactly** (same struct layouts; reuse or re-issue the control interface GUID) so `vdisplay/sudovda.rs` needs **~zero change** (at most a GUID constant). Class=Display + IddCx INF, our own EDID + programmatic mode list incl. the per-ADD client mode, the watchdog, a real swap-chain drain (the vdd port — the drain is required so DWM keeps compositing; DDA/WGC still captures the desktop). Bundle + self-sign + `pnputil`-install via the installer, identical to the gamepad-driver path we just built. **Outcome:** all-Rust, SudoVDA dependency dropped, DDA capture unchanged. Effort ≈ **2–4 wk to first light**, **5–7 wk to parity** (HDR, multi-monitor, CI). - **P2 — direct frame push (kill DDA).** Add a swap-chain processor that copies each frame into a shared section/texture; new `capture` backend reads it directly; pin the render adapter to the encoder GPU. Gate behind a flag, validate against DDA, then retire the DDA path + the `win32u.dll` patch. HDR via the IddCx 1.11 D3D12 acquire path. **Outcome:** the real "owning the stack pays off" cleanup. Effort: additional; the Session-0 transport is the long pole. ## Risks 1. **D3-in-a-driver swap-chain loop** — the one genuinely new piece; bugs here = black screens/TDR. Mitigated by `virtual-display-rs`'s `swap_chain_processor.rs` + the MS sample as references. 2. **Session-0 cross-process transport** (P2) — the actual hard part; prototype it first. 3. **HDR = the harder D3D12 1.11 path** — our current WGC/DDA HDR is free; the IddCx HDR path is not. 4. **Two binding stacks** if we go D2(b) — a maintenance cost cutting against "clean/consistent." 5. **No WHQL ⇒ no Windows Update / Dev-Center distribution** — same constraint our gamepad drivers already accept (bundle + self-sign + import cert). ## References - IddCx model + signing: [IDD model overview](https://learn.microsoft.com/en-us/windows-hardware/drivers/display/indirect-display-driver-model-overview) · [IddCx versions](https://learn.microsoft.com/en-us/windows-hardware/drivers/display/iddcx-versions) · [1.10+ updates](https://learn.microsoft.com/en-us/windows-hardware/drivers/display/iddcx1.10-updates) · [UMDF signing](https://learn.microsoft.com/en-us/archive/blogs/peterwie/do-umdf-drivers-require-signing) - Swap-chain / frames: [IDDCX_METADATA](https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/iddcx/ns-iddcx-iddcx_metadata) · [SetDevice](https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/iddcx/nf-iddcx-iddcxswapchainsetdevice) · [SetRenderAdapter](https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/iddcx/nf-iddcx-iddcxadaptersetrenderadapter) · [ASSIGN_SWAPCHAIN](https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/iddcx/nc-iddcx-evt_idd_cx_monitor_assign_swapchain) - Prior art: [microsoft IddSampleDriver](https://github.com/microsoft/Windows-driver-samples/tree/main/video/IndirectDisplay) · [SudoMaker/SudoVDA](https://github.com/SudoMaker/SudoVDA) ([ioctl.h](https://github.com/SudoMaker/SudoVDA/blob/master/Common/Include/sudovda-ioctl.h)) · **[MolotovCherry/virtual-display-rs (Rust, MIT)](https://github.com/MolotovCherry/virtual-display-rs)** · [Looking Glass IDD (swap-chain → shm, no DDA)](https://deepwiki.com/gnif/LookingGlass/2.5-indirect-display-driver-(idd)) · [itsmikethetech/Virtual-Display-Driver](https://github.com/itsmikethetech/Virtual-Display-Driver)