Files
punktfunk/design/windows-virtual-display-rust-port.md
enricobuehler 7b99b41ede docs(design): trim shipped plans, consolidate cluster, add index
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).

- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
  apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
  host-latency, gpu-contention (fixed stale status table), game-library,
  linux-setup (fixed m0->spike + stale zero-copy claim),
  session-aware-host-followups, windows-client-bootstrap,
  windows-dualsense-{scoping,game-detection}, windows-virtual-display,
  security-review (per-finding status table; #12 still open),
  apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
  windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
  merged, M4 done); windows-secure-desktop.md archived (now a fallback
  behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
  roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:39:06 +00:00

150 lines
11 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Windows virtual display — a Rust port of SudoVDA (investigation & plan)
> **Status:** SHIPPED (P1, 2026-06-22) + P2 CLOSED as a dead end. The all-Rust IddCx driver
> `pf-vdisplay` (`packaging/windows/drivers/pf-vdisplay/`) replaced the vendored SudoVDA C++ driver
> (SudoVDA backend deleted in `84a3b95`) and is the **sole** Windows vdisplay backend; the host drives it
> via `crates/punktfunk-host/src/vdisplay/windows/pf_vdisplay.rs`. Live-validated streaming on the RTX box
> at 5120×1440@240. The current consolidated Windows-host architecture lives in
> [`windows-host-rewrite.md`](windows-host-rewrite.md). This doc is trimmed to the two things git history
> can't replace: the **on-glass driver-iteration gotchas**, and the **P2 decision record** proving
> direct-frame-push (IDD-push) is architecturally impossible for bare-metal capture — *do not re-attempt it.*
All the P1 planning/feasibility/decision content (signing tier, Rust prior art, binding-stack choice,
IOCTL contract, phased plan) executed as designed and now lives in the code + `windows-host-rewrite.md`;
it has been cut. What remains below is the durable record.
## Driver-iteration gotchas (hard-won, on-glass)
These cost real time during P1 bring-up and apply to **any** future IddCx/UMDF driver work on this box.
- **INF DriverVer gate.** Updating an installed UMDF driver only takes if the INF **DriverVer changes**
`deploy-dev.ps1` stamps a date.time `-v` on every run; without a bump the **old binary keeps running
(silently)**.
- **Devnode hygiene — `nefconc`, never `devgen`.** Create the root devnode with
`nefconc --create-device-node` (a clean `ROOT\DISPLAY` node), **NOT** `devgen /add` — devgen makes
**persistent `SWD\DEVGEN` software devices** that survive reboot *and* registry deletion and resurrect on
every `pnputil /add-driver` (they carry `hwid root\pf_vdisplay`, so the driver install re-materializes
them). The production installer must use a single `nefconc`/INF-created node and never `devgen`.
- **Session-0 vs Session-1 observability.** Every standalone probe (`vdtest`, the host's
`live_create_drop` test) runs in **Session 0** — the services session, whose desktop is a throwaway
**1024×768** basic display. IddCx activation happens in the **console Session 1**, where the GPU drives
the real desktop. So `Screen.AllScreens`/CCD queries from Session 0 *can never* see the virtual monitor
activate — they report the wrong desktop. The only valid way to drive + observe it is the **host service**
(SYSTEM, which targets Session 1) plus the driver's own `OutputDebugString` (system-wide,
session-agnostic). (An early "monitor arrives but never gets a swap-chain / no DXGI output" symptom was
this measurement artifact, not a driver bug.)
- **Accumulated device-state damage.** Repeated reinstalls + `Disable`/`Enable-PnpDevice` cycles + a control
handle the host **cached across all of it** wedge the device tree (stale handle → the host's PINGs fail →
the 3 s watchdog tears the monitor down mid-session → capture opens a dying display → "no DXGI output").
A **reboot** clears it and it works on the first connect. Lesson: after device churn, restart the host
service (fresh handle) — and when in doubt, reboot.
- **Hot-reload is unreliable; deploy = install + reboot.** `pnputil /restart-device` does **NOT** restart
WUDFHost (old image stays mapped), `Disable/Enable-PnpDevice` errors on the root-enumerated IDD, and
**killing WUDFHost invalidates the host's cached `{e5bcc234}` control handle** (every ADD then fails
`0x80070006`, and the device can wedge to `FAILED_POST_START`). A **reboot** loads a freshly-installed
build cleanly. **Recovery** from a broken build is clean and reboot-free:
`pnputil /delete-driver <oemNN>.inf /uninstall` removes the bad package and the device rebinds the
previous (validated) package in the DriverStore.
- **`FAILED_POST_START` is usually churn, not the binary.** Comparing a working vs. a suspect DLL's import
tables came out **identical** (same DLLs; the size/hash delta is just the Authenticode signature). A clean
install **+ reboot** (no `restart-device`/`disable-enable`/kill in between) loads to `OK`.
- **The swap-chain drain is required.** The swap-chain processor is a faithful port of
virtual-display-rs's — it drains correctly via `ReleaseAndAcquireBuffer` + `FinishedProcessingFrame`. The
drain is *required*; a true no-op stalls DWM and freezes the captured image.
- **`pf-vdisplay` can't coexist with SudoVDA.** They register the same control-interface GUID, so two IddCx
adapters claiming `{e5bcc234}``FAILED_POST_START`. pf-vdisplay *replaces* SudoVDA (now moot — SudoVDA
is deleted — but the same rule binds any second IDD that claims the GUID).
## P2 — direct frame push (kill DDA): decision record — DEAD END, DO NOT PURSUE
P2 wanted the driver to *publish* each swap-chain frame to the host directly (Looking-Glass style), to
retire DXGI Desktop Duplication and its multi-GPU survival code (`capture/dxgi.rs`'s
`DXGI_ERROR_ACCESS_LOST`/`MODE_CHANGE_IN_PROGRESS` re-duplication churn and the `win32u.dll`
`install_gpu_pref_hook()` patch). **It cannot work for bare-metal console-desktop capture.** All the
IDD-push code stays in-tree, compiles, and is gated **off** behind `PUNKTFUNK_IDD_PUSH` — dormant and
harmless — as the documented record so it isn't re-tried.
### What was proven sound (so the failure is *not* a transport bug)
- **Producer and consumer are both in Session 0.** The pf-vdisplay host process is `WUDFHost.exe`
(`-DeviceGroupId:pfVDisplayGroup`) and the punktfunk host service is `LocalSystem`**both Session 0**.
So a D3D11 **shared keyed-mutex texture** created in the driver can be opened by name in the host
(`ID3D11Device1::OpenSharedResourceByName`) with both devices on the **same render-adapter LUID** (the
driver reports it out of the `ADD` IOCTL via `OsAdapterLuid`). Named kernel objects resolve through
Session 0's shared `\BaseNamedObjects`, so no `Global\` prefix / `SeCreateGlobalPrivilege` gymnastics are
needed for same-session use. The Looking-Glass cross-*VM* shared-memory device is unnecessary — this is
cross-*process*, same-session, one GPU.
- **Transport shape (built):** a **ring** of N (default 3) shared keyed-mutex textures (newest-wins, so the
swap-chain thread never blocks — a stalled `IddCxSwapChainReleaseAndAcquire` loop freezes DWM compositing
system-wide) + a named metadata header (`{magic, version, generation, width, height, dxgi_format,
ring_len, latest}`) + a frame-ready auto-reset event. A **generation** counter bumps on a mode change so
the host re-opens the ring.
- **The inversion (required) — host creates, driver opens.** **WUDFHost runs with a restricted token: it
can neither write the filesystem nor create named kernel objects** (`CreateFileMappingW`/`CreateEventW`/
`CreateSharedHandle` all fail silently), which a file-logging driver build confirmed (it wrote no file at
all even though `init()` runs in `DriverEntry` and the device is `OK`). This is exactly why the gamepad
UMDF drivers invert it (`inject/dualsense_windows.rs`): **the HOST creates the section** (privileged → a
permissive `Global\` name + SDDL `D:(A;;GA;;;WD)`) and **the DRIVER only OPENS it**. The host-created-ring
/ restricted-open split was implemented and **works every time** (`created shared ring … render_luid=…`,
no name collisions after the per-attempt generation fix). The gamepad drivers independently prove a UMDF
driver *can* open + write a host-created `Global\` section on this box — so the driver writing nothing is
**not** an access problem.
### Root cause — the swap-chain is never assigned (fundamental, not fixable)
Across **every** configuration tested, the driver's `run_core` swap-chain processor is **never entered**
(`run_core_entries=0`):
- in-process (Session 0) and WGC-triggered (Session 1 helper) sessions,
- a user-created ring AND a host-created (LocalSystem) ring with the permissive `D:(A;;GA;;;WD)` SDDL,
- with and without a Low-IL (`S:(ML;;NW;;;LW)`) mandatory label,
- with WUDFHost confirmed **not** an AppContainer (`IsAppContainer=0`),
— even while WGC simultaneously captured the same virtual monitor's composition and streamed multi-MB of
HEVC.
**An IddCx virtual monitor only receives a swap-chain (`EVT_IDD_CX_MONITOR_ASSIGN_SWAPCHAIN`) when the OS
presents/scans-out to it, which requires a real presentation consumer. WGC/DDA capture of the composed
desktop does NOT count** — it reads DWM's composition, bypassing the driver's swap-chain. With no physical
scanout and no consumer that routes *through the driver*, the path stays inactive (`IDDCX_PATH_FLAGS=0`) and
`ASSIGN_SWAPCHAIN` never fires. Session 0 additionally has no DWM/compositor at all.
Ecosystem + first-party confirmation:
- **Every bare-metal virtual-display capture project uses WGC/DDA, not the driver swap-chain:** SudoVDA
(its swap-chain loop acquires-and-discards), Apollo/Sunshine (DDA + WGC backends), virtual-display-rs
(discards), parsec-vdd (no frame path). Only **Looking Glass** consumes the driver swap-chain — and only
because a **VM guest scans out** the display (the consumer). Bare metal has no equivalent.
- Microsoft's own unanswered Q&A (learn.microsoft.com/answers 4096179) reports the identical symptom for
the IddSampleDriver: virtual display "always inactive," `ASSIGN_SWAPCHAIN` never runs.
### Both remaining escape hatches tested and closed
- **Option 3 — a present *source* on the display — TESTED, failed.** A present-trigger added to the
Session-1 WGC helper successfully created a D3D11 swapchain on the virtual display and presented
continuously (WGC even captured the flashing window). The driver stayed `run_core_entries=0` /
`frames_acquired=0`. So an active present *source* does NOT make the OS assign the driver's swap-chain —
DWM composes the present onto the display (capturable) without routing it through the driver.
- **Option 2 — a driver flag — closed by analysis.** The present-trigger succeeding proves the **path is
already active**; the missing piece is **scanout routed through the driver**, which the OS does only for a
real consumer (physical display / VM guest / RDP). The one IddCx flag for that —
`IDDCX_ADAPTER_FLAGS_REMOTE_SESSION_DRIVER` — requires the **RDP protocol stack** as the consumer, which
bare-metal console capture has no equivalent of.
### Verdict (final)
IDD-push needs a presentation consumer (scanout / VM guest / RDP) that bare-metal console desktop-capture
fundamentally cannot provide. No host-side capture, no in-process path, no present source, and no available
driver flag overcomes it. **WGC (normal desktop) + DDA (secure desktop) is the only viable Windows capture
path — as the entire ecosystem already does.** Any future "lower overhead" must come from optimizing the
WGC/DDA path (trimming the Session-0↔Session-1 relay, zero-copy encode), **not** from IDD-push. The
remaining gaps a hypothetical IDD-push would also have had (cursor delivered separately via
`IddCxMonitorSetupHardwareCursor`/`QueryHardwareCursor`; HDR needing the IddCx **1.11 D3D12 acquire path**
`SetDevice2`/`ReleaseAndAcquireBuffer2``ID3D12Resource`, since the default swap-chain surface is 8-bit)
are moot for the same reason.
## Open items
**None.** P1 shipped; P2 is a permanent *do-not-pursue* record (no pending work). WGC/DDA is the shipping
capture path.