7b99b41ede
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).
- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
host-latency, gpu-contention (fixed stale status table), game-library,
linux-setup (fixed m0->spike + stale zero-copy claim),
session-aware-host-followups, windows-client-bootstrap,
windows-dualsense-{scoping,game-detection}, windows-virtual-display,
security-review (per-finding status table; #12 still open),
apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
merged, M4 done); windows-secure-desktop.md archived (now a fallback
behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
150 lines
11 KiB
Markdown
150 lines
11 KiB
Markdown
# Windows virtual display — a Rust port of SudoVDA (investigation & plan)
|
||
|
||
> **Status:** SHIPPED (P1, 2026-06-22) + P2 CLOSED as a dead end. The all-Rust IddCx driver
|
||
> `pf-vdisplay` (`packaging/windows/drivers/pf-vdisplay/`) replaced the vendored SudoVDA C++ driver
|
||
> (SudoVDA backend deleted in `84a3b95`) and is the **sole** Windows vdisplay backend; the host drives it
|
||
> via `crates/punktfunk-host/src/vdisplay/windows/pf_vdisplay.rs`. Live-validated streaming on the RTX box
|
||
> at 5120×1440@240. The current consolidated Windows-host architecture lives in
|
||
> [`windows-host-rewrite.md`](windows-host-rewrite.md). This doc is trimmed to the two things git history
|
||
> can't replace: the **on-glass driver-iteration gotchas**, and the **P2 decision record** proving
|
||
> direct-frame-push (IDD-push) is architecturally impossible for bare-metal capture — *do not re-attempt it.*
|
||
|
||
All the P1 planning/feasibility/decision content (signing tier, Rust prior art, binding-stack choice,
|
||
IOCTL contract, phased plan) executed as designed and now lives in the code + `windows-host-rewrite.md`;
|
||
it has been cut. What remains below is the durable record.
|
||
|
||
## Driver-iteration gotchas (hard-won, on-glass)
|
||
|
||
These cost real time during P1 bring-up and apply to **any** future IddCx/UMDF driver work on this box.
|
||
|
||
- **INF DriverVer gate.** Updating an installed UMDF driver only takes if the INF **DriverVer changes** —
|
||
`deploy-dev.ps1` stamps a date.time `-v` on every run; without a bump the **old binary keeps running
|
||
(silently)**.
|
||
- **Devnode hygiene — `nefconc`, never `devgen`.** Create the root devnode with
|
||
`nefconc --create-device-node` (a clean `ROOT\DISPLAY` node), **NOT** `devgen /add` — devgen makes
|
||
**persistent `SWD\DEVGEN` software devices** that survive reboot *and* registry deletion and resurrect on
|
||
every `pnputil /add-driver` (they carry `hwid root\pf_vdisplay`, so the driver install re-materializes
|
||
them). The production installer must use a single `nefconc`/INF-created node and never `devgen`.
|
||
- **Session-0 vs Session-1 observability.** Every standalone probe (`vdtest`, the host's
|
||
`live_create_drop` test) runs in **Session 0** — the services session, whose desktop is a throwaway
|
||
**1024×768** basic display. IddCx activation happens in the **console Session 1**, where the GPU drives
|
||
the real desktop. So `Screen.AllScreens`/CCD queries from Session 0 *can never* see the virtual monitor
|
||
activate — they report the wrong desktop. The only valid way to drive + observe it is the **host service**
|
||
(SYSTEM, which targets Session 1) plus the driver's own `OutputDebugString` (system-wide,
|
||
session-agnostic). (An early "monitor arrives but never gets a swap-chain / no DXGI output" symptom was
|
||
this measurement artifact, not a driver bug.)
|
||
- **Accumulated device-state damage.** Repeated reinstalls + `Disable`/`Enable-PnpDevice` cycles + a control
|
||
handle the host **cached across all of it** wedge the device tree (stale handle → the host's PINGs fail →
|
||
the 3 s watchdog tears the monitor down mid-session → capture opens a dying display → "no DXGI output").
|
||
A **reboot** clears it and it works on the first connect. Lesson: after device churn, restart the host
|
||
service (fresh handle) — and when in doubt, reboot.
|
||
- **Hot-reload is unreliable; deploy = install + reboot.** `pnputil /restart-device` does **NOT** restart
|
||
WUDFHost (old image stays mapped), `Disable/Enable-PnpDevice` errors on the root-enumerated IDD, and
|
||
**killing WUDFHost invalidates the host's cached `{e5bcc234}` control handle** (every ADD then fails
|
||
`0x80070006`, and the device can wedge to `FAILED_POST_START`). A **reboot** loads a freshly-installed
|
||
build cleanly. **Recovery** from a broken build is clean and reboot-free:
|
||
`pnputil /delete-driver <oemNN>.inf /uninstall` removes the bad package and the device rebinds the
|
||
previous (validated) package in the DriverStore.
|
||
- **`FAILED_POST_START` is usually churn, not the binary.** Comparing a working vs. a suspect DLL's import
|
||
tables came out **identical** (same DLLs; the size/hash delta is just the Authenticode signature). A clean
|
||
install **+ reboot** (no `restart-device`/`disable-enable`/kill in between) loads to `OK`.
|
||
- **The swap-chain drain is required.** The swap-chain processor is a faithful port of
|
||
virtual-display-rs's — it drains correctly via `ReleaseAndAcquireBuffer` + `FinishedProcessingFrame`. The
|
||
drain is *required*; a true no-op stalls DWM and freezes the captured image.
|
||
- **`pf-vdisplay` can't coexist with SudoVDA.** They register the same control-interface GUID, so two IddCx
|
||
adapters claiming `{e5bcc234}` → `FAILED_POST_START`. pf-vdisplay *replaces* SudoVDA (now moot — SudoVDA
|
||
is deleted — but the same rule binds any second IDD that claims the GUID).
|
||
|
||
## P2 — direct frame push (kill DDA): decision record — DEAD END, DO NOT PURSUE
|
||
|
||
P2 wanted the driver to *publish* each swap-chain frame to the host directly (Looking-Glass style), to
|
||
retire DXGI Desktop Duplication and its multi-GPU survival code (`capture/dxgi.rs`'s
|
||
`DXGI_ERROR_ACCESS_LOST`/`MODE_CHANGE_IN_PROGRESS` re-duplication churn and the `win32u.dll`
|
||
`install_gpu_pref_hook()` patch). **It cannot work for bare-metal console-desktop capture.** All the
|
||
IDD-push code stays in-tree, compiles, and is gated **off** behind `PUNKTFUNK_IDD_PUSH` — dormant and
|
||
harmless — as the documented record so it isn't re-tried.
|
||
|
||
### What was proven sound (so the failure is *not* a transport bug)
|
||
|
||
- **Producer and consumer are both in Session 0.** The pf-vdisplay host process is `WUDFHost.exe`
|
||
(`-DeviceGroupId:pfVDisplayGroup`) and the punktfunk host service is `LocalSystem` — **both Session 0**.
|
||
So a D3D11 **shared keyed-mutex texture** created in the driver can be opened by name in the host
|
||
(`ID3D11Device1::OpenSharedResourceByName`) with both devices on the **same render-adapter LUID** (the
|
||
driver reports it out of the `ADD` IOCTL via `OsAdapterLuid`). Named kernel objects resolve through
|
||
Session 0's shared `\BaseNamedObjects`, so no `Global\` prefix / `SeCreateGlobalPrivilege` gymnastics are
|
||
needed for same-session use. The Looking-Glass cross-*VM* shared-memory device is unnecessary — this is
|
||
cross-*process*, same-session, one GPU.
|
||
- **Transport shape (built):** a **ring** of N (default 3) shared keyed-mutex textures (newest-wins, so the
|
||
swap-chain thread never blocks — a stalled `IddCxSwapChainReleaseAndAcquire` loop freezes DWM compositing
|
||
system-wide) + a named metadata header (`{magic, version, generation, width, height, dxgi_format,
|
||
ring_len, latest}`) + a frame-ready auto-reset event. A **generation** counter bumps on a mode change so
|
||
the host re-opens the ring.
|
||
- **The inversion (required) — host creates, driver opens.** **WUDFHost runs with a restricted token: it
|
||
can neither write the filesystem nor create named kernel objects** (`CreateFileMappingW`/`CreateEventW`/
|
||
`CreateSharedHandle` all fail silently), which a file-logging driver build confirmed (it wrote no file at
|
||
all even though `init()` runs in `DriverEntry` and the device is `OK`). This is exactly why the gamepad
|
||
UMDF drivers invert it (`inject/dualsense_windows.rs`): **the HOST creates the section** (privileged → a
|
||
permissive `Global\` name + SDDL `D:(A;;GA;;;WD)`) and **the DRIVER only OPENS it**. The host-created-ring
|
||
/ restricted-open split was implemented and **works every time** (`created shared ring … render_luid=…`,
|
||
no name collisions after the per-attempt generation fix). The gamepad drivers independently prove a UMDF
|
||
driver *can* open + write a host-created `Global\` section on this box — so the driver writing nothing is
|
||
**not** an access problem.
|
||
|
||
### Root cause — the swap-chain is never assigned (fundamental, not fixable)
|
||
|
||
Across **every** configuration tested, the driver's `run_core` swap-chain processor is **never entered**
|
||
(`run_core_entries=0`):
|
||
|
||
- in-process (Session 0) and WGC-triggered (Session 1 helper) sessions,
|
||
- a user-created ring AND a host-created (LocalSystem) ring with the permissive `D:(A;;GA;;;WD)` SDDL,
|
||
- with and without a Low-IL (`S:(ML;;NW;;;LW)`) mandatory label,
|
||
- with WUDFHost confirmed **not** an AppContainer (`IsAppContainer=0`),
|
||
|
||
— even while WGC simultaneously captured the same virtual monitor's composition and streamed multi-MB of
|
||
HEVC.
|
||
|
||
**An IddCx virtual monitor only receives a swap-chain (`EVT_IDD_CX_MONITOR_ASSIGN_SWAPCHAIN`) when the OS
|
||
presents/scans-out to it, which requires a real presentation consumer. WGC/DDA capture of the composed
|
||
desktop does NOT count** — it reads DWM's composition, bypassing the driver's swap-chain. With no physical
|
||
scanout and no consumer that routes *through the driver*, the path stays inactive (`IDDCX_PATH_FLAGS=0`) and
|
||
`ASSIGN_SWAPCHAIN` never fires. Session 0 additionally has no DWM/compositor at all.
|
||
|
||
Ecosystem + first-party confirmation:
|
||
|
||
- **Every bare-metal virtual-display capture project uses WGC/DDA, not the driver swap-chain:** SudoVDA
|
||
(its swap-chain loop acquires-and-discards), Apollo/Sunshine (DDA + WGC backends), virtual-display-rs
|
||
(discards), parsec-vdd (no frame path). Only **Looking Glass** consumes the driver swap-chain — and only
|
||
because a **VM guest scans out** the display (the consumer). Bare metal has no equivalent.
|
||
- Microsoft's own unanswered Q&A (learn.microsoft.com/answers 4096179) reports the identical symptom for
|
||
the IddSampleDriver: virtual display "always inactive," `ASSIGN_SWAPCHAIN` never runs.
|
||
|
||
### Both remaining escape hatches tested and closed
|
||
|
||
- **Option 3 — a present *source* on the display — TESTED, failed.** A present-trigger added to the
|
||
Session-1 WGC helper successfully created a D3D11 swapchain on the virtual display and presented
|
||
continuously (WGC even captured the flashing window). The driver stayed `run_core_entries=0` /
|
||
`frames_acquired=0`. So an active present *source* does NOT make the OS assign the driver's swap-chain —
|
||
DWM composes the present onto the display (capturable) without routing it through the driver.
|
||
- **Option 2 — a driver flag — closed by analysis.** The present-trigger succeeding proves the **path is
|
||
already active**; the missing piece is **scanout routed through the driver**, which the OS does only for a
|
||
real consumer (physical display / VM guest / RDP). The one IddCx flag for that —
|
||
`IDDCX_ADAPTER_FLAGS_REMOTE_SESSION_DRIVER` — requires the **RDP protocol stack** as the consumer, which
|
||
bare-metal console capture has no equivalent of.
|
||
|
||
### Verdict (final)
|
||
|
||
IDD-push needs a presentation consumer (scanout / VM guest / RDP) that bare-metal console desktop-capture
|
||
fundamentally cannot provide. No host-side capture, no in-process path, no present source, and no available
|
||
driver flag overcomes it. **WGC (normal desktop) + DDA (secure desktop) is the only viable Windows capture
|
||
path — as the entire ecosystem already does.** Any future "lower overhead" must come from optimizing the
|
||
WGC/DDA path (trimming the Session-0↔Session-1 relay, zero-copy encode), **not** from IDD-push. The
|
||
remaining gaps a hypothetical IDD-push would also have had (cursor delivered separately via
|
||
`IddCxMonitorSetupHardwareCursor`/`QueryHardwareCursor`; HDR needing the IddCx **1.11 D3D12 acquire path**
|
||
`SetDevice2`/`ReleaseAndAcquireBuffer2` → `ID3D12Resource`, since the default swap-chain surface is 8-bit)
|
||
are moot for the same reason.
|
||
|
||
## Open items
|
||
|
||
**None.** P1 shipped; P2 is a permanent *do-not-pursue* record (no pending work). WGC/DDA is the shipping
|
||
capture path.
|