docs(design): trim shipped plans, consolidate cluster, add index
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).
- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
host-latency, gpu-contention (fixed stale status table), game-library,
linux-setup (fixed m0->spike + stale zero-copy claim),
session-aware-host-followups, windows-client-bootstrap,
windows-dualsense-{scoping,game-detection}, windows-virtual-display,
security-review (per-finding status table; #12 still open),
apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
merged, M4 done); windows-secure-desktop.md archived (now a fallback
behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -1,497 +1,149 @@
|
||||
# Windows virtual display — a Rust port of SudoVDA (investigation & plan)
|
||||
|
||||
Status: **P1 done — `pf-vdisplay` validated streaming on glass at 5120×1440@240** (2026-06-22). The
|
||||
all-Rust IddCx driver replaces the vendored **SudoVDA** C++ driver, matching the "all-Rust UMDF, zero
|
||||
external driver deps" direction we finished for gamepads (ViGEmBus gone; DualSense/DS4/XUSB shipped).
|
||||
The investigation/plan below is kept for context; see **Validated on-box** for the result.
|
||||
> **Status:** SHIPPED (P1, 2026-06-22) + P2 CLOSED as a dead end. The all-Rust IddCx driver
|
||||
> `pf-vdisplay` (`packaging/windows/drivers/pf-vdisplay/`) replaced the vendored SudoVDA C++ driver
|
||||
> (SudoVDA backend deleted in `84a3b95`) and is the **sole** Windows vdisplay backend; the host drives it
|
||||
> via `crates/punktfunk-host/src/vdisplay/windows/pf_vdisplay.rs`. Live-validated streaming on the RTX box
|
||||
> at 5120×1440@240. The current consolidated Windows-host architecture lives in
|
||||
> [`windows-host-rewrite.md`](windows-host-rewrite.md). This doc is trimmed to the two things git history
|
||||
> can't replace: the **on-glass driver-iteration gotchas**, and the **P2 decision record** proving
|
||||
> direct-frame-push (IDD-push) is architecturally impossible for bare-metal capture — *do not re-attempt it.*
|
||||
|
||||
## TL;DR
|
||||
All the P1 planning/feasibility/decision content (signing tier, Rust prior art, binding-stack choice,
|
||||
IOCTL contract, phased plan) executed as designed and now lives in the code + `windows-host-rewrite.md`;
|
||||
it has been cut. What remains below is the durable record.
|
||||
|
||||
A Rust port is **feasible, low-on-blockers, and strategically aligned** — and there's an unexpected
|
||||
architectural prize beyond "same thing, in Rust."
|
||||
## Driver-iteration gotchas (hard-won, on-glass)
|
||||
|
||||
- **Signing is not a blocker.** An IddCx driver is UMDF *user-mode*; it needs **no WHQL, no
|
||||
attestation, no test-signing**. A self-signed cert in LocalMachine `Root` + `TrustedPublisher`
|
||||
loads it — **exactly the model our gamepad drivers already ship** (and exactly what SudoVDA and the
|
||||
other forks do). ([Do UMDF drivers require signing?](https://learn.microsoft.com/en-us/archive/blogs/peterwie/do-umdf-drivers-require-signing))
|
||||
- **We would not be first in Rust.** [`MolotovCherry/virtual-display-rs`](https://github.com/MolotovCherry/virtual-display-rs)
|
||||
is a complete, shipping **IddCx driver written in Rust** (MIT), with hand-rolled IddCx/WDF bindgen
|
||||
bindings (`wdf-umdf-sys` + `wdf-umdf`) and a reference swap-chain processor. This turns "greenfield
|
||||
FFI" into "adapt a proven reference."
|
||||
- **The prize: we can stop using DXGI Desktop Duplication.** An IddCx driver already *receives* the
|
||||
composited desktop frames in its swap-chain. [Looking Glass](https://deepwiki.com/gnif/LookingGlass/2.5-indirect-display-driver-(idd))
|
||||
ships exactly this in production — driver consumes the swap-chain, hands frames to a separate
|
||||
process, "operates entirely independently of DDA." Doing the same would **delete an entire class of
|
||||
multi-GPU bugs** the current `capture/dxgi.rs` is built to survive (ACCESS_LOST storms,
|
||||
MODE_CHANGE_IN_PROGRESS, the `win32u.dll` reparenting patch).
|
||||
These cost real time during P1 bring-up and apply to **any** future IddCx/UMDF driver work on this box.
|
||||
|
||||
Recommendation: **yes, build it in Rust**, in phases — a drop-in DDA-compatible driver first (own the
|
||||
stack at low risk), then the direct-frame-push path (the real cleanup). Keep vendoring SudoVDA as the
|
||||
safe interim until the Rust driver is on-glass-validated on the RTX box.
|
||||
- **INF DriverVer gate.** Updating an installed UMDF driver only takes if the INF **DriverVer changes** —
|
||||
`deploy-dev.ps1` stamps a date.time `-v` on every run; without a bump the **old binary keeps running
|
||||
(silently)**.
|
||||
- **Devnode hygiene — `nefconc`, never `devgen`.** Create the root devnode with
|
||||
`nefconc --create-device-node` (a clean `ROOT\DISPLAY` node), **NOT** `devgen /add` — devgen makes
|
||||
**persistent `SWD\DEVGEN` software devices** that survive reboot *and* registry deletion and resurrect on
|
||||
every `pnputil /add-driver` (they carry `hwid root\pf_vdisplay`, so the driver install re-materializes
|
||||
them). The production installer must use a single `nefconc`/INF-created node and never `devgen`.
|
||||
- **Session-0 vs Session-1 observability.** Every standalone probe (`vdtest`, the host's
|
||||
`live_create_drop` test) runs in **Session 0** — the services session, whose desktop is a throwaway
|
||||
**1024×768** basic display. IddCx activation happens in the **console Session 1**, where the GPU drives
|
||||
the real desktop. So `Screen.AllScreens`/CCD queries from Session 0 *can never* see the virtual monitor
|
||||
activate — they report the wrong desktop. The only valid way to drive + observe it is the **host service**
|
||||
(SYSTEM, which targets Session 1) plus the driver's own `OutputDebugString` (system-wide,
|
||||
session-agnostic). (An early "monitor arrives but never gets a swap-chain / no DXGI output" symptom was
|
||||
this measurement artifact, not a driver bug.)
|
||||
- **Accumulated device-state damage.** Repeated reinstalls + `Disable`/`Enable-PnpDevice` cycles + a control
|
||||
handle the host **cached across all of it** wedge the device tree (stale handle → the host's PINGs fail →
|
||||
the 3 s watchdog tears the monitor down mid-session → capture opens a dying display → "no DXGI output").
|
||||
A **reboot** clears it and it works on the first connect. Lesson: after device churn, restart the host
|
||||
service (fresh handle) — and when in doubt, reboot.
|
||||
- **Hot-reload is unreliable; deploy = install + reboot.** `pnputil /restart-device` does **NOT** restart
|
||||
WUDFHost (old image stays mapped), `Disable/Enable-PnpDevice` errors on the root-enumerated IDD, and
|
||||
**killing WUDFHost invalidates the host's cached `{e5bcc234}` control handle** (every ADD then fails
|
||||
`0x80070006`, and the device can wedge to `FAILED_POST_START`). A **reboot** loads a freshly-installed
|
||||
build cleanly. **Recovery** from a broken build is clean and reboot-free:
|
||||
`pnputil /delete-driver <oemNN>.inf /uninstall` removes the bad package and the device rebinds the
|
||||
previous (validated) package in the DriverStore.
|
||||
- **`FAILED_POST_START` is usually churn, not the binary.** Comparing a working vs. a suspect DLL's import
|
||||
tables came out **identical** (same DLLs; the size/hash delta is just the Authenticode signature). A clean
|
||||
install **+ reboot** (no `restart-device`/`disable-enable`/kill in between) loads to `OK`.
|
||||
- **The swap-chain drain is required.** The swap-chain processor is a faithful port of
|
||||
virtual-display-rs's — it drains correctly via `ReleaseAndAcquireBuffer` + `FinishedProcessingFrame`. The
|
||||
drain is *required*; a true no-op stalls DWM and freezes the captured image.
|
||||
- **`pf-vdisplay` can't coexist with SudoVDA.** They register the same control-interface GUID, so two IddCx
|
||||
adapters claiming `{e5bcc234}` → `FAILED_POST_START`. pf-vdisplay *replaces* SudoVDA (now moot — SudoVDA
|
||||
is deleted — but the same rule binds any second IDD that claims the GUID).
|
||||
|
||||
## Validated on-box (2026-06-22)
|
||||
## P2 — direct frame push (kill DDA): decision record — DEAD END, DO NOT PURSUE
|
||||
|
||||
Before committing, the toolchain + load path were proven on the RTX box (Win11 26200, WDK 26100):
|
||||
P2 wanted the driver to *publish* each swap-chain frame to the host directly (Looking-Glass style), to
|
||||
retire DXGI Desktop Duplication and its multi-GPU survival code (`capture/dxgi.rs`'s
|
||||
`DXGI_ERROR_ACCESS_LOST`/`MODE_CHANGE_IN_PROGRESS` re-duplication churn and the `win32u.dll`
|
||||
`install_gpu_pref_hook()` patch). **It cannot work for bare-metal console-desktop capture.** All the
|
||||
IDD-push code stays in-tree, compiles, and is gated **off** behind `PUNKTFUNK_IDD_PUSH` — dormant and
|
||||
harmless — as the documented record so it isn't re-tried.
|
||||
|
||||
- **A Rust IddCx driver builds with our toolchain.** Cloned [`virtual-display-rs`](https://github.com/MolotovCherry/virtual-display-rs)
|
||||
and built its driver `.dll` against our WDK (UMDF 2.31 + IddCx 1.4 stubs, bindgen over `IddCx.h` via
|
||||
our LLVM, nightly-2024-07-26). One fix needed: its `build.rs` picked the **max** SDK Lib version
|
||||
(`10.0.28000.0`, a base SDK with no IddCx) for the `IddCxStub` search path; resolving it by the
|
||||
version that actually contains `um\x64\iddcx\1.4` (`10.0.26100.0`, the WDK) fixed the link.
|
||||
- **It installs self-signed and loads.** Signed `.dll`/`.cat` with our existing driver cert (the
|
||||
gamepad `punktfunk-ds-test`), `pnputil /add-driver`, root devnode via `devgen`. The device came up
|
||||
**Status OK / CM_PROB_NONE**, Class Display, hosted by `WUDFRd` — a Rust IddCx adapter initialized
|
||||
cleanly. (SudoVDA, already live here, independently confirms IddCx + self-signed UMDF work on this
|
||||
box.) Test artifacts removed afterward; SudoVDA untouched.
|
||||
### What was proven sound (so the failure is *not* a transport bug)
|
||||
|
||||
**Conclusion:** the central risk ("can we build + load a Rust IddCx driver here?") is retired. The
|
||||
binding question (D2) resolves toward **reusing `virtual-display-rs`'s self-contained `wdf-umdf-sys` +
|
||||
`wdf-umdf` bindgen crates** (now proven to build + load on our box) rather than extending
|
||||
`windows-drivers-rs` — IddCx functions are direct `IddCxStub` exports the WDF function-table macro
|
||||
can't reach anyway, so a unified bindgen is the cleaner base for `pf-vdisplay`. Reference clone kept at
|
||||
`C:\Users\Public\virtual-display-rs`.
|
||||
|
||||
**Scaffold + driver logic landed + on-glass:** `packaging/windows/vdisplay-driver/` — vendored
|
||||
`wdf-umdf-sys`/`wdf-umdf` (MIT, + the SDK-version build.rs fix) + the `pf-vdisplay` driver crate. The
|
||||
full IddCx driver is ported (entry → `IDD_CX_CLIENT_CONFIG` with all 7 callbacks → device/monitor
|
||||
context → our own EDID → a real swap-chain drain), with the IPC/serde/`tokio` stack replaced by an
|
||||
in-tree `monitor` model and `OutputDebugString` logging. **Validated on the RTX box:** built, signed
|
||||
(our `punktfunk-ds-test` cert), installed, loaded **Status OK**, and **arrived a real virtual monitor**
|
||||
("VirtuDisplay+", `DISPLAY\CHY0000`) — i.e. an OURS, all-Rust IddCx virtual display creating a monitor.
|
||||
|
||||
**IOCTL control plane done + on-glass (P1 functionally complete):** the SudoVDA-compatible control
|
||||
plane is implemented (`EVT_IDD_CX_DEVICE_IO_CONTROL` + the `{e5bcc234-…}` interface registered via
|
||||
`WdfDeviceCreateDeviceInterface`; `control.rs` with byte-identical structs) — `ADD` a monitor at a
|
||||
requested mode → `{LUID, target_id}` (target id + adapter LUID captured from `IDARG_OUT_MONITORARRIVAL`),
|
||||
`REMOVE` by GUID, `PING`/`GET_WATCHDOG` watchdog, `GET_VERSION`, `SET_RENDER_ADAPTER`
|
||||
(`IddCxAdapterSetRenderAdapter`); per-`ADD` mode injection (requested mode preferred + fallbacks). Added
|
||||
the five missing FFI wrappers to the vendored `wdf-umdf`. **Validated on the RTX box** with a probe
|
||||
that mimics `vdisplay/sudovda.rs` exactly: `GET_VERSION → 0.2.1`, `GET_WATCHDOG → timeout=3`,
|
||||
`ADD 1920×1080@60 → target_id=257 + adapter LUID`, a real "VirtuDisplay+" monitor arrived at the
|
||||
requested mode, `REMOVE` ok. **Constraint:** pf-vdisplay can't coexist with SudoVDA — they register the
|
||||
same interface GUID, so two IddCx adapters claiming it → `FAILED_POST_START`; pf-vdisplay *replaces*
|
||||
SudoVDA (validated by disabling SudoVDA first).
|
||||
|
||||
**Watchdog + real-host drive validated:** added the watchdog thread (1 Hz countdown reset by any IOCTL;
|
||||
tears down all monitors at 0 so a gone host never leaves a phantom display; mirrors SudoVDA's
|
||||
`RunWatchdog`). Pointed the **real host** at it — removed SudoVDA's devnode so pf-vdisplay is the sole
|
||||
`{e5bcc234}` provider, then ran the host's `vdisplay::sudovda::tests::live_create_drop`
|
||||
(`PUNKTFUNK_SUDOVDA_LIVE=1`): **test passed**, and the pf-vdisplay log shows the host's IOCTLs landing —
|
||||
`ADD 1920x1080@60 → target_id=258, luid=…02619823`, then the watchdog correctly tore the monitor down
|
||||
when the test process exited without a final REMOVE. So `vdisplay/sudovda.rs` drives pf-vdisplay
|
||||
unchanged through the full control contract.
|
||||
|
||||
**Validated streaming end-to-end on glass (2026-06-22) — P1 complete.** pf-vdisplay is a working
|
||||
SudoVDA replacement. Driven by the **real host** (`serve`, the LocalSystem service) with a stock client
|
||||
at **5120×1440@240**: the monitor arrives, `resolve_gdi_name → \\.\DISPLAY10`, `set_active_mode` +
|
||||
CCD-isolate succeed, the DXGI output resolves **under the RTX 4090**, WGC capture + NVENC run at
|
||||
**steady 240 fps, ~2.4 ms encode**, 6512 AUs sent, clean teardown (`isolate restored rc=0x0`). Same
|
||||
`vdisplay/sudovda.rs` path, unchanged — full parity with SudoVDA.
|
||||
|
||||
**The earlier "monitor arrives but never gets a swap-chain / no DXGI output" symptoms were a
|
||||
measurement + state artifact, not a driver bug.** Two traps cost a lot of time:
|
||||
1. **Session 0.** Every standalone probe (`vdtest`, the host's `live_create_drop` test) ran in
|
||||
**Session 0** — the services session, whose desktop is a throwaway **1024×768** basic display. IddCx
|
||||
activation happens in the **console Session 1**, where the 4090 drives the real desktop. So
|
||||
`Screen.AllScreens`/CCD queries from Session 0 *can never* see the virtual monitor activate — they
|
||||
report the wrong desktop. The only valid way to drive + observe it is the **host service** (SYSTEM,
|
||||
which targets Session 1) plus the driver's own `OutputDebugString` (system-wide, session-agnostic).
|
||||
2. **Accumulated device-state damage.** Repeated reinstalls + `Disable`/`Enable-PnpDevice` cycles +
|
||||
a control handle the host **cached across all of it** left the device tree wedged (stale handle →
|
||||
the host's PINGs fail → the 3 s watchdog tears the monitor down mid-session → capture opens a dying
|
||||
display → "no DXGI output"). **A reboot cleared it and it worked on the first connect.** Lesson:
|
||||
after device churn, restart the host service (fresh handle) — and when in doubt, reboot.
|
||||
|
||||
The swap-chain processor is a **faithful port of virtual-display-rs's** (it drains correctly via
|
||||
`ReleaseAndAcquireBuffer` + `FinishedProcessingFrame` — the drain is *required*; a true no-op would
|
||||
stall DWM and freeze the captured image). The EDID is our **own clean 128-byte block** (manufacturer
|
||||
`PNK`, product `punktfunk`) — no SudoVDA bytes.
|
||||
|
||||
**Build gotcha (important for iterating):** updating an installed UMDF driver only takes if the INF
|
||||
**DriverVer changes** — `deploy-dev.ps1` stamps a date.time `-v` on every run; without a bump the old
|
||||
binary keeps running (silently). **Devnode hygiene:** create the root devnode with
|
||||
`nefconc --create-device-node` (a clean `ROOT\DISPLAY` node), NOT `devgen /add` — devgen makes
|
||||
**persistent `SWD\DEVGEN` software devices** that survive reboot *and* registry deletion and resurrect
|
||||
on every `pnputil /add-driver` (they have `hwid root\pf_vdisplay`, so the driver install re-materializes
|
||||
them). The production installer must use a single `nefconc`/INF-created node and never `devgen`.
|
||||
|
||||
## P2 — direct frame push (kill DDA): design & decision record
|
||||
|
||||
Status: **in progress.** P1 ships frames the old way (the driver drains its swap-chain and DDA/WGC
|
||||
re-captures the composited desktop). P2 makes the driver *publish* each swap-chain frame to the host
|
||||
directly, so we can retire Desktop Duplication and its multi-GPU survival code. Built behind
|
||||
`PUNKTFUNK_IDD_PUSH`, A/B'd against DDA, and only then made the default.
|
||||
|
||||
### The decisive finding: producer and consumer are both in Session 0
|
||||
|
||||
The whole transport design hinged on one unknown — same-session or cross-session? **Measured on the
|
||||
RTX box (2026-06-22):** the pf-vdisplay host process is `WUDFHost.exe` with
|
||||
`-DeviceGroupId:pfVDisplayGroup`, running in **Session 0**; the punktfunk host service is `LocalSystem`,
|
||||
also **Session 0**. So the swap-chain processor thread (spawned by our own `thread::spawn` inside the
|
||||
driver, i.e. in `WUDFHost`) and the encoder live in the **same session**. This is the easy case:
|
||||
|
||||
- A D3D11 **shared keyed-mutex texture** created in the driver can be opened by name in the host with
|
||||
`ID3D11Device1::OpenSharedResourceByName` — both devices created on the **same render-adapter LUID**
|
||||
(which the driver already reports out of the `ADD` IOCTL via `OsAdapterLuid`, surfaced as
|
||||
`WinCaptureTarget::adapter_luid`).
|
||||
- Named kernel objects resolve through Session 0's shared `\BaseNamedObjects`, so **no `Global\`
|
||||
prefix / `SeCreateGlobalPrivilege` gymnastics** are needed (kept the names unprefixed; documented
|
||||
that this relies on both processes being Session 0). The Looking-Glass cross-*VM* shared-memory
|
||||
device is unnecessary — this is cross-*process*, same-session, on one GPU.
|
||||
|
||||
This collapses the "Session-0 cross-process transport is the long pole" risk from the original plan.
|
||||
|
||||
### Transport: a ring of shared keyed-mutex textures + a metadata header + an event
|
||||
|
||||
A single ping-pong keyed mutex would couple the driver's present rate to the host's consume rate — and
|
||||
**the swap-chain thread must never block** (a stalled `IddCxSwapChainReleaseAndAcquire`/processing loop
|
||||
freezes DWM compositing system-wide). So, the Looking-Glass shape — multiple frame buffers, newest
|
||||
wins:
|
||||
|
||||
- **Ring** of `N` (default 3) shared textures, `RESOURCE_MISC_SHARED_NTHANDLE |
|
||||
SHARED_KEYEDMUTEX`, fixed size for the session. A **generation** counter bumps on a mode change
|
||||
(resize): the driver tears down + recreates the ring at the new size, the host notices the
|
||||
generation change and re-opens.
|
||||
- **Named metadata header** (`CreateFileMapping`): `{magic, version, generation, width, height,
|
||||
dxgi_format, ring_len, latest}` where `latest` packs `{write_index, monotonic sequence}` published
|
||||
*after* the copy completes. Plain (unprefixed) names — Session-0 shared namespace.
|
||||
- **Frame-ready auto-reset event** so the consumer waits instead of spinning.
|
||||
- **Producer (driver, per acquired frame):** pick `(latest_index + 1) % N`; **try**-acquire that
|
||||
slot's keyed mutex with a 0 ms timeout (if the host still holds it — rare with 3 slots — reuse the
|
||||
current slot or skip, **never block**); `CopyResource` the acquired `MetaData.pSurface` into the
|
||||
slot; release the mutex; publish `{index, ++seq}`; `SetEvent`. Then `FinishedProcessingFrame` as
|
||||
today.
|
||||
- **Consumer (host `IddPushCapturer`):** `WaitForSingleObject(event, timeout)`; read `latest`; if `seq`
|
||||
advanced, acquire that slot's mutex, `CopyResource` into an owned NVENC-input texture, release, yield
|
||||
`FramePayload::D3d11{texture, device}` — straight into the existing zero-copy NVENC path. No DDA, no
|
||||
CPU readback.
|
||||
|
||||
### What P2 removes vs. keeps
|
||||
|
||||
- **Removes:** `capture/dxgi.rs`'s `DXGI_ERROR_ACCESS_LOST`/`MODE_CHANGE_IN_PROGRESS` re-duplication
|
||||
churn, the legacy-`DuplicateOutput` fallback, and **`install_gpu_pref_hook()` (the `win32u.dll`
|
||||
patch)** — by **pinning the render adapter to the encoder GPU** (`IddCxAdapterSetRenderAdapter`, the
|
||||
existing `SET_RENDER_ADAPTER` IOCTL, driven before `ADD`), so the OS never reparents the output and
|
||||
the shared texture + NVENC share one device by construction.
|
||||
- **Keeps:** display **topology** (making the virtual display the composited desktop) and the
|
||||
**watchdog** (now ours). The **two-process WGC secure-desktop relay** stays until we confirm the IDD
|
||||
push also delivers the secure (Winlogon) desktop; if it does, that retires too.
|
||||
|
||||
### On-glass attempt 2026-06-22 — code complete, blocked at driver load
|
||||
|
||||
The full transport (driver publisher + host `IddPushCapturer` + render-LUID robustness + in-process
|
||||
routing) is written and compiles clean. The first on-glass A/B exposed several real things and one
|
||||
hard blocker:
|
||||
|
||||
- **The service captures in a Session-1 WGC helper, not in-process.** `should_use_helper()` returns
|
||||
true for a SYSTEM service, so it spawns a user-session helper that does capture **and input
|
||||
injection**. IDD-push must capture **in-process in Session 0** (where the driver publishes) — wired
|
||||
via `should_use_helper()` returning false for `PUNKTFUNK_IDD_PUSH`. **Caveat:** `SendInput` from
|
||||
Session 0 can't reach the user's Session-1 desktop, so in-process IDD-push has **no working input**
|
||||
yet. Production needs either a Session-1 input-only helper, or `Global\`-namespaced shared textures
|
||||
so a Session-1 helper consumes IDD-push for both video + input.
|
||||
- **`SET_RENDER_ADAPTER` is ignored by the driver** (the IDD lands on a different adapter than pinned:
|
||||
observed IDD adapter `0xd60722` vs pinned 4090 `0x15de1`). The render-LUID-in-header path makes the
|
||||
host bind correctly regardless, but the driver should be made to actually honor the pin (or the host
|
||||
must copy across adapters) so NVENC gets a 4090 surface.
|
||||
- **Cursor is included** in the IddCx composited frame (DDA strips it) — so the host-side cursor
|
||||
compositor (P2.5) is likely unnecessary for this path.
|
||||
- **`FAILED_POST_START` was a red herring (churn, not the binary).** Comparing the 2157 (works) and
|
||||
the `frame_transport` DLL import tables: **identical** (same 8 DLLs; the size/hash delta is just the
|
||||
Authenticode signature). A clean install **+ reboot** (no `restart-device`/`disable-enable`/kill in
|
||||
between) loads the `frame_transport` driver to **`OK`**. The earlier `FAILED_POST_START` was the
|
||||
device wedging from the hot-reload churn (the deploy gotchas above). **Lesson: deploy = install +
|
||||
reboot, full stop.**
|
||||
- **THE REAL BLOCKER — the driver can't CREATE the shared objects.** With the driver loaded clean and
|
||||
the monitor active, the host's `IddPushCapturer` still times out: `pfvd-hdr-<target> never appeared`.
|
||||
The driver's own `OutputDebugString` is invisible (UMDF redirects it to ETW, not DebugView — verified
|
||||
with a working DBWIN self-test), so a **file-logging** driver build was tried — and it wrote **no
|
||||
file at all**, even though `init()` runs in `DriverEntry`, the device is `OK`, WUDFHost runs as
|
||||
`LocalService`, and `C:\Users\Public` is world-writable. **WUDFHost runs with a restricted token: it
|
||||
- **Producer and consumer are both in Session 0.** The pf-vdisplay host process is `WUDFHost.exe`
|
||||
(`-DeviceGroupId:pfVDisplayGroup`) and the punktfunk host service is `LocalSystem` — **both Session 0**.
|
||||
So a D3D11 **shared keyed-mutex texture** created in the driver can be opened by name in the host
|
||||
(`ID3D11Device1::OpenSharedResourceByName`) with both devices on the **same render-adapter LUID** (the
|
||||
driver reports it out of the `ADD` IOCTL via `OsAdapterLuid`). Named kernel objects resolve through
|
||||
Session 0's shared `\BaseNamedObjects`, so no `Global\` prefix / `SeCreateGlobalPrivilege` gymnastics are
|
||||
needed for same-session use. The Looking-Glass cross-*VM* shared-memory device is unnecessary — this is
|
||||
cross-*process*, same-session, one GPU.
|
||||
- **Transport shape (built):** a **ring** of N (default 3) shared keyed-mutex textures (newest-wins, so the
|
||||
swap-chain thread never blocks — a stalled `IddCxSwapChainReleaseAndAcquire` loop freezes DWM compositing
|
||||
system-wide) + a named metadata header (`{magic, version, generation, width, height, dxgi_format,
|
||||
ring_len, latest}`) + a frame-ready auto-reset event. A **generation** counter bumps on a mode change so
|
||||
the host re-opens the ring.
|
||||
- **The inversion (required) — host creates, driver opens.** **WUDFHost runs with a restricted token: it
|
||||
can neither write the filesystem nor create named kernel objects** (`CreateFileMappingW`/`CreateEventW`/
|
||||
`CreateSharedHandle`), so `FramePublisher::new` fails silently. This is exactly why the **gamepad UMDF
|
||||
drivers invert it**: `inject/dualsense_windows.rs` — *"the host creates the section (privileged → a
|
||||
permissive SDDL so the WUDFHost can open it); the driver maps it"* — `Global\pfds-shm-<idx>` + SDDL
|
||||
`D:(A;;GA;;;WD)`. **Fix: invert frame-push to match.** The HOST creates the header + event + ring
|
||||
textures (`Global\` names, `D:(A;;GA;;;WD)` SDDL); the DRIVER only OPENS them, writes its actual
|
||||
render LUID + a status code back into the host-created header (so we get driver visibility through the
|
||||
host log), and runs the copy loop. The host creates the textures on the render adapter the driver
|
||||
reports.
|
||||
- **Also unresolved: `SET_RENDER_ADAPTER` appears ignored** (the host's pin to the 4090 vs the ADD-reply
|
||||
adapter differ every time). The inverted header carries the driver's *actual* render LUID so the host
|
||||
can create textures + run NVENC on the right adapter — but if that's the iGPU, NVENC (NVIDIA) can't
|
||||
encode it, so the driver must be made to honor the pin (or the host must cross-adapter copy). Needs its
|
||||
own investigation.
|
||||
`CreateSharedHandle` all fail silently), which a file-logging driver build confirmed (it wrote no file at
|
||||
all even though `init()` runs in `DriverEntry` and the device is `OK`). This is exactly why the gamepad
|
||||
UMDF drivers invert it (`inject/dualsense_windows.rs`): **the HOST creates the section** (privileged → a
|
||||
permissive `Global\` name + SDDL `D:(A;;GA;;;WD)`) and **the DRIVER only OPENS it**. The host-created-ring
|
||||
/ restricted-open split was implemented and **works every time** (`created shared ring … render_luid=…`,
|
||||
no name collisions after the per-attempt generation fix). The gamepad drivers independently prove a UMDF
|
||||
driver *can* open + write a host-created `Global\` section on this box — so the driver writing nothing is
|
||||
**not** an access problem.
|
||||
|
||||
**Driver deploy gotchas learned (this box):** hot-reloading a UMDF display driver is unreliable —
|
||||
`pnputil /restart-device` does NOT restart WUDFHost (old image stays mapped), `Disable/Enable-PnpDevice`
|
||||
errors on the root-enumerated IDD, and **killing WUDFHost invalidates the host's cached `{e5bcc234}`
|
||||
control handle** (every ADD then fails `0x80070006`, and the device can wedge to `FAILED_POST_START`).
|
||||
A **reboot** loads a freshly-installed build cleanly. **Recovery** from a broken build is clean and
|
||||
reboot-free: `pnputil /delete-driver <oemNN>.inf /uninstall` removes the bad package and the device
|
||||
rebinds the previous (validated) package in the DriverStore — restored 2157 → `OK` immediately.
|
||||
### Root cause — the swap-chain is never assigned (fundamental, not fixable)
|
||||
|
||||
### On-glass attempt 2 (2026-06-23) — inversion works; in-process Session-0 path is a dead end
|
||||
|
||||
Implemented the **inversion** (host creates the header + event + ring textures with the
|
||||
`D:(A;;GA;;;WD)` SDDL, driver only opens them) + a per-attempt **generation** (kills the
|
||||
`DXGI_ERROR_NAME_ALREADY_EXISTS` retry collisions) + a fixed-name **`Global\pfvd-dbg` debug channel**
|
||||
(structured counters the driver writes, since UMDF/ETW + the restricted token block its other logs).
|
||||
Results on the RTX box:
|
||||
|
||||
- ✅ The host **creates the shared ring every time** (`created shared ring … render_luid=…`) — the
|
||||
privileged-create / restricted-open split is sound.
|
||||
- ✅ No more name collisions (generation fix).
|
||||
- ❌ **The driver writes NOTHING** — debug block all zeros, crucially `run_core_entries=0`. The
|
||||
swap-chain processor **never runs**, i.e. the OS **never assigns a swap-chain** to the virtual
|
||||
monitor in this path.
|
||||
|
||||
**Root cause: an IddCx monitor only gets a swap-chain when something PRESENTS to it, and the in-process
|
||||
path has no presenter.** The host + the CCD topology-isolate run in **Session 0, which has no DWM /
|
||||
compositor**. The WGC path works because its capture helper lives in **Session 1**, where DWM composes
|
||||
the desktop onto the display (that composition is the swap-chain trigger). So in-process Session-0
|
||||
IDD-push gets no frames to push, full stop — a **fundamental** barrier, not a fixable bug. The original
|
||||
plan's "Session-0 transport is the long pole" was right, but the long pole turned out to be *triggering
|
||||
presentation*, not the shared-memory mechanics (those work).
|
||||
|
||||
**Consequence:** the only viable IDD-push shape is **option 3 — a Session-1 helper drives presentation +
|
||||
consumes the `Global\` ring** (the inversion built here is exactly what it needs). But it carries an
|
||||
unretired risk: it's still unproven whether the swap-chain gets assigned even with a Session-1 consumer
|
||||
that isn't WGC. Until that's answered, **DDA/WGC stays the shipping Windows capture path** — it works.
|
||||
All the IDD-push code (driver open-side + host create-side + debug channel) is written, compiles, and is
|
||||
gated behind `PUNKTFUNK_IDD_PUSH` (off), so it's dormant and harmless.
|
||||
|
||||
### CONCLUSION (2026-06-23): IDD-push is not viable for bare-metal capture — the swap-chain is never assigned
|
||||
|
||||
After the inversion + a fixed-name debug channel + a host-created-ring observer + an autonomous
|
||||
loopback test harness (`punktfunk-probe` → the SYSTEM service, paired via the mgmt API), the question
|
||||
"does the driver's swap-chain processor ever run?" was answered **definitively: no.** The driver's
|
||||
`run_core` is **never entered** — `run_core_entries=0` in *every* configuration tested:
|
||||
Across **every** configuration tested, the driver's `run_core` swap-chain processor is **never entered**
|
||||
(`run_core_entries=0`):
|
||||
|
||||
- in-process (Session 0) and WGC-triggered (Session 1 helper) sessions,
|
||||
- a user-created ring AND a host-created (LocalSystem) ring with a permissive `D:(A;;GA;;;WD)` SDDL,
|
||||
- a user-created ring AND a host-created (LocalSystem) ring with the permissive `D:(A;;GA;;;WD)` SDDL,
|
||||
- with and without a Low-IL (`S:(ML;;NW;;;LW)`) mandatory label,
|
||||
- with WUDFHost confirmed **not** an AppContainer (`IsAppContainer=0`),
|
||||
|
||||
— even while WGC simultaneously captured the same virtual monitor's composition and streamed multi-MB
|
||||
of HEVC. The gamepad UMDF drivers prove a UMDF driver *can* open + write a host-created `Global\`
|
||||
section on this box, so the driver writing nothing is **not** an access problem — `run_core` simply
|
||||
does not run.
|
||||
— even while WGC simultaneously captured the same virtual monitor's composition and streamed multi-MB of
|
||||
HEVC.
|
||||
|
||||
**Root cause (researched + ecosystem-confirmed):** an IddCx virtual monitor only receives a swap-chain
|
||||
(`EVT_IDD_CX_MONITOR_ASSIGN_SWAPCHAIN`) when the OS **presents/scans-out** to it, which requires a real
|
||||
presentation consumer. **WGC/DDA capture of the composed desktop does NOT count** — it reads DWM's
|
||||
composition, bypassing the driver's swap-chain. With no physical scanout and no consumer that routes
|
||||
*through the driver*, the path stays inactive (`IDDCX_PATH_FLAGS=0`) and `ASSIGN_SWAPCHAIN` never fires.
|
||||
Confirming evidence:
|
||||
**An IddCx virtual monitor only receives a swap-chain (`EVT_IDD_CX_MONITOR_ASSIGN_SWAPCHAIN`) when the OS
|
||||
presents/scans-out to it, which requires a real presentation consumer. WGC/DDA capture of the composed
|
||||
desktop does NOT count** — it reads DWM's composition, bypassing the driver's swap-chain. With no physical
|
||||
scanout and no consumer that routes *through the driver*, the path stays inactive (`IDDCX_PATH_FLAGS=0`) and
|
||||
`ASSIGN_SWAPCHAIN` never fires. Session 0 additionally has no DWM/compositor at all.
|
||||
|
||||
Ecosystem + first-party confirmation:
|
||||
|
||||
- **Every bare-metal virtual-display capture project uses WGC/DDA, not the driver swap-chain:** SudoVDA
|
||||
(its swap-chain loop acquires-and-discards), Apollo/Sunshine (DDA + WGC backends), virtual-display-rs
|
||||
(discards), parsec-vdd (no frame path). Only **Looking Glass** consumes the driver swap-chain — and
|
||||
only because a **VM guest scans out** the display (the consumer). We have no equivalent on bare metal.
|
||||
(discards), parsec-vdd (no frame path). Only **Looking Glass** consumes the driver swap-chain — and only
|
||||
because a **VM guest scans out** the display (the consumer). Bare metal has no equivalent.
|
||||
- Microsoft's own unanswered Q&A (learn.microsoft.com/answers 4096179) reports the identical symptom for
|
||||
the IddSampleDriver: virtual display "always inactive," `ASSIGN_SWAPCHAIN` never runs.
|
||||
|
||||
**Verdict:** the "driver consumes its swap-chain and pushes frames" architecture (P2 / Looking-Glass
|
||||
style) **cannot get frames** for punktfunk's bare-metal, whole-desktop, capture-only use case. The
|
||||
shared-memory transport machinery (host-creates / driver-opens, the gamepad pattern) is all sound and
|
||||
proven to *create*, but there is nothing for the driver to publish. **DDA/WGC remains the only viable
|
||||
Windows capture path**, which is exactly what the entire ecosystem does. The IDD-push code stays
|
||||
in-tree, compiles, and is gated `off` (`PUNKTFUNK_IDD_PUSH`) — dormant and harmless — documenting the
|
||||
attempt so it isn't re-tried. "Better performance/lower overhead" must come from optimizing the WGC/DDA
|
||||
path (e.g. trimming the Session-0↔Session-1 relay, zero-copy encode), not from IDD-push.
|
||||
### Both remaining escape hatches tested and closed
|
||||
|
||||
The only unexplored avenue is **driver-side** (a different adapter/monitor/path setup that might make the
|
||||
OS treat the virtual display as a presentation target) — but it needs a reboot to test, the MS Q&A
|
||||
suggests it's unsolved, and the unanimous ecosystem choice of WGC/DDA argues it's a dead end.
|
||||
- **Option 3 — a present *source* on the display — TESTED, failed.** A present-trigger added to the
|
||||
Session-1 WGC helper successfully created a D3D11 swapchain on the virtual display and presented
|
||||
continuously (WGC even captured the flashing window). The driver stayed `run_core_entries=0` /
|
||||
`frames_acquired=0`. So an active present *source* does NOT make the OS assign the driver's swap-chain —
|
||||
DWM composes the present onto the display (capturable) without routing it through the driver.
|
||||
- **Option 2 — a driver flag — closed by analysis.** The present-trigger succeeding proves the **path is
|
||||
already active**; the missing piece is **scanout routed through the driver**, which the OS does only for a
|
||||
real consumer (physical display / VM guest / RDP). The one IddCx flag for that —
|
||||
`IDDCX_ADAPTER_FLAGS_REMOTE_SESSION_DRIVER` — requires the **RDP protocol stack** as the consumer, which
|
||||
bare-metal console capture has no equivalent of.
|
||||
|
||||
**Final exhaustion (2026-06-23, follow-up): both remaining avenues closed.**
|
||||
### Verdict (final)
|
||||
|
||||
- **Option 3 (present source) — TESTED, failed.** Added a present-trigger to the Session-1 WGC helper:
|
||||
it successfully created a D3D11 swapchain on the virtual display and presented continuously (WGC even
|
||||
captured the flashing window). The driver stayed `run_core_entries=0` / `frames_acquired=0`. So an
|
||||
active *present source* on the display does NOT make the OS assign the driver's swap-chain either —
|
||||
DWM composes the present onto the display (capturable) without routing it through the driver's
|
||||
swap-chain.
|
||||
- **Option 2 (driver flag) — closed by analysis.** The present-trigger succeeding proves the **path is
|
||||
already active** (a swapchain presents to the display fine); the missing piece is **scanout routed
|
||||
through the driver**, which the OS does only for a real consumer (physical display / VM guest / RDP).
|
||||
The one IddCx flag for that — `IDDCX_ADAPTER_FLAGS_REMOTE_SESSION_DRIVER` — requires the **RDP
|
||||
protocol stack** as the consumer, which bare-metal console capture has no equivalent of.
|
||||
IDD-push needs a presentation consumer (scanout / VM guest / RDP) that bare-metal console desktop-capture
|
||||
fundamentally cannot provide. No host-side capture, no in-process path, no present source, and no available
|
||||
driver flag overcomes it. **WGC (normal desktop) + DDA (secure desktop) is the only viable Windows capture
|
||||
path — as the entire ecosystem already does.** Any future "lower overhead" must come from optimizing the
|
||||
WGC/DDA path (trimming the Session-0↔Session-1 relay, zero-copy encode), **not** from IDD-push. The
|
||||
remaining gaps a hypothetical IDD-push would also have had (cursor delivered separately via
|
||||
`IddCxMonitorSetupHardwareCursor`/`QueryHardwareCursor`; HDR needing the IddCx **1.11 D3D12 acquire path**
|
||||
`SetDevice2`/`ReleaseAndAcquireBuffer2` → `ID3D12Resource`, since the default swap-chain surface is 8-bit)
|
||||
are moot for the same reason.
|
||||
|
||||
**Verdict is final:** IDD-push needs a presentation consumer (scanout / VM guest / RDP) that bare-metal
|
||||
console desktop-capture fundamentally cannot provide. No host-side capture, no in-process path, no
|
||||
present source, and no available driver flag overcomes it. WGC (normal desktop) + DDA (secure desktop)
|
||||
is the only viable Windows capture path — as the entire ecosystem already does. The IDD-push +
|
||||
present-trigger code stays in-tree, gated off, as the documented record of the attempt.
|
||||
## Open items
|
||||
|
||||
### Known gaps the build-out must close (tracked as P2.* tasks)
|
||||
|
||||
- **Cursor.** DDA/WGC composite the HW cursor host-side from frame-info; the IDD path delivers the
|
||||
cursor separately (`IddCxMonitorSetupHardwareCursor` event → `QueryHardwareCursor`). The prototype
|
||||
may ship cursor-less; the build-out wires the IDD cursor into the existing `CursorCompositor`.
|
||||
- **HDR.** The default IddCx swap-chain surface is 8-bit `B8G8R8A8`; FP16/HDR needs the **IddCx 1.11
|
||||
D3D12 acquire path** (`SetDevice2`/`ReleaseAndAcquireBuffer2` → `ID3D12Resource`). Build against
|
||||
1.10, runtime-gate 1.11. SDR-only for the prototype.
|
||||
|
||||
## Why we'd do this
|
||||
|
||||
The user's goals, mapped to outcomes:
|
||||
|
||||
| Goal | Outcome |
|
||||
| --- | --- |
|
||||
| Drop external deps | No more vendored prebuilt SudoVDA `.dll`/`.cat` (third-party, C++, single upstream). |
|
||||
| Increase Rust coverage | The display driver joins the gamepad drivers as in-tree Rust UMDF. |
|
||||
| Own the stack / easier display management | We control the IOCTL protocol, the EDID, the mode list, the watchdog — and can fold the topology/mode logic that's currently scattered in `vdisplay/sudovda.rs` into the driver. |
|
||||
| Cleaner code | Phase 2 retires `capture/dxgi.rs`'s DDA workarounds + the `win32u.dll` patch. |
|
||||
|
||||
## What we'd be replacing (current architecture)
|
||||
|
||||
- **Driver:** SudoVDA — UMDF2 IddCx, `Class=Display`, `UmdfExtensions=IddCx0102`,
|
||||
`UpperFilters=IndirectKmd`, root-enumerated `Root\SudoMaker\SudoVDA`. Vendored prebuilt under
|
||||
`packaging/windows/sudovda/`, installed by `install-sudovda.ps1` (cert → `nefconc` devnode →
|
||||
`pnputil`). Source is public ([SudoMaker/SudoVDA](https://github.com/SudoMaker/SudoVDA), README-only
|
||||
MIT/CC0 grant over the MS sample, ~1,900 LOC C++).
|
||||
- **Host contract:** `crates/punktfunk-host/src/vdisplay/sudovda.rs` opens the control device by
|
||||
interface GUID `{e5bcc234-…}` and drives a tiny `METHOD_BUFFERED` IOCTL protocol — byte-identical to
|
||||
SudoVDA's `Common/Include/sudovda-ioctl.h`:
|
||||
- `ADD (0x800)` `{w,h,refresh,GUID,name[14],serial[14]}` → `{LUID, target_id}`
|
||||
- `REMOVE (0x801)` `{GUID}` · `SET_RENDER_ADAPTER (0x802)` `{LUID}` · `GET_WATCHDOG (0x803)` ·
|
||||
`PING (0x888)` (mandatory keepalive) · `GET_VERSION (0x8FF)`
|
||||
- **Capture:** `capture/dxgi.rs` finds the virtual monitor's GDI output **across all adapters** (it's
|
||||
enumerated under the *rendering* GPU, not SudoVDA's LUID) and runs **DXGI Desktop Duplication**
|
||||
(`DuplicateOutput1`, FP16 for HDR). This file is **dominated by virtual-display-over-DDA survival
|
||||
code**: `DXGI_ERROR_ACCESS_LOST` re-duplication with retries, `MODE_CHANGE_IN_PROGRESS` backoff,
|
||||
legacy-`DuplicateOutput` fallback, CCD display isolation to make the IDD the sole composited
|
||||
desktop, and an **`install_gpu_pref_hook()` that patches `win32u.dll!NtGdiDdDDIGetCachedHybridQueryValue`**
|
||||
to stop DXGI reparenting the output across GPUs. Most of that exists *because* we capture a virtual
|
||||
display via DDA on a multi-GPU box.
|
||||
|
||||
## Feasibility findings
|
||||
|
||||
### Signing — green (the make-or-break)
|
||||
UMDF user-mode ⇒ Code-Integrity signing rules don't apply to our binary (the only kernel piece is
|
||||
Microsoft's inbox `IndirectKmd`). Self-signed cert in `Root` + `TrustedPublisher` is sufficient on a
|
||||
normal Secure-Boot Win11 box — no `bcdedit /set testsigning`. SudoVDA and `virtual-display-rs` both
|
||||
ship this way. This is the **same** model as our DualSense/DS4/XUSB drivers. (The only thing that
|
||||
breaks install is a botched cert placement, not a signing *tier*.)
|
||||
|
||||
### Rust prior art — exists, MIT, reusable
|
||||
`virtual-display-rs` proves an all-Rust IddCx driver runs in production and gives us:
|
||||
`wdf-umdf-sys` (bindgen over WDF **and** `iddcx.h`, links `IddCxStub`), `wdf-umdf` (safe wrappers —
|
||||
`iddcx.rs` ~300 LOC, with an `IddCxIsFunctionAvailable!` version-gate macro), and a reference driver
|
||||
(`swap_chain_processor.rs` ~158 LOC, `direct_3d_device.rs`, `edid.rs`). **Caveat:** it uses its *own*
|
||||
bindgen stack, **not** `microsoft/windows-drivers-rs` — see Decision D2.
|
||||
|
||||
### windows-drivers-rs IddCx support — absent, but a bounded extension
|
||||
Our `wdk-sys` (m0) binds Base + WDF + feature-gated subsets (hid/gpio/spb/…). **Zero IddCx symbols.**
|
||||
Adding it is the same shape as the existing subsets: an `ApiSubset::Iddcx` variant + `iddcx` feature →
|
||||
`iddcx_headers()` returning `iddcx.h` for bindgen, and linking `IddCx.lib`. IddCx functions are **not**
|
||||
WDF-table functions, so the `call_unsafe_wdf_function_binding!` macro doesn't apply — they're direct
|
||||
`IddCx.lib` exports we'd `#[link(name="IddCx")] extern` (or bindgen) and wrap ourselves.
|
||||
`windows` 0.58 (already in the tree) provides the Direct3D11/Dxgi APIs the swap-chain loop needs.
|
||||
|
||||
### The IddCx driver itself — well-understood, ~1–2k LOC
|
||||
Required callbacks (baselined on the MS [IddSampleDriver](https://github.com/microsoft/Windows-driver-samples/blob/main/video/IndirectDisplay/IddSampleDriver/Driver.cpp), ~1,100 LOC, IddCx 1.4):
|
||||
`EVT_IDD_CX_ADAPTER_INIT_FINISHED`, `ADAPTER_COMMIT_MODES`, `PARSE_MONITOR_DESCRIPTION`,
|
||||
`MONITOR_GET_DEFAULT_DESCRIPTION_MODES`, `MONITOR_QUERY_TARGET_MODES`, `MONITOR_ASSIGN_SWAPCHAIN`
|
||||
(the only callback with real D3D work), `MONITOR_UNASSIGN_SWAPCHAIN`, and `DEVICE_IO_CONTROL` (where
|
||||
our ADD/REMOVE/PING IOCTLs live). Init flow: `WdfDeviceCreate → IddCxDeviceInitConfig →
|
||||
IddCxDeviceInitialize → IddCxAdapterInitAsync → IddCxMonitorCreate → IddCxMonitorArrival`.
|
||||
|
||||
**Arbitrary resolutions don't need EDID timings:** ship one generic ~128/256-byte EDID base block to
|
||||
make Windows treat the target as a real monitor, then advertise modes programmatically from the
|
||||
mode-list callbacks — a static table **plus the runtime-requested client mode injected as preferred**
|
||||
(exactly SudoVDA's `s_DefaultModes[]` + per-ADD preferred-mode approach). 5120×1440@240 just gets
|
||||
added at ADD time.
|
||||
|
||||
**HDR/10-bit:** supported, but it's the one place IddCx is *harder* than today. The default swap-chain
|
||||
surface is **8-bit `A8R8G8B8`**; FP16/HDR requires the IddCx **1.11 D3D12 acquire path**
|
||||
(`SetDevice2`/`ReleaseAndAcquireBuffer2` → `ID3D12Resource`, with a stricter sync model). Our box is
|
||||
Win11 26200 (IddCx ≥ 1.10), so this is reachable, but it's real work — and our current WGC/DDA path
|
||||
gives FP16 HDR "for free." Build against 1.10 and runtime-gate the newer DDIs (SudoVDA's pattern).
|
||||
|
||||
## The architectural prize: skip DDA (Phase 2)
|
||||
|
||||
An IddCx driver gets each presented frame from `IddCxSwapChainReleaseAndAcquireBuffer` as an
|
||||
`IDXGIResource` on a device **we** bind via `IddCxSwapChainSetDevice`. We can copy it into a shared
|
||||
texture / shared section and hand it to the host's encoder process directly — **no Desktop
|
||||
Duplication**. Why this is the real win, not just a detour:
|
||||
|
||||
- **It's the *intended* IddCx use case.** IddCx exists for remote/wireless/USB displays that ship
|
||||
swap-chain frames over a wire; consuming frames in the driver is the designed path, and **Looking
|
||||
Glass already does exactly this** (driver → shared memory → separate consumer, no DDA).
|
||||
- **It kills the multi-GPU bug class.** We call `IddCxAdapterSetRenderAdapter` to pin the swap-chain to
|
||||
the **same GPU as our NVENC encoder before adding the monitor**, and the OS honors it. No more DXGI
|
||||
reparenting the output onto the wrong GPU, no ACCESS_LOST storms, and we can **retire
|
||||
`install_gpu_pref_hook()` (the `win32u.dll` patch)** and most of `capture/dxgi.rs`. Swap-chain
|
||||
re-creation becomes a documented, in-band event (`ABANDON_SWAPCHAIN`) instead of an undocumented
|
||||
failure we fight with retries.
|
||||
|
||||
What it does **not** remove (be honest): display **topology** management — making the virtual display
|
||||
the sole/primary composited desktop so the game (and Winlogon) render to it — is independent of how we
|
||||
*get* frames and stays (though we can integrate it more cleanly). And the watchdog stays, now ours.
|
||||
|
||||
The cost: a **Session-0 → service cross-process frame transport** (the driver host is `WUDFHost` in
|
||||
Session 0 / LocalService; our host is a LocalSystem service). A `Global\`-named, explicitly-ACL'd
|
||||
shared section + keyed-mutex texture (Looking Glass's shape) is where the engineering actually goes —
|
||||
prototype this first, it's the only genuinely new risk. Plus the HDR D3D12 path above.
|
||||
|
||||
## Decisions to make at kickoff
|
||||
|
||||
- **D1 — Own the driver?** Recommend **yes, in Rust.** (Alternatives: fork SudoVDA's C++ — fastest to a
|
||||
known-good HDR driver but reintroduces a C++ toolchain and README-only license provenance; or keep
|
||||
vendoring — zero cost, but none of the goals.)
|
||||
- **D2 — Binding stack?** The main implementation fork.
|
||||
- **(a)** Extend our `windows-drivers-rs` (m0) with an `iddcx` subset — **one toolchain across all
|
||||
our drivers**, our build env, but we write the IddCx bindings ourselves (+~3–5 wk), using
|
||||
`virtual-display-rs`'s `iddcx.rs` as the 1:1 guide. *Preferred for consistency.*
|
||||
- **(b)** Vendor `virtual-display-rs`'s `wdf-umdf*` crates (MIT) — fastest to first light, but a
|
||||
*second* WDK-binding stack in-tree.
|
||||
- Suggested sequence: **prototype on (b) to prove IddCx-on-our-box in days**, then build production on
|
||||
**(a)** for consistency.
|
||||
- **D3 — Frame transport?** Phase it: **DDA-compatible first** (zero capture-side change), **direct
|
||||
push second** (the cleanup). Don't couple the driver rewrite to the transport rewrite.
|
||||
|
||||
## Recommended plan
|
||||
|
||||
- **P0 — now:** keep vendoring SudoVDA. No change. (The gamepad-driver installer work just shipped;
|
||||
this is independent.)
|
||||
- **P1 — drop-in Rust IddCx driver (`pf-vdisplay`).** Replicate SudoVDA's IOCTL contract **exactly**
|
||||
(same struct layouts; reuse or re-issue the control interface GUID) so `vdisplay/sudovda.rs` needs
|
||||
**~zero change** (at most a GUID constant). Class=Display + IddCx INF, our own EDID + programmatic
|
||||
mode list incl. the per-ADD client mode, the watchdog, a real swap-chain drain (the vdd port — the
|
||||
drain is required so DWM keeps compositing; DDA/WGC still captures the desktop). Bundle + self-sign +
|
||||
`pnputil`-install via the installer, identical to the gamepad-driver path we just built. **Outcome:** all-Rust, SudoVDA dependency dropped, DDA capture
|
||||
unchanged. Effort ≈ **2–4 wk to first light**, **5–7 wk to parity** (HDR, multi-monitor, CI).
|
||||
- **P2 — direct frame push (kill DDA).** Add a swap-chain processor that copies each frame into a
|
||||
shared section/texture; new `capture` backend reads it directly; pin the render adapter to the
|
||||
encoder GPU. Gate behind a flag, validate against DDA, then retire the DDA path + the `win32u.dll`
|
||||
patch. HDR via the IddCx 1.11 D3D12 acquire path. **Outcome:** the real "owning the stack pays off"
|
||||
cleanup. Effort: additional; the Session-0 transport is the long pole.
|
||||
|
||||
## Risks
|
||||
|
||||
1. **D3-in-a-driver swap-chain loop** — the one genuinely new piece; bugs here = black screens/TDR.
|
||||
Mitigated by `virtual-display-rs`'s `swap_chain_processor.rs` + the MS sample as references.
|
||||
2. **Session-0 cross-process transport** (P2) — the actual hard part; prototype it first.
|
||||
3. **HDR = the harder D3D12 1.11 path** — our current WGC/DDA HDR is free; the IddCx HDR path is not.
|
||||
4. **Two binding stacks** if we go D2(b) — a maintenance cost cutting against "clean/consistent."
|
||||
5. **No WHQL ⇒ no Windows Update / Dev-Center distribution** — same constraint our gamepad drivers
|
||||
already accept (bundle + self-sign + import cert).
|
||||
|
||||
## References
|
||||
|
||||
- IddCx model + signing: [IDD model overview](https://learn.microsoft.com/en-us/windows-hardware/drivers/display/indirect-display-driver-model-overview) ·
|
||||
[IddCx versions](https://learn.microsoft.com/en-us/windows-hardware/drivers/display/iddcx-versions) ·
|
||||
[1.10+ updates](https://learn.microsoft.com/en-us/windows-hardware/drivers/display/iddcx1.10-updates) ·
|
||||
[UMDF signing](https://learn.microsoft.com/en-us/archive/blogs/peterwie/do-umdf-drivers-require-signing)
|
||||
- Swap-chain / frames: [IDDCX_METADATA](https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/iddcx/ns-iddcx-iddcx_metadata) ·
|
||||
[SetDevice](https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/iddcx/nf-iddcx-iddcxswapchainsetdevice) ·
|
||||
[SetRenderAdapter](https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/iddcx/nf-iddcx-iddcxadaptersetrenderadapter) ·
|
||||
[ASSIGN_SWAPCHAIN](https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/iddcx/nc-iddcx-evt_idd_cx_monitor_assign_swapchain)
|
||||
- Prior art: [microsoft IddSampleDriver](https://github.com/microsoft/Windows-driver-samples/tree/main/video/IndirectDisplay) ·
|
||||
[SudoMaker/SudoVDA](https://github.com/SudoMaker/SudoVDA) ([ioctl.h](https://github.com/SudoMaker/SudoVDA/blob/master/Common/Include/sudovda-ioctl.h)) ·
|
||||
**[MolotovCherry/virtual-display-rs (Rust, MIT)](https://github.com/MolotovCherry/virtual-display-rs)** ·
|
||||
[Looking Glass IDD (swap-chain → shm, no DDA)](https://deepwiki.com/gnif/LookingGlass/2.5-indirect-display-driver-(idd)) ·
|
||||
[itsmikethetech/Virtual-Display-Driver](https://github.com/itsmikethetech/Virtual-Display-Driver)
|
||||
**None.** P1 shipped; P2 is a permanent *do-not-pursue* record (no pending work). WGC/DDA is the shipping
|
||||
capture path.
|
||||
|
||||
Reference in New Issue
Block a user