Files
punktfunk/docs/windows-host-rewrite.md
T
enricobuehler 48202a0f89 docs(windows-rewrite): mark game-capture bug FIXED + bring rewrite status current (§15)
The fullscreen-game-breaks-IDD-push bug is FIXED by the resolution-listening
recovery (c87bfe0: the 250ms poll now follows the display's actual resolution
and recreates the ring on any descriptor change, recover-or-drop), backed by
open-time first-frame DDA failover (f98ab07) and the driver publish() width/
height guard + flushed logging (789ad49). No protocol bump was needed — the host
reads the real resolution straight from Windows (CCD/GDI), so the bug doc's
Stage-1 composing capturer + Stage-2 protocol bump were unnecessary. Bug doc
marked FIXED with a Resolution section; the staged plan kept as superseded record.

windows-host-rewrite.md: the progress log was stale (ended at "M1 cont."). Added
§15 Current status — the driver STEP 0-8 port landed on main on-glass HDR-
validated; the host was refactored *in place* via windows-host-goal1 (not the §10
greenfield rebuild); §2.5 ownership model resolved the swap-chain-reuse / monitor-
leak open item; iddcx + /INTEGRITYCHECK CI-green. Remaining: the secure-desktop
on-glass gate (the single biggest unproven claim), M4 gamepad-driver migration,
M5/M6 cleanup, and the pf-vdisplay slot-reclaim driver fix. Top Status flipped
proposed → largely implemented.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 21:35:55 +00:00

838 lines
63 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Windows Host Rewrite — Design & Plan
Status: **largely implemented** (updated 2026-06-25 — see [§15 Current status](#15-current-status-2026-06-25)
for the milestone-by-milestone state; §0–§14 below are the original design and remain the reference). This
plan takes the current, hard-won Windows host (pf-vdisplay all-Rust IddCx driver + IDD-push zero-copy
capture, live-validated 5120×1440@240 HDR on the RTX box) as a *knowledge base* and re-derives a clean,
stable, well-layered architecture from it. It drops all SudoVDA back-compat (we own both ends now) and
drives `unsafe` to a contained minimum.
It supersedes the stale conclusion in `docs/windows-virtual-display-rust-port.md` ("IDD-push not
viable") — that verdict was written in the *same commit* (`e2c9bfd`) that shipped the working
922-line consumer + 424-line producer. **IDD-push works and is the architecture.** The breakthrough the
prose never recorded: once the CCD topology makes the virtual display the sole composited desktop in the
console session, DWM composites to it and the IddCx swap-chain *is* assigned
(`run_core: FIRST FRAME acquired — DWM IS compositing the virtual display!`). Per the owner, **IDD-push
also captures the secure desktop (Winlogon / UAC / lock)** — so it is the universal primary path, not
just the normal-desktop path.
### Decisions resolved (2026-06-24)
| # | Decision | Chosen |
|---|----------|--------|
| A. Execution | greenfield vs staged | **Greenfield rewrite** — rebuild the Windows host fresh against the clean architecture, salvaging the validated "jewels" (§1) verbatim. (Risk acknowledged: no CI for the Windows paths — mitigated by the §1 preservation checklist + on-glass gates, §10.) |
| B. Capture surface | IDD-only / IDD+secure-DDA / keep fallbacks | **IDD-push primary for everything (incl. the secure desktop); keep WGC + DDA as fallbacks.** |
| C. Driver binding stack | wdf-umdf vs windows-drivers-rs | **Extend `microsoft/windows-drivers-rs`** with an `iddcx` subset; unify all three drivers on it; **solve `/INTEGRITYCHECK` properly** (§6). |
| D. GameStream on Windows | keep / keep-secure-default / drop | **Keep Moonlight compat; flip the installer/service default to secure `serve`** (GameStream an explicit opt-in). |
---
## 0. Goals (from the brief)
1. **Clean, stable, well-layered architecture.** Decompose the god-files, give every subsystem one
owner, and replace the ~40-knob `PUNKTFUNK_*` env soup with a typed config resolved once per session.
2. **Drop every trace of SudoVDA back-compat.** We own the driver (`pf-vdisplay`) and the host. The
byte-identical IOCTL ABI, the reused `{e5bcc234}` GUID, the `sudovda` module name, the "SudoVDA
ignores this" conditionals — all pure liability now.
3. **Minimize `unsafe`.** ~480 `unsafe` occurrences across the Windows surface; the large majority are
FFI-mechanical (windows-rs/NVENC/WDK already return `Result`). Target: host ~144→~35, drivers
~227→~60, with the irreducible floor *contained* in 34 named modules under
`deny(unsafe_op_in_unsafe_fn)`.
### Non-goals / invariants (do not regress)
- **Linux host behavior is out of scope and must not change.** The host crate is shared; Linux is
validated across KWin/gamescope/Mutter/Sway. Touch only the seams.
- **`punktfunk-core` stays the one linked core.** Protocol/FEC/crypto/QUIC live there behind the C
ABI; the host is a leaf binary. No protocol changes here.
- **No async on the per-frame path.** Native threads only (the existing discipline).
---
## 1. What we KEEP (validated, load-bearing — port, don't rewrite)
These are expensive empirical wins. The rewrite relocates/wraps them but must preserve behavior
byte-for-byte:
- **The IDD-push frame transport shape**: host-creates / driver-opens shared keyed-mutex texture ring
with the permissive `D:(A;;GA;;;WD)` SDDL (forced by the restricted WUDFHost token, mirrors the
gamepad drivers); the generation-tagged `latest = gen<<40 | seq<<8 | slot` stale-ring reject (kills
the HDR-flip garbage frame); 0 ms try-acquire / drop-on-full publish (never block the swap-chain
thread); the host output ring `OUT_RING` + `pipeline_depth=2` overlap of convert/copy vs NVENC.
- **The IddCx driver internals that earned their keep**: `edid.rs` in full (128-byte EDID + CTA-861.3
HDR block, serial-as-index round-trip, dual checksums); the HDR enablement recipe (`CAN_PROCESS_FP16`
+ the `*2` mode DDIs + `set_gamma_ramp`/`set_default_hdr_metadata` accept-stubs + `HIGH_COLOR_SPACE` +
8|10 bpc); `DEVICE_POOL` one-device-per-render-LUID (the NVIDIA UMD-thread/VRAM leak fix); stamping
the OS target id onto the monitor context (the recreated-monitor `target_id=0` fix); the swap-chain
processor's two real leak fixes (borrow `IDXGIDevice` across `SetDevice` retries; check `terminate`
at the loop top during a frame burst).
- **The monitor-lifecycle concurrency correctness**: serialized ADD/REMOVE/teardown, the documented
lock order, the watchdog CAS + re-check-under-lock, the creation grace window, the
generation-stamped lease (a stale lease can't tear down a fresh monitor). *Structure* can change;
these properties must survive.
- **The CCD topology fixes**: `isolate_displays_ccd` (the iGPU-attached-monitor hybrid-box correctness;
the `SDC_FORCE_MODE_ENUMERATION` re-commit that drives `COMMIT_MODES → ASSIGN_SWAPCHAIN`); restore
topology *before* REMOVE.
- **The HDR color math**: `hdr.rs` verbatim (pure, unit-tested, ST.2086 G/B/R + big-endian SEI);
`HdrConverter`/`HdrP010Converter` + the f64 `p010_reference` + `hdr_p010_selftest`; `VideoConverter`
(RGB→NV12/P010 on the video engine — a measured latency win); the cursor decomposition
(`convert_pointer_shape` color/masked/monochrome edge cases).
- **NVENC tuning**: caps-probe-before-configure (disambiguate unsupported-config vs too-high-bitrate;
10-bit→8-bit graceful downgrade); the bitrate-clamp binary search (finds each GPU's real ceiling);
true RFI over the DPB; the low-latency configs (CBR, infinite GOP, P-only, ~1-frame VBV).
- **The gamepad driver wins**: the SwDeviceCreate identity recipe (enumerator with no `_`; mandatory
completion callback; synthesized `USB\VID_054C&PID_0CE6` compat-ids for native-DS5 detection; the
non-null per-pad `ContainerId` dodging the xinput1_4 slot-skip); one `pf_dualsense` serving
DualSense+DS4 via a `device_type` byte; XUSB declining `WAIT_*` to force synchronous `GET_STATE`;
the static HID descriptors/feature blobs; per-pad index via `pszDeviceLocation`.
- **The session-glue patterns**: the `Capturer`/`VirtualDisplay`/`Encoder` trait seam + RAII keepalive
teardown; host-lifetime shared services (`InjectorService`/`MicService`/`AudioCapSlot`) with
per-session gamepads; the encode|send thread split + microburst pacing; `build_pipeline_with_retry`
+ permanent-vs-transient classification; the control-task `select!` + adaptive-FEC; the GameStream
`VideoPacketizer` (GF8 Cauchy, Moonlight byte-exact); the pairing/trust handshake.
- **The SCM supervisor model**: Session-0 LocalSystem supervisor → token-retarget →
`CreateProcessAsUserW` `serve` into the console session, relaunch-on-session-change, kill-on-close
Job Object; the file-append log-mask; the two-tier logging init.
- **Build/CI wins**: the `wdf-umdf-sys` build.rs SDK-version resolution (picks the SDK version that
actually contains `iddcx`, not the max base SDK); the ARM64 cross-compile off the x64 runner; the
thin-.iss / fat-binary installer delegating to `service install`.
---
## 2. Target architecture
### 2.1 Crate & workspace strategy
**Keep ONE shared `crates/punktfunk-host` crate** (do *not* split `punktfunk-host-windows`). The host is
a leaf binary consumed by nobody; the "one core, linked everywhere" invariant is already satisfied by
`punktfunk-core`. A split would only fork the genuinely-shared session glue, traits, and `hdr.rs`. The
cfg-sprawl win comes instead from confining all Windows code under one `src/windows/` subtree behind a
single `#[cfg(windows)] mod windows;` seam, with backend impls next to their trait's dispatch point.
**Pull the three drivers into ONE in-tree driver workspace** (`packaging/windows/drivers/`) on a single
binding stack, one `rust-toolchain.toml`, one signing recipe, one CI build. Today they are 23 disjoint
cargo packages on two incompatible WDK stacks (see §6).
**Add ONE shared `no_std` ABI crate** (`crates/pf-vdisplay-proto`, name TBD) consumed by both the host
crate and the driver workspace. It owns *every* cross-process binary contract that is currently
hand-duplicated with "must match" comments. This is the single highest-value correctness change (§4.1).
### 2.2 Target file tree (host crate)
```
crates/punktfunk-host/src/
main.rs clap-derive subcommand dispatch only (kills parse_serve/parse_spike/hand --help)
config.rs HostConfig (typed; parsed ONCE from host.env/env/flags) + config_dir
session/
mod.rs SessionFactory, SessionPlan, SessionContext, Session (the ONLY teardown path)
server.rs QUIC accept loop, handshake, shared-service wiring
serve_session.rs resolve_* → Welcome/Start → spawn → RAII teardown
control.rs mid-stream renegotiation select! loop
pipeline.rs REAL shared encode|send split, send_loop, FrameMsg, pacing (used by native AND GameStream)
capture.rs Capturer trait + CapturedFrame/PixelFormat/FramePayload (platform-neutral)
capture/linux.rs
capture/windows/ mod.rs (dispatch), idd_push.rs, dda.rs, wgc.rs, secure_desktop.rs*
vdisplay.rs VirtualDisplay/VirtualOutput trait + open() dispatch (neutral)
vdisplay/{kwin,gamescope,mutter,wlroots}.rs
vdisplay/windows.rs was sudovda.rs → PfVirtualDisplay + VirtualDisplayManager
encode.rs Encoder trait, EncodedFrame, validate_dimensions, open_encoder dispatch
encode/{linux,vaapi,sw}.rs
encode/windows/ mod.rs (dispatch), nvenc.rs, nvenc_sys.rs, ffmpeg_win/{mod,system,zerocopy,d3d11va_ffi}.rs
hdr.rs PRESERVE VERBATIM
inject.rs / inject/linux/* / inject/windows/{mod,sendinput,pad_manager,xusb,dualsense,dualshock4,swdevice,section}.rs
inject/proto/{dualsense,dualshock4}.rs shared pure codecs (PRESERVE)
audio.rs / audio/linux.rs / audio/windows/{mod,wasapi_cap,wasapi_mic}.rs
windows/ mod.rs, d3d/{mod,texture,ring,convert}.rs, color/{hdr,p010,video_proc}.rs,
cursor.rs, display_ccd.rs, adapter.rs, process.rs (Token/Event/Job/Child/spawn_as_user),
service.rs (SCM; uses process.rs), win32u_hook.rs*, gpu_priority.rs
session_tuning.rs (PRESERVE) / pwinit.rs / discovery.rs / mgmt.rs / native_pairing.rs / library.rs
gamestream/ unchanged module set; stream.rs slims by reusing session/pipeline.rs
```
`*` = survives only per the secure-desktop / WGC product decisions (§5, §11).
### 2.3 The seam traits (keep the shape; tighten 3 things)
```rust
trait VirtualDisplay: Send {
fn name(&self) -> &str;
fn create(&self, mode: Mode) -> Result<VirtualOutput>;
fn set_launch_command(&self, cmd: Option<String>); // per-instance, not a global env var
}
struct VirtualOutput {
node_id: u32,
preferred_mode: Mode,
#[cfg(windows)] win_capture: WinCaptureTarget, // target_id + adapter_luid + monitor_gen (carried, not ambient)
keepalive: Box<dyn VirtualLease>,
}
trait VirtualLease: Send { // Drop = release; replaces the sudovda free-fns + CURRENT_MON_GEN reach-in
fn set_hdr(&self, on: bool) -> Result<()>;
fn hdr_enabled(&self) -> bool;
fn await_released(&self, timeout: Duration) -> bool;
}
trait Capturer: Send {
fn next_frame(&mut self) -> Result<CapturedFrame>;
fn try_latest(&mut self) -> Option<CapturedFrame>;
fn set_active(&mut self, a: bool);
fn hdr_meta(&self) -> Option<HdrMeta>;
fn pipeline_depth(&self) -> usize;
}
fn open_capturer(vout: VirtualOutput, want: OutputFormat) -> Result<Box<dyn Capturer>>; // format+HDR passed IN
trait Encoder: Send {
fn submit(&mut self, f: &CapturedFrame) -> Result<()>;
fn poll(&mut self) -> Option<EncodedFrame>;
fn flush(&mut self);
fn request_keyframe(&mut self);
fn caps(&self) -> EncoderCaps; // query, don't rely on default no-ops
fn set_hdr_meta(&mut self, m: Option<HdrMeta>);
fn invalidate_ref_frames(&mut self, lo: u64, hi: u64) -> bool;
}
fn open_encoder(plan: &EncodePlan) -> Result<Box<dyn Encoder>>;
trait AudioCapturer: Send { fn next_chunk(&mut self) -> Result<Vec<f32>>; fn channels(&self) -> u16; fn drain(&mut self); }
trait VirtualMic: Send { fn push(&mut self, pcm: &[f32]); fn channels(&self) -> u16; }
trait InputInjector: Send { fn inject(&mut self, e: &InputEvent); }
trait PadManager: Send { /* handle/apply_rich/pump/heartbeat — Box<dyn PadManager> via select(GamepadPref), replaces the PadBackend enum */ }
```
The three tightenings: (1) `Capturer` takes the desired `OutputFormat` IN — kills the
`capture → encode::windows_resolved_backend()` back-reference that's recomputed in `dxgi.rs`; (2) HDR
control + monitor-release become `VirtualLease` methods so the session glue never names a concrete
backend and contains zero `unsafe`; (3) optional encoder capabilities are queried via `EncoderCaps`.
### 2.4 SessionFactory + typed plan (the single biggest clarity lever)
Today the Windows capture/topology/encoder decision is made by ~40 scattered env reads, recomputed in
THREE places (`capture_virtual_output`, `should_use_helper`, `virtual_stream`) with no single owner and
a latent mirrored-dispatch bug (capture and encode can disagree on the backend). Replace with:
```rust
struct SessionPlan {
display: DisplayBackend,
capture: CaptureBackend, // IddPush | Dda | Wgc
topology: SessionTopology, // SingleProcess | TwoProcessRelay
encoder: EncoderBackend, // Nvenc | Amf | Qsv | Software
input_format: OutputFormat,
bit_depth: u8, hdr: bool, pipeline_depth: usize,
}
struct SessionFactory { cfg: Arc<HostConfig>, vdm: Arc<VirtualDisplayManager>, injector, mic, audio }
impl SessionFactory {
fn plan(&self, welcome: &Welcome) -> SessionPlan; // resolves ONCE from HostConfig; no env reads downstream
fn build(&self, plan: &SessionPlan, ctx: SessionContext) -> Result<Session>; // owns the RAII chain
}
```
`build()` owns the chain `vdm.lease(mode) → open_capturer(vout, fmt) → open_encoder(plan) → spawn
pipeline`, and `Session::drop` is the only teardown path. This kills the env soup, makes the deployed
path readable, and removes the capture/encode backend-disagreement bug class. It also lets us drop the
1213-arg `#[allow(too_many_arguments)]` signatures (a `SessionContext` struct) and the dead
`Compositor` ceremony threaded through the Windows path.
### 2.5 Ownership model — delete the global statics
> **✅ IMPLEMENTED (2026-06-25, branch `windows-host-goal1`).** Landed as 3 steps + an on-glass
> reconnect-leak test — see [`windows-host-goal1-plan.md`](windows-host-goal1-plan.md) §2.5 for the
> commits + results. One deviation from the sketch below: the 5-agent map found **`CURRENT_MON_GEN` was
> write-only** (the per-frame monitor-gen bail was never wired), so the "generation carried through
> `WinCaptureTarget`" item was unnecessary and dropped; the gen lives on the manager + lease only.
Today the lifecycle is smeared across `IDD_PERSIST` + `open_or_reuse` (dead code), `CURRENT_MON_GEN`
(read per-frame), `IDD_SETUP_LOCK`/`IDD_SESSION_STOP` (the preempt dance), `MGR: Mutex<Mgr>`, and on
the driver side `ADAPTER`/`MONITOR_MODES`/`NEXT_ID`/`WATCHDOG_*`/`DEVICE_POOL`. Replace with:
- A host-lifetime **`VirtualDisplayManager`** owning a *typed* `OwnedHandle` device handle (not a raw
`isize` smuggled across threads) and the refcounted Idle/Active/Lingering state machine (preserve the
machine — it's earned).
- A per-session **`MonitorLease`** whose `Drop` releases the refcount; the monitor **generation carried
through `WinCaptureTarget`** instead of the ambient `CURRENT_MON_GEN`.
- On the driver: **wire `EvtCleanupCallback` for `MonitorContext`** (only `DeviceContext` has it today)
so the `SwapChainProcessor` + D3D resources drop via WDF RAII — deleting `free_swap_chain_processor`
and the manual-free-before-departure dance that is the **documented dominant reconnect leak**. Move
the process-global driver state into the `DeviceContext`; collapse the 3-way monitor identity
(`MONITOR_MODES` / EDID serial / context stamp) to one `Monitor` owned by the context.
---
## 3. The host↔driver contract (own it; define once)
### 3.1 `pf-vdisplay-proto` (no_std, bytemuck/zerocopy)
One crate, both build graphs (path dep). Owns:
- **Control plane**: a fresh interface GUID; a contiguous, versioned op enum; `#[repr(C)]` request/reply
structs carrying only used fields.
- **Frame plane**: `SharedHeader`, the `FrameToken { generation, seq, slot }` with `pack`/`unpack`
(replacing the hand-twiddled `gen<<40|seq<<8|slot` on both sides), the `Global\pfvd-*` name helpers.
- **Gamepad sections**: `XusbShm` (64 B) and `PadShm` (256 B, incl. `device_type`) layouts.
- Derive `FromBytes`/`IntoBytes`/`Pod`; `const` size+offset asserts; round-trip tests. **ABI drift
becomes a compile error, not a runtime corruption.** (bytemuck is already a dep in the driver +
wdf-umdf-sys.) This deletes every `OFF_*` constant + `read/write_unaligned` on both sides of every
boundary — the largest single block of shared-memory `unsafe`, and the top drift hazard.
### 3.2 Control plane — keep DeviceIoControl, redesign the ABI
`DeviceIoControl` is the correct WDF idiom for a driver with no control device and is low-frequency
(ADD/REMOVE per session + a keepalive); the shared-memory pattern buys nothing here. Keep it; redesign
the surface:
- Ops actually needed: `Add(mode, identity) → {luid, target_id}`, `Remove`, `SetRenderAdapter`
(now **unconditional** — pf-vdisplay honors it for hybrid-GPU IDD-push; drop the SudoVDA-parity
default-off branch), `ClearAll` (first-class startup orphan reap, not an "ignored by SudoVDA" hack),
`GetInfo` (a real version handshake), and keepalive (see §3.4).
- Drop the SudoVDA-isms: `AddParams.device_name[14]`/`serial[14]` (ignored), the 16-byte GUID → a
monotonic `u64` session id (the refcount manager owns collision safety; retires `next_monitor_guid`'s
pid-mangling), the 4-byte `{major,minor,incr,test}` version tuple → one `u32`, the gappy
`0x800/0x888/0x8FF` func numbering → contiguous.
- One typed IOCTL dispatch helper retrieves+validates+aligns the buffers and hands the body a safe
`&Req` / `&mut MaybeUninit<Reply>` — collapses ~20 of `control.rs`'s 29 `unsafe` blocks.
### 3.3 Frame plane — keep the inversion, retire the scaffolding
Keep the host-creates / driver-opens ring exactly. **Remove the bring-up scaffolding** that diagnosed
the now-solved `run_core=0` mystery: the `DebugBlock` channel + `DBG_MAGIC`, `spawn_observer` /
`PUNKTFUNK_IDD_PUSH_OBSERVE`, the `error!`-as-`info!` logging, the intentional handle leak, and the
20 s blind no-frame deadline (replace with the `DRV_STATUS_OPENED` handshake as a bounded liveness
signal).
### 3.4 Driver swap-chain reuse — the one open root cause
Today a *reused* IddCx monitor's swap-chain dies after ~2 sessions (target id resolves to 0, `SetDevice`
fails `0x80070057`, then an access violation), forcing fresh-monitor-per-session + the host-side
preempt/`wait_for_monitor_released` dance + the `IDD_PERSIST` "create once, never recreate" workaround.
The fix is in the **driver**: with `EvtCleanupCallback` wired + state owned by `DeviceContext` + the
identity collapsed to one `Monitor` (the recreate-path bugs are exactly the 3-way identity desync), the
clean recreate should become stable. **If** that holds, delete `IDD_SETUP_LOCK`/`IDD_SESSION_STOP` +
the preempt dance and unblock `max_concurrent>1` on Windows. **If** it can't be fixed cheaply, isolate
the residual serialization inside `VirtualDisplayManager` (not smeared back into the session loop).
Separately, evaluate replacing the polling watchdog (PING/countdown/grace/linger constellation) with a
**WDF file-object `EvtFileClose`** (host holds the control handle open; close = host gone) — feasibility
TBD on UMDF/IddCx.
---
## 4. Capture strategy
**IDD-push is the universal primary path — normal AND secure desktop (Decision B).** It composes
in-process (cross-session via `Global\` shared textures: driver in WUDFHost/Session 0, `serve` in the
console session), needs no DXGI Desktop Duplication and no `win32u` reparenting hook, is live-validated
at 5K@240 HDR, and (per the owner) also captures the secure desktop (Winlogon/UAC/lock). So there is no
separate "secure capturer" in the primary path: the same `IddPushCapturer` spans the lock screen and
UAC. Capture selection moves into a typed `CaptureBackend` in the `SessionPlan` — replacing the 3-way
env branch with `IddPush` (default) → `Dda`/`Wgc` (explicit fallbacks).
**WGC + DDA are kept as fallbacks, not deleted (Decision B).** They cover non-IddCx / pre-pf-vdisplay
hardware and act as a safety net if IDD-push fails to attach. But they are **demoted**: they are no
longer the default, no longer entangled with the secure-desktop mux, and selected only via the explicit
`CaptureBackend` fallback in the plan. This lets the DDA module shed the parts that existed *only* to
make virtual-display-over-DDA survive on a hybrid box, while the genuinely-useful capture/recovery core
stays:
- **Scope the `win32u` self-modifying-code hook + the GPU-pref hook to the DDA fallback leg** (one
`win32u_hook::install()`), so the primary IDD-push path never touches them. Re-confirm whether DDA
even needs the `win32u` hook against pf-vdisplay (it may not — open verification item).
- **The two-process WGC relay's secure-desktop mux is retired** — IDD-push handles the secure desktop
directly, so `desktop_watch.rs` + `composed_flip.rs` + the `virtual_stream_relay` monolith are no
longer needed for their original purpose. Keep a **minimal** WGC fallback capturer if the WGC backend
is retained; do not port the 400-line relay state machine. (The cross-session input concern below is
handled by the `InputInjector`/topology abstraction, not the AU video relay.)
**Shared D3D primitives** move out of `dxgi.rs` (today the de-facto dumping ground that `wgc.rs` and
`idd_push.rs` import from) into `windows/d3d/` (typed `Texture2d`/`Ring`/`CopyResource`/`Map`-as-bytes),
`windows/color/` (the converters + `hdr_p010_selftest` verbatim), and `windows/cursor.rs`. All three
capturers consume them — deletes the duplicated `tex_desc`, cursor, HDR-poll, repeat-last logic.
**The texture-ownership contract becomes type-level.** NVENC encodes the capturer's texture *in place*
(no copy), sound today only because the IDD-push capturer rotates `OUT_RING` and the loop honors
`pipeline_depth()` — an undocumented cross-module coupling that is *already* a latent corruption risk.
Fix: either the encoder always `CopySubresourceRegion`s (as `ffmpeg_win` does), or the capturer hands an
explicitly-leased ring texture with a documented lifetime. No more relying on the synchronous-loop
assumption.
**The IDD-push input question** (must confirm on-glass): capture+encode run in `serve`; input must reach
the *streamed* (console-session) desktop. If `serve` runs in the console session, `SendInput` works
directly. A code comment flags "SendInput from Session 0 can't reach Session 1" — so the architecture
must make `InputInjector` satisfiable either by in-session `SendInput` *or* by a tiny **input-only
Session-1 agent** (re-scope the old WGC helper to input only). The `SessionPlan.topology` expresses
this.
---
## 5. Encode layer
- Resolve backend + input format + pipeline depth **once** into `EncodePlan` and hand it to both the
capturer and the encoder factory — kill the duplicated `windows_resolved_backend()` call in `dxgi.rs`
(the highest-severity coupling). Trim `open_video`'s 8-arg grab-bag (`cuda` is always false on
Windows; `bit_depth` is overridden by the capture format anyway).
- **`nvenc_sys.rs`**: a thin safe wrapper — RAII `NvSession`/`NvBitstream`/`NvRegistration`/
`NvMappedInput` (Drop = destroy/unregister/unmap) + an `NV_ENC_CONFIG` builder. The public encoder
then has near-zero `unsafe` and no hand-written teardown loops. (The SDK table already returns
`Result` via `result_without_string()`.) This is the single biggest encode-side `unsafe` reduction.
- **`ffmpeg_win`**: RAII `AvFrame`/`SwsCtx`/`HwDeviceCtx`/`HwFramesCtx` delete every manual `av_*_free`
and the error-path cleanup ladders (also the biggest leak-risk reduction); a checked `MappedSurface`
for the staging readback; a `const` size-assert on the hand-mirrored `AVD3D11VA*` structs in a
dedicated `d3d11va_ffi` submodule (silent FFmpeg ABI drift is currently undetectable). Keep
system-readback the default; zero-copy stays opt-in/experimental (no AMD/Intel lab box).
- **HDR symmetry**: make in-band ST.2086/CLL SEI a shared post-encode step so AMF/QSV get the same
mastering metadata as NVENC (today only NVENC attaches it; AMF/QSV rely solely on the 0xCE datagram).
Centralize "when does the client learn HDR metadata" in one owner.
- Keep `hdr.rs`, the `Encoder` trait, `EncodedFrame`, `validate_dimensions`, the caps-probe + RFI logic
verbatim. Delete the `pipeline.rs` `pump_once` doc stub (the real loop is `session/pipeline.rs`).
---
## 6. Drivers — one binding stack (`windows-drivers-rs`), one workspace, one signing recipe
Today: `pf-vdisplay` on the vendored **`wdf-umdf`** stack; `pf_dualsense` + `pf_xusb` on
**`microsoft/windows-drivers-rs`** (`wdk`/`wdk-sys`/`wdk-build`). Two bindgen passes, two SDK
resolutions, two `NTSTATUS`, two build systems, two signing recipes.
**Decision C: unify all three on `microsoft/windows-drivers-rs`** (the official Microsoft stack), in one
in-tree `packaging/windows/drivers/` workspace, edition 2024, one `rust-toolchain.toml`, one CI build.
The gamepad drivers already ship on it; the work is to **migrate `pf-vdisplay` onto it** and **add the
IddCx surface** it lacks today.
**Required pieces of this migration (each a Phase-0/early task):**
1. **Add an `iddcx` subset to `wdk-sys`.** IddCx DDIs are *not* WDF-table functions — they are direct
`IddCxStub` exports — so the extension is bounded: an `ApiSubset::Iddcx` + `iddcx` feature →
bindgen `IddCx.h` + link `IddCxStub`, then ~15 thin `extern`/wrapper fns. Use the current
`wdf-umdf/src/iddcx.rs` (~345 LOC, validated) as a **line-by-line oracle**, including the IddCx 1.10
`*2` HDR DDIs (`IddCxSwapChainReleaseAndAcquireBuffer2`, `IDARG_*2`, `_METADATA2`).
2. **Solve `/INTEGRITYCHECK` for self-signed loading — properly.** `wdk-build` links the driver with
`/INTEGRITYCHECK`, which a self-signed cert can't satisfy (CodeIntegrity 3004/3089). Today the
gamepad drivers hand-patch the FORCE_INTEGRITY PE bit post-link. Replace that hack with a robust
solution, in order of preference: (a) **override the linker flag** — drop `/INTEGRITYCHECK` via
`wdk-build` config / `RUSTFLAGS`/`link-args` if it can be suppressed cleanly; else (b) a
**deterministic, tested CI post-link tool** (a small Rust/PowerShell step that clears bit `0x80` at
`e_lfanew+0x5e` and re-signs, run in CI, not by hand) so it's reproducible and not a footgun; (c) for
a public build, real **attestation signing** (Partner Center) satisfies `/INTEGRITYCHECK`
legitimately. Pick (a) if feasible; (b) as the fleet-self-signed fallback. This is the headline cost
of choosing this stack and must be nailed in Phase 0.
3. **Backport the `wdf-umdf-sys` build.rs SDK-resolution fix** into `wdk-build` (or a local override):
resolve `IddCx.h`/`IddCxStub` by the SDK version that *actually contains* `um\x64\iddcx`, not the max
base SDK (the real failure where a newer base SDK shadows the WDK SDK). windows-drivers-rs's default
resolution doesn't exercise IddCx today, so this likely needs porting.
4. **Port `pf-vdisplay`'s typed safety wins** onto the new stack: re-create the
`WDF_DECLARE_CONTEXT_TYPE!` `Arc<RwLock<T>>` context abstraction (the gold-standard contained
`unsafe`); the version-gate protocol (`IddCxIsFunctionAvailable!` / `IDD_STRUCTURE_SIZE!`); and a
thin safe wrapper layer so the gamepad drivers stop emitting raw `call_unsafe_wdf_function_binding!`
everywhere (the biggest driver-`unsafe` lever).
While unifying, also: adopt WDF device contexts for per-pad state (drop the
`UmdfHostProcessSharing=ProcessSharingDisabled`-dependent statics → true multi-pad-per-host); replace
`mem::zeroed()` configs with the `WDF_*_CONFIG_INIT` initializers (kills the recurring zeroed-default
bug class that already caused 3 driver bugs); cache the shm view (RAII `ShmView`) instead of
re-mapping ~125×/s; **delete the world-writable `C:\Users\Public\*.log` driver logging** and the "M0
spike" naming; collapse `is_nt_error()`/`dyn-Any`/`From<()>`-as-error into a typed `IntoDriverResult`;
collapse the per-call dispatch `unsafe` into one generic `dispatch()` helper.
**Provenance note:** confirm where `wdk`/`wdk-sys`/`wdk-build` come from (the gamepad drivers' Cargo.toml
path-deps `../../crates/wdk*` don't exist in this checkout — they resolve inside a windows-drivers-rs
checkout on the dev box). Pin them as crates.io deps or a vendored, version-pinned copy so the driver
workspace builds reproducibly in CI.
---
## 7. Input, audio, service, packaging
- **Input**: consolidate the host-side device plumbing (`create_swdevice`/`create_shm_section`/
`SwDeviceProfile`) into one `inject/windows/swdevice.rs` used by all three managers (XUSB included,
which currently re-implements its own). The shm layouts come from `pf-vdisplay-proto`. Re-scope the
cross-session helper (if any) to input-only.
- **Audio**: small, already fairly clean. Replace the lone `newdev.dll` `LoadLibrary`+`transmute`
(`wasapi_mic.rs`, the audio runtime's *only* `unsafe`) with the windows-rs `DiInstallDriverW` binding
(or move provisioning to the installer) → zero `unsafe` in the audio runtime.
- **Service / process**: one `windows/process.rs` owning RAII `Token`/`Event`/`Job`/`Child` + a single
`spawn_as_user()` used by BOTH the SCM supervisor and any helper — deletes the duplicated
token-dup/`merged_env_block`/`CreateProcessAsUserW` machinery and ~12 manual `CloseHandle` sites. Add
a **cooperative stop**: a named stop event the supervisor sets and `serve` waits on, so Stop runs RAII
teardown (today `TerminateProcess` skips Drop → the virtual monitor lingers, the documented
stale-monitor gotcha); `TerminateProcess` only as a bounded fallback.
- **Packaging/CI**: keep the thin-.iss / fat-binary model; add a `punktfunk-host web install/uninstall`
subcommand to absorb the web-setup PowerShell. **Build + sign the unified driver workspace in CI from
source** (or a CI guard that fails on stale-vendored-DLL / un-bumped DriverVer) so the driver can't
silently drift from its source. Mint the **fresh pf-vdisplay GUID** coordinated across host + driver +
INF. Single source of truth for version → build + ISCC AppVersion + INF DriverVer. Investigate
retiring `nefconc` by creating the ROOT devnode via SwDevice/CM in Rust. Keep the
devgen-never / nefconc-only and DriverVer-bump gotchas codified.
---
## 8. Unsafe-reduction program (run at port time, not as a separate pass)
- **P0 lints first** (a few lines, before new code): `#![deny(unsafe_op_in_unsafe_fn)]` (host crate has
none today; the driver workspace already has it), `#![warn(clippy::undocumented_unsafe_blocks)]`,
`#![warn(clippy::multiple_unsafe_ops_per_block)]`. Generated bindings keep their opt-out.
- **P0 std handle ownership**: `std::os::windows::io::OwnedHandle` / `std::fs::File::from_raw_handle`
everywhere a raw `HANDLE`/`isize` is held (events/jobs/tokens/sections/pipes). Used in **zero** host
files today — the single biggest cheap win. Deletes the bespoke `unsafe impl Read/Write/Drop`
(`HandleReader`), the never-closed sudovda control handle, the `AtomicIsize` HANDLE globals, ~6 manual
`CloseHandle` sites — and fixes real leaks.
- **P0 the proto crate** (§3.1) — kills the shared-memory pointer-cast `unsafe`.
- **P1 typed wrappers**: `windows/d3d/` (most COM calls already return `Result`; per-frame loop bodies
become `unsafe`-free, the irreducible keyed-mutex/`from_raw_parts` lands in one `frame_xfer` fn);
`nvenc_sys` + RAII ffmpeg (§5); one `windows/process.rs` (§7); collapse the 21 `unsafe impl Send`
onto one audited `SendPtr<T>`/`ThreadBound<T>` (directly de-risks the NVENC in-place coupling).
- **P2 contain the irreducible**: `win32u_hook.rs` (one `install()`; scope to secure-DDA or drop),
`gpu_priority.rs` (the D3DKMT transmute), the WDF context-blob macro, the IddCx swap-chain DDI +
`from_raw_borrowed` (wrap in a typed `SwapChain` guard returning a borrowed `AcquiredSurface<'_>`).
Document a `// SAFETY:` per residual site.
- **P2 delete `unsafe` by deleting code**: the `present_trigger` dead diagnostic, the `DebugBlock`
channel, `spawn_observer`, `IDD_PERSIST`/`open_or_reuse`, `helpers.rs Sendable<T>`, the WGC-open
thread-watchdog hack (gone with WGC), the driver file-logging.
Estimated: host ~144→~35, drivers ~227→~60, residual concentrated and auditable. (`#![forbid(unsafe)]`
is impossible for the drivers and the per-frame D3D path — the realistic target is *containment*.)
---
## 9. SudoVDA decoupling (mechanical rename + scrub)
`vdisplay/sudovda.rs``vdisplay/windows.rs`; `SudoVdaDisplay``PfVirtualDisplay`; scrub "SudoVDA"
from all log/error/doc strings across `capture.rs`/`dxgi.rs`/`wgc*.rs`/`idd_push.rs`/`punktfunk1.rs`/
`main.rs`/`sendinput.rs` (141 refs / 15 files). **Split the reach-in helpers out** of the vdisplay
backend (they're display-utility, not virtual-display creation): `set_advanced_color`,
`advanced_color_enabled`, `resolve_gdi_name`, `isolate/restore_displays_ccd`, `set_active_mode`
`windows/display_ccd.rs` (collapsing the 4× copy-pasted `QueryDisplayConfig` preamble into one safe
`query_active_config()`); `resolve_render_adapter_luid``windows/adapter.rs`. Both vdisplay and
capture then depend on these as peers, breaking the circular reach-in. `WinCaptureTarget` moves to a
neutral location (defined in `dxgi.rs`, constructed in `sudovda.rs` today). Drop the dual-driver
fallback conditionals. Expose HDR/monitor-release as `VirtualLease` methods (zero `unsafe` in the
session glue).
---
## 10. Build plan (greenfield — Decision A)
A from-scratch rebuild of the Windows host against the clean architecture, **salvaging the §1 jewels
verbatim** (the already-clean, already-tested modules: `hdr.rs`, `edid.rs`, the `inject/proto` codecs,
the HDR/cursor converters + their self-tests, the GF8 packetizer, the pairing handshake). The old
Windows code stays in-tree, untouched, as the *reference implementation* until the new path reaches
parity on glass, then is deleted.
**Greenfield-risk mitigation (the survey's strong caveat stands):** almost none of this is
CI-validatable — the Windows backends + drivers need the RTX box (192.168.1.173) + the build VM, and
**AMF/QSV have no lab hardware at all**. A greenfield rewrite therefore carries real risk of silently
dropping a layered bug-fix. Two guardrails are mandatory:
1. **The §1 preservation checklist is a test/assert contract**, not prose: each rebuilt module ports its
hard-won invariants as unit tests or runtime asserts — RAII teardown order (restore displays *before*
REMOVE), keyed-mutex held only across convert/copy, `terminate` checked at the swap-chain loop top,
magic stamped last, `OUT_RING` texture rotation under `pipeline_depth>1`, the NVENC caps-probe
downgrade, the SwDeviceCreate identity recipe. A rebuild that drops one fails its own test.
2. **On-glass A/B gates** at each milestone below, on the RTX box, against the current shipping build:
1080p60, 5K@240 HDR, reconnect-storm, secure desktop (lock/UAC), multi-pad. Nothing replaces the old
path until its A/B passes.
### Build order
- **M0 — Foundations + the `/INTEGRITYCHECK` answer.** Stand up `crates/pf-vdisplay-proto` (the clean,
owned ABI: fresh GUID, the redesigned IOCTL op enum + `#[repr(C)]` structs, `SharedHeader`,
`FrameToken`, the gamepad shm layouts, `const` size-asserts, round-trip tests). Stand up the in-tree
`packaging/windows/drivers/` workspace on `windows-drivers-rs` and **prove the two hard unknowns**:
(a) the `iddcx` `wdk-sys` subset bindgen+links and a trivial IddCx adapter loads; (b) `/INTEGRITYCHECK`
is solved (§6.2) so a self-signed driver loads under Secure Boot with no hand-patching. Add the P0
lints to the host crate. *No host behavior yet.*
- **M1 — pf-vdisplay on the new stack, first light.** Rebuild the IddCx driver against
`windows-drivers-rs`+`iddcx`, clean from the start: `DeviceContext`-owned state (no process-globals),
one `Monitor` identity, `EvtCleanupCallback` on `MonitorContext`, the ported `Arc<RwLock<T>>` context,
the EDID + HDR recipe verbatim, the redesigned control plane from the proto crate. *(On-glass: ADD →
monitor arrives → IDD-push ring attaches → frames flow at 1080p; REMOVE clean.)*
- **M2 — IDD-push capture + NVENC, glass-to-glass.** New `src/windows/` tree: `windows/d3d/` typed
wrappers, `windows/color/` (converters + self-tests), `windows/cursor.rs`, `capture/windows/idd_push.rs`
consuming the proto ring with a **type-level texture-ownership contract** (no in-place-encode
assumption), `encode/windows/{nvenc.rs,nvenc_sys.rs}`, `vdisplay/windows.rs` + `windows/display_ccd.rs`
+ `windows/adapter.rs`. Wire the `SessionFactory`/`SessionPlan` (M2 only needs the IDD-push+NVENC
plan). *(On-glass A/B: 1080p60 + 5K@240 HDR, latency parity with the current build.)*
- **M3 — Service, input, audio, secure desktop.** `windows/process.rs` (RAII Token/Event/Job/Child +
`spawn_as_user` + cooperative stop) + `windows/service.rs`; `inject/windows/*` on the proto shm +
consolidated `swdevice.rs`; `audio/windows/*` (zero-`unsafe` runtime). Confirm IDD-push captures the
secure desktop (lock/UAC) and input reaches the streamed session (in-session `SendInput`, or the
input-only agent if needed). *(On-glass: full session incl. lock screen + UAC + a real pad.)*
- **M4 — Gamepad drivers onto the unified stack.** Rebuild `pf_dualsense` + `pf_xusb` on
`windows-drivers-rs` in the same workspace, WDF device contexts (true multi-pad), proto shm,
`WDF_*_CONFIG_INIT`, no file logging, no "M0 spike" naming. *(On-glass: 2 XInput + 2 DualSense pads,
rumble/lightbar/adaptive-trigger round-trip.)*
- **M5 — Fallbacks + GameStream + AMF/QSV.** Port the demoted WGC + DDA fallback capturers (minimal,
`win32u` hook scoped to the DDA leg); `encode/windows/ffmpeg_win/*` with RAII FFmpeg + the
`d3d11va_ffi` size-assert (system-readback default; zero-copy experimental); GameStream planes reusing
`session/pipeline.rs`, installer default flipped to secure `serve`. *(On-glass: Moonlight client on
the DDA fallback; AMF/QSV stays CI-only.)*
- **M6 — Cut over + delete.** Flip the default to the new path, run the full A/B matrix, then delete the
old `dxgi.rs`/`wgc*`/`sudovda.rs`/`punktfunk1.rs` Windows monoliths + the bring-up scaffolding
(`DebugBlock`/`spawn_observer`/observe gate) + the old gamepad driver crates. Single source of truth
for version; CI builds+signs all drivers from source.
Milestones are roughly dependency-ordered; M0 is the long pole (the `/INTEGRITYCHECK` + `iddcx` proof
gates everything else). M5's AMF/QSV cannot be validated without hardware — keep it system-readback-only
and clearly experimental.
---
## 11. Decisions (resolved 2026-06-24) + open verification items
The five product forks are decided (see the table in §0): **A** greenfield; **B** IDD-push primary for
everything incl. secure desktop, WGC+DDA kept as demoted fallbacks; **C** extend `windows-drivers-rs` +
solve `/INTEGRITYCHECK`; **D** keep GameStream, default secure. On **E (concurrent sessions)**: fix the
driver swap-chain lifecycle regardless (it removes the leak + the preempt dance); treat true
`max_concurrent>1` on Windows as a follow-on once clean reuse is proven on glass.
What remains are **technical unknowns to confirm on the RTX box** (not user decisions):
- **`/INTEGRITYCHECK` resolution path (M0 long pole).** Can `wdk-build` suppress `/INTEGRITYCHECK` via
config/link-args (preferred), or must we keep a deterministic CI post-link bit-clear? Decides the
signing story for all three drivers.
- **`iddcx` subset on `wdk-sys`.** Does the bindgen+`IddCxStub` link cleanly, and does the SDK-resolution
fix need backporting? (windows-drivers-rs doesn't exercise IddCx today.)
- **Driver swap-chain reuse.** Does the clean ownership model (`EvtCleanupCallback` + DeviceContext state
+ single `Monitor` identity) actually fix the "reused swap-chain dies after ~2 sessions" root cause? If
not, the residual serialization stays inside `VirtualDisplayManager`.
- **IDD-push input + secure desktop.** Confirm `serve` runs in the console session so `SendInput` reaches
the streamed desktop (a code comment warns about Session 0→1); confirm IDD-push frames flow through the
lock screen / UAC (owner reports yes — verify and lock it in as the primary, demoting the DDA secure
leg to fallback).
- **Does the demoted DDA fallback still need the `win32u` hook** against pf-vdisplay, or was that purely
a SudoVDA/hybrid pathology? If unneeded, the self-modifying-code hook can be deleted entirely.
- **AMF/QSV** stays CI-only (no hardware) — system-readback default, zero-copy experimental.
---
## 12. Risks
- **Greenfield with no CI (the dominant risk).** The build VM is headless/WARP; the WinUI/hardware/driver
paths need the RTX box, and AMF/QSV have no hardware. A from-scratch rebuild can silently drop a
layered bug-fix. Mitigation: the §1 preservation checklist is a *test/assert contract* per rebuilt
module; on-glass A/B gates the new path before the old one is deleted (M6); keep the old code in-tree
as the reference until parity.
- **`/INTEGRITYCHECK` (M0 long pole).** Choosing `windows-drivers-rs` means self-signed loading depends
on solving it cleanly (§6.2). If neither linker-flag suppression nor a deterministic CI post-link step
works, drivers can't load self-signed — prove this first, it gates everything.
- **`iddcx` on `wdk-sys`** is new surface (windows-drivers-rs doesn't bind IddCx). Bounded
(`IddCxStub` exports + ~15 wrappers, with the validated `wdf-umdf/iddcx.rs` as oracle) but unproven on
this stack — M0 must light it.
- **`pf-vdisplay-proto` spans two cargo build graphs** (host workspace + the driver workspace). Validate
the path-dep resolves on the Windows build env in M0; pin `wdk*` provenance so the driver workspace
builds reproducibly in CI.
- **Driver swap-chain-reuse root cause still undiagnosed.** The clean ownership model *should* fix it;
if not, residual serialization stays inside `VirtualDisplayManager` and `max_concurrent>1` stays
blocked. Keep `await_released` on the trait until reuse is proven on glass.
- **NVENC in-place encode + `pipeline_depth>1`** is a latent corruption risk; the M2 texture-ownership
contract must be type-level (not the synchronous-loop assumption). Verify the ring on glass.
- **Host/driver version drift in the field.** New host + new driver are always built together (greenfield),
but the installer bundles both — enforce a startup version handshake (proto version in both binaries)
and a CI guarantee they're built from the same revision.
- **Big-bang cutover (M6).** Flipping the default and deleting the old monoliths is the riskiest moment;
it is gated on the full A/B matrix passing, and the old code is recoverable from git if a regression
surfaces post-cutover.
---
## 13. Progress log + M1 IddCx-binding recipe (2026-06-24)
**M0 COMPLETE** (commits through `f896f70`, on `main`, CI-green + validated on the RTX box):
- `crates/pf-vdisplay-proto` — owned host↔driver ABI (fresh GUID, typed IOCTLs + frame transport, const
size-asserts). Green Linux + MSVC.
- Runner **and** RTX box provisioned: WDK 26100 (WDF 2.31, IddCx 1.10), LLVM **21.1.2** (the runner's
default was a ToT/22-dev build → wdk-sys bindgen `E0080` layout-test overflow; 21.1.2 builds clean —
windows-drivers-rs discussion #591). cargo-wdk on the runner.
- `packaging/windows/drivers/` — unified driver workspace on windows-drivers-rs; `wdk-probe` (minimal
UMDF) builds clean end-to-end (bindgen + WDF link + static-CRT `.cargo/config` + `pf-vdisplay-proto`
path-dep). Build layers solved: in-tree target dir (wdk-build walks OUT_DIR ancestors for `Cargo.lock`);
`[workspace.metadata.wdk.driver-model]` = UMDF 2.31; `target-feature=+crt-static` w/ explicit target;
`Version_Number=10.0.26100.0`; `LIBCLANG_PATH` → LLVM 21.1.2.
- **`/INTEGRITYCHECK` resolved**: wdk-build sets it unconditionally (no opt-out) → `packaging/windows/
clear-force-integrity.ps1` clears the PE `FORCE_INTEGRITY` bit (0x0080 @ e_lfanew+0x5e) post-link,
before signing. Proven `0x01E0→0x0160` on CI and in PS 5.1 on the box. Self-signed UMDF load itself is
already proven on the box (the gamepad drivers).
**RTX box** (`ssh "Enrico Bühler"@192.168.1.173`, ENRICOS-DESKTOP, RTX 4090 driver 610.62, PS 5.1 shell):
**ephemeral** — boots to Proxmox on reboot, so unreachable after a reboot. Treat as opportunistic on-glass
(driver load + IDD-push streaming) only; **CI on the windows-amd64 runner is the persistent validator**.
A build clone is at `C:\Users\Public\pf-rewrite`; builds the driver in ~29 s with the box's LLVM 21.1.2.
### M1 — IddCx binding on windows-drivers-rs (the recipe)
IddCx DDIs are **function-table dispatched** (`IddFunctions[]` indexed by `IDDFUNCENUM::<Name>TableIndex`,
`IddDriverGlobals` implicit first arg) — *exactly* the model wdk-sys already implements for WDF (not direct
`IddCxStub` exports as first assumed).
**Approach (Option 1, recommended):** vendor windows-drivers-rs **0.5.1** in-tree (pinned; source staged at
`scratchpad/wdr`, commit `0e3499d`), patched via `[patch.crates-io]` for just `wdk-build` + `wdk-sys`, and
add a first-class **`ApiSubset::Iddcx`** that bindgens `iddcx/1.10/IddCx.h` in an extra pass **reusing the
identical `bindgen::Builder::wdk_default(config)` baseline** (so its WDF/DXGI types *resolve to*, not
redefine, wdk-sys's — type identity by construction). This mirrors wdk-sys's existing gpio/hid/spb/usb
versioned-subpath subsets exactly.
- wdk-build: add `ApiSubset::Iddcx`, a `headers` match arm, `iddcx_headers() -> ["iddcx/1.10/IddCx.h"]`
(UMDF-only).
- wdk-sys build.rs: `generate_iddcx` as a copy of `generate_gpio` — `bindgen_header_contents([Base, Wdf,
Iddcx])`, `(TYPES|VARS).complement()`, `.allowlist_file("(?i).*iddcx.*")`; behind an `iddcx` feature;
add to `ENABLED_API_SUBSETS`; `pub mod iddcx` in lib.rs.
- A `wdk-iddcx` wrapper crate (port of `wdf-umdf/src/iddcx.rs`): table dispatch via
`wdk_sys::iddcx::_IDDFUNCENUM::<Name>TableIndex as usize` (ModuleConsts const, **not** the oracle's
NewType `.0`); NTSTATUS is plain `i32` in wdk-sys (use `wdk_sys::NT_SUCCESS`, drop the oracle's newtype).
- Driver build.rs: add `link-search` to `Lib/<sdk>/um/<arch>/iddcx/1.10` (the SDK version that *contains*
iddcx — glob, don't trust max) + `static=IddCxStub`; hand-declare `#[no_mangle] pub static
IddMinimumVersionRequired: ULONG = 4;`; keep the FORCE_INTEGRITY clear.
**Make-or-break — RESOLVED ✅ (CI-green @ `6d8c7a5`, run 5548, no fallback).** `IddCx.h` bindgens AND the
generated module compiles inside wdk-sys with WDF **type-identity**; the #515/#516 header conflict NEVER
materialized. Vendored the **published windows-drivers-rs 0.5.1** crates (wdk-build + wdk-sys) under
`packaging/windows/drivers/vendor/`, `[patch.crates-io]`'d. The six knobs `generate_iddcx` actually
needed (each a real gotcha, all CI-proven; the recipe above was close but the codegen/scope details
differed):
1. **`--language=c++`** — `wdk_default` parses **C**; IddCx.h's `IDARG_*` typedefs need C++ or you get a
"must use 'struct' tag" cascade (verified by direct `clang` on the box: 0 errors as C++, fails as C).
2. **`-DIDD_STUB`** — table-dispatch mode; skips `IddCxFuncEnum.h`'s `#error IDDCX_VERSION_MAJOR is not
defined` (it lives inside `#ifndef IDD_STUB`). **Do NOT add `WDF_STUB`** — wdk-sys parses `wdf.h`
non-stubbed, and stubbing it only here would desync the shared WDF types (breaking type-identity).
3. **`allowlist_recursively(false)` + `allowlist_file("(?i).*iddcx.*")`, full codegen (no
`.complement()`)** — emit ONLY IddCx items; WDF/Win types resolve to wdk-sys's via
`use crate::types::*` in `src/iddcx.rs`. No giant blocklist (Option 2 avoided).
4. **`allowlist_type("_?DXGI_.*" / "IDXGI.*" / "_?OPM_.*" / "_?D3DCOLORVALUE")`** — emit the non-WDF types
wdk-sys doesn't bindgen, locally (absent from `crate::types`, so non-conflicting). The `_?` is
load-bearing: `typedef struct _OPM_X {} OPM_X` needs the tag AND the alias (recursively(false) won't
pull the tag from the typedef).
5. **`pub type UINT = ::core::ffi::c_uint;` in `src/iddcx.rs`** — `UINT` (unsigned int) is absent from
`crate::types`; covers the top-level struct-field uses.
6. **`translate_enum_integer_types(true)`** — C++ parsing kept `UINT` as the underlying repr of the
DXGI/OPM ModuleConsts enums (`pub mod _X { pub type Type = UINT; }`), and nested modules can't see the
parent `UINT`. This emits native `u32` reprs → self-contained enum modules.
The wrapper note still holds: table dispatch via `wdk_sys::iddcx::_IDDFUNCENUM::<Name>TableIndex as usize`
(ModuleConsts const, **not** the oracle's NewType `.0`); NTSTATUS = plain `i32` (`wdk_sys::NT_SUCCESS`).
Driver build.rs will add the IddCxStub link-search + `IddMinimumVersionRequired` + keep the
FORCE_INTEGRITY clear. **Option 2 stays rejected; the `wdf-umdf-sys` fallback is unneeded.**
**NEXT (M1 cont.):** port the full ~30-DDI / ~40-struct surface (incl. the HDR `*2` DDIs) + the
swap-chain processor + frame transport, with the clean ownership model (DeviceContext-owned state,
`EvtCleanupCallback` on `MonitorContext`, single `Monitor` identity, the owned `pf-vdisplay-proto` plane).
First gate: a probe linking `IddCxStub` and calling `IddCxDeviceInitConfig`/`…Initialize`/
`…AdapterInitAsync` (CI = compile+link). On-glass load + IDD-push stream needs the RTX box (ephemeral —
currently down/Proxmox).
---
## 14. M1 step 2 — pf-vdisplay driver port plan (2026-06-24, workflow-mapped + critiqued)
**Status of the binding (DONE, CI-green):** the wdk-sys `iddcx` binding is proven *complete for the whole
driver*, not just init. `wdk-probe/src/iddcx_surface_assert.rs` (commit `ae803b2`) CI-asserts every `*2`/HDR
struct (`IDDCX_TARGET_MODE2`/`PATH2`/`METADATA2`, `IDARG_*RELEASEANDACQUIREBUFFER2` — which embed
`DISPLAYCONFIG_*`/`LUID`, both of which **resolve from `crate::types`** — no allowlist gap), all 14 inbound
`PFN_IDD_CX_*` callbacks, the `.Size` machinery (`IddStructures`/`IddStructureCount`/
`IddClientVersionHigherThanFramework`/`_IDDSTRUCTENUM::INDEX_*` — so `IDD_STRUCTURE_SIZE!` is portable), and
`IDDCX_ADAPTER_FLAGS::…CAN_PROCESS_FP16` + `IDDCX_TARGET_CAPS::…HIGH_COLOR_SPACE`. ModuleConsts module
naming: the func/struct enums are `_IDDFUNCENUM`/`_IDDSTRUCTENUM` (underscored tag), but the flag/cap enums
are `IDDCX_ADAPTER_FLAGS`/`IDDCX_TARGET_CAPS` (no underscore).
### DDIs to wrap (11 — graduate `wdk-probe/src/iddcx_rt.rs` → a `wdk-iddcx` crate)
DeviceInitConfig, DeviceInitialize, AdapterInitAsync (done), MonitorCreate, MonitorArrival,
MonitorDeparture, AdapterSetRenderAdapter, SwapChainSetDevice (`other_is_error`; 0x887A0026→retry),
**SwapChainReleaseAndAcquireBuffer2** (HDR variant only; `other_is_error`; E_PENDING 0x8000000A → wait on
the surface event), SwapChainFinishedProcessingFrame. **Drop** the v1 `ReleaseAndAcquireBuffer` (adapter
always sets FP16). **Defer** the hardware-cursor DDIs (cursor baked into video).
### Callbacks (15 in `IDD_CX_CLIENT_CONFIG`; `*2` mandatory because FP16)
parse_monitor_description (+`2`), monitor_query_target_modes (+`2`), adapter_commit_modes (+`2`),
adapter_init_finished (stash IDDCX_ADAPTER + start watchdog), monitor_get_default_modes (→NOT_IMPLEMENTED,
we always carry EDID), **query_target_info (→HIGH_COLOR_SPACE), set_gamma_ramp (accept-stub — WITHOUT it
the adapter fails to init), set_default_hdr_metadata (accept-stub)** — the last three are mandatory under
FP16, assign_swap_chain, unassign_swap_chain, device_io_control (the pf-vdisplay-proto control plane).
Plus `EvtDeviceD0Entry` (adapter created HERE, not in DeviceAdd) and two `EvtCleanupCallback`s.
### State model (the rewrite's core change)
`DeviceContext` OWNS all state — IDDCX_ADAPTER, session_id-keyed monitor map, watchdog, the per-render-LUID
`Direct3DDevice` pool — replacing the oracle's process globals. Reachable from BOTH the WDFDEVICE (strong)
and the IDDCX_ADAPTER object (the adapter-side callbacks need it). `MonitorContext` owns the
`SwapChainProcessor` + `target_id`; **wire `EvtCleanupCallback` on the IDDCX_MONITOR object** so RAII Drop
joins the worker thread + frees D3D (the oracle lacked this → the dominant reconnect leak). **Single Monitor
identity** keyed by `session_id` (collapses the oracle's 3-way EDID-serial/map/stamp desync that caused the
`target_id=0` recreate bug); `assign_swap_chain` reads `target_id` from the context, never a map lookup.
The HOST still owns the control-device handle, the linger/reuse state machine, and ALL `Global\` shared
objects (created `D:(A;;GA;;;WD)`); the driver only OPENS them.
### Frame transport (single-source on `pf_vdisplay_proto::frame::*`)
Acquire via `ReleaseAndAcquireBuffer2{AcquireSystemMemoryBuffer=0}` → GPU `ID3D11Texture2D`; borrow
`out.MetaData.pSurface` with `IDXGIResource::from_raw_borrowed` (do NOT steal IddCx's refcount), publish
BEFORE `FinishedProcessingFrame`. Ring = `RING_LEN`(6) keyed-mutex shared textures opened by name
(`frame::{header_name,event_name,texture_name}`); per-frame: GetDesc format-guard (drop on FP16↔BGRA
mismatch), `AcquireSync(0,0ms)`, `CopyResource`, `ReleaseSync(0)`, store `FrameToken{gen,seq,slot}.pack()`
(Release), `SetEvent`. All-slots-busy → drop, never block. `is_stale()` (header.generation Acquire) → reattach
on host ring recreate. Write `DRV_STATUS_OPENED` + render LUID into the header. Drop the old DebugBlock +
the locally-duplicated header/MAGIC/name consts.
### Implementation checklist (each step CI- or box-gated)
0. workspace `pf-vdisplay`(cdylib)+`wdk-iddcx` members — **STEP-0 gate must pull in `std::thread`+`OwnedHandle`** (critique: prove std links under the UMDF toolchain *here*, not at STEP 5). CI.
1. graduate `iddcx_rt.rs``wdk-iddcx` (11 DDIs + `is_nt_error`/`other_is_error`) + **re-export the inbound PFN types**. CI link.
1.5 (critique add) the surface-assert (DONE @ `ae803b2`) lives on so the full PFN/`*2`/`DISPLAYCONFIG` surface stays a CI gate.
2. DriverEntry + driver_add: full `IDD_CX_CLIENT_CONFIG` (15 callbacks as stubs) + DeviceInitConfig + WdfDeviceCreate(+cleanup) + CreateDeviceInterface(`PF_VDISPLAY_INTERFACE_GUID`) + DeviceInitialize + D0Entry stub; salvage `edid.rs` verbatim. **Resolve `.Size` via `IDD_STRUCTURE_SIZE!` (machinery confirmed present).** CI link + FORCE_INTEGRITY clear.
3. DeviceContext + `WDF_DECLARE_CONTEXT_TYPE` Arc<RwLock> blob; init_adapter in D0Entry (caps+FP16) → AdapterInitAsync; the `*2` mode DDIs + query_target_info + gamma/hdr accept-stubs. **Box gate:** loads under Secure Boot, enumerates as IddCx adapter, Status OK (no "Failed to get adapter").
4. control plane (GET_INFO version handshake — **host MUST assert `protocol_version`**, ADD/REMOVE/SET_RENDER_ADAPTER/PING/CLEAR_ALL) + create_monitor + real mode DDIs + watchdog + MONITOR_OP_LOCK; **switch host `sudovda.rs`/`idd_push.rs` to `pf_vdisplay_proto` (GUID e5bcc234→70667664, IOCTL 0x800→0x900, GUID-key→session_id) — lockstep**. CI (host build) + box (monitor appears at WxH@Hz).
5. Direct3DDevice + assign/unassign + SwapChainProcessor (worker thread, SetDevice 60×@50ms single-borrow retry, top-of-loop terminate, Buffer2 acquire, from_raw_borrowed) WITHOUT publisher; wire monitor `EvtCleanupCallback`. **Box:** swap-chain assigns, acquire loop runs, RAII teardown (no thread/VRAM leak). **Critique: instrument that MonitorContext::Drop actually RAN; if the monitor-object cleanup callback does not fire, keep the oracle's explicit free-before-departure path as the fallback.**
6. FramePublisher on `pf_vdisplay_proto::frame::*` + keyed-mutex RAII guard + OwnedHandle/ShmView; wire into run_core. **Box:** full IDD-push glass-to-glass, A/B vs the shipping driver. **Critique: add a BLOCKING secure-desktop gate here** — lock (Win+L)+UAC with serve in the console session / driver in Session 0, confirm frames keep flowing AND input reaches the desktop; until it passes, do NOT delete the WGC-relay/DDA secure path.
7. HDR ring-recreate + repeated session recreate (confirm the recreate-crash is gone). **Critique: define the failure branch** — if recreate isn't stable, keep IDD_PERSIST + state that mid-stream Reconfigure stays unsupported on Windows IDD-push (host rejects, as today) rather than crashing; keep `max_concurrent=1`. **Specify the concurrent-monitor D3D model before enabling >1** (two worker threads must not share one SINGLETHREADED immediate context — give each monitor its own device or a deferred/multithreaded context).
8. unsafe-reduction pass (one audited `SendPtr`/`ThreadBound`; per-site `// SAFETY`; `AcquiredSurface<'_>` + `KeyedMutexGuard` RAII so the hot loop has zero raw Finish/ReleaseSync) + **delete the old `packaging/windows/vdisplay-driver/` tree only after the secure-desktop gate (step 6) passes**. CI clippy -D warnings + final box A/B.
### Critique verdict + the big risk
Plan is implementation-ready once the 4 CI-checkable unknowns are gates (3 now resolved by the surface-assert
+ `.Size` machinery presence; std-under-UMDF is the STEP-0 gate). **SINGLE BIGGEST RISK: the secure-desktop
claim** — the plan retires the proven two-process WGC relay + DDA on the *unproven* assertion that one
IddPushCapturer captures the lock/UAC secure desktop directly (IDD-push is opt-in today behind
`PUNKTFUNK_IDD_PUSH`). Make it a blocking on-glass gate (step 6) and keep the WGC relay recoverable for one
release. Other defined-failure-branch items: monitor `EvtCleanupCallback` firing, IDD_PERSIST/Reconfigure,
concurrent-monitor device sharing, host↔driver `protocol_version` lockstep.
---
## 15. Current status (2026-06-25)
The rewrite is **largely implemented**. The new all-Rust `pf-vdisplay` driver (the M0 long pole — `iddcx`
on `windows-drivers-rs` + `/INTEGRITYCHECK` — and the §14 STEP 08 port) **landed on `main`, on-glass HDR
validated**, and the host was decomposed into the clean layered architecture. One important deviation from
the plan: **the host was refactored *in place* via a staged, behavior-preserving plan
([`windows-host-goal1-plan.md`](windows-host-goal1-plan.md)), not greenfield-rebuilt** — the §10 "rebuild
fresh, keep old as reference" framing was superseded because staging preserved the live-validated host at
every step (lower regression risk than a big-bang M2 rebuild). The §2.3/§2.4/§2.5 design (seam traits,
`SessionPlan`/`SessionFactory`/`SessionContext`, the `VirtualDisplayManager` ownership model) is realized in
that branch's commits, not the M2 greenfield tree the build order imagined.
### Milestone / step status
| Item | Status | Evidence |
|---|---|---|
| **M0** — proto crate, driver workspace, `iddcx` binding, `/INTEGRITYCHECK` | ✅ **DONE** | `pf-vdisplay-proto`; `packaging/windows/drivers/`; `clear-force-integrity.ps1`; CI-green (§13) |
| **§14 STEP 08** — pf-vdisplay driver port (device→adapter→control→swap-chain→frame transport→HDR→.inx→unsafe pass) | ✅ **DONE** | `d7a9fbf``cd59151`; on-glass HDR (`6399d28`: "Mac connects WITH HDR") |
| **M1/M2** — IDD-push capture + NVENC glass-to-glass | ✅ **DONE** | new driver tree + the existing host IDD-push path; 5K@240 HDR zero-copy on-glass |
| **§2.5** — ownership-model rewrite (`VirtualDisplayManager`/`MonitorLease`); swap-chain-reuse / monitor-leak | ✅ **DONE / RESOLVED** | `windows-host-goal1` §2.5 (`1520201``683c81b`); reconnect-leak A/B: 0 leaked monitors |
| **Goal-1 host refactor** (the in-place §2.22.5 realization, incl. `EncoderCaps`) | ✅ **DONE** | `windows-host-goal1` branch — all 6 stages + §2.5 + 3 seam tightenings |
| **Game-capture bug (GB1)** — fullscreen game breaks IDD-push | ✅ **FIXED** | `c87bfe0`/`f98ab07`/`789ad49`; see [game-capture-bug.md](windows-host-rewrite-game-capture-bug.md) |
| **M3** — service / input / audio cleanup | 🟡 code present (largely via the existing host + goal1) | — |
| **M4** — gamepad drivers (`pf_dualsense`/`pf_xusb`) onto the unified stack, WDF device contexts (true multi-pad) | ❌ **NOT STARTED** | old gamepad-driver crates still separate |
| **M5** — demoted WGC/DDA fallback port + GameStream-on-`session/pipeline` + AMF/QSV (no hw) | 🟡 **PARTIAL** | fallbacks exist; not re-shaped onto the new seams |
| **M6** — cut over + delete the old monoliths | 🟡 **PARTIAL** | old `vdisplay-driver/` tree deleted (`a2bd0cd`); host monoliths remain |
### What genuinely remains
1. **Secure-desktop on-glass gate (the single biggest open risk, §14 STEP 6 critique).** IDD-push capturing
the lock screen / UAC with `serve` in the console session is **asserted, not yet locked on glass**. Until
it passes, keep the WGC-relay / secure-DDA path recoverable. Hardware-gated (RTX box; ephemeral).
2. **M4 — gamepad-driver migration** onto `windows-drivers-rs` (WDF device contexts → true multi-pad). The
proven recipe exists; ~23 days, hardware-gated.
3. **M5/M6 cleanup** — re-shape the WGC/DDA fallback + GameStream onto `session/pipeline`, then delete the
old Windows monoliths. Low priority; AMF/QSV stays CI-only (no lab hw).
4. **pf-vdisplay driver slot reclaim** — sustained ADD/REMOVE churn wedges the driver (`ADD →
0x80070490 ERROR_NOT_FOUND`): it doesn't reclaim IddCx monitor slots on REMOVE (ghost nodes accumulate).
Recovery today is `packaging/windows/reset-pf-vdisplay.ps1`; the real fix is in the driver
(`control.rs`/`adapter.rs`). Dev helpers `reset-pf-vdisplay.ps1` + `redeploy-pf-vdisplay.ps1` are committed.
### Resolved since the original §11 open items
- **Driver swap-chain reuse** — the clean ownership model (`EvtCleanupCallback` + DeviceContext-owned state +
single `Monitor` identity) is in; §2.5's reconnect-leak A/B shows **0 leaked active monitors**. The
per-frame `CURRENT_MON_GEN` "monitor-gen bail" turned out to have been **write-only** (never wired), so the
"carry the gen through `WinCaptureTarget`" item was dropped; the gen lives on the manager + lease only.
- **`/INTEGRITYCHECK` + `iddcx` on `wdk-sys`** — both proven CI-green (§13).
Box reminder: the RTX box (`ssh "Enrico Bühler"@…`) is **ephemeral** (boots to Proxmox on reboot; IP floats
on DHCP — has been `.173`/`.158`); the windows-amd64 CI runner is the persistent validator. On-glass gates
are opportunistic.