Files
punktfunk/docs/windows-host-rewrite.md
T
enricobuehler 48202a0f89 docs(windows-rewrite): mark game-capture bug FIXED + bring rewrite status current (§15)
The fullscreen-game-breaks-IDD-push bug is FIXED by the resolution-listening
recovery (c87bfe0: the 250ms poll now follows the display's actual resolution
and recreates the ring on any descriptor change, recover-or-drop), backed by
open-time first-frame DDA failover (f98ab07) and the driver publish() width/
height guard + flushed logging (789ad49). No protocol bump was needed — the host
reads the real resolution straight from Windows (CCD/GDI), so the bug doc's
Stage-1 composing capturer + Stage-2 protocol bump were unnecessary. Bug doc
marked FIXED with a Resolution section; the staged plan kept as superseded record.

windows-host-rewrite.md: the progress log was stale (ended at "M1 cont."). Added
§15 Current status — the driver STEP 0-8 port landed on main on-glass HDR-
validated; the host was refactored *in place* via windows-host-goal1 (not the §10
greenfield rebuild); §2.5 ownership model resolved the swap-chain-reuse / monitor-
leak open item; iddcx + /INTEGRITYCHECK CI-green. Remaining: the secure-desktop
on-glass gate (the single biggest unproven claim), M4 gamepad-driver migration,
M5/M6 cleanup, and the pf-vdisplay slot-reclaim driver fix. Top Status flipped
proposed → largely implemented.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 21:35:55 +00:00

63 KiB
Raw Blame History

Windows Host Rewrite — Design & Plan

Status: largely implemented (updated 2026-06-25 — see §15 Current status for the milestone-by-milestone state; §0–§14 below are the original design and remain the reference). This plan takes the current, hard-won Windows host (pf-vdisplay all-Rust IddCx driver + IDD-push zero-copy capture, live-validated 5120×1440@240 HDR on the RTX box) as a knowledge base and re-derives a clean, stable, well-layered architecture from it. It drops all SudoVDA back-compat (we own both ends now) and drives unsafe to a contained minimum.

It supersedes the stale conclusion in docs/windows-virtual-display-rust-port.md ("IDD-push not viable") — that verdict was written in the same commit (e2c9bfd) that shipped the working 922-line consumer + 424-line producer. IDD-push works and is the architecture. The breakthrough the prose never recorded: once the CCD topology makes the virtual display the sole composited desktop in the console session, DWM composites to it and the IddCx swap-chain is assigned (run_core: FIRST FRAME acquired — DWM IS compositing the virtual display!). Per the owner, IDD-push also captures the secure desktop (Winlogon / UAC / lock) — so it is the universal primary path, not just the normal-desktop path.

Decisions resolved (2026-06-24)

# Decision Chosen
A. Execution greenfield vs staged Greenfield rewrite — rebuild the Windows host fresh against the clean architecture, salvaging the validated "jewels" (§1) verbatim. (Risk acknowledged: no CI for the Windows paths — mitigated by the §1 preservation checklist + on-glass gates, §10.)
B. Capture surface IDD-only / IDD+secure-DDA / keep fallbacks IDD-push primary for everything (incl. the secure desktop); keep WGC + DDA as fallbacks.
C. Driver binding stack wdf-umdf vs windows-drivers-rs Extend microsoft/windows-drivers-rs with an iddcx subset; unify all three drivers on it; solve /INTEGRITYCHECK properly (§6).
D. GameStream on Windows keep / keep-secure-default / drop Keep Moonlight compat; flip the installer/service default to secure serve (GameStream an explicit opt-in).

0. Goals (from the brief)

  1. Clean, stable, well-layered architecture. Decompose the god-files, give every subsystem one owner, and replace the ~40-knob PUNKTFUNK_* env soup with a typed config resolved once per session.
  2. Drop every trace of SudoVDA back-compat. We own the driver (pf-vdisplay) and the host. The byte-identical IOCTL ABI, the reused {e5bcc234} GUID, the sudovda module name, the "SudoVDA ignores this" conditionals — all pure liability now.
  3. Minimize unsafe. ~480 unsafe occurrences across the Windows surface; the large majority are FFI-mechanical (windows-rs/NVENC/WDK already return Result). Target: host ~144→~35, drivers ~227→~60, with the irreducible floor contained in 34 named modules under deny(unsafe_op_in_unsafe_fn).

Non-goals / invariants (do not regress)

  • Linux host behavior is out of scope and must not change. The host crate is shared; Linux is validated across KWin/gamescope/Mutter/Sway. Touch only the seams.
  • punktfunk-core stays the one linked core. Protocol/FEC/crypto/QUIC live there behind the C ABI; the host is a leaf binary. No protocol changes here.
  • No async on the per-frame path. Native threads only (the existing discipline).

1. What we KEEP (validated, load-bearing — port, don't rewrite)

These are expensive empirical wins. The rewrite relocates/wraps them but must preserve behavior byte-for-byte:

  • The IDD-push frame transport shape: host-creates / driver-opens shared keyed-mutex texture ring with the permissive D:(A;;GA;;;WD) SDDL (forced by the restricted WUDFHost token, mirrors the gamepad drivers); the generation-tagged latest = gen<<40 | seq<<8 | slot stale-ring reject (kills the HDR-flip garbage frame); 0 ms try-acquire / drop-on-full publish (never block the swap-chain thread); the host output ring OUT_RING + pipeline_depth=2 overlap of convert/copy vs NVENC.
  • The IddCx driver internals that earned their keep: edid.rs in full (128-byte EDID + CTA-861.3 HDR block, serial-as-index round-trip, dual checksums); the HDR enablement recipe (CAN_PROCESS_FP16
    • the *2 mode DDIs + set_gamma_ramp/set_default_hdr_metadata accept-stubs + HIGH_COLOR_SPACE + 8|10 bpc); DEVICE_POOL one-device-per-render-LUID (the NVIDIA UMD-thread/VRAM leak fix); stamping the OS target id onto the monitor context (the recreated-monitor target_id=0 fix); the swap-chain processor's two real leak fixes (borrow IDXGIDevice across SetDevice retries; check terminate at the loop top during a frame burst).
  • The monitor-lifecycle concurrency correctness: serialized ADD/REMOVE/teardown, the documented lock order, the watchdog CAS + re-check-under-lock, the creation grace window, the generation-stamped lease (a stale lease can't tear down a fresh monitor). Structure can change; these properties must survive.
  • The CCD topology fixes: isolate_displays_ccd (the iGPU-attached-monitor hybrid-box correctness; the SDC_FORCE_MODE_ENUMERATION re-commit that drives COMMIT_MODES → ASSIGN_SWAPCHAIN); restore topology before REMOVE.
  • The HDR color math: hdr.rs verbatim (pure, unit-tested, ST.2086 G/B/R + big-endian SEI); HdrConverter/HdrP010Converter + the f64 p010_reference + hdr_p010_selftest; VideoConverter (RGB→NV12/P010 on the video engine — a measured latency win); the cursor decomposition (convert_pointer_shape color/masked/monochrome edge cases).
  • NVENC tuning: caps-probe-before-configure (disambiguate unsupported-config vs too-high-bitrate; 10-bit→8-bit graceful downgrade); the bitrate-clamp binary search (finds each GPU's real ceiling); true RFI over the DPB; the low-latency configs (CBR, infinite GOP, P-only, ~1-frame VBV).
  • The gamepad driver wins: the SwDeviceCreate identity recipe (enumerator with no _; mandatory completion callback; synthesized USB\VID_054C&PID_0CE6 compat-ids for native-DS5 detection; the non-null per-pad ContainerId dodging the xinput1_4 slot-skip); one pf_dualsense serving DualSense+DS4 via a device_type byte; XUSB declining WAIT_* to force synchronous GET_STATE; the static HID descriptors/feature blobs; per-pad index via pszDeviceLocation.
  • The session-glue patterns: the Capturer/VirtualDisplay/Encoder trait seam + RAII keepalive teardown; host-lifetime shared services (InjectorService/MicService/AudioCapSlot) with per-session gamepads; the encode|send thread split + microburst pacing; build_pipeline_with_retry
    • permanent-vs-transient classification; the control-task select! + adaptive-FEC; the GameStream VideoPacketizer (GF8 Cauchy, Moonlight byte-exact); the pairing/trust handshake.
  • The SCM supervisor model: Session-0 LocalSystem supervisor → token-retarget → CreateProcessAsUserW serve into the console session, relaunch-on-session-change, kill-on-close Job Object; the file-append log-mask; the two-tier logging init.
  • Build/CI wins: the wdf-umdf-sys build.rs SDK-version resolution (picks the SDK version that actually contains iddcx, not the max base SDK); the ARM64 cross-compile off the x64 runner; the thin-.iss / fat-binary installer delegating to service install.

2. Target architecture

2.1 Crate & workspace strategy

Keep ONE shared crates/punktfunk-host crate (do not split punktfunk-host-windows). The host is a leaf binary consumed by nobody; the "one core, linked everywhere" invariant is already satisfied by punktfunk-core. A split would only fork the genuinely-shared session glue, traits, and hdr.rs. The cfg-sprawl win comes instead from confining all Windows code under one src/windows/ subtree behind a single #[cfg(windows)] mod windows; seam, with backend impls next to their trait's dispatch point.

Pull the three drivers into ONE in-tree driver workspace (packaging/windows/drivers/) on a single binding stack, one rust-toolchain.toml, one signing recipe, one CI build. Today they are 23 disjoint cargo packages on two incompatible WDK stacks (see §6).

Add ONE shared no_std ABI crate (crates/pf-vdisplay-proto, name TBD) consumed by both the host crate and the driver workspace. It owns every cross-process binary contract that is currently hand-duplicated with "must match" comments. This is the single highest-value correctness change (§4.1).

2.2 Target file tree (host crate)

crates/punktfunk-host/src/
  main.rs                  clap-derive subcommand dispatch only (kills parse_serve/parse_spike/hand --help)
  config.rs                HostConfig (typed; parsed ONCE from host.env/env/flags) + config_dir
  session/
    mod.rs                 SessionFactory, SessionPlan, SessionContext, Session (the ONLY teardown path)
    server.rs              QUIC accept loop, handshake, shared-service wiring
    serve_session.rs       resolve_* → Welcome/Start → spawn → RAII teardown
    control.rs             mid-stream renegotiation select! loop
    pipeline.rs            REAL shared encode|send split, send_loop, FrameMsg, pacing (used by native AND GameStream)
  capture.rs               Capturer trait + CapturedFrame/PixelFormat/FramePayload (platform-neutral)
  capture/linux.rs
  capture/windows/         mod.rs (dispatch), idd_push.rs, dda.rs, wgc.rs, secure_desktop.rs*
  vdisplay.rs              VirtualDisplay/VirtualOutput trait + open() dispatch (neutral)
  vdisplay/{kwin,gamescope,mutter,wlroots}.rs
  vdisplay/windows.rs      was sudovda.rs → PfVirtualDisplay + VirtualDisplayManager
  encode.rs                Encoder trait, EncodedFrame, validate_dimensions, open_encoder dispatch
  encode/{linux,vaapi,sw}.rs
  encode/windows/          mod.rs (dispatch), nvenc.rs, nvenc_sys.rs, ffmpeg_win/{mod,system,zerocopy,d3d11va_ffi}.rs
  hdr.rs                   PRESERVE VERBATIM
  inject.rs / inject/linux/* / inject/windows/{mod,sendinput,pad_manager,xusb,dualsense,dualshock4,swdevice,section}.rs
  inject/proto/{dualsense,dualshock4}.rs   shared pure codecs (PRESERVE)
  audio.rs / audio/linux.rs / audio/windows/{mod,wasapi_cap,wasapi_mic}.rs
  windows/                 mod.rs, d3d/{mod,texture,ring,convert}.rs, color/{hdr,p010,video_proc}.rs,
                           cursor.rs, display_ccd.rs, adapter.rs, process.rs (Token/Event/Job/Child/spawn_as_user),
                           service.rs (SCM; uses process.rs), win32u_hook.rs*, gpu_priority.rs
  session_tuning.rs (PRESERVE) / pwinit.rs / discovery.rs / mgmt.rs / native_pairing.rs / library.rs
  gamestream/              unchanged module set; stream.rs slims by reusing session/pipeline.rs

* = survives only per the secure-desktop / WGC product decisions (§5, §11).

2.3 The seam traits (keep the shape; tighten 3 things)

trait VirtualDisplay: Send {
    fn name(&self) -> &str;
    fn create(&self, mode: Mode) -> Result<VirtualOutput>;
    fn set_launch_command(&self, cmd: Option<String>);   // per-instance, not a global env var
}
struct VirtualOutput {
    node_id: u32,
    preferred_mode: Mode,
    #[cfg(windows)] win_capture: WinCaptureTarget,        // target_id + adapter_luid + monitor_gen (carried, not ambient)
    keepalive: Box<dyn VirtualLease>,
}
trait VirtualLease: Send {                                // Drop = release; replaces the sudovda free-fns + CURRENT_MON_GEN reach-in
    fn set_hdr(&self, on: bool) -> Result<()>;
    fn hdr_enabled(&self) -> bool;
    fn await_released(&self, timeout: Duration) -> bool;
}

trait Capturer: Send {
    fn next_frame(&mut self) -> Result<CapturedFrame>;
    fn try_latest(&mut self) -> Option<CapturedFrame>;
    fn set_active(&mut self, a: bool);
    fn hdr_meta(&self) -> Option<HdrMeta>;
    fn pipeline_depth(&self) -> usize;
}
fn open_capturer(vout: VirtualOutput, want: OutputFormat) -> Result<Box<dyn Capturer>>;  // format+HDR passed IN

trait Encoder: Send {
    fn submit(&mut self, f: &CapturedFrame) -> Result<()>;
    fn poll(&mut self) -> Option<EncodedFrame>;
    fn flush(&mut self);
    fn request_keyframe(&mut self);
    fn caps(&self) -> EncoderCaps;                        // query, don't rely on default no-ops
    fn set_hdr_meta(&mut self, m: Option<HdrMeta>);
    fn invalidate_ref_frames(&mut self, lo: u64, hi: u64) -> bool;
}
fn open_encoder(plan: &EncodePlan) -> Result<Box<dyn Encoder>>;

trait AudioCapturer: Send { fn next_chunk(&mut self) -> Result<Vec<f32>>; fn channels(&self) -> u16; fn drain(&mut self); }
trait VirtualMic:    Send { fn push(&mut self, pcm: &[f32]); fn channels(&self) -> u16; }
trait InputInjector: Send { fn inject(&mut self, e: &InputEvent); }
trait PadManager:    Send { /* handle/apply_rich/pump/heartbeat — Box<dyn PadManager> via select(GamepadPref), replaces the PadBackend enum */ }

The three tightenings: (1) Capturer takes the desired OutputFormat IN — kills the capture → encode::windows_resolved_backend() back-reference that's recomputed in dxgi.rs; (2) HDR control + monitor-release become VirtualLease methods so the session glue never names a concrete backend and contains zero unsafe; (3) optional encoder capabilities are queried via EncoderCaps.

2.4 SessionFactory + typed plan (the single biggest clarity lever)

Today the Windows capture/topology/encoder decision is made by ~40 scattered env reads, recomputed in THREE places (capture_virtual_output, should_use_helper, virtual_stream) with no single owner and a latent mirrored-dispatch bug (capture and encode can disagree on the backend). Replace with:

struct SessionPlan {
    display:  DisplayBackend,
    capture:  CaptureBackend,        // IddPush | Dda | Wgc
    topology: SessionTopology,       // SingleProcess | TwoProcessRelay
    encoder:  EncoderBackend,        // Nvenc | Amf | Qsv | Software
    input_format: OutputFormat,
    bit_depth: u8, hdr: bool, pipeline_depth: usize,
}
struct SessionFactory { cfg: Arc<HostConfig>, vdm: Arc<VirtualDisplayManager>, injector, mic, audio }
impl SessionFactory {
    fn plan(&self, welcome: &Welcome) -> SessionPlan;          // resolves ONCE from HostConfig; no env reads downstream
    fn build(&self, plan: &SessionPlan, ctx: SessionContext) -> Result<Session>;  // owns the RAII chain
}

build() owns the chain vdm.lease(mode) → open_capturer(vout, fmt) → open_encoder(plan) → spawn pipeline, and Session::drop is the only teardown path. This kills the env soup, makes the deployed path readable, and removes the capture/encode backend-disagreement bug class. It also lets us drop the 1213-arg #[allow(too_many_arguments)] signatures (a SessionContext struct) and the dead Compositor ceremony threaded through the Windows path.

2.5 Ownership model — delete the global statics

IMPLEMENTED (2026-06-25, branch windows-host-goal1). Landed as 3 steps + an on-glass reconnect-leak test — see windows-host-goal1-plan.md §2.5 for the commits + results. One deviation from the sketch below: the 5-agent map found CURRENT_MON_GEN was write-only (the per-frame monitor-gen bail was never wired), so the "generation carried through WinCaptureTarget" item was unnecessary and dropped; the gen lives on the manager + lease only.

Today the lifecycle is smeared across IDD_PERSIST + open_or_reuse (dead code), CURRENT_MON_GEN (read per-frame), IDD_SETUP_LOCK/IDD_SESSION_STOP (the preempt dance), MGR: Mutex<Mgr>, and on the driver side ADAPTER/MONITOR_MODES/NEXT_ID/WATCHDOG_*/DEVICE_POOL. Replace with:

  • A host-lifetime VirtualDisplayManager owning a typed OwnedHandle device handle (not a raw isize smuggled across threads) and the refcounted Idle/Active/Lingering state machine (preserve the machine — it's earned).
  • A per-session MonitorLease whose Drop releases the refcount; the monitor generation carried through WinCaptureTarget instead of the ambient CURRENT_MON_GEN.
  • On the driver: wire EvtCleanupCallback for MonitorContext (only DeviceContext has it today) so the SwapChainProcessor + D3D resources drop via WDF RAII — deleting free_swap_chain_processor and the manual-free-before-departure dance that is the documented dominant reconnect leak. Move the process-global driver state into the DeviceContext; collapse the 3-way monitor identity (MONITOR_MODES / EDID serial / context stamp) to one Monitor owned by the context.

3. The host↔driver contract (own it; define once)

3.1 pf-vdisplay-proto (no_std, bytemuck/zerocopy)

One crate, both build graphs (path dep). Owns:

  • Control plane: a fresh interface GUID; a contiguous, versioned op enum; #[repr(C)] request/reply structs carrying only used fields.
  • Frame plane: SharedHeader, the FrameToken { generation, seq, slot } with pack/unpack (replacing the hand-twiddled gen<<40|seq<<8|slot on both sides), the Global\pfvd-* name helpers.
  • Gamepad sections: XusbShm (64 B) and PadShm (256 B, incl. device_type) layouts.
  • Derive FromBytes/IntoBytes/Pod; const size+offset asserts; round-trip tests. ABI drift becomes a compile error, not a runtime corruption. (bytemuck is already a dep in the driver + wdf-umdf-sys.) This deletes every OFF_* constant + read/write_unaligned on both sides of every boundary — the largest single block of shared-memory unsafe, and the top drift hazard.

3.2 Control plane — keep DeviceIoControl, redesign the ABI

DeviceIoControl is the correct WDF idiom for a driver with no control device and is low-frequency (ADD/REMOVE per session + a keepalive); the shared-memory pattern buys nothing here. Keep it; redesign the surface:

  • Ops actually needed: Add(mode, identity) → {luid, target_id}, Remove, SetRenderAdapter (now unconditional — pf-vdisplay honors it for hybrid-GPU IDD-push; drop the SudoVDA-parity default-off branch), ClearAll (first-class startup orphan reap, not an "ignored by SudoVDA" hack), GetInfo (a real version handshake), and keepalive (see §3.4).
  • Drop the SudoVDA-isms: AddParams.device_name[14]/serial[14] (ignored), the 16-byte GUID → a monotonic u64 session id (the refcount manager owns collision safety; retires next_monitor_guid's pid-mangling), the 4-byte {major,minor,incr,test} version tuple → one u32, the gappy 0x800/0x888/0x8FF func numbering → contiguous.
  • One typed IOCTL dispatch helper retrieves+validates+aligns the buffers and hands the body a safe &Req / &mut MaybeUninit<Reply> — collapses ~20 of control.rs's 29 unsafe blocks.

3.3 Frame plane — keep the inversion, retire the scaffolding

Keep the host-creates / driver-opens ring exactly. Remove the bring-up scaffolding that diagnosed the now-solved run_core=0 mystery: the DebugBlock channel + DBG_MAGIC, spawn_observer / PUNKTFUNK_IDD_PUSH_OBSERVE, the error!-as-info! logging, the intentional handle leak, and the 20 s blind no-frame deadline (replace with the DRV_STATUS_OPENED handshake as a bounded liveness signal).

3.4 Driver swap-chain reuse — the one open root cause

Today a reused IddCx monitor's swap-chain dies after ~2 sessions (target id resolves to 0, SetDevice fails 0x80070057, then an access violation), forcing fresh-monitor-per-session + the host-side preempt/wait_for_monitor_released dance + the IDD_PERSIST "create once, never recreate" workaround. The fix is in the driver: with EvtCleanupCallback wired + state owned by DeviceContext + the identity collapsed to one Monitor (the recreate-path bugs are exactly the 3-way identity desync), the clean recreate should become stable. If that holds, delete IDD_SETUP_LOCK/IDD_SESSION_STOP + the preempt dance and unblock max_concurrent>1 on Windows. If it can't be fixed cheaply, isolate the residual serialization inside VirtualDisplayManager (not smeared back into the session loop). Separately, evaluate replacing the polling watchdog (PING/countdown/grace/linger constellation) with a WDF file-object EvtFileClose (host holds the control handle open; close = host gone) — feasibility TBD on UMDF/IddCx.


4. Capture strategy

IDD-push is the universal primary path — normal AND secure desktop (Decision B). It composes in-process (cross-session via Global\ shared textures: driver in WUDFHost/Session 0, serve in the console session), needs no DXGI Desktop Duplication and no win32u reparenting hook, is live-validated at 5K@240 HDR, and (per the owner) also captures the secure desktop (Winlogon/UAC/lock). So there is no separate "secure capturer" in the primary path: the same IddPushCapturer spans the lock screen and UAC. Capture selection moves into a typed CaptureBackend in the SessionPlan — replacing the 3-way env branch with IddPush (default) → Dda/Wgc (explicit fallbacks).

WGC + DDA are kept as fallbacks, not deleted (Decision B). They cover non-IddCx / pre-pf-vdisplay hardware and act as a safety net if IDD-push fails to attach. But they are demoted: they are no longer the default, no longer entangled with the secure-desktop mux, and selected only via the explicit CaptureBackend fallback in the plan. This lets the DDA module shed the parts that existed only to make virtual-display-over-DDA survive on a hybrid box, while the genuinely-useful capture/recovery core stays:

  • Scope the win32u self-modifying-code hook + the GPU-pref hook to the DDA fallback leg (one win32u_hook::install()), so the primary IDD-push path never touches them. Re-confirm whether DDA even needs the win32u hook against pf-vdisplay (it may not — open verification item).
  • The two-process WGC relay's secure-desktop mux is retired — IDD-push handles the secure desktop directly, so desktop_watch.rs + composed_flip.rs + the virtual_stream_relay monolith are no longer needed for their original purpose. Keep a minimal WGC fallback capturer if the WGC backend is retained; do not port the 400-line relay state machine. (The cross-session input concern below is handled by the InputInjector/topology abstraction, not the AU video relay.)

Shared D3D primitives move out of dxgi.rs (today the de-facto dumping ground that wgc.rs and idd_push.rs import from) into windows/d3d/ (typed Texture2d/Ring/CopyResource/Map-as-bytes), windows/color/ (the converters + hdr_p010_selftest verbatim), and windows/cursor.rs. All three capturers consume them — deletes the duplicated tex_desc, cursor, HDR-poll, repeat-last logic.

The texture-ownership contract becomes type-level. NVENC encodes the capturer's texture in place (no copy), sound today only because the IDD-push capturer rotates OUT_RING and the loop honors pipeline_depth() — an undocumented cross-module coupling that is already a latent corruption risk. Fix: either the encoder always CopySubresourceRegions (as ffmpeg_win does), or the capturer hands an explicitly-leased ring texture with a documented lifetime. No more relying on the synchronous-loop assumption.

The IDD-push input question (must confirm on-glass): capture+encode run in serve; input must reach the streamed (console-session) desktop. If serve runs in the console session, SendInput works directly. A code comment flags "SendInput from Session 0 can't reach Session 1" — so the architecture must make InputInjector satisfiable either by in-session SendInput or by a tiny input-only Session-1 agent (re-scope the old WGC helper to input only). The SessionPlan.topology expresses this.


5. Encode layer

  • Resolve backend + input format + pipeline depth once into EncodePlan and hand it to both the capturer and the encoder factory — kill the duplicated windows_resolved_backend() call in dxgi.rs (the highest-severity coupling). Trim open_video's 8-arg grab-bag (cuda is always false on Windows; bit_depth is overridden by the capture format anyway).
  • nvenc_sys.rs: a thin safe wrapper — RAII NvSession/NvBitstream/NvRegistration/ NvMappedInput (Drop = destroy/unregister/unmap) + an NV_ENC_CONFIG builder. The public encoder then has near-zero unsafe and no hand-written teardown loops. (The SDK table already returns Result via result_without_string().) This is the single biggest encode-side unsafe reduction.
  • ffmpeg_win: RAII AvFrame/SwsCtx/HwDeviceCtx/HwFramesCtx delete every manual av_*_free and the error-path cleanup ladders (also the biggest leak-risk reduction); a checked MappedSurface for the staging readback; a const size-assert on the hand-mirrored AVD3D11VA* structs in a dedicated d3d11va_ffi submodule (silent FFmpeg ABI drift is currently undetectable). Keep system-readback the default; zero-copy stays opt-in/experimental (no AMD/Intel lab box).
  • HDR symmetry: make in-band ST.2086/CLL SEI a shared post-encode step so AMF/QSV get the same mastering metadata as NVENC (today only NVENC attaches it; AMF/QSV rely solely on the 0xCE datagram). Centralize "when does the client learn HDR metadata" in one owner.
  • Keep hdr.rs, the Encoder trait, EncodedFrame, validate_dimensions, the caps-probe + RFI logic verbatim. Delete the pipeline.rs pump_once doc stub (the real loop is session/pipeline.rs).

6. Drivers — one binding stack (windows-drivers-rs), one workspace, one signing recipe

Today: pf-vdisplay on the vendored wdf-umdf stack; pf_dualsense + pf_xusb on microsoft/windows-drivers-rs (wdk/wdk-sys/wdk-build). Two bindgen passes, two SDK resolutions, two NTSTATUS, two build systems, two signing recipes.

Decision C: unify all three on microsoft/windows-drivers-rs (the official Microsoft stack), in one in-tree packaging/windows/drivers/ workspace, edition 2024, one rust-toolchain.toml, one CI build. The gamepad drivers already ship on it; the work is to migrate pf-vdisplay onto it and add the IddCx surface it lacks today.

Required pieces of this migration (each a Phase-0/early task):

  1. Add an iddcx subset to wdk-sys. IddCx DDIs are not WDF-table functions — they are direct IddCxStub exports — so the extension is bounded: an ApiSubset::Iddcx + iddcx feature → bindgen IddCx.h + link IddCxStub, then ~15 thin extern/wrapper fns. Use the current wdf-umdf/src/iddcx.rs (~345 LOC, validated) as a line-by-line oracle, including the IddCx 1.10 *2 HDR DDIs (IddCxSwapChainReleaseAndAcquireBuffer2, IDARG_*2, _METADATA2).
  2. Solve /INTEGRITYCHECK for self-signed loading — properly. wdk-build links the driver with /INTEGRITYCHECK, which a self-signed cert can't satisfy (CodeIntegrity 3004/3089). Today the gamepad drivers hand-patch the FORCE_INTEGRITY PE bit post-link. Replace that hack with a robust solution, in order of preference: (a) override the linker flag — drop /INTEGRITYCHECK via wdk-build config / RUSTFLAGS/link-args if it can be suppressed cleanly; else (b) a deterministic, tested CI post-link tool (a small Rust/PowerShell step that clears bit 0x80 at e_lfanew+0x5e and re-signs, run in CI, not by hand) so it's reproducible and not a footgun; (c) for a public build, real attestation signing (Partner Center) satisfies /INTEGRITYCHECK legitimately. Pick (a) if feasible; (b) as the fleet-self-signed fallback. This is the headline cost of choosing this stack and must be nailed in Phase 0.
  3. Backport the wdf-umdf-sys build.rs SDK-resolution fix into wdk-build (or a local override): resolve IddCx.h/IddCxStub by the SDK version that actually contains um\x64\iddcx, not the max base SDK (the real failure where a newer base SDK shadows the WDK SDK). windows-drivers-rs's default resolution doesn't exercise IddCx today, so this likely needs porting.
  4. Port pf-vdisplay's typed safety wins onto the new stack: re-create the WDF_DECLARE_CONTEXT_TYPE! Arc<RwLock<T>> context abstraction (the gold-standard contained unsafe); the version-gate protocol (IddCxIsFunctionAvailable! / IDD_STRUCTURE_SIZE!); and a thin safe wrapper layer so the gamepad drivers stop emitting raw call_unsafe_wdf_function_binding! everywhere (the biggest driver-unsafe lever).

While unifying, also: adopt WDF device contexts for per-pad state (drop the UmdfHostProcessSharing=ProcessSharingDisabled-dependent statics → true multi-pad-per-host); replace mem::zeroed() configs with the WDF_*_CONFIG_INIT initializers (kills the recurring zeroed-default bug class that already caused 3 driver bugs); cache the shm view (RAII ShmView) instead of re-mapping ~125×/s; delete the world-writable C:\Users\Public\*.log driver logging and the "M0 spike" naming; collapse is_nt_error()/dyn-Any/From<()>-as-error into a typed IntoDriverResult; collapse the per-call dispatch unsafe into one generic dispatch() helper.

Provenance note: confirm where wdk/wdk-sys/wdk-build come from (the gamepad drivers' Cargo.toml path-deps ../../crates/wdk* don't exist in this checkout — they resolve inside a windows-drivers-rs checkout on the dev box). Pin them as crates.io deps or a vendored, version-pinned copy so the driver workspace builds reproducibly in CI.


7. Input, audio, service, packaging

  • Input: consolidate the host-side device plumbing (create_swdevice/create_shm_section/ SwDeviceProfile) into one inject/windows/swdevice.rs used by all three managers (XUSB included, which currently re-implements its own). The shm layouts come from pf-vdisplay-proto. Re-scope the cross-session helper (if any) to input-only.
  • Audio: small, already fairly clean. Replace the lone newdev.dll LoadLibrary+transmute (wasapi_mic.rs, the audio runtime's only unsafe) with the windows-rs DiInstallDriverW binding (or move provisioning to the installer) → zero unsafe in the audio runtime.
  • Service / process: one windows/process.rs owning RAII Token/Event/Job/Child + a single spawn_as_user() used by BOTH the SCM supervisor and any helper — deletes the duplicated token-dup/merged_env_block/CreateProcessAsUserW machinery and ~12 manual CloseHandle sites. Add a cooperative stop: a named stop event the supervisor sets and serve waits on, so Stop runs RAII teardown (today TerminateProcess skips Drop → the virtual monitor lingers, the documented stale-monitor gotcha); TerminateProcess only as a bounded fallback.
  • Packaging/CI: keep the thin-.iss / fat-binary model; add a punktfunk-host web install/uninstall subcommand to absorb the web-setup PowerShell. Build + sign the unified driver workspace in CI from source (or a CI guard that fails on stale-vendored-DLL / un-bumped DriverVer) so the driver can't silently drift from its source. Mint the fresh pf-vdisplay GUID coordinated across host + driver + INF. Single source of truth for version → build + ISCC AppVersion + INF DriverVer. Investigate retiring nefconc by creating the ROOT devnode via SwDevice/CM in Rust. Keep the devgen-never / nefconc-only and DriverVer-bump gotchas codified.

8. Unsafe-reduction program (run at port time, not as a separate pass)

  • P0 lints first (a few lines, before new code): #![deny(unsafe_op_in_unsafe_fn)] (host crate has none today; the driver workspace already has it), #![warn(clippy::undocumented_unsafe_blocks)], #![warn(clippy::multiple_unsafe_ops_per_block)]. Generated bindings keep their opt-out.
  • P0 std handle ownership: std::os::windows::io::OwnedHandle / std::fs::File::from_raw_handle everywhere a raw HANDLE/isize is held (events/jobs/tokens/sections/pipes). Used in zero host files today — the single biggest cheap win. Deletes the bespoke unsafe impl Read/Write/Drop (HandleReader), the never-closed sudovda control handle, the AtomicIsize HANDLE globals, ~6 manual CloseHandle sites — and fixes real leaks.
  • P0 the proto crate (§3.1) — kills the shared-memory pointer-cast unsafe.
  • P1 typed wrappers: windows/d3d/ (most COM calls already return Result; per-frame loop bodies become unsafe-free, the irreducible keyed-mutex/from_raw_parts lands in one frame_xfer fn); nvenc_sys + RAII ffmpeg (§5); one windows/process.rs (§7); collapse the 21 unsafe impl Send onto one audited SendPtr<T>/ThreadBound<T> (directly de-risks the NVENC in-place coupling).
  • P2 contain the irreducible: win32u_hook.rs (one install(); scope to secure-DDA or drop), gpu_priority.rs (the D3DKMT transmute), the WDF context-blob macro, the IddCx swap-chain DDI + from_raw_borrowed (wrap in a typed SwapChain guard returning a borrowed AcquiredSurface<'_>). Document a // SAFETY: per residual site.
  • P2 delete unsafe by deleting code: the present_trigger dead diagnostic, the DebugBlock channel, spawn_observer, IDD_PERSIST/open_or_reuse, helpers.rs Sendable<T>, the WGC-open thread-watchdog hack (gone with WGC), the driver file-logging.

Estimated: host ~144→~35, drivers ~227→~60, residual concentrated and auditable. (#![forbid(unsafe)] is impossible for the drivers and the per-frame D3D path — the realistic target is containment.)


9. SudoVDA decoupling (mechanical rename + scrub)

vdisplay/sudovda.rsvdisplay/windows.rs; SudoVdaDisplayPfVirtualDisplay; scrub "SudoVDA" from all log/error/doc strings across capture.rs/dxgi.rs/wgc*.rs/idd_push.rs/punktfunk1.rs/ main.rs/sendinput.rs (141 refs / 15 files). Split the reach-in helpers out of the vdisplay backend (they're display-utility, not virtual-display creation): set_advanced_color, advanced_color_enabled, resolve_gdi_name, isolate/restore_displays_ccd, set_active_modewindows/display_ccd.rs (collapsing the 4× copy-pasted QueryDisplayConfig preamble into one safe query_active_config()); resolve_render_adapter_luidwindows/adapter.rs. Both vdisplay and capture then depend on these as peers, breaking the circular reach-in. WinCaptureTarget moves to a neutral location (defined in dxgi.rs, constructed in sudovda.rs today). Drop the dual-driver fallback conditionals. Expose HDR/monitor-release as VirtualLease methods (zero unsafe in the session glue).


10. Build plan (greenfield — Decision A)

A from-scratch rebuild of the Windows host against the clean architecture, salvaging the §1 jewels verbatim (the already-clean, already-tested modules: hdr.rs, edid.rs, the inject/proto codecs, the HDR/cursor converters + their self-tests, the GF8 packetizer, the pairing handshake). The old Windows code stays in-tree, untouched, as the reference implementation until the new path reaches parity on glass, then is deleted.

Greenfield-risk mitigation (the survey's strong caveat stands): almost none of this is CI-validatable — the Windows backends + drivers need the RTX box (192.168.1.173) + the build VM, and AMF/QSV have no lab hardware at all. A greenfield rewrite therefore carries real risk of silently dropping a layered bug-fix. Two guardrails are mandatory:

  1. The §1 preservation checklist is a test/assert contract, not prose: each rebuilt module ports its hard-won invariants as unit tests or runtime asserts — RAII teardown order (restore displays before REMOVE), keyed-mutex held only across convert/copy, terminate checked at the swap-chain loop top, magic stamped last, OUT_RING texture rotation under pipeline_depth>1, the NVENC caps-probe downgrade, the SwDeviceCreate identity recipe. A rebuild that drops one fails its own test.
  2. On-glass A/B gates at each milestone below, on the RTX box, against the current shipping build: 1080p60, 5K@240 HDR, reconnect-storm, secure desktop (lock/UAC), multi-pad. Nothing replaces the old path until its A/B passes.

Build order

  • M0 — Foundations + the /INTEGRITYCHECK answer. Stand up crates/pf-vdisplay-proto (the clean, owned ABI: fresh GUID, the redesigned IOCTL op enum + #[repr(C)] structs, SharedHeader, FrameToken, the gamepad shm layouts, const size-asserts, round-trip tests). Stand up the in-tree packaging/windows/drivers/ workspace on windows-drivers-rs and prove the two hard unknowns: (a) the iddcx wdk-sys subset bindgen+links and a trivial IddCx adapter loads; (b) /INTEGRITYCHECK is solved (§6.2) so a self-signed driver loads under Secure Boot with no hand-patching. Add the P0 lints to the host crate. No host behavior yet.
  • M1 — pf-vdisplay on the new stack, first light. Rebuild the IddCx driver against windows-drivers-rs+iddcx, clean from the start: DeviceContext-owned state (no process-globals), one Monitor identity, EvtCleanupCallback on MonitorContext, the ported Arc<RwLock<T>> context, the EDID + HDR recipe verbatim, the redesigned control plane from the proto crate. (On-glass: ADD → monitor arrives → IDD-push ring attaches → frames flow at 1080p; REMOVE clean.)
  • M2 — IDD-push capture + NVENC, glass-to-glass. New src/windows/ tree: windows/d3d/ typed wrappers, windows/color/ (converters + self-tests), windows/cursor.rs, capture/windows/idd_push.rs consuming the proto ring with a type-level texture-ownership contract (no in-place-encode assumption), encode/windows/{nvenc.rs,nvenc_sys.rs}, vdisplay/windows.rs + windows/display_ccd.rs
    • windows/adapter.rs. Wire the SessionFactory/SessionPlan (M2 only needs the IDD-push+NVENC plan). (On-glass A/B: 1080p60 + 5K@240 HDR, latency parity with the current build.)
  • M3 — Service, input, audio, secure desktop. windows/process.rs (RAII Token/Event/Job/Child + spawn_as_user + cooperative stop) + windows/service.rs; inject/windows/* on the proto shm + consolidated swdevice.rs; audio/windows/* (zero-unsafe runtime). Confirm IDD-push captures the secure desktop (lock/UAC) and input reaches the streamed session (in-session SendInput, or the input-only agent if needed). (On-glass: full session incl. lock screen + UAC + a real pad.)
  • M4 — Gamepad drivers onto the unified stack. Rebuild pf_dualsense + pf_xusb on windows-drivers-rs in the same workspace, WDF device contexts (true multi-pad), proto shm, WDF_*_CONFIG_INIT, no file logging, no "M0 spike" naming. (On-glass: 2 XInput + 2 DualSense pads, rumble/lightbar/adaptive-trigger round-trip.)
  • M5 — Fallbacks + GameStream + AMF/QSV. Port the demoted WGC + DDA fallback capturers (minimal, win32u hook scoped to the DDA leg); encode/windows/ffmpeg_win/* with RAII FFmpeg + the d3d11va_ffi size-assert (system-readback default; zero-copy experimental); GameStream planes reusing session/pipeline.rs, installer default flipped to secure serve. (On-glass: Moonlight client on the DDA fallback; AMF/QSV stays CI-only.)
  • M6 — Cut over + delete. Flip the default to the new path, run the full A/B matrix, then delete the old dxgi.rs/wgc*/sudovda.rs/punktfunk1.rs Windows monoliths + the bring-up scaffolding (DebugBlock/spawn_observer/observe gate) + the old gamepad driver crates. Single source of truth for version; CI builds+signs all drivers from source.

Milestones are roughly dependency-ordered; M0 is the long pole (the /INTEGRITYCHECK + iddcx proof gates everything else). M5's AMF/QSV cannot be validated without hardware — keep it system-readback-only and clearly experimental.


11. Decisions (resolved 2026-06-24) + open verification items

The five product forks are decided (see the table in §0): A greenfield; B IDD-push primary for everything incl. secure desktop, WGC+DDA kept as demoted fallbacks; C extend windows-drivers-rs + solve /INTEGRITYCHECK; D keep GameStream, default secure. On E (concurrent sessions): fix the driver swap-chain lifecycle regardless (it removes the leak + the preempt dance); treat true max_concurrent>1 on Windows as a follow-on once clean reuse is proven on glass.

What remains are technical unknowns to confirm on the RTX box (not user decisions):

  • /INTEGRITYCHECK resolution path (M0 long pole). Can wdk-build suppress /INTEGRITYCHECK via config/link-args (preferred), or must we keep a deterministic CI post-link bit-clear? Decides the signing story for all three drivers.
  • iddcx subset on wdk-sys. Does the bindgen+IddCxStub link cleanly, and does the SDK-resolution fix need backporting? (windows-drivers-rs doesn't exercise IddCx today.)
  • Driver swap-chain reuse. Does the clean ownership model (EvtCleanupCallback + DeviceContext state
    • single Monitor identity) actually fix the "reused swap-chain dies after ~2 sessions" root cause? If not, the residual serialization stays inside VirtualDisplayManager.
  • IDD-push input + secure desktop. Confirm serve runs in the console session so SendInput reaches the streamed desktop (a code comment warns about Session 0→1); confirm IDD-push frames flow through the lock screen / UAC (owner reports yes — verify and lock it in as the primary, demoting the DDA secure leg to fallback).
  • Does the demoted DDA fallback still need the win32u hook against pf-vdisplay, or was that purely a SudoVDA/hybrid pathology? If unneeded, the self-modifying-code hook can be deleted entirely.
  • AMF/QSV stays CI-only (no hardware) — system-readback default, zero-copy experimental.

12. Risks

  • Greenfield with no CI (the dominant risk). The build VM is headless/WARP; the WinUI/hardware/driver paths need the RTX box, and AMF/QSV have no hardware. A from-scratch rebuild can silently drop a layered bug-fix. Mitigation: the §1 preservation checklist is a test/assert contract per rebuilt module; on-glass A/B gates the new path before the old one is deleted (M6); keep the old code in-tree as the reference until parity.
  • /INTEGRITYCHECK (M0 long pole). Choosing windows-drivers-rs means self-signed loading depends on solving it cleanly (§6.2). If neither linker-flag suppression nor a deterministic CI post-link step works, drivers can't load self-signed — prove this first, it gates everything.
  • iddcx on wdk-sys is new surface (windows-drivers-rs doesn't bind IddCx). Bounded (IddCxStub exports + ~15 wrappers, with the validated wdf-umdf/iddcx.rs as oracle) but unproven on this stack — M0 must light it.
  • pf-vdisplay-proto spans two cargo build graphs (host workspace + the driver workspace). Validate the path-dep resolves on the Windows build env in M0; pin wdk* provenance so the driver workspace builds reproducibly in CI.
  • Driver swap-chain-reuse root cause still undiagnosed. The clean ownership model should fix it; if not, residual serialization stays inside VirtualDisplayManager and max_concurrent>1 stays blocked. Keep await_released on the trait until reuse is proven on glass.
  • NVENC in-place encode + pipeline_depth>1 is a latent corruption risk; the M2 texture-ownership contract must be type-level (not the synchronous-loop assumption). Verify the ring on glass.
  • Host/driver version drift in the field. New host + new driver are always built together (greenfield), but the installer bundles both — enforce a startup version handshake (proto version in both binaries) and a CI guarantee they're built from the same revision.
  • Big-bang cutover (M6). Flipping the default and deleting the old monoliths is the riskiest moment; it is gated on the full A/B matrix passing, and the old code is recoverable from git if a regression surfaces post-cutover.

13. Progress log + M1 IddCx-binding recipe (2026-06-24)

M0 COMPLETE (commits through f896f70, on main, CI-green + validated on the RTX box):

  • crates/pf-vdisplay-proto — owned host↔driver ABI (fresh GUID, typed IOCTLs + frame transport, const size-asserts). Green Linux + MSVC.
  • Runner and RTX box provisioned: WDK 26100 (WDF 2.31, IddCx 1.10), LLVM 21.1.2 (the runner's default was a ToT/22-dev build → wdk-sys bindgen E0080 layout-test overflow; 21.1.2 builds clean — windows-drivers-rs discussion #591). cargo-wdk on the runner.
  • packaging/windows/drivers/ — unified driver workspace on windows-drivers-rs; wdk-probe (minimal UMDF) builds clean end-to-end (bindgen + WDF link + static-CRT .cargo/config + pf-vdisplay-proto path-dep). Build layers solved: in-tree target dir (wdk-build walks OUT_DIR ancestors for Cargo.lock); [workspace.metadata.wdk.driver-model] = UMDF 2.31; target-feature=+crt-static w/ explicit target; Version_Number=10.0.26100.0; LIBCLANG_PATH → LLVM 21.1.2.
  • /INTEGRITYCHECK resolved: wdk-build sets it unconditionally (no opt-out) → packaging/windows/ clear-force-integrity.ps1 clears the PE FORCE_INTEGRITY bit (0x0080 @ e_lfanew+0x5e) post-link, before signing. Proven 0x01E0→0x0160 on CI and in PS 5.1 on the box. Self-signed UMDF load itself is already proven on the box (the gamepad drivers).

RTX box (ssh "Enrico Bühler"@192.168.1.173, ENRICOS-DESKTOP, RTX 4090 driver 610.62, PS 5.1 shell): ephemeral — boots to Proxmox on reboot, so unreachable after a reboot. Treat as opportunistic on-glass (driver load + IDD-push streaming) only; CI on the windows-amd64 runner is the persistent validator. A build clone is at C:\Users\Public\pf-rewrite; builds the driver in ~29 s with the box's LLVM 21.1.2.

M1 — IddCx binding on windows-drivers-rs (the recipe)

IddCx DDIs are function-table dispatched (IddFunctions[] indexed by IDDFUNCENUM::<Name>TableIndex, IddDriverGlobals implicit first arg) — exactly the model wdk-sys already implements for WDF (not direct IddCxStub exports as first assumed).

Approach (Option 1, recommended): vendor windows-drivers-rs 0.5.1 in-tree (pinned; source staged at scratchpad/wdr, commit 0e3499d), patched via [patch.crates-io] for just wdk-build + wdk-sys, and add a first-class ApiSubset::Iddcx that bindgens iddcx/1.10/IddCx.h in an extra pass reusing the identical bindgen::Builder::wdk_default(config) baseline (so its WDF/DXGI types resolve to, not redefine, wdk-sys's — type identity by construction). This mirrors wdk-sys's existing gpio/hid/spb/usb versioned-subpath subsets exactly.

  • wdk-build: add ApiSubset::Iddcx, a headers match arm, iddcx_headers() -> ["iddcx/1.10/IddCx.h"] (UMDF-only).
  • wdk-sys build.rs: generate_iddcx as a copy of generate_gpiobindgen_header_contents([Base, Wdf, Iddcx]), (TYPES|VARS).complement(), .allowlist_file("(?i).*iddcx.*"); behind an iddcx feature; add to ENABLED_API_SUBSETS; pub mod iddcx in lib.rs.
  • A wdk-iddcx wrapper crate (port of wdf-umdf/src/iddcx.rs): table dispatch via wdk_sys::iddcx::_IDDFUNCENUM::<Name>TableIndex as usize (ModuleConsts const, not the oracle's NewType .0); NTSTATUS is plain i32 in wdk-sys (use wdk_sys::NT_SUCCESS, drop the oracle's newtype).
  • Driver build.rs: add link-search to Lib/<sdk>/um/<arch>/iddcx/1.10 (the SDK version that contains iddcx — glob, don't trust max) + static=IddCxStub; hand-declare #[no_mangle] pub static IddMinimumVersionRequired: ULONG = 4;; keep the FORCE_INTEGRITY clear.

Make-or-break — RESOLVED (CI-green @ 6d8c7a5, run 5548, no fallback). IddCx.h bindgens AND the generated module compiles inside wdk-sys with WDF type-identity; the #515/#516 header conflict NEVER materialized. Vendored the published windows-drivers-rs 0.5.1 crates (wdk-build + wdk-sys) under packaging/windows/drivers/vendor/, [patch.crates-io]'d. The six knobs generate_iddcx actually needed (each a real gotcha, all CI-proven; the recipe above was close but the codegen/scope details differed):

  1. --language=c++wdk_default parses C; IddCx.h's IDARG_* typedefs need C++ or you get a "must use 'struct' tag" cascade (verified by direct clang on the box: 0 errors as C++, fails as C).
  2. -DIDD_STUB — table-dispatch mode; skips IddCxFuncEnum.h's #error IDDCX_VERSION_MAJOR is not defined (it lives inside #ifndef IDD_STUB). Do NOT add WDF_STUB — wdk-sys parses wdf.h non-stubbed, and stubbing it only here would desync the shared WDF types (breaking type-identity).
  3. allowlist_recursively(false) + allowlist_file("(?i).*iddcx.*"), full codegen (no .complement()) — emit ONLY IddCx items; WDF/Win types resolve to wdk-sys's via use crate::types::* in src/iddcx.rs. No giant blocklist (Option 2 avoided).
  4. allowlist_type("_?DXGI_.*" / "IDXGI.*" / "_?OPM_.*" / "_?D3DCOLORVALUE") — emit the non-WDF types wdk-sys doesn't bindgen, locally (absent from crate::types, so non-conflicting). The _? is load-bearing: typedef struct _OPM_X {} OPM_X needs the tag AND the alias (recursively(false) won't pull the tag from the typedef).
  5. pub type UINT = ::core::ffi::c_uint; in src/iddcx.rsUINT (unsigned int) is absent from crate::types; covers the top-level struct-field uses.
  6. translate_enum_integer_types(true) — C++ parsing kept UINT as the underlying repr of the DXGI/OPM ModuleConsts enums (pub mod _X { pub type Type = UINT; }), and nested modules can't see the parent UINT. This emits native u32 reprs → self-contained enum modules.

The wrapper note still holds: table dispatch via wdk_sys::iddcx::_IDDFUNCENUM::<Name>TableIndex as usize (ModuleConsts const, not the oracle's NewType .0); NTSTATUS = plain i32 (wdk_sys::NT_SUCCESS). Driver build.rs will add the IddCxStub link-search + IddMinimumVersionRequired + keep the FORCE_INTEGRITY clear. Option 2 stays rejected; the wdf-umdf-sys fallback is unneeded.

NEXT (M1 cont.): port the full ~30-DDI / ~40-struct surface (incl. the HDR *2 DDIs) + the swap-chain processor + frame transport, with the clean ownership model (DeviceContext-owned state, EvtCleanupCallback on MonitorContext, single Monitor identity, the owned pf-vdisplay-proto plane). First gate: a probe linking IddCxStub and calling IddCxDeviceInitConfig/…Initialize/ …AdapterInitAsync (CI = compile+link). On-glass load + IDD-push stream needs the RTX box (ephemeral — currently down/Proxmox).


14. M1 step 2 — pf-vdisplay driver port plan (2026-06-24, workflow-mapped + critiqued)

Status of the binding (DONE, CI-green): the wdk-sys iddcx binding is proven complete for the whole driver, not just init. wdk-probe/src/iddcx_surface_assert.rs (commit ae803b2) CI-asserts every *2/HDR struct (IDDCX_TARGET_MODE2/PATH2/METADATA2, IDARG_*RELEASEANDACQUIREBUFFER2 — which embed DISPLAYCONFIG_*/LUID, both of which resolve from crate::types — no allowlist gap), all 14 inbound PFN_IDD_CX_* callbacks, the .Size machinery (IddStructures/IddStructureCount/ IddClientVersionHigherThanFramework/_IDDSTRUCTENUM::INDEX_* — so IDD_STRUCTURE_SIZE! is portable), and IDDCX_ADAPTER_FLAGS::…CAN_PROCESS_FP16 + IDDCX_TARGET_CAPS::…HIGH_COLOR_SPACE. ModuleConsts module naming: the func/struct enums are _IDDFUNCENUM/_IDDSTRUCTENUM (underscored tag), but the flag/cap enums are IDDCX_ADAPTER_FLAGS/IDDCX_TARGET_CAPS (no underscore).

DDIs to wrap (11 — graduate wdk-probe/src/iddcx_rt.rs → a wdk-iddcx crate)

DeviceInitConfig, DeviceInitialize, AdapterInitAsync (done), MonitorCreate, MonitorArrival, MonitorDeparture, AdapterSetRenderAdapter, SwapChainSetDevice (other_is_error; 0x887A0026→retry), SwapChainReleaseAndAcquireBuffer2 (HDR variant only; other_is_error; E_PENDING 0x8000000A → wait on the surface event), SwapChainFinishedProcessingFrame. Drop the v1 ReleaseAndAcquireBuffer (adapter always sets FP16). Defer the hardware-cursor DDIs (cursor baked into video).

Callbacks (15 in IDD_CX_CLIENT_CONFIG; *2 mandatory because FP16)

parse_monitor_description (+2), monitor_query_target_modes (+2), adapter_commit_modes (+2), adapter_init_finished (stash IDDCX_ADAPTER + start watchdog), monitor_get_default_modes (→NOT_IMPLEMENTED, we always carry EDID), query_target_info (→HIGH_COLOR_SPACE), set_gamma_ramp (accept-stub — WITHOUT it the adapter fails to init), set_default_hdr_metadata (accept-stub) — the last three are mandatory under FP16, assign_swap_chain, unassign_swap_chain, device_io_control (the pf-vdisplay-proto control plane). Plus EvtDeviceD0Entry (adapter created HERE, not in DeviceAdd) and two EvtCleanupCallbacks.

State model (the rewrite's core change)

DeviceContext OWNS all state — IDDCX_ADAPTER, session_id-keyed monitor map, watchdog, the per-render-LUID Direct3DDevice pool — replacing the oracle's process globals. Reachable from BOTH the WDFDEVICE (strong) and the IDDCX_ADAPTER object (the adapter-side callbacks need it). MonitorContext owns the SwapChainProcessor + target_id; wire EvtCleanupCallback on the IDDCX_MONITOR object so RAII Drop joins the worker thread + frees D3D (the oracle lacked this → the dominant reconnect leak). Single Monitor identity keyed by session_id (collapses the oracle's 3-way EDID-serial/map/stamp desync that caused the target_id=0 recreate bug); assign_swap_chain reads target_id from the context, never a map lookup. The HOST still owns the control-device handle, the linger/reuse state machine, and ALL Global\ shared objects (created D:(A;;GA;;;WD)); the driver only OPENS them.

Frame transport (single-source on pf_vdisplay_proto::frame::*)

Acquire via ReleaseAndAcquireBuffer2{AcquireSystemMemoryBuffer=0} → GPU ID3D11Texture2D; borrow out.MetaData.pSurface with IDXGIResource::from_raw_borrowed (do NOT steal IddCx's refcount), publish BEFORE FinishedProcessingFrame. Ring = RING_LEN(6) keyed-mutex shared textures opened by name (frame::{header_name,event_name,texture_name}); per-frame: GetDesc format-guard (drop on FP16↔BGRA mismatch), AcquireSync(0,0ms), CopyResource, ReleaseSync(0), store FrameToken{gen,seq,slot}.pack() (Release), SetEvent. All-slots-busy → drop, never block. is_stale() (header.generation Acquire) → reattach on host ring recreate. Write DRV_STATUS_OPENED + render LUID into the header. Drop the old DebugBlock + the locally-duplicated header/MAGIC/name consts.

Implementation checklist (each step CI- or box-gated)

  1. workspace pf-vdisplay(cdylib)+wdk-iddcx members — STEP-0 gate must pull in std::thread+OwnedHandle (critique: prove std links under the UMDF toolchain here, not at STEP 5). CI.
  2. graduate iddcx_rt.rswdk-iddcx (11 DDIs + is_nt_error/other_is_error) + re-export the inbound PFN types. CI link. 1.5 (critique add) the surface-assert (DONE @ ae803b2) lives on so the full PFN/*2/DISPLAYCONFIG surface stays a CI gate.
  3. DriverEntry + driver_add: full IDD_CX_CLIENT_CONFIG (15 callbacks as stubs) + DeviceInitConfig + WdfDeviceCreate(+cleanup) + CreateDeviceInterface(PF_VDISPLAY_INTERFACE_GUID) + DeviceInitialize + D0Entry stub; salvage edid.rs verbatim. Resolve .Size via IDD_STRUCTURE_SIZE! (machinery confirmed present). CI link + FORCE_INTEGRITY clear.
  4. DeviceContext + WDF_DECLARE_CONTEXT_TYPE Arc blob; init_adapter in D0Entry (caps+FP16) → AdapterInitAsync; the *2 mode DDIs + query_target_info + gamma/hdr accept-stubs. Box gate: loads under Secure Boot, enumerates as IddCx adapter, Status OK (no "Failed to get adapter").
  5. control plane (GET_INFO version handshake — host MUST assert protocol_version, ADD/REMOVE/SET_RENDER_ADAPTER/PING/CLEAR_ALL) + create_monitor + real mode DDIs + watchdog + MONITOR_OP_LOCK; switch host sudovda.rs/idd_push.rs to pf_vdisplay_proto (GUID e5bcc234→70667664, IOCTL 0x800→0x900, GUID-key→session_id) — lockstep. CI (host build) + box (monitor appears at WxH@Hz).
  6. Direct3DDevice + assign/unassign + SwapChainProcessor (worker thread, SetDevice 60×@50ms single-borrow retry, top-of-loop terminate, Buffer2 acquire, from_raw_borrowed) WITHOUT publisher; wire monitor EvtCleanupCallback. Box: swap-chain assigns, acquire loop runs, RAII teardown (no thread/VRAM leak). Critique: instrument that MonitorContext::Drop actually RAN; if the monitor-object cleanup callback does not fire, keep the oracle's explicit free-before-departure path as the fallback.
  7. FramePublisher on pf_vdisplay_proto::frame::* + keyed-mutex RAII guard + OwnedHandle/ShmView; wire into run_core. Box: full IDD-push glass-to-glass, A/B vs the shipping driver. Critique: add a BLOCKING secure-desktop gate here — lock (Win+L)+UAC with serve in the console session / driver in Session 0, confirm frames keep flowing AND input reaches the desktop; until it passes, do NOT delete the WGC-relay/DDA secure path.
  8. HDR ring-recreate + repeated session recreate (confirm the recreate-crash is gone). Critique: define the failure branch — if recreate isn't stable, keep IDD_PERSIST + state that mid-stream Reconfigure stays unsupported on Windows IDD-push (host rejects, as today) rather than crashing; keep max_concurrent=1. Specify the concurrent-monitor D3D model before enabling >1 (two worker threads must not share one SINGLETHREADED immediate context — give each monitor its own device or a deferred/multithreaded context).
  9. unsafe-reduction pass (one audited SendPtr/ThreadBound; per-site // SAFETY; AcquiredSurface<'_> + KeyedMutexGuard RAII so the hot loop has zero raw Finish/ReleaseSync) + delete the old packaging/windows/vdisplay-driver/ tree only after the secure-desktop gate (step 6) passes. CI clippy -D warnings + final box A/B.

Critique verdict + the big risk

Plan is implementation-ready once the 4 CI-checkable unknowns are gates (3 now resolved by the surface-assert

  • .Size machinery presence; std-under-UMDF is the STEP-0 gate). SINGLE BIGGEST RISK: the secure-desktop claim — the plan retires the proven two-process WGC relay + DDA on the unproven assertion that one IddPushCapturer captures the lock/UAC secure desktop directly (IDD-push is opt-in today behind PUNKTFUNK_IDD_PUSH). Make it a blocking on-glass gate (step 6) and keep the WGC relay recoverable for one release. Other defined-failure-branch items: monitor EvtCleanupCallback firing, IDD_PERSIST/Reconfigure, concurrent-monitor device sharing, host↔driver protocol_version lockstep.

15. Current status (2026-06-25)

The rewrite is largely implemented. The new all-Rust pf-vdisplay driver (the M0 long pole — iddcx on windows-drivers-rs + /INTEGRITYCHECK — and the §14 STEP 08 port) landed on main, on-glass HDR validated, and the host was decomposed into the clean layered architecture. One important deviation from the plan: the host was refactored in place via a staged, behavior-preserving plan (windows-host-goal1-plan.md), not greenfield-rebuilt — the §10 "rebuild fresh, keep old as reference" framing was superseded because staging preserved the live-validated host at every step (lower regression risk than a big-bang M2 rebuild). The §2.3/§2.4/§2.5 design (seam traits, SessionPlan/SessionFactory/SessionContext, the VirtualDisplayManager ownership model) is realized in that branch's commits, not the M2 greenfield tree the build order imagined.

Milestone / step status

Item Status Evidence
M0 — proto crate, driver workspace, iddcx binding, /INTEGRITYCHECK DONE pf-vdisplay-proto; packaging/windows/drivers/; clear-force-integrity.ps1; CI-green (§13)
§14 STEP 08 — pf-vdisplay driver port (device→adapter→control→swap-chain→frame transport→HDR→.inx→unsafe pass) DONE d7a9fbfcd59151; on-glass HDR (6399d28: "Mac connects WITH HDR")
M1/M2 — IDD-push capture + NVENC glass-to-glass DONE new driver tree + the existing host IDD-push path; 5K@240 HDR zero-copy on-glass
§2.5 — ownership-model rewrite (VirtualDisplayManager/MonitorLease); swap-chain-reuse / monitor-leak DONE / RESOLVED windows-host-goal1 §2.5 (1520201683c81b); reconnect-leak A/B: 0 leaked monitors
Goal-1 host refactor (the in-place §2.22.5 realization, incl. EncoderCaps) DONE windows-host-goal1 branch — all 6 stages + §2.5 + 3 seam tightenings
Game-capture bug (GB1) — fullscreen game breaks IDD-push FIXED c87bfe0/f98ab07/789ad49; see game-capture-bug.md
M3 — service / input / audio cleanup 🟡 code present (largely via the existing host + goal1)
M4 — gamepad drivers (pf_dualsense/pf_xusb) onto the unified stack, WDF device contexts (true multi-pad) NOT STARTED old gamepad-driver crates still separate
M5 — demoted WGC/DDA fallback port + GameStream-on-session/pipeline + AMF/QSV (no hw) 🟡 PARTIAL fallbacks exist; not re-shaped onto the new seams
M6 — cut over + delete the old monoliths 🟡 PARTIAL old vdisplay-driver/ tree deleted (a2bd0cd); host monoliths remain

What genuinely remains

  1. Secure-desktop on-glass gate (the single biggest open risk, §14 STEP 6 critique). IDD-push capturing the lock screen / UAC with serve in the console session is asserted, not yet locked on glass. Until it passes, keep the WGC-relay / secure-DDA path recoverable. Hardware-gated (RTX box; ephemeral).
  2. M4 — gamepad-driver migration onto windows-drivers-rs (WDF device contexts → true multi-pad). The proven recipe exists; ~23 days, hardware-gated.
  3. M5/M6 cleanup — re-shape the WGC/DDA fallback + GameStream onto session/pipeline, then delete the old Windows monoliths. Low priority; AMF/QSV stays CI-only (no lab hw).
  4. pf-vdisplay driver slot reclaim — sustained ADD/REMOVE churn wedges the driver (ADD → 0x80070490 ERROR_NOT_FOUND): it doesn't reclaim IddCx monitor slots on REMOVE (ghost nodes accumulate). Recovery today is packaging/windows/reset-pf-vdisplay.ps1; the real fix is in the driver (control.rs/adapter.rs). Dev helpers reset-pf-vdisplay.ps1 + redeploy-pf-vdisplay.ps1 are committed.

Resolved since the original §11 open items

  • Driver swap-chain reuse — the clean ownership model (EvtCleanupCallback + DeviceContext-owned state + single Monitor identity) is in; §2.5's reconnect-leak A/B shows 0 leaked active monitors. The per-frame CURRENT_MON_GEN "monitor-gen bail" turned out to have been write-only (never wired), so the "carry the gen through WinCaptureTarget" item was dropped; the gen lives on the manager + lease only.
  • /INTEGRITYCHECK + iddcx on wdk-sys — both proven CI-green (§13).

Box reminder: the RTX box (ssh "Enrico Bühler"@…) is ephemeral (boots to Proxmox on reboot; IP floats on DHCP — has been .173/.158); the windows-amd64 CI runner is the persistent validator. On-glass gates are opportunistic.