Files
punktfunk/design/windows-host-rewrite.md
T
enricobuehler 2c937855b3 fix(packaging/windows): Windows 11 22H2 floor + tray install task + stale console-port fixes
The OS floor is now enforced at install time (MinVersion=10.0.22621 with an
explanatory [Messages] override): pf-vdisplay is built against IddCx 1.10, and
on Windows 10 (incl. LTSC) / Win11 21H2 the device fails start with Code 10
STATUS_DEVICE_POWER_FAILURE (field-reported). Docs (site requirements/install/
windows-host pages + README) state the floor; new docs-site Security page.

Installer also gains the trayicon task (punktfunk-tray.exe file + HKLM Run key,
post-install launch as the signed-in user, upgrade taskkill + uninstall
--quit/taskkill choreography before file deletion), and the wizard/cleanup
text/port sweeps move off the stale :3000 web-console references to :47992
(cleanups sweep both for upgrades from old installs).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-03 12:09:52 +00:00

30 KiB
Raw Blame History

Windows Host — Architecture, Status & Roadmap

Single source of truth for the punktfunk Windows streaming host: the all-Rust pf-vdisplay IddCx virtual-display driver + IDD-push zero-copy capture + NVENC/AMF/QSV encode, shipped as a signed Inno Setup installer with a LocalSystem SCM service. Live-validated on the RTX box through 5120×1440@240 HDR, the secure desktop (lock/UAC), and a fullscreen game.

This file is the consolidated Windows-host doc — it absorbs the rewrite design plan, the Goal-1 staged-refactor plan, the audit + remediation tracker, the fullscreen-game capture-bug analysis, and the durable rationale from the original windows-host.md implementation plan (now a stub). Last updated 2026-06-26. All of this work is merged to main (the windows-host-goal1 branch landed at 3e7c9bd).


1. Status at a glance

Goals 13 and milestones M0M4 are complete and merged to main. The host has a clean, typed, layered architecture (HostConfig → SessionPlan → SessionContext, windows/+linux/ confinement, a single VirtualDisplayManager ownership model, EncoderCaps); the all-Rust IddCx pf-vdisplay driver loads self-signed under Secure Boot and does IDD-push zero-copy capture at 5K@240 HDR including the secure desktop (Winlogon/UAC/lock); SudoVDA is gone (84a3b95) — pf-vdisplay is the sole virtual-display backend; and the three UMDF drivers (pf-vdisplay, pf-dualsense, pf-xusb) now build from source in one unified packaging/windows/drivers/ workspace (M4, 92e6802). The shipped path (IDD-push + NVENC) is live-validated on glass; the AMF/QSV encode path is CI-green but not yet on-hardware (no AMD/Intel Windows box in the lab).

Ground the details against the code: crates/punktfunk-host/src/windows/, crates/punktfunk-host/src/{capture,encode,inject,audio,vdisplay}/windows/, and packaging/windows/drivers/.

What remains (all non-blocking): the pf-vdisplay slot-reclaim-on-REMOVE fix needs an on-glass reconnect-storm A/B (§4 P1.3); host-crate unsafe lint hygiene + old-monolith / bring-up-scaffolding cleanup (§4 P2); and the hardware-gated items — AMF/QSV on-glass, hybrid-GPU SET_RENDER_ADAPTER, the WGC/DDA fallback reshape, and true max_concurrent>1 (§4 P3). One framing note: the host was not greenfield-rebuilt — it was refactored in place via a staged, behavior-preserving sequence that kept the live host working at every step; only the driver was rebuilt fresh.


2. Architecture (what is on disk)

A ~1-page map; the empirical constraints these encode are in §3, the deep reference is in §6.

2.1 Layering & crates

  • crates/punktfunk-host — one shared host crate (Linux + Windows; not split). Platform code is confined under per-module windows/+linux/ folders behind #[cfg] seams (capture/{windows,linux}/, encode/…, inject/…, audio/…, vdisplay/…, plus top-level src/windows/+src/linux/). Module names stay flat (#[path]), so caller paths are platform-agnostic.
  • crates/punktfunk-core — the one linked protocol/FEC/crypto/QUIC core (unchanged here).
  • crates/pf-driver-proto — the owned, no_std host↔driver ABI (frame ring + control plane + gamepad SHM), consumed by both the host crate and the driver workspace.
  • packaging/windows/drivers/ — the unified driver workspace on microsoft/windows-drivers-rs (vendored 0.5.1 + an iddcx subset): pf-vdisplay (the IddCx display driver), pf-dualsense + pf-xusb (the gamepad drivers, folded in by M4), wdk-iddcx (typed IddCx DDI wrappers), wdk-probe (the CI link/surface gate), vendor/{wdk-build,wdk-sys}.

2.2 Session resolution, ownership, and seam traits (Goal-1)

The old ~40-knob PUNKTFUNK_* env soup (re-read and recomputed in three places) is replaced by a resolve-once pipeline: config.rs HostConfig (typed, parsed once) → session_plan.rs SessionPlan (a Copy plan resolved once per session — CaptureBackend::resolve() picks IddPush | Dda | Wgc, resolve_topology picks SingleProcess | TwoProcessRelay; this killed the latent capture/encode backend-disagreement bug) → SessionContext (bundles the ~13 session args + plane receivers, moved into the stream thread).

Ownership is a single OnceLock VirtualDisplayManager (vdisplay/windows/manager.rs) owning a typed Arc<OwnedHandle> control-device handle (no raw-isize cross-thread smuggle), a refcounted Idle/Active/Lingering state machine, and the monitor generation; a per-session MonitorLease's Drop releases the refcount (a stale lease can't tear down a fresh monitor). This deleted a fistful of CURRENT_MON_GEN/MGR/IDD_* globals and validated on glass at 0 leaked monitors across a reconnect storm, A/B-equivalent to the shipping host.

The seam traits (VirtualDisplay/VirtualOutput/VirtualLease, Capturer, Encoder, AudioCapturer/VirtualMic/InputInjector/PadManager) got two tightenings: the capturer takes the desired OutputFormat { gpu, hdr } in (killing the capture → encode::windows_resolved_backend() back-reference), and Encoder::caps() -> EncoderCaps (§2.4) lets the session glue route loss-recovery by query.

2.3 Capture — IDD-push primary (normal and secure desktop), WGC/DDA fallback, GB1 recovery

IDD-push is the universal primary path. Capture comes straight from the driver's shared keyed-mutex texture ring (capture/windows/idd_push.rs) — no Desktop Duplication, no win32u reparenting hook. The host creates the ring as a sealed channel (proto v2, design/idd-push-security.md): the header, frame-ready event, and ring textures are unnamed (nothing to enumerate, open by name, or squat), and the host DuplicateHandles them into the driver's WUDFHost and delivers the handle values over the SYSTEM+admins-only control device (IOCTL_SET_FRAME_CHANNEL), so only the two endpoint processes can ever reach a frame — DDA's isolation property in user mode. (The objects keep a D:(A;;GA;;;SY)(A;;GA;;;LS) DACL as defense-in-depth; it is no longer the isolation boundary. This supersedes the earlier named-ring scheme, which was world-openable Global\pfvd-* (D:(A;;GA;;;WD)) then SY+LS-scoped.) The generation-tagged latest = gen<<40 | seq<<8 | slot stale-ring reject kills the HDR-flip garbage frame; a host-owned 3-slot OUT_RING rotated per frame is the texture-ownership contract that enables pipeline_depth=2 (convert/copy on the 3D engine overlapping NVENC on the ASIC). It captures the secure desktop (Winlogon/UAC/lock) directly (validated 2026-06-25), so there is no separate secure capturer in the primary path.

  • Open-time fallback: IddPushCapturer::open waits a bounded ~4 s for a first frame (not just DRV_STATUS_OPENED); on attach failure it returns the keepalive back so capture.rs opens DDA on the same WinCaptureTarget — never a 20 s black bail (ed58365/f98ab07).
  • Mid-session game mode-set recovery (GB1, fixed): the 250 ms poll follows the display's actual resolution (win_display::active_resolution, CCD/GDI) and recreates the ring on any descriptor change (size or HDR) → the driver re-attaches → frames resume at the game's mode, no reconnect. If a change is unrecoverable (e.g. an exclusive flip), a recovering_since clock drops the session after 3 s so the client reconnects cleanly. No protocol bump was needed — the host reads the resolution straight from Windows (c87bfe0; the driver's publish() width/height guard + flushed log is 789ad49).
  • WGC + DDA stay as demoted fallbacks for non-IddCx hardware (wgc.rs/dxgi.rs). The two-process WGC secure-desktop relay (wgc_relay.rs) is no longer load-bearing now that IDD-push handles the secure desktop; it is kept recoverable but slated for M5/M6 cleanup. (Its constraint analysis is archived in archive/windows-secure-desktop.md.)

2.4 Encode — NVENC / AMF / QSV / software; EncoderCaps; HDR

encode/windows/ dispatches per DXGI adapter vendor (open_video): NVENC (NVIDIA, direct SDK, nvenc.rs — caps-probe-before-configure, bitrate-clamp binary search, true RFI over the DPB, in-band ST.2086/CLL SEI), AMF/QSV (AMD/Intel via libavcodec, ffmpeg_win.rs — system-readback default, opt-in zero-copy D3D11; CI-only, no lab hardware), or software H.264 (sw.rs). HDR (10-bit) forces HEVC Main10 + BT.2020 PQ; the client auto-detects PQ from the VUI. The encoder adapts to a mid-session size/format/HDR change per frame (tears down + re-inits), so the GB1 capturer's resolution changes are handled downstream with no API change.

Encoder::caps() -> EncoderCaps { supports_rfi, supports_hdr_metadata } lets the session glue route loss-recovery by query (only Windows direct-NVENC overrides it; the GameStream loop gates the RFI path on supports_rfi rather than hard-coding per-backend knowledge into the glue).

2.5 Host↔driver ABI & the pf-vdisplay driver

pf-driver-proto is one no_std crate in both build graphs. It owns the frame plane (FrameToken

  • SharedHeader; since proto v2 the frame objects are unnamed — no Global\pfvd-* names — and are delivered by handle duplication over IOCTL_SET_FRAME_CHANNEL, the sealed channel: design/idd-push-security.md), the control plane (a fresh interface GUID — not SudoVDA's e5bcc234; contiguous 0x900 IOCTL ops; a GET_INFO version handshake the host asserts + bails on mismatch), and the gamepad SHM (XusbShm/PadShm incl. device_type). bytemuck-Pod + size_of and offset_of! asserts make ABI drift a compile error.

The driver (packaging/windows/drivers/pf-vdisplay/src/) is an all-Rust UMDF IddCx driver on windows-drivers-rs + the iddcx wdk-sys subset; the STEP 08 build is the checklist in §6.3, its internals are the invariants in §3, and it loads self-signed under Secure Boot (FORCE_INTEGRITY cleared post-link, §6.1). Known gaps: ownership state is still partly process-global with EvtCleanupCallback on the WDFDEVICE (a deliberate, sound choice — E1 in §4); and slot-reclaim-on-REMOVE (§4 P1.3).

2.6 Service, packaging, installer

A LocalSystem SCM supervisor (windows/service.rs) token-retargets and CreateProcessAsUserWs serve into the console session (so SendInput reaches both the streamed and the secure desktop), relaunches on session-change, and kills-on-close via a Job Object — the Sunshine/Apollo model (rationale: windows-service.md). Shipped as a signed Inno Setup setup.exe (packaging/windows/, windows-host.yml) that builds + signs all three drivers from source, bundles them + the FFmpeg DLLs, and delegates to service install. GameStream (Moonlight) is kept, but the installer/service default to secure serve (GameStream opt-in).


3. Validated invariants — preserve, do not regress

These are expensive empirical wins; keep them intact when touching the code:

  • Frame transport: host-creates/driver-opens keyed-mutex ring; generation-tagged stale-ring reject; 0 ms try-acquire / drop-on-full publish (never block the swap-chain thread); the OUT_RING rotation + pipeline_depth=2 overlap; repeat_last rotates into a fresh out-ring slot (depth-safe).
  • Driver internals: edid.rs (128-byte EDID + CTA-861.3 HDR block, dual checksums); the FP16 HDR recipe (CAN_PROCESS_FP16 + the *2 DDIs + gamma/HDR accept-stubs + HIGH_COLOR_SPACE); DEVICE_POOL per render-LUID (NVIDIA UMD/VRAM leak fix); target-id stamped on the monitor context; the two swap-chain leak fixes (borrow IDXGIDevice across SetDevice retries; check terminate at the loop top).
  • Monitor lifecycle: serialized ADD/REMOVE/teardown; restore CCD topology before REMOVE; the generation-stamped lease (a stale lease can't tear down a fresh monitor); 0-leak across reconnects.
  • HDR color math: hdr.rs (pure, unit-tested, ST.2086 + big-endian SEI); the FP16→P010/Rgb10a2 converters + hdr_p010_selftest; the cursor decomposition.
  • NVENC tuning: caps-probe-before-configure (10-bit→8-bit graceful downgrade); bitrate-clamp binary search (each GPU's real ceiling); true RFI over the DPB; CBR / infinite-GOP / P-only / ~1-frame VBV.
  • Gamepad recipe: the SwDeviceCreate identity (enumerator with no _; mandatory completion callback; synthesized DS5 compat-ids; non-null per-pad ContainerId); one pf_dualsense serving DualSense+DS4 via a device_type byte; XUSB declining WAIT_*; per-pad index via pszDeviceLocation.
  • Session glue: the trait seam + RAII keepalive teardown; host-lifetime shared services + per-session gamepads; the encode|send split + microburst pacing; build_pipeline_with_retry permanent-vs-transient classification; the GameStream VideoPacketizer (GF8 Cauchy, Moonlight byte-exact); the pairing/trust handshake.
  • Core discipline: no async on the per-frame path; pf-driver-proto is the single ABI source (drift = compile error); the version handshake the host asserts.

4. Open work / next tasks (prioritized)

P1 — ship-readiness / correctness

  1. Goal-1 → main merge — DONE. The windows-host-goal1 branch is merged (tip 3e7c9bd); the full Windows CI matrix (incl. the amf-qsv encode path that local checks skip) runs on push.
  2. IDD-push default — resolved via host.env. The shipped default host.env sets PUNKTFUNK_IDD_PUSH=1, so a fresh install runs the validated IDD-push path (with the WGC/DDA fallback in place). The bare in-code default (config.rs) is still false (the dev / non-pf-driver default); flipping it to follow the deployed default is an optional tidy.
  3. pf-vdisplay slot reclaim on REMOVE (driver robustness) — 🟡 fix landed, on-glass-validation pending. Sustained ADD/REMOVE churn wedged the driver (ADD → 0x80070490 ERROR_NOT_FOUND) because the monitor id (EDID serial / ConnectorIndex / container GUID) was a monotonic NEXT_ID, never reclaimed → IddCx accumulated a new OS target slot per cycle until exhaustion. monitor.rs now allocates the lowest free id (alloc_monitor_id), reused on REMOVE, so a fresh ADD reuses the departed monitor's target slot instead of orphaning it. CI-compile-gated; the wedge only reproduces under sustained churn on the RTX box, so this needs an on-glass reconnect-storm A/B to confirm (the box is ephemeral). Keep packaging/windows/reset-pf-vdisplay.ps1 as the recovery until validated.

P2 — hygiene / architecture completion 4. D1-host — host-crate P0 lints — deferred (low value / high churn). A crate-wide #![deny(unsafe_op_in_unsafe_fn)] produced 100+ FFI-wrap sites across the Linux modules; it wraps unsafe (discipline) rather than reducing it and doesn't improve stability, so it was deprioritized vs the OwnedHandle/RAII reductions (which are completeidd_push.rs, service.rs, the three gamepad backends via a shared gamepad_raii.rs, the SCM STOP/SESSION events as OnceLock<OwnedHandle>, the hot-loop KeyedMutexGuard, and the driver's pod_init!; all box-validated, clean sc stop in ~1 s). The driver already has the deny. Revisit D1-host as a final discipline pass (staged per-module) if desired. 5. M6 scaffolding cleanup — the bring-up diagnostics (spawn_observer/DebugBlock in idd_push.rs) were deleted with the sealed-channel change (they were the last fixed-name Global\ objects on the frame path); once full parity is proven on glass, the host monoliths remain.

Explicitly NOT doing (stability decision): E1 — driver DeviceContext ownership + per-IDDCX_MONITOR EvtCleanupCallback. The current process-global design is sound: IddCx DDIs receive only an IDDCX_MONITOR handle (never the WDFDEVICE/context), and ProcessSharingDisabled makes one devnode = one host process that dies with the device. A "device-owned" variant would add a use-after-free window (the watchdog races device cleanup) for no gain, and the per-monitor cleanup callback isn't reliably reachable on this UMDF/IddCx stack. Cleanup is already deterministic (WDFDEVICE EvtCleanupCallback + cleanup_for_device_removal + the host-gone watchdog). Revisit only if max_concurrent>1 on Windows is actually needed. (monitor.rs documents this rationale at the MONITOR_MODES static.)

P3 — larger, mostly hardware-gated 6. M4 — gamepad-driver unification — substantially DONE (92e6802). pf-dualsense (DualSense / DualShock 4) and pf-xusb (Xbox 360 / XInput) now live in the unified packaging/windows/drivers/ workspace and build from source per release against the vendored wdk-sys, exactly like pf-vdisplay; build-gamepad-drivers.ps1 signs them with the shared cert. Remaining: point the driver side at pf_driver_proto::gamepad::{PadShm,XusbShm} (the host side already does — the device_type-at-offset hand-duplication is the last ABI-drift hazard), add WDF device contexts for true multi-pad, and confirm the source build matches the prior shipped binaries. 7. M5 — reshape WGC/DDA + GameStream onto session/pipeline, then delete the old relay/monoliths. AMF/QSV stays CI-only (no lab hardware). 8. On-glass behavioral validation of the committed-but-unexercised fixes: the watchdog reaping on host-kill, SET_RENDER_ADAPTER on a hybrid box (the lab box is single-dGPU), the IDD-push→DDA fallback trigger, HDR-ring sizing + out-ring repeat under real HDR/static-desktop pipelining, and the AMF/QSV encode path on real AMD/Intel hardware.


5. Operations

5.1 RTX box on-glass recipe

The persistent on-glass validator is the RTX box (ssh "Enrico Bühler"@<ip>, ENRICOS-DESKTOP, RTX 4090, PS shell). The IP FLOATS (DHCP; boots to Proxmox on reboot → ephemeral, unreachable after a reboot; recently .173/.158 — confirm current first; never reboot it, never depend on it surviving). It has WDK 26100 + LLVM 21.1.2 + the Rust toolchain; build clone at C:\Users\Public\pf-rewrite (the user's active driver-dev tree — don't clobber uncommitted WIP; use a worktree). Username has a ü → quote it; it only breaks SDL3/client builds, not the host. To validate a host branch: worktree-checkout, build with CARGO_TARGET_DIR=C:\t-goal1, then stop the PunktfunkHost service, back up the binary + %ProgramData%\punktfunk\host.env, copy your build in, restart, drive punktfunk-probe.exe loopback, then restore + git worktree remove. Drive over ssh via powershell -EncodedCommand <base64 UTF-16LE> (plain quoting mangles; prefer Write-Output/file-redirect for clean output). Driver redeploy: packaging/windows/redeploy-pf-vdisplay.ps1; ghost-monitor recovery: reset-pf-vdisplay.ps1.

5.2 CI / validation

The persistent build validator is the windows-amd64 CI runner (no GPU — fine for builds / iddcx link / /INTEGRITYCHECK self-sign / the surface-asserts; live NVENC encode + on-glass defers to the RTX box). Workflows: windows-host.yml (the host installer), windows-drivers.yml (the driver workspace build + FORCE_INTEGRITY clear; self-provisions the WDK/LLVM toolchain via scripts/ci/ ensure-windows-toolchain.ps1), windows-msix.yml (the client). A single Windows runner serializes the whole fleet; a Cargo.toml touch costs ~25 min of queue, so driver pushes that avoid Cargo.toml skip the fleet serialization.

Local pre-push checks (this Linux box can't compile the Windows paths):

cargo test  -p pf-driver-proto                 # the ABI crate (cross-platform)
cargo check -p punktfunk-host                     # Linux paths; win_* mods are #[cfg(windows)]
cargo clippy -p punktfunk-host --all-targets -- -D warnings
# Windows host clippy (on the box; NVENC needs no import lib — runtime-loaded):
#   cargo clippy -p punktfunk-host --features nvenc --target x86_64-pc-windows-msvc -- -D warnings
# Driver build (on the box): cd packaging/windows/drivers; Version_Number=10.0.26100.0;
#   LIBCLANG_PATH='C:\Program Files\LLVM\bin'; cargo build

Note: a pre-existing rustfmt-version drift exists in some Windows-only files (this box's rustfmt 1.9.0 wraps offset_of!/unsafe fn differently than the runner's) — don't reformat unrelated files to chase it.

5.3 Env knobs (Windows host)

PUNKTFUNK_IDD_PUSH=1 (capture from the driver ring; shipped host.env default on, in-code default off), PUNKTFUNK_ENCODER=auto|nvenc (auto → vendor-detect), PUNKTFUNK_10BIT=1 + PUNKTFUNK_HDR_SHADER_P010=1 (HDR), PUNKTFUNK_SECURE_DDA=1, PUNKTFUNK_NO_WGC=1 (pure DDA), PUNKTFUNK_ZEROCOPY=1, PUNKTFUNK_MONITOR_LINGER_MS, PFVD_DEBUG_LOG=1 (driver file log — release builds are silent without it). Config lives in %ProgramData%\punktfunk\host.env; logs in %ProgramData%\punktfunk\logs\host.log.

5.4 Build / deploy / packaging

x64-only by design (no ARM64 NVIDIA driver). The installer is the thin-.iss / fat-binary model delegating to service install; tag host-win-vX.Y.Z. The drivers are built + FORCE_INTEGRITY-cleared + signed + Inf2Cat'd in CI from source. DriverVer must bump on any driver change; create the ROOT devnode via nefcon (devgen is forbidden).


6. Reference (hard-won — keep)

6.1 The /INTEGRITYCHECK answer

wdk-build emits cargo::rustc-cdylib-link-arg=/INTEGRITYCHECK unconditionally (no cfg/env/Config opt-out), so a self-signed driver can't load (CodeIntegrity 3004/3089). The fix: a deterministic, idempotent post-link step packaging/windows/clear-force-integrity.ps1 clears the PE FORCE_INTEGRITY bit (0x0080 @ e_lfanew+0x5e) + verifies (CI-proven 0x01E0 → 0x0160), before signing. Packaging order: cargo build → clear-force-integrity → sign .dllInf2Cat → sign .cat. (A public build would use real attestation signing, which satisfies /INTEGRITYCHECK legitimately.)

6.2 The iddcx binding on wdk-sys (the make-or-break — proven, the 6 bindgen knobs)

IddCx DDIs are function-table dispatched (IddFunctions[] indexed by _IDDFUNCENUM::<Name>TableIndex, IddDriverGlobals implicit arg 1) — the same model wdk-sys already implements for WDF. The vendored windows-drivers-rs 0.5.1 (packaging/windows/drivers/vendor/, [patch.crates-io]'d) gets a first-class ApiSubset::Iddcx that bindgens iddcx/1.10/IddCx.h reusing the identical wdk_default(config) baseline (so WDF/DXGI types resolve to, not redefine, wdk-sys's — type-identity by construction). The six knobs generate_iddcx needed (each a real gotcha, all CI-proven):

  1. --language=c++wdk_default parses C; IddCx.h's IDARG_* typedefs need C++ (else a "must use 'struct' tag" cascade).
  2. -DIDD_STUB — table-dispatch mode; skips IddCxFuncEnum.h's #error IDDCX_VERSION_MAJOR not defined. Do NOT add WDF_STUB (would desync the shared WDF type-identity).
  3. allowlist_recursively(false) + allowlist_file("(?i).*iddcx.*"), full codegen (no .complement()) — emit ONLY IddCx items; WDF/Win types resolve via use crate::types::*.
  4. allowlist_type("_?DXGI_.*" / "IDXGI.*" / "_?OPM_.*" / "_?D3DCOLORVALUE") — emit the non-WDF types wdk-sys doesn't bindgen, locally. The _? is load-bearing (typedef struct _OPM_X {} OPM_X needs the tag AND the alias).
  5. pub type UINT = ::core::ffi::c_uint; in src/iddcx.rsUINT is absent from crate::types.
  6. translate_enum_integer_types(true) — emit native u32 reprs for the DXGI/OPM ModuleConsts enums (nested modules can't see a parent UINT).

Wrapper note: table dispatch via _IDDFUNCENUM::<Name>TableIndex as usize (the ModuleConsts const, not a NewType .0); NTSTATUS is plain i32 (wdk_sys::NT_SUCCESS). The driver build.rs adds the IddCxStub link-search (the import lib is under iddcx\1.0\ even though headers are 1.10) + #[no_mangle] pub static IddMinimumVersionRequired: ULONG = 4. The versioned IDD_STRUCTURE_SIZE! path is dropped — the WDK links the iddcx 1.0 stub (lacks the version table); we target 1.10 vs a current framework, so size_of is exactly correct.

6.3 Driver port checklist (STEP 08, as landed)

  1. workspace pf-vdisplay(cdylib)+wdk-iddcx; prove std::thread+OwnedHandle link under UMDF (done).
  2. wdk-iddcx: 11 typed DDI wrappers via one dispatch macro + re-export the inbound PFN_* types.
  3. DriverEntry + IDD_CX_CLIENT_CONFIG (15 callbacks) + DeviceInitConfig + WdfDeviceCreate + CreateDeviceInterface (the owned pf GUID) + DeviceInitialize; edid.rs salvaged verbatim.
  4. DeviceContext + WDF_DECLARE_CONTEXT_TYPE blob; init_adapter in D0Entry (caps + FP16) → AdapterInitAsync; the *2 mode DDIs + query_target_info + gamma/HDR accept-stubs. (Box gate: loads under Secure Boot, enumerates as an IddCx adapter, Status OK.)
  5. control plane (GET_INFO version handshake the host asserts, ADD/REMOVE/SET_RENDER_ADAPTER/PING/ CLEAR_ALL) + create_monitor + real mode DDIs + watchdog + mode bounds; host switched to pf_driver_proto.
  6. Direct3DDevice + assign/unassign + SwapChainProcessor (worker, SetDevice 60×@50 ms single-borrow retry, top-of-loop terminate, ReleaseAndAcquireBuffer2, from_raw_borrowed).
  7. FramePublisher on pf_driver_proto::frame + keyed-mutex RAII guard; wire into run_core. (Box: full IDD-push glass-to-glass + the secure-desktop gate — validated 2026-06-25.)
  8. HDR / FP16 ring (validated: Mac connects WITH HDR).
  9. its own .inx + an unsafe-reduction pass (deny(unsafe_op_in_unsafe_fn), per-site // SAFETY:).

Remaining driver work beyond STEP 8: E1 (DeviceContext-owned state + per-IDDCX_MONITOR EvtCleanupCallback → unblock max_concurrent>1 — see §4 for why it's deliberately deferred), the slot-reclaim-on-REMOVE fix (§4 P1.3), and folding the gamepad-driver side onto pf_driver_proto (M4 tail, §4 P3).

6.4 Resolved product decisions (the five forks)

A the host was refactored in place (staged, behavior-preserving), not greenfield-rebuilt — the driver was rebuilt fresh. B IDD-push primary for everything incl. the secure desktop (validated); WGC+DDA demoted to non-IddCx fallbacks. C all drivers on microsoft/windows-drivers-rs (+ the iddcx subset; /INTEGRITYCHECK solved) — done for pf-vdisplay and now for the gamepad drivers (M4, 92e6802). D keep GameStream (Moonlight), default to secure serve. E concurrent sessions: the host-side preempt dance was removed by the ownership-model work, but true max_concurrent>1 on Windows stays blocked on the E1 driver swap-chain-reuse work (deliberately deferred, §4). Rejected: DeviceContext-per-monitor ownership — see the E1 stability decision in §4 (it would add a use-after-free window for no gain under ProcessSharingDisabled).


Origins & design rationale (from the original plan)

This folds in the durable rationale from the original Windows host + client plan (windows-host.md, now a stub; full original text in git history). The Windows host began (2026-06-10 to 2026-06-14) as a "add backends behind the existing traits" job, not a parallel port — punktfunk-core and the whole control plane are platform-agnostic, and the host already compiled on non-Linux (macOS) thanks to existing cfg(target_os) gating. These framing decisions shaped what shipped and still explain why the code is the way it is:

  • Build order: host-first. A user preference (the research had recommended client-first, since the client is unblocked by the no-GPU problem and becomes the host's test endpoint). The trade-off held — the GPU-gated steps were the only ones that stalled GPU-less.
  • Trait-based abstraction → ~95% reuse. punktfunk-core (protocol/FEC/crypto/session/transport/QUIC/ C ABI), the GameStream wire logic (mDNS, serverinfo, pairing, RTSP, ENet), the management REST API + native_pairing/discovery, and the punktfunk1/spike/pipeline orchestration all carried over unchanged — only the OS-touching backends behind Capturer/Encoder/VirtualDisplay/InputInjector/ AudioCapturer/VirtualMic are new #[cfg(windows)] code. Getting to MSVC needed only ~3 cfg-gates (gate the std::os::fd/OwnedFd unix-isms in main.rs/vdisplay.rs).
  • The no-GPU dev strategy. Most of the port was built + validated on a GPU-less Windows VM: the MSVC compile, the virtual-display control path (WARP), the openh264 software-encode pipeline (full capture→encode→FEC→UDP transport minus HW), SendInput injection + interactive-session/desktop-reattach, gamepad + rumble, and the entire client (software-decode loopback). Only NVENC-D3D11 zero-copy, the DDA-vs-WGC bake-off, split-encode/bitrate-ceiling, and all glass-to-glass numbers deferred to a real NVIDIA box (no perf claim transfers from Linux).
  • Windows-specific structural issues (no Linux precedent) — these are the gotchas that drove the service + capture design and remain true:
    • Interactive session, not a Session-0 service. SendInput can't reach the desktop from Session 0; Desktop Duplication / capture need the interactive session. Hence the SYSTEM-in-interactive-session supervisor (§2.6, windows-service.md) and the OpenInputDesktop/ SetThreadDesktop re-attach to survive UAC/lock desktop switches.
    • Clock epoch. The skew handshake assumes both ends read the same realtime epoch in ns — the Windows host must emit timestamps from GetSystemTimePreciseAsFileTime→Unix-epoch-ns, or cross-machine latency
      • ClockProbe/ClockEcho break (std SystemTime on Windows is historically coarser).
    • No audio endpoint on a headless IDD. WASAPI loopback needs a real/virtual render device; the virtual mic (client→host) has no clean user-mode path — deferred.
    • Color/range. All clients assume BT.709 limited-range; the BGRA→I420/NV12 path must match or colors wash out — validated against the existing decoders.

SudoVDA → pf-vdisplay evolution. The original plan was built around SudoVDA, an off-the-shelf indirect display driver (the same IDD Apollo ships) — chosen to avoid writing/WHQL-signing a driver and to get arbitrary WxH@Hz modes on the fly. It carried the host all the way to live-validated NVENC on a real RTX 4090. It was then replaced by the all-Rust pf-vdisplay IddCx driver (which solved /INTEGRITYCHECK self-signing, §6.1, and gave us the IDD-push zero-copy capture path that captures the secure desktop directly) and deleted in commit 84a3b95pf-vdisplay is now the sole virtual-display backend. The full SudoVDA control protocol (IOCTL layout, watchdog keepalive, GDI-name resolution) lives in git history if ever needed as a reference.