Files
punktfunk/design/windows-host-rewrite.md
T
enricobuehler 7b99b41ede docs(design): trim shipped plans, consolidate cluster, add index
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).

- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
  apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
  host-latency, gpu-contention (fixed stale status table), game-library,
  linux-setup (fixed m0->spike + stale zero-copy claim),
  session-aware-host-followups, windows-client-bootstrap,
  windows-dualsense-{scoping,game-detection}, windows-virtual-display,
  security-review (per-finding status table; #12 still open),
  apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
  windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
  merged, M4 done); windows-secure-desktop.md archived (now a fallback
  behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
  roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:39:06 +00:00

411 lines
30 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Windows Host — Architecture, Status & Roadmap
> **Single source of truth** for the punktfunk Windows streaming host: the all-Rust **`pf-vdisplay`
> IddCx virtual-display driver** + **IDD-push zero-copy capture** + **NVENC/AMF/QSV encode**, shipped as
> a signed Inno Setup installer with a LocalSystem SCM service. Live-validated on the RTX box through
> 5120×1440@240 HDR, the secure desktop (lock/UAC), and a fullscreen game.
>
> This file is the consolidated Windows-host doc — it absorbs the rewrite design plan, the Goal-1
> staged-refactor plan, the audit + remediation tracker, the fullscreen-game capture-bug analysis, and
> the durable rationale from the original `windows-host.md` implementation plan (now a stub).
> **Last updated 2026-06-26.** All of this work is **merged to `main`** (the `windows-host-goal1`
> branch landed at `3e7c9bd`).
---
## 1. Status at a glance
**Goals 13 and milestones M0M4 are complete and merged to `main`.** The host has a clean, typed,
layered architecture (`HostConfig → SessionPlan → SessionContext`, `windows/`+`linux/` confinement, a
single `VirtualDisplayManager` ownership model, `EncoderCaps`); the all-Rust IddCx `pf-vdisplay` driver
loads self-signed under Secure Boot and does IDD-push zero-copy capture at 5K@240 HDR including the
**secure desktop** (Winlogon/UAC/lock); SudoVDA is gone (`84a3b95`) — `pf-vdisplay` is the sole
virtual-display backend; and the three UMDF drivers (`pf-vdisplay`, `pf-dualsense`, `pf-xusb`) now build
from source in one unified `packaging/windows/drivers/` workspace (M4, `92e6802`). The shipped path
(IDD-push + NVENC) is live-validated on glass; the AMF/QSV encode path is CI-green but not yet
on-hardware (no AMD/Intel Windows box in the lab).
Ground the details against the code: `crates/punktfunk-host/src/windows/`,
`crates/punktfunk-host/src/{capture,encode,inject,audio,vdisplay}/windows/`, and
`packaging/windows/drivers/`.
**What remains (all non-blocking):** the `pf-vdisplay` slot-reclaim-on-REMOVE fix needs an on-glass
reconnect-storm A/B (§4 P1.3); host-crate `unsafe` lint hygiene + old-monolith / bring-up-scaffolding
cleanup (§4 P2); and the hardware-gated items — AMF/QSV on-glass, hybrid-GPU `SET_RENDER_ADAPTER`,
the WGC/DDA fallback reshape, and true `max_concurrent>1` (§4 P3). One framing note: the host was
**not** greenfield-rebuilt — it was **refactored in place** via a staged, behavior-preserving sequence
that kept the live host working at every step; only the *driver* was rebuilt fresh.
---
## 2. Architecture (what is on disk)
A ~1-page map; the empirical constraints these encode are in §3, the deep reference is in §6.
### 2.1 Layering & crates
- **`crates/punktfunk-host`** — one shared host crate (Linux + Windows; not split). Platform code is
confined under per-module `windows/`+`linux/` folders behind `#[cfg]` seams (`capture/{windows,linux}/`,
`encode/…`, `inject/…`, `audio/…`, `vdisplay/…`, plus top-level `src/windows/`+`src/linux/`). Module
names stay flat (`#[path]`), so caller paths are platform-agnostic.
- **`crates/punktfunk-core`** — the one linked protocol/FEC/crypto/QUIC core (unchanged here).
- **`crates/pf-driver-proto`** — the owned, `no_std` host↔driver ABI (frame ring + control plane +
gamepad SHM), consumed by both the host crate and the driver workspace.
- **`packaging/windows/drivers/`** — the unified driver workspace on `microsoft/windows-drivers-rs`
(vendored 0.5.1 + an `iddcx` subset): `pf-vdisplay` (the IddCx display driver), `pf-dualsense` +
`pf-xusb` (the gamepad drivers, folded in by M4), `wdk-iddcx` (typed IddCx DDI wrappers), `wdk-probe`
(the CI link/surface gate), `vendor/{wdk-build,wdk-sys}`.
### 2.2 Session resolution, ownership, and seam traits (Goal-1)
The old ~40-knob `PUNKTFUNK_*` env soup (re-read and recomputed in three places) is replaced by a
**resolve-once** pipeline: `config.rs` `HostConfig` (typed, parsed once) → `session_plan.rs` `SessionPlan`
(a `Copy` plan resolved once per session — `CaptureBackend::resolve()` picks `IddPush | Dda | Wgc`,
`resolve_topology` picks `SingleProcess | TwoProcessRelay`; this killed the latent capture/encode
backend-disagreement bug) → `SessionContext` (bundles the ~13 session args + plane receivers, moved into
the stream thread).
Ownership is a single **OnceLock `VirtualDisplayManager`** (`vdisplay/windows/manager.rs`) owning a
*typed* `Arc<OwnedHandle>` control-device handle (no raw-`isize` cross-thread smuggle), a refcounted
Idle/Active/Lingering state machine, and the monitor generation; a per-session `MonitorLease`'s `Drop`
releases the refcount (a stale lease can't tear down a fresh monitor). This deleted a fistful of
`CURRENT_MON_GEN`/`MGR`/`IDD_*` globals and validated on glass at **0 leaked monitors across a reconnect
storm**, A/B-equivalent to the shipping host.
The seam traits (`VirtualDisplay`/`VirtualOutput`/`VirtualLease`, `Capturer`, `Encoder`,
`AudioCapturer`/`VirtualMic`/`InputInjector`/`PadManager`) got two tightenings: the capturer takes the
desired `OutputFormat { gpu, hdr }` **in** (killing the `capture → encode::windows_resolved_backend()`
back-reference), and `Encoder::caps() -> EncoderCaps` (§2.4) lets the session glue route loss-recovery by
query.
### 2.3 Capture — IDD-push primary (normal **and** secure desktop), WGC/DDA fallback, GB1 recovery
**IDD-push is the universal primary path.** Capture comes straight from the driver's shared keyed-mutex
texture ring (`capture/windows/idd_push.rs`) — no Desktop Duplication, no `win32u` reparenting hook. The
host creates the ring; the driver opens it (permissive `D:(A;;GA;;;WD)` SDDL). The generation-tagged
`latest = gen<<40 | seq<<8 | slot` stale-ring reject kills the HDR-flip garbage frame; a host-owned
3-slot `OUT_RING` rotated per frame is the texture-ownership contract that enables `pipeline_depth=2`
(convert/copy on the 3D engine overlapping NVENC on the ASIC). It captures the **secure desktop**
(Winlogon/UAC/lock) directly (validated 2026-06-25), so there is no separate secure capturer in the
primary path.
- **Open-time fallback:** `IddPushCapturer::open` waits a bounded ~4 s for a *first frame* (not just
`DRV_STATUS_OPENED`); on attach failure it returns the keepalive back so `capture.rs` opens **DDA** on
the same `WinCaptureTarget` — never a 20 s black bail (`ed58365`/`f98ab07`).
- **Mid-session game mode-set recovery (GB1, fixed):** the 250 ms poll follows the display's *actual*
resolution (`win_display::active_resolution`, CCD/GDI) and recreates the ring on any descriptor change
(size **or** HDR) → the driver re-attaches → frames resume at the game's mode, **no reconnect**. If a
change is unrecoverable (e.g. an exclusive flip), a `recovering_since` clock drops the session after 3 s
so the client reconnects cleanly. No protocol bump was needed — the host reads the resolution straight
from Windows (`c87bfe0`; the driver's `publish()` width/height guard + flushed log is `789ad49`).
- **WGC + DDA** stay as demoted fallbacks for non-IddCx hardware (`wgc.rs`/`dxgi.rs`). The two-process WGC
secure-desktop relay (`wgc_relay.rs`) is no longer load-bearing now that IDD-push handles the secure
desktop; it is kept recoverable but slated for M5/M6 cleanup. (Its constraint analysis is archived in
[`archive/windows-secure-desktop.md`](archive/windows-secure-desktop.md).)
### 2.4 Encode — NVENC / AMF / QSV / software; `EncoderCaps`; HDR
`encode/windows/` dispatches per DXGI adapter vendor (`open_video`): **NVENC** (NVIDIA, direct SDK,
`nvenc.rs` — caps-probe-before-configure, bitrate-clamp binary search, true RFI over the DPB, in-band
ST.2086/CLL SEI), **AMF**/**QSV** (AMD/Intel via libavcodec, `ffmpeg_win.rs` — system-readback default,
opt-in zero-copy D3D11; CI-only, no lab hardware), or **software** H.264 (`sw.rs`). HDR (10-bit) forces
HEVC Main10 + BT.2020 PQ; the client auto-detects PQ from the VUI. The encoder adapts to a mid-session
size/format/HDR change per frame (tears down + re-inits), so the GB1 capturer's resolution changes are
handled downstream with no API change.
`Encoder::caps() -> EncoderCaps { supports_rfi, supports_hdr_metadata }` lets the session glue route
loss-recovery by query (only Windows direct-NVENC overrides it; the GameStream loop gates the RFI path on
`supports_rfi` rather than hard-coding per-backend knowledge into the glue).
### 2.5 Host↔driver ABI & the `pf-vdisplay` driver
`pf-driver-proto` is one `no_std` crate in both build graphs. It owns the **frame plane** (`FrameToken`
+ `Global\pfvd-*` names), the **control plane** (a fresh interface GUID — *not* SudoVDA's `e5bcc234`;
contiguous `0x900` IOCTL ops; a `GET_INFO` version handshake the host **asserts** + bails on mismatch),
and the **gamepad SHM** (`XusbShm`/`PadShm` incl. `device_type`). `bytemuck`-`Pod` + `size_of` **and**
`offset_of!` asserts make ABI drift a **compile error**.
The driver (`packaging/windows/drivers/pf-vdisplay/src/`) is an all-Rust UMDF IddCx driver on
`windows-drivers-rs` + the `iddcx` `wdk-sys` subset; the STEP 08 build is the checklist in §6.3, its
internals are the invariants in §3, and it loads self-signed under Secure Boot (FORCE_INTEGRITY cleared
post-link, §6.1). **Known gaps:** ownership state is still partly process-global with
`EvtCleanupCallback` on the **WDFDEVICE** (a deliberate, sound choice — E1 in §4); and
slot-reclaim-on-REMOVE (§4 P1.3).
### 2.6 Service, packaging, installer
A `LocalSystem` SCM supervisor (`windows/service.rs`) token-retargets and `CreateProcessAsUserW`s `serve`
into the console session (so `SendInput` reaches both the streamed and the secure desktop), relaunches on
session-change, and kills-on-close via a Job Object — the Sunshine/Apollo model (rationale:
[`windows-service.md`](windows-service.md)). Shipped as a **signed Inno Setup** `setup.exe`
(`packaging/windows/`, `windows-host.yml`) that builds + signs all three drivers from source, bundles
them + the FFmpeg DLLs, and delegates to `service install`. GameStream (Moonlight) is kept, but the
installer/service default to secure `serve` (GameStream opt-in).
---
## 3. Validated invariants — preserve, do not regress
These are expensive empirical wins; keep them intact when touching the code:
- **Frame transport:** host-creates/driver-opens keyed-mutex ring; generation-tagged stale-ring reject;
0 ms try-acquire / drop-on-full publish (never block the swap-chain thread); the `OUT_RING` rotation +
`pipeline_depth=2` overlap; `repeat_last` rotates into a fresh out-ring slot (depth-safe).
- **Driver internals:** `edid.rs` (128-byte EDID + CTA-861.3 HDR block, dual checksums); the FP16 HDR
recipe (`CAN_PROCESS_FP16` + the `*2` DDIs + gamma/HDR accept-stubs + `HIGH_COLOR_SPACE`); `DEVICE_POOL`
per render-LUID (NVIDIA UMD/VRAM leak fix); target-id stamped on the monitor context; the two swap-chain
leak fixes (borrow `IDXGIDevice` across `SetDevice` retries; check `terminate` at the loop top).
- **Monitor lifecycle:** serialized ADD/REMOVE/teardown; restore CCD topology **before** REMOVE; the
generation-stamped lease (a stale lease can't tear down a fresh monitor); 0-leak across reconnects.
- **HDR color math:** `hdr.rs` (pure, unit-tested, ST.2086 + big-endian SEI); the FP16→P010/Rgb10a2
converters + `hdr_p010_selftest`; the cursor decomposition.
- **NVENC tuning:** caps-probe-before-configure (10-bit→8-bit graceful downgrade); bitrate-clamp binary
search (each GPU's real ceiling); true RFI over the DPB; CBR / infinite-GOP / P-only / ~1-frame VBV.
- **Gamepad recipe:** the SwDeviceCreate identity (enumerator with no `_`; mandatory completion callback;
synthesized DS5 compat-ids; non-null per-pad `ContainerId`); one `pf_dualsense` serving DualSense+DS4
via a `device_type` byte; XUSB declining `WAIT_*`; per-pad index via `pszDeviceLocation`.
- **Session glue:** the trait seam + RAII keepalive teardown; host-lifetime shared services + per-session
gamepads; the encode|send split + microburst pacing; `build_pipeline_with_retry` permanent-vs-transient
classification; the GameStream `VideoPacketizer` (GF8 Cauchy, Moonlight byte-exact); the pairing/trust
handshake.
- **Core discipline:** no async on the per-frame path; `pf-driver-proto` is the single ABI source
(drift = compile error); the version handshake the host asserts.
---
## 4. Open work / next tasks (prioritized)
**P1 — ship-readiness / correctness**
1. **Goal-1 → `main` merge — ✅ DONE.** The `windows-host-goal1` branch is merged (tip `3e7c9bd`); the
full Windows CI matrix (incl. the `amf-qsv` encode path that local checks skip) runs on push.
2. **IDD-push default — ✅ resolved via `host.env`.** The shipped default `host.env` sets
`PUNKTFUNK_IDD_PUSH=1`, so a fresh install runs the validated IDD-push path (with the WGC/DDA fallback
in place). The bare *in-code* default (`config.rs`) is still `false` (the dev / non-pf-driver default);
flipping it to follow the deployed default is an optional tidy.
3. **pf-vdisplay slot reclaim on REMOVE** (driver robustness) — 🟡 **fix landed, on-glass-validation
pending.** Sustained ADD/REMOVE churn wedged the driver (`ADD → 0x80070490 ERROR_NOT_FOUND`) because the
monitor id (EDID serial / `ConnectorIndex` / container GUID) was a **monotonic** `NEXT_ID`, never
reclaimed → IddCx accumulated a new OS target slot per cycle until exhaustion. `monitor.rs` now allocates
the **lowest free id** (`alloc_monitor_id`), reused on REMOVE, so a fresh ADD reuses the departed
monitor's target slot instead of orphaning it. CI-compile-gated; the wedge only reproduces under
sustained churn on the RTX box, so this needs an **on-glass reconnect-storm A/B** to confirm (the box is
ephemeral). Keep `packaging/windows/reset-pf-vdisplay.ps1` as the recovery until validated.
**P2 — hygiene / architecture completion**
4. **D1-host — host-crate P0 lints — deferred (low value / high churn).** A crate-wide
`#![deny(unsafe_op_in_unsafe_fn)]` produced 100+ FFI-wrap sites across the Linux modules; it *wraps*
unsafe (discipline) rather than reducing it and doesn't improve stability, so it was deprioritized vs
the `OwnedHandle`/RAII reductions (which are **complete**`idd_push.rs`, `service.rs`, the three
gamepad backends via a shared `gamepad_raii.rs`, the SCM STOP/SESSION events as `OnceLock<OwnedHandle>`,
the hot-loop `KeyedMutexGuard`, and the driver's `pod_init!`; all box-validated, clean `sc stop` in
~1 s). The driver already has the deny. Revisit D1-host as a final discipline pass (staged per-module)
if desired.
5. **M6 scaffolding cleanup** — delete the bring-up diagnostics (`spawn_observer`/`DebugBlock` in
`idd_push.rs`) and, once full parity is proven on glass, the host monoliths.
**Explicitly NOT doing (stability decision): E1 — driver `DeviceContext` ownership + per-`IDDCX_MONITOR`
`EvtCleanupCallback`.** The current process-global design is *sound*: IddCx DDIs receive only an
`IDDCX_MONITOR` handle (never the WDFDEVICE/context), and `ProcessSharingDisabled` makes one devnode = one
host process that dies with the device. A "device-owned" variant would *add* a use-after-free window (the
watchdog races device cleanup) for no gain, and the per-monitor cleanup callback isn't reliably reachable
on this UMDF/IddCx stack. Cleanup is already deterministic (WDFDEVICE `EvtCleanupCallback` +
`cleanup_for_device_removal` + the host-gone watchdog). **Revisit only if `max_concurrent>1` on Windows is
actually needed.** (`monitor.rs` documents this rationale at the `MONITOR_MODES` static.)
**P3 — larger, mostly hardware-gated**
6. **M4 — gamepad-driver unification — ✅ substantially DONE** (`92e6802`). `pf-dualsense` (DualSense /
DualShock 4) and `pf-xusb` (Xbox 360 / XInput) now live in the unified `packaging/windows/drivers/`
workspace and build from source per release against the vendored `wdk-sys`, exactly like `pf-vdisplay`;
`build-gamepad-drivers.ps1` signs them with the shared cert. **Remaining:** point the **driver side** at
`pf_driver_proto::gamepad::{PadShm,XusbShm}` (the host side already does — the `device_type`-at-offset
hand-duplication is the last ABI-drift hazard), add WDF device contexts for true multi-pad, and confirm
the source build matches the prior shipped binaries.
7. **M5 — reshape WGC/DDA + GameStream onto `session/pipeline`**, then delete the old relay/monoliths.
AMF/QSV stays CI-only (no lab hardware).
8. **On-glass behavioral validation** of the committed-but-unexercised fixes: the watchdog reaping on
host-kill, `SET_RENDER_ADAPTER` on a **hybrid** box (the lab box is single-dGPU), the IDD-push→DDA
fallback trigger, HDR-ring sizing + out-ring repeat under real HDR/static-desktop pipelining, and the
AMF/QSV encode path on real AMD/Intel hardware.
---
## 5. Operations
### 5.1 RTX box on-glass recipe
The persistent on-glass validator is the **RTX box** (`ssh "Enrico Bühler"@<ip>`, ENRICOS-DESKTOP, RTX
4090, PS shell). **The IP FLOATS** (DHCP; boots to **Proxmox** on reboot → ephemeral, unreachable after a
reboot; recently `.173`/`.158` — confirm current first; **never reboot it, never depend on it surviving**).
It has WDK 26100 + LLVM 21.1.2 + the Rust toolchain; build clone at `C:\Users\Public\pf-rewrite` (the
user's active driver-dev tree — **don't clobber uncommitted WIP**; use a worktree). Username has a `ü`
quote it; it only breaks SDL3/client builds, not the host. To validate a host branch: worktree-checkout,
build with `CARGO_TARGET_DIR=C:\t-goal1`, then stop the **PunktfunkHost** service, back up the binary +
`%ProgramData%\punktfunk\host.env`, copy your build in, restart, drive `punktfunk-probe.exe` loopback,
then restore + `git worktree remove`. Drive over ssh via `powershell -EncodedCommand <base64 UTF-16LE>`
(plain quoting mangles; prefer `Write-Output`/file-redirect for clean output). Driver redeploy:
`packaging/windows/redeploy-pf-vdisplay.ps1`; ghost-monitor recovery: `reset-pf-vdisplay.ps1`.
### 5.2 CI / validation
The persistent build validator is the **windows-amd64 CI runner** (no GPU — fine for builds / `iddcx`
link / `/INTEGRITYCHECK` self-sign / the surface-asserts; live NVENC encode + on-glass defers to the RTX
box). Workflows: `windows-host.yml` (the host installer), `windows-drivers.yml` (the driver workspace
build + FORCE_INTEGRITY clear), `windows-drivers-provision.yml` (WDK/LLVM toolchain), `windows-msix.yml`
(the client). A single Windows runner serializes the whole fleet; a `Cargo.toml` touch costs ~25 min of
queue, so driver pushes that avoid `Cargo.toml` skip the fleet serialization.
Local pre-push checks (this Linux box can't compile the Windows paths):
```sh
cargo test -p pf-driver-proto # the ABI crate (cross-platform)
cargo check -p punktfunk-host # Linux paths; win_* mods are #[cfg(windows)]
cargo clippy -p punktfunk-host --all-targets -- -D warnings
# Windows host clippy (on the box): PUNKTFUNK_NVENC_LIB_DIR=C:\t\nvenc;
# cargo clippy -p punktfunk-host --features nvenc --target x86_64-pc-windows-msvc -- -D warnings
# Driver build (on the box): cd packaging/windows/drivers; Version_Number=10.0.26100.0;
# LIBCLANG_PATH='C:\Program Files\LLVM\bin'; cargo build
```
Note: a pre-existing rustfmt-version drift exists in some Windows-only files (this box's rustfmt 1.9.0
wraps `offset_of!`/`unsafe fn` differently than the runner's) — don't reformat unrelated files to chase it.
### 5.3 Env knobs (Windows host)
`PUNKTFUNK_IDD_PUSH=1` (capture from the driver ring; shipped `host.env` default on, in-code default off),
`PUNKTFUNK_ENCODER=auto|nvenc` (auto → vendor-detect), `PUNKTFUNK_10BIT=1` + `PUNKTFUNK_HDR_SHADER_P010=1`
(HDR), `PUNKTFUNK_SECURE_DDA=1`, `PUNKTFUNK_NO_WGC=1` (pure DDA), `PUNKTFUNK_ZEROCOPY=1`,
`PUNKTFUNK_MONITOR_LINGER_MS`, `PFVD_DEBUG_LOG=1` (driver file log — release builds are silent without it).
Config lives in `%ProgramData%\punktfunk\host.env`; logs in `%ProgramData%\punktfunk\logs\host.log`.
### 5.4 Build / deploy / packaging
x64-only by design (no ARM64 NVIDIA driver). The installer is the thin-`.iss` / fat-binary model
delegating to `service install`; tag `host-win-vX.Y.Z`. The drivers are built + FORCE_INTEGRITY-cleared +
signed + `Inf2Cat`'d in CI from source. DriverVer must bump on any driver change; create the ROOT devnode
via nefcon (devgen is forbidden).
---
## 6. Reference (hard-won — keep)
### 6.1 The `/INTEGRITYCHECK` answer
`wdk-build` emits `cargo::rustc-cdylib-link-arg=/INTEGRITYCHECK` **unconditionally** (no cfg/env/Config
opt-out), so a self-signed driver can't load (CodeIntegrity 3004/3089). The fix: a deterministic,
idempotent post-link step `packaging/windows/clear-force-integrity.ps1` clears the PE FORCE_INTEGRITY bit
(`0x0080 @ e_lfanew+0x5e`) + verifies (CI-proven `0x01E0 → 0x0160`), **before** signing. Packaging order:
`cargo build` → clear-force-integrity → sign `.dll``Inf2Cat` → sign `.cat`. (A public build would use
real attestation signing, which satisfies `/INTEGRITYCHECK` legitimately.)
### 6.2 The `iddcx` binding on `wdk-sys` (the make-or-break — proven, the 6 bindgen knobs)
IddCx DDIs are **function-table dispatched** (`IddFunctions[]` indexed by `_IDDFUNCENUM::<Name>TableIndex`,
`IddDriverGlobals` implicit arg 1) — the same model `wdk-sys` already implements for WDF. The vendored
`windows-drivers-rs` 0.5.1 (`packaging/windows/drivers/vendor/`, `[patch.crates-io]`'d) gets a first-class
`ApiSubset::Iddcx` that bindgens `iddcx/1.10/IddCx.h` reusing the identical `wdk_default(config)` baseline
(so WDF/DXGI types **resolve to**, not redefine, `wdk-sys`'s — type-identity by construction). The six
knobs `generate_iddcx` needed (each a real gotcha, all CI-proven):
1. **`--language=c++`** — `wdk_default` parses C; `IddCx.h`'s `IDARG_*` typedefs need C++ (else a "must use
'struct' tag" cascade).
2. **`-DIDD_STUB`** — table-dispatch mode; skips `IddCxFuncEnum.h`'s `#error IDDCX_VERSION_MAJOR not
defined`. **Do NOT add `WDF_STUB`** (would desync the shared WDF type-identity).
3. **`allowlist_recursively(false)` + `allowlist_file("(?i).*iddcx.*")`, full codegen (no `.complement()`)**
— emit ONLY IddCx items; WDF/Win types resolve via `use crate::types::*`.
4. **`allowlist_type("_?DXGI_.*" / "IDXGI.*" / "_?OPM_.*" / "_?D3DCOLORVALUE")`** — emit the non-WDF types
`wdk-sys` doesn't bindgen, locally. The `_?` is load-bearing (`typedef struct _OPM_X {} OPM_X` needs the
tag AND the alias).
5. **`pub type UINT = ::core::ffi::c_uint;` in `src/iddcx.rs`** — `UINT` is absent from `crate::types`.
6. **`translate_enum_integer_types(true)`** — emit native `u32` reprs for the DXGI/OPM ModuleConsts enums
(nested modules can't see a parent `UINT`).
Wrapper note: table dispatch via `_IDDFUNCENUM::<Name>TableIndex as usize` (the ModuleConsts const, **not**
a NewType `.0`); NTSTATUS is plain `i32` (`wdk_sys::NT_SUCCESS`). The driver `build.rs` adds the IddCxStub
link-search (the import lib is under `iddcx\1.0\` even though headers are `1.10`) + `#[no_mangle] pub static
IddMinimumVersionRequired: ULONG = 4`. The versioned `IDD_STRUCTURE_SIZE!` path is dropped — the WDK links
the iddcx **1.0** stub (lacks the version table); we target 1.10 vs a current framework, so `size_of` is
exactly correct.
### 6.3 Driver port checklist (STEP 08, as landed)
0. workspace `pf-vdisplay`(cdylib)+`wdk-iddcx`; prove `std::thread`+`OwnedHandle` link under UMDF (done).
1. `wdk-iddcx`: 11 typed DDI wrappers via one dispatch macro + re-export the inbound `PFN_*` types.
2. DriverEntry + `IDD_CX_CLIENT_CONFIG` (15 callbacks) + DeviceInitConfig + WdfDeviceCreate +
CreateDeviceInterface (the owned pf GUID) + DeviceInitialize; `edid.rs` salvaged verbatim.
3. DeviceContext + `WDF_DECLARE_CONTEXT_TYPE` blob; `init_adapter` in D0Entry (caps + FP16) →
AdapterInitAsync; the `*2` mode DDIs + `query_target_info` + gamma/HDR accept-stubs. (Box gate: loads
under Secure Boot, enumerates as an IddCx adapter, Status OK.)
4. control plane (`GET_INFO` version handshake the host asserts, ADD/REMOVE/SET_RENDER_ADAPTER/PING/
CLEAR_ALL) + create_monitor + real mode DDIs + watchdog + mode bounds; host switched to
`pf_driver_proto`.
5. `Direct3DDevice` + assign/unassign + `SwapChainProcessor` (worker, `SetDevice` 60×@50 ms single-borrow
retry, top-of-loop `terminate`, `ReleaseAndAcquireBuffer2`, `from_raw_borrowed`).
6. `FramePublisher` on `pf_driver_proto::frame` + keyed-mutex RAII guard; wire into `run_core`. (Box:
full IDD-push glass-to-glass + the **secure-desktop** gate — validated 2026-06-25.)
7. HDR / FP16 ring (validated: Mac connects WITH HDR).
8. its own `.inx` + an `unsafe`-reduction pass (`deny(unsafe_op_in_unsafe_fn)`, per-site `// SAFETY:`).
**Remaining driver work** beyond STEP 8: E1 (DeviceContext-owned state + per-`IDDCX_MONITOR`
`EvtCleanupCallback` → unblock `max_concurrent>1` — see §4 for why it's deliberately deferred), the
slot-reclaim-on-REMOVE fix (§4 P1.3), and folding the gamepad-driver side onto `pf_driver_proto` (M4 tail,
§4 P3).
### 6.4 Resolved product decisions (the five forks)
**A** the host was refactored **in place** (staged, behavior-preserving), not greenfield-rebuilt — the
driver *was* rebuilt fresh. **B** IDD-push primary for everything incl. the **secure desktop** (validated);
WGC+DDA demoted to non-IddCx fallbacks. **C** all drivers on `microsoft/windows-drivers-rs` (+ the `iddcx`
subset; `/INTEGRITYCHECK` solved) — done for `pf-vdisplay` and now for the gamepad drivers (M4, `92e6802`).
**D** keep GameStream (Moonlight), default to secure `serve`. **E** concurrent sessions: the host-side
preempt dance was removed by the ownership-model work, but true `max_concurrent>1` on Windows stays blocked
on the E1 driver swap-chain-reuse work (deliberately deferred, §4). **Rejected: DeviceContext-per-monitor
ownership** — see the E1 stability decision in §4 (it would add a use-after-free window for no gain under
`ProcessSharingDisabled`).
---
## Origins & design rationale (from the original plan)
This folds in the durable rationale from the original Windows host + client plan
([`windows-host.md`](windows-host.md), now a stub; full original text in git history). The Windows host
began (2026-06-10 to 2026-06-14) as a *"add backends behind the existing traits"* job, not a parallel
port — `punktfunk-core` and the whole control plane are platform-agnostic, and the host already compiled
on non-Linux (macOS) thanks to existing `cfg(target_os)` gating. These framing decisions shaped what
shipped and still explain *why* the code is the way it is:
- **Build order: host-first.** A user preference (the research had recommended *client*-first, since the
client is unblocked by the no-GPU problem and becomes the host's test endpoint). The trade-off held —
the GPU-gated steps were the only ones that stalled GPU-less.
- **Trait-based abstraction → ~95% reuse.** `punktfunk-core` (protocol/FEC/crypto/session/transport/QUIC/
C ABI), the GameStream wire logic (mDNS, serverinfo, pairing, RTSP, ENet), the management REST API +
`native_pairing`/`discovery`, and the `punktfunk1`/`spike`/`pipeline` orchestration all carried over
unchanged — only the OS-touching backends behind `Capturer`/`Encoder`/`VirtualDisplay`/`InputInjector`/
`AudioCapturer`/`VirtualMic` are new `#[cfg(windows)]` code. Getting to MSVC needed only ~3 `cfg`-gates
(gate the `std::os::fd`/`OwnedFd` unix-isms in `main.rs`/`vdisplay.rs`).
- **The no-GPU dev strategy.** Most of the port was built + validated on a **GPU-less Windows VM**: the
MSVC compile, the virtual-display control path (WARP), the openh264 software-encode pipeline (full
capture→encode→FEC→UDP transport minus HW), SendInput injection + interactive-session/desktop-reattach,
gamepad + rumble, and the entire client (software-decode loopback). Only NVENC-D3D11 zero-copy, the
DDA-vs-WGC bake-off, split-encode/bitrate-ceiling, and *all* glass-to-glass numbers deferred to a real
NVIDIA box (no perf claim transfers from Linux).
- **Windows-specific structural issues (no Linux precedent)** — these are the gotchas that drove the
service + capture design and remain true:
- **Interactive session, not a Session-0 service.** SendInput can't reach the desktop from Session 0;
Desktop Duplication / capture need the interactive session. Hence the SYSTEM-in-interactive-session
supervisor (§2.6, [`windows-service.md`](windows-service.md)) and the `OpenInputDesktop`/
`SetThreadDesktop` re-attach to survive UAC/lock desktop switches.
- **Clock epoch.** The skew handshake assumes both ends read the same realtime epoch in ns — the Windows
host must emit timestamps from `GetSystemTimePreciseAsFileTime`→Unix-epoch-ns, or cross-machine latency
+ `ClockProbe`/`ClockEcho` break (std `SystemTime` on Windows is historically coarser).
- **No audio endpoint on a headless IDD.** WASAPI loopback needs a real/virtual render device; the
virtual *mic* (client→host) has no clean user-mode path — deferred.
- **Color/range.** All clients assume BT.709 limited-range; the BGRA→I420/NV12 path must match or colors
wash out — validated against the existing decoders.
**SudoVDA → pf-vdisplay evolution.** The original plan was built around **SudoVDA**, an off-the-shelf
indirect display driver (the same IDD Apollo ships) — chosen to avoid writing/WHQL-signing a driver and to
get arbitrary `WxH@Hz` modes on the fly. It carried the host all the way to live-validated NVENC on a real
RTX 4090. It was then replaced by the all-Rust `pf-vdisplay` IddCx driver (which solved
`/INTEGRITYCHECK` self-signing, §6.1, and gave us the IDD-**push** zero-copy capture path that captures the
secure desktop directly) and **deleted in commit `84a3b95`**`pf-vdisplay` is now the sole
virtual-display backend. The full SudoVDA control protocol (IOCTL layout, watchdog keepalive, GDI-name
resolution) lives in git history if ever needed as a reference.