docs(design): trim shipped plans, consolidate cluster, add index

Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).

- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
  apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
  host-latency, gpu-contention (fixed stale status table), game-library,
  linux-setup (fixed m0->spike + stale zero-copy claim),
  session-aware-host-followups, windows-client-bootstrap,
  windows-dualsense-{scoping,game-detection}, windows-virtual-display,
  security-review (per-finding status table; #12 still open),
  apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
  windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
  merged, M4 done); windows-secure-desktop.md archived (now a fallback
  behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
  roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-26 16:39:06 +00:00
parent 9ea2c17419
commit 7b99b41ede
27 changed files with 1322 additions and 3229 deletions
+160 -182
View File
@@ -5,107 +5,80 @@
> a signed Inno Setup installer with a LocalSystem SCM service. Live-validated on the RTX box through
> 5120×1440@240 HDR, the secure desktop (lock/UAC), and a fullscreen game.
>
> This file **consolidates and replaces** five earlier docs (now retired into it): the rewrite design
> plan, the Goal-1 staged-refactor plan, the audit, the audit-remediation tracker, and the
> fullscreen-game capture-bug analysis. See the [consolidation note](#appendix--consolidation-note) for
> what moved where. **Last updated 2026-06-26.** Work lives on branch **`windows-host-goal1`** (off
> `main`, not yet merged).
> This file is the consolidated Windows-host doc — it absorbs the rewrite design plan, the Goal-1
> staged-refactor plan, the audit + remediation tracker, the fullscreen-game capture-bug analysis, and
> the durable rationale from the original `windows-host.md` implementation plan (now a stub).
> **Last updated 2026-06-26.** All of this work is **merged to `main`** (the `windows-host-goal1`
> branch landed at `3e7c9bd`).
---
## 1. Status at a glance
The Windows host is **functionally complete and validated on glass.** The hard, high-risk proofs are done:
a clean all-Rust IddCx driver on the unified `windows-drivers-rs` stack (the `/INTEGRITYCHECK` answer +
the `iddcx` `wdk-sys` binding), IDD-push zero-copy capture at 5K@240 HDR, the secure desktop (Winlogon /
UAC / lock), and the host re-architected into a clean, typed, layered shape. What remains is
**non-blocking**: hygiene (host `unsafe` lints, a few `OwnedHandle` rollouts), the SudoVDA backend
deletion (decoupled, not yet removed), a driver robustness gap (slot reclaim), the gamepad-driver
unification (M4), and old-monolith cleanup (M6) — plus the merge to `main`.
**Goals 13 and milestones M0M4 are complete and merged to `main`.** The host has a clean, typed,
layered architecture (`HostConfig → SessionPlan → SessionContext`, `windows/`+`linux/` confinement, a
single `VirtualDisplayManager` ownership model, `EncoderCaps`); the all-Rust IddCx `pf-vdisplay` driver
loads self-signed under Secure Boot and does IDD-push zero-copy capture at 5K@240 HDR including the
**secure desktop** (Winlogon/UAC/lock); SudoVDA is gone (`84a3b95`) — `pf-vdisplay` is the sole
virtual-display backend; and the three UMDF drivers (`pf-vdisplay`, `pf-dualsense`, `pf-xusb`) now build
from source in one unified `packaging/windows/drivers/` workspace (M4, `92e6802`). The shipped path
(IDD-push + NVENC) is live-validated on glass; the AMF/QSV encode path is CI-green but not yet
on-hardware (no AMD/Intel Windows box in the lab).
One framing correction baked into this doc: the host was **not** greenfield-rebuilt as the original plan
imagined. It was **refactored in place** via a staged, behavior-preserving sequence (the "Goal-1" plan),
which kept the live-validated host working at every step. The driver, by contrast, *was* rebuilt fresh
(the new `packaging/windows/drivers/pf-vdisplay/` tree).
Ground the details against the code: `crates/punktfunk-host/src/windows/`,
`crates/punktfunk-host/src/{capture,encode,inject,audio,vdisplay}/windows/`, and
`packaging/windows/drivers/`.
### Scorecard (verified against `windows-host-goal1` HEAD, 2026-06-25)
| Item | Status | Evidence |
|---|---|---|
| **Goal 1** — clean, layered host architecture | ✅ **DONE** | `config.rs` (`HostConfig`), `session_plan.rs` (`SessionPlan`), `SessionContext`, `windows/`+`linux/` confinement (`38c68c3`), `VirtualDisplayManager` (§2.5), `EncoderCaps` (`0ccd0fe`) |
| **Goal 2** — drop every trace of SudoVDA | ✅ **DONE** | reach-in decoupled (F1: `d638a93`/`e60cda3``win_adapter`/`win_display`), then the `sudovda.rs` backend + the dual-backend select **deleted** (this branch) — pf-vdisplay is the sole Windows virtual-display backend |
| **Goal 3** — minimize `unsafe` + P0 lints | 🟡 **PARTIAL** (**box-validated**) | driver `deny(unsafe_op_in_unsafe_fn)` (`a755d6e`); **`OwnedHandle`/RAII rollout** — `idd_push.rs` (`011607e`, view-leak fix) + `service.rs` child/job (`4c95ba7`) + the 3 gamepad backends via shared `gamepad_raii.rs` (`e5c2b4e`) + the IDD-push `KeyedMutexGuard` hot loop (`6585643`) + the **SCM STOP/SESSION events**`OnceLock<OwnedHandle>` (`61c02e6`, runtime-validated: clean ~1 s `sc stop`); **driver `pod_init!`** (`bf57704`, 27→1). **On-glass clean: host clippy `-D warnings` + driver build** (RTX box; `bd05bc8` fixed 11 lints the gate surfaced). The host-side raw-handle smuggling is fully retired; only host-crate P0 lints remain (deferred — churn>value) |
| **M0** — proto ABI + driver toolchain + `/INTEGRITYCHECK` + `iddcx` | ✅ **DONE** | `pf-driver-proto`; vendored `windows-drivers-rs` 0.5.1; `clear-force-integrity.ps1`; CI-green |
| **M1** — new IddCx driver, first light + HDR | ✅ **DONE (on-glass)** | STEP 08 (`d7a9fbf``cd59151`); HDR live ("Mac connects WITH HDR", `6399d28`) |
| **M2** — IDD-push capture + NVENC, glass-to-glass | ✅ **DONE (on-glass)** | 5120×1440@240 HDR zero-copy; integrated into the host path |
| **M3** — service / input / audio / **secure desktop** | ✅ **DONE (on-glass)** | secure desktop (lock/UAC) **owner-confirmed 2026-06-25** — IDD-push captures it + input reaches it |
| **M4** — gamepad drivers onto the unified stack | ❌ **OPEN** | `pf_dualsense`/`pf_xusb` still standalone (`packaging/windows/{dualsense,xusb}-driver/`), not in `drivers/` workspace |
| **M5** — WGC/DDA fallback reshape + GameStream-on-pipeline + AMF/QSV | 🟡 **PARTIAL** | fallbacks exist (`wgc.rs`/`wgc_relay.rs`/`dxgi.rs`), not reshaped onto the new seams; AMF/QSV CI-only (no lab hw) |
| **M6** — cut over + delete the old monoliths | 🟡 **PARTIAL** | old `vdisplay-driver/` tree deleted (`a2bd0cd`); host monoliths + bring-up scaffolding (`spawn_observer`/`DebugBlock`) remain |
| **Game-capture bug (GB1)** — fullscreen game breaks IDD-push | ✅ **FIXED** | resolution-listening recovery (`c87bfe0`) + open-time DDA failover (`f98ab07`) + driver guard/log (`789ad49`) |
| **Audit P0/P1/P2** | ✅ mostly **RESOLVED** | watchdog, `SET_RENDER_ADAPTER`, log gate, mode bounds, IDD-push fallback, F1, out-ring/HDR-ring, proto asserts — all landed; **open:** host hygiene (§8), E1 completion, slot-reclaim |
**What remains (all non-blocking):** the `pf-vdisplay` slot-reclaim-on-REMOVE fix needs an on-glass
reconnect-storm A/B (§4 P1.3); host-crate `unsafe` lint hygiene + old-monolith / bring-up-scaffolding
cleanup (§4 P2); and the hardware-gated items — AMF/QSV on-glass, hybrid-GPU `SET_RENDER_ADAPTER`,
the WGC/DDA fallback reshape, and true `max_concurrent>1` (§4 P3). One framing note: the host was
**not** greenfield-rebuilt — it was **refactored in place** via a staged, behavior-preserving sequence
that kept the live host working at every step; only the *driver* was rebuilt fresh.
---
## 2. Architecture (what is on disk)
A ~1-page map; the empirical constraints these encode are in §3, the deep reference is in §6.
### 2.1 Layering & crates
- **`crates/punktfunk-host`** — one shared host crate (Linux + Windows; not split). Platform code is
confined under per-module `windows/`+`linux/` folders behind `#[cfg]` seams (`capture/{windows,linux}/`,
`encode/{windows,linux}/`, `inject/{windows,linux}/`, `audio/{windows,linux}/`, `vdisplay/{windows,linux}/`,
and top-level `src/windows/`+`src/linux/`). Module names stay flat (`#[path]`), so caller paths are
platform-agnostic.
`encode/…`, `inject/…`, `audio/…`, `vdisplay/…`, plus top-level `src/windows/`+`src/linux/`). Module
names stay flat (`#[path]`), so caller paths are platform-agnostic.
- **`crates/punktfunk-core`** — the one linked protocol/FEC/crypto/QUIC core (unchanged here).
- **`crates/pf-driver-proto`** — the owned, `no_std` host↔driver ABI (frame ring + control plane +
gamepad SHM), consumed by both the host crate and the driver workspace (§2.7).
gamepad SHM), consumed by both the host crate and the driver workspace.
- **`packaging/windows/drivers/`** — the unified driver workspace on `microsoft/windows-drivers-rs`
(vendored 0.5.1 + an `iddcx` subset): members `pf-vdisplay` (the IddCx display driver), `wdk-iddcx`
(the typed IddCx DDI wrappers), `wdk-probe` (the CI link/surface gate), `vendor/{wdk-build,wdk-sys}`.
(vendored 0.5.1 + an `iddcx` subset): `pf-vdisplay` (the IddCx display driver), `pf-dualsense` +
`pf-xusb` (the gamepad drivers, folded in by M4), `wdk-iddcx` (typed IddCx DDI wrappers), `wdk-probe`
(the CI link/surface gate), `vendor/{wdk-build,wdk-sys}`.
### 2.2 Session resolution — `HostConfig → SessionPlan → SessionContext` (Goal-1 realized)
### 2.2 Session resolution, ownership, and seam traits (Goal-1)
The old ~40-knob `PUNKTFUNK_*` env soup, re-read and recomputed in three places, is replaced by a
resolve-once pipeline:
The old ~40-knob `PUNKTFUNK_*` env soup (re-read and recomputed in three places) is replaced by a
**resolve-once** pipeline: `config.rs` `HostConfig` (typed, parsed once) → `session_plan.rs` `SessionPlan`
(a `Copy` plan resolved once per session — `CaptureBackend::resolve()` picks `IddPush | Dda | Wgc`,
`resolve_topology` picks `SingleProcess | TwoProcessRelay`; this killed the latent capture/encode
backend-disagreement bug) → `SessionContext` (bundles the ~13 session args + plane receivers, moved into
the stream thread).
- **`config.rs` `HostConfig`** — typed config parsed **once** from `host.env`/env/flags
(`idd_push`/`encoder_pref`/`no_wgc`/`capture_backend`/`render_adapter`/`secure_dda`/`ten_bit`/`zerocopy`/…).
Each field's parser is byte-identical to the read it replaced. (Runtime-mutated Linux session vars from
`vdisplay::apply_session_env`, and single-use local tuning knobs, are deliberately kept live — see the
`config.rs` header.)
- **`session_plan.rs` `SessionPlan { display, capture, topology, encoder, input_format, bit_depth, hdr,
pipeline_depth }`** — a `Copy` plan resolved **once** per session from `HostConfig` + the negotiated
bit-depth, logged, and threaded through `build_pipeline`. `CaptureBackend::resolve()` is the one
resolver (`IddPush | Dda | Wgc`); `resolve_topology` decides `SingleProcess | TwoProcessRelay`. This
killed the latent capture/encode backend-disagreement bug.
- **`SessionContext`** — bundles the session entry's ~13 args (was `#[allow(too_many_arguments)]`) and the
plane receivers into one owned struct moved into the stream thread.
Ownership is a single **OnceLock `VirtualDisplayManager`** (`vdisplay/windows/manager.rs`) owning a
*typed* `Arc<OwnedHandle>` control-device handle (no raw-`isize` cross-thread smuggle), a refcounted
Idle/Active/Lingering state machine, and the monitor generation; a per-session `MonitorLease`'s `Drop`
releases the refcount (a stale lease can't tear down a fresh monitor). This deleted a fistful of
`CURRENT_MON_GEN`/`MGR`/`IDD_*` globals and validated on glass at **0 leaked monitors across a reconnect
storm**, A/B-equivalent to the shipping host.
### 2.3 Ownership model — `VirtualDisplayManager` + `MonitorLease` (§2.5 realized)
The seam traits (`VirtualDisplay`/`VirtualOutput`/`VirtualLease`, `Capturer`, `Encoder`,
`AudioCapturer`/`VirtualMic`/`InputInjector`/`PadManager`) got two tightenings: the capturer takes the
desired `OutputFormat { gpu, hdr }` **in** (killing the `capture → encode::windows_resolved_backend()`
back-reference), and `Encoder::caps() -> EncoderCaps` (§2.4) lets the session glue route loss-recovery by
query.
A single **OnceLock `VirtualDisplayManager`** (`vdisplay/windows/manager.rs`) owns a *typed*
`Arc<OwnedHandle>` control-device handle (no raw-`isize` cross-thread smuggle), the refcounted
Idle/Active/Lingering state machine, and the monitor generation (`AtomicU64`). Both Windows backends
(`pf_vdisplay`, `sudovda`) shrank to thin `VdisplayDriver` impls (`open`/`add_monitor`/`remove_monitor`/
`ping`) behind it; `MonitorKey = Guid | Session(u64)`. A per-session `MonitorLease`'s `Drop` releases the
refcount (a stale lease can't tear down a fresh monitor). This deleted the old `CURRENT_MON_GEN`/`MON_GEN`/
two-`MGR`/`IDD_PERSIST`/`IDD_SETUP_LOCK`/`IDD_SESSION_STOP` globals. Validated on glass: **0 leaked active
monitors across a reconnect storm**, A/B-equivalent to the shipping host. (The 5-agent map found
`CURRENT_MON_GEN` had been **write-only** — the per-frame "monitor-gen bail" was never wired — so the gen
lives on the manager + lease only.)
### 2.4 The seam traits
`VirtualDisplay`/`VirtualOutput`/`VirtualLease` (RAII keepalive = release), `Capturer`
(`next_frame`/`try_latest`/`set_active`/`hdr_meta`/`pipeline_depth`), `Encoder`
(`submit`/`caps`/`request_keyframe`/`set_hdr_meta`/`invalidate_ref_frames`/`poll`/`flush`),
`AudioCapturer`/`VirtualMic`/`InputInjector`/`PadManager`. Realized tightenings: the capturer takes the
desired `OutputFormat { gpu, hdr }` **in** (killed the `capture → encode::windows_resolved_backend()`
back-reference recomputed in `dxgi.rs`); and `Encoder::caps() -> EncoderCaps { supports_rfi,
supports_hdr_metadata }` lets the session glue route loss-recovery by query (only Windows direct-NVENC
overrides it; the GameStream loop gates the RFI path on `supports_rfi`).
### 2.5 Capture — IDD-push primary (normal **and** secure desktop), WGC/DDA fallback, GB1 recovery
### 2.3 Capture — IDD-push primary (normal **and** secure desktop), WGC/DDA fallback, GB1 recovery
**IDD-push is the universal primary path.** Capture comes straight from the driver's shared keyed-mutex
texture ring (`capture/windows/idd_push.rs`) — no Desktop Duplication, no `win32u` reparenting hook. The
@@ -118,7 +91,7 @@ primary path.
- **Open-time fallback:** `IddPushCapturer::open` waits a bounded ~4 s for a *first frame* (not just
`DRV_STATUS_OPENED`); on attach failure it returns the keepalive back so `capture.rs` opens **DDA** on
the same `WinCaptureTarget` — never a 20 s black bail (audit §5.1, `ed58365`/`f98ab07`).
the same `WinCaptureTarget` — never a 20 s black bail (`ed58365`/`f98ab07`).
- **Mid-session game mode-set recovery (GB1, fixed):** the 250 ms poll follows the display's *actual*
resolution (`win_display::active_resolution`, CCD/GDI) and recreates the ring on any descriptor change
(size **or** HDR) → the driver re-attaches → frames resume at the game's mode, **no reconnect**. If a
@@ -127,9 +100,10 @@ primary path.
from Windows (`c87bfe0`; the driver's `publish()` width/height guard + flushed log is `789ad49`).
- **WGC + DDA** stay as demoted fallbacks for non-IddCx hardware (`wgc.rs`/`dxgi.rs`). The two-process WGC
secure-desktop relay (`wgc_relay.rs`) is no longer load-bearing now that IDD-push handles the secure
desktop; it is kept recoverable but slated for M5/M6 cleanup.
desktop; it is kept recoverable but slated for M5/M6 cleanup. (Its constraint analysis is archived in
[`archive/windows-secure-desktop.md`](archive/windows-secure-desktop.md).)
### 2.6 Encode — NVENC / AMF / QSV / software; `EncoderCaps`; HDR
### 2.4 Encode — NVENC / AMF / QSV / software; `EncoderCaps`; HDR
`encode/windows/` dispatches per DXGI adapter vendor (`open_video`): **NVENC** (NVIDIA, direct SDK,
`nvenc.rs` — caps-probe-before-configure, bitrate-clamp binary search, true RFI over the DPB, in-band
@@ -139,38 +113,34 @@ HEVC Main10 + BT.2020 PQ; the client auto-detects PQ from the VUI. The encoder a
size/format/HDR change per frame (tears down + re-inits), so the GB1 capturer's resolution changes are
handled downstream with no API change.
### 2.7 Host↔driver ABI — `pf-driver-proto`
`Encoder::caps() -> EncoderCaps { supports_rfi, supports_hdr_metadata }` lets the session glue route
loss-recovery by query (only Windows direct-NVENC overrides it; the GameStream loop gates the RFI path on
`supports_rfi` rather than hard-coding per-backend knowledge into the glue).
One `no_std` crate, both build graphs. Owns the **frame plane** (`SharedHeader`, `FrameToken { generation,
seq, slot }` with `pack`/`unpack`, `Global\pfvd-*` name helpers), the **control plane** (fresh interface
GUID — not SudoVDA's `e5bcc234`; contiguous `0x900` IOCTL ops; `u64` session id; a real `GET_INFO` version
handshake the host **asserts** + bails on mismatch), and the **gamepad SHM** (`XusbShm` 64 B, `PadShm`
256 B incl. `device_type`). `bytemuck`-`Pod` + `size_of` **and** `offset_of!` asserts make ABI drift a
**compile error** (`95dcef3`). The host-side gamepad consumers derive their layouts from here; the
**driver-side** gamepad drivers do not yet (M4).
### 2.5 Host↔driver ABI & the `pf-vdisplay` driver
### 2.8 The `pf-vdisplay` IddCx driver
`pf-driver-proto` is one `no_std` crate in both build graphs. It owns the **frame plane** (`FrameToken`
+ `Global\pfvd-*` names), the **control plane** (a fresh interface GUID — *not* SudoVDA's `e5bcc234`;
contiguous `0x900` IOCTL ops; a `GET_INFO` version handshake the host **asserts** + bails on mismatch),
and the **gamepad SHM** (`XusbShm`/`PadShm` incl. `device_type`). `bytemuck`-`Pod` + `size_of` **and**
`offset_of!` asserts make ABI drift a **compile error**.
All-Rust UMDF IddCx driver on `windows-drivers-rs` + the `iddcx` `wdk-sys` subset. STEP 08 landed
(`packaging/windows/drivers/pf-vdisplay/src/`): `entry.rs` (DriverEntry + `IDD_CX_CLIENT_CONFIG`, 15
callbacks), `adapter.rs` (caps + FP16 + `SET_RENDER_ADAPTER`), `monitor.rs`/`callbacks.rs` (the `*2` HDR
mode DDIs, EDID verbatim), `swap_chain_processor.rs` (the worker, `SetDevice`-retry + top-of-loop
`terminate`), `frame_transport.rs` (the `FramePublisher` on `pf_driver_proto::frame`), `control.rs` (the
typed IOCTL dispatch + host-gone **watchdog** + mode bounds). Self-signed-loadable under Secure Boot
(FORCE_INTEGRITY cleared post-link). **Known gaps:** ownership state is still partly process-global
(`MONITOR_MODES`/`NEXT_ID`/`ADAPTER`/`DEVICE_POOL`) with `EvtCleanupCallback` on the **WDFDEVICE** (not
per-`IDDCX_MONITOR`) — see E1 in §4; and it does not reclaim IddCx monitor **slots** on REMOVE (the
ghost-monitor wedge, §4).
The driver (`packaging/windows/drivers/pf-vdisplay/src/`) is an all-Rust UMDF IddCx driver on
`windows-drivers-rs` + the `iddcx` `wdk-sys` subset; the STEP 08 build is the checklist in §6.3, its
internals are the invariants in §3, and it loads self-signed under Secure Boot (FORCE_INTEGRITY cleared
post-link, §6.1). **Known gaps:** ownership state is still partly process-global with
`EvtCleanupCallback` on the **WDFDEVICE** (a deliberate, sound choice — E1 in §4); and
slot-reclaim-on-REMOVE (§4 P1.3).
### 2.9 Service, packaging, installer
### 2.6 Service, packaging, installer
A `LocalSystem` SCM supervisor (`service.rs`) token-retargets and `CreateProcessAsUserW`s `serve` into the
console session (so `SendInput` reaches the streamed desktop + the secure desktop), relaunches on
session-change, and kills-on-close via a Job Object. Shipped as a **signed Inno Setup** `setup.exe`
(`packaging/windows/`, `windows-host.yml`) that bundles the **new** `pf-vdisplay` driver
(`pf_vdisplay.inx` in-tree, old `vdisplay-driver/` tree deleted) + FFmpeg DLLs and delegates to `service
install`. GameStream (Moonlight) is kept but the installer/service default to secure `serve` (GameStream
opt-in).
A `LocalSystem` SCM supervisor (`windows/service.rs`) token-retargets and `CreateProcessAsUserW`s `serve`
into the console session (so `SendInput` reaches both the streamed and the secure desktop), relaunches on
session-change, and kills-on-close via a Job Object — the Sunshine/Apollo model (rationale:
[`windows-service.md`](windows-service.md)). Shipped as a **signed Inno Setup** `setup.exe`
(`packaging/windows/`, `windows-host.yml`) that builds + signs all three drivers from source, bundles
them + the FFmpeg DLLs, and delegates to `service install`. GameStream (Moonlight) is kept, but the
installer/service default to secure `serve` (GameStream opt-in).
---
@@ -206,11 +176,12 @@ These are expensive empirical wins; keep them intact when touching the code:
## 4. Open work / next tasks (prioritized)
**P1 — ship-readiness / correctness**
1. **Merge `windows-host-goal1` → `main` + push** (outward-facing → confirm first). Pushing also runs the
full Windows CI matrix incl. the `amf-qsv` encode path, which local checks skip.
2. **Make IDD-push the default**today it is gated behind `PUNKTFUNK_IDD_PUSH` (`config.rs` default
`false`); deployment sets it in `host.env`. Flip the code default (with the WGC/DDA fallback already in
place) so a fresh install runs the validated path, or document the `host.env` requirement explicitly.
1. **Goal-1 → `main` merge — ✅ DONE.** The `windows-host-goal1` branch is merged (tip `3e7c9bd`); the
full Windows CI matrix (incl. the `amf-qsv` encode path that local checks skip) runs on push.
2. **IDD-push default — ✅ resolved via `host.env`.** The shipped default `host.env` sets
`PUNKTFUNK_IDD_PUSH=1`, so a fresh install runs the validated IDD-push path (with the WGC/DDA fallback
in place). The bare *in-code* default (`config.rs`) is still `false` (the dev / non-pf-driver default);
flipping it to follow the deployed default is an optional tidy.
3. **pf-vdisplay slot reclaim on REMOVE** (driver robustness) — 🟡 **fix landed, on-glass-validation
pending.** Sustained ADD/REMOVE churn wedged the driver (`ADD → 0x80070490 ERROR_NOT_FOUND`) because the
monitor id (EDID serial / `ConnectorIndex` / container GUID) was a **monotonic** `NEXT_ID`, never
@@ -220,42 +191,16 @@ These are expensive empirical wins; keep them intact when touching the code:
sustained churn on the RTX box, so this needs an **on-glass reconnect-storm A/B** to confirm (the box is
ephemeral). Keep `packaging/windows/reset-pf-vdisplay.ps1` as the recovery until validated.
**P2 — hygiene / architecture completion** (the unsafe-reduction + stability priority)
4. **D1-host — host-crate P0 lints.** Add `#![deny(unsafe_op_in_unsafe_fn)]` +
`#![warn(clippy::undocumented_unsafe_blocks)]` to the host crate and fix the fallout (~30 of the 52
`unsafe fn`s need an inner `unsafe {}`). Stage it **per-module, Linux-first** (item-level `#[deny]` on
`linux/zerocopy/cuda.rs`/`egl.rs`, `encode/linux/vaapi.rs` — locally verifiable), then the Windows
modules (CI-gated), then promote to crate-level. The driver already has the deny.
5. **D2 — `OwnedHandle` / RAII rollout.** ✅ **DONE (complete).** `capture/windows/idd_push.rs` (`011607e`:
a `MappedSection` RAII for the mapping handle **+** the leaked `MapViewOfFile` view, + `OwnedHandle` for
the event / ring-slot shared handles); `windows/service.rs` (`4c95ba7`: the child process/thread + Job
handles, ~9 `CloseHandle` deleted); the **three gamepad backends** (`e5c2b4e`: a shared
`inject/windows/gamepad_raii.rs` — `Shm` for the section+view, `SwDevice` for the devnode — replacing the
duplicated `create_shm_section` + three hand-written `Drop`s); and the **SCM STOP/SESSION events**
(`61c02e6`: `AtomicIsize` raw-`isize` smuggle → `OnceLock<OwnedHandle>` the capture-free C handler reads,
owned for the process lifetime — also closes a latent close-then-signal window). **Runtime-validated on
the RTX box**: swapped in, `sc start` → RUNNING, `sc stop` → clean STOPPED in ~1 s (not a timeout-kill),
original restored. `manager.rs`/`pf_vdisplay.rs` already used the pattern.
6. **Hot-loop `KeyedMutexGuard` ✅ done** (`6585643`) — the IDD-push consume loop's hand-written
`AcquireSync`/`ReleaseSync` (with its "don't `?`-return between them or you leak the lock + stall the
driver" caveat) is now a RAII guard scoped to the convert/copy block: same release point (latency
unchanged), but leak-proof on any early return. **Driver `pod_init!` ✅** (`bf57704`, 27 `mem::zeroed` →
1). **Skipped `ThreadBound<T>`** (each `unsafe impl Send` wraps a distinct type — churn, no real gain) and
**scratched the IOCTL dispatcher** (`control.rs`'s `read_input<T>`/`write_output_complete<T>` are already
generic with minimal unsafe).
**On-glass build validation (RTX box, 2026-06-26).** Built this branch on the box in an isolated worktree:
**host `cargo clippy -p punktfunk-host --features nvenc -D warnings` = CLEAN**, **driver `cargo build` =
CLEAN** — validating the whole session's Windows + driver work on real hardware. The clippy gate (which the
goal1/§2.5 work never ran — it used `cargo check`) surfaced + fixed 11 lint issues (`bd05bc8`: 9 redundant
`as *mut c_void`, an `if_same_then_else`, an `unused_unsafe` in `pod_init!`). Remaining only a runtime
**latency A/B** for the `KeyedMutexGuard` (provably equivalent — same release point) if a deeper check is
wanted.
7. **D1-host P0 lints — deferred (low value / high churn).** A crate-wide `#![deny(unsafe_op_in_unsafe_fn)]`
produced 100+ FFI-wrap sites across the Linux modules; it *wraps* unsafe (discipline) rather than
reducing it and doesn't improve stability, so it was deprioritized vs the `OwnedHandle`/RAII reductions
above. Revisit as a final discipline pass (staged per-module) if desired.
8. **M6 scaffolding cleanup** — delete the bring-up diagnostics (`spawn_observer`/`DebugBlock` in
**P2 — hygiene / architecture completion**
4. **D1-host — host-crate P0 lints — deferred (low value / high churn).** A crate-wide
`#![deny(unsafe_op_in_unsafe_fn)]` produced 100+ FFI-wrap sites across the Linux modules; it *wraps*
unsafe (discipline) rather than reducing it and doesn't improve stability, so it was deprioritized vs
the `OwnedHandle`/RAII reductions (which are **complete**`idd_push.rs`, `service.rs`, the three
gamepad backends via a shared `gamepad_raii.rs`, the SCM STOP/SESSION events as `OnceLock<OwnedHandle>`,
the hot-loop `KeyedMutexGuard`, and the driver's `pod_init!`; all box-validated, clean `sc stop` in
~1 s). The driver already has the deny. Revisit D1-host as a final discipline pass (staged per-module)
if desired.
5. **M6 scaffolding cleanup** delete the bring-up diagnostics (`spawn_observer`/`DebugBlock` in
`idd_push.rs`) and, once full parity is proven on glass, the host monoliths.
**Explicitly NOT doing (stability decision): E1 — driver `DeviceContext` ownership + per-`IDDCX_MONITOR`
@@ -266,20 +211,21 @@ watchdog races device cleanup) for no gain, and the per-monitor cleanup callback
on this UMDF/IddCx stack. Cleanup is already deterministic (WDFDEVICE `EvtCleanupCallback` +
`cleanup_for_device_removal` + the host-gone watchdog). **Revisit only if `max_concurrent>1` on Windows is
actually needed.** (`monitor.rs` documents this rationale at the `MONITOR_MODES` static.)
8. **M6 scaffolding cleanup** — delete the bring-up diagnostics (`spawn_observer`/`DebugBlock` in
`idd_push.rs`) and, once full parity is proven on glass, the host monoliths.
**P3 — larger, mostly hardware-gated**
9. **M4 — gamepad-driver unification.** Fold `pf_dualsense` + `pf_xusb` (standalone
`packaging/windows/{dualsense,xusb}-driver/` on the old WDF stack) into the unified `drivers/` workspace
on `windows-drivers-rs` with WDF device contexts (true multi-pad), and point the **driver side** at
`pf_driver_proto::gamepad::{PadShm,XusbShm}` (host side already does — the `device_type`-at-offset-140
hand-duplication is the last ABI-drift hazard). Largest item.
10. **M5 — reshape WGC/DDA + GameStream onto `session/pipeline`**, then delete the old relay/monoliths.
AMF/QSV stays CI-only (no lab hardware).
11. **On-glass behavioral validation** of the committed-but-unexercised fixes: the watchdog reaping on
host-kill, `SET_RENDER_ADAPTER` on a **hybrid** box (the lab box is single-dGPU), the IDD-push→DDA
fallback trigger, HDR-ring sizing + out-ring repeat under real HDR/static-desktop pipelining.
6. **M4 — gamepad-driver unification — ✅ substantially DONE** (`92e6802`). `pf-dualsense` (DualSense /
DualShock 4) and `pf-xusb` (Xbox 360 / XInput) now live in the unified `packaging/windows/drivers/`
workspace and build from source per release against the vendored `wdk-sys`, exactly like `pf-vdisplay`;
`build-gamepad-drivers.ps1` signs them with the shared cert. **Remaining:** point the **driver side** at
`pf_driver_proto::gamepad::{PadShm,XusbShm}` (the host side already does — the `device_type`-at-offset
hand-duplication is the last ABI-drift hazard), add WDF device contexts for true multi-pad, and confirm
the source build matches the prior shipped binaries.
7. **M5 — reshape WGC/DDA + GameStream onto `session/pipeline`**, then delete the old relay/monoliths.
AMF/QSV stays CI-only (no lab hardware).
8. **On-glass behavioral validation** of the committed-but-unexercised fixes: the watchdog reaping on
host-kill, `SET_RENDER_ADAPTER` on a **hybrid** box (the lab box is single-dGPU), the IDD-push→DDA
fallback trigger, HDR-ring sizing + out-ring repeat under real HDR/static-desktop pipelining, and the
AMF/QSV encode path on real AMD/Intel hardware.
---
@@ -323,7 +269,7 @@ wraps `offset_of!`/`unsafe fn` differently than the runner's) — don't reformat
### 5.3 Env knobs (Windows host)
`PUNKTFUNK_IDD_PUSH=1` (capture from the driver ring; default off), `PUNKTFUNK_VDISPLAY=pf|sudovda`,
`PUNKTFUNK_IDD_PUSH=1` (capture from the driver ring; shipped `host.env` default on, in-code default off),
`PUNKTFUNK_ENCODER=auto|nvenc` (auto → vendor-detect), `PUNKTFUNK_10BIT=1` + `PUNKTFUNK_HDR_SHADER_P010=1`
(HDR), `PUNKTFUNK_SECURE_DDA=1`, `PUNKTFUNK_NO_WGC=1` (pure DDA), `PUNKTFUNK_ZEROCOPY=1`,
`PUNKTFUNK_MONITOR_LINGER_MS`, `PFVD_DEBUG_LOG=1` (driver file log — release builds are silent without it).
@@ -331,8 +277,8 @@ Config lives in `%ProgramData%\punktfunk\host.env`; logs in `%ProgramData%\punkt
### 5.4 Build / deploy / packaging
x64-only by design (no ARM64 NVIDIA driver / SudoVDA). The installer is the thin-`.iss` / fat-binary model
delegating to `service install`; tag `host-win-vX.Y.Z`. The driver is built + FORCE_INTEGRITY-cleared +
x64-only by design (no ARM64 NVIDIA driver). The installer is the thin-`.iss` / fat-binary model
delegating to `service install`; tag `host-win-vX.Y.Z`. The drivers are built + FORCE_INTEGRITY-cleared +
signed + `Inf2Cat`'d in CI from source. DriverVer must bump on any driver change; create the ROOT devnode
via nefcon (devgen is forbidden).
@@ -398,35 +344,67 @@ exactly correct.
8. its own `.inx` + an `unsafe`-reduction pass (`deny(unsafe_op_in_unsafe_fn)`, per-site `// SAFETY:`).
**Remaining driver work** beyond STEP 8: E1 (DeviceContext-owned state + per-`IDDCX_MONITOR`
`EvtCleanupCallback` → unblock `max_concurrent>1`), the slot-reclaim-on-REMOVE fix, and M4 (fold the
gamepad drivers in). See §4.
`EvtCleanupCallback` → unblock `max_concurrent>1` — see §4 for why it's deliberately deferred), the
slot-reclaim-on-REMOVE fix (§4 P1.3), and folding the gamepad-driver side onto `pf_driver_proto` (M4 tail,
§4 P3).
### 6.4 Resolved product decisions (the five forks)
**A** the host was refactored **in place** (staged, behavior-preserving), not greenfield-rebuilt — the
driver *was* rebuilt fresh. **B** IDD-push primary for everything incl. the **secure desktop** (validated);
WGC+DDA demoted to non-IddCx fallbacks. **C** all drivers on `microsoft/windows-drivers-rs` (+ the `iddcx`
subset; `/INTEGRITYCHECK` solved) — done for `pf-vdisplay`, **pending for the gamepad drivers (M4)**.
subset; `/INTEGRITYCHECK` solved) — done for `pf-vdisplay` and now for the gamepad drivers (M4, `92e6802`).
**D** keep GameStream (Moonlight), default to secure `serve`. **E** concurrent sessions: the host-side
preempt dance was removed by §2.5, but true `max_concurrent>1` on Windows stays blocked on the E1 driver
swap-chain-reuse work.
preempt dance was removed by the ownership-model work, but true `max_concurrent>1` on Windows stays blocked
on the E1 driver swap-chain-reuse work (deliberately deferred, §4). **Rejected: DeviceContext-per-monitor
ownership** — see the E1 stability decision in §4 (it would add a use-after-free window for no gain under
`ProcessSharingDisabled`).
---
## Appendix — consolidation note
## Origins & design rationale (from the original plan)
This file replaces five docs (recoverable from git history):
This folds in the durable rationale from the original Windows host + client plan
([`windows-host.md`](windows-host.md), now a stub; full original text in git history). The Windows host
began (2026-06-10 to 2026-06-14) as a *"add backends behind the existing traits"* job, not a parallel
port — `punktfunk-core` and the whole control plane are platform-agnostic, and the host already compiled
on non-Linux (macOS) thanks to existing `cfg(target_os)` gating. These framing decisions shaped what
shipped and still explain *why* the code is the way it is:
- `windows-host-rewrite.md` (the original design + plan, §0–§15) — its current status, architecture, the
jewels, the seam traits, and the deep reference (§6) are folded in here.
- `windows-host-goal1-plan.md` (the 6-stage in-place host refactor) — **complete**; its outcome is §2.22.4
and the Goal-1 scorecard row.
- `windows-host-rewrite-audit.md` (the 2026-06-25 audit) — its findings are reconciled to current reality
in §1 (scorecard) and §4 (only the still-open items survive: host hygiene, E1, slot-reclaim).
- `windows-host-rewrite-remediation.md` (the audit-remediation tracker) — its landed items are in §1; its
remaining items (D1-host, D2, E1, G) are §4 P2/P3.
- `windows-host-rewrite-game-capture-bug.md` (the GB1 investigation + fix) — **fixed**; the resolution is
§2.5 (capture). The full investigation narrative is in git history.
- **Build order: host-first.** A user preference (the research had recommended *client*-first, since the
client is unblocked by the no-GPU problem and becomes the host's test endpoint). The trade-off held —
the GPU-gated steps were the only ones that stalled GPU-less.
- **Trait-based abstraction → ~95% reuse.** `punktfunk-core` (protocol/FEC/crypto/session/transport/QUIC/
C ABI), the GameStream wire logic (mDNS, serverinfo, pairing, RTSP, ENet), the management REST API +
`native_pairing`/`discovery`, and the `punktfunk1`/`spike`/`pipeline` orchestration all carried over
unchanged — only the OS-touching backends behind `Capturer`/`Encoder`/`VirtualDisplay`/`InputInjector`/
`AudioCapturer`/`VirtualMic` are new `#[cfg(windows)]` code. Getting to MSVC needed only ~3 `cfg`-gates
(gate the `std::os::fd`/`OwnedFd` unix-isms in `main.rs`/`vdisplay.rs`).
- **The no-GPU dev strategy.** Most of the port was built + validated on a **GPU-less Windows VM**: the
MSVC compile, the virtual-display control path (WARP), the openh264 software-encode pipeline (full
capture→encode→FEC→UDP transport minus HW), SendInput injection + interactive-session/desktop-reattach,
gamepad + rumble, and the entire client (software-decode loopback). Only NVENC-D3D11 zero-copy, the
DDA-vs-WGC bake-off, split-encode/bitrate-ceiling, and *all* glass-to-glass numbers deferred to a real
NVIDIA box (no perf claim transfers from Linux).
- **Windows-specific structural issues (no Linux precedent)** — these are the gotchas that drove the
service + capture design and remain true:
- **Interactive session, not a Session-0 service.** SendInput can't reach the desktop from Session 0;
Desktop Duplication / capture need the interactive session. Hence the SYSTEM-in-interactive-session
supervisor (§2.6, [`windows-service.md`](windows-service.md)) and the `OpenInputDesktop`/
`SetThreadDesktop` re-attach to survive UAC/lock desktop switches.
- **Clock epoch.** The skew handshake assumes both ends read the same realtime epoch in ns — the Windows
host must emit timestamps from `GetSystemTimePreciseAsFileTime`→Unix-epoch-ns, or cross-machine latency
+ `ClockProbe`/`ClockEcho` break (std `SystemTime` on Windows is historically coarser).
- **No audio endpoint on a headless IDD.** WASAPI loopback needs a real/virtual render device; the
virtual *mic* (client→host) has no clean user-mode path — deferred.
- **Color/range.** All clients assume BT.709 limited-range; the BGRA→I420/NV12 path must match or colors
wash out — validated against the existing decoders.
(The older `design/windows-host.md`, a pre-rewrite implementation plan from 2026-06-22, is a separate
lineage and is left as-is.)
**SudoVDA → pf-vdisplay evolution.** The original plan was built around **SudoVDA**, an off-the-shelf
indirect display driver (the same IDD Apollo ships) — chosen to avoid writing/WHQL-signing a driver and to
get arbitrary `WxH@Hz` modes on the fly. It carried the host all the way to live-validated NVENC on a real
RTX 4090. It was then replaced by the all-Rust `pf-vdisplay` IddCx driver (which solved
`/INTEGRITYCHECK` self-signing, §6.1, and gave us the IDD-**push** zero-copy capture path that captures the
secure desktop directly) and **deleted in commit `84a3b95`**`pf-vdisplay` is now the sole
virtual-display backend. The full SudoVDA control protocol (IOCTL layout, watchdog keepalive, GDI-name
resolution) lives in git history if ever needed as a reference.