22bef1fd0a
Scorecard Goal 3 + §4 P2: the OwnedHandle RAII rollout (idd_push011607e— also a view-leak fix; service child/job4c95ba7) and the driver pod_init! macro (bf57704, 27→1) landed. Recorded the remaining items (service SCM-handler event smuggling, driver IOCTL-dispatch / KeyedMutexGuard levers, the deferred D1-host lint sweep) and that ThreadBound was skipped as not-a-clean-win. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
420 lines
31 KiB
Markdown
420 lines
31 KiB
Markdown
# Windows Host — Architecture, Status & Roadmap
|
||
|
||
> **Single source of truth** for the punktfunk Windows streaming host: the all-Rust **`pf-vdisplay`
|
||
> IddCx virtual-display driver** + **IDD-push zero-copy capture** + **NVENC/AMF/QSV encode**, shipped as
|
||
> a signed Inno Setup installer with a LocalSystem SCM service. Live-validated on the RTX box through
|
||
> 5120×1440@240 HDR, the secure desktop (lock/UAC), and a fullscreen game.
|
||
>
|
||
> This file **consolidates and replaces** five earlier docs (now retired into it): the rewrite design
|
||
> plan, the Goal-1 staged-refactor plan, the audit, the audit-remediation tracker, and the
|
||
> fullscreen-game capture-bug analysis. See the [consolidation note](#appendix--consolidation-note) for
|
||
> what moved where. **Last updated 2026-06-26.** Work lives on branch **`windows-host-goal1`** (off
|
||
> `main`, not yet merged).
|
||
|
||
---
|
||
|
||
## 1. Status at a glance
|
||
|
||
The Windows host is **functionally complete and validated on glass.** The hard, high-risk proofs are done:
|
||
a clean all-Rust IddCx driver on the unified `windows-drivers-rs` stack (the `/INTEGRITYCHECK` answer +
|
||
the `iddcx` `wdk-sys` binding), IDD-push zero-copy capture at 5K@240 HDR, the secure desktop (Winlogon /
|
||
UAC / lock), and the host re-architected into a clean, typed, layered shape. What remains is
|
||
**non-blocking**: hygiene (host `unsafe` lints, a few `OwnedHandle` rollouts), the SudoVDA backend
|
||
deletion (decoupled, not yet removed), a driver robustness gap (slot reclaim), the gamepad-driver
|
||
unification (M4), and old-monolith cleanup (M6) — plus the merge to `main`.
|
||
|
||
One framing correction baked into this doc: the host was **not** greenfield-rebuilt as the original plan
|
||
imagined. It was **refactored in place** via a staged, behavior-preserving sequence (the "Goal-1" plan),
|
||
which kept the live-validated host working at every step. The driver, by contrast, *was* rebuilt fresh
|
||
(the new `packaging/windows/drivers/pf-vdisplay/` tree).
|
||
|
||
### Scorecard (verified against `windows-host-goal1` HEAD, 2026-06-25)
|
||
|
||
| Item | Status | Evidence |
|
||
|---|---|---|
|
||
| **Goal 1** — clean, layered host architecture | ✅ **DONE** | `config.rs` (`HostConfig`), `session_plan.rs` (`SessionPlan`), `SessionContext`, `windows/`+`linux/` confinement (`38c68c3`), `VirtualDisplayManager` (§2.5), `EncoderCaps` (`0ccd0fe`) |
|
||
| **Goal 2** — drop every trace of SudoVDA | ✅ **DONE** | reach-in decoupled (F1: `d638a93`/`e60cda3` → `win_adapter`/`win_display`), then the `sudovda.rs` backend + the dual-backend select **deleted** (this branch) — pf-vdisplay is the sole Windows virtual-display backend |
|
||
| **Goal 3** — minimize `unsafe` + P0 lints | 🟡 **PARTIAL** | driver `deny(unsafe_op_in_unsafe_fn)` (`a755d6e`); **`OwnedHandle` RAII rollout** — `idd_push.rs` (`011607e`, also fixes a view leak) + `service.rs` child/job (`4c95ba7`), on top of `manager.rs`/`pf_vdisplay.rs`; **driver `pod_init!`** (`bf57704`, 27→1). Remaining: host-crate P0 lints (deferred — high churn, low value), the `service.rs` SCM-handler event smuggling, the driver IOCTL-dispatch / `KeyedMutexGuard` levers |
|
||
| **M0** — proto ABI + driver toolchain + `/INTEGRITYCHECK` + `iddcx` | ✅ **DONE** | `pf-driver-proto`; vendored `windows-drivers-rs` 0.5.1; `clear-force-integrity.ps1`; CI-green |
|
||
| **M1** — new IddCx driver, first light + HDR | ✅ **DONE (on-glass)** | STEP 0–8 (`d7a9fbf`…`cd59151`); HDR live ("Mac connects WITH HDR", `6399d28`) |
|
||
| **M2** — IDD-push capture + NVENC, glass-to-glass | ✅ **DONE (on-glass)** | 5120×1440@240 HDR zero-copy; integrated into the host path |
|
||
| **M3** — service / input / audio / **secure desktop** | ✅ **DONE (on-glass)** | secure desktop (lock/UAC) **owner-confirmed 2026-06-25** — IDD-push captures it + input reaches it |
|
||
| **M4** — gamepad drivers onto the unified stack | ❌ **OPEN** | `pf_dualsense`/`pf_xusb` still standalone (`packaging/windows/{dualsense,xusb}-driver/`), not in `drivers/` workspace |
|
||
| **M5** — WGC/DDA fallback reshape + GameStream-on-pipeline + AMF/QSV | 🟡 **PARTIAL** | fallbacks exist (`wgc.rs`/`wgc_relay.rs`/`dxgi.rs`), not reshaped onto the new seams; AMF/QSV CI-only (no lab hw) |
|
||
| **M6** — cut over + delete the old monoliths | 🟡 **PARTIAL** | old `vdisplay-driver/` tree deleted (`a2bd0cd`); host monoliths + bring-up scaffolding (`spawn_observer`/`DebugBlock`) remain |
|
||
| **Game-capture bug (GB1)** — fullscreen game breaks IDD-push | ✅ **FIXED** | resolution-listening recovery (`c87bfe0`) + open-time DDA failover (`f98ab07`) + driver guard/log (`789ad49`) |
|
||
| **Audit P0/P1/P2** | ✅ mostly **RESOLVED** | watchdog, `SET_RENDER_ADAPTER`, log gate, mode bounds, IDD-push fallback, F1, out-ring/HDR-ring, proto asserts — all landed; **open:** host hygiene (§8), E1 completion, slot-reclaim |
|
||
|
||
---
|
||
|
||
## 2. Architecture (what is on disk)
|
||
|
||
### 2.1 Layering & crates
|
||
|
||
- **`crates/punktfunk-host`** — one shared host crate (Linux + Windows; not split). Platform code is
|
||
confined under per-module `windows/`+`linux/` folders behind `#[cfg]` seams (`capture/{windows,linux}/`,
|
||
`encode/{windows,linux}/`, `inject/{windows,linux}/`, `audio/{windows,linux}/`, `vdisplay/{windows,linux}/`,
|
||
and top-level `src/windows/`+`src/linux/`). Module names stay flat (`#[path]`), so caller paths are
|
||
platform-agnostic.
|
||
- **`crates/punktfunk-core`** — the one linked protocol/FEC/crypto/QUIC core (unchanged here).
|
||
- **`crates/pf-driver-proto`** — the owned, `no_std` host↔driver ABI (frame ring + control plane +
|
||
gamepad SHM), consumed by both the host crate and the driver workspace (§2.7).
|
||
- **`packaging/windows/drivers/`** — the unified driver workspace on `microsoft/windows-drivers-rs`
|
||
(vendored 0.5.1 + an `iddcx` subset): members `pf-vdisplay` (the IddCx display driver), `wdk-iddcx`
|
||
(the typed IddCx DDI wrappers), `wdk-probe` (the CI link/surface gate), `vendor/{wdk-build,wdk-sys}`.
|
||
|
||
### 2.2 Session resolution — `HostConfig → SessionPlan → SessionContext` (Goal-1 realized)
|
||
|
||
The old ~40-knob `PUNKTFUNK_*` env soup, re-read and recomputed in three places, is replaced by a
|
||
resolve-once pipeline:
|
||
|
||
- **`config.rs` `HostConfig`** — typed config parsed **once** from `host.env`/env/flags
|
||
(`idd_push`/`encoder_pref`/`no_wgc`/`capture_backend`/`render_adapter`/`secure_dda`/`ten_bit`/`zerocopy`/…).
|
||
Each field's parser is byte-identical to the read it replaced. (Runtime-mutated Linux session vars from
|
||
`vdisplay::apply_session_env`, and single-use local tuning knobs, are deliberately kept live — see the
|
||
`config.rs` header.)
|
||
- **`session_plan.rs` `SessionPlan { display, capture, topology, encoder, input_format, bit_depth, hdr,
|
||
pipeline_depth }`** — a `Copy` plan resolved **once** per session from `HostConfig` + the negotiated
|
||
bit-depth, logged, and threaded through `build_pipeline`. `CaptureBackend::resolve()` is the one
|
||
resolver (`IddPush | Dda | Wgc`); `resolve_topology` decides `SingleProcess | TwoProcessRelay`. This
|
||
killed the latent capture/encode backend-disagreement bug.
|
||
- **`SessionContext`** — bundles the session entry's ~13 args (was `#[allow(too_many_arguments)]`) and the
|
||
plane receivers into one owned struct moved into the stream thread.
|
||
|
||
### 2.3 Ownership model — `VirtualDisplayManager` + `MonitorLease` (§2.5 realized)
|
||
|
||
A single **OnceLock `VirtualDisplayManager`** (`vdisplay/windows/manager.rs`) owns a *typed*
|
||
`Arc<OwnedHandle>` control-device handle (no raw-`isize` cross-thread smuggle), the refcounted
|
||
Idle/Active/Lingering state machine, and the monitor generation (`AtomicU64`). Both Windows backends
|
||
(`pf_vdisplay`, `sudovda`) shrank to thin `VdisplayDriver` impls (`open`/`add_monitor`/`remove_monitor`/
|
||
`ping`) behind it; `MonitorKey = Guid | Session(u64)`. A per-session `MonitorLease`'s `Drop` releases the
|
||
refcount (a stale lease can't tear down a fresh monitor). This deleted the old `CURRENT_MON_GEN`/`MON_GEN`/
|
||
two-`MGR`/`IDD_PERSIST`/`IDD_SETUP_LOCK`/`IDD_SESSION_STOP` globals. Validated on glass: **0 leaked active
|
||
monitors across a reconnect storm**, A/B-equivalent to the shipping host. (The 5-agent map found
|
||
`CURRENT_MON_GEN` had been **write-only** — the per-frame "monitor-gen bail" was never wired — so the gen
|
||
lives on the manager + lease only.)
|
||
|
||
### 2.4 The seam traits
|
||
|
||
`VirtualDisplay`/`VirtualOutput`/`VirtualLease` (RAII keepalive = release), `Capturer`
|
||
(`next_frame`/`try_latest`/`set_active`/`hdr_meta`/`pipeline_depth`), `Encoder`
|
||
(`submit`/`caps`/`request_keyframe`/`set_hdr_meta`/`invalidate_ref_frames`/`poll`/`flush`),
|
||
`AudioCapturer`/`VirtualMic`/`InputInjector`/`PadManager`. Realized tightenings: the capturer takes the
|
||
desired `OutputFormat { gpu, hdr }` **in** (killed the `capture → encode::windows_resolved_backend()`
|
||
back-reference recomputed in `dxgi.rs`); and `Encoder::caps() -> EncoderCaps { supports_rfi,
|
||
supports_hdr_metadata }` lets the session glue route loss-recovery by query (only Windows direct-NVENC
|
||
overrides it; the GameStream loop gates the RFI path on `supports_rfi`).
|
||
|
||
### 2.5 Capture — IDD-push primary (normal **and** secure desktop), WGC/DDA fallback, GB1 recovery
|
||
|
||
**IDD-push is the universal primary path.** Capture comes straight from the driver's shared keyed-mutex
|
||
texture ring (`capture/windows/idd_push.rs`) — no Desktop Duplication, no `win32u` reparenting hook. The
|
||
host creates the ring; the driver opens it (permissive `D:(A;;GA;;;WD)` SDDL). The generation-tagged
|
||
`latest = gen<<40 | seq<<8 | slot` stale-ring reject kills the HDR-flip garbage frame; a host-owned
|
||
3-slot `OUT_RING` rotated per frame is the texture-ownership contract that enables `pipeline_depth=2`
|
||
(convert/copy on the 3D engine overlapping NVENC on the ASIC). It captures the **secure desktop**
|
||
(Winlogon/UAC/lock) directly (validated 2026-06-25), so there is no separate secure capturer in the
|
||
primary path.
|
||
|
||
- **Open-time fallback:** `IddPushCapturer::open` waits a bounded ~4 s for a *first frame* (not just
|
||
`DRV_STATUS_OPENED`); on attach failure it returns the keepalive back so `capture.rs` opens **DDA** on
|
||
the same `WinCaptureTarget` — never a 20 s black bail (audit §5.1, `ed58365`/`f98ab07`).
|
||
- **Mid-session game mode-set recovery (GB1, fixed):** the 250 ms poll follows the display's *actual*
|
||
resolution (`win_display::active_resolution`, CCD/GDI) and recreates the ring on any descriptor change
|
||
(size **or** HDR) → the driver re-attaches → frames resume at the game's mode, **no reconnect**. If a
|
||
change is unrecoverable (e.g. an exclusive flip), a `recovering_since` clock drops the session after 3 s
|
||
so the client reconnects cleanly. No protocol bump was needed — the host reads the resolution straight
|
||
from Windows (`c87bfe0`; the driver's `publish()` width/height guard + flushed log is `789ad49`).
|
||
- **WGC + DDA** stay as demoted fallbacks for non-IddCx hardware (`wgc.rs`/`dxgi.rs`). The two-process WGC
|
||
secure-desktop relay (`wgc_relay.rs`) is no longer load-bearing now that IDD-push handles the secure
|
||
desktop; it is kept recoverable but slated for M5/M6 cleanup.
|
||
|
||
### 2.6 Encode — NVENC / AMF / QSV / software; `EncoderCaps`; HDR
|
||
|
||
`encode/windows/` dispatches per DXGI adapter vendor (`open_video`): **NVENC** (NVIDIA, direct SDK,
|
||
`nvenc.rs` — caps-probe-before-configure, bitrate-clamp binary search, true RFI over the DPB, in-band
|
||
ST.2086/CLL SEI), **AMF**/**QSV** (AMD/Intel via libavcodec, `ffmpeg_win.rs` — system-readback default,
|
||
opt-in zero-copy D3D11; CI-only, no lab hardware), or **software** H.264 (`sw.rs`). HDR (10-bit) forces
|
||
HEVC Main10 + BT.2020 PQ; the client auto-detects PQ from the VUI. The encoder adapts to a mid-session
|
||
size/format/HDR change per frame (tears down + re-inits), so the GB1 capturer's resolution changes are
|
||
handled downstream with no API change.
|
||
|
||
### 2.7 Host↔driver ABI — `pf-driver-proto`
|
||
|
||
One `no_std` crate, both build graphs. Owns the **frame plane** (`SharedHeader`, `FrameToken { generation,
|
||
seq, slot }` with `pack`/`unpack`, `Global\pfvd-*` name helpers), the **control plane** (fresh interface
|
||
GUID — not SudoVDA's `e5bcc234`; contiguous `0x900` IOCTL ops; `u64` session id; a real `GET_INFO` version
|
||
handshake the host **asserts** + bails on mismatch), and the **gamepad SHM** (`XusbShm` 64 B, `PadShm`
|
||
256 B incl. `device_type`). `bytemuck`-`Pod` + `size_of` **and** `offset_of!` asserts make ABI drift a
|
||
**compile error** (`95dcef3`). The host-side gamepad consumers derive their layouts from here; the
|
||
**driver-side** gamepad drivers do not yet (M4).
|
||
|
||
### 2.8 The `pf-vdisplay` IddCx driver
|
||
|
||
All-Rust UMDF IddCx driver on `windows-drivers-rs` + the `iddcx` `wdk-sys` subset. STEP 0–8 landed
|
||
(`packaging/windows/drivers/pf-vdisplay/src/`): `entry.rs` (DriverEntry + `IDD_CX_CLIENT_CONFIG`, 15
|
||
callbacks), `adapter.rs` (caps + FP16 + `SET_RENDER_ADAPTER`), `monitor.rs`/`callbacks.rs` (the `*2` HDR
|
||
mode DDIs, EDID verbatim), `swap_chain_processor.rs` (the worker, `SetDevice`-retry + top-of-loop
|
||
`terminate`), `frame_transport.rs` (the `FramePublisher` on `pf_driver_proto::frame`), `control.rs` (the
|
||
typed IOCTL dispatch + host-gone **watchdog** + mode bounds). Self-signed-loadable under Secure Boot
|
||
(FORCE_INTEGRITY cleared post-link). **Known gaps:** ownership state is still partly process-global
|
||
(`MONITOR_MODES`/`NEXT_ID`/`ADAPTER`/`DEVICE_POOL`) with `EvtCleanupCallback` on the **WDFDEVICE** (not
|
||
per-`IDDCX_MONITOR`) — see E1 in §4; and it does not reclaim IddCx monitor **slots** on REMOVE (the
|
||
ghost-monitor wedge, §4).
|
||
|
||
### 2.9 Service, packaging, installer
|
||
|
||
A `LocalSystem` SCM supervisor (`service.rs`) token-retargets and `CreateProcessAsUserW`s `serve` into the
|
||
console session (so `SendInput` reaches the streamed desktop + the secure desktop), relaunches on
|
||
session-change, and kills-on-close via a Job Object. Shipped as a **signed Inno Setup** `setup.exe`
|
||
(`packaging/windows/`, `windows-host.yml`) that bundles the **new** `pf-vdisplay` driver
|
||
(`pf_vdisplay.inx` in-tree, old `vdisplay-driver/` tree deleted) + FFmpeg DLLs and delegates to `service
|
||
install`. GameStream (Moonlight) is kept but the installer/service default to secure `serve` (GameStream
|
||
opt-in).
|
||
|
||
---
|
||
|
||
## 3. Validated invariants — preserve, do not regress
|
||
|
||
These are expensive empirical wins; keep them intact when touching the code:
|
||
|
||
- **Frame transport:** host-creates/driver-opens keyed-mutex ring; generation-tagged stale-ring reject;
|
||
0 ms try-acquire / drop-on-full publish (never block the swap-chain thread); the `OUT_RING` rotation +
|
||
`pipeline_depth=2` overlap; `repeat_last` rotates into a fresh out-ring slot (depth-safe).
|
||
- **Driver internals:** `edid.rs` (128-byte EDID + CTA-861.3 HDR block, dual checksums); the FP16 HDR
|
||
recipe (`CAN_PROCESS_FP16` + the `*2` DDIs + gamma/HDR accept-stubs + `HIGH_COLOR_SPACE`); `DEVICE_POOL`
|
||
per render-LUID (NVIDIA UMD/VRAM leak fix); target-id stamped on the monitor context; the two swap-chain
|
||
leak fixes (borrow `IDXGIDevice` across `SetDevice` retries; check `terminate` at the loop top).
|
||
- **Monitor lifecycle:** serialized ADD/REMOVE/teardown; restore CCD topology **before** REMOVE; the
|
||
generation-stamped lease (a stale lease can't tear down a fresh monitor); 0-leak across reconnects.
|
||
- **HDR color math:** `hdr.rs` (pure, unit-tested, ST.2086 + big-endian SEI); the FP16→P010/Rgb10a2
|
||
converters + `hdr_p010_selftest`; the cursor decomposition.
|
||
- **NVENC tuning:** caps-probe-before-configure (10-bit→8-bit graceful downgrade); bitrate-clamp binary
|
||
search (each GPU's real ceiling); true RFI over the DPB; CBR / infinite-GOP / P-only / ~1-frame VBV.
|
||
- **Gamepad recipe:** the SwDeviceCreate identity (enumerator with no `_`; mandatory completion callback;
|
||
synthesized DS5 compat-ids; non-null per-pad `ContainerId`); one `pf_dualsense` serving DualSense+DS4
|
||
via a `device_type` byte; XUSB declining `WAIT_*`; per-pad index via `pszDeviceLocation`.
|
||
- **Session glue:** the trait seam + RAII keepalive teardown; host-lifetime shared services + per-session
|
||
gamepads; the encode|send split + microburst pacing; `build_pipeline_with_retry` permanent-vs-transient
|
||
classification; the GameStream `VideoPacketizer` (GF8 Cauchy, Moonlight byte-exact); the pairing/trust
|
||
handshake.
|
||
- **Core discipline:** no async on the per-frame path; `pf-driver-proto` is the single ABI source
|
||
(drift = compile error); the version handshake the host asserts.
|
||
|
||
---
|
||
|
||
## 4. Open work / next tasks (prioritized)
|
||
|
||
**P1 — ship-readiness / correctness**
|
||
1. **Merge `windows-host-goal1` → `main` + push** (outward-facing → confirm first). Pushing also runs the
|
||
full Windows CI matrix incl. the `amf-qsv` encode path, which local checks skip.
|
||
2. **Make IDD-push the default** — today it is gated behind `PUNKTFUNK_IDD_PUSH` (`config.rs` default
|
||
`false`); deployment sets it in `host.env`. Flip the code default (with the WGC/DDA fallback already in
|
||
place) so a fresh install runs the validated path, or document the `host.env` requirement explicitly.
|
||
3. **pf-vdisplay slot reclaim on REMOVE** (driver robustness) — 🟡 **fix landed, on-glass-validation
|
||
pending.** Sustained ADD/REMOVE churn wedged the driver (`ADD → 0x80070490 ERROR_NOT_FOUND`) because the
|
||
monitor id (EDID serial / `ConnectorIndex` / container GUID) was a **monotonic** `NEXT_ID`, never
|
||
reclaimed → IddCx accumulated a new OS target slot per cycle until exhaustion. `monitor.rs` now allocates
|
||
the **lowest free id** (`alloc_monitor_id`), reused on REMOVE, so a fresh ADD reuses the departed
|
||
monitor's target slot instead of orphaning it. CI-compile-gated; the wedge only reproduces under
|
||
sustained churn on the RTX box, so this needs an **on-glass reconnect-storm A/B** to confirm (the box is
|
||
ephemeral). Keep `packaging/windows/reset-pf-vdisplay.ps1` as the recovery until validated.
|
||
|
||
**P2 — hygiene / architecture completion** (the unsafe-reduction + stability priority)
|
||
4. **D1-host — host-crate P0 lints.** Add `#![deny(unsafe_op_in_unsafe_fn)]` +
|
||
`#![warn(clippy::undocumented_unsafe_blocks)]` to the host crate and fix the fallout (~30 of the 52
|
||
`unsafe fn`s need an inner `unsafe {}`). Stage it **per-module, Linux-first** (item-level `#[deny]` on
|
||
`linux/zerocopy/cuda.rs`/`egl.rs`, `encode/linux/vaapi.rs` — locally verifiable), then the Windows
|
||
modules (CI-gated), then promote to crate-level. The driver already has the deny.
|
||
5. **D2 — `OwnedHandle` rollout.** ✅ **mostly done** — `capture/windows/idd_push.rs` (`011607e`: a
|
||
`MappedSection` RAII for the mapping handle **+** the leaked `MapViewOfFile` view, + `OwnedHandle` for the
|
||
event / ring-slot shared handles) and `windows/service.rs` (`4c95ba7`: the child process/thread + Job
|
||
handles, ~9 `CloseHandle` deleted). **Remaining:** the `service.rs` `AtomicIsize` STOP/SESSION events
|
||
(deliberately left — smuggled into the C SCM handler, a separate riskier redesign) and the gamepad shm
|
||
handles. `manager.rs`/`pf_vdisplay.rs` already used the pattern.
|
||
6. **Driver unsafe levers** (the driver is already `deny`-clean with per-site SAFETY; these *reduce count*):
|
||
✅ **`pod_init!` macro done** (`bf57704`, 27 `mem::zeroed` → 1). **Skipped `ThreadBound<T>`** — not a
|
||
clean win (each `unsafe impl Send` wraps a distinct type; consolidating churns every access for no real
|
||
safety gain over the per-struct `// SAFETY:`). **Remaining:** a generic IOCTL dispatch helper in
|
||
`control.rs`, and a `KeyedMutexGuard`/`AcquiredSurface` RAII for the frame-transport hot loop (needs an
|
||
on-glass latency check).
|
||
7. **D1-host P0 lints — deferred (low value / high churn).** A crate-wide `#![deny(unsafe_op_in_unsafe_fn)]`
|
||
produced 100+ FFI-wrap sites across the Linux modules; it *wraps* unsafe (discipline) rather than
|
||
reducing it and doesn't improve stability, so it was deprioritized vs the `OwnedHandle`/RAII reductions
|
||
above. Revisit as a final discipline pass (staged per-module) if desired.
|
||
8. **M6 scaffolding cleanup** — delete the bring-up diagnostics (`spawn_observer`/`DebugBlock` in
|
||
`idd_push.rs`) and, once full parity is proven on glass, the host monoliths.
|
||
|
||
**Explicitly NOT doing (stability decision): E1 — driver `DeviceContext` ownership + per-`IDDCX_MONITOR`
|
||
`EvtCleanupCallback`.** The current process-global design is *sound*: IddCx DDIs receive only an
|
||
`IDDCX_MONITOR` handle (never the WDFDEVICE/context), and `ProcessSharingDisabled` makes one devnode = one
|
||
host process that dies with the device. A "device-owned" variant would *add* a use-after-free window (the
|
||
watchdog races device cleanup) for no gain, and the per-monitor cleanup callback isn't reliably reachable
|
||
on this UMDF/IddCx stack. Cleanup is already deterministic (WDFDEVICE `EvtCleanupCallback` +
|
||
`cleanup_for_device_removal` + the host-gone watchdog). **Revisit only if `max_concurrent>1` on Windows is
|
||
actually needed.** (`monitor.rs` documents this rationale at the `MONITOR_MODES` static.)
|
||
8. **M6 scaffolding cleanup** — delete the bring-up diagnostics (`spawn_observer`/`DebugBlock` in
|
||
`idd_push.rs`) and, once full parity is proven on glass, the host monoliths.
|
||
|
||
**P3 — larger, mostly hardware-gated**
|
||
9. **M4 — gamepad-driver unification.** Fold `pf_dualsense` + `pf_xusb` (standalone
|
||
`packaging/windows/{dualsense,xusb}-driver/` on the old WDF stack) into the unified `drivers/` workspace
|
||
on `windows-drivers-rs` with WDF device contexts (true multi-pad), and point the **driver side** at
|
||
`pf_driver_proto::gamepad::{PadShm,XusbShm}` (host side already does — the `device_type`-at-offset-140
|
||
hand-duplication is the last ABI-drift hazard). Largest item.
|
||
10. **M5 — reshape WGC/DDA + GameStream onto `session/pipeline`**, then delete the old relay/monoliths.
|
||
AMF/QSV stays CI-only (no lab hardware).
|
||
11. **On-glass behavioral validation** of the committed-but-unexercised fixes: the watchdog reaping on
|
||
host-kill, `SET_RENDER_ADAPTER` on a **hybrid** box (the lab box is single-dGPU), the IDD-push→DDA
|
||
fallback trigger, HDR-ring sizing + out-ring repeat under real HDR/static-desktop pipelining.
|
||
|
||
---
|
||
|
||
## 5. Operations
|
||
|
||
### 5.1 RTX box on-glass recipe
|
||
|
||
The persistent on-glass validator is the **RTX box** (`ssh "Enrico Bühler"@<ip>`, ENRICOS-DESKTOP, RTX
|
||
4090, PS shell). **The IP FLOATS** (DHCP; boots to **Proxmox** on reboot → ephemeral, unreachable after a
|
||
reboot; recently `.173`/`.158` — confirm current first; **never reboot it, never depend on it surviving**).
|
||
It has WDK 26100 + LLVM 21.1.2 + the Rust toolchain; build clone at `C:\Users\Public\pf-rewrite` (the
|
||
user's active driver-dev tree — **don't clobber uncommitted WIP**; use a worktree). Username has a `ü` →
|
||
quote it; it only breaks SDL3/client builds, not the host. To validate a host branch: worktree-checkout,
|
||
build with `CARGO_TARGET_DIR=C:\t-goal1`, then stop the **PunktfunkHost** service, back up the binary +
|
||
`%ProgramData%\punktfunk\host.env`, copy your build in, restart, drive `punktfunk-probe.exe` loopback,
|
||
then restore + `git worktree remove`. Drive over ssh via `powershell -EncodedCommand <base64 UTF-16LE>`
|
||
(plain quoting mangles; prefer `Write-Output`/file-redirect for clean output). Driver redeploy:
|
||
`packaging/windows/redeploy-pf-vdisplay.ps1`; ghost-monitor recovery: `reset-pf-vdisplay.ps1`.
|
||
|
||
### 5.2 CI / validation
|
||
|
||
The persistent build validator is the **windows-amd64 CI runner** (no GPU — fine for builds / `iddcx`
|
||
link / `/INTEGRITYCHECK` self-sign / the surface-asserts; live NVENC encode + on-glass defers to the RTX
|
||
box). Workflows: `windows-host.yml` (the host installer), `windows-drivers.yml` (the driver workspace
|
||
build + FORCE_INTEGRITY clear), `windows-drivers-provision.yml` (WDK/LLVM toolchain), `windows-msix.yml`
|
||
(the client). A single Windows runner serializes the whole fleet; a `Cargo.toml` touch costs ~25 min of
|
||
queue, so driver pushes that avoid `Cargo.toml` skip the fleet serialization.
|
||
|
||
Local pre-push checks (this Linux box can't compile the Windows paths):
|
||
```sh
|
||
cargo test -p pf-driver-proto # the ABI crate (cross-platform)
|
||
cargo check -p punktfunk-host # Linux paths; win_* mods are #[cfg(windows)]
|
||
cargo clippy -p punktfunk-host --all-targets -- -D warnings
|
||
# Windows host clippy (on the box): PUNKTFUNK_NVENC_LIB_DIR=C:\t\nvenc;
|
||
# cargo clippy -p punktfunk-host --features nvenc --target x86_64-pc-windows-msvc -- -D warnings
|
||
# Driver build (on the box): cd packaging/windows/drivers; Version_Number=10.0.26100.0;
|
||
# LIBCLANG_PATH='C:\Program Files\LLVM\bin'; cargo build
|
||
```
|
||
Note: a pre-existing rustfmt-version drift exists in some Windows-only files (this box's rustfmt 1.9.0
|
||
wraps `offset_of!`/`unsafe fn` differently than the runner's) — don't reformat unrelated files to chase it.
|
||
|
||
### 5.3 Env knobs (Windows host)
|
||
|
||
`PUNKTFUNK_IDD_PUSH=1` (capture from the driver ring; default off), `PUNKTFUNK_VDISPLAY=pf|sudovda`,
|
||
`PUNKTFUNK_ENCODER=auto|nvenc` (auto → vendor-detect), `PUNKTFUNK_10BIT=1` + `PUNKTFUNK_HDR_SHADER_P010=1`
|
||
(HDR), `PUNKTFUNK_SECURE_DDA=1`, `PUNKTFUNK_NO_WGC=1` (pure DDA), `PUNKTFUNK_ZEROCOPY=1`,
|
||
`PUNKTFUNK_MONITOR_LINGER_MS`, `PFVD_DEBUG_LOG=1` (driver file log — release builds are silent without it).
|
||
Config lives in `%ProgramData%\punktfunk\host.env`; logs in `%ProgramData%\punktfunk\logs\host.log`.
|
||
|
||
### 5.4 Build / deploy / packaging
|
||
|
||
x64-only by design (no ARM64 NVIDIA driver / SudoVDA). The installer is the thin-`.iss` / fat-binary model
|
||
delegating to `service install`; tag `host-win-vX.Y.Z`. The driver is built + FORCE_INTEGRITY-cleared +
|
||
signed + `Inf2Cat`'d in CI from source. DriverVer must bump on any driver change; create the ROOT devnode
|
||
via nefcon (devgen is forbidden).
|
||
|
||
---
|
||
|
||
## 6. Reference (hard-won — keep)
|
||
|
||
### 6.1 The `/INTEGRITYCHECK` answer
|
||
|
||
`wdk-build` emits `cargo::rustc-cdylib-link-arg=/INTEGRITYCHECK` **unconditionally** (no cfg/env/Config
|
||
opt-out), so a self-signed driver can't load (CodeIntegrity 3004/3089). The fix: a deterministic,
|
||
idempotent post-link step `packaging/windows/clear-force-integrity.ps1` clears the PE FORCE_INTEGRITY bit
|
||
(`0x0080 @ e_lfanew+0x5e`) + verifies (CI-proven `0x01E0 → 0x0160`), **before** signing. Packaging order:
|
||
`cargo build` → clear-force-integrity → sign `.dll` → `Inf2Cat` → sign `.cat`. (A public build would use
|
||
real attestation signing, which satisfies `/INTEGRITYCHECK` legitimately.)
|
||
|
||
### 6.2 The `iddcx` binding on `wdk-sys` (the make-or-break — proven, the 6 bindgen knobs)
|
||
|
||
IddCx DDIs are **function-table dispatched** (`IddFunctions[]` indexed by `_IDDFUNCENUM::<Name>TableIndex`,
|
||
`IddDriverGlobals` implicit arg 1) — the same model `wdk-sys` already implements for WDF. The vendored
|
||
`windows-drivers-rs` 0.5.1 (`packaging/windows/drivers/vendor/`, `[patch.crates-io]`'d) gets a first-class
|
||
`ApiSubset::Iddcx` that bindgens `iddcx/1.10/IddCx.h` reusing the identical `wdk_default(config)` baseline
|
||
(so WDF/DXGI types **resolve to**, not redefine, `wdk-sys`'s — type-identity by construction). The six
|
||
knobs `generate_iddcx` needed (each a real gotcha, all CI-proven):
|
||
|
||
1. **`--language=c++`** — `wdk_default` parses C; `IddCx.h`'s `IDARG_*` typedefs need C++ (else a "must use
|
||
'struct' tag" cascade).
|
||
2. **`-DIDD_STUB`** — table-dispatch mode; skips `IddCxFuncEnum.h`'s `#error IDDCX_VERSION_MAJOR not
|
||
defined`. **Do NOT add `WDF_STUB`** (would desync the shared WDF type-identity).
|
||
3. **`allowlist_recursively(false)` + `allowlist_file("(?i).*iddcx.*")`, full codegen (no `.complement()`)**
|
||
— emit ONLY IddCx items; WDF/Win types resolve via `use crate::types::*`.
|
||
4. **`allowlist_type("_?DXGI_.*" / "IDXGI.*" / "_?OPM_.*" / "_?D3DCOLORVALUE")`** — emit the non-WDF types
|
||
`wdk-sys` doesn't bindgen, locally. The `_?` is load-bearing (`typedef struct _OPM_X {} OPM_X` needs the
|
||
tag AND the alias).
|
||
5. **`pub type UINT = ::core::ffi::c_uint;` in `src/iddcx.rs`** — `UINT` is absent from `crate::types`.
|
||
6. **`translate_enum_integer_types(true)`** — emit native `u32` reprs for the DXGI/OPM ModuleConsts enums
|
||
(nested modules can't see a parent `UINT`).
|
||
|
||
Wrapper note: table dispatch via `_IDDFUNCENUM::<Name>TableIndex as usize` (the ModuleConsts const, **not**
|
||
a NewType `.0`); NTSTATUS is plain `i32` (`wdk_sys::NT_SUCCESS`). The driver `build.rs` adds the IddCxStub
|
||
link-search (the import lib is under `iddcx\1.0\` even though headers are `1.10`) + `#[no_mangle] pub static
|
||
IddMinimumVersionRequired: ULONG = 4`. The versioned `IDD_STRUCTURE_SIZE!` path is dropped — the WDK links
|
||
the iddcx **1.0** stub (lacks the version table); we target 1.10 vs a current framework, so `size_of` is
|
||
exactly correct.
|
||
|
||
### 6.3 Driver port checklist (STEP 0–8, as landed)
|
||
|
||
0. workspace `pf-vdisplay`(cdylib)+`wdk-iddcx`; prove `std::thread`+`OwnedHandle` link under UMDF (done).
|
||
1. `wdk-iddcx`: 11 typed DDI wrappers via one dispatch macro + re-export the inbound `PFN_*` types.
|
||
2. DriverEntry + `IDD_CX_CLIENT_CONFIG` (15 callbacks) + DeviceInitConfig + WdfDeviceCreate +
|
||
CreateDeviceInterface (the owned pf GUID) + DeviceInitialize; `edid.rs` salvaged verbatim.
|
||
3. DeviceContext + `WDF_DECLARE_CONTEXT_TYPE` blob; `init_adapter` in D0Entry (caps + FP16) →
|
||
AdapterInitAsync; the `*2` mode DDIs + `query_target_info` + gamma/HDR accept-stubs. (Box gate: loads
|
||
under Secure Boot, enumerates as an IddCx adapter, Status OK.)
|
||
4. control plane (`GET_INFO` version handshake the host asserts, ADD/REMOVE/SET_RENDER_ADAPTER/PING/
|
||
CLEAR_ALL) + create_monitor + real mode DDIs + watchdog + mode bounds; host switched to
|
||
`pf_driver_proto`.
|
||
5. `Direct3DDevice` + assign/unassign + `SwapChainProcessor` (worker, `SetDevice` 60×@50 ms single-borrow
|
||
retry, top-of-loop `terminate`, `ReleaseAndAcquireBuffer2`, `from_raw_borrowed`).
|
||
6. `FramePublisher` on `pf_driver_proto::frame` + keyed-mutex RAII guard; wire into `run_core`. (Box:
|
||
full IDD-push glass-to-glass + the **secure-desktop** gate — validated 2026-06-25.)
|
||
7. HDR / FP16 ring (validated: Mac connects WITH HDR).
|
||
8. its own `.inx` + an `unsafe`-reduction pass (`deny(unsafe_op_in_unsafe_fn)`, per-site `// SAFETY:`).
|
||
|
||
**Remaining driver work** beyond STEP 8: E1 (DeviceContext-owned state + per-`IDDCX_MONITOR`
|
||
`EvtCleanupCallback` → unblock `max_concurrent>1`), the slot-reclaim-on-REMOVE fix, and M4 (fold the
|
||
gamepad drivers in). See §4.
|
||
|
||
### 6.4 Resolved product decisions (the five forks)
|
||
|
||
**A** the host was refactored **in place** (staged, behavior-preserving), not greenfield-rebuilt — the
|
||
driver *was* rebuilt fresh. **B** IDD-push primary for everything incl. the **secure desktop** (validated);
|
||
WGC+DDA demoted to non-IddCx fallbacks. **C** all drivers on `microsoft/windows-drivers-rs` (+ the `iddcx`
|
||
subset; `/INTEGRITYCHECK` solved) — done for `pf-vdisplay`, **pending for the gamepad drivers (M4)**.
|
||
**D** keep GameStream (Moonlight), default to secure `serve`. **E** concurrent sessions: the host-side
|
||
preempt dance was removed by §2.5, but true `max_concurrent>1` on Windows stays blocked on the E1 driver
|
||
swap-chain-reuse work.
|
||
|
||
---
|
||
|
||
## Appendix — consolidation note
|
||
|
||
This file replaces five docs (recoverable from git history):
|
||
|
||
- `windows-host-rewrite.md` (the original design + plan, §0–§15) — its current status, architecture, the
|
||
jewels, the seam traits, and the deep reference (§6) are folded in here.
|
||
- `windows-host-goal1-plan.md` (the 6-stage in-place host refactor) — **complete**; its outcome is §2.2–2.4
|
||
and the Goal-1 scorecard row.
|
||
- `windows-host-rewrite-audit.md` (the 2026-06-25 audit) — its findings are reconciled to current reality
|
||
in §1 (scorecard) and §4 (only the still-open items survive: host hygiene, E1, slot-reclaim).
|
||
- `windows-host-rewrite-remediation.md` (the audit-remediation tracker) — its landed items are in §1; its
|
||
remaining items (D1-host, D2, E1, G) are §4 P2/P3.
|
||
- `windows-host-rewrite-game-capture-bug.md` (the GB1 investigation + fix) — **fixed**; the resolution is
|
||
§2.5 (capture). The full investigation narrative is in git history.
|
||
|
||
(The older `docs/windows-host.md`, a pre-rewrite implementation plan from 2026-06-22, is a separate
|
||
lineage and is left as-is.)
|