Files
punktfunk/docs/windows-host-rewrite-audit.md
T
enricobuehler 5d279f8886 docs(windows-rewrite): audit-remediation hand-off tracker
Living progress/hand-off doc (docs/windows-host-rewrite-remediation.md): the 9 committed remediation commits with audit refs + how each was verified, the remaining tasks (D2, D1-host, E1, G) with scope / on-glass-gating / verification notes, the box verification recipe, and the new modules introduced. Cross-linked from the audit doc.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 14:30:43 +00:00

289 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Windows Host Rewrite — Audit
Status: **audit** (2026-06-25). Reviews the state of the Windows host rewrite against its plan
([`docs/windows-host-rewrite.md`](windows-host-rewrite.md)). Read-only assessment — no code changed.
Scope: the new IddCx driver workspace (`packaging/windows/drivers/`), the owned ABI crate
(`crates/pf-vdisplay-proto`), the host-side IDD-push path (`capture/idd_push.rs`,
`vdisplay/pf_vdisplay.rs`), and the deployment/packaging seam. Evidence is cited as `file:line`.
> **Remediation in progress (2026-06-25).** The findings below were the state at audit time; several are
> already being worked through. Resolved since: the **cutover (§3)** — STEP 8 gave the new driver its own
> `.inx` and re-vendored the installer to the new wdk-sys build (`pf_vdisplay.dll` 613 KB → 251 KB), so the
> new driver is now the shipped one; and the **proto ABI hardening (§6.1/§6.2)** — offset asserts + the
> owned gamepad SHM layouts have landed. **Live progress + the hand-off task list are tracked in
> [`docs/windows-host-rewrite-remediation.md`](windows-host-rewrite-remediation.md).**
---
## 0. Bottom line
The framing "the Windows host has been rewritten with IDD-push as the main path" **overstates what is
on disk.** What actually landed is the **driver rewrite** (plan M0 + M1, STEPs 07): a clean, new,
all-Rust IddCx driver (`packaging/windows/drivers/pf-vdisplay`, ~2,000 LOC) on the unified
`windows-drivers-rs` stack, speaking an owned ABI crate (`pf-vdisplay-proto`), validated on-glass through
HDR. That is the hardest, highest-risk part of the plan (the `/INTEGRITYCHECK` answer, the `iddcx` binding
on `wdk-sys`, on-glass IDD-push + HDR) and it is genuinely well executed.
Three facts the framing hides:
1. **The new path is not the shipped path — it is not shipped at all.** The installer still vendors and
installs the **old** `vdisplay-driver/` (wdf-umdf) build
(`packaging/windows/pf-vdisplay/pf_vdisplay.dll`, dated 2026-06-24). The new driver has **no INF
in-tree**, is not vendored, and therefore cannot be packaged. IDD-push capture is gated behind
`PUNKTFUNK_IDD_PUSH`, which is **not set** in `scripts/windows/host.env.example`, so the default
capture path is **WGC→DDA** and the default display backend falls back to **SudoVDA** whenever the new
driver interface isn't enumerable. The new path runs only on a hand-built bench box with the env var
set.
2. **The host-side rewrite — Goal 1 — has not started.** No `src/windows/` tree, no `config.rs`/
`HostConfig`, no `SessionFactory`/`SessionPlan`, no `session/`. The old god-files are intact. SudoVDA
was not removed (135 refs; `sudovda.rs` is a *hard dependency* of the new path). Unsafe went **up**,
not down.
3. **The new driver itself diverges from its own spec in load-bearing ways** — the watchdog is dead code,
`SET_RENDER_ADAPTER` is a stub, the §2.5 ownership-model refactor wasn't done, and world-writable
logging was re-introduced.
So the riskiest **proof** is done (real progress). The **rewrite** (clean architecture, cutover,
hardening) is still ahead.
---
## 1. Goal / milestone scorecard
| Goal / milestone | Status | Evidence |
|---|---|---|
| **M0** proto ABI + driver toolchain + `/INTEGRITYCHECK` + iddcx binding | ✅ Done | `pf-vdisplay-proto`, vendored `windows-drivers-rs`, `clear-force-integrity.ps1` |
| **M1** new IddCx driver, first light + HDR | ✅ Done (on-glass) | STEPs 07; `swap_chain_processor.rs`, `frame_transport.rs`, `callbacks.rs` |
| **Goal 1** clean, layered host architecture | ❌ Not started | no `src/windows/`, `config.rs`, `session/`, `SessionFactory`/`SessionPlan` |
| **Goal 2** drop every trace of SudoVDA | ❌ Not done | 135 `sudovda` refs; `sudovda.rs` (1,193 LOC) is a hard dep of `pf_vdisplay.rs` + `idd_push.rs` |
| **Goal 3** minimize unsafe + P0 lints | ❌ Regressed | host unsafe ~476 (↑); driver ~160 vs ~60 target; **no** P0 lints anywhere; `OwnedHandle` in **0** host files |
| **§2.5** delete driver global statics / DeviceContext-owned state / `EvtCleanupCallback` | ❌ Not done | `MONITOR_MODES`/`NEXT_ID`/`ADAPTER`/`DEVICE_POOL` still process-globals; `DeviceContext{_device}` empty; no monitor cleanup callback |
| **M4** unify gamepad drivers onto new stack | ❌ Not started | workspace members = `wdk-probe/wdk-iddcx/pf-vdisplay` only; gamepad drivers still standalone wdf-umdf |
| **M6** cutover + delete old monoliths | ❌ Not reached | old driver trees + `dxgi/wgc/wgc_relay/sudovda/punktfunk1` all present (partly by-design as "reference until parity") |
---
## 2. What landed well (preserve, do not regress)
- **The §1 driver "jewels" survived the port.** The two real swap-chain leak fixes are verbatim with
their rationale: borrow `IDXGIDevice` once across `SetDevice` retries
(`swap_chain_processor.rs:174`), and check `terminate` at the loop top during a frame burst (`:238`).
`DEVICE_POOL` keyed by render LUID (the NVIDIA UMD-thread/VRAM leak fix) is intact
(`direct_3d_device.rs:115`). Monitor lock discipline (drop the worker **outside** `MONITOR_MODES`) is
correct (`monitor.rs:343-390`).
- **The frame transport is clean and correct** — the standout module. `FramePublisher` uses
`pf_vdisplay_proto::frame` for header/token/names (no hand-rolled offsets), straight-line
acquire→copy→release with no `?` between lock/unlock (`frame_transport.rs:266-275`), format guard
before `CopyResource`, stale-ring generation detection, correct drop order.
- **The proto control plane is properly owned**: fresh GUID (not SudoVDA's `e5bcc234`), centralized
`FrameToken::pack/unpack` used by both sides, and a **real version handshake the host actually
asserts** and bails on mismatch (`pf_vdisplay.rs:455-466`). Typed IOCTL dispatch collapsed the
per-call unsafe (`control.rs`).
- **Per-block `// SAFETY:` discipline** is already present throughout the new driver — most of the value
of `clippy::undocumented_unsafe_blocks` without the lint being on yet.
---
## 3. Deployment gap (the headline)
The new path is built and validated but not reachable by an installed product.
- **Installer ships the old driver.** `packaging/windows/stage-pf-vdisplay.ps1:7-8` vendors the signed
output of `packaging/windows/vdisplay-driver/` (the wdf-umdf tree); `punktfunk-host.iss` installs that
via `install-pf-vdisplay.ps1`. The vendored binary is `packaging/windows/pf-vdisplay/pf_vdisplay.dll`
(613,760 bytes — the old build).
- **New driver is not packageable.** `find packaging/windows/drivers -name '*.inf'` → none. The new
workspace is built + FORCE_INTEGRITY-cleared in CI (`windows-drivers.yml`) as a **compile/link gate
only**; nothing signs or vendors its output.
- **GUID split keeps them apart.** The old driver exposes the old SudoVDA interface GUID; the host's
`sudovda.rs` backend opens it. The new driver exposes the fresh `70667664-…` GUID; only
`pf_vdisplay.rs` opens it. With the old driver installed, `pf_vdisplay::is_available()` → false → the
host silently uses the SudoVDA backend.
- **IDD-push is off by default.** `scripts/windows/host.env.example` sets only
`PUNKTFUNK_ENCODER=auto`, `PUNKTFUNK_VIDEO_SOURCE=virtual`, `PUNKTFUNK_SECURE_DDA=1`, `RUST_LOG=info`.
`PUNKTFUNK_IDD_PUSH` is checked via `var_os(...).is_some()` (`capture.rs:348`, `punktfunk1.rs:2223+`,
`pf_vdisplay.rs:57`) but never set in deployment.
Net: a freshly installed Windows host runs **old driver + SudoVDA backend + WGC/DDA capture** — the
pre-rewrite path. The rewrite is a manually-validated parallel track, not a delivered feature.
---
## 4. Driver code audit — stability / correctness
### 4.1 P0 — the watchdog is dead code; host-crash leaks an orphan monitor
`WATCHDOG_PINGS` is incremented on `IOCTL_PING` (`control.rs:35`) but **nothing reads it** — the only
`thread::spawn` in the driver is the swap-chain worker (`swap_chain_processor.rs:104`). The comments are
misleading: "STEP 4's watchdog thread samples it" (`control.rs:17`) and "the watchdog reaps all monitors"
(`control.rs:14`) describe a thread that does not exist; `adapter_init_finished`
(`callbacks.rs:30-37`) does not start one despite its doc claiming so.
Consequence: if `serve` dies or the service is stopped with `TerminateProcess` (skipping `Drop` → no
`IOCTL_REMOVE`), the virtual monitor + its worker thread + pooled D3D device persist in WUDFHost until the
**next** host start issues `IOCTL_CLEAR_ALL`. If the host is not restarted, the orphan monitor stays
plugged into the desktop topology indefinitely.
The plan called for host-gone detection by **`EvtCleanupCallback` RAII**, a **polling watchdog**, or
**`EvtFileClose`** (§3.4) — none is implemented. Fix: implement the watchdog thread, or (preferred) wire
`EvtFileClose` so "host holds the control handle open" = liveness; and remove the false comments.
### 4.2 P1 — `SET_RENDER_ADAPTER` is a stub → hybrid-GPU is a hard failure
`control.rs:47` returns `STATUS_NOT_IMPLEMENTED`, contradicting plan §3.2 (which made it unconditional).
The driver renders the virtual monitor on whatever adapter the OS picks (`callbacks.rs:275`,
`pooled_device(luid)`) and reports that LUID to the host. On a hybrid **iGPU+dGPU** box, if the OS picks
the iGPU, the host's ring textures (created on the NVENC dGPU) fail `OpenSharedResourceByName`
`DRV_STATUS_TEX_FAIL` (`frame_transport.rs:195-208`) → the host's 20 s hard bail (§5.1). This is a silent
hard failure on common Optimus/hybrid configs. The single-dGPU RTX bench box never reproduced it.
### 4.3 P1 — the §2.5 ownership refactor wasn't done
State is still process-global: `MONITOR_MODES`/`NEXT_ID` (`monitor.rs:63,65`), `ADAPTER`
(`adapter.rs:41`), `DEVICE_POOL` (`direct_3d_device.rs:115`); `DeviceContext` is an empty `{ _device }`
(`entry.rs:20`). No `EvtCleanupCallback` on the monitor object (`monitor.rs:292-296` sets only Size +
scope). Monitor identity is still 3-keyed (`id`/`object`/`session_id`), not the collapsed single
`Monitor`.
This is why the plan's central payoff — *stable monitor reuse → drop the preempt dance → unblock
`max_concurrent>1` on Windows* — was not achieved. The host still does fresh-monitor-per-session with the
`IDD_SETUP_LOCK` preempt + `wait_for_monitor_released` dance (`punktfunk1.rs:2216-2237`), so Windows
IDD-push is effectively single-client even though `DEFAULT_MAX_CONCURRENT = 4`.
### 4.4 P2 — world-writable logging re-introduced
Plan §6 said delete the `C:\Users\Public\*.log` driver logging; the new driver re-added it
(`pf-vdisplay/src/log.rs:18``C:\Users\Public\pfvd-driver.log`). Info-leak / DoS surface; should move to
ETW or be gated off release builds.
### 4.5 P2 — no control-plane input validation
`create_monitor` receives `width/height/refresh` from the IOCTL with no bounds check (`control.rs:62-63`
`monitor.rs:243`). The host is a trusted LocalSystem process so the trust boundary holds, but a buggy
host could request an absurd mode. `read_input` uses `T: Copy`, not `bytemuck::Pod` (`control.rs:96`);
Pod would be a stronger guarantee.
---
## 5. Host code audit
### 5.1 P1 — when IDD-push is engaged there is no fallback
The plan kept WGC/DDA as a safety net; the code commits hard. `capture.rs:345` consumes the keepalive and
returns the IDD-push capturer with "no fall-through"; attach failure surfaces as a **20 s deadline
`bail!`** (`idd_push.rs:820-846`) that tears the session down black rather than degrading to DDA. Combined
with §4.2, hybrid-GPU = a guaranteed 20 s black-then-drop.
### 5.2 P1 — SudoVDA is a hard dependency of the "new" path
`pf_vdisplay.rs` and `idd_push.rs` import `isolate_displays_ccd`/`resolve_render_adapter_luid`/
`set_advanced_color`/`CURRENT_MON_GEN` directly from `super::sudovda` (`pf_vdisplay.rs:43-46`,
`idd_push.rs:351-356,809`). `punktfunk1.rs:2231` calls `crate::vdisplay::sudovda::wait_for_monitor_released`
even when pf-vdisplay is the live backend — benign **today** only because pf-vdisplay preempts inline and
the SudoVDA `MGR` is empty (`pf_vdisplay.rs:645-647`), but it is a fragile cross-static landmine. Plan §9
(move CCD/adapter helpers into neutral `windows/display_ccd.rs` + `adapter.rs`) is the right fix and is
unstarted.
### 5.3 P2 — texture-ownership contract is convention, not types
The §4 in-place-encode hazard is *mitigated* by a host-owned 3-slot `OUT_RING` +
`pipeline_depth().clamp(1, OUT_RING)` (`idd_push.rs:60,867-872`) — sound for the live synchronous loop —
but nothing type-enforces it. `nvenc.rs:7-10` still carries the "safe because the loop is synchronous"
comment, and `repeat_last()` (`idd_push.rs:755-766`) can re-hand an out-ring slot that may still be
encoding under depth>1. Narrow, but it is the residual corruption edge the plan wanted closed type-level.
### 5.4 P2 — HDR toggle recreates the whole ring mid-session
`recreate_ring` (`idd_push.rs:582-617`) drops + recreates all 6 keyed-mutex textures on an HDR mode flip,
polled on a 250 ms throttle (`idd_push.rs:622-626`) → up to a 250 ms format-mismatch freeze window where
the driver drops every frame (`frame_transport.rs:256-260`). Works, but heavy and visibly janky.
---
## 6. ABI / proto
### 6.1 P1 — gamepad SHM was not migrated into proto (the one real drift hazard)
Plan §3.1 wanted `XusbShm` (64 B) and `PadShm` (256 B incl. `device_type`) in `pf-vdisplay-proto`. They
are hand-duplicated across four sides on two build graphs, with `device_type` as a bare literal `140`:
host `inject/dualsense_windows.rs:45-52` (`OFF_DEVTYPE=140`) vs driver `dualsense-driver/src/lib.rs:753`
(`*view.add(140)`); XUSB host `inject/gamepad_windows.rs:36-47` vs driver `xusb-driver/src/lib.rs`. A
one-sided edit compiles clean on both and silently mis-routes. The `pf-vdisplay` frame/control contract
got compile-error-on-drift; the gamepad contract did not. (The gamepad drivers being standalone cargo
workspaces is the structural blocker — folding them into the unified workspace, M4, fixes both.)
### 6.2 P2 — proto advertises offset asserts but only has size asserts
`SharedHeader` (14 mixed-width fields + a `_pad`) is guarded by `size_of == 64` + bytemuck-Pod
(`pf-vdisplay-proto/src/lib.rs:232`), which catches most regressions but not a same-size field reorder.
Add `offset_of!` asserts for `magic/latest/generation/dxgi_format/driver_status` and the `AddReply` LUID
split.
---
## 7. Performance opportunities
- **Hybrid-GPU cross-adapter copy** (once §4.2 `SET_RENDER_ADAPTER` works): pinning the driver render to
the NVENC GPU removes a cross-adapter staging path entirely — correctness *and* latency.
- **HDR ring recreate** (§5.4) is the heaviest per-session-event op; if the display HDR state is known at
`open()` from the negotiated mode, size the ring right the first time and skip the recreate + 250 ms
window in the common case.
- **Keyed-mutex acquire timeout is 8 ms** on the host consume side (`idd_push.rs:725`) — at 240 Hz
(4.2 ms/frame) one stall already drops ≥2 frames. Reasonable as a safety bound; worth measuring under
load against a tighter value plus an explicit drop counter.
- The encode|send split, microburst pacing, and `pipeline_depth=2` convert/copy-vs-NVENC overlap are
preserved — no regression on the hot path.
---
## 8. Hygiene (Goal 3)
- **No P0 lints anywhere.** Neither the host crate nor the new driver crates carry
`deny(unsafe_op_in_unsafe_fn)` / `warn(clippy::undocumented_unsafe_blocks)` /
`warn(clippy::multiple_unsafe_ops_per_block)`. The plan claimed the driver workspace "already has it";
it does not (`pf-vdisplay/src/lib.rs:11` is only `allow(...)`). A few-line, high-leverage first step
before any further unsafe work.
- **`OwnedHandle`/`from_raw_handle` used in zero host files** — the plan's "single biggest cheap win."
`pf_vdisplay.rs` holds a raw `isize` device handle in the pinger thread; `idd_push.rs` holds raw
event/map handles. Obvious first conversions.
- **Unsafe counts moved the wrong way.** Host ~476 (target ~35); new driver ~160 (target for all three
drivers ~60), and the old gamepad drivers are untouched on top of that.
---
## 9. Recommended priority order
**P0 — correctness/stability, before relying on the path**
1. Make host-gone detection real: implement the watchdog thread **or** `EvtFileClose`, and delete the
false "watchdog" comments. Verify service stop is cooperative (named stop event → `Drop`
`IOCTL_REMOVE`), not `TerminateProcess`. (§4.1)
2. Implement `SET_RENDER_ADAPTER` (pin driver render to the NVENC adapter) **and** add a real capture
fallback (IDD-push attach failure → DDA) instead of the 20 s black bail. (§4.2, §5.1)
**P1 — ship-ability + the actual rewrite**
3. Cutover plan: give the new driver an in-tree INF, vendor *its* signed output, flip
`stage-pf-vdisplay.ps1`, and make IDD-push the code default (WGC/DDA fallback) or set
`PUNKTFUNK_IDD_PUSH=1` in `host.env`. Until then the rewrite does not reach users. (§3)
4. Migrate the gamepad SHM into `pf-vdisplay-proto` (kills the `140`-literal drift hazard). (§6.1)
5. Add the P0 lints; convert raw handles to `OwnedHandle`. (§8)
**P2 — the host-side architecture (Goal 1, the bulk of "rewrite the host")**
6. §2.5 driver ownership refactor (DeviceContext state + `EvtCleanupCallback` + single monitor identity)
— the prerequisite to `max_concurrent>1` on Windows. (§4.3)
7. §9 SudoVDA decoupling (split CCD/adapter helpers into neutral modules), then the §2.2/§2.4 host tree
(`config.rs`/`SessionFactory`) — the clean architecture that was Goal 1. (§5.2)
8. Offset asserts in proto; remove world-writable driver logging; M4 gamepad-driver unification; then M6
deletion of the old monoliths. (§6.2, §4.4)
---
## Appendix — methodology
Full read of the new driver (`packaging/windows/drivers/pf-vdisplay/src/*.rs`, `wdk-iddcx/src/lib.rs`)
and `pf-vdisplay-proto`; targeted read of the host IDD-push path (`capture/idd_push.rs`,
`vdisplay/pf_vdisplay.rs`, `capture.rs`, `vdisplay.rs`, `encode.rs`, `encode/nvenc.rs`); structural
grep/diff of plan §2.2/§6/§8/§9/§10 against the on-disk tree; packaging/CI inspection
(`punktfunk-host.iss`, `stage-pf-vdisplay.ps1`, `windows-drivers.yml`, `scripts/windows/host.env.example`).
Unsafe counts are raw `grep -c unsafe` over the relevant subtrees (occurrences, not blocks). Not validated
on hardware — this audit reads code and packaging only; on-glass behavior is per the commit log and
[`docs/windows-host-rewrite.md`](windows-host-rewrite.md) §1314.