Driver track (M0+M1, STEPs 0-7) landed and is on-glass-validated, but the host-side goals (clean architecture, SudoVDA removal, unsafe reduction) and several driver-spec items (host-gone watchdog, SET_RENDER_ADAPTER, ownership model) are not yet done. Full findings + a prioritized P0-P2 fix list in the doc. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
18 KiB
Windows Host Rewrite — Audit
Status: audit (2026-06-25). Reviews the state of the Windows host rewrite against its plan
(docs/windows-host-rewrite.md). Read-only assessment — no code changed.
Scope: the new IddCx driver workspace (packaging/windows/drivers/), the owned ABI crate
(crates/pf-vdisplay-proto), the host-side IDD-push path (capture/idd_push.rs,
vdisplay/pf_vdisplay.rs), and the deployment/packaging seam. Evidence is cited as file:line.
Remediation in progress (2026-06-25). The findings below were the state at audit time; several are already being worked through. Resolved since: the cutover (§3) — STEP 8 gave the new driver its own
.inxand re-vendored the installer to the new wdk-sys build (pf_vdisplay.dll613 KB → 251 KB), so the new driver is now the shipped one; and the proto ABI hardening (§6.1/§6.2) — offset asserts + the owned gamepad SHM layouts have landed. Remaining items are tracked and in progress.
0. Bottom line
The framing "the Windows host has been rewritten with IDD-push as the main path" overstates what is
on disk. What actually landed is the driver rewrite (plan M0 + M1, STEPs 0–7): a clean, new,
all-Rust IddCx driver (packaging/windows/drivers/pf-vdisplay, ~2,000 LOC) on the unified
windows-drivers-rs stack, speaking an owned ABI crate (pf-vdisplay-proto), validated on-glass through
HDR. That is the hardest, highest-risk part of the plan (the /INTEGRITYCHECK answer, the iddcx binding
on wdk-sys, on-glass IDD-push + HDR) and it is genuinely well executed.
Three facts the framing hides:
- The new path is not the shipped path — it is not shipped at all. The installer still vendors and
installs the old
vdisplay-driver/(wdf-umdf) build (packaging/windows/pf-vdisplay/pf_vdisplay.dll, dated 2026-06-24). The new driver has no INF in-tree, is not vendored, and therefore cannot be packaged. IDD-push capture is gated behindPUNKTFUNK_IDD_PUSH, which is not set inscripts/windows/host.env.example, so the default capture path is WGC→DDA and the default display backend falls back to SudoVDA whenever the new driver interface isn't enumerable. The new path runs only on a hand-built bench box with the env var set. - The host-side rewrite — Goal 1 — has not started. No
src/windows/tree, noconfig.rs/HostConfig, noSessionFactory/SessionPlan, nosession/. The old god-files are intact. SudoVDA was not removed (135 refs;sudovda.rsis a hard dependency of the new path). Unsafe went up, not down. - The new driver itself diverges from its own spec in load-bearing ways — the watchdog is dead code,
SET_RENDER_ADAPTERis a stub, the §2.5 ownership-model refactor wasn't done, and world-writable logging was re-introduced.
So the riskiest proof is done (real progress). The rewrite (clean architecture, cutover, hardening) is still ahead.
1. Goal / milestone scorecard
| Goal / milestone | Status | Evidence |
|---|---|---|
M0 proto ABI + driver toolchain + /INTEGRITYCHECK + iddcx binding |
✅ Done | pf-vdisplay-proto, vendored windows-drivers-rs, clear-force-integrity.ps1 |
| M1 new IddCx driver, first light + HDR | ✅ Done (on-glass) | STEPs 0–7; swap_chain_processor.rs, frame_transport.rs, callbacks.rs |
| Goal 1 clean, layered host architecture | ❌ Not started | no src/windows/, config.rs, session/, SessionFactory/SessionPlan |
| Goal 2 drop every trace of SudoVDA | ❌ Not done | 135 sudovda refs; sudovda.rs (1,193 LOC) is a hard dep of pf_vdisplay.rs + idd_push.rs |
| Goal 3 minimize unsafe + P0 lints | ❌ Regressed | host unsafe ~476 (↑); driver ~160 vs ~60 target; no P0 lints anywhere; OwnedHandle in 0 host files |
§2.5 delete driver global statics / DeviceContext-owned state / EvtCleanupCallback |
❌ Not done | MONITOR_MODES/NEXT_ID/ADAPTER/DEVICE_POOL still process-globals; DeviceContext{_device} empty; no monitor cleanup callback |
| M4 unify gamepad drivers onto new stack | ❌ Not started | workspace members = wdk-probe/wdk-iddcx/pf-vdisplay only; gamepad drivers still standalone wdf-umdf |
| M6 cutover + delete old monoliths | ❌ Not reached | old driver trees + dxgi/wgc/wgc_relay/sudovda/punktfunk1 all present (partly by-design as "reference until parity") |
2. What landed well (preserve, do not regress)
- The §1 driver "jewels" survived the port. The two real swap-chain leak fixes are verbatim with
their rationale: borrow
IDXGIDeviceonce acrossSetDeviceretries (swap_chain_processor.rs:174), and checkterminateat the loop top during a frame burst (:238).DEVICE_POOLkeyed by render LUID (the NVIDIA UMD-thread/VRAM leak fix) is intact (direct_3d_device.rs:115). Monitor lock discipline (drop the worker outsideMONITOR_MODES) is correct (monitor.rs:343-390). - The frame transport is clean and correct — the standout module.
FramePublisherusespf_vdisplay_proto::framefor header/token/names (no hand-rolled offsets), straight-line acquire→copy→release with no?between lock/unlock (frame_transport.rs:266-275), format guard beforeCopyResource, stale-ring generation detection, correct drop order. - The proto control plane is properly owned: fresh GUID (not SudoVDA's
e5bcc234), centralizedFrameToken::pack/unpackused by both sides, and a real version handshake the host actually asserts and bails on mismatch (pf_vdisplay.rs:455-466). Typed IOCTL dispatch collapsed the per-call unsafe (control.rs). - Per-block
// SAFETY:discipline is already present throughout the new driver — most of the value ofclippy::undocumented_unsafe_blockswithout the lint being on yet.
3. Deployment gap (the headline)
The new path is built and validated but not reachable by an installed product.
- Installer ships the old driver.
packaging/windows/stage-pf-vdisplay.ps1:7-8vendors the signed output ofpackaging/windows/vdisplay-driver/(the wdf-umdf tree);punktfunk-host.issinstalls that viainstall-pf-vdisplay.ps1. The vendored binary ispackaging/windows/pf-vdisplay/pf_vdisplay.dll(613,760 bytes — the old build). - New driver is not packageable.
find packaging/windows/drivers -name '*.inf'→ none. The new workspace is built + FORCE_INTEGRITY-cleared in CI (windows-drivers.yml) as a compile/link gate only; nothing signs or vendors its output. - GUID split keeps them apart. The old driver exposes the old SudoVDA interface GUID; the host's
sudovda.rsbackend opens it. The new driver exposes the fresh70667664-…GUID; onlypf_vdisplay.rsopens it. With the old driver installed,pf_vdisplay::is_available()→ false → the host silently uses the SudoVDA backend. - IDD-push is off by default.
scripts/windows/host.env.examplesets onlyPUNKTFUNK_ENCODER=auto,PUNKTFUNK_VIDEO_SOURCE=virtual,PUNKTFUNK_SECURE_DDA=1,RUST_LOG=info.PUNKTFUNK_IDD_PUSHis checked viavar_os(...).is_some()(capture.rs:348,punktfunk1.rs:2223+,pf_vdisplay.rs:57) but never set in deployment.
Net: a freshly installed Windows host runs old driver + SudoVDA backend + WGC/DDA capture — the pre-rewrite path. The rewrite is a manually-validated parallel track, not a delivered feature.
4. Driver code audit — stability / correctness
4.1 P0 — the watchdog is dead code; host-crash leaks an orphan monitor
WATCHDOG_PINGS is incremented on IOCTL_PING (control.rs:35) but nothing reads it — the only
thread::spawn in the driver is the swap-chain worker (swap_chain_processor.rs:104). The comments are
misleading: "STEP 4's watchdog thread samples it" (control.rs:17) and "the watchdog reaps all monitors"
(control.rs:14) describe a thread that does not exist; adapter_init_finished
(callbacks.rs:30-37) does not start one despite its doc claiming so.
Consequence: if serve dies or the service is stopped with TerminateProcess (skipping Drop → no
IOCTL_REMOVE), the virtual monitor + its worker thread + pooled D3D device persist in WUDFHost until the
next host start issues IOCTL_CLEAR_ALL. If the host is not restarted, the orphan monitor stays
plugged into the desktop topology indefinitely.
The plan called for host-gone detection by EvtCleanupCallback RAII, a polling watchdog, or
EvtFileClose (§3.4) — none is implemented. Fix: implement the watchdog thread, or (preferred) wire
EvtFileClose so "host holds the control handle open" = liveness; and remove the false comments.
4.2 P1 — SET_RENDER_ADAPTER is a stub → hybrid-GPU is a hard failure
control.rs:47 returns STATUS_NOT_IMPLEMENTED, contradicting plan §3.2 (which made it unconditional).
The driver renders the virtual monitor on whatever adapter the OS picks (callbacks.rs:275,
pooled_device(luid)) and reports that LUID to the host. On a hybrid iGPU+dGPU box, if the OS picks
the iGPU, the host's ring textures (created on the NVENC dGPU) fail OpenSharedResourceByName →
DRV_STATUS_TEX_FAIL (frame_transport.rs:195-208) → the host's 20 s hard bail (§5.1). This is a silent
hard failure on common Optimus/hybrid configs. The single-dGPU RTX bench box never reproduced it.
4.3 P1 — the §2.5 ownership refactor wasn't done
State is still process-global: MONITOR_MODES/NEXT_ID (monitor.rs:63,65), ADAPTER
(adapter.rs:41), DEVICE_POOL (direct_3d_device.rs:115); DeviceContext is an empty { _device }
(entry.rs:20). No EvtCleanupCallback on the monitor object (monitor.rs:292-296 sets only Size +
scope). Monitor identity is still 3-keyed (id/object/session_id), not the collapsed single
Monitor.
This is why the plan's central payoff — stable monitor reuse → drop the preempt dance → unblock
max_concurrent>1 on Windows — was not achieved. The host still does fresh-monitor-per-session with the
IDD_SETUP_LOCK preempt + wait_for_monitor_released dance (punktfunk1.rs:2216-2237), so Windows
IDD-push is effectively single-client even though DEFAULT_MAX_CONCURRENT = 4.
4.4 P2 — world-writable logging re-introduced
Plan §6 said delete the C:\Users\Public\*.log driver logging; the new driver re-added it
(pf-vdisplay/src/log.rs:18 → C:\Users\Public\pfvd-driver.log). Info-leak / DoS surface; should move to
ETW or be gated off release builds.
4.5 P2 — no control-plane input validation
create_monitor receives width/height/refresh from the IOCTL with no bounds check (control.rs:62-63
→ monitor.rs:243). The host is a trusted LocalSystem process so the trust boundary holds, but a buggy
host could request an absurd mode. read_input uses T: Copy, not bytemuck::Pod (control.rs:96);
Pod would be a stronger guarantee.
5. Host code audit
5.1 P1 — when IDD-push is engaged there is no fallback
The plan kept WGC/DDA as a safety net; the code commits hard. capture.rs:345 consumes the keepalive and
returns the IDD-push capturer with "no fall-through"; attach failure surfaces as a 20 s deadline
bail! (idd_push.rs:820-846) that tears the session down black rather than degrading to DDA. Combined
with §4.2, hybrid-GPU = a guaranteed 20 s black-then-drop.
5.2 P1 — SudoVDA is a hard dependency of the "new" path
pf_vdisplay.rs and idd_push.rs import isolate_displays_ccd/resolve_render_adapter_luid/
set_advanced_color/CURRENT_MON_GEN directly from super::sudovda (pf_vdisplay.rs:43-46,
idd_push.rs:351-356,809). punktfunk1.rs:2231 calls crate::vdisplay::sudovda::wait_for_monitor_released
even when pf-vdisplay is the live backend — benign today only because pf-vdisplay preempts inline and
the SudoVDA MGR is empty (pf_vdisplay.rs:645-647), but it is a fragile cross-static landmine. Plan §9
(move CCD/adapter helpers into neutral windows/display_ccd.rs + adapter.rs) is the right fix and is
unstarted.
5.3 P2 — texture-ownership contract is convention, not types
The §4 in-place-encode hazard is mitigated by a host-owned 3-slot OUT_RING +
pipeline_depth().clamp(1, OUT_RING) (idd_push.rs:60,867-872) — sound for the live synchronous loop —
but nothing type-enforces it. nvenc.rs:7-10 still carries the "safe because the loop is synchronous"
comment, and repeat_last() (idd_push.rs:755-766) can re-hand an out-ring slot that may still be
encoding under depth>1. Narrow, but it is the residual corruption edge the plan wanted closed type-level.
5.4 P2 — HDR toggle recreates the whole ring mid-session
recreate_ring (idd_push.rs:582-617) drops + recreates all 6 keyed-mutex textures on an HDR mode flip,
polled on a 250 ms throttle (idd_push.rs:622-626) → up to a 250 ms format-mismatch freeze window where
the driver drops every frame (frame_transport.rs:256-260). Works, but heavy and visibly janky.
6. ABI / proto
6.1 P1 — gamepad SHM was not migrated into proto (the one real drift hazard)
Plan §3.1 wanted XusbShm (64 B) and PadShm (256 B incl. device_type) in pf-vdisplay-proto. They
are hand-duplicated across four sides on two build graphs, with device_type as a bare literal 140:
host inject/dualsense_windows.rs:45-52 (OFF_DEVTYPE=140) vs driver dualsense-driver/src/lib.rs:753
(*view.add(140)); XUSB host inject/gamepad_windows.rs:36-47 vs driver xusb-driver/src/lib.rs. A
one-sided edit compiles clean on both and silently mis-routes. The pf-vdisplay frame/control contract
got compile-error-on-drift; the gamepad contract did not. (The gamepad drivers being standalone cargo
workspaces is the structural blocker — folding them into the unified workspace, M4, fixes both.)
6.2 P2 — proto advertises offset asserts but only has size asserts
SharedHeader (14 mixed-width fields + a _pad) is guarded by size_of == 64 + bytemuck-Pod
(pf-vdisplay-proto/src/lib.rs:232), which catches most regressions but not a same-size field reorder.
Add offset_of! asserts for magic/latest/generation/dxgi_format/driver_status and the AddReply LUID
split.
7. Performance opportunities
- Hybrid-GPU cross-adapter copy (once §4.2
SET_RENDER_ADAPTERworks): pinning the driver render to the NVENC GPU removes a cross-adapter staging path entirely — correctness and latency. - HDR ring recreate (§5.4) is the heaviest per-session-event op; if the display HDR state is known at
open()from the negotiated mode, size the ring right the first time and skip the recreate + 250 ms window in the common case. - Keyed-mutex acquire timeout is 8 ms on the host consume side (
idd_push.rs:725) — at 240 Hz (4.2 ms/frame) one stall already drops ≥2 frames. Reasonable as a safety bound; worth measuring under load against a tighter value plus an explicit drop counter. - The encode|send split, microburst pacing, and
pipeline_depth=2convert/copy-vs-NVENC overlap are preserved — no regression on the hot path.
8. Hygiene (Goal 3)
- No P0 lints anywhere. Neither the host crate nor the new driver crates carry
deny(unsafe_op_in_unsafe_fn)/warn(clippy::undocumented_unsafe_blocks)/warn(clippy::multiple_unsafe_ops_per_block). The plan claimed the driver workspace "already has it"; it does not (pf-vdisplay/src/lib.rs:11is onlyallow(...)). A few-line, high-leverage first step before any further unsafe work. OwnedHandle/from_raw_handleused in zero host files — the plan's "single biggest cheap win."pf_vdisplay.rsholds a rawisizedevice handle in the pinger thread;idd_push.rsholds raw event/map handles. Obvious first conversions.- Unsafe counts moved the wrong way. Host ~476 (target ~35); new driver ~160 (target for all three drivers ~60), and the old gamepad drivers are untouched on top of that.
9. Recommended priority order
P0 — correctness/stability, before relying on the path
- Make host-gone detection real: implement the watchdog thread or
EvtFileClose, and delete the false "watchdog" comments. Verify service stop is cooperative (named stop event →Drop→IOCTL_REMOVE), notTerminateProcess. (§4.1) - Implement
SET_RENDER_ADAPTER(pin driver render to the NVENC adapter) and add a real capture fallback (IDD-push attach failure → DDA) instead of the 20 s black bail. (§4.2, §5.1)
P1 — ship-ability + the actual rewrite
3. Cutover plan: give the new driver an in-tree INF, vendor its signed output, flip
stage-pf-vdisplay.ps1, and make IDD-push the code default (WGC/DDA fallback) or set
PUNKTFUNK_IDD_PUSH=1 in host.env. Until then the rewrite does not reach users. (§3)
4. Migrate the gamepad SHM into pf-vdisplay-proto (kills the 140-literal drift hazard). (§6.1)
5. Add the P0 lints; convert raw handles to OwnedHandle. (§8)
P2 — the host-side architecture (Goal 1, the bulk of "rewrite the host")
6. §2.5 driver ownership refactor (DeviceContext state + EvtCleanupCallback + single monitor identity)
— the prerequisite to max_concurrent>1 on Windows. (§4.3)
7. §9 SudoVDA decoupling (split CCD/adapter helpers into neutral modules), then the §2.2/§2.4 host tree
(config.rs/SessionFactory) — the clean architecture that was Goal 1. (§5.2)
8. Offset asserts in proto; remove world-writable driver logging; M4 gamepad-driver unification; then M6
deletion of the old monoliths. (§6.2, §4.4)
Appendix — methodology
Full read of the new driver (packaging/windows/drivers/pf-vdisplay/src/*.rs, wdk-iddcx/src/lib.rs)
and pf-vdisplay-proto; targeted read of the host IDD-push path (capture/idd_push.rs,
vdisplay/pf_vdisplay.rs, capture.rs, vdisplay.rs, encode.rs, encode/nvenc.rs); structural
grep/diff of plan §2.2/§6/§8/§9/§10 against the on-disk tree; packaging/CI inspection
(punktfunk-host.iss, stage-pf-vdisplay.ps1, windows-drivers.yml, scripts/windows/host.env.example).
Unsafe counts are raw grep -c unsafe over the relevant subtrees (occurrences, not blocks). Not validated
on hardware — this audit reads code and packaging only; on-glass behavior is per the commit log and
docs/windows-host-rewrite.md §13–14.