Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).
- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
host-latency, gpu-contention (fixed stale status table), game-library,
linux-setup (fixed m0->spike + stale zero-copy claim),
session-aware-host-followups, windows-client-bootstrap,
windows-dualsense-{scoping,game-detection}, windows-virtual-display,
security-review (per-finding status table; #12 still open),
apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
merged, M4 done); windows-secure-desktop.md archived (now a fallback
behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
30 KiB
Windows Host — Architecture, Status & Roadmap
Single source of truth for the punktfunk Windows streaming host: the all-Rust
pf-vdisplayIddCx virtual-display driver + IDD-push zero-copy capture + NVENC/AMF/QSV encode, shipped as a signed Inno Setup installer with a LocalSystem SCM service. Live-validated on the RTX box through 5120×1440@240 HDR, the secure desktop (lock/UAC), and a fullscreen game.This file is the consolidated Windows-host doc — it absorbs the rewrite design plan, the Goal-1 staged-refactor plan, the audit + remediation tracker, the fullscreen-game capture-bug analysis, and the durable rationale from the original
windows-host.mdimplementation plan (now a stub). Last updated 2026-06-26. All of this work is merged tomain(thewindows-host-goal1branch landed at3e7c9bd).
1. Status at a glance
Goals 1–3 and milestones M0–M4 are complete and merged to main. The host has a clean, typed,
layered architecture (HostConfig → SessionPlan → SessionContext, windows/+linux/ confinement, a
single VirtualDisplayManager ownership model, EncoderCaps); the all-Rust IddCx pf-vdisplay driver
loads self-signed under Secure Boot and does IDD-push zero-copy capture at 5K@240 HDR including the
secure desktop (Winlogon/UAC/lock); SudoVDA is gone (84a3b95) — pf-vdisplay is the sole
virtual-display backend; and the three UMDF drivers (pf-vdisplay, pf-dualsense, pf-xusb) now build
from source in one unified packaging/windows/drivers/ workspace (M4, 92e6802). The shipped path
(IDD-push + NVENC) is live-validated on glass; the AMF/QSV encode path is CI-green but not yet
on-hardware (no AMD/Intel Windows box in the lab).
Ground the details against the code: crates/punktfunk-host/src/windows/,
crates/punktfunk-host/src/{capture,encode,inject,audio,vdisplay}/windows/, and
packaging/windows/drivers/.
What remains (all non-blocking): the pf-vdisplay slot-reclaim-on-REMOVE fix needs an on-glass
reconnect-storm A/B (§4 P1.3); host-crate unsafe lint hygiene + old-monolith / bring-up-scaffolding
cleanup (§4 P2); and the hardware-gated items — AMF/QSV on-glass, hybrid-GPU SET_RENDER_ADAPTER,
the WGC/DDA fallback reshape, and true max_concurrent>1 (§4 P3). One framing note: the host was
not greenfield-rebuilt — it was refactored in place via a staged, behavior-preserving sequence
that kept the live host working at every step; only the driver was rebuilt fresh.
2. Architecture (what is on disk)
A ~1-page map; the empirical constraints these encode are in §3, the deep reference is in §6.
2.1 Layering & crates
crates/punktfunk-host— one shared host crate (Linux + Windows; not split). Platform code is confined under per-modulewindows/+linux/folders behind#[cfg]seams (capture/{windows,linux}/,encode/…,inject/…,audio/…,vdisplay/…, plus top-levelsrc/windows/+src/linux/). Module names stay flat (#[path]), so caller paths are platform-agnostic.crates/punktfunk-core— the one linked protocol/FEC/crypto/QUIC core (unchanged here).crates/pf-driver-proto— the owned,no_stdhost↔driver ABI (frame ring + control plane + gamepad SHM), consumed by both the host crate and the driver workspace.packaging/windows/drivers/— the unified driver workspace onmicrosoft/windows-drivers-rs(vendored 0.5.1 + aniddcxsubset):pf-vdisplay(the IddCx display driver),pf-dualsense+pf-xusb(the gamepad drivers, folded in by M4),wdk-iddcx(typed IddCx DDI wrappers),wdk-probe(the CI link/surface gate),vendor/{wdk-build,wdk-sys}.
2.2 Session resolution, ownership, and seam traits (Goal-1)
The old ~40-knob PUNKTFUNK_* env soup (re-read and recomputed in three places) is replaced by a
resolve-once pipeline: config.rs HostConfig (typed, parsed once) → session_plan.rs SessionPlan
(a Copy plan resolved once per session — CaptureBackend::resolve() picks IddPush | Dda | Wgc,
resolve_topology picks SingleProcess | TwoProcessRelay; this killed the latent capture/encode
backend-disagreement bug) → SessionContext (bundles the ~13 session args + plane receivers, moved into
the stream thread).
Ownership is a single OnceLock VirtualDisplayManager (vdisplay/windows/manager.rs) owning a
typed Arc<OwnedHandle> control-device handle (no raw-isize cross-thread smuggle), a refcounted
Idle/Active/Lingering state machine, and the monitor generation; a per-session MonitorLease's Drop
releases the refcount (a stale lease can't tear down a fresh monitor). This deleted a fistful of
CURRENT_MON_GEN/MGR/IDD_* globals and validated on glass at 0 leaked monitors across a reconnect
storm, A/B-equivalent to the shipping host.
The seam traits (VirtualDisplay/VirtualOutput/VirtualLease, Capturer, Encoder,
AudioCapturer/VirtualMic/InputInjector/PadManager) got two tightenings: the capturer takes the
desired OutputFormat { gpu, hdr } in (killing the capture → encode::windows_resolved_backend()
back-reference), and Encoder::caps() -> EncoderCaps (§2.4) lets the session glue route loss-recovery by
query.
2.3 Capture — IDD-push primary (normal and secure desktop), WGC/DDA fallback, GB1 recovery
IDD-push is the universal primary path. Capture comes straight from the driver's shared keyed-mutex
texture ring (capture/windows/idd_push.rs) — no Desktop Duplication, no win32u reparenting hook. The
host creates the ring; the driver opens it (permissive D:(A;;GA;;;WD) SDDL). The generation-tagged
latest = gen<<40 | seq<<8 | slot stale-ring reject kills the HDR-flip garbage frame; a host-owned
3-slot OUT_RING rotated per frame is the texture-ownership contract that enables pipeline_depth=2
(convert/copy on the 3D engine overlapping NVENC on the ASIC). It captures the secure desktop
(Winlogon/UAC/lock) directly (validated 2026-06-25), so there is no separate secure capturer in the
primary path.
- Open-time fallback:
IddPushCapturer::openwaits a bounded ~4 s for a first frame (not justDRV_STATUS_OPENED); on attach failure it returns the keepalive back socapture.rsopens DDA on the sameWinCaptureTarget— never a 20 s black bail (ed58365/f98ab07). - Mid-session game mode-set recovery (GB1, fixed): the 250 ms poll follows the display's actual
resolution (
win_display::active_resolution, CCD/GDI) and recreates the ring on any descriptor change (size or HDR) → the driver re-attaches → frames resume at the game's mode, no reconnect. If a change is unrecoverable (e.g. an exclusive flip), arecovering_sinceclock drops the session after 3 s so the client reconnects cleanly. No protocol bump was needed — the host reads the resolution straight from Windows (c87bfe0; the driver'spublish()width/height guard + flushed log is789ad49). - WGC + DDA stay as demoted fallbacks for non-IddCx hardware (
wgc.rs/dxgi.rs). The two-process WGC secure-desktop relay (wgc_relay.rs) is no longer load-bearing now that IDD-push handles the secure desktop; it is kept recoverable but slated for M5/M6 cleanup. (Its constraint analysis is archived inarchive/windows-secure-desktop.md.)
2.4 Encode — NVENC / AMF / QSV / software; EncoderCaps; HDR
encode/windows/ dispatches per DXGI adapter vendor (open_video): NVENC (NVIDIA, direct SDK,
nvenc.rs — caps-probe-before-configure, bitrate-clamp binary search, true RFI over the DPB, in-band
ST.2086/CLL SEI), AMF/QSV (AMD/Intel via libavcodec, ffmpeg_win.rs — system-readback default,
opt-in zero-copy D3D11; CI-only, no lab hardware), or software H.264 (sw.rs). HDR (10-bit) forces
HEVC Main10 + BT.2020 PQ; the client auto-detects PQ from the VUI. The encoder adapts to a mid-session
size/format/HDR change per frame (tears down + re-inits), so the GB1 capturer's resolution changes are
handled downstream with no API change.
Encoder::caps() -> EncoderCaps { supports_rfi, supports_hdr_metadata } lets the session glue route
loss-recovery by query (only Windows direct-NVENC overrides it; the GameStream loop gates the RFI path on
supports_rfi rather than hard-coding per-backend knowledge into the glue).
2.5 Host↔driver ABI & the pf-vdisplay driver
pf-driver-proto is one no_std crate in both build graphs. It owns the frame plane (FrameToken
Global\pfvd-*names), the control plane (a fresh interface GUID — not SudoVDA'se5bcc234; contiguous0x900IOCTL ops; aGET_INFOversion handshake the host asserts + bails on mismatch), and the gamepad SHM (XusbShm/PadShmincl.device_type).bytemuck-Pod+size_ofandoffset_of!asserts make ABI drift a compile error.
The driver (packaging/windows/drivers/pf-vdisplay/src/) is an all-Rust UMDF IddCx driver on
windows-drivers-rs + the iddcx wdk-sys subset; the STEP 0–8 build is the checklist in §6.3, its
internals are the invariants in §3, and it loads self-signed under Secure Boot (FORCE_INTEGRITY cleared
post-link, §6.1). Known gaps: ownership state is still partly process-global with
EvtCleanupCallback on the WDFDEVICE (a deliberate, sound choice — E1 in §4); and
slot-reclaim-on-REMOVE (§4 P1.3).
2.6 Service, packaging, installer
A LocalSystem SCM supervisor (windows/service.rs) token-retargets and CreateProcessAsUserWs serve
into the console session (so SendInput reaches both the streamed and the secure desktop), relaunches on
session-change, and kills-on-close via a Job Object — the Sunshine/Apollo model (rationale:
windows-service.md). Shipped as a signed Inno Setup setup.exe
(packaging/windows/, windows-host.yml) that builds + signs all three drivers from source, bundles
them + the FFmpeg DLLs, and delegates to service install. GameStream (Moonlight) is kept, but the
installer/service default to secure serve (GameStream opt-in).
3. Validated invariants — preserve, do not regress
These are expensive empirical wins; keep them intact when touching the code:
- Frame transport: host-creates/driver-opens keyed-mutex ring; generation-tagged stale-ring reject;
0 ms try-acquire / drop-on-full publish (never block the swap-chain thread); the
OUT_RINGrotation +pipeline_depth=2overlap;repeat_lastrotates into a fresh out-ring slot (depth-safe). - Driver internals:
edid.rs(128-byte EDID + CTA-861.3 HDR block, dual checksums); the FP16 HDR recipe (CAN_PROCESS_FP16+ the*2DDIs + gamma/HDR accept-stubs +HIGH_COLOR_SPACE);DEVICE_POOLper render-LUID (NVIDIA UMD/VRAM leak fix); target-id stamped on the monitor context; the two swap-chain leak fixes (borrowIDXGIDeviceacrossSetDeviceretries; checkterminateat the loop top). - Monitor lifecycle: serialized ADD/REMOVE/teardown; restore CCD topology before REMOVE; the generation-stamped lease (a stale lease can't tear down a fresh monitor); 0-leak across reconnects.
- HDR color math:
hdr.rs(pure, unit-tested, ST.2086 + big-endian SEI); the FP16→P010/Rgb10a2 converters +hdr_p010_selftest; the cursor decomposition. - NVENC tuning: caps-probe-before-configure (10-bit→8-bit graceful downgrade); bitrate-clamp binary search (each GPU's real ceiling); true RFI over the DPB; CBR / infinite-GOP / P-only / ~1-frame VBV.
- Gamepad recipe: the SwDeviceCreate identity (enumerator with no
_; mandatory completion callback; synthesized DS5 compat-ids; non-null per-padContainerId); onepf_dualsenseserving DualSense+DS4 via adevice_typebyte; XUSB decliningWAIT_*; per-pad index viapszDeviceLocation. - Session glue: the trait seam + RAII keepalive teardown; host-lifetime shared services + per-session
gamepads; the encode|send split + microburst pacing;
build_pipeline_with_retrypermanent-vs-transient classification; the GameStreamVideoPacketizer(GF8 Cauchy, Moonlight byte-exact); the pairing/trust handshake. - Core discipline: no async on the per-frame path;
pf-driver-protois the single ABI source (drift = compile error); the version handshake the host asserts.
4. Open work / next tasks (prioritized)
P1 — ship-readiness / correctness
- Goal-1 →
mainmerge — ✅ DONE. Thewindows-host-goal1branch is merged (tip3e7c9bd); the full Windows CI matrix (incl. theamf-qsvencode path that local checks skip) runs on push. - IDD-push default — ✅ resolved via
host.env. The shipped defaulthost.envsetsPUNKTFUNK_IDD_PUSH=1, so a fresh install runs the validated IDD-push path (with the WGC/DDA fallback in place). The bare in-code default (config.rs) is stillfalse(the dev / non-pf-driver default); flipping it to follow the deployed default is an optional tidy. - pf-vdisplay slot reclaim on REMOVE (driver robustness) — 🟡 fix landed, on-glass-validation
pending. Sustained ADD/REMOVE churn wedged the driver (
ADD → 0x80070490 ERROR_NOT_FOUND) because the monitor id (EDID serial /ConnectorIndex/ container GUID) was a monotonicNEXT_ID, never reclaimed → IddCx accumulated a new OS target slot per cycle until exhaustion.monitor.rsnow allocates the lowest free id (alloc_monitor_id), reused on REMOVE, so a fresh ADD reuses the departed monitor's target slot instead of orphaning it. CI-compile-gated; the wedge only reproduces under sustained churn on the RTX box, so this needs an on-glass reconnect-storm A/B to confirm (the box is ephemeral). Keeppackaging/windows/reset-pf-vdisplay.ps1as the recovery until validated.
P2 — hygiene / architecture completion
4. D1-host — host-crate P0 lints — deferred (low value / high churn). A crate-wide
#![deny(unsafe_op_in_unsafe_fn)] produced 100+ FFI-wrap sites across the Linux modules; it wraps
unsafe (discipline) rather than reducing it and doesn't improve stability, so it was deprioritized vs
the OwnedHandle/RAII reductions (which are complete — idd_push.rs, service.rs, the three
gamepad backends via a shared gamepad_raii.rs, the SCM STOP/SESSION events as OnceLock<OwnedHandle>,
the hot-loop KeyedMutexGuard, and the driver's pod_init!; all box-validated, clean sc stop in
~1 s). The driver already has the deny. Revisit D1-host as a final discipline pass (staged per-module)
if desired.
5. M6 scaffolding cleanup — delete the bring-up diagnostics (spawn_observer/DebugBlock in
idd_push.rs) and, once full parity is proven on glass, the host monoliths.
Explicitly NOT doing (stability decision): E1 — driver DeviceContext ownership + per-IDDCX_MONITOR
EvtCleanupCallback. The current process-global design is sound: IddCx DDIs receive only an
IDDCX_MONITOR handle (never the WDFDEVICE/context), and ProcessSharingDisabled makes one devnode = one
host process that dies with the device. A "device-owned" variant would add a use-after-free window (the
watchdog races device cleanup) for no gain, and the per-monitor cleanup callback isn't reliably reachable
on this UMDF/IddCx stack. Cleanup is already deterministic (WDFDEVICE EvtCleanupCallback +
cleanup_for_device_removal + the host-gone watchdog). Revisit only if max_concurrent>1 on Windows is
actually needed. (monitor.rs documents this rationale at the MONITOR_MODES static.)
P3 — larger, mostly hardware-gated
6. M4 — gamepad-driver unification — ✅ substantially DONE (92e6802). pf-dualsense (DualSense /
DualShock 4) and pf-xusb (Xbox 360 / XInput) now live in the unified packaging/windows/drivers/
workspace and build from source per release against the vendored wdk-sys, exactly like pf-vdisplay;
build-gamepad-drivers.ps1 signs them with the shared cert. Remaining: point the driver side at
pf_driver_proto::gamepad::{PadShm,XusbShm} (the host side already does — the device_type-at-offset
hand-duplication is the last ABI-drift hazard), add WDF device contexts for true multi-pad, and confirm
the source build matches the prior shipped binaries.
7. M5 — reshape WGC/DDA + GameStream onto session/pipeline, then delete the old relay/monoliths.
AMF/QSV stays CI-only (no lab hardware).
8. On-glass behavioral validation of the committed-but-unexercised fixes: the watchdog reaping on
host-kill, SET_RENDER_ADAPTER on a hybrid box (the lab box is single-dGPU), the IDD-push→DDA
fallback trigger, HDR-ring sizing + out-ring repeat under real HDR/static-desktop pipelining, and the
AMF/QSV encode path on real AMD/Intel hardware.
5. Operations
5.1 RTX box on-glass recipe
The persistent on-glass validator is the RTX box (ssh "Enrico Bühler"@<ip>, ENRICOS-DESKTOP, RTX
4090, PS shell). The IP FLOATS (DHCP; boots to Proxmox on reboot → ephemeral, unreachable after a
reboot; recently .173/.158 — confirm current first; never reboot it, never depend on it surviving).
It has WDK 26100 + LLVM 21.1.2 + the Rust toolchain; build clone at C:\Users\Public\pf-rewrite (the
user's active driver-dev tree — don't clobber uncommitted WIP; use a worktree). Username has a ü →
quote it; it only breaks SDL3/client builds, not the host. To validate a host branch: worktree-checkout,
build with CARGO_TARGET_DIR=C:\t-goal1, then stop the PunktfunkHost service, back up the binary +
%ProgramData%\punktfunk\host.env, copy your build in, restart, drive punktfunk-probe.exe loopback,
then restore + git worktree remove. Drive over ssh via powershell -EncodedCommand <base64 UTF-16LE>
(plain quoting mangles; prefer Write-Output/file-redirect for clean output). Driver redeploy:
packaging/windows/redeploy-pf-vdisplay.ps1; ghost-monitor recovery: reset-pf-vdisplay.ps1.
5.2 CI / validation
The persistent build validator is the windows-amd64 CI runner (no GPU — fine for builds / iddcx
link / /INTEGRITYCHECK self-sign / the surface-asserts; live NVENC encode + on-glass defers to the RTX
box). Workflows: windows-host.yml (the host installer), windows-drivers.yml (the driver workspace
build + FORCE_INTEGRITY clear), windows-drivers-provision.yml (WDK/LLVM toolchain), windows-msix.yml
(the client). A single Windows runner serializes the whole fleet; a Cargo.toml touch costs ~25 min of
queue, so driver pushes that avoid Cargo.toml skip the fleet serialization.
Local pre-push checks (this Linux box can't compile the Windows paths):
cargo test -p pf-driver-proto # the ABI crate (cross-platform)
cargo check -p punktfunk-host # Linux paths; win_* mods are #[cfg(windows)]
cargo clippy -p punktfunk-host --all-targets -- -D warnings
# Windows host clippy (on the box): PUNKTFUNK_NVENC_LIB_DIR=C:\t\nvenc;
# cargo clippy -p punktfunk-host --features nvenc --target x86_64-pc-windows-msvc -- -D warnings
# Driver build (on the box): cd packaging/windows/drivers; Version_Number=10.0.26100.0;
# LIBCLANG_PATH='C:\Program Files\LLVM\bin'; cargo build
Note: a pre-existing rustfmt-version drift exists in some Windows-only files (this box's rustfmt 1.9.0
wraps offset_of!/unsafe fn differently than the runner's) — don't reformat unrelated files to chase it.
5.3 Env knobs (Windows host)
PUNKTFUNK_IDD_PUSH=1 (capture from the driver ring; shipped host.env default on, in-code default off),
PUNKTFUNK_ENCODER=auto|nvenc (auto → vendor-detect), PUNKTFUNK_10BIT=1 + PUNKTFUNK_HDR_SHADER_P010=1
(HDR), PUNKTFUNK_SECURE_DDA=1, PUNKTFUNK_NO_WGC=1 (pure DDA), PUNKTFUNK_ZEROCOPY=1,
PUNKTFUNK_MONITOR_LINGER_MS, PFVD_DEBUG_LOG=1 (driver file log — release builds are silent without it).
Config lives in %ProgramData%\punktfunk\host.env; logs in %ProgramData%\punktfunk\logs\host.log.
5.4 Build / deploy / packaging
x64-only by design (no ARM64 NVIDIA driver). The installer is the thin-.iss / fat-binary model
delegating to service install; tag host-win-vX.Y.Z. The drivers are built + FORCE_INTEGRITY-cleared +
signed + Inf2Cat'd in CI from source. DriverVer must bump on any driver change; create the ROOT devnode
via nefcon (devgen is forbidden).
6. Reference (hard-won — keep)
6.1 The /INTEGRITYCHECK answer
wdk-build emits cargo::rustc-cdylib-link-arg=/INTEGRITYCHECK unconditionally (no cfg/env/Config
opt-out), so a self-signed driver can't load (CodeIntegrity 3004/3089). The fix: a deterministic,
idempotent post-link step packaging/windows/clear-force-integrity.ps1 clears the PE FORCE_INTEGRITY bit
(0x0080 @ e_lfanew+0x5e) + verifies (CI-proven 0x01E0 → 0x0160), before signing. Packaging order:
cargo build → clear-force-integrity → sign .dll → Inf2Cat → sign .cat. (A public build would use
real attestation signing, which satisfies /INTEGRITYCHECK legitimately.)
6.2 The iddcx binding on wdk-sys (the make-or-break — proven, the 6 bindgen knobs)
IddCx DDIs are function-table dispatched (IddFunctions[] indexed by _IDDFUNCENUM::<Name>TableIndex,
IddDriverGlobals implicit arg 1) — the same model wdk-sys already implements for WDF. The vendored
windows-drivers-rs 0.5.1 (packaging/windows/drivers/vendor/, [patch.crates-io]'d) gets a first-class
ApiSubset::Iddcx that bindgens iddcx/1.10/IddCx.h reusing the identical wdk_default(config) baseline
(so WDF/DXGI types resolve to, not redefine, wdk-sys's — type-identity by construction). The six
knobs generate_iddcx needed (each a real gotcha, all CI-proven):
--language=c++—wdk_defaultparses C;IddCx.h'sIDARG_*typedefs need C++ (else a "must use 'struct' tag" cascade).-DIDD_STUB— table-dispatch mode; skipsIddCxFuncEnum.h's#error IDDCX_VERSION_MAJOR not defined. Do NOT addWDF_STUB(would desync the shared WDF type-identity).allowlist_recursively(false)+allowlist_file("(?i).*iddcx.*"), full codegen (no.complement()) — emit ONLY IddCx items; WDF/Win types resolve viause crate::types::*.allowlist_type("_?DXGI_.*" / "IDXGI.*" / "_?OPM_.*" / "_?D3DCOLORVALUE")— emit the non-WDF typeswdk-sysdoesn't bindgen, locally. The_?is load-bearing (typedef struct _OPM_X {} OPM_Xneeds the tag AND the alias).pub type UINT = ::core::ffi::c_uint;insrc/iddcx.rs—UINTis absent fromcrate::types.translate_enum_integer_types(true)— emit nativeu32reprs for the DXGI/OPM ModuleConsts enums (nested modules can't see a parentUINT).
Wrapper note: table dispatch via _IDDFUNCENUM::<Name>TableIndex as usize (the ModuleConsts const, not
a NewType .0); NTSTATUS is plain i32 (wdk_sys::NT_SUCCESS). The driver build.rs adds the IddCxStub
link-search (the import lib is under iddcx\1.0\ even though headers are 1.10) + #[no_mangle] pub static IddMinimumVersionRequired: ULONG = 4. The versioned IDD_STRUCTURE_SIZE! path is dropped — the WDK links
the iddcx 1.0 stub (lacks the version table); we target 1.10 vs a current framework, so size_of is
exactly correct.
6.3 Driver port checklist (STEP 0–8, as landed)
- workspace
pf-vdisplay(cdylib)+wdk-iddcx; provestd::thread+OwnedHandlelink under UMDF (done). wdk-iddcx: 11 typed DDI wrappers via one dispatch macro + re-export the inboundPFN_*types.- DriverEntry +
IDD_CX_CLIENT_CONFIG(15 callbacks) + DeviceInitConfig + WdfDeviceCreate + CreateDeviceInterface (the owned pf GUID) + DeviceInitialize;edid.rssalvaged verbatim. - DeviceContext +
WDF_DECLARE_CONTEXT_TYPEblob;init_adapterin D0Entry (caps + FP16) → AdapterInitAsync; the*2mode DDIs +query_target_info+ gamma/HDR accept-stubs. (Box gate: loads under Secure Boot, enumerates as an IddCx adapter, Status OK.) - control plane (
GET_INFOversion handshake the host asserts, ADD/REMOVE/SET_RENDER_ADAPTER/PING/ CLEAR_ALL) + create_monitor + real mode DDIs + watchdog + mode bounds; host switched topf_driver_proto. Direct3DDevice+ assign/unassign +SwapChainProcessor(worker,SetDevice60×@50 ms single-borrow retry, top-of-loopterminate,ReleaseAndAcquireBuffer2,from_raw_borrowed).FramePublisheronpf_driver_proto::frame+ keyed-mutex RAII guard; wire intorun_core. (Box: full IDD-push glass-to-glass + the secure-desktop gate — validated 2026-06-25.)- HDR / FP16 ring (validated: Mac connects WITH HDR).
- its own
.inx+ anunsafe-reduction pass (deny(unsafe_op_in_unsafe_fn), per-site// SAFETY:).
Remaining driver work beyond STEP 8: E1 (DeviceContext-owned state + per-IDDCX_MONITOR
EvtCleanupCallback → unblock max_concurrent>1 — see §4 for why it's deliberately deferred), the
slot-reclaim-on-REMOVE fix (§4 P1.3), and folding the gamepad-driver side onto pf_driver_proto (M4 tail,
§4 P3).
6.4 Resolved product decisions (the five forks)
A the host was refactored in place (staged, behavior-preserving), not greenfield-rebuilt — the
driver was rebuilt fresh. B IDD-push primary for everything incl. the secure desktop (validated);
WGC+DDA demoted to non-IddCx fallbacks. C all drivers on microsoft/windows-drivers-rs (+ the iddcx
subset; /INTEGRITYCHECK solved) — done for pf-vdisplay and now for the gamepad drivers (M4, 92e6802).
D keep GameStream (Moonlight), default to secure serve. E concurrent sessions: the host-side
preempt dance was removed by the ownership-model work, but true max_concurrent>1 on Windows stays blocked
on the E1 driver swap-chain-reuse work (deliberately deferred, §4). Rejected: DeviceContext-per-monitor
ownership — see the E1 stability decision in §4 (it would add a use-after-free window for no gain under
ProcessSharingDisabled).
Origins & design rationale (from the original plan)
This folds in the durable rationale from the original Windows host + client plan
(windows-host.md, now a stub; full original text in git history). The Windows host
began (2026-06-10 to 2026-06-14) as a "add backends behind the existing traits" job, not a parallel
port — punktfunk-core and the whole control plane are platform-agnostic, and the host already compiled
on non-Linux (macOS) thanks to existing cfg(target_os) gating. These framing decisions shaped what
shipped and still explain why the code is the way it is:
- Build order: host-first. A user preference (the research had recommended client-first, since the client is unblocked by the no-GPU problem and becomes the host's test endpoint). The trade-off held — the GPU-gated steps were the only ones that stalled GPU-less.
- Trait-based abstraction → ~95% reuse.
punktfunk-core(protocol/FEC/crypto/session/transport/QUIC/ C ABI), the GameStream wire logic (mDNS, serverinfo, pairing, RTSP, ENet), the management REST API +native_pairing/discovery, and thepunktfunk1/spike/pipelineorchestration all carried over unchanged — only the OS-touching backends behindCapturer/Encoder/VirtualDisplay/InputInjector/AudioCapturer/VirtualMicare new#[cfg(windows)]code. Getting to MSVC needed only ~3cfg-gates (gate thestd::os::fd/OwnedFdunix-isms inmain.rs/vdisplay.rs). - The no-GPU dev strategy. Most of the port was built + validated on a GPU-less Windows VM: the MSVC compile, the virtual-display control path (WARP), the openh264 software-encode pipeline (full capture→encode→FEC→UDP transport minus HW), SendInput injection + interactive-session/desktop-reattach, gamepad + rumble, and the entire client (software-decode loopback). Only NVENC-D3D11 zero-copy, the DDA-vs-WGC bake-off, split-encode/bitrate-ceiling, and all glass-to-glass numbers deferred to a real NVIDIA box (no perf claim transfers from Linux).
- Windows-specific structural issues (no Linux precedent) — these are the gotchas that drove the
service + capture design and remain true:
- Interactive session, not a Session-0 service. SendInput can't reach the desktop from Session 0;
Desktop Duplication / capture need the interactive session. Hence the SYSTEM-in-interactive-session
supervisor (§2.6,
windows-service.md) and theOpenInputDesktop/SetThreadDesktopre-attach to survive UAC/lock desktop switches. - Clock epoch. The skew handshake assumes both ends read the same realtime epoch in ns — the Windows
host must emit timestamps from
GetSystemTimePreciseAsFileTime→Unix-epoch-ns, or cross-machine latencyClockProbe/ClockEchobreak (stdSystemTimeon Windows is historically coarser).
- No audio endpoint on a headless IDD. WASAPI loopback needs a real/virtual render device; the virtual mic (client→host) has no clean user-mode path — deferred.
- Color/range. All clients assume BT.709 limited-range; the BGRA→I420/NV12 path must match or colors wash out — validated against the existing decoders.
- Interactive session, not a Session-0 service. SendInput can't reach the desktop from Session 0;
Desktop Duplication / capture need the interactive session. Hence the SYSTEM-in-interactive-session
supervisor (§2.6,
SudoVDA → pf-vdisplay evolution. The original plan was built around SudoVDA, an off-the-shelf
indirect display driver (the same IDD Apollo ships) — chosen to avoid writing/WHQL-signing a driver and to
get arbitrary WxH@Hz modes on the fly. It carried the host all the way to live-validated NVENC on a real
RTX 4090. It was then replaced by the all-Rust pf-vdisplay IddCx driver (which solved
/INTEGRITYCHECK self-signing, §6.1, and gave us the IDD-push zero-copy capture path that captures the
secure desktop directly) and deleted in commit 84a3b95 — pf-vdisplay is now the sole
virtual-display backend. The full SudoVDA control protocol (IOCTL layout, watchdog keepalive, GDI-name
resolution) lives in git history if ever needed as a reference.