Record the full driver port plan from the iddcx-driver-port-map workflow: the 11
DDIs to wrap, the 15 IDD_CX_CLIENT_CONFIG callbacks, the DeviceContext-owned state
model (single Monitor identity + monitor EvtCleanupCallback RAII), the
pf-vdisplay-proto frame transport, and the 8-step CI/box-gated checklist. Fold in
the adversarial critique: secure-desktop is a BLOCKING gate (do not retire the WGC
relay until proven), define the recreate/concurrency/Reconfigure failure branches,
host<->driver protocol_version lockstep. De-risk status: the full IddCx symbol
surface + .Size machinery is CI-proven present (ae803b2).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
57 KiB
Windows Host Rewrite — Design & Plan
Status: proposed (2026-06-24). This plan takes the current, hard-won Windows host (pf-vdisplay
all-Rust IddCx driver + IDD-push zero-copy capture, live-validated 5120×1440@240 HDR on the RTX box)
as a knowledge base and re-derives a clean, stable, well-layered architecture from it. It drops all
SudoVDA back-compat (we own both ends now) and drives unsafe to a contained minimum.
It supersedes the stale conclusion in docs/windows-virtual-display-rust-port.md ("IDD-push not
viable") — that verdict was written in the same commit (e2c9bfd) that shipped the working
922-line consumer + 424-line producer. IDD-push works and is the architecture. The breakthrough the
prose never recorded: once the CCD topology makes the virtual display the sole composited desktop in the
console session, DWM composites to it and the IddCx swap-chain is assigned
(run_core: FIRST FRAME acquired — DWM IS compositing the virtual display!). Per the owner, IDD-push
also captures the secure desktop (Winlogon / UAC / lock) — so it is the universal primary path, not
just the normal-desktop path.
Decisions resolved (2026-06-24)
| # | Decision | Chosen |
|---|---|---|
| A. Execution | greenfield vs staged | Greenfield rewrite — rebuild the Windows host fresh against the clean architecture, salvaging the validated "jewels" (§1) verbatim. (Risk acknowledged: no CI for the Windows paths — mitigated by the §1 preservation checklist + on-glass gates, §10.) |
| B. Capture surface | IDD-only / IDD+secure-DDA / keep fallbacks | IDD-push primary for everything (incl. the secure desktop); keep WGC + DDA as fallbacks. |
| C. Driver binding stack | wdf-umdf vs windows-drivers-rs | Extend microsoft/windows-drivers-rs with an iddcx subset; unify all three drivers on it; solve /INTEGRITYCHECK properly (§6). |
| D. GameStream on Windows | keep / keep-secure-default / drop | Keep Moonlight compat; flip the installer/service default to secure serve (GameStream an explicit opt-in). |
0. Goals (from the brief)
- Clean, stable, well-layered architecture. Decompose the god-files, give every subsystem one
owner, and replace the ~40-knob
PUNKTFUNK_*env soup with a typed config resolved once per session. - Drop every trace of SudoVDA back-compat. We own the driver (
pf-vdisplay) and the host. The byte-identical IOCTL ABI, the reused{e5bcc234}GUID, thesudovdamodule name, the "SudoVDA ignores this" conditionals — all pure liability now. - Minimize
unsafe. ~480unsafeoccurrences across the Windows surface; the large majority are FFI-mechanical (windows-rs/NVENC/WDK already returnResult). Target: host ~144→~35, drivers ~227→~60, with the irreducible floor contained in 3–4 named modules underdeny(unsafe_op_in_unsafe_fn).
Non-goals / invariants (do not regress)
- Linux host behavior is out of scope and must not change. The host crate is shared; Linux is validated across KWin/gamescope/Mutter/Sway. Touch only the seams.
punktfunk-corestays the one linked core. Protocol/FEC/crypto/QUIC live there behind the C ABI; the host is a leaf binary. No protocol changes here.- No async on the per-frame path. Native threads only (the existing discipline).
1. What we KEEP (validated, load-bearing — port, don't rewrite)
These are expensive empirical wins. The rewrite relocates/wraps them but must preserve behavior byte-for-byte:
- The IDD-push frame transport shape: host-creates / driver-opens shared keyed-mutex texture ring
with the permissive
D:(A;;GA;;;WD)SDDL (forced by the restricted WUDFHost token, mirrors the gamepad drivers); the generation-taggedlatest = gen<<40 | seq<<8 | slotstale-ring reject (kills the HDR-flip garbage frame); 0 ms try-acquire / drop-on-full publish (never block the swap-chain thread); the host output ringOUT_RING+pipeline_depth=2overlap of convert/copy vs NVENC. - The IddCx driver internals that earned their keep:
edid.rsin full (128-byte EDID + CTA-861.3 HDR block, serial-as-index round-trip, dual checksums); the HDR enablement recipe (CAN_PROCESS_FP16- the
*2mode DDIs +set_gamma_ramp/set_default_hdr_metadataaccept-stubs +HIGH_COLOR_SPACE+ 8|10 bpc);DEVICE_POOLone-device-per-render-LUID (the NVIDIA UMD-thread/VRAM leak fix); stamping the OS target id onto the monitor context (the recreated-monitortarget_id=0fix); the swap-chain processor's two real leak fixes (borrowIDXGIDeviceacrossSetDeviceretries; checkterminateat the loop top during a frame burst).
- the
- The monitor-lifecycle concurrency correctness: serialized ADD/REMOVE/teardown, the documented lock order, the watchdog CAS + re-check-under-lock, the creation grace window, the generation-stamped lease (a stale lease can't tear down a fresh monitor). Structure can change; these properties must survive.
- The CCD topology fixes:
isolate_displays_ccd(the iGPU-attached-monitor hybrid-box correctness; theSDC_FORCE_MODE_ENUMERATIONre-commit that drivesCOMMIT_MODES → ASSIGN_SWAPCHAIN); restore topology before REMOVE. - The HDR color math:
hdr.rsverbatim (pure, unit-tested, ST.2086 G/B/R + big-endian SEI);HdrConverter/HdrP010Converter+ the f64p010_reference+hdr_p010_selftest;VideoConverter(RGB→NV12/P010 on the video engine — a measured latency win); the cursor decomposition (convert_pointer_shapecolor/masked/monochrome edge cases). - NVENC tuning: caps-probe-before-configure (disambiguate unsupported-config vs too-high-bitrate; 10-bit→8-bit graceful downgrade); the bitrate-clamp binary search (finds each GPU's real ceiling); true RFI over the DPB; the low-latency configs (CBR, infinite GOP, P-only, ~1-frame VBV).
- The gamepad driver wins: the SwDeviceCreate identity recipe (enumerator with no
_; mandatory completion callback; synthesizedUSB\VID_054C&PID_0CE6compat-ids for native-DS5 detection; the non-null per-padContainerIddodging the xinput1_4 slot-skip); onepf_dualsenseserving DualSense+DS4 via adevice_typebyte; XUSB decliningWAIT_*to force synchronousGET_STATE; the static HID descriptors/feature blobs; per-pad index viapszDeviceLocation. - The session-glue patterns: the
Capturer/VirtualDisplay/Encodertrait seam + RAII keepalive teardown; host-lifetime shared services (InjectorService/MicService/AudioCapSlot) with per-session gamepads; the encode|send thread split + microburst pacing;build_pipeline_with_retry- permanent-vs-transient classification; the control-task
select!+ adaptive-FEC; the GameStreamVideoPacketizer(GF8 Cauchy, Moonlight byte-exact); the pairing/trust handshake.
- permanent-vs-transient classification; the control-task
- The SCM supervisor model: Session-0 LocalSystem supervisor → token-retarget →
CreateProcessAsUserWserveinto the console session, relaunch-on-session-change, kill-on-close Job Object; the file-append log-mask; the two-tier logging init. - Build/CI wins: the
wdf-umdf-sysbuild.rs SDK-version resolution (picks the SDK version that actually containsiddcx, not the max base SDK); the ARM64 cross-compile off the x64 runner; the thin-.iss / fat-binary installer delegating toservice install.
2. Target architecture
2.1 Crate & workspace strategy
Keep ONE shared crates/punktfunk-host crate (do not split punktfunk-host-windows). The host is
a leaf binary consumed by nobody; the "one core, linked everywhere" invariant is already satisfied by
punktfunk-core. A split would only fork the genuinely-shared session glue, traits, and hdr.rs. The
cfg-sprawl win comes instead from confining all Windows code under one src/windows/ subtree behind a
single #[cfg(windows)] mod windows; seam, with backend impls next to their trait's dispatch point.
Pull the three drivers into ONE in-tree driver workspace (packaging/windows/drivers/) on a single
binding stack, one rust-toolchain.toml, one signing recipe, one CI build. Today they are 2–3 disjoint
cargo packages on two incompatible WDK stacks (see §6).
Add ONE shared no_std ABI crate (crates/pf-vdisplay-proto, name TBD) consumed by both the host
crate and the driver workspace. It owns every cross-process binary contract that is currently
hand-duplicated with "must match" comments. This is the single highest-value correctness change (§4.1).
2.2 Target file tree (host crate)
crates/punktfunk-host/src/
main.rs clap-derive subcommand dispatch only (kills parse_serve/parse_spike/hand --help)
config.rs HostConfig (typed; parsed ONCE from host.env/env/flags) + config_dir
session/
mod.rs SessionFactory, SessionPlan, SessionContext, Session (the ONLY teardown path)
server.rs QUIC accept loop, handshake, shared-service wiring
serve_session.rs resolve_* → Welcome/Start → spawn → RAII teardown
control.rs mid-stream renegotiation select! loop
pipeline.rs REAL shared encode|send split, send_loop, FrameMsg, pacing (used by native AND GameStream)
capture.rs Capturer trait + CapturedFrame/PixelFormat/FramePayload (platform-neutral)
capture/linux.rs
capture/windows/ mod.rs (dispatch), idd_push.rs, dda.rs, wgc.rs, secure_desktop.rs*
vdisplay.rs VirtualDisplay/VirtualOutput trait + open() dispatch (neutral)
vdisplay/{kwin,gamescope,mutter,wlroots}.rs
vdisplay/windows.rs was sudovda.rs → PfVirtualDisplay + VirtualDisplayManager
encode.rs Encoder trait, EncodedFrame, validate_dimensions, open_encoder dispatch
encode/{linux,vaapi,sw}.rs
encode/windows/ mod.rs (dispatch), nvenc.rs, nvenc_sys.rs, ffmpeg_win/{mod,system,zerocopy,d3d11va_ffi}.rs
hdr.rs PRESERVE VERBATIM
inject.rs / inject/linux/* / inject/windows/{mod,sendinput,pad_manager,xusb,dualsense,dualshock4,swdevice,section}.rs
inject/proto/{dualsense,dualshock4}.rs shared pure codecs (PRESERVE)
audio.rs / audio/linux.rs / audio/windows/{mod,wasapi_cap,wasapi_mic}.rs
windows/ mod.rs, d3d/{mod,texture,ring,convert}.rs, color/{hdr,p010,video_proc}.rs,
cursor.rs, display_ccd.rs, adapter.rs, process.rs (Token/Event/Job/Child/spawn_as_user),
service.rs (SCM; uses process.rs), win32u_hook.rs*, gpu_priority.rs
session_tuning.rs (PRESERVE) / pwinit.rs / discovery.rs / mgmt.rs / native_pairing.rs / library.rs
gamestream/ unchanged module set; stream.rs slims by reusing session/pipeline.rs
* = survives only per the secure-desktop / WGC product decisions (§5, §11).
2.3 The seam traits (keep the shape; tighten 3 things)
trait VirtualDisplay: Send {
fn name(&self) -> &str;
fn create(&self, mode: Mode) -> Result<VirtualOutput>;
fn set_launch_command(&self, cmd: Option<String>); // per-instance, not a global env var
}
struct VirtualOutput {
node_id: u32,
preferred_mode: Mode,
#[cfg(windows)] win_capture: WinCaptureTarget, // target_id + adapter_luid + monitor_gen (carried, not ambient)
keepalive: Box<dyn VirtualLease>,
}
trait VirtualLease: Send { // Drop = release; replaces the sudovda free-fns + CURRENT_MON_GEN reach-in
fn set_hdr(&self, on: bool) -> Result<()>;
fn hdr_enabled(&self) -> bool;
fn await_released(&self, timeout: Duration) -> bool;
}
trait Capturer: Send {
fn next_frame(&mut self) -> Result<CapturedFrame>;
fn try_latest(&mut self) -> Option<CapturedFrame>;
fn set_active(&mut self, a: bool);
fn hdr_meta(&self) -> Option<HdrMeta>;
fn pipeline_depth(&self) -> usize;
}
fn open_capturer(vout: VirtualOutput, want: OutputFormat) -> Result<Box<dyn Capturer>>; // format+HDR passed IN
trait Encoder: Send {
fn submit(&mut self, f: &CapturedFrame) -> Result<()>;
fn poll(&mut self) -> Option<EncodedFrame>;
fn flush(&mut self);
fn request_keyframe(&mut self);
fn caps(&self) -> EncoderCaps; // query, don't rely on default no-ops
fn set_hdr_meta(&mut self, m: Option<HdrMeta>);
fn invalidate_ref_frames(&mut self, lo: u64, hi: u64) -> bool;
}
fn open_encoder(plan: &EncodePlan) -> Result<Box<dyn Encoder>>;
trait AudioCapturer: Send { fn next_chunk(&mut self) -> Result<Vec<f32>>; fn channels(&self) -> u16; fn drain(&mut self); }
trait VirtualMic: Send { fn push(&mut self, pcm: &[f32]); fn channels(&self) -> u16; }
trait InputInjector: Send { fn inject(&mut self, e: &InputEvent); }
trait PadManager: Send { /* handle/apply_rich/pump/heartbeat — Box<dyn PadManager> via select(GamepadPref), replaces the PadBackend enum */ }
The three tightenings: (1) Capturer takes the desired OutputFormat IN — kills the
capture → encode::windows_resolved_backend() back-reference that's recomputed in dxgi.rs; (2) HDR
control + monitor-release become VirtualLease methods so the session glue never names a concrete
backend and contains zero unsafe; (3) optional encoder capabilities are queried via EncoderCaps.
2.4 SessionFactory + typed plan (the single biggest clarity lever)
Today the Windows capture/topology/encoder decision is made by ~40 scattered env reads, recomputed in
THREE places (capture_virtual_output, should_use_helper, virtual_stream) with no single owner and
a latent mirrored-dispatch bug (capture and encode can disagree on the backend). Replace with:
struct SessionPlan {
display: DisplayBackend,
capture: CaptureBackend, // IddPush | Dda | Wgc
topology: SessionTopology, // SingleProcess | TwoProcessRelay
encoder: EncoderBackend, // Nvenc | Amf | Qsv | Software
input_format: OutputFormat,
bit_depth: u8, hdr: bool, pipeline_depth: usize,
}
struct SessionFactory { cfg: Arc<HostConfig>, vdm: Arc<VirtualDisplayManager>, injector, mic, audio }
impl SessionFactory {
fn plan(&self, welcome: &Welcome) -> SessionPlan; // resolves ONCE from HostConfig; no env reads downstream
fn build(&self, plan: &SessionPlan, ctx: SessionContext) -> Result<Session>; // owns the RAII chain
}
build() owns the chain vdm.lease(mode) → open_capturer(vout, fmt) → open_encoder(plan) → spawn pipeline, and Session::drop is the only teardown path. This kills the env soup, makes the deployed
path readable, and removes the capture/encode backend-disagreement bug class. It also lets us drop the
12–13-arg #[allow(too_many_arguments)] signatures (a SessionContext struct) and the dead
Compositor ceremony threaded through the Windows path.
2.5 Ownership model — delete the global statics
Today the lifecycle is smeared across IDD_PERSIST + open_or_reuse (dead code), CURRENT_MON_GEN
(read per-frame), IDD_SETUP_LOCK/IDD_SESSION_STOP (the preempt dance), MGR: Mutex<Mgr>, and on
the driver side ADAPTER/MONITOR_MODES/NEXT_ID/WATCHDOG_*/DEVICE_POOL. Replace with:
- A host-lifetime
VirtualDisplayManagerowning a typedOwnedHandledevice handle (not a rawisizesmuggled across threads) and the refcounted Idle/Active/Lingering state machine (preserve the machine — it's earned). - A per-session
MonitorLeasewhoseDropreleases the refcount; the monitor generation carried throughWinCaptureTargetinstead of the ambientCURRENT_MON_GEN. - On the driver: wire
EvtCleanupCallbackforMonitorContext(onlyDeviceContexthas it today) so theSwapChainProcessor+ D3D resources drop via WDF RAII — deletingfree_swap_chain_processorand the manual-free-before-departure dance that is the documented dominant reconnect leak. Move the process-global driver state into theDeviceContext; collapse the 3-way monitor identity (MONITOR_MODES/ EDID serial / context stamp) to oneMonitorowned by the context.
3. The host↔driver contract (own it; define once)
3.1 pf-vdisplay-proto (no_std, bytemuck/zerocopy)
One crate, both build graphs (path dep). Owns:
- Control plane: a fresh interface GUID; a contiguous, versioned op enum;
#[repr(C)]request/reply structs carrying only used fields. - Frame plane:
SharedHeader, theFrameToken { generation, seq, slot }withpack/unpack(replacing the hand-twiddledgen<<40|seq<<8|sloton both sides), theGlobal\pfvd-*name helpers. - Gamepad sections:
XusbShm(64 B) andPadShm(256 B, incl.device_type) layouts. - Derive
FromBytes/IntoBytes/Pod;constsize+offset asserts; round-trip tests. ABI drift becomes a compile error, not a runtime corruption. (bytemuck is already a dep in the driver + wdf-umdf-sys.) This deletes everyOFF_*constant +read/write_unalignedon both sides of every boundary — the largest single block of shared-memoryunsafe, and the top drift hazard.
3.2 Control plane — keep DeviceIoControl, redesign the ABI
DeviceIoControl is the correct WDF idiom for a driver with no control device and is low-frequency
(ADD/REMOVE per session + a keepalive); the shared-memory pattern buys nothing here. Keep it; redesign
the surface:
- Ops actually needed:
Add(mode, identity) → {luid, target_id},Remove,SetRenderAdapter(now unconditional — pf-vdisplay honors it for hybrid-GPU IDD-push; drop the SudoVDA-parity default-off branch),ClearAll(first-class startup orphan reap, not an "ignored by SudoVDA" hack),GetInfo(a real version handshake), and keepalive (see §3.4). - Drop the SudoVDA-isms:
AddParams.device_name[14]/serial[14](ignored), the 16-byte GUID → a monotonicu64session id (the refcount manager owns collision safety; retiresnext_monitor_guid's pid-mangling), the 4-byte{major,minor,incr,test}version tuple → oneu32, the gappy0x800/0x888/0x8FFfunc numbering → contiguous. - One typed IOCTL dispatch helper retrieves+validates+aligns the buffers and hands the body a safe
&Req/&mut MaybeUninit<Reply>— collapses ~20 ofcontrol.rs's 29unsafeblocks.
3.3 Frame plane — keep the inversion, retire the scaffolding
Keep the host-creates / driver-opens ring exactly. Remove the bring-up scaffolding that diagnosed
the now-solved run_core=0 mystery: the DebugBlock channel + DBG_MAGIC, spawn_observer /
PUNKTFUNK_IDD_PUSH_OBSERVE, the error!-as-info! logging, the intentional handle leak, and the
20 s blind no-frame deadline (replace with the DRV_STATUS_OPENED handshake as a bounded liveness
signal).
3.4 Driver swap-chain reuse — the one open root cause
Today a reused IddCx monitor's swap-chain dies after ~2 sessions (target id resolves to 0, SetDevice
fails 0x80070057, then an access violation), forcing fresh-monitor-per-session + the host-side
preempt/wait_for_monitor_released dance + the IDD_PERSIST "create once, never recreate" workaround.
The fix is in the driver: with EvtCleanupCallback wired + state owned by DeviceContext + the
identity collapsed to one Monitor (the recreate-path bugs are exactly the 3-way identity desync), the
clean recreate should become stable. If that holds, delete IDD_SETUP_LOCK/IDD_SESSION_STOP +
the preempt dance and unblock max_concurrent>1 on Windows. If it can't be fixed cheaply, isolate
the residual serialization inside VirtualDisplayManager (not smeared back into the session loop).
Separately, evaluate replacing the polling watchdog (PING/countdown/grace/linger constellation) with a
WDF file-object EvtFileClose (host holds the control handle open; close = host gone) — feasibility
TBD on UMDF/IddCx.
4. Capture strategy
IDD-push is the universal primary path — normal AND secure desktop (Decision B). It composes
in-process (cross-session via Global\ shared textures: driver in WUDFHost/Session 0, serve in the
console session), needs no DXGI Desktop Duplication and no win32u reparenting hook, is live-validated
at 5K@240 HDR, and (per the owner) also captures the secure desktop (Winlogon/UAC/lock). So there is no
separate "secure capturer" in the primary path: the same IddPushCapturer spans the lock screen and
UAC. Capture selection moves into a typed CaptureBackend in the SessionPlan — replacing the 3-way
env branch with IddPush (default) → Dda/Wgc (explicit fallbacks).
WGC + DDA are kept as fallbacks, not deleted (Decision B). They cover non-IddCx / pre-pf-vdisplay
hardware and act as a safety net if IDD-push fails to attach. But they are demoted: they are no
longer the default, no longer entangled with the secure-desktop mux, and selected only via the explicit
CaptureBackend fallback in the plan. This lets the DDA module shed the parts that existed only to
make virtual-display-over-DDA survive on a hybrid box, while the genuinely-useful capture/recovery core
stays:
- Scope the
win32uself-modifying-code hook + the GPU-pref hook to the DDA fallback leg (onewin32u_hook::install()), so the primary IDD-push path never touches them. Re-confirm whether DDA even needs thewin32uhook against pf-vdisplay (it may not — open verification item). - The two-process WGC relay's secure-desktop mux is retired — IDD-push handles the secure desktop
directly, so
desktop_watch.rs+composed_flip.rs+ thevirtual_stream_relaymonolith are no longer needed for their original purpose. Keep a minimal WGC fallback capturer if the WGC backend is retained; do not port the 400-line relay state machine. (The cross-session input concern below is handled by theInputInjector/topology abstraction, not the AU video relay.)
Shared D3D primitives move out of dxgi.rs (today the de-facto dumping ground that wgc.rs and
idd_push.rs import from) into windows/d3d/ (typed Texture2d/Ring/CopyResource/Map-as-bytes),
windows/color/ (the converters + hdr_p010_selftest verbatim), and windows/cursor.rs. All three
capturers consume them — deletes the duplicated tex_desc, cursor, HDR-poll, repeat-last logic.
The texture-ownership contract becomes type-level. NVENC encodes the capturer's texture in place
(no copy), sound today only because the IDD-push capturer rotates OUT_RING and the loop honors
pipeline_depth() — an undocumented cross-module coupling that is already a latent corruption risk.
Fix: either the encoder always CopySubresourceRegions (as ffmpeg_win does), or the capturer hands an
explicitly-leased ring texture with a documented lifetime. No more relying on the synchronous-loop
assumption.
The IDD-push input question (must confirm on-glass): capture+encode run in serve; input must reach
the streamed (console-session) desktop. If serve runs in the console session, SendInput works
directly. A code comment flags "SendInput from Session 0 can't reach Session 1" — so the architecture
must make InputInjector satisfiable either by in-session SendInput or by a tiny input-only
Session-1 agent (re-scope the old WGC helper to input only). The SessionPlan.topology expresses
this.
5. Encode layer
- Resolve backend + input format + pipeline depth once into
EncodePlanand hand it to both the capturer and the encoder factory — kill the duplicatedwindows_resolved_backend()call indxgi.rs(the highest-severity coupling). Trimopen_video's 8-arg grab-bag (cudais always false on Windows;bit_depthis overridden by the capture format anyway). nvenc_sys.rs: a thin safe wrapper — RAIINvSession/NvBitstream/NvRegistration/NvMappedInput(Drop = destroy/unregister/unmap) + anNV_ENC_CONFIGbuilder. The public encoder then has near-zerounsafeand no hand-written teardown loops. (The SDK table already returnsResultviaresult_without_string().) This is the single biggest encode-sideunsafereduction.ffmpeg_win: RAIIAvFrame/SwsCtx/HwDeviceCtx/HwFramesCtxdelete every manualav_*_freeand the error-path cleanup ladders (also the biggest leak-risk reduction); a checkedMappedSurfacefor the staging readback; aconstsize-assert on the hand-mirroredAVD3D11VA*structs in a dedicatedd3d11va_ffisubmodule (silent FFmpeg ABI drift is currently undetectable). Keep system-readback the default; zero-copy stays opt-in/experimental (no AMD/Intel lab box).- HDR symmetry: make in-band ST.2086/CLL SEI a shared post-encode step so AMF/QSV get the same mastering metadata as NVENC (today only NVENC attaches it; AMF/QSV rely solely on the 0xCE datagram). Centralize "when does the client learn HDR metadata" in one owner.
- Keep
hdr.rs, theEncodertrait,EncodedFrame,validate_dimensions, the caps-probe + RFI logic verbatim. Delete thepipeline.rspump_oncedoc stub (the real loop issession/pipeline.rs).
6. Drivers — one binding stack (windows-drivers-rs), one workspace, one signing recipe
Today: pf-vdisplay on the vendored wdf-umdf stack; pf_dualsense + pf_xusb on
microsoft/windows-drivers-rs (wdk/wdk-sys/wdk-build). Two bindgen passes, two SDK
resolutions, two NTSTATUS, two build systems, two signing recipes.
Decision C: unify all three on microsoft/windows-drivers-rs (the official Microsoft stack), in one
in-tree packaging/windows/drivers/ workspace, edition 2024, one rust-toolchain.toml, one CI build.
The gamepad drivers already ship on it; the work is to migrate pf-vdisplay onto it and add the
IddCx surface it lacks today.
Required pieces of this migration (each a Phase-0/early task):
- Add an
iddcxsubset towdk-sys. IddCx DDIs are not WDF-table functions — they are directIddCxStubexports — so the extension is bounded: anApiSubset::Iddcx+iddcxfeature → bindgenIddCx.h+ linkIddCxStub, then ~15 thinextern/wrapper fns. Use the currentwdf-umdf/src/iddcx.rs(~345 LOC, validated) as a line-by-line oracle, including the IddCx 1.10*2HDR DDIs (IddCxSwapChainReleaseAndAcquireBuffer2,IDARG_*2,_METADATA2). - Solve
/INTEGRITYCHECKfor self-signed loading — properly.wdk-buildlinks the driver with/INTEGRITYCHECK, which a self-signed cert can't satisfy (CodeIntegrity 3004/3089). Today the gamepad drivers hand-patch the FORCE_INTEGRITY PE bit post-link. Replace that hack with a robust solution, in order of preference: (a) override the linker flag — drop/INTEGRITYCHECKviawdk-buildconfig /RUSTFLAGS/link-argsif it can be suppressed cleanly; else (b) a deterministic, tested CI post-link tool (a small Rust/PowerShell step that clears bit0x80ate_lfanew+0x5eand re-signs, run in CI, not by hand) so it's reproducible and not a footgun; (c) for a public build, real attestation signing (Partner Center) satisfies/INTEGRITYCHECKlegitimately. Pick (a) if feasible; (b) as the fleet-self-signed fallback. This is the headline cost of choosing this stack and must be nailed in Phase 0. - Backport the
wdf-umdf-sysbuild.rs SDK-resolution fix intowdk-build(or a local override): resolveIddCx.h/IddCxStubby the SDK version that actually containsum\x64\iddcx, not the max base SDK (the real failure where a newer base SDK shadows the WDK SDK). windows-drivers-rs's default resolution doesn't exercise IddCx today, so this likely needs porting. - Port
pf-vdisplay's typed safety wins onto the new stack: re-create theWDF_DECLARE_CONTEXT_TYPE!Arc<RwLock<T>>context abstraction (the gold-standard containedunsafe); the version-gate protocol (IddCxIsFunctionAvailable!/IDD_STRUCTURE_SIZE!); and a thin safe wrapper layer so the gamepad drivers stop emitting rawcall_unsafe_wdf_function_binding!everywhere (the biggest driver-unsafelever).
While unifying, also: adopt WDF device contexts for per-pad state (drop the
UmdfHostProcessSharing=ProcessSharingDisabled-dependent statics → true multi-pad-per-host); replace
mem::zeroed() configs with the WDF_*_CONFIG_INIT initializers (kills the recurring zeroed-default
bug class that already caused 3 driver bugs); cache the shm view (RAII ShmView) instead of
re-mapping ~125×/s; delete the world-writable C:\Users\Public\*.log driver logging and the "M0
spike" naming; collapse is_nt_error()/dyn-Any/From<()>-as-error into a typed IntoDriverResult;
collapse the per-call dispatch unsafe into one generic dispatch() helper.
Provenance note: confirm where wdk/wdk-sys/wdk-build come from (the gamepad drivers' Cargo.toml
path-deps ../../crates/wdk* don't exist in this checkout — they resolve inside a windows-drivers-rs
checkout on the dev box). Pin them as crates.io deps or a vendored, version-pinned copy so the driver
workspace builds reproducibly in CI.
7. Input, audio, service, packaging
- Input: consolidate the host-side device plumbing (
create_swdevice/create_shm_section/SwDeviceProfile) into oneinject/windows/swdevice.rsused by all three managers (XUSB included, which currently re-implements its own). The shm layouts come frompf-vdisplay-proto. Re-scope the cross-session helper (if any) to input-only. - Audio: small, already fairly clean. Replace the lone
newdev.dllLoadLibrary+transmute(wasapi_mic.rs, the audio runtime's onlyunsafe) with the windows-rsDiInstallDriverWbinding (or move provisioning to the installer) → zerounsafein the audio runtime. - Service / process: one
windows/process.rsowning RAIIToken/Event/Job/Child+ a singlespawn_as_user()used by BOTH the SCM supervisor and any helper — deletes the duplicated token-dup/merged_env_block/CreateProcessAsUserWmachinery and ~12 manualCloseHandlesites. Add a cooperative stop: a named stop event the supervisor sets andservewaits on, so Stop runs RAII teardown (todayTerminateProcessskips Drop → the virtual monitor lingers, the documented stale-monitor gotcha);TerminateProcessonly as a bounded fallback. - Packaging/CI: keep the thin-.iss / fat-binary model; add a
punktfunk-host web install/uninstallsubcommand to absorb the web-setup PowerShell. Build + sign the unified driver workspace in CI from source (or a CI guard that fails on stale-vendored-DLL / un-bumped DriverVer) so the driver can't silently drift from its source. Mint the fresh pf-vdisplay GUID coordinated across host + driver + INF. Single source of truth for version → build + ISCC AppVersion + INF DriverVer. Investigate retiringnefconcby creating the ROOT devnode via SwDevice/CM in Rust. Keep the devgen-never / nefconc-only and DriverVer-bump gotchas codified.
8. Unsafe-reduction program (run at port time, not as a separate pass)
- P0 lints first (a few lines, before new code):
#,#![warn(clippy::undocumented_unsafe_blocks)],#![warn(clippy::multiple_unsafe_ops_per_block)]. Generated bindings keep their opt-out. - P0 std handle ownership:
std::os::windows::io::OwnedHandle/std::fs::File::from_raw_handleeverywhere a rawHANDLE/isizeis held (events/jobs/tokens/sections/pipes). Used in zero host files today — the single biggest cheap win. Deletes the bespokeunsafe impl Read/Write/Drop(HandleReader), the never-closed sudovda control handle, theAtomicIsizeHANDLE globals, ~6 manualCloseHandlesites — and fixes real leaks. - P0 the proto crate (§3.1) — kills the shared-memory pointer-cast
unsafe. - P1 typed wrappers:
windows/d3d/(most COM calls already returnResult; per-frame loop bodies becomeunsafe-free, the irreducible keyed-mutex/from_raw_partslands in oneframe_xferfn);nvenc_sys+ RAII ffmpeg (§5); onewindows/process.rs(§7); collapse the 21unsafe impl Sendonto one auditedSendPtr<T>/ThreadBound<T>(directly de-risks the NVENC in-place coupling). - P2 contain the irreducible:
win32u_hook.rs(oneinstall(); scope to secure-DDA or drop),gpu_priority.rs(the D3DKMT transmute), the WDF context-blob macro, the IddCx swap-chain DDI +from_raw_borrowed(wrap in a typedSwapChainguard returning a borrowedAcquiredSurface<'_>). Document a// SAFETY:per residual site. - P2 delete
unsafeby deleting code: thepresent_triggerdead diagnostic, theDebugBlockchannel,spawn_observer,IDD_PERSIST/open_or_reuse,helpers.rs Sendable<T>, the WGC-open thread-watchdog hack (gone with WGC), the driver file-logging.
Estimated: host ~144→~35, drivers ~227→~60, residual concentrated and auditable. (#![forbid(unsafe)]
is impossible for the drivers and the per-frame D3D path — the realistic target is containment.)
9. SudoVDA decoupling (mechanical rename + scrub)
vdisplay/sudovda.rs → vdisplay/windows.rs; SudoVdaDisplay → PfVirtualDisplay; scrub "SudoVDA"
from all log/error/doc strings across capture.rs/dxgi.rs/wgc*.rs/idd_push.rs/punktfunk1.rs/
main.rs/sendinput.rs (141 refs / 15 files). Split the reach-in helpers out of the vdisplay
backend (they're display-utility, not virtual-display creation): set_advanced_color,
advanced_color_enabled, resolve_gdi_name, isolate/restore_displays_ccd, set_active_mode →
windows/display_ccd.rs (collapsing the 4× copy-pasted QueryDisplayConfig preamble into one safe
query_active_config()); resolve_render_adapter_luid → windows/adapter.rs. Both vdisplay and
capture then depend on these as peers, breaking the circular reach-in. WinCaptureTarget moves to a
neutral location (defined in dxgi.rs, constructed in sudovda.rs today). Drop the dual-driver
fallback conditionals. Expose HDR/monitor-release as VirtualLease methods (zero unsafe in the
session glue).
10. Build plan (greenfield — Decision A)
A from-scratch rebuild of the Windows host against the clean architecture, salvaging the §1 jewels
verbatim (the already-clean, already-tested modules: hdr.rs, edid.rs, the inject/proto codecs,
the HDR/cursor converters + their self-tests, the GF8 packetizer, the pairing handshake). The old
Windows code stays in-tree, untouched, as the reference implementation until the new path reaches
parity on glass, then is deleted.
Greenfield-risk mitigation (the survey's strong caveat stands): almost none of this is CI-validatable — the Windows backends + drivers need the RTX box (192.168.1.173) + the build VM, and AMF/QSV have no lab hardware at all. A greenfield rewrite therefore carries real risk of silently dropping a layered bug-fix. Two guardrails are mandatory:
- The §1 preservation checklist is a test/assert contract, not prose: each rebuilt module ports its
hard-won invariants as unit tests or runtime asserts — RAII teardown order (restore displays before
REMOVE), keyed-mutex held only across convert/copy,
terminatechecked at the swap-chain loop top, magic stamped last,OUT_RINGtexture rotation underpipeline_depth>1, the NVENC caps-probe downgrade, the SwDeviceCreate identity recipe. A rebuild that drops one fails its own test. - On-glass A/B gates at each milestone below, on the RTX box, against the current shipping build: 1080p60, 5K@240 HDR, reconnect-storm, secure desktop (lock/UAC), multi-pad. Nothing replaces the old path until its A/B passes.
Build order
- M0 — Foundations + the
/INTEGRITYCHECKanswer. Stand upcrates/pf-vdisplay-proto(the clean, owned ABI: fresh GUID, the redesigned IOCTL op enum +#[repr(C)]structs,SharedHeader,FrameToken, the gamepad shm layouts,constsize-asserts, round-trip tests). Stand up the in-treepackaging/windows/drivers/workspace onwindows-drivers-rsand prove the two hard unknowns: (a) theiddcxwdk-syssubset bindgen+links and a trivial IddCx adapter loads; (b)/INTEGRITYCHECKis solved (§6.2) so a self-signed driver loads under Secure Boot with no hand-patching. Add the P0 lints to the host crate. No host behavior yet. - M1 — pf-vdisplay on the new stack, first light. Rebuild the IddCx driver against
windows-drivers-rs+iddcx, clean from the start:DeviceContext-owned state (no process-globals), oneMonitoridentity,EvtCleanupCallbackonMonitorContext, the portedArc<RwLock<T>>context, the EDID + HDR recipe verbatim, the redesigned control plane from the proto crate. (On-glass: ADD → monitor arrives → IDD-push ring attaches → frames flow at 1080p; REMOVE clean.) - M2 — IDD-push capture + NVENC, glass-to-glass. New
src/windows/tree:windows/d3d/typed wrappers,windows/color/(converters + self-tests),windows/cursor.rs,capture/windows/idd_push.rsconsuming the proto ring with a type-level texture-ownership contract (no in-place-encode assumption),encode/windows/{nvenc.rs,nvenc_sys.rs},vdisplay/windows.rs+windows/display_ccd.rswindows/adapter.rs. Wire theSessionFactory/SessionPlan(M2 only needs the IDD-push+NVENC plan). (On-glass A/B: 1080p60 + 5K@240 HDR, latency parity with the current build.)
- M3 — Service, input, audio, secure desktop.
windows/process.rs(RAII Token/Event/Job/Child +spawn_as_user+ cooperative stop) +windows/service.rs;inject/windows/*on the proto shm + consolidatedswdevice.rs;audio/windows/*(zero-unsaferuntime). Confirm IDD-push captures the secure desktop (lock/UAC) and input reaches the streamed session (in-sessionSendInput, or the input-only agent if needed). (On-glass: full session incl. lock screen + UAC + a real pad.) - M4 — Gamepad drivers onto the unified stack. Rebuild
pf_dualsense+pf_xusbonwindows-drivers-rsin the same workspace, WDF device contexts (true multi-pad), proto shm,WDF_*_CONFIG_INIT, no file logging, no "M0 spike" naming. (On-glass: 2 XInput + 2 DualSense pads, rumble/lightbar/adaptive-trigger round-trip.) - M5 — Fallbacks + GameStream + AMF/QSV. Port the demoted WGC + DDA fallback capturers (minimal,
win32uhook scoped to the DDA leg);encode/windows/ffmpeg_win/*with RAII FFmpeg + thed3d11va_ffisize-assert (system-readback default; zero-copy experimental); GameStream planes reusingsession/pipeline.rs, installer default flipped to secureserve. (On-glass: Moonlight client on the DDA fallback; AMF/QSV stays CI-only.) - M6 — Cut over + delete. Flip the default to the new path, run the full A/B matrix, then delete the
old
dxgi.rs/wgc*/sudovda.rs/punktfunk1.rsWindows monoliths + the bring-up scaffolding (DebugBlock/spawn_observer/observe gate) + the old gamepad driver crates. Single source of truth for version; CI builds+signs all drivers from source.
Milestones are roughly dependency-ordered; M0 is the long pole (the /INTEGRITYCHECK + iddcx proof
gates everything else). M5's AMF/QSV cannot be validated without hardware — keep it system-readback-only
and clearly experimental.
11. Decisions (resolved 2026-06-24) + open verification items
The five product forks are decided (see the table in §0): A greenfield; B IDD-push primary for
everything incl. secure desktop, WGC+DDA kept as demoted fallbacks; C extend windows-drivers-rs +
solve /INTEGRITYCHECK; D keep GameStream, default secure. On E (concurrent sessions): fix the
driver swap-chain lifecycle regardless (it removes the leak + the preempt dance); treat true
max_concurrent>1 on Windows as a follow-on once clean reuse is proven on glass.
What remains are technical unknowns to confirm on the RTX box (not user decisions):
/INTEGRITYCHECKresolution path (M0 long pole). Canwdk-buildsuppress/INTEGRITYCHECKvia config/link-args (preferred), or must we keep a deterministic CI post-link bit-clear? Decides the signing story for all three drivers.iddcxsubset onwdk-sys. Does the bindgen+IddCxStublink cleanly, and does the SDK-resolution fix need backporting? (windows-drivers-rs doesn't exercise IddCx today.)- Driver swap-chain reuse. Does the clean ownership model (
EvtCleanupCallback+ DeviceContext state- single
Monitoridentity) actually fix the "reused swap-chain dies after ~2 sessions" root cause? If not, the residual serialization stays insideVirtualDisplayManager.
- single
- IDD-push input + secure desktop. Confirm
serveruns in the console session soSendInputreaches the streamed desktop (a code comment warns about Session 0→1); confirm IDD-push frames flow through the lock screen / UAC (owner reports yes — verify and lock it in as the primary, demoting the DDA secure leg to fallback). - Does the demoted DDA fallback still need the
win32uhook against pf-vdisplay, or was that purely a SudoVDA/hybrid pathology? If unneeded, the self-modifying-code hook can be deleted entirely. - AMF/QSV stays CI-only (no hardware) — system-readback default, zero-copy experimental.
12. Risks
- Greenfield with no CI (the dominant risk). The build VM is headless/WARP; the WinUI/hardware/driver paths need the RTX box, and AMF/QSV have no hardware. A from-scratch rebuild can silently drop a layered bug-fix. Mitigation: the §1 preservation checklist is a test/assert contract per rebuilt module; on-glass A/B gates the new path before the old one is deleted (M6); keep the old code in-tree as the reference until parity.
/INTEGRITYCHECK(M0 long pole). Choosingwindows-drivers-rsmeans self-signed loading depends on solving it cleanly (§6.2). If neither linker-flag suppression nor a deterministic CI post-link step works, drivers can't load self-signed — prove this first, it gates everything.iddcxonwdk-sysis new surface (windows-drivers-rs doesn't bind IddCx). Bounded (IddCxStubexports + ~15 wrappers, with the validatedwdf-umdf/iddcx.rsas oracle) but unproven on this stack — M0 must light it.pf-vdisplay-protospans two cargo build graphs (host workspace + the driver workspace). Validate the path-dep resolves on the Windows build env in M0; pinwdk*provenance so the driver workspace builds reproducibly in CI.- Driver swap-chain-reuse root cause still undiagnosed. The clean ownership model should fix it;
if not, residual serialization stays inside
VirtualDisplayManagerandmax_concurrent>1stays blocked. Keepawait_releasedon the trait until reuse is proven on glass. - NVENC in-place encode +
pipeline_depth>1is a latent corruption risk; the M2 texture-ownership contract must be type-level (not the synchronous-loop assumption). Verify the ring on glass. - Host/driver version drift in the field. New host + new driver are always built together (greenfield), but the installer bundles both — enforce a startup version handshake (proto version in both binaries) and a CI guarantee they're built from the same revision.
- Big-bang cutover (M6). Flipping the default and deleting the old monoliths is the riskiest moment; it is gated on the full A/B matrix passing, and the old code is recoverable from git if a regression surfaces post-cutover.
13. Progress log + M1 IddCx-binding recipe (2026-06-24)
M0 COMPLETE (commits through f896f70, on main, CI-green + validated on the RTX box):
crates/pf-vdisplay-proto— owned host↔driver ABI (fresh GUID, typed IOCTLs + frame transport, const size-asserts). Green Linux + MSVC.- Runner and RTX box provisioned: WDK 26100 (WDF 2.31, IddCx 1.10), LLVM 21.1.2 (the runner's
default was a ToT/22-dev build → wdk-sys bindgen
E0080layout-test overflow; 21.1.2 builds clean — windows-drivers-rs discussion #591). cargo-wdk on the runner. packaging/windows/drivers/— unified driver workspace on windows-drivers-rs;wdk-probe(minimal UMDF) builds clean end-to-end (bindgen + WDF link + static-CRT.cargo/config+pf-vdisplay-protopath-dep). Build layers solved: in-tree target dir (wdk-build walks OUT_DIR ancestors forCargo.lock);[workspace.metadata.wdk.driver-model]= UMDF 2.31;target-feature=+crt-staticw/ explicit target;Version_Number=10.0.26100.0;LIBCLANG_PATH→ LLVM 21.1.2./INTEGRITYCHECKresolved: wdk-build sets it unconditionally (no opt-out) →packaging/windows/ clear-force-integrity.ps1clears the PEFORCE_INTEGRITYbit (0x0080 @ e_lfanew+0x5e) post-link, before signing. Proven0x01E0→0x0160on CI and in PS 5.1 on the box. Self-signed UMDF load itself is already proven on the box (the gamepad drivers).
RTX box (ssh "Enrico Bühler"@192.168.1.173, ENRICOS-DESKTOP, RTX 4090 driver 610.62, PS 5.1 shell):
ephemeral — boots to Proxmox on reboot, so unreachable after a reboot. Treat as opportunistic on-glass
(driver load + IDD-push streaming) only; CI on the windows-amd64 runner is the persistent validator.
A build clone is at C:\Users\Public\pf-rewrite; builds the driver in ~29 s with the box's LLVM 21.1.2.
M1 — IddCx binding on windows-drivers-rs (the recipe)
IddCx DDIs are function-table dispatched (IddFunctions[] indexed by IDDFUNCENUM::<Name>TableIndex,
IddDriverGlobals implicit first arg) — exactly the model wdk-sys already implements for WDF (not direct
IddCxStub exports as first assumed).
Approach (Option 1, recommended): vendor windows-drivers-rs 0.5.1 in-tree (pinned; source staged at
scratchpad/wdr, commit 0e3499d), patched via [patch.crates-io] for just wdk-build + wdk-sys, and
add a first-class ApiSubset::Iddcx that bindgens iddcx/1.10/IddCx.h in an extra pass reusing the
identical bindgen::Builder::wdk_default(config) baseline (so its WDF/DXGI types resolve to, not
redefine, wdk-sys's — type identity by construction). This mirrors wdk-sys's existing gpio/hid/spb/usb
versioned-subpath subsets exactly.
- wdk-build: add
ApiSubset::Iddcx, aheadersmatch arm,iddcx_headers() -> ["iddcx/1.10/IddCx.h"](UMDF-only). - wdk-sys build.rs:
generate_iddcxas a copy ofgenerate_gpio—bindgen_header_contents([Base, Wdf, Iddcx]),(TYPES|VARS).complement(),.allowlist_file("(?i).*iddcx.*"); behind aniddcxfeature; add toENABLED_API_SUBSETS;pub mod iddcxin lib.rs. - A
wdk-iddcxwrapper crate (port ofwdf-umdf/src/iddcx.rs): table dispatch viawdk_sys::iddcx::_IDDFUNCENUM::<Name>TableIndex as usize(ModuleConsts const, not the oracle's NewType.0); NTSTATUS is plaini32in wdk-sys (usewdk_sys::NT_SUCCESS, drop the oracle's newtype). - Driver build.rs: add
link-searchtoLib/<sdk>/um/<arch>/iddcx/1.10(the SDK version that contains iddcx — glob, don't trust max) +static=IddCxStub; hand-declare#[no_mangle] pub static IddMinimumVersionRequired: ULONG = 4;; keep the FORCE_INTEGRITY clear.
Make-or-break — RESOLVED ✅ (CI-green @ 6d8c7a5, run 5548, no fallback). IddCx.h bindgens AND the
generated module compiles inside wdk-sys with WDF type-identity; the #515/#516 header conflict NEVER
materialized. Vendored the published windows-drivers-rs 0.5.1 crates (wdk-build + wdk-sys) under
packaging/windows/drivers/vendor/, [patch.crates-io]'d. The six knobs generate_iddcx actually
needed (each a real gotcha, all CI-proven; the recipe above was close but the codegen/scope details
differed):
--language=c++—wdk_defaultparses C; IddCx.h'sIDARG_*typedefs need C++ or you get a "must use 'struct' tag" cascade (verified by directclangon the box: 0 errors as C++, fails as C).-DIDD_STUB— table-dispatch mode; skipsIddCxFuncEnum.h's#error IDDCX_VERSION_MAJOR is not defined(it lives inside#ifndef IDD_STUB). Do NOT addWDF_STUB— wdk-sys parseswdf.hnon-stubbed, and stubbing it only here would desync the shared WDF types (breaking type-identity).allowlist_recursively(false)+allowlist_file("(?i).*iddcx.*"), full codegen (no.complement()) — emit ONLY IddCx items; WDF/Win types resolve to wdk-sys's viause crate::types::*insrc/iddcx.rs. No giant blocklist (Option 2 avoided).allowlist_type("_?DXGI_.*" / "IDXGI.*" / "_?OPM_.*" / "_?D3DCOLORVALUE")— emit the non-WDF types wdk-sys doesn't bindgen, locally (absent fromcrate::types, so non-conflicting). The_?is load-bearing:typedef struct _OPM_X {} OPM_Xneeds the tag AND the alias (recursively(false) won't pull the tag from the typedef).pub type UINT = ::core::ffi::c_uint;insrc/iddcx.rs—UINT(unsigned int) is absent fromcrate::types; covers the top-level struct-field uses.translate_enum_integer_types(true)— C++ parsing keptUINTas the underlying repr of the DXGI/OPM ModuleConsts enums (pub mod _X { pub type Type = UINT; }), and nested modules can't see the parentUINT. This emits nativeu32reprs → self-contained enum modules.
The wrapper note still holds: table dispatch via wdk_sys::iddcx::_IDDFUNCENUM::<Name>TableIndex as usize
(ModuleConsts const, not the oracle's NewType .0); NTSTATUS = plain i32 (wdk_sys::NT_SUCCESS).
Driver build.rs will add the IddCxStub link-search + IddMinimumVersionRequired + keep the
FORCE_INTEGRITY clear. Option 2 stays rejected; the wdf-umdf-sys fallback is unneeded.
NEXT (M1 cont.): port the full ~30-DDI / ~40-struct surface (incl. the HDR *2 DDIs) + the
swap-chain processor + frame transport, with the clean ownership model (DeviceContext-owned state,
EvtCleanupCallback on MonitorContext, single Monitor identity, the owned pf-vdisplay-proto plane).
First gate: a probe linking IddCxStub and calling IddCxDeviceInitConfig/…Initialize/
…AdapterInitAsync (CI = compile+link). On-glass load + IDD-push stream needs the RTX box (ephemeral —
currently down/Proxmox).
14. M1 step 2 — pf-vdisplay driver port plan (2026-06-24, workflow-mapped + critiqued)
Status of the binding (DONE, CI-green): the wdk-sys iddcx binding is proven complete for the whole
driver, not just init. wdk-probe/src/iddcx_surface_assert.rs (commit ae803b2) CI-asserts every *2/HDR
struct (IDDCX_TARGET_MODE2/PATH2/METADATA2, IDARG_*RELEASEANDACQUIREBUFFER2 — which embed
DISPLAYCONFIG_*/LUID, both of which resolve from crate::types — no allowlist gap), all 14 inbound
PFN_IDD_CX_* callbacks, the .Size machinery (IddStructures/IddStructureCount/
IddClientVersionHigherThanFramework/_IDDSTRUCTENUM::INDEX_* — so IDD_STRUCTURE_SIZE! is portable), and
IDDCX_ADAPTER_FLAGS::…CAN_PROCESS_FP16 + IDDCX_TARGET_CAPS::…HIGH_COLOR_SPACE. ModuleConsts module
naming: the func/struct enums are _IDDFUNCENUM/_IDDSTRUCTENUM (underscored tag), but the flag/cap enums
are IDDCX_ADAPTER_FLAGS/IDDCX_TARGET_CAPS (no underscore).
DDIs to wrap (11 — graduate wdk-probe/src/iddcx_rt.rs → a wdk-iddcx crate)
DeviceInitConfig, DeviceInitialize, AdapterInitAsync (done), MonitorCreate, MonitorArrival,
MonitorDeparture, AdapterSetRenderAdapter, SwapChainSetDevice (other_is_error; 0x887A0026→retry),
SwapChainReleaseAndAcquireBuffer2 (HDR variant only; other_is_error; E_PENDING 0x8000000A → wait on
the surface event), SwapChainFinishedProcessingFrame. Drop the v1 ReleaseAndAcquireBuffer (adapter
always sets FP16). Defer the hardware-cursor DDIs (cursor baked into video).
Callbacks (15 in IDD_CX_CLIENT_CONFIG; *2 mandatory because FP16)
parse_monitor_description (+2), monitor_query_target_modes (+2), adapter_commit_modes (+2),
adapter_init_finished (stash IDDCX_ADAPTER + start watchdog), monitor_get_default_modes (→NOT_IMPLEMENTED,
we always carry EDID), query_target_info (→HIGH_COLOR_SPACE), set_gamma_ramp (accept-stub — WITHOUT it
the adapter fails to init), set_default_hdr_metadata (accept-stub) — the last three are mandatory under
FP16, assign_swap_chain, unassign_swap_chain, device_io_control (the pf-vdisplay-proto control plane).
Plus EvtDeviceD0Entry (adapter created HERE, not in DeviceAdd) and two EvtCleanupCallbacks.
State model (the rewrite's core change)
DeviceContext OWNS all state — IDDCX_ADAPTER, session_id-keyed monitor map, watchdog, the per-render-LUID
Direct3DDevice pool — replacing the oracle's process globals. Reachable from BOTH the WDFDEVICE (strong)
and the IDDCX_ADAPTER object (the adapter-side callbacks need it). MonitorContext owns the
SwapChainProcessor + target_id; wire EvtCleanupCallback on the IDDCX_MONITOR object so RAII Drop
joins the worker thread + frees D3D (the oracle lacked this → the dominant reconnect leak). Single Monitor
identity keyed by session_id (collapses the oracle's 3-way EDID-serial/map/stamp desync that caused the
target_id=0 recreate bug); assign_swap_chain reads target_id from the context, never a map lookup.
The HOST still owns the control-device handle, the linger/reuse state machine, and ALL Global\ shared
objects (created D:(A;;GA;;;WD)); the driver only OPENS them.
Frame transport (single-source on pf_vdisplay_proto::frame::*)
Acquire via ReleaseAndAcquireBuffer2{AcquireSystemMemoryBuffer=0} → GPU ID3D11Texture2D; borrow
out.MetaData.pSurface with IDXGIResource::from_raw_borrowed (do NOT steal IddCx's refcount), publish
BEFORE FinishedProcessingFrame. Ring = RING_LEN(6) keyed-mutex shared textures opened by name
(frame::{header_name,event_name,texture_name}); per-frame: GetDesc format-guard (drop on FP16↔BGRA
mismatch), AcquireSync(0,0ms), CopyResource, ReleaseSync(0), store FrameToken{gen,seq,slot}.pack()
(Release), SetEvent. All-slots-busy → drop, never block. is_stale() (header.generation Acquire) → reattach
on host ring recreate. Write DRV_STATUS_OPENED + render LUID into the header. Drop the old DebugBlock +
the locally-duplicated header/MAGIC/name consts.
Implementation checklist (each step CI- or box-gated)
- workspace
pf-vdisplay(cdylib)+wdk-iddcxmembers — STEP-0 gate must pull instd::thread+OwnedHandle(critique: prove std links under the UMDF toolchain here, not at STEP 5). CI. - graduate
iddcx_rt.rs→wdk-iddcx(11 DDIs +is_nt_error/other_is_error) + re-export the inbound PFN types. CI link. 1.5 (critique add) the surface-assert (DONE @ae803b2) lives on so the full PFN/*2/DISPLAYCONFIGsurface stays a CI gate. - DriverEntry + driver_add: full
IDD_CX_CLIENT_CONFIG(15 callbacks as stubs) + DeviceInitConfig + WdfDeviceCreate(+cleanup) + CreateDeviceInterface(PF_VDISPLAY_INTERFACE_GUID) + DeviceInitialize + D0Entry stub; salvageedid.rsverbatim. Resolve.SizeviaIDD_STRUCTURE_SIZE!(machinery confirmed present). CI link + FORCE_INTEGRITY clear. - DeviceContext +
WDF_DECLARE_CONTEXT_TYPEArc blob; init_adapter in D0Entry (caps+FP16) → AdapterInitAsync; the*2mode DDIs + query_target_info + gamma/hdr accept-stubs. Box gate: loads under Secure Boot, enumerates as IddCx adapter, Status OK (no "Failed to get adapter"). - control plane (GET_INFO version handshake — host MUST assert
protocol_version, ADD/REMOVE/SET_RENDER_ADAPTER/PING/CLEAR_ALL) + create_monitor + real mode DDIs + watchdog + MONITOR_OP_LOCK; switch hostsudovda.rs/idd_push.rstopf_vdisplay_proto(GUID e5bcc234→70667664, IOCTL 0x800→0x900, GUID-key→session_id) — lockstep. CI (host build) + box (monitor appears at WxH@Hz). - Direct3DDevice + assign/unassign + SwapChainProcessor (worker thread, SetDevice 60×@50ms single-borrow retry, top-of-loop terminate, Buffer2 acquire, from_raw_borrowed) WITHOUT publisher; wire monitor
EvtCleanupCallback. Box: swap-chain assigns, acquire loop runs, RAII teardown (no thread/VRAM leak). Critique: instrument that MonitorContext::Drop actually RAN; if the monitor-object cleanup callback does not fire, keep the oracle's explicit free-before-departure path as the fallback. - FramePublisher on
pf_vdisplay_proto::frame::*+ keyed-mutex RAII guard + OwnedHandle/ShmView; wire into run_core. Box: full IDD-push glass-to-glass, A/B vs the shipping driver. Critique: add a BLOCKING secure-desktop gate here — lock (Win+L)+UAC with serve in the console session / driver in Session 0, confirm frames keep flowing AND input reaches the desktop; until it passes, do NOT delete the WGC-relay/DDA secure path. - HDR ring-recreate + repeated session recreate (confirm the recreate-crash is gone). Critique: define the failure branch — if recreate isn't stable, keep IDD_PERSIST + state that mid-stream Reconfigure stays unsupported on Windows IDD-push (host rejects, as today) rather than crashing; keep
max_concurrent=1. Specify the concurrent-monitor D3D model before enabling >1 (two worker threads must not share one SINGLETHREADED immediate context — give each monitor its own device or a deferred/multithreaded context). - unsafe-reduction pass (one audited
SendPtr/ThreadBound; per-site// SAFETY;AcquiredSurface<'_>+KeyedMutexGuardRAII so the hot loop has zero raw Finish/ReleaseSync) + delete the oldpackaging/windows/vdisplay-driver/tree only after the secure-desktop gate (step 6) passes. CI clippy -D warnings + final box A/B.
Critique verdict + the big risk
Plan is implementation-ready once the 4 CI-checkable unknowns are gates (3 now resolved by the surface-assert
.Sizemachinery presence; std-under-UMDF is the STEP-0 gate). SINGLE BIGGEST RISK: the secure-desktop claim — the plan retires the proven two-process WGC relay + DDA on the unproven assertion that one IddPushCapturer captures the lock/UAC secure desktop directly (IDD-push is opt-in today behindPUNKTFUNK_IDD_PUSH). Make it a blocking on-glass gate (step 6) and keep the WGC relay recoverable for one release. Other defined-failure-branch items: monitorEvtCleanupCallbackfiring, IDD_PERSIST/Reconfigure, concurrent-monitor device sharing, host↔driverprotocol_versionlockstep.