feat(windows-drivers): host-gone watchdog, SET_RENDER_ADAPTER, log gate, mode bounds

Audit §4.1: implement the host-gone watchdog — it was dead code (WATCHDOG_PINGS bumped but never sampled, no thread). Every IOCTL now bumps a liveness counter; a watchdog thread reap_orphaned()s monitors (created_at grace) if no IOCTL arrives within WATCHDOG_TIMEOUT_S, so a crashed/TerminateProcess'd host no longer leaves its virtual monitor + swap-chain worker + pooled D3D device wedged until the next CLEAR_ALL. Removes the false 'watchdog thread' comments.

Audit §4.2: implement SET_RENDER_ADAPTER (was STATUS_NOT_IMPLEMENTED) via IddCxAdapterSetRenderAdapter, so the host can pin the IDD render to the NVENC GPU on a hybrid iGPU+dGPU box (else the OS-picked iGPU makes the host ring textures un-openable -> DRV_STATUS_TEX_FAIL).

Audit §4.4: gate the world-writable C:\Users\Public\pfvd-driver.log behind debug builds / PFVD_DEBUG_LOG (a release build never writes it).

Audit §4.5: bounds-check the requested mode in IOCTL_ADD; compute display_info clock_rate in u64 + saturate (the old u32 refresh*(h+4)^2 overflowed/aborted the mode DDI for large modes).

Verified: driver workspace builds clean on the RTX box (WDK 26100 + LLVM 21.1.2, MSVC). On-glass functional validation of the watchdog/render-pin is a follow-up (needs a driver reinstall + session).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-25 12:49:49 +00:00
parent 95dcef3515
commit 0a7ae5ef09
5 changed files with 177 additions and 18 deletions
@@ -3,25 +3,76 @@
//! (watchdog keepalive), ADD/REMOVE/CLEAR_ALL (virtual monitors), and SET_RENDER_ADAPTER (next). Every
//! path completes the `WDFREQUEST` exactly once (the `EVT_IDD_CX_DEVICE_IO_CONTROL` shape returns `()`).
use core::sync::atomic::{AtomicU64, Ordering};
use core::sync::atomic::{AtomicBool, AtomicU64, Ordering};
use std::time::{Duration, Instant};
use pf_vdisplay_proto::control;
use wdk_iddcx::nt_success;
use wdk_sys::{NTSTATUS, WDFREQUEST, call_unsafe_wdf_function_binding};
use crate::{STATUS_INVALID_PARAMETER, STATUS_NOT_FOUND, STATUS_NOT_IMPLEMENTED, STATUS_SUCCESS};
use crate::{STATUS_INVALID_PARAMETER, STATUS_NOT_FOUND, STATUS_SUCCESS};
/// The host must PING within this window or the watchdog reaps all monitors (STEP 4: the watchdog thread).
/// The host must send an IOCTL within this window (it PINGs on a `timeout/3` timer) or the watchdog
/// treats it as gone and reaps every monitor. Reported to the host via [`control::IOCTL_GET_INFO`].
const WATCHDOG_TIMEOUT_S: u32 = 10;
/// Keepalive counter — PING bumps it; STEP 4's watchdog thread samples it to detect a gone host.
/// Host-liveness counter — EVERY inbound IOCTL bumps it; [`start_watchdog`]'s thread samples it.
static WATCHDOG_PINGS: AtomicU64 = AtomicU64::new(0);
/// Spawns the watchdog thread exactly once (idempotent across re-entrant adapter inits).
static WATCHDOG_STARTED: AtomicBool = AtomicBool::new(false);
/// Start the host-liveness watchdog (once, from `adapter_init_finished`).
///
/// Previously [`WATCHDOG_PINGS`] was bumped but NEVER sampled (no thread existed) — so a host that died
/// without a cooperative REMOVE (crash / `TerminateProcess`) left its virtual monitor + swap-chain
/// worker + pooled D3D device wedged in WUDFHost until the next host start's CLEAR_ALL, and a
/// not-restarted host left the orphan monitor in the desktop topology indefinitely
/// (`docs/windows-host-rewrite-audit.md` §4.1). This thread closes that: if no IOCTL arrives for
/// `WATCHDOG_TIMEOUT_S` while monitors exist, it departs them all.
///
/// (A WDF `EvtFileClose` on the control handle would be more immediate — the plan's preferred §3.4
/// option — but the polling watchdog matches the proven oracle and needs no IddCx file-object plumbing.)
pub fn start_watchdog() {
if WATCHDOG_STARTED.swap(true, Ordering::SeqCst) {
return;
}
let tick = Duration::from_secs(u64::from((WATCHDOG_TIMEOUT_S / 3).max(1)));
let timeout = Duration::from_secs(u64::from(WATCHDOG_TIMEOUT_S));
std::thread::spawn(move || {
let mut last = WATCHDOG_PINGS.load(Ordering::Relaxed);
let mut last_change = Instant::now();
loop {
std::thread::sleep(tick);
let cur = WATCHDOG_PINGS.load(Ordering::Relaxed);
if cur != last {
last = cur;
last_change = Instant::now();
continue;
}
// No IOCTL since `last_change`. A live host PINGs every `timeout/3`, so this only trips once
// the host is truly gone; only reap when there's something to reap.
if last_change.elapsed() >= timeout && crate::monitor::has_monitors() {
let n = crate::monitor::reap_orphaned(Duration::from_secs(3));
if n > 0 {
dbglog!(
"[pf-vd] watchdog: no host IOCTL in {WATCHDOG_TIMEOUT_S}s — host gone, departed {n} monitor(s)"
);
}
last_change = Instant::now(); // don't re-reap every tick
}
}
});
}
/// Dispatch one control IOCTL and complete the request.
///
/// # Safety
/// `request` is the framework-provided `WDFREQUEST` for an `EvtIddCxDeviceIoControl` call.
pub unsafe fn dispatch(request: WDFREQUEST, ioctl_code: u32) {
// Every inbound IOCTL is host liveness (the host PINGs on a timer, plus ADD/REMOVE/GET_INFO/…) —
// bump the watchdog at the top so it only fires once the host has gone truly silent. See
// [`start_watchdog`].
WATCHDOG_PINGS.fetch_add(1, Ordering::Relaxed);
match ioctl_code {
control::IOCTL_GET_INFO => {
let reply = control::InfoReply {
@@ -31,10 +82,7 @@ pub unsafe fn dispatch(request: WDFREQUEST, ioctl_code: u32) {
// SAFETY: `request` is the framework WDFREQUEST.
unsafe { write_output_complete(request, &reply) };
}
control::IOCTL_PING => {
WATCHDOG_PINGS.fetch_add(1, Ordering::Relaxed);
complete(request, STATUS_SUCCESS);
}
control::IOCTL_PING => complete(request, STATUS_SUCCESS),
// SAFETY: `request` is the framework WDFREQUEST.
control::IOCTL_ADD => unsafe { add(request) },
// SAFETY: `request` is the framework WDFREQUEST.
@@ -43,12 +91,34 @@ pub unsafe fn dispatch(request: WDFREQUEST, ioctl_code: u32) {
crate::monitor::clear_all();
complete(request, STATUS_SUCCESS);
}
// SET_RENDER_ADAPTER (hybrid-GPU render pin): STEP 4 (next).
control::IOCTL_SET_RENDER_ADAPTER => complete(request, STATUS_NOT_IMPLEMENTED),
// SAFETY: `request` is the framework WDFREQUEST.
control::IOCTL_SET_RENDER_ADAPTER => unsafe { set_render_adapter(request) },
_ => complete(request, STATUS_NOT_FOUND),
}
}
/// Sanity bounds for a requested mode — generous (covers any real client) but rejects zero/absurd
/// values that would otherwise feed the EDID/mode math unchecked.
fn valid_mode(width: u32, height: u32, refresh_hz: u32) -> bool {
(1..=16384).contains(&width)
&& (1..=16384).contains(&height)
&& (1..=1000).contains(&refresh_hz)
}
/// `IOCTL_SET_RENDER_ADAPTER`: pin the IddCx render adapter (hybrid-GPU IDD-push).
///
/// # Safety
/// `request` is the framework `WDFREQUEST`.
unsafe fn set_render_adapter(request: WDFREQUEST) {
// SAFETY: `request` is the framework WDFREQUEST.
let Some(req) = (unsafe { read_input::<control::SetRenderAdapterRequest>(request) }) else {
complete(request, STATUS_INVALID_PARAMETER);
return;
};
let st = crate::adapter::set_render_adapter(req.luid_low, req.luid_high);
complete(request, st);
}
/// `IOCTL_ADD`: create a virtual monitor at the requested mode → reply with the OS target id + LUID.
///
/// # Safety
@@ -59,6 +129,10 @@ unsafe fn add(request: WDFREQUEST) {
complete(request, STATUS_INVALID_PARAMETER);
return;
};
if !valid_mode(req.width, req.height, req.refresh_hz) {
complete(request, STATUS_INVALID_PARAMETER);
return;
}
let Some((target_id, luid_low, luid_high)) =
crate::monitor::create_monitor(req.session_id, req.width, req.height, req.refresh_hz)
else {