fix(windows-host): claim the vdisplay single-instance guard eagerly at serve startup

On-glass the lazy (first-session) claim let a second host started while the
freshly-restarted service sat idle win the mutex and ADD a monitor on the
real driver — priority backwards. The claim is now a process-global,
retryable slot (a failed claim is not memoized, so it heals once the other
instance exits), and `serve` claims it before any client can connect;
ensure_device keeps the lazy claim for standalone punktfunk1-host runs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-07-03 18:57:47 +00:00
parent 7e31020c1c
commit 3f33ed30ae
2 changed files with 31 additions and 6 deletions
@@ -149,9 +149,6 @@ struct DeviceSlot {
/// `CLEAR_ALL` (crashed-host orphan reap) runs only on the FIRST open of the process; a reopen
/// races sessions this process still considers live and must not raze them.
opened_once: bool,
/// The cross-process single-instance mutex (`Global\punktfunk-vdisplay-manager`), acquired on
/// the first open and held — never released — for the process lifetime.
instance_guard: Option<OwnedHandle>,
}
/// The host-lifetime virtual-display manager: the single owner of the monitor lifecycle.
@@ -216,6 +213,31 @@ pub(crate) fn control_device_handle() -> Option<HANDLE> {
/// next use reopens. The root `windows` error survives anyhow `.context` chains via `downcast_ref`.
/// NOTE: 0x80070490 (ERROR_NOT_FOUND, the ADD slot-exhaustion wedge) is deliberately NOT here — it
/// has its own reap-and-retry handling and the device is alive when it fires.
/// The held single-instance mutex (`None` until claimed). Process-global — not per-manager — so the
/// serve path can claim it EAGERLY at startup, before any session opens the backend: the claim is
/// first-comer-wins, and a lazily-claiming service could otherwise lose its own machine's driver to
/// a stray second host started while the service sat idle (observed on-glass). A failed claim is NOT
/// memoized: once the other instance exits, the next attempt succeeds.
static INSTANCE: Mutex<Option<OwnedHandle>> = Mutex::new(None);
/// Claim (or re-verify) the cross-process single-instance guard. Idempotent; retries after failure.
fn claim_instance() -> Result<()> {
let mut g = INSTANCE.lock().unwrap();
if g.is_none() {
*g = Some(acquire_single_instance()?);
}
Ok(())
}
/// Eager startup claim for the serve/service path (Windows): reserves this process as THE
/// pf-vdisplay manager before any client connects. Failure is a loud warning, not fatal — sessions
/// then fail with the same clear in-use error until the other instance exits.
pub(crate) fn claim_instance_eagerly() {
if let Err(e) = claim_instance() {
tracing::warn!("pf-vdisplay single-instance claim failed at startup: {e:#}");
}
}
/// The cross-process single-instance guard for pf-vdisplay management. A SECOND host process's
/// first device open used to fire `IOCTL_CLEAR_ALL` and raze the live host's monitors mid-stream —
/// an admin footgun (run `punktfunk-host serve` while the SCM service streams), masked afterwards
@@ -303,9 +325,7 @@ impl VirtualDisplayManager {
return Ok(HANDLE(d.as_raw_handle()));
}
let reap = !slot.opened_once;
if slot.instance_guard.is_none() {
slot.instance_guard = Some(acquire_single_instance()?);
}
claim_instance()?;
// SAFETY: `VdisplayDriver::open` is `unsafe` only because it issues SetupAPI + `DeviceIoControl`
// FFI in the caller's apartment; the `device` mutex (held here) serializes it, so there is no
// concurrent open. `open` has no handle precondition to uphold, and the `OwnedHandle` it