fix(windows-drivers): reclaim pf-vdisplay monitor ids on REMOVE (P1, slot-reclaim)

The driver assigned each virtual monitor a monotonically-increasing NEXT_ID used
as the EDID serial / IddCx ConnectorIndex / container GUID, and never reclaimed
it on REMOVE. Under sustained ADD/REMOVE churn the connector index kept climbing,
so IddCx/PnP allocated a NEW OS target slot every cycle and orphaned the old one
(ghost "Generic Monitor (punktfunk)" nodes) until the adapter's target capacity
was exhausted and ADD failed 0x80070490 ERROR_NOT_FOUND.

Fix: `create_monitor` now allocates the LOWEST free id (`alloc_monitor_id`,
computed under the MONITOR_MODES lock with the push) instead of a counter, so a
departed monitor's id is reclaimed and a fresh ADD reuses its target slot rather
than orphaning it. With <= N live monitors the id stays bounded to 1..=N+1.
Deleted the now-unused NEXT_ID + AtomicU32/Ordering import.

CI-compile-gated only — the wedge reproduces solely under sustained churn on the
RTX box, so this needs an on-glass reconnect-storm A/B to confirm (box is
ephemeral/down). Marked on-glass-pending in windows-host-rewrite.md §4; keep
reset-pf-vdisplay.ps1 as the recovery until validated. NOT to be relied on (or
merged to main) until that A/B passes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-25 22:11:36 +00:00
parent 0bf3984614
commit 8cde8621ce
2 changed files with 32 additions and 14 deletions
+8 -4
View File
@@ -211,10 +211,14 @@ These are expensive empirical wins; keep them intact when touching the code:
2. **Make IDD-push the default** — today it is gated behind `PUNKTFUNK_IDD_PUSH` (`config.rs` default
`false`); deployment sets it in `host.env`. Flip the code default (with the WGC/DDA fallback already in
place) so a fresh install runs the validated path, or document the `host.env` requirement explicitly.
3. **pf-vdisplay slot reclaim on REMOVE** (driver robustness) — sustained ADD/REMOVE churn wedges the
driver (`ADD → 0x80070490 ERROR_NOT_FOUND`) because IddCx monitor slots aren't reclaimed (ghost nodes
accumulate). Workaround today: `packaging/windows/reset-pf-vdisplay.ps1`. Real fix in `control.rs`/
`monitor.rs`. On-glass-gated.
3. **pf-vdisplay slot reclaim on REMOVE** (driver robustness) — 🟡 **fix landed, on-glass-validation
pending.** Sustained ADD/REMOVE churn wedged the driver (`ADD → 0x80070490 ERROR_NOT_FOUND`) because the
monitor id (EDID serial / `ConnectorIndex` / container GUID) was a **monotonic** `NEXT_ID`, never
reclaimed → IddCx accumulated a new OS target slot per cycle until exhaustion. `monitor.rs` now allocates
the **lowest free id** (`alloc_monitor_id`), reused on REMOVE, so a fresh ADD reuses the departed
monitor's target slot instead of orphaning it. CI-compile-gated; the wedge only reproduces under
sustained churn on the RTX box, so this needs an **on-glass reconnect-storm A/B** to confirm (the box is
ephemeral). Keep `packaging/windows/reset-pf-vdisplay.ps1` as the recovery until validated.
**P2 — hygiene / architecture completion**
4. **D1-host — host-crate P0 lints.** Add `#![deny(unsafe_op_in_unsafe_fn)]` +