docs(windows-rewrite): mark game-capture bug FIXED + bring rewrite status current (§15)
The fullscreen-game-breaks-IDD-push bug is FIXED by the resolution-listening recovery (c87bfe0: the 250ms poll now follows the display's actual resolution and recreates the ring on any descriptor change, recover-or-drop), backed by open-time first-frame DDA failover (f98ab07) and the driver publish() width/ height guard + flushed logging (789ad49). No protocol bump was needed — the host reads the real resolution straight from Windows (CCD/GDI), so the bug doc's Stage-1 composing capturer + Stage-2 protocol bump were unnecessary. Bug doc marked FIXED with a Resolution section; the staged plan kept as superseded record. windows-host-rewrite.md: the progress log was stale (ended at "M1 cont."). Added §15 Current status — the driver STEP 0-8 port landed on main on-glass HDR- validated; the host was refactored *in place* via windows-host-goal1 (not the §10 greenfield rebuild); §2.5 ownership model resolved the swap-chain-reuse / monitor- leak open item; iddcx + /INTEGRITYCHECK CI-green. Remaining: the secure-desktop on-glass gate (the single biggest unproven claim), M4 gamepad-driver migration, M5/M6 cleanup, and the pf-vdisplay slot-reclaim driver fix. Top Status flipped proposed → largely implemented. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,260 @@
|
||||
# pf-vdisplay: fullscreen game breaks video (IDD-push capture) — issue analysis
|
||||
|
||||
> **Status: FIXED ✅ (2026-06-25).** Resolved by the **resolution-listening recovery** — see
|
||||
> [Resolution](#resolution-fixed-2026-06-25) below. The investigation that follows is kept as the record
|
||||
> of how it was diagnosed. Companion to [`windows-host-rewrite.md`](./windows-host-rewrite.md).
|
||||
|
||||
## Resolution (fixed 2026-06-25)
|
||||
|
||||
The fix landed as the **recover-or-drop** design (host-only, **no protocol bump**), *not* the
|
||||
composing-capturer mid-session failover originally sketched in
|
||||
[Recommended fix](#recommended-fix-staged):
|
||||
|
||||
- **`c87bfe0` — IDD-push *recovers* from a game mode-set (the "resolution-listening" work).** The ring now
|
||||
**tracks the display's actual mode**. At open it is sized to the display's real resolution (new
|
||||
`win_display::active_resolution`, CCD/GDI). Mid-session the 250 ms poll — previously HDR-toggle-only —
|
||||
now also **follows the active resolution**; on *any* descriptor change (size **or** HDR) it recreates the
|
||||
ring at the new mode (`recreate_ring` generalized to a new size), the driver re-attaches via the existing
|
||||
`is_stale()` path, and frames resume at the game's mode. **No freeze, no reconnect.** If a change is
|
||||
genuinely unrecoverable (e.g. an exclusive flip the host can't follow) a `recovering_since` clock fires
|
||||
after 3 s and `try_consume` drops the session cleanly so the client reconnects, instead of freezing
|
||||
forever. A pure idle desktop (no mode change) never triggers it.
|
||||
- **`f98ab07` — open-time first-frame failover to DDA (GB1 pt 1).** `wait_for_attach` now requires the
|
||||
driver to publish a *first frame* (not just `DRV_STATUS_OPENED`); a display the driver attaches to but
|
||||
whose frames its `publish()` guard rejects now fails `open()` within ~4 s → `capture.rs` falls back to
|
||||
DDA → the game is captured + visible after a reconnect. A normal/idle open (frame within ~1 s) is never
|
||||
false-failed, and DDA is itself a working path, so even a false positive degrades gracefully.
|
||||
- **`789ad49` — driver `publish()` width/height guard + a process-lifetime flushed log appender** (GB3
|
||||
groundwork): drops a surface whose descriptor no longer matches the host ring (`CopyResource` needs
|
||||
matching dims too, else garbage) and logs the actual descriptor once per mismatch episode, so the
|
||||
swap-chain WORKER-thread lines land (closing the bug-doc **S3** observability gap). Needs a driver
|
||||
rebuild + re-vendor to deploy (separate from the host-only GB1 fix).
|
||||
|
||||
**Why this instead of the composing capturer (original Stage 1):** the host reads the display's real
|
||||
resolution straight from Windows (CCD/GDI), so it doesn't need the driver to report it over a new
|
||||
`SharedHeader` field — the original **Stage 2's protocol bump is unnecessary**. In-place recovery keeps the
|
||||
fast IDD-push (zero-copy) path live *through* a game mode-set instead of permanently demoting to DDA;
|
||||
open-time DDA failover (`f98ab07`) covers the "display already in a broken mode at connect" case.
|
||||
|
||||
**Deferred (non-blocking):** Stage 3 (trim `default_modes`) — deprioritized (recovery handles mode-sets and
|
||||
trimming risks the live display-activation path); Stage S driver resilience (S1/S2) — gated on the
|
||||
`789ad49` logging once a fresh repro is captured. Owner-confirmed the resolution-listening recovery fixes
|
||||
the user-visible bug (2026-06-25).
|
||||
|
||||
## Context
|
||||
|
||||
The all-Rust `pf-vdisplay` IddCx virtual-display driver (STEP 0–8 of the Windows host rewrite,
|
||||
now on `main`, on-glass-validated for plain desktop + HDR streaming) breaks when a **fullscreen
|
||||
game** runs on the stream.
|
||||
|
||||
**Reproduction (RTX 4090 box `192.168.1.158`):** launch *Doom the Dark Ages* while streaming → the
|
||||
desktop image **flashes** (a display mode-set fired), the game is **never visible**, and **disconnect
|
||||
+ reconnect yields a black screen with working audio**. (The box was rebooted afterward, so live
|
||||
logs from the incident are gone.)
|
||||
|
||||
**Runtime config in play** (`C:\ProgramData\punktfunk\host.env`):
|
||||
- `PUNKTFUNK_IDD_PUSH=1` → capture comes from the driver's **shared-memory frame ring**, not DDA/WGC.
|
||||
- `PUNKTFUNK_10BIT=1` (+ `PUNKTFUNK_HDR_SHADER_P010=1`) → **HDR active**; the ring is FP16.
|
||||
- `PUNKTFUNK_MONITOR_LINGER_MS=0` → every (re)connect builds a **fresh** monitor + ring.
|
||||
- `PUNKTFUNK_VDISPLAY=pf`, `PUNKTFUNK_ENCODER=nvenc`, `PUNKTFUNK_SECURE_DDA=1`.
|
||||
|
||||
The driver log (`C:\Users\Public\pfvd-driver.log`) at inspection showed **8 fresh
|
||||
`IddCxMonitorCreate`/`Arrival` pairs (ids 1–8), all `0x0`, and ZERO swap-chain-processor lines** —
|
||||
so monitor creation is healthy and the break is entirely **downstream of monitor creation**
|
||||
(swap-chain drain / frame publish / host consume), exactly where a game-induced mode change lands.
|
||||
|
||||
## Root cause (one sentence)
|
||||
|
||||
The IDD-push ring is created **once** at session start with a **fixed format and fixed size**
|
||||
derived from session-start state, there is **no channel for the driver to report the actual
|
||||
acquired-surface descriptor** back to the host, and there is **no mid-session fallback** — so when
|
||||
a game forces a format and/or resolution change on the virtual display, the driver silently drops
|
||||
every frame, the host never learns it needs to adapt, and the stream goes black and then hard-crashes.
|
||||
|
||||
## How the symptom maps to the code
|
||||
|
||||
1. Game launches → forces a **mode set** on the virtual display (the "desktop flash"). This changes
|
||||
the OS-composed surface's **DXGI format and/or width/height**, and triggers a swap-chain
|
||||
unassign→reassign in the driver.
|
||||
2. The driver's `publish()` copies the acquired surface into the host ring **only if formats match
|
||||
exactly** (`desc.Format` u32 compare) — and `CopyResource` *also* silently requires identical
|
||||
dimensions, which is never checked. → **every frame dropped.**
|
||||
3. The host's only ring-recreate trigger is polling Windows' **HDR-enabled toggle**. A game-driven
|
||||
format/size change it can't observe → **host never recreates the ring** → driver re-attaches to
|
||||
the same mismatched ring → keeps dropping.
|
||||
4. Once `PUNKTFUNK_IDD_PUSH=1`, the ring is the **sole** capture source (no DDA/WGC fallback).
|
||||
`next_frame()` repeats the last good frame, then **`bail!`s after a 20 s deadline → the stream
|
||||
dies.**
|
||||
5. **Reconnect stays black** because the game is still holding the display in the changed state; the
|
||||
fresh ring is rebuilt at the **session-negotiated** format/size again and re-mismatches. Audio is
|
||||
a fully independent plane, so it survives — matching "black + audio."
|
||||
|
||||
---
|
||||
|
||||
## Identified issues
|
||||
|
||||
### Primary
|
||||
|
||||
**P1 — IDD-push ring format is fixed at session start; host can't observe a game-driven format change.**
|
||||
- Host picks the ring format once: FP16 (`DXGI_FORMAT_R16G16B16A16_FLOAT`) if
|
||||
`advanced_color_enabled(target_id)` else `DXGI_FORMAT_B8G8R8A8_UNORM`.
|
||||
`crates/punktfunk-host/src/capture/idd_push.rs:340-361`
|
||||
- Driver drops any frame whose `desc.Format` ≠ the ring format, silently.
|
||||
`packaging/windows/drivers/pf-vdisplay/src/frame_transport.rs:281-286`
|
||||
- Host recreates the ring **only** on a Windows HDR-toggle poll (250 ms), never on a format change
|
||||
it can't see. `idd_push.rs:619-640` (`poll_display_hdr` → `recreate_ring` at `:582-617`).
|
||||
- Driver re-attaches on a host generation bump (`is_stale`), but nothing bumps it for this case.
|
||||
`frame_transport.rs:259-270`.
|
||||
- **No `SharedHeader` field carries the driver's actual acquired-surface format** — the driver only
|
||||
writes `driver_status`, `driver_status_detail`, `driver_render_luid_low/high` back.
|
||||
|
||||
**P2 — IDD-push ring size is fixed at session start; a resolution change is never detected.**
|
||||
- `header.width/height` written once at `idd_push.rs:396-397`; ring slots sized once and never
|
||||
resized; consumed frames always report the session size (`idd_push.rs:744-745`).
|
||||
- `publish()` guards **format only, not width/height** (`frame_transport.rs:284`). `CopyResource`
|
||||
requires identical dimensions, so a resolution change → silent no-op/garbage, no error logged.
|
||||
- Driver never reports the acquired surface's real width/height to the host.
|
||||
|
||||
**P3 — No mid-session capture fallback; a 20 s hard crash instead of degrade.**
|
||||
- `PUNKTFUNK_IDD_PUSH=1` returns the IDD-push capturer early with the keepalive moved into it — **no
|
||||
fall-through**. `crates/punktfunk-host/src/capture.rs:348-356`.
|
||||
- `next_frame()` waits on the frame-ready event (16 ms), repeats the last frame, and **`bail!`s
|
||||
after a 20 s deadline** → the encode loop tears the session down.
|
||||
`idd_push.rs:819-847`.
|
||||
- The WGC→DDA fallback that exists (`capture.rs:389-404`) is **open-time only** and on the
|
||||
**non**-IDD-push path; it does not help here.
|
||||
- The `VirtualOutput` already carries a `WinCaptureTarget { adapter_luid, gdi_name, target_id }`
|
||||
(`vdisplay/pf_vdisplay.rs` `Monitor::target()`), so a DDA/WGC capturer **can** be opened on the
|
||||
same virtual output — the wiring just doesn't exist for IDD-push.
|
||||
|
||||
### Secondary (verify during the fix; not the proven primary cause)
|
||||
|
||||
**S1 — Driver `run_core` exits permanently on a swap-chain error, with no clear re-arm.**
|
||||
- On a `ReleaseAndAcquireBuffer2` error (e.g. `DXGI_ERROR_ACCESS_LOST` when a game grabs the
|
||||
display), `run_core` `break`s and returns; the worker exits and deletes the swap-chain object.
|
||||
`packaging/windows/drivers/pf-vdisplay/src/swap_chain_processor.rs:359-362` (+ delete at `:141-143`).
|
||||
- A mode change drives unassign→assign which **does** respawn a fresh processor
|
||||
(`callbacks.rs:309-318`, `:249-305`), so a clean mode change recovers. **Open question:** whether
|
||||
the OS reliably re-assigns after a bare `ACCESS_LOST` exit (no unassign), or whether the monitor
|
||||
stalls with a dead-but-installed processor. Confirm against the IddCx contract / upstream
|
||||
`virtual-display-rs`. The standard IddCx model expects the OS to re-assign, but this needs proof.
|
||||
|
||||
**S2 — `IddCxSwapChainSetDevice` give-up leaves a dead-but-installed processor.**
|
||||
- `assign_swap_chain` returns `STATUS_SUCCESS` and installs the processor **before** the worker's
|
||||
`SetDevice` retries run; if all 60 retries (≈3 s) fail during a mode flap, the worker returns and
|
||||
the processor is dead, but the OS believes the swap chain is assigned → potential permanent stall.
|
||||
`swap_chain_processor.rs:191-226`, `callbacks.rs:279-293`.
|
||||
|
||||
**S3 — Driver worker-thread diagnostics are not landing (impairs root-causing).**
|
||||
- `dbglog!` → `log.rs` opens/append/closes the file per call with **no explicit flush**, and the
|
||||
observed log had only control-plane (IOCTL-thread) lines, no swap-chain-processor lines.
|
||||
`packaging/windows/drivers/pf-vdisplay/src/log.rs:9-22`.
|
||||
- Whatever the exact reason (write race / token / interleave), the practical effect is the
|
||||
swap-chain processor's behavior during the break is **invisible**, which is why the cause can't be
|
||||
pinned from logs alone today. **Fix this first** so the next repro is conclusive.
|
||||
|
||||
---
|
||||
|
||||
## Verified facts that de-risk the fix
|
||||
|
||||
- **The encoder already adapts to a mid-session size/format change.** `encode/nvenc.rs:580-618`:
|
||||
`submit` detects `size_changed`/`hdr_changed`/device change per frame, tears down, and re-inits
|
||||
adopting the new frame's geometry + pixel format. So a capturer that changes resolution/format
|
||||
mid-session is handled downstream — **no encoder API change is needed** for either fix direction.
|
||||
- **The stream loop relays per-frame geometry.** `CapturedFrame` carries `width`/`height`/`format`
|
||||
(`capture.rs:50-57`); the loop reads `pipeline_depth()` live and forwards whatever `try_latest()`
|
||||
returns.
|
||||
- **WGC and DDA emit the same pixel formats the IDD-push path emits** (`Bgra` / `Rgb10a2`), so a
|
||||
failover capturer feeds the encoder compatible frames.
|
||||
- **A failover capturer fits the existing `Capturer` trait** (`next_frame` + `try_latest`,
|
||||
`capture.rs:120-155`) — a composing capturer that owns the ring capturer + a lazily-opened
|
||||
WGC/DDA capturer and switches between them is a clean drop-in.
|
||||
|
||||
---
|
||||
|
||||
## Recommended fix (staged)
|
||||
|
||||
> **Superseded — see [Resolution](#resolution-fixed-2026-06-25).** This was the original plan; the bug
|
||||
> was fixed by the simpler **recover-or-drop** approach (host follows the OS resolution + open-time DDA
|
||||
> failover), so Stage 1's composing capturer and Stage 2's protocol bump were not needed. Kept for context.
|
||||
|
||||
Defense-in-depth. Stages 0–1 are **host-only** (no driver rebuild, no protocol bump) and are the
|
||||
fast, robust, user-visible fix. Stages 2–3 harden the fast path and need the driver re-vendor loop.
|
||||
|
||||
- **Stage 0 — Diagnostics first (land before anything else).**
|
||||
- `log.rs`: flush after each write (or keep a process-lifetime appender) and confirm worker-thread
|
||||
writes land. (S3)
|
||||
- Driver: in `publish()`, log/record the acquired surface's **actual format + width + height**
|
||||
even on the drop path, so a repro shows exactly what changed.
|
||||
- Host: replace the silent 20 s wait with a `tracing::warn!` at ~2 s of no fresh frame, including
|
||||
`driver_status`/`driver_status_detail` and the host's expected ring format/size.
|
||||
- Goal: the next Doom-launch repro definitively classifies the cause (format mismatch vs size
|
||||
mismatch vs `run_core` exit vs no-reassign).
|
||||
|
||||
- **Stage 1 — Mid-session fallback IDD-push → WGC/DDA (robust to ALL failure modes).** (P3)
|
||||
- Add a composing `Capturer` that owns the IDD-push capturer and, when it yields no fresh frame
|
||||
for a **short** window (~1.5 s, not 20 s), opens a DDA/WGC capturer on the same
|
||||
`WinCaptureTarget` and serves from it for the rest of the session (optionally probing the ring
|
||||
for recovery). Encoder follows the new format/size automatically (verified above).
|
||||
- This alone guarantees the session never goes permanently black again and makes Doom playable via
|
||||
WGC/DDA when the ring path is defeated — independent of the *why*.
|
||||
- Touch points: `capture.rs:334-356` (wire the composing capturer behind `PUNKTFUNK_IDD_PUSH`),
|
||||
`idd_push.rs` (expose a "stalled?" signal + shorten the deadline), reuse `dxgi.rs`/`wgc.rs`.
|
||||
|
||||
- **Stage 2 — Adaptive ring (makes the fast IDD-push path itself survive a game mode change).** (P1, P2)
|
||||
- Driver writes the **actual acquired-surface format + width + height** into new `SharedHeader`
|
||||
fields, in `publish()`, **even when about to drop the frame**.
|
||||
- Host watches those fields and, on any change vs the ring's current format/size, **recreates the
|
||||
ring at the new descriptor + bumps `generation`** (generalize `recreate_ring`/`poll_display_hdr`
|
||||
from "HDR toggled" to "descriptor changed"). Driver re-attaches via existing `is_stale()`.
|
||||
- Driver `publish()` gains a **width/height guard** alongside the format guard.
|
||||
- **Implications:** bump `pf_vdisplay_proto::PROTOCOL_VERSION` (host does a HARD version check in
|
||||
`pf_vdisplay.rs::mgr_ensure_device`), update the `const` size/offset asserts in
|
||||
`crates/pf-vdisplay-proto/src/frame.rs`, and deploy host + driver **in lockstep** (rebuild +
|
||||
re-sign + re-vendor `packaging/windows/pf-vdisplay/{dll,inf,cat}` on the RTX box, WUDFHost
|
||||
reload).
|
||||
|
||||
- **Stage 3 — Prevention (frequency reducer, not a standalone fix).** (reduces P1/P2 triggers)
|
||||
- Trim `monitor.rs::default_modes()` so the IDD advertises essentially only the negotiated mode, so
|
||||
a game can't pick a different fullscreen resolution. Verify it doesn't break mid-stream
|
||||
`Reconfigure`. Optionally re-assert the active mode after a detected mode change.
|
||||
|
||||
- **Stage S — Driver resilience (address S1/S2 once Stage 0 reveals if they fire).**
|
||||
- If logs show a permanent stall after `ACCESS_LOST`/SetDevice-give-up, add a re-arm path (e.g.
|
||||
delete the swap chain so the OS re-assigns, or signal `assign_swap_chain` to retry) and avoid
|
||||
installing a processor that has already failed `SetDevice`.
|
||||
|
||||
## Validation plan (RTX box `ssh "Enrico Bühler@192.168.1.158"`)
|
||||
|
||||
1. Deploy the Stage-0 host (+ driver if rebuilt); `punktfunk-host service stop/start`.
|
||||
2. Connect a client, confirm normal stream. `type C:\Users\Public\pfvd-driver.log` to baseline.
|
||||
3. Launch *Doom the Dark Ages* (or any fullscreen/HDR game). Capture: driver log + host service log
|
||||
(find where the in-session `serve` logs land; `RUST_LOG=info`).
|
||||
4. Read which mechanism fired (format/size/exit/no-reassign) from the Stage-0 diagnostics.
|
||||
5. **Success:** game is visible, the stream survives the mode-set flash, no 20 s crash, reconnect
|
||||
restores video. With Stage 1: the failover to WGC/DDA is logged and frames keep flowing. With
|
||||
Stage 2: the ring recreates at the new descriptor and the fast path resumes.
|
||||
|
||||
## File map
|
||||
|
||||
| Area | Path |
|
||||
|---|---|
|
||||
| Host ring consumer | `crates/punktfunk-host/src/capture/idd_push.rs` |
|
||||
| Capture selection / trait | `crates/punktfunk-host/src/capture.rs` |
|
||||
| NVENC re-init (no change needed) | `crates/punktfunk-host/src/encode/nvenc.rs:564-618` |
|
||||
| DDA / WGC capturers (failover targets) | `crates/punktfunk-host/src/capture/{dxgi,wgc}.rs` |
|
||||
| Host monitor lifecycle / capture target | `crates/punktfunk-host/src/vdisplay/pf_vdisplay.rs` |
|
||||
| Shared contract (Stage 2 fields + version) | `crates/pf-vdisplay-proto/src/{lib,frame}.rs` |
|
||||
| Driver frame publisher (guards + reporting) | `packaging/windows/drivers/pf-vdisplay/src/frame_transport.rs` |
|
||||
| Driver swap-chain lifecycle (S1/S2) | `packaging/windows/drivers/pf-vdisplay/src/swap_chain_processor.rs`, `callbacks.rs` |
|
||||
| Driver logging (S3) | `packaging/windows/drivers/pf-vdisplay/src/log.rs` |
|
||||
| Advertised modes (Stage 3) | `packaging/windows/drivers/pf-vdisplay/src/monitor.rs` (`default_modes`) |
|
||||
| Vendored signed driver (Stage 2 re-vendor) | `packaging/windows/pf-vdisplay/{pf_vdisplay.dll,.inf,.cat}` |
|
||||
|
||||
## Notes / caveats
|
||||
|
||||
- Doc lag (unrelated to the fix, worth flagging): `stage-pf-vdisplay.ps1` / packaging comments still
|
||||
reference the OLD `packaging/windows/vdisplay-driver/` tree; the active driver source is the NEW
|
||||
`packaging/windows/drivers/pf-vdisplay/` tree (re-vendored in commit `a11b0dd`).
|
||||
- The exact trigger (format vs resolution vs exclusive-flip vs processor-death) is **not yet proven
|
||||
from logs** — Stage 0 exists to pin it. Stage 1 fixes the user-visible symptom regardless.
|
||||
@@ -1,9 +1,11 @@
|
||||
# Windows Host Rewrite — Design & Plan
|
||||
|
||||
Status: **proposed** (2026-06-24). This plan takes the current, hard-won Windows host (pf-vdisplay
|
||||
all-Rust IddCx driver + IDD-push zero-copy capture, live-validated 5120×1440@240 HDR on the RTX box)
|
||||
as a *knowledge base* and re-derives a clean, stable, well-layered architecture from it. It drops all
|
||||
SudoVDA back-compat (we own both ends now) and drives `unsafe` to a contained minimum.
|
||||
Status: **largely implemented** (updated 2026-06-25 — see [§15 Current status](#15-current-status-2026-06-25)
|
||||
for the milestone-by-milestone state; §0–§14 below are the original design and remain the reference). This
|
||||
plan takes the current, hard-won Windows host (pf-vdisplay all-Rust IddCx driver + IDD-push zero-copy
|
||||
capture, live-validated 5120×1440@240 HDR on the RTX box) as a *knowledge base* and re-derives a clean,
|
||||
stable, well-layered architecture from it. It drops all SudoVDA back-compat (we own both ends now) and
|
||||
drives `unsafe` to a contained minimum.
|
||||
|
||||
It supersedes the stale conclusion in `docs/windows-virtual-display-rust-port.md` ("IDD-push not
|
||||
viable") — that verdict was written in the *same commit* (`e2c9bfd`) that shipped the working
|
||||
@@ -778,3 +780,58 @@ IddPushCapturer captures the lock/UAC secure desktop directly (IDD-push is opt-i
|
||||
`PUNKTFUNK_IDD_PUSH`). Make it a blocking on-glass gate (step 6) and keep the WGC relay recoverable for one
|
||||
release. Other defined-failure-branch items: monitor `EvtCleanupCallback` firing, IDD_PERSIST/Reconfigure,
|
||||
concurrent-monitor device sharing, host↔driver `protocol_version` lockstep.
|
||||
|
||||
---
|
||||
|
||||
## 15. Current status (2026-06-25)
|
||||
|
||||
The rewrite is **largely implemented**. The new all-Rust `pf-vdisplay` driver (the M0 long pole — `iddcx`
|
||||
on `windows-drivers-rs` + `/INTEGRITYCHECK` — and the §14 STEP 0–8 port) **landed on `main`, on-glass HDR
|
||||
validated**, and the host was decomposed into the clean layered architecture. One important deviation from
|
||||
the plan: **the host was refactored *in place* via a staged, behavior-preserving plan
|
||||
([`windows-host-goal1-plan.md`](windows-host-goal1-plan.md)), not greenfield-rebuilt** — the §10 "rebuild
|
||||
fresh, keep old as reference" framing was superseded because staging preserved the live-validated host at
|
||||
every step (lower regression risk than a big-bang M2 rebuild). The §2.3/§2.4/§2.5 design (seam traits,
|
||||
`SessionPlan`/`SessionFactory`/`SessionContext`, the `VirtualDisplayManager` ownership model) is realized in
|
||||
that branch's commits, not the M2 greenfield tree the build order imagined.
|
||||
|
||||
### Milestone / step status
|
||||
|
||||
| Item | Status | Evidence |
|
||||
|---|---|---|
|
||||
| **M0** — proto crate, driver workspace, `iddcx` binding, `/INTEGRITYCHECK` | ✅ **DONE** | `pf-vdisplay-proto`; `packaging/windows/drivers/`; `clear-force-integrity.ps1`; CI-green (§13) |
|
||||
| **§14 STEP 0–8** — pf-vdisplay driver port (device→adapter→control→swap-chain→frame transport→HDR→.inx→unsafe pass) | ✅ **DONE** | `d7a9fbf`…`cd59151`; on-glass HDR (`6399d28`: "Mac connects WITH HDR") |
|
||||
| **M1/M2** — IDD-push capture + NVENC glass-to-glass | ✅ **DONE** | new driver tree + the existing host IDD-push path; 5K@240 HDR zero-copy on-glass |
|
||||
| **§2.5** — ownership-model rewrite (`VirtualDisplayManager`/`MonitorLease`); swap-chain-reuse / monitor-leak | ✅ **DONE / RESOLVED** | `windows-host-goal1` §2.5 (`1520201`…`683c81b`); reconnect-leak A/B: 0 leaked monitors |
|
||||
| **Goal-1 host refactor** (the in-place §2.2–2.5 realization, incl. `EncoderCaps`) | ✅ **DONE** | `windows-host-goal1` branch — all 6 stages + §2.5 + 3 seam tightenings |
|
||||
| **Game-capture bug (GB1)** — fullscreen game breaks IDD-push | ✅ **FIXED** | `c87bfe0`/`f98ab07`/`789ad49`; see [game-capture-bug.md](windows-host-rewrite-game-capture-bug.md) |
|
||||
| **M3** — service / input / audio cleanup | 🟡 code present (largely via the existing host + goal1) | — |
|
||||
| **M4** — gamepad drivers (`pf_dualsense`/`pf_xusb`) onto the unified stack, WDF device contexts (true multi-pad) | ❌ **NOT STARTED** | old gamepad-driver crates still separate |
|
||||
| **M5** — demoted WGC/DDA fallback port + GameStream-on-`session/pipeline` + AMF/QSV (no hw) | 🟡 **PARTIAL** | fallbacks exist; not re-shaped onto the new seams |
|
||||
| **M6** — cut over + delete the old monoliths | 🟡 **PARTIAL** | old `vdisplay-driver/` tree deleted (`a2bd0cd`); host monoliths remain |
|
||||
|
||||
### What genuinely remains
|
||||
|
||||
1. **Secure-desktop on-glass gate (the single biggest open risk, §14 STEP 6 critique).** IDD-push capturing
|
||||
the lock screen / UAC with `serve` in the console session is **asserted, not yet locked on glass**. Until
|
||||
it passes, keep the WGC-relay / secure-DDA path recoverable. Hardware-gated (RTX box; ephemeral).
|
||||
2. **M4 — gamepad-driver migration** onto `windows-drivers-rs` (WDF device contexts → true multi-pad). The
|
||||
proven recipe exists; ~2–3 days, hardware-gated.
|
||||
3. **M5/M6 cleanup** — re-shape the WGC/DDA fallback + GameStream onto `session/pipeline`, then delete the
|
||||
old Windows monoliths. Low priority; AMF/QSV stays CI-only (no lab hw).
|
||||
4. **pf-vdisplay driver slot reclaim** — sustained ADD/REMOVE churn wedges the driver (`ADD →
|
||||
0x80070490 ERROR_NOT_FOUND`): it doesn't reclaim IddCx monitor slots on REMOVE (ghost nodes accumulate).
|
||||
Recovery today is `packaging/windows/reset-pf-vdisplay.ps1`; the real fix is in the driver
|
||||
(`control.rs`/`adapter.rs`). Dev helpers `reset-pf-vdisplay.ps1` + `redeploy-pf-vdisplay.ps1` are committed.
|
||||
|
||||
### Resolved since the original §11 open items
|
||||
|
||||
- **Driver swap-chain reuse** — the clean ownership model (`EvtCleanupCallback` + DeviceContext-owned state +
|
||||
single `Monitor` identity) is in; §2.5's reconnect-leak A/B shows **0 leaked active monitors**. The
|
||||
per-frame `CURRENT_MON_GEN` "monitor-gen bail" turned out to have been **write-only** (never wired), so the
|
||||
"carry the gen through `WinCaptureTarget`" item was dropped; the gen lives on the manager + lease only.
|
||||
- **`/INTEGRITYCHECK` + `iddcx` on `wdk-sys`** — both proven CI-green (§13).
|
||||
|
||||
Box reminder: the RTX box (`ssh "Enrico Bühler"@…`) is **ephemeral** (boots to Proxmox on reboot; IP floats
|
||||
on DHCP — has been `.173`/`.158`); the windows-amd64 CI runner is the persistent validator. On-glass gates
|
||||
are opportunistic.
|
||||
|
||||
Reference in New Issue
Block a user