docs(display-management): record Stage 5 §6A on-glass validation + keep-alive hardening

Honesty pass after the 2026-07-05 on-glass session — the plan now reflects what was actually
validated, no more/no less:
- Stage 5 (§6A): HOST-SIDE → DONE + on-glass validated (KWin .116 + Mutter .21): group model,
  positions, identity keying, group-aware exclusive/extend coexistence, 2 concurrent Mutter
  RecordVirtual monitors. Remaining is hardware-gated residuals ONLY (per-group physical-restore
  EFFECT needs a monitor-attached box — headless reports also_disabled=[]; wlroots exclusive;
  Mutter APPLY_TEMPORARY revert).
- Stage 3: the KDE set-scaling ROUND-TRIP is now proven live (150%/125% → disconnect → reconnect
  → reapplied, kwinoutputconfig.json) — moved from Deferred to Validated. Closes the Stage-3 gate.
- §5.1: the explicit-quit bypass (b53710d) is now IMPLEMENTED on punktfunk/1 (QUIT_CLOSE_CODE →
  release(force_immediate)), plus the new same-client reconnect preempt and the tunable idle
  timeout — documented as built + validated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-07-05 17:12:23 +00:00
parent b71dc94bb2
commit a0546b36b6
+71 -29
View File
@@ -1,16 +1,26 @@
# Virtual-display management & lifecycle policy — design
> **Status (2026-07-05):** **Stages 04 DONE + on-glass validated; Stage 5 HOST-SIDE DONE** (branch
> `display-mgmt-stage0`, not yet merged). Stage 5 §6A host-side complete: display **groups**
> **Status (2026-07-05):** **Stages 05 (§6A) DONE + on-glass validated; keep-alive reconnect
> hardened** (branch `display-mgmt-stage0`, not yet merged). Stage 5 §6A: display **groups**
> (`registry::group_key` — one per desktop backend, each gamescope spawn its own), group-aware
> `exclusive`/`primary` (KWin name-filter + first-slot-wins; Mutter `set_first_in_group`), **per-group
> topology restore** (KWin restore floats through the group, runs on the last member's teardown), the
> **layout engine** (`vdisplay/layout.rs`, auto-row + manual) + registry-driven `apply_position`, and the
> **layout engine** (`vdisplay/layout.rs`, auto-row + manual) + registry-driven `apply_position`, the
> `PUT /display/layout` endpoint with group/position/index in `/display/state`, and the **web console
> arrangement table** (x/y editor → `PUT /display/layout`). **Remaining Stage 5 = validation + residuals
> only** (no more build work): on-glass validation (2 clients on a GPU box, not the dev VM) + two
> documented residuals (wlroots `exclusive`, Mutter `APPLY_TEMPORARY` revert). See the **Status —
> handoff** block under §11 for the per-stage state and the key decisions (notably the Windows `reject`
> arrangement table** — **live-validated on KWin `.116` + Mutter `.21`** (group model, positions,
> identity keying, group-aware exclusive/extend, 2 concurrent Mutter `RecordVirtual` monitors). The
> Stage-3 **KDE scaling round-trip is now proven live** (set 150 %/125 % → disconnect → reconnect →
> reapplied, seen in `kwinoutputconfig.json`). **Keep-alive reconnect hardening (`b53710d`, on-glass
> validated with the probe):** a same-client reconnect **preempts its own zombie**
> (`admission::preempt_same_identity` — fixes "reconnect within the idle-detection window lands on a
> fresh SECOND display while the old one keeps streaming"), a **deliberate quit skips the linger**
> (client closes with `QUIT_CLOSE_CODE` 0x51 → `registry::release(force_immediate)`; §5.1), and the QUIC
> control-connection idle timeout (the disconnect-detection latency) is **host-tunable**
> (`PUNKTFUNK_IDLE_TIMEOUT_MS` / `--idle-timeout-ms`, default 8 s). **Remaining Stage 5 = hardware-gated
> residuals only**: the per-group physical-restore EFFECT (needs a monitor-attached Linux box — the
> headless validation boxes report `also_disabled=[]`, so nothing is disabled to restore), wlroots
> `exclusive` (needs a Sway box), Mutter `APPLY_TEMPORARY` disconnect-revert. See the **Status —
> handoff** block under §11 for the per-stage state and key decisions (notably the Windows `reject`
> default).
> This doc designs a **policy layer on top of the
> existing per-compositor `VirtualDisplay` backends** — user-configurable lifecycle (keep-alive
@@ -276,10 +286,21 @@ plumbing) does not. Concretely per backend, "the display survives" means:
- The **launch command runs once per display creation, never per attach** — a reconnect to a
kept gamescope must not double-launch the game. Today launch already happens once per
`build_pipeline`-successful session; the invariant moves with the create into the registry.
- An explicit client **quit** (GameStream `cancel`/quit-app; a future punktfunk/1
`EndSession{quit}` control message — protocol growth, trailing-byte back-compat as usual)
bypasses keep-alive: the user said "stop the game", so tear down now. Plain disconnects and
connection losses honor the policy.
- An explicit client **quit** (a user "stop", not a network drop) bypasses keep-alive: tear down
now. **Implemented on punktfunk/1** (`b53710d`, on-glass validated): the client closes the QUIC
connection with `QUIT_CLOSE_CODE` (0x51, shared in `core::quic`); the host reads the
`ApplicationClosed` reason and does `registry::release(force_immediate)``Linger::Immediate`
teardown, skipping the linger. `NativeClient::disconnect_quit()` + `punktfunk-probe --quit` drive
it; GameStream `cancel`/quit-app (`h_cancel`) + the five real clients sending the code are
follow-ups. A plain disconnect / connection loss honors the policy (lingers for reconnect).
- A **same-client reconnect resumes** (never a fresh second display). A reconnect while the client's
own prior session is still `Active` — its QUIC idle timer hasn't fired, and detection lags a drop by
`max_idle_timeout` (default 8 s, host-tunable via `PUNKTFUNK_IDLE_TIMEOUT_MS` / `--idle-timeout-ms`)
— is recognised by `admission::preempt_same_identity` (same cert fingerprint): the host signals the
zombie's stop + waits the release grace, so it lingers and the reconnect **reuses** the kept display.
Without this, a reconnect inside the detection window landed on a fresh second display while the old
session kept streaming. **Implemented + on-glass validated** (`b53710d`); implements the "preempts
downstream" the admission layer already promised (§5.3).
- Host shutdown tears everything down (RAII on exit, as today). A host crash leaves whatever
the OS reclaims — Wayland connections die with the process (compositor reclaims outputs),
spawned gamescopes die with the process group, the pf-vdisplay watchdog reaps monitors when
@@ -676,8 +697,10 @@ GNOME/Mutter, RTX 5070 Ti), **`.116`** (Bazzite KDE/KWin, AMD — build via a `f
via `effective_topology()`), 3 (platform-neutral `identity.rs` map + `per-client-mode` + KWin per-slot
output naming → **KWin persists per-output scale by name**, proven via `kwinoutputconfig.json` on `.116`),
4 (mode-conflict admission — `vdisplay/admission.rs`, loopback-validated for all four policies).
- **Stage 5: HOST-SIDE DONE (web table + on-glass pending).** All §6A group semantics landed + unit-tested
(no two-session on-glass possible on the GPU-less dev VM): **display groups** (`registry::group_key` — one
- **Stage 5 (§6A): DONE + on-glass validated (KWin `.116` + Mutter `.21`).** All §6A group semantics
landed + unit-tested, then live-validated (group model, positions, identity keying, group-aware
exclusive/extend, 2 concurrent Mutter `RecordVirtual` monitors; the dev VM itself is GPU-less):
**display groups** (`registry::group_key` — one
per desktop backend, each gamescope spawn its own group), **group-aware exclusive/primary** (KWin
`MANAGED_PREFIX` + first-slot-wins; Mutter `set_first_in_group` → a non-first session extends rather than
re-clobbering), **per-group topology restore** (KWin hands its restore to the registry via
@@ -688,9 +711,13 @@ GNOME/Mutter, RTX 5070 Ti), **`.116`** (Bazzite KDE/KWin, AMD — build via a `f
`PUT /api/v1/display/layout` endpoint (`EffectivePolicy::with_manual_layout`), and `/display/state` now
carrying `group`/`display_index`/`position`/`identity_slot`/`topology`. The registry keys the arrangement
on per-client identity via `VirtualDisplay::last_identity_slot` (KWin). The **web arrangement table**
(`DisplayCard.tsx` `DisplayArrangement`, en+de) is also done. **Remaining = validation + residuals only:**
on-glass validation + the documented residuals (wlroots `exclusive`, Mutter `APPLY_TEMPORARY` revert) —
see the Stage 5 entry below.
(`DisplayCard.tsx` `DisplayArrangement`, en+de) is also done, moved to its own **Virtual displays** nav
section with a full one-click-preset config surface. **Remaining = hardware-gated residuals only:** the
per-group physical-restore EFFECT (needs a monitor-attached box — the headless boxes report
`also_disabled=[]`), wlroots `exclusive` (needs a Sway box), Mutter `APPLY_TEMPORARY` disconnect-revert —
see the Stage 5 entry below. Plus the **keep-alive reconnect hardening** (`b53710d`, on-glass validated):
same-client zombie preempt + deliberate-quit skip-linger + tunable idle timeout (§5.1) — what made
"reconnect resumes" actually hold under a fast reconnect.
**Decisions / deltas from this plan as written — read before continuing:**
- **Windows admission default is `reject`, NOT `join`** (supersedes the Stage-4 line below). Two
@@ -708,11 +735,18 @@ GNOME/Mutter, RTX 5070 Ti), **`.116`** (Bazzite KDE/KWin, AMD — build via a `f
- **GameStream 503** is implemented (owner-fp on `LaunchSession`, `gamestream_admission()` unit-tested,
shares `effective_conflict()`) but NOT Moonlight-validated (can't drive `/launch` autonomously).
**Deferred (need a display-attached box / a specific compositor / a real client):** the `primary`
physical-keep EFFECT on Linux + a Windows primary-only CCD variant; **wlroots `exclusive`**; the KWin
set-150 %-scaling ROUND-TRIP (SSH can't drive `kscreen-doctor` into the live session — the persist
mechanism itself is already proven); GameStream 503 on-glass; two-concurrent-session validation of the
Stage-5 group-aware exclusive.
**Validated since (2026-07-05):** the KWin **set-scaling ROUND-TRIP** — a client set 150 % then 125 %
in the streamed KDE session, disconnected, reconnected, and the scale was reapplied to the freshly
re-created `Virtual-punktfunk-<id>` (proven in `kwinoutputconfig.json`); this closes the Stage-3 gate.
Also the §6A group model + group-aware exclusive/extend + 2 concurrent Mutter `RecordVirtual` monitors,
and the keep-alive reconnect hardening.
**Still deferred (need a display-attached box / a specific compositor / a real client):** the `primary`
physical-keep EFFECT on Linux + a Windows primary-only CCD variant; **wlroots `exclusive`**; GameStream
503 on-glass; and the **per-group physical-restore EFFECT** — a monitor-attached box is required to see
`exclusive` disable a physical output and the group restore re-enable it only after the last member drops
(the headless boxes report `also_disabled=[]`, so the group semantics are proven but the physical
toggle isn't).
- **Stage 0 — policy + plumbing-lite. [DONE ✓]** `policy.rs` (schema/presets/persist/env-compat, fully
unit-tested), mgmt GET/PUT `/display/settings`, console card (settings only), docs page
@@ -735,8 +769,11 @@ Stage-5 group-aware exclusive.
- **Stage 3 — identity. [DONE ✓]** Platform-neutral identity map + migration, per-slot KWin output
naming (+ the concurrent-session name-clash fix riding along), GameStream identity wiring,
optional `per-client-mode` keying, per-client `default_scale` on KWin.
*Validate on KDE:* connect client A → set 150 % scaling → disconnect → reconnect → scaling
reapplied; client B unaffected; `kwinoutputconfig.json` inspected for the named entries.
*Validated on KDE (`.116`, 2026-07-05):* a client set 150 % then 125 % in the streamed session,
disconnected, reconnected (keep-alive off → full teardown+recreate), and the scale was reapplied to
the fresh `Virtual-punktfunk-<id>` — confirmed in `kwinoutputconfig.json` (`scale=1.25` persisted by
connector name). This is the round-trip the persist mechanism was designed for. *(client-B-unaffected
under two concurrent sessions is folded into the Stage-5 two-session case.)*
- **Stage 4 — mode-conflict admission. [DONE ✓]** Decision function (`vdisplay/admission.rs`,
`decide`/`admit`/`effective_conflict`) wired into the punktfunk/1 handshake + GameStream `h_launch`,
the typed punktfunk/1 `busy` refusal (QUIC close `0x42` + reason), GameStream 503 path, `steal`
@@ -744,7 +781,7 @@ Stage-5 group-aware exclusive.
`join`/silent-reconfigure originally planned** — see the handoff Decisions above (single-capturer
IDD-push). Loopback-validated (all four policies) + `.173` reject-default validated; GameStream 503
unit-tested, Moonlight-pending.
- **Stage 5 — §6A multi-client monitors. [HOST-SIDE DONE ✓ — web table + on-glass pending]** Display
- **Stage 5 — §6A multi-client monitors. [DONE ✓ — on-glass validated (KWin `.116` + Mutter `.21`); hardware-gated residuals deferred]** Display
groups, group-aware exclusive/primary/restore (incl. the name-filter fix), layout auto-row + manual,
`/display/layout`, console arrangement table. Cheap: rides Stages 13 infrastructure, no protocol change.
**Done:**
@@ -802,11 +839,16 @@ Stage-5 group-aware exclusive.
writes `PUT /display/layout` (switches the host to a manual layout, applied next connect). en+de
i18n; the stale `display_pending_note` copy refreshed. tsc + vite build green. (Drag mini-map is a
later stretch.)
**Remaining Stage 5 — validation + deferred residuals only (no more host/web build work):**
- **On-glass validation** (needs a GPU box + 2 clients — NOT the GPU-less dev VM): two clients
(probe + GTK) on the headless KDE box forming a 2-output desktop; drag a window across; disconnect
one → its slot lingers per policy, the sibling is unaffected, and the physical is restored only after
BOTH drop (the per-group restore). Plus the concurrent-Mutter case on a GNOME box.
**Remaining Stage 5 — hardware-gated residuals only (no more host/web build work):**
- **On-glass validation — mostly DONE (2026-07-05, KWin `.116` + Mutter `.21`):** the group model,
per-member positions, identity keying, group-aware `exclusive`/`extend` coexistence (a 2nd session
does NOT clobber the 1st's output), and **2 concurrent Mutter `RecordVirtual` monitors** are all
confirmed live; the keep-alive reconnect path (reuse, quit-skip-linger, tunable idle) was validated
deterministically with the probe. **STILL PENDING: the per-group physical-restore EFFECT** — both
boxes are headless so `also_disabled=[]` (nothing to disable → nothing to restore); seeing
`exclusive` black out a real monitor and the group restore re-enable it only after the LAST member
drops needs a **monitor-attached Linux box**. The group-restore LOGIC (`hand_off_restore`) is
unit-tested; only the physical effect is unobserved.
- **wlroots group-aware exclusive** stays deferred: wlroots `exclusive` is not implemented at all (needs
a Sway box), so there is no topology to make group-aware yet. §6A multi-view on wlroots already works
(independent `HEADLESS-N` outputs).