Files
punktfunk/design/display-management.md
T
enricobuehler bbd98241e4 feat(vdisplay): display-management policy surface (Stage 0)
A user-configurable policy layer above the per-compositor VirtualDisplay
backends: keep-alive, topology, conflict, identity, layout, max-displays —
persisted to display-settings.json, editable from the web console, applied
per connect. Design: design/display-management.md.

Stage 0 stands up the surface and wires the two behaviors the existing code
can already express — the Windows monitor linger duration and the
"make the streamed output the sole desktop" topology — through it; every
other option is stored + echoed but not yet enforced (later stages). An
unconfigured host (no display-settings.json) keeps today's exact behavior.

- vdisplay/policy.rs: pure DisplayPolicy + 5 presets + JSON store (gpu-settings
  pattern) + EffectivePolicy; 9 unit tests.
- vdisplay.rs: resolve_topology(Auto); apply_session_env drives *_VIRTUAL_PRIMARY
  from the policy only when a settings file exists.
- windows/manager.rs: linger_ms() + should_isolate() read the policy when configured.
- mgmt: GET/PUT /api/v1/display/settings (bearer-only); PUT rejects keep_alive
  forever until the lifecycle stage. OpenAPI regenerated.
- web console: Host → Virtual displays card (preset picker + custom fields); en+de.
- docs-site: virtual-displays.md + configuration.md cross-links.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-07-04 19:44:18 +00:00

53 KiB
Raw Blame History

Virtual-display management & lifecycle policy — design

Status: PLANNED (nothing implemented). This doc designs a policy layer on top of the existing per-compositor VirtualDisplay backends — user-configurable lifecycle (keep-alive after disconnect), topology (primary / exclusive), conflict handling (what happens when a second client wants a different mode), stable display identity (so desktop environments remember per-client settings like scaling), and multi-monitor (several virtual displays forming one desktop, fed by one client or by several). The VirtualDisplay trait and the per-backend create() mechanics stay as they are; this layer decides when to create, how many, how long to keep, what else to do to the topology, and under which identity.

Companion docs: design/implementation-plan.md §6 (virtual displays), design/vrr-plan.md (pacing — out of scope here), design/gamescope-multiuser.md (per-session isolation — adjacent, not required).

1. Goal

Today the virtual-display behavior is hardcoded per platform and per backend:

  • A session's virtual output is created at connect and torn down (RAII) at session end — a disconnect destroys the display, reshuffles the desktop, and (on gamescope bare-spawn) kills the running game.
  • "Make the streamed output the sole desktop" is an env knob on Linux (PUNKTFUNK_KWIN_VIRTUAL_PRIMARY / PUNKTFUNK_MUTTER_VIRTUAL_PRIMARY, default-on for the auto-detected desktop path) and default-on on Windows (PUNKTFUNK_NO_ISOLATE to opt out) — and on Linux "primary" and "disable the other outputs" are conflated into one switch.
  • What happens when a second client connects is an emergent property of the platform: Linux creates a second output (multi-view), Windows reconfigures the shared monitor under the live session (join-path reconfigure in vdisplay/windows/manager.rs::acquire), GameStream preempts.
  • Only Windows gives a client a stable monitor identity (vdisplay/windows/identity.rs), so only Windows reapplies per-client display config (DPI scaling) across reconnects. On KDE every session's output is Virtual-punktfunk at whatever mode — scaling has to be re-set per connect and is shared across every client.
  • One session = exactly one display. A client with two physical monitors can only stream one; a tablet can't join an existing streamed desktop as a second monitor on purpose (the Linux multi-view behavior half-does it by accident, with no layout control).

Goal: one shared, documented configuration surface — a small set of orthogonal options with safe defaults and selectable presets, stored host-side, editable from the web console, applied uniformly across the punktfunk/1 and GameStream paths and across all five backends (KWin, gamescope, Mutter, wlroots, Windows pf-vdisplay), each backend implementing what it can and honestly declining what it can't (the same honest-downgrade convention as 4:4:4/10-bit).

2. What exists today (inventory)

The asymmetry worth internalizing: Windows already has most of the machinery; Linux has none.

Mechanism Windows (pf-vdisplay) Linux (kwin/mutter/wlroots) gamescope
Lifecycle owner VirtualDisplayManager singleton — Idle / Active{refs} / Lingering{until} state machine, gen-stamped MonitorLease none — session owns VirtualOutput.keepalive, capturer drop = teardown managed path: debounced TV-session restore (RESTORE_DEBOUNCE 5 s) + warm-session reuse; spawn path: child dies with the session
Keep-alive after disconnect linger, default 10 s (PUNKTFUNK_MONITOR_LINGER_MS) none managed: 5 s debounce (hardcoded)
Reuse on reconnect join Active (refcount++) / adopt Lingering (with a dead-swapchain preempt for IDD) none (always create fresh) managed: reuses the warm session
Primary / exclusive isolate_displays_ccd (exclusive), default on, restore on teardown apply_virtual_primary = primary and disable others, env-gated, restore on drop; Mutter make_virtual_primary = sole monitor (APPLY_TEMPORARY) n/a (own nested session)
Mode conflict join-path silently reconfigures the shared monitor (last-wins) each session gets its own output (multi-view) managed: one session; spawn: one gamescope per client
Stable identity identity.rs — cert-fp → id 1..=15 (EDID serial + ConnectorIndex), LRU, persisted pf-vdisplay-identity.json none — KWin output always named punktfunk, sway HEADLESS-N, Mutter auto-serial n/a
Multi-monitor manager is single-monitor (driver supports 16 connectors) N outputs happen to coexist (multi-view), no layout/group semantics single-output nested session

Design consequence: the plan is not "build a manager" — it's (a) extract the state machine Windows already proved into a platform-neutral, unit-testable core, (b) give Linux the ownership split it's missing (manager owns the keepalive, session holds a lease), (c) put a typed policy in front of both, (d) extend identity to Linux where the compositor allows it, and (e) grow the slot model into display groups so multi-monitor is an arrangement of slots, not a new system.

3. Architecture

Three new pieces, layered strictly above the VirtualDisplay trait (no backend rewrite):

                       ┌────────────────────────────────────────────┐
   mgmt API / console  │  DisplayPolicy  (vdisplay/policy.rs)       │  pure config: schema,
   host.env compat ───▶│  presets · layout · validation · persist   │  presets, env-compat
                       └───────────────┬────────────────────────────┘
                                       │ read per acquire/release (live-reload)
                       ┌───────────────▼────────────────────────────┐
   punktfunk/1 session │  DisplayRegistry (vdisplay/registry.rs)    │  host-lifetime singleton:
   GameStream session ─▶  acquire(identity, mode) → DisplayLease    │  owns ManagedDisplay slots
   mgmt /display/state │  release(lease) · linger timer · groups    │  grouped per desktop,
                       └───────┬────────────────────────┬───────────┘  drives the pure Lifecycle
                               │ create()/drop keepalive │ reconfigure/topology/layout ops
                  ┌────────────▼──────────┐   ┌──────────▼───────────────┐
                  │ Linux backends        │   │ Windows                  │
                  │ kwin · gamescope ·    │   │ VirtualDisplayManager    │
                  │ mutter · wlroots      │   │ (existing; delegates its │
                  │ (unchanged trait)     │   │ state decisions upward)  │
                  └───────────────────────┘   └──────────────────────────┘
  • vdisplay/policy.rs — the typed config (DisplayPolicy), preset expansion, JSON persistence (<config>/display-settings.json, the gpu-settings.json pattern: sanitize on load, atomic tmp+rename write), and the deprecated-env-knob mapping. 100 % pure and unit-tested (the pick_gamescope_mode / wiring_plan.rs discipline).
  • vdisplay/lifecycle.rs — the pure state machine: per-slot Idle / Active{refs} / Lingering{until} / Pinned plus the admission decision function (given: policy, requesting identity, requested mode(s), current slots → Create | Reuse | Reconfigure | Join{at_mode} | Steal{victims} | Reject{reason}). No I/O, no OS types — fully proptest/unit-testable, shared verbatim by both platforms. Pinned is Lingering with no deadline (keep-alive forever), releasable only via mgmt/teardown.
  • vdisplay/registry.rs — the host-lifetime singleton that owns ManagedDisplay slots (the backend VirtualOutput including its keepalive, the identity slot, current mode, group membership, topology-restore state) and executes the lifecycle decisions: calls VirtualDisplay::create, holds keepalives past session end, runs the linger timer, applies layout, exposes the mgmt snapshot. On Windows it wraps the existing VirtualDisplayManager (which keeps its driver/CCD/preempt specifics — the IDD dead-swapchain preempt, the WUDFHost-death preempt, begin_idd_setup — but reads its linger duration and join/steal behavior from the policy instead of env/hardcode).

The ownership split (the one real refactor)

Today capture::capture_virtual_output(vout, …) consumes the whole VirtualOutput — the capturer owns the keepalive, so capturer drop tears the display down. That coupling is exactly what makes keep-alive impossible on Linux. Split it:

pub struct DisplayLease { /* registry handle + gen stamp; Drop = release(refcount--) */ }
pub struct CaptureSource {          // what capture actually needs — Copy-ish, no ownership
    pub node_id: u32,
    pub remote_fd: Option<OwnedFd>, // Mutter portal daemon (dup'd per capture attach)
    pub preferred_mode: Option<(u32, u32, u32)>,
    #[cfg(windows)] pub win_capture: Option<WinCaptureTarget>,
}
// registry.acquire(...) -> (DisplayLease, CaptureSource)

The keepalive: Box<dyn Send> moves into ManagedDisplay inside the registry. The session's pipeline holds the DisplayLease (mirrors the Windows MonitorLease, gen-stamped so a stale lease from a preempted display is a release-no-op — the proven pattern). build_pipeline's vd.create(mode) call sites (punktfunk1.rs, gamestream/stream.rs, spike.rs) become registry::acquire(...). Every failure/retry path keeps its shape — the retry-hold lease trick in build_pipeline_with_retry maps 1:1 onto a DisplayLease.

Re-capture on reuse is per-backend (see §7): wlroots re-runs portal capture of the still- existing output; KWin/Mutter reconnect a PipeWire consumer to the kept node (validation item); gamescope re-discovers the nested compositor's node; Windows already re-targets. If re-capture of a kept display fails, the registry falls back to teardown + fresh create (bounded, inside the existing build_pipeline_with_retry budget) — keep-alive is an optimization, never a new failure mode.

4. The configuration surface

4.1 Schema (<config>/display-settings.json)

{
  "version": 1,
  // Convenience: a named preset. "custom" (or absent) = the explicit fields below rule.
  // When a preset IS named, the fields below are ignored (the console writes one or the other).
  "preset": "custom",

  // How long a display (and, on gamescope, the nested session + game) survives after the last
  // session detaches. "off" = teardown at session end. "forever" = until host stop / explicit
  // release. Duration is seconds.
  "keep_alive": { "mode": "duration", "seconds": 300 },   // "off" | {"duration", seconds} | "forever"

  // What the host does to the box's display topology while virtual displays are up:
  //   "extend"     add the virtual display(s), touch nothing else
  //   "primary"    make the group's primary virtual display the OS primary; physical outputs
  //                 stay enabled
  //   "exclusive"  the managed virtual displays become the ONLY enabled outputs (physicals
  //                 disabled, restored when the group's last display is torn down)
  //   "auto"       today's behavior: exclusive on the auto-detected desktop path & Windows,
  //                 extend when the operator pinned a compositor/env said otherwise
  "topology": "auto",

  // Admission when a client connects while another client's display/session is live and the
  // requested mode differs (same-client reconnect ALWAYS reuses/reconfigures its own display):
  //   "separate"  give the new client its own virtual display ON THE SAME DESKTOP (bounded by
  //                max_displays) — this is also the "many clients as monitors" mode, see §6A
  //   "steal"     stop the existing session(s), tear down / reconfigure, serve the new client
  //   "join"      admit the new client AT THE EXISTING MODE (Welcome/serverinfo reflect the
  //                real mode — the honest-downgrade convention); never reconfigures under a
  //                live session
  //   "reject"    refuse the new client with a clear handshake error
  "mode_conflict": "separate",

  // Stable display identity → desktop environments persist per-display config (KDE scaling):
  //   "shared"           one identity for everything (today's Linux behavior)
  //   "per-client"       one identity per paired client cert fingerprint (today's Windows);
  //                       a multi-display client (§6B) gets one identity per (client, display #)
  //   "per-client-mode"  one identity per (client, WxH) — distinct scaling per resolution,
  //                       at the cost of identity slots (Windows has 15; LRU eviction)
  "identity": "per-client",

  // How the group's displays are arranged in the desktop coordinate space (§6.2):
  //   "auto-row"  left-to-right in acquire order, top-aligned (deterministic default);
  //                a §6B client's own monitor-arrangement hints override auto placement
  //   "manual"    per-identity-slot offsets below (console-arranged); wins over client hints
  "layout": { "mode": "auto-row", "positions": { /* "<slot>": {"x": 0, "y": 0} */ } },

  // Upper bound on simultaneously-live virtual displays (Active + Lingering + Pinned, across
  // the whole group). Admission returns Reject/Steal (per mode_conflict) when full; a §6B
  // AddDisplay beyond it is declined. Windows is additionally capped by the driver (see §7).
  "max_displays": 4
}

Deliberate non-options (rejected):

  • Per-client policy overrides — real, but v2. One host-global policy first; the schema keys are chosen so a later "clients": {"<fp>": {…}} overlay is additive.
  • Idle timeout for Pinned displays ("forever but tear down after 24 h") — keep_alive already expresses it as a long duration; don't add a second axis.
  • Choosing the linger for capture-loss separately from clean disconnect — the registry only sees "last lease released"; the session layer already distinguishes and (see §5.1) an explicit client quit bypasses keep-alive entirely.
  • Per-display FEC/bitrate policy knobs — bitrate stays session-negotiated per stream as today; a multi-display session's per-display bitrates are the client's ask, not host policy.

4.2 Precedence & live-reload

display-settings.json (console-written) > deprecated env knobs > built-in defaults — the exact precedence convention the GPU preference set (console preference > PUNKTFUNK_RENDER_ADAPTER > auto). The policy is read at each acquire/release, not once at startup (it's file/registry state, not env — no HostConfig constraint), so a console change applies to the next connect/disconnect without a host restart, same contract as the GPU card ("applies to the next session"). Env-knob compatibility mapping (all logged as deprecated when they take effect):

Legacy knob Maps to
PUNKTFUNK_MONITOR_LINGER_MS keep_alive = duration(ms/1000) (Windows)
PUNKTFUNK_NO_ISOLATE topology = "extend" (Windows)
PUNKTFUNK_KWIN_VIRTUAL_PRIMARY / PUNKTFUNK_MUTTER_VIRTUAL_PRIMARY topology = "exclusive" when truthy, "extend" when explicitly 0

The apply_session_env default-on write of *_VIRTUAL_PRIMARY for the auto-desktop path is replaced by topology = "auto" resolving to exclusive on that path — one fewer process-env mutation on the connect path (a small win for the env-race surface ENV_LOCK guards).

4.3 Presets

Presets are the documented, supported entry point; raw fields are the escape hatch. Expansion lives in policy.rs and is unit-tested so docs and code can't drift.

Preset keep_alive topology mode_conflict identity layout Story
default 10 s auto separate per-client auto-row Today's behavior, made explicit: short linger absorbs client hiccups/reconnects, streamed output is the sole desktop on the auto path, extra clients get their own view.
gaming-rig forever exclusive steal per-client auto-row Dedicated headless/couch box: the game and its display survive disconnects indefinitely; whoever connects takes the box over ("the TV model").
shared-desktop off extend separate per-client auto-row Streaming a desktop someone may also use physically: never blank the real monitors, never keep ghost outputs, concurrent viewers each get a view.
hotdesk 5 min exclusive reject per-client-mode auto-row One user at a time with fast reattach (roaming between own devices); a second user is told the box is busy; each device+resolution keeps its own scaling.
workstation 5 min exclusive separate per-client manual The multi-monitor daily driver: your dual-monitor client gets both displays back exactly where you arranged them (§6B), or a tablet joins as a side monitor (§6A).

5. Option semantics in detail

5.1 keep_alive

What survives. The display (compositor output / IddCx monitor / spawned gamescope) and its topology state survive; the session (QUIC conn, capture stream, encoder, input devices, audio plumbing) does not. Concretely per backend, "the display survives" means:

  • kwin / mutter / wlroots: the output stays in the layout → windows don't reshuffle, a running game keeps rendering at the client's mode, reconnect is fast (no create/negotiate).
  • gamescope (bare spawn): the nested gamescope and the game launched inside it keep running — this is the headline user value (Sunshine/Apollo-style detach/reattach) and the reason keep_alive is worth building at all.
  • gamescope (managed): the policy duration replaces the hardcoded 5 s RESTORE_DEBOUNCE — the warm Steam session stays up for the window; forever means the TV session is never auto-restored (release via console/tray).
  • Windows: the existing linger, plus forever = the new Pinned state.

Rules.

  • Input devices (uinput pads, libei/EIS contexts) stay session-scoped — a disconnect reads to the game as "controller unplugged", which games handle. (Keeping pads alive for kept sessions is a possible later refinement; do not build it now.)
  • The launch command runs once per display creation, never per attach — a reconnect to a kept gamescope must not double-launch the game. Today launch already happens once per build_pipeline-successful session; the invariant moves with the create into the registry.
  • An explicit client quit (GameStream cancel/quit-app; a future punktfunk/1 EndSession{quit} control message — protocol growth, trailing-byte back-compat as usual) bypasses keep-alive: the user said "stop the game", so tear down now. Plain disconnects and connection losses honor the policy.
  • Host shutdown tears everything down (RAII on exit, as today). A host crash leaves whatever the OS reclaims — Wayland connections die with the process (compositor reclaims outputs), spawned gamescopes die with the process group, the pf-vdisplay watchdog reaps monitors when pings stop. No new orphan class.
  • keep_alive + topology=exclusive means physical monitors stay dark after disconnect until linger expiry / release. This is intended (gaming-rig) but must be loud in the docs, and the release-now escape hatch (§8) must exist in the same release that ships forever.

5.2 topology

Splits the currently-conflated "primary" knob into three honest levels, group-aware (§6.1): "exclusive" means the managed virtual displays are the only enabled outputs — never disable a sibling slot; restore fires when the group's last display drops. Per-backend mapping:

extend primary exclusive
KWin no-op kscreen-doctor output.X.primary only primary + disable non-managed others (today's apply_virtual_primary with a registry-driven filter, §6.1), restore-on-teardown
Mutter no-op ApplyMonitorsConfig incl. physicals, virtual primary today's sole-monitor config (make_virtual_primary) extended to include all group members
wlroots no-op unsupported (no primary concept) → log + treat as extend swaymsg output <phys> disable + re-enable on teardown (new, small)
gamescope n/a — the nested session is the whole world; all three resolve to no-op
Windows skip isolate (today's PUNKTFUNK_NO_ISOLATE) CCD primary-only variant (new, small — set_active_mode already exists; primary without deactivation) today's isolate_displays_ccd, extended to isolate to the SET of managed targets

Restore stays bound to display teardown (keepalive drop / teardown()), not session end — already true everywhere; keep-alive inherits it for free. The KWin restore-before-reclaim ordering (re-enable others first so KWin never sees zero enabled outputs) is preserved.

auto resolves at acquire time: exclusive on Windows and on the Linux auto-detected-desktop path, extend under an explicit PUNKTFUNK_COMPOSITOR pin (the CI/test posture) — bit-for-bit today's defaults, so default preset = no behavior change.

5.3 mode_conflict

Enforced at admission, before the Welcome / RTSP launch, in the lifecycle decision function — so the client gets an honest answer, not a mid-build failure:

  • Applies only across different clients (identity ≠ identity). A same-client reconnect always preempts its own zombie session / adopts its own kept display and reconfigures it to the newly requested mode (today's behavior, now uniform on all platforms).
  • separate — allocate another slot in the desktop group (Linux multi-view today, upgraded with layout — §6A; Windows: requires the multi-monitor manager, §6.6 — until that stage lands, separate on Windows resolves to join with a startup + docs warning rather than silently doing something else).
  • join — the second client is admitted at the live display's mode. punktfunk/1: the Welcome's Config carries the real mode (the client already renders what the Welcome says — the 4:4:4/10-bit honest-downgrade pattern). GameStream: serverinfo/RTSP negotiate the live mode. This replaces the Windows join-path's silent last-wins reconfigure under a live session — that current behavior becomes opt-in as steal.
  • steal — signal the victim sessions' stop flags (the machinery begin_idd_setup already uses), wait the release grace, tear down or reconfigure, admit. Trust note: conflict policy runs after the pairing gate, so on a default host only paired clients can steal; on an --open/TOFU host any accepted client can — the docs call this out and recommend reject for open hosts.
  • reject — punktfunk/1: a typed handshake refusal (extend the existing error path with a busy reason string carrying the live mode + client label so the client UI can say "host is streaming 2560×1440 to "); GameStream: the 503/session-in-use answer Moonlight already understands.

Interaction with --max-concurrent (session bound) is unchanged and orthogonal: sessions and displays are different resources; max_displays bounds displays, the accept-loop permit bounds in-flight sessions. join deliberately lets N sessions share one display (that's today's Windows concurrency model).

5.4 identity — stable displays, persistent scaling (the KDE ask)

Two halves: an identity map (who gets which slot) and a per-backend identity carrier (how a slot becomes something the DE keys its config on).

Map — generalize vdisplay/windows/identity.rs (it's already pure + unit-tested) into a platform-neutral vdisplay/identity.rs: key = client cert fp (plus display ordinal for a §6B multi-display client, plus WxH under per-client-mode), value = small stable slot id, LRU eviction at the platform cap, persisted <config>/display-identity.json (Windows migrates pf-vdisplay-identity.json on first load — read old path if new absent, write new). Anonymous/unpaired clients stay slot 0 = auto/shared. GameStream clients get identities too (improvement over today): the paired GameStream client cert fingerprint feeds the same map, so a Moonlight device also keeps its scaling — today set_client_identity is only wired on the punktfunk/1 path.

Carriers per backend:

  • Windows — shipped: slot → EDID serial + IddCx ConnectorIndex; Windows keys PerMonitorSettings (DPI scaling) on exactly that. Cap 15 (ConnectorIndex < MaxMonitorsSupported=16). per-client-mode and per-display ordinals work unchanged but burn slots faster — the LRU already handles pressure; document the trade-off.
  • KWin — the carrier is the output name: stream_virtual_output(name, …) becomes punktfunk-<slot> → output Virtual-punktfunk-<slot>. KWin persists per-output config (scale, transform, mode) in kwinoutputconfig.json, matching EDID-less outputs by name — so a stable per-client name is precisely what makes KDE reapply that client's scaling. Two validation items before relying on it (Stage 3 gate, §11):
    1. confirm KWin ≥ 6.5.6 actually persists + reapplies scale for Virtual-* outputs;
    2. confirm a remembered mode doesn't fight the freshly requested one (if KWin reapplies a stale stored mode on output-added, our existing set_custom_refresh/mode apply must run after and win — it already reads back the achieved mode, so a fight is at least visible). Side effect worth having: distinct names also unclash concurrent sessions (today two simultaneous KWin sessions both create Virtual-punktfunk and set_custom_refresh / other_enabled_outputs match by that shared name — a latent multi-view bug this fixes).
  • wlroots — no rename and no settable description via IPC; headless outputs are HEADLESS-N by creation order. Identity is therefore not reliably carriable → declared unsupported (shared behavior regardless of setting; capability matrix + docs say so). The single-session case is de-facto stable (HEADLESS-1), which users can pin in sway config — document that recipe instead of pretending.
  • MutterRecordVirtual auto-generates the virtual monitor's serial; no public D-Bus surface to control it → unsupported for now. Note for later: re-evaluate Mutter's virtual-monitor D-Bus surface per GNOME release (tracked as an open item, not a promise).
  • gamescope — n/a: the client streams a whole nested session; scaling inside it is per-game.

Scale as a punktfunk-side option (small, high-value adjunct): KWin's stream_virtual_output takes a scale argument we currently hardcode to 1.0. Add an optional per-client default_scale (console-editable next to the device list) passed at create on KWin; on Windows scaling stays the OS's job (identity makes it persist). This gives HiDPI phones/ tablets a correct-sized desktop on first connect, before any DE-side persistence exists. A client-requested scale hint in the Hello (trailing-byte back-compat, like the gamepad-pref byte) is future protocol growth — design it when a client actually wants to send it.

6. Multi-monitor

Two scenarios, deliberately separated because they differ ~10× in cost:

  • §6A — many clients, one desktop ("second screen"): each client device becomes one more monitor of the same host desktop (tablet as a side monitor next to the laptop's stream). Structurally this already half-exists on the Linux desktop compositors (separate gives every client its own output on the shared desktop); what's missing is intent: layout control, group-aware topology, and honest per-backend gating. No protocol change — it ships on the registry work.
  • §6B — one client, many displays: a client with two physical monitors gets two virtual displays, streamed as two video planes, presented one-per-monitor, arranged on the host to mirror the client's physical arrangement. Needs protocol growth, N encoder pipelines, client presenter work, and (on Windows) the multi-monitor manager. punktfunk/1-native only — GameStream/Moonlight has no multi-display vocabulary and stays single-stream.

6.1 Display groups (registry concept, serves both)

ManagedDisplay slots gain a group: the set of displays sharing one desktop/session.

  • kwin / mutter / wlroots: one group per compositor session — every acquired slot joins it (that is the shared desktop).
  • gamescope spawn: one group per spawned nested session. gamescope is single-output — a §6B client asking N displays there resolves to 1, honestly (the extra AddDisplays are declined).
  • Windows: one group (the desktop); slots = IddCx monitors (§6.6).

Group-aware semantics — these fix latent issues even before multi-monitor ships:

  • exclusive disables only non-managed (physical/bootstrap) outputs, never group members. Today's KWin apply_virtual_primary disables "everything not named Virtual-punktfunk" — under Stage-3 per-slot names, a second session's exclusive would disable the first session's live output. The filter must consult the registry (the set of managed output names), not one hardcoded name. Same shape on Windows (isolate_displays_ccd isolates to the managed target set) and Mutter (the sole-monitor config includes all group members).
  • primary designates one group member — for §6B the client marks which of its displays is primary (its OS already knows); for §6A the first slot wins unless the console re-designates.
  • Topology restore is per-group, not per-display — the saved pre-stream config is restored when the group's last member drops, never while siblings live. (Windows SavedConfig and the KWin restore vec move from Monitor/StopGuard into the group record.)

6.2 Layout

The layout policy block (§4.1) controls where group members sit in the desktop space:

  • auto-row (default): left-to-right in acquire order, top-aligned — what compositors mostly do anyway, made deterministic.
  • manual: per-identity-slot offsets, console-edited (an OS-settings-style drag mini-map is the stretch UI; an x/y table ships first). Keyed by identity slot, so client B's tablet always reappears to the right of client A's monitor — layout + identity compose.
  • A §6B client sends its real monitor arrangement as per-display position hints; they override auto-row (mouse crossing between streamed monitors then matches the client's physical layout) but lose to manual pins.

Backend mapping — all existing tooling, no new protocols: KWin kscreen-doctor output.X.position.x,y (validate syntax the way set_custom_refresh did); wlroots swaymsg output <n> position X Y; Mutter logical-monitor positions in the same ApplyMonitorsConfig we already build; Windows CCD source origins in the same SetDisplayConfig path isolate_displays_ccd uses.

Host-side input routing. §6A needs nothing (N clients inject into one desktop — already true today). §6B needs the injectors to map (display, x, y) → desktop coordinates using the group layout: per-backend work items — libei absolute positioning is per-region, the wlr virtual-pointer protocol binds to an output, Windows SendInput absolute is desktop-normalized (pure math off the group layout). Wire change in §6.3.

Two realities to document, not engineer around: cursor rendering is already correct (every backend embeds the cursor per-output — KWin POINTER_EMBEDDED, the IDD's per-monitor composition — so it appears only on the stream it's on and "crosses" between monitors naturally), and a §6A desktop has one cursor shared by all member clients — exactly right for the one-user-two-devices case (touch the tablet, the cursor jumps there), chaotic for two people; genuinely independent users want gamescope multi-user (design/gamescope-multiuser.md), not groups.

6.3 Protocol growth for §6B (punktfunk/1 only)

Principle: a display is one data-plane instance. Don't touch the hardened core packet format — N displays = N × (encoder + send thread + core Session over its own UDP flow), one shared QUIC control connection, one set of session-scoped side planes (audio, mic, rumble, input). And don't grow the Hello: the handshake's back-compat idiom is single trailing bytes — a variable-length display list doesn't fit it, and it doesn't need to, because the control stream stays open after Start (Reconfigure/ClockProbe already ride it).

  • Capability: client advertises VIDEO_CAP_MULTI_DISPLAY (video_caps bit 0x10); the Welcome echoes the host's per-session display budget as one trailing byte (max_displays remaining, 0/absent = single-display host — old hosts are automatically honest).
  • Negotiation: the Hello/Welcome pair is untouched and establishes display 0 exactly as today (an old host serves a multi-monitor-capable client's primary display with zero special cases). Extra displays negotiate post-Start on the control stream: AddDisplay { mode, position_hint, primary: bool } → DisplayAdded { index, config /* the same honest per-display Config shape the Welcome carries: mode, bit depth, chroma, codec */ } or DisplayDeclined { reason }. RemoveDisplay { index } and a per-display Reconfigure (index as a trailing byte on the existing message) complete the set — client monitor hotplug maps 1:1 onto Add/Remove mid-session.
  • Data plane: DisplayAdded carries the flow binding (host UDP port / flow token) for that display's own core Session. Per-flow crypto derives the AES-GCM nonce salts per (direction, display index) — no salt reuse across flows; FEC domains are independent per flow (loss on one display can't stall another) — this is why "one Session per display" beats muxing display ids into the core packet format.
  • Side planes: pointer/touch events gain a display-index byte (same trailing-byte pattern as the gamepad pref; absent = display 0); 0xCF host-timing and 0xCE HDR-metadata datagrams gain the index the same way (a client mixing an HDR laptop panel + SDR external monitor gets per-display grades). Audio/mic/rumble/gamepad stay session-scoped, untouched.
  • Per-display honesty: each display negotiates bit depth/chroma/codec independently through the same resolve functions — a host that can afford HEVC Main10 on one head and only 4:2:0 on the second says so in each DisplayAdded.config.
  • Stats: the stats-unification vocabulary (four measurement points, p50/p95 windows) gains a display dimension — per-display series, HUD shows the focused display's equation (design/stats-unification.md gets a §6B addendum; don't invent client-local stats).
  • C ABI / connector: punktfunk_add_display / per-display next_au routing (an index out param on the existing call keeps the ABI additive), so PunktfunkKit/JNI stay on the shared connector.

6.4 Encoder & resource budget

N displays = N encode pipelines. NVENC consumer session caps — and the existing auto 2-way split-encode above ~1 Gpix/s consuming two NVENC sessions for one stream — mean admission must budget: DisplayAdded is granted only if the encoder backend confirms capacity (extend the existing NVENC session accounting + the AMF/QSV probes with a can_open_another() check), and split-encode is disabled for multi-display sessions (displays win over split; a 5K@240 single head is not the multi-monitor use case). max_displays bounds the group. Same idle-cost note as keep-alive: every added display composites + encodes at full rate. Bandwidth is per-display additive (two 4K heads ≈ 2× the bitrate): the per-host speed test's recommendation should be read per session and split across that session's displays — the client divides its ask, the host doesn't second-guess it (per-display bitrate is deliberately not host policy, §4.1).

6.5 Client staging for §6B

  • Linux GTK + Windows clients first — natural multi-window presenters: one window/fullscreen surface per display on the matching physical monitor, the existing capture state machine extended to span them (pointer crossing between our fullscreen windows must not release capture).
  • macOS second (multi-NSWindow across Screens; Spaces/fullscreen interplay is the risk).
  • Android/iOS/tvOS: never advertise the capability — single-display presenters. A phone or tablet still participates in multi-monitor via §6A (it is a second monitor), which needs nothing from those clients.

6.6 Windows multi-monitor manager

Previously an explicit non-goal; now a designed final stage — the single-monitor manager keeps working unchanged until it lands:

  • Manager: the singleton's MgrState becomes a map keyed by connector id; lifecycle.rs is already written per-slot, so the Windows manager's delegation doesn't change shape. The IDD reconnect preempts (dead-swapchain, WUDFHost-death) become per-slot.
  • Driver: pf-vdisplay already ADDs by connector id 1..=15 (the identity map's bound). The sealed frame channel (IOCTL_SET_FRAME_CHANNEL) must become per-monitor — channel messages carry the monitor id, reusing the multi-pad pad_index pattern (driver proto v3; design/idd-push-security.md addendum: same unnamed-object + handle-dup broker per ring). Driver work + CI + on-glass validation is exactly why this stage is last.
  • Capture/encode: one IDD-push capturer per monitor ring; budget per §6.4.
  • CCD: isolate/primary/layout already group-aware from §6.1/6.2.

7. Per-backend capability matrix

What each backend supports; unsupported cells resolve to the stated fallback and are surfaced in GET /api/v1/display/state per display ("capabilities": [...]) so the console can grey options out per-host instead of lying:

Capability KWin gamescope spawn gamescope managed gamescope attach Mutter wlroots Windows
keep-alive (linger/forever) hold the vout thread; re-attach PipeWire consumer to the kept node — validate nested session + game survive; re-discover node policy replaces the 5 s debounce — (never owned it) hold the D-Bus session; consumer re-attach — validate output persists; fresh portal capture per attach (cleanest) shipped; add Pinned
reconfigure kept display to a new mode set_custom_refresh + kscreen mode SIGKILL+respawn is the honest "reconfigure" (game restarts — docs say so) or decline → recreate existing managed-mode set ⚠ node is sized by negotiation; renegotiation unproven — fallback recreate output <n> mode --custom reconfigure() shipped
topology: primary n/a n/a n/a → extend (new, small)
topology: exclusive shipped (filter → group-aware) n/a n/a n/a shipped (→ group-aware) (new, small) shipped (→ group-aware)
mode_conflict: separate / §6A group multi-output one gamescope per client (independent sessions, no shared desktop) single session → steal/join/reject only assumed — validate ≥2 RecordVirtual monitors HEADLESS-N §6.6 (until then → join + warning)
§6B multi-display for one client N outputs + layout single-output (extra displays declined) ⚠ gated on the ≥2-monitor validation §6.6
layout (position control) kscreen position n/a n/a n/a ApplyMonitorsConfig output position CCD origins
stable identity output name per slot n/a n/a n/a (API gives no serial control) (no name control) shipped

The attach gamescope sub-mode never owns the display (it mirrors a foreign gamescope) — the registry records it as an unmanaged pass-through slot: no keep-alive, no topology, no identity, conflict = join-only. That's just codifying reality.

8. Management API, web console, tray

Endpoints (bearer-only, like /gpus; documented in mgmt.rs's OpenAPI → regenerate api/openapi.json):

  • GET /api/v1/display/settings{ settings, preset_expansions, capabilities } — the stored policy plus what this host's live backend can actually do (so the console renders accurate controls).
  • PUT /api/v1/display/settings — validate (unknown fields rejected, ranges clamped like the GPU PUT), persist atomically, log. Applies from the next acquire/release.
  • GET /api/v1/display/state → live slots:
    { "displays": [ { "slot": 3, "backend": "kwin", "output": "Virtual-punktfunk-3",
        "mode": "2560x1440@120", "state": "lingering", "expires_in_s": 240,
        "client": "a1b2c3…(label)", "display_index": 0, "sessions": 0,
        "group": 1, "position": {"x": 0, "y": 0}, "topology": "exclusive" } ] }
    
  • POST /api/v1/display/release { "slot": 3 } or {} (all) — immediately tear down Lingering/Pinned displays. Refuses Active (stopping a live session is session management, not display management — don't blur it).
  • PUT /api/v1/display/layout { "positions": { "<slot>": {"x":…, "y":…} } } — the manual arrangement (applies live to affected groups; persisted into the policy's layout block).

Web console (Host page, next to the GPU card): a Virtual displays card — preset selector (radio + one-line story each, custom unlocking the advanced fields), the live display list from /state with per-row "Release" buttons and a linger countdown, the arrangement editor (x/y table first, drag mini-map stretch), capability-aware disabled states. The loopback local/summary gains a displays_live count (counts only — the established no-secrets rule) so the tray tooltip can show "1 display kept alive" and offer a release-all action through the same elevation path as start/stop (Windows) / systemctl --user (Linux) — tray work is a stretch stage, not core.

9. Enforcement points (exact code paths)

  1. punktfunk/1 handshake (punktfunk1.rs, where the Hello is resolved into the Welcome): call registry::admit(identity, requested_mode) → on Reject answer the typed refusal; on Join the Welcome's Config carries the live mode; on Steal signal victims + wait release (bounded) before proceeding. This runs before SessionContext is built.
  2. virtual_stream / build_pipeline (punktfunk1.rs:3511, build_pipeline_with_retry): vd.create(mode)registry::acquire(...) -> (DisplayLease, CaptureSource); the retry-hold lease keeps its exact semantics. The mid-stream Reconfigure, session-switch, and capture-loss rebuild paths re-acquire through the registry so a compositor switch correctly releases the old backend's slot and the new mode updates the slot's record.
  3. Control stream, post-Start (§6B): AddDisplay/RemoveDisplay handlers spawn/stop a per-display pipeline (its own registry::acquire, encoder, send thread, UDP flow) inside the same SessionContext lifetime; --max-concurrent counts sessions, not displays.
  4. GameStream (gamestream/stream.rs::open_gs_virtual_source): same acquire; identity from the paired client cert fp (new); quit-app → release(quit=true) which bypasses keep-alive.
  5. Session end: capturer drop (releases the PipeWire consumer / ring) then DisplayLease drop → lifecycle decides Linger/Pinned/teardown. On Linux the keepalive no longer rides the capturer (§3 ownership split).
  6. serve startup/shutdown: registry constructed once (like start_restore_worker), all slots torn down on graceful exit.

10. Documentation plan

A dedicated docs-site page docs-site/content/docs/virtual-displays.md (+ meta.json entry), cross-linked from configuration.md, host-cli.md, steamos-host.md, and troubleshooting.md. Structure — written for the operator, presets first:

  1. What punktfunk does with displays — 5 lines: per-client-sized virtual output, created on connect, what "keep alive"/"exclusive" mean physically.
  2. Pick a preset — the §4.3 table verbatim, each with a one-paragraph story and the JSON it expands to ("copy this into display-settings.json, or click it in the console").
  3. Options reference — one subsection per option: values, default, per-backend support badge row, and a concrete example scenario each ("You stream from your phone at 1080p and your TV at 4K120: with identity: per-client KDE remembers 150 % scaling for the phone and 100 % for the TV").
  4. Multi-monitor — the two scenarios in user language: "use your tablet as a second monitor" (§6A: connect a second device, arrange it in the console) and "stream your dual-monitor setup" (§6B: which clients support it, what the host does with the layout), plus the support matrix and the GameStream single-stream note.
  5. Persistent scaling (KDE/Windows) — the user-visible recipe: connect once, set scaling in System Settings / Windows Settings while streaming, done — punktfunk's stable identity makes the DE reapply it. Honest support table (KWin / Windows / GNOME why / Sway recipe).
  6. Troubleshooting — "my physical monitors stayed off" → release button/endpoint + the keep_alive×exclusive explanation; "second client gets the wrong resolution" → join semantics; "game restarted on reconnect" → gamescope reconfigure caveat; "second display declined" → encoder budget (§6.4); KWin/gamescope version floors.
  7. Legacy env knobs — the §4.2 mapping table, marked deprecated.

Also update: README.md status row, CLAUDE.md (status + invariant below), host.env.example (point at the JSON/console, list deprecated knobs), and the OpenAPI snapshot.

New design invariant for CLAUDE.md (once shipped): Display lifecycle is owned by the registry, policy-driven; sessions hold leases, never the keepalive. New backends implement VirtualDisplay + declare capabilities; they never grow their own lifecycle/env knobs. A display is one data-plane instance — multi-display never muxes into the core packet format.

11. Staged implementation

Each stage lands green (cargo test/clippy/fmt, OpenAPI drift check) and is independently shippable; on-glass validation notes inline. Heads-up for this box: the dev VM currently has no GPU passthrough (RTX 5070 Ti detached at the Proxmox level, 2026-07-01) — KWin-path live validation needs the GPU back or one of the LAN hosts (.248 GNOME / .48 Fedora KDE).

  • Stage 0 — policy + plumbing-lite. policy.rs (schema/presets/persist/env-compat, fully unit-tested), mgmt GET/PUT /display/settings, console card (settings only), docs page skeleton with the presets/options tables. Behavior deltas limited to what existing knobs can express: Windows linger reads the policy; Linux topology auto/extend/exclusive routes through the existing primary code. No lifecycle change yet — zero-risk adoption of the surface.
  • Stage 1 — lifecycle core + Linux keep-alive (easy backends). lifecycle.rs pure machine (+proptests: no lost teardowns, no double-frees across arbitrary acquire/release/expiry interleavings), registry.rs, the ownership split (DisplayLease/CaptureSource — the one cross-cutting refactor, touches capture_virtual_output signatures on both OSes), keep-alive live for wlroots and gamescope-spawn (the two backends where reuse is structurally trivial), /display/state + /display/release, console live-list. Windows manager delegates linger/pinned decisions to lifecycle.rs (its driver specifics untouched). Validate: sway on this box (headless), gamescope spawn: connect → disconnect → verify vkcube/game still runs → reconnect → same session, no relaunch.
  • Stage 2 — KWin/Mutter keep-alive + topology decoupling. Kept-node PipeWire re-attach on KWin and Mutter (each behind its validation; fallback recreate), primary (without disable) on KWin/Mutter/Windows, exclusive on wlroots, restore paths regression-tested. Validate: headless KDE session (the run-headless-kde.sh rig), GNOME box .248.
  • Stage 3 — identity. Platform-neutral identity map + migration, per-slot KWin output naming (+ the concurrent-session name-clash fix riding along), GameStream identity wiring, optional per-client-mode keying, per-client default_scale on KWin. Validate on KDE: connect client A → set 150 % scaling → disconnect → reconnect → scaling reapplied; client B unaffected; kwinoutputconfig.json inspected for the named entries.
  • Stage 4 — mode-conflict admission. Decision function wired into both handshakes, the typed punktfunk/1 busy refusal, GameStream 503 path, the Windows silent-reconfigure → join-default change (call it out in release notes — it's a behavior fix), steal victim signaling reusing the stop-flag plumbing. Validate: two probe clients loopback (--mode differing) under each policy value.
  • Stage 5 — §6A multi-client monitors. Display groups, group-aware exclusive/primary/ restore (incl. the name-filter fix), layout auto-row + manual, /display/layout, console arrangement table. Cheap: rides Stages 13 infrastructure, no protocol change. Validate: two clients (probe + GTK) on the headless KDE box forming a 2-output desktop; drag a window across; disconnect one → its slot lingers per policy, sibling unaffected, restore only after both drop.
  • Stage 6 — §6B protocol + Linux host + GTK client. VIDEO_CAP_MULTI_DISPLAY, control- stream Add/Remove/DisplayAdded, per-flow nonce-salt derivation, per-display pipelines on KWin/wlroots, input display-index routing, C ABI additions, GTK client multi-window presenter, stats display dimension. Validate: loopback probe requesting 2 displays → two decodable .h265 outs + per-display 0xCF; then a real dual-monitor Linux client against the KDE box.
  • Stage 7 — Windows multi-monitor (§6.6: driver proto v3 per-monitor sealed rings, manager slot map, Windows client multi-window, separate un-gated on Windows) — gated on driver CI + on-glass, deliberately last.
  • Stage 8 — polish. Docs page finalized with real console screenshots, tray count/release (stretch), README/CLAUDE.md/host.env.example updates, local/summary count, macOS §6B presenter (its own mini-stage when scheduled).

12. Risks & open questions

  • PipeWire node reuse after consumer detach (KWin/Mutter) — the load-bearing unknown for Stage 2. If a kept node won't renegotiate for a fresh consumer, keep-alive on those backends degrades to "topology-stable but recreate-on-reconnect" (still valuable: no desktop reshuffle when paired with identity naming). The fallback is designed in, so the stage can't strand.
  • KWin persistence of Virtual-* output config — if KWin declines to persist virtual outputs, per-client scaling on KDE needs punktfunk-side scale storage instead (the default_scale adjunct already gives us the mechanism); identity naming stays worthwhile for the name-clash fix alone.
  • KWin stored-mode vs requested-mode fights under identity naming (§5.4) — mitigated by our post-create mode apply + read-back; watch for it in Stage 3 validation.
  • Compositor ceilings on simultaneous virtual outputs — load-bearing for §6A/§6B: probe KWin's virtual-output count and Mutter's RecordVirtual count (≥2 monitors) empirically in Stage 2/5; max_displays default 4 keeps us under any realistic ceiling.
  • Encoder session exhaustion (§6.4) — NVENC caps × split-encode × concurrent sessions must be budgeted in one place (the admission check), or a second display can silently break an unrelated session's encode. Split-encode is disabled for multi-display sessions by design.
  • Per-display input mapping — each Linux injector (libei, wlr, gamescope EIS) binds absolute coordinates differently; the §6B display-index routing is per-injector work with per-backend validation, not one generic patch.
  • Client-side multi-window fullscreen juggling (§6.5) — per-monitor DPI on Windows, Spaces on macOS, pointer capture across our own windows; the reason clients stage GTK/Windows first.
  • Idle kept displays burn resources — a kept gamescope keeps the game rendering (GPU) at full rate; a kept KWin output keeps compositing; every §6B display encodes at full rate. Document; a later refinement could drop a kept session's refresh, out of scope here.
  • Security posture — keep-alive keeps a user session composited/running unattended; nothing is unlocked that wasn't, and admission still rides pairing. steal on --open hosts is the one sharp edge → docs recommend reject there (§5.3). The mgmt endpoints are bearer-only; local/summary exposes counts only. §6B's extra UDP flows reuse the hardened core Session unchanged (per-flow salts derived, never reused) — no new crypto surface.
  • Mutter identity — blocked on GNOME API surface; re-check per GNOME release.