From 9775794ba533d867034904316c4c098505f45b8e Mon Sep 17 00:00:00 2001 From: enricobuehler Date: Sun, 14 Jun 2026 23:53:45 +0000 Subject: [PATCH] docs: known limitations + follow-ups for the session-aware host MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Capture the deliberately-parked items after live-validating the session-aware backend selector on the Bazzite F44 box (Desktop KDE + Gaming both at the client's resolution, warm reuse, Feature B mid-stream switch both directions). Top follow-ups: (1) F44 gamescope teardown corrupts the GPU context (try SIGKILL teardown, else keep the managed session warm); (2) mid-stream-switch input is flaky until a reconnect (portal opens before the systemd/D-Bus activation env settles — fix: import-environment on switch); (3) the KWin virtual output isn't set primary. Plus polish: input-loss window on switch, the recovered NVENC invalid-param log, the 4090 HEVC ~800Mbps cap, restore-guard/keep-warm interaction, and promoting Feature B from opt-in to default. Co-Authored-By: Claude Opus 4.8 (1M context) --- docs/session-aware-host-followups.md | 79 ++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) create mode 100644 docs/session-aware-host-followups.md diff --git a/docs/session-aware-host-followups.md b/docs/session-aware-host-followups.md new file mode 100644 index 0000000..ae34be2 --- /dev/null +++ b/docs/session-aware-host-followups.md @@ -0,0 +1,79 @@ +# Session-aware host — known limitations & follow-ups + +Status: 2026-06-14. The host auto-detects the live session (Gaming / KDE / GNOME / wlroots) **per +connect** and routes both video and input at it — managed gamescope at the client's resolution in +Steam Gaming Mode, a KWin/Mutter virtual output at the client's resolution on a Desktop. A watcher +(opt-in: `PUNKTFUNK_SESSION_WATCH=1`) follows a Gaming↔Desktop switch **mid-stream** and rebuilds the +backend in place without a reconnect. + +Live-validated on the Bazzite F44 box (`bazzite-deck-nvidia:testing`, RTX 4090): Desktop KDE at +5120×1440 + input; Gaming managed at 5120×1440; warm-session reuse on quick reconnect; Feature B +video-switch both directions. The items below are **deliberately parked** — they have workarounds +and/or are F44-specific. + +## High priority + +### 1. F44 gamescope teardown corrupts the GPU context +Every gamescope teardown on this box (stop the autologin on connect; stop the managed session on +restore) risks leaking the NVIDIA GPU context — surfaces as `CUDA_ERROR_ILLEGAL_STATE` (401) in +`cuCtxCreate` / `vkCreateDevice` `VK_ERROR_INITIALIZATION_FAILED` (-3), then a black screen that +**needs a reboot**. The 5 s debounced restore + the desktop restore-guard cut the teardown *count* +but don't eliminate it. Options, in order of preference: +- **SIGKILL the gamescope on teardown** instead of `systemctl stop` (SIGTERM). Hypothesis: skipping + gamescope's buggy SIGTERM teardown handler (the part that SIGSEGVs, exit 139) lets the process die + hard and the driver reclaim its GPU resources cleanly via normal process exit — no half-torn-down + context. Change `stop_autologin_sessions` + `stop_session` (`vdisplay/gamescope.rs`) to + `systemctl --user kill --signal=SIGKILL ` (+ a follow-up `stop`/`reset-failed` to clear unit + state). **Untested** — this is the first thing to try; it would preserve "managed client-res + gaming AND TV-shows-gaming-when-idle". +- **Keep the managed session warm** (no per-disconnect restore): spawn once, reuse forever, never + tear down → ~1 teardown per host lifetime. Tradeoff: the TV is blank/idle when no client is + connected (the autologin is never restored; return to gaming manually). +- Upstream gamescope/driver fix. + +### 2. Mid-stream-switch input (Gaming→Desktop) is flaky until a reconnect +After a mid-stream switch to a desktop, pointer/keyboard often don't land until a disconnect+reconnect. +Root cause: on the switch the host opens the KDE `RemoteDesktop` **portal** immediately, but the +portal is a D-Bus-activated service that reads the **systemd `--user` activation environment**, which +hasn't settled to the new session yet — so the portal session is created against a half-stale env. It +*accepts* events (no error, so the injector's reopen-on-failure never fires) but they don't reach +KDE. The host log even shows `libei: portal granted devices` + `device RESUMED`, yet input is dead. +**Workaround: reconnect once after switching** (re-opens the portal against the settled env → works). +Fix: on a session switch, push the new session env into the systemd/D-Bus activation environment +before reopening input — `systemctl --user import-environment WAYLAND_DISPLAY XDG_CURRENT_DESKTOP +DBUS_SESSION_BUS_ADDRESS` + `dbus-update-activation-environment` (exactly what a fresh KDE login does; +see `scripts/headless/run-headless-kde.sh:118-119`) — and/or delay/retry the input reopen until the +portal is settled. + +### 3. KWin virtual output is not set as primary +The KWin virtual output isn't marked the primary display, so KDE panels / primary-screen content can +stay on the (now-absent) original output instead of the streamed virtual screen. Follow-up: set the +virtual output primary on create — `vdisplay/kwin.rs` already reads `PUNKTFUNK_KWIN_VIRTUAL_PRIMARY`; +make it default-on (or always promote the new output to primary via the KWin/kscreen API) so the +streamed screen is the primary desktop. + +## Lower priority / polish + +### 4. Mid-stream-switch input loss window (~6 s) +During the libei portal setup on a switch, buffered input drops (`libei: DROP — no resumed device`, +hundreds of events). Polish: pre-warm the portal, or hold events instead of dropping during the +device-resume window. + +### 5. NVENC `InitializeEncoder failed: invalid param` (recovered) +At 5120×1440@240 the first NVENC open fails with `invalid param (8)` and **recovers** via the 2-way +split-encode path (the stream is live). Cosmetic but noisy — investigate the first-attempt failure / +silence the log. + +### 6. NVENC HEVC bitrate cap (~800 Mbps on the RTX 4090) +HEVC opens at the GPU's max (~800 Mbps) when a higher rate is requested (e.g. 1600). Not a bug; +consider preferring AV1 when the client requests >~800 Mbps HEVC, and surface the cap in the +speed-test / bitrate UI. + +### 7. Restore-guard / keep-warm model interaction +`do_restore_tv_session`, when a desktop is active, still stops the idle managed gamescope (a teardown +— leak risk per #1) and consumes `STOPPED_AUTOLOGIN` (so a later return-to-gaming won't auto-restore +the TV session). Resolve together with the keep-warm decision in #1. + +### 8. Feature B is opt-in +The mid-stream watcher is gated behind `PUNKTFUNK_SESSION_WATCH=1` pending broader validation. Promote +to default-on once #2 (mid-stream input) lands and it's exercised on more boxes.