Files
punktfunk/docs/session-aware-host-followups.md
T
enricobuehler 38cce754bd
android / android (push) Failing after 21s
ci / web (push) Failing after 3s
ci / docs-site (push) Failing after 1s
ci / bench (push) Failing after 0s
deb / build-publish (push) Failing after 0s
decky / build-publish (push) Failing after 1s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Failing after 0s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Failing after 1s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 0s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 1s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Failing after 1s
apple / swift (push) Successful in 53s
ci / rust (push) Failing after 2m33s
docker / deploy-docs (push) Has been skipped
flatpak / build-publish (push) Failing after 0s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 0s
docs: mark session-aware follow-ups #2 (switch input) + #3 (vout primary) resolved
Both landed in 3363576 and validated live on the Bazzite F44 box: a Gaming→Desktop
mid-stream switch shows `settled desktop portal env … compositor=kwin` →
`portal granted devices` → `device RESUMED` (input lands, no reconnect), and
`KWin: streamed output set as the sole desktop also_disabled=["HDMI-A-1"]` (panels
on the streamed screen). Remaining: #1 (F44 gamescope teardown GPU leak) + the
lower-priority polish.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 07:12:29 +00:00

71 lines
4.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Session-aware host — known limitations & follow-ups
Status: 2026-06-14. The host auto-detects the live session (Gaming / KDE / GNOME / wlroots) **per
connect** and routes both video and input at it — managed gamescope at the client's resolution in
Steam Gaming Mode, a KWin/Mutter virtual output at the client's resolution on a Desktop. A watcher
(opt-in: `PUNKTFUNK_SESSION_WATCH=1`) follows a Gaming↔Desktop switch **mid-stream** and rebuilds the
backend in place without a reconnect.
Live-validated on the Bazzite F44 box (`bazzite-deck-nvidia:testing`, RTX 4090): Desktop KDE at
5120×1440 + input; Gaming managed at 5120×1440; warm-session reuse on quick reconnect; Feature B
video-switch both directions.
## Resolved (2026-06-15, `3363576`)
- **#2 — mid-stream-switch input** ✅ `vdisplay::settle_desktop_portal()` pushes the live session env
into the systemd/D-Bus activation environment and restarts the KWin portal on a switch, so input
lands without a reconnect. Validated live: `settled desktop portal env … compositor=kwin`
`libei: portal granted devices``device RESUMED` on a Gaming→Desktop mid-stream switch.
- **#3 — KWin/Mutter virtual output primary** ✅ `apply_session_env` defaults
`PUNKTFUNK_KWIN_VIRTUAL_PRIMARY` / `PUNKTFUNK_MUTTER_VIRTUAL_PRIMARY` on for the auto desktop path.
Validated live: `KWin: streamed output set as the sole desktop also_disabled=["HDMI-A-1"]` — panels
now render on the streamed screen.
## Still parked
### 1. F44 gamescope teardown corrupts the GPU context
Every gamescope teardown on this box (stop the autologin on connect; stop the managed session on
restore) risks leaking the NVIDIA GPU context — surfaces as `CUDA_ERROR_ILLEGAL_STATE` (401) in
`cuCtxCreate` / `vkCreateDevice` `VK_ERROR_INITIALIZATION_FAILED` (-3), then a black screen that
**needs a reboot**. The 5 s debounced restore + the desktop restore-guard cut the teardown *count*
but don't eliminate it. Options, in order of preference:
- **SIGKILL the gamescope on teardown** instead of `systemctl stop` (SIGTERM). Hypothesis: skipping
gamescope's buggy SIGTERM teardown handler (the part that SIGSEGVs, exit 139) lets the process die
hard and the driver reclaim its GPU resources cleanly via normal process exit — no half-torn-down
context. Change `stop_autologin_sessions` + `stop_session` (`vdisplay/gamescope.rs`) to
`systemctl --user kill --signal=SIGKILL <unit>` (+ a follow-up `stop`/`reset-failed` to clear unit
state). **Untested** — this is the first thing to try; it would preserve "managed client-res
gaming AND TV-shows-gaming-when-idle".
- **Keep the managed session warm** (no per-disconnect restore): spawn once, reuse forever, never
tear down → ~1 teardown per host lifetime. Tradeoff: the TV is blank/idle when no client is
connected (the autologin is never restored; return to gaming manually).
- Upstream gamescope/driver fix.
(#2 mid-stream-switch input and #3 virtual-output-primary are **resolved** — see the Resolved section above.)
## Lower priority / polish
### 4. Mid-stream-switch input loss window (~6 s)
During the libei portal setup on a switch, buffered input drops (`libei: DROP — no resumed device`,
hundreds of events). Polish: pre-warm the portal, or hold events instead of dropping during the
device-resume window.
### 5. NVENC `InitializeEncoder failed: invalid param` (recovered)
At 5120×1440@240 the first NVENC open fails with `invalid param (8)` and **recovers** via the 2-way
split-encode path (the stream is live). Cosmetic but noisy — investigate the first-attempt failure /
silence the log.
### 6. NVENC HEVC bitrate cap (~800 Mbps on the RTX 4090)
HEVC opens at the GPU's max (~800 Mbps) when a higher rate is requested (e.g. 1600). Not a bug;
consider preferring AV1 when the client requests >~800 Mbps HEVC, and surface the cap in the
speed-test / bitrate UI.
### 7. Restore-guard / keep-warm model interaction
`do_restore_tv_session`, when a desktop is active, still stops the idle managed gamescope (a teardown
— leak risk per #1) and consumes `STOPPED_AUTOLOGIN` (so a later return-to-gaming won't auto-restore
the TV session). Resolve together with the keep-warm decision in #1.
### 8. Feature B is opt-in
The mid-stream watcher is gated behind `PUNKTFUNK_SESSION_WATCH=1` pending broader validation. Promote
to default-on once #2 (mid-stream input) lands and it's exercised on more boxes.