38cce754bd
android / android (push) Failing after 21s
ci / web (push) Failing after 3s
ci / docs-site (push) Failing after 1s
ci / bench (push) Failing after 0s
deb / build-publish (push) Failing after 0s
decky / build-publish (push) Failing after 1s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Failing after 0s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Failing after 1s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 0s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 1s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Failing after 1s
apple / swift (push) Successful in 53s
ci / rust (push) Failing after 2m33s
docker / deploy-docs (push) Has been skipped
flatpak / build-publish (push) Failing after 0s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 0s
Both landed in 3363576 and validated live on the Bazzite F44 box: a Gaming→Desktop
mid-stream switch shows `settled desktop portal env … compositor=kwin` →
`portal granted devices` → `device RESUMED` (input lands, no reconnect), and
`KWin: streamed output set as the sole desktop also_disabled=["HDMI-A-1"]` (panels
on the streamed screen). Remaining: #1 (F44 gamescope teardown GPU leak) + the
lower-priority polish.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
71 lines
4.3 KiB
Markdown
71 lines
4.3 KiB
Markdown
# Session-aware host — known limitations & follow-ups
|
||
|
||
Status: 2026-06-14. The host auto-detects the live session (Gaming / KDE / GNOME / wlroots) **per
|
||
connect** and routes both video and input at it — managed gamescope at the client's resolution in
|
||
Steam Gaming Mode, a KWin/Mutter virtual output at the client's resolution on a Desktop. A watcher
|
||
(opt-in: `PUNKTFUNK_SESSION_WATCH=1`) follows a Gaming↔Desktop switch **mid-stream** and rebuilds the
|
||
backend in place without a reconnect.
|
||
|
||
Live-validated on the Bazzite F44 box (`bazzite-deck-nvidia:testing`, RTX 4090): Desktop KDE at
|
||
5120×1440 + input; Gaming managed at 5120×1440; warm-session reuse on quick reconnect; Feature B
|
||
video-switch both directions.
|
||
|
||
## Resolved (2026-06-15, `3363576`)
|
||
|
||
- **#2 — mid-stream-switch input** ✅ `vdisplay::settle_desktop_portal()` pushes the live session env
|
||
into the systemd/D-Bus activation environment and restarts the KWin portal on a switch, so input
|
||
lands without a reconnect. Validated live: `settled desktop portal env … compositor=kwin` →
|
||
`libei: portal granted devices` → `device RESUMED` on a Gaming→Desktop mid-stream switch.
|
||
- **#3 — KWin/Mutter virtual output primary** ✅ `apply_session_env` defaults
|
||
`PUNKTFUNK_KWIN_VIRTUAL_PRIMARY` / `PUNKTFUNK_MUTTER_VIRTUAL_PRIMARY` on for the auto desktop path.
|
||
Validated live: `KWin: streamed output set as the sole desktop also_disabled=["HDMI-A-1"]` — panels
|
||
now render on the streamed screen.
|
||
|
||
## Still parked
|
||
|
||
### 1. F44 gamescope teardown corrupts the GPU context
|
||
Every gamescope teardown on this box (stop the autologin on connect; stop the managed session on
|
||
restore) risks leaking the NVIDIA GPU context — surfaces as `CUDA_ERROR_ILLEGAL_STATE` (401) in
|
||
`cuCtxCreate` / `vkCreateDevice` `VK_ERROR_INITIALIZATION_FAILED` (-3), then a black screen that
|
||
**needs a reboot**. The 5 s debounced restore + the desktop restore-guard cut the teardown *count*
|
||
but don't eliminate it. Options, in order of preference:
|
||
- **SIGKILL the gamescope on teardown** instead of `systemctl stop` (SIGTERM). Hypothesis: skipping
|
||
gamescope's buggy SIGTERM teardown handler (the part that SIGSEGVs, exit 139) lets the process die
|
||
hard and the driver reclaim its GPU resources cleanly via normal process exit — no half-torn-down
|
||
context. Change `stop_autologin_sessions` + `stop_session` (`vdisplay/gamescope.rs`) to
|
||
`systemctl --user kill --signal=SIGKILL <unit>` (+ a follow-up `stop`/`reset-failed` to clear unit
|
||
state). **Untested** — this is the first thing to try; it would preserve "managed client-res
|
||
gaming AND TV-shows-gaming-when-idle".
|
||
- **Keep the managed session warm** (no per-disconnect restore): spawn once, reuse forever, never
|
||
tear down → ~1 teardown per host lifetime. Tradeoff: the TV is blank/idle when no client is
|
||
connected (the autologin is never restored; return to gaming manually).
|
||
- Upstream gamescope/driver fix.
|
||
|
||
(#2 mid-stream-switch input and #3 virtual-output-primary are **resolved** — see the Resolved section above.)
|
||
|
||
## Lower priority / polish
|
||
|
||
### 4. Mid-stream-switch input loss window (~6 s)
|
||
During the libei portal setup on a switch, buffered input drops (`libei: DROP — no resumed device`,
|
||
hundreds of events). Polish: pre-warm the portal, or hold events instead of dropping during the
|
||
device-resume window.
|
||
|
||
### 5. NVENC `InitializeEncoder failed: invalid param` (recovered)
|
||
At 5120×1440@240 the first NVENC open fails with `invalid param (8)` and **recovers** via the 2-way
|
||
split-encode path (the stream is live). Cosmetic but noisy — investigate the first-attempt failure /
|
||
silence the log.
|
||
|
||
### 6. NVENC HEVC bitrate cap (~800 Mbps on the RTX 4090)
|
||
HEVC opens at the GPU's max (~800 Mbps) when a higher rate is requested (e.g. 1600). Not a bug;
|
||
consider preferring AV1 when the client requests >~800 Mbps HEVC, and surface the cap in the
|
||
speed-test / bitrate UI.
|
||
|
||
### 7. Restore-guard / keep-warm model interaction
|
||
`do_restore_tv_session`, when a desktop is active, still stops the idle managed gamescope (a teardown
|
||
— leak risk per #1) and consumes `STOPPED_AUTOLOGIN` (so a later return-to-gaming won't auto-restore
|
||
the TV session). Resolve together with the keep-warm decision in #1.
|
||
|
||
### 8. Feature B is opt-in
|
||
The mid-stream watcher is gated behind `PUNKTFUNK_SESSION_WATCH=1` pending broader validation. Promote
|
||
to default-on once #2 (mid-stream input) lands and it's exercised on more boxes.
|