Commit Graph

2 Commits

Author SHA1 Message Date
enricobuehler 92c6da9546 fix(capture/mutter): restore zero-copy + sync via dmabuf implicit fence
ci / web (push) Failing after 42s
apple / swift (push) Failing after 1m5s
ci / rust (push) Failing after 1m10s
ci / docs-site (push) Failing after 44s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
deb / build-publish (push) Successful in 2m54s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (push) Successful in 5m13s
The previous attempt (8531135) dropped zero-copy on Mutter+NVIDIA for a sticky
CPU/SHM fallback that (a) still listed SPA_DATA_DmaBuf in its buffer types, so
Mutter kept handing dmabufs that got mmap-read UNsynced — making the flashing
worse, not better — and (b) hinged on producer explicit sync, which Mutter+NVIDIA
cannot do (`error alloc buffers` / no cogl sync_fd, confirmed in worker-3 logs).

Revert the capture restructure to the original zero-copy dmabuf path, and fix the
NVIDIA stale-frame race the RIGHT way for a producer that can't do explicit sync:
the consumer snapshots the dmabuf's implicit fence (DMA_BUF_IOCTL_EXPORT_SYNC_FILE)
and waits the producer's render before sampling (new dmabuf_fence module, ioctl
number unit-tested). Covers the GPU import and the CPU mmap read. Logs once whether
a render was actually in flight (waited=true → the driver fences and the race is
closed; false → no implicit fence, so we learn zero-copy still needs SHM here).

drm_sync (the explicit-sync primitive) is kept and verified but marked unused —
no targeted compositor produces a usable sync_fd today; ready to wire in when one
does. The Bug-2 input fix (held-key release on disconnect) from 8531135 is kept.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 09:28:17 +00:00
enricobuehler 8531135bb7 fix(capture/mutter): stale-frame flashes + stuck input after disconnect on GNOME
ci / web (push) Failing after 49s
apple / swift (push) Failing after 1m4s
ci / rust (push) Failing after 1m9s
ci / docs-site (push) Failing after 42s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 6s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 6s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
deb / build-publish (push) Successful in 2m58s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (push) Successful in 4m17s
Deep dive into the two GNOME-only host bugs (KWin/gamescope clean):

1. Stale-frame flashes (windows at old positions, typed text reverting):
   Mutter renders its virtual monitors DIRECTLY into the PipeWire buffer
   pool, and NVIDIA has no implicit dmabuf fencing — our zero-copy
   import raced the render and encoded each pool buffer's PREVIOUS
   contents. Fix, in order of preference:
   - Consumer-side PipeWire explicit sync (SPA_META_SyncTimeline): new
     drm_sync module (DRM timeline-syncobj wait/signal via raw ioctls,
     unit-tested incl. a live signal->wait round trip); announced
     post-format via update_params (the OBS pattern — at connect time
     the meta makes producers fail allocation, observed on KWin), with
     a blocks=3 Buffers filter so the producer's sync pod wins; acquire
     point awaited before any read (GPU import or CPU mmap), release
     point signaled on every path.
   - Where the producer can't do explicit sync (Mutter on NVIDIA today:
     no cogl sync_fd, "error alloc buffers"), a sticky fallback flips
     the capture to the synchronous CPU/shm path — Mutter's glReadPixels
     download orders against its render, so frames are correct by
     construction. First session pays one ~10 s probe+retry; later
     sessions go straight there. Validated live on home-worker-3
     (GNOME 50 + RTX 4090): clean fallback, 30 MB HEVC streamed.
   - Sync is only announced on Mutter sessions (new VirtualOutput.mutter
     tag): KWin+NVIDIA fails allocation when merely asked, and doesn't
     need it (verified unchanged: zero-copy CUDA import + 1.1 MB/10 s).
   PUNKTFUNK_EXPLICIT_SYNC=0 disables the probe outright.

2. Clicks wedged in the focused app after disconnect+reconnect: a client
   vanishing mid-press left keys/buttons latched in the compositor —
   Mutter keeps the destroyed EIS device's implicit grab and the focused
   app stops taking clicks until restarted. EiState now tracks held
   keys/buttons/touches (wire codes) and synthesizes releases through
   the normal inject path before the EIS connection goes away.

GNOME hosts on NVIDIA temporarily lose zero-copy (correctness over
throughput); the moment Mutter+driver gain working explicit sync, the
sync path engages automatically and zero-copy returns.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-13 00:34:42 +00:00