feat(client-linux): VAAPI hardware decode — zero-copy dmabuf into GraphicsOffload
ci / docs-site (push) Failing after 45s
ci / web (push) Failing after 32s
apple / swift (push) Successful in 1m16s
ci / rust (push) Failing after 1m18s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 7s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 7s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Failing after 1m38s
rpm / build-publish (push) Successful in 4m10s

Stage 1.5: on Intel/AMD clients libavcodec's VAAPI hwaccel decodes on
the GPU; frames map to DRM-PRIME dmabufs (av_hwframe_map, zero copy)
and reach GTK as GdkDmabufTexture (BT.709 limited CICP color state —
GDK's dmabuf default is BT.601). Inside GtkGraphicsOffload that is the
decoder-to-subsurface path, direct-scanout eligible when fullscreen.

Fallback ladder, live-verified on the NVIDIA dev box: no VAAPI device
-> software decode at session start (logged reason); a mid-session
VAAPI error (e.g. broken nvidia-vaapi-driver) demotes to software and
the host's IDR/RFI recovery resynchronizes; a rejected dmabuf import
logs and the stream continues. PUNKTFUNK_DECODER=software|vaapi
overrides; the first-frame log now names the active path.

The hwaccel path is raw ffmpeg-sys FFI (ffmpeg-next wraps none of it):
hw device ctx + get_format pinned to AV_PIX_FMT_VAAPI (NONE on
mismatch so cpu-fallback never silently engages inside libavcodec),
thread_count=1, LOW_DELAY. Surface lifetime rides DrmFrameGuard into
the texture's release func — GDK runs it on both success and failure.

Needs an Intel/AMD client box (Steam Deck/Bazzite) to live-verify the
hardware path; the software path is unchanged and revalidated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-06-12 23:26:59 +00:00
parent b5c30dff4f
commit 4b1bbfdf0e
4 changed files with 350 additions and 41 deletions
+13 -5
View File
@@ -95,11 +95,19 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
jitter ring inverted), SDL3 gamepad capture + rumble/lightbar feedback, keyboard via
exact inverse of the host VK table, absolute mouse + 120-unit scroll. Validated live
against `serve --native` on this box: 1080p60, steady 60 fps, capture→decoded p50
≈6.4 ms (debug build). `--connect host[:port]` for scripting. Next (per the 2026-06-12
research, memory `linux-client-option-a`): VAAPI dmabuf → `GdkDmabufTexture` (Tier-1
zero-copy on Intel/AMD), then the stage-2 raw-Wayland presenter (wp_presentation
feedback, tearing-control, Vulkan Video on NVIDIA) — **wgpu/winit rejected** (no dmabuf
import / presentation feedback / shortcuts-inhibit).
≈6.4 ms (debug build). `--connect host[:port]` for scripting. **Swift-parity batch +
stage 1.5 (2026-06-12 evening)**: capture state machine (click-to-capture,
Ctrl+Alt+Shift+Q / focus-loss release, held-state flush), app-lifetime SDL gamepad
service (pad pin UI, auto type from the physical pad, DualSense touchpad/motion 0xCC +
raw-DS5-effects trigger/player-LED replay — needs a physical pad to live-verify), mic
uplink (validated live), per-host speed test, compositor pref, native-display mode
default, saved-hosts list, .deb + RPM-subpackage CI (deb.yml/rpm.yml). **VAAPI decode
→ DRM-PRIME dmabuf → `GdkDmabufTexture`** (BT.709 color state; Tier-1 zero-copy on
Intel/AMD, `PUNKTFUNK_DECODER=software|vaapi` override) with a proven fallback ladder —
no VAAPI device (NVIDIA) or mid-session VAAPI error → software decode; needs an
Intel/AMD client box to live-verify the hw path. Next: the stage-2 raw-Wayland
presenter (wp_presentation feedback, tearing-control, Vulkan Video on NVIDIA) —
**wgpu/winit rejected** (no dmabuf import / presentation feedback / shortcuts-inhibit).
2. **Sub-frame pipelining**: overlap encode and transmit within a frame. Requires a direct
NVENC SDK wrapper (libavcodec only emits whole AUs) — the next big latency lever (~24 ms
at high res).