Commit Graph

4 Commits

Author SHA1 Message Date
enricobuehler 333f66b45b fix(host/serverinfo): don't advertise an empty codec mask when the VAAPI probe finds nothing
apple / swift (push) Successful in 54s
windows-host / package (push) Successful in 2m21s
android / android (push) Successful in 3m30s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 27s
ci / rust (push) Successful in 5m56s
deb / build-publish (push) Successful in 3m9s
ci / bench (push) Successful in 4m40s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
decky / build-publish (push) Successful in 11s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m47s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m51s
docker / deploy-docs (push) Successful in 17s
The Phase 3 GPU-aware codec mask (6922e1c) probes VAAPI on any non-NVIDIA host.
On a GPU-less box (CI container: no /dev/nvidia* -> `auto` picks VAAPI, but there's
no VA display) the probe returns all-false, so the mask was 0 -- the host
advertised NO codecs, and the serverinfo unit test failed.

Fall back to the static superset when the probe yields nothing (VAAPI wasn't
usable, not "the GPU encodes nothing"); quiet ffmpeg's expected "No VA display"
error during the probe; and assert the test against codec_mode_support() rather
than a hardcoded 65793 so it's deterministic regardless of the build host's GPU.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 11:52:17 +00:00
enricobuehler 6922e1c467 feat(host): VAAPI codec probe + AMD/Intel packaging + neutral logs (Phase 3)
apple / swift (push) Successful in 55s
ci / rust (push) Failing after 1m35s
ci / web (push) Successful in 28s
windows-host / package (push) Successful in 2m23s
ci / docs-site (push) Successful in 30s
android / android (push) Successful in 3m24s
deb / build-publish (push) Successful in 3m22s
decky / build-publish (push) Successful in 14s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m48s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m50s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m51s
docker / deploy-docs (push) Successful in 18s
Polish for AMD/Intel support:
- GameStream serverinfo advertises only codecs the GPU can ACTUALLY encode on
  the VAAPI backend (probed once by opening a tiny encoder per codec). AV1
  encode is narrow (Intel Arc/Xe2+, AMD RDNA3+/RDNA4) and an old iGPU may lack
  HEVC, so a Moonlight client never negotiates a codec the encoder can't open.
  NVENC/Windows keep the Moonlight-validated static mask. Validated on a Radeon
  780M: h264/h265/av1 all probe true -> mask unchanged (65793).
- Packaging: Recommends mesa-va-drivers + intel-media-va-driver (deb) /
  mesa-va-drivers + intel-media-driver (rpm) so the auto-selected VAAPI backend
  works out of the box on AMD/Intel; NVIDIA boxes can --no-install-recommends.
  (Fedora note: stock mesa-va-drivers disables HEVC/AV1 -- needs the freeworld
  variant from RPM Fusion.)
- De-NVIDIA-fy the user-facing encoder log/context strings ("open NVENC" ->
  "open video encoder") now that VAAPI is a first-class backend.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 10:41:37 +00:00
enricobuehler 708c62788d feat(host/encode): VAAPI zero-copy dmabuf import (AMD/Intel GPU CSC)
apple / swift (push) Successful in 57s
ci / rust (push) Successful in 1m39s
ci / web (push) Successful in 32s
ci / docs-site (push) Successful in 31s
android / android (push) Successful in 3m29s
windows-host / package (push) Successful in 3m39s
deb / build-publish (push) Successful in 3m7s
decky / build-publish (push) Successful in 22s
ci / bench (push) Successful in 4m43s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m24s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 22s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m22s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m53s
Phase 2 of AMD/Intel support: the VAAPI encoder now takes the capture dmabuf
directly and does the RGB->NV12 colour conversion on the GPU's video engine,
eliminating the host-side de-pad + swscale CSC + upload the CPU path pays.

- capture: a vendor-neutral FramePayload::Dmabuf (dup'd fd + fourcc/modifier/
  layout). When zero-copy is on, the EGL->CUDA importer is unavailable (any
  non-NVIDIA host), and the backend is VAAPI, the capturer advertises LINEAR
  dmabuf and hands the raw buffer to the encoder instead of CPU-copying it.
- encode/vaapi: the encoder self-configures from the first frame's payload (no
  open_video signature change). The dmabuf arm wraps the buffer as an
  AV_PIX_FMT_DRM_PRIME frame and pushes it through a filter graph
  buffer(drm_prime) -> hwmap(vaapi) -> scale_vaapi=nv12 -> buffersink; the
  encoder takes NV12 surfaces straight from the sink. The Phase 1 CPU-upload
  path is kept as the other arm (used when capture produces CPU frames).

Live-validated on a Radeon 780M (real Sway/xdpw desktop capture): correct,
pixel-perfect HEVC, and ~10x less host CPU at 1440p (4.2s -> 0.4s of CPU for
300 frames) -- the de-pad/CSC/upload moves to the GPU. NVIDIA unchanged
(zero-copy still imports to CUDA; the passthrough path only engages on
non-NVIDIA hosts).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 09:57:00 +00:00
enricobuehler b390dd883b feat(host/encode): VAAPI encode backend for AMD/Intel GPUs (Linux)
apple / swift (push) Successful in 54s
windows-host / package (push) Successful in 2m52s
android / android (push) Successful in 3m4s
ci / rust (push) Successful in 1m18s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 27s
ci / bench (push) Successful in 4m32s
deb / build-publish (push) Successful in 2m56s
decky / build-publish (push) Successful in 22s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m59s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 15s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m41s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 7m27s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m10s
docker / deploy-docs (push) Successful in 39s
The Linux host was NVENC/CUDA-only. Add a VAAPI encoder — one libavcodec
backend (h264/hevc/av1_vaapi) covering both AMD (Mesa radeonsi) and Intel
(iHD) — behind the existing `Encoder` trait, and turn `open_video`'s Linux
arm into a vendor dispatcher: `PUNKTFUNK_ENCODER=auto|nvenc|vaapi` (default
auto: NVENC when a CUDA frame or /dev/nvidia* is present, else VAAPI). The
NVIDIA path is unchanged — auto resolves to NVENC on an NVIDIA box and the
bitrate-probe loop moved verbatim into `open_nvenc_probed`.

`VaapiEncoder` mirrors the NVENC hwframes pattern with AV_HWDEVICE_TYPE_VAAPI.
The CPU-input path swscales packed RGB -> NV12 (BT.709 limited, VUI signalled)
and uploads into a pooled VA surface (av_hwframe_transfer_data), preserving the
low-latency model (infinite GOP, on-demand forced IDR, async_depth=1, CBR when
the driver supports it). It works on a non-NVIDIA box with no capture changes:
the capturer already falls back to CPU frames when its EGL->CUDA importer can't
initialise (no libcuda).

Live-validated on a Radeon 780M (RDNA3): hevc/h264/av1_vaapi all encode,
HEVC/H264 decode cleanly with correct BT.709-limited colours, infinite GOP
preserved. Zero-copy dmabuf import (the high-res perf lever) is next.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 15:35:49 +00:00