refactor(host/zerocopy): dlopen libcuda instead of a link-time #[link]
apple / swift (push) Successful in 54s
windows-host / package (push) Successful in 2m15s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m18s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m14s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 55s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 58s
android / android (push) Successful in 4m10s
audit / cargo-audit (push) Failing after 1m5s
ci / web (push) Successful in 28s
ci / docs-site (push) Successful in 28s
ci / rust (push) Successful in 5m41s
ci / bench (push) Successful in 5m53s
decky / build-publish (push) Successful in 11s
deb / build-publish (push) Successful in 3m24s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 35s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m7s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m16s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3m50s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 22s
flatpak / build-publish (push) Successful in 4m9s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m23s
docker / deploy-docs (push) Successful in 5s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m51s

The host hard-linked libcuda.so.1 on Linux (`#[link(name="cuda")]` in
`zerocopy::cuda`), so the binary wouldn't even *start* on a non-NVIDIA box —
the dynamic loader can't resolve the NEEDED libcuda. That blocked running the
new VAAPI (AMD/Intel) path on a machine without the NVIDIA driver.

Resolve the 18 CUDA Driver API symbols at runtime via `libloading` instead.
Same-named wrapper fns forward to the dlopen'd table (call sites unchanged);
when libcuda is absent they return a non-zero CUresult so `context()` fails
cleanly and the capturer falls back to the CPU path. The library handle is
leaked (process-lifetime, like the shared context).

One Linux binary now runs on NVIDIA (CUDA zero-copy -> NVENC) and on AMD/Intel
(VAAPI, no NVIDIA driver). Verified: the NVIDIA dev box still does dmabuf->CUDA
zero-copy; on a Radeon 780M box the host builds with no libcuda present, the
binary has no NEEDED libcuda entry, and VAAPI encode runs with no stub.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-19 15:44:57 +00:00
parent b390dd883b
commit f96e4ec9f8
3 changed files with 249 additions and 61 deletions
+3
View File
@@ -105,6 +105,9 @@ khronos-egl = { version = "6", features = ["dynamic"] }
# GPU-copy into an exportable allocation, export OPAQUE_FD → cuImportExternalMemory (the
# officially-supported CUDA pairing; raw dmabuf fds are rejected by the desktop driver).
ash = "0.38"
# `libcuda.so.1` is dlopen'd at runtime (NOT link-time) so one Linux binary runs on NVIDIA
# (zero-copy via CUDA) AND on AMD/Intel (VAAPI, no NVIDIA driver present) — see `zerocopy::cuda`.
libloading = "0.8"
[target.'cfg(target_os = "windows")'.dependencies]
# Windows host backends. `windows` covers the Win32/CCD APIs the SudoVDA virtual-display backend