Scaffolding for dmabuf zero-copy (plan §9), opt-in via LUMEN_ZEROCOPY:
- src/zerocopy/{cuda,egl}.rs: hand-rolled CUDA Driver-API FFI (no Rust crate
exposes the EGL-interop calls / CUeglFrame) with a shared process-wide
CUcontext + pitched device buffers; an EGL importer (GBM platform on the
NVIDIA render node) that turns a dmabuf into an EGLImage, registers it with
CUDA, and copies it device-to-device into an owned buffer. `zerocopy-probe`
subcommand validates the FFI/linking/GPU access — confirmed on the box
(driver 595, EGL_EXT_image_dma_buf_import + modifiers).
- CapturedFrame gains a FramePayload enum (Cpu(Vec<u8>) | Cuda(DeviceBuffer));
the encoder branches: CPU keeps the expand+upload path, CUDA wraps the device
buffer in an AV_PIX_FMT_CUDA frame fed straight to hevc_nvenc (sharing our
CUcontext via a hand-declared AVCUDADeviceContext, since ffmpeg-sys doesn't
bind hwcontext_cuda.h). open_video/the encoder take a `cuda` flag derived from
the first frame's payload.
The capture-side dmabuf negotiation (which produces the Cuda frames) is the
next step; the CPU path is unchanged and remains the default + fallback. Builds
clean, clippy clean, tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Graceful FEC behavior on a lossy link: at a realistic 2% packet loss the stream
is now steady 0% (was spiking 40-60%). Verified live.
- IDR/RFI handling: the control thread recognizes the client's recovery requests
(0x0301 invalidate-reference-frames, 0x0302 request-IDR, 0x0305) and sets a
shared force_idr flag; the video thread forces an NVENC keyframe on the next
frame (Encoder::request_keyframe → input frame pict_type = I). Without this, a
frame that exceeds the FEC budget broke the reference chain until the next GOP
IDR (~2s), cascading to most of the stream being undecodable.
- Min-parity floor: honor the client's x-nv-vqos[0].fec.minRequiredFecPackets
(it asks for 2). Small P-frames previously got m=ceil(k*20/100)=1 parity — a
single loss broke them; flooring m>=2 (capped so k+m<=255, wire pct recomputed)
protects them. This is what turned the 2% spikes into steady 0%.
- Send pacing: spread each frame's packets evenly across the frame interval
instead of blasting them at line rate (a real link drops microbursts), matching
Sunshine's rate-controlled sends; sub-500us sleeps skipped (unreliable).
Note: sustained ~8% uniform loss still degrades — that exceeds 20% FEC for
reference-frame video and real Sunshine degrades there too; real networks are
<1% or bursty, which this now handles cleanly.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>