Files
punktfunk/crates/punktfunk-host/src/session_plan.rs
T
enricobuehler ffc0b07b46 feat(protocol,host): negotiate video codec + add a GPU-less software (openh264) encode path
Phase 1 of codec negotiation, and the Linux software H.264 encode path it unblocks.

**Codec negotiation (core `quic`):**
- `Hello.video_codecs` (bitfield: CODEC_H264/HEVC/AV1) — the client advertises what it can
  decode; appended as a trailing byte (older client → 0 = HEVC-only, back-compat).
- `Welcome.codec` — the single codec the host resolved and will emit; trailing byte (older
  host → HEVC).
- `resolve_codec(client, host_capable)` picks the shared codec (precedence HEVC > AV1 > H.264)
  or `None` → the host refuses honestly rather than sending an undecodable stream.
- Roundtrip + back-compat tests; cbindgen exports the CODEC_* constants.

**Software encoder (host):**
- The openh264 `OpenH264Encoder` (was Windows-only) is now built on Linux too — it's
  platform-agnostic (consumes CPU RGB `CapturedFrame`s, statically-bundled openh264). `openh264`
  moved to the shared linux+windows Cargo target.
- `PUNKTFUNK_ENCODER=software` selects it: `open_video` gains a `software` branch (H.264 only),
  and `session_plan::resolve_encoder` / `capture::gpu_encode` resolve `EncoderBackend::Software`
  → `output_format().gpu = false`, so the portal capturer delivers CPU RGB. Explicit-only (auto
  never picks it — a box with a dead driver still has /dev/nvidiactl and would mis-resolve NVENC).

**Host codec resolution (`punktfunk1`):**
- The native path no longer hardcodes HEVC: it resolves the codec from the client's advertised
  set ∩ the host's capability (`Codec::host_wire_caps`: software→H.264, else HEVC), threads it
  through `SessionPlan.codec`, and opens the encoder + validates reconfigures at that codec. A
  software host + HEVC-only client is refused with a clear error.
- 4:4:4 is gated on HEVC (it's HEVC-only).

**Probe:** advertises H264|HEVC|AV1 and logs the resolved codec.

Validated on the GPU-less dev box: negotiation is live end-to-end (probe advertises 0x07 → host
resolves H.264 → Welcome reports it → plan = Software/H264), and the openh264 unit test (CPU RGB →
AnnexB IDR) now runs on Linux. Full capture→encode still needs a GPU on this box — every
compositor screencast path (KWin GL, gamescope VK_EXT_physical_device_drm, wlroots EGL) requires
one; software render (llvmpipe/pixman) can't be captured — so this box exercises negotiation +
encoder, not live capture. The software path unblocks GPU-less-*encode* boxes that still have a
display GPU. Phase 2 (clients advertising real codecs + decoding per Welcome.codec) is a follow-up.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-01 23:13:39 +00:00

174 lines
8.2 KiB
Rust
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
//! `SessionPlan` — the per-session capture / topology / encoder decision, resolved **once** from
//! [`HostConfig`](crate::config) (+ the handshake-negotiated bit depth) into a typed, logged value.
//!
//! **Goal-1 stage 3** (`design/windows-host-rewrite.md` §2.2): before this, the Windows session decision was
//! re-derived at three call sites — the capture backend inside `capture::capture_virtual_output`, the
//! process topology in `punktfunk1::should_use_helper`, and the encode backend in
//! `encode::windows_resolved_backend` — each reading [`config`](crate::config) independently, with no
//! single owner (the latent "capture and encode disagree on the backend" hazard, plan §2.4). `SessionPlan`
//! resolves them together, once, so the deployed path reads one typed artifact.
//!
//! Stage 3 routes the **capture** and **topology** decisions through the plan (see
//! `capture::capture_virtual_output` taking [`CaptureBackend`] in, and `virtual_stream` reading
//! [`SessionTopology`]). The **encoder** is resolved by `encode::windows_resolved_backend` (config-backed
//! and GPU-vendor cached since stage 2, so already a single source) and *recorded* here as
//! [`EncoderBackend`]. Threading `encoder`/`input_format` into the encoder + capturer opens — which
//! removes the `capture → encode::windows_resolved_backend()` back-reference recomputed in `dxgi.rs` —
//! is **stage 5**.
//!
//! The type is platform-neutral so it threads through the shared `virtual_stream`/`build_pipeline`
//! signatures; on Linux it resolves to the single portal/single-process path (the 3-way dispatch is a
//! Windows-only concern).
/// Where a session's frames come from.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum CaptureBackend {
/// Linux: the xdg ScreenCast portal → PipeWire (the only Linux capture path).
Portal,
/// Windows: IDD direct-push — frames pulled straight from the pf-vdisplay driver's shared ring
/// (in-process, Session 0; captures the secure desktop too). The sole Windows capture path —
/// DXGI Desktop Duplication (DDA) and the WGC two-process relay were removed.
IddPush,
}
impl CaptureBackend {
/// Resolve the capture backend from [`config`](crate::config). This is the single resolver shared by
/// [`SessionPlan::resolve`] and the standalone callers (GameStream / spike), so they can't drift.
#[cfg(target_os = "linux")]
pub fn resolve() -> Self {
CaptureBackend::Portal
}
/// Windows: IDD direct-push is the sole capture path (DDA + the WGC two-process relay were removed).
#[cfg(target_os = "windows")]
pub fn resolve() -> Self {
CaptureBackend::IddPush
}
#[cfg(not(any(target_os = "linux", target_os = "windows")))]
pub fn resolve() -> Self {
CaptureBackend::Portal
}
}
/// How a session is structured across processes.
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum SessionTopology {
/// One process captures + encodes. The only topology: Linux (portal) and Windows (in-process
/// IDD-push in Session 0). The SYSTEM-host + user-session WGC relay was removed with DDA/WGC.
SingleProcess,
}
/// The resolved encode backend (recorded for logging / stages 45; the per-session encoder open still
/// resolves via `encode::windows_resolved_backend`, which is config-backed + GPU-vendor cached).
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum EncoderBackend {
/// Linux: NVENC vs VAAPI is auto-detected inside `encode::open_video` (not modeled here).
PlatformAuto,
Nvenc,
Amf,
Qsv,
Software,
}
impl EncoderBackend {
/// True if this backend encodes on the GPU (so the capturer should produce GPU-resident frames). Only
/// the software encoder takes CPU staging; `PlatformAuto` (Linux NVENC/VAAPI) is always GPU.
pub fn is_gpu(self) -> bool {
!matches!(self, EncoderBackend::Software)
}
}
/// The per-session decision, resolved once. `Copy` so it threads through the capture/encode chain
/// without ceremony (stage 4 folds it, with the rest of the arg soup, into a `SessionContext`).
#[derive(Clone, Copy, Debug)]
pub struct SessionPlan {
pub capture: CaptureBackend,
pub topology: SessionTopology,
pub encoder: EncoderBackend,
/// Handshake-negotiated encode bit depth (8, or 10 = HEVC Main10).
pub bit_depth: u8,
/// The IDD-push HDR hint (`bit_depth >= 10`) — the want-HDR flag handed to the capturer so it
/// proactively enables advanced color on the virtual display. Linux is 8-bit (HDR blocked upstream).
pub hdr: bool,
/// Handshake-negotiated chroma subsampling (4:2:0, or full-chroma 4:4:4 when the client + host +
/// GPU all support it). Resolved before the Welcome; `Yuv420` on every backend that declined it.
pub chroma: crate::encode::ChromaFormat,
/// Handshake-negotiated video codec the encoder emits — HEVC by default, H.264 for a GPU-less
/// software host (`resolve_codec` over the client's advertised codecs ∩ the host's capability).
pub codec: crate::encode::Codec,
}
impl SessionPlan {
/// Resolve the whole plan once from [`config`](crate::config) + the negotiated `bit_depth`,
/// `chroma`, and `codec`.
pub fn resolve(
bit_depth: u8,
chroma: crate::encode::ChromaFormat,
codec: crate::encode::Codec,
) -> Self {
SessionPlan {
capture: CaptureBackend::resolve(),
topology: resolve_topology(),
encoder: resolve_encoder(),
bit_depth,
hdr: bit_depth >= 10,
chroma,
codec,
}
}
/// The capturer's target output format (Goal-1 stage 5): `gpu` from the already-resolved `encoder`
/// (no second backend probe), `hdr` from the plan. Handed into `capture::capture_virtual_output` so the
/// capturer never re-derives the encode backend.
pub fn output_format(&self) -> crate::capture::OutputFormat {
let gpu = self.encoder.is_gpu();
// Linux NVENC 4:4:4: libavcodec `hevc_nvenc` only emits 4:4:4 from a YUV444 *input* frame —
// RGB-in is always subsampled to 4:2:0 (verified on the RTX 5070 Ti). So the encoder does an
// RGB→YUV444P swscale and needs CPU-resident RGB frames; force the zero-copy GPU capture off
// for a 4:4:4 NVENC session. (VAAPI 4:4:4, where the hardware supports it, keeps its dmabuf
// path via `scale_vaapi`; Windows NVENC ingests ARGB directly and stays GPU.)
#[cfg(target_os = "linux")]
let gpu = {
let force_cpu_for_nvenc_444 =
self.chroma.is_444() && !crate::encode::linux_zero_copy_is_vaapi();
gpu && !force_cpu_for_nvenc_444
};
crate::capture::OutputFormat {
gpu,
hdr: self.hdr,
// 4:4:4 needs a full-chroma source: on Windows this keeps the capturer on RGB (not the
// default NV12/P010 video-engine output) so NVENC can CSC to 4:4:4.
chroma_444: self.chroma.is_444(),
}
}
}
/// Process topology. Single-process is the only topology now: Linux (portal) and Windows (in-process
/// IDD-push in Session 0). The Windows SYSTEM-host + user-session WGC relay was removed with DDA/WGC.
pub(crate) fn resolve_topology() -> SessionTopology {
SessionTopology::SingleProcess
}
#[cfg(target_os = "windows")]
fn resolve_encoder() -> EncoderBackend {
match crate::encode::windows_resolved_backend() {
crate::encode::WindowsBackend::Nvenc => EncoderBackend::Nvenc,
crate::encode::WindowsBackend::Amf => EncoderBackend::Amf,
crate::encode::WindowsBackend::Qsv => EncoderBackend::Qsv,
crate::encode::WindowsBackend::Software => EncoderBackend::Software,
}
}
#[cfg(not(target_os = "windows"))]
fn resolve_encoder() -> EncoderBackend {
// `PUNKTFUNK_ENCODER=software` forces the GPU-less openh264 path — which must take CPU-staged
// capture (`EncoderBackend::Software.is_gpu() == false` → `output_format().gpu = false`), so the
// portal capturer delivers CPU RGB. Everything else stays `PlatformAuto` (NVENC/VAAPI resolved
// inside `encode::open_video`).
match crate::config::config().encoder_pref.as_str() {
"software" | "sw" | "openh264" => EncoderBackend::Software,
_ => EncoderBackend::PlatformAuto,
}
}