feat(protocol,host): negotiate video codec + add a GPU-less software (openh264) encode path
Phase 1 of codec negotiation, and the Linux software H.264 encode path it unblocks. **Codec negotiation (core `quic`):** - `Hello.video_codecs` (bitfield: CODEC_H264/HEVC/AV1) — the client advertises what it can decode; appended as a trailing byte (older client → 0 = HEVC-only, back-compat). - `Welcome.codec` — the single codec the host resolved and will emit; trailing byte (older host → HEVC). - `resolve_codec(client, host_capable)` picks the shared codec (precedence HEVC > AV1 > H.264) or `None` → the host refuses honestly rather than sending an undecodable stream. - Roundtrip + back-compat tests; cbindgen exports the CODEC_* constants. **Software encoder (host):** - The openh264 `OpenH264Encoder` (was Windows-only) is now built on Linux too — it's platform-agnostic (consumes CPU RGB `CapturedFrame`s, statically-bundled openh264). `openh264` moved to the shared linux+windows Cargo target. - `PUNKTFUNK_ENCODER=software` selects it: `open_video` gains a `software` branch (H.264 only), and `session_plan::resolve_encoder` / `capture::gpu_encode` resolve `EncoderBackend::Software` → `output_format().gpu = false`, so the portal capturer delivers CPU RGB. Explicit-only (auto never picks it — a box with a dead driver still has /dev/nvidiactl and would mis-resolve NVENC). **Host codec resolution (`punktfunk1`):** - The native path no longer hardcodes HEVC: it resolves the codec from the client's advertised set ∩ the host's capability (`Codec::host_wire_caps`: software→H.264, else HEVC), threads it through `SessionPlan.codec`, and opens the encoder + validates reconfigures at that codec. A software host + HEVC-only client is refused with a clear error. - 4:4:4 is gated on HEVC (it's HEVC-only). **Probe:** advertises H264|HEVC|AV1 and logs the resolved codec. Validated on the GPU-less dev box: negotiation is live end-to-end (probe advertises 0x07 → host resolves H.264 → Welcome reports it → plan = Software/H264), and the openh264 unit test (CPU RGB → AnnexB IDR) now runs on Linux. Full capture→encode still needs a GPU on this box — every compositor screencast path (KWin GL, gamescope VK_EXT_physical_device_drm, wlroots EGL) requires one; software render (llvmpipe/pixman) can't be captured — so this box exercises negotiation + encoder, not live capture. The software path unblocks GPU-less-*encode* boxes that still have a display GPU. Phase 2 (clients advertising real codecs + decoding per Welcome.codec) is a follow-up. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -71,6 +71,10 @@ tempfile = "3"
|
||||
# MSVC too (needs CMake + NASM, both on the box). Both platforms that have an audio-capture backend.
|
||||
[target.'cfg(any(target_os = "linux", target_os = "windows"))'.dependencies]
|
||||
opus = "0.3"
|
||||
# Software H.264 encoder — the GPU-less encode path on both Linux and Windows (and a fallback when no
|
||||
# hardware encoder is available). The default `source` feature statically compiles OpenH264 (BSD-2) —
|
||||
# no system lib, builds on MSVC; nasm on PATH adds the SIMD fast path.
|
||||
openh264 = "0.9"
|
||||
|
||||
[target.'cfg(target_os = "linux")'.dependencies]
|
||||
# `screencast` gates the ScreenCast portal module; `remote_desktop` adds the RemoteDesktop
|
||||
@@ -198,9 +202,6 @@ winreg = "0.56"
|
||||
# Parse each Xbox/Game-Pass game's MicrosoftGame.config (GDK manifest XML) for the Xbox store
|
||||
# provider — a small read-only DOM is all we need (Identity/Executable/ShellVisuals/StoreId).
|
||||
roxmltree = "0.21"
|
||||
# Software H.264 encoder (GPU-less path + NVENC fallback). The default `source` feature statically
|
||||
# compiles OpenH264 (BSD-2) — no system lib, builds on MSVC; nasm on PATH adds the SIMD fast path.
|
||||
openh264 = "0.9"
|
||||
# WASAPI loopback audio capture (default render endpoint -> 48 kHz stereo f32 for the Opus path).
|
||||
wasapi = "0.23"
|
||||
# Virtual Xbox 360 gamepad: the in-tree XUSB companion UMDF driver (packaging/windows/xusb-driver),
|
||||
|
||||
@@ -95,7 +95,13 @@ pub(crate) fn gpu_encode() -> bool {
|
||||
}
|
||||
#[cfg(not(target_os = "windows"))]
|
||||
pub(crate) fn gpu_encode() -> bool {
|
||||
true
|
||||
// The GPU-less software encoder (openh264) needs CPU-staged RGB frames; every other Linux
|
||||
// backend (NVENC/CUDA, VAAPI) is GPU-resident. Mirrors `session_plan::resolve_encoder`, for the
|
||||
// GameStream/spike entry points that use `OutputFormat::resolve` instead of a full `SessionPlan`.
|
||||
!matches!(
|
||||
crate::config::config().encoder_pref.as_str(),
|
||||
"software" | "sw" | "openh264"
|
||||
)
|
||||
}
|
||||
|
||||
/// A captured frame. [`format`](Self::format)/dimensions describe the pixels regardless of
|
||||
|
||||
@@ -57,6 +57,37 @@ impl ChromaFormat {
|
||||
}
|
||||
|
||||
impl Codec {
|
||||
/// Map a negotiated `quic` codec bit ([`punktfunk_core::quic::CODEC_H264`] etc.) to the encoder
|
||||
/// [`Codec`]. Unknown / `0` → HEVC (the pre-negotiation default). Inverse of [`Codec::to_wire`].
|
||||
pub fn from_wire(bit: u8) -> Codec {
|
||||
match bit {
|
||||
punktfunk_core::quic::CODEC_H264 => Codec::H264,
|
||||
punktfunk_core::quic::CODEC_AV1 => Codec::Av1,
|
||||
_ => Codec::H265,
|
||||
}
|
||||
}
|
||||
|
||||
/// The single `quic` codec bit for this codec (echoed in [`punktfunk_core::quic::Welcome::codec`]).
|
||||
pub fn to_wire(self) -> u8 {
|
||||
match self {
|
||||
Codec::H264 => punktfunk_core::quic::CODEC_H264,
|
||||
Codec::H265 => punktfunk_core::quic::CODEC_HEVC,
|
||||
Codec::Av1 => punktfunk_core::quic::CODEC_AV1,
|
||||
}
|
||||
}
|
||||
|
||||
/// The `quic` codec bitfield the host can currently **emit** on the punktfunk/1 native path,
|
||||
/// given the resolved encode backend. The GPU-less software encoder (openh264) produces H.264
|
||||
/// only; every GPU backend emits HEVC today (per-GPU H.264/AV1 negotiation on the native path is
|
||||
/// future work — GameStream already negotiates codecs with Moonlight separately). Fed to
|
||||
/// [`punktfunk_core::quic::resolve_codec`] against the client's advertised codecs.
|
||||
pub fn host_wire_caps() -> u8 {
|
||||
match crate::config::config().encoder_pref.as_str() {
|
||||
"software" | "sw" | "openh264" => punktfunk_core::quic::CODEC_H264,
|
||||
_ => punktfunk_core::quic::CODEC_HEVC,
|
||||
}
|
||||
}
|
||||
|
||||
/// The FFmpeg NVENC encoder name (selected by name, not codec id — the latter would
|
||||
/// pick the software encoder).
|
||||
pub fn nvenc_name(self) -> &'static str {
|
||||
@@ -283,6 +314,21 @@ pub fn open_video(
|
||||
chroma,
|
||||
),
|
||||
"vaapi" | "amd" | "intel" => open_vaapi(),
|
||||
// GPU-less software H.264 (openh264) — for a headless / GPU-lost box. Explicit-only:
|
||||
// `auto` never picks it (a box with `/dev/nvidiactl` present but a dead driver would
|
||||
// otherwise wrongly resolve to NVENC). Needs H.264 (openh264 emits only that) and a CPU
|
||||
// RGB frame, which the capturer delivers because the software backend resolves `gpu=false`.
|
||||
"software" | "sw" | "openh264" => {
|
||||
if codec != Codec::H264 {
|
||||
anyhow::bail!(
|
||||
"the software encoder emits H.264 only; the session negotiated {codec:?} \
|
||||
(a client must advertise CODEC_H264 to reach a software host)"
|
||||
);
|
||||
}
|
||||
let _ = (cuda, bit_depth); // software path is CPU + 8-bit only
|
||||
sw::OpenH264Encoder::open(format, width, height, fps, bitrate_bps)
|
||||
.map(|e| Box::new(e) as Box<dyn Encoder>)
|
||||
}
|
||||
"auto" | "" => {
|
||||
// A CUDA frame can ONLY be consumed by NVENC, and a box with the NVIDIA device
|
||||
// nodes always prefers it. Everything else (AMD/Intel) takes the VAAPI path.
|
||||
@@ -303,7 +349,7 @@ pub fn open_video(
|
||||
}
|
||||
}
|
||||
other => anyhow::bail!(
|
||||
"unknown PUNKTFUNK_ENCODER={other:?} — use auto (default), nvenc, or vaapi"
|
||||
"unknown PUNKTFUNK_ENCODER={other:?} — use auto (default), nvenc, vaapi, or software"
|
||||
),
|
||||
}
|
||||
}
|
||||
@@ -708,8 +754,10 @@ mod linux;
|
||||
#[cfg(all(target_os = "windows", feature = "nvenc"))]
|
||||
#[path = "encode/windows/nvenc.rs"]
|
||||
mod nvenc;
|
||||
#[cfg(target_os = "windows")]
|
||||
#[path = "encode/windows/sw.rs"]
|
||||
// Software (openh264) H.264 encoder — the GPU-less path on BOTH Windows and Linux (a headless /
|
||||
// GPU-less test box, or a fallback when no hardware encoder is available). Platform-agnostic: it
|
||||
// consumes CPU RGB `CapturedFrame`s and the statically-bundled openh264 build.
|
||||
#[cfg(any(target_os = "windows", target_os = "linux"))]
|
||||
mod sw;
|
||||
#[cfg(target_os = "linux")]
|
||||
#[path = "encode/linux/vaapi.rs"]
|
||||
|
||||
@@ -658,12 +658,30 @@ async fn serve_session(
|
||||
);
|
||||
// The pairing gate (require_pairing → paired? else park for delegated approval) ran above,
|
||||
// before this future, so a client reaching here is paired (or the host is `--open`).
|
||||
crate::encode::validate_dimensions(
|
||||
crate::encode::Codec::H265,
|
||||
hello.mode.width,
|
||||
hello.mode.height,
|
||||
)
|
||||
.context("client-requested mode")?;
|
||||
|
||||
// Codec negotiation: pick the one codec this host will emit (its backend capability ∩ the
|
||||
// client's advertised codecs). A GPU-less software host emits H.264, so an HEVC-only client
|
||||
// shares nothing with it → refuse honestly rather than send a stream it can't decode.
|
||||
let host_codecs = crate::encode::Codec::host_wire_caps();
|
||||
let codec_bit = punktfunk_core::quic::resolve_codec(hello.video_codecs, host_codecs)
|
||||
.ok_or_else(|| {
|
||||
anyhow!(
|
||||
"no shared video codec: client advertised 0x{:02x}, host can emit 0x{:02x} \
|
||||
(a software-encode host produces H.264 — the client must advertise CODEC_H264)",
|
||||
hello.video_codecs,
|
||||
host_codecs
|
||||
)
|
||||
})?;
|
||||
let codec = crate::encode::Codec::from_wire(codec_bit);
|
||||
tracing::info!(
|
||||
?codec,
|
||||
client_codecs = format_args!("0x{:02x}", hello.video_codecs),
|
||||
host_codecs = format_args!("0x{host_codecs:02x}"),
|
||||
"video codec negotiated"
|
||||
);
|
||||
|
||||
crate::encode::validate_dimensions(codec, hello.mode.width, hello.mode.height)
|
||||
.context("client-requested mode")?;
|
||||
|
||||
// Resolve the client's compositor preference to a concrete backend *now*, so the Welcome
|
||||
// can report what we'll actually drive. Only the Virtual source has a compositor; the
|
||||
@@ -749,7 +767,11 @@ async fn serve_session(
|
||||
// the cheap gates already pass. The result is cached process-wide (a negative latches until
|
||||
// restart — acceptable: a GPU either supports HEVC 4:4:4 or it doesn't, and a transient open
|
||||
// failure here is rare since the session's own encoder isn't open yet).
|
||||
let gpu_supports_444 = if host_wants_444 && client_supports_444 && capture_supports_444 {
|
||||
let gpu_supports_444 = if codec == crate::encode::Codec::H265
|
||||
&& host_wants_444
|
||||
&& client_supports_444
|
||||
&& capture_supports_444
|
||||
{
|
||||
tokio::task::spawn_blocking(|| {
|
||||
crate::encode::can_encode_444(crate::encode::Codec::H265)
|
||||
})
|
||||
@@ -826,6 +848,9 @@ async fn serve_session(
|
||||
// The resolved audio channel count the audio thread will capture + Opus-(multi)stream
|
||||
// encode (2/6/8). The client builds its decoder from this echoed value.
|
||||
audio_channels,
|
||||
// The negotiated codec the encoder will emit (H.264 for a software host, else HEVC). The
|
||||
// client builds its decoder from this instead of assuming HEVC.
|
||||
codec: codec_bit,
|
||||
};
|
||||
io::write_msg(&mut send, &welcome.encode()).await?;
|
||||
|
||||
@@ -838,6 +863,9 @@ async fn serve_session(
|
||||
.await
|
||||
.map_err(|_| anyhow!("handshake timed out after {HANDSHAKE_TIMEOUT:?}"))??;
|
||||
let (mut ctrl_send, mut ctrl_recv) = (send, recv);
|
||||
// Negotiated codec (HEVC / H.264), derived from the Welcome. `Copy`, so the control task's
|
||||
// `async move` captures a copy and it stays usable for the data-plane SessionContext below.
|
||||
let codec = crate::encode::Codec::from_wire(welcome.codec);
|
||||
let client_udp = std::net::SocketAddr::new(peer.ip(), start.client_udp_port);
|
||||
tracing::info!(
|
||||
%client_udp,
|
||||
@@ -874,7 +902,7 @@ async fn serve_session(
|
||||
if let Ok(req) = Reconfigure::decode(&msg) {
|
||||
let ok = req.mode.refresh_hz > 0
|
||||
&& crate::encode::validate_dimensions(
|
||||
crate::encode::Codec::H265,
|
||||
codec,
|
||||
req.mode.width,
|
||||
req.mode.height,
|
||||
)
|
||||
@@ -1169,6 +1197,7 @@ async fn serve_session(
|
||||
bitrate_kbps,
|
||||
bit_depth,
|
||||
chroma,
|
||||
codec,
|
||||
probe_rx,
|
||||
probe_result_tx,
|
||||
fec_target: fec_target_dp,
|
||||
@@ -2727,6 +2756,9 @@ struct SessionContext {
|
||||
bit_depth: u8,
|
||||
/// Negotiated chroma subsampling (4:2:0, or 4:4:4 when the client + host + GPU all support it).
|
||||
chroma: crate::encode::ChromaFormat,
|
||||
/// Negotiated video codec the encoder emits (HEVC by default; H.264 for a software host). Also
|
||||
/// used to rebuild the encoder at the same codec across a mid-stream mode reconfigure.
|
||||
codec: crate::encode::Codec,
|
||||
/// Speed-test burst requests (see [`service_probes`]).
|
||||
probe_rx: std::sync::mpsc::Receiver<ProbeRequest>,
|
||||
/// Speed-test results back to the control task.
|
||||
@@ -2758,7 +2790,7 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
// path now reads this typed `SessionPlan` instead of re-deriving from config at each dispatch site
|
||||
// (the latent "capture and encode disagree on the backend" hazard, plan §2.4). `bit_depth` is the
|
||||
// only per-session input — capture/topology/encoder are otherwise pure functions of `HostConfig`.
|
||||
let plan = crate::session_plan::SessionPlan::resolve(ctx.bit_depth, ctx.chroma);
|
||||
let plan = crate::session_plan::SessionPlan::resolve(ctx.bit_depth, ctx.chroma, ctx.codec);
|
||||
tracing::info!(?plan, "resolved session plan");
|
||||
// Single-process path: unpack the context into the locals the loop below uses (names unchanged, so the
|
||||
// body is byte-for-byte the same; the receivers are now owned but `try_recv()` is identical).
|
||||
@@ -2774,6 +2806,8 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
bit_depth,
|
||||
// The resolved chroma is already captured in `plan` (above); ignore the duplicate here.
|
||||
chroma: _,
|
||||
// Likewise the codec — `plan.codec` (resolved from `ctx.codec`) is the source of truth below.
|
||||
codec: _,
|
||||
probe_rx,
|
||||
probe_result_tx,
|
||||
fec_target,
|
||||
@@ -3448,7 +3482,7 @@ fn build_pipeline(
|
||||
// `bit_depth` is the handshake-negotiated value (8, or 10 = HEVC Main10 when the client
|
||||
// advertised VIDEO_CAP_10BIT and the host opted in). Threaded down from the Welcome.
|
||||
let enc = crate::encode::open_video(
|
||||
crate::encode::Codec::H265,
|
||||
plan.codec,
|
||||
frame.format,
|
||||
frame.width,
|
||||
frame.height,
|
||||
|
||||
@@ -94,12 +94,19 @@ pub struct SessionPlan {
|
||||
/// Handshake-negotiated chroma subsampling (4:2:0, or full-chroma 4:4:4 when the client + host +
|
||||
/// GPU all support it). Resolved before the Welcome; `Yuv420` on every backend that declined it.
|
||||
pub chroma: crate::encode::ChromaFormat,
|
||||
/// Handshake-negotiated video codec the encoder emits — HEVC by default, H.264 for a GPU-less
|
||||
/// software host (`resolve_codec` over the client's advertised codecs ∩ the host's capability).
|
||||
pub codec: crate::encode::Codec,
|
||||
}
|
||||
|
||||
impl SessionPlan {
|
||||
/// Resolve the whole plan once from [`config`](crate::config) + the negotiated `bit_depth` and
|
||||
/// `chroma`.
|
||||
pub fn resolve(bit_depth: u8, chroma: crate::encode::ChromaFormat) -> Self {
|
||||
/// Resolve the whole plan once from [`config`](crate::config) + the negotiated `bit_depth`,
|
||||
/// `chroma`, and `codec`.
|
||||
pub fn resolve(
|
||||
bit_depth: u8,
|
||||
chroma: crate::encode::ChromaFormat,
|
||||
codec: crate::encode::Codec,
|
||||
) -> Self {
|
||||
SessionPlan {
|
||||
capture: CaptureBackend::resolve(),
|
||||
topology: resolve_topology(),
|
||||
@@ -107,6 +114,7 @@ impl SessionPlan {
|
||||
bit_depth,
|
||||
hdr: bit_depth >= 10,
|
||||
chroma,
|
||||
codec,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -154,5 +162,12 @@ fn resolve_encoder() -> EncoderBackend {
|
||||
|
||||
#[cfg(not(target_os = "windows"))]
|
||||
fn resolve_encoder() -> EncoderBackend {
|
||||
EncoderBackend::PlatformAuto
|
||||
// `PUNKTFUNK_ENCODER=software` forces the GPU-less openh264 path — which must take CPU-staged
|
||||
// capture (`EncoderBackend::Software.is_gpu() == false` → `output_format().gpu = false`), so the
|
||||
// portal capturer delivers CPU RGB. Everything else stays `PlatformAuto` (NVENC/VAAPI resolved
|
||||
// inside `encode::open_video`).
|
||||
match crate::config::config().encoder_pref.as_str() {
|
||||
"software" | "sw" | "openh264" => EncoderBackend::Software,
|
||||
_ => EncoderBackend::PlatformAuto,
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user