feat(host): Apollo-backlog hardening — cert gate, NVENC RFI, media QoS, async injector

A pass over the apollo-comparison backlog (re-verified against current code).
Lands four items end-to-end plus a Windows-DualSense scoping doc.

- #5/#92/#26 — GameStream paired-cert allow-list. tls.rs surfaces the verified
  peer cert to handlers (serve_https + PeerCertFingerprint, now shared with the
  mgmt API instead of duplicated); nvhttp gates /launch /resume /applist /cancel
  on AppState.paired and reports a real PairStatus; save_paired writes atomically
  (temp+rename). Closes the "mTLS accepts any client cert" hole. + regression test.

- #6/#51/#19/#22 — NVENC caps query -> reference-frame invalidation. nvenc.rs
  query_caps probes nvEncGetEncodeCaps (max dims / 10-bit / custom-VBV / RFI),
  rejecting over-range modes and degrading 10-bit->8-bit instead of an opaque
  InvalidParam. New Encoder::invalidate_ref_frames (default false -> caller
  keyframes); the Windows NVENC path implements real RFI (multi-ref DPB +
  nvEncInvalidateRefFrames, dedup + IDR-on-overflow). control.rs decodes the
  0x0301 lost-frame range (Apollo's IDX_INVALIDATE_REF_FRAMES) -> AppState.rfi_range
  -> encode loop, falling back to a keyframe. NOTE: the Windows NVENC impl is
  RTX-box/CI-pending (can't compile on Linux); adversarially reviewed vs the SDK.

- #43/#72 — media socket QoS + buffer growth. New punktfunk_core::transport::qos:
  grow_socket_buffers (factored out the native plane's 32MB SO_SNDBUF growth so the
  GameStream sockets reuse it) + set_media_qos (opt-in PUNKTFUNK_DSCP=1: DSCP CS5
  video / CS6 audio + Linux SO_PRIORITY, Apollo's scheme). Wired into UdpTransport
  and the GameStream video/audio sockets. Windows IP_TOS needs qWAVE (follow-up).

- #8/#45 — GameStream input injection off the ENet service thread. on_receive no
  longer injects inline (a slow inject head-blocked ENet keepalive/retransmit); it
  forwards to a dedicated injector thread. The hardened InjectorService moved from
  punktfunk1 into crate::inject (shared by both planes) + a coalesce step that sums
  adjacent relative-mouse/scroll deltas while preserving button/key/abs ordering.

Docs: re-verified apollo-comparison.md status (22 items already done/obsolete since
the snapshot) + windows-dualsense-scoping.md (ViGEm can't emulate a DualSense; real
DS5 on Windows needs a VHF virtual-HID driver — web-research pass pending).

fmt + clippy -D warnings clean; full workspace test suite green; no C-ABI/OpenAPI drift.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-21 00:06:30 +00:00
parent a2a6b858f7
commit 450bcf1e7b
20 changed files with 1060 additions and 281 deletions
+6 -37
View File
@@ -413,26 +413,15 @@ pub struct UdpTransport {
}
impl UdpTransport {
/// Target kernel socket-buffer size. A high-resolution frame is a burst (a 5120×1440
/// keyframe is ~130 packets the send thread hands to `sendmmsg` at once); the default
/// UDP buffer (~208 KB on Linux) overflows on it, which EAGAINs the host send (dropping
/// packets) or drops on the client recv — and with infinite-GOP a single lost frame
/// freezes the decode until the next RFI refresh. Requested large; the OS clamps to
/// `net.core.{wmem,rmem}_max` (Linux) / `kern.ipc.maxsockbuf` (macOS).
///
/// Sized for 1 Gbps+: at ~1.2 Gbps on the wire an 8 MB buffer is only ~49 ms of steady state,
/// and a single multi-MB IDR keyframe (~4 MB ≈ 3300 packets) instantly fills most of it. 32 MB
/// gives ~200 ms of headroom and absorbs a keyframe burst without EAGAIN drops. (Paced sending
/// — `punktfunk1.rs::paced_submit` — now spreads a big frame's overflow, so this buffer mostly absorbs
/// the immediate microburst rather than a whole unpaced frame.)
const TARGET_SOCKBUF: usize = 32 * 1024 * 1024;
/// Bind `local` and `connect` to `peer`, so `send`/`recv` need no address and the
/// kernel filters to this peer. Non-blocking, matching the [`Transport`] contract.
pub fn connect(local: &str, peer: &str) -> std::io::Result<Self> {
let socket = UdpSocket::bind(local)?;
socket.connect(peer)?;
Self::grow_buffers(&socket);
super::qos::grow_socket_buffers(&socket);
// The native data plane is video-dominant — tag it as the video class (opt-in via
// PUNKTFUNK_DSCP). Each end marks its own egress.
super::qos::set_media_qos(&socket, super::qos::MediaClass::Video);
socket.set_nonblocking(true)?;
Ok(UdpTransport { socket })
}
@@ -481,7 +470,8 @@ impl UdpTransport {
let target = observed.map(|s| s.to_string());
socket.connect(target.as_deref().unwrap_or(fallback_peer))?;
socket.set_read_timeout(None)?;
Self::grow_buffers(&socket);
super::qos::grow_socket_buffers(&socket);
super::qos::set_media_qos(&socket, super::qos::MediaClass::Video);
socket.set_nonblocking(true)?;
Ok((UdpTransport { socket }, punched))
}
@@ -498,27 +488,6 @@ impl UdpTransport {
self.socket.local_addr()
}
/// Best-effort grow of SO_SNDBUF/SO_RCVBUF (see [`TARGET_SOCKBUF`]). A failure isn't fatal
/// (the stream just runs lossier); a grant far below the request means the OS cap is too
/// low for clean 4K/5K streaming, so warn once with the knob to raise.
fn grow_buffers(socket: &UdpSocket) {
let sock = socket2::SockRef::from(socket);
let _ = sock.set_send_buffer_size(Self::TARGET_SOCKBUF);
let _ = sock.set_recv_buffer_size(Self::TARGET_SOCKBUF);
// The kernel reports back the (possibly clamped, Linux-doubled) granted size.
let granted = sock
.send_buffer_size()
.unwrap_or(0)
.min(sock.recv_buffer_size().unwrap_or(0));
if granted < Self::TARGET_SOCKBUF / 4 {
tracing::warn!(
granted_kb = granted / 1024,
"UDP socket buffer capped well below target — high-resolution streaming may drop \
frames; raise net.core.wmem_max / net.core.rmem_max (Linux) for clean 4K/5K"
);
}
}
/// Apple batched receive via `recvmsg_x` — drains up to `out.len()` datagrams in one syscall into
/// the caller's reused buffers (the recv counterpart of Linux `recvmmsg`, which Darwin lacks).
/// SAFETY: each `MsghdrX` holds a raw pointer into `iovs`, which holds raw pointers into `out`'s