From a bug-hunt + unsafe-audit pass (4 reviewers + adversarial verify). It
confirmed ZERO real bugs in the recent batched/paced data-plane work — these are
the surfaced cleanups + one genuine soundness fix:
- SOUNDNESS (reduce unsafe): inject/gamepad.rs::pump_ff did `ptr::read` of an
InputEventRaw (align 8, holds a timeval) out of a 1-aligned [u8; N] buffer — UB
per the reference (x86_64 tolerates it, but it can miscompile under LTO). Use
ptr::read_unaligned + a SAFETY note. Zero behavior change.
- recv parity: recv_batch (recvmmsg) didn't drop an oversized/truncated datagram
the way scalar recv does — poll_frame now skips a message whose len fills the
buffer (> MAX_DATAGRAM_BYTES), matching recv's `n >= RECV_BUF` drop. (AEAD
already rejected these on encrypted sessions; this restores the documented
invariant on the batched path.)
- dedup unsafe FFI: factor the identical mmsghdr-from-iovec construction out of
send_batch + recv_batch into one `mmsghdrs()` helper — the raw-pointer
scaffolding + its lifetime SAFETY note now live in one place.
- docs: TARGET_SOCKBUF no longer calls paced sending future work (it landed,
m3.rs::paced_submit); gamescope.rs input is no longer "(TODO)" (wired +
live-validated); the PUNKTFUNK_PERF `wire_mbps` field is renamed `tx_mbps` and
noted as attempted/sealed bytes (send_dropped shows what didn't reach the wire).
Full suite (35 + loopback round-trip + 6) + clippy + fmt green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Final increment of the 1 Gbps data-plane rework — the recv counterpart of the
sendmmsg work. The client recv path did one recvfrom + one Vec allocation per
packet (and the pump's 300µs idle sleep could let packets pile up at line rate).
- Transport gains recv_batch(&mut [Vec<u8>], &mut [usize]) -> count; default is
a single scalar recv into out[0] (loopback + non-Linux).
- UdpTransport overrides it on Linux with recvmmsg (MSG_DONTWAIT) draining up to
N datagrams per syscall into the caller's reused buffers — no per-packet alloc.
- Session::poll_frame owns a lazily-allocated recv ring (RECV_BATCH=32) and
consumes it one packet at a time across calls, refilling with one recvmmsg when
drained. Encapsulated: the punktfunk-client-rs + NativeClient pumps are
unchanged, and draining a batch per syscall means the 300µs sleep no longer
underdrains. Added UdpTransport::local_addr (used by the test, generally handy).
~125k → ~4k recv syscalls/sec at line rate, zero per-packet recv allocation.
Verified: new recv_batch_drains_over_loopback test (50 datagrams drained intact
via recvmmsg) + the existing loopback round-trip now runs through the batched
poll_frame; full suite (35 + round-trip + 6) + clippy + fmt green.
Decode-in-place (kill the per-packet open_from_wire alloc) is a separate later
optimization. With A (sendmmsg) + B (paced send) + C (recvmmsg), the native data
plane is batched + paced end to end.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First increment of the 1 Gbps send-path rework (the measured bottleneck): the
native data plane did one send() syscall per packet — at ~125k pkt/s (1 Gbps
wire) that burns a core on syscalls. Port the proven GameStream sendmmsg path
into the core Transport seam.
- Transport gains `send_batch(&[&[u8]]) -> usize` (count handed to the kernel;
caller counts the rest as send-buffer drops). Default = the scalar send loop
(loopback transport + non-Linux).
- UdpTransport overrides it on Linux with `sendmmsg` (64 datagrams/syscall);
the connected socket needs no per-message address. Non-blocking-aware: a full
send buffer yields a short count / EAGAIN, and we stop + report what went out
rather than block or retry (same lossy, FEC-protected contract as send()).
- Session::submit_frame seals every shard then hands the whole frame to
send_batch in ONE call instead of looping send() — ~64x fewer syscalls per
frame on the native + GameStream-over-core paths; send_dropped accounting
preserved (total - sent).
~125k → ~2k syscalls/sec at 1 Gbps line rate. Verified: new loopback-UDP test
send_batch_delivers_over_loopback (100 batched packets arrive intact, datagram
boundaries preserved); full core suite + clippy + fmt green.
Next increments: a paced send thread (microburst shaping so a real NIC doesn't
drop line-rate bursts) and recvmmsg on the client.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First step of 1 Gbps+ readiness (the whole point of the GF(2^16) Leopard FEC):
make 1 Gbps configurable and its dominant failure mode observable, before the
real transport work (sendmmsg + paced encode|send split) lands.
Investigation (6-way) verdict: we're ~halfway, and it's mostly clamps plus one
real piece of work. The integer/type path, FEC (a 1 Gbps frame is only a few
hundred shards in one GF(2^16) block, far under the 65535 ceiling), AES-GCM
(AES-NI, ~10-25x headroom), and the M1 reassembler bounds (fully derived from
the negotiated FecConfig) are ALL already 1 Gbps-ready and untouched.
This commit (the configurable + observable foundation):
- m3.rs: MAX_BITRATE_KBPS 500_000 -> 2_000_000 (2 Gbps headroom over the 1 Gbps+
target); MAX_PROBE_KBPS 1_000_000 -> 3_000_000 (probe can demonstrate headroom
ABOVE the session cap so a client can confidently pick a 1 Gbps+ bitrate).
- transport/udp.rs: TARGET_SOCKBUF 8 MB -> 32 MB (a multi-MB IDR keyframe burst
no longer fills the buffer); scripts/99-punktfunk-net.conf bumped to match.
- Observability: Transport::send now returns Ok(true|false) (false = WouldBlock
send-buffer drop, previously a silent Ok(())). Session counts these as a new
`packets_send_dropped` stat (distinct from recv-side packets_dropped) — in
Stats, the C ABI PunktfunkStats (header regenerated), a PUNKTFUNK_PERF periodic
wire-Mbps + drop dump in virtual_stream, and the speed-test probe completion
log. This is the dominant 1 Gbps+ loss mode and was invisible.
Loopback-verified: a probe now runs at 1.2 Gbps target (no longer truncated to
1 Gbps) with the drop counter live. NOT yet a sustained-1-Gbps proof — the
single-send()-per-packet native path is the next, real piece of work (port the
proven GameStream sendmmsg + paced send thread into the core Transport).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The data-plane UDP sockets used the OS default buffer (~208 KB on Linux, similar
on macOS), which is smaller than a single high-resolution frame burst: a
5120×1440 keyframe is ~130 packets the encode|send thread hands to sendmmsg at
once. The burst overflows the buffer — EAGAIN on the host send (now dropped, was
fatal) or a silent drop on the client recv — and because the data plane runs
infinite-GOP, one lost frame breaks every subsequent reference and the decode
freezes on the last good frame until an RFI refresh that may never catch up.
Symptom: connect at 5120×1440, see ONE frame, then a frozen image (audio + input
keep working — those ride QUIC, not this socket).
Set SO_SNDBUF/SO_RCVBUF to 8 MB (clamped by the OS to net.core.{w,r}mem_max on
Linux / kern.ipc.maxsockbuf on macOS); warn if the grant lands far below target so
an undersized host is diagnosable. The client side matters most — the SAME
UdpTransport backs the Apple client's data plane via the C ABI, and macOS grants
multi-MB buffers without any sysctl, so a rebuilt client stops losing frames.
Validated live, bazzite→client at 5120×1440: was 1319/1500 frames (12% loss →
freeze), now 1500/1500 @60 and 5279/5279 @240 (split-encode active), zero
mismatches, p50 1.9–3.4 ms. Host send buffer was still capped at 416 KB and lost
nothing — the loss was purely the client recv buffer.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
UdpTransport sockets are non-blocking, so a momentarily-full kernel send buffer
makes socket.send return WouldBlock (EAGAIN). submit_frame propagated that as a
fatal error, tearing the whole punktfunk/1 session down — observed when attaching
to an already-running source (a headless Steam session) that emits frames at full
rate the instant capture connects: the first burst saturates the tx queue and the
session dies before a single frame reaches the client.
The data plane is lossy + Leopard-FEC-protected and runs infinite-GOP with RFI
keyframes, so the real-time-correct response to a full tx queue is to DROP the
packet (the next frame / FEC recovers) — exactly what the recv path already does
for WouldBlock. Blocking would queue stale frames and add latency. Loopback/M1
paths are unaffected (LoopbackTransport never blocks; M1 tests stay green).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Full project rename, decided 2026-06-10:
- Crates/binaries: punktfunk-core / punktfunk-host / punktfunk-client-rs.
- C ABI: punktfunk_* symbols, Punktfunk* types, include/punktfunk_core.h,
PUNKTFUNK_FEATURE_QUIC guard (header regenerated; cbindgen renames updated, incl.
PUNKTFUNK_BTN_*/PUNKTFUNK_AXIS_* wire constants).
- Protocol: punktfunk/1 — control-plane magic LMN1 → PKF1, nonce salt lmn1 → pkf1.
WIRE BREAK: clients must be rebuilt from this revision.
- Env knobs: PUNKTFUNK_VIDEO_SOURCE / PUNKTFUNK_COMPOSITOR / PUNKTFUNK_ZEROCOPY / ….
- Host config dir: ~/.config/punktfunk (the box's dir was migrated in place — the
persistent identity is unchanged, pinned fingerprints stay valid).
- Swift package: PunktfunkKit + PunktfunkCore.xcframework + PunktfunkConnection
(Sources/PunktfunkClient app + tests renamed with it); build-xcframework.sh updated.
- scripts/: 60-punktfunk.rules, punktfunk-host.service; OpenAPI doc regenerated.
Also: scripts/headless/run-headless-kde.sh — full headless Plasma bringup. Root cause of
"desktop but no apps/settings" over the stream: plasmashell launched without
XDG_MENU_PREFIX=plasma-, so the launcher resolved a nonexistent applications.menu and
rendered an empty menu. The script sets the complete KDE session env (menu prefix,
KDE_FULL_SESSION, session version) and rebuilds ksycoca before starting plasmashell.
Gate: 97/97 tests, clippy -D warnings (both feature sets), fmt, C-ABI harness PASS,
zero lumen references left outside .git.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>