Files
punktfunk/crates/punktfunk-core
enricobuehler 448986f41c
apple / swift (push) Successful in 1m16s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
ci / rust (push) Successful in 1m31s
deb / build-publish (push) Successful in 2m36s
ci / web (push) Failing after 36s
ci / docs-site (push) Failing after 32s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m42s
rpm / build-publish (push) Successful in 4m38s
docker / deploy-docs (push) Successful in 17s
perf(core): UDP GSO send path (the multi-Gbps lever)
sendmmsg already batches syscalls but still builds one sk_buff per datagram —
the kernel-side wall above ~1 Gbps. UDP Generic Segmentation Offload hands the
kernel one big buffer it splits into gso_size datagrams, building ~1 GSO skb per
≤64 segments. Research (LWN/Cloudflare/Tailscale) measures ~2.4x throughput at
equal CPU and 17-44x fewer syscalls, and that sendmmsg batching alone is
insufficient — you need true segmentation offload.

Adds Transport::send_gso (default = send_batch) + a UdpTransport Linux override:
coalesces a frame's equal-size wire packets (shards are zero-padded to a constant
size, so a whole frame is one gso_size) into ≤64-segment sendmsg(UDP_SEGMENT)
calls. seal/send routes through it. Opt-in via PUNKTFUNK_GSO (new unsafe hot-path
code) with automatic fallback to sendmmsg on any GSO error (unsupported kernel/
path), latched per process. Loopback unit test validates the cmsg segmentation;
full session over loopback streams clean (0% loss). Linux-only; loopback/non-Linux
keep sendmmsg/scalar.

Next levers: in-place AES-GCM seal (kill per-packet allocs), UDP GRO on recv,
drop the sleep-pacing in favor of the kernel qdisc, jumbo MTU.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 23:29:51 +00:00
..