fix(net/gso): fall back to sendmmsg on EMSGSIZE instead of tearing down
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 30s
apple / swift (push) Successful in 1m15s
ci / rust (push) Successful in 2m6s
ci / bench (push) Successful in 1m35s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 2m22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m56s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m23s

Enabling PUNKTFUNK_GSO on a host whose egress MTU is below our UDP segment size
made every GSO send return EMSGSIZE (code 90, "Message too long") — the kernel
validates each GSO segment against the device MTU at send time, which plain
sendmmsg does not. EMSGSIZE wasn't in gso_unsupported() (nor is_transient_io), so
it propagated as a fatal "send failed — stopping stream" and instantly killed
every session the moment GSO was on (observed live: connection fails instantly /
speed-test 0 Mbps).

Add EMSGSIZE to gso_unsupported() so it latches GSO off for the process and
finishes via sendmmsg — the standard "GSO not usable on this path" fallback.
Measured after: the same host+path does 1 Gbps at 0.0% loss over the real LAN via
sendmmsg (and the host send path sustains a 2 Gbps probe with send_dropped=0), so
GSO is a >2 Gbps optimization, not required for 1 Gbps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-14 01:06:41 +00:00
parent 16ccc7c876
commit 4d26f61e40
+11 -3
View File
@@ -72,13 +72,21 @@ mod gso {
}
}
/// True if the send error means UDP GSO isn't supported here (vs a transient/real failure) — so we
/// latch GSO off and fall back to `sendmmsg` rather than tear the stream down.
/// True if the send error means UDP GSO isn't usable on this kernel/NIC/path (vs a transient/real
/// failure) — so we latch GSO off and fall back to `sendmmsg` rather than tear the stream down.
/// `EMSGSIZE` is the important one in practice: a NIC/egress path whose effective MTU is below our
/// segment size rejects the whole GSO super-buffer at send time (the kernel validates each segment
/// against the device MTU, which plain `sendmmsg` does not) — observed live as a code-90
/// "Message too long" that instantly killed the stream. Treat it as "no GSO here" and fall back.
#[cfg(target_os = "linux")]
fn gso_unsupported(e: &std::io::Error) -> bool {
matches!(
e.raw_os_error(),
Some(libc::ENOPROTOOPT) | Some(libc::EOPNOTSUPP) | Some(libc::EINVAL) | Some(libc::EIO)
Some(libc::ENOPROTOOPT)
| Some(libc::EOPNOTSUPP)
| Some(libc::EINVAL)
| Some(libc::EIO)
| Some(libc::EMSGSIZE)
)
}