docs(roadmap): §11 1 Gbps+ data plane — foundation landed, batched send next
ci / rust (push) Has been cancelled
ci / rust (push) Has been cancelled
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -224,3 +224,42 @@ build the moment a producer lands.
|
||||
zero-copy import (`GL_RGB10_A2`/float dest for RGB10, or P010 straight through the Vulkan→CUDA
|
||||
path); `hevc_nvenc -profile main10` + color/SEI metadata; opt-in Hello/Welcome + C ABI; Apple
|
||||
VideoToolbox Main10 decode + `wantsExtendedDynamicRangeContent` EDR present + SDR fallback.
|
||||
|
||||
## 11. 1 Gbps+ data plane *(foundation landed — the real work is batched/paced send)*
|
||||
|
||||
Support 1 Gbps+ video bitrate end to end — **the whole point of the GF(2¹⁶) Leopard FEC** (it breaks
|
||||
the GF(2⁸)/Moonlight ~1 Gbps wall). A 6-way subagent investigation (2026-06-11) mapped every ceiling.
|
||||
|
||||
**Verdict: ~halfway, and it's mostly clamps + ONE real piece of work.** Already 1 Gbps-ready and
|
||||
untouched: the integer/type path (u32 kbps → u64 → int64_t, no truncation); FEC (a 1 Gbps frame is
|
||||
only ~434–874 data shards = a single GF(2¹⁶) block, two orders under the 65535 ceiling); AES-GCM
|
||||
(RustCrypto auto AES-NI, ~10–25× headroom on x86_64); the u64 sequence/nonce space; and the **M1
|
||||
`ReassemblerLimits`** — fully *derived* from the negotiated `FecConfig`, so they already admit every
|
||||
legit high-bitrate frame with nothing to relax. Security invariant to keep: every allocation size
|
||||
must trace to a host-negotiated parameter clamped to a scheme ceiling — scale via the negotiated
|
||||
params (`max_data_per_block`, `shard_payload`), never by widening a bound by hand.
|
||||
|
||||
- **Done & live (`b8a33e2`) — make 1 Gbps configurable + its failure mode observable:** raised the
|
||||
clamps (`MAX_BITRATE_KBPS` 500 Mbps → 2 Gbps; `MAX_PROBE_KBPS` 1 → 3 Gbps so the probe can show
|
||||
headroom above the session cap); `TARGET_SOCKBUF` 8 → 32 MB (+ matching `99-punktfunk-net.conf`)
|
||||
so a multi-MB IDR burst doesn't fill the buffer; and surfaced the previously-silent WouldBlock
|
||||
send-buffer drop — `Transport::send` → `Result<bool>`, a new `packets_send_dropped` stat (Stats +
|
||||
C ABI `PunktfunkStats`), a `PUNKTFUNK_PERF` wire-Mbps/drop dump in `virtual_stream`, and the probe
|
||||
completion log. Loopback-verified the clamp no longer truncates a 1.2 Gbps probe.
|
||||
- **The real bottleneck (next):** the native data plane is single-threaded with one `send()` syscall
|
||||
per packet — at ~125k pkt/s (1 Gbps wire) it burns a core on syscalls and mass-drops keyframe
|
||||
bursts. The fix is a **port, not invention**: lift the GameStream path's proven `sendmmsg_all`
|
||||
(64/call) + paced `spawn_sender` into the core `Transport` seam (`send_batch(&[&[u8]])`, Linux
|
||||
`sendmmsg`, scalar default), move FEC+seal+send onto a dedicated paced send thread, and mirror with
|
||||
`recvmmsg` + a reused buffer ring on the client (kills the per-recv alloc + the 300 µs-sleep
|
||||
underdrain). ~64× fewer syscalls.
|
||||
- **Then refine as profiling shows:** add a FEC throughput-bench to `loss-harness`; reuse the
|
||||
reed-solomon engine in `Gf16Coder`; lower `max_data_per_block` 4096 → 256–1024 (bounds burst-drop
|
||||
blast radius + enables per-block FEC parallelism); seal in place via `AeadInPlace`; bump
|
||||
`shard_payload` 1200 → ~1452 (or jumbo after a path-MTU probe) for ~17% (or ~6×) fewer packets.
|
||||
- **DoS hygiene (last):** derive the one hardcoded reassembler field (`max_frame_bytes` = 64 MiB,
|
||||
never set by `session_config`) from the negotiated mode/bitrate — strictly *tightens* the surface.
|
||||
- **Validate with the speed-test probe** (it reuses the real `submit_frame`→FEC+crypto+send path):
|
||||
`punktfunk-client-rs --speed-test KBPS:MS`, RELEASE build (debug is CPU-bound ~30 Mbps), watching
|
||||
`packets_send_dropped`. Open Qs: NVENC CBR rate-tracking at 0.5–1 Gbps (no explicit
|
||||
`rc_buffer_size`); LAN/QEMU-NIC jumbo/GSO support; any `web/` bitrate slider hardcoding 500 Mbps.
|
||||
|
||||
Reference in New Issue
Block a user