punktfunk

Author	SHA1	Message	Date
enricobuehler	5b0d84acd0	feat: M3 — lumen/1 native streaming: real video at client mode + input over QUIC datagrams The native protocol now does the real thing, end to end: - Hello carries the client's requested mode; the host creates a NATIVE virtual output at exactly that size/refresh (same vdisplay backends as the GameStream path) and streams NVENC HEVC through the M1 Session (GF(2^16) Leopard FEC + AES-GCM, QUIC-negotiated). - Input rides QUIC DATAGRAMS — encrypted, congestion-managed, no ENet retransmission spikes — decoded into lumen_core InputEvents and fed to the session's input injector. - Frames are stamped with the capture wall clock; the reference client computes per-frame capture→reassembled latency percentiles and writes a playable .h265. - m3-host gains --source synthetic\|virtual + --seconds; the client gains --mode WxHxFPS, --out, --input-test (scripted mouse/keyboard datagrams). VALIDATED live (gamescope session, xev nested): client requested 1280x720@120 → host created gamescope at that mode → 1680/1680 frames over 14s, zero loss, valid HEVC; pipeline latency p50 0.83ms / p95 1.2ms / p99 1.3ms (capture→encode→FEC→crypto→UDP→ reassembled, same-host clock); 176 input datagrams sent → injector (GamescopeEi) → 164 X events observed inside the nested session. Known follow-on: slice-level sub-frame pipelining needs the NVENC SDK directly (libavcodec emits whole AUs only) — the next big latency lever. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-10 06:56:47 +00:00
enricobuehler	de3123038f	feat: M3 seed — the lumen/1 native protocol: QUIC control plane + reference client (Phase 5) The first end-to-end run of lumen's own protocol, past the GameStream compatibility layer. - lumen-core/src/quic.rs (behind the `quic` feature): the lumen/1 handshake — Hello/Welcome/ Start as length-prefixed LE binary on one QUIC bi-stream. Welcome carries the COMPLETE data-plane Config: mode, FEC scheme incl. GF(2^16) Leopard (inexpressible in GameStream), shard sizing, AES-GCM key + per-direction salt, data UDP port. Plus quinn endpoint helpers (self-signed server; accepts-any client — pinning lands with the trust model) and framed async IO. Round-trip unit-tested. - lumen-host m3-host: serves one lumen/1 session — QUIC handshake, then a NATIVE thread (no async on the frame path — design invariant) streams deterministic 64KB test frames through the hardened M1 Session over UdpTransport. - lumen-client-rs: from scaffold to working reference client — connects, negotiates, brings up the client Session over UDP, reassembles + FEC-recovers + byte-verifies every frame. VALIDATED END-TO-END on localhost: 300/300 frames verified, 0 mismatches, through QUIC-negotiated GF(2^16) FEC + AES-GCM over real UDP sockets. M4 (decode+present) builds on this exact client skeleton. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 23:33:40 +00:00
enricobuehler	72f8c05aa3	feat: M2 P1.5 (FEC) — nanors-exact Reed-Solomon recovery for the video stream Moonlight now reconstructs lost video shards from our parity (verified live: under induced packet loss the picture recovers cleanly instead of failing with "network connection too bad"; 0% added loss in normal operation). The decisive finding: Moonlight's nanors uses a CAUCHY generator matrix (M[j][i] = inv[(m+i)^j], GF(2^8) poly 0x1d), while reed-solomon-erasure is Vandermonde — so its parity was NOT Moonlight-decodable, despite the old gf8.rs comment claiming equivalence. lumen-core: - Swap the GF(2^8) backend from reed-solomon-erasure to a vendored fec-rs (vendor/fec-rs, BSD-2), which builds the byte-identical Cauchy matrix. Pure Rust, no FFI — keeps the "one core" hot path. This makes both lumen's own protocol and the GameStream parity nanors-compatible. - Lock it with a regression test against real nanors vectors (k=4,m=2 [10,20,30,40] -> parity [136,0]) + an independent matrix-derived cross-check + an erase/recover round-trip. Existing FEC/loopback tests stay green, so lumen's own protocol is unaffected. lumen-host video.rs: - Generate m = ceil(kpct/100) parity shards per FEC block via Gf8Coder; stamp fecInfo with the recomputed wire pct (100m/k) so the client derives the same count; cap per-block data to 255100/(100+pct) so k+m <= 255. - CRITICAL byte-exactness: RS runs over the whole `blocksize` shard (Moonlight decodes packetSize+16 bytes from the datagram start and PACKET_RECOVERY_FAILUREs on a bad reconstructed `flags` byte). So the NV header fields RS must reproduce (streamPacketIndex/frameIndex/flags/multiFec) are written into data shards BEFORE encode, and only the transport fields (RTP header/seq/timestamp + fecInfo) are stamped AFTER — leaving the flags byte RS-covered. Matches Sunshine stream.cpp. Unit-tested incl. flags recovery. - fec_percentage wired from stream.rs (Sunshine default 20, LUMEN_FEC_PCT override; 0 = data-only). LUMEN_VIDEO_DROP injects loss to test recovery. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 11:34:27 +00:00
enricobuehler	a913042367	feat: M1 lumen-core (FEC/crypto/packet/session + C ABI) and workspace scaffold Ground-up low-latency streaming stack per docs/implementation-plan.md. M1 is complete and tested; Linux host backends are cfg-gated stubs to be filled in on real hardware (M0/M2). lumen-core (built + tested on macOS/aarch64 — 21 tests): - fec: ErasureCoder over GF(2^8) (reed-solomon-erasure, Moonlight-compatible) and GF(2^16) Leopard-RS (reed-solomon-simd, the >1 Gbps wall-breaker); proptested - packet: zero-copy #[repr(C)] framing, multi-block, FEC-aware reassembly - crypto: AES-128-GCM with per-direction nonce salts + sequence-as-AAD - session: host submit / client poll hot paths + input; loopback & UDP transports - abi: opaque handles, versioned LumenConfig, panic guards; cbindgen-generated header - acceptance: Rust loopback+proptest and a C harness that links the staticlib Scaffold (compiles green on all platforms): lumen-host (vdisplay/capture/encode/ inject/web/pipeline seams under cfg(linux)), lumen-client-rs, tools/{loss-harness, latency-probe}, Apple/Android client stubs, Gitea CI, docs. Hardened against a multi-agent adversarial review (13 verified findings fixed, regression-tested): reassembler memory-DoS bounds + block-consistency validation, GCM nonce-reuse direction separation, ABI struct_size guard + range checks, FEC shard-length guards, shard_payload datagram bound, key zeroization + Debug redaction. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 00:02:52 +02:00

4 Commits