Commit Graph

4 Commits

Author SHA1 Message Date
enricobuehler 5b0d84acd0 feat: M3 — lumen/1 native streaming: real video at client mode + input over QUIC datagrams
The native protocol now does the real thing, end to end:

- Hello carries the client's requested mode; the host creates a NATIVE virtual output at
  exactly that size/refresh (same vdisplay backends as the GameStream path) and streams
  NVENC HEVC through the M1 Session (GF(2^16) Leopard FEC + AES-GCM, QUIC-negotiated).
- Input rides QUIC DATAGRAMS — encrypted, congestion-managed, no ENet retransmission
  spikes — decoded into lumen_core InputEvents and fed to the session's input injector.
- Frames are stamped with the capture wall clock; the reference client computes per-frame
  capture→reassembled latency percentiles and writes a playable .h265.
- m3-host gains --source synthetic|virtual + --seconds; the client gains --mode WxHxFPS,
  --out, --input-test (scripted mouse/keyboard datagrams).

VALIDATED live (gamescope session, xev nested): client requested 1280x720@120 → host
created gamescope at that mode → 1680/1680 frames over 14s, zero loss, valid HEVC;
pipeline latency p50 0.83ms / p95 1.2ms / p99 1.3ms (capture→encode→FEC→crypto→UDP→
reassembled, same-host clock); 176 input datagrams sent → injector (GamescopeEi) → 164
X events observed inside the nested session.

Known follow-on: slice-level sub-frame pipelining needs the NVENC SDK directly (libavcodec
emits whole AUs only) — the next big latency lever.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 06:56:47 +00:00
enricobuehler de3123038f feat: M3 seed — the lumen/1 native protocol: QUIC control plane + reference client (Phase 5)
The first end-to-end run of lumen's own protocol, past the GameStream compatibility layer.

- lumen-core/src/quic.rs (behind the `quic` feature): the lumen/1 handshake — Hello/Welcome/
  Start as length-prefixed LE binary on one QUIC bi-stream. Welcome carries the COMPLETE
  data-plane Config: mode, FEC scheme incl. GF(2^16) Leopard (inexpressible in GameStream),
  shard sizing, AES-GCM key + per-direction salt, data UDP port. Plus quinn endpoint helpers
  (self-signed server; accepts-any client — pinning lands with the trust model) and framed
  async IO. Round-trip unit-tested.
- lumen-host m3-host: serves one lumen/1 session — QUIC handshake, then a NATIVE thread
  (no async on the frame path — design invariant) streams deterministic 64KB test frames
  through the hardened M1 Session over UdpTransport.
- lumen-client-rs: from scaffold to working reference client — connects, negotiates, brings
  up the client Session over UDP, reassembles + FEC-recovers + byte-verifies every frame.

VALIDATED END-TO-END on localhost: 300/300 frames verified, 0 mismatches, through
QUIC-negotiated GF(2^16) FEC + AES-GCM over real UDP sockets. M4 (decode+present) builds on
this exact client skeleton.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:33:40 +00:00
enricobuehler 72f8c05aa3 feat: M2 P1.5 (FEC) — nanors-exact Reed-Solomon recovery for the video stream
Moonlight now reconstructs lost video shards from our parity (verified live:
under induced packet loss the picture recovers cleanly instead of failing with
"network connection too bad"; 0% added loss in normal operation).

The decisive finding: Moonlight's nanors uses a CAUCHY generator matrix
(M[j][i] = inv[(m+i)^j], GF(2^8) poly 0x1d), while reed-solomon-erasure is
Vandermonde — so its parity was NOT Moonlight-decodable, despite the old
gf8.rs comment claiming equivalence.

lumen-core:
- Swap the GF(2^8) backend from reed-solomon-erasure to a vendored fec-rs
  (vendor/fec-rs, BSD-2), which builds the byte-identical Cauchy matrix. Pure
  Rust, no FFI — keeps the "one core" hot path. This makes both lumen's own
  protocol and the GameStream parity nanors-compatible.
- Lock it with a regression test against real nanors vectors
  (k=4,m=2 [10,20,30,40] -> parity [136,0]) + an independent matrix-derived
  cross-check + an erase/recover round-trip. Existing FEC/loopback tests stay
  green, so lumen's own protocol is unaffected.

lumen-host video.rs:
- Generate m = ceil(k*pct/100) parity shards per FEC block via Gf8Coder; stamp
  fecInfo with the recomputed wire pct (100*m/k) so the client derives the same
  count; cap per-block data to 255*100/(100+pct) so k+m <= 255.
- CRITICAL byte-exactness: RS runs over the whole `blocksize` shard (Moonlight
  decodes packetSize+16 bytes from the datagram start and PACKET_RECOVERY_FAILUREs
  on a bad reconstructed `flags` byte). So the NV header fields RS must reproduce
  (streamPacketIndex/frameIndex/flags/multiFec*) are written into data shards
  BEFORE encode, and only the transport fields (RTP header/seq/timestamp +
  fecInfo) are stamped AFTER — leaving the flags byte RS-covered. Matches
  Sunshine stream.cpp. Unit-tested incl. flags recovery.
- fec_percentage wired from stream.rs (Sunshine default 20, LUMEN_FEC_PCT
  override; 0 = data-only). LUMEN_VIDEO_DROP injects loss to test recovery.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 11:34:27 +00:00
enricobuehler a913042367 feat: M1 lumen-core (FEC/crypto/packet/session + C ABI) and workspace scaffold
Ground-up low-latency streaming stack per docs/implementation-plan.md. M1 is
complete and tested; Linux host backends are cfg-gated stubs to be filled in on
real hardware (M0/M2).

lumen-core (built + tested on macOS/aarch64 — 21 tests):
- fec: ErasureCoder over GF(2^8) (reed-solomon-erasure, Moonlight-compatible)
  and GF(2^16) Leopard-RS (reed-solomon-simd, the >1 Gbps wall-breaker); proptested
- packet: zero-copy #[repr(C)] framing, multi-block, FEC-aware reassembly
- crypto: AES-128-GCM with per-direction nonce salts + sequence-as-AAD
- session: host submit / client poll hot paths + input; loopback & UDP transports
- abi: opaque handles, versioned LumenConfig, panic guards; cbindgen-generated header
- acceptance: Rust loopback+proptest and a C harness that links the staticlib

Scaffold (compiles green on all platforms): lumen-host (vdisplay/capture/encode/
inject/web/pipeline seams under cfg(linux)), lumen-client-rs, tools/{loss-harness,
latency-probe}, Apple/Android client stubs, Gitea CI, docs.

Hardened against a multi-agent adversarial review (13 verified findings fixed,
regression-tested): reassembler memory-DoS bounds + block-consistency validation,
GCM nonce-reuse direction separation, ABI struct_size guard + range checks, FEC
shard-length guards, shard_payload datagram bound, key zeroization + Debug redaction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 00:02:52 +02:00