feat: M2 P1.5 (FEC) — nanors-exact Reed-Solomon recovery for the video stream

Moonlight now reconstructs lost video shards from our parity (verified live: under induced packet loss the picture recovers cleanly instead of failing with "network connection too bad"; 0% added loss in normal operation). The decisive finding: Moonlight's nanors uses a CAUCHY generator matrix (M[j][i] = inv[(m+i)^j], GF(2^8) poly 0x1d), while reed-solomon-erasure is Vandermonde — so its parity was NOT Moonlight-decodable, despite the old gf8.rs comment claiming equivalence. lumen-core: - Swap the GF(2^8) backend from reed-solomon-erasure to a vendored fec-rs (vendor/fec-rs, BSD-2), which builds the byte-identical Cauchy matrix. Pure Rust, no FFI — keeps the "one core" hot path. This makes both lumen's own protocol and the GameStream parity nanors-compatible. - Lock it with a regression test against real nanors vectors (k=4,m=2 [10,20,30,40] -> parity [136,0]) + an independent matrix-derived cross-check + an erase/recover round-trip. Existing FEC/loopback tests stay green, so lumen's own protocol is unaffected. lumen-host video.rs: - Generate m = ceil(k*pct/100) parity shards per FEC block via Gf8Coder; stamp fecInfo with the recomputed wire pct (100*m/k) so the client derives the same count; cap per-block data to 255*100/(100+pct) so k+m <= 255. - CRITICAL byte-exactness: RS runs over the whole `blocksize` shard (Moonlight decodes packetSize+16 bytes from the datagram start and PACKET_RECOVERY_FAILUREs on a bad reconstructed `flags` byte). So the NV header fields RS must reproduce (streamPacketIndex/frameIndex/flags/multiFec*) are written into data shards BEFORE encode, and only the transport fields (RTP header/seq/timestamp + fecInfo) are stamped AFTER — leaving the flags byte RS-covered. Matches Sunshine stream.cpp. Unit-tested incl. flags recovery. - fec_percentage wired from stream.rs (Sunshine default 20, LUMEN_FEC_PCT override; 0 = data-only). LUMEN_VIDEO_DROP injects loss to test recovery. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 11:34:27 +00:00
parent 278a6330de
commit 72f8c05aa3
14 changed files with 2921 additions and 212 deletions
@@ -0,0 +1,73 @@
+# fec-rs
+
+[![CI](https://github.com/hgaiser/fec-rs/workflows/CI/badge.svg)](https://github.com/hgaiser/fec-rs/actions)
+[![Crates.io](https://img.shields.io/crates/v/fec-rs.svg)](https://crates.io/crates/fec-rs)
+[![Documentation](https://docs.rs/fec-rs/badge.svg)](https://docs.rs/fec-rs)
+
+A pure Rust Reed-Solomon erasure coding library with runtime SIMD acceleration.
+
+## Features
+
+- **Pure Rust** — No C/C++ dependencies or FFI. Everything is implemented in safe Rust
+  (with targeted `unsafe` for SIMD intrinsics).
+- **Runtime SIMD detection** — Automatically uses the fastest available instruction set
+  via `std::is_x86_feature_detected!`. A single binary works on all x86_64 systems.
+- **GF(2^8)** — Operates over the Galois field GF(2^8) with generating polynomial 29 (0x1D),
+  compatible with the Moonlight streaming protocol.
+- **Shard-by-shard encoding** — Incremental encoding via `ShardByShard` for streaming use cases.
+- **Reconstruction** — Reconstruct missing data and/or parity shards from any sufficient subset.
+
+## SIMD Acceleration
+
+On x86_64, the library automatically detects CPU features at runtime and uses
+the best available instruction set:
+
+- **GFNI + AVX2** — Single-instruction GF multiply on 32 bytes (Intel Alder Lake+, AMD Zen 4+)
+- **AVX2** — VPSHUFB split-table nibble lookup on 32 bytes
+- **GFNI + SSE** — Single-instruction GF multiply on 16 bytes
+- **SSSE3** — VPSHUFB split-table nibble lookup on 16 bytes
+- **Scalar** — Lookup table fallback
+
+## Parallel Encoding
+
+Enable the `parallel` feature for optional rayon-based parallel encoding:
+
+```toml
+fec-rs = { version = "0.1", features = ["parallel"] }
+```
+
+When enabled, large encode workloads automatically distribute parity shard
+computation across threads. Small workloads use the sequential path to avoid
+overhead.
+
+## Usage
+
+```rust
+use fec_rs::ReedSolomon;
+
+let rs = ReedSolomon::new(4, 2).unwrap();
+
+let mut shards: Vec<Vec<u8>> = vec![
+    vec![0, 1, 2, 3],
+    vec![4, 5, 6, 7],
+    vec![8, 9, 10, 11],
+    vec![12, 13, 14, 15],
+    vec![0, 0, 0, 0], // parity shard 1
+    vec![0, 0, 0, 0], // parity shard 2
+];
+
+// Encode parity
+rs.encode(&mut shards).unwrap();
+
+// Verify
+assert!(rs.verify(&shards).unwrap());
+
+// Simulate loss of shard 0
+let mut recovery: Vec<Option<Vec<u8>>> = shards.into_iter().map(Some).collect();
+recovery[0] = None;
+
+// Reconstruct
+rs.reconstruct(&mut recovery).unwrap();
+```
+
+License: BSD-2-Clause