The shared-core architecture pays off: platform clients now link ONE Rust library that
does the entire lumen/1 protocol, and only add decode/present/input on top.
lumen-core:
- client.rs (quic feature): NativeClient — QUIC handshake + UDP data plane + input
datagrams on internal threads; embedder surface = connect / next_frame / send_input.
- abi.rs: lumen_connect / lumen_connection_next_au (borrow-until-next-call, matching
lumen_client_poll_frame semantics) / lumen_connection_send_input / lumen_connection_mode /
lumen_connection_close. Guarded in the generated header by LUMEN_FEATURE_QUIC (cbindgen
[defines] mapping), so the checked-in header is stable across feature sets.
- error.rs: append-only LumenStatus additions Timeout (-9) and Closed (-10).
- TESTED end-to-end through the C ABI: in-process lumen/1 host, lumen_connect pulls 25
byte-verified frames, sends input, closes (m3.rs::c_abi_connection_roundtrip).
Apple client (clients/apple — SCAFFOLD, written on Linux, first Xcode build pending):
- scripts/build-xcframework.sh: cargo per Apple target → universal staticlib + header
(LUMEN_FEATURE_QUIC pre-defined) + modulemap → LumenCore.xcframework.
- Package.swift (LumenKit) + Swift sources: LumenConnection (ABI wrapper), AnnexB
(in-band VPS/SPS/PPS → CMVideoFormatDescription, Annex-B → AVCC CMSampleBuffers with
DisplayImmediately), StreamView (SwiftUI over AVSampleBufferDisplayLayer — stage-1
presenter that hardware-decodes compressed HEVC itself), InputCapture (GCMouse raw
deltas + GCKeyboard HID→VK).
- README.md is the full handoff for the next (Mac-side) agent: build steps, ABI contract,
first-light test recipe against the Linux host, stage-2 (VT+Metal pacing) plan, and the
known host-side gaps (single-session m3-host, no lumen/1 audio yet, gamepad kinds not
yet routed in m3's injector, seed-stage trust).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The native protocol now does the real thing, end to end:
- Hello carries the client's requested mode; the host creates a NATIVE virtual output at
exactly that size/refresh (same vdisplay backends as the GameStream path) and streams
NVENC HEVC through the M1 Session (GF(2^16) Leopard FEC + AES-GCM, QUIC-negotiated).
- Input rides QUIC DATAGRAMS — encrypted, congestion-managed, no ENet retransmission
spikes — decoded into lumen_core InputEvents and fed to the session's input injector.
- Frames are stamped with the capture wall clock; the reference client computes per-frame
capture→reassembled latency percentiles and writes a playable .h265.
- m3-host gains --source synthetic|virtual + --seconds; the client gains --mode WxHxFPS,
--out, --input-test (scripted mouse/keyboard datagrams).
VALIDATED live (gamescope session, xev nested): client requested 1280x720@120 → host
created gamescope at that mode → 1680/1680 frames over 14s, zero loss, valid HEVC;
pipeline latency p50 0.83ms / p95 1.2ms / p99 1.3ms (capture→encode→FEC→crypto→UDP→
reassembled, same-host clock); 176 input datagrams sent → injector (GamescopeEi) → 164
X events observed inside the nested session.
Known follow-on: slice-level sub-frame pipelining needs the NVENC SDK directly (libavcodec
emits whole AUs only) — the next big latency lever.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The first end-to-end run of lumen's own protocol, past the GameStream compatibility layer.
- lumen-core/src/quic.rs (behind the `quic` feature): the lumen/1 handshake — Hello/Welcome/
Start as length-prefixed LE binary on one QUIC bi-stream. Welcome carries the COMPLETE
data-plane Config: mode, FEC scheme incl. GF(2^16) Leopard (inexpressible in GameStream),
shard sizing, AES-GCM key + per-direction salt, data UDP port. Plus quinn endpoint helpers
(self-signed server; accepts-any client — pinning lands with the trust model) and framed
async IO. Round-trip unit-tested.
- lumen-host m3-host: serves one lumen/1 session — QUIC handshake, then a NATIVE thread
(no async on the frame path — design invariant) streams deterministic 64KB test frames
through the hardened M1 Session over UdpTransport.
- lumen-client-rs: from scaffold to working reference client — connects, negotiates, brings
up the client Session over UDP, reassembles + FEC-recovers + byte-verifies every frame.
VALIDATED END-TO-END on localhost: 300/300 frames verified, 0 mismatches, through
QUIC-negotiated GF(2^16) FEC + AES-GCM over real UDP sockets. M4 (decode+present) builds on
this exact client skeleton.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Moonlight now reconstructs lost video shards from our parity (verified live:
under induced packet loss the picture recovers cleanly instead of failing with
"network connection too bad"; 0% added loss in normal operation).
The decisive finding: Moonlight's nanors uses a CAUCHY generator matrix
(M[j][i] = inv[(m+i)^j], GF(2^8) poly 0x1d), while reed-solomon-erasure is
Vandermonde — so its parity was NOT Moonlight-decodable, despite the old
gf8.rs comment claiming equivalence.
lumen-core:
- Swap the GF(2^8) backend from reed-solomon-erasure to a vendored fec-rs
(vendor/fec-rs, BSD-2), which builds the byte-identical Cauchy matrix. Pure
Rust, no FFI — keeps the "one core" hot path. This makes both lumen's own
protocol and the GameStream parity nanors-compatible.
- Lock it with a regression test against real nanors vectors
(k=4,m=2 [10,20,30,40] -> parity [136,0]) + an independent matrix-derived
cross-check + an erase/recover round-trip. Existing FEC/loopback tests stay
green, so lumen's own protocol is unaffected.
lumen-host video.rs:
- Generate m = ceil(k*pct/100) parity shards per FEC block via Gf8Coder; stamp
fecInfo with the recomputed wire pct (100*m/k) so the client derives the same
count; cap per-block data to 255*100/(100+pct) so k+m <= 255.
- CRITICAL byte-exactness: RS runs over the whole `blocksize` shard (Moonlight
decodes packetSize+16 bytes from the datagram start and PACKET_RECOVERY_FAILUREs
on a bad reconstructed `flags` byte). So the NV header fields RS must reproduce
(streamPacketIndex/frameIndex/flags/multiFec*) are written into data shards
BEFORE encode, and only the transport fields (RTP header/seq/timestamp +
fecInfo) are stamped AFTER — leaving the flags byte RS-covered. Matches
Sunshine stream.cpp. Unit-tested incl. flags recovery.
- fec_percentage wired from stream.rs (Sunshine default 20, LUMEN_FEC_PCT
override; 0 = data-only). LUMEN_VIDEO_DROP injects loss to test recovery.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>