bf8a974e8b
ci / rust (push) Has been cancelled
The clients/apple scaffold is now a working macOS client, validated live against this repo's host across the LAN: gamescope virtual output → NVENC HEVC → lumen/1 (GF(2¹⁶) FEC + AES-GCM over UDP, QUIC control) → VideoToolbox → AVSampleBufferDisplayLayer at 720p60, mouse/keyboard flowing back as QUIC datagrams into the host's gamescope EIS injector (~3.7k events injected in one session). LumenKit: - LumenConnection: the predicted cbindgen compile fixes (C17 header spells the typedefs as integers while the enum constants import as a distinct Swift type — bridge by rawValue); close() is now safe from any thread (a close flag + pumpLock held across the blocking poll enforce the C contract "never close with a next_au in flight"; flag prevents lock-starvation by back-to-back polls). - StreamView: per-pump cancellation token (reconnects can't double-pump), flush + re-gate on the next in-band parameter sets when the layer fails, no stale enqueue after restart. - InputCapture: fractional-delta accumulation (sub-pixel motion isn't truncated away), pressed-state tracking with release-all on focus loss and stop() (nothing sticks down host-side), global-singleton ownership guard (GC has one handler slot per process), X1/X2 buttons, horizontal scroll, full keypad/CapsLock/ISO-102nd/PrintScreen/Menu VKs. - LumenClient app shell (swift run LumenClient): connect form, fps/Mb-s HUD, LUMEN_AUTOCONNECT/LUMEN_MODE for scripted first-light runs. - Tests: Annex-B byte-level units; real-codec round trip (VTCompressionSession-encoded HEVC rebuilt as the host's wire shape → AnnexB → VTDecompressionSession → pixels); test-loopback.sh (Swift client vs a real local m3-host over loopback — the Swift twin of c_abi_connection_roundtrip); RemoteFirstLightTests (full pipeline over the LAN). Host/build fixes that fell out: - The workspace builds on non-Linux again: gamestream audio (opus) and sendmmsg batching are now platform-gated with stubs/fallback, per the crate's "compiles everywhere" rule. - Horizontal scroll was inverted end-to-end: the injectors negated BOTH axes onto the ei/wl axes, but GameStream's horizontal convention is positive = right (moonlight-qt/Sunshine pass it through unnegated) — only vertical flips now. This also un-inverts real Moonlight clients. - AnnexB drops all zeros preceding a start code (trailing_zero_8bits padding), ffmpeg's policy, instead of leaking them into the preceding NAL. - build-xcframework.sh: deployment targets pinned to the package floor + an otool guard — cargo does not fingerprint MACOSX_DEPLOYMENT_TARGET, so warm caches can silently ship too-new minos objects. Adversarially reviewed (5-dimension multi-agent pass, every finding refutation-verified): 14 confirmed findings, all fixed above; the send-while-polling core-contract gap flagged here is closed by the lumen/1 session-planes work (&self pulls + per-plane borrow slots). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
160 lines
9.9 KiB
Markdown
160 lines
9.9 KiB
Markdown
# CLAUDE.md — lumen
|
||
|
||
Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protocol core
|
||
(`lumen-core`) exposed over a C ABI and native clients per platform. Full design:
|
||
[`docs/implementation-plan.md`](docs/implementation-plan.md). Status table: `README.md`.
|
||
|
||
## Where the work stands
|
||
|
||
- **M1 (`lumen-core` + C ABI): complete and hardened.** FEC recovery, loopback-under-loss,
|
||
proptests, C ABI harness all green; 13 adversarial-review findings fixed +
|
||
regression-tested (`a913042`).
|
||
- **M2 (GameStream host): working end-to-end with a stock Moonlight client.** Validated live
|
||
on this box: pairing (persists across restarts), serverinfo/applist (app catalog from
|
||
`~/.config/lumen/apps.json` → each entry picks a compositor + nested command), RTSP, ENet
|
||
control, audio, and video at the **client's native resolution and refresh** — the host
|
||
creates a per-session virtual output via per-compositor `VirtualDisplay` backends:
|
||
**KWin** (`zkde_screencast stream_virtual_output`, needs KWin ≥ 6.5.6 headless; >60 Hz via
|
||
custom modes), **gamescope** (spawned headless at WxH@Hz, its PipeWire node captured, needs
|
||
gamescope ≥ 3.16.22 — older deadlocks on PipeWire ≥ 1.6), **Mutter** (D-Bus
|
||
`RecordVirtual`; implemented, live validation pending a gnome-shell install).
|
||
Performance work landed and measured: GPU **zero-copy** on all paths (tiled dmabuf →
|
||
EGL/GL → CUDA; LINEAR dmabuf → **Vulkan bridge** → CUDA → NVENC), auto 2-way NVENC
|
||
split-encode above ~1 Gpix/s (5K@240), infinite GOP + RFI keyframes (killed the periodic
|
||
freeze), encode|send thread split with `sendmmsg` batching. Stable 240 fps at 5120×1440.
|
||
Input: mouse/keyboard (libei via RemoteDesktop portal on KWin/GNOME, gamescope's own EIS
|
||
socket, wlr protocols on Sway) and **gamepads** (uinput X-Box-360 pads + rumble
|
||
back-channel; live validation pending the udev rule below). Management REST API +
|
||
checked-in OpenAPI doc (`mgmt.rs`).
|
||
- **M3 (`lumen/1`, the native protocol): full session planes, validated live.** QUIC
|
||
control plane (`lumen-core` `quic` feature: Hello{mode}/Welcome{full Config}/Start), data
|
||
plane = the hardened M1 `Session` over raw UDP with **GF(2¹⁶) Leopard FEC + AES-GCM**
|
||
(inexpressible in GameStream), host creates the native virtual output at the client's
|
||
requested mode. `m3-host` is a **persistent listener** (sessions back to back;
|
||
`--max-sessions`). QUIC datagrams carry the side planes, demuxed by first byte: input
|
||
0xC8 (incl. **gamepads** — incremental events accumulated into the uinput xpad), **Opus
|
||
audio** 0xC9 (48 kHz stereo, 5 ms, host→client), **rumble** 0xCA (host→client). **Trust:**
|
||
host serves its persistent identity (`~/.config/lumen/cert.pem`, shared with GameStream
|
||
pairing) and logs the SHA-256 fingerprint; clients pin it (TOFU on first connect —
|
||
`endpoint::client_pinned`). Measured on-box at 720p120: 1680/1680 frames, **p50 0.83 ms**
|
||
capture→…→reassembled; audio measured live (~200 pkts/s). `lumen-client-rs` is the
|
||
working reference client (`--pin`, datagram counters, `--input-test` incl. gamepad).
|
||
The embeddable connector (`NativeClient`) exposes it all over the C ABI: `lumen_connect`
|
||
(pin/TOFU) + `next_au`/`next_audio`/`next_rumble`/`send_input`.
|
||
|
||
## What's left
|
||
|
||
1. **M4 — client decode + present: macOS stage 1 done, first light achieved
|
||
(2026-06-10).** LumenKit compiles and is tested on macOS (AnnexB → VideoToolbox →
|
||
`AVSampleBufferDisplayLayer`, GCMouse/GCKeyboard capture, `LumenClient` app shell);
|
||
validated live Mac ↔ this box at 720p60 — vkcube on glass, input injected via gamescope
|
||
EIS. Tests: `swift test` in `clients/apple` (unit + real-codec round trip),
|
||
`test-loopback.sh` (Swift client vs synthetic m3-host on loopback — runs on macOS),
|
||
`RemoteFirstLightTests` (full pipeline over the LAN). See
|
||
[`clients/apple/README.md`](clients/apple/README.md). Next: stage 2 presenter
|
||
(`VTDecompressionSession` + `CAMetalLayer` frame pacing), glass-to-glass numbers via
|
||
`tools/latency-probe` (scaffold), iOS variant. The Linux reference client
|
||
(`lumen-client-rs`) gets VAAPI + wgpu on the same connector later.
|
||
2. **Sub-frame pipelining**: overlap encode and transmit within a frame. Requires a direct
|
||
NVENC SDK wrapper (libavcodec only emits whole AUs) — the next big latency lever (~2–4 ms
|
||
at high res).
|
||
3. **lumen/1 protocol growth**: a PIN-style pairing ceremony on top of fingerprint pinning,
|
||
mid-stream mode renegotiation (the Welcome is one-shot today), concurrent sessions
|
||
(today: one at a time, extras wait in the accept queue).
|
||
4. **M2 polish**: wlroots/Sway `VirtualDisplay` backend (deferred; swaymsg `create_output`),
|
||
GNOME live validation, gamepad live validation (blocked on the udev rule below),
|
||
HDR/10-bit/AV1 negotiation, surround audio, reconnect-at-new-mode robustness.
|
||
5. **Native clients** (`clients/{apple,android}` scaffolds) consuming `lumen_core.h`.
|
||
6. **This box, one-time setup still pending**: `sudo cp scripts/60-lumen.rules
|
||
/etc/udev/rules.d/` + user into `input` group (gamepads); `apt install gnome-shell`
|
||
(Mutter validation). Done since last update: gamescope 3.16.22 is installed at
|
||
`/usr/local/bin` — the `PATH=/tmp/gamescope-src/...` override is no longer needed.
|
||
|
||
## Build / test / run
|
||
|
||
```sh
|
||
cargo build --workspace # green on Linux and macOS
|
||
cargo test --workspace # unit + loopback + proptest + C ABI harness (~97 tests)
|
||
cargo clippy --workspace --all-targets -- -D warnings
|
||
cargo fmt --all --check
|
||
|
||
cargo run -p loss-harness # FEC loss-resilience sweep (no network needed)
|
||
bash crates/lumen-core/tests/c/run.sh # standalone C-ABI link + round-trip proof
|
||
```
|
||
|
||
Generated artifacts are **checked in** and CI fails on drift: `include/lumen_core.h`
|
||
(cbindgen from `lumen-core/src/abi.rs`) and `docs/api/openapi.json` (regenerate with
|
||
`cargo run -p lumen-host -- openapi > docs/api/openapi.json`; spec lives in `mgmt.rs`).
|
||
|
||
## Layout
|
||
|
||
```
|
||
crates/lumen-core/ protocol · FEC · crypto · quic (lumen/1 control plane, feature-gated)
|
||
crates/lumen-host/
|
||
gamestream/ Moonlight compat: nvhttp · pairing · rtsp · control · stream · gamepad · apps
|
||
vdisplay/{kwin,gamescope,mutter}.rs per-compositor client-sized virtual outputs
|
||
zerocopy/{egl,cuda,vulkan}.rs dmabuf → CUDA → NVENC (tiled via EGL/GL, LINEAR via Vulkan)
|
||
inject/{libei,wlr,gamepad}.rs input backends (+ uinput virtual gamepads)
|
||
capture.rs · encode.rs · audio.rs · m0.rs · m3.rs · mgmt.rs
|
||
crates/lumen-client-rs/ lumen/1 reference client (M3 headless; M4 adds decode+present)
|
||
tools/{loss-harness,latency-probe}/ measurement (plan §10)
|
||
scripts/ 60-lumen.rules · lumen-host.service · host.env.example · headless/
|
||
include/lumen_core.h generated C header
|
||
```
|
||
|
||
## Design invariants — do not regress
|
||
|
||
- **One core, linked everywhere.** Protocol/FEC/crypto live only in `lumen-core`, behind a
|
||
stable, versioned C ABI. `tokio`/`quinn` exist only behind the `quic` feature (control
|
||
plane); **no async on the per-frame path** — native threads only.
|
||
- **Native client resolution, no scaling.** A session gets a virtual output at exactly the
|
||
client's WxH@Hz via the `VirtualDisplay` trait (`create(mode) → VirtualOutput { node_id,
|
||
remote_fd, preferred_mode, keepalive }`, RAII teardown). There is no cross-compositor
|
||
protocol for this — each compositor keeps its own backend.
|
||
- **FEC is the wall-breaker.** GF(2⁸) (≤255 shards/block, Moonlight-compatible) and GF(2¹⁶)
|
||
Leopard (≤65535 shards/block) — lumen/1 negotiates the latter, removing the ~1 Gbps
|
||
ceiling.
|
||
- **M1 security hardening stays intact**: reassembler bounds attacker-controlled fields
|
||
before allocating (`ReassemblerLimits`); AES-GCM per-direction nonce salts + seq-as-AAD;
|
||
ABI `struct_size` checks. Regression tests exist — keep them green.
|
||
- **PipeWire consumer discipline**: our capture streams set `node.dont-reconnect` and tear
|
||
down promptly on negotiation timeout — one wedged link head-blocks the daemon's shared
|
||
work queue system-wide.
|
||
|
||
## Running on this box
|
||
|
||
Headless QEMU VM (Ubuntu 26.04, kernel 7.0), passthrough RTX 5070 Ti (driver 595 **open**
|
||
module — a kernel update silently drops it; reinstall `nvidia-driver-595-open`), no KMS
|
||
scanout → KWin `--drm` impossible; everything renders offscreen via `renderD128`.
|
||
|
||
```sh
|
||
# compositor session (shell 1, or the systemd unit in scripts/):
|
||
XDG_RUNTIME_DIR=/run/user/1000 DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus \
|
||
XDG_CURRENT_DESKTOP=KDE KWIN_WAYLAND_NO_PERMISSION_CHECKS=1 \
|
||
kwin_wayland --virtual --width 1920 --height 1080 --no-lockscreen --socket wayland-kde \
|
||
--exit-with-session wev
|
||
|
||
# host (shell 2; gamescope entries need the PATH override until ninja install):
|
||
WAYLAND_DISPLAY=wayland-kde XDG_CURRENT_DESKTOP=KDE LUMEN_VIDEO_SOURCE=virtual \
|
||
LUMEN_ZEROCOPY=1 PATH=/tmp/gamescope-src/build/src:$PATH cargo run -rp lumen-host -- serve
|
||
|
||
# lumen/1 native loopback test (no Moonlight needed; same env as serve, listener persists
|
||
# across sessions — bound it with --max-sessions):
|
||
cargo run -rp lumen-host -- m3-host --source virtual --seconds 10 --max-sessions 1
|
||
cargo run -rp lumen-client-rs -- --mode 1280x720x120 --out /tmp/a.h265 --input-test # + --pin HEX
|
||
```
|
||
|
||
Pinned crate facts: `ashpd` 0.13 + `pipewire` 0.9 (must match ashpd's) + `ffmpeg-next` 8.x
|
||
(system FFmpeg 8 / libavcodec 62). Env knobs: `LUMEN_VIDEO_SOURCE=virtual|portal`,
|
||
`LUMEN_COMPOSITOR=kwin|gamescope|mutter`, `LUMEN_ZEROCOPY=1`, `LUMEN_GAMESCOPE_APP=...`,
|
||
`LUMEN_INPUT_BACKEND=...`, `LUMEN_PERF=1` (per-stage timing), `LUMEN_VIDEO_DROP=N` (FEC
|
||
test), `LUMEN_FEC_PCT=N`.
|
||
|
||
## Conventions
|
||
|
||
- Rust 2021, `rustfmt` + `clippy -D warnings` clean before commit.
|
||
- Match the surrounding code's comment density and naming.
|
||
- Commit messages end with the Co-Authored-By trailer (see `git log`).
|
||
- `pkill` caution on this box: match exact comm names (`pkill -x gamescope-wl`,
|
||
`pkill -x lumen-host`) — `pkill -f` self-matches the invoking shell.
|