m3-host is now a real host, not a one-shot demo. Everything validated live on this box
(two back-to-back sessions, pinned + TOFU, ~200 audio pkts/s, p50 0.84 ms at 720p60).
lumen-core:
- quic.rs: QUIC-datagram side planes demuxed by first byte — Opus audio 0xC9
([magic][u32 seq][u64 pts_ns][opus], host→client) and rumble 0xCA ([magic][pad][low][high]).
- Trust: endpoint::server_with_identity (persistent PEM identity) and
endpoint::client_pinned — SHA-256 cert-fingerprint pinning with TOFU (observed
fingerprint reported back for persisting). The verifier checks the TLS 1.3
CertificateVerify signature for real (an MITM replaying the host's public cert without
its key is rejected; cert pinning alone would not prove key possession).
- client.rs: NativeClient gains pin + host_fingerprint, audio/rumble receivers
(next_audio / next_rumble); pull methods take &self so the C ABI's per-plane threads
never alias a &mut (per-plane mutexed borrow slots in abi.rs).
- abi.rs: lumen_connect(pin_sha256, observed_sha256_out) + lumen_connection_next_audio /
next_rumble. input.rs: documented gamepad wire contract (GameStream buttonFlags bits,
XInput axis conventions, +y = up) — exported as LUMEN_BTN_*/LUMEN_AXIS_* (bare BTN_*
collides with <linux/input-event-codes.h> at different values).
lumen-host (m3):
- Persistent accept loop: sessions back to back on one endpoint (--max-sessions, 0 =
forever); per-session failures log and the loop keeps serving; 10 s handshake deadline
so a silent client can't wedge the sequential accept queue; teardown on every exit path
(stop flag → conn.close → join audio+input threads).
- Audio plane: desktop PipeWire capture → Opus 48 kHz stereo 5 ms CBR → datagrams; ONE
capturer reused across sessions via an AudioCapSlot (PipeWire streams have no cheap
teardown — per-session opens would leak a thread + core connection + live node each).
- Gamepad routing: incremental GamepadButton/GamepadAxis datagrams accumulate into
per-pad state feeding the uinput xpad manager; force feedback returns as rumble
datagrams, with current state re-sent every 500 ms (idempotent-state healing for the
lossy channel). QUIC endpoint serves the persistent ~/.config/lumen identity and logs
the pinnable fingerprint.
lumen-client-rs: --pin (malformed values abort — never silently downgrade to TOFU),
TOFU fingerprint logging, audio/rumble datagram counters, gamepad events in --input-test.
clients/apple: scaffold synced — pinSHA256/hostFingerprint (wrong-size pin throws,
fail-closed), nextAudio/nextRumble, gamepad event constructors; README handoff updated
(persistent listener, audio decode notes, trust UX).
Adversarially reviewed (5-dimension multi-agent pass over the diff, 2-skeptic
verification): fixed the MITM signature-check gap, a Y-axis contract inversion, header
macro collisions, ABI aliasing UB, the PipeWire per-session leak, the missing handshake
deadline, fail-open pin parsing, and teardown-on-error paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Resolutions: serve() keeps main's AppState::new() with our persisted-pairing load folded
into it; main.rs keeps both the m3 and mgmt modules; mgmt's test LaunchSessions gain the
new appid field; Cargo.lock re-resolved. Full gate green (92 tests, clippy, fmt).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The native protocol now does the real thing, end to end:
- Hello carries the client's requested mode; the host creates a NATIVE virtual output at
exactly that size/refresh (same vdisplay backends as the GameStream path) and streams
NVENC HEVC through the M1 Session (GF(2^16) Leopard FEC + AES-GCM, QUIC-negotiated).
- Input rides QUIC DATAGRAMS — encrypted, congestion-managed, no ENet retransmission
spikes — decoded into lumen_core InputEvents and fed to the session's input injector.
- Frames are stamped with the capture wall clock; the reference client computes per-frame
capture→reassembled latency percentiles and writes a playable .h265.
- m3-host gains --source synthetic|virtual + --seconds; the client gains --mode WxHxFPS,
--out, --input-test (scripted mouse/keyboard datagrams).
VALIDATED live (gamescope session, xev nested): client requested 1280x720@120 → host
created gamescope at that mode → 1680/1680 frames over 14s, zero loss, valid HEVC;
pipeline latency p50 0.83ms / p95 1.2ms / p99 1.3ms (capture→encode→FEC→crypto→UDP→
reassembled, same-host clock); 176 input datagrams sent → injector (GamescopeEi) → 164
X events observed inside the nested session.
Known follow-on: slice-level sub-frame pipelining needs the NVENC SDK directly (libavcodec
emits whole AUs only) — the next big latency lever.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The first end-to-end run of lumen's own protocol, past the GameStream compatibility layer.
- lumen-core/src/quic.rs (behind the `quic` feature): the lumen/1 handshake — Hello/Welcome/
Start as length-prefixed LE binary on one QUIC bi-stream. Welcome carries the COMPLETE
data-plane Config: mode, FEC scheme incl. GF(2^16) Leopard (inexpressible in GameStream),
shard sizing, AES-GCM key + per-direction salt, data UDP port. Plus quinn endpoint helpers
(self-signed server; accepts-any client — pinning lands with the trust model) and framed
async IO. Round-trip unit-tested.
- lumen-host m3-host: serves one lumen/1 session — QUIC handshake, then a NATIVE thread
(no async on the frame path — design invariant) streams deterministic 64KB test frames
through the hardened M1 Session over UdpTransport.
- lumen-client-rs: from scaffold to working reference client — connects, negotiates, brings
up the client Session over UDP, reassembles + FEC-recovers + byte-verifies every frame.
VALIDATED END-TO-END on localhost: 300/300 frames verified, 0 mismatches, through
QUIC-negotiated GF(2^16) FEC + AES-GCM over real UDP sockets. M4 (decode+present) builds on
this exact client skeleton.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Five confirmed findings from a 46-agent review panel:
- Empty --mgmt-token no longer satisfies the non-loopback token gate
(critical: 'Bearer ' with an empty token authenticated; parse_serve now
bails on blank tokens and mgmt::run treats blank as none)
- axum's built-in body rejections (400/415/422) now wear the documented
ApiError envelope via an ApiJson extractor, and the spec documents them
- GET /health carries security([{}]) in the spec, matching the server's
auth exemption
- unpairClient's description no longer claims revocation the TLS layer
doesn't enforce yet (gamestream/tls.rs accepts any cert — known gap)
- CLAUDE.md/README.md no longer reference the deleted web.rs
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A versioned control-plane REST API (/api/v1) on its own port (default
127.0.0.1:47990) serving host info, runtime status, paired-client
management, the pairing PIN flow, and session control (stop / force-IDR).
The OpenAPI 3.1 document is generated from the handlers by utoipa, served
live at /api/v1/openapi.json (+ Scalar docs at /api/docs), printable via
`lumen-host openapi`, and checked in at docs/api/openapi.json for client
codegen — a test fails if it drifts, mirroring the cbindgen header rule.
Auth: optional bearer token (--mgmt-token / LUMEN_MGMT_TOKEN), enforced on
everything but /health, and mandatory for non-loopback binds. PinGate
gains a waiter count so the API can report pin_pending; logs moved to
stderr so stdout stays machine-readable. Supersedes the web.rs stub.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Honor the client's requested resolution by rendering a compositor virtual output at
exactly that size — native, headless, no scaling. There is no cross-compositor Wayland
protocol for this, so it's a per-compositor backend behind the (previously stubbed)
VirtualDisplay trait.
- vdisplay.rs: VirtualDisplay::create(mode) now returns a live VirtualOutput
{ node_id, remote_fd: Option<OwnedFd>, keepalive } with RAII teardown (drop releases
the output) instead of an inert OutputHandle + explicit destroy. Add compositor
detect() (LUMEN_COMPOSITOR / XDG_CURRENT_DESKTOP).
- vdisplay/kwin.rs: the KWin backend — the zkde_screencast_unstable_v1 stream_virtual_output
client (vendored protocol XML + wayland-scanner codegen). Creates a WxH output, returns
its PipeWire node (default daemon, remote_fd=None); a keepalive thread holds the Wayland
connection until dropped. (Moved here from capture/kwin.rs — it's a vdisplay backend, not
capture.)
- capture: generalize the PipeWire consumer to Option<OwnedFd> (portal remote vs. default
daemon) and add capture_virtual_output(vout), compositor-agnostic, owning the keepalive.
- gamestream/stream.rs: LUMEN_VIDEO_SOURCE=virtual creates a virtual display sized to the
client's cfg and captures it (self-contained, not pooled — a reconnect at a new
resolution gets a fresh output).
- m0: --source kwin-virtual goes through the trait.
Verified end-to-end against the running headless KWin: the request reaches the compositor
and is handled cleanly. Native creation needs a backend implementing createVirtualOutput —
the DRM backend, or the VirtualBackend since KWin 6.5.6; on this box's --virtual 6.4.5 it
returns "Could not find output" (expected; validates after the KWin upgrade). wlroots/Mutter
backends are the next ones to land on the same seam.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Scaffolding for dmabuf zero-copy (plan §9), opt-in via LUMEN_ZEROCOPY:
- src/zerocopy/{cuda,egl}.rs: hand-rolled CUDA Driver-API FFI (no Rust crate
exposes the EGL-interop calls / CUeglFrame) with a shared process-wide
CUcontext + pitched device buffers; an EGL importer (GBM platform on the
NVIDIA render node) that turns a dmabuf into an EGLImage, registers it with
CUDA, and copies it device-to-device into an owned buffer. `zerocopy-probe`
subcommand validates the FFI/linking/GPU access — confirmed on the box
(driver 595, EGL_EXT_image_dma_buf_import + modifiers).
- CapturedFrame gains a FramePayload enum (Cpu(Vec<u8>) | Cuda(DeviceBuffer));
the encoder branches: CPU keeps the expand+upload path, CUDA wraps the device
buffer in an AV_PIX_FMT_CUDA frame fed straight to hevc_nvenc (sharing our
CUcontext via a hand-declared AVCUDADeviceContext, since ffmpeg-sys doesn't
bind hwcontext_cuda.h). open_video/the encoder take a `cuda` flag derived from
the first frame's payload.
The capture-side dmabuf negotiation (which produces the Cuda frames) is the
next step; the CPU path is unchanged and remains the default + fallback. Builds
clean, clippy clean, tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The M2 teardown work added an `active` gate to the PipeWire capture callback
(idle by default so reconnects stay cheap, with the stream path calling
set_active(true) on PLAY). The `m0` subcommand was never updated, so its portal
capturer stayed inactive and the callback dropped every frame — `m0 --source
portal` failed with "no PipeWire frame within 10s" on every compositor. Call
set_active(true) before the capture loop.
Validated on headless KWin (Plasma 6.4) via the RemoteDesktop-anchored ScreenCast
session: real desktop frames flow (shm BGRx 1920x1080) and encode to valid H.265.
(Also folds in a rustfmt reflow of the input-test log line.)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a second input-injection backend that works on compositors implementing
the org.freedesktop.portal.RemoteDesktop interface (KWin, GNOME/Mutter), where
the wlroots virtual-input protocols are absent. Uses ashpd 0.13 to open a
RemoteDesktop session + EIS fd and reis 0.6.1 to drive it as an EI sender:
bind pointer/keyboard/scroll/button capabilities and, per device,
start_emulating → emit → frame. Runs on a dedicated thread with its own tokio
runtime (the portal session + EIS connection must stay alive and the event
stream must be polled continuously); open() returns immediately so a slow or
denied portal can never freeze the ENet control thread, with events enqueued
over an unbounded channel until devices resume.
Backend now auto-selects per session (inject::default_backend): wlr on Sway,
libei on KDE/GNOME; LUMEN_INPUT_BACKEND overrides. Refactor inject.rs into the
inject/{wlr,libei}.rs layout matching the capture/encode convention. Keyboard
codes are evdev (the same space our VK→evdev table produces) and the compositor
supplies the keymap, so no keymap upload and no modifier serialization — pressing
the modifier keys Moonlight sends is enough.
Add a `lumen-host input-test` subcommand that injects a scripted mouse+keyboard
pattern through the session backend, so input injection can be validated without
a Moonlight client.
Live-validated on headless KWin (Plasma 6.4): mouse motion, left click, and the
'A' key inject correctly and are delivered to the focused client.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
A stock Moonlight client now gets video + full input + AUDIO from the
from-scratch GameStream host (verified live end-to-end on a macOS client).
Audio (audio.rs, audio/linux.rs, gamestream/audio.rs):
- Capture the default PipeWire sink's monitor (system output) as interleaved
f32 stereo @ 48kHz via stream.capture.sink, on its own thread.
- Opus-encode 5ms/240-sample stereo frames (RESTRICTED_LOWDELAY, CBR) and send
as GameStream RTP audio: 12-byte BE RTP_PACKET (packetType 97, seq+1/pkt,
timestamp += packetDuration, ssrc 0) on UDP 48000, after learning the client
endpoint from its port-learning ping.
- Encrypt the Opus payload with AES-128-CBC (PKCS7), key = launch rikey, IV =
BE32(rikeyid + seq) in [0..4]. Like the control stream, modern Moonlight
always decrypts audio regardless of the negotiated flags — plaintext makes it
log "Failed to decrypt audio packet" and play silence (diagnosed from the
client log). RTP header stays in the clear. Scheme cross-checked against
Sunshine stream.cpp/crypto.cpp + moonlight AudioStream.c.
- Pace each frame to its 5ms slot (PipeWire delivers ~1024-frame buffers) to
avoid bursts the client's jitter buffer hears as glitches. LUMEN_AUDIO_GAIN
applies optional linear gain for quiet sources.
- DESCRIBE SDP advertises the stereo Opus config (a=fmtp:97 surround-params).
Video (stream.rs): pace at a steady ≤60fps, re-encoding the last captured frame
when the compositor produces none. wlroots only emits on damage, so a static or
slow-updating desktop previously starved the client into a "network too slow"
abort; an unchanged frame costs a near-empty P-frame. Adds a non-blocking
Capturer::try_latest (portal drains to the freshest queued frame).
Misc: serialize pipewire init across the video + audio capture threads
(pwinit.rs, std::sync::Once) to avoid a concurrent pw_init race. Deps: opus,
cbc; libopus-dev in bootstrap-ubuntu.sh.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>