Commit Graph

12 Commits

Author SHA1 Message Date
enricobuehler 68f2b19cca Merge main (management REST API) into m1-lumen-core
ci / rust (push) Has been cancelled
Resolutions: serve() keeps main's AppState::new() with our persisted-pairing load folded
into it; main.rs keeps both the m3 and mgmt modules; mgmt's test LaunchSessions gain the
new appid field; Cargo.lock re-resolved. Full gate green (92 tests, clippy, fmt).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 07:03:41 +00:00
enricobuehler 5b0d84acd0 feat: M3 — lumen/1 native streaming: real video at client mode + input over QUIC datagrams
The native protocol now does the real thing, end to end:

- Hello carries the client's requested mode; the host creates a NATIVE virtual output at
  exactly that size/refresh (same vdisplay backends as the GameStream path) and streams
  NVENC HEVC through the M1 Session (GF(2^16) Leopard FEC + AES-GCM, QUIC-negotiated).
- Input rides QUIC DATAGRAMS — encrypted, congestion-managed, no ENet retransmission
  spikes — decoded into lumen_core InputEvents and fed to the session's input injector.
- Frames are stamped with the capture wall clock; the reference client computes per-frame
  capture→reassembled latency percentiles and writes a playable .h265.
- m3-host gains --source synthetic|virtual + --seconds; the client gains --mode WxHxFPS,
  --out, --input-test (scripted mouse/keyboard datagrams).

VALIDATED live (gamescope session, xev nested): client requested 1280x720@120 → host
created gamescope at that mode → 1680/1680 frames over 14s, zero loss, valid HEVC;
pipeline latency p50 0.83ms / p95 1.2ms / p99 1.3ms (capture→encode→FEC→crypto→UDP→
reassembled, same-host clock); 176 input datagrams sent → injector (GamescopeEi) → 164
X events observed inside the nested session.

Known follow-on: slice-level sub-frame pipelining needs the NVENC SDK directly (libavcodec
emits whole AUs only) — the next big latency lever.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 06:56:47 +00:00
enricobuehler de3123038f feat: M3 seed — the lumen/1 native protocol: QUIC control plane + reference client (Phase 5)
The first end-to-end run of lumen's own protocol, past the GameStream compatibility layer.

- lumen-core/src/quic.rs (behind the `quic` feature): the lumen/1 handshake — Hello/Welcome/
  Start as length-prefixed LE binary on one QUIC bi-stream. Welcome carries the COMPLETE
  data-plane Config: mode, FEC scheme incl. GF(2^16) Leopard (inexpressible in GameStream),
  shard sizing, AES-GCM key + per-direction salt, data UDP port. Plus quinn endpoint helpers
  (self-signed server; accepts-any client — pinning lands with the trust model) and framed
  async IO. Round-trip unit-tested.
- lumen-host m3-host: serves one lumen/1 session — QUIC handshake, then a NATIVE thread
  (no async on the frame path — design invariant) streams deterministic 64KB test frames
  through the hardened M1 Session over UdpTransport.
- lumen-client-rs: from scaffold to working reference client — connects, negotiates, brings
  up the client Session over UDP, reassembles + FEC-recovers + byte-verifies every frame.

VALIDATED END-TO-END on localhost: 300/300 frames verified, 0 mismatches, through
QUIC-negotiated GF(2^16) FEC + AES-GCM over real UDP sockets. M4 (decode+present) builds on
this exact client skeleton.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:33:40 +00:00
enricobuehler bd25f5e02f fix: M2 — harden the management API after adversarial review
ci / rust (push) Has been cancelled
Five confirmed findings from a 46-agent review panel:
- Empty --mgmt-token no longer satisfies the non-loopback token gate
  (critical: 'Bearer ' with an empty token authenticated; parse_serve now
  bails on blank tokens and mgmt::run treats blank as none)
- axum's built-in body rejections (400/415/422) now wear the documented
  ApiError envelope via an ApiJson extractor, and the spec documents them
- GET /health carries security([{}]) in the spec, matching the server's
  auth exemption
- unpairClient's description no longer claims revocation the TLS layer
  doesn't enforce yet (gamestream/tls.rs accepts any cert — known gap)
- CLAUDE.md/README.md no longer reference the deleted web.rs

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 22:00:22 +00:00
enricobuehler a339a0466e feat: M2 — management REST API with OpenAPI doc (control-pane groundwork)
A versioned control-plane REST API (/api/v1) on its own port (default
127.0.0.1:47990) serving host info, runtime status, paired-client
management, the pairing PIN flow, and session control (stop / force-IDR).
The OpenAPI 3.1 document is generated from the handlers by utoipa, served
live at /api/v1/openapi.json (+ Scalar docs at /api/docs), printable via
`lumen-host openapi`, and checked in at docs/api/openapi.json for client
codegen — a test fails if it drifts, mirroring the cbindgen header rule.

Auth: optional bearer token (--mgmt-token / LUMEN_MGMT_TOKEN), enforced on
everything but /health, and mandatory for non-loopback binds. PinGate
gains a waiter count so the API can report pin_pending; logs moved to
stderr so stdout stays machine-readable. Supersedes the web.rs stub.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 21:35:43 +00:00
enricobuehler 7d08e43c16 feat: M2 — KWin virtual-output backend behind a VirtualDisplay trait (native client resolution)
Honor the client's requested resolution by rendering a compositor virtual output at
exactly that size — native, headless, no scaling. There is no cross-compositor Wayland
protocol for this, so it's a per-compositor backend behind the (previously stubbed)
VirtualDisplay trait.

- vdisplay.rs: VirtualDisplay::create(mode) now returns a live VirtualOutput
  { node_id, remote_fd: Option<OwnedFd>, keepalive } with RAII teardown (drop releases
  the output) instead of an inert OutputHandle + explicit destroy. Add compositor
  detect() (LUMEN_COMPOSITOR / XDG_CURRENT_DESKTOP).
- vdisplay/kwin.rs: the KWin backend — the zkde_screencast_unstable_v1 stream_virtual_output
  client (vendored protocol XML + wayland-scanner codegen). Creates a WxH output, returns
  its PipeWire node (default daemon, remote_fd=None); a keepalive thread holds the Wayland
  connection until dropped. (Moved here from capture/kwin.rs — it's a vdisplay backend, not
  capture.)
- capture: generalize the PipeWire consumer to Option<OwnedFd> (portal remote vs. default
  daemon) and add capture_virtual_output(vout), compositor-agnostic, owning the keepalive.
- gamestream/stream.rs: LUMEN_VIDEO_SOURCE=virtual creates a virtual display sized to the
  client's cfg and captures it (self-contained, not pooled — a reconnect at a new
  resolution gets a fresh output).
- m0: --source kwin-virtual goes through the trait.

Verified end-to-end against the running headless KWin: the request reaches the compositor
and is handled cleanly. Native creation needs a backend implementing createVirtualOutput —
the DRM backend, or the VirtualBackend since KWin 6.5.6; on this box's --virtual 6.4.5 it
returns "Could not find output" (expected; validates after the KWin upgrade). wlroots/Mutter
backends are the next ones to land on the same seam.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 17:30:02 +00:00
enricobuehler 16a00563a8 feat: M2 zero-copy foundation — EGL→CUDA import + NVENC CUDA-frame path
Scaffolding for dmabuf zero-copy (plan §9), opt-in via LUMEN_ZEROCOPY:

- src/zerocopy/{cuda,egl}.rs: hand-rolled CUDA Driver-API FFI (no Rust crate
  exposes the EGL-interop calls / CUeglFrame) with a shared process-wide
  CUcontext + pitched device buffers; an EGL importer (GBM platform on the
  NVIDIA render node) that turns a dmabuf into an EGLImage, registers it with
  CUDA, and copies it device-to-device into an owned buffer. `zerocopy-probe`
  subcommand validates the FFI/linking/GPU access — confirmed on the box
  (driver 595, EGL_EXT_image_dma_buf_import + modifiers).
- CapturedFrame gains a FramePayload enum (Cpu(Vec<u8>) | Cuda(DeviceBuffer));
  the encoder branches: CPU keeps the expand+upload path, CUDA wraps the device
  buffer in an AV_PIX_FMT_CUDA frame fed straight to hevc_nvenc (sharing our
  CUcontext via a hand-declared AVCUDADeviceContext, since ffmpeg-sys doesn't
  bind hwcontext_cuda.h). open_video/the encoder take a `cuda` flag derived from
  the first frame's payload.

The capture-side dmabuf negotiation (which produces the Cuda frames) is the
next step; the CPU path is unchanged and remains the default + fallback. Builds
clean, clippy clean, tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 15:13:05 +00:00
enricobuehler b64be1dc33 fix: m0 portal capture — activate the capturer so frames are delivered
The M2 teardown work added an `active` gate to the PipeWire capture callback
(idle by default so reconnects stay cheap, with the stream path calling
set_active(true) on PLAY). The `m0` subcommand was never updated, so its portal
capturer stayed inactive and the callback dropped every frame — `m0 --source
portal` failed with "no PipeWire frame within 10s" on every compositor. Call
set_active(true) before the capture loop.

Validated on headless KWin (Plasma 6.4) via the RemoteDesktop-anchored ScreenCast
session: real desktop frames flow (shm BGRx 1920x1080) and encode to valid H.265.

(Also folds in a rustfmt reflow of the input-test log line.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 14:24:09 +00:00
enricobuehler 03a6a67354 feat: M2 P1.7 — libei input backend (portable to KWin/GNOME)
Add a second input-injection backend that works on compositors implementing
the org.freedesktop.portal.RemoteDesktop interface (KWin, GNOME/Mutter), where
the wlroots virtual-input protocols are absent. Uses ashpd 0.13 to open a
RemoteDesktop session + EIS fd and reis 0.6.1 to drive it as an EI sender:
bind pointer/keyboard/scroll/button capabilities and, per device,
start_emulating → emit → frame. Runs on a dedicated thread with its own tokio
runtime (the portal session + EIS connection must stay alive and the event
stream must be polled continuously); open() returns immediately so a slow or
denied portal can never freeze the ENet control thread, with events enqueued
over an unbounded channel until devices resume.

Backend now auto-selects per session (inject::default_backend): wlr on Sway,
libei on KDE/GNOME; LUMEN_INPUT_BACKEND overrides. Refactor inject.rs into the
inject/{wlr,libei}.rs layout matching the capture/encode convention. Keyboard
codes are evdev (the same space our VK→evdev table produces) and the compositor
supplies the keymap, so no keymap upload and no modifier serialization — pressing
the modifier keys Moonlight sends is enough.

Add a `lumen-host input-test` subcommand that injects a scripted mouse+keyboard
pattern through the session backend, so input injection can be validated without
a Moonlight client.

Live-validated on headless KWin (Plasma 6.4): mouse motion, left click, and the
'A' key inject correctly and are delivered to the focused client.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 13:58:41 +00:00
enricobuehler 278a6330de feat: M2 P1.6 — audio (Opus + AES-CBC) and steady-rate video pacing
A stock Moonlight client now gets video + full input + AUDIO from the
from-scratch GameStream host (verified live end-to-end on a macOS client).

Audio (audio.rs, audio/linux.rs, gamestream/audio.rs):
- Capture the default PipeWire sink's monitor (system output) as interleaved
  f32 stereo @ 48kHz via stream.capture.sink, on its own thread.
- Opus-encode 5ms/240-sample stereo frames (RESTRICTED_LOWDELAY, CBR) and send
  as GameStream RTP audio: 12-byte BE RTP_PACKET (packetType 97, seq+1/pkt,
  timestamp += packetDuration, ssrc 0) on UDP 48000, after learning the client
  endpoint from its port-learning ping.
- Encrypt the Opus payload with AES-128-CBC (PKCS7), key = launch rikey, IV =
  BE32(rikeyid + seq) in [0..4]. Like the control stream, modern Moonlight
  always decrypts audio regardless of the negotiated flags — plaintext makes it
  log "Failed to decrypt audio packet" and play silence (diagnosed from the
  client log). RTP header stays in the clear. Scheme cross-checked against
  Sunshine stream.cpp/crypto.cpp + moonlight AudioStream.c.
- Pace each frame to its 5ms slot (PipeWire delivers ~1024-frame buffers) to
  avoid bursts the client's jitter buffer hears as glitches. LUMEN_AUDIO_GAIN
  applies optional linear gain for quiet sources.
- DESCRIBE SDP advertises the stereo Opus config (a=fmtp:97 surround-params).

Video (stream.rs): pace at a steady ≤60fps, re-encoding the last captured frame
when the compositor produces none. wlroots only emits on damage, so a static or
slow-updating desktop previously starved the client into a "network too slow"
abort; an unchanged frame costs a near-empty P-frame. Adds a non-blocking
Capturer::try_latest (portal drains to the freshest queued frame).

Misc: serialize pipewire init across the video + audio capture threads
(pwinit.rs, std::sync::Once) to avoid a concurrent pw_init race. Deps: opus,
cbc; libopus-dev in bootstrap-ubuntu.sh.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 10:39:22 +00:00
enricobuehler ab6dda2e5f feat: M0 capture→encode pipeline + M2 GameStream host (pairing, RTSP, video)
M0 (lumen-host) — verified on NVIDIA RTX 5070 Ti / Ubuntu 25.10:
headless wlroots → xdg ScreenCast portal → PipeWire → NVENC HEVC → playable file,
with each access unit round-tripped through a lumen_core host↔client Session
(FEC + packetize + reassemble), 0 mismatches.
- capture.rs: SyntheticCapturer + portal capture (ashpd 0.13 + pipewire 0.9), format-aware
- encode/linux.rs: NVENC via ffmpeg-next 7 (BGRx/RGB → rgb0, no host-side swscale)
- m0.rs: capture→encode→file + lumen-core loopback verification

M2 P1 (lumen-host gamestream/) — a stock Moonlight client pairs + launches, verified live:
- mDNS _nvstream._tcp + nvhttp /serverinfo (HTTP 47989, mutual-TLS HTTPS 47984)
- 4-phase pairing: PIN→AES-128-ECB / SHA-256 / RSA-PKCS1v15 / X.509, custom rustls
  ClientCertVerifier for the mutual-TLS pairchallenge
- /applist, /launch (rikey/rikeyid/mode), hand-rolled RTSP (OPTIONS/DESCRIBE/SETUP×3/
  ANNOUNCE/PLAY, one-request-per-TCP-connection per moonlight-common-c's read-to-EOF)
- video.rs: GameStream RTP + NV_VIDEO_PACKET wire packetizer, data-shards-only (0% FEC,
  clean-LAN), unit-tested (single/multi-block)

Docs: docs/m2-plan.md (phased plan) + docs/research/ (ground-truth protocol spec).
Bootstrap/setup updated for the verified path (libnvidia-gl, render/video groups, GPU
EGL, pipewire 0.9). Workspace clippy-clean, tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 07:14:59 +00:00
enricobuehler a913042367 feat: M1 lumen-core (FEC/crypto/packet/session + C ABI) and workspace scaffold
Ground-up low-latency streaming stack per docs/implementation-plan.md. M1 is
complete and tested; Linux host backends are cfg-gated stubs to be filled in on
real hardware (M0/M2).

lumen-core (built + tested on macOS/aarch64 — 21 tests):
- fec: ErasureCoder over GF(2^8) (reed-solomon-erasure, Moonlight-compatible)
  and GF(2^16) Leopard-RS (reed-solomon-simd, the >1 Gbps wall-breaker); proptested
- packet: zero-copy #[repr(C)] framing, multi-block, FEC-aware reassembly
- crypto: AES-128-GCM with per-direction nonce salts + sequence-as-AAD
- session: host submit / client poll hot paths + input; loopback & UDP transports
- abi: opaque handles, versioned LumenConfig, panic guards; cbindgen-generated header
- acceptance: Rust loopback+proptest and a C harness that links the staticlib

Scaffold (compiles green on all platforms): lumen-host (vdisplay/capture/encode/
inject/web/pipeline seams under cfg(linux)), lumen-client-rs, tools/{loss-harness,
latency-probe}, Apple/Android client stubs, Gitea CI, docs.

Hardened against a multi-agent adversarial review (13 verified findings fixed,
regression-tested): reassembler memory-DoS bounds + block-consistency validation,
GCM nonce-reuse direction separation, ABI struct_size guard + range checks, FEC
shard-length guards, shard_payload datagram bound, key zeroization + Debug redaction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 00:02:52 +02:00