c8f9032dec095032c38c9400d454fe5bb2164ba3
11 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
c8f9032dec |
feat: M2 — harden gamescope capture path (blocked on gamescope ≥3.16.22 upstream fix)
Deep investigation (gdb + daemon traces) proved the gamescope capture stall is a gamescope
3.16.20 bug, not ours: it calls pw_loop_iterate() without pw_loop_enter()/leave(), and under
PipeWire 1.6's loop locking its main thread permanently holds the loop mutex — the pw thread
deadlocks, gamescope never acks the daemon's port_set_param(Format), and the link parks in
"negotiating" silently. Stock gst pipewiresrc fails identically. Fixed upstream by gamescope
commit e3ed1ea7 ("pipewire: Fix pipewire loop locking", pipewire#5148); first release 3.16.22.
Ubuntu 26.04 ships 3.16.20 (built ten days before the fix) — patch/upgrade required.
Consumer-side improvements from the investigation (all verified correct vs gamescope's pods,
and needed once the producer is fixed):
- discover the node from gamescope's own "stream available on node ID: N" log line (its
node.name appears on two objects; the advertised id is authoritative); pw-dump fallback
- CPU path accepts mappable dmabufs: Buffers param now offers MemPtr|MemFd|DmaBuf (gamescope
counter-offers exactly DmaBuf when its modifier pod wins, never MemPtr), mmap the fd
ourselves when MAP_BUFFERS didn't (Vulkan-exported dmabufs aren't flagged mappable), and
treat chunk.size==0 as the computed span
- warn_once on every silent frame-drop path in the process callback
- node.dont-reconnect on our capture streams: an orphaned stream re-targeted by wireplumber
onto a fresh node wedges it — and a stuck link head-blocks the daemon's shared work queue,
stalling ALL new link negotiation system-wide (this poisoned whole test sessions)
- LUMEN_GAMESCOPE_NODE (attach to an existing gamescope) + LUMEN_PW_FIXED_POD (negotiation
bisection) debug knobs
KWin path regression-tested (zero-copy intact). gamescope end-to-end validation pending the
patched gamescope build.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
669d40ae21 |
build: migrate to ffmpeg-next 8 (FFmpeg 8.x / libavcodec 62)
Ubuntu 26.04 ships FFmpeg 8.0 (libavcodec 62); bump ffmpeg-next 7.1 -> 8.1 to bind it as the intended pairing. No source changes needed — the encode API surface we use (avcodec_send_frame, hwframe contexts, AV_PIX_FMT_CUDA, av_log) is stable across 7->8. Workspace builds + all tests green; clippy/fmt clean. Refresh the 7.x doc references. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
7d08e43c16 |
feat: M2 — KWin virtual-output backend behind a VirtualDisplay trait (native client resolution)
Honor the client's requested resolution by rendering a compositor virtual output at
exactly that size — native, headless, no scaling. There is no cross-compositor Wayland
protocol for this, so it's a per-compositor backend behind the (previously stubbed)
VirtualDisplay trait.
- vdisplay.rs: VirtualDisplay::create(mode) now returns a live VirtualOutput
{ node_id, remote_fd: Option<OwnedFd>, keepalive } with RAII teardown (drop releases
the output) instead of an inert OutputHandle + explicit destroy. Add compositor
detect() (LUMEN_COMPOSITOR / XDG_CURRENT_DESKTOP).
- vdisplay/kwin.rs: the KWin backend — the zkde_screencast_unstable_v1 stream_virtual_output
client (vendored protocol XML + wayland-scanner codegen). Creates a WxH output, returns
its PipeWire node (default daemon, remote_fd=None); a keepalive thread holds the Wayland
connection until dropped. (Moved here from capture/kwin.rs — it's a vdisplay backend, not
capture.)
- capture: generalize the PipeWire consumer to Option<OwnedFd> (portal remote vs. default
daemon) and add capture_virtual_output(vout), compositor-agnostic, owning the keepalive.
- gamestream/stream.rs: LUMEN_VIDEO_SOURCE=virtual creates a virtual display sized to the
client's cfg and captures it (self-contained, not pooled — a reconnect at a new
resolution gets a fresh output).
- m0: --source kwin-virtual goes through the trait.
Verified end-to-end against the running headless KWin: the request reaches the compositor
and is handled cleanly. Native creation needs a backend implementing createVirtualOutput —
the DRM backend, or the VirtualBackend since KWin 6.5.6; on this box's --virtual 6.4.5 it
returns "Could not find output" (expected; validates after the KWin upgrade). wlroots/Mutter
backends are the next ones to land on the same seam.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
16a00563a8 |
feat: M2 zero-copy foundation — EGL→CUDA import + NVENC CUDA-frame path
Scaffolding for dmabuf zero-copy (plan §9), opt-in via LUMEN_ZEROCOPY:
- src/zerocopy/{cuda,egl}.rs: hand-rolled CUDA Driver-API FFI (no Rust crate
exposes the EGL-interop calls / CUeglFrame) with a shared process-wide
CUcontext + pitched device buffers; an EGL importer (GBM platform on the
NVIDIA render node) that turns a dmabuf into an EGLImage, registers it with
CUDA, and copies it device-to-device into an owned buffer. `zerocopy-probe`
subcommand validates the FFI/linking/GPU access — confirmed on the box
(driver 595, EGL_EXT_image_dma_buf_import + modifiers).
- CapturedFrame gains a FramePayload enum (Cpu(Vec<u8>) | Cuda(DeviceBuffer));
the encoder branches: CPU keeps the expand+upload path, CUDA wraps the device
buffer in an AV_PIX_FMT_CUDA frame fed straight to hevc_nvenc (sharing our
CUcontext via a hand-declared AVCUDADeviceContext, since ffmpeg-sys doesn't
bind hwcontext_cuda.h). open_video/the encoder take a `cuda` flag derived from
the first frame's payload.
The capture-side dmabuf negotiation (which produces the Cuda frames) is the
next step; the CPU path is unchanged and remains the default + fallback. Builds
clean, clippy clean, tests pass.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
03a6a67354 |
feat: M2 P1.7 — libei input backend (portable to KWin/GNOME)
Add a second input-injection backend that works on compositors implementing
the org.freedesktop.portal.RemoteDesktop interface (KWin, GNOME/Mutter), where
the wlroots virtual-input protocols are absent. Uses ashpd 0.13 to open a
RemoteDesktop session + EIS fd and reis 0.6.1 to drive it as an EI sender:
bind pointer/keyboard/scroll/button capabilities and, per device,
start_emulating → emit → frame. Runs on a dedicated thread with its own tokio
runtime (the portal session + EIS connection must stay alive and the event
stream must be polled continuously); open() returns immediately so a slow or
denied portal can never freeze the ENet control thread, with events enqueued
over an unbounded channel until devices resume.
Backend now auto-selects per session (inject::default_backend): wlr on Sway,
libei on KDE/GNOME; LUMEN_INPUT_BACKEND overrides. Refactor inject.rs into the
inject/{wlr,libei}.rs layout matching the capture/encode convention. Keyboard
codes are evdev (the same space our VK→evdev table produces) and the compositor
supplies the keymap, so no keymap upload and no modifier serialization — pressing
the modifier keys Moonlight sends is enough.
Add a `lumen-host input-test` subcommand that injects a scripted mouse+keyboard
pattern through the session backend, so input injection can be validated without
a Moonlight client.
Live-validated on headless KWin (Plasma 6.4): mouse motion, left click, and the
'A' key inject correctly and are delivered to the focused client.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
72f8c05aa3 |
feat: M2 P1.5 (FEC) — nanors-exact Reed-Solomon recovery for the video stream
Moonlight now reconstructs lost video shards from our parity (verified live: under induced packet loss the picture recovers cleanly instead of failing with "network connection too bad"; 0% added loss in normal operation). The decisive finding: Moonlight's nanors uses a CAUCHY generator matrix (M[j][i] = inv[(m+i)^j], GF(2^8) poly 0x1d), while reed-solomon-erasure is Vandermonde — so its parity was NOT Moonlight-decodable, despite the old gf8.rs comment claiming equivalence. lumen-core: - Swap the GF(2^8) backend from reed-solomon-erasure to a vendored fec-rs (vendor/fec-rs, BSD-2), which builds the byte-identical Cauchy matrix. Pure Rust, no FFI — keeps the "one core" hot path. This makes both lumen's own protocol and the GameStream parity nanors-compatible. - Lock it with a regression test against real nanors vectors (k=4,m=2 [10,20,30,40] -> parity [136,0]) + an independent matrix-derived cross-check + an erase/recover round-trip. Existing FEC/loopback tests stay green, so lumen's own protocol is unaffected. lumen-host video.rs: - Generate m = ceil(k*pct/100) parity shards per FEC block via Gf8Coder; stamp fecInfo with the recomputed wire pct (100*m/k) so the client derives the same count; cap per-block data to 255*100/(100+pct) so k+m <= 255. - CRITICAL byte-exactness: RS runs over the whole `blocksize` shard (Moonlight decodes packetSize+16 bytes from the datagram start and PACKET_RECOVERY_FAILUREs on a bad reconstructed `flags` byte). So the NV header fields RS must reproduce (streamPacketIndex/frameIndex/flags/multiFec*) are written into data shards BEFORE encode, and only the transport fields (RTP header/seq/timestamp + fecInfo) are stamped AFTER — leaving the flags byte RS-covered. Matches Sunshine stream.cpp. Unit-tested incl. flags recovery. - fec_percentage wired from stream.rs (Sunshine default 20, LUMEN_FEC_PCT override; 0 = data-only). LUMEN_VIDEO_DROP injects loss to test recovery. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
278a6330de |
feat: M2 P1.6 — audio (Opus + AES-CBC) and steady-rate video pacing
A stock Moonlight client now gets video + full input + AUDIO from the from-scratch GameStream host (verified live end-to-end on a macOS client). Audio (audio.rs, audio/linux.rs, gamestream/audio.rs): - Capture the default PipeWire sink's monitor (system output) as interleaved f32 stereo @ 48kHz via stream.capture.sink, on its own thread. - Opus-encode 5ms/240-sample stereo frames (RESTRICTED_LOWDELAY, CBR) and send as GameStream RTP audio: 12-byte BE RTP_PACKET (packetType 97, seq+1/pkt, timestamp += packetDuration, ssrc 0) on UDP 48000, after learning the client endpoint from its port-learning ping. - Encrypt the Opus payload with AES-128-CBC (PKCS7), key = launch rikey, IV = BE32(rikeyid + seq) in [0..4]. Like the control stream, modern Moonlight always decrypts audio regardless of the negotiated flags — plaintext makes it log "Failed to decrypt audio packet" and play silence (diagnosed from the client log). RTP header stays in the clear. Scheme cross-checked against Sunshine stream.cpp/crypto.cpp + moonlight AudioStream.c. - Pace each frame to its 5ms slot (PipeWire delivers ~1024-frame buffers) to avoid bursts the client's jitter buffer hears as glitches. LUMEN_AUDIO_GAIN applies optional linear gain for quiet sources. - DESCRIBE SDP advertises the stereo Opus config (a=fmtp:97 surround-params). Video (stream.rs): pace at a steady ≤60fps, re-encoding the last captured frame when the compositor produces none. wlroots only emits on damage, so a static or slow-updating desktop previously starved the client into a "network too slow" abort; an unchanged frame costs a near-empty P-frame. Adds a non-blocking Capturer::try_latest (portal drains to the freshest queued frame). Misc: serialize pipewire init across the video + audio capture threads (pwinit.rs, std::sync::Once) to avoid a concurrent pw_init race. Deps: opus, cbc; libopus-dev in bootstrap-ubuntu.sh. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
4c2c41acba |
feat: M2 P1.4 — control-stream decryption + input injection (mouse/keyboard live)
A stock Moonlight client can now drive the headless Sway desktop: mouse
movement, buttons, scroll, and keyboard all inject through the streamed
session (verified live end-to-end — typing, clicking, window management).
Control stream (gamestream/control.rs):
- Moonlight encrypts the ENet control stream with AES-128-GCM even though we
negotiate no media encryption (it detects our Sunshine `state` and turns it
on). Decrypt per-packet under the /launch `rikey`.
- The exact GCM scheme is auto-detected on the first authenticating packet
(nonce construction × key byte-order × tag position × AAD) since GCM gives no
partial credit. Our client uses the legacy 16-byte nonce (`iv[0]=seq&0xff`)
because we advertise no encryption; the 12-byte SS_ENC_CONTROL_V2 nonce is
also supported. Key/IV/tag layout cross-checked against Sunshine stream.cpp +
crypto.cpp and moonlight-common-c ControlStream.c.
Input decode (gamestream/input.rs):
- Decrypted control messages (`[u16 type][u16 len][NV_INPUT packet]`, type
0x0206) decode into lumen_core::input::InputEvent: relative/abs mouse, buttons,
vert/horiz scroll, keyboard down/up. Struct layout from moonlight Input.h
(size BE, magic LE, body BE; keyCode LE masked to the low-byte VK), dispatch
per Sunshine input.cpp (Gen5+). Unit-tested against real captured bytes.
Injection (inject.rs):
- WlrootsInjector: connects to Sway as a Wayland client and injects via the
wlroots virtual-pointer + virtual-keyboard protocols (uinput is invisible to a
compositor running WLR_LIBINPUT_NO_DEVICES=1). Uploads an evdev/US xkb keymap,
tracks modifier state, and maps Windows VK → Linux evdev (full table).
Deps: aes-gcm, wayland-client, wayland-protocols-{wlr,misc}, xkbcommon (+
libxkbcommon-dev in bootstrap-ubuntu.sh).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
de60650ed3 |
feat(m2): live video to stock Moonlight — ENet control + video data plane
A stock Moonlight client now decodes H.265 from the lumen host end-to-end (verified at 5120×1440@120 on RTX 5070 Ti): - control.rs: ENet control host on UDP 47999 (rusty_enet). Moonlight starts the control stream before video (STAGE_CONTROL_STREAM_START precedes _VIDEO_), so it must be up first — this was the blocker behind the earlier "error 35". - stream.rs: video data plane — on RTSP PLAY, learn the client endpoint from its ping, NVENC-encode at the negotiated mode, packetize (GameStream RTP/NV/FEC), send over UDP 47998; stops when the client disconnects. - rtsp.rs: ANNOUNCE → StreamConfig (resolution/fps/packetSize/bitrate/codec), PLAY starts the stream, TEARDOWN stops it; PairStatus=1 over the mutual-TLS port. P1.3 uses a synthetic test pattern + data-shards-only FEC (clean-LAN). Next: real portal desktop capture, input injection (decode control → uinput), nanors-exact FEC, encryption, audio. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
ab6dda2e5f |
feat: M0 capture→encode pipeline + M2 GameStream host (pairing, RTSP, video)
M0 (lumen-host) — verified on NVIDIA RTX 5070 Ti / Ubuntu 25.10: headless wlroots → xdg ScreenCast portal → PipeWire → NVENC HEVC → playable file, with each access unit round-tripped through a lumen_core host↔client Session (FEC + packetize + reassemble), 0 mismatches. - capture.rs: SyntheticCapturer + portal capture (ashpd 0.13 + pipewire 0.9), format-aware - encode/linux.rs: NVENC via ffmpeg-next 7 (BGRx/RGB → rgb0, no host-side swscale) - m0.rs: capture→encode→file + lumen-core loopback verification M2 P1 (lumen-host gamestream/) — a stock Moonlight client pairs + launches, verified live: - mDNS _nvstream._tcp + nvhttp /serverinfo (HTTP 47989, mutual-TLS HTTPS 47984) - 4-phase pairing: PIN→AES-128-ECB / SHA-256 / RSA-PKCS1v15 / X.509, custom rustls ClientCertVerifier for the mutual-TLS pairchallenge - /applist, /launch (rikey/rikeyid/mode), hand-rolled RTSP (OPTIONS/DESCRIBE/SETUP×3/ ANNOUNCE/PLAY, one-request-per-TCP-connection per moonlight-common-c's read-to-EOF) - video.rs: GameStream RTP + NV_VIDEO_PACKET wire packetizer, data-shards-only (0% FEC, clean-LAN), unit-tested (single/multi-block) Docs: docs/m2-plan.md (phased plan) + docs/research/ (ground-truth protocol spec). Bootstrap/setup updated for the verified path (libnvidia-gl, render/video groups, GPU EGL, pipewire 0.9). Workspace clippy-clean, tests green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
a913042367 |
feat: M1 lumen-core (FEC/crypto/packet/session + C ABI) and workspace scaffold
Ground-up low-latency streaming stack per docs/implementation-plan.md. M1 is
complete and tested; Linux host backends are cfg-gated stubs to be filled in on
real hardware (M0/M2).
lumen-core (built + tested on macOS/aarch64 — 21 tests):
- fec: ErasureCoder over GF(2^8) (reed-solomon-erasure, Moonlight-compatible)
and GF(2^16) Leopard-RS (reed-solomon-simd, the >1 Gbps wall-breaker); proptested
- packet: zero-copy #[repr(C)] framing, multi-block, FEC-aware reassembly
- crypto: AES-128-GCM with per-direction nonce salts + sequence-as-AAD
- session: host submit / client poll hot paths + input; loopback & UDP transports
- abi: opaque handles, versioned LumenConfig, panic guards; cbindgen-generated header
- acceptance: Rust loopback+proptest and a C harness that links the staticlib
Scaffold (compiles green on all platforms): lumen-host (vdisplay/capture/encode/
inject/web/pipeline seams under cfg(linux)), lumen-client-rs, tools/{loss-harness,
latency-probe}, Apple/Android client stubs, Gitea CI, docs.
Hardened against a multi-agent adversarial review (13 verified findings fixed,
regression-tested): reassembler memory-DoS bounds + block-consistency validation,
GCM nonce-reuse direction separation, ABI struct_size guard + range checks, FEC
shard-length guards, shard_payload datagram bound, key zeroization + Debug redaction.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|