Commit Graph

54 Commits

Author SHA1 Message Date
enricobuehler 2372b02620 feat(host): virtual DualSense via UHID (hid-playstation) — device + report mapping
ci / rust (push) Has been cancelled
Roadmap #5 (rich DualSense). A UHID device presents a real Sony DualSense to the kernel's
hid-playstation driver (matched by VID 054C/PID 0CE6), which exposes the full controller —
gamepad, motion sensors, touchpad, lightbar/player LEDs, adaptive triggers — unlike the
uinput X-Box-360 pad.

- inject/dualsense.rs: hand-rolled /dev/uhid codec (no bindgen) mirroring the uinput style;
  the canonical inputtino 232-byte USB HID report descriptor + the feature-report replies
  (calibration 0x05 / pairing 0x09 / firmware 0x20) — answering hid-playstation's GET_REPORTs
  during init is REQUIRED or it creates no input devices. DsState::from_gamepad maps a
  GameStream/XInput frame → the DualSense input report (buttons/sticks/triggers/dpad, +
  touchpad/motion fields); service() answers GET_REPORTs and parses HID OUTPUT (rumble /
  lightbar RGB / player LEDs / adaptive triggers) into quic::HidOutput.
- scripts/60-punktfunk.rules: grant /dev/uhid to the 'input' group (like /dev/uinput).
- `punktfunk-host dualsense-test`: standalone validation (no streaming session).

Validated live: `dualsense-test` → hid-playstation binds + loads ff_memless + led_class_
multicolor; the kernel creates "Punktfunk DualSense 0" (event/js gamepad + Motion Sensors +
Touchpad + Headset Jack) at VID 054c/PID 0ce6, plus the lightbar at /sys/class/leds/
input*:rgb:indicator; js shows the Cross button firing + the left-stick sweep. Clippy/fmt
clean, workspace tests green. Wiring into the session (pad-type select, touchpad/motion
routing, HID-output back-channel) is the next commit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 07:27:28 +00:00
enricobuehler 6575dddac7 fix: keep the workspace green on macOS after the mic/touch/rich-input batch
The new features were Linux-built only and broke the documented macOS gate
(cargo build/test/clippy --workspace) four ways, all fixed following the existing
platform-gating conventions:

- m3.rs: mic_service_thread split into the Linux worker and a non-Linux stub that
  drains and drops (sessions still count the datagrams) — opus/PipeWire are
  Linux-gated deps, same pattern as audio_thread.
- punktfunk-client-rs: the new `opus` dependency moved into the Linux target table and
  --mic-test gated with a warn-and-skip stub (only the synthetic-tone test rig needs
  the encoder; the mic uplink itself is portable).
- gamestream/audio.rs: SAMPLE_RATE import gated to any(linux, test) (the frame_sizing
  test uses it everywhere, the data plane only on Linux).
- tests/c_abi.rs: the harness's macOS link flags gained Security + CoreFoundation —
  the quic feature now pulls rustls's platform verifier into the staticlib.

Also: two clippy match-ref-pats lints in the new rich-input/HID-output decoders
(clippy -D warnings is the repo gate), the regenerated punktfunk_core.h committed (the
checked-in copy predated the rich-input/HID-output constants — CI fails on drift), and
web's inlang cache dir gitignored.

cargo build/test/clippy/fmt --workspace: green on macOS, 122 tests passing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 09:07:48 +02:00
enricobuehler 5f6d2cb88b feat(proto): variable-length rich-input (0xCC) + HID-output (0xCD) datagrams
ci / rust (push) Has been cancelled
Foundation for rich DualSense support (roadmap #5). The fixed 18-byte InputEvent (0xC8) can't
hold the DualSense touchpad/motion or HID feedback, so two new variable-length, kind-tagged
datagram families join the side-plane (mouse/keyboard/gamepad/touch keep the fixed InputEvent):

- RICH_INPUT_MAGIC 0xCC, client→host: `[0xCC][kind][fields]`
    Touchpad{pad,finger,active,x,y}  (x/y normalized 0..65535; host scales to the pad)
    Motion{pad, gyro[3], accel[3]}   (raw i16, straight into the DualSense report)
- HIDOUT_MAGIC 0xCD, host→client: `[0xCD][kind][pad][fields]` — the rich analog of the 0xCA
  rumble datagram (rumble stays on 0xCA):
    Led{rgb}  PlayerLeds{bits}  Trigger{which, effect}  (adaptive-trigger params to replay)

`RichInput`/`HidOutput` enums with encode/decode; unknown kinds + truncation decode to None
(forward-compatible). +2 round-trip/disjointness tests; quic suite green, clippy/fmt clean.
Wiring (host UHID device, capture, C ABI, client) lands in following commits.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 06:55:04 +00:00
enricobuehler dc375668ee feat: touch input — TouchDown/Move/Up + host libei ei_touchscreen injection
ci / rust (push) Has been cancelled
Roadmap #5 (touch, ahead of the XL UHID DualSense work). Touch fits the existing 18-byte
InputEvent: code = touch id, x/y = client pixels, flags = (w<<16)|h — the same absolute
mapping as MouseMoveAbs.

- core: InputKind::{TouchDown=9, TouchMove=10, TouchUp=11} + from_u8 + roundtrip test.
- host inject/libei.rs: request the RemoteDesktop Touchscreen device type, bind the Touch
  capability, and inject ei_touchscreen down/motion/up (one event = one frame, per the
  protocol rule), mapping coordinates into the device region like the abs pointer. wlroots
  has no virtual-touch protocol wired — no-ops there.
- client-rs --touch-test: drags a synthetic finger (touch id 0) in a circle.

Validated live on headless KWin: the portal GRANTS the Touchscreen device type
(Keyboard|Pointer|Touchscreen), proving the request path — but KWin's EIS server creates no
touchscreen *device*, so touch currently no-ops on this KWin (now logged once, not silent).
The injection code is correct and will land on a backend that exposes ei_touchscreen
(gamescope / a newer compositor / the real touch-client path). Workspace green, clippy/fmt
clean, +1 unit test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:38:44 +00:00
enricobuehler 0755c823a5 feat: mic passthrough — client microphone → host virtual PipeWire source
ci / rust (push) Has been cancelled
The inverse of the host→client audio path: the client's mic, Opus-encoded, rides a
new 0xCB QUIC datagram to the host, which decodes it into a virtual PipeWire
Audio/Source its apps can record from (voice chat, etc.).

Protocol (punktfunk-core):
- MIC_MAGIC 0xCB + encode/decode_mic_datagram (mirror of the 0xC9 audio datagram).
- NativeClient::send_mic(seq, pts_ns, opus) over a new outbound channel + worker task
  (mirror of send_input); C ABI punktfunk_connection_send_mic for native clients.

Host:
- audio::VirtualMic + PwMicSource: a PipeWire output stream tagged media.class=
  Audio/Source (Direction::Output) — a recordable microphone node, fed decoded PCM.
- MicService: host-lifetime owner of the source + Opus decoder (mirror of
  InjectorService / the audio capturer slot); lazily opened, persists across sessions,
  self-heals. The per-session datagram reader now demuxes 0xCB→mic / 0xC8→input over a
  single read_datagram loop (two loops would race).
- Adaptive jitter buffer in the producer: primes to ~3 consumer quanta before emitting,
  so the 5 ms push / N ms pull clock skew never underruns — without it ~58% of output
  was silence; with it, glitch-free across consumer quanta.

Client: punktfunk-client-rs --mic-test streams a synthetic 440 Hz Opus tone as the mic
uplink (opus dep added) for end-to-end validation without a real microphone.

Validated live on headless KWin: client tone → host source → pw-record shows the
punktfunk-mic Audio/Source node, 440 Hz dominant (Goertzel power 20.7 vs <0.001
elsewhere), RMS 0.179 ≈ the ideal 0.177, 0.3–0.4% silence at both 256 ms and 10 ms
consumer quanta. Tests +1 (mic datagram roundtrip); workspace green, clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:15:07 +00:00
enricobuehler a03aae891e fix(m3): persistent host-lifetime input injector — end the RemoteDesktop portal churn
ci / rust (push) Has been cancelled
Under rapid client reconnects, KWin's libei/EIS input setup intermittently wedged
with "EIS setup timed out", causing total input loss for affected sessions. Root
cause: each punktfunk/1 session opened (and tore down) its own RemoteDesktop-portal
CreateSession for pointer/keyboard injection, and back-to-back reconnects raced a
prior session's portal teardown before it settled.

LibeiInjector is only a Send channel handle to a worker thread that owns the portal
session, so the injector can live for the whole host run instead of per session.
Adds InjectorService: one host-lifetime thread owns the (!Send) injector, opened
ONCE (lazily, on the first event) and reused across every session — the portal grant
is established a single time and held. Sessions forward pointer/keyboard events to it
over a clonable Send channel; gamepads stay per-session (uinput, no portal). The
service self-heals — reopen after a 2s backoff if open fails or the backend worker
dies (covers a gamescope EIS socket that respawns with its nested session).

Mirrors the existing host-lifetime audio-capturer slot; the audio capturer is Send
(a slot works), the injector is !Send (needs the owning thread + channel).

Validated live on headless KWin: 8 rapid back-to-back input sessions →
"input injector ready (host-lifetime)" exactly once, ZERO "EIS setup timed out",
8/8 sessions injected input. Tests green, clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 21:31:11 +00:00
enricobuehler 6fdf7d1511 feat: client-selectable compositor (protocol → host → client → C ABI → mgmt → web)
A client can now request which compositor backend the host drives its virtual
output on (gamescope/KWin/Mutter/wlroots). The host honors the request if that
backend is available, else falls back to auto-detect and reports the resolved
choice back — wire-compatible both directions (no ABI bump).

Protocol (punktfunk-core):
- New CompositorPref (config.rs): Auto|Kwin|Wlroots|Mutter|Gamescope with
  u8/name mappings. Appended as one optional byte to Hello (client preference)
  and Welcome (host's resolved choice). Both decoders already tolerate trailing
  bytes, so old↔new interop is preserved — ABI_VERSION stays 2. Round-trip +
  back-compat (truncated-message) tests.
- C ABI: punktfunk_connect_ex(compositor) + PUNKTFUNK_COMPOSITOR_* constants;
  punktfunk_connect delegates with AUTO, so the existing symbol is unchanged.
  NativeClient::connect / worker_main thread the preference through.

Host:
- vdisplay::available() enumerates usable backends via cheap, side-effect-free
  probes (KWin zkde global, gamescope binary+version, GNOME/Sway env), plus
  Compositor id/label/as_pref/from_pref/all helpers.
- m3 handshake resolves the preference to a concrete backend during the
  handshake (pick_compositor pure + resolved logging), reports it in Welcome,
  and threads it into virtual_stream (replacing the unconditional detect()).
- mgmt GET /v1/compositors lists every backend with availability + the
  auto-detected default (OpenAPI regenerated).

Client:
- punktfunk-client-rs --compositor NAME; logs the host's resolved choice from
  the Welcome ("session offer … compositor=…").

Web console:
- Host page gains a Compositors card (availability + default badges) via the
  codegen'd useListCompositors hook; en/de strings added.

Also fixes a pre-existing, env-dependent test-isolation bug:
mgmt::tests::paired_clients_list_and_unpair seeded the real
~/.config/punktfunk/paired.json (AppState::new loads it), so a real
GameStream-paired client leaked into body[0] on a dev box — now cleared first.

Live-validated against headless KWin: --compositor kwin honored, --compositor
mutter falls back to kwin (available=[kwin, gamescope]), resolved choice
round-trips to the client. Tests: +6 (wire/back-compat, resolution precedence,
endpoint); workspace green, clippy/fmt clean, C ABI harness PASS at abi_version=2,
web typecheck + build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:45:41 +02:00
enricobuehler 75eb8fa0d6 feat(host): KDE-reliability phase 2 — pipeline retry, graceful capture teardown, refresh reconcile
Hardens the virtual-display → capture → encode bring-up against the transient
failures that surfaced as black screens / wrong refresh on cold KDE sessions.

- m3: build_pipeline_with_retry wraps the initial vd.create() + first-frame with
  bounded exponential backoff (4 attempts, 500ms→2s). is_permanent_build_error
  classifies config/version/missing-tool failures so they fail fast instead of
  burning the retry budget. Encoder + frame clock now pace to the *achieved*
  refresh reported in VirtualOutput::preferred_mode, not the requested rate.
- capture/linux: PortalCapturer::Drop sends a pipewire channel quit and joins the
  thread, so a dropped/failed/retried capturer releases its PipeWire thread + EGL/
  CUDA context promptly instead of leaking it to process exit. First-frame timeout
  now reports the node id and distinguishes "format never negotiated" from
  "negotiated but no buffers arrived" via a negotiated flag set in param_changed.
- vdisplay/kwin: set_custom_refresh reads back the active mode from kscreen-doctor
  and returns the refresh KWin actually gave us (a rejected custom mode silently
  leaves the output at 60Hz); create() carries it into preferred_mode.
- vdisplay/gamescope: find_gamescope_node requires the Video/Source object (the
  node.name=gamescope tag is on two objects; the other wedges the link); a version
  check warns on <3.16.22 (the PipeWire-1.6 capture-deadlock signature).

Live-validated against headless KWin: 720p120 build with requested=120 achieved=120,
zero-copy CUDA frames, and no per-session thread accumulation across back-to-back
sessions. Tests: +3 unit (retry classifier, gamescope version parse); 49 host tests
green, clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:45:41 +02:00
enricobuehler 9fdc3c3246 feat(headless-kde): reliable bring-up — readiness probe, fix portal ordering/env (roadmap #1 phase 1)
ci / rust (push) Has been cancelled
Headless KDE startup was a chain of timing-sensitive handoffs gated by a blind `sleep 2`,
the dominant source of black screens. Phase-1 fixes:

- New `punktfunk-host probe-compositor` subcommand: exits 0 iff the detected compositor is
  up AND ready to create a virtual output now. KWin gets a real check (connect + registry
  roundtrip + the privileged zkde_screencast global must be advertised — what the backend
  needs); gamescope/Mutter/wlroots create on demand so the probe just confirms Linux.
  (vdisplay::probe dispatcher + kwin::probe; reuses kwin.rs's existing roundtrip path.)
- run-headless-kde.sh: replace `sleep 2` with an active readiness wait (poll probe-compositor
  until ready, 30s deadline, and bail with kwin's log if kwin_wayland exits during init).
  Move the portal restart to AFTER readiness, and precede it with `systemctl --user
  import-environment` + `dbus-update-activation-environment` (the missing env import — the
  Sway script does this; without it a restarted portal inherits a stale/empty WAYLAND_DISPLAY,
  which is the "streams but eats no input/audio" failure). kwin's stderr → a log file.

Validated: probe-compositor exits 0 "Kwin ready" against the live session, exit 1 with a
clear diagnostic when the compositor is absent. 114 tests green, clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 19:25:52 +00:00
enricobuehler ff4fe197be fix(punktfunk/1): adversarial-review fixes — SPAKE2 pairing, renegotiation hardening, +more
ci / rust (push) Has been cancelled
Triaged the multi-agent review of the renegotiation + pairing + Sway + AV1/surround batch
(1 critical, 11 major/minor confirmed). Fixes:

CRITICAL — PIN pairing was offline-brute-forceable. The HMAC-of-PIN proof let an active
MITM who terminates the TOFU ceremony recover the 4-digit PIN by offline dictionary search
(all other inputs observable) and forge a correctly-bound proof. Replaced with **SPAKE2**
(balanced PAKE, `spake2` crate) + key-confirmation MACs, binding both cert fingerprints as
the SPAKE2 identities: an attacker gets exactly ONE online guess, no offline search, and
mismatched cert views (a real MITM) never reach a shared key. Also reworked the UX to an
"arming PIN" — one PIN per arming window shown at host startup (the SPAKE2 client needs the
PIN to build its first message, so it can't be minted per-connection). Validated live:
wrong PIN rejected in 0.1s, right PIN pairs + persists + the paired identity streams.

Pairing hardening: `--allow-pairing`/`--require-pairing` must arm pairing (default rejects
unsolicited ceremonies); per-host cooldown bounds online guessing; the client flushes its
CONNECTION_CLOSE so a refused ceremony can't wedge the sequential host for the full timeout;
atomic (temp+rename) paired-store writes.

Protocol: control/pairing messages use a distinct CTL_MAGIC (PKFc) — fully disjoint from
the positional Hello namespace (a future abi_version can't be misparsed as a control
message); all typed decodes are length-exact. ABI_VERSION → 2 (punktfunk_connect signature
gained the identity params; header regenerated).

Renegotiation: drain the reconfig channel to the NEWEST mode (one rebuild, not one per
stale step); validate refresh_hz; build the new pipeline BEFORE dropping the old so a
rebuild failure keeps the session on its current mode instead of killing it.

GameStream: packetDuration snaps to {5,10} (an in-between value isn't a legal Opus frame
size and would kill audio). Sway: chooser file moved to $XDG_RUNTIME_DIR (was a fixed
world-writable /tmp path — DoS / capture-misdirection by another local user).

Swift: fixed two compile breakers in the new pairing/identity APIs (Int32 status .rawValue,
UInt cap cast). New SPAKE2 + namespace-disjointness + pairing-roundtrip unit tests; the
in-process pairing test now also exercises the arming PIN + cooldown. 114 tests green,
clippy -D warnings clean (both feature sets), fmt, C-ABI harness.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 16:26:48 +00:00
enricobuehler 429bd1e6ac Merge branch 'worktree-agent-a6fe98c40d55fd284' into m1-lumen-core
# Conflicts:
#	CLAUDE.md
2026-06-10 15:42:48 +00:00
enricobuehler 4d26ac5c85 feat: punktfunk/1 — mid-stream mode renegotiation + PIN pairing ceremony
Renegotiation (no reconnect on resize): the handshake bi-stream stays open; the client
sends Reconfigure{mode} (typed post-handshake message), the host validates + acks
Reconfigured and rebuilds capture/encoder/virtual output at the new mode while the data
plane (keys, ports, FEC) runs untouched — the first new-mode AU is an IDR with in-band
parameter sets. NativeClient::request_mode / punktfunk_connection_request_mode; mode()
reflects the active mode. Validated live on KWin: one continuous stream, 225 frames
@1280x720 then 395 @1920x1080, ~90 ms pipeline rebuild (ffprobe shows both resolutions).

PIN pairing (mutual trust, kills TOFU MITM): clients get persistent self-signed
identities presented via QUIC client auth (generate_identity / client auth offered but
optional server-side — legacy clients still connect). Ceremony on the control stream:
PairRequest{name} → host shows a 4-digit PIN (log) + PairChallenge{salt} → client proves
with HMAC-SHA256(PIN‖salt, client_fp‖host_fp) — binding both certs means a MITM can't
forward a proof, single attempt per PIN, constant-time compare → PairResult; host
persists the fingerprint (~/.config/punktfunk/punktfunk1-paired.json), client pins the
host's. m3-host --require-pairing gates sessions on the paired set.
NativeClient::pair + punktfunk_pair/punktfunk_generate_identity in the ABI; reference
client: --pair PIN --name LABEL + auto-generated persistent identity, --remode for live
renegotiation testing. Swift wrapper: ClientIdentity/generateIdentity()/pair(),
requestMode()/currentMode(); README handoff updated.

Tested: reconfigure/pairing wire roundtrips, C-ABI mode switch ack, full in-process
ceremony (wrong PIN → Crypto, anonymous-vs-gate rejection, success → pinned session);
live wrong-PIN ceremony against the serving host (PIN logged, proof rejected).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:42:29 +00:00
enricobuehler 3cc3c02b42 feat(gamestream): AV1 negotiation + 5.1/7.1 surround audio
Codec negotiation (M2 polish):
- ServerCodecModeSupport now advertises what we encode: H264|HEVC|AV1_MAIN8
  = 65793 (flags verified against moonlight-common-c Limelight.h). The old
  placeholder 3843 wrongly claimed HEVC Main10 + 4:4:4 and no AV1. Main10
  bits stay off on purpose: Moonlight ties 10-bit to HDR, and capture is
  8-bit SDR BGRx with no HDR metadata path (av1_nvenc -highbitdepth was
  validated working for later).
- RTSP ANNOUNCE: bitStreamFormat 0/1/2 -> H264/HEVC/AV1 (already plumbed to
  av1_nvenc; validated e2e via `m0 --codec av1` + ffprobe av01), and a
  dynamicRangeMode!=0 request now logs + falls back to 8-bit SDR.

Surround audio (M2 polish):
- ANNOUNCE x-nv-audio.surround.{numChannels,AudioQuality} +
  x-nv-aqos.packetDuration -> per-session AudioParams; DESCRIBE advertises
  all six Opus configs (normal before HQ per channel count). Normal-quality
  mappings are pre-rotated for the client's GFE-order LFE swap
  (RtspConnection.c, verified verbatim) so its derived decoder mapping
  equals our encoder mapping — including 7.1, where Sunshine's rotate only
  covers [3,6) and scrambles LFE/SL/SR.
- 5.1/7.1 encode via libopus multistream (audiopus_sys, the sys layer the
  opus crate already links) with Sunshine's layouts/bitrates, RAII wrapper;
  the live-validated stereo wire is byte-identical (plain Opus, no FEC).
- Surround sessions add Sunshine-style RS(4,2) audio FEC (packetType 127 +
  AUDIO_FEC_HEADER, the OpenFEC parity matrix both ends hardcode, nanors
  gemm semantics verified from nanors/rs.c).
- PipeWire capture generalized to the negotiated channel count with explicit
  FL FR FC LFE RL RR [SL SR] positions; missing sink channels are zero-
  filled by the channel-mixer. PwAudioCapturer now tears down cleanly on
  Drop (pipewire channel -> loop quit), so a channel-count change can
  reopen without leaking a capture stream.

Tests: serverinfo mask, RTSP codec/audio param parsing, DESCRIBE contents,
surround-params strings + client-swap round trip, FEC parity self-recovery
and packet layout, real-codec 5.1 channel-identity round trip, and an
ignored live test (ran green against a 6ch null sink monitor).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:41:15 +00:00
enricobuehler 7381ba8218 feat(vdisplay): wlroots/Sway backend — swaymsg headless output + xdpw chooser
The fourth VirtualDisplay backend: `swaymsg create_output` adds a HEADLESS-N
output (name found by diffing get_outputs), `output <NAME> mode --custom
WxH@HzHz` sets the client's exact mode (and the refresh clock a fresh headless
output needs to produce frames at all), and the PipeWire node comes from the
ScreenCast portal. Headless output selection is non-interactive via
xdg-desktop-portal-wlr's chooser hook: a managed config (chooser_type=simple,
chooser_cmd cats /tmp/punktfunk-xdpw-output; portal try-restarted when the
config changes) plus a per-session `Monitor: <NAME>` written to that file.
Teardown is RAII: drop ends the portal thread (zbus connection drop ends the
cast) then `swaymsg output <NAME> unplug`. swaymsg commands go after `--` so
tokens like `--custom` reach sway instead of swaymsg's getopt.

Validated live on headless sway 1.11 (gles2-on-NVIDIA, xdpw 0.8.1), zero-copy
dmabuf→CUDA on both runs: 720p60 257 frames p50 0.77 ms, 1080p60 480/480
frames p50 1.18 ms, output unplugged with the session both times. The
checked-in xdpw.config sample now matches the managed config (the old
chooser_type=none/HEADLESS-1 form would pin capture to the wrong output).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:23:28 +00:00
enricobuehler bfd64ce871 rename: lumen → punktfunk, everywhere
ci / rust (push) Has been cancelled
Full project rename, decided 2026-06-10:
- Crates/binaries: punktfunk-core / punktfunk-host / punktfunk-client-rs.
- C ABI: punktfunk_* symbols, Punktfunk* types, include/punktfunk_core.h,
  PUNKTFUNK_FEATURE_QUIC guard (header regenerated; cbindgen renames updated, incl.
  PUNKTFUNK_BTN_*/PUNKTFUNK_AXIS_* wire constants).
- Protocol: punktfunk/1 — control-plane magic LMN1 → PKF1, nonce salt lmn1 → pkf1.
  WIRE BREAK: clients must be rebuilt from this revision.
- Env knobs: PUNKTFUNK_VIDEO_SOURCE / PUNKTFUNK_COMPOSITOR / PUNKTFUNK_ZEROCOPY / ….
- Host config dir: ~/.config/punktfunk (the box's dir was migrated in place — the
  persistent identity is unchanged, pinned fingerprints stay valid).
- Swift package: PunktfunkKit + PunktfunkCore.xcframework + PunktfunkConnection
  (Sources/PunktfunkClient app + tests renamed with it); build-xcframework.sh updated.
- scripts/: 60-punktfunk.rules, punktfunk-host.service; OpenAPI doc regenerated.

Also: scripts/headless/run-headless-kde.sh — full headless Plasma bringup. Root cause of
"desktop but no apps/settings" over the stream: plasmashell launched without
XDG_MENU_PREFIX=plasma-, so the launcher resolved a nonexistent applications.menu and
rendered an empty menu. The script sets the complete KDE session env (menu prefix,
KDE_FULL_SESSION, session version) and rebuilds ksycoca before starting plasmashell.

Gate: 97/97 tests, clippy -D warnings (both feature sets), fmt, C-ABI harness PASS,
zero lumen references left outside .git.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 13:11:59 +00:00
enricobuehler bf8a974e8b feat: M4 stage 1 — the SwiftUI client is real: compiles, tested, first light on glass
ci / rust (push) Has been cancelled
The clients/apple scaffold is now a working macOS client, validated live against this
repo's host across the LAN: gamescope virtual output → NVENC HEVC → lumen/1 (GF(2¹⁶) FEC +
AES-GCM over UDP, QUIC control) → VideoToolbox → AVSampleBufferDisplayLayer at 720p60,
mouse/keyboard flowing back as QUIC datagrams into the host's gamescope EIS injector
(~3.7k events injected in one session).

LumenKit:
- LumenConnection: the predicted cbindgen compile fixes (C17 header spells the typedefs as
  integers while the enum constants import as a distinct Swift type — bridge by rawValue);
  close() is now safe from any thread (a close flag + pumpLock held across the blocking
  poll enforce the C contract "never close with a next_au in flight"; flag prevents
  lock-starvation by back-to-back polls).
- StreamView: per-pump cancellation token (reconnects can't double-pump), flush + re-gate
  on the next in-band parameter sets when the layer fails, no stale enqueue after restart.
- InputCapture: fractional-delta accumulation (sub-pixel motion isn't truncated away),
  pressed-state tracking with release-all on focus loss and stop() (nothing sticks down
  host-side), global-singleton ownership guard (GC has one handler slot per process),
  X1/X2 buttons, horizontal scroll, full keypad/CapsLock/ISO-102nd/PrintScreen/Menu VKs.
- LumenClient app shell (swift run LumenClient): connect form, fps/Mb-s HUD,
  LUMEN_AUTOCONNECT/LUMEN_MODE for scripted first-light runs.
- Tests: Annex-B byte-level units; real-codec round trip (VTCompressionSession-encoded
  HEVC rebuilt as the host's wire shape → AnnexB → VTDecompressionSession → pixels);
  test-loopback.sh (Swift client vs a real local m3-host over loopback — the Swift twin of
  c_abi_connection_roundtrip); RemoteFirstLightTests (full pipeline over the LAN).

Host/build fixes that fell out:
- The workspace builds on non-Linux again: gamestream audio (opus) and sendmmsg batching
  are now platform-gated with stubs/fallback, per the crate's "compiles everywhere" rule.
- Horizontal scroll was inverted end-to-end: the injectors negated BOTH axes onto the
  ei/wl axes, but GameStream's horizontal convention is positive = right
  (moonlight-qt/Sunshine pass it through unnegated) — only vertical flips now. This also
  un-inverts real Moonlight clients.
- AnnexB drops all zeros preceding a start code (trailing_zero_8bits padding), ffmpeg's
  policy, instead of leaking them into the preceding NAL.
- build-xcframework.sh: deployment targets pinned to the package floor + an otool guard —
  cargo does not fingerprint MACOSX_DEPLOYMENT_TARGET, so warm caches can silently ship
  too-new minos objects.

Adversarially reviewed (5-dimension multi-agent pass, every finding refutation-verified):
14 confirmed findings, all fixed above; the send-while-polling core-contract gap flagged
here is closed by the lumen/1 session-planes work (&self pulls + per-plane borrow slots).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:46:45 +02:00
enricobuehler 520d7342dd feat: M3 — full lumen/1 session planes: audio, gamepads+rumble, pinned trust, persistent listener
ci / rust (push) Has been cancelled
m3-host is now a real host, not a one-shot demo. Everything validated live on this box
(two back-to-back sessions, pinned + TOFU, ~200 audio pkts/s, p50 0.84 ms at 720p60).

lumen-core:
- quic.rs: QUIC-datagram side planes demuxed by first byte — Opus audio 0xC9
  ([magic][u32 seq][u64 pts_ns][opus], host→client) and rumble 0xCA ([magic][pad][low][high]).
- Trust: endpoint::server_with_identity (persistent PEM identity) and
  endpoint::client_pinned — SHA-256 cert-fingerprint pinning with TOFU (observed
  fingerprint reported back for persisting). The verifier checks the TLS 1.3
  CertificateVerify signature for real (an MITM replaying the host's public cert without
  its key is rejected; cert pinning alone would not prove key possession).
- client.rs: NativeClient gains pin + host_fingerprint, audio/rumble receivers
  (next_audio / next_rumble); pull methods take &self so the C ABI's per-plane threads
  never alias a &mut (per-plane mutexed borrow slots in abi.rs).
- abi.rs: lumen_connect(pin_sha256, observed_sha256_out) + lumen_connection_next_audio /
  next_rumble. input.rs: documented gamepad wire contract (GameStream buttonFlags bits,
  XInput axis conventions, +y = up) — exported as LUMEN_BTN_*/LUMEN_AXIS_* (bare BTN_*
  collides with <linux/input-event-codes.h> at different values).

lumen-host (m3):
- Persistent accept loop: sessions back to back on one endpoint (--max-sessions, 0 =
  forever); per-session failures log and the loop keeps serving; 10 s handshake deadline
  so a silent client can't wedge the sequential accept queue; teardown on every exit path
  (stop flag → conn.close → join audio+input threads).
- Audio plane: desktop PipeWire capture → Opus 48 kHz stereo 5 ms CBR → datagrams; ONE
  capturer reused across sessions via an AudioCapSlot (PipeWire streams have no cheap
  teardown — per-session opens would leak a thread + core connection + live node each).
- Gamepad routing: incremental GamepadButton/GamepadAxis datagrams accumulate into
  per-pad state feeding the uinput xpad manager; force feedback returns as rumble
  datagrams, with current state re-sent every 500 ms (idempotent-state healing for the
  lossy channel). QUIC endpoint serves the persistent ~/.config/lumen identity and logs
  the pinnable fingerprint.

lumen-client-rs: --pin (malformed values abort — never silently downgrade to TOFU),
TOFU fingerprint logging, audio/rumble datagram counters, gamepad events in --input-test.

clients/apple: scaffold synced — pinSHA256/hostFingerprint (wrong-size pin throws,
fail-closed), nextAudio/nextRumble, gamepad event constructors; README handoff updated
(persistent listener, audio decode notes, trust UX).

Adversarially reviewed (5-dimension multi-agent pass over the diff, 2-skeptic
verification): fixed the MITM signature-check gap, a Y-axis contract inversion, header
macro collisions, ABI aliasing UB, the PipeWire per-session leak, the missing handshake
deadline, fail-open pin parsing, and teardown-on-error paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 12:26:18 +00:00
enricobuehler 3ea096ace9 feat: M4 groundwork — lumen/1 client connector in the C ABI + SwiftUI client scaffold
ci / rust (push) Has been cancelled
The shared-core architecture pays off: platform clients now link ONE Rust library that
does the entire lumen/1 protocol, and only add decode/present/input on top.

lumen-core:
- client.rs (quic feature): NativeClient — QUIC handshake + UDP data plane + input
  datagrams on internal threads; embedder surface = connect / next_frame / send_input.
- abi.rs: lumen_connect / lumen_connection_next_au (borrow-until-next-call, matching
  lumen_client_poll_frame semantics) / lumen_connection_send_input / lumen_connection_mode /
  lumen_connection_close. Guarded in the generated header by LUMEN_FEATURE_QUIC (cbindgen
  [defines] mapping), so the checked-in header is stable across feature sets.
- error.rs: append-only LumenStatus additions Timeout (-9) and Closed (-10).
- TESTED end-to-end through the C ABI: in-process lumen/1 host, lumen_connect pulls 25
  byte-verified frames, sends input, closes (m3.rs::c_abi_connection_roundtrip).

Apple client (clients/apple — SCAFFOLD, written on Linux, first Xcode build pending):
- scripts/build-xcframework.sh: cargo per Apple target → universal staticlib + header
  (LUMEN_FEATURE_QUIC pre-defined) + modulemap → LumenCore.xcframework.
- Package.swift (LumenKit) + Swift sources: LumenConnection (ABI wrapper), AnnexB
  (in-band VPS/SPS/PPS → CMVideoFormatDescription, Annex-B → AVCC CMSampleBuffers with
  DisplayImmediately), StreamView (SwiftUI over AVSampleBufferDisplayLayer — stage-1
  presenter that hardware-decodes compressed HEVC itself), InputCapture (GCMouse raw
  deltas + GCKeyboard HID→VK).
- README.md is the full handoff for the next (Mac-side) agent: build steps, ABI contract,
  first-light test recipe against the Linux host, stage-2 (VT+Metal pacing) plan, and the
  known host-side gaps (single-session m3-host, no lumen/1 audio yet, gamepad kinds not
  yet routed in m3's injector, seed-stage trust).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 07:28:41 +00:00
enricobuehler 68f2b19cca Merge main (management REST API) into m1-lumen-core
ci / rust (push) Has been cancelled
Resolutions: serve() keeps main's AppState::new() with our persisted-pairing load folded
into it; main.rs keeps both the m3 and mgmt modules; mgmt's test LaunchSessions gain the
new appid field; Cargo.lock re-resolved. Full gate green (92 tests, clippy, fmt).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 07:03:41 +00:00
enricobuehler 5b0d84acd0 feat: M3 — lumen/1 native streaming: real video at client mode + input over QUIC datagrams
The native protocol now does the real thing, end to end:

- Hello carries the client's requested mode; the host creates a NATIVE virtual output at
  exactly that size/refresh (same vdisplay backends as the GameStream path) and streams
  NVENC HEVC through the M1 Session (GF(2^16) Leopard FEC + AES-GCM, QUIC-negotiated).
- Input rides QUIC DATAGRAMS — encrypted, congestion-managed, no ENet retransmission
  spikes — decoded into lumen_core InputEvents and fed to the session's input injector.
- Frames are stamped with the capture wall clock; the reference client computes per-frame
  capture→reassembled latency percentiles and writes a playable .h265.
- m3-host gains --source synthetic|virtual + --seconds; the client gains --mode WxHxFPS,
  --out, --input-test (scripted mouse/keyboard datagrams).

VALIDATED live (gamescope session, xev nested): client requested 1280x720@120 → host
created gamescope at that mode → 1680/1680 frames over 14s, zero loss, valid HEVC;
pipeline latency p50 0.83ms / p95 1.2ms / p99 1.3ms (capture→encode→FEC→crypto→UDP→
reassembled, same-host clock); 176 input datagrams sent → injector (GamescopeEi) → 164
X events observed inside the nested session.

Known follow-on: slice-level sub-frame pipelining needs the NVENC SDK directly (libavcodec
emits whole AUs only) — the next big latency lever.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 06:56:47 +00:00
enricobuehler de3123038f feat: M3 seed — the lumen/1 native protocol: QUIC control plane + reference client (Phase 5)
The first end-to-end run of lumen's own protocol, past the GameStream compatibility layer.

- lumen-core/src/quic.rs (behind the `quic` feature): the lumen/1 handshake — Hello/Welcome/
  Start as length-prefixed LE binary on one QUIC bi-stream. Welcome carries the COMPLETE
  data-plane Config: mode, FEC scheme incl. GF(2^16) Leopard (inexpressible in GameStream),
  shard sizing, AES-GCM key + per-direction salt, data UDP port. Plus quinn endpoint helpers
  (self-signed server; accepts-any client — pinning lands with the trust model) and framed
  async IO. Round-trip unit-tested.
- lumen-host m3-host: serves one lumen/1 session — QUIC handshake, then a NATIVE thread
  (no async on the frame path — design invariant) streams deterministic 64KB test frames
  through the hardened M1 Session over UdpTransport.
- lumen-client-rs: from scaffold to working reference client — connects, negotiates, brings
  up the client Session over UDP, reassembles + FEC-recovers + byte-verifies every frame.

VALIDATED END-TO-END on localhost: 300/300 frames verified, 0 mismatches, through
QUIC-negotiated GF(2^16) FEC + AES-GCM over real UDP sockets. M4 (decode+present) builds on
this exact client skeleton.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:33:40 +00:00
enricobuehler 1eeb35a723 feat: M2 — host productionization: app catalog, persistent pairing, quit semantics, systemd (Phase 4)
- gamestream/apps.rs: an app catalog (loaded from ~/.config/lumen/apps.json, with defaults:
  Desktop + gamescope entries when gamescope/steam/vkcube are installed). /applist renders
  it; /launch?appid=N selects the entry; RTSP PLAY resolves it and the stream honors the
  app's compositor + nested command — so a Moonlight client picks "Steam" and gets a
  gamescope session at its native resolution, or "Desktop" for the KWin/GNOME desktop.
- Persistent pairing: the paired-client cert allow-list now survives restarts
  (~/.config/lumen/paired.json), saved on each successful pairing, loaded at boot.
- Quit semantics: /cancel now actually stops the media threads (streaming/audio flags),
  tearing down the per-session virtual output / gamescope process via the capturer's RAII.
- scripts/lumen-host.service (systemd user unit) + scripts/host.env.example (config file
  consumed by it) — the host runs as a managed service instead of an SSH shell.

Smoke-tested: serve boots, /applist serves the catalog (Desktop + vkcube gamescope entry
auto-detected on this box). GNOME backend validation still pending gnome-shell install;
wlroots vdisplay backend deliberately deferred (not in the priority compositor trio).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:23:53 +00:00
enricobuehler 826da9968e feat: M2 — Vulkan bridge: TRUE zero-copy for gamescope's LINEAR dmabufs (Phase 3)
The missing zero-copy path is closed. NVIDIA's EGL won't sample LINEAR and the CUDA driver
rejects raw dmabuf fds — but Vulkan imports dmabufs (VK_EXT_external_memory_dma_buf) and
exports OPAQUE_FD memory that CUDA officially imports. zerocopy/vulkan.rs (ash):

  dmabuf fd → VkBuffer (import cached per fd) → vkCmdCopyBuffer (GPU) →
  exportable VkBuffer → vkGetMemoryFdKHR(OPAQUE_FD) → cuImportExternalMemory → CUdeviceptr

The exportable buffer + CUDA mapping are per-resolution; per frame it's one GPU buffer copy
(fence-waited) + one pitched CUDA copy into the encoder's pool. No CPU touches pixels.
EglImporter::import_linear now routes through the bridge (lazy init; any failure still falls
back to the CPU mmap path). cuda::ExternalDmabuf gained import_owned_fd for the
Vulkan-exported fd.

Validated live: gamescope 720p120 → "Vulkan→CUDA exportable staging buffer ready
size=3686400" (exactly 1280*720*4), full-rate 122.7 fps, decoded frame pixel-correct
(vkcube). KWin's tiled EGL path regression-tested intact. NV12 negotiation dropped — moot
now that BGRx is fully zero-copy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:18:38 +00:00
enricobuehler c39615d7d1 perf: M2 — split the data plane into encode | send threads with batched, paced sends (Phase 2)
stream_body no longer sends: each frame's packet batch goes over a depth-2 bounded queue to a
dedicated send thread, so a send spike can never stall capture/encode (a full queue drops the
NEWEST batch — FEC/RFI covers the client — rather than ever blocking). The sender ships packets
with sendmmsg (≤64/syscall: ~375 syscalls/s instead of ~24k at 5K@240) in 16-packet chunks
paced across ~3/4 of the frame interval — microburst shaping for real links without per-packet
sleep jitter. Client-gone detection moved to the sender (clears `running`); the LUMEN_VIDEO_DROP
FEC test knob moved with the send path. Loopback-tested: batches arrive complete and
byte-identical through the paced sendmmsg path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:13:34 +00:00
enricobuehler 6521146abc feat: M2 — gamepad input: uinput virtual X-Box pads + rumble back-channel (Phase 1)
Full controller path for the SteamOS-like session, mirroring Sunshine byte-for-byte
(wire formats verified against moonlight-common-c + Sunshine source; ioctl numbers and
struct layouts verified by compiling against this box's <linux/uinput.h>, locked in with
const asserts):

- gamestream/gamepad.rs: decode MULTI_CONTROLLER (magic 0x0C, mixed BE-size/LE-body) incl.
  the Sunshine buttonFlags2 extension (paddles/touchpad/Misc — our appversion already
  advertises Sunshine, so clients send it) and CONTROLLER_ARRIVAL (0x55000004); build the
  0x010B rumble plaintext (with the mandatory 4-byte filler). Unit-tested.
- inject/gamepad.rs: VirtualPad clones the kernel xpad identity ("Microsoft X-Box 360 pad",
  045e:028e, exact button/axis codes + absinfo) so SDL/Steam/Proton match their built-in
  mapping with zero config. GamepadManager creates/destroys pads from activeGamepadMask
  (hotplug), emits button transitions + axes (+Y-up → evdev +Y-down negation, D-pad as
  HAT0X/Y) per frame. Rumble: non-blocking FF pump answers UI_BEGIN/END_FF_UPLOAD/ERASE
  (games block in EVIOCSFF until answered), tracks effects with replay expiry + FF_GAIN,
  mixes to (low, high) motor levels, dedups.
- control.rs: channel_limit 8 → 0x30 — Moonlight sends gamepad input on ENet channel
  0x10+n, so the old limit silently discarded ALL controller input. Gamepad events route to
  the manager; rumble is sealed with the client's detected GCM scheme direction-flipped
  (V2 marker 'H?', own seq counter) and sent on the control peer every service tick.
- scripts/60-lumen.rules: udev rule (Sunshine-style) granting the input group /dev/uinput.

Live validation needs the udev rule installed (root-only /dev/uinput on this box) + a
Moonlight client with a controller; everything else is gated and unit/static-checked.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:09:16 +00:00
enricobuehler be18a782b1 feat: M2 — GNOME/Mutter virtual-display backend (RecordVirtual) + preferred-mode negotiation
Third compositor on the VirtualDisplay seam, via Mutter's direct D-Bus APIs (the
gnome-remote-desktop headless model, no portal grant): RemoteDesktop.CreateSession →
ScreenCast.CreateSession({remote-desktop-session-id}) → Session.RecordVirtual (creates a
virtual monitor) → Start → PipeWireStreamAdded(node_id). A keepalive thread owns the zbus
connection (sessions die with it — RAII teardown); select with LUMEN_COMPOSITOR=mutter or
XDG_CURRENT_DESKTOP=GNOME.

Mutter sizes its virtual monitor FROM the PipeWire format negotiation, so VirtualOutput
gains preferred_mode (w, h, refresh_hz), threaded into the consumer's format pods as the
default size/framerate. KWin/gamescope set it too (their outputs are already exact-size;
the preference just confirms it) — both regression-tested intact.

Compile/clippy/test clean; live validation needs gnome-shell installed (then
`gnome-shell --headless` + m0 --source kwin-virtual with LUMEN_COMPOSITOR=mutter).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 22:48:53 +00:00
enricobuehler 751789f932 feat: M2 — LINEAR-dmabuf CUDA import attempt + graceful zero-copy fallback (gamescope)
gamescope only offers LINEAR dmabufs, which the EGL/GL interop path can't handle (NVIDIA's
EGL lists no LINEAR modifier for sampling). Attempt a direct CUDA external-memory import
(cuImportExternalMemory OPAQUE_FD, cached per buffer fd, one DtoD copy per frame into the
pooled buffer): the FFI + plumbing are in place, and LINEAR(0) is now advertised alongside
the tiled EGL modifiers (tiled first, so KWin still prefers it — regression-tested).

Empirically the 595 desktop driver rejects raw dmabuf fds as OPAQUE_FD (CUDA_ERROR_UNKNOWN),
matching the documented limitation — true LINEAR GPU import needs a Vulkan interop bridge
(import dmabuf via VK_EXT_external_memory_dma_buf, GPU-copy into an exportable allocation,
hand that to CUDA), noted as future work. So the importer now degrades instead of dying:
on GPU-import failure it logs once, disables itself, and falls through to the CPU mmap path.
Validated: gamescope + LUMEN_ZEROCOPY=1 runs full-rate (122.9 fps @720p120, valid HEVC) via
the fallback; KWin keeps real zero-copy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 22:43:35 +00:00
enricobuehler 7f3897e0d3 feat: M2 — gamescope input via its EIS socket (SteamOS-like input path)
gamescope runs its own EIS server and exports the socket to its children as LIBEI_SOCKET —
no portal involved. The gamescope backend now launches the nested app through a tiny shell
wrapper that relays that value to /tmp/lumen-gamescope-ei; the libei injector gains an
EiSource enum (Portal | SocketPathFile) and connects a UnixStream directly to gamescope's
socket (polling until the app has started), then runs the identical reis sender flow.
Backend::GamescopeEi is auto-selected when LUMEN_COMPOSITOR=gamescope
(LUMEN_INPUT_BACKEND=gamescope overrides).

Validated end-to-end: input-test against a headless gamescope running xev — 129
MotionNotify/KeyPress/ButtonPress events delivered into the nested X app ("Gamescope
Virtual Input" device bound, sender handshake + emulation working).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 22:34:48 +00:00
enricobuehler c8f9032dec feat: M2 — harden gamescope capture path (blocked on gamescope ≥3.16.22 upstream fix)
Deep investigation (gdb + daemon traces) proved the gamescope capture stall is a gamescope
3.16.20 bug, not ours: it calls pw_loop_iterate() without pw_loop_enter()/leave(), and under
PipeWire 1.6's loop locking its main thread permanently holds the loop mutex — the pw thread
deadlocks, gamescope never acks the daemon's port_set_param(Format), and the link parks in
"negotiating" silently. Stock gst pipewiresrc fails identically. Fixed upstream by gamescope
commit e3ed1ea7 ("pipewire: Fix pipewire loop locking", pipewire#5148); first release 3.16.22.
Ubuntu 26.04 ships 3.16.20 (built ten days before the fix) — patch/upgrade required.

Consumer-side improvements from the investigation (all verified correct vs gamescope's pods,
and needed once the producer is fixed):
- discover the node from gamescope's own "stream available on node ID: N" log line (its
  node.name appears on two objects; the advertised id is authoritative); pw-dump fallback
- CPU path accepts mappable dmabufs: Buffers param now offers MemPtr|MemFd|DmaBuf (gamescope
  counter-offers exactly DmaBuf when its modifier pod wins, never MemPtr), mmap the fd
  ourselves when MAP_BUFFERS didn't (Vulkan-exported dmabufs aren't flagged mappable), and
  treat chunk.size==0 as the computed span
- warn_once on every silent frame-drop path in the process callback
- node.dont-reconnect on our capture streams: an orphaned stream re-targeted by wireplumber
  onto a fresh node wedges it — and a stuck link head-blocks the daemon's shared work queue,
  stalling ALL new link negotiation system-wide (this poisoned whole test sessions)
- LUMEN_GAMESCOPE_NODE (attach to an existing gamescope) + LUMEN_PW_FIXED_POD (negotiation
  bisection) debug knobs

KWin path regression-tested (zero-copy intact). gamescope end-to-end validation pending the
patched gamescope build.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 22:16:53 +00:00
enricobuehler bd25f5e02f fix: M2 — harden the management API after adversarial review
ci / rust (push) Has been cancelled
Five confirmed findings from a 46-agent review panel:
- Empty --mgmt-token no longer satisfies the non-loopback token gate
  (critical: 'Bearer ' with an empty token authenticated; parse_serve now
  bails on blank tokens and mgmt::run treats blank as none)
- axum's built-in body rejections (400/415/422) now wear the documented
  ApiError envelope via an ApiJson extractor, and the spec documents them
- GET /health carries security([{}]) in the spec, matching the server's
  auth exemption
- unpairClient's description no longer claims revocation the TLS layer
  doesn't enforce yet (gamestream/tls.rs accepts any cert — known gap)
- CLAUDE.md/README.md no longer reference the deleted web.rs

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 22:00:22 +00:00
enricobuehler a339a0466e feat: M2 — management REST API with OpenAPI doc (control-pane groundwork)
A versioned control-plane REST API (/api/v1) on its own port (default
127.0.0.1:47990) serving host info, runtime status, paired-client
management, the pairing PIN flow, and session control (stop / force-IDR).
The OpenAPI 3.1 document is generated from the handlers by utoipa, served
live at /api/v1/openapi.json (+ Scalar docs at /api/docs), printable via
`lumen-host openapi`, and checked in at docs/api/openapi.json for client
codegen — a test fails if it drifts, mirroring the cbindgen header rule.

Auth: optional bearer token (--mgmt-token / LUMEN_MGMT_TOKEN), enforced on
everything but /health, and mandatory for non-loopback binds. PinGate
gains a waiter count so the API can report pin_pending; logs moved to
stderr so stdout stays machine-readable. Supersedes the web.rs stub.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 21:35:43 +00:00
enricobuehler 20bd76ae50 feat: M2 — gamescope virtual-display backend (spawn headless, capture its PipeWire node)
Third compositor on the VirtualDisplay seam. gamescope's model differs from KWin/Mutter: it's
not a runtime protocol but a micro-compositor we spawn — `gamescope --backend headless -W -H -r
-- <app>` — which composites at the client's size AND refresh natively (so no separate
refresh step), runs the app nested, and exports a built-in PipeWire node named "gamescope".
The backend spawns it, discovers that node via pw-dump, and returns a VirtualOutput whose
keepalive owns the process (drop = kill = teardown). App via LUMEN_GAMESCOPE_APP. Select with
LUMEN_COMPOSITOR=gamescope; m0's virtual source now honors LUMEN_COMPOSITOR so any backend is
testable without a client. Input (gamescope's libei/EIS socket) is a follow-up.

Builds/clippy/fmt clean. Needs gamescope installed to validate; headless capture on the
proprietary NVIDIA driver is plausible-by-architecture but unproven — validate empirically.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 21:23:52 +00:00
enricobuehler 22a982a1cb feat: M2 — drive KWin virtual output above 60Hz (custom mode at the client's refresh)
KWin creates virtual outputs at a hardcoded 60 Hz and zkde stream_virtual_output has no
refresh argument, so the *source* composited at 60 Hz even when the client asked for 120/240
(confirmed live: stream paced a stable 240 fps but only ~60 unique frames/s). KWin 6.6+ allows
custom modes on virtual outputs, so after creating the output we install + select a mode at the
client's refresh, before capture connects PipeWire. First cut shells out to kscreen-doctor
(output is "Virtual-<name>"); the in-process kde_output_management_v2 client is a follow-up.
Best-effort — failure leaves the source at 60 Hz (stream still works). Verified the mode is
applied (Virtual-lumen -> 1280x720@120). Empirically de-risked that this headless QEMU VM's
software vsync accepts >60 Hz.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 21:14:16 +00:00
enricobuehler 37ae26b4be perf: M2 — auto 2-way NVENC split-encode for high pixel rates (5K@240)
GB203 has two NVENC engines. A single HEVC p1 session tops out ~1 Gpix/s, so 5120x1440@240
(1.77 Gpix/s) is encoder-bound on one engine; split-frame encode runs it across both (~1.8x,
latency-neutral, output is standard HEVC the client decodes normally). NVENC's AUTO split
won't engage below ~2112px height, so force split_encode_mode=2 when the pixel rate exceeds
~1 Gpix/s (HEVC/AV1 only — not H.264). Below that (e.g. 5K@120) stay single-engine to avoid
the ~2% BD-rate cost. Override with LUMEN_SPLIT_ENCODE. Verified: engages at 240, not at 120.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 20:42:41 +00:00
enricobuehler a473f4a926 perf: M2 — amortize per-frame zero-copy overhead (pool buffers + register once)
The zero-copy import did real per-frame GPU churn that capped high-fps throughput: a fresh
~29MB cuMemAllocPitch + cuMemFree, a cuGraphicsGLRegisterImage/unregister, and a map of the
*same* persistent blit texture — every frame. Two fixes:

- BufferPool: a recycled free-list of pitched device buffers per resolution. DeviceBuffer
  returns its allocation to the pool on drop (after the encoder synchronized) instead of
  freeing — kills the per-frame 29MB alloc/free that took the device allocator lock and
  serialized against the GPU.
- RegisteredTexture: register the (reused) GL_RGBA8 blit destination with CUDA ONCE when the
  GlBlit is built; each frame only maps → copies the array → unmaps, instead of
  registering/unregistering every frame.

This is the "zero-copy should be overhead-free" cleanup. Verified the import still produces
correct frames; the remaining per-frame cuCtxSynchronize pair (shared-context coupling) is
the next step (CUDA stream + events). lumen-host builds, clippy/fmt/tests clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 20:38:26 +00:00
enricobuehler 0e1853e070 fix: M2 — eliminate the periodic high-res stream freeze (infinite GOP + single-deadline pacing)
At 5120x1440 the stream froze on a ~2s cadence. Two compounding causes (confirmed by a
profiling pass + adversarial review):

1. Periodic IDR every 2s (set_gop(fps*2)). A keyframe at 5K is ~20-40x a P-frame — a
   recurring multi-millisecond encode+packetize+send spike. Fix: infinite GOP (gop_size=-1),
   one IDR at stream start, P-frames only; forced-idr makes a client recovery request (RFI via
   request_keyframe) emit an IDR on demand — the Moonlight/Sunshine low-latency model.

2. Two pacing timers summing on the capture/encode thread: a per-packet thread::sleep pacer
   (spread a frame's packets across a whole frame interval) PLUS a backstop sleep on top, so
   every frame cost 1-2x the interval and the big IDR blew through it (the 2->120 oscillation).
   Fix: delete both; send at line rate and drive cadence from a single absolute deadline.
   (Proper microburst pacing belongs on a dedicated send thread — a follow-up.)

Also: honor the client's fps (pacing clamp 60->240) and add an env-gated (LUMEN_PERF)
per-stage timing log (enc/pkt/send µs + unique-vs-reencoded frames + max packet burst) for
diagnosing the remaining throughput ceiling. Verified live: freeze gone at 5120x1440.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 20:29:48 +00:00
enricobuehler 669d40ae21 build: migrate to ffmpeg-next 8 (FFmpeg 8.x / libavcodec 62)
Ubuntu 26.04 ships FFmpeg 8.0 (libavcodec 62); bump ffmpeg-next 7.1 -> 8.1 to bind it
as the intended pairing. No source changes needed — the encode API surface we use
(avcodec_send_frame, hwframe contexts, AV_PIX_FMT_CUDA, av_log) is stable across 7->8.
Workspace builds + all tests green; clippy/fmt clean. Refresh the 7.x doc references.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 18:13:40 +00:00
enricobuehler 7d08e43c16 feat: M2 — KWin virtual-output backend behind a VirtualDisplay trait (native client resolution)
Honor the client's requested resolution by rendering a compositor virtual output at
exactly that size — native, headless, no scaling. There is no cross-compositor Wayland
protocol for this, so it's a per-compositor backend behind the (previously stubbed)
VirtualDisplay trait.

- vdisplay.rs: VirtualDisplay::create(mode) now returns a live VirtualOutput
  { node_id, remote_fd: Option<OwnedFd>, keepalive } with RAII teardown (drop releases
  the output) instead of an inert OutputHandle + explicit destroy. Add compositor
  detect() (LUMEN_COMPOSITOR / XDG_CURRENT_DESKTOP).
- vdisplay/kwin.rs: the KWin backend — the zkde_screencast_unstable_v1 stream_virtual_output
  client (vendored protocol XML + wayland-scanner codegen). Creates a WxH output, returns
  its PipeWire node (default daemon, remote_fd=None); a keepalive thread holds the Wayland
  connection until dropped. (Moved here from capture/kwin.rs — it's a vdisplay backend, not
  capture.)
- capture: generalize the PipeWire consumer to Option<OwnedFd> (portal remote vs. default
  daemon) and add capture_virtual_output(vout), compositor-agnostic, owning the keepalive.
- gamestream/stream.rs: LUMEN_VIDEO_SOURCE=virtual creates a virtual display sized to the
  client's cfg and captures it (self-contained, not pooled — a reconnect at a new
  resolution gets a fresh output).
- m0: --source kwin-virtual goes through the trait.

Verified end-to-end against the running headless KWin: the request reaches the compositor
and is handled cleanly. Native creation needs a backend implementing createVirtualOutput —
the DRM backend, or the VirtualBackend since KWin 6.5.6; on this box's --virtual 6.4.5 it
returns "Could not find output" (expected; validates after the KWin upgrade). wlroots/Mutter
backends are the next ones to land on the same seam.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 17:30:02 +00:00
enricobuehler 6508980564 feat: M2 — validate client-requested video mode (codec dimension guards)
Clients pick the resolution via mode=WxHxFPS / RTSP clientViewportWd-Ht, so the
host must bound attacker/typo-controlled dimensions before allocating buffers or
opening NVENC. Add encode::validate_dimensions: reject zero, odd, and over-limit
modes (H.264 ≤ 4096px/side; HEVC/AV1 ≤ 8192) with a clear message instead of a
buffer-math overflow or an opaque NVENC open failure. Gate both the stream path
(before any allocation) and open_video (also covers m0). Unit-tested.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 16:40:56 +00:00
enricobuehler aa91485008 feat: M2 — complete zero-copy dmabuf→NVENC capture path (EGL/GL→CUDA)
The PipeWire dmabuf now reaches NVENC with no CPU touch. Verified live against
headless KWin: a tiled BGRx dmabuf is imported and encoded to a pixel-correct
H.265 stream (decoded frame matches the captured desktop — no tiling artifacts,
no colour swap). The CPU-copy path stays the default and the runtime fallback.

Capture side (zerocopy::egl): desktop NVIDIA can't register a dmabuf EGLImage
with CUDA directly (cuGraphicsEGLRegisterImage is Tegra-only; cuGraphicsGLRegisterImage
rejects EGLImage-backed textures), so we follow OBS/Sunshine — bind the EGLImage
to a GL texture, render it through a fullscreen-triangle shader into an immutable
GL_RGBA8 texture (de-tiling + .bgra swizzle to the BGRx the encoder wants), then
register that texture with CUDA and copy it device-to-device into an owned buffer
so the dmabuf returns to the compositor immediately.

Encode side (encode/linux::submit_cuda): take a *pooled* CUDA surface via
av_hwframe_get_buffer and device→device-copy our imported buffer into it, instead
of wrapping our own pointer in a bare AVFrame. A bare frame is rejected with
EINVAL (NVENC ignores frames with null buf[0]; the encode path's av_frame_ref
needs a refcounted buffer), and a fresh device pointer every frame would thrash
NVENC's bounded resource-registration cache — the pool recycles a small set.

Also: gate FFmpeg AV_LOG_DEBUG behind LUMEN_FFMPEG_DEBUG for diagnosing
hw-frame rejects, and refresh the now-accurate module docs.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 16:28:29 +00:00
enricobuehler e3876c0d8a feat: M2 zero-copy — PipeWire dmabuf negotiation + EGL device-platform import (WIP)
Wire the capture side of zero-copy (LUMEN_ZEROCOPY=1):

- EGL importer now opens the headless EGLDisplay on the NVIDIA EGL device
  (EGL_PLATFORM_DEVICE_EXT) and queries its importable DRM modifiers
  (eglQueryDmaBufModifiersEXT).
- The PipeWire stream advertises a BGRx dmabuf format with those modifiers as a
  mandatory enum Choice + a dmabuf-only Buffers param; the compositor fixates an
  importable tiled modifier. param_changed reads the negotiated modifier; the
  process callback imports the dmabuf (eglCreateImage with explicit LO/HI
  modifier) and would copy it into a CUDA buffer for the encoder.

Validated against headless KWin (Plasma 6.4): negotiation succeeds (13 NVIDIA
modifiers advertised, KWin fixates one, stream reaches Streaming with a real
tiled dmabuf) and `eglCreateImage` succeeds. The remaining blocker is
`cuGraphicsEGLRegisterImage` returning CUDA_ERROR_INVALID_VALUE on the
dmabuf-imported EGLImage — the likely fix is to bind the EGLImage to a GL
texture (glEGLImageTargetTexture2DOES) and register that via
cuGraphicsGLRegisterImage (OBS/Sunshine's path), which needs a GL context.

The CPU-copy path stays the default and is unaffected (regression-checked: real
KWin capture → HEVC). LUMEN_ZEROCOPY is opt-in/experimental until the CUDA
registration lands.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 15:41:31 +00:00
enricobuehler 16a00563a8 feat: M2 zero-copy foundation — EGL→CUDA import + NVENC CUDA-frame path
Scaffolding for dmabuf zero-copy (plan §9), opt-in via LUMEN_ZEROCOPY:

- src/zerocopy/{cuda,egl}.rs: hand-rolled CUDA Driver-API FFI (no Rust crate
  exposes the EGL-interop calls / CUeglFrame) with a shared process-wide
  CUcontext + pitched device buffers; an EGL importer (GBM platform on the
  NVIDIA render node) that turns a dmabuf into an EGLImage, registers it with
  CUDA, and copies it device-to-device into an owned buffer. `zerocopy-probe`
  subcommand validates the FFI/linking/GPU access — confirmed on the box
  (driver 595, EGL_EXT_image_dma_buf_import + modifiers).
- CapturedFrame gains a FramePayload enum (Cpu(Vec<u8>) | Cuda(DeviceBuffer));
  the encoder branches: CPU keeps the expand+upload path, CUDA wraps the device
  buffer in an AV_PIX_FMT_CUDA frame fed straight to hevc_nvenc (sharing our
  CUcontext via a hand-declared AVCUDADeviceContext, since ffmpeg-sys doesn't
  bind hwcontext_cuda.h). open_video/the encoder take a `cuda` flag derived from
  the first frame's payload.

The capture-side dmabuf negotiation (which produces the Cuda frames) is the
next step; the CPU path is unchanged and remains the default + fallback. Builds
clean, clippy clean, tests pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 15:13:05 +00:00
enricobuehler b64be1dc33 fix: m0 portal capture — activate the capturer so frames are delivered
The M2 teardown work added an `active` gate to the PipeWire capture callback
(idle by default so reconnects stay cheap, with the stream path calling
set_active(true) on PLAY). The `m0` subcommand was never updated, so its portal
capturer stayed inactive and the callback dropped every frame — `m0 --source
portal` failed with "no PipeWire frame within 10s" on every compositor. Call
set_active(true) before the capture loop.

Validated on headless KWin (Plasma 6.4) via the RemoteDesktop-anchored ScreenCast
session: real desktop frames flow (shm BGRx 1920x1080) and encode to valid H.265.

(Also folds in a rustfmt reflow of the input-test log line.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 14:24:09 +00:00
enricobuehler 0a79a8209b feat: M2 — RemoteDesktop-anchored ScreenCast capture for KWin/GNOME
On RemoteDesktop-capable desktops (KWin, GNOME), select the ScreenCast source on
a session created via the RemoteDesktop portal and start it through RemoteDesktop,
so a single grant — pre-authorized headlessly via the `kde-authorized` permission,
exactly like the libei input path — also covers screen capture. Standalone
ScreenCast has no such bypass and would raise an un-clickable dialog on a headless
box. wlroots/Sway has no RemoteDesktop portal, so it keeps the plain ScreenCast
session; the choice keys off inject::default_backend(). The PipeWire consumer is
unchanged — the anchored session yields the same fd + node id.

Validated on headless KWin (Plasma 6.4): the portal grants the session with no
dialog and PipeWire negotiates the format (1920x1080 BGRx, Streaming). Frame
delivery on KWin still pends dmabuf import — KWin hands GPU dmabuf buffers and the
M0 consumer is CPU-copy/shm only (plan §9, zero-copy) — so it's the next step;
the CPU-copy path remains correct on wlroots.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 14:09:19 +00:00
enricobuehler 03a6a67354 feat: M2 P1.7 — libei input backend (portable to KWin/GNOME)
Add a second input-injection backend that works on compositors implementing
the org.freedesktop.portal.RemoteDesktop interface (KWin, GNOME/Mutter), where
the wlroots virtual-input protocols are absent. Uses ashpd 0.13 to open a
RemoteDesktop session + EIS fd and reis 0.6.1 to drive it as an EI sender:
bind pointer/keyboard/scroll/button capabilities and, per device,
start_emulating → emit → frame. Runs on a dedicated thread with its own tokio
runtime (the portal session + EIS connection must stay alive and the event
stream must be polled continuously); open() returns immediately so a slow or
denied portal can never freeze the ENet control thread, with events enqueued
over an unbounded channel until devices resume.

Backend now auto-selects per session (inject::default_backend): wlr on Sway,
libei on KDE/GNOME; LUMEN_INPUT_BACKEND overrides. Refactor inject.rs into the
inject/{wlr,libei}.rs layout matching the capture/encode convention. Keyboard
codes are evdev (the same space our VK→evdev table produces) and the compositor
supplies the keymap, so no keymap upload and no modifier serialization — pressing
the modifier keys Moonlight sends is enough.

Add a `lumen-host input-test` subcommand that injects a scripted mouse+keyboard
pattern through the session backend, so input injection can be validated without
a Moonlight client.

Live-validated on headless KWin (Plasma 6.4): mouse motion, left click, and the
'A' key inject correctly and are delivered to the focused client.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 13:58:41 +00:00
enricobuehler 6de09fd822 feat: M2 teardown — persistent capturers for clean reconnects
Disconnect/reconnect now works reliably. Previously each stream spawned its own
portal+PipeWire (and PipeWire audio) capture threads and never stopped them, so a
reconnect opened a SECOND screencast session that conflicted with the leaked
first one ("no PipeWire frame within 10s" → black screen on reconnect).

- The screen capturer and audio capturer are now persistent, held in AppState and
  reused across streams (created on the first stream). One screencast session for
  the host's lifetime → no conflict, and instant reconnect (no re-handshake).
  Verified live: 3 stream cycles, 1 create + 2 "reusing capturer", clean every time.
- Capturer::set_active gates the (5K, ~1.3 GB/s) de-pad copy to active streams, so
  the persistent video capturer is nearly free while idle between streams.
- AudioCapturer::drain discards buffered chunks on reuse so the client never hears
  stale audio captured while idle.
- stream.rs / gamestream/audio.rs split into a borrow-the-capturer wrapper + the
  encode/send body, so the capturer is always returned to its slot on exit.

This holds whether the client reconnects via /resume (Moonlight's "running →
play/continue") or a fresh /launch — both re-run RTSP PLAY → a new stream cycle.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 12:35:10 +00:00
enricobuehler af4360c930 feat: M2 P1.5 robustness — IDR-on-request, send pacing, min-parity floor
Graceful FEC behavior on a lossy link: at a realistic 2% packet loss the stream
is now steady 0% (was spiking 40-60%). Verified live.

- IDR/RFI handling: the control thread recognizes the client's recovery requests
  (0x0301 invalidate-reference-frames, 0x0302 request-IDR, 0x0305) and sets a
  shared force_idr flag; the video thread forces an NVENC keyframe on the next
  frame (Encoder::request_keyframe → input frame pict_type = I). Without this, a
  frame that exceeds the FEC budget broke the reference chain until the next GOP
  IDR (~2s), cascading to most of the stream being undecodable.
- Min-parity floor: honor the client's x-nv-vqos[0].fec.minRequiredFecPackets
  (it asks for 2). Small P-frames previously got m=ceil(k*20/100)=1 parity — a
  single loss broke them; flooring m>=2 (capped so k+m<=255, wire pct recomputed)
  protects them. This is what turned the 2% spikes into steady 0%.
- Send pacing: spread each frame's packets evenly across the frame interval
  instead of blasting them at line rate (a real link drops microbursts), matching
  Sunshine's rate-controlled sends; sub-500us sleeps skipped (unreliable).

Note: sustained ~8% uniform loss still degrades — that exceeds 20% FEC for
reference-frame video and real Sunshine degrades there too; real networks are
<1% or bursty, which this now handles cleanly.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 12:14:59 +00:00
enricobuehler 72f8c05aa3 feat: M2 P1.5 (FEC) — nanors-exact Reed-Solomon recovery for the video stream
Moonlight now reconstructs lost video shards from our parity (verified live:
under induced packet loss the picture recovers cleanly instead of failing with
"network connection too bad"; 0% added loss in normal operation).

The decisive finding: Moonlight's nanors uses a CAUCHY generator matrix
(M[j][i] = inv[(m+i)^j], GF(2^8) poly 0x1d), while reed-solomon-erasure is
Vandermonde — so its parity was NOT Moonlight-decodable, despite the old
gf8.rs comment claiming equivalence.

lumen-core:
- Swap the GF(2^8) backend from reed-solomon-erasure to a vendored fec-rs
  (vendor/fec-rs, BSD-2), which builds the byte-identical Cauchy matrix. Pure
  Rust, no FFI — keeps the "one core" hot path. This makes both lumen's own
  protocol and the GameStream parity nanors-compatible.
- Lock it with a regression test against real nanors vectors
  (k=4,m=2 [10,20,30,40] -> parity [136,0]) + an independent matrix-derived
  cross-check + an erase/recover round-trip. Existing FEC/loopback tests stay
  green, so lumen's own protocol is unaffected.

lumen-host video.rs:
- Generate m = ceil(k*pct/100) parity shards per FEC block via Gf8Coder; stamp
  fecInfo with the recomputed wire pct (100*m/k) so the client derives the same
  count; cap per-block data to 255*100/(100+pct) so k+m <= 255.
- CRITICAL byte-exactness: RS runs over the whole `blocksize` shard (Moonlight
  decodes packetSize+16 bytes from the datagram start and PACKET_RECOVERY_FAILUREs
  on a bad reconstructed `flags` byte). So the NV header fields RS must reproduce
  (streamPacketIndex/frameIndex/flags/multiFec*) are written into data shards
  BEFORE encode, and only the transport fields (RTP header/seq/timestamp +
  fecInfo) are stamped AFTER — leaving the flags byte RS-covered. Matches
  Sunshine stream.cpp. Unit-tested incl. flags recovery.
- fec_percentage wired from stream.rs (Sunshine default 20, LUMEN_FEC_PCT
  override; 0 = data-only). LUMEN_VIDEO_DROP injects loss to test recovery.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 11:34:27 +00:00
enricobuehler 278a6330de feat: M2 P1.6 — audio (Opus + AES-CBC) and steady-rate video pacing
A stock Moonlight client now gets video + full input + AUDIO from the
from-scratch GameStream host (verified live end-to-end on a macOS client).

Audio (audio.rs, audio/linux.rs, gamestream/audio.rs):
- Capture the default PipeWire sink's monitor (system output) as interleaved
  f32 stereo @ 48kHz via stream.capture.sink, on its own thread.
- Opus-encode 5ms/240-sample stereo frames (RESTRICTED_LOWDELAY, CBR) and send
  as GameStream RTP audio: 12-byte BE RTP_PACKET (packetType 97, seq+1/pkt,
  timestamp += packetDuration, ssrc 0) on UDP 48000, after learning the client
  endpoint from its port-learning ping.
- Encrypt the Opus payload with AES-128-CBC (PKCS7), key = launch rikey, IV =
  BE32(rikeyid + seq) in [0..4]. Like the control stream, modern Moonlight
  always decrypts audio regardless of the negotiated flags — plaintext makes it
  log "Failed to decrypt audio packet" and play silence (diagnosed from the
  client log). RTP header stays in the clear. Scheme cross-checked against
  Sunshine stream.cpp/crypto.cpp + moonlight AudioStream.c.
- Pace each frame to its 5ms slot (PipeWire delivers ~1024-frame buffers) to
  avoid bursts the client's jitter buffer hears as glitches. LUMEN_AUDIO_GAIN
  applies optional linear gain for quiet sources.
- DESCRIBE SDP advertises the stereo Opus config (a=fmtp:97 surround-params).

Video (stream.rs): pace at a steady ≤60fps, re-encoding the last captured frame
when the compositor produces none. wlroots only emits on damage, so a static or
slow-updating desktop previously starved the client into a "network too slow"
abort; an unchanged frame costs a near-empty P-frame. Adds a non-blocking
Capturer::try_latest (portal drains to the freshest queued frame).

Misc: serialize pipewire init across the video + audio capture threads
(pwinit.rs, std::sync::Once) to avoid a concurrent pw_init race. Deps: opus,
cbc; libopus-dev in bootstrap-ubuntu.sh.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 10:39:22 +00:00
enricobuehler 4c2c41acba feat: M2 P1.4 — control-stream decryption + input injection (mouse/keyboard live)
A stock Moonlight client can now drive the headless Sway desktop: mouse
movement, buttons, scroll, and keyboard all inject through the streamed
session (verified live end-to-end — typing, clicking, window management).

Control stream (gamestream/control.rs):
- Moonlight encrypts the ENet control stream with AES-128-GCM even though we
  negotiate no media encryption (it detects our Sunshine `state` and turns it
  on). Decrypt per-packet under the /launch `rikey`.
- The exact GCM scheme is auto-detected on the first authenticating packet
  (nonce construction × key byte-order × tag position × AAD) since GCM gives no
  partial credit. Our client uses the legacy 16-byte nonce (`iv[0]=seq&0xff`)
  because we advertise no encryption; the 12-byte SS_ENC_CONTROL_V2 nonce is
  also supported. Key/IV/tag layout cross-checked against Sunshine stream.cpp +
  crypto.cpp and moonlight-common-c ControlStream.c.

Input decode (gamestream/input.rs):
- Decrypted control messages (`[u16 type][u16 len][NV_INPUT packet]`, type
  0x0206) decode into lumen_core::input::InputEvent: relative/abs mouse, buttons,
  vert/horiz scroll, keyboard down/up. Struct layout from moonlight Input.h
  (size BE, magic LE, body BE; keyCode LE masked to the low-byte VK), dispatch
  per Sunshine input.cpp (Gen5+). Unit-tested against real captured bytes.

Injection (inject.rs):
- WlrootsInjector: connects to Sway as a Wayland client and injects via the
  wlroots virtual-pointer + virtual-keyboard protocols (uinput is invisible to a
  compositor running WLR_LIBINPUT_NO_DEVICES=1). Uploads an evdev/US xkb keymap,
  tracks modifier state, and maps Windows VK → Linux evdev (full table).

Deps: aes-gcm, wayland-client, wayland-protocols-{wlr,misc}, xkbcommon (+
libxkbcommon-dev in bootstrap-ubuntu.sh).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 08:56:19 +00:00