Commit Graph

178 Commits

Author SHA1 Message Date
enricobuehler 7eb9a927cf feat(connector): expose host clock offset over the C ABI for glass-to-glass
ci / rust (push) Has been cancelled
Factor the client-side skew handshake into a shared core helper (quic::clock_sync
-> ClockSkew) so both the reference client and the embeddable connector use one
implementation. NativeClient now runs the handshake at connect (right after Start,
before the control task takes the stream) and stores the host-client offset; it's
read over the C ABI via punktfunk_connection_clock_offset_ns (i64 ns, host minus
client; 0 = no correction / old host).

This is the substrate the Apple client needs for the decode->present (glass-to-
glass) term: stamp present time, add the offset to express it in the host's
capture clock, subtract the AU pts_ns. client-rs drops its local clock_sync copy
and uses the shared helper (behavior unchanged; validated locally).

Regenerates include/punktfunk_core.h. Roadmap section 12 + status updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 11:44:54 +00:00
enricobuehler 05bc9ab22c feat(latency): wall-clock skew handshake for cross-machine latency measurement
ci / rust (push) Has been cancelled
ClockProbe/ClockEcho on the QUIC control stream — 8 NTP-style rounds right after
Start; the min-RTT sample gives the host-client clock offset (clock_offset_ns
estimator in punktfunk-core). The client adds the offset to its receive instant
before differencing against the AU pts_ns, so the capture->reassembled latency
percentiles are valid across machines (skew_corrected=true), not just same-host.
Back-compat: an old host that doesn't answer the probe times out and the client
falls back to a shared-clock assumption (skew_corrected=false).

Host adds one ClockProbe dispatch arm in the control task; the client runs
clock_sync after Start, before the --remode/--speed-test tasks take the stream.

Validated cross-LAN (GNOME box -> dev box): offset ~ -1.57 ms (reproducible),
rtt ~140 us, p50 1.30 ms skew-corrected capture->reassembled — the offset is
exactly the systematic error the handshake removes. Unit tests for the message
codecs and the min-RTT offset estimator.

Roadmap §12: skew handshake done; remaining for true glass-to-glass is the Apple
client present-stamp (decode->present) plus the host render->capture term.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 11:20:20 +00:00
enricobuehler 4fff4641bb feat(discovery): native-protocol LAN auto-discovery over mDNS
ci / rust (push) Has been cancelled
Both the unified host (serve --native) and standalone m3-host now advertise the
native punktfunk/1 service over mDNS (_punktfunk._udp) — the analogue of the
GameStream _nvstream._tcp advert. TXT records carry proto, the host cert
fingerprint (fp, the value clients pin), the pairing requirement
(pair=required|optional), and the host id. New crate::discovery module, wired
into m3::serve so both host entry points get it; best-effort, never blocks
streaming (--connect always works).

Client gains `punktfunk-client-rs --discover [SECS]`: browses the LAN and prints
each host (name, addr:port, pairing, fingerprint), then exits. Apple clients
browse the same service natively via NWBrowser (service type + TXT keys are the
contract).

Validated cross-LAN: the dev box discovered the GNOME-box appliance
(pair=required) and a standalone synthetic host (pair=optional); fingerprint and
pairing state correct in both.

Also refresh the now-stale sendmmsg caveat in the bitrate doc (batched/paced send
landed + validated to 1 Gbps) and mark the encode|send thread split done in §12.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 10:37:12 +00:00
enricobuehler b295a5b7a9 perf(latency): encode|send thread split on the native path
ci / rust (push) Has been cancelled
Bigger-bet #1 from the latency plan. virtual_stream ran capture+encode+seal+
paced-send on ONE thread, so frame N+1's capture/encode couldn't start until
frame N's entire paced tail had left the wire — the pacing budget (~0.9×interval)
was serialized in front of the next encode. Port GameStream's spawn_sender model
to the native path:

- A dedicated send thread (`send_loop`) owns the WHOLE Session (so no socket
  clone or shared/Arc stats needed — `seal_frame` mutates the nonce, `send_sealed`
  + the probe bursts all live there) and does FEC+seal + microburst-paced send.
- The encode thread captures+encodes + handles reconfig and hands each AU over a
  bounded sync_channel(3) as a FrameMsg (data, capture_ns, flags, deadline,
  encode_us). It BLOCKS on backpressure if the send falls behind — frames slow
  down rather than a dropped frame freezing the infinite-GOP stream (we don't
  drop). Clean shutdown: drop the channel → send thread drains/exits → join.
- Probes (run_probe_burst) move to the send thread since they need the Session; a
  burst naturally pauses video (the encode thread blocks on the full channel).
- Per-frame encode_us/pace_us histogram moved to the send thread (carries
  encode_us in the FrameMsg) and now reflects the overlap.

Removes the encode↔paced-tail serialization (~2-8 ms @60-120 fps), independent of
the pacing policy, no quality cost. Substrate for the future NVENC slice wrapper.

Verified live on this box (appliance restarted onto it): a client streamed the
KWin desktop (1.49 MB H.265, clean, no panic) and a 200 Mbps speed-test probe
completed through the send thread (0 drops). Build + clippy + fmt green.
Real-NIC sustained soak (reconfig under load, line-rate, mode switches) pending
the Ubuntu third host.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 09:42:16 +00:00
enricobuehler 86f463cf71 fix(housekeeping): unaligned read UB + recv-drop parity; dedup mmsghdr; doc fixes
ci / rust (push) Has been cancelled
From a bug-hunt + unsafe-audit pass (4 reviewers + adversarial verify). It
confirmed ZERO real bugs in the recent batched/paced data-plane work — these are
the surfaced cleanups + one genuine soundness fix:

- SOUNDNESS (reduce unsafe): inject/gamepad.rs::pump_ff did `ptr::read` of an
  InputEventRaw (align 8, holds a timeval) out of a 1-aligned [u8; N] buffer — UB
  per the reference (x86_64 tolerates it, but it can miscompile under LTO). Use
  ptr::read_unaligned + a SAFETY note. Zero behavior change.
- recv parity: recv_batch (recvmmsg) didn't drop an oversized/truncated datagram
  the way scalar recv does — poll_frame now skips a message whose len fills the
  buffer (> MAX_DATAGRAM_BYTES), matching recv's `n >= RECV_BUF` drop. (AEAD
  already rejected these on encrypted sessions; this restores the documented
  invariant on the batched path.)
- dedup unsafe FFI: factor the identical mmsghdr-from-iovec construction out of
  send_batch + recv_batch into one `mmsghdrs()` helper — the raw-pointer
  scaffolding + its lifetime SAFETY note now live in one place.
- docs: TARGET_SOCKBUF no longer calls paced sending future work (it landed,
  m3.rs::paced_submit); gamescope.rs input is no longer "(TODO)" (wired +
  live-validated); the PUNKTFUNK_PERF `wire_mbps` field is renamed `tx_mbps` and
  noted as attempted/sealed bytes (send_dropped shows what didn't reach the wire).

Full suite (35 + loopback round-trip + 6) + clippy + fmt green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 23:04:23 +00:00
enricobuehler 99f60b5b08 perf(latency): microburst-cap pacing + per-frame latency histogram
ci / rust (push) Has been cancelled
From the latency investigation: the freeze-fix pacing (paced_submit) was the
single biggest software-controllable latency term — it unconditionally spread
EVERY multi-chunk frame over ~90% of the frame interval, adding up to ~7.5 ms
@120 / ~15 ms @60 to a frame's last packet even when the frame was small or the
link idle. Recover that on the common case while keeping the freeze fix:

- Microburst-cap pacing: a frame whose sealed size is <= a cap (default 128 KB,
  PUNKTFUNK_PACE_BURST_KB) goes out in ONE immediate burst — no pacing latency.
  Only the OVERFLOW of a bigger frame (IDR / sustained high bitrate, the bursts
  that actually overran the tx buffer and froze) is spread. 128 KB is well under
  the ~150 Mbps@60 frame size where drops began, so the default is safe; raise it
  after confirming send_dropped stays 0 on a given link. Still never slower than
  unpaced (budget collapses to 0 with no slack). seal-once/in-order nonce
  preserved — chunks are split, never reordered or re-sealed.
- Per-frame instrumentation (PUNKTFUNK_PERF, zero-cost off): encode_us +
  pace_us (the pacing tail) p50/p99/max histograms + immediate-vs-paced frame
  counts in the periodic perf line, so the pacing tail is finally visible and the
  cap is tunable against real numbers.

Host builds + clippy + fmt green. NOT yet deployed to the running hosts (still on
the safe full-pacing A+B build) — needs the user's LAN soak to validate the cap
doesn't reintroduce send_dropped before raising it. Deferred bigger bets (need
real-NIC/GPU/Mac validation): encode|send thread split on the native path,
CUDA stream+event (one redundant sync), NVENC slice wrapper, stage-2 Apple
presenter, glass-to-glass probe — see docs/roadmap.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 22:53:52 +00:00
enricobuehler 2f4f92a804 feat(1gbps): batched client recv via recvmmsg (increment C)
ci / rust (push) Has been cancelled
Final increment of the 1 Gbps data-plane rework — the recv counterpart of the
sendmmsg work. The client recv path did one recvfrom + one Vec allocation per
packet (and the pump's 300µs idle sleep could let packets pile up at line rate).

- Transport gains recv_batch(&mut [Vec<u8>], &mut [usize]) -> count; default is
  a single scalar recv into out[0] (loopback + non-Linux).
- UdpTransport overrides it on Linux with recvmmsg (MSG_DONTWAIT) draining up to
  N datagrams per syscall into the caller's reused buffers — no per-packet alloc.
- Session::poll_frame owns a lazily-allocated recv ring (RECV_BATCH=32) and
  consumes it one packet at a time across calls, refilling with one recvmmsg when
  drained. Encapsulated: the punktfunk-client-rs + NativeClient pumps are
  unchanged, and draining a batch per syscall means the 300µs sleep no longer
  underdrains. Added UdpTransport::local_addr (used by the test, generally handy).

~125k → ~4k recv syscalls/sec at line rate, zero per-packet recv allocation.
Verified: new recv_batch_drains_over_loopback test (50 datagrams drained intact
via recvmmsg) + the existing loopback round-trip now runs through the batched
poll_frame; full suite (35 + round-trip + 6) + clippy + fmt green.

Decode-in-place (kill the per-packet open_from_wire alloc) is a separate later
optimization. With A (sendmmsg) + B (paced send) + C (recvmmsg), the native data
plane is batched + paced end to end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 22:39:51 +00:00
enricobuehler 10a932d013 feat(1gbps): pace per-frame sends so high-bitrate frames don't burst-drop
ci / rust (push) Has been cancelled
Increment B of the send-path rework — the actual fix for "freezes get more
common over ~150 Mbps, no image at all at 400 Mbps" on the native path. Cause:
the encoder emits a frame and submit_frame blasted ALL its packets at once into
the NIC; a real link drops the line-rate burst (host send buffer EAGAINs), and
under infinite GOP one dropped frame freezes the decode until the next keyframe.
(The speed-test probe showed 0 drops at 400 Mbps because the probe is self-paced;
real video wasn't.)

Adaptive pacing, no extra thread, no regression:
- Session splits into seal_frame (FEC + packetize + seal → wire packets, no
  send) and send_sealed (one batched sendmmsg of a chunk, counts drops);
  submit_frame is now their composition (synthetic + probe paths unchanged).
- virtual_stream's paced_submit seals a frame then sends it in 16-packet chunks
  spread over ~90% of the time until the next frame is due. At 60 fps desktop
  (fast encode → lots of slack) the frame spreads across the interval → no NIC
  burst → no freeze. At 240 fps@5K (encode ≈ interval → ~0 slack) the budget
  collapses and every chunk goes out immediately → never slower than before.

Core suite (34 + loopback round-trip + 6) + clippy + fmt green. The seal/send
split is covered by the existing loopback tests; the pacing is host timing,
verified by review (live-test needs a real NIC — your Mac at a raised bitrate).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 22:15:52 +00:00
enricobuehler c24b571e37 feat(1gbps): batched send via sendmmsg (Transport::send_batch)
ci / rust (push) Has been cancelled
First increment of the 1 Gbps send-path rework (the measured bottleneck): the
native data plane did one send() syscall per packet — at ~125k pkt/s (1 Gbps
wire) that burns a core on syscalls. Port the proven GameStream sendmmsg path
into the core Transport seam.

- Transport gains `send_batch(&[&[u8]]) -> usize` (count handed to the kernel;
  caller counts the rest as send-buffer drops). Default = the scalar send loop
  (loopback transport + non-Linux).
- UdpTransport overrides it on Linux with `sendmmsg` (64 datagrams/syscall);
  the connected socket needs no per-message address. Non-blocking-aware: a full
  send buffer yields a short count / EAGAIN, and we stop + report what went out
  rather than block or retry (same lossy, FEC-protected contract as send()).
- Session::submit_frame seals every shard then hands the whole frame to
  send_batch in ONE call instead of looping send() — ~64x fewer syscalls per
  frame on the native + GameStream-over-core paths; send_dropped accounting
  preserved (total - sent).

~125k → ~2k syscalls/sec at 1 Gbps line rate. Verified: new loopback-UDP test
send_batch_delivers_over_loopback (100 batched packets arrive intact, datagram
boundaries preserved); full core suite + clippy + fmt green.

Next increments: a paced send thread (microburst shaping so a real NIC doesn't
drop line-rate bursts) and recvmmsg on the client.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 21:55:22 +00:00
enricobuehler b8a33e21a2 feat(1gbps): raise bitrate/probe clamps + socket buffers, count send-buffer drops
ci / rust (push) Has been cancelled
First step of 1 Gbps+ readiness (the whole point of the GF(2^16) Leopard FEC):
make 1 Gbps configurable and its dominant failure mode observable, before the
real transport work (sendmmsg + paced encode|send split) lands.

Investigation (6-way) verdict: we're ~halfway, and it's mostly clamps plus one
real piece of work. The integer/type path, FEC (a 1 Gbps frame is only a few
hundred shards in one GF(2^16) block, far under the 65535 ceiling), AES-GCM
(AES-NI, ~10-25x headroom), and the M1 reassembler bounds (fully derived from
the negotiated FecConfig) are ALL already 1 Gbps-ready and untouched.

This commit (the configurable + observable foundation):
- m3.rs: MAX_BITRATE_KBPS 500_000 -> 2_000_000 (2 Gbps headroom over the 1 Gbps+
  target); MAX_PROBE_KBPS 1_000_000 -> 3_000_000 (probe can demonstrate headroom
  ABOVE the session cap so a client can confidently pick a 1 Gbps+ bitrate).
- transport/udp.rs: TARGET_SOCKBUF 8 MB -> 32 MB (a multi-MB IDR keyframe burst
  no longer fills the buffer); scripts/99-punktfunk-net.conf bumped to match.
- Observability: Transport::send now returns Ok(true|false) (false = WouldBlock
  send-buffer drop, previously a silent Ok(())). Session counts these as a new
  `packets_send_dropped` stat (distinct from recv-side packets_dropped) — in
  Stats, the C ABI PunktfunkStats (header regenerated), a PUNKTFUNK_PERF periodic
  wire-Mbps + drop dump in virtual_stream, and the speed-test probe completion
  log. This is the dominant 1 Gbps+ loss mode and was invisible.

Loopback-verified: a probe now runs at 1.2 Gbps target (no longer truncated to
1 Gbps) with the drop counter live. NOT yet a sustained-1-Gbps proof — the
single-send()-per-packet native path is the next, real piece of work (port the
proven GameStream sendmmsg + paced send thread into the core Transport).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 20:45:49 +00:00
enricobuehler 7cac1eb663 feat(host): log virtual DualSense pad creation (match the X-Box path)
ci / rust (push) Has been cancelled
The uinput X-Box 360 backend logs "virtual gamepad created" on success, but
the UHID DualSense backend logged only on failure — so a working DualSense
session was silent and indistinguishable in the logs from one where no pad
was ever created. Add the matching success log.

This makes a DualSense-not-working report self-diagnosing: the host now logs
either "virtual DualSense created (UHID hid-playstation)" or the existing
"virtual DualSense creation failed — controller input disabled" (which fires
when /dev/uhid isn't writable — i.e. the 60-punktfunk.rules uhid rule isn't
installed or the user isn't in the 'input' group).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 18:51:23 +00:00
enricobuehler 74819b1be8 feat(punktfunk/1): negotiable encoder bitrate + bandwidth speed-test probe
ci / rust (push) Has been cancelled
Two related additions to the native protocol, host-side (the client side of
each is exposed over the C ABI so the platform clients can wire it up).

Bitrate negotiation
- Hello/Welcome carry `bitrate_kbps` (appended trailing-byte field, back-compat:
  old peers decode 0 = host default). The client requests a rate; the host
  clamps it to [500 kbps, 500 Mbps] (or its 20 Mbps default when 0) and echoes
  the resolved value in Welcome. Replaces the hardcoded 20 Mbps NVENC bitrate in
  m3.rs — threaded through virtual_stream → build_pipeline → open_video, applied
  on the initial mode and every reconfigure rebuild.
- C ABI: punktfunk_connect_ex3(..., bitrate_kbps, ...) (ex2 delegates with 0);
  punktfunk_connection_bitrate() reads the resolved value.

Speed test (bandwidth probe)
- New typed control messages ProbeRequest{target_kbps,duration_ms} (0x20) /
  ProbeResult{bytes_sent,packets_sent,duration_ms} (0x21), plus a FLAG_PROBE
  packet flag. The client asks the host to burst zero-filled, FLAG_PROBE-tagged
  access units over the data plane at a target goodput for a duration (clamped
  ≤ 1 Gbps / ≤ 5 s), pacing by a bytes-allowed budget; video pauses for the
  burst. The host reports what it actually sent; the client measures received
  bytes + window → goodput and loss. Probe filler is never fed to the decoder
  (diverted in the connector pump and the reference client's poll loop).
- The host control task now multiplexes Reconfigure + ProbeRequest (inbound)
  and ProbeResult (outbound) over select!; a probe channel reaches the
  data-plane thread (both virtual and synthetic sources).
- Connector: NativeClient::request_probe()/probe_result() with an internal
  accumulator; C ABI punktfunk_connection_speed_test() +
  punktfunk_connection_probe_result() → PunktfunkProbeResult.
- punktfunk-client-rs gains `--bitrate KBPS` and `--speed-test KBPS:MS` (its own
  loop measures + logs goodput/loss) for loopback verification.

Validated on loopback (synthetic source): a 20 Mbps / 2 s probe measured
20050 kbps at 0% loss, bitrate negotiated (0→20000 and 50000→50000), and the
interleaved probe AUs were correctly excluded from frame verification
(mismatched=0). Wire codecs + trailing-byte back-compat have unit tests. C
header regenerated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 18:44:47 +00:00
enricobuehler c894c6f897 feat(host): host-managed gamescope session at the client's mode (dynamic res + refresh)
ci / rust (push) Has been cancelled
Nested games on the Bazzite host saw the wrong display: refresh capped at 60 Hz,
the box's connected TV's EDID modes leaking in (DOOM landed on 2560×1440@60), and
the resolution fixed at whatever the always-on session was launched at — the
client's requested mode never reached the game. Root causes: the session-plus
gamescope command has no --nested-refresh (Xwayland advertises 59.96 Hz for every
mode), --prefer-output HDMI-A-1 makes gamescope read the TV EDID, and the ATTACH
model launches one fixed-resolution session.

New vdisplay path: PUNKTFUNK_GAMESCOPE_SESSION=<client> — the host LAUNCHES
gamescope-session-plus headless AT THE CLIENT'S mode and relaunches it when the
mode changes. Injected via a host-written GAMESCOPE_BIN wrapper (--nested-refresh
$PF_HZ, the flag session-plus doesn't expose) + DRM_MODE=cvt (gamescope generates
clean CVT modes at that refresh instead of the TV's EDID). The session runs as a
transient `systemd-run --user` unit (clean cgroup teardown of the Steam tree);
state lives in a host-lifetime static (MANAGED_SESSION), NOT in GamescopeDisplay
(which is per-client-session) — so a same-mode reconnect REUSES the running
session instantly (no Steam restart) while a different mode RELAUNCHES it (games
can't change output mode live; a game/Steam restart on a mode change is
unavoidable and acceptable). Reuses the existing node + EIS auto-discovery
(find_gamescope_node / find_gamescope_eis_socket, factored into
point_injector_at_eis) and the existing mid-stream Reconfigure → vd.create(mode)
machinery — no protocol or m3 control-flow change.

Validated live on bazzite (RTX 4090): games' Xwayland now advertises 5120×1440 @
239.90 Hz as the preferred mode (was 59.96), the TV's 3840×2160/4096×2160@60 modes
are gone, frames stream; reconnect at 1920×1080@120 relaunches and games see that;
same-mode reconnect reuses with no restart and frames flow instantly.

scripts: host.env.example documents PUNKTFUNK_GAMESCOPE_SESSION (mutually exclusive
with the legacy NODE=auto attach); punktfunk-steam-session.service marked
deprecated (superseded — must not run alongside the host-managed path).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 16:14:10 +00:00
enricobuehler 1d605fb781 feat(gamepad): controller discovery + client-negotiated pad type + rich DualSense end to end
The Apple client grows full gamepad support and punktfunk/1 learns to negotiate
the virtual pad type:

- Protocol: Hello carries a GamepadPref byte (offset 21, the same trailing-byte
  back-compat pattern as the compositor; echoed resolved in Welcome at 54).
  Host precedence: explicit client choice > PUNKTFUNK_GAMEPAD env > Xbox 360,
  DualSense (UHID) only where available. ABI: punktfunk_connect_ex2 +
  punktfunk_connection_gamepad (connect_ex delegates; ABI_VERSION stays 2 — the
  trailing byte IS the compat mechanism). punktfunk-client-rs gets --gamepad.

- Swift client: GamepadManager (app-lifetime discovery + selection — Settings
  lists every controller with capabilities/battery/"In use"; exactly ONE pad
  forwards as pad 0, auto = most recently connected, or pinned), GamepadCapture
  (snapshot-diff button/axis events, DualSense touchpad + ~250 Hz motion on the
  rich-input plane, held state released on switch/deactivate/stop),
  GamepadFeedback (rumble → CoreHaptics per-handle engines; lightbar →
  GCDeviceLight; player LEDs → playerIndex; adaptive-trigger blocks → the
  table-driven DualSenseTriggerEffect parser → GCDualSenseAdaptiveTrigger,
  exact for the 10-zone positional modes). The pad type auto-resolves from the
  physical controller at connect time, user-overridable in Settings.

- Host DualSense fixes surfaced by adversarial review against hid-playstation /
  SDL / Nielk1 ground truth: input-report sensor/touch offsets were off by one
  (the kernel read garbage motion + phantom touches), the L2/R2 trigger blocks
  were swapped (the report is right-trigger-first), feedback now gates on the
  report's valid-flags (a plain rumble write no longer blanks lightbar/
  triggers), and the touchpad rescale clamps to the advertised ABS_MT extents.

- Tests: Hello/Welcome trailing-byte back-compat, pick_gamepad precedence,
  byte-exact input-report layout, valid-flag gating, per-mode trigger-parser
  table (incl. packed 3-bit zones), wire conversions, and a scripted loopback
  feedback burst (PUNKTFUNK_TEST_FEEDBACK=1) asserted through the xcframework
  on the rumble + HID-output planes.

Validated: cargo test/clippy/fmt green on macOS + Linux (61 host tests), swift
build/test green, test-loopback.sh green, tvOS/iOS targets compile. DualSense
motion sign/scale is derived from the calibration blob, not yet live-verified
(constants isolated in GamepadWire).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 16:28:33 +02:00
enricobuehler 0f333460ec fix(core): grow UDP socket buffers — fixes 4K/5K video freezing on one frame
The data-plane UDP sockets used the OS default buffer (~208 KB on Linux, similar
on macOS), which is smaller than a single high-resolution frame burst: a
5120×1440 keyframe is ~130 packets the encode|send thread hands to sendmmsg at
once. The burst overflows the buffer — EAGAIN on the host send (now dropped, was
fatal) or a silent drop on the client recv — and because the data plane runs
infinite-GOP, one lost frame breaks every subsequent reference and the decode
freezes on the last good frame until an RFI refresh that may never catch up.
Symptom: connect at 5120×1440, see ONE frame, then a frozen image (audio + input
keep working — those ride QUIC, not this socket).

Set SO_SNDBUF/SO_RCVBUF to 8 MB (clamped by the OS to net.core.{w,r}mem_max on
Linux / kern.ipc.maxsockbuf on macOS); warn if the grant lands far below target so
an undersized host is diagnosable. The client side matters most — the SAME
UdpTransport backs the Apple client's data plane via the C ABI, and macOS grants
multi-MB buffers without any sysctl, so a rebuilt client stops losing frames.

Validated live, bazzite→client at 5120×1440: was 1319/1500 frames (12% loss →
freeze), now 1500/1500 @60 and 5279/5279 @240 (split-encode active), zero
mismatches, p50 1.9–3.4 ms. Host send buffer was still capped at 416 KB and lost
nothing — the loss was purely the client recv buffer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 13:28:02 +00:00
enricobuehler 9128bc3836 feat(host): attach to a running gamescope session (Bazzite headless Steam)
Bazzite (and SteamOS-like hosts) run Steam Big Picture inside their OWN
gamescope-session-plus session. Nesting a second gamescope+Steam can't work — the
second Steam sees the first and exits, taking the nested gamescope down with it
(crash in its exit handlers), killing both video and input. The robust model is to
let punktfunk OWN that session: run gamescope-session-plus headless at the client's
resolution (full Steam Deck UI polish: MangoApp, VRR, controller config) and have
the host ATTACH to it rather than spawn its own.

The video half already existed (PUNKTFUNK_GAMESCOPE_NODE=<id> attaches to a
PipeWire node). This finishes it:
- PUNKTFUNK_GAMESCOPE_NODE=auto discovers the gamescope Video/Source node, so the
  (dynamic) node id needn't be hand-wired.
- The attach path now also points the libei injector at the running session's EIS
  socket: find_gamescope_eis_socket() scans XDG_RUNTIME_DIR for gamescope-<N>-ei,
  connect()-probes each (stale dead-session sockets refuse), and writes the newest
  live one to the relay file the injector reads. So input reaches the attached
  session with zero manual config.

scripts/punktfunk-steam-session.service: a systemd --user unit that runs
gamescope-session-plus headless at a configured resolution, with the one-time
headless-appliance setup (linger + multi-user.target) documented inline.

Validated live on bazzite (RTX 4090): the full Steam Big Picture session streams
(1499 frames, p50 ~1ms) with mouse/keyboard injected into it (device resumed, all
caps, emitted=true), node + EIS socket both auto-detected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 12:55:12 +00:00
enricobuehler b6f4164454 fix(core): drop video packets on a full UDP send buffer, don't fail the session
UdpTransport sockets are non-blocking, so a momentarily-full kernel send buffer
makes socket.send return WouldBlock (EAGAIN). submit_frame propagated that as a
fatal error, tearing the whole punktfunk/1 session down — observed when attaching
to an already-running source (a headless Steam session) that emits frames at full
rate the instant capture connects: the first burst saturates the tx queue and the
session dies before a single frame reaches the client.

The data plane is lossy + Leopard-FEC-protected and runs infinite-GOP with RFI
keyframes, so the real-time-correct response to a full tx queue is to DROP the
packet (the next frame / FEC recovers) — exactly what the recv path already does
for WouldBlock. Blocking would queue stale frames and add latency. Loopback/M1
paths are unaffected (LoopbackTransport never blocks; M1 tests stay green).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 12:55:12 +00:00
enricobuehler 6e1097da4f fix(inject): self-heal a stale/hung EIS connection + per-kind injection diagnostics
The host-lifetime libei injector could connect to a gamescope EIS socket whose
listen socket exists but whose server never drives the EI handshake — a stale
socket left by a SIGKILLed prior session, or one created early in a new
gamescope's startup before its libei server is ready. `UnixStream::connect` to a
socket *file* succeeds the moment the path exists, so the worker sailed past the
connect and then hung forever in `handshake_tokio` (or sat connected with no
device ever resumed). Because `LibeiInjector::inject` only enqueues onto a
channel (the !Send worker owns the connection), the send never errors, so
InjectorService never noticed the dead worker and never reopened — every input
event for the whole session was silently swallowed. The 30s setup timeout didn't
help: a typical session ends first, so input just died with no error logged.
Reconnecting made it worse (more stale sockets to land on).

Two self-heal bounds, both paths (gamescope socket + KWin/GNOME portal):
- Bound the EI handshake at 8s — a non-responding EIS server now errors instead
  of hanging, so the worker exits and the next inject() reopens.
- Watchdog: if no input device resumes within 5s of connecting, treat the
  connection as dead-on-arrival and exit (same reopen path). Healthy servers
  add+resume a device within a beat of the handshake.

Verified on-box: clean gamescope + KWin paths connect/resume/emit unchanged; a
stale listener that accepts-but-never-handshakes now errors in 8s; two
back-to-back gamescope sessions both inject (session 2 reopens against the fresh
socket). Independently confirmed end-to-end delivery on KWin — a focused wev got
the injected motions/keys/buttons — i.e. injection itself was never broken, only
its recovery from a bad connection.

Also adds permanent low-volume diagnostics so the next "input dead" report is
instantly triageable: log each EIS device's capabilities on resume, the first of
each InputKind a client sends + whether it emitted, and no-resumed-device drops.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 11:57:09 +00:00
enricobuehler 609136cd2d fix(inject): make the gamescope EIS injector reconnect robustly across sessions
ci / rust (push) Has been cancelled
Root cause of "input doesn't work" on the unified host: a single fresh session
injects fine (EIS connects, "Gamescope Virtual Input" device added), but the
host-lifetime injector reused a STALE per-session EIS socket across sessions →
"connect EIS socket …: Connection refused". (Headless gamescope is EIS-only — it
ignores uinput — so libei/EIS is the one input path for both gamescope and KWin;
no second path needed.)

- connect_socket_file: re-READ the relay file and RETRY the connect on
  refused/missing (the live gamescope's EIS appears shortly), bounded at 15s,
  instead of connecting once and bubbling ECONNREFUSED.
- GamescopeProc::drop: clear the relayed EIS socket name on teardown so a dead
  session can't hand a stale path to the next reconnect.

Validated: two sessions back-to-back each reconnect (EIS connected + device added).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 11:12:13 +00:00
enricobuehler 9a6058cd20 feat(host): §8a — require native pairing by default (serve --open to disable)
ci / rust (push) Has been cancelled
An open punktfunk/1 host any LAN device can trust-on-first-use and stream from is
insecure. The unified host now gates native sessions on pairing by DEFAULT: a client
must complete the SPAKE2 PIN ceremony (armed from the web console) before it's
admitted; paired devices persist. `serve --open` keeps the old TOFU behavior for
trusted single-user setups.

native_serve_opts now takes a NativeServe { port, require_pairing }; parse_serve
builds it with require_pairing = !--open. GameStream pairing (separate) is unchanged.
The require_pairing gate + ceremony are already covered by m3::pairing_ceremony_and_gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 10:23:08 +00:00
enricobuehler 19666ba57e feat(host): unified host + native pairing over the management API
`serve --native` now runs the GameStream host AND the native punktfunk/1 (QUIC)
host in ONE process, sharing a single NativePairing handle with the management API
— so native pairing is operable from the web console instead of journalctl.

- gamestream::serve gains a native_port: spawns crate::m3::serve in the same
  runtime and passes the shared NativePairing to mgmt::run. Validated live: one
  process binds both RTSP 48010 and QUIC 9777.
- mgmt API: new `native` endpoints — GET /native/pair (status), POST
  /native/pair/arm (mint a fresh, time-limited PIN to DISPLAY), DELETE /native/pair
  (disarm), GET/DELETE /native/clients (list/unpair). GameStream-only hosts report
  enabled:false. OpenAPI regenerated (checked-in doc + drift test).
- main.rs: serve --native / --native-port flags.

The native host arms pairing on demand (the operator reads the PIN from the
console; the SPAKE2 ceremony is host-shows-PIN). New mgmt + native_pairing tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 09:55:30 +00:00
enricobuehler 5ca860533e refactor(native-pairing): extract shared on-demand arming state
Groundwork for web-UI-driven native (punktfunk/1) pairing. Replaces m3's fixed
startup PIN + local paired store with a shared `NativePairing` (new module):
arm-on-demand with a fresh, time-limited PIN (`arm(ttl)`), `current_pin()` read
per ceremony so a lapsed window stops pairing, plus the trust store (list/add/
remove/is_paired) and a `status()` snapshot. The management API (next commit) and
the QUIC accept loop share one handle. CLI `--allow-pairing`/`--require-pairing`
still arm at startup (no expiry, PIN logged) — back-compat. m3 pairing ceremony +
gate and the C-ABI roundtrip stay green; new unit tests for arm/expire/pair.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 09:55:30 +00:00
enricobuehler 136390514d build: support FFmpeg 7.x and 8.x; fix RPM spec GPU link deps
ci / rust (push) Has been cancelled
punktfunk-host builds unchanged against either FFmpeg 7.x (libavcodec 61) or 8.x
(libavcodec 62) — ffmpeg-sys-next auto-detects the system version, and the host's
ffmpeg FFI only touches long-stable APIs. Confirmed by building + running live on a
Bazzite F43 box (FFmpeg 7.1.3): full gamescope capture → zero-copy dmabuf→CUDA →
NVENC H.265 at 1280x720x60, p50 ~0.96 ms. Just doc/spec accuracy, no code change:

- encode/linux.rs + CLAUDE.md: drop the "FFmpeg 8 only" claim; note 7.x/8.x both work.
- rpm spec: add the missing zero-copy GPU build deps the link actually needs —
  pkgconfig(gl) + pkgconfig(gbm) (mesa) — and document that -lcuda needs libcuda.so at
  link time (NVIDIA host, or the CUDA toolkit stub on a headless COPR/koji builder).
  Tracked for a proper fix: make the cuda/gbm/GL FFI dlopen-based like khronos-egl so
  the RPM builds on a GPU-less host.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 09:12:59 +00:00
enricobuehler 59edeedf07 feat(dualsense): Phase C/D/E — virtual DualSense routing + 0xCC/0xCD planes + C ABI
ci / rust (push) Has been cancelled
PUNKTFUNK_GAMEPAD=dualsense now routes a session's gamepad through a real virtual
DualSense (UHID + hid-playstation) end to end:

- host: a `PadBackend` enum (m3.rs) selects `GamepadManager` (uinput xpad, default)
  or the new `DualSenseManager` (dualsense.rs) per session. The manager keeps each
  pad's full DsState so touchpad + motion (rich-input plane) persist across
  button/stick frames, and services the !Send /dev/uhid fd only on the input thread
  (which cycles <=4ms, so the GET_REPORT init handshake completes).
- feedback: `service()` now returns `DsFeedback { hidout, rumble }`. Motor rumble
  stays on the universal 0xCA plane (so non-DualSense clients still feel it; manager
  dedups change); lightbar / player LEDs / adaptive-trigger effects ride the new
  0xCD HID-output plane (host->client) as `HidOutput`.
- rich input: touchpad contacts + motion ride the 0xCC plane (client->host) as
  `RichInput`, applied via `DualSenseManager::apply_rich` (merged with button state;
  touch normalized 0..65535 -> the touchpad resolution).
- connector + C ABI: `NativeClient::next_hidout` / `send_rich_input`, exported as
  `punktfunk_connection_next_hidout` (-> PunktfunkHidOutput) and
  `punktfunk_connection_send_rich_input` (<- PunktfunkRichInput); header regenerated.
- reference client: `--rich-input-test` drives the DualSense touchpad + motion and
  logs the 0xCD feedback that comes back.

Validated live on-box: a synthetic-source m3-host + client-rs created the real
kernel DualSense, drove 0xCC, and decoded 12 live 0xCD events (the kernel's actual
lightbar/trigger init reports) with the data plane unaffected (600/600 frames).
Adversarial review fixes folded in: the input loop no longer skips the rich drain +
feedback pump on a dropped gamepad event, and the touch contact id is clamped to its
slot. Remaining: the Apple client renders triggers/rumble on a real DualSense.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 08:36:12 +00:00
enricobuehler 2372b02620 feat(host): virtual DualSense via UHID (hid-playstation) — device + report mapping
ci / rust (push) Has been cancelled
Roadmap #5 (rich DualSense). A UHID device presents a real Sony DualSense to the kernel's
hid-playstation driver (matched by VID 054C/PID 0CE6), which exposes the full controller —
gamepad, motion sensors, touchpad, lightbar/player LEDs, adaptive triggers — unlike the
uinput X-Box-360 pad.

- inject/dualsense.rs: hand-rolled /dev/uhid codec (no bindgen) mirroring the uinput style;
  the canonical inputtino 232-byte USB HID report descriptor + the feature-report replies
  (calibration 0x05 / pairing 0x09 / firmware 0x20) — answering hid-playstation's GET_REPORTs
  during init is REQUIRED or it creates no input devices. DsState::from_gamepad maps a
  GameStream/XInput frame → the DualSense input report (buttons/sticks/triggers/dpad, +
  touchpad/motion fields); service() answers GET_REPORTs and parses HID OUTPUT (rumble /
  lightbar RGB / player LEDs / adaptive triggers) into quic::HidOutput.
- scripts/60-punktfunk.rules: grant /dev/uhid to the 'input' group (like /dev/uinput).
- `punktfunk-host dualsense-test`: standalone validation (no streaming session).

Validated live: `dualsense-test` → hid-playstation binds + loads ff_memless + led_class_
multicolor; the kernel creates "Punktfunk DualSense 0" (event/js gamepad + Motion Sensors +
Touchpad + Headset Jack) at VID 054c/PID 0ce6, plus the lightbar at /sys/class/leds/
input*:rgb:indicator; js shows the Cross button firing + the left-stick sweep. Clippy/fmt
clean, workspace tests green. Wiring into the session (pad-type select, touchpad/motion
routing, HID-output back-channel) is the next commit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 07:27:28 +00:00
enricobuehler 6575dddac7 fix: keep the workspace green on macOS after the mic/touch/rich-input batch
The new features were Linux-built only and broke the documented macOS gate
(cargo build/test/clippy --workspace) four ways, all fixed following the existing
platform-gating conventions:

- m3.rs: mic_service_thread split into the Linux worker and a non-Linux stub that
  drains and drops (sessions still count the datagrams) — opus/PipeWire are
  Linux-gated deps, same pattern as audio_thread.
- punktfunk-client-rs: the new `opus` dependency moved into the Linux target table and
  --mic-test gated with a warn-and-skip stub (only the synthetic-tone test rig needs
  the encoder; the mic uplink itself is portable).
- gamestream/audio.rs: SAMPLE_RATE import gated to any(linux, test) (the frame_sizing
  test uses it everywhere, the data plane only on Linux).
- tests/c_abi.rs: the harness's macOS link flags gained Security + CoreFoundation —
  the quic feature now pulls rustls's platform verifier into the staticlib.

Also: two clippy match-ref-pats lints in the new rich-input/HID-output decoders
(clippy -D warnings is the repo gate), the regenerated punktfunk_core.h committed (the
checked-in copy predated the rich-input/HID-output constants — CI fails on drift), and
web's inlang cache dir gitignored.

cargo build/test/clippy/fmt --workspace: green on macOS, 122 tests passing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 09:07:48 +02:00
enricobuehler 5f6d2cb88b feat(proto): variable-length rich-input (0xCC) + HID-output (0xCD) datagrams
ci / rust (push) Has been cancelled
Foundation for rich DualSense support (roadmap #5). The fixed 18-byte InputEvent (0xC8) can't
hold the DualSense touchpad/motion or HID feedback, so two new variable-length, kind-tagged
datagram families join the side-plane (mouse/keyboard/gamepad/touch keep the fixed InputEvent):

- RICH_INPUT_MAGIC 0xCC, client→host: `[0xCC][kind][fields]`
    Touchpad{pad,finger,active,x,y}  (x/y normalized 0..65535; host scales to the pad)
    Motion{pad, gyro[3], accel[3]}   (raw i16, straight into the DualSense report)
- HIDOUT_MAGIC 0xCD, host→client: `[0xCD][kind][pad][fields]` — the rich analog of the 0xCA
  rumble datagram (rumble stays on 0xCA):
    Led{rgb}  PlayerLeds{bits}  Trigger{which, effect}  (adaptive-trigger params to replay)

`RichInput`/`HidOutput` enums with encode/decode; unknown kinds + truncation decode to None
(forward-compatible). +2 round-trip/disjointness tests; quic suite green, clippy/fmt clean.
Wiring (host UHID device, capture, C ABI, client) lands in following commits.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 06:55:04 +00:00
enricobuehler dc375668ee feat: touch input — TouchDown/Move/Up + host libei ei_touchscreen injection
ci / rust (push) Has been cancelled
Roadmap #5 (touch, ahead of the XL UHID DualSense work). Touch fits the existing 18-byte
InputEvent: code = touch id, x/y = client pixels, flags = (w<<16)|h — the same absolute
mapping as MouseMoveAbs.

- core: InputKind::{TouchDown=9, TouchMove=10, TouchUp=11} + from_u8 + roundtrip test.
- host inject/libei.rs: request the RemoteDesktop Touchscreen device type, bind the Touch
  capability, and inject ei_touchscreen down/motion/up (one event = one frame, per the
  protocol rule), mapping coordinates into the device region like the abs pointer. wlroots
  has no virtual-touch protocol wired — no-ops there.
- client-rs --touch-test: drags a synthetic finger (touch id 0) in a circle.

Validated live on headless KWin: the portal GRANTS the Touchscreen device type
(Keyboard|Pointer|Touchscreen), proving the request path — but KWin's EIS server creates no
touchscreen *device*, so touch currently no-ops on this KWin (now logged once, not silent).
The injection code is correct and will land on a backend that exposes ei_touchscreen
(gamescope / a newer compositor / the real touch-client path). Workspace green, clippy/fmt
clean, +1 unit test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:38:44 +00:00
enricobuehler 0755c823a5 feat: mic passthrough — client microphone → host virtual PipeWire source
ci / rust (push) Has been cancelled
The inverse of the host→client audio path: the client's mic, Opus-encoded, rides a
new 0xCB QUIC datagram to the host, which decodes it into a virtual PipeWire
Audio/Source its apps can record from (voice chat, etc.).

Protocol (punktfunk-core):
- MIC_MAGIC 0xCB + encode/decode_mic_datagram (mirror of the 0xC9 audio datagram).
- NativeClient::send_mic(seq, pts_ns, opus) over a new outbound channel + worker task
  (mirror of send_input); C ABI punktfunk_connection_send_mic for native clients.

Host:
- audio::VirtualMic + PwMicSource: a PipeWire output stream tagged media.class=
  Audio/Source (Direction::Output) — a recordable microphone node, fed decoded PCM.
- MicService: host-lifetime owner of the source + Opus decoder (mirror of
  InjectorService / the audio capturer slot); lazily opened, persists across sessions,
  self-heals. The per-session datagram reader now demuxes 0xCB→mic / 0xC8→input over a
  single read_datagram loop (two loops would race).
- Adaptive jitter buffer in the producer: primes to ~3 consumer quanta before emitting,
  so the 5 ms push / N ms pull clock skew never underruns — without it ~58% of output
  was silence; with it, glitch-free across consumer quanta.

Client: punktfunk-client-rs --mic-test streams a synthetic 440 Hz Opus tone as the mic
uplink (opus dep added) for end-to-end validation without a real microphone.

Validated live on headless KWin: client tone → host source → pw-record shows the
punktfunk-mic Audio/Source node, 440 Hz dominant (Goertzel power 20.7 vs <0.001
elsewhere), RMS 0.179 ≈ the ideal 0.177, 0.3–0.4% silence at both 256 ms and 10 ms
consumer quanta. Tests +1 (mic datagram roundtrip); workspace green, clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:15:07 +00:00
enricobuehler a03aae891e fix(m3): persistent host-lifetime input injector — end the RemoteDesktop portal churn
ci / rust (push) Has been cancelled
Under rapid client reconnects, KWin's libei/EIS input setup intermittently wedged
with "EIS setup timed out", causing total input loss for affected sessions. Root
cause: each punktfunk/1 session opened (and tore down) its own RemoteDesktop-portal
CreateSession for pointer/keyboard injection, and back-to-back reconnects raced a
prior session's portal teardown before it settled.

LibeiInjector is only a Send channel handle to a worker thread that owns the portal
session, so the injector can live for the whole host run instead of per session.
Adds InjectorService: one host-lifetime thread owns the (!Send) injector, opened
ONCE (lazily, on the first event) and reused across every session — the portal grant
is established a single time and held. Sessions forward pointer/keyboard events to it
over a clonable Send channel; gamepads stay per-session (uinput, no portal). The
service self-heals — reopen after a 2s backoff if open fails or the backend worker
dies (covers a gamescope EIS socket that respawns with its nested session).

Mirrors the existing host-lifetime audio-capturer slot; the audio capturer is Send
(a slot works), the injector is !Send (needs the owning thread + channel).

Validated live on headless KWin: 8 rapid back-to-back input sessions →
"input injector ready (host-lifetime)" exactly once, ZERO "EIS setup timed out",
8/8 sessions injected input. Tests green, clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 21:31:11 +00:00
enricobuehler 6fdf7d1511 feat: client-selectable compositor (protocol → host → client → C ABI → mgmt → web)
A client can now request which compositor backend the host drives its virtual
output on (gamescope/KWin/Mutter/wlroots). The host honors the request if that
backend is available, else falls back to auto-detect and reports the resolved
choice back — wire-compatible both directions (no ABI bump).

Protocol (punktfunk-core):
- New CompositorPref (config.rs): Auto|Kwin|Wlroots|Mutter|Gamescope with
  u8/name mappings. Appended as one optional byte to Hello (client preference)
  and Welcome (host's resolved choice). Both decoders already tolerate trailing
  bytes, so old↔new interop is preserved — ABI_VERSION stays 2. Round-trip +
  back-compat (truncated-message) tests.
- C ABI: punktfunk_connect_ex(compositor) + PUNKTFUNK_COMPOSITOR_* constants;
  punktfunk_connect delegates with AUTO, so the existing symbol is unchanged.
  NativeClient::connect / worker_main thread the preference through.

Host:
- vdisplay::available() enumerates usable backends via cheap, side-effect-free
  probes (KWin zkde global, gamescope binary+version, GNOME/Sway env), plus
  Compositor id/label/as_pref/from_pref/all helpers.
- m3 handshake resolves the preference to a concrete backend during the
  handshake (pick_compositor pure + resolved logging), reports it in Welcome,
  and threads it into virtual_stream (replacing the unconditional detect()).
- mgmt GET /v1/compositors lists every backend with availability + the
  auto-detected default (OpenAPI regenerated).

Client:
- punktfunk-client-rs --compositor NAME; logs the host's resolved choice from
  the Welcome ("session offer … compositor=…").

Web console:
- Host page gains a Compositors card (availability + default badges) via the
  codegen'd useListCompositors hook; en/de strings added.

Also fixes a pre-existing, env-dependent test-isolation bug:
mgmt::tests::paired_clients_list_and_unpair seeded the real
~/.config/punktfunk/paired.json (AppState::new loads it), so a real
GameStream-paired client leaked into body[0] on a dev box — now cleared first.

Live-validated against headless KWin: --compositor kwin honored, --compositor
mutter falls back to kwin (available=[kwin, gamescope]), resolved choice
round-trips to the client. Tests: +6 (wire/back-compat, resolution precedence,
endpoint); workspace green, clippy/fmt clean, C ABI harness PASS at abi_version=2,
web typecheck + build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:45:41 +02:00
enricobuehler 75eb8fa0d6 feat(host): KDE-reliability phase 2 — pipeline retry, graceful capture teardown, refresh reconcile
Hardens the virtual-display → capture → encode bring-up against the transient
failures that surfaced as black screens / wrong refresh on cold KDE sessions.

- m3: build_pipeline_with_retry wraps the initial vd.create() + first-frame with
  bounded exponential backoff (4 attempts, 500ms→2s). is_permanent_build_error
  classifies config/version/missing-tool failures so they fail fast instead of
  burning the retry budget. Encoder + frame clock now pace to the *achieved*
  refresh reported in VirtualOutput::preferred_mode, not the requested rate.
- capture/linux: PortalCapturer::Drop sends a pipewire channel quit and joins the
  thread, so a dropped/failed/retried capturer releases its PipeWire thread + EGL/
  CUDA context promptly instead of leaking it to process exit. First-frame timeout
  now reports the node id and distinguishes "format never negotiated" from
  "negotiated but no buffers arrived" via a negotiated flag set in param_changed.
- vdisplay/kwin: set_custom_refresh reads back the active mode from kscreen-doctor
  and returns the refresh KWin actually gave us (a rejected custom mode silently
  leaves the output at 60Hz); create() carries it into preferred_mode.
- vdisplay/gamescope: find_gamescope_node requires the Video/Source object (the
  node.name=gamescope tag is on two objects; the other wedges the link); a version
  check warns on <3.16.22 (the PipeWire-1.6 capture-deadlock signature).

Live-validated against headless KWin: 720p120 build with requested=120 achieved=120,
zero-copy CUDA frames, and no per-session thread accumulation across back-to-back
sessions. Tests: +3 unit (retry classifier, gamescope version parse); 49 host tests
green, clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:45:41 +02:00
enricobuehler 9fdc3c3246 feat(headless-kde): reliable bring-up — readiness probe, fix portal ordering/env (roadmap #1 phase 1)
ci / rust (push) Has been cancelled
Headless KDE startup was a chain of timing-sensitive handoffs gated by a blind `sleep 2`,
the dominant source of black screens. Phase-1 fixes:

- New `punktfunk-host probe-compositor` subcommand: exits 0 iff the detected compositor is
  up AND ready to create a virtual output now. KWin gets a real check (connect + registry
  roundtrip + the privileged zkde_screencast global must be advertised — what the backend
  needs); gamescope/Mutter/wlroots create on demand so the probe just confirms Linux.
  (vdisplay::probe dispatcher + kwin::probe; reuses kwin.rs's existing roundtrip path.)
- run-headless-kde.sh: replace `sleep 2` with an active readiness wait (poll probe-compositor
  until ready, 30s deadline, and bail with kwin's log if kwin_wayland exits during init).
  Move the portal restart to AFTER readiness, and precede it with `systemctl --user
  import-environment` + `dbus-update-activation-environment` (the missing env import — the
  Sway script does this; without it a restarted portal inherits a stale/empty WAYLAND_DISPLAY,
  which is the "streams but eats no input/audio" failure). kwin's stderr → a log file.

Validated: probe-compositor exits 0 "Kwin ready" against the live session, exit 1 with a
clear diagnostic when the compositor is absent. 114 tests green, clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 19:25:52 +00:00
enricobuehler ff4fe197be fix(punktfunk/1): adversarial-review fixes — SPAKE2 pairing, renegotiation hardening, +more
ci / rust (push) Has been cancelled
Triaged the multi-agent review of the renegotiation + pairing + Sway + AV1/surround batch
(1 critical, 11 major/minor confirmed). Fixes:

CRITICAL — PIN pairing was offline-brute-forceable. The HMAC-of-PIN proof let an active
MITM who terminates the TOFU ceremony recover the 4-digit PIN by offline dictionary search
(all other inputs observable) and forge a correctly-bound proof. Replaced with **SPAKE2**
(balanced PAKE, `spake2` crate) + key-confirmation MACs, binding both cert fingerprints as
the SPAKE2 identities: an attacker gets exactly ONE online guess, no offline search, and
mismatched cert views (a real MITM) never reach a shared key. Also reworked the UX to an
"arming PIN" — one PIN per arming window shown at host startup (the SPAKE2 client needs the
PIN to build its first message, so it can't be minted per-connection). Validated live:
wrong PIN rejected in 0.1s, right PIN pairs + persists + the paired identity streams.

Pairing hardening: `--allow-pairing`/`--require-pairing` must arm pairing (default rejects
unsolicited ceremonies); per-host cooldown bounds online guessing; the client flushes its
CONNECTION_CLOSE so a refused ceremony can't wedge the sequential host for the full timeout;
atomic (temp+rename) paired-store writes.

Protocol: control/pairing messages use a distinct CTL_MAGIC (PKFc) — fully disjoint from
the positional Hello namespace (a future abi_version can't be misparsed as a control
message); all typed decodes are length-exact. ABI_VERSION → 2 (punktfunk_connect signature
gained the identity params; header regenerated).

Renegotiation: drain the reconfig channel to the NEWEST mode (one rebuild, not one per
stale step); validate refresh_hz; build the new pipeline BEFORE dropping the old so a
rebuild failure keeps the session on its current mode instead of killing it.

GameStream: packetDuration snaps to {5,10} (an in-between value isn't a legal Opus frame
size and would kill audio). Sway: chooser file moved to $XDG_RUNTIME_DIR (was a fixed
world-writable /tmp path — DoS / capture-misdirection by another local user).

Swift: fixed two compile breakers in the new pairing/identity APIs (Int32 status .rawValue,
UInt cap cast). New SPAKE2 + namespace-disjointness + pairing-roundtrip unit tests; the
in-process pairing test now also exercises the arming PIN + cooldown. 114 tests green,
clippy -D warnings clean (both feature sets), fmt, C-ABI harness.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 16:26:48 +00:00
enricobuehler 429bd1e6ac Merge branch 'worktree-agent-a6fe98c40d55fd284' into m1-lumen-core
# Conflicts:
#	CLAUDE.md
2026-06-10 15:42:48 +00:00
enricobuehler 4d26ac5c85 feat: punktfunk/1 — mid-stream mode renegotiation + PIN pairing ceremony
Renegotiation (no reconnect on resize): the handshake bi-stream stays open; the client
sends Reconfigure{mode} (typed post-handshake message), the host validates + acks
Reconfigured and rebuilds capture/encoder/virtual output at the new mode while the data
plane (keys, ports, FEC) runs untouched — the first new-mode AU is an IDR with in-band
parameter sets. NativeClient::request_mode / punktfunk_connection_request_mode; mode()
reflects the active mode. Validated live on KWin: one continuous stream, 225 frames
@1280x720 then 395 @1920x1080, ~90 ms pipeline rebuild (ffprobe shows both resolutions).

PIN pairing (mutual trust, kills TOFU MITM): clients get persistent self-signed
identities presented via QUIC client auth (generate_identity / client auth offered but
optional server-side — legacy clients still connect). Ceremony on the control stream:
PairRequest{name} → host shows a 4-digit PIN (log) + PairChallenge{salt} → client proves
with HMAC-SHA256(PIN‖salt, client_fp‖host_fp) — binding both certs means a MITM can't
forward a proof, single attempt per PIN, constant-time compare → PairResult; host
persists the fingerprint (~/.config/punktfunk/punktfunk1-paired.json), client pins the
host's. m3-host --require-pairing gates sessions on the paired set.
NativeClient::pair + punktfunk_pair/punktfunk_generate_identity in the ABI; reference
client: --pair PIN --name LABEL + auto-generated persistent identity, --remode for live
renegotiation testing. Swift wrapper: ClientIdentity/generateIdentity()/pair(),
requestMode()/currentMode(); README handoff updated.

Tested: reconfigure/pairing wire roundtrips, C-ABI mode switch ack, full in-process
ceremony (wrong PIN → Crypto, anonymous-vs-gate rejection, success → pinned session);
live wrong-PIN ceremony against the serving host (PIN logged, proof rejected).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:42:29 +00:00
enricobuehler 3cc3c02b42 feat(gamestream): AV1 negotiation + 5.1/7.1 surround audio
Codec negotiation (M2 polish):
- ServerCodecModeSupport now advertises what we encode: H264|HEVC|AV1_MAIN8
  = 65793 (flags verified against moonlight-common-c Limelight.h). The old
  placeholder 3843 wrongly claimed HEVC Main10 + 4:4:4 and no AV1. Main10
  bits stay off on purpose: Moonlight ties 10-bit to HDR, and capture is
  8-bit SDR BGRx with no HDR metadata path (av1_nvenc -highbitdepth was
  validated working for later).
- RTSP ANNOUNCE: bitStreamFormat 0/1/2 -> H264/HEVC/AV1 (already plumbed to
  av1_nvenc; validated e2e via `m0 --codec av1` + ffprobe av01), and a
  dynamicRangeMode!=0 request now logs + falls back to 8-bit SDR.

Surround audio (M2 polish):
- ANNOUNCE x-nv-audio.surround.{numChannels,AudioQuality} +
  x-nv-aqos.packetDuration -> per-session AudioParams; DESCRIBE advertises
  all six Opus configs (normal before HQ per channel count). Normal-quality
  mappings are pre-rotated for the client's GFE-order LFE swap
  (RtspConnection.c, verified verbatim) so its derived decoder mapping
  equals our encoder mapping — including 7.1, where Sunshine's rotate only
  covers [3,6) and scrambles LFE/SL/SR.
- 5.1/7.1 encode via libopus multistream (audiopus_sys, the sys layer the
  opus crate already links) with Sunshine's layouts/bitrates, RAII wrapper;
  the live-validated stereo wire is byte-identical (plain Opus, no FEC).
- Surround sessions add Sunshine-style RS(4,2) audio FEC (packetType 127 +
  AUDIO_FEC_HEADER, the OpenFEC parity matrix both ends hardcode, nanors
  gemm semantics verified from nanors/rs.c).
- PipeWire capture generalized to the negotiated channel count with explicit
  FL FR FC LFE RL RR [SL SR] positions; missing sink channels are zero-
  filled by the channel-mixer. PwAudioCapturer now tears down cleanly on
  Drop (pipewire channel -> loop quit), so a channel-count change can
  reopen without leaking a capture stream.

Tests: serverinfo mask, RTSP codec/audio param parsing, DESCRIBE contents,
surround-params strings + client-swap round trip, FEC parity self-recovery
and packet layout, real-codec 5.1 channel-identity round trip, and an
ignored live test (ran green against a 6ch null sink monitor).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:41:15 +00:00
enricobuehler 7381ba8218 feat(vdisplay): wlroots/Sway backend — swaymsg headless output + xdpw chooser
The fourth VirtualDisplay backend: `swaymsg create_output` adds a HEADLESS-N
output (name found by diffing get_outputs), `output <NAME> mode --custom
WxH@HzHz` sets the client's exact mode (and the refresh clock a fresh headless
output needs to produce frames at all), and the PipeWire node comes from the
ScreenCast portal. Headless output selection is non-interactive via
xdg-desktop-portal-wlr's chooser hook: a managed config (chooser_type=simple,
chooser_cmd cats /tmp/punktfunk-xdpw-output; portal try-restarted when the
config changes) plus a per-session `Monitor: <NAME>` written to that file.
Teardown is RAII: drop ends the portal thread (zbus connection drop ends the
cast) then `swaymsg output <NAME> unplug`. swaymsg commands go after `--` so
tokens like `--custom` reach sway instead of swaymsg's getopt.

Validated live on headless sway 1.11 (gles2-on-NVIDIA, xdpw 0.8.1), zero-copy
dmabuf→CUDA on both runs: 720p60 257 frames p50 0.77 ms, 1080p60 480/480
frames p50 1.18 ms, output unplugged with the session both times. The
checked-in xdpw.config sample now matches the managed config (the old
chooser_type=none/HEADLESS-1 form would pin capture to the wrong output).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:23:28 +00:00
enricobuehler bfd64ce871 rename: lumen → punktfunk, everywhere
ci / rust (push) Has been cancelled
Full project rename, decided 2026-06-10:
- Crates/binaries: punktfunk-core / punktfunk-host / punktfunk-client-rs.
- C ABI: punktfunk_* symbols, Punktfunk* types, include/punktfunk_core.h,
  PUNKTFUNK_FEATURE_QUIC guard (header regenerated; cbindgen renames updated, incl.
  PUNKTFUNK_BTN_*/PUNKTFUNK_AXIS_* wire constants).
- Protocol: punktfunk/1 — control-plane magic LMN1 → PKF1, nonce salt lmn1 → pkf1.
  WIRE BREAK: clients must be rebuilt from this revision.
- Env knobs: PUNKTFUNK_VIDEO_SOURCE / PUNKTFUNK_COMPOSITOR / PUNKTFUNK_ZEROCOPY / ….
- Host config dir: ~/.config/punktfunk (the box's dir was migrated in place — the
  persistent identity is unchanged, pinned fingerprints stay valid).
- Swift package: PunktfunkKit + PunktfunkCore.xcframework + PunktfunkConnection
  (Sources/PunktfunkClient app + tests renamed with it); build-xcframework.sh updated.
- scripts/: 60-punktfunk.rules, punktfunk-host.service; OpenAPI doc regenerated.

Also: scripts/headless/run-headless-kde.sh — full headless Plasma bringup. Root cause of
"desktop but no apps/settings" over the stream: plasmashell launched without
XDG_MENU_PREFIX=plasma-, so the launcher resolved a nonexistent applications.menu and
rendered an empty menu. The script sets the complete KDE session env (menu prefix,
KDE_FULL_SESSION, session version) and rebuilds ksycoca before starting plasmashell.

Gate: 97/97 tests, clippy -D warnings (both feature sets), fmt, C-ABI harness PASS,
zero lumen references left outside .git.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 13:11:59 +00:00
enricobuehler bf8a974e8b feat: M4 stage 1 — the SwiftUI client is real: compiles, tested, first light on glass
ci / rust (push) Has been cancelled
The clients/apple scaffold is now a working macOS client, validated live against this
repo's host across the LAN: gamescope virtual output → NVENC HEVC → lumen/1 (GF(2¹⁶) FEC +
AES-GCM over UDP, QUIC control) → VideoToolbox → AVSampleBufferDisplayLayer at 720p60,
mouse/keyboard flowing back as QUIC datagrams into the host's gamescope EIS injector
(~3.7k events injected in one session).

LumenKit:
- LumenConnection: the predicted cbindgen compile fixes (C17 header spells the typedefs as
  integers while the enum constants import as a distinct Swift type — bridge by rawValue);
  close() is now safe from any thread (a close flag + pumpLock held across the blocking
  poll enforce the C contract "never close with a next_au in flight"; flag prevents
  lock-starvation by back-to-back polls).
- StreamView: per-pump cancellation token (reconnects can't double-pump), flush + re-gate
  on the next in-band parameter sets when the layer fails, no stale enqueue after restart.
- InputCapture: fractional-delta accumulation (sub-pixel motion isn't truncated away),
  pressed-state tracking with release-all on focus loss and stop() (nothing sticks down
  host-side), global-singleton ownership guard (GC has one handler slot per process),
  X1/X2 buttons, horizontal scroll, full keypad/CapsLock/ISO-102nd/PrintScreen/Menu VKs.
- LumenClient app shell (swift run LumenClient): connect form, fps/Mb-s HUD,
  LUMEN_AUTOCONNECT/LUMEN_MODE for scripted first-light runs.
- Tests: Annex-B byte-level units; real-codec round trip (VTCompressionSession-encoded
  HEVC rebuilt as the host's wire shape → AnnexB → VTDecompressionSession → pixels);
  test-loopback.sh (Swift client vs a real local m3-host over loopback — the Swift twin of
  c_abi_connection_roundtrip); RemoteFirstLightTests (full pipeline over the LAN).

Host/build fixes that fell out:
- The workspace builds on non-Linux again: gamestream audio (opus) and sendmmsg batching
  are now platform-gated with stubs/fallback, per the crate's "compiles everywhere" rule.
- Horizontal scroll was inverted end-to-end: the injectors negated BOTH axes onto the
  ei/wl axes, but GameStream's horizontal convention is positive = right
  (moonlight-qt/Sunshine pass it through unnegated) — only vertical flips now. This also
  un-inverts real Moonlight clients.
- AnnexB drops all zeros preceding a start code (trailing_zero_8bits padding), ffmpeg's
  policy, instead of leaking them into the preceding NAL.
- build-xcframework.sh: deployment targets pinned to the package floor + an otool guard —
  cargo does not fingerprint MACOSX_DEPLOYMENT_TARGET, so warm caches can silently ship
  too-new minos objects.

Adversarially reviewed (5-dimension multi-agent pass, every finding refutation-verified):
14 confirmed findings, all fixed above; the send-while-polling core-contract gap flagged
here is closed by the lumen/1 session-planes work (&self pulls + per-plane borrow slots).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:46:45 +02:00
enricobuehler 520d7342dd feat: M3 — full lumen/1 session planes: audio, gamepads+rumble, pinned trust, persistent listener
ci / rust (push) Has been cancelled
m3-host is now a real host, not a one-shot demo. Everything validated live on this box
(two back-to-back sessions, pinned + TOFU, ~200 audio pkts/s, p50 0.84 ms at 720p60).

lumen-core:
- quic.rs: QUIC-datagram side planes demuxed by first byte — Opus audio 0xC9
  ([magic][u32 seq][u64 pts_ns][opus], host→client) and rumble 0xCA ([magic][pad][low][high]).
- Trust: endpoint::server_with_identity (persistent PEM identity) and
  endpoint::client_pinned — SHA-256 cert-fingerprint pinning with TOFU (observed
  fingerprint reported back for persisting). The verifier checks the TLS 1.3
  CertificateVerify signature for real (an MITM replaying the host's public cert without
  its key is rejected; cert pinning alone would not prove key possession).
- client.rs: NativeClient gains pin + host_fingerprint, audio/rumble receivers
  (next_audio / next_rumble); pull methods take &self so the C ABI's per-plane threads
  never alias a &mut (per-plane mutexed borrow slots in abi.rs).
- abi.rs: lumen_connect(pin_sha256, observed_sha256_out) + lumen_connection_next_audio /
  next_rumble. input.rs: documented gamepad wire contract (GameStream buttonFlags bits,
  XInput axis conventions, +y = up) — exported as LUMEN_BTN_*/LUMEN_AXIS_* (bare BTN_*
  collides with <linux/input-event-codes.h> at different values).

lumen-host (m3):
- Persistent accept loop: sessions back to back on one endpoint (--max-sessions, 0 =
  forever); per-session failures log and the loop keeps serving; 10 s handshake deadline
  so a silent client can't wedge the sequential accept queue; teardown on every exit path
  (stop flag → conn.close → join audio+input threads).
- Audio plane: desktop PipeWire capture → Opus 48 kHz stereo 5 ms CBR → datagrams; ONE
  capturer reused across sessions via an AudioCapSlot (PipeWire streams have no cheap
  teardown — per-session opens would leak a thread + core connection + live node each).
- Gamepad routing: incremental GamepadButton/GamepadAxis datagrams accumulate into
  per-pad state feeding the uinput xpad manager; force feedback returns as rumble
  datagrams, with current state re-sent every 500 ms (idempotent-state healing for the
  lossy channel). QUIC endpoint serves the persistent ~/.config/lumen identity and logs
  the pinnable fingerprint.

lumen-client-rs: --pin (malformed values abort — never silently downgrade to TOFU),
TOFU fingerprint logging, audio/rumble datagram counters, gamepad events in --input-test.

clients/apple: scaffold synced — pinSHA256/hostFingerprint (wrong-size pin throws,
fail-closed), nextAudio/nextRumble, gamepad event constructors; README handoff updated
(persistent listener, audio decode notes, trust UX).

Adversarially reviewed (5-dimension multi-agent pass over the diff, 2-skeptic
verification): fixed the MITM signature-check gap, a Y-axis contract inversion, header
macro collisions, ABI aliasing UB, the PipeWire per-session leak, the missing handshake
deadline, fail-open pin parsing, and teardown-on-error paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 12:26:18 +00:00
enricobuehler 3ea096ace9 feat: M4 groundwork — lumen/1 client connector in the C ABI + SwiftUI client scaffold
ci / rust (push) Has been cancelled
The shared-core architecture pays off: platform clients now link ONE Rust library that
does the entire lumen/1 protocol, and only add decode/present/input on top.

lumen-core:
- client.rs (quic feature): NativeClient — QUIC handshake + UDP data plane + input
  datagrams on internal threads; embedder surface = connect / next_frame / send_input.
- abi.rs: lumen_connect / lumen_connection_next_au (borrow-until-next-call, matching
  lumen_client_poll_frame semantics) / lumen_connection_send_input / lumen_connection_mode /
  lumen_connection_close. Guarded in the generated header by LUMEN_FEATURE_QUIC (cbindgen
  [defines] mapping), so the checked-in header is stable across feature sets.
- error.rs: append-only LumenStatus additions Timeout (-9) and Closed (-10).
- TESTED end-to-end through the C ABI: in-process lumen/1 host, lumen_connect pulls 25
  byte-verified frames, sends input, closes (m3.rs::c_abi_connection_roundtrip).

Apple client (clients/apple — SCAFFOLD, written on Linux, first Xcode build pending):
- scripts/build-xcframework.sh: cargo per Apple target → universal staticlib + header
  (LUMEN_FEATURE_QUIC pre-defined) + modulemap → LumenCore.xcframework.
- Package.swift (LumenKit) + Swift sources: LumenConnection (ABI wrapper), AnnexB
  (in-band VPS/SPS/PPS → CMVideoFormatDescription, Annex-B → AVCC CMSampleBuffers with
  DisplayImmediately), StreamView (SwiftUI over AVSampleBufferDisplayLayer — stage-1
  presenter that hardware-decodes compressed HEVC itself), InputCapture (GCMouse raw
  deltas + GCKeyboard HID→VK).
- README.md is the full handoff for the next (Mac-side) agent: build steps, ABI contract,
  first-light test recipe against the Linux host, stage-2 (VT+Metal pacing) plan, and the
  known host-side gaps (single-session m3-host, no lumen/1 audio yet, gamepad kinds not
  yet routed in m3's injector, seed-stage trust).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 07:28:41 +00:00
enricobuehler 68f2b19cca Merge main (management REST API) into m1-lumen-core
ci / rust (push) Has been cancelled
Resolutions: serve() keeps main's AppState::new() with our persisted-pairing load folded
into it; main.rs keeps both the m3 and mgmt modules; mgmt's test LaunchSessions gain the
new appid field; Cargo.lock re-resolved. Full gate green (92 tests, clippy, fmt).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 07:03:41 +00:00
enricobuehler 5b0d84acd0 feat: M3 — lumen/1 native streaming: real video at client mode + input over QUIC datagrams
The native protocol now does the real thing, end to end:

- Hello carries the client's requested mode; the host creates a NATIVE virtual output at
  exactly that size/refresh (same vdisplay backends as the GameStream path) and streams
  NVENC HEVC through the M1 Session (GF(2^16) Leopard FEC + AES-GCM, QUIC-negotiated).
- Input rides QUIC DATAGRAMS — encrypted, congestion-managed, no ENet retransmission
  spikes — decoded into lumen_core InputEvents and fed to the session's input injector.
- Frames are stamped with the capture wall clock; the reference client computes per-frame
  capture→reassembled latency percentiles and writes a playable .h265.
- m3-host gains --source synthetic|virtual + --seconds; the client gains --mode WxHxFPS,
  --out, --input-test (scripted mouse/keyboard datagrams).

VALIDATED live (gamescope session, xev nested): client requested 1280x720@120 → host
created gamescope at that mode → 1680/1680 frames over 14s, zero loss, valid HEVC;
pipeline latency p50 0.83ms / p95 1.2ms / p99 1.3ms (capture→encode→FEC→crypto→UDP→
reassembled, same-host clock); 176 input datagrams sent → injector (GamescopeEi) → 164
X events observed inside the nested session.

Known follow-on: slice-level sub-frame pipelining needs the NVENC SDK directly (libavcodec
emits whole AUs only) — the next big latency lever.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 06:56:47 +00:00
enricobuehler de3123038f feat: M3 seed — the lumen/1 native protocol: QUIC control plane + reference client (Phase 5)
The first end-to-end run of lumen's own protocol, past the GameStream compatibility layer.

- lumen-core/src/quic.rs (behind the `quic` feature): the lumen/1 handshake — Hello/Welcome/
  Start as length-prefixed LE binary on one QUIC bi-stream. Welcome carries the COMPLETE
  data-plane Config: mode, FEC scheme incl. GF(2^16) Leopard (inexpressible in GameStream),
  shard sizing, AES-GCM key + per-direction salt, data UDP port. Plus quinn endpoint helpers
  (self-signed server; accepts-any client — pinning lands with the trust model) and framed
  async IO. Round-trip unit-tested.
- lumen-host m3-host: serves one lumen/1 session — QUIC handshake, then a NATIVE thread
  (no async on the frame path — design invariant) streams deterministic 64KB test frames
  through the hardened M1 Session over UdpTransport.
- lumen-client-rs: from scaffold to working reference client — connects, negotiates, brings
  up the client Session over UDP, reassembles + FEC-recovers + byte-verifies every frame.

VALIDATED END-TO-END on localhost: 300/300 frames verified, 0 mismatches, through
QUIC-negotiated GF(2^16) FEC + AES-GCM over real UDP sockets. M4 (decode+present) builds on
this exact client skeleton.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:33:40 +00:00
enricobuehler 1eeb35a723 feat: M2 — host productionization: app catalog, persistent pairing, quit semantics, systemd (Phase 4)
- gamestream/apps.rs: an app catalog (loaded from ~/.config/lumen/apps.json, with defaults:
  Desktop + gamescope entries when gamescope/steam/vkcube are installed). /applist renders
  it; /launch?appid=N selects the entry; RTSP PLAY resolves it and the stream honors the
  app's compositor + nested command — so a Moonlight client picks "Steam" and gets a
  gamescope session at its native resolution, or "Desktop" for the KWin/GNOME desktop.
- Persistent pairing: the paired-client cert allow-list now survives restarts
  (~/.config/lumen/paired.json), saved on each successful pairing, loaded at boot.
- Quit semantics: /cancel now actually stops the media threads (streaming/audio flags),
  tearing down the per-session virtual output / gamescope process via the capturer's RAII.
- scripts/lumen-host.service (systemd user unit) + scripts/host.env.example (config file
  consumed by it) — the host runs as a managed service instead of an SSH shell.

Smoke-tested: serve boots, /applist serves the catalog (Desktop + vkcube gamescope entry
auto-detected on this box). GNOME backend validation still pending gnome-shell install;
wlroots vdisplay backend deliberately deferred (not in the priority compositor trio).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:23:53 +00:00
enricobuehler 826da9968e feat: M2 — Vulkan bridge: TRUE zero-copy for gamescope's LINEAR dmabufs (Phase 3)
The missing zero-copy path is closed. NVIDIA's EGL won't sample LINEAR and the CUDA driver
rejects raw dmabuf fds — but Vulkan imports dmabufs (VK_EXT_external_memory_dma_buf) and
exports OPAQUE_FD memory that CUDA officially imports. zerocopy/vulkan.rs (ash):

  dmabuf fd → VkBuffer (import cached per fd) → vkCmdCopyBuffer (GPU) →
  exportable VkBuffer → vkGetMemoryFdKHR(OPAQUE_FD) → cuImportExternalMemory → CUdeviceptr

The exportable buffer + CUDA mapping are per-resolution; per frame it's one GPU buffer copy
(fence-waited) + one pitched CUDA copy into the encoder's pool. No CPU touches pixels.
EglImporter::import_linear now routes through the bridge (lazy init; any failure still falls
back to the CPU mmap path). cuda::ExternalDmabuf gained import_owned_fd for the
Vulkan-exported fd.

Validated live: gamescope 720p120 → "Vulkan→CUDA exportable staging buffer ready
size=3686400" (exactly 1280*720*4), full-rate 122.7 fps, decoded frame pixel-correct
(vkcube). KWin's tiled EGL path regression-tested intact. NV12 negotiation dropped — moot
now that BGRx is fully zero-copy.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:18:38 +00:00
enricobuehler c39615d7d1 perf: M2 — split the data plane into encode | send threads with batched, paced sends (Phase 2)
stream_body no longer sends: each frame's packet batch goes over a depth-2 bounded queue to a
dedicated send thread, so a send spike can never stall capture/encode (a full queue drops the
NEWEST batch — FEC/RFI covers the client — rather than ever blocking). The sender ships packets
with sendmmsg (≤64/syscall: ~375 syscalls/s instead of ~24k at 5K@240) in 16-packet chunks
paced across ~3/4 of the frame interval — microburst shaping for real links without per-packet
sleep jitter. Client-gone detection moved to the sender (clears `running`); the LUMEN_VIDEO_DROP
FEC test knob moved with the send path. Loopback-tested: batches arrive complete and
byte-identical through the paced sendmmsg path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:13:34 +00:00
enricobuehler 6521146abc feat: M2 — gamepad input: uinput virtual X-Box pads + rumble back-channel (Phase 1)
Full controller path for the SteamOS-like session, mirroring Sunshine byte-for-byte
(wire formats verified against moonlight-common-c + Sunshine source; ioctl numbers and
struct layouts verified by compiling against this box's <linux/uinput.h>, locked in with
const asserts):

- gamestream/gamepad.rs: decode MULTI_CONTROLLER (magic 0x0C, mixed BE-size/LE-body) incl.
  the Sunshine buttonFlags2 extension (paddles/touchpad/Misc — our appversion already
  advertises Sunshine, so clients send it) and CONTROLLER_ARRIVAL (0x55000004); build the
  0x010B rumble plaintext (with the mandatory 4-byte filler). Unit-tested.
- inject/gamepad.rs: VirtualPad clones the kernel xpad identity ("Microsoft X-Box 360 pad",
  045e:028e, exact button/axis codes + absinfo) so SDL/Steam/Proton match their built-in
  mapping with zero config. GamepadManager creates/destroys pads from activeGamepadMask
  (hotplug), emits button transitions + axes (+Y-up → evdev +Y-down negation, D-pad as
  HAT0X/Y) per frame. Rumble: non-blocking FF pump answers UI_BEGIN/END_FF_UPLOAD/ERASE
  (games block in EVIOCSFF until answered), tracks effects with replay expiry + FF_GAIN,
  mixes to (low, high) motor levels, dedups.
- control.rs: channel_limit 8 → 0x30 — Moonlight sends gamepad input on ENet channel
  0x10+n, so the old limit silently discarded ALL controller input. Gamepad events route to
  the manager; rumble is sealed with the client's detected GCM scheme direction-flipped
  (V2 marker 'H?', own seq counter) and sent on the control peer every service tick.
- scripts/60-lumen.rules: udev rule (Sunshine-style) granting the input group /dev/uinput.

Live validation needs the udev rule installed (root-only /dev/uinput on this box) + a
Moonlight client with a controller; everything else is gated and unit/static-checked.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 23:09:16 +00:00
enricobuehler be18a782b1 feat: M2 — GNOME/Mutter virtual-display backend (RecordVirtual) + preferred-mode negotiation
Third compositor on the VirtualDisplay seam, via Mutter's direct D-Bus APIs (the
gnome-remote-desktop headless model, no portal grant): RemoteDesktop.CreateSession →
ScreenCast.CreateSession({remote-desktop-session-id}) → Session.RecordVirtual (creates a
virtual monitor) → Start → PipeWireStreamAdded(node_id). A keepalive thread owns the zbus
connection (sessions die with it — RAII teardown); select with LUMEN_COMPOSITOR=mutter or
XDG_CURRENT_DESKTOP=GNOME.

Mutter sizes its virtual monitor FROM the PipeWire format negotiation, so VirtualOutput
gains preferred_mode (w, h, refresh_hz), threaded into the consumer's format pods as the
default size/framerate. KWin/gamescope set it too (their outputs are already exact-size;
the preference just confirms it) — both regression-tested intact.

Compile/clippy/test clean; live validation needs gnome-shell installed (then
`gnome-shell --headless` + m0 --source kwin-virtual with LUMEN_COMPOSITOR=mutter).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-09 22:48:53 +00:00