Commit Graph

42 Commits

Author SHA1 Message Date
enricobuehler 0324719b6e feat(host/windows): USO batched send for the GameStream video plane
The GameStream video sender did one send() syscall per packet on Windows
(the #[cfg(not(target_os="linux"))] sendmmsg_all fallback), capping
throughput at high packet rates. Wire it to UDP Send Offload (the Windows
analogue of Linux GSO) so each paced 16-packet burst goes out in one
WSASendMsg(UDP_SEND_MSG_SIZE) syscall instead of 16, preserving the
microburst pacing.

Expose a reusable punktfunk_core::transport::send_uso_all (Windows-only)
that reuses the proven native-plane USO primitive (send_one_uso + the uso
on/off latch + uso_unsupported), with the same uniform-size guard and
≤512-segment chunking as UdpTransport::send_gso. It returns how many leading
packets it sent via USO; the GameStream sendmmsg_all sends any remainder
(USO off via PUNKTFUNK_GSO=0, a size-mixed burst, or a frame's short final
packet) with per-packet send. On-wire packet boundaries are unchanged.

Resolves #4 in docs/apollo-comparison.md. Linux build unaffected;
punktfunk-core type-checks for x86_64-pc-windows-msvc. Host Windows compile
deferred to CI / dev box.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 10:21:33 +00:00
enricobuehler e99a1aea43 fix(apple): resolve QoS priority inversions + two Swift concurrency warnings
apple / swift (push) Successful in 55s
ci / rust (push) Successful in 1m31s
android / android (push) Successful in 1m48s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 33s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
ci / bench (push) Successful in 1m35s
decky / build-publish (push) Successful in 11s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m19s
flatpak / build-publish (push) Successful in 4m2s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m22s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m39s
Priority inversions (Thread Performance Checker): the Apple client drains every
plane on .userInteractive threads (video pump, audio, gamepad feedback) and
connects on a .userInitiated Task, but the connector's producer threads ran at
the default QoS — so a high-QoS consumer parked waiting on a lower-QoS producer.
Pin the connector's producers (outer worker thread, all tokio runtime threads via
on_thread_start, and the data-plane spawn_blocking pump) to .userInteractive on
Apple so they match the consumers. #[cfg(target_vendor = "apple")] helper using
the existing libc dep; no-op off Apple, no Swift-side change (no latency
regression).

GamepadFeedback.swift: the init's MainActor hop captured self implicitly-strong
while the inner $active sink captured it weakly — capture [weak self] in the hop
too (the sink stays weak to avoid the retain cycle).

StreamPump.swift: the @Sendable pump-thread closure captured the non-Sendable
AVSampleBufferDisplayLayer. enqueue/flush are documented thread-safe and only the
pump thread drives it after start(), so assert that with nonisolated(unsafe).

cargo build/test/clippy/fmt green (core + host); xcframework rebuilt; swift build
+ iOS/tvOS targets clean with both warnings gone. Runtime confirmation of the
inversion warnings needs a GUI run under Xcode's Thread Performance Checker.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 22:48:10 +02:00
enricobuehler bbabc04bca feat(hdr): Windows HDR10 + 10-bit end-to-end, negotiated; non-blocking capture recovery
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m32s
android / android (push) Successful in 1m49s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m36s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m20s
flatpak / build-publish (push) Successful in 4m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m11s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m32s
Adds true HDR (BT.2020 PQ) and 10-bit (HEVC Main10) streaming, negotiated so an
8-bit/SDR client is never sent a stream it can't decode, plus a robust fix for the
capture losing the stream across a secure-desktop transition.

Protocol (punktfunk-core/quic.rs):
- Hello gains `video_caps` (VIDEO_CAP_10BIT / VIDEO_CAP_HDR), Welcome gains `bit_depth`,
  both as optional trailing bytes (back-compat). client-rs advertises 10-bit via
  PUNKTFUNK_CLIENT_10BIT; the connector advertises 0 for now (in-band detection drives
  the native clients). Regenerated punktfunk_core.h.

Windows host:
- 10-bit Main10: host enables it only when the client advertised VIDEO_CAP_10BIT AND
  PUNKTFUNK_10BIT is set; threaded through open_video → NVENC (profile Main10,
  pixelBitDepthMinus8).
- HDR: when the captured desktop is scRGB FP16 (R16G16B16A16_FLOAT, HDR on), copy it to
  an FP16 surface, composite the cursor there, convert scRGB → BT.2020 PQ 10-bit
  (R10G10B10A2) via a shader, and encode HEVC Main10 with the BT.2020/PQ colour VUI
  (ABGR10 input). Fixes the freeze + cursor-trail that came from feeding FP16 into the
  BGRA path. Reacts dynamically to the HDR toggle.
- Capture recovery: rebuild is now a single NON-BLOCKING attempt, throttled to ~4×/s,
  repeating the last good frame between attempts (format-tagged last_present). During a
  secure-desktop dwell SudoVDA's output is gone; the old blocking 12 s retry starved the
  send loop for seconds so the client timed out and disconnected — now the session stays
  fed (frozen) until the desktop returns. Also seeds a black frame on recovery.

Apple client (PunktfunkKit):
- Detects HDR in-band from the stream VUI (PQ transfer function), decodes to 10-bit P010,
  and presents via an rgba16Float + BT.2020 PQ CAMetalLayer with EDR; SDR path unchanged.
  Switches automatically on a mid-session HDR toggle.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 20:28:52 +00:00
enricobuehler c830246037 feat(host/windows): UDP send offload + NVENC 2-way split-encode (1 Gbps+ / 5K@240)
apple / swift (push) Successful in 53s
audit / cargo-audit (push) Failing after 1m7s
ci / rust (push) Failing after 40s
android / android (push) Successful in 2m11s
ci / web (push) Successful in 29s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
flatpak / build-publish (push) Successful in 3m42s
deb / build-publish (push) Successful in 6m58s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m30s
docker / deploy-docs (push) Successful in 30s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m10s
The Windows host couldn't sustain high-throughput / high-fps streams — two gaps vs the Linux host,
both found via live RTX 4090 measurement (PERF timing + nvidia-smi per-engine attribution):

- UDP Send Offload (USO). punktfunk-core's UdpTransport sent one packet per `send` syscall on
  Windows (send_batch/send_gso were Linux-only), capping throughput at high packet rates. Add a
  Windows `send_gso` override using `WSASendMsg` + `UDP_SEND_MSG_SIZE` (the Windows analogue of
  Linux UDP GSO) via windows-sys — one syscall segments a coalesced <=512-segment super-buffer to
  the connected peer. On by default with auto-fallback (PUNKTFUNK_GSO=0 disables, error latches
  off); plugs into the existing paced send path. SO_SNDBUF (32MB) was already cross-platform.

- NVENC 2-way split-frame encoding. A single Ada NVENC session tops out ~0.8 Gpix/s, so 5K@240
  (1.77 Gpix/s) took ~8 ms/frame -> a ~125 fps ceiling at high motion (the in-game stutter). Set
  NV_ENC_INITIALIZE_PARAMS.splitEncodeMode = TWO_FORCED above ~1 Gpix/s (matching the Linux
  libavcodec split_encode_mode path) to use both 4090 encoders — measured ~8 ms -> ~4 ms/frame at
  throughput. Env override PUNKTFUNK_SPLIT_ENCODE; init-failure fallback disables it (e.g. H264).

Windows-only paths; Linux/macOS unaffected. Builds clean on x86_64-pc-windows-msvc.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 16:52:59 +00:00
enricobuehler 27e58658af feat(launch): punktfunk/1 launch integration — client picks a title, host runs it
ci / web (push) Successful in 28s
ci / docs-site (push) Successful in 30s
apple / swift (push) Successful in 1m17s
ci / rust (push) Successful in 2m6s
ci / bench (push) Successful in 1m34s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 6s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 3s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 2m23s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m53s
docker / deploy-docs (push) Successful in 40s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m55s
Plan step 4 (plumbing + host behavior). A client can ask the host to launch a
library title on connect; the host resolves it against ITS OWN library and runs it
in the session — the client sends only the store-qualified id, never a command, so a
remote peer can't inject one.

- Protocol (quic.rs): `Hello.launch: Option<String>` (the GameEntry id). Appended
  after `name`; when launch is present but name absent, a zero-length name placeholder
  keeps the offset deterministic — so a Hello with neither field stays byte-identical
  to the bitrate-era 26-byte form (test-asserted). Old peers ignore it; new hosts
  decode None from old clients. Round-trip + back-compat + truncation tests.
- Host: `library::launch_command(id)` resolves id → command via the host's own library —
  `steam_appid` → `steam steam://rungameid/<appid>` (appid validated as digits, the only
  client-influenced part), `command` → the host-stored command verbatim (trusted, never
  from the client). m3.rs sets PUNKTFUNK_GAMESCOPE_APP from it before bringup, exactly
  as the GameStream /launch path does (one session at a time). Unit-tested incl. an
  injection-attempt guard. Takes effect on the bare-spawn gamescope path; a no-op on a
  shared desktop / attach-to-existing session.
- C ABI: `punktfunk_connect_ex4` adds `launch_id` (NULL = none); `_ex3` now delegates to
  it. Threaded through NativeClient::connect → WorkerArgs → Hello.
- client-rs gains `--launch ID` (headless testing); client-linux passes None (no picker
  yet). Header regenerated.

Next: the Apple library grid passes the picked id via punktfunk_connect_ex4.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 14:56:18 +00:00
enricobuehler fc30307a87 feat(abi): expose the host-resolved compositor to clients
ci / docs-site (push) Successful in 30s
apple / swift (push) Successful in 1m13s
ci / bench (push) Successful in 1m39s
ci / web (push) Successful in 30s
ci / rust (push) Successful in 2m3s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 2m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m46s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m23s
Add punktfunk_connection_compositor() (mirrors punktfunk_connection_gamepad): a
client getter for the compositor the host actually resolved for the session, read
from Welcome.compositor and threaded through NativeClient.resolved_compositor. The
Apple/Linux clients use it to enable the client-side cursor by default on gamescope
sessions, whose PipeWire capture carries no cursor (verified upstream). Header
regenerated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 11:58:37 +00:00
enricobuehler 4d26f61e40 fix(net/gso): fall back to sendmmsg on EMSGSIZE instead of tearing down
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 30s
apple / swift (push) Successful in 1m15s
ci / rust (push) Successful in 2m6s
ci / bench (push) Successful in 1m35s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 2m22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m56s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m23s
Enabling PUNKTFUNK_GSO on a host whose egress MTU is below our UDP segment size
made every GSO send return EMSGSIZE (code 90, "Message too long") — the kernel
validates each GSO segment against the device MTU at send time, which plain
sendmmsg does not. EMSGSIZE wasn't in gso_unsupported() (nor is_transient_io), so
it propagated as a fatal "send failed — stopping stream" and instantly killed
every session the moment GSO was on (observed live: connection fails instantly /
speed-test 0 Mbps).

Add EMSGSIZE to gso_unsupported() so it latches GSO off for the process and
finishes via sendmmsg — the standard "GSO not usable on this path" fallback.
Measured after: the same host+path does 1 Gbps at 0.0% loss over the real LAN via
sendmmsg (and the host send path sustains a 2 Gbps probe with send_dropped=0), so
GSO is a >2 Gbps optimization, not required for 1 Gbps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 01:06:41 +00:00
enricobuehler 16ccc7c876 fix(net): don't tear the stream down on a connected-UDP ICMP blip (ECONNREFUSED)
ci / web (push) Successful in 25s
ci / docs-site (push) Successful in 30s
apple / swift (push) Successful in 1m15s
ci / rust (push) Successful in 2m7s
ci / bench (push) Successful in 1m36s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 2m17s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m50s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m22s
Root cause of the Mac "session ended" at higher bitrates. The video data plane is
a *connected* UDP socket; with data-plane hole-punching the path can blip and the
kernel surfaces an asynchronous ICMP port-unreachable/reset as ECONNREFUSED /
ECONNRESET on a later send or recv. Both the host send loop and the client
poll_frame treated that as fatal and tore the session down:

    ERROR punktfunk_host::m3: send failed — stopping stream
      error=send_sealed: Io(ConnectionRefused, code 111)   <-- observed live

That also cascades: a transient ICMP makes the client's poll_frame bail and close
its data socket, which makes the host's next send get a *real* ECONNREFUSED, which
tears the host side down too — exactly the "broke at 500 Mbps+" report.

Fix: classify ECONNREFUSED/ECONNRESET alongside WouldBlock as transient (a lossy
drop / "no data this poll"), never a teardown, at every data-path send/recv site
(send, send_batch, send_gso, recv, recv_batch x2, recv_batch_x). FEC + the next
frame/RFI recover; if the peer is genuinely gone the QUIC control plane's
conn.closed() ends the session cleanly (no infinite "stream into the void").
This is the standard connected-UDP rule that ICMP errors are advisory — doubly
true with hole-punching. Adds is_transient_io() + a unit test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 00:54:10 +00:00
enricobuehler c2ae40ef9e feat(net/mac): default-on recvmsg_x batched Mac recv + GSO host + longer probe
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 31s
ci / rust (push) Successful in 2m6s
ci / bench (push) Successful in 1m35s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
apple / swift (push) Successful in 1m17s
docker / deploy-docs (push) Successful in 17s
deb / build-publish (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m50s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m27s
The Mac/iOS client's wall around ~380 Mbps on a 2.5 G path is the receive
drain, not the transport: a loopback speed-test pushes 380/600/1000 Mbps at
0.0% loss, but Darwin has no recvmmsg(2), so the macOS client was doing one
recv() syscall per packet — ~40-90k syscalls/s on one core. When the recv loop
can't drain fast enough the kernel socket buffer backs up and drops, which the
client sees as a sustained stream stalling/freezing in the 300-400 Mbps range
(and an immediate "session ended" when a 500 Mbps+ first keyframe bursts in).

- core/transport: flip recvmsg_x (the batched Darwin recv, ~30x fewer syscalls)
  from opt-in to default ON, opt-out via PUNKTFUNK_RECVMSG_X=0. Keeps the
  auto-fallback to the scalar loop on any unexpected syscall error. The Apple CI
  swift-test loopback now exercises this path by default.
- packaging/kde host.env: enable PUNKTFUNK_GSO=1 — UDP segmentation offload on
  the host send path (one sendmsg per ~64 packets), the dominant lever above
  ~1 Gbps. Already wired (send_sealed -> send_gso) with sendmmsg auto-fallback.
- apple SpeedTestSheet: lengthen the bandwidth probe 2 s -> 5 s so the measured
  number stops swinging wildly (50 vs 900 Mbps on the same link) — long enough
  for steady-state send + recv drain to settle. Matches host MAX_PROBE_MS.
- host capture: PUNKTFUNK_SYNTH_NOISE synthetic high-entropy source for
  reproducible throughput testing of the encode->FEC->send->recv path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-14 00:35:26 +00:00
enricobuehler 9338a8797d style: rustfmt the connect_via_punch match guard
ci / web (push) Successful in 29s
apple / swift (push) Successful in 1m21s
ci / docs-site (push) Successful in 33s
ci / rust (push) Successful in 2m4s
ci / bench (push) Successful in 1m39s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m53s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m28s
cargo fmt --all --check failed CI on the long match-arm guard in UdpTransport::connect_via_punch;
apply the formatter's wrapping. No behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 19:56:25 +00:00
enricobuehler 7ec91aec2d feat(punktfunk/1): cross-VLAN/NAT video via data-plane hole-punching
ci / web (push) Successful in 29s
ci / rust (push) Failing after 38s
ci / docs-site (push) Successful in 30s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 6s
apple / swift (push) Successful in 1m17s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 6s
deb / build-publish (push) Successful in 3m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m58s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m17s
The video data plane is a raw UDP socket separate from the QUIC control connection. On a flat LAN
the host can send straight to the client, but across NAT or a stateful inter-VLAN firewall the
unsolicited host→client video is rejected (ICMP port-unreachable → the session dies immediately,
while control/audio/input keep working since they ride the client-initiated QUIC). Observed live:
a client on 192.168.6.2 streaming from a host on 192.168.1.48.

Fix: client-initiated hole-punching. The client sends PUNCH_MAGIC datagrams from its data socket
to the host's advertised data port (Welcome.udp_port); that opens the firewall/NAT return path and
lets the host learn the client's OBSERVED source (the NAT-translated address, not the client's
reported private one). The host (UdpTransport::connect_via_punch) waits ≤2.5s for the first punch
and streams there, falling back to the client-reported address for clients that don't punch
(flat-LAN behaviour unchanged). The client keeps a low-rate keepalive so a stateful firewall's idle
timeout can't close the path during a static, low-bitrate scene. Wired into client-rs and the
NativeClient connector (covers the Linux + Apple clients; the Apple app needs an xcframework rebuild
to pick up the new core).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 18:46:15 +00:00
enricobuehler 2ebffe3457 perf(core): recvmsg_x batched receive on Apple (macOS client)
apple / swift (push) Failing after 1m2s
ci / rust (push) Failing after 1m11s
ci / web (push) Failing after 39s
ci / docs-site (push) Failing after 41s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 6s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 3m5s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (push) Successful in 4m30s
macOS/iOS have no recvmmsg(2), so the Mac client did one recv() syscall per
packet (non-allocating after the earlier fix, but still a syscall each — a
single-core wall at line rate that Moonlight avoids). Add the Darwin recvmsg_x(2)
batched-receive path (the recv counterpart of Linux recvmmsg): one syscall drains
up to RECV_BATCH datagrams into the reused ring. struct msghdr_x + the extern
aren't in the libc crate, so declared here (cfg target_vendor=apple).

Opt-in via PUNKTFUNK_RECVMSG_X (it's FFI we can't exercise off-Apple) with
auto-fallback to the tested scalar recv-loop on any unexpected error. Linux
recvmmsg + the non-Apple scalar loop are unchanged; apple.yml compiles the path.

Re GRO: Linux recv already batches via recvmmsg (32/syscall), so UDP GRO is only a
marginal add there and needs a recv-path redesign to split coalesced buffers —
deferred as low-ROI vs the Mac, which had no batching at all.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 23:52:39 +00:00
enricobuehler 9c86f667ca perf(core): in-place AES-GCM seal + reused wire-buffer pool (host send)
ci / web (push) Failing after 39s
ci / docs-site (push) Failing after 33s
apple / swift (push) Successful in 1m16s
ci / rust (push) Successful in 1m20s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 6s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
deb / build-publish (push) Successful in 3m3s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (push) Successful in 4m35s
The host sealed every packet with ~3 heap allocations: aes-gcm's convenience
encrypt() allocates the ciphertext Vec, seal_for_wire allocates the seq||ct||tag
wire Vec, and seal_frame allocated a fresh Vec<Vec<u8>> per frame. At line rate
(~250k–500k pkt/s for 2.5–5 Gbps) that's the single-core allocator wall.

- SessionCrypto::seal_in_place uses AeadInPlace::encrypt_in_place_detached to
  encrypt into the caller's buffer and write the detached tag at the end —
  byte-identical to seal's ciphertext||tag, no allocation (unit-tested for byte
  equality + decrypt).
- Session keeps a wire_pool the caller returns via reclaim_wires; seal_frame
  seals each packet in place into the reused buffers (clear() keeps capacity), so
  after warmup there's no per-packet ciphertext/wire allocation. paced_submit and
  submit_frame reclaim the pool after sending.

End-to-end encrypted/lossless multi-frame tests stay green (validates the pool
reuse doesn't corrupt across frames). Next: write packetize directly into a
contiguous send buffer (kills the remaining shard allocs + GSO's coalescing copy).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 23:47:38 +00:00
enricobuehler 448986f41c perf(core): UDP GSO send path (the multi-Gbps lever)
apple / swift (push) Successful in 1m16s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
ci / rust (push) Successful in 1m31s
deb / build-publish (push) Successful in 2m36s
ci / web (push) Failing after 36s
ci / docs-site (push) Failing after 32s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m42s
rpm / build-publish (push) Successful in 4m38s
docker / deploy-docs (push) Successful in 17s
sendmmsg already batches syscalls but still builds one sk_buff per datagram —
the kernel-side wall above ~1 Gbps. UDP Generic Segmentation Offload hands the
kernel one big buffer it splits into gso_size datagrams, building ~1 GSO skb per
≤64 segments. Research (LWN/Cloudflare/Tailscale) measures ~2.4x throughput at
equal CPU and 17-44x fewer syscalls, and that sendmmsg batching alone is
insufficient — you need true segmentation offload.

Adds Transport::send_gso (default = send_batch) + a UdpTransport Linux override:
coalesces a frame's equal-size wire packets (shards are zero-padded to a constant
size, so a whole frame is one gso_size) into ≤64-segment sendmsg(UDP_SEGMENT)
calls. seal/send routes through it. Opt-in via PUNKTFUNK_GSO (new unsafe hot-path
code) with automatic fallback to sendmmsg on any GSO error (unsupported kernel/
path), latched per process. Loopback unit test validates the cmsg segmentation;
full session over loopback streams clean (0% loss). Linux-only; loopback/non-Linux
keep sendmmsg/scalar.

Next levers: in-place AES-GCM seal (kill per-packet allocs), UDP GRO on recv,
drop the sleep-pacing in favor of the kernel qdisc, jumbo MTU.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 23:29:51 +00:00
enricobuehler 6b5ee9f47b perf(core): batched non-allocating recv on Apple targets (macOS client wall)
apple / swift (push) Failing after 28s
ci / rust (push) Failing after 1m18s
ci / web (push) Failing after 47s
ci / docs-site (push) Failing after 35s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
docker / deploy-docs (push) Has been skipped
rpm / build-publish (push) Failing after 16s
deb / build-publish (push) Failing after 43s
The batched `recvmmsg` recv path was Linux-only; macOS fell back to the trait
default, which calls the scalar `recv` — a fresh `vec![0u8; 2049]` allocation
(plus zeroing and a copy) PER PACKET on the single receive thread. At line rate
that alloc/free churn, not the syscall, was the single-core wall: measured the
real Mac client topping out ~315 Mbps and dropping the session at 800, while a
Linux client (recvmmsg) held a clean 1 Gbps against the same host, and Moonlight
(batched recv) does 900 on the same Mac.

Add a `cfg(all(unix, not(linux)))` `recv_batch` that drains up to RECV_BATCH
datagrams per call with `libc::recv(MSG_DONTWAIT)` straight into the caller's
reused ring buffers — no per-packet allocation or copy. Still one syscall per
datagram (a future `recvmsg_x` batch would cut that too), but it removes the
dominant cost. Linux recvmmsg path and the Windows/loopback default unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 23:05:54 +00:00
enricobuehler c56b1b455a feat(punktfunk/1): request-IDR recovery for a wedged client decode
apple / swift (push) Successful in 1m17s
ci / rust (push) Failing after 31s
ci / web (push) Failing after 42s
ci / docs-site (push) Failing after 40s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 10s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 6s
docker / deploy-docs (push) Has been skipped
rpm / build-publish (push) Failing after 15s
deb / build-publish (push) Failing after 43s
Fixes the intermittent first-connect freeze. The host streams infinite GOP — one
opening IDR, then P-frames only (recovery keyframes just on loss) — so when the
client's decoder wedges on the cold first session (a lost/corrupt opening IDR, a
bad early P-frame) the picture stays frozen until the far-off next keyframe. The
client had no way to ask for one; now it does.

Add a RequestKeyframe control message (client -> host, reliable control stream),
mirroring Reconfigure:
- core: quic.rs RequestKeyframe (type 0x03) + roundtrip test; client.rs
  CtrlRequest::Keyframe + NativeClient::request_keyframe; abi.rs
  punktfunk_connection_request_keyframe (header regenerated).
- host: m3.rs decodes it in the control loop and signals the encode loop, which
  coalesces a burst and calls enc.request_keyframe() — wiring the existing
  NvencEncoder hook (force_kf -> next frame pict_type=I), the same recovery the
  GameStream path already had via force_idr.
- apple: PunktfunkConnection.requestKeyframe(); StreamPump (stage-1) requests on
  layer.status==.failed; Stage2Pipeline (stage-2) on a sync submit failure and on
  the async decode-error callback via a thread-safe KeyframeRecovery. All
  throttled to <=1/250ms (the decode stays wedged for several frames until the IDR
  lands, so per-frame requests would flood the control stream).

Self-healing: a lost recovery IDR is re-requested after the throttle; the host
coalesces bursts into a single IDR.

Validated: cargo fmt + clippy clean; core + host test suites green (incl. new
request_keyframe_roundtrip); swift build + test (39 passed); xcframework rebuilt
(all 5 slices), header regenerated with no unrelated drift.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 00:48:24 +02:00
enricobuehler dea749186d fix(quic/apple): QUIC keep-alive + reconnect input re-engage
ci / rust (push) Failing after 36s
ci / web (push) Failing after 51s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
ci / docs-site (push) Failing after 40s
apple / swift (push) Successful in 1m16s
docker / deploy-docs (push) Successful in 17s
Three native-client bugs isolated against a stock Moonlight client (which
stays connected / keeps input working under the same actions):

- Connection drops mid-stream: the quinn endpoints (host + client) ran with
  default transport config, so keep_alive_interval was OFF. Any quiet stretch
  (no input, audio muted/stalled, a capture hiccup, a mode change) let the
  idle timer expire and quinn closed the session -> next_au=Closed -> "Session
  ended". Moonlight's ENet sends keepalive pings; we sent nothing. Add a shared
  TransportConfig (keep-alive 4s under an explicit 20s idle timeout) to both
  endpoint::server_from_der and endpoint::client_pinned_with_identity.

- Reconnect input dead (macOS): the session-start auto-capture one-shot was
  consumed even when engageCapture(fromClick:false) was refused (window not key
  yet at the instant of reconnect), with no retry -> capture stayed off and
  input never forwarded. Clear the one-shot only on a successful engage, and
  retry on NSWindow.didBecomeKey. Stays scoped to session start, so it does not
  resurrect the rejected auto-grab-on-activation behavior.

- Reconnect input dead (iOS): wasCapturedOnResign leaked stale state across
  sessions and the foreground-restore could fire before this session's
  InputCapture was wired (setForwarding no-ops on nil). Reset it per session in
  start() and guard the didBecomeActive restore on inputCapture != nil.

Validated: cargo build -p punktfunk-core --features quic; swift build;
swift test (39 passed, 0 failures); xcframework rebuilt (all 5 slices), no
ABI/header drift.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 23:07:54 +02:00
enricobuehler 5f088c6f56 fix(client-linux): absolute mouse was dropped — pack the surface size in flags
ci / web (push) Failing after 45s
ci / rust (push) Successful in 1m1s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
apple / swift (push) Successful in 1m18s
ci / docs-site (push) Failing after 42s
docker / deploy-docs (push) Successful in 17s
The MouseMoveAbs wire contract packs the client coordinate-space size
as (width << 16) | height in `flags` (same as touch); injectors
normalize against it and drop the event when it is zero. The GTK
client sent flags=0, so KWin's libei path refused every motion
(`emitted=false`) — found via the first real user test from
home-worker-3.

- ui_stream: send_abs() packs the negotiated mode into flags for
  motion + click-position events.
- core input.rs: document the contract on MouseMoveAbs itself (it was
  only implied by TouchDown's doc).
- client-rs --input-test: add a MouseMoveAbs sweep so the absolute
  path stays covered — Moonlight and the Mac client only send relative
  motion, which is why this gap survived every prior live test.

Validated live against serve --native: kind=MouseMoveAbs emitted=true.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 20:50:53 +00:00
enricobuehler 99b4de32ee feat(pairing): delegated approval (§8b-1) — approve an unpaired device from the console
ci / web (push) Failing after 40s
ci / rust (push) Successful in 1m6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 13s
apple / swift (push) Successful in 1m20s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
ci / docs-site (push) Failing after 46s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 18s
docker / deploy-docs (push) Successful in 16s
An identified-but-unpaired device that knocks on a pairing-required host is now
held as a pending request the operator approves from the web console — pairing it
with no PIN fetched out of band — instead of a flat reject.

- core: Hello gains an optional trailing device name (len u8 || UTF-8, ≤64,
  same trailing-back-compat pattern as compositor/gamepad/bitrate). client-rs
  --name sends it; the connector sends None (fingerprint-derived label).
- native_pairing: in-memory pending queue (note_pending dedups by fingerprint,
  evicts the least-recently-active past a 32 cap, 10-min TTL); approve_pending
  pins the fingerprint, deny drops it. Names are sanitized (strip control/ANSI/
  bidi — untrusted wire input); add()/remove() roll back in-memory on a persist
  failure; pairing clears any stale pending knock.
- m3: the require_pairing gate records the knock (sanitized label) before
  rejecting; anonymous (certless) clients record nothing.
- mgmt: GET /native/pending, POST /native/pending/{id}/approve (optional {name})
  and /deny; OpenAPI + tests; docs/api/openapi.json regenerated.
- web: a "Waiting for approval" section on the Pairing page (live-poll, Approve/
  Deny, error-surfaced via QueryState); en+de strings.
- Also completes an in-progress NativeClient Sync refactor (receivers behind
  per-plane mutexes) that was left half-applied in the tree.

Adversarially reviewed (4 lenses + 3-vote verify); the confirmed findings are
fixed here. Validated live on the GNOME box: knock (with a wire name, and a
malicious ANSI/bidi name that got neutralized) → pending → approve → the same
identity streams real video. Full workspace tests + clippy + fmt green; web tsc
clean. Roadmap §8b-1 marked done; §8b-2 (peer-push approval) is the client
follow-up. See docs-site pairing page.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 19:14:05 +00:00
enricobuehler 60ccbfdcf7 style: cargo fmt --all under rustfmt 1.9 (Rust 1.96)
Comment reflow only — the pinned "stable" channel moved and CI checks
formatting with the current toolchain.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 12:28:13 +00:00
enricobuehler 6b4de5d738 feat(client/speedtest): request the host's full 3 Gbps probe ceiling
The Apple speed test asked for only 400 Mbps, capping the measured throughput
there and hiding the link's real headroom. Request the host's full
MAX_PROBE_KBPS (3 Gbps) instead, and raise the recommended-bitrate clamp from
500 Mbps to the host's 2 Gbps session ceiling so a fast measurement yields a
usable recommendation.

Also fix the stale caps left when the host clamps were raised (b8a33e2): the
resolved-bitrate range and the probe doc comments (abi.rs, client.rs,
regenerated header), plus the section 9 roadmap copy, now read 3 Gbps probe /
2 Gbps session.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 14:08:19 +02:00
enricobuehler 1c94f46be8 style(quic): format-stable clock test assert (message to comment)
ci / rust (push) Has been cancelled
The clock_offset test's assert_eq! carried an inline message that newer rustfmt
wants to wrap while the repo's committed style keeps such asserts on one line.
Move the message to a comment and use bare assert_eq! so it formats identically
under any rustfmt version — no new fmt-check ambiguity from this addition.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 12:01:56 +00:00
enricobuehler 7eb9a927cf feat(connector): expose host clock offset over the C ABI for glass-to-glass
ci / rust (push) Has been cancelled
Factor the client-side skew handshake into a shared core helper (quic::clock_sync
-> ClockSkew) so both the reference client and the embeddable connector use one
implementation. NativeClient now runs the handshake at connect (right after Start,
before the control task takes the stream) and stores the host-client offset; it's
read over the C ABI via punktfunk_connection_clock_offset_ns (i64 ns, host minus
client; 0 = no correction / old host).

This is the substrate the Apple client needs for the decode->present (glass-to-
glass) term: stamp present time, add the offset to express it in the host's
capture clock, subtract the AU pts_ns. client-rs drops its local clock_sync copy
and uses the shared helper (behavior unchanged; validated locally).

Regenerates include/punktfunk_core.h. Roadmap section 12 + status updated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 11:44:54 +00:00
enricobuehler 05bc9ab22c feat(latency): wall-clock skew handshake for cross-machine latency measurement
ci / rust (push) Has been cancelled
ClockProbe/ClockEcho on the QUIC control stream — 8 NTP-style rounds right after
Start; the min-RTT sample gives the host-client clock offset (clock_offset_ns
estimator in punktfunk-core). The client adds the offset to its receive instant
before differencing against the AU pts_ns, so the capture->reassembled latency
percentiles are valid across machines (skew_corrected=true), not just same-host.
Back-compat: an old host that doesn't answer the probe times out and the client
falls back to a shared-clock assumption (skew_corrected=false).

Host adds one ClockProbe dispatch arm in the control task; the client runs
clock_sync after Start, before the --remode/--speed-test tasks take the stream.

Validated cross-LAN (GNOME box -> dev box): offset ~ -1.57 ms (reproducible),
rtt ~140 us, p50 1.30 ms skew-corrected capture->reassembled — the offset is
exactly the systematic error the handshake removes. Unit tests for the message
codecs and the min-RTT offset estimator.

Roadmap §12: skew handshake done; remaining for true glass-to-glass is the Apple
client present-stamp (decode->present) plus the host render->capture term.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 11:20:20 +00:00
enricobuehler 86f463cf71 fix(housekeeping): unaligned read UB + recv-drop parity; dedup mmsghdr; doc fixes
ci / rust (push) Has been cancelled
From a bug-hunt + unsafe-audit pass (4 reviewers + adversarial verify). It
confirmed ZERO real bugs in the recent batched/paced data-plane work — these are
the surfaced cleanups + one genuine soundness fix:

- SOUNDNESS (reduce unsafe): inject/gamepad.rs::pump_ff did `ptr::read` of an
  InputEventRaw (align 8, holds a timeval) out of a 1-aligned [u8; N] buffer — UB
  per the reference (x86_64 tolerates it, but it can miscompile under LTO). Use
  ptr::read_unaligned + a SAFETY note. Zero behavior change.
- recv parity: recv_batch (recvmmsg) didn't drop an oversized/truncated datagram
  the way scalar recv does — poll_frame now skips a message whose len fills the
  buffer (> MAX_DATAGRAM_BYTES), matching recv's `n >= RECV_BUF` drop. (AEAD
  already rejected these on encrypted sessions; this restores the documented
  invariant on the batched path.)
- dedup unsafe FFI: factor the identical mmsghdr-from-iovec construction out of
  send_batch + recv_batch into one `mmsghdrs()` helper — the raw-pointer
  scaffolding + its lifetime SAFETY note now live in one place.
- docs: TARGET_SOCKBUF no longer calls paced sending future work (it landed,
  m3.rs::paced_submit); gamescope.rs input is no longer "(TODO)" (wired +
  live-validated); the PUNKTFUNK_PERF `wire_mbps` field is renamed `tx_mbps` and
  noted as attempted/sealed bytes (send_dropped shows what didn't reach the wire).

Full suite (35 + loopback round-trip + 6) + clippy + fmt green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 23:04:23 +00:00
enricobuehler 2f4f92a804 feat(1gbps): batched client recv via recvmmsg (increment C)
ci / rust (push) Has been cancelled
Final increment of the 1 Gbps data-plane rework — the recv counterpart of the
sendmmsg work. The client recv path did one recvfrom + one Vec allocation per
packet (and the pump's 300µs idle sleep could let packets pile up at line rate).

- Transport gains recv_batch(&mut [Vec<u8>], &mut [usize]) -> count; default is
  a single scalar recv into out[0] (loopback + non-Linux).
- UdpTransport overrides it on Linux with recvmmsg (MSG_DONTWAIT) draining up to
  N datagrams per syscall into the caller's reused buffers — no per-packet alloc.
- Session::poll_frame owns a lazily-allocated recv ring (RECV_BATCH=32) and
  consumes it one packet at a time across calls, refilling with one recvmmsg when
  drained. Encapsulated: the punktfunk-client-rs + NativeClient pumps are
  unchanged, and draining a batch per syscall means the 300µs sleep no longer
  underdrains. Added UdpTransport::local_addr (used by the test, generally handy).

~125k → ~4k recv syscalls/sec at line rate, zero per-packet recv allocation.
Verified: new recv_batch_drains_over_loopback test (50 datagrams drained intact
via recvmmsg) + the existing loopback round-trip now runs through the batched
poll_frame; full suite (35 + round-trip + 6) + clippy + fmt green.

Decode-in-place (kill the per-packet open_from_wire alloc) is a separate later
optimization. With A (sendmmsg) + B (paced send) + C (recvmmsg), the native data
plane is batched + paced end to end.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 22:39:51 +00:00
enricobuehler 10a932d013 feat(1gbps): pace per-frame sends so high-bitrate frames don't burst-drop
ci / rust (push) Has been cancelled
Increment B of the send-path rework — the actual fix for "freezes get more
common over ~150 Mbps, no image at all at 400 Mbps" on the native path. Cause:
the encoder emits a frame and submit_frame blasted ALL its packets at once into
the NIC; a real link drops the line-rate burst (host send buffer EAGAINs), and
under infinite GOP one dropped frame freezes the decode until the next keyframe.
(The speed-test probe showed 0 drops at 400 Mbps because the probe is self-paced;
real video wasn't.)

Adaptive pacing, no extra thread, no regression:
- Session splits into seal_frame (FEC + packetize + seal → wire packets, no
  send) and send_sealed (one batched sendmmsg of a chunk, counts drops);
  submit_frame is now their composition (synthetic + probe paths unchanged).
- virtual_stream's paced_submit seals a frame then sends it in 16-packet chunks
  spread over ~90% of the time until the next frame is due. At 60 fps desktop
  (fast encode → lots of slack) the frame spreads across the interval → no NIC
  burst → no freeze. At 240 fps@5K (encode ≈ interval → ~0 slack) the budget
  collapses and every chunk goes out immediately → never slower than before.

Core suite (34 + loopback round-trip + 6) + clippy + fmt green. The seal/send
split is covered by the existing loopback tests; the pacing is host timing,
verified by review (live-test needs a real NIC — your Mac at a raised bitrate).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 22:15:52 +00:00
enricobuehler c24b571e37 feat(1gbps): batched send via sendmmsg (Transport::send_batch)
ci / rust (push) Has been cancelled
First increment of the 1 Gbps send-path rework (the measured bottleneck): the
native data plane did one send() syscall per packet — at ~125k pkt/s (1 Gbps
wire) that burns a core on syscalls. Port the proven GameStream sendmmsg path
into the core Transport seam.

- Transport gains `send_batch(&[&[u8]]) -> usize` (count handed to the kernel;
  caller counts the rest as send-buffer drops). Default = the scalar send loop
  (loopback transport + non-Linux).
- UdpTransport overrides it on Linux with `sendmmsg` (64 datagrams/syscall);
  the connected socket needs no per-message address. Non-blocking-aware: a full
  send buffer yields a short count / EAGAIN, and we stop + report what went out
  rather than block or retry (same lossy, FEC-protected contract as send()).
- Session::submit_frame seals every shard then hands the whole frame to
  send_batch in ONE call instead of looping send() — ~64x fewer syscalls per
  frame on the native + GameStream-over-core paths; send_dropped accounting
  preserved (total - sent).

~125k → ~2k syscalls/sec at 1 Gbps line rate. Verified: new loopback-UDP test
send_batch_delivers_over_loopback (100 batched packets arrive intact, datagram
boundaries preserved); full core suite + clippy + fmt green.

Next increments: a paced send thread (microburst shaping so a real NIC doesn't
drop line-rate bursts) and recvmmsg on the client.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 21:55:22 +00:00
enricobuehler b8a33e21a2 feat(1gbps): raise bitrate/probe clamps + socket buffers, count send-buffer drops
ci / rust (push) Has been cancelled
First step of 1 Gbps+ readiness (the whole point of the GF(2^16) Leopard FEC):
make 1 Gbps configurable and its dominant failure mode observable, before the
real transport work (sendmmsg + paced encode|send split) lands.

Investigation (6-way) verdict: we're ~halfway, and it's mostly clamps plus one
real piece of work. The integer/type path, FEC (a 1 Gbps frame is only a few
hundred shards in one GF(2^16) block, far under the 65535 ceiling), AES-GCM
(AES-NI, ~10-25x headroom), and the M1 reassembler bounds (fully derived from
the negotiated FecConfig) are ALL already 1 Gbps-ready and untouched.

This commit (the configurable + observable foundation):
- m3.rs: MAX_BITRATE_KBPS 500_000 -> 2_000_000 (2 Gbps headroom over the 1 Gbps+
  target); MAX_PROBE_KBPS 1_000_000 -> 3_000_000 (probe can demonstrate headroom
  ABOVE the session cap so a client can confidently pick a 1 Gbps+ bitrate).
- transport/udp.rs: TARGET_SOCKBUF 8 MB -> 32 MB (a multi-MB IDR keyframe burst
  no longer fills the buffer); scripts/99-punktfunk-net.conf bumped to match.
- Observability: Transport::send now returns Ok(true|false) (false = WouldBlock
  send-buffer drop, previously a silent Ok(())). Session counts these as a new
  `packets_send_dropped` stat (distinct from recv-side packets_dropped) — in
  Stats, the C ABI PunktfunkStats (header regenerated), a PUNKTFUNK_PERF periodic
  wire-Mbps + drop dump in virtual_stream, and the speed-test probe completion
  log. This is the dominant 1 Gbps+ loss mode and was invisible.

Loopback-verified: a probe now runs at 1.2 Gbps target (no longer truncated to
1 Gbps) with the drop counter live. NOT yet a sustained-1-Gbps proof — the
single-send()-per-packet native path is the next, real piece of work (port the
proven GameStream sendmmsg + paced send thread into the core Transport).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 20:45:49 +00:00
enricobuehler 74819b1be8 feat(punktfunk/1): negotiable encoder bitrate + bandwidth speed-test probe
ci / rust (push) Has been cancelled
Two related additions to the native protocol, host-side (the client side of
each is exposed over the C ABI so the platform clients can wire it up).

Bitrate negotiation
- Hello/Welcome carry `bitrate_kbps` (appended trailing-byte field, back-compat:
  old peers decode 0 = host default). The client requests a rate; the host
  clamps it to [500 kbps, 500 Mbps] (or its 20 Mbps default when 0) and echoes
  the resolved value in Welcome. Replaces the hardcoded 20 Mbps NVENC bitrate in
  m3.rs — threaded through virtual_stream → build_pipeline → open_video, applied
  on the initial mode and every reconfigure rebuild.
- C ABI: punktfunk_connect_ex3(..., bitrate_kbps, ...) (ex2 delegates with 0);
  punktfunk_connection_bitrate() reads the resolved value.

Speed test (bandwidth probe)
- New typed control messages ProbeRequest{target_kbps,duration_ms} (0x20) /
  ProbeResult{bytes_sent,packets_sent,duration_ms} (0x21), plus a FLAG_PROBE
  packet flag. The client asks the host to burst zero-filled, FLAG_PROBE-tagged
  access units over the data plane at a target goodput for a duration (clamped
  ≤ 1 Gbps / ≤ 5 s), pacing by a bytes-allowed budget; video pauses for the
  burst. The host reports what it actually sent; the client measures received
  bytes + window → goodput and loss. Probe filler is never fed to the decoder
  (diverted in the connector pump and the reference client's poll loop).
- The host control task now multiplexes Reconfigure + ProbeRequest (inbound)
  and ProbeResult (outbound) over select!; a probe channel reaches the
  data-plane thread (both virtual and synthetic sources).
- Connector: NativeClient::request_probe()/probe_result() with an internal
  accumulator; C ABI punktfunk_connection_speed_test() +
  punktfunk_connection_probe_result() → PunktfunkProbeResult.
- punktfunk-client-rs gains `--bitrate KBPS` and `--speed-test KBPS:MS` (its own
  loop measures + logs goodput/loss) for loopback verification.

Validated on loopback (synthetic source): a 20 Mbps / 2 s probe measured
20050 kbps at 0% loss, bitrate negotiated (0→20000 and 50000→50000), and the
interleaved probe AUs were correctly excluded from frame verification
(mismatched=0). Wire codecs + trailing-byte back-compat have unit tests. C
header regenerated.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 18:44:47 +00:00
enricobuehler 1d605fb781 feat(gamepad): controller discovery + client-negotiated pad type + rich DualSense end to end
The Apple client grows full gamepad support and punktfunk/1 learns to negotiate
the virtual pad type:

- Protocol: Hello carries a GamepadPref byte (offset 21, the same trailing-byte
  back-compat pattern as the compositor; echoed resolved in Welcome at 54).
  Host precedence: explicit client choice > PUNKTFUNK_GAMEPAD env > Xbox 360,
  DualSense (UHID) only where available. ABI: punktfunk_connect_ex2 +
  punktfunk_connection_gamepad (connect_ex delegates; ABI_VERSION stays 2 — the
  trailing byte IS the compat mechanism). punktfunk-client-rs gets --gamepad.

- Swift client: GamepadManager (app-lifetime discovery + selection — Settings
  lists every controller with capabilities/battery/"In use"; exactly ONE pad
  forwards as pad 0, auto = most recently connected, or pinned), GamepadCapture
  (snapshot-diff button/axis events, DualSense touchpad + ~250 Hz motion on the
  rich-input plane, held state released on switch/deactivate/stop),
  GamepadFeedback (rumble → CoreHaptics per-handle engines; lightbar →
  GCDeviceLight; player LEDs → playerIndex; adaptive-trigger blocks → the
  table-driven DualSenseTriggerEffect parser → GCDualSenseAdaptiveTrigger,
  exact for the 10-zone positional modes). The pad type auto-resolves from the
  physical controller at connect time, user-overridable in Settings.

- Host DualSense fixes surfaced by adversarial review against hid-playstation /
  SDL / Nielk1 ground truth: input-report sensor/touch offsets were off by one
  (the kernel read garbage motion + phantom touches), the L2/R2 trigger blocks
  were swapped (the report is right-trigger-first), feedback now gates on the
  report's valid-flags (a plain rumble write no longer blanks lightbar/
  triggers), and the touchpad rescale clamps to the advertised ABS_MT extents.

- Tests: Hello/Welcome trailing-byte back-compat, pick_gamepad precedence,
  byte-exact input-report layout, valid-flag gating, per-mode trigger-parser
  table (incl. packed 3-bit zones), wire conversions, and a scripted loopback
  feedback burst (PUNKTFUNK_TEST_FEEDBACK=1) asserted through the xcframework
  on the rumble + HID-output planes.

Validated: cargo test/clippy/fmt green on macOS + Linux (61 host tests), swift
build/test green, test-loopback.sh green, tvOS/iOS targets compile. DualSense
motion sign/scale is derived from the calibration blob, not yet live-verified
(constants isolated in GamepadWire).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 16:28:33 +02:00
enricobuehler 0f333460ec fix(core): grow UDP socket buffers — fixes 4K/5K video freezing on one frame
The data-plane UDP sockets used the OS default buffer (~208 KB on Linux, similar
on macOS), which is smaller than a single high-resolution frame burst: a
5120×1440 keyframe is ~130 packets the encode|send thread hands to sendmmsg at
once. The burst overflows the buffer — EAGAIN on the host send (now dropped, was
fatal) or a silent drop on the client recv — and because the data plane runs
infinite-GOP, one lost frame breaks every subsequent reference and the decode
freezes on the last good frame until an RFI refresh that may never catch up.
Symptom: connect at 5120×1440, see ONE frame, then a frozen image (audio + input
keep working — those ride QUIC, not this socket).

Set SO_SNDBUF/SO_RCVBUF to 8 MB (clamped by the OS to net.core.{w,r}mem_max on
Linux / kern.ipc.maxsockbuf on macOS); warn if the grant lands far below target so
an undersized host is diagnosable. The client side matters most — the SAME
UdpTransport backs the Apple client's data plane via the C ABI, and macOS grants
multi-MB buffers without any sysctl, so a rebuilt client stops losing frames.

Validated live, bazzite→client at 5120×1440: was 1319/1500 frames (12% loss →
freeze), now 1500/1500 @60 and 5279/5279 @240 (split-encode active), zero
mismatches, p50 1.9–3.4 ms. Host send buffer was still capped at 416 KB and lost
nothing — the loss was purely the client recv buffer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 13:28:02 +00:00
enricobuehler b6f4164454 fix(core): drop video packets on a full UDP send buffer, don't fail the session
UdpTransport sockets are non-blocking, so a momentarily-full kernel send buffer
makes socket.send return WouldBlock (EAGAIN). submit_frame propagated that as a
fatal error, tearing the whole punktfunk/1 session down — observed when attaching
to an already-running source (a headless Steam session) that emits frames at full
rate the instant capture connects: the first burst saturates the tx queue and the
session dies before a single frame reaches the client.

The data plane is lossy + Leopard-FEC-protected and runs infinite-GOP with RFI
keyframes, so the real-time-correct response to a full tx queue is to DROP the
packet (the next frame / FEC recovers) — exactly what the recv path already does
for WouldBlock. Blocking would queue stale frames and add latency. Loopback/M1
paths are unaffected (LoopbackTransport never blocks; M1 tests stay green).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 12:55:12 +00:00
enricobuehler 59edeedf07 feat(dualsense): Phase C/D/E — virtual DualSense routing + 0xCC/0xCD planes + C ABI
ci / rust (push) Has been cancelled
PUNKTFUNK_GAMEPAD=dualsense now routes a session's gamepad through a real virtual
DualSense (UHID + hid-playstation) end to end:

- host: a `PadBackend` enum (m3.rs) selects `GamepadManager` (uinput xpad, default)
  or the new `DualSenseManager` (dualsense.rs) per session. The manager keeps each
  pad's full DsState so touchpad + motion (rich-input plane) persist across
  button/stick frames, and services the !Send /dev/uhid fd only on the input thread
  (which cycles <=4ms, so the GET_REPORT init handshake completes).
- feedback: `service()` now returns `DsFeedback { hidout, rumble }`. Motor rumble
  stays on the universal 0xCA plane (so non-DualSense clients still feel it; manager
  dedups change); lightbar / player LEDs / adaptive-trigger effects ride the new
  0xCD HID-output plane (host->client) as `HidOutput`.
- rich input: touchpad contacts + motion ride the 0xCC plane (client->host) as
  `RichInput`, applied via `DualSenseManager::apply_rich` (merged with button state;
  touch normalized 0..65535 -> the touchpad resolution).
- connector + C ABI: `NativeClient::next_hidout` / `send_rich_input`, exported as
  `punktfunk_connection_next_hidout` (-> PunktfunkHidOutput) and
  `punktfunk_connection_send_rich_input` (<- PunktfunkRichInput); header regenerated.
- reference client: `--rich-input-test` drives the DualSense touchpad + motion and
  logs the 0xCD feedback that comes back.

Validated live on-box: a synthetic-source m3-host + client-rs created the real
kernel DualSense, drove 0xCC, and decoded 12 live 0xCD events (the kernel's actual
lightbar/trigger init reports) with the data plane unaffected (600/600 frames).
Adversarial review fixes folded in: the input loop no longer skips the rich drain +
feedback pump on a dropped gamepad event, and the touch contact id is clamped to its
slot. Remaining: the Apple client renders triggers/rumble on a real DualSense.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 08:36:12 +00:00
enricobuehler 6575dddac7 fix: keep the workspace green on macOS after the mic/touch/rich-input batch
The new features were Linux-built only and broke the documented macOS gate
(cargo build/test/clippy --workspace) four ways, all fixed following the existing
platform-gating conventions:

- m3.rs: mic_service_thread split into the Linux worker and a non-Linux stub that
  drains and drops (sessions still count the datagrams) — opus/PipeWire are
  Linux-gated deps, same pattern as audio_thread.
- punktfunk-client-rs: the new `opus` dependency moved into the Linux target table and
  --mic-test gated with a warn-and-skip stub (only the synthetic-tone test rig needs
  the encoder; the mic uplink itself is portable).
- gamestream/audio.rs: SAMPLE_RATE import gated to any(linux, test) (the frame_sizing
  test uses it everywhere, the data plane only on Linux).
- tests/c_abi.rs: the harness's macOS link flags gained Security + CoreFoundation —
  the quic feature now pulls rustls's platform verifier into the staticlib.

Also: two clippy match-ref-pats lints in the new rich-input/HID-output decoders
(clippy -D warnings is the repo gate), the regenerated punktfunk_core.h committed (the
checked-in copy predated the rich-input/HID-output constants — CI fails on drift), and
web's inlang cache dir gitignored.

cargo build/test/clippy/fmt --workspace: green on macOS, 122 tests passing.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-11 09:07:48 +02:00
enricobuehler 5f6d2cb88b feat(proto): variable-length rich-input (0xCC) + HID-output (0xCD) datagrams
ci / rust (push) Has been cancelled
Foundation for rich DualSense support (roadmap #5). The fixed 18-byte InputEvent (0xC8) can't
hold the DualSense touchpad/motion or HID feedback, so two new variable-length, kind-tagged
datagram families join the side-plane (mouse/keyboard/gamepad/touch keep the fixed InputEvent):

- RICH_INPUT_MAGIC 0xCC, client→host: `[0xCC][kind][fields]`
    Touchpad{pad,finger,active,x,y}  (x/y normalized 0..65535; host scales to the pad)
    Motion{pad, gyro[3], accel[3]}   (raw i16, straight into the DualSense report)
- HIDOUT_MAGIC 0xCD, host→client: `[0xCD][kind][pad][fields]` — the rich analog of the 0xCA
  rumble datagram (rumble stays on 0xCA):
    Led{rgb}  PlayerLeds{bits}  Trigger{which, effect}  (adaptive-trigger params to replay)

`RichInput`/`HidOutput` enums with encode/decode; unknown kinds + truncation decode to None
(forward-compatible). +2 round-trip/disjointness tests; quic suite green, clippy/fmt clean.
Wiring (host UHID device, capture, C ABI, client) lands in following commits.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 06:55:04 +00:00
enricobuehler dc375668ee feat: touch input — TouchDown/Move/Up + host libei ei_touchscreen injection
ci / rust (push) Has been cancelled
Roadmap #5 (touch, ahead of the XL UHID DualSense work). Touch fits the existing 18-byte
InputEvent: code = touch id, x/y = client pixels, flags = (w<<16)|h — the same absolute
mapping as MouseMoveAbs.

- core: InputKind::{TouchDown=9, TouchMove=10, TouchUp=11} + from_u8 + roundtrip test.
- host inject/libei.rs: request the RemoteDesktop Touchscreen device type, bind the Touch
  capability, and inject ei_touchscreen down/motion/up (one event = one frame, per the
  protocol rule), mapping coordinates into the device region like the abs pointer. wlroots
  has no virtual-touch protocol wired — no-ops there.
- client-rs --touch-test: drags a synthetic finger (touch id 0) in a circle.

Validated live on headless KWin: the portal GRANTS the Touchscreen device type
(Keyboard|Pointer|Touchscreen), proving the request path — but KWin's EIS server creates no
touchscreen *device*, so touch currently no-ops on this KWin (now logged once, not silent).
The injection code is correct and will land on a backend that exposes ei_touchscreen
(gamescope / a newer compositor / the real touch-client path). Workspace green, clippy/fmt
clean, +1 unit test.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:38:44 +00:00
enricobuehler 0755c823a5 feat: mic passthrough — client microphone → host virtual PipeWire source
ci / rust (push) Has been cancelled
The inverse of the host→client audio path: the client's mic, Opus-encoded, rides a
new 0xCB QUIC datagram to the host, which decodes it into a virtual PipeWire
Audio/Source its apps can record from (voice chat, etc.).

Protocol (punktfunk-core):
- MIC_MAGIC 0xCB + encode/decode_mic_datagram (mirror of the 0xC9 audio datagram).
- NativeClient::send_mic(seq, pts_ns, opus) over a new outbound channel + worker task
  (mirror of send_input); C ABI punktfunk_connection_send_mic for native clients.

Host:
- audio::VirtualMic + PwMicSource: a PipeWire output stream tagged media.class=
  Audio/Source (Direction::Output) — a recordable microphone node, fed decoded PCM.
- MicService: host-lifetime owner of the source + Opus decoder (mirror of
  InjectorService / the audio capturer slot); lazily opened, persists across sessions,
  self-heals. The per-session datagram reader now demuxes 0xCB→mic / 0xC8→input over a
  single read_datagram loop (two loops would race).
- Adaptive jitter buffer in the producer: primes to ~3 consumer quanta before emitting,
  so the 5 ms push / N ms pull clock skew never underruns — without it ~58% of output
  was silence; with it, glitch-free across consumer quanta.

Client: punktfunk-client-rs --mic-test streams a synthetic 440 Hz Opus tone as the mic
uplink (opus dep added) for end-to-end validation without a real microphone.

Validated live on headless KWin: client tone → host source → pw-record shows the
punktfunk-mic Audio/Source node, 440 Hz dominant (Goertzel power 20.7 vs <0.001
elsewhere), RMS 0.179 ≈ the ideal 0.177, 0.3–0.4% silence at both 256 ms and 10 ms
consumer quanta. Tests +1 (mic datagram roundtrip); workspace green, clippy/fmt clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:15:07 +00:00
enricobuehler 6fdf7d1511 feat: client-selectable compositor (protocol → host → client → C ABI → mgmt → web)
A client can now request which compositor backend the host drives its virtual
output on (gamescope/KWin/Mutter/wlroots). The host honors the request if that
backend is available, else falls back to auto-detect and reports the resolved
choice back — wire-compatible both directions (no ABI bump).

Protocol (punktfunk-core):
- New CompositorPref (config.rs): Auto|Kwin|Wlroots|Mutter|Gamescope with
  u8/name mappings. Appended as one optional byte to Hello (client preference)
  and Welcome (host's resolved choice). Both decoders already tolerate trailing
  bytes, so old↔new interop is preserved — ABI_VERSION stays 2. Round-trip +
  back-compat (truncated-message) tests.
- C ABI: punktfunk_connect_ex(compositor) + PUNKTFUNK_COMPOSITOR_* constants;
  punktfunk_connect delegates with AUTO, so the existing symbol is unchanged.
  NativeClient::connect / worker_main thread the preference through.

Host:
- vdisplay::available() enumerates usable backends via cheap, side-effect-free
  probes (KWin zkde global, gamescope binary+version, GNOME/Sway env), plus
  Compositor id/label/as_pref/from_pref/all helpers.
- m3 handshake resolves the preference to a concrete backend during the
  handshake (pick_compositor pure + resolved logging), reports it in Welcome,
  and threads it into virtual_stream (replacing the unconditional detect()).
- mgmt GET /v1/compositors lists every backend with availability + the
  auto-detected default (OpenAPI regenerated).

Client:
- punktfunk-client-rs --compositor NAME; logs the host's resolved choice from
  the Welcome ("session offer … compositor=…").

Web console:
- Host page gains a Compositors card (availability + default badges) via the
  codegen'd useListCompositors hook; en/de strings added.

Also fixes a pre-existing, env-dependent test-isolation bug:
mgmt::tests::paired_clients_list_and_unpair seeded the real
~/.config/punktfunk/paired.json (AppState::new loads it), so a real
GameStream-paired client leaked into body[0] on a dev box — now cleared first.

Live-validated against headless KWin: --compositor kwin honored, --compositor
mutter falls back to kwin (available=[kwin, gamescope]), resolved choice
round-trips to the client. Tests: +6 (wire/back-compat, resolution precedence,
endpoint); workspace green, clippy/fmt clean, C ABI harness PASS at abi_version=2,
web typecheck + build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:45:41 +02:00
enricobuehler ff4fe197be fix(punktfunk/1): adversarial-review fixes — SPAKE2 pairing, renegotiation hardening, +more
ci / rust (push) Has been cancelled
Triaged the multi-agent review of the renegotiation + pairing + Sway + AV1/surround batch
(1 critical, 11 major/minor confirmed). Fixes:

CRITICAL — PIN pairing was offline-brute-forceable. The HMAC-of-PIN proof let an active
MITM who terminates the TOFU ceremony recover the 4-digit PIN by offline dictionary search
(all other inputs observable) and forge a correctly-bound proof. Replaced with **SPAKE2**
(balanced PAKE, `spake2` crate) + key-confirmation MACs, binding both cert fingerprints as
the SPAKE2 identities: an attacker gets exactly ONE online guess, no offline search, and
mismatched cert views (a real MITM) never reach a shared key. Also reworked the UX to an
"arming PIN" — one PIN per arming window shown at host startup (the SPAKE2 client needs the
PIN to build its first message, so it can't be minted per-connection). Validated live:
wrong PIN rejected in 0.1s, right PIN pairs + persists + the paired identity streams.

Pairing hardening: `--allow-pairing`/`--require-pairing` must arm pairing (default rejects
unsolicited ceremonies); per-host cooldown bounds online guessing; the client flushes its
CONNECTION_CLOSE so a refused ceremony can't wedge the sequential host for the full timeout;
atomic (temp+rename) paired-store writes.

Protocol: control/pairing messages use a distinct CTL_MAGIC (PKFc) — fully disjoint from
the positional Hello namespace (a future abi_version can't be misparsed as a control
message); all typed decodes are length-exact. ABI_VERSION → 2 (punktfunk_connect signature
gained the identity params; header regenerated).

Renegotiation: drain the reconfig channel to the NEWEST mode (one rebuild, not one per
stale step); validate refresh_hz; build the new pipeline BEFORE dropping the old so a
rebuild failure keeps the session on its current mode instead of killing it.

GameStream: packetDuration snaps to {5,10} (an in-between value isn't a legal Opus frame
size and would kill audio). Sway: chooser file moved to $XDG_RUNTIME_DIR (was a fixed
world-writable /tmp path — DoS / capture-misdirection by another local user).

Swift: fixed two compile breakers in the new pairing/identity APIs (Int32 status .rawValue,
UInt cap cast). New SPAKE2 + namespace-disjointness + pairing-roundtrip unit tests; the
in-process pairing test now also exercises the arming PIN + cooldown. 114 tests green,
clippy -D warnings clean (both feature sets), fmt, C-ABI harness.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 16:26:48 +00:00
enricobuehler 4d26ac5c85 feat: punktfunk/1 — mid-stream mode renegotiation + PIN pairing ceremony
Renegotiation (no reconnect on resize): the handshake bi-stream stays open; the client
sends Reconfigure{mode} (typed post-handshake message), the host validates + acks
Reconfigured and rebuilds capture/encoder/virtual output at the new mode while the data
plane (keys, ports, FEC) runs untouched — the first new-mode AU is an IDR with in-band
parameter sets. NativeClient::request_mode / punktfunk_connection_request_mode; mode()
reflects the active mode. Validated live on KWin: one continuous stream, 225 frames
@1280x720 then 395 @1920x1080, ~90 ms pipeline rebuild (ffprobe shows both resolutions).

PIN pairing (mutual trust, kills TOFU MITM): clients get persistent self-signed
identities presented via QUIC client auth (generate_identity / client auth offered but
optional server-side — legacy clients still connect). Ceremony on the control stream:
PairRequest{name} → host shows a 4-digit PIN (log) + PairChallenge{salt} → client proves
with HMAC-SHA256(PIN‖salt, client_fp‖host_fp) — binding both certs means a MITM can't
forward a proof, single attempt per PIN, constant-time compare → PairResult; host
persists the fingerprint (~/.config/punktfunk/punktfunk1-paired.json), client pins the
host's. m3-host --require-pairing gates sessions on the paired set.
NativeClient::pair + punktfunk_pair/punktfunk_generate_identity in the ABI; reference
client: --pair PIN --name LABEL + auto-generated persistent identity, --remode for live
renegotiation testing. Swift wrapper: ClientIdentity/generateIdentity()/pair(),
requestMode()/currentMode(); README handoff updated.

Tested: reconfigure/pairing wire roundtrips, C-ABI mode switch ack, full in-process
ceremony (wrong PIN → Crypto, anonymous-vs-gate rejection, success → pinned session);
live wrong-PIN ceremony against the serving host (PIN logged, proof rejected).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 15:42:29 +00:00
enricobuehler bfd64ce871 rename: lumen → punktfunk, everywhere
ci / rust (push) Has been cancelled
Full project rename, decided 2026-06-10:
- Crates/binaries: punktfunk-core / punktfunk-host / punktfunk-client-rs.
- C ABI: punktfunk_* symbols, Punktfunk* types, include/punktfunk_core.h,
  PUNKTFUNK_FEATURE_QUIC guard (header regenerated; cbindgen renames updated, incl.
  PUNKTFUNK_BTN_*/PUNKTFUNK_AXIS_* wire constants).
- Protocol: punktfunk/1 — control-plane magic LMN1 → PKF1, nonce salt lmn1 → pkf1.
  WIRE BREAK: clients must be rebuilt from this revision.
- Env knobs: PUNKTFUNK_VIDEO_SOURCE / PUNKTFUNK_COMPOSITOR / PUNKTFUNK_ZEROCOPY / ….
- Host config dir: ~/.config/punktfunk (the box's dir was migrated in place — the
  persistent identity is unchanged, pinned fingerprints stay valid).
- Swift package: PunktfunkKit + PunktfunkCore.xcframework + PunktfunkConnection
  (Sources/PunktfunkClient app + tests renamed with it); build-xcframework.sh updated.
- scripts/: 60-punktfunk.rules, punktfunk-host.service; OpenAPI doc regenerated.

Also: scripts/headless/run-headless-kde.sh — full headless Plasma bringup. Root cause of
"desktop but no apps/settings" over the stream: plasmashell launched without
XDG_MENU_PREFIX=plasma-, so the launcher resolved a nonexistent applications.menu and
rendered an empty menu. The script sets the complete KDE session env (menu prefix,
KDE_FULL_SESSION, session version) and rebuilds ksycoca before starting plasmashell.

Gate: 97/97 tests, clippy -D warnings (both feature sets), fmt, C-ABI harness PASS,
zero lumen references left outside .git.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 13:11:59 +00:00