f37a304fba0c491670dd066a11e56cce88ade6dc
6 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
f37a304fba |
fix(core/speed-test): packet-level throughput + paced burst (kill the 0/100% cliff)
The punktfunk/1 speed test was unusable across every client/host: at the start of a burst a little data got through, then everything read as dropped (~10 MB total). Two compounding bugs: 1. Receive side measured throughput from fully-reassembled FLAG_PROBE *access units* only. The instant loss crossed the 20% FEC budget no AU completed, so the figure cliffed to 0 / 100% loss even though most bytes still arrived — a binary cliff, not a graded measurement. 2. Send side blasted each filler AU (up to 256 KB ≈ 200 packets) into the socket buffer in one unpaced batch, unlike the real video path which paces. On a small buffer (e.g. the Steam Deck's 416 KB) a single AU overflowed it, so the test measured self-inflicted buffer overflow instead of the link. Fixes: - Host `run_probe_burst` keeps each AU a small (~16 KB) burst and paces by the byte budget, mirroring `paced_submit`; reports the WIRE packets the kernel accepted and the ones the send buffer dropped (stat deltas), separating host-side drops from link loss. - `ProbeResult` gains `wire_packets_sent` + `send_dropped` (back-compat decode: a 21-byte pre-wire-stats result still decodes, new fields 0). - Clients (probe + connector) count delivered traffic at the packet level via `session.stats()` deltas over the burst window, so throughput/loss degrade gracefully. Connector freezes the delivered figure when the host report lands so resumed video can't inflate it. New `ProbeOutcome`/`PunktfunkProbeResult` fields: `host_drop_pct`, `wire_packets_sent`, `send_dropped`. Validated on loopback (graded 142→1391 Mbps, host_drop/link_loss split correctly, no cliff) and live against the Deck: clean to ~200 Mbps goodput / 273 Mbps wire at 0% link loss, host send buffer the wall above that (the lever-#1 target). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
480dee863d |
feat(host/gamescope): custom-resolution Game-Mode streaming on the Steam Deck
The Steam Deck (SteamOS) ships its OWN gaming session — `gamescope-session.target` driven by `/usr/lib/steamos/gamescope-session`, not Bazzite's `gamescope-session-plus`. That script `exec gamescope`s with HARDCODED physical-panel args (`-w 1280 -h 800 -O '*',eDP-1`) and launches Steam via a SEPARATE `steam-launcher.service`, so the existing managed-session path (which assumes session-plus) couldn't honor the client's mode — an attach captured the panel's native 1280x800 instead. Add a SteamOS branch to the managed-session path: detect it, write a `gamescope` PATH-shim that rewrites the hardcoded args to `--backend headless -W <client> -H <client> -r <hz>`, drop a transient user `gamescope-session.service.d` override pointing PATH at the shim + the mode, then RESTART the whole target so `steam-launcher.service` brings Steam up IN the headless gamescope at the client's resolution. Attach to the one fresh node (the restart kills any prior gamescope, so no stale-node attach). Restore-on-disconnect removes the override + restarts the target back to the physical panel (debounced; skipped if the user switched to a desktop session). All user-level (`systemctl --user`) — no root. Also widen `build_pipeline_with_retry` to 8 attempts (~90s): a host-managed gamescope session cold-starting Steam Big Picture takes 30-60s to first frame, and a first-connect timeout would tear down the warm session (forcing another cold start on reconnect). Permanent failures still fail fast via `is_permanent_build_error`. Validated live on a Steam Deck: Game Mode auto-detected, host takes over headless at the client's mode (720p / 1080p), Steam Big Picture streamed glass-to-glass to the Mac at the requested resolution. Single-tenant (concurrent clients at different modes still thrash — a follow-up). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
0f7f1be3c3 |
fix(core/transport): treat ENOBUFS as a transient drop, not a fatal error
WiFi drivers (e.g. ath11k on the Steam Deck) return ENOBUFS — not EAGAIN/EWOULDBLOCK — when the tx queue is momentarily full. Rust maps ENOBUFS to ErrorKind::Uncategorized, so `is_transient_io` (which only matched WouldBlock/ConnRefused/ConnReset) treated it as a real error and tore the whole stream down on a single transient burst. This presented as a vicious Heisenbug on the Deck: the native host streamed flawlessly on loopback and under a debugger (anything slow enough not to fill the small ~416 KB wlan0 buffer), but died at full rate cross-machine over WiFi — flaky hang-or-SIGKILL because tx-queue-full is probabilistic. Diagnosed live via a forced core dump (gdb on the hung core): the data-plane thread had bailed on a fatal send error. Treat ENOBUFS (and asynchronous network-path blips ENETUNREACH / EHOSTUNREACH / ENETDOWN / EHOSTDOWN) as a lossy drop like WouldBlock — FEC + the next frame recover. Validated: 6/6 back-to-back cross-machine streams over the Deck's WiFi, host stable, p50 ~4.4 ms (one run dropped 4/300 frames *gracefully*, 0 mismatched — the fix working as intended). Also surface a data-plane bind/hole-punch failure directly in punktfunk1 (it was previously only reported after teardown, which a stall could swallow entirely). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
6922e1c467 |
feat(host): VAAPI codec probe + AMD/Intel packaging + neutral logs (Phase 3)
apple / swift (push) Successful in 55s
ci / rust (push) Failing after 1m35s
ci / web (push) Successful in 28s
windows-host / package (push) Successful in 2m23s
ci / docs-site (push) Successful in 30s
android / android (push) Successful in 3m24s
deb / build-publish (push) Successful in 3m22s
decky / build-publish (push) Successful in 14s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m48s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m50s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m51s
docker / deploy-docs (push) Successful in 18s
Polish for AMD/Intel support:
- GameStream serverinfo advertises only codecs the GPU can ACTUALLY encode on
the VAAPI backend (probed once by opening a tiny encoder per codec). AV1
encode is narrow (Intel Arc/Xe2+, AMD RDNA3+/RDNA4) and an old iGPU may lack
HEVC, so a Moonlight client never negotiates a codec the encoder can't open.
NVENC/Windows keep the Moonlight-validated static mask. Validated on a Radeon
780M: h264/h265/av1 all probe true -> mask unchanged (65793).
- Packaging: Recommends mesa-va-drivers + intel-media-va-driver (deb) /
mesa-va-drivers + intel-media-driver (rpm) so the auto-selected VAAPI backend
works out of the box on AMD/Intel; NVIDIA boxes can --no-install-recommends.
(Fedora note: stock mesa-va-drivers disables HEVC/AV1 -- needs the freeworld
variant from RPM Fusion.)
- De-NVIDIA-fy the user-facing encoder log/context strings ("open NVENC" ->
"open video encoder") now that VAAPI is a first-class backend.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
112a054c35 |
perf(host): latency hardening for the game-vs-encode GPU contention collapse
Verified, prioritized analysis in docs/host-latency-plan.md (multi-agent investigation + adversarial verification). Lands the two low-risk tiers: Tier 2B — Linux scheduling hygiene: - boost_thread_priority now nices the capture/encode (-10) and send (-5) threads on Linux (setpriority, best-effort; no-op without CAP_SYS_NICE), and the wrong "gamescope caps the game" doc-comment is corrected. - CUDA context created with CU_CTX_SCHED_BLOCKING_SYNC (frees a core on the shared box instead of busy-spinning on completion). - Copies moved off the default stream onto a per-thread highest-priority CUDA stream (cuStreamCreateWithPriority, graceful NULL-stream fallback) with a per-stream sync that no longer blocks on the other worker thread's in-flight copies. Stream priority is measure-then-keep (NVIDIA Linux may ignore it); never regresses. Tier 3A — Windows session tuning (new session_tuning.rs, raw C-ABI FFI, no-op off Windows): once-per-process 1ms timer + DwmEnableMMCSS + HIGH priority class; per-thread MMCSS "Games" + keep-display-awake. Wired into both the native (boost_thread_priority) and GameStream (stream.rs) paths. We had zero session tuning before (Apollo streaming_will_start parity). Tier 2A (Linux NV12 convert) is specified but intentionally not landed: it is colour-correctness-critical and needs A/B validation on a GPU box with a display (green-screen risk). Builds + clippy + fmt green on Linux. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
9c8fa9340c |
refactor: drop milestone names + consolidate clients; loss-recovery & rumble fixes
apple / swift (push) Failing after 40s
audit / cargo-audit (push) Failing after 1m12s
windows-msix / package (push) Successful in 1m37s
windows / build (push) Successful in 1m14s
android / android (push) Successful in 4m48s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 4m21s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 4m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 19s
deb / build-publish (push) Successful in 6m3s
flatpak / build-publish (push) Successful in 4m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m16s
docker / deploy-docs (push) Successful in 18s
Two bodies of work in one commit (the rename moved files the fixes also touched). Naming/structure cleanup (pre-launch): - Host modules m3.rs->punktfunk1.rs, m0.rs->spike.rs; CLI m3-host->punktfunk1-host, m0->spike; bare `punktfunk-host` now prints help. Types M3Options/M3Source-> Punktfunk1Options/Punktfunk1Source. - Clients consolidated out of crates/ into clients/: punktfunk-client-rs-> clients/probe (crate punktfunk-probe), client-linux->clients/linux, client-windows->clients/windows, punktfunk-android->clients/android/native (crate punktfunk-client-android; kept [lib] name=punktfunk_android so the JNI contract is unchanged). crates/ now holds only core + host. - Milestone codes M0-M4 purged from code/CLI/CLAUDE.md/README/docs/docs-site, kept only in docs/implementation-plan.md. docs/m2-plan.md-> docs/gamestream-host-plan.md. CI/gradle/flatpak paths updated. Client loss-recovery (video froze and never recovered after a brief drop): - Export punktfunk_connection_frames_dropped through the C ABI (the core already tracked it for the client keyframe-recovery loop; it was never reachable from the ABI clients). Regenerated punktfunk_core.h. - Apple (StreamPump + Stage2Pipeline) and Android (decode.rs) now poll frames_dropped and request a keyframe when it climbs -- the same loss-driven recovery Linux/Windows already had. Under infinite GOP the decoder silently conceals reference-missing frames, so the decode-error trigger rarely fires. Apple rumble robustness (worked then went spotty -- DualSense + Xbox): - Add CHHapticEngine stopped/reset handlers (rebuild on app background / audio interruption / server reset) and drop the permanent `broken` latch on a transient drive failure; latch only when the controller truly has no haptics. - Surface swallowed SDL set_rumble errors on Linux/Windows + diagnostic logging. Verified: cargo build/clippy/fmt --workspace, C-ABI harness, header drift. Not runnable on this box (verify in CI): Gitea workflows, gradle/Android, flatpak, Swift/decky. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |