Commit Graph

4 Commits

Author SHA1 Message Date
enricobuehler 69609945a3 feat(clients): host/network split in every stats HUD (stats phase 2, client side)
Consumes the 0xCF host-timing plane (449a67c) on all four GUI clients: each
keeps a bounded pending ring of receipt samples keyed by pts, matches the
host's per-AU capture→sent reports against it, and the HUD equation becomes

  = host 3.1 + network 6.7 + decode 2.1 + display 2.3

falling back to the combined `= host+network …` term whenever no timing
matched the window (old host / datagram loss) — same total, one split
fewer, never a misleading zero. Apple additionally gains the split as the
only equation line under the stage-1 fallback presenter (receipt is
presenter-independent), a `nextHostTiming` wrapper with its own plane lock,
and a unit-tested `HostNetworkSplitter`; Android extends the JNI stats
array 16→18 doubles (0–15 unchanged); Windows/Linux thread the split
through `Stats` into the HUD and the headless/debug logs.

Docs updated: design/stats-unification.md Phase 2 → implemented (wire
format, fallback semantics), and the docs-site matrix's Sunshine "Host
processing latency" row is now a direct match (ours includes the paced
send; avg vs p50).

Verified here: linux client clippy -D warnings green on the live tree,
windows stub check + hand-verified diff, android cargo-ndk arm64 check
green, apple loopback test extended (needs the rebuilt xcframework + swift
test on the mac). On-glass: pending on all platforms.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-03 21:31:49 +00:00
enricobuehler 09a5957c6d feat(clients): unified stats vocabulary across every client + Moonlight comparison docs
One stat model everywhere (design/stats-unification.md): four measurement
points (capture/received/decoded/displayed), three stages that tile the
interval exactly, and a HUD that shows the addition explicitly —

  end-to-end 14.2 ms p50 · 19.8 p95 · capture→on-glass
  = host+network 9.8 + decode 2.1 + display 2.3

replacing each client's ad-hoc mix of overlapping absolutes (the Apple HUD's
three arrow lines that looked sequential but weren't), mean-vs-median decode
times (Windows/Linux), missing same-host-clock flags (Windows/Linux), and
three different names for the same capture→received measurement (probe's
"reassembled", Apple/Android's "client", Windows/Linux's post-decode "lat").

Per client: Apple threads receivedNs through the VT decode via the frame
refcon bit pattern so the decode stage exists at all (stage-1 fallback
honestly degrades to a capture→received headline); Windows carries
FrameTimes through the existing frame channel to the render thread and adds
e2e p50/p95 post-Present; Linux stamps received at AU pop and rides
decoded_ns on DecodedFrame to the paintable-set site; Android pairs receipt
stamps with MediaCodec output buffers via the codec's pts round-trip (JNI
stats array 14→16 doubles, indexes 0-13 unchanged). fps now uniformly counts
received AUs; lost/(received+lost) per window, hidden at zero.

docs-site gains "Understanding the Stats Overlay": what each line means, why
the equation only approximately sums (percentiles), and a line-by-line
Moonlight/Sunshine matrix — including that Moonlight has no end-to-end
number and its "network latency" is an ENet control RTT, so punktfunk's
headline must not be compared against any single Moonlight line.

Verified here: linux client + probe + core check/clippy/fmt green, android
native cargo-ndk arm64 check green. Pending: Windows CI + on-glass, swift
test on the mac, on-device Android.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-03 21:01:29 +00:00
enricobuehler bd4e15b68d refactor(android): split session JNI into modules, HUD-gated stats, AAudio open retry
- native: the 756-line session.rs becomes session/{mod,connect,input,planes}.rs
  around a SessionHandle (connect lifecycle + trust, input plane shims, plane
  start/stop + stats drain).
- Decode-stats sampling is HUD-gated (nativeSetVideoStatsEnabled): with the
  overlay hidden the decode thread skips the per-AU clock read + lock; enabling
  resets the measurement window.
- audio: the AAudio open path is a per-sharing-mode try_open closure — the
  realtime callback state (ring, prime, free-list) is rebuilt per attempt, so a
  failed exclusive-mode try can't leak state into the shared-mode retry.
- Kotlin: ConnectScreen/StreamScreen slimmed by extracting ConnectDialogs,
  StatsOverlay and TouchInput.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 11:08:56 +02:00
enricobuehler 5262e28b79 feat(android): live stats HUD + low-latency decode tuning
apple / swift (push) Successful in 54s
windows-msix / package (push) Successful in 1m1s
decky / build-publish (push) Has been cancelled
deb / build-publish (push) Has been cancelled
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Has been cancelled
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Has been cancelled
windows / build (push) Successful in 55s
ci / docs-site (push) Successful in 31s
audit / cargo-audit (push) Failing after 1m8s
android / android (push) Failing after 2m12s
ci / web (push) Successful in 31s
ci / bench (push) Successful in 4m31s
ci / rust (push) Successful in 6m31s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 35s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m44s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m25s
flatpak / build-publish (push) Successful in 5m5s
docker / deploy-docs (push) Successful in 20s
Stats HUD (mirrors the Apple client): the decode thread accumulates FPS, receive
throughput, and capture->client latency (p50/p95, skew-corrected) in Rust
(clients/android/native/src/stats.rs); nativeVideoStats drains a snapshot ~1 Hz
over JNI as a DoubleArray. StreamScreen renders a Compose overlay
(W*H@Hz / fps / Mb/s / latency, + dropped-under-loss), toggled by a Settings
switch (persisted, default on) or a 3-finger tap.

Performance (decode.rs):
- ANativeWindow_setFrameRate(refresh_hz): align display vsync to the stream rate
  (no 60-in-120 judder); safe since minSdk 31 >= API 30.
- Raise the decode thread toward URGENT_DISPLAY (best-effort setpriority) so
  background work can't preempt it under load.
- Codec low-latency hints KEY_PRIORITY=0 (realtime) + KEY_OPERATING_RATE.

Verified host-side: cargo build/clippy/fmt --workspace (the ungated stats + JNI
accessor). The android-gated decode.rs (NDK) and the Kotlin build only in CI
(android.yml: gradle + cargo-ndk) -- APIs verified against crate sources.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 21:49:29 +00:00