feat(clients): unified stats vocabulary across every client + Moonlight comparison docs

One stat model everywhere (design/stats-unification.md): four measurement
points (capture/received/decoded/displayed), three stages that tile the
interval exactly, and a HUD that shows the addition explicitly —

  end-to-end 14.2 ms p50 · 19.8 p95 · capture→on-glass
  = host+network 9.8 + decode 2.1 + display 2.3

replacing each client's ad-hoc mix of overlapping absolutes (the Apple HUD's
three arrow lines that looked sequential but weren't), mean-vs-median decode
times (Windows/Linux), missing same-host-clock flags (Windows/Linux), and
three different names for the same capture→received measurement (probe's
"reassembled", Apple/Android's "client", Windows/Linux's post-decode "lat").

Per client: Apple threads receivedNs through the VT decode via the frame
refcon bit pattern so the decode stage exists at all (stage-1 fallback
honestly degrades to a capture→received headline); Windows carries
FrameTimes through the existing frame channel to the render thread and adds
e2e p50/p95 post-Present; Linux stamps received at AU pop and rides
decoded_ns on DecodedFrame to the paintable-set site; Android pairs receipt
stamps with MediaCodec output buffers via the codec's pts round-trip (JNI
stats array 14→16 doubles, indexes 0-13 unchanged). fps now uniformly counts
received AUs; lost/(received+lost) per window, hidden at zero.

docs-site gains "Understanding the Stats Overlay": what each line means, why
the equation only approximately sums (percentiles), and a line-by-line
Moonlight/Sunshine matrix — including that Moonlight has no end-to-end
number and its "network latency" is an ENet control RTT, so punktfunk's
headline must not be compared against any single Moonlight line.

Verified here: linux client + probe + core check/clippy/fmt green, android
native cargo-ndk arm64 check green. Pending: Windows CI + on-glass, swift
test on the mac, on-device Android.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-07-03 21:01:29 +00:00
parent c7630ff5dc
commit 09a5957c6d
38 changed files with 1122 additions and 380 deletions
+1 -1
View File
@@ -14,7 +14,7 @@ example of driving the protocol end to end: QUIC control plane, UDP data plane,
- **Receives a real stream**, writes a playable elementary stream (`.h265`/`.h264`/`.av1` — the
extension tracks the **negotiated codec**; the probe advertises all three and the host picks), and
reports per-frame **capture→…→reassembled latency** percentiles (the host stamps each frame with
reports per-frame **capture→received latency** percentiles (the host stamps each frame with
its capture clock).
- **Verification mode** against a synthetic host — byte-checks deterministic test frames.
- **Exercises every plane** with scripted test traffic:
+4 -4
View File
@@ -4,7 +4,7 @@
//! * **verification** (`frames > 0`, synthetic host): byte-checks deterministic test frames;
//! * **stream** (`frames == 0`, virtual host): receives real encoded AUs, writes a playable
//! elementary stream (the dump extension follows the negotiated codec — `.h265`/`.h264`/`.av1`;
//! the probe advertises all three), and reports per-frame **capture→…→reassembled latency**
//! the probe advertises all three), and reports per-frame **capture→received latency**
//! percentiles (the host stamps each frame with its capture wall clock; same-host runs share
//! that clock).
//!
@@ -481,7 +481,7 @@ async fn session(args: Args) -> Result<()> {
.await?;
// Wall-clock skew handshake on the still-private control stream (before --remode/--speed-test
// take it): align our clock to the host's so the per-frame capture→reassembled latency is valid
// take it): align our clock to the host's so the per-frame capture→received latency is valid
// across machines. `None` ⇒ an old host that doesn't answer — fall back to a shared clock (0).
let clock_offset_ns = match punktfunk_core::quic::clock_sync(&mut send, &mut recv).await {
Some(skew) => {
@@ -1051,7 +1051,7 @@ async fn session(args: Args) -> Result<()> {
continue;
}
bytes += frame.data.len() as u64;
// capture→reassembled: our receive instant in the host clock (now + offset)
// capture→received: our receive instant in the host clock (now + offset)
// minus the host's capture pts. offset is 0 same-host / old host.
let lat = (now_ns() as i128 + clock_offset as i128 - frame.pts_ns as i128)
.max(0) as u64;
@@ -1100,7 +1100,7 @@ async fn session(args: Args) -> Result<()> {
lat_p99_us = pct(0.99),
lat_max_us = latencies_us.last().copied().unwrap_or(0),
skew_corrected,
"punktfunk/1 stream complete (capture→reassembled latency; skew_corrected=true ⇒ \
"punktfunk/1 stream complete (capture→received latency; skew_corrected=true ⇒ \
cross-machine valid, false ⇒ same-host clock)"
);
if expected > 0 {