feat(clients): host/network split in every stats HUD (stats phase 2, client side)

Consumes the 0xCF host-timing plane (449a67c) on all four GUI clients: each
keeps a bounded pending ring of receipt samples keyed by pts, matches the
host's per-AU capture→sent reports against it, and the HUD equation becomes

  = host 3.1 + network 6.7 + decode 2.1 + display 2.3

falling back to the combined `= host+network …` term whenever no timing
matched the window (old host / datagram loss) — same total, one split
fewer, never a misleading zero. Apple additionally gains the split as the
only equation line under the stage-1 fallback presenter (receipt is
presenter-independent), a `nextHostTiming` wrapper with its own plane lock,
and a unit-tested `HostNetworkSplitter`; Android extends the JNI stats
array 16→18 doubles (0–15 unchanged); Windows/Linux thread the split
through `Stats` into the HUD and the headless/debug logs.

Docs updated: design/stats-unification.md Phase 2 → implemented (wire
format, fallback semantics), and the docs-site matrix's Sunshine "Host
processing latency" row is now a direct match (ours includes the paced
send; avg vs p50).

Verified here: linux client clippy -D warnings green on the live tree,
windows stub check + hand-verified diff, android cargo-ndk arm64 check
green, apple loopback test extended (needs the rebuilt xcframework + swift
test on the mac). On-glass: pending on all platforms.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-07-03 21:31:49 +00:00
parent 8470419433
commit 69609945a3
19 changed files with 610 additions and 59 deletions
@@ -16,12 +16,15 @@ import kotlin.math.roundToInt
/** /**
* The live stats overlay — the unified HUD (`design/stats-unification.md`, Android v1: headline is * The live stats overlay — the unified HUD (`design/stats-unification.md`, Android v1: headline is
* `capture→decoded`, tiled by `host+network` + `decode`). Reads the 16-double layout from * `capture→decoded`, tiled by `host+network` + `decode`). Reads the 18-double layout from
* [NativeBridge.nativeVideoStats]: * [NativeBridge.nativeVideoStats]:
* `[fps, mbps, e2eP50Ms, e2eP95Ms, latValid, skew, w, h, hz, lost, bitDepth, colorPrimaries, * `[fps, mbps, e2eP50Ms, e2eP95Ms, latValid, skew, w, h, hz, lost, bitDepth, colorPrimaries,
* colorTransfer, chromaFormatIdc, hostNetP50Ms, decodeP50Ms]`. Indexes 1013 (present on a current * colorTransfer, chromaFormatIdc, hostNetP50Ms, decodeP50Ms, hostP50Ms, netP50Ms]`. Indexes 1013
* native lib) describe the negotiated video feed and render as a codec/depth/colour/chroma line; * (present on a current native lib) describe the negotiated video feed and render as a
* 14/15 render as the stage equation; older layouts just omit those lines. * codec/depth/colour/chroma line; 14/15 render as the stage equation — split into
* `host + network + decode` when the Phase-2 terms at 16/17 are nonzero (a current host sends
* per-AU 0xCF timings; an old host leaves them 0 and the combined `host+network` term stands);
* older layouts just omit those lines.
*/ */
@Composable @Composable
internal fun StatsOverlay(s: DoubleArray, modifier: Modifier = Modifier) { internal fun StatsOverlay(s: DoubleArray, modifier: Modifier = Modifier) {
@@ -60,8 +63,16 @@ internal fun StatsOverlay(s: DoubleArray, modifier: Modifier = Modifier) {
fontSize = 12.sp, fontSize = 12.sp,
) )
if (s.size >= 16) { if (s.size >= 16) {
// Phase-2 split (s[16]/s[17]): render `host + network` separately when the host
// reported its share this window; otherwise the combined term (old host / no
// matched 0xCF timing).
val equation = if (s.size >= 18 && s[16] > 0) {
"= host ${"%.1f".format(s[16])} + network ${"%.1f".format(s[17])} + decode ${"%.1f".format(s[15])}"
} else {
"= host+network ${"%.1f".format(s[14])} + decode ${"%.1f".format(s[15])}"
}
Text( Text(
"= host+network ${"%.1f".format(s[14])} + decode ${"%.1f".format(s[15])}", equation,
color = Color.White, color = Color.White,
fontFamily = FontFamily.Monospace, fontFamily = FontFamily.Monospace,
fontSize = 12.sp, fontSize = 12.sp,
@@ -105,14 +105,17 @@ object NativeBridge {
/** /**
* Drain ~1 s of live decode stats for the on-stream HUD, or `null` when no decode thread runs. * Drain ~1 s of live decode stats for the on-stream HUD, or `null` when no decode thread runs.
* Returns 16 doubles (unified stats spec, `design/stats-unification.md`): * Returns 18 doubles (unified stats spec, `design/stats-unification.md`):
* `[fps, mbps, e2eP50Ms, e2eP95Ms, latValid, skewCorrected, width, height, refreshHz, framesLost, * `[fps, mbps, e2eP50Ms, e2eP95Ms, latValid, skewCorrected, width, height, refreshHz, framesLost,
* bitDepth, colorPrimaries, colorTransfer, chromaFormatIdc, hostNetP50Ms, decodeP50Ms]` * bitDepth, colorPrimaries, colorTransfer, chromaFormatIdc, hostNetP50Ms, decodeP50Ms, hostP50Ms,
* netP50Ms]`
* (the two flags are 1.0/0.0; indexes 2/3 are the end-to-end capture→decoded headline; 1013 * (the two flags are 1.0/0.0; indexes 2/3 are the end-to-end capture→decoded headline; 1013
* describe the negotiated video feed — bit depth 8/10, CICP primaries/transfer, and the HEVC * describe the negotiated video feed — bit depth 8/10, CICP primaries/transfer, and the HEVC
* chroma_format_idc 1=4:2:0 / 3=4:4:4; 14/15 are the stage p50s tiling the headline — * chroma_format_idc 1=4:2:0 / 3=4:4:4; 14/15 are the stage p50s tiling the headline —
* `host+network` = capture→received, `decode` = received→decoded). Poll ~1 Hz; each call * `host+network` = capture→received, `decode` = received→decoded; 16/17 split the
* resets the measurement window. * `host+network` term via the host's per-AU 0xCF timings — `host` = the host's capture→sent,
* `network` = the remainder — both 0.0 when no timing matched this window, i.e. an old host).
* Poll ~1 Hz; each call resets the measurement window.
*/ */
external fun nativeVideoStats(handle: Long): DoubleArray? external fun nativeVideoStats(handle: Long): DoubleArray?
+30
View File
@@ -25,6 +25,11 @@ use std::time::{Duration, Instant};
/// flight, so anything beyond this is stale (codec flushed / HUD toggled) and gets evicted. /// flight, so anything beyond this is stale (codec flushed / HUD toggled) and gets evicted.
const IN_FLIGHT_CAP: usize = 64; const IN_FLIGHT_CAP: usize = 64;
/// Cap on received AUs awaiting their 0xCF host timing (Phase 2 host/network split): the timing
/// datagram trails its AU by at most the wire, so a match lands within a frame or two — anything
/// this deep is a lost datagram (or an old host that never sends any) and gets evicted.
const PENDING_SPLIT_CAP: usize = 256;
/// The decode loop. Runs on the `pf-decode` thread until `shutdown` is set or the session closes. /// The decode loop. Runs on the `pf-decode` thread until `shutdown` is set or the session closes.
pub fn run( pub fn run(
client: Arc<NativeClient>, client: Arc<NativeClient>,
@@ -155,6 +160,11 @@ pub fn run(
// point (output-buffer dequeue — MediaCodec round-trips presentationTimeUs) can be paired back // point (output-buffer dequeue — MediaCodec round-trips presentationTimeUs) can be paired back
// to its receipt for the `decode` stage. Only fed while the HUD is visible. // to its receipt for the `decode` stage. Only fed while the HUD is visible.
let mut in_flight: VecDeque<(u64, i128)> = VecDeque::new(); let mut in_flight: VecDeque<(u64, i128)> = VecDeque::new();
// Phase-2 host/network split (design/stats-unification.md): received AUs awaiting their 0xCF
// host timing, as (pts_ns, capture→received µs). The timings are drained non-blockingly right
// where receipts are recorded and matched by pts; `network = hostnet host` (saturating).
// Only fed while the HUD is visible; an old host never sends a 0xCF, so entries just age out.
let mut pending_split: VecDeque<(u64, u64)> = VecDeque::new();
// The dataspace we've signalled on the Surface so far (None = default/SDR). Set reactively once // The dataspace we've signalled on the Surface so far (None = default/SDR). Set reactively once
// the decoder reports an HDR stream (see `drain`); avoids re-applying every format event. // the decoder reports an HDR stream (see `drain`); avoids re-applying every format event.
let mut applied_ds: Option<DataSpace> = None; let mut applied_ds: Option<DataSpace> = None;
@@ -190,6 +200,26 @@ pub fn run(
if in_flight.len() > IN_FLIGHT_CAP { if in_flight.len() > IN_FLIGHT_CAP {
in_flight.pop_front(); // stale — codec never echoed it back in_flight.pop_front(); // stale — codec never echoed it back
} }
// Phase-2 split: park this AU's capture→received sample, then match any
// 0xCF host timings that have arrived — host = the host's own
// capture→sent, network = our capture→received minus it (per-frame
// tiling; saturating in case of clock jitter).
if let Some(hostnet_us) = lat_us {
pending_split.push_back((frame.pts_ns, hostnet_us));
if pending_split.len() > PENDING_SPLIT_CAP {
pending_split.pop_front(); // 0xCF lost / old host — evict
}
}
while let Ok(t) = client.next_host_timing(Duration::ZERO) {
if let Some(i) = pending_split.iter().position(|&(p, _)| p == t.pts_ns)
{
let (_, hostnet_us) = pending_split.remove(i).unwrap();
stats.note_host_split(
t.host_us as u64,
hostnet_us.saturating_sub(t.host_us as u64),
);
}
}
} }
pending = Some(frame); pending = Some(frame);
} }
+14 -7
View File
@@ -73,13 +73,16 @@ pub extern "system" fn Java_io_unom_punktfunk_kit_NativeBridge_nativeStopVideo(
} }
/// `NativeBridge.nativeVideoStats(handle): DoubleArray?` — drain ~1 s of decode stats for the HUD /// `NativeBridge.nativeVideoStats(handle): DoubleArray?` — drain ~1 s of decode stats for the HUD
/// (unified stats spec, `design/stats-unification.md`). Returns 16 doubles /// (unified stats spec, `design/stats-unification.md`). Returns 18 doubles
/// `[fps, mbps, e2eP50Ms, e2eP95Ms, latValid, skewCorrected, width, height, refreshHz, framesLost, /// `[fps, mbps, e2eP50Ms, e2eP95Ms, latValid, skewCorrected, width, height, refreshHz, framesLost,
/// bitDepth, colorPrimaries, colorTransfer, chromaFormatIdc, hostNetP50Ms, decodeP50Ms]` /// bitDepth, colorPrimaries, colorTransfer, chromaFormatIdc, hostNetP50Ms, decodeP50Ms, hostP50Ms,
/// (the two flags are 1.0/0.0; indexes 013 match the previous 14-double layout with the latency /// netP50Ms]`
/// pair re-based from capture→received to the end-to-end capture→decoded headline; the two stage /// (the two flags are 1.0/0.0; indexes 015 match the previous 16-double layout — 013 the original
/// p50s tiling it — `host+network` = capture→received, `decode` = received→decoded — are appended /// 14-double one with the latency pair re-based to the end-to-end capture→decoded headline, 14/15
/// at the end), or `null` when no decode thread is running. Poll ~1 Hz from the UI; each call /// the stage p50s tiling it: `host+network` = capture→received, `decode` = received→decoded; 16/17
/// are the Phase-2 split of the `host+network` term from the per-AU 0xCF host timings — `host` =
/// the host's capture→sent, `network` = the remainder — both 0.0 when no timing matched this
/// window, i.e. an old host), or `null` when no decode thread is running. Poll ~1 Hz from the UI; each call
/// resets the measurement window. Not android-gated — pure `jni` + connector reads, so it links on /// resets the measurement window. Not android-gated — pure `jni` + connector reads, so it links on
/// the host build too (Kotlin only ever calls it on device). /// the host build too (Kotlin only ever calls it on device).
#[no_mangle] #[no_mangle]
@@ -100,7 +103,7 @@ pub extern "system" fn Java_io_unom_punktfunk_kit_NativeBridge_nativeVideoStats(
let snap = h.stats.drain(); let snap = h.stats.drain();
let mode = h.client.mode(); let mode = h.client.mode();
let color = h.client.color; let color = h.client.color;
let buf: [f64; 16] = [ let buf: [f64; 18] = [
snap.fps, snap.fps,
snap.mbps, snap.mbps,
snap.e2e_p50_ms, snap.e2e_p50_ms,
@@ -122,6 +125,10 @@ pub extern "system" fn Java_io_unom_punktfunk_kit_NativeBridge_nativeVideoStats(
// Stage p50s tiling the end-to-end headline (appended to keep 013 index-compatible). // Stage p50s tiling the end-to-end headline (appended to keep 013 index-compatible).
snap.hostnet_p50_ms, snap.hostnet_p50_ms,
snap.decode_p50_ms, snap.decode_p50_ms,
// Phase-2 host/network split of the `host+network` stage (0xCF host timings): 0.0
// when no timing matched this window (old host) — the HUD keeps the combined term.
snap.host_p50_ms,
snap.net_p50_ms,
]; ];
let arr = match env.new_double_array(buf.len() as jsize) { let arr = match env.new_double_array(buf.len() as jsize) {
Ok(a) => a, Ok(a) => a,
+42 -1
View File
@@ -1,7 +1,9 @@
//! Live decode stats for the on-stream HUD, following the unified stats spec //! Live decode stats for the on-stream HUD, following the unified stats spec
//! (`design/stats-unification.md`): FPS, receive throughput, and the Android v1 stage split — //! (`design/stats-unification.md`): FPS, receive throughput, and the Android v1 stage split —
//! headline `end-to-end` = capture→decoded (p50/p95) tiled by `host+network` = capture→received //! headline `end-to-end` = capture→decoded (p50/p95) tiled by `host+network` = capture→received
//! and `decode` = received→decoded (stage p50s). The decode thread is the sole writer //! and `decode` = received→decoded (stage p50s). When the host emits per-AU 0xCF host timings, the
//! `host+network` term further splits into `host` + `network` (Phase 2, `note_host_split`); an old
//! host emits none and the combined term stands. The decode thread is the sole writer
//! (`note_received` per access unit at receipt, `note_decoded` per decoder output buffer); the JNI //! (`note_received` per access unit at receipt, `note_decoded` per decoder output buffer); the JNI
//! accessor `nativeVideoStats` drains a snapshot ~1 Hz and resets the window. Sampling is gated on //! accessor `nativeVideoStats` drains a snapshot ~1 Hz and resets the window. Sampling is gated on
//! the HUD actually being visible (`set_enabled`, driven by `nativeSetVideoStatsEnabled`) so the //! the HUD actually being visible (`set_enabled`, driven by `nativeSetVideoStatsEnabled`) so the
@@ -32,6 +34,12 @@ struct Inner {
e2e_us: Vec<u64>, e2e_us: Vec<u64>,
/// `host+network` stage = capture→received samples, in microseconds (skew-corrected). /// `host+network` stage = capture→received samples, in microseconds (skew-corrected).
hostnet_us: Vec<u64>, hostnet_us: Vec<u64>,
/// Phase-2 split of `host+network` (design/stats-unification.md Phase 2), fed only when the
/// host emits per-AU 0xCF timings: `host` = the host's own capture→sent duration, µs.
host_us: Vec<u64>,
/// The matching `network` term, µs: capture→received minus the host's capture→sent
/// (wire + reassembly). Always pushed in lockstep with `host_us`.
net_us: Vec<u64>,
/// `decode` stage = received→decoded samples, in microseconds (client-local, single clock). /// `decode` stage = received→decoded samples, in microseconds (client-local, single clock).
decode_us: Vec<u64>, decode_us: Vec<u64>,
/// Whether the host answered the clock-skew handshake (latency is cross-machine valid). /// Whether the host answered the clock-skew handshake (latency is cross-machine valid).
@@ -50,6 +58,10 @@ pub struct Snapshot {
/// Stage p50s (ms): `host+network` (capture→received) and `decode` (received→decoded). /// Stage p50s (ms): `host+network` (capture→received) and `decode` (received→decoded).
pub hostnet_p50_ms: f64, pub hostnet_p50_ms: f64,
pub decode_p50_ms: f64, pub decode_p50_ms: f64,
/// Phase-2 `host` / `network` split p50s (ms) — 0.0 when no 0xCF timing matched this window
/// (old host / no samples yet), in which case the HUD keeps the combined `host+network` term.
pub host_p50_ms: f64,
pub net_p50_ms: f64,
pub lat_valid: bool, pub lat_valid: bool,
pub skew_corrected: bool, pub skew_corrected: bool,
} }
@@ -73,6 +85,8 @@ impl VideoStats {
bytes: 0, bytes: 0,
e2e_us: Vec::with_capacity(256), e2e_us: Vec::with_capacity(256),
hostnet_us: Vec::with_capacity(256), hostnet_us: Vec::with_capacity(256),
host_us: Vec::with_capacity(256),
net_us: Vec::with_capacity(256),
decode_us: Vec::with_capacity(256), decode_us: Vec::with_capacity(256),
skew_corrected: false, skew_corrected: false,
}), }),
@@ -101,6 +115,8 @@ impl VideoStats {
g.bytes = 0; g.bytes = 0;
g.e2e_us.clear(); g.e2e_us.clear();
g.hostnet_us.clear(); g.hostnet_us.clear();
g.host_us.clear();
g.net_us.clear();
g.decode_us.clear(); g.decode_us.clear();
} }
} }
@@ -128,6 +144,25 @@ impl VideoStats {
} }
} }
/// Record one matched host/network split sample (Phase 2): the host's reported capture→sent
/// duration and our capture→received minus it, both µs — one pair per AU whose 0xCF host
/// timing arrived and matched by pts. An old host emits none, leaving the vecs empty and the
/// snapshot p50s at 0 (HUD keeps the combined `host+network` term).
// Driven only by the android-only decode thread; unreferenced on the host build — expected.
#[cfg_attr(not(target_os = "android"), allow(dead_code))]
pub fn note_host_split(&self, host_us: u64, net_us: u64) {
if !self.enabled.load(Ordering::Relaxed) {
return; // HUD hidden — skip the lock
}
// Poison-proof for the same reason as `note_received`.
let mut g = self
.inner
.lock()
.unwrap_or_else(std::sync::PoisonError::into_inner);
g.host_us.push(host_us);
g.net_us.push(net_us);
}
/// Record one decoded output frame: its capture→decoded `end-to-end` sample and its /// Record one decoded output frame: its capture→decoded `end-to-end` sample and its
/// received→decoded `decode` stage sample (either may be absent — e.g. the receipt stamp for /// received→decoded `decode` stage sample (either may be absent — e.g. the receipt stamp for
/// this pts predates the HUD being shown). /// this pts predates the HUD being shown).
@@ -163,6 +198,8 @@ impl VideoStats {
let mbps = g.bytes as f64 * 8.0 / 1_000_000.0 / elapsed; let mbps = g.bytes as f64 * 8.0 / 1_000_000.0 / elapsed;
g.e2e_us.sort_unstable(); g.e2e_us.sort_unstable();
g.hostnet_us.sort_unstable(); g.hostnet_us.sort_unstable();
g.host_us.sort_unstable();
g.net_us.sort_unstable();
g.decode_us.sort_unstable(); g.decode_us.sort_unstable();
let snap = Snapshot { let snap = Snapshot {
fps, fps,
@@ -171,6 +208,8 @@ impl VideoStats {
e2e_p95_ms: pctl_ms(&g.e2e_us, 0.95), e2e_p95_ms: pctl_ms(&g.e2e_us, 0.95),
hostnet_p50_ms: pctl_ms(&g.hostnet_us, 0.50), hostnet_p50_ms: pctl_ms(&g.hostnet_us, 0.50),
decode_p50_ms: pctl_ms(&g.decode_us, 0.50), decode_p50_ms: pctl_ms(&g.decode_us, 0.50),
host_p50_ms: pctl_ms(&g.host_us, 0.50),
net_p50_ms: pctl_ms(&g.net_us, 0.50),
lat_valid: !g.e2e_us.is_empty(), lat_valid: !g.e2e_us.is_empty(),
skew_corrected: g.skew_corrected, skew_corrected: g.skew_corrected,
}; };
@@ -179,6 +218,8 @@ impl VideoStats {
g.bytes = 0; g.bytes = 0;
g.e2e_us.clear(); g.e2e_us.clear();
g.hostnet_us.clear(); g.hostnet_us.clear();
g.host_us.clear();
g.net_us.clear();
g.decode_us.clear(); g.decode_us.clear();
snap snap
} }
@@ -326,9 +326,14 @@ struct ContentView: View {
onCaptureChange: { [weak model] captured in onCaptureChange: { [weak model] captured in
model?.mouseCaptured = captured model?.mouseCaptured = captured
}, },
onFrame: { [meter = model.meter, latency = model.latency, offset = conn.clockOffsetNs] au in onFrame: { [meter = model.meter, latency = model.latency,
split = model.latencySplit, offset = conn.clockOffsetNs] au in
meter.note(byteCount: au.data.count) meter.note(byteCount: au.data.count)
latency.record(ptsNs: au.ptsNs, offsetNs: offset) latency.record(ptsNs: au.ptsNs, offsetNs: offset)
// The same receipt, keyed by pts, awaiting its 0xCF host timing (the
// host/network split drained by the 1 s stats tick).
split.recordReceipt(
ptsNs: au.ptsNs, receivedNs: au.receivedNs, offsetNs: offset)
}, },
onSessionEnd: { [weak model] in onSessionEnd: { [weak model] in
Task { @MainActor in model?.sessionEnded() } Task { @MainActor in model?.sessionEnded() }
@@ -69,6 +69,14 @@ final class SessionModel: ObservableObject {
@Published var hostNetworkP95Ms = 0.0 @Published var hostNetworkP95Ms = 0.0
@Published var hostNetworkValid = false @Published var hostNetworkValid = false
@Published var hostNetworkSkewCorrected = false @Published var hostNetworkSkewCorrected = false
/// Phase 2 of the same stage: `host+network` split into its two terms via the host's per-AU
/// 0xCF timing reports (host = capturefully-sent as the host measured it, network = the
/// remainder), matched to receipts by pts in `latencySplit`. `splitValid` is false whenever
/// no timing matched in the window an old host that never emits the plane, or heavy 0xCF
/// loss and the HUD then falls back to the combined `host+network` term.
@Published var hostP50Ms = 0.0
@Published var networkP50Ms = 0.0
@Published var splitValid = false
/// End-to-end = captureon-glass, measured directly per frame (never summed from the stages) /// End-to-end = captureon-glass, measured directly per frame (never summed from the stages)
/// the HUD headline. Only the stage-2 presenter can stamp it (it owns decode + a /// the HUD headline. Only the stage-2 presenter can stamp it (it owns decode + a
/// CAMetalLayer/display-link present); stays invalid under stage-1, where the layer presents /// CAMetalLayer/display-link present); stays invalid under stage-1, where the layer presents
@@ -96,6 +104,10 @@ final class SessionModel: ObservableObject {
/// Capturereceived (the host+network stage), fed per AU at receipt by the stream view's /// Capturereceived (the host+network stage), fed per AU at receipt by the stream view's
/// onFrame under both presenters. /// onFrame under both presenters.
let latency = LatencyMeter() let latency = LatencyMeter()
/// The host/network split of that same stage: onFrame also records (pts, interval) receipts
/// here, and the 1 s stats tick drains the connection's 0xCF host timings into it under
/// both presenters (the receipt path is presenter-independent).
let latencySplit = HostNetworkSplitter()
/// The stage-2 meters, passed to StreamView: end-to-end (captureon-glass, stamped at /// The stage-2 meters, passed to StreamView: end-to-end (captureon-glass, stamped at
/// present), decode (receiveddecoded), display (decodedon-glass). /// present), decode (receiveddecoded), display (decodedon-glass).
let endToEnd = LatencyMeter() let endToEnd = LatencyMeter()
@@ -296,6 +308,7 @@ final class SessionModel: ObservableObject {
fps = 0 fps = 0
mbps = 0 mbps = 0
hostNetworkValid = false hostNetworkValid = false
splitValid = false
endToEndValid = false endToEndValid = false
decodeValid = false decodeValid = false
displayValid = false displayValid = false
@@ -341,6 +354,7 @@ final class SessionModel: ObservableObject {
private func startStatsTimer() { private func startStatsTimer() {
lastFramesDropped = 0 // a fresh connection's cumulative drop counter starts at 0 lastFramesDropped = 0 // a fresh connection's cumulative drop counter starts at 0
latencySplit.reset() // no stale receipts/samples from a previous session
let timer = Timer(timeInterval: 1.0, repeats: true) { [weak self] _ in let timer = Timer(timeInterval: 1.0, repeats: true) { [weak self] _ in
guard let self else { return } guard let self else { return }
Task { @MainActor in Task { @MainActor in
@@ -364,6 +378,25 @@ final class SessionModel: ObservableObject {
} else { } else {
self.hostNetworkValid = false self.hostNetworkValid = false
} }
// Phase 2: drain the window's per-AU host timings (0xCF) into the splitter
// non-blocking, bounded (a 240 fps window is ~240 reports; the cap only guards
// a pathological burst). `try?` flattens (SE-0230); a throw (.closed during
// teardown) just ends the drain. An old host never emits any splitValid stays
// false and the HUD keeps the combined host+network term.
if let conn = self.connection {
var burst = 0
while burst < 1024, let t = try? conn.nextHostTiming(timeoutMs: 0) {
self.latencySplit.noteHostTiming(ptsNs: t.ptsNs, hostUs: t.hostUs)
burst += 1
}
}
if let s = self.latencySplit.drain() {
self.hostP50Ms = s.hostP50Ms
self.networkP50Ms = s.networkP50Ms
self.splitValid = true
} else {
self.splitValid = false
}
if let e = self.endToEnd.drain() { if let e = self.endToEnd.drain() {
self.endToEndP50Ms = e.p50Ms self.endToEndP50Ms = e.p50Ms
self.endToEndP95Ms = e.p95Ms self.endToEndP95Ms = e.p95Ms
@@ -26,20 +26,34 @@ struct StreamHUDView: View {
Text("end-to-end \(model.endToEndP50Ms, specifier: "%.1f") ms p50 · \(model.endToEndP95Ms, specifier: "%.1f") p95 · capture→on-glass\(model.endToEndSkewCorrected ? "" : " (same-host clock)")") Text("end-to-end \(model.endToEndP50Ms, specifier: "%.1f") ms p50 · \(model.endToEndP95Ms, specifier: "%.1f") p95 · capture→on-glass\(model.endToEndSkewCorrected ? "" : " (same-host clock)")")
.font(.system(.caption2, design: .monospaced)) .font(.system(.caption2, design: .monospaced))
.foregroundStyle(.secondary) .foregroundStyle(.secondary)
// The equation: the three stages tiling the headline interval (per-window p50s // The equation: the stages tiling the headline interval (per-window p50s
// they only approximately sum to the directly-measured total). // they only approximately sum to the directly-measured total). With a host
// that reports per-AU timings (0xCF) the first term splits into host + network
// (phase 2); an old host keeps the combined term.
if model.hostNetworkValid && model.decodeValid && model.displayValid { if model.hostNetworkValid && model.decodeValid && model.displayValid {
Text("= host+network \(model.hostNetworkP50Ms, specifier: "%.1f") + decode \(model.decodeP50Ms, specifier: "%.1f") + display \(model.displayP50Ms, specifier: "%.1f")") if model.splitValid {
.font(.system(.caption2, design: .monospaced)) Text("= host \(model.hostP50Ms, specifier: "%.1f") + network \(model.networkP50Ms, specifier: "%.1f") + decode \(model.decodeP50Ms, specifier: "%.1f") + display \(model.displayP50Ms, specifier: "%.1f")")
.foregroundStyle(.secondary) .font(.system(.caption2, design: .monospaced))
.foregroundStyle(.secondary)
} else {
Text("= host+network \(model.hostNetworkP50Ms, specifier: "%.1f") + decode \(model.decodeP50Ms, specifier: "%.1f") + display \(model.displayP50Ms, specifier: "%.1f")")
.font(.system(.caption2, design: .monospaced))
.foregroundStyle(.secondary)
}
} }
} else if model.hostNetworkValid { } else if model.hostNetworkValid {
// Stage-1 fallback presenter: the layer decodes + presents internally with no // Stage-1 fallback presenter: the layer decodes + presents internally with no
// per-frame stamp, so the honest headline ends at receipt and there is no // per-frame stamp, so the honest headline ends at receipt. The host/network
// equation line (host+network is the whole measured interval). // split still applies there (receipt is presenter-independent) it becomes the
// only equation line; without it, host+network IS the whole measured interval.
Text("capture→received \(model.hostNetworkP50Ms, specifier: "%.1f") ms p50 · \(model.hostNetworkP95Ms, specifier: "%.1f") p95\(model.hostNetworkSkewCorrected ? "" : " (same-host clock)")") Text("capture→received \(model.hostNetworkP50Ms, specifier: "%.1f") ms p50 · \(model.hostNetworkP95Ms, specifier: "%.1f") p95\(model.hostNetworkSkewCorrected ? "" : " (same-host clock)")")
.font(.system(.caption2, design: .monospaced)) .font(.system(.caption2, design: .monospaced))
.foregroundStyle(.secondary) .foregroundStyle(.secondary)
if model.splitValid {
Text("= host \(model.hostP50Ms, specifier: "%.1f") + network \(model.networkP50Ms, specifier: "%.1f")")
.font(.system(.caption2, design: .monospaced))
.foregroundStyle(.secondary)
}
} }
if model.lostFrames > 0 { if model.lostFrames > 0 {
// Unrecoverable network drops this window; hidden while the link is clean. // Unrecoverable network drops this window; hidden while the link is clean.
@@ -83,6 +83,9 @@ public final class PunktfunkConnection {
/// Same role for the feedback drain thread (rumble + HID-output two core planes, /// Same role for the feedback drain thread (rumble + HID-output two core planes,
/// drained sequentially by one thread). /// drained sequentially by one thread).
private let feedbackLock = NSLock() private let feedbackLock = NSLock()
/// Same role for the host-timing (0xCF) puller its own plane in the core, drained
/// non-blockingly by the app's 1 s stats tick (never contends with the blocking pullers).
private let statsLock = NSLock()
/// Negotiated session mode (host-confirmed). /// Negotiated session mode (host-confirmed).
public private(set) var width: UInt32 = 0 public private(set) var width: UInt32 = 0
@@ -665,6 +668,40 @@ public final class PunktfunkConnection {
} }
} }
/// One per-AU host-timing report (0xCF): the host's capturefully-sent duration for the
/// access unit whose `AccessUnit.ptsNs` equals `ptsNs` exactly. The stats consumer derives
/// `network = (receivedNs + clockOffsetNs ptsNs) hostUs` the host/network split of the
/// HUD's `host+network` stage (design/stats-unification.md Phase 2).
public struct HostTiming: Sendable, Equatable {
/// The AU's capture stamp (host capture clock matches the AU's `ptsNs`).
public let ptsNs: UInt64
/// Host capturesent duration, µs.
public let hostUs: UInt32
}
/// Pull the next per-AU host timing; nil on timeout, throws `.closed` once the session
/// ended. Best-effort plane: an older host never emits any keep showing the combined
/// `host+network` stage then. Drain non-blockingly (`timeoutMs: 0`) from ONE stats
/// consumer (its own core plane, safe alongside the other pullers).
public func nextHostTiming(timeoutMs: UInt32 = 0) throws -> HostTiming? {
statsLock.lock()
defer { statsLock.unlock() }
guard let h = liveHandle() else { throw PunktfunkClientError.closed }
var out = PunktfunkHostTiming()
let rc = punktfunk_connection_next_host_timing(h, &out, timeoutMs)
switch rc {
case statusOK:
return HostTiming(ptsNs: out.pts_ns, hostUs: out.host_us)
case statusNoFrame:
return nil
case statusClosed:
throw PunktfunkClientError.closed
default:
throw PunktfunkClientError.status(rc)
}
}
/// Send one input event (delivered to the host as a QUIC datagram). Thread-safe; /// Send one input event (delivered to the host as a QUIC datagram). Thread-safe;
/// silently dropped after close. /// silently dropped after close.
public func send(_ event: PunktfunkInputEvent) { public func send(_ event: PunktfunkInputEvent) {
@@ -684,10 +721,12 @@ public final class PunktfunkConnection {
pumpLock.lock() // pullers exit at their next poll boundary, releasing these pumpLock.lock() // pullers exit at their next poll boundary, releasing these
audioLock.lock() audioLock.lock()
feedbackLock.lock() feedbackLock.lock()
statsLock.lock()
abiLock.lock() abiLock.lock()
let h = handle let h = handle
handle = nil handle = nil
abiLock.unlock() abiLock.unlock()
statsLock.unlock()
feedbackLock.unlock() feedbackLock.unlock()
audioLock.unlock() audioLock.unlock()
pumpLock.unlock() pumpLock.unlock()
@@ -0,0 +1,88 @@
// Splits the unified stats model's `host+network` stage (capturereceived) into its `host`
// (capturefully-sent, reported per AU by the host on the 0xCF plane) and `network`
// (the remainder) terms design/stats-unification.md Phase 2.
//
// Receipt samples are recorded per frame from the pump path; host timings are matched to them
// by exact pts (the 0xCF datagram carries the AU's own `pts_ns`). Best-effort by construction:
// a lost 0xCF datagram, an FEC-dropped AU, or an old host that never emits the plane simply
// contributes no split sample the HUD then keeps the combined `host+network` line. NSLock
// rather than an actor the receipt writer is the non-async pump path (same pattern as
// LatencyMeter/FrameMeter).
import Foundation
/// Per-frame `host` / `network` sampler: `recordReceipt` at AU receipt (pts + the combined
/// capturereceived interval), `noteHostTiming` per drained 0xCF report, `drain` the window's
/// p50s once a second. The pending ring is bounded (drop-oldest) so an old host receipts
/// forever, timings never costs a fixed ~4 KB, not growth.
public final class HostNetworkSplitter: @unchecked Sendable {
private let lock = NSLock()
/// Received AUs awaiting their 0xCF host timing: (pts, combined capturereceived µs).
private var pending: [(ptsNs: UInt64, combinedUs: Int64)] = []
private var hostUsSamples: [Int64] = []
private var networkUsSamples: [Int64] = []
/// ~1 s of frames at 240 fps; beyond it the oldest receipt can no longer expect a match.
private static let pendingCap = 256
public init() {}
/// Record one frame at receipt. `ptsNs` is the host capture clock (the AU's pts),
/// `receivedNs` the client `CLOCK_REALTIME` receipt instant (`AccessUnit.receivedNs`),
/// `offsetNs` the connect-time hostclient clock offset (0 = uncorrected). Same
/// absurd-value clamp as LatencyMeter a sample it would drop must not linger here.
public func recordReceipt(ptsNs: UInt64, receivedNs: Int64, offsetNs: Int64) {
let combinedNs = receivedNs &+ offsetNs &- Int64(bitPattern: ptsNs)
guard combinedNs > 0, combinedNs < 10_000_000_000 else { return }
lock.lock()
pending.append((ptsNs: ptsNs, combinedUs: combinedNs / 1000))
if pending.count > Self.pendingCap {
pending.removeFirst(pending.count - Self.pendingCap)
}
lock.unlock()
}
/// Match one host timing (0xCF) to its receipt: `host` = the reported capturesent,
/// `network` = the combined interval minus it, floored at 0 (the terms tile per frame; a
/// slightly-off skew offset must not produce a negative wire time). Unmatched timings
/// the AU was FEC-dropped, or its receipt raced this drain are simply skipped.
public func noteHostTiming(ptsNs: UInt64, hostUs: UInt32) {
lock.lock()
defer { lock.unlock() }
guard let i = pending.firstIndex(where: { $0.ptsNs == ptsNs }) else { return }
let combinedUs = pending.remove(at: i).combinedUs
hostUsSamples.append(Int64(hostUs))
networkUsSamples.append(max(0, combinedUs - Int64(hostUs)))
}
public struct Split: Sendable {
public let hostP50Ms: Double
public let networkP50Ms: Double
public let count: Int
}
/// The window's p50s since the last drain, then reset (matched samples only; the pending
/// ring survives a receipt may still match a timing drained next tick). `nil` when no
/// timing matched in the interval the caller falls back to the combined stage.
public func drain() -> Split? {
lock.lock()
let host = hostUsSamples.sorted()
let network = networkUsSamples.sorted()
hostUsSamples.removeAll(keepingCapacity: true)
networkUsSamples.removeAll(keepingCapacity: true)
lock.unlock()
guard !host.isEmpty else { return nil }
func p50(_ sorted: [Int64]) -> Double {
Double(sorted[min(sorted.count / 2, sorted.count - 1)]) / 1000.0 // µs ms
}
return Split(hostP50Ms: p50(host), networkP50Ms: p50(network), count: host.count)
}
/// Forget everything (pending receipts + window) a fresh connection starts clean.
public func reset() {
lock.lock()
pending.removeAll()
hostUsSamples.removeAll()
networkUsSamples.removeAll()
lock.unlock()
}
}
@@ -0,0 +1,107 @@
// Unit tests for HostNetworkSplitter (the host/network split of the unified stats model's
// host+network stage design/stats-unification.md Phase 2): pts matching, the per-frame
// tiling arithmetic (network = combined host, floored at 0), drain/reset semantics, the
// bounded pending ring, and the absurd-receipt clamp. All samples use explicit instants, so
// the expectations are exact.
import Foundation
import XCTest
@testable import PunktfunkKit
final class HostNetworkSplitterTests: XCTestCase {
/// An arbitrary host-capture pts (ns) far from zero, like a real CLOCK_REALTIME stamp.
private let basePts: UInt64 = 1_000_000_000_000
private func receipt(_ s: HostNetworkSplitter, pts: UInt64, combinedMs: Int64,
offsetNs: Int64 = 0) {
s.recordReceipt(
ptsNs: pts, receivedNs: Int64(pts) + combinedMs * 1_000_000 - offsetNs,
offsetNs: offsetNs)
}
func testEmptyDrainIsNil() {
XCTAssertNil(HostNetworkSplitter().drain())
}
func testMatchSplitsCombinedIntoHostAndNetwork() {
let s = HostNetworkSplitter()
receipt(s, pts: basePts, combinedMs: 8) // capturereceived 8 ms
s.noteHostTiming(ptsNs: basePts, hostUs: 3_000) // host says 3 ms of it was its own
guard let split = s.drain() else { return XCTFail("expected a matched sample") }
XCTAssertEqual(split.count, 1)
XCTAssertEqual(split.hostP50Ms, 3.0)
XCTAssertEqual(split.networkP50Ms, 5.0, "the two terms tile the combined interval")
XCTAssertNil(s.drain(), "drain resets the window")
}
func testSkewOffsetAppliesToTheCombinedInterval() {
let s = HostNetworkSplitter()
// Client clock 2 ms behind the host: the raw difference alone would read 6 ms.
receipt(s, pts: basePts, combinedMs: 8, offsetNs: 2_000_000)
s.noteHostTiming(ptsNs: basePts, hostUs: 3_000)
XCTAssertEqual(s.drain()?.networkP50Ms, 5.0)
}
func testUnmatchedTimingIsSkipped() {
let s = HostNetworkSplitter()
receipt(s, pts: basePts, combinedMs: 8)
// A timing for an AU we never received (FEC-dropped) must not fabricate a sample.
s.noteHostTiming(ptsNs: basePts + 1, hostUs: 3_000)
XCTAssertNil(s.drain())
}
func testReceiptSurvivesADrainUntilItsTimingArrives() {
let s = HostNetworkSplitter()
receipt(s, pts: basePts, combinedMs: 8)
XCTAssertNil(s.drain(), "no timing matched yet")
s.noteHostTiming(ptsNs: basePts, hostUs: 3_000) // arrives one tick late still matches
XCTAssertEqual(s.drain()?.hostP50Ms, 3.0)
}
func testEachReceiptMatchesOnce() {
let s = HostNetworkSplitter()
receipt(s, pts: basePts, combinedMs: 8)
s.noteHostTiming(ptsNs: basePts, hostUs: 3_000)
s.noteHostTiming(ptsNs: basePts, hostUs: 3_000) // duplicate 0xCF no second sample
XCTAssertEqual(s.drain()?.count, 1)
}
func testNetworkFlooredAtZero() {
let s = HostNetworkSplitter()
// A slightly-off skew offset can make host_us exceed the combined interval.
receipt(s, pts: basePts, combinedMs: 2)
s.noteHostTiming(ptsNs: basePts, hostUs: 3_000)
guard let split = s.drain() else { return XCTFail("expected a sample") }
XCTAssertEqual(split.hostP50Ms, 3.0)
XCTAssertEqual(split.networkP50Ms, 0.0)
}
func testPendingRingDropsOldest() {
let s = HostNetworkSplitter()
for i in 0..<300 { // cap is 256 the first receipts fall out
receipt(s, pts: basePts + UInt64(i), combinedMs: 8)
}
s.noteHostTiming(ptsNs: basePts, hostUs: 3_000) // evicted no match
XCTAssertNil(s.drain())
s.noteHostTiming(ptsNs: basePts + 299, hostUs: 3_000) // newest still pending
XCTAssertEqual(s.drain()?.count, 1)
}
func testAbsurdReceiptsAreDropped() {
let s = HostNetworkSplitter()
receipt(s, pts: basePts, combinedMs: -1) // received before capture clock step
receipt(s, pts: basePts + 1, combinedMs: 20_000) // > 10 s garbage pts/offset
s.noteHostTiming(ptsNs: basePts, hostUs: 1_000)
s.noteHostTiming(ptsNs: basePts + 1, hostUs: 1_000)
XCTAssertNil(s.drain())
}
func testResetForgetsPendingReceipts() {
let s = HostNetworkSplitter()
receipt(s, pts: basePts, combinedMs: 8)
s.reset()
s.noteHostTiming(ptsNs: basePts, hostUs: 3_000)
XCTAssertNil(s.drain(), "a fresh session must not match a previous session's receipts")
}
}
@@ -25,12 +25,18 @@ final class LoopbackIntegrationTests: XCTestCase {
XCTAssertEqual(conn.resolvedBitrateKbps, 50_000) XCTAssertEqual(conn.resolvedBitrateKbps, 50_000)
// Pull 25 synthetic frames and byte-verify the documented pattern: // Pull 25 synthetic frames and byte-verify the documented pattern:
// u32 LE frame index, then data[i] = (idx as u8) &+ (i as u8). // u32 LE frame index, then data[i] = (idx as u8) &+ (i as u8). Alongside, drain the
// per-AU host-timing plane (0xCF) the way the app's stats tick does the connector
// ORs VIDEO_CAP_HOST_TIMING in unconditionally and the synthetic host stamps one
// report per AU, so the pts correlation must hold end to end through the xcframework.
var got = 0 var got = 0
var lastIndex: UInt32 = 0 var lastIndex: UInt32 = 0
var receivedPts = Set<UInt64>()
var timings: [PunktfunkConnection.HostTiming] = []
let deadline = Date().addingTimeInterval(30) let deadline = Date().addingTimeInterval(30)
while got < 25 { while got < 25 {
XCTAssertLessThan(Date(), deadline, "timed out after \(got) frames") XCTAssertLessThan(Date(), deadline, "timed out after \(got) frames")
while let t = try conn.nextHostTiming(timeoutMs: 0) { timings.append(t) }
guard let au = try conn.nextAU(timeoutMs: 2000) else { continue } guard let au = try conn.nextAU(timeoutMs: 2000) else { continue }
let idx = au.data.prefix(4).reversed().reduce(UInt32(0)) { ($0 << 8) | UInt32($1) } let idx = au.data.prefix(4).reversed().reduce(UInt32(0)) { ($0 << 8) | UInt32($1) }
for (i, byte) in au.data.enumerated().dropFirst(4) { for (i, byte) in au.data.enumerated().dropFirst(4) {
@@ -41,10 +47,22 @@ final class LoopbackIntegrationTests: XCTestCase {
} }
} }
XCTAssertGreaterThan(au.ptsNs, 0) XCTAssertGreaterThan(au.ptsNs, 0)
receivedPts.insert(au.ptsNs)
lastIndex = idx lastIndex = idx
got += 1 got += 1
} }
XCTAssertGreaterThanOrEqual(lastIndex, 24) XCTAssertGreaterThanOrEqual(lastIndex, 24)
// Belt-and-braces: the last frame's timing lands just after its AU give it a bounded
// grace drain (the stream keeps running, so this must not loop on fresh timings).
var grace = 0
while grace < 64, !timings.contains(where: { receivedPts.contains($0.ptsNs) }),
let t = try conn.nextHostTiming(timeoutMs: 100) {
timings.append(t)
grace += 1
}
XCTAssertTrue(
timings.contains { receivedPts.contains($0.ptsNs) },
"no 0xCF host timing matched a received AU's pts (got \(timings.count) timings)")
// Input goes the other way (enqueue-only; the host logs the count on close) // Input goes the other way (enqueue-only; the host logs the count on close)
// including the touch kinds, gamepad events, the rich-input plane (DualSense // including the touch kinds, gamepad events, the rich-input plane (DualSense
+51
View File
@@ -56,6 +56,16 @@ pub struct Stats {
pub mbps: f32, pub mbps: f32,
/// p50 `host+network` stage: capture → received, host-clock corrected (ms). /// p50 `host+network` stage: capture → received, host-clock corrected (ms).
pub host_net_ms: f32, pub host_net_ms: f32,
/// p50 `host` stage: the host's own capture→fully-sent, from the per-AU 0xCF host
/// timings (design/stats-unification.md Phase 2). Valid only when `split`.
pub host_ms: f32,
/// p50 `network` stage: capture→received minus the host-reported share
/// (`hostnet host`, per-frame, saturating). Valid only when `split`.
pub net_ms: f32,
/// The window had matched host timings — the OSD splits `host+network` into
/// `host + network`. An old host never emits 0xCF, so this stays false and the
/// combined stage renders unchanged.
pub split: bool,
/// p50 `decode` stage: received → decoded, single-clock client-local (ms). /// p50 `decode` stage: received → decoded, single-clock client-local (ms).
pub decode_ms: f32, pub decode_ms: f32,
/// Unrecoverable network frame drops this window, and their share of /// Unrecoverable network frame drops this window, and their share of
@@ -67,6 +77,11 @@ pub struct Stats {
pub decoder: &'static str, pub decoder: &'static str,
} }
/// Frames the pump keeps waiting for their 0xCF host timing (pts → capture→received µs).
/// ~2 s at 120 Hz — a timing arrives within a frame or two of its AU, and against an old
/// host (no 0xCF at all) this just caps the dead-weight ring.
const PENDING_SPLIT_CAP: usize = 256;
/// Sort a window of µs samples in place and return `(p50, p95)` per the spec's index /// Sort a window of µs samples in place and return `(p50, p95)` per the spec's index
/// rules (`sorted[len/2]`, `sorted[min(len*95/100, len-1)]`); an empty window reads 0. /// rules (`sorted[len/2]`, `sorted[min(len*95/100, len-1)]`); an empty window reads 0.
pub fn window_percentiles(samples: &mut [u64]) -> (u64, u64) { pub fn window_percentiles(samples: &mut [u64]) -> (u64, u64) {
@@ -245,6 +260,12 @@ fn pump(
// corrected), `decode` = received→decoded (client-local). p50 per 1 s window. // corrected), `decode` = received→decoded (client-local). p50 per 1 s window.
let mut hostnet_us: Vec<u64> = Vec::with_capacity(256); let mut hostnet_us: Vec<u64> = Vec::with_capacity(256);
let mut decode_us: Vec<u64> = Vec::with_capacity(256); let mut decode_us: Vec<u64> = Vec::with_capacity(256);
// Host/network split (Phase 2): frames awaiting their per-AU 0xCF host timing,
// correlated by pts_ns. Bounded — an old host never sends any, so entries just age out.
let mut pending_split: std::collections::VecDeque<(u64, u64)> =
std::collections::VecDeque::with_capacity(PENDING_SPLIT_CAP);
let mut host_us_win: Vec<u64> = Vec::with_capacity(256);
let mut net_us_win: Vec<u64> = Vec::with_capacity(256);
// What actually decoded the last frame — a VAAPI failure demotes mid-session, so // What actually decoded the last frame — a VAAPI failure demotes mid-session, so
// this is read off each frame's image variant rather than fixed at startup. // this is read off each frame's image variant rather than fixed at startup.
let mut dec_path: &'static str = ""; let mut dec_path: &'static str = "";
@@ -291,6 +312,12 @@ fn pump(
.max(0) as u64; .max(0) as u64;
if hn > 0 && hn < 10_000_000_000 { if hn > 0 && hn < 10_000_000_000 {
hostnet_us.push(hn / 1000); hostnet_us.push(hn / 1000);
// Remember the sample for the host/network split — matched
// against the AU's 0xCF host timing when it arrives.
if pending_split.len() >= PENDING_SPLIT_CAP {
pending_split.pop_front();
}
pending_split.push_back((frame.pts_ns, hn / 1000));
} }
// `decode` stage: received→decoded, single clock, no skew. // `decode` stage: received→decoded, single clock, no skew.
decode_us.push(decoded_ns.saturating_sub(received_ns) / 1000); decode_us.push(decoded_ns.saturating_sub(received_ns) / 1000);
@@ -310,6 +337,19 @@ fn pump(
Err(e) => break Some(format!("session: {e:?}")), Err(e) => break Some(format!("session: {e:?}")),
} }
// Drain the per-AU host timings (0xCF) non-blockingly and match them to received
// frames by pts: host = the host's own capture→sent, network = our
// capture→received minus it (the two tile per frame by construction). An old
// host never emits any — the deque fills to its cap and the OSD keeps the
// combined `host+network` stage.
while let Ok(t) = connector.next_host_timing(Duration::ZERO) {
if let Some(i) = pending_split.iter().position(|(p, _)| *p == t.pts_ns) {
let (_, hn_us) = pending_split.remove(i).unwrap();
host_us_win.push(t.host_us as u64);
net_us_win.push(hn_us.saturating_sub(t.host_us as u64));
}
}
// Loss recovery: under infinite GOP the only recovery keyframe is one we request. The // Loss recovery: under infinite GOP the only recovery keyframe is one we request. The
// reassembler drops unrecoverable AUs (frames_dropped); the decoder then conceals the // reassembler drops unrecoverable AUs (frames_dropped); the decoder then conceals the
// reference-missing delta frames that follow and returns Ok, so keying off a decode error // reference-missing delta frames that follow and returns Ok, so keying off a decode error
@@ -330,11 +370,17 @@ fn pump(
let secs = window_start.elapsed().as_secs_f32(); let secs = window_start.elapsed().as_secs_f32();
let (hn_p50, _) = window_percentiles(&mut hostnet_us); let (hn_p50, _) = window_percentiles(&mut hostnet_us);
let (dec_p50, _) = window_percentiles(&mut decode_us); let (dec_p50, _) = window_percentiles(&mut decode_us);
// Host/network split — present only when this window matched 0xCF timings.
let split = !host_us_win.is_empty();
let (host_p50, _) = window_percentiles(&mut host_us_win);
let (net_p50, _) = window_percentiles(&mut net_us_win);
let lost = dropped.saturating_sub(window_dropped) as u32; let lost = dropped.saturating_sub(window_dropped) as u32;
window_dropped = dropped; window_dropped = dropped;
tracing::debug!( tracing::debug!(
fps = frames_n, fps = frames_n,
hostnet_p50_us = hn_p50, hostnet_p50_us = hn_p50,
host_p50_us = host_p50,
net_p50_us = net_p50,
decode_p50_us = dec_p50, decode_p50_us = dec_p50,
lost, lost,
total_frames, total_frames,
@@ -344,6 +390,9 @@ fn pump(
fps: frames_n as f32 / secs, fps: frames_n as f32 / secs,
mbps: bytes_n as f32 * 8.0 / 1e6 / secs, mbps: bytes_n as f32 * 8.0 / 1e6 / secs,
host_net_ms: hn_p50 as f32 / 1000.0, host_net_ms: hn_p50 as f32 / 1000.0,
host_ms: host_p50 as f32 / 1000.0,
net_ms: net_p50 as f32 / 1000.0,
split,
decode_ms: dec_p50 as f32 / 1000.0, decode_ms: dec_p50 as f32 / 1000.0,
lost, lost,
lost_pct: if lost > 0 { lost_pct: if lost > 0 {
@@ -358,6 +407,8 @@ fn pump(
bytes_n = 0; bytes_n = 0;
hostnet_us.clear(); hostnet_us.clear();
decode_us.clear(); decode_us.clear();
host_us_win.clear();
net_us_win.clear();
} }
}; };
+19 -4
View File
@@ -68,10 +68,28 @@ impl StreamPage {
if self.hdr.get() { if self.hdr.get() {
line1.push_str(" · HDR"); line1.push_str(" · HDR");
} }
// The equation line: split `host+network` into `host + network` when the host
// reported per-AU timings (0xCF, stats Phase 2); the combined stage otherwise.
let equation = if s.split {
format!(
"= host {:.1} + network {:.1} + decode {:.1} + display {:.1}",
s.host_ms,
s.net_ms,
s.decode_ms,
self.presented.display_ms.get(),
)
} else {
format!(
"= host+network {:.1} + decode {:.1} + display {:.1}",
s.host_net_ms,
s.decode_ms,
self.presented.display_ms.get(),
)
};
let mut text = format!( let mut text = format!(
"{line1}\n\ "{line1}\n\
end-to-end {:.1} ms p50 · {:.1} p95 · capture→displayed{}\n\ end-to-end {:.1} ms p50 · {:.1} p95 · capture→displayed{}\n\
= host+network {:.1} + decode {:.1} + display {:.1}", {equation}",
self.presented.e2e_p50_ms.get(), self.presented.e2e_p50_ms.get(),
self.presented.e2e_p95_ms.get(), self.presented.e2e_p95_ms.get(),
if self.same_host { if self.same_host {
@@ -79,9 +97,6 @@ impl StreamPage {
} else { } else {
"" ""
}, },
s.host_net_ms,
s.decode_ms,
self.presented.display_ms.get(),
); );
// Counters — only rendered when nonzero this window. // Counters — only rendered when nonzero this window.
if s.lost > 0 { if s.lost > 0 {
+17 -7
View File
@@ -175,7 +175,8 @@ fn fmt_uptime(secs: u32) -> String {
/// The streaming HUD overlay (top-right), unified stats vocabulary (design/stats-unification.md): /// The streaming HUD overlay (top-right), unified stats vocabulary (design/stats-unification.md):
/// a chip row (mode · codec · decode path · HDR), a stream line (received fps · goodput · /// a chip row (mode · codec · decode path · HDR), a stream line (received fps · goodput ·
/// presenter fps), the end-to-end headline (capture→on-glass p50/p95, host-clock corrected), the /// presenter fps), the end-to-end headline (capture→on-glass p50/p95, host-clock corrected), the
/// stage equation (= host+network + decode + display, stage p50s), a session line /// stage equation (= host + network + decode + display when the host reports 0xCF timings, else
/// the combined = host+network + decode + display; stage p50s), a session line
/// (host · time · loss/skips), and the shortcut hints. Layered over the `SwapChainPanel` in the /// (host · time · loss/skips), and the shortcut hints. Layered over the `SwapChainPanel` in the
/// same grid cell. /// same grid cell.
fn hud_overlay(hud: &HudSample, mode: Option<Mode>, host: &str) -> Element { fn hud_overlay(hud: &HudSample, mode: Option<Mode>, host: &str) -> Element {
@@ -212,12 +213,21 @@ fn hud_overlay(hud: &HudSample, mode: Option<Mode>, host: &str) -> Element {
if stats.same_host { if stats.same_host {
e2e_line.push_str(" (same-host clock)"); e2e_line.push_str(" (same-host clock)");
} }
// The equation: the three stages tile the headline interval per frame; the window p50s only // The equation: the stages tile the headline interval per frame; the window p50s only
// approximately sum (percentiles aren't additive). // approximately sum (percentiles aren't additive). With per-AU 0xCF host timings the opaque
let stage_line = format!( // `host+network` term splits into `host` (host capture→sent) + `network` (the remainder);
"= host+network {:.1} + decode {:.1} + display {:.1}", // an old host emits none and the combined term stays.
stats.hostnet_ms, stats.decode_ms, present.display_p50_ms let stage_line = if stats.split {
); format!(
"= host {:.1} + network {:.1} + decode {:.1} + display {:.1}",
stats.host_ms, stats.net_ms, stats.decode_ms, present.display_p50_ms
)
} else {
format!(
"= host+network {:.1} + decode {:.1} + display {:.1}",
stats.hostnet_ms, stats.decode_ms, present.display_p50_ms
)
};
let mut session_bits: Vec<String> = Vec::new(); let mut session_bits: Vec<String> = Vec::new();
if !host.is_empty() { if !host.is_empty() {
session_bits.push(host.to_string()); session_bits.push(host.to_string());
+12
View File
@@ -238,6 +238,18 @@ fn run_headless_cli(args: &[String], identity: (String, String)) {
session::SessionEvent::Connected { session::SessionEvent::Connected {
mode, fingerprint, .. mode, fingerprint, ..
} => tracing::info!(?mode, fp = %trust::hex(&fingerprint), "connected"), } => tracing::info!(?mode, fp = %trust::hex(&fingerprint), "connected"),
// With per-AU 0xCF host timings the combined host+network stage splits into
// host (capture→sent on the host) + net; an old host emits none → combined only.
session::SessionEvent::Stats(s) if s.split => tracing::info!(
fps = format!("{:.0}", s.fps),
mbps = format!("{:.1}", s.mbps),
decode_p50_ms = format!("{:.2}", s.decode_ms),
hostnet_p50_ms = format!("{:.2}", s.hostnet_ms),
host_p50_ms = format!("{:.2}", s.host_ms),
net_p50_ms = format!("{:.2}", s.net_ms),
frames_seen,
"stats"
),
session::SessionEvent::Stats(s) => tracing::info!( session::SessionEvent::Stats(s) => tracing::info!(
fps = format!("{:.0}", s.fps), fps = format!("{:.0}", s.fps),
mbps = format!("{:.1}", s.mbps), mbps = format!("{:.1}", s.mbps),
+44
View File
@@ -55,6 +55,15 @@ pub struct Stats {
/// `host+network` stage p50 over the last 1 s window: capture (`pts_ns`) → received, /// `host+network` stage p50 over the last 1 s window: capture (`pts_ns`) → received,
/// host-clock corrected via `clock_offset_ns`. /// host-clock corrected via `clock_offset_ns`.
pub hostnet_ms: f32, pub hostnet_ms: f32,
/// `host` stage p50 (host capture→sent, from the per-AU 0xCF host-timing plane). Valid only
/// when `split` — an old host emits no 0xCF and the HUD keeps the combined stage.
pub host_ms: f32,
/// `network` stage p50 (`hostnet host`, tiled per frame before taking the percentile).
/// Valid only when `split`.
pub net_ms: f32,
/// True when any 0xCF host timings matched received AUs this window — the HUD then renders
/// `host + network` instead of the combined `host+network` term.
pub split: bool,
/// True when `clock_offset_ns == 0` (host didn't answer the skew handshake / same host) — /// True when `clock_offset_ns == 0` (host didn't answer the skew handshake / same host) —
/// the HUD appends `(same-host clock)` to the end-to-end line. /// the HUD appends `(same-host clock)` to the end-to-end line.
pub same_host: bool, pub same_host: bool,
@@ -330,6 +339,12 @@ fn pump(
// 1 s tumbling stage windows (spec: design/stats-unification.md — percentiles, never means). // 1 s tumbling stage windows (spec: design/stats-unification.md — percentiles, never means).
let mut hostnet_us: Vec<u64> = Vec::with_capacity(256); let mut hostnet_us: Vec<u64> = Vec::with_capacity(256);
let mut decode_us: Vec<u64> = Vec::with_capacity(256); let mut decode_us: Vec<u64> = Vec::with_capacity(256);
// Host/network split (Phase 2): received AUs awaiting their 0xCF host timing, `(pts_ns,
// hostnet_us)`, matched as the datagrams arrive. Bounded — an old host never sends any.
let mut pending_split: std::collections::VecDeque<(u64, u64)> =
std::collections::VecDeque::with_capacity(256);
let mut host_us_w: Vec<u64> = Vec::with_capacity(256);
let mut net_us_w: Vec<u64> = Vec::with_capacity(256);
let mut pcm = vec![0f32; 5760 * channels as usize]; // scratch: max Opus frame (120 ms) × channels let mut pcm = vec![0f32; 5760 * channels as usize]; // scratch: max Opus frame (120 ms) × channels
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs. // Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs.
let mut last_dropped = connector.frames_dropped(); let mut last_dropped = connector.frames_dropped();
@@ -352,6 +367,11 @@ fn pump(
.max(0) as u64; .max(0) as u64;
if hostnet > 0 && hostnet < 10_000_000_000 { if hostnet > 0 && hostnet < 10_000_000_000 {
hostnet_us.push(hostnet / 1000); hostnet_us.push(hostnet / 1000);
// Remember this AU for the 0xCF match below (host/network split).
pending_split.push_back((frame.pts_ns, hostnet / 1000));
if pending_split.len() > 256 {
pending_split.pop_front();
}
} }
// A D3D11VA→software demotion (see `Decoder::decode`) starts a FRESH decoder that // A D3D11VA→software demotion (see `Decoder::decode`) starts a FRESH decoder that
// has none of the stream's parameter sets; under infinite GOP it would sit on // has none of the stream's parameter sets; under infinite GOP it would sit on
@@ -440,15 +460,34 @@ fn pump(
*crate::present::LATEST_HDR_META.lock().unwrap() = Some(meta); *crate::present::LATEST_HDR_META.lock().unwrap() = Some(meta);
} }
// Drain the per-AU host-timing plane (0xCF) and match by pts: `host` = the host's own
// capture→sent, `network` = our capture→received minus it — the two tile per frame
// (design/stats-unification.md Phase 2). An old host never emits any; `split` stays false
// and the HUD keeps the combined `host+network` stage.
while let Ok(t) = connector.next_host_timing(Duration::ZERO) {
if let Some(i) = pending_split.iter().position(|(p, _)| *p == t.pts_ns) {
let (_, hn_us) = pending_split.remove(i).unwrap();
host_us_w.push(t.host_us as u64);
net_us_w.push(hn_us.saturating_sub(t.host_us as u64));
}
}
if window_start.elapsed() >= Duration::from_secs(1) { if window_start.elapsed() >= Duration::from_secs(1) {
let secs = window_start.elapsed().as_secs_f32(); let secs = window_start.elapsed().as_secs_f32();
hostnet_us.sort_unstable(); hostnet_us.sort_unstable();
decode_us.sort_unstable(); decode_us.sort_unstable();
host_us_w.sort_unstable();
net_us_w.sort_unstable();
let p50 = |v: &[u64]| v.get(v.len() / 2).copied().unwrap_or(0); let p50 = |v: &[u64]| v.get(v.len() / 2).copied().unwrap_or(0);
let (hostnet_p50, decode_p50) = (p50(&hostnet_us), p50(&decode_us)); let (hostnet_p50, decode_p50) = (p50(&hostnet_us), p50(&decode_us));
let (host_p50, net_p50) = (p50(&host_us_w), p50(&net_us_w));
let split = !host_us_w.is_empty();
tracing::debug!( tracing::debug!(
fps = frames_n, fps = frames_n,
hostnet_p50_us = hostnet_p50, hostnet_p50_us = hostnet_p50,
host_p50_us = host_p50,
net_p50_us = net_p50,
split,
decode_p50_us = decode_p50, decode_p50_us = decode_p50,
total_frames, total_frames,
"stream window" "stream window"
@@ -458,6 +497,9 @@ fn pump(
mbps: bytes_n as f32 * 8.0 / 1e6 / secs, mbps: bytes_n as f32 * 8.0 / 1e6 / secs,
decode_ms: decode_p50 as f32 / 1000.0, decode_ms: decode_p50 as f32 / 1000.0,
hostnet_ms: hostnet_p50 as f32 / 1000.0, hostnet_ms: hostnet_p50 as f32 / 1000.0,
host_ms: host_p50 as f32 / 1000.0,
net_ms: net_p50 as f32 / 1000.0,
split,
same_host: clock_offset == 0, same_host: clock_offset == 0,
hardware, hardware,
hdr, hdr,
@@ -470,6 +512,8 @@ fn pump(
bytes_n = 0; bytes_n = 0;
hostnet_us.clear(); hostnet_us.clear();
decode_us.clear(); decode_us.clear();
host_us_w.clear();
net_us_w.clear();
} }
}; };
+34 -15
View File
@@ -121,22 +121,41 @@ Sunshine's "host processing latency" (capture→send).
from frame-number gaps); our `fps` counts received only — equal at ~0 loss. from frame-number gaps); our `fps` counts received only — equal at ~0 loss.
- Moonlight decode/queue/render times are **means**; ours are p50s. - Moonlight decode/queue/render times are **means**; ours are p50s.
## Phase 2 (specced, not in v1): split `host+network` ## Phase 2 (implemented): split `host+network` via the 0xCF host-timing plane
Carry the host's capture→send duration per AU (host stamps it at send, e.g. a Not an AU-header change after all — the hardened data-plane format stays untouched.
varint-µs field in the AU header or a 0.1 ms u16 à la Sunshine's frame header). Client The host reports its share on the established QUIC side-plane pattern:
then displays `host {x} + network {y}` instead of `host+network`, where
`network = (received capture) host_reported` — and the Moonlight matrix gains a - **Cap bit**: `Hello::video_caps` gains `VIDEO_CAP_HOST_TIMING` (0x08). NativeClient
direct "Host processing latency" counterpart. Requires a core wire/ABI bump ORs it in unconditionally; the probe sets it explicitly. Old hosts ignore it.
(`punktfunk_frame` gains `host_latency_us`), trailing-byte back-compat like the - **Datagram**: `HOST_TIMING_MAGIC` 0xCF, 13 bytes — `[tag][pts_ns u64 LE][host_us
compositor/gamepad preference bytes. Also consider surfacing the QUIC path RTT u32 LE]` (`quic::HostTiming`). Emitted once per AU by the send thread right after
(quinn exposes it) as a diagnostics line, clearly labelled control-plane RTT. the AU's last packet left the socket, so `host_us` = capture→fully-sent (capture
read/convert, encode, FEC+seal, paced send) against the same anchor as the wire
pts. Speed-test filler (FLAG_PROBE) is skipped. The synthetic host emits it too
(loopback protocol tests cover the plane).
- **Client math**: correlate by `pts_ns` (a bounded pending ring of receipt samples),
`host = host_us`, `network = hostnet_sample host_us` (saturating) — the two terms
tile the `host+network` stage per frame by construction. Equation line becomes
`= host {x} + network {y} + decode + display`; when no 0xCF arrived in the window
(old host / all datagrams lost) it falls back to the combined `host+network` term.
- **Surfaces**: `NativeClient::next_host_timing()` (Rust clients), C ABI
`PunktfunkHostTiming` + `punktfunk_connection_next_host_timing` (Apple), probe log
line `host/network latency split` (host_p50/p95_us · net_p50/p95_us).
- This is our direct analogue of Sunshine's "host processing latency" — ours
additionally includes the paced send (theirs stops just before the UDP send).
Still open for a later phase: surfacing the QUIC path RTT (quinn exposes it) as a
diagnostics line, clearly labelled control-plane RTT.
## Implementation status ## Implementation status
- [ ] Apple (`StreamHUDView`/`SessionModel`/`Stage2Pipeline` + `LatencyMeter` reuse) - [x] Apple (`StreamHUDView`/`SessionModel`/`Stage2Pipeline` + `LatencyMeter` reuse) — 09a5957
- [ ] Windows (`app/stream.rs` HUD rows, `session.rs`/`render.rs` meters → p50/p95) - [x] Windows (`app/stream.rs` HUD rows, `session.rs`/`render.rs` meters → p50/p95) — 09a5957
- [ ] Linux (`ui_stream.rs` OSD, `session.rs` window meters) - [x] Linux (`ui_stream.rs` OSD, `session.rs` window meters) — 09a5957
- [ ] Android (`stats.rs`/`decode.rs` stage split, `StatsOverlay.kt`) - [x] Android (`stats.rs`/`decode.rs` stage split, `StatsOverlay.kt`) — 09a5957
- [ ] probe (rename `capture→reassembled``capture→received` in the log line) - [x] probe (rename `capture→reassembled` → `capture→received` in the log line) — 09a5957
- [ ] docs-site stats page + matrix; link from `moonlight.md` - [x] docs-site stats page + matrix; link from `moonlight.md` — 09a5957
- [x] Phase 2 wire layer (0xCF + cap bit + NativeClient/ABI + host emission + probe split) — 449a67c
- [ ] Phase 2 client HUDs (host/network equation terms on Apple/Windows/Linux/Android)
- [ ] On-glass validation everywhere (Mac swift test + glass, Windows CI + glass, Android device, Linux glass)
+11 -7
View File
@@ -25,7 +25,7 @@ life:
``` ```
1920×1080@120 · 119 fps · 38.2 Mb/s · HEVC 10-bit HDR · GPU decode 1920×1080@120 · 119 fps · 38.2 Mb/s · HEVC 10-bit HDR · GPU decode
end-to-end 14.2 ms p50 · 19.8 p95 · capture→on-glass end-to-end 14.2 ms p50 · 19.8 p95 · capture→on-glass
= host+network 9.8 + decode 2.1 + display 2.3 = host 3.1 + network 6.7 + decode 2.1 + display 2.3
lost 3 (0.1%) · skipped 1 · FEC 12 lost 3 (0.1%) · skipped 1 · FEC 12
``` ```
@@ -35,14 +35,18 @@ lost 3 (0.1%) · skipped 1 · FEC 12
capture to the endpoint named at the end of the line (`capture→on-glass` here). capture to the endpoint named at the end of the line (`capture→on-glass` here).
`p50` = the typical frame (median), `p95` = the slow outliers. This is the one `p50` = the typical frame (median), `p95` = the slow outliers. This is the one
number that summarizes your stream. number that summarizes your stream.
- **Line 3 — where the time goes.** The three stages **tile the end-to-end interval** - **Line 3 — where the time goes.** The stages **tile the end-to-end interval**
each starts where the previous one ends, so they add up to the headline: each starts where the previous one ends, so they add up to the headline:
- `host+network` — capture → received: the host's capture/encode/send pipeline - `host` — capture → sent: the host's own share (capture read, encode, error
*plus* the network flight and reassembly, in one number. coding, the paced send), reported by the host itself once per frame.
- `network` — sent → received: the network flight plus reassembly on your device.
- `decode` — received → decoded, on your device. - `decode` — received → decoded, on your device.
- `display` — decoded → displayed: waiting for the right screen refresh, rendering, - `display` — decoded → displayed: waiting for the right screen refresh, rendering,
and vsync. and vsync.
Against an **older host** that doesn't report its share yet, the first two terms
merge into a single `host+network` number — same total, one split fewer.
(Stage values are per-stage medians, so they sum only *approximately* to the (Stage values are per-stage medians, so they sum only *approximately* to the
headline median — percentiles aren't perfectly additive. The headline is measured headline median — percentiles aren't perfectly additive. The headline is measured
directly, never computed as a sum.) directly, never computed as a sum.)
@@ -109,10 +113,10 @@ stands in for a one-way frame flight that Moonlight doesn't measure.)
| `Incoming frame rate from network` | Frames reassembled from the network per second | `fps` (line 1) | **Yes — direct** | | `Incoming frame rate from network` | Frames reassembled from the network per second | `fps` (line 1) | **Yes — direct** |
| `Decoding frame rate` (desktop only) | Frames leaving the decoder per second | not shown separately (equals `fps` unless the decoder is falling behind) | — | | `Decoding frame rate` (desktop only) | Frames leaving the decoder per second | not shown separately (equals `fps` unless the decoder is falling behind) | — |
| `Rendering frame rate` (desktop only) | Frames actually presented per second | `fps` minus `skipped` | Approximately | | `Rendering frame rate` (desktop only) | Frames actually presented per second | `fps` minus `skipped` | Approximately |
| `Host processing latency min/max/avg` (Sunshine hosts) | Host capture → just-before-send, reported by Sunshine per frame | contained inside `host+network`; the host-side breakdown lives in the punktfunk web console (capture/encode/send stages) | Indirect — punktfunk's `host+network` additionally includes the network flight | | `Host processing latency min/max/avg` (Sunshine hosts) | Host capture → just-before-send, reported by Sunshine per frame | `host` (line 3) — the host reports capture→fully-sent per frame the same way | **Yes — direct** (punktfunk's includes the paced send itself, Sunshine's stops just before it; avg vs p50) |
| `Frames dropped by your network connection` | Frame-sequence gaps ÷ total frames | `lost` (line 4) | **Yes — direct** | | `Frames dropped by your network connection` | Frame-sequence gaps ÷ total frames | `lost` (line 4) | **Yes — direct** |
| `Frames dropped due to network jitter` | Decoded frames the *client's pacer* chose to drop ÷ decoded frames | `skipped` (line 4) | Approximately (both are client-side pacing decisions, despite Moonlight's name) | | `Frames dropped due to network jitter` | Decoded frames the *client's pacer* chose to drop ÷ decoded frames | `skipped` (line 4) | Approximately (both are client-side pacing decisions, despite Moonlight's name) |
| `Average network latency` | The **control connection's round-trip time** (ENet RTT + variance) — not video frame latency | none, on purpose | **No.** An RTT is not a frame latency; punktfunk measures the actual per-frame path instead | | `Average network latency` | The **control connection's round-trip time** (ENet RTT + variance) — not video frame latency | `network` (line 3) is the closest concept, but it's the *actual one-way frame path* (flight + reassembly), not an RTT | **No direct comparison.** Roughly, punktfunk's `network` ≈ ½ × an idle RTT plus serialization time of the frame |
| `Average decoding time` | Mean time from decoder enqueue to picture out | `decode` (p50) | Yes (mean vs median; both include decoder queueing) | | `Average decoding time` | Mean time from decoder enqueue to picture out | `decode` (p50) | Yes (mean vs median; both include decoder queueing) |
| `Average frame queue delay` | Mean time a decoded frame waits for its vsync slot | inside `display` | Sum the two Moonlight lines → | | `Average frame queue delay` | Mean time a decoded frame waits for its vsync slot | inside `display` | Sum the two Moonlight lines → |
| `Average rendering time (incl. V-sync latency)` | Mean duration of the present call | inside `display` | …and compare against punktfunk's `display` | | `Average rendering time (incl. V-sync latency)` | Mean duration of the present call | inside `display` | …and compare against punktfunk's `display` |