feat(host): web-console performance capture — record stream stats, graph them
apple / swift (push) Successful in 1m1s
android / android (push) Successful in 4m13s
ci / rust (push) Successful in 4m42s
ci / web (push) Successful in 50s
ci / docs-site (push) Successful in 53s
windows-host / package (push) Successful in 5m51s
apple / screenshots (push) Successful in 5m1s
deb / build-publish (push) Successful in 2m29s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 33s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
ci / bench (push) Successful in 4m35s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m9s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m10s
apple / swift (push) Successful in 1m1s
android / android (push) Successful in 4m13s
ci / rust (push) Successful in 4m42s
ci / web (push) Successful in 50s
ci / docs-site (push) Successful in 53s
windows-host / package (push) Successful in 5m51s
apple / screenshots (push) Successful in 5m1s
deb / build-publish (push) Successful in 2m29s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 33s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
ci / bench (push) Successful in 4m35s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m9s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m10s
Arm streaming-perf-stats capture from the web console, play, stop, and review the run as graphs; finished captures are saved to disk as browsable/exportable recordings. Covers both the native punktfunk/1 path and GameStream. - stats_recorder.rs: one shared Arc<StatsRecorder> ring (created in gamestream::serve, shared with the mgmt API + both streaming loops, mirroring NativePairing). The hot-path gate is a runtime AtomicBool that replaces the startup-only PUNKTFUNK_PERF for *recording* (PERF stdout logging unchanged); bounded ring (~3 h); atomic temp+rename writes to ~/.config/punktfunk/captures/*.json; path-traversal-safe ids; poison-resilient locks. - native (punktfunk1.rs) + GameStream (stream.rs) emit a StatsSample at their existing ~2 s / ~1 s aggregation boundary — per-stage latency p50/p99, fps new/repeat, goodput, loss/FEC deltas — with no new per-frame work beyond the cheap atomic check. FrameMsg.was_measured keeps pre-arm in-flight frames out of the first window's percentiles (without zeroing the Windows-relay path's fps/encode). - mgmt.rs: 7 bearer-only /api/v1/stats/* endpoints (capture start/stop/status/live; recordings list/get/delete); api/openapi.json regenerated, in sync. - web: new "Performance" page (recharts, rendered SSR-safe) — capture control, live graphs while armed, recordings table (view / download-JSON / delete), and a detail view with the latency stacked-area bottleneck breakdown (p50/p99 toggle) + throughput + health. Charts adapt to either path's stage set. Design: design/stats-capture-plan.md. Built and adversarially reviewed via a multi-agent workflow; workspace build/clippy(-D warnings)/fmt/tests green, OpenAPI no-drift. Not yet on-glass validated against a live session. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,246 @@
|
||||
# Stats capture & graphing — design
|
||||
|
||||
Goal: let an operator **enable performance-stats capture from the web console**, play a
|
||||
session, **stop**, and **review the captured time-series as graphs** in the web console.
|
||||
Captures are **saved to disk** (browse/compare past sessions; survive host restart) and
|
||||
cover **both** streaming paths: native punktfunk/1 (`virtual_stream`) and GameStream/Moonlight
|
||||
(`gamestream/stream.rs`).
|
||||
|
||||
This builds on the existing per-stage instrumentation (today gated by `PUNKTFUNK_PERF=1`,
|
||||
stdout-only, read once at startup). We make recording **runtime-toggleable**, route the same
|
||||
aggregates into a **shared ring → on-disk recording**, and expose it over the mgmt REST API +
|
||||
web console.
|
||||
|
||||
---
|
||||
|
||||
## 1. Host: shared `StatsRecorder`
|
||||
|
||||
New module `crates/punktfunk-host/src/stats_recorder.rs`. One `Arc<StatsRecorder>` is created
|
||||
once in the unified host entry (`gamestream::serve`, the `serve` subcommand) alongside
|
||||
`Arc<NativePairing>`, and shared with **both** the mgmt API (`MgmtState`) and the streaming
|
||||
loops (threaded through `punktfunk1::serve` → `SessionContext` → `virtual_stream`/`send_loop`,
|
||||
and into the GameStream encode loop). Mirror the existing `NativePairing` Arc-sharing pattern
|
||||
exactly.
|
||||
|
||||
### Data model (serde + utoipa `ToSchema`; this is the wire + on-disk shape)
|
||||
|
||||
```rust
|
||||
/// One pipeline stage's latency in a window (microseconds).
|
||||
pub struct StageTiming {
|
||||
pub name: String, // "capture" | "submit" | "encode" | "packetize" | "send"
|
||||
pub p50_us: f32,
|
||||
pub p99_us: f32,
|
||||
}
|
||||
|
||||
/// One aggregated sample (~ every 2 s native, ~ every 1 s GameStream).
|
||||
pub struct StatsSample {
|
||||
pub t_ms: u64, // ms since capture start (monotonic, from a stored Instant)
|
||||
pub session_id: u32, // disambiguates concurrent sessions (usually constant)
|
||||
pub stages: Vec<StageTiming>, // ordered pipeline stages for this path
|
||||
pub fps: f32, // genuine NEW frames/s from the source
|
||||
pub repeat_fps: f32, // re-encoded holds/s (source-starvation indicator)
|
||||
pub mbps: f32, // tx goodput (Mb/s)
|
||||
pub bitrate_kbps: u32, // configured target bitrate
|
||||
pub frames_dropped: u32, // delta in this window
|
||||
pub packets_dropped: u32, // delta (receiver-side / reassembler), where known
|
||||
pub send_dropped: u32, // delta (host send-buffer overflow / EAGAIN)
|
||||
pub fec_recovered: u32, // delta (shards recovered)
|
||||
}
|
||||
|
||||
pub struct CaptureMeta {
|
||||
pub id: String, // "2026-06-26T20-14-03Z_5120x1440" — also the filename stem
|
||||
pub started_unix_ms: u64,
|
||||
pub duration_ms: u64,
|
||||
pub kind: String, // "native" | "gamestream"
|
||||
pub width: u32,
|
||||
pub height: u32,
|
||||
pub fps: u32,
|
||||
pub codec: String, // "h264" | "hevc" | "av1"
|
||||
pub client: String, // short label / fingerprint prefix, or "" if unknown
|
||||
pub sample_count: u32,
|
||||
}
|
||||
|
||||
pub struct Capture {
|
||||
pub meta: CaptureMeta,
|
||||
pub samples: Vec<StatsSample>,
|
||||
}
|
||||
|
||||
pub struct StatsStatus {
|
||||
pub armed: bool, // capture currently running
|
||||
pub sample_count: u32, // samples in the in-progress capture
|
||||
pub started_unix_ms: u64, // 0 if idle
|
||||
pub kind: String, // path of the in-progress capture, "" if idle
|
||||
}
|
||||
```
|
||||
|
||||
Stage sets per path (ordered, roughly the per-frame critical path so stacking is meaningful):
|
||||
- **native**: `capture` (try_latest ring read + color convert), `submit` (NVENC enqueue),
|
||||
`encode` (lock_bitstream = NVENC schedule + ASIC — the dominant stage under GPU load),
|
||||
`send` (paced_submit: seal + FEC + pace + sendmmsg).
|
||||
- **gamestream**: `capture`, `encode`, `packetize` (poll+FEC+packetize), `send`.
|
||||
|
||||
> Native naming: today's vectors are `st_cap`→`capture`, `st_submit`→`submit`,
|
||||
> `st_wait`→`encode`, `pace_us`→`send`. (`encode_us` total ≈ capture+submit+encode; we do not
|
||||
> emit it as a stage to avoid double-counting — it's implied by the stack.)
|
||||
|
||||
### Recorder API
|
||||
|
||||
```rust
|
||||
pub struct StatsRecorder { /* dir, armed: AtomicBool, live: Mutex<Option<Live>>, next_sid: AtomicU32 */ }
|
||||
|
||||
impl StatsRecorder {
|
||||
pub fn new(dir: PathBuf) -> Arc<Self>; // creates dir (0700) if missing
|
||||
|
||||
pub fn is_armed(&self) -> bool; // cheap Relaxed atomic load — called on the hot path
|
||||
|
||||
/// Arm a new capture. No-op if already armed (returns current status).
|
||||
pub fn start(&self) -> StatsStatus;
|
||||
|
||||
/// A streaming loop announces itself when it first records while armed.
|
||||
/// Seeds CaptureMeta (kind/w/h/fps/codec/client) on the FIRST registration. Returns session_id.
|
||||
pub fn register_session(&self, kind: &'static str, w: u32, h: u32, fps: u32, codec: &str, client: &str) -> u32;
|
||||
|
||||
/// Append one aggregated sample (called from the loops' existing ~2 s/~1 s boundary).
|
||||
/// Bounded: cap at MAX_SAMPLES (e.g. 5400 ≈ 3 h @ 2 s). On overflow, stop appending and
|
||||
/// set a `truncated` flag (DO NOT drop oldest — a saved recording must keep its start).
|
||||
pub fn push_sample(&self, session_id: u32, sample: StatsSample);
|
||||
|
||||
/// Disarm + finalize: write <dir>/<id>.json atomically, clear live, return saved meta.
|
||||
pub fn stop(&self) -> std::io::Result<Option<CaptureMeta>>;
|
||||
|
||||
pub fn status(&self) -> StatsStatus;
|
||||
pub fn live_snapshot(&self) -> Option<Capture>; // clone of the in-progress capture for live graphing
|
||||
|
||||
pub fn list(&self) -> Vec<CaptureMeta>; // scan dir, parse meta only, newest first
|
||||
pub fn load(&self, id: &str) -> std::io::Result<Capture>;
|
||||
pub fn delete(&self, id: &str) -> std::io::Result<()>;
|
||||
}
|
||||
```
|
||||
|
||||
Invariants / safety:
|
||||
- **No async on the per-frame path.** `is_armed()` is a `Relaxed` atomic load; sample
|
||||
construction happens only at the existing 2 s / 1 s aggregation boundary, never per frame.
|
||||
- **`id` is path-traversal-safe.** `load`/`delete` MUST reject any id not matching
|
||||
`^[A-Za-z0-9._-]+$` (no `/`, no `..`, no `:` — keep it a valid Windows filename), and only ever
|
||||
join `dir/<id>.json`. Return NotFound on reject. (Endpoints are bearer-authed, but defend in
|
||||
depth.)
|
||||
- **Bounded memory.** `MAX_SAMPLES` cap; truncate (keep oldest), never unbounded.
|
||||
- **Atomic disk write.** Write to `<id>.json.tmp` then rename, so a crash mid-write can't leave
|
||||
a half file. Pretty-print not required; compact JSON is fine.
|
||||
- Captures dir: `~/.config/punktfunk/captures/` (next to `cert.pem` etc.). Resolve via the same
|
||||
config-dir helper the rest of the host uses.
|
||||
|
||||
### Runtime gating change (the key behavioral change)
|
||||
|
||||
Today the loops measure per-stage timing only `if perf` (a startup bool). Change the per-frame
|
||||
**measurement** predicate to `let measure = perf || recorder.is_armed();`, re-evaluated each
|
||||
frame (cheap atomic). Then at the aggregation boundary:
|
||||
- if `perf` → keep the existing `tracing::info!` log line (unchanged behavior);
|
||||
- if `recorder.is_armed()` → also build a `StatsSample` and `push_sample`.
|
||||
|
||||
So `PUNKTFUNK_PERF=1` still works exactly as before, AND the web toggle now works at runtime
|
||||
with zero startup flags.
|
||||
|
||||
### Where each loop emits the sample
|
||||
|
||||
- **native** (`punktfunk1.rs`): the cap/submit/encode(`st_wait`) splits live in the capture
|
||||
thread; `mbps`/`send_dropped`/`bytes` and `session.stats()` live in the send thread. Emit the
|
||||
complete sample from **one** place. Cleanest: carry the per-frame `cap_us/submit_us/wait_us`
|
||||
(and a `repeat: bool`) on `FrameMsg` to the send thread (it already carries `encode_us`), so
|
||||
`send_loop` builds the whole sample at its existing 2 s boundary where `session.stats()` is
|
||||
already read. Compute `frames_dropped/packets_dropped/send_dropped/fec_recovered` as deltas vs
|
||||
the previous window's `Session::stats()` snapshot (the loop already tracks `last_bytes` /
|
||||
`last_send_dropped` — extend that bookkeeping). `register_session` is called once with the
|
||||
negotiated mode/codec and the client label.
|
||||
- **gamestream** (`gamestream/stream.rs`): the encode loop already tracks per-stage max each
|
||||
1 s. Add p50/p99 accumulation (small per-stage `Vec<u32>` like the native path) and, when
|
||||
`perf || recorder.is_armed()`, emit a `StatsSample` with stages
|
||||
`[capture, encode, packetize, send]` + fps (unique new frames) + mbps + whatever loss/byte
|
||||
counters that path exposes (use 0 where a counter doesn't exist; do NOT fabricate). Call
|
||||
`register_session("gamestream", ...)` with the GameStream-negotiated mode/codec/client.
|
||||
|
||||
Threading: add `stats: Arc<StatsRecorder>` to `SessionContext` and the GameStream stream
|
||||
setup; the standalone `punktfunk1-host` subcommand (no mgmt) passes a fresh recorder (harmless,
|
||||
just unused).
|
||||
|
||||
---
|
||||
|
||||
## 2. Host: mgmt REST API (`mgmt.rs`)
|
||||
|
||||
Add `stats: Arc<StatsRecorder>` to `MgmtState`. Register handlers in `api_router_parts()` via
|
||||
`routes!()` with `#[utoipa::path]`. All under `/api/v1`, **bearer-token only** (operator
|
||||
actions — do NOT add them to the mTLS `cert_may_access` read-only allowlist). All bodies/returns
|
||||
derive `ToSchema`; errors use the `ApiJson`/`ApiError` envelope. Tag every operation `stats`.
|
||||
|
||||
| Method & path | fn (operationId) | body → returns |
|
||||
|---------------------------------------|-------------------------|-------------------------------|
|
||||
| POST `/api/v1/stats/capture/start` | `stats_capture_start` | — → `StatsStatus` |
|
||||
| POST `/api/v1/stats/capture/stop` | `stats_capture_stop` | — → `CaptureMeta` (200) / 204-ish if nothing was recording |
|
||||
| GET `/api/v1/stats/capture/status` | `stats_capture_status` | → `StatsStatus` |
|
||||
| GET `/api/v1/stats/capture/live` | `stats_capture_live` | → `Capture` (in-progress; 404/empty if idle) |
|
||||
| GET `/api/v1/stats/recordings` | `stats_recordings_list` | → `Vec<CaptureMeta>` |
|
||||
| GET `/api/v1/stats/recordings/{id}` | `stats_recording_get` | → `Capture` |
|
||||
| DELETE `/api/v1/stats/recordings/{id}`| `stats_recording_delete`| → `StatsStatus`/204 |
|
||||
|
||||
Register the new `ToSchema` types with the OpenApi derive's `components(schemas(...))` list.
|
||||
Then regenerate the checked-in spec:
|
||||
|
||||
```
|
||||
cargo run -p punktfunk-host -- openapi > api/openapi.json
|
||||
```
|
||||
|
||||
CI fails on drift — the regenerated `api/openapi.json` MUST be committed.
|
||||
|
||||
---
|
||||
|
||||
## 3. Web console (`web/`)
|
||||
|
||||
New page **"Performance"** following the established route → section/index (fetch) →
|
||||
section/view (presentational) pattern, registered in the `NAV` array (`app-shell.tsx`) with a
|
||||
lucide icon (`Activity` or `LineChart`).
|
||||
|
||||
- Route: `web/src/routes/stats.tsx` → `createFileRoute('/stats')` → `SectionStats`.
|
||||
- Section: `web/src/sections/Stats/index.tsx` (orval hooks) + `view.tsx` (presentational,
|
||||
i18n via Paraglide `m.*`). Use `Section`, `QueryState`, `Card`/`CardHeader`/`CardTitle`/
|
||||
`CardContent`, `Button`, `Badge` from `web/src/components/ui`.
|
||||
- Charts: **add `recharts`** to `web/package.json` (no chart lib exists today). Render charts
|
||||
**client-only** (a mounted guard) so SSR doesn't choke on `ResponsiveContainer`'s 0-width
|
||||
measure. Theme via existing CSS variables / brand violet, dark-mode aware.
|
||||
|
||||
Data hooks come from regenerated orval (`bun run api:gen` after the host's openapi.json is
|
||||
updated): `useStatsCaptureStatus`, `useStatsCaptureStart`, `useStatsCaptureStop`,
|
||||
`useStatsCaptureLive`, `useStatsRecordingsList`, `useStatsRecordingGet`,
|
||||
`useStatsRecordingDelete` (exact names per orval's tag/operationId convention — verify against
|
||||
generated output and adjust the view imports to match).
|
||||
|
||||
UI layout:
|
||||
1. **Capture control card** — Start/Stop button (mutations; invalidate status query on
|
||||
success), a "Recording…"/"Idle" `Badge`, elapsed time + live sample count
|
||||
(`useStatsCaptureStatus`, `refetchInterval: 2000`). On Start, the live chart appears.
|
||||
2. **Live chart** (visible while armed; `useStatsCaptureLive`, `refetchInterval: 2000`) — the
|
||||
latency stage breakdown as a **stacked area** (capture/submit/encode/send in µs, the
|
||||
"where does the time go" view), with fps and mbps as secondary line charts.
|
||||
3. **Recordings card** — table from `useStatsRecordingsList`: time, kind badge, resolution,
|
||||
codec, duration, sample count; row actions **View** (select → detail), **Download** (export
|
||||
the `Capture` JSON via the recording GET), **Delete** (mutation, confirm).
|
||||
4. **Recording detail** — when a recording (or the live capture) is selected, render the full
|
||||
graph set from its `samples`:
|
||||
- Latency stage breakdown (stacked area, µs) — primary bottleneck view; p99 overlay toggle.
|
||||
- Throughput: fps (new vs repeat) + mbps.
|
||||
- Health: frames_dropped / packets_dropped / send_dropped / fec_recovered over time.
|
||||
|
||||
i18n: add keys to `web/messages/en.json` + `de.json` (nav label, titles, button/labels) and
|
||||
regenerate Paraglide. Keep both locales in sync.
|
||||
|
||||
---
|
||||
|
||||
## 4. Verification / done-criteria
|
||||
|
||||
- `cargo build -p punktfunk-host` (and `--workspace`), `cargo clippy --workspace --all-targets
|
||||
-D warnings`, `cargo fmt --all --check` — green.
|
||||
- `cargo run -p punktfunk-host -- openapi > api/openapi.json` — committed, no drift.
|
||||
- `PUNKTFUNK_PERF=1` stdout behavior unchanged (no regression to the existing perf log).
|
||||
- Web: orval regen clean, typecheck/build green, charts render client-side.
|
||||
- CLAUDE.md status note + this plan updated.
|
||||
- Adversarial review: hot-path stays sync + bounded; `id` path-traversal-safe; OpenAPI/orval no
|
||||
drift; SSR-safe charts; both paths actually emit samples.
|
||||
Reference in New Issue
Block a user