Commit Graph

2 Commits

Author SHA1 Message Date
enricobuehler bfbe5ab888 docs(host-latency): mark Tier 2A landed + validated; Tier 3A FFI validated on MSVC
apple / swift (push) Successful in 54s
android / android (push) Failing after 51s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 29s
ci / rust (push) Failing after 4m4s
ci / bench (push) Failing after 3m23s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 49s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 7s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 40s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Successful in 8m44s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 6m51s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 6m11s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 23:40:24 +00:00
enricobuehler 112a054c35 perf(host): latency hardening for the game-vs-encode GPU contention collapse
Verified, prioritized analysis in docs/host-latency-plan.md (multi-agent
investigation + adversarial verification). Lands the two low-risk tiers:

Tier 2B — Linux scheduling hygiene:
- boost_thread_priority now nices the capture/encode (-10) and send (-5)
  threads on Linux (setpriority, best-effort; no-op without CAP_SYS_NICE),
  and the wrong "gamescope caps the game" doc-comment is corrected.
- CUDA context created with CU_CTX_SCHED_BLOCKING_SYNC (frees a core on the
  shared box instead of busy-spinning on completion).
- Copies moved off the default stream onto a per-thread highest-priority
  CUDA stream (cuStreamCreateWithPriority, graceful NULL-stream fallback)
  with a per-stream sync that no longer blocks on the other worker thread's
  in-flight copies. Stream priority is measure-then-keep (NVIDIA Linux may
  ignore it); never regresses.

Tier 3A — Windows session tuning (new session_tuning.rs, raw C-ABI FFI,
no-op off Windows): once-per-process 1ms timer + DwmEnableMMCSS + HIGH
priority class; per-thread MMCSS "Games" + keep-display-awake. Wired into
both the native (boost_thread_priority) and GameStream (stream.rs) paths.
We had zero session tuning before (Apollo streaming_will_start parity).

Tier 2A (Linux NV12 convert) is specified but intentionally not landed:
it is colour-correctness-critical and needs A/B validation on a GPU box
with a display (green-screen risk). Builds + clippy + fmt green on Linux.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 23:05:57 +00:00