bench(ci): report-only regression harness — Tier-1/2 in CI + Tier-3 GPU runner
ci / rust (push) Failing after 47s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 27s
ci / bench (push) Successful in 1m34s
apple / swift (push) Successful in 1m19s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m49s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m36s
docker / deploy-docs (push) Failing after 17s
ci / rust (push) Failing after 47s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 27s
ci / bench (push) Successful in 1m34s
apple / swift (push) Successful in 1m19s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 4m49s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m36s
docker / deploy-docs (push) Failing after 17s
- scripts/bench/compare.py: diff criterion medians (target/criterion/**/estimates.json) vs a committed baseline, print a markdown table to the job summary, flag >threshold regressions, always exit 0 (shared CI hardware is too noisy to gate on). --update rewrites the baseline. - ci.yml `bench` job: runs Tier-1 (criterion) + Tier-2 (loss-harness FEC recovery) GPU-free in the rust-ci container, then compare.py — report-only visibility per push/PR. - scripts/bench/gpu-stream.sh + bench-gpu.yml: Tier-3 real pipeline (virtual output → zero-copy → NVENC → punktfunk/1 → reassemble) on a self-hosted GPU runner; captures encode_us/tx_mbps/ send_dropped + client capture→reassembled latency, compares to gpu-baseline.json (20% threshold). Needs the dev box registered as a `[self-hosted, gpu]` act_runner (one-time, see the workflow header) — the dedicated hardware makes its absolute baseline meaningful, unlike shared CI. - baseline.json: dev-box Tier-1 numbers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -115,3 +115,25 @@ jobs:
|
||||
run: bun run build
|
||||
- name: Typecheck
|
||||
run: bun run lint
|
||||
|
||||
bench:
|
||||
# Tier-1 (criterion microbenchmarks) + Tier-2 (FEC loss recovery) — GPU-free, so they run here.
|
||||
# Report-only: prints the numbers + a diff vs the committed baseline to the job summary and never
|
||||
# fails the build (shared CI hardware is too noisy to gate on). The tight regression gate + the
|
||||
# real encode/stream path live on the self-hosted GPU runner (Tier 3, bench-gpu.yml).
|
||||
runs-on: ubuntu-24.04
|
||||
container:
|
||||
image: git.unom.io/unom/punktfunk-rust-ci:latest
|
||||
timeout-minutes: 30
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Prep
|
||||
run: |
|
||||
git config --global --add safe.directory "$PWD"
|
||||
command -v python3 >/dev/null || { apt-get update && apt-get install -y --no-install-recommends python3; }
|
||||
- name: Tier-1 microbenchmarks (criterion)
|
||||
run: cargo bench -p punktfunk-core --bench pipeline -- --warm-up-time 1 --measurement-time 3
|
||||
- name: Tier-2 FEC loss recovery (loss-harness)
|
||||
run: cargo run -q -p loss-harness
|
||||
- name: Compare vs baseline (report-only)
|
||||
run: python3 scripts/bench/compare.py --threshold 0.5
|
||||
|
||||
Reference in New Issue
Block a user