480dee863d0f16c9fad1f950edaf466706f6a472
3 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
3074b30988 |
ci(runner): cap the act_runner cache + 30-min prune (fix recurring disk-full)
apple / swift (push) Successful in 53s
android / android (push) Successful in 10m42s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 28s
ci / rust (push) Successful in 11m39s
ci / bench (push) Successful in 4m43s
deb / build-publish (push) Successful in 3m7s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
decky / build-publish (push) Successful in 12s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 7m23s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 7m24s
docker / deploy-docs (push) Failing after 8s
The hourly docker-prune could never reclaim the real disk filler: the act_runner cache server's blob store (cache.dir:"" -> /root/.cache/actcache/cache) lives in the long-running runner container's WRITABLE LAYER, which docker prune can't see. It grew to ~66 GB and filled the 125 GB disk on its own. - New docker-prune.sh holds the logic (inline ExecStart= broke under systemd's own $-expansion, which emptied $SZ/$(...) before sh ran them — silently no-oping the burst guard). The unit now just calls the script. - Caps the actcache: clears the blobs once they exceed ~20 GB (act_runner repopulates; keys are content-hashed, so only stale entries drop). - Burst guard lowered 85%->80% and now also clears the actcache. - Timer hourly -> every 30 min; image/cache `until` 12h -> 6h. Live: cleared 66 GB on home-runner-1 (93% -> 20%), deployed + verified. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
8265742e74 |
ci: bust the re-poisoned cargo cache (v3) + burst-guard the runner prune
apple / swift (push) Successful in 53s
android / android (push) Has been cancelled
deb / build-publish (push) Has been cancelled
ci / rust (push) Has been cancelled
ci / web (push) Has been cancelled
ci / docs-site (push) Has been cancelled
ci / bench (push) Has been cancelled
decky / build-publish (push) Has been cancelled
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Has been cancelled
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Has been cancelled
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Has been cancelled
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Has been cancelled
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Has been cancelled
docker / deploy-docs (push) Has been cancelled
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Has been cancelled
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Has been cancelled
This session's push storm refilled the runner to 100% WITHIN the prune timer's 24h window (it only trims >24h), so a build hit ENOSPC and actions/cache saved a truncated target/ -> `error[E0463]: can't find crate for shlex` in ci.yml's clippy. Two fixes: - Bump cargo-target-v2- -> v3- in ci.yml + deb.yml so the poisoned tarball is bypassed (a suffix bump can't — restore-keys falls back to the old prefix; same as the v1->v2 fix). - Harden scripts/ci/docker-prune: run HOURLY (was 6h) with a burst guard — if the disk is still >85% after the normal until=12h trim, prune ALL idle images + build cache (in-use protected). A fast push-burst can fill 99 GB inside any time window, so the disk-pressure trigger, not the age filter, is the real backstop. Applied live on home-runner-1 (reclaimed 95%->66%) and checked in. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
bf65d264fd |
ci: bound runner disk + bust the disk-full-corrupted cargo target cache
apple / swift (push) Successful in 54s
ci / bench (push) Successful in 1m35s
ci / rust (push) Successful in 6m49s
android / android (push) Failing after 4m5s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 26s
decky / build-publish (push) Successful in 29s
deb / build-publish (push) Failing after 2m33s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m40s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m32s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m17s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 21s
flatpak / build-publish (push) Failing after 4s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m27s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m28s
docker / deploy-docs (push) Successful in 20s
The self-hosted runner filled its disk (95%, builds failing on ENOSPC): every CI
push builds a sha-<commit>-tagged Docker image per pipeline, and since those tags
are never dangling a plain `docker image prune` skips them — they piled up to 589
images / ~85 GB plus 18 GB of build cache. Two parts:
- scripts/ci/docker-prune.{service,timer}: a host-level systemd timer (every 6h,
Persistent) that prunes images/build-cache/containers older than 24h — in-use
images stay protected. Checked in (the runner is hand-provisioned and shared
across orgs) and already installed live; reclaimed 89 GB -> 39 GB (95% -> 42%).
- ci.yml / deb.yml: bump the `cargo-target-<rustc>-*` cache key to `-v2-`. The
disk-full build let actions/cache save a truncated target/ (a dep's .rmeta went
missing -> "error[E0463]: can't find crate for pem_rfc7468" while compiling der).
A suffix bump is useless here — restore-keys would fall back to the poisoned
prefix — so the prefix is versioned to force one clean rebuild. cargo-home is
untouched (sources were intact; the failure was a missing build artifact).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|