From 47a69a0063672e777c823224ebaf54ef55f51114 Mon Sep 17 00:00:00 2001 From: enricobuehler Date: Fri, 12 Jun 2026 12:40:36 +0000 Subject: [PATCH] fix(ci): match real runner labels + survivable Mac runner daemon MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit runs-on: ubuntu-24.04 (the label the existing Linux runner actually advertises — ubuntu-latest queued forever). Mac runner: strip the docker:// default labels generate-config seeds (they override the host-mode registration labels and make the daemon demand a Docker engine), and ship the service as a root LaunchDaemon — macOS Local Network privacy silently blocks LAN dials from unbundled CLI binaries in gui/user launchd domains ("no route to host"), system daemons are exempt. Without sudo the script leaves an interim nohup daemon. CI surface documented in CLAUDE.md + docs-site ci.md. Co-Authored-By: Claude Fable 5 --- .gitea/workflows/ci.yml | 6 ++-- .gitea/workflows/docker.yml | 2 +- CLAUDE.md | 7 ++++ docs-site/content/docs/ci.md | 23 ++++++------ scripts/ci/setup-macos-runner.sh | 60 +++++++++++++++++++++----------- 5 files changed, 64 insertions(+), 34 deletions(-) diff --git a/.gitea/workflows/ci.yml b/.gitea/workflows/ci.yml index 627abd5..e9136bd 100644 --- a/.gitea/workflows/ci.yml +++ b/.gitea/workflows/ci.yml @@ -11,7 +11,7 @@ on: jobs: rust: - runs-on: ubuntu-latest + runs-on: ubuntu-24.04 container: image: git.unom.io/unom/punktfunk-rust-ci:latest timeout-minutes: 90 @@ -60,7 +60,7 @@ jobs: || (echo "include/punktfunk_core.h is stale — commit the regenerated header" && exit 1) web: - runs-on: ubuntu-latest + runs-on: ubuntu-24.04 container: image: oven/bun:1 timeout-minutes: 30 @@ -84,7 +84,7 @@ jobs: run: bun run lint docs-site: - runs-on: ubuntu-latest + runs-on: ubuntu-24.04 container: image: oven/bun:1 timeout-minutes: 30 diff --git a/.gitea/workflows/docker.yml b/.gitea/workflows/docker.yml index 7bed8c4..19bb5f2 100644 --- a/.gitea/workflows/docker.yml +++ b/.gitea/workflows/docker.yml @@ -24,7 +24,7 @@ env: jobs: build-push: - runs-on: ubuntu-latest + runs-on: ubuntu-24.04 timeout-minutes: 45 strategy: matrix: diff --git a/CLAUDE.md b/CLAUDE.md index 0dd57c8..7ef94f0 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -124,6 +124,13 @@ Generated artifacts are **checked in** and CI fails on drift: `include/punktfunk (cbindgen from `punktfunk-core/src/abi.rs`) and `docs/api/openapi.json` (regenerate with `cargo run -p punktfunk-host -- openapi > docs/api/openapi.json`; spec lives in `mgmt.rs`). +CI is Gitea Actions (`.gitea/workflows/`, guide: docs-site `ci.md`): `ci.yml` runs the +workspace checks inside the `git.unom.io/unom/punktfunk-rust-ci` image plus web/docs-site +build+typecheck; `docker.yml` builds+pushes the web/docs/rust-ci images (host and native +clients are deliberately NOT containerized); `apple.yml` builds the xcframework and runs +`swift build`/`swift test` on the `macos-arm64` host-mode runner (home-mac-mini-1, +provisioned by `scripts/ci/setup-macos-runner.sh`). + ## Layout ``` diff --git a/docs-site/content/docs/ci.md b/docs-site/content/docs/ci.md index 4a2dce0..0eb744e 100644 --- a/docs-site/content/docs/ci.md +++ b/docs-site/content/docs/ci.md @@ -10,8 +10,8 @@ CI runs on **Gitea Actions** (`git.unom.io`, org `unom`). Three workflows in | Workflow | Trigger | Runner | What it does | |---|---|---|---| -| `ci.yml` | push to `main`, PRs | `ubuntu-latest` | Rust workspace (fmt · clippy `-D warnings` · build · test · C-ABI harness · generated-header drift) inside the `punktfunk-rust-ci` image; `web/` and `docs-site/` build + typecheck in `oven/bun:1` | -| `docker.yml` | push to `main`, `v*` tags, manual | `ubuntu-latest` | Builds + pushes the three images below (`latest` + `sha-` tags) | +| `ci.yml` | push to `main`, PRs | `ubuntu-24.04` | Rust workspace (fmt · clippy `-D warnings` · build · test · C-ABI harness · generated-header drift) inside the `punktfunk-rust-ci` image; `web/` and `docs-site/` build + typecheck in `oven/bun:1` | +| `docker.yml` | push to `main`, `v*` tags, manual | `ubuntu-24.04` | Builds + pushes the three images below (`latest` + `sha-` tags) | | `apple.yml` | push to `main`, PRs, manual | `macos-arm64` | Rust core → `PunktfunkCore.xcframework` → `swift build` + `swift test` in `clients/apple` | ## Dockerized pieces @@ -31,7 +31,7 @@ push actor). ## Runners -- **`ubuntu-latest`** — the pre-existing Linux runner; runs the Rust/web/docs jobs (as +- **`ubuntu-24.04`** — the pre-existing Linux runner; runs the Rust/web/docs jobs (as docker containers) and the image build+push jobs. - **`macos-arm64`** — `home-mac-mini-1` (M-series, macOS 26), a **host-mode** `act_runner` (upstream now ships it as `gitea-runner`) provisioned by @@ -39,10 +39,13 @@ push actor). rustup (+ both darwin targets for the universal xcframework), Node.js (host-mode runners execute JS actions via `node` from PATH — nothing auto-provisions it), the runner binary in `~/.local/bin`, state in `~/ci/act-runner/` (config, `.runner` registration, - `runner.log`), kept alive by the `io.gitea.act_runner` LaunchAgent. Needs full **Xcode** - for `xcodebuild -create-xcframework` (CLT alone only covers `swift build/test`); if - `xcode-select` still points at CLT, the script auto-detects `/Applications/Xcode*.app` - and bakes a `DEVELOPER_DIR` override into the LaunchAgent — no sudo required. + `runner.log`), kept alive by the `io.gitea.act_runner` **root LaunchDaemon** — it cannot + be a user LaunchAgent: macOS Local Network privacy silently blocks LAN dials + ("no route to host") from unbundled CLI binaries in gui/user launchd domains, while + system daemons are exempt. Needs full **Xcode** for `xcodebuild -create-xcframework` + (CLT alone only covers `swift build/test`); if `xcode-select` still points at CLT, the + script auto-detects `/Applications/Xcode*.app` and bakes a `DEVELOPER_DIR` override into + the daemon environment — no `xcode-select -s` required. Re-provisioning (idempotent) or first-time registration from a dev box: @@ -55,9 +58,9 @@ ssh enricobuehler@192.168.1.135 GITEA_RUNNER_TOKEN= bash -s \ ## Troubleshooting - **Mac runner offline** — `ssh tail -50 '~/ci/act-runner/runner.log'`; restart with - `launchctl kickstart -k gui/$(id -u)/io.gitea.act_runner`. After a reboot with nobody - logged in, the LaunchAgent only starts once auto-login is enabled (or promote the plist - to a LaunchDaemon). + `sudo launchctl kickstart -k system/io.gitea.act_runner`. "no route to host" in the log + means the daemon is running in a gui/user domain again — see the Local Network note + above. - **`apple.yml` fails at the xcframework step** — Xcode missing or unselected: `sudo xcode-select -s /Applications/Xcode.app/Contents/Developer` and accept the license (`sudo xcodebuild -license accept`), then re-run. diff --git a/scripts/ci/setup-macos-runner.sh b/scripts/ci/setup-macos-runner.sh index e6f2a1c..056bf0d 100644 --- a/scripts/ci/setup-macos-runner.sh +++ b/scripts/ci/setup-macos-runner.sh @@ -8,8 +8,9 @@ # Installs: rustup (+ both darwin targets for the universal xcframework), Node.js (the # runner executes JS actions like actions/checkout via `node` from PATH — host mode does # not auto-provision it), the act_runner binary (host mode — jobs run directly on macOS, -# no containers), and a LaunchAgent that keeps the runner daemon alive. Registration only -# happens once (.runner file); the token is NOT persisted by this script. +# no containers), and a root LaunchDaemon that keeps the runner daemon alive (see the +# launchd section for why it can't be a user LaunchAgent). Registration only happens once +# (.runner file); the token is NOT persisted by this script. # # Env knobs: GITEA_INSTANCE (default https://git.unom.io), GITEA_RUNNER_TOKEN (required # for first-time registration only), RUNNER_NAME (default: LocalHostName), RUNNER_LABELS @@ -26,7 +27,6 @@ RUNNER_NAME="${RUNNER_NAME:-$(scutil --get LocalHostName)}" LABELS="${RUNNER_LABELS:-macos-arm64:host}" RUNNER_HOME="$HOME/ci/act-runner" BIN_DIR="$HOME/.local/bin" -PLIST="$HOME/Library/LaunchAgents/io.gitea.act_runner.plist" # --- Rust toolchain (the xcframework is built from the Rust core) ----------------------- if [ ! -x "$HOME/.cargo/bin/rustup" ]; then @@ -65,6 +65,11 @@ fi # --- config + one-time registration ------------------------------------------------------ cd "$RUNNER_HOME" [ -f config.yaml ] || "$BIN_DIR/act_runner" generate-config > config.yaml +# generate-config seeds runner.labels with docker:// defaults, which (a) override the +# host-mode labels registered in .runner and (b) make the daemon demand a Docker engine +# ("Docker Engine socket not found"). Empty them so .runner's labels rule. +sed -i '' -e '/docker.gitea.com\/runner-images/d' \ + -e 's|^\([[:space:]]*\)labels:$|\1labels: []|' config.yaml if [ ! -f .runner ]; then if [ -z "${GITEA_RUNNER_TOKEN:-}" ]; then echo "ERROR: not registered yet — re-run with GITEA_RUNNER_TOKEN=" >&2 @@ -78,28 +83,38 @@ if [ ! -f .runner ]; then --labels "$LABELS" fi -# --- LaunchAgent: keep the daemon alive across crashes and (GUI) logins ------------------ -# PATH must carry the CLT tools, cargo and act_runner itself; jobs inherit it. +# --- launchd service --------------------------------------------------------------------- +# macOS Local Network privacy (15+) silently denies LAN connections ("no route to host") +# to unbundled CLI binaries in gui/user launchd domains — a user LaunchAgent can NOT reach +# a Gitea instance on the LAN (curl over ssh works, the same dial from the agent fails). +# System-domain daemons are exempt and survive reboots with nobody logged in, so the +# runner ships as a root LaunchDaemon; installing it needs sudo once. Without sudo this +# script still leaves a working (but reboot-volatile) nohup daemon behind. +# PATH must carry the CLT tools, cargo, node and act_runner itself; jobs inherit it. # If the system developer dir is CLT-only but a full Xcode is installed, hand jobs a # DEVELOPER_DIR override — the per-process equivalent of `xcode-select -s`, no sudo needed. DEVELOPER_DIR_XML="" +DEV_DIR="" if ! /usr/bin/xcodebuild -version >/dev/null 2>&1; then for app in /Applications/Xcode.app /Applications/Xcode*.app; do if DEVELOPER_DIR="$app/Contents/Developer" /usr/bin/xcodebuild -version >/dev/null 2>&1; then - DEVELOPER_DIR_XML="DEVELOPER_DIR$app/Contents/Developer" + DEV_DIR="$app/Contents/Developer" + DEVELOPER_DIR_XML="DEVELOPER_DIR$DEV_DIR" echo "==> using full Xcode at $app via DEVELOPER_DIR" break fi done fi -mkdir -p "$(dirname "$PLIST")" -cat > "$PLIST" < "$PLIST_STAGE" < Labelio.gitea.act_runner + UserName$USER ProgramArguments $BIN_DIR/act_runner @@ -123,19 +138,24 @@ cat > "$PLIST" < EOF -UID_NUM="$(id -u)" -launchctl bootout "gui/$UID_NUM/io.gitea.act_runner" 2>/dev/null || true -if launchctl bootstrap "gui/$UID_NUM" "$PLIST" 2>/dev/null; then - echo "==> runner LaunchAgent bootstrapped (gui/$UID_NUM)" +launchctl bootout "gui/$(id -u)/io.gitea.act_runner" 2>/dev/null || true +if sudo -n true 2>/dev/null; then + sudo install -m 644 -o root -g wheel "$PLIST_STAGE" "$PLIST_SYSTEM" + pkill -x act_runner 2>/dev/null || true + sudo launchctl bootout system/io.gitea.act_runner 2>/dev/null || true + sudo launchctl bootstrap system "$PLIST_SYSTEM" + echo "==> runner LaunchDaemon bootstrapped (system domain)" else - # No GUI session (pure-SSH box, nobody logged in): land it in the user domain for now. - # For boot persistence without a GUI login, either enable auto-login for this user or - # promote the plist to a root-owned LaunchDaemon in /Library/LaunchDaemons (sudo). - launchctl bootout "user/$UID_NUM/io.gitea.act_runner" 2>/dev/null || true - launchctl bootstrap "user/$UID_NUM" "$PLIST" - echo "==> runner LaunchAgent bootstrapped (user/$UID_NUM — no GUI session)" - echo " NOTE: won't auto-start after reboot until auto-login is enabled or the" - echo " plist is promoted to a LaunchDaemon." + if ! pgrep -x act_runner >/dev/null; then + echo "==> no sudo: starting an interim daemon (dies on reboot)" + (cd "$RUNNER_HOME" && \ + PATH="$HOME/.cargo/bin:$BIN_DIR:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin" \ + ${DEV_DIR:+DEVELOPER_DIR="$DEV_DIR"} \ + nohup "$BIN_DIR/act_runner" daemon --config config.yaml >> runner.log 2>&1 &) + fi + echo "==> for the permanent (reboot-safe) runner, run once on the Mac:" + echo " sudo install -m 644 -o root -g wheel $PLIST_STAGE $PLIST_SYSTEM" + echo " sudo launchctl bootstrap system $PLIST_SYSTEM" fi sleep 2