a730ca8557
ci / rust (push) Has been cancelled
Scroll was wired to GCMouse's scroll dpad, which only fires for plain HID wheel deltas — trackpad and Magic Mouse scrolling are gesture events that never reach GameController, so scrolling was dead on the default Mac setups. The stream view now overrides scrollWheel (while captured the cursor is parked mid-view, so it receives every scroll event) and feeds InputCapture.sendScroll: precise gesture deltas are pixels (~0.1 notch/px, SDL's factor → ×12 for WHEEL_DELTA(120)), classic wheels are lines (×120), fractional remainders accumulate, and the GC scroll handler is gone so wheel mice can't double-deliver. Signs pass through as-is, preserving the local (natural-)scrolling preference. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
175 lines
12 KiB
Markdown
175 lines
12 KiB
Markdown
# punktfunk Apple client (SwiftUI)
|
||
|
||
The native macOS/iOS client for **`punktfunk/1`** (the post-GameStream protocol). All
|
||
networking/protocol work — QUIC control plane, UDP data plane, GF(2¹⁶) FEC, AES-GCM,
|
||
input datagrams, Opus audio, cert pinning — lives in the shared Rust core (statically
|
||
linked as `PunktfunkCore.xcframework`); this package is the Swift shell: decode
|
||
(VideoToolbox), present (SwiftUI), input capture.
|
||
|
||
## Status — first light achieved (2026-06-10)
|
||
|
||
Validated live, Mac ↔ Linux box over the LAN: gamescope virtual output → NVENC HEVC →
|
||
`punktfunk/1` (GF(2¹⁶) FEC + AES-GCM over UDP, QUIC control) → VideoToolbox →
|
||
`AVSampleBufferDisplayLayer` on glass at 1280×720@60, with mouse/keyboard flowing back as
|
||
QUIC datagrams into the host's gamescope EIS injector (thousands of events injected during
|
||
the session). Headless variant of the same proof: `RemoteFirstLightTests` decoded 60/60
|
||
received AUs spanning 983 ms of host capture clock.
|
||
|
||
The connector underneath (`punktfunk_core::client::NativeClient` over the C ABI) carries the
|
||
full session: video AUs, **Opus audio** (`nextAudio()`), **rumble** (`nextRumble()`),
|
||
input incl. gamepads, and **cert pinning + TOFU** (`pinSHA256:`/`hostFingerprint`) — see
|
||
`m3.rs::tests::c_abi_connection_roundtrip` (three sequential sessions: TOFU, pinned
|
||
reconnect, wrong-pin rejection). The host (`punktfunk-host m3-host`) is a persistent listener:
|
||
reconnect at will during development.
|
||
|
||
What's here, all compiled and tested on macOS (Xcode 26.5 / Swift 6.3):
|
||
|
||
- **`PunktfunkKit`** (library)
|
||
- `PunktfunkConnection.swift` — wrapper over the C ABI. AUs/audio are copied into `Data`
|
||
(the C pointer is only valid until the next call of the same kind). `close()` is safe
|
||
from any thread: per-plane locks enforce the C contract ("never close with a
|
||
`next_au`/`next_audio` in flight") instead of leaving it to callers. Pinning + TOFU
|
||
via `pinSHA256:`/`hostFingerprint`.
|
||
- `AnnexB.swift` — in-band VPS/SPS/PPS → `CMVideoFormatDescription`; Annex-B → AVCC
|
||
`CMSampleBuffer` with `DisplayImmediately` set.
|
||
- `StreamView.swift` — SwiftUI `NSViewRepresentable` over `AVSampleBufferDisplayLayer`
|
||
(stage-1 presenter: the layer hardware-decodes compressed HEVC itself). One pump
|
||
thread per view, token-cancelled so reconnects can't double-pump.
|
||
- `InputCapture.swift` — `GCMouse` raw deltas + `GCKeyboard` HID→VK mapping (the host's
|
||
`vk_to_evdev` consumes Windows VKs), with fractional-delta accumulation so sub-pixel
|
||
motion isn't truncated away. Buttons use GameStream ids (1=left … 5=X2). Scroll
|
||
arrives via the stream view's `scrollWheel` override instead of GC (trackpad/Magic
|
||
Mouse gestures never reach GCMouse's scroll dpad), WHEEL_DELTA(120)-scaled.
|
||
- **`PunktfunkClient`** (the app): hosts grid (saved in UserDefaults), "+" toolbar
|
||
sheet to add hosts, stream mode in Settings (⌘,), two trust flows — the
|
||
trust-on-first-use fingerprint prompt over the live-but-blurred stream, and SPAKE2 PIN
|
||
pairing (`PairSheet`, from a host card's context menu or the trust prompt;
|
||
`ClientIdentityStore` keeps the client identity in the Keychain and presents it on
|
||
every connect) — then pinned reconnects, fps/Mb-s HUD. Settings also picks the HOST
|
||
compositor (KWin/wlroots/Mutter/gamescope, default automatic — the host honors it
|
||
only if that backend is available there). (Audio playback and
|
||
gamepad capture are not wired into the app yet — the connector surface is there; see
|
||
notes 5–6.)
|
||
- **Tests** (`swift test`): byte-level Annex-B units; a real-codec round trip
|
||
(VTCompressionSession-encoded HEVC rebuilt as the host's wire shape → `AnnexB` →
|
||
VTDecompressionSession → pixels); loopback integration against real local hosts
|
||
(`test-loopback.sh` — stream round trip, plus the PIN pairing ceremony and the
|
||
`--require-pairing` gate against a second, armed host); the remote first-light test
|
||
above.
|
||
|
||
## Build / run / test (on a Mac)
|
||
|
||
```sh
|
||
rustup target add aarch64-apple-darwin x86_64-apple-darwin
|
||
bash scripts/build-xcframework.sh # → clients/apple/PunktfunkCore.xcframework
|
||
cd clients/apple
|
||
swift build && swift test # loopback/remote tests self-skip without a host
|
||
swift run PunktfunkClient # the unbundled dev shell (CLI)
|
||
open Punktfunk.xcodeproj # the real app: ⌘R builds + runs Punktfunk.app
|
||
|
||
bash test-loopback.sh # full loopback proof: builds punktfunk-host
|
||
# (synthetic source — runs on macOS), streams
|
||
# byte-verified frames into the Swift client
|
||
|
||
# against the real host (Linux box, see CLAUDE.md "Running on this box") — m3-host is a
|
||
# persistent listener, reconnect at will:
|
||
# PUNKTFUNK_COMPOSITOR=gamescope PUNKTFUNK_GAMESCOPE_APP=vkcube PUNKTFUNK_ZEROCOPY=1 \
|
||
# cargo run -rp punktfunk-host -- m3-host --source virtual --seconds 60
|
||
PUNKTFUNK_REMOTE_HOST=<box-ip> swift test --filter RemoteFirstLightTests # headless
|
||
# (+ PUNKTFUNK_REMOTE_PORT / PUNKTFUNK_REMOTE_COMPOSITOR=gamescope|kwin|… /
|
||
# PUNKTFUNK_REMOTE_PIN=<arming-pin> for the remote pairing test)
|
||
PUNKTFUNK_AUTOCONNECT=<box-ip> PUNKTFUNK_MODE=1280x720x60 swift run PunktfunkClient # on glass
|
||
```
|
||
|
||
## Xcode project (`Punktfunk.xcodeproj`)
|
||
|
||
The app target **Punktfunk** wraps the same sources as the `swift run` shell
|
||
(`Sources/PunktfunkClient`, a synchronized folder — no duplication) plus `App/` (asset
|
||
catalog) and links `PunktfunkKit` from the local package. Generated Info.plist, ad-hoc
|
||
signing, bundle id `io.unom.punktfunk`. Notes:
|
||
|
||
- **App icon**: `App/Assets.xcassets` ships an empty `AppIcon` slot. For an Icon Composer
|
||
`.icon`: add the file to the project (target Punktfunk), set it as the App Icon in the
|
||
target's General tab, and delete the placeholder `AppIcon.appiconset`. Heads-up: CLI
|
||
`actool` (Xcode 26.5) crashed compiling `punktfunk_Logo.icon` — if Xcode does the same,
|
||
suspect the icon bundle (it has a duplicate-named layer, "…Layer-3 2.svg"), not the
|
||
project.
|
||
- **Tests from Xcode**: the package tests run with `swift test`; to get them on ⌘U, add
|
||
`PunktfunkKitTests` once via Edit Scheme → Test → + (Xcode persists it into the shared
|
||
scheme — a hand-written package-test reference doesn't resolve headlessly).
|
||
- `xcodebuild -project Punktfunk.xcodeproj -scheme Punktfunk build` works headlessly.
|
||
|
||
## Notes for whoever picks this up next
|
||
|
||
1. **cbindgen import quirk** (the predicted "small compile fixes", now fixed): the
|
||
C17-compatible header spells `PunktfunkStatus`/`PunktfunkInputKind` as integer typedefs while
|
||
the enum *constants* import into Swift as a distinct same-named type — bridge with
|
||
`.rawValue` (see the top of `PunktfunkConnection.swift`). Don't fight the generated header.
|
||
2. **ABI contract**: one video pump thread per connection, plus optionally one *separate*
|
||
audio drain thread for `nextAudio()`/`nextRumble()` (the core keeps per-plane borrow
|
||
slots, so the planes never alias); `send()` is enqueue-only and safe alongside all of
|
||
them. The wrapper's per-plane locks make `close()` safe from anywhere (it waits out
|
||
in-flight polls, ≤ their timeouts).
|
||
3. **Decode flow**: the host opens every stream with an IDR carrying VPS/SPS/PPS in-band
|
||
and recovery keyframes re-send them — "refresh the format description on every IDR"
|
||
(what `StreamView` does) is sufficient; there is no out-of-band extradata, ever.
|
||
4. **Stage 2 (next)**: explicit `VTDecompressionSession` + `CAMetalLayer` for frame-pacing
|
||
control (ProMotion/120 Hz), glass-to-glass measurement via `tools/latency-probe` (the
|
||
host stamps `pts_ns` with its capture wall clock; across machines you need a clock
|
||
offset estimate from the QUIC RTT).
|
||
5. **Audio**: `nextAudio()` yields raw Opus packets (48 kHz stereo, one 5 ms frame each,
|
||
sequence-numbered). Decode with libopus or `AVAudioConverter`/`kAudioFormatOpus` into an
|
||
`AVAudioEngine` source node; conceal gaps (drop/dup) rather than blocking — the Rust
|
||
side buffers 320 ms and drops the newest packet when the puller lags. Wall-clock
|
||
`ptsNs` shares the host clock with video AUs for A/V sync. Wiring this into
|
||
`PunktfunkClient` is the next app-side task.
|
||
6. **Gamepads**: `GCController` → `.gamepadButton(...)`/`.gamepadAxis(...)` events (wire
|
||
contract documented on the constructors; the host accumulates them into a virtual
|
||
Xbox 360 pad). Poll `nextRumble()` and feed `GCDeviceHaptics` for force feedback.
|
||
Client-side capture isn't in `InputCapture` yet.
|
||
7. **Trust — the full ceremony exists now (SPAKE2).** `generateIdentity()` once (persist
|
||
both PEMs in the Keychain), then `pair(host:identity:pin:name:)` with the 4-digit PIN
|
||
the host prints when it ARMS pairing (`--allow-pairing`/`--require-pairing`; one PIN
|
||
per arming window, shown at startup — the user reads it before pairing). Returns the
|
||
host's VERIFIED fingerprint; persist it and pass `pinSHA256:` + `identity:` to every
|
||
connect. Pairing is a real PAKE: a wrong PIN gets ONE online guess (no offline
|
||
dictionary attack), throwing `.wrongPIN`; a wrong-size pin throws `.invalidPin`. `PunktfunkClient` implements both flows:
|
||
the TOFU fingerprint sheet keeps working against hosts not running
|
||
`--require-pairing`, and the PIN ceremony is wired in — `ClientIdentityStore`
|
||
(Keychain) on every connect, `PairSheet` from a host card's context menu or the trust
|
||
prompt's "Pair with PIN instead…" (the host's accept loop is sequential, so that path
|
||
drops the live session before pairing). With `--require-pairing` the host now
|
||
authorizes clients too (the "other direction" is no longer open, opt-in per host);
|
||
the whole gate is regression-tested in `testPairingCeremonyAndRequirePairingGate`.
|
||
7b. **Resize without reconnect**: `requestMode(width:height:refreshHz:)` mid-stream —
|
||
the host rebuilds at the new mode in ~90 ms; the first new-mode AU is an IDR with
|
||
fresh parameter sets (the refresh-on-IDR decode flow handles it untouched) and
|
||
`currentMode()` reflects the switch. Wire it to window-resize events.
|
||
8. **Input capture** (stage 1): capture is a deliberate, reversible STATE owned by
|
||
`StreamLayerView`, Moonlight-style. Engaged when the stream starts / trust is
|
||
confirmed and when the user clicks into the video (that click is suppressed toward
|
||
the host); released by ⌘⎋ (toggles) or focus loss; NEVER engaged by mere app
|
||
activation — activating clicks may be title-bar drags or resizes, which used to get
|
||
their cursor warped away mid-drag. While captured: the local cursor is hidden +
|
||
frozen mid-view (the host renders its own), all input is forwarded, and the view
|
||
consumes key events as first responder so unhandled keyDowns don't beep — ⌘-combos
|
||
still work locally (⌘D disconnect, ⌘Q) *and* reach the host via GC. While released:
|
||
nothing is forwarded (`InputCapture.forwarding` gates the GC handlers; held
|
||
keys/buttons are flushed host-side on release so nothing sticks down), the cursor is
|
||
free, and the HUD shows "Click the stream to capture input". GC handlers only fire
|
||
while the app has focus, and focus loss also auto-releases everything held. One live capture per process (the GC
|
||
mouse/keyboard singletons have a single handler slot — ownership is tracked so a stale
|
||
capture's stop() can't clobber a newer one).
|
||
9. **iOS**: same package (`BUILD_IOS=1` for the xcframework slice); `StreamView` needs the
|
||
`UIViewRepresentable` twin and touch→input mapping.
|
||
|
||
## Known limitations of the current host (relevant to client UX)
|
||
|
||
- One session **at a time** (the listener is persistent, but a second concurrent client
|
||
waits in the accept queue until the current session ends — the virtual output and
|
||
encoder are single-tenant).
|
||
- Mid-stream renegotiation (resolution change without reconnect) is designed-for but not
|
||
implemented (the Welcome is one-shot today).
|
||
- Host-side gamepad injection needs `/dev/uinput` access on the box (udev rule from
|
||
`docs/linux-setup.md`).
|