Files
punktfunk/clients/apple/README.md
T
enricobuehler bf8a974e8b
ci / rust (push) Has been cancelled
feat: M4 stage 1 — the SwiftUI client is real: compiles, tested, first light on glass
The clients/apple scaffold is now a working macOS client, validated live against this
repo's host across the LAN: gamescope virtual output → NVENC HEVC → lumen/1 (GF(2¹⁶) FEC +
AES-GCM over UDP, QUIC control) → VideoToolbox → AVSampleBufferDisplayLayer at 720p60,
mouse/keyboard flowing back as QUIC datagrams into the host's gamescope EIS injector
(~3.7k events injected in one session).

LumenKit:
- LumenConnection: the predicted cbindgen compile fixes (C17 header spells the typedefs as
  integers while the enum constants import as a distinct Swift type — bridge by rawValue);
  close() is now safe from any thread (a close flag + pumpLock held across the blocking
  poll enforce the C contract "never close with a next_au in flight"; flag prevents
  lock-starvation by back-to-back polls).
- StreamView: per-pump cancellation token (reconnects can't double-pump), flush + re-gate
  on the next in-band parameter sets when the layer fails, no stale enqueue after restart.
- InputCapture: fractional-delta accumulation (sub-pixel motion isn't truncated away),
  pressed-state tracking with release-all on focus loss and stop() (nothing sticks down
  host-side), global-singleton ownership guard (GC has one handler slot per process),
  X1/X2 buttons, horizontal scroll, full keypad/CapsLock/ISO-102nd/PrintScreen/Menu VKs.
- LumenClient app shell (swift run LumenClient): connect form, fps/Mb-s HUD,
  LUMEN_AUTOCONNECT/LUMEN_MODE for scripted first-light runs.
- Tests: Annex-B byte-level units; real-codec round trip (VTCompressionSession-encoded
  HEVC rebuilt as the host's wire shape → AnnexB → VTDecompressionSession → pixels);
  test-loopback.sh (Swift client vs a real local m3-host over loopback — the Swift twin of
  c_abi_connection_roundtrip); RemoteFirstLightTests (full pipeline over the LAN).

Host/build fixes that fell out:
- The workspace builds on non-Linux again: gamestream audio (opus) and sendmmsg batching
  are now platform-gated with stubs/fallback, per the crate's "compiles everywhere" rule.
- Horizontal scroll was inverted end-to-end: the injectors negated BOTH axes onto the
  ei/wl axes, but GameStream's horizontal convention is positive = right
  (moonlight-qt/Sunshine pass it through unnegated) — only vertical flips now. This also
  un-inverts real Moonlight clients.
- AnnexB drops all zeros preceding a start code (trailing_zero_8bits padding), ffmpeg's
  policy, instead of leaking them into the preceding NAL.
- build-xcframework.sh: deployment targets pinned to the package floor + an otool guard —
  cargo does not fingerprint MACOSX_DEPLOYMENT_TARGET, so warm caches can silently ship
  too-new minos objects.

Adversarially reviewed (5-dimension multi-agent pass, every finding refutation-verified):
14 confirmed findings, all fixed above; the send-while-polling core-contract gap flagged
here is closed by the lumen/1 session-planes work (&self pulls + per-plane borrow slots).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 14:46:45 +02:00

122 lines
7.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# lumen Apple client (SwiftUI)
The native macOS/iOS client for **`lumen/1`** (the post-GameStream protocol). All
networking/protocol work — QUIC control plane, UDP data plane, GF(2¹⁶) FEC, AES-GCM,
input datagrams, Opus audio, cert pinning — lives in the shared Rust core (statically
linked as `LumenCore.xcframework`); this package is the Swift shell: decode
(VideoToolbox), present (SwiftUI), input capture.
## Status — first light achieved (2026-06-10)
Validated live, Mac ↔ Linux box over the LAN: gamescope virtual output → NVENC HEVC →
`lumen/1` (GF(2¹⁶) FEC + AES-GCM over UDP, QUIC control) → VideoToolbox →
`AVSampleBufferDisplayLayer` on glass at 1280×720@60, with mouse/keyboard flowing back as
QUIC datagrams into the host's gamescope EIS injector (thousands of events injected during
the session). Headless variant of the same proof: `RemoteFirstLightTests` decoded 60/60
received AUs spanning 983 ms of host capture clock.
The connector underneath (`lumen_core::client::NativeClient` over the C ABI) carries the
full session: video AUs, **Opus audio** (`nextAudio()`), **rumble** (`nextRumble()`),
input incl. gamepads, and **cert pinning + TOFU** (`pinSHA256:`/`hostFingerprint`) — see
`m3.rs::tests::c_abi_connection_roundtrip` (three sequential sessions: TOFU, pinned
reconnect, wrong-pin rejection). The host (`lumen-host m3-host`) is a persistent listener:
reconnect at will during development.
What's here, all compiled and tested on macOS (Xcode 26.5 / Swift 6.3):
- **`LumenKit`** (library)
- `LumenConnection.swift` — wrapper over the C ABI. AUs/audio are copied into `Data`
(the C pointer is only valid until the next call of the same kind). `close()` is safe
from any thread: per-plane locks enforce the C contract ("never close with a
`next_au`/`next_audio` in flight") instead of leaving it to callers. Pinning + TOFU
via `pinSHA256:`/`hostFingerprint`.
- `AnnexB.swift` — in-band VPS/SPS/PPS → `CMVideoFormatDescription`; Annex-B → AVCC
`CMSampleBuffer` with `DisplayImmediately` set.
- `StreamView.swift` — SwiftUI `NSViewRepresentable` over `AVSampleBufferDisplayLayer`
(stage-1 presenter: the layer hardware-decodes compressed HEVC itself). One pump
thread per view, token-cancelled so reconnects can't double-pump.
- `InputCapture.swift``GCMouse` raw deltas + `GCKeyboard` HID→VK mapping (the host's
`vk_to_evdev` consumes Windows VKs), with fractional-delta accumulation so sub-pixel
motion isn't truncated away. Buttons use GameStream ids (1=left … 5=X2); scroll is
WHEEL_DELTA(120)-scaled.
- **`LumenClient`** (development app shell): connect form → stream + input, fps/Mb-s HUD.
(Audio playback and gamepad capture are not wired into the app yet — the connector
surface is there; see notes 56.)
- **Tests** (`swift test`): byte-level Annex-B units; a real-codec round trip
(VTCompressionSession-encoded HEVC rebuilt as the host's wire shape → `AnnexB`
VTDecompressionSession → pixels); loopback integration against a real local host
(`test-loopback.sh`); the remote first-light test above.
## Build / run / test (on a Mac)
```sh
rustup target add aarch64-apple-darwin x86_64-apple-darwin
bash scripts/build-xcframework.sh # → clients/apple/LumenCore.xcframework
cd clients/apple
swift build && swift test # loopback/remote tests self-skip without a host
swift run LumenClient # the app; or open Package.swift in Xcode
bash test-loopback.sh # full loopback proof: builds lumen-host
# (synthetic source — runs on macOS), streams
# byte-verified frames into the Swift client
# against the real host (Linux box, see CLAUDE.md "Running on this box") — m3-host is a
# persistent listener, reconnect at will:
# LUMEN_COMPOSITOR=gamescope LUMEN_GAMESCOPE_APP=vkcube LUMEN_ZEROCOPY=1 \
# cargo run -rp lumen-host -- m3-host --source virtual --seconds 60
LUMEN_REMOTE_HOST=<box-ip> swift test --filter RemoteFirstLightTests # headless
LUMEN_AUTOCONNECT=<box-ip> LUMEN_MODE=1280x720x60 swift run LumenClient # on glass
```
## Notes for whoever picks this up next
1. **cbindgen import quirk** (the predicted "small compile fixes", now fixed): the
C17-compatible header spells `LumenStatus`/`LumenInputKind` as integer typedefs while
the enum *constants* import into Swift as a distinct same-named type — bridge with
`.rawValue` (see the top of `LumenConnection.swift`). Don't fight the generated header.
2. **ABI contract**: one video pump thread per connection, plus optionally one *separate*
audio drain thread for `nextAudio()`/`nextRumble()` (the core keeps per-plane borrow
slots, so the planes never alias); `send()` is enqueue-only and safe alongside all of
them. The wrapper's per-plane locks make `close()` safe from anywhere (it waits out
in-flight polls, ≤ their timeouts).
3. **Decode flow**: the host opens every stream with an IDR carrying VPS/SPS/PPS in-band
and recovery keyframes re-send them — "refresh the format description on every IDR"
(what `StreamView` does) is sufficient; there is no out-of-band extradata, ever.
4. **Stage 2 (next)**: explicit `VTDecompressionSession` + `CAMetalLayer` for frame-pacing
control (ProMotion/120 Hz), glass-to-glass measurement via `tools/latency-probe` (the
host stamps `pts_ns` with its capture wall clock; across machines you need a clock
offset estimate from the QUIC RTT).
5. **Audio**: `nextAudio()` yields raw Opus packets (48 kHz stereo, one 5 ms frame each,
sequence-numbered). Decode with libopus or `AVAudioConverter`/`kAudioFormatOpus` into an
`AVAudioEngine` source node; conceal gaps (drop/dup) rather than blocking — the Rust
side buffers 320 ms and drops the newest packet when the puller lags. Wall-clock
`ptsNs` shares the host clock with video AUs for A/V sync. Wiring this into
`LumenClient` is the next app-side task.
6. **Gamepads**: `GCController``.gamepadButton(...)`/`.gamepadAxis(...)` events (wire
contract documented on the constructors; the host accumulates them into a virtual
Xbox 360 pad). Poll `nextRumble()` and feed `GCDeviceHaptics` for force feedback.
Client-side capture isn't in `InputCapture` yet.
7. **Trust**: connect once with `pinSHA256: nil` (TOFU), persist `hostFingerprint` keyed
by host, pass it on every later connect — a mismatch throws `.connectFailed`. The host
logs its fingerprint at startup ("clients pin this fingerprint") for out-of-band
verification UX; a PIN-style pairing ceremony is a later lumen-core task. `LumenClient`
doesn't persist fingerprints yet — add it alongside the "add host" UX.
8. **Input capture caveats** (stage 1): GC handlers only fire while the app has focus —
on focus loss `InputCapture` auto-releases everything still held (keys + buttons) so
nothing sticks down host-side. Local shortcuts (⌘-anything) still also reach the host;
a capture toggle is a small follow-up. One live capture per process (the GC mouse/
keyboard singletons have a single handler slot — ownership is tracked so a stale
capture's stop() can't clobber a newer one).
9. **iOS**: same package (`BUILD_IOS=1` for the xcframework slice); `StreamView` needs the
`UIViewRepresentable` twin and touch→input mapping.
## Known limitations of the current host (relevant to client UX)
- One session **at a time** (the listener is persistent, but a second concurrent client
waits in the accept queue until the current session ends — the virtual output and
encoder are single-tenant).
- Mid-stream renegotiation (resolution change without reconnect) is designed-for but not
implemented (the Welcome is one-shot today).
- Host-side gamepad injection needs `/dev/uinput` access on the box (udev rule from
`docs/linux-setup.md`).