feat(apple): session audio — host playback + mic uplink, device pickers in Settings
ci / rust (push) Has been cancelled

Both directions of the audio plane, on CoreAudio's built-in Opus codec
(kAudioFormatOpus — no bundled libopus; OpusCodec.swift, round trip unit-tested):

- Playback: a drain thread pulls nextAudio() packets, decodes, and writes a priming
  jitter ring feeding an AVAudioSourceNode (~20 ms prefill, adaptive to the device's
  render quantum so large-buffer devices don't oscillate prime/dropout; a high-water
  clamp sheds stall backlog so one network hiccup can't permanently lag audio behind
  video; underrun re-primes — one dip, not sustained crackle).
- Mic: a second engine taps the input device, resamples to 48 kHz stereo, Opus-encodes
  20 ms chunks and sendMic()s them into the host's virtual PipeWire source. Permission
  via AVCaptureDevice (NSMicrophoneUsageDescription added to the Xcode target).
- Settings: Speaker + Microphone pickers (CoreAudio HAL enumeration, persisted by
  device UID — "System default" leaves the engine unpinned so it follows macOS device
  changes) and a "Send microphone" toggle (default on). Applies from the next session.
- Audio starts with streaming, never during the trust prompt (no host sound — and no
  mic uplink — before the user trusted the host); teardown stops audio before close().

Adversarial-review fixes baked in: stop() and the dangling mic-permission callback
share one lock+flag protocol (no hot mic with no owner), the connect-success handler
bails when the attempt was abandoned mid-handshake (no session/mic for a dead window),
SessionAudio gets a deinit backstop (a dropped instance can't pin the connection via
its drain thread), and the render scratch buffer is block-owned (was leaked per
session).

Verified live against the box: remote test decodes 100 host Opus packets to PCM and
the host opens its virtual mic on the first uplinked frame ("punktfunk/1 virtual mic
ready"); on-glass session runs with both engines up.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This commit is contained in:
2026-06-11 09:39:08 +02:00
parent 2372b02620
commit b26f138699
9 changed files with 840 additions and 8 deletions
+12 -4
View File
@@ -117,10 +117,18 @@ signing, bundle id `io.unom.punktfunk`. Notes:
control (ProMotion/120 Hz), glass-to-glass measurement via `tools/latency-probe` (the
host stamps `pts_ns` with its capture wall clock; across machines you need a clock
offset estimate from the QUIC RTT).
5. **Audio**: `nextAudio()` yields raw Opus packets (48 kHz stereo, one 5 ms frame each,
sequence-numbered). The inverse direction exists too: `sendMic(_:seq:ptsNs:)` uplinks
the client's mic as Opus frames into a virtual PipeWire source on the host (wire it
to AVAudioEngine input + an Opus encoder alongside playback). Decode with libopus or `AVAudioConverter`/`kAudioFormatOpus` into an
5. **Audio — wired, both directions.** Playback: `SessionAudio` drains `nextAudio()`
on its own thread, decodes through CoreAudio's built-in Opus codec (`OpusCodec.swift`
— kAudioFormatOpus, no bundled libopus; round-trip unit-tested) into a priming
jitter ring feeding an `AVAudioSourceNode`. Mic: a second engine taps the input
device, resamples to 48 kHz stereo, Opus-encodes 20 ms chunks and `sendMic()`s them
(the host's virtual PipeWire source accepts any frame size ≤ 120 ms). Speaker/mic
are chosen in Settings (`AudioDevices.swift` — persisted by UID; "System default"
leaves the engines unpinned so they follow macOS device changes), mic on/off toggle
included; the app asks for mic permission on first use
(NSMicrophoneUsageDescription is in the Xcode target). A/V sync and packet-loss
concealment beyond silence-fill are still open (AudioPacket.seq/ptsNs carry what's
needed). Decode with libopus or `AVAudioConverter`/`kAudioFormatOpus` into an
`AVAudioEngine` source node; conceal gaps (drop/dup) rather than blocking — the Rust
side buffers 320 ms and drops the newest packet when the puller lags. Wall-clock
`ptsNs` shares the host clock with video AUs for A/V sync. Wiring this into