Opt-in (Settings -> Presenter; `punktfunk.presenter`, default stage-1). Stage-1's
AVSampleBufferDisplayLayer decodes AND presents internally with no per-frame
callback, so neither decode nor present can be stamped or hand-paced. Stage-2
takes explicit control:
- VideoDecoder: VTDecompressionSession, async output callback stamps
decode-completion, session rebuilt on every IDR / format change. Unit-tested
(testVideoDecoderAsyncCallbackDeliversPixels).
- MetalVideoPresenter: CAMetalLayer + CVMetalTextureCache + a runtime-compiled
BT.709 limited-range NV12->RGB shader, present at the next vsync. The
CVMetalTextures + pixel buffer are held until the GPU completes.
- Stage2Pipeline: pump thread -> decoder -> newest-ready 1-slot ring; the hosting
view's display link drains it once per vsync and stamps capture->present
(the display-link target time projected into CLOCK_REALTIME).
- LatencyMeter gains record(ptsNs:atNs:offsetNs:); the HUD shows a capture->present
(glass-to-glass, modulo host render->capture) line, skew-corrected via
clockOffsetNs. Measured live ~11 ms p50 vs ~2.2 ms capture->client.
- StreamView / StreamViewIOS host the CAMetalLayer as a sublayer + a CADisplayLink
(NSView.displayLink on macOS) when stage-2; input capture + HUD unchanged. The
session-active gates switch from `pump != nil` to `connection != nil` so capture
engages without a StreamPump.
Validated: builds macOS/iOS/tvOS; the decode half is unit-tested; the Metal
present is live-validated on glass (correct image + the capture->present number).
Colorspace is BT.709 SDR for now; 10-bit/HDR + a pacing policy are later.
Plan: docs-site/content/docs/apple-stage2-presenter.md.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Apple client now consumes the connector's clock offset. PunktfunkConnection
reads punktfunk_connection_clock_offset_ns into clockOffsetNs at connect; a new
LatencyMeter (PunktfunkKit, NSLock + percentiles, mirrors FrameMeter) records each
AU's capture->client-receipt latency = now(CLOCK_REALTIME) + offset - pts_ns, and
SessionModel drains p50/p95 into the macOS HUD ("capture->client N/N ms p50/p95",
"(same-host)" when the host didn't answer the skew handshake). Wired at the
existing onFrame hook in ContentView — additive, no change to the decode/present
path. Unit test for the meter (percentiles, skew flag, absurd-value guard).
This is the first cross-machine latency the real Apple client reports. SCOPE:
stage-1 AVSampleBufferDisplayLayer decodes+presents compressed samples internally
with no per-frame callback, so this excludes decode+present; true decode->present
needs the stage-2 presenter (VTDecompressionSession + CAMetalLayer). Rebuild
PunktfunkCore.xcframework (for the new C getter) before swift build/test on a Mac.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>