feat(apple): stage-2 presenter — explicit decode + Metal present + glass-to-glass
ci / web (push) Failing after 38s
ci / rust (push) Successful in 53s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 16s
ci / docs-site (push) Failing after 39s
docker / deploy-docs (push) Successful in 16s
apple / swift (push) Successful in 1m17s

Opt-in (Settings -> Presenter; `punktfunk.presenter`, default stage-1). Stage-1's
AVSampleBufferDisplayLayer decodes AND presents internally with no per-frame
callback, so neither decode nor present can be stamped or hand-paced. Stage-2
takes explicit control:

- VideoDecoder: VTDecompressionSession, async output callback stamps
  decode-completion, session rebuilt on every IDR / format change. Unit-tested
  (testVideoDecoderAsyncCallbackDeliversPixels).
- MetalVideoPresenter: CAMetalLayer + CVMetalTextureCache + a runtime-compiled
  BT.709 limited-range NV12->RGB shader, present at the next vsync. The
  CVMetalTextures + pixel buffer are held until the GPU completes.
- Stage2Pipeline: pump thread -> decoder -> newest-ready 1-slot ring; the hosting
  view's display link drains it once per vsync and stamps capture->present
  (the display-link target time projected into CLOCK_REALTIME).
- LatencyMeter gains record(ptsNs:atNs:offsetNs:); the HUD shows a capture->present
  (glass-to-glass, modulo host render->capture) line, skew-corrected via
  clockOffsetNs. Measured live ~11 ms p50 vs ~2.2 ms capture->client.
- StreamView / StreamViewIOS host the CAMetalLayer as a sublayer + a CADisplayLink
  (NSView.displayLink on macOS) when stage-2; input capture + HUD unchanged. The
  session-active gates switch from `pump != nil` to `connection != nil` so capture
  engages without a StreamPump.

Validated: builds macOS/iOS/tvOS; the decode half is unit-tested; the Metal
present is live-validated on glass (correct image + the capture->present number).
Colorspace is BT.709 SDR for now; 10-bit/HDR + a pacing policy are later.
Plan: docs-site/content/docs/apple-stage2-presenter.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-12 15:28:23 +02:00
parent 848738ed00
commit 7b10714b62
12 changed files with 737 additions and 30 deletions
@@ -9,6 +9,13 @@ import VideoToolbox
import XCTest
@testable import PunktfunkKit
/// Sendable holder for the values the (background-thread) decode callback writes.
private final class FrameBox: @unchecked Sendable {
let lock = NSLock()
var frame: ReadyFrame?
var error: OSStatus?
}
final class VideoToolboxRoundTripTests: XCTestCase {
private let width = 320
private let height = 240
@@ -59,6 +66,43 @@ final class VideoToolboxRoundTripTests: XCTestCase {
XCTAssertEqual(CVPixelBufferGetHeight(pixels), height)
}
/// Stage-2 decode half: the same known IDR through `VideoDecoder` assert its async output
/// callback fires with a CVPixelBuffer of the right dimensions, the pts round-trips, and
/// decode-completion is stamped.
func testVideoDecoderAsyncCallbackDeliversPixels() throws {
let (formatDesc, avccSample) = try encodeOneHEVCKeyframe()
let annexB = try annexBAU(formatDesc: formatDesc, avccSample: avccSample)
let format = try XCTUnwrap(AnnexB.formatDescription(fromIDR: annexB))
let au = AccessUnit(data: annexB, ptsNs: 42_000_000, frameIndex: 0, flags: 0)
let box = FrameBox()
let done = DispatchSemaphore(value: 0)
let decoder = VideoDecoder(
onDecoded: { frame in
box.lock.lock(); box.frame = frame; box.lock.unlock()
done.signal()
},
onDecodeError: { status in
box.lock.lock(); box.error = status; box.lock.unlock()
done.signal()
})
XCTAssertTrue(decoder.decode(au: au, format: format), "frame submit should succeed")
XCTAssertEqual(done.wait(timeout: .now() + 10), .success, "the decode callback must fire")
decoder.reset()
box.lock.lock()
let frame = box.frame
let error = box.error
box.lock.unlock()
XCTAssertNil(error.map { "decode error \($0)" })
let ready = try XCTUnwrap(frame, "the async output callback must deliver a ReadyFrame")
XCTAssertEqual(CVPixelBufferGetWidth(ready.pixelBuffer), width)
XCTAssertEqual(CVPixelBufferGetHeight(ready.pixelBuffer), height)
XCTAssertEqual(ready.ptsNs, 42_000_000, "pts round-trips through the decoder")
XCTAssertGreaterThan(ready.decodedNs, 0, "decode-completion is stamped")
}
// MARK: - encode helpers
/// One forced-IDR HEVC frame; returns its format description and raw AVCC sample bytes.