feat(apple): Improve presenter
apple / screenshots (push) Has been cancelled
apple / swift (push) Has been cancelled
ci / docs-site (push) Has been cancelled
ci / bench (push) Has been cancelled
ci / web (push) Has been cancelled
ci / rust (push) Has been cancelled
android-screenshots / screenshots (push) Successful in 2m16s
deb / build-publish (push) Successful in 3m26s
decky / build-publish (push) Successful in 13s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 6s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
windows-host / package (push) Successful in 6m48s
release / apple (push) Successful in 7m45s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m22s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m37s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
android / android (push) Successful in 9m35s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m32s
linux-client-screenshots / screenshots (push) Successful in 2m31s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m53s
web-screenshots / screenshots (push) Successful in 2m32s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m37s
flatpak / build-publish (push) Failing after 3m47s
docker / deploy-docs (push) Failing after 1m9s
apple / screenshots (push) Has been cancelled
apple / swift (push) Has been cancelled
ci / docs-site (push) Has been cancelled
ci / bench (push) Has been cancelled
ci / web (push) Has been cancelled
ci / rust (push) Has been cancelled
android-screenshots / screenshots (push) Successful in 2m16s
deb / build-publish (push) Successful in 3m26s
decky / build-publish (push) Successful in 13s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 6s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
windows-host / package (push) Successful in 6m48s
release / apple (push) Successful in 7m45s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m22s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m37s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
android / android (push) Successful in 9m35s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m32s
linux-client-screenshots / screenshots (push) Successful in 2m31s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m53s
web-screenshots / screenshots (push) Successful in 2m32s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m37s
flatpak / build-publish (push) Failing after 3m47s
docker / deploy-docs (push) Failing after 1m9s
feat(apple): add cursor capture on iPad
This commit is contained in:
@@ -3,12 +3,61 @@ title: "Apple Stage-2 Presenter (handoff)"
|
||||
description: "Design rationale + open items for the explicit VTDecompressionSession → CAMetalLayer presenter. Implementation shipped; this page is trimmed to the why + what's left."
|
||||
---
|
||||
|
||||
> **Status:** SHIPPED behind the opt-in `punktfunk.presenter` flag (`AVSampleBufferDisplayLayer`
|
||||
> stage-1 remains the default known-good path). Live-validated ~11 ms p50 capture→present (commit
|
||||
> `7b10714`). Code: `clients/apple/Sources/PunktfunkKit/{Stage2Pipeline,MetalVideoPresenter,VideoDecoder,LatencyMeter}.swift`;
|
||||
> Settings has a presenter picker (`DefaultsKey.presenter`, `SettingsView.swift`). This doc is trimmed
|
||||
> to design rationale + open items — the shipped `.swift` code is the source of truth for the
|
||||
> decode/present/measurement walkthrough.
|
||||
> **Status:** SHIPPED as the **default** presenter (stage-1 `AVSampleBufferDisplayLayer` is the
|
||||
> Metal-unavailable / DEBUG fallback). HDR corrected and **4:4:4** added on top of the proven
|
||||
> main-thread present path (the hosting view's `CADisplayLink` drives `render` per vsync). Code:
|
||||
> `clients/apple/Sources/PunktfunkKit/{Stage2Pipeline,MetalVideoPresenter,VideoDecoder,Stage444Probe,LatencyMeter}.swift`.
|
||||
> This doc is trimmed to design rationale + open items — the shipped `.swift` code is the source of
|
||||
> truth for the decode/present/measurement walkthrough.
|
||||
>
|
||||
> **HDR (the "too bright" fix).** The presenter renders to a *separate* CAMetalLayer drawable, so the
|
||||
> mastering metadata that was attached to the source `CVPixelBuffer` was never composited — and with no
|
||||
> reference-white anchor the system rendered the PQ signal far too bright. The fix is to keep the
|
||||
> PQ-passthrough shader (BT.2020 limited→full → PQ R′G′B′ as-is) and put the anchor **on the layer**:
|
||||
> `colorspace = itur_2100_PQ`, `wantsExtendedDynamicRangeContent = true` (on **all** platforms — the old
|
||||
> `#if os(macOS)` guard left iOS/tvOS EDR half-engaged), and
|
||||
> `edrMetadata = CAEDRMetadata.hdr10(displayInfo:contentInfo:opticalOutputScale: 203)`. 203 nits =
|
||||
> BT.2408 HDR reference white anchors diffuse white at EDR 1.0; a larger value renders dimmer. The
|
||||
> mastering/CLL blobs (host `0xCE` datagram) now refine `edrMetadata` (drained by the pump,
|
||||
> `setHdrMeta` hops the layer write to main) rather than being attached to a never-composited source
|
||||
> buffer. **Needs on-glass validation on a real EDR panel.**
|
||||
>
|
||||
> **Mid-session SDR↔HDR.** The control-plane colour (`connection.isHDR`, from the Welcome) is fixed per
|
||||
> session, but the host can re-init its encoder mid-session (a game entering HDR), so the HEVC VUI — and
|
||||
> the decoder's `frame.isHDR` — flips. The presenter follows the **decoded frame**, not the latched
|
||||
> session flag: `render` calls the idempotent `configure(hdr:)` every frame, so on a flip it
|
||||
> reconfigures the layer (per-mode pixel format `bgra8Unorm` SDR / `rgba16Float` HDR, colorspace, EDR)
|
||||
> and selects the matching shader — all synchronously on the main thread (the present path is
|
||||
> main-thread, so no cross-thread hop is needed). The last `0xCE` grade is cached so an SDR→HDR
|
||||
> reconfigure re-applies the real mastering metadata instead of the bare anchor. The pump drains `0xCE`
|
||||
> **unconditionally** (not gated on the Welcome flag) so a session that starts SDR still gets mastering
|
||||
> metadata when it goes HDR. A ≤2-frame transition flash on the rare flip is accepted.
|
||||
>
|
||||
> **Pacing.** The hosting view owns a **main-runloop `CADisplayLink`** (a weak `DisplayLinkProxy`
|
||||
> breaks the retain cycle) that calls `renderTick` once per vsync. `renderTick` pops the **newest**
|
||||
> ready frame from the 1-slot ring (older undisplayed frames dropped — lowest latency, no smoothing
|
||||
> buffer) and, if there is one, draws it via **manual `layer.nextDrawable()`** and presents at the next
|
||||
> vsync; on an idle vsync (no new frame) it does nothing and the compositor holds the last presented
|
||||
> drawable (no idle re-render — matters at 5K). `drawableSize` is set **before** `nextDrawable` (it
|
||||
> doesn't track bounds, defaults to 0), so allocation always uses the decoded size. `maximumDrawableCount
|
||||
> = 3`. macOS `displaySyncEnabled = **false**`: the display link is the single pacing source, so leaving
|
||||
> the layer's own vsync wait on would *also* block `present`/`nextDrawable` on the main thread and
|
||||
> serialize it to the display — the cause of the fullscreen judder; disabling it lets present return
|
||||
> promptly. Present is stamped at the display link's `targetTimestamp` projected to `CLOCK_REALTIME`
|
||||
> (the actual on-glass instant, <1 vsync after the draw — accurate for the HUD).
|
||||
>
|
||||
> *(History: an off-main `CAMetalDisplayLink` variant and an off-main blocking-render present thread
|
||||
> were both tried and **reverted** — both measured slower on macOS *and* iPad than this main-thread
|
||||
> display-link path, whose real judder fix was simply `displaySyncEnabled = false`, not moving present
|
||||
> off-thread. Measured ~11 ms p50 on the main-thread path.)*
|
||||
>
|
||||
> **4:4:4.** Chroma, bit-depth, and colorimetry are orthogonal: the decode pixel format is a 2×2 of
|
||||
> `(chroma, HDR)` → `420v/x420/444v/x444` (all biplanar, so the existing shaders sample a full-size
|
||||
> chroma plane unchanged); the shader is keyed only on HDR. The client advertises `VIDEO_CAP_444` only
|
||||
> when `Stage444Probe` confirms **hardware** 4:4:4 decode (a hardware-required `VTDecompressionSession`
|
||||
> over an embedded 256×256 4:4:4 keyframe — software 4:4:4 is too slow for real-time; validated on M3:
|
||||
> `444v`/`x444` produced). A bounded pump backstop ends a 4:4:4 session that persistently fails to
|
||||
> decode (gated to 4:4:4 sessions, so 4:2:0 loss-recovery is untouched).
|
||||
|
||||
## Why stage 2 (design rationale)
|
||||
|
||||
@@ -47,10 +96,28 @@ Async `VTDecompressionSession` callback → **1-slot newest-ready ring** → dis
|
||||
|
||||
## Open items
|
||||
|
||||
- **Make stage 2 the default** — after resolution / HDR edge-case checks (HDR = BT.2020/PQ, 10-bit
|
||||
`…10BiPlanar` + EDR `CAMetalLayer.wantsExtendedDynamicRangeContent`; ties in with the HDR roadmap).
|
||||
- **On-glass HDR validation** — eyeball `edrMetadata` + `opticalOutputScale: 203` on a real EDR panel
|
||||
(XDR display) against stage-1 side-by-side: diffuse white should sit at SDR-white level with only
|
||||
highlights climbing. The reference white is a single named constant (`hdrReferenceWhiteNits`) for easy
|
||||
tuning. (Needs a Windows HDR host; the Linux host is 8-bit SDR only.)
|
||||
- **On-glass 4:4:4 validation** — confirm a `PUNKTFUNK_444` host (RTX box) streams a 4:4:4 session the
|
||||
client decodes in hardware (HUD shows the resolved chroma); verify the resolution-ceiling backstop by
|
||||
forcing a too-large 4:4:4 mode.
|
||||
- **Glass-to-glass numbers via `tools/latency-probe`** — close the still-unmeasured host render→capture
|
||||
term.
|
||||
- **Smoothing / pacing policy** — present newest-ready for lowest latency today; a pacing policy can come
|
||||
later if frames look uneven.
|
||||
- **iOS / iPadOS / tvOS stage-2 variants.**
|
||||
term and confirm the main-thread display-link present p50 holds at ~11 ms (and isn't regressed by the
|
||||
per-frame `configure` / HDR-anchor work).
|
||||
- **Smoothing / pacing policy** — present newest-ready for lowest latency today; an optional even-pacing
|
||||
policy (`present(_:afterMinimumDuration:)`) can come later if frames look uneven.
|
||||
- **4:4:4 runtime downgrade-reconnect** — today a persistently-undecodable 4:4:4 session ends cleanly
|
||||
(the live 4:4:4 decode requires hardware, so a resolution-ceiling miss fails the session create
|
||||
*synchronously* and the pump backstop ends it — no black-screen loop); auto-reconnecting at 4:2:0
|
||||
(dropping `VIDEO_CAP_444`) is a future refinement.
|
||||
- **HLG** — `isHDR`/`isHDRFormat` fold HLG (transfer 18) in with PQ, but the presenter is PQ-only
|
||||
(`itur_2100_PQ` + `hdr10` EDR), so an HLG stream would be mis-toned. Latent — no host emits HLG
|
||||
(the stack is BT.2020 **PQ** only). A real HLG path (`itur_2100_HLG`, no PQ reference-white anchor)
|
||||
is future work; until then HLG should be treated as out of scope.
|
||||
- **Full-range** — the shaders hardcode limited→full expansion and the decoder requests the
|
||||
`*VideoRange` formats regardless of `connection.colorFullRange`; VideoToolbox range-converts a
|
||||
full-range source to video range on decode, so it stays self-consistent (mild level compression on
|
||||
genuinely full-range content, which no host emits). Pre-existing; wire `colorFullRange` into the
|
||||
range constants eventually.
|
||||
|
||||
Reference in New Issue
Block a user