# HDR pipeline — investigation & implementation plan Goal: **true, correct HDR glass-to-glass** for punktfunk, across the host (Windows today; Linux blocked upstream) and every client (Windows / Apple / Android / Linux). This is an audit of the current state, the gap list, and a phased plan. It was produced from a full read of every HDR-touching subsystem cross-checked against the HDR10 standards (CICP/H.273 VUI, SMPTE ST.2086 mastering, CEA-861.3 MaxCLL/MaxFALL) and the Sunshine/Apollo/Moonlight reference implementation. > Status legend: **blocker** (HDR can't work) · **correctness** (HDR works but looks wrong) · > **quality** (correct-ish, missing refinement) · **ok**. --- ## TL;DR Our HDR is **correct in isolated islands but broken end-to-end.** The pixel math and the HEVC VUI we *do* emit are right (self-test validated, matches Apollo). What's missing is the **metadata chain**: nothing measures, signals, transports, or applies the *static HDR metadata* (mastering display colour volume + content light level) that tells a display how to tone-map — so every client hardcodes generic values or infers from the bitstream, and one line (`abi.rs:896`, `video_caps = 0`) makes the entire (correct) Apple HDR pipeline dead code. --- ## What's already correct (the islands) | Stage | Where | |---|---| | Windows host HEVC **VUI** — primaries=9 (BT.2020) / transfer=16 (PQ) / matrix=9 (BT.2020-NCL) / limited range | `encode/nvenc.rs:307-316` | | Windows host **scRGB→BT.2020 PQ** shader (×80 nits → BT.709→2020 → ST.2084 OETF, 10000-nit peak) | `capture/dxgi.rs` — self-test `<1` code error, matches Apollo | | Windows client **P010 decode + YUV→RGB** (BT.2020-NCL, limited→full) + **R10G10B10A2 / G2084-P2020 swapchain** | `present.rs:66-77, 320-370` | | Android client **Main10 decode + reactive DataSpace** (BT2020-PQ/HLG) | `decode.rs:210-227` | | Apple client decode/present **code** (P010 VideoToolbox, BT.2020 PQ Metal, `itur_2100_PQ` + EDR) | correct — but never runs (blocker #2) | ## Gap list ### Blockers 1. **No color-metadata transport in the protocol** *(the keystone).* The wire carries only `Hello.video_caps` (10BIT/HDR bits) and `Welcome.bit_depth` (8/10) — `quic.rs:127-128` explicitly defers color. No primaries/transfer/matrix/range, **no ST.2086 mastering, no MaxCLL/MaxFALL**. ST.2086/CLL host→client is impossible by construction today. 2. **C ABI hardcodes `video_caps = 0`** (`abi.rs:896`) → Apple's complete HDR pipeline is dead code; no ABI embedder can request HDR. One-line root cause. 3. **H.264 and AV1 emit zero color signaling on Windows** — the `if self.hdr` VUI block in `nvenc.rs` only writes `hevcConfig`. Any H.264+10-bit or AV1+HDR stream decodes as BT.709 SDR. *(AV1 is **not** a "copy the HEVC VUI" fix — AV1 has no VUI/SEI; it carries primaries/transfer/matrix in the sequence-header `color_config` and mastering/CLL in **METADATA OBUs** `HDR_MDCV`/`HDR_CLL`. Verify whether NVENC's AV1 path accepts them.)* 4. **Linux host is 8-bit only end to end** — capture offers only 8-bit PipeWire formats (`capture/linux.rs:443-453, 594-654`; gamescope #2126, portals don't wire PipeWire 1.6 BT.2020/PQ); encode downgrades 10-bit (`encode/linux.rs:153-162` TODO, `vaapi.rs:719`) with BT.709 hardcoded. The Windows-style 8-bit→Main10 upconvert shim is not implemented here. 5. **Linux client HDR is a complete non-feature** — `video_caps=0`, P010 decode path dead (`video.rs:379`), CICP hardcoded BT.709 (`ui_stream.rs:239-243`), no Wayland color-management (GTK4 0.11 too old). ### Correctness 6. **No host ever emits the ST.2086 mastering or CEA-861.3 CLL SEI.** Windows never reads `IDXGIOutput6::GetDesc1`; `nvenc.rs` never builds an `NV_ENC_SEI_PAYLOAD`; Linux attaches no libavcodec `side_data`. Apollo reads `GetDesc1` and attaches it. 7. **Clients hardcode mastering metadata.** `present.rs:584-595` ships fixed `1000-nit / MaxCLL 1000 / MaxFALL 400` (with the literal "the protocol doesn't carry the stream's real mastering metadata yet" comment). Apple/Android set none. 8. **HDR→SDR tone-mapping is unaddressed — and it's the common case.** Most client displays are SDR. No client queries display peak; silent `SetColorSpace1`/`SetHDRMetaData` failures present PQ as SDR gamma (crushed/dark). We lean entirely on OS auto-fallback. 9. **Windows secure desktop drops HDR to SDR** on lock/UAC (`dxgi.rs:325-368`, `sudovda.rs:234-277`). 10. **GameStream silently streams SDR** on a Moonlight HDR request (`mod.rs:48-56`, `rtsp.rs:288-293`) — logged, but no negotiated error. Real Apollo parity needs the Moonlight `SS_HDR_METADATA` blob on the **ENet control channel**, not just in-band. 11. **Linux client software path is color-wrong even for SDR** — BT.601 applied to BT.709 (`video.rs:162-167`, no `color_state` on the texture). Standalone bug. ### Quality 12. No per-content MaxCLL/MaxFALL (`GetDesc1` doesn't expose it). No encoder-CSC-range vs signaled-range reconciliation (black-crush risk). No automated 10-bit test — `probe` never even reads `Welcome.bit_depth` (`main.rs:396-406`). ### Out of scope (call out, don't build) - Dynamic metadata: HDR10+ (ST.2094-40) and Dolby Vision RPU. We handle *static* ST.2086 only, with mid-stream changes carried by re-sending the static block (below). - HLG: the colorimetry transfer enum carries `18` from day one (free), but the `0xCE` mastering datagram is **omitted for HLG** (scene-referred, no mastering metadata). --- ## Protocol design (the keystone — pure-additive, hardware-free, CI-testable) Two layers, both back-compat-safe via the established trailing-bytes / new-datagram-tag patterns. ### (A) Per-session colorimetry — 4 trailing bytes on `Welcome` After the existing `bit_depth` (offset 59), append a fixed 4-byte CICP block at offsets 60..64. (A future mirror on `Reconfigured` will announce a mid-stream SDR↔HDR / BT.709↔BT.2020 flip on the control stream we already use for renegotiation — deferred to Step 1 with the mid-stream-flip work; today a mode switch never changes the colour, and the `0xCE` re-send covers mastering changes.) ``` [60] colour_primaries (CICP: 1=BT.709, 9=BT.2020) [61] transfer_characteristics (1=BT.709, 16=PQ/SMPTE2084, 18=HLG) [62] matrix_coeffs (1=BT.709, 9=BT.2020-NCL) ← never emit 10 (CL): no client decodes it [63] video_full_range_flag (0=limited, 1=full) ``` Decode with `b.get(60).unwrap_or(1)` etc. — an older host omits them → BT.709 limited SDR (today's behavior). `Welcome` stays `Copy`. Modeled as a `ColorInfo` struct on the wire types and exposed on `NativeClient` (with `bit_depth`) so clients *know* the colorimetry instead of inferring it. ### (B) Per-change mastering + CLL — a new host→client datagram, tag `0xCE` ST.2086 is variable and changes mid-stream, so it rides a datagram (next tag after `0xCD` HIDOUT), demuxed in `client.rs` like AUDIO/RUMBLE/HIDOUT. 28 bytes, standard SEI fixed-point: ``` [0] = 0xCE G.x G.y B.x B.y R.x R.y 6 × u16 LE display primaries, 1/50000 units wp.x wp.y 2 × u16 LE white point, 1/50000 units max_display_mastering_luminance u32 LE 0.0001 cd/m² min_display_mastering_luminance u32 LE 0.0001 cd/m² max_cll u16 LE nits max_fall u16 LE nits ``` - Sent on session start and whenever `GetDesc1`/source mastering changes; **re-sent on every IDR/RFI keyframe** so a client that dropped the (best-effort) datagram converges within a GOP. Until first receipt the client uses the Welcome transfer + a documented generic default. - **Bounds-check length before reading** (reassembler-bounds security invariant) — truncation test required. - **Omitted entirely for HLG.** - Units note: these map straight to DXGI `DXGI_HDR_METADATA_HDR10`, Android `KEY_HDR_STATIC_INFO`, and Apple `CAEDRMetadata.hdr10`. On the **libavcodec/Linux** side they need conversion — `AVMasteringDisplayMetadata` stores `AVRational`, not raw fixed-point. ### (C) C ABI - `punktfunk_connect_ex5(... video_caps: u8)` (ex4 delegates with 0); **fix `abi.rs:896`.** - `punktfunk_connection_next_hdr_meta(c, *mut PunktfunkHdrMeta, timeout_ms)` — new plane, one-puller contract like `next_audio`. - `punktfunk_connection_color_info(c, *mut prim, *mut trc, *mut matrix, *mut range, *mut bit_depth)`. - Regenerate `include/punktfunk_core.h` (cbindgen); `struct_size`/repr(C) guards on new structs. --- ## Phases ### Step 0 — Protocol + ABI carry color metadata end to end *(this change)* The dominant cross-cutting blocker; everything else is downstream. No rendering changes, no hardware, CI-testable. - **core:** `ColorInfo` + 4 Welcome bytes; `HdrMeta` + `0xCE` codec (bounds-checked); `NativeClient` `color`/`bit_depth` fields + HdrMeta receiver + demux + `next_hdr_meta`. - **C ABI:** `connect_ex5`, `next_hdr_meta`, `color_info`, fix caps=0; regen header. - **host:** populate `Welcome.color` from the negotiated bit-depth/HDR decision; send a `0xCE` (generic default for now) when HDR is negotiated. - **clients:** Windows/Android inherit the demux via shared core; Apple flips to `ex5`. - **validation:** `quic.rs` round-trip + truncation + **SDR back-compat** tests; `probe` logs `bit_depth` + colorimetry; loopback asserts a 10-bit Welcome carries trc=16 and a `0xCE` lands. ### Step 1 — Host emits correct in-band SEI + complete VUI on all codecs *(landed; RTX-validation pending)* In-band SEI is read directly by decoders, so it fixes correctness even before clients consume the protocol, and gives an Apollo/Moonlight on-glass parity gate. - **Single source of truth:** the capturer learns the source display's mastering metadata and exposes it via `Capturer::hdr_meta() -> Option`. The stream loop forwards it to the encoder (`Encoder::set_hdr_meta` → in-band SEI) **and** the client (real `0xCE`, re-sent on each keyframe). Pure byte-level logic (float→fixed conversion + the HEVC/H.264 SEI payloads) lives in the unit-tested, cross-platform `src/hdr.rs` (`hdr_meta_from_display`, `hevc_mastering_display_sei` type **137**, `hevc_content_light_level_sei` type **144** — note: NOT "type 4", that was a drafting error). - **Windows (done, CI-compiled / RTX on-glass pending):** `dxgi.rs` + `wgc.rs` read `IDXGIOutput6::GetDesc1` at capture init / output change → `HdrMeta` (MaxCLL/MaxFALL left 0 — GetDesc1 has none, like Apollo). `nvenc.rs` attaches the mastering + CLL SEI on every IDR for HEVC/H.264. (AV1 mastering rides METADATA OBUs, not SEI — follow-up; AV1 `color_config` already lands in Step 0's quick win.) - **Linux encode-ready — DEFERRED into Step 4:** Linux capture is 8-bit only, so signalling BT.2020 PQ + attaching mastering side-data on a downconverted 8-bit stream would be *incorrect*. The libavcodec `side_data` path (with the `AVRational` conversion) lands together with the 8-bit→Main10 shim / true 10-bit capture in Step 4. - **Windows secure-desktop relay** (`virtual_stream_relay`) still sends only the generic baseline `0xCE`; the helper's in-band SEI carries the real grade. Wiring the relay's `0xCE` is a follow-up. - **validation (RTX box):** `ffprobe -show_frames` shows mastering + CLL side-data with the display's real luminance and VUI 9/16/9; stock Moonlight shows correct (not washed-out) HDR. Add **encoder-CSC-range == signaled-range** check. ### Step 2 — Clients apply the metadata *(landed; CI/on-glass validation pending)* All three clients now drain the protocol's `HdrMeta` (`next_hdr_meta` / `nextHdrMeta`) and apply it, each remapping from the wire form (ST.2086 G,B,R order, mastering luminance in 0.0001 cd/m²) to the platform's expected layout: - **Windows (Rust, CI-compiled):** session pump drains `next_hdr_meta` into a `LATEST_HDR_META` slot; `present_newest` applies it via `Presenter::set_hdr_metadata` → real `SetHDRMetaData` (`hdr_meta_to_dxgi`: G,B,R→R,G,B reorder, 0.0001-nit→nit for `MaxMasteringLuminance`), dropping the 1000/1000/400 hardcode. `SetColorSpace1`/`SetHDRMetaData` failures + an SDR-display colour-space rejection are now **logged**, not swallowed. - **Apple (Swift, mac-runner CI):** connect now advertises caps via `punktfunk_connect_ex5` (`SessionModel` computes `videoCap10Bit|videoCapHDR` from `hdrEnabled`) — *this is the fix that resurrects Apple's previously-dead HDR pipeline*. `nextHdrMeta`/`colorInfo` wrappers added; the pump drains `nextHdrMeta` → `VideoDecoder.setHdrMeta` → `CVBufferSetAttachment` of `kCVImageBufferMasteringDisplayColorVolumeKey` (24-byte BE SEI) + `kCVImageBufferContentLightLevelInfoKey` (4-byte BE) on each HDR pixel buffer (the correct path for the itur_2100_PQ layer; `CAEDRMetadata` on a PQ layer is ambiguous and was avoided). - **Android (Rust `decode.rs`, cargo-ndk verified):** when `client.color.is_hdr()`, drain the first `next_hdr_meta` and set `MediaFormat` `hdr-static-info` (`KEY_HDR_STATIC_INFO`) before `configure()` — `android_hdr_static_info` builds the 25-byte CTA-861.3 Type-1 blob (LE, **R,G,B** order, max-lum in **nits-u16**). `Display.getHdrCapabilities` gate deferred (the Surface DataSpace already drives SurfaceFlinger tone-mapping on non-HDR displays). ### Step 3 — Display-capability gate *(landed; CI/on-glass validation pending)* The common-case correctness step — most client displays are SDR. **Chosen approach: capability-gate** (not an in-shader BT.2390 tone-map). Rationale: with Steps 1–2 the host sends *correct* mastering metadata, so an HDR display self-tone-maps from it; the real remaining gap is SDR displays, best fixed by **not advertising HDR you can't present** — the host then sends a proper BT.709 SDR stream instead of PQ the panel would mis-tone-map (washed-out/dark). No guessed tone-map curve, deterministic. - **Windows** (`present::display_supports_hdr` via DXGI: any `IDXGIOutput6` colour space == `G2084`): `session.rs` ANDs it with the user's HDR setting before advertising caps; logs when it drops to SDR. - **Apple** (`SessionModel`, main-actor): `NSScreen.maximumExtendedDynamicRangeColorComponentValue > 1` (macOS) / `UIScreen.main.potentialEDRHeadroom > 1` (iOS) ANDed with `hdrEnabled`. - **Android** (`Settings.displaySupportsHdr` via `Display.getHdrCapabilities` HDR10/HDR10+): Kotlin passes it to `nativeConnect`; `session.rs` gates the caps on the new `hdr_enabled` jboolean (cargo-ndk-verified). - **Deferred** (need on-glass / the RTX box): the **mid-session `Reconfigure` "downgrade to SDR"** for a monitor move HDR↔SDR; and confirming the **host produces SDR for an SDR client even off an HDR desktop** — on the native path the per-session SudoVDA follows the negotiated depth (SDR client → SDR virtual display → SDR stream), so it should hold end-to-end; verify the stale-HDR-SudoVDA edge case on the RTX box. ### Step 4 — Linux (last; capture blocked upstream) - **8-bit→Main10 NVENC upconvert shim** (`encode/linux.rs`) — Main10 transport with correct VUI/SEI without HDR capture (gate so we don't claim HDR transfer on SDR content). - **Linux encode color + side-data (the deferred Step 1c):** set `color_primaries/trc/colorspace/range` from the negotiated `ColorInfo` and attach `AV_FRAME_DATA_MASTERING_DISPLAY_METADATA` / `CONTENT_LIGHT_LEVEL` side-data (with the `AVRational` conversion) in `encode/linux.rs` + `vaapi.rs` — only once the encoder actually produces 10-bit, so the signalling matches the bits. - True 10-bit capture: offer `ABGR2101010`/`P010` PipeWire formats + read colorimetry; pilot on Sway/wlroots; track gamescope #2126. **Don't block the rest of the plan on it.** - Linux client: `ex5` caps, P010 decode, GdkDmabufTexture CICP from Welcome, `wp_color_management` when GTK ≥ 4.14. ## Quick wins (independent, land in parallel) 1. `connect_ex5` + fix `abi.rs:896` — resurrects Apple's pipeline *(Step 0)*. 2. H.264 VUI + AV1 `color_config` on `nvenc.rs` — closes two latent blockers *(Windows-only, validated in CI / on the RTX box)*. 3. `probe` logs `bit_depth` + colorimetry — observability for every later round-trip assertion. 4. Linux client BT.601→BT.709 sws + texture `color_state` — standalone SDR correctness bug. 5. GameStream silent-downgrade already warns (`rtsp.rs:289`) — keep observable. ## Open questions - **MaxCLL source:** `GetDesc1` doesn't expose it (Apollo zeroes). Static default, or measure per-frame peak in the PQ shader (only truly-correct, adds a readback)? - **GameStream:** implement `SS_HDR_METADATA` for Moonlight parity, or keep it deliberately SDR and steer HDR users to punktfunk/1? - **HLG:** carry the enum from day one (free) — but do any sources actually produce HLG? - **Linux:** is shipping the 8-bit→Main10 shim as "HDR-capable transport" acceptable, or does it risk advertising HDR we can't truly deliver? ## Ordering rationale Step 0 first: it's the keystone (metadata transport is the dominant cross-cutter; the ABI line is a one-line root cause) and needs no hardware. Step 1 next: in-band SEI is read directly by decoders, so it fixes correctness even before our clients consume the protocol, and gives an Apollo-parity on-glass gate. Steps 2–3 are mechanical per-client wiring once metadata flows. Linux is last because capture is gated on upstream we don't control; the shim delivers Main10 transport without that dependency. Hardware dependencies: Step 0 = none (CI); Step 1 = RTX Windows host; Steps 2–3 = a real HDR display per platform; Step 4 = a Linux GPU box + HDR-capable Wayland compositor.