Files
punktfunk/docs/hdr-pipeline-plan.md
T
enricobuehler 551012bb43 feat(clients): HDR Steps 2-3 — apply mastering metadata + display capability-gate
Continues docs/hdr-pipeline-plan.md. Steps 0/1 + Step 2 (Windows/Android) already
landed in 3526517; this is Step 2 (Apple) + Step 3 (all clients). Client-only — no
core/host/ABI change (the 0xCE/next_hdr_meta/color_info surfaces shipped in Step 0).

Step 2 — clients APPLY the host's HDR metadata (each remaps from the wire form: ST.2086
G,B,R order, mastering luminance in 0.0001 cd/m2):
- Apple: connect via punktfunk_connect_ex5 (resurrects the previously-dead HDR pipeline);
  nextHdrMeta/colorInfo wrappers + HdrMeta SEI-blob builders; the pump drains nextHdrMeta
  -> VideoDecoder.setHdrMeta -> CVBufferSetAttachment of MasteringDisplayColorVolume (24B
  BE) + ContentLightLevelInfo (4B BE) on each HDR pixel buffer (correct for the
  itur_2100_PQ layer; CAEDRMetadata avoided as ambiguous there).

Step 3 — capability-gate: advertise HDR caps ONLY when the display can present it, so an
SDR display gets a proper BT.709 stream instead of PQ it would mis-tone-map; an HDR
display self-tone-maps from the Step-1/2 mastering metadata.
- Windows: present::display_supports_hdr() (DXGI any IDXGIOutput6 colour space == G2084),
  ANDed with the user HDR setting in session.rs; logs the SDR drop.
- Apple: NSScreen.maximumExtendedDynamicRangeColorComponentValue>1 (macOS) /
  UIScreen.main.potentialEDRHeadroom>1 (iOS) in SessionModel.
- Android: Settings.displaySupportsHdr (Display.getHdrCapabilities HDR10/HDR10+) passed
  through a new hdr_enabled jboolean on nativeConnect; session.rs gates the caps.

Validation: Android native (incl. the jboolean gate) builds + clippy clean via cargo-ndk;
fmt clean. Windows (MSVC), Apple (Swift) and the Kotlin side are CI/on-glass validated —
not compilable on the Linux dev box. Deferred to the RTX box: mid-session Reconfigure
SDR-downgrade on monitor move, and confirming the host emits SDR for an SDR client off an
HDR desktop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 09:46:58 +00:00

17 KiB
Raw Blame History

HDR pipeline — investigation & implementation plan

Goal: true, correct HDR glass-to-glass for punktfunk, across the host (Windows today; Linux blocked upstream) and every client (Windows / Apple / Android / Linux).

This is an audit of the current state, the gap list, and a phased plan. It was produced from a full read of every HDR-touching subsystem cross-checked against the HDR10 standards (CICP/H.273 VUI, SMPTE ST.2086 mastering, CEA-861.3 MaxCLL/MaxFALL) and the Sunshine/Apollo/Moonlight reference implementation.

Status legend: blocker (HDR can't work) · correctness (HDR works but looks wrong) · quality (correct-ish, missing refinement) · ok.


TL;DR

Our HDR is correct in isolated islands but broken end-to-end. The pixel math and the HEVC VUI we do emit are right (self-test validated, matches Apollo). What's missing is the metadata chain: nothing measures, signals, transports, or applies the static HDR metadata (mastering display colour volume + content light level) that tells a display how to tone-map — so every client hardcodes generic values or infers from the bitstream, and one line (abi.rs:896, video_caps = 0) makes the entire (correct) Apple HDR pipeline dead code.


What's already correct (the islands)

Stage Where
Windows host HEVC VUI — primaries=9 (BT.2020) / transfer=16 (PQ) / matrix=9 (BT.2020-NCL) / limited range encode/nvenc.rs:307-316
Windows host scRGB→BT.2020 PQ shader (×80 nits → BT.709→2020 → ST.2084 OETF, 10000-nit peak) capture/dxgi.rs — self-test <1 code error, matches Apollo
Windows client P010 decode + YUV→RGB (BT.2020-NCL, limited→full) + R10G10B10A2 / G2084-P2020 swapchain present.rs:66-77, 320-370
Android client Main10 decode + reactive DataSpace (BT2020-PQ/HLG) decode.rs:210-227
Apple client decode/present code (P010 VideoToolbox, BT.2020 PQ Metal, itur_2100_PQ + EDR) correct — but never runs (blocker #2)

Gap list

Blockers

  1. No color-metadata transport in the protocol (the keystone). The wire carries only Hello.video_caps (10BIT/HDR bits) and Welcome.bit_depth (8/10) — quic.rs:127-128 explicitly defers color. No primaries/transfer/matrix/range, no ST.2086 mastering, no MaxCLL/MaxFALL. ST.2086/CLL host→client is impossible by construction today.
  2. C ABI hardcodes video_caps = 0 (abi.rs:896) → Apple's complete HDR pipeline is dead code; no ABI embedder can request HDR. One-line root cause.
  3. H.264 and AV1 emit zero color signaling on Windows — the if self.hdr VUI block in nvenc.rs only writes hevcConfig. Any H.264+10-bit or AV1+HDR stream decodes as BT.709 SDR. (AV1 is not a "copy the HEVC VUI" fix — AV1 has no VUI/SEI; it carries primaries/transfer/matrix in the sequence-header color_config and mastering/CLL in METADATA OBUs HDR_MDCV/HDR_CLL. Verify whether NVENC's AV1 path accepts them.)
  4. Linux host is 8-bit only end to end — capture offers only 8-bit PipeWire formats (capture/linux.rs:443-453, 594-654; gamescope #2126, portals don't wire PipeWire 1.6 BT.2020/PQ); encode downgrades 10-bit (encode/linux.rs:153-162 TODO, vaapi.rs:719) with BT.709 hardcoded. The Windows-style 8-bit→Main10 upconvert shim is not implemented here.
  5. Linux client HDR is a complete non-featurevideo_caps=0, P010 decode path dead (video.rs:379), CICP hardcoded BT.709 (ui_stream.rs:239-243), no Wayland color-management (GTK4 0.11 too old).

Correctness

  1. No host ever emits the ST.2086 mastering or CEA-861.3 CLL SEI. Windows never reads IDXGIOutput6::GetDesc1; nvenc.rs never builds an NV_ENC_SEI_PAYLOAD; Linux attaches no libavcodec side_data. Apollo reads GetDesc1 and attaches it.
  2. Clients hardcode mastering metadata. present.rs:584-595 ships fixed 1000-nit / MaxCLL 1000 / MaxFALL 400 (with the literal "the protocol doesn't carry the stream's real mastering metadata yet" comment). Apple/Android set none.
  3. HDR→SDR tone-mapping is unaddressed — and it's the common case. Most client displays are SDR. No client queries display peak; silent SetColorSpace1/SetHDRMetaData failures present PQ as SDR gamma (crushed/dark). We lean entirely on OS auto-fallback.
  4. Windows secure desktop drops HDR to SDR on lock/UAC (dxgi.rs:325-368, sudovda.rs:234-277).
  5. GameStream silently streams SDR on a Moonlight HDR request (mod.rs:48-56, rtsp.rs:288-293) — logged, but no negotiated error. Real Apollo parity needs the Moonlight SS_HDR_METADATA blob on the ENet control channel, not just in-band.
  6. Linux client software path is color-wrong even for SDR — BT.601 applied to BT.709 (video.rs:162-167, no color_state on the texture). Standalone bug.

Quality

  1. No per-content MaxCLL/MaxFALL (GetDesc1 doesn't expose it). No encoder-CSC-range vs signaled-range reconciliation (black-crush risk). No automated 10-bit test — probe never even reads Welcome.bit_depth (main.rs:396-406).

Out of scope (call out, don't build)

  • Dynamic metadata: HDR10+ (ST.2094-40) and Dolby Vision RPU. We handle static ST.2086 only, with mid-stream changes carried by re-sending the static block (below).
  • HLG: the colorimetry transfer enum carries 18 from day one (free), but the 0xCE mastering datagram is omitted for HLG (scene-referred, no mastering metadata).

Protocol design (the keystone — pure-additive, hardware-free, CI-testable)

Two layers, both back-compat-safe via the established trailing-bytes / new-datagram-tag patterns.

(A) Per-session colorimetry — 4 trailing bytes on Welcome

After the existing bit_depth (offset 59), append a fixed 4-byte CICP block at offsets 60..64. (A future mirror on Reconfigured will announce a mid-stream SDR↔HDR / BT.709↔BT.2020 flip on the control stream we already use for renegotiation — deferred to Step 1 with the mid-stream-flip work; today a mode switch never changes the colour, and the 0xCE re-send covers mastering changes.)

[60] colour_primaries          (CICP: 1=BT.709, 9=BT.2020)
[61] transfer_characteristics  (1=BT.709, 16=PQ/SMPTE2084, 18=HLG)
[62] matrix_coeffs             (1=BT.709, 9=BT.2020-NCL)   ← never emit 10 (CL): no client decodes it
[63] video_full_range_flag     (0=limited, 1=full)

Decode with b.get(60).unwrap_or(1) etc. — an older host omits them → BT.709 limited SDR (today's behavior). Welcome stays Copy. Modeled as a ColorInfo struct on the wire types and exposed on NativeClient (with bit_depth) so clients know the colorimetry instead of inferring it.

(B) Per-change mastering + CLL — a new host→client datagram, tag 0xCE

ST.2086 is variable and changes mid-stream, so it rides a datagram (next tag after 0xCD HIDOUT), demuxed in client.rs like AUDIO/RUMBLE/HIDOUT. 28 bytes, standard SEI fixed-point:

[0]   = 0xCE
G.x G.y B.x B.y R.x R.y   6 × u16 LE   display primaries, 1/50000 units
wp.x wp.y                 2 × u16 LE   white point,        1/50000 units
max_display_mastering_luminance  u32 LE   0.0001 cd/m²
min_display_mastering_luminance  u32 LE   0.0001 cd/m²
max_cll                   u16 LE       nits
max_fall                  u16 LE       nits
  • Sent on session start and whenever GetDesc1/source mastering changes; re-sent on every IDR/RFI keyframe so a client that dropped the (best-effort) datagram converges within a GOP. Until first receipt the client uses the Welcome transfer + a documented generic default.
  • Bounds-check length before reading (reassembler-bounds security invariant) — truncation test required.
  • Omitted entirely for HLG.
  • Units note: these map straight to DXGI DXGI_HDR_METADATA_HDR10, Android KEY_HDR_STATIC_INFO, and Apple CAEDRMetadata.hdr10. On the libavcodec/Linux side they need conversion — AVMasteringDisplayMetadata stores AVRational, not raw fixed-point.

(C) C ABI

  • punktfunk_connect_ex5(... video_caps: u8) (ex4 delegates with 0); fix abi.rs:896.
  • punktfunk_connection_next_hdr_meta(c, *mut PunktfunkHdrMeta, timeout_ms) — new plane, one-puller contract like next_audio.
  • punktfunk_connection_color_info(c, *mut prim, *mut trc, *mut matrix, *mut range, *mut bit_depth).
  • Regenerate include/punktfunk_core.h (cbindgen); struct_size/repr(C) guards on new structs.

Phases

Step 0 — Protocol + ABI carry color metadata end to end (this change)

The dominant cross-cutting blocker; everything else is downstream. No rendering changes, no hardware, CI-testable.

  • core: ColorInfo + 4 Welcome bytes; HdrMeta + 0xCE codec (bounds-checked); NativeClient color/bit_depth fields + HdrMeta receiver + demux + next_hdr_meta.
  • C ABI: connect_ex5, next_hdr_meta, color_info, fix caps=0; regen header.
  • host: populate Welcome.color from the negotiated bit-depth/HDR decision; send a 0xCE (generic default for now) when HDR is negotiated.
  • clients: Windows/Android inherit the demux via shared core; Apple flips to ex5.
  • validation: quic.rs round-trip + truncation + SDR back-compat tests; probe logs bit_depth + colorimetry; loopback asserts a 10-bit Welcome carries trc=16 and a 0xCE lands.

Step 1 — Host emits correct in-band SEI + complete VUI on all codecs (landed; RTX-validation pending)

In-band SEI is read directly by decoders, so it fixes correctness even before clients consume the protocol, and gives an Apollo/Moonlight on-glass parity gate.

  • Single source of truth: the capturer learns the source display's mastering metadata and exposes it via Capturer::hdr_meta() -> Option<HdrMeta>. The stream loop forwards it to the encoder (Encoder::set_hdr_meta → in-band SEI) and the client (real 0xCE, re-sent on each keyframe). Pure byte-level logic (float→fixed conversion + the HEVC/H.264 SEI payloads) lives in the unit-tested, cross-platform src/hdr.rs (hdr_meta_from_display, hevc_mastering_display_sei type 137, hevc_content_light_level_sei type 144 — note: NOT "type 4", that was a drafting error).
  • Windows (done, CI-compiled / RTX on-glass pending): dxgi.rs + wgc.rs read IDXGIOutput6::GetDesc1 at capture init / output change → HdrMeta (MaxCLL/MaxFALL left 0 — GetDesc1 has none, like Apollo). nvenc.rs attaches the mastering + CLL SEI on every IDR for HEVC/H.264. (AV1 mastering rides METADATA OBUs, not SEI — follow-up; AV1 color_config already lands in Step 0's quick win.)
  • Linux encode-ready — DEFERRED into Step 4: Linux capture is 8-bit only, so signalling BT.2020 PQ + attaching mastering side-data on a downconverted 8-bit stream would be incorrect. The libavcodec side_data path (with the AVRational conversion) lands together with the 8-bit→Main10 shim / true 10-bit capture in Step 4.
  • Windows secure-desktop relay (virtual_stream_relay) still sends only the generic baseline 0xCE; the helper's in-band SEI carries the real grade. Wiring the relay's 0xCE is a follow-up.
  • validation (RTX box): ffprobe -show_frames shows mastering + CLL side-data with the display's real luminance and VUI 9/16/9; stock Moonlight shows correct (not washed-out) HDR. Add encoder-CSC-range == signaled-range check.

Step 2 — Clients apply the metadata (landed; CI/on-glass validation pending)

All three clients now drain the protocol's HdrMeta (next_hdr_meta / nextHdrMeta) and apply it, each remapping from the wire form (ST.2086 G,B,R order, mastering luminance in 0.0001 cd/m²) to the platform's expected layout:

  • Windows (Rust, CI-compiled): session pump drains next_hdr_meta into a LATEST_HDR_META slot; present_newest applies it via Presenter::set_hdr_metadata → real SetHDRMetaData (hdr_meta_to_dxgi: G,B,R→R,G,B reorder, 0.0001-nit→nit for MaxMasteringLuminance), dropping the 1000/1000/400 hardcode. SetColorSpace1/SetHDRMetaData failures + an SDR-display colour-space rejection are now logged, not swallowed.
  • Apple (Swift, mac-runner CI): connect now advertises caps via punktfunk_connect_ex5 (SessionModel computes videoCap10Bit|videoCapHDR from hdrEnabled) — this is the fix that resurrects Apple's previously-dead HDR pipeline. nextHdrMeta/colorInfo wrappers added; the pump drains nextHdrMetaVideoDecoder.setHdrMetaCVBufferSetAttachment of kCVImageBufferMasteringDisplayColorVolumeKey (24-byte BE SEI) + kCVImageBufferContentLightLevelInfoKey (4-byte BE) on each HDR pixel buffer (the correct path for the itur_2100_PQ layer; CAEDRMetadata on a PQ layer is ambiguous and was avoided).
  • Android (Rust decode.rs, cargo-ndk verified): when client.color.is_hdr(), drain the first next_hdr_meta and set MediaFormat hdr-static-info (KEY_HDR_STATIC_INFO) before configure()android_hdr_static_info builds the 25-byte CTA-861.3 Type-1 blob (LE, R,G,B order, max-lum in nits-u16). Display.getHdrCapabilities gate deferred (the Surface DataSpace already drives SurfaceFlinger tone-mapping on non-HDR displays).

Step 3 — Display-capability gate (landed; CI/on-glass validation pending)

The common-case correctness step — most client displays are SDR. Chosen approach: capability-gate (not an in-shader BT.2390 tone-map). Rationale: with Steps 12 the host sends correct mastering metadata, so an HDR display self-tone-maps from it; the real remaining gap is SDR displays, best fixed by not advertising HDR you can't present — the host then sends a proper BT.709 SDR stream instead of PQ the panel would mis-tone-map (washed-out/dark). No guessed tone-map curve, deterministic.

  • Windows (present::display_supports_hdr via DXGI: any IDXGIOutput6 colour space == G2084): session.rs ANDs it with the user's HDR setting before advertising caps; logs when it drops to SDR.
  • Apple (SessionModel, main-actor): `NSScreen.maximumExtendedDynamicRangeColorComponentValue

    1(macOS) /UIScreen.main.potentialEDRHeadroom > 1(iOS) ANDed withhdrEnabled`.

  • Android (Settings.displaySupportsHdr via Display.getHdrCapabilities HDR10/HDR10+): Kotlin passes it to nativeConnect; session.rs gates the caps on the new hdr_enabled jboolean (cargo-ndk-verified).
  • Deferred (need on-glass / the RTX box): the mid-session Reconfigure "downgrade to SDR" for a monitor move HDR↔SDR; and confirming the host produces SDR for an SDR client even off an HDR desktop — on the native path the per-session SudoVDA follows the negotiated depth (SDR client → SDR virtual display → SDR stream), so it should hold end-to-end; verify the stale-HDR-SudoVDA edge case on the RTX box.

Step 4 — Linux (last; capture blocked upstream)

  • 8-bit→Main10 NVENC upconvert shim (encode/linux.rs) — Main10 transport with correct VUI/SEI without HDR capture (gate so we don't claim HDR transfer on SDR content).
  • Linux encode color + side-data (the deferred Step 1c): set color_primaries/trc/colorspace/range from the negotiated ColorInfo and attach AV_FRAME_DATA_MASTERING_DISPLAY_METADATA / CONTENT_LIGHT_LEVEL side-data (with the AVRational conversion) in encode/linux.rs + vaapi.rs — only once the encoder actually produces 10-bit, so the signalling matches the bits.
  • True 10-bit capture: offer ABGR2101010/P010 PipeWire formats + read colorimetry; pilot on Sway/wlroots; track gamescope #2126. Don't block the rest of the plan on it.
  • Linux client: ex5 caps, P010 decode, GdkDmabufTexture CICP from Welcome, wp_color_management when GTK ≥ 4.14.

Quick wins (independent, land in parallel)

  1. connect_ex5 + fix abi.rs:896 — resurrects Apple's pipeline (Step 0).
  2. H.264 VUI + AV1 color_config on nvenc.rs — closes two latent blockers (Windows-only, validated in CI / on the RTX box).
  3. probe logs bit_depth + colorimetry — observability for every later round-trip assertion.
  4. Linux client BT.601→BT.709 sws + texture color_state — standalone SDR correctness bug.
  5. GameStream silent-downgrade already warns (rtsp.rs:289) — keep observable.

Open questions

  • MaxCLL source: GetDesc1 doesn't expose it (Apollo zeroes). Static default, or measure per-frame peak in the PQ shader (only truly-correct, adds a readback)?
  • GameStream: implement SS_HDR_METADATA for Moonlight parity, or keep it deliberately SDR and steer HDR users to punktfunk/1?
  • HLG: carry the enum from day one (free) — but do any sources actually produce HLG?
  • Linux: is shipping the 8-bit→Main10 shim as "HDR-capable transport" acceptable, or does it risk advertising HDR we can't truly deliver?

Ordering rationale

Step 0 first: it's the keystone (metadata transport is the dominant cross-cutter; the ABI line is a one-line root cause) and needs no hardware. Step 1 next: in-band SEI is read directly by decoders, so it fixes correctness even before our clients consume the protocol, and gives an Apollo-parity on-glass gate. Steps 23 are mechanical per-client wiring once metadata flows. Linux is last because capture is gated on upstream we don't control; the shim delivers Main10 transport without that dependency.

Hardware dependencies: Step 0 = none (CI); Step 1 = RTX Windows host; Steps 23 = a real HDR display per platform; Step 4 = a Linux GPU box + HDR-capable Wayland compositor.