Continues docs/hdr-pipeline-plan.md. Steps 0/1 + Step 2 (Windows/Android) already landed in 3526517; this is Step 2 (Apple) + Step 3 (all clients). Client-only — no core/host/ABI change (the 0xCE/next_hdr_meta/color_info surfaces shipped in Step 0). Step 2 — clients APPLY the host's HDR metadata (each remaps from the wire form: ST.2086 G,B,R order, mastering luminance in 0.0001 cd/m2): - Apple: connect via punktfunk_connect_ex5 (resurrects the previously-dead HDR pipeline); nextHdrMeta/colorInfo wrappers + HdrMeta SEI-blob builders; the pump drains nextHdrMeta -> VideoDecoder.setHdrMeta -> CVBufferSetAttachment of MasteringDisplayColorVolume (24B BE) + ContentLightLevelInfo (4B BE) on each HDR pixel buffer (correct for the itur_2100_PQ layer; CAEDRMetadata avoided as ambiguous there). Step 3 — capability-gate: advertise HDR caps ONLY when the display can present it, so an SDR display gets a proper BT.709 stream instead of PQ it would mis-tone-map; an HDR display self-tone-maps from the Step-1/2 mastering metadata. - Windows: present::display_supports_hdr() (DXGI any IDXGIOutput6 colour space == G2084), ANDed with the user HDR setting in session.rs; logs the SDR drop. - Apple: NSScreen.maximumExtendedDynamicRangeColorComponentValue>1 (macOS) / UIScreen.main.potentialEDRHeadroom>1 (iOS) in SessionModel. - Android: Settings.displaySupportsHdr (Display.getHdrCapabilities HDR10/HDR10+) passed through a new hdr_enabled jboolean on nativeConnect; session.rs gates the caps. Validation: Android native (incl. the jboolean gate) builds + clippy clean via cargo-ndk; fmt clean. Windows (MSVC), Apple (Swift) and the Kotlin side are CI/on-glass validated — not compilable on the Linux dev box. Deferred to the RTX box: mid-session Reconfigure SDR-downgrade on monitor move, and confirming the host emits SDR for an SDR client off an HDR desktop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
17 KiB
HDR pipeline — investigation & implementation plan
Goal: true, correct HDR glass-to-glass for punktfunk, across the host (Windows today; Linux blocked upstream) and every client (Windows / Apple / Android / Linux).
This is an audit of the current state, the gap list, and a phased plan. It was produced from a full read of every HDR-touching subsystem cross-checked against the HDR10 standards (CICP/H.273 VUI, SMPTE ST.2086 mastering, CEA-861.3 MaxCLL/MaxFALL) and the Sunshine/Apollo/Moonlight reference implementation.
Status legend: blocker (HDR can't work) · correctness (HDR works but looks wrong) · quality (correct-ish, missing refinement) · ok.
TL;DR
Our HDR is correct in isolated islands but broken end-to-end. The pixel math and the HEVC
VUI we do emit are right (self-test validated, matches Apollo). What's missing is the
metadata chain: nothing measures, signals, transports, or applies the static HDR metadata
(mastering display colour volume + content light level) that tells a display how to tone-map —
so every client hardcodes generic values or infers from the bitstream, and one line
(abi.rs:896, video_caps = 0) makes the entire (correct) Apple HDR pipeline dead code.
What's already correct (the islands)
| Stage | Where |
|---|---|
| Windows host HEVC VUI — primaries=9 (BT.2020) / transfer=16 (PQ) / matrix=9 (BT.2020-NCL) / limited range | encode/nvenc.rs:307-316 |
| Windows host scRGB→BT.2020 PQ shader (×80 nits → BT.709→2020 → ST.2084 OETF, 10000-nit peak) | capture/dxgi.rs — self-test <1 code error, matches Apollo |
| Windows client P010 decode + YUV→RGB (BT.2020-NCL, limited→full) + R10G10B10A2 / G2084-P2020 swapchain | present.rs:66-77, 320-370 |
| Android client Main10 decode + reactive DataSpace (BT2020-PQ/HLG) | decode.rs:210-227 |
Apple client decode/present code (P010 VideoToolbox, BT.2020 PQ Metal, itur_2100_PQ + EDR) |
correct — but never runs (blocker #2) |
Gap list
Blockers
- No color-metadata transport in the protocol (the keystone). The wire carries only
Hello.video_caps(10BIT/HDR bits) andWelcome.bit_depth(8/10) —quic.rs:127-128explicitly defers color. No primaries/transfer/matrix/range, no ST.2086 mastering, no MaxCLL/MaxFALL. ST.2086/CLL host→client is impossible by construction today. - C ABI hardcodes
video_caps = 0(abi.rs:896) → Apple's complete HDR pipeline is dead code; no ABI embedder can request HDR. One-line root cause. - H.264 and AV1 emit zero color signaling on Windows — the
if self.hdrVUI block innvenc.rsonly writeshevcConfig. Any H.264+10-bit or AV1+HDR stream decodes as BT.709 SDR. (AV1 is not a "copy the HEVC VUI" fix — AV1 has no VUI/SEI; it carries primaries/transfer/matrix in the sequence-headercolor_configand mastering/CLL in METADATA OBUsHDR_MDCV/HDR_CLL. Verify whether NVENC's AV1 path accepts them.) - Linux host is 8-bit only end to end — capture offers only 8-bit PipeWire formats
(
capture/linux.rs:443-453, 594-654; gamescope #2126, portals don't wire PipeWire 1.6 BT.2020/PQ); encode downgrades 10-bit (encode/linux.rs:153-162TODO,vaapi.rs:719) with BT.709 hardcoded. The Windows-style 8-bit→Main10 upconvert shim is not implemented here. - Linux client HDR is a complete non-feature —
video_caps=0, P010 decode path dead (video.rs:379), CICP hardcoded BT.709 (ui_stream.rs:239-243), no Wayland color-management (GTK4 0.11 too old).
Correctness
- No host ever emits the ST.2086 mastering or CEA-861.3 CLL SEI. Windows never reads
IDXGIOutput6::GetDesc1;nvenc.rsnever builds anNV_ENC_SEI_PAYLOAD; Linux attaches no libavcodecside_data. Apollo readsGetDesc1and attaches it. - Clients hardcode mastering metadata.
present.rs:584-595ships fixed1000-nit / MaxCLL 1000 / MaxFALL 400(with the literal "the protocol doesn't carry the stream's real mastering metadata yet" comment). Apple/Android set none. - HDR→SDR tone-mapping is unaddressed — and it's the common case. Most client displays are
SDR. No client queries display peak; silent
SetColorSpace1/SetHDRMetaDatafailures present PQ as SDR gamma (crushed/dark). We lean entirely on OS auto-fallback. - Windows secure desktop drops HDR to SDR on lock/UAC (
dxgi.rs:325-368,sudovda.rs:234-277). - GameStream silently streams SDR on a Moonlight HDR request (
mod.rs:48-56,rtsp.rs:288-293) — logged, but no negotiated error. Real Apollo parity needs the MoonlightSS_HDR_METADATAblob on the ENet control channel, not just in-band. - Linux client software path is color-wrong even for SDR — BT.601 applied to BT.709
(
video.rs:162-167, nocolor_stateon the texture). Standalone bug.
Quality
- No per-content MaxCLL/MaxFALL (
GetDesc1doesn't expose it). No encoder-CSC-range vs signaled-range reconciliation (black-crush risk). No automated 10-bit test —probenever even readsWelcome.bit_depth(main.rs:396-406).
Out of scope (call out, don't build)
- Dynamic metadata: HDR10+ (ST.2094-40) and Dolby Vision RPU. We handle static ST.2086 only, with mid-stream changes carried by re-sending the static block (below).
- HLG: the colorimetry transfer enum carries
18from day one (free), but the0xCEmastering datagram is omitted for HLG (scene-referred, no mastering metadata).
Protocol design (the keystone — pure-additive, hardware-free, CI-testable)
Two layers, both back-compat-safe via the established trailing-bytes / new-datagram-tag patterns.
(A) Per-session colorimetry — 4 trailing bytes on Welcome
After the existing bit_depth (offset 59), append a fixed 4-byte CICP block at offsets 60..64.
(A future mirror on Reconfigured will announce a mid-stream SDR↔HDR / BT.709↔BT.2020 flip on the
control stream we already use for renegotiation — deferred to Step 1 with the mid-stream-flip work;
today a mode switch never changes the colour, and the 0xCE re-send covers mastering changes.)
[60] colour_primaries (CICP: 1=BT.709, 9=BT.2020)
[61] transfer_characteristics (1=BT.709, 16=PQ/SMPTE2084, 18=HLG)
[62] matrix_coeffs (1=BT.709, 9=BT.2020-NCL) ← never emit 10 (CL): no client decodes it
[63] video_full_range_flag (0=limited, 1=full)
Decode with b.get(60).unwrap_or(1) etc. — an older host omits them → BT.709 limited SDR
(today's behavior). Welcome stays Copy. Modeled as a ColorInfo struct on the wire types
and exposed on NativeClient (with bit_depth) so clients know the colorimetry instead of
inferring it.
(B) Per-change mastering + CLL — a new host→client datagram, tag 0xCE
ST.2086 is variable and changes mid-stream, so it rides a datagram (next tag after 0xCD
HIDOUT), demuxed in client.rs like AUDIO/RUMBLE/HIDOUT. 28 bytes, standard SEI fixed-point:
[0] = 0xCE
G.x G.y B.x B.y R.x R.y 6 × u16 LE display primaries, 1/50000 units
wp.x wp.y 2 × u16 LE white point, 1/50000 units
max_display_mastering_luminance u32 LE 0.0001 cd/m²
min_display_mastering_luminance u32 LE 0.0001 cd/m²
max_cll u16 LE nits
max_fall u16 LE nits
- Sent on session start and whenever
GetDesc1/source mastering changes; re-sent on every IDR/RFI keyframe so a client that dropped the (best-effort) datagram converges within a GOP. Until first receipt the client uses the Welcome transfer + a documented generic default. - Bounds-check length before reading (reassembler-bounds security invariant) — truncation test required.
- Omitted entirely for HLG.
- Units note: these map straight to DXGI
DXGI_HDR_METADATA_HDR10, AndroidKEY_HDR_STATIC_INFO, and AppleCAEDRMetadata.hdr10. On the libavcodec/Linux side they need conversion —AVMasteringDisplayMetadatastoresAVRational, not raw fixed-point.
(C) C ABI
punktfunk_connect_ex5(... video_caps: u8)(ex4 delegates with 0); fixabi.rs:896.punktfunk_connection_next_hdr_meta(c, *mut PunktfunkHdrMeta, timeout_ms)— new plane, one-puller contract likenext_audio.punktfunk_connection_color_info(c, *mut prim, *mut trc, *mut matrix, *mut range, *mut bit_depth).- Regenerate
include/punktfunk_core.h(cbindgen);struct_size/repr(C) guards on new structs.
Phases
Step 0 — Protocol + ABI carry color metadata end to end (this change)
The dominant cross-cutting blocker; everything else is downstream. No rendering changes, no hardware, CI-testable.
- core:
ColorInfo+ 4 Welcome bytes;HdrMeta+0xCEcodec (bounds-checked);NativeClientcolor/bit_depthfields + HdrMeta receiver + demux +next_hdr_meta. - C ABI:
connect_ex5,next_hdr_meta,color_info, fix caps=0; regen header. - host: populate
Welcome.colorfrom the negotiated bit-depth/HDR decision; send a0xCE(generic default for now) when HDR is negotiated. - clients: Windows/Android inherit the demux via shared core; Apple flips to
ex5. - validation:
quic.rsround-trip + truncation + SDR back-compat tests;probelogsbit_depth+ colorimetry; loopback asserts a 10-bit Welcome carries trc=16 and a0xCElands.
Step 1 — Host emits correct in-band SEI + complete VUI on all codecs (landed; RTX-validation pending)
In-band SEI is read directly by decoders, so it fixes correctness even before clients consume the protocol, and gives an Apollo/Moonlight on-glass parity gate.
- Single source of truth: the capturer learns the source display's mastering metadata and
exposes it via
Capturer::hdr_meta() -> Option<HdrMeta>. The stream loop forwards it to the encoder (Encoder::set_hdr_meta→ in-band SEI) and the client (real0xCE, re-sent on each keyframe). Pure byte-level logic (float→fixed conversion + the HEVC/H.264 SEI payloads) lives in the unit-tested, cross-platformsrc/hdr.rs(hdr_meta_from_display,hevc_mastering_display_seitype 137,hevc_content_light_level_seitype 144 — note: NOT "type 4", that was a drafting error). - Windows (done, CI-compiled / RTX on-glass pending):
dxgi.rs+wgc.rsreadIDXGIOutput6::GetDesc1at capture init / output change →HdrMeta(MaxCLL/MaxFALL left 0 — GetDesc1 has none, like Apollo).nvenc.rsattaches the mastering + CLL SEI on every IDR for HEVC/H.264. (AV1 mastering rides METADATA OBUs, not SEI — follow-up; AV1color_configalready lands in Step 0's quick win.) - Linux encode-ready — DEFERRED into Step 4: Linux capture is 8-bit only, so signalling
BT.2020 PQ + attaching mastering side-data on a downconverted 8-bit stream would be incorrect.
The libavcodec
side_datapath (with theAVRationalconversion) lands together with the 8-bit→Main10 shim / true 10-bit capture in Step 4. - Windows secure-desktop relay (
virtual_stream_relay) still sends only the generic baseline0xCE; the helper's in-band SEI carries the real grade. Wiring the relay's0xCEis a follow-up. - validation (RTX box):
ffprobe -show_framesshows mastering + CLL side-data with the display's real luminance and VUI 9/16/9; stock Moonlight shows correct (not washed-out) HDR. Add encoder-CSC-range == signaled-range check.
Step 2 — Clients apply the metadata (landed; CI/on-glass validation pending)
All three clients now drain the protocol's HdrMeta (next_hdr_meta / nextHdrMeta) and apply it,
each remapping from the wire form (ST.2086 G,B,R order, mastering luminance in 0.0001 cd/m²) to the
platform's expected layout:
- Windows (Rust, CI-compiled): session pump drains
next_hdr_metainto aLATEST_HDR_METAslot;present_newestapplies it viaPresenter::set_hdr_metadata→ realSetHDRMetaData(hdr_meta_to_dxgi: G,B,R→R,G,B reorder, 0.0001-nit→nit forMaxMasteringLuminance), dropping the 1000/1000/400 hardcode.SetColorSpace1/SetHDRMetaDatafailures + an SDR-display colour-space rejection are now logged, not swallowed. - Apple (Swift, mac-runner CI): connect now advertises caps via
punktfunk_connect_ex5(SessionModelcomputesvideoCap10Bit|videoCapHDRfromhdrEnabled) — this is the fix that resurrects Apple's previously-dead HDR pipeline.nextHdrMeta/colorInfowrappers added; the pump drainsnextHdrMeta→VideoDecoder.setHdrMeta→CVBufferSetAttachmentofkCVImageBufferMasteringDisplayColorVolumeKey(24-byte BE SEI) +kCVImageBufferContentLightLevelInfoKey(4-byte BE) on each HDR pixel buffer (the correct path for the itur_2100_PQ layer;CAEDRMetadataon a PQ layer is ambiguous and was avoided). - Android (Rust
decode.rs, cargo-ndk verified): whenclient.color.is_hdr(), drain the firstnext_hdr_metaand setMediaFormathdr-static-info(KEY_HDR_STATIC_INFO) beforeconfigure()—android_hdr_static_infobuilds the 25-byte CTA-861.3 Type-1 blob (LE, R,G,B order, max-lum in nits-u16).Display.getHdrCapabilitiesgate deferred (the Surface DataSpace already drives SurfaceFlinger tone-mapping on non-HDR displays).
Step 3 — Display-capability gate (landed; CI/on-glass validation pending)
The common-case correctness step — most client displays are SDR. Chosen approach: capability-gate (not an in-shader BT.2390 tone-map). Rationale: with Steps 1–2 the host sends correct mastering metadata, so an HDR display self-tone-maps from it; the real remaining gap is SDR displays, best fixed by not advertising HDR you can't present — the host then sends a proper BT.709 SDR stream instead of PQ the panel would mis-tone-map (washed-out/dark). No guessed tone-map curve, deterministic.
- Windows (
present::display_supports_hdrvia DXGI: anyIDXGIOutput6colour space ==G2084):session.rsANDs it with the user's HDR setting before advertising caps; logs when it drops to SDR. - Apple (
SessionModel, main-actor): `NSScreen.maximumExtendedDynamicRangeColorComponentValue1
(macOS) /UIScreen.main.potentialEDRHeadroom > 1(iOS) ANDed withhdrEnabled`. - Android (
Settings.displaySupportsHdrviaDisplay.getHdrCapabilitiesHDR10/HDR10+): Kotlin passes it tonativeConnect;session.rsgates the caps on the newhdr_enabledjboolean (cargo-ndk-verified). - Deferred (need on-glass / the RTX box): the mid-session
Reconfigure"downgrade to SDR" for a monitor move HDR↔SDR; and confirming the host produces SDR for an SDR client even off an HDR desktop — on the native path the per-session SudoVDA follows the negotiated depth (SDR client → SDR virtual display → SDR stream), so it should hold end-to-end; verify the stale-HDR-SudoVDA edge case on the RTX box.
Step 4 — Linux (last; capture blocked upstream)
- 8-bit→Main10 NVENC upconvert shim (
encode/linux.rs) — Main10 transport with correct VUI/SEI without HDR capture (gate so we don't claim HDR transfer on SDR content). - Linux encode color + side-data (the deferred Step 1c): set
color_primaries/trc/colorspace/rangefrom the negotiatedColorInfoand attachAV_FRAME_DATA_MASTERING_DISPLAY_METADATA/CONTENT_LIGHT_LEVELside-data (with theAVRationalconversion) inencode/linux.rs+vaapi.rs— only once the encoder actually produces 10-bit, so the signalling matches the bits. - True 10-bit capture: offer
ABGR2101010/P010PipeWire formats + read colorimetry; pilot on Sway/wlroots; track gamescope #2126. Don't block the rest of the plan on it. - Linux client:
ex5caps, P010 decode, GdkDmabufTexture CICP from Welcome,wp_color_managementwhen GTK ≥ 4.14.
Quick wins (independent, land in parallel)
connect_ex5+ fixabi.rs:896— resurrects Apple's pipeline (Step 0).- H.264 VUI + AV1
color_configonnvenc.rs— closes two latent blockers (Windows-only, validated in CI / on the RTX box). probelogsbit_depth+ colorimetry — observability for every later round-trip assertion.- Linux client BT.601→BT.709 sws + texture
color_state— standalone SDR correctness bug. - GameStream silent-downgrade already warns (
rtsp.rs:289) — keep observable.
Open questions
- MaxCLL source:
GetDesc1doesn't expose it (Apollo zeroes). Static default, or measure per-frame peak in the PQ shader (only truly-correct, adds a readback)? - GameStream: implement
SS_HDR_METADATAfor Moonlight parity, or keep it deliberately SDR and steer HDR users to punktfunk/1? - HLG: carry the enum from day one (free) — but do any sources actually produce HLG?
- Linux: is shipping the 8-bit→Main10 shim as "HDR-capable transport" acceptable, or does it risk advertising HDR we can't truly deliver?
Ordering rationale
Step 0 first: it's the keystone (metadata transport is the dominant cross-cutter; the ABI line is a one-line root cause) and needs no hardware. Step 1 next: in-band SEI is read directly by decoders, so it fixes correctness even before our clients consume the protocol, and gives an Apollo-parity on-glass gate. Steps 2–3 are mechanical per-client wiring once metadata flows. Linux is last because capture is gated on upstream we don't control; the shim delivers Main10 transport without that dependency.
Hardware dependencies: Step 0 = none (CI); Step 1 = RTX Windows host; Steps 2–3 = a real HDR display per platform; Step 4 = a Linux GPU box + HDR-capable Wayland compositor.