Files
punktfunk/docs/hdr-pipeline-plan.md
T
enricobuehler 551012bb43 feat(clients): HDR Steps 2-3 — apply mastering metadata + display capability-gate
Continues docs/hdr-pipeline-plan.md. Steps 0/1 + Step 2 (Windows/Android) already
landed in 3526517; this is Step 2 (Apple) + Step 3 (all clients). Client-only — no
core/host/ABI change (the 0xCE/next_hdr_meta/color_info surfaces shipped in Step 0).

Step 2 — clients APPLY the host's HDR metadata (each remaps from the wire form: ST.2086
G,B,R order, mastering luminance in 0.0001 cd/m2):
- Apple: connect via punktfunk_connect_ex5 (resurrects the previously-dead HDR pipeline);
  nextHdrMeta/colorInfo wrappers + HdrMeta SEI-blob builders; the pump drains nextHdrMeta
  -> VideoDecoder.setHdrMeta -> CVBufferSetAttachment of MasteringDisplayColorVolume (24B
  BE) + ContentLightLevelInfo (4B BE) on each HDR pixel buffer (correct for the
  itur_2100_PQ layer; CAEDRMetadata avoided as ambiguous there).

Step 3 — capability-gate: advertise HDR caps ONLY when the display can present it, so an
SDR display gets a proper BT.709 stream instead of PQ it would mis-tone-map; an HDR
display self-tone-maps from the Step-1/2 mastering metadata.
- Windows: present::display_supports_hdr() (DXGI any IDXGIOutput6 colour space == G2084),
  ANDed with the user HDR setting in session.rs; logs the SDR drop.
- Apple: NSScreen.maximumExtendedDynamicRangeColorComponentValue>1 (macOS) /
  UIScreen.main.potentialEDRHeadroom>1 (iOS) in SessionModel.
- Android: Settings.displaySupportsHdr (Display.getHdrCapabilities HDR10/HDR10+) passed
  through a new hdr_enabled jboolean on nativeConnect; session.rs gates the caps.

Validation: Android native (incl. the jboolean gate) builds + clippy clean via cargo-ndk;
fmt clean. Windows (MSVC), Apple (Swift) and the Kotlin side are CI/on-glass validated —
not compilable on the Linux dev box. Deferred to the RTX box: mid-session Reconfigure
SDR-downgrade on monitor move, and confirming the host emits SDR for an SDR client off an
HDR desktop.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 09:46:58 +00:00

271 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# HDR pipeline — investigation & implementation plan
Goal: **true, correct HDR glass-to-glass** for punktfunk, across the host (Windows today;
Linux blocked upstream) and every client (Windows / Apple / Android / Linux).
This is an audit of the current state, the gap list, and a phased plan. It was produced from
a full read of every HDR-touching subsystem cross-checked against the HDR10 standards
(CICP/H.273 VUI, SMPTE ST.2086 mastering, CEA-861.3 MaxCLL/MaxFALL) and the
Sunshine/Apollo/Moonlight reference implementation.
> Status legend: **blocker** (HDR can't work) · **correctness** (HDR works but looks wrong) ·
> **quality** (correct-ish, missing refinement) · **ok**.
---
## TL;DR
Our HDR is **correct in isolated islands but broken end-to-end.** The pixel math and the HEVC
VUI we *do* emit are right (self-test validated, matches Apollo). What's missing is the
**metadata chain**: nothing measures, signals, transports, or applies the *static HDR metadata*
(mastering display colour volume + content light level) that tells a display how to tone-map —
so every client hardcodes generic values or infers from the bitstream, and one line
(`abi.rs:896`, `video_caps = 0`) makes the entire (correct) Apple HDR pipeline dead code.
---
## What's already correct (the islands)
| Stage | Where |
|---|---|
| Windows host HEVC **VUI** — primaries=9 (BT.2020) / transfer=16 (PQ) / matrix=9 (BT.2020-NCL) / limited range | `encode/nvenc.rs:307-316` |
| Windows host **scRGB→BT.2020 PQ** shader (×80 nits → BT.709→2020 → ST.2084 OETF, 10000-nit peak) | `capture/dxgi.rs` — self-test `<1` code error, matches Apollo |
| Windows client **P010 decode + YUV→RGB** (BT.2020-NCL, limited→full) + **R10G10B10A2 / G2084-P2020 swapchain** | `present.rs:66-77, 320-370` |
| Android client **Main10 decode + reactive DataSpace** (BT2020-PQ/HLG) | `decode.rs:210-227` |
| Apple client decode/present **code** (P010 VideoToolbox, BT.2020 PQ Metal, `itur_2100_PQ` + EDR) | correct — but never runs (blocker #2) |
## Gap list
### Blockers
1. **No color-metadata transport in the protocol** *(the keystone).* The wire carries only
`Hello.video_caps` (10BIT/HDR bits) and `Welcome.bit_depth` (8/10) — `quic.rs:127-128`
explicitly defers color. No primaries/transfer/matrix/range, **no ST.2086 mastering, no
MaxCLL/MaxFALL**. ST.2086/CLL host→client is impossible by construction today.
2. **C ABI hardcodes `video_caps = 0`** (`abi.rs:896`) → Apple's complete HDR pipeline is dead
code; no ABI embedder can request HDR. One-line root cause.
3. **H.264 and AV1 emit zero color signaling on Windows** — the `if self.hdr` VUI block in
`nvenc.rs` only writes `hevcConfig`. Any H.264+10-bit or AV1+HDR stream decodes as BT.709 SDR.
*(AV1 is **not** a "copy the HEVC VUI" fix — AV1 has no VUI/SEI; it carries
primaries/transfer/matrix in the sequence-header `color_config` and mastering/CLL in
**METADATA OBUs** `HDR_MDCV`/`HDR_CLL`. Verify whether NVENC's AV1 path accepts them.)*
4. **Linux host is 8-bit only end to end** — capture offers only 8-bit PipeWire formats
(`capture/linux.rs:443-453, 594-654`; gamescope #2126, portals don't wire PipeWire 1.6
BT.2020/PQ); encode downgrades 10-bit (`encode/linux.rs:153-162` TODO, `vaapi.rs:719`) with
BT.709 hardcoded. The Windows-style 8-bit→Main10 upconvert shim is not implemented here.
5. **Linux client HDR is a complete non-feature**`video_caps=0`, P010 decode path dead
(`video.rs:379`), CICP hardcoded BT.709 (`ui_stream.rs:239-243`), no Wayland
color-management (GTK4 0.11 too old).
### Correctness
6. **No host ever emits the ST.2086 mastering or CEA-861.3 CLL SEI.** Windows never reads
`IDXGIOutput6::GetDesc1`; `nvenc.rs` never builds an `NV_ENC_SEI_PAYLOAD`; Linux attaches no
libavcodec `side_data`. Apollo reads `GetDesc1` and attaches it.
7. **Clients hardcode mastering metadata.** `present.rs:584-595` ships fixed
`1000-nit / MaxCLL 1000 / MaxFALL 400` (with the literal "the protocol doesn't carry the
stream's real mastering metadata yet" comment). Apple/Android set none.
8. **HDR→SDR tone-mapping is unaddressed — and it's the common case.** Most client displays are
SDR. No client queries display peak; silent `SetColorSpace1`/`SetHDRMetaData` failures present
PQ as SDR gamma (crushed/dark). We lean entirely on OS auto-fallback.
9. **Windows secure desktop drops HDR to SDR** on lock/UAC (`dxgi.rs:325-368`,
`sudovda.rs:234-277`).
10. **GameStream silently streams SDR** on a Moonlight HDR request (`mod.rs:48-56`,
`rtsp.rs:288-293`) — logged, but no negotiated error. Real Apollo parity needs the Moonlight
`SS_HDR_METADATA` blob on the **ENet control channel**, not just in-band.
11. **Linux client software path is color-wrong even for SDR** — BT.601 applied to BT.709
(`video.rs:162-167`, no `color_state` on the texture). Standalone bug.
### Quality
12. No per-content MaxCLL/MaxFALL (`GetDesc1` doesn't expose it). No encoder-CSC-range vs
signaled-range reconciliation (black-crush risk). No automated 10-bit test — `probe` never
even reads `Welcome.bit_depth` (`main.rs:396-406`).
### Out of scope (call out, don't build)
- Dynamic metadata: HDR10+ (ST.2094-40) and Dolby Vision RPU. We handle *static* ST.2086 only,
with mid-stream changes carried by re-sending the static block (below).
- HLG: the colorimetry transfer enum carries `18` from day one (free), but the `0xCE` mastering
datagram is **omitted for HLG** (scene-referred, no mastering metadata).
---
## Protocol design (the keystone — pure-additive, hardware-free, CI-testable)
Two layers, both back-compat-safe via the established trailing-bytes / new-datagram-tag patterns.
### (A) Per-session colorimetry — 4 trailing bytes on `Welcome`
After the existing `bit_depth` (offset 59), append a fixed 4-byte CICP block at offsets 60..64.
(A future mirror on `Reconfigured` will announce a mid-stream SDR↔HDR / BT.709↔BT.2020 flip on the
control stream we already use for renegotiation — deferred to Step 1 with the mid-stream-flip work;
today a mode switch never changes the colour, and the `0xCE` re-send covers mastering changes.)
```
[60] colour_primaries (CICP: 1=BT.709, 9=BT.2020)
[61] transfer_characteristics (1=BT.709, 16=PQ/SMPTE2084, 18=HLG)
[62] matrix_coeffs (1=BT.709, 9=BT.2020-NCL) ← never emit 10 (CL): no client decodes it
[63] video_full_range_flag (0=limited, 1=full)
```
Decode with `b.get(60).unwrap_or(1)` etc. — an older host omits them → BT.709 limited SDR
(today's behavior). `Welcome` stays `Copy`. Modeled as a `ColorInfo` struct on the wire types
and exposed on `NativeClient` (with `bit_depth`) so clients *know* the colorimetry instead of
inferring it.
### (B) Per-change mastering + CLL — a new host→client datagram, tag `0xCE`
ST.2086 is variable and changes mid-stream, so it rides a datagram (next tag after `0xCD`
HIDOUT), demuxed in `client.rs` like AUDIO/RUMBLE/HIDOUT. 28 bytes, standard SEI fixed-point:
```
[0] = 0xCE
G.x G.y B.x B.y R.x R.y 6 × u16 LE display primaries, 1/50000 units
wp.x wp.y 2 × u16 LE white point, 1/50000 units
max_display_mastering_luminance u32 LE 0.0001 cd/m²
min_display_mastering_luminance u32 LE 0.0001 cd/m²
max_cll u16 LE nits
max_fall u16 LE nits
```
- Sent on session start and whenever `GetDesc1`/source mastering changes; **re-sent on every
IDR/RFI keyframe** so a client that dropped the (best-effort) datagram converges within a GOP.
Until first receipt the client uses the Welcome transfer + a documented generic default.
- **Bounds-check length before reading** (reassembler-bounds security invariant) — truncation
test required.
- **Omitted entirely for HLG.**
- Units note: these map straight to DXGI `DXGI_HDR_METADATA_HDR10`, Android `KEY_HDR_STATIC_INFO`,
and Apple `CAEDRMetadata.hdr10`. On the **libavcodec/Linux** side they need conversion —
`AVMasteringDisplayMetadata` stores `AVRational`, not raw fixed-point.
### (C) C ABI
- `punktfunk_connect_ex5(... video_caps: u8)` (ex4 delegates with 0); **fix `abi.rs:896`.**
- `punktfunk_connection_next_hdr_meta(c, *mut PunktfunkHdrMeta, timeout_ms)` — new plane,
one-puller contract like `next_audio`.
- `punktfunk_connection_color_info(c, *mut prim, *mut trc, *mut matrix, *mut range, *mut bit_depth)`.
- Regenerate `include/punktfunk_core.h` (cbindgen); `struct_size`/repr(C) guards on new structs.
---
## Phases
### Step 0 — Protocol + ABI carry color metadata end to end *(this change)*
The dominant cross-cutting blocker; everything else is downstream. No rendering changes, no
hardware, CI-testable.
- **core:** `ColorInfo` + 4 Welcome bytes; `HdrMeta` + `0xCE` codec (bounds-checked);
`NativeClient` `color`/`bit_depth` fields + HdrMeta receiver + demux + `next_hdr_meta`.
- **C ABI:** `connect_ex5`, `next_hdr_meta`, `color_info`, fix caps=0; regen header.
- **host:** populate `Welcome.color` from the negotiated bit-depth/HDR decision; send a `0xCE`
(generic default for now) when HDR is negotiated.
- **clients:** Windows/Android inherit the demux via shared core; Apple flips to `ex5`.
- **validation:** `quic.rs` round-trip + truncation + **SDR back-compat** tests; `probe` logs
`bit_depth` + colorimetry; loopback asserts a 10-bit Welcome carries trc=16 and a `0xCE` lands.
### Step 1 — Host emits correct in-band SEI + complete VUI on all codecs *(landed; RTX-validation pending)*
In-band SEI is read directly by decoders, so it fixes correctness even before clients consume
the protocol, and gives an Apollo/Moonlight on-glass parity gate.
- **Single source of truth:** the capturer learns the source display's mastering metadata and
exposes it via `Capturer::hdr_meta() -> Option<HdrMeta>`. The stream loop forwards it to the
encoder (`Encoder::set_hdr_meta` → in-band SEI) **and** the client (real `0xCE`, re-sent on each
keyframe). Pure byte-level logic (float→fixed conversion + the HEVC/H.264 SEI payloads) lives in
the unit-tested, cross-platform `src/hdr.rs` (`hdr_meta_from_display`, `hevc_mastering_display_sei`
type **137**, `hevc_content_light_level_sei` type **144** — note: NOT "type 4", that was a
drafting error).
- **Windows (done, CI-compiled / RTX on-glass pending):** `dxgi.rs` + `wgc.rs` read
`IDXGIOutput6::GetDesc1` at capture init / output change → `HdrMeta` (MaxCLL/MaxFALL left 0 —
GetDesc1 has none, like Apollo). `nvenc.rs` attaches the mastering + CLL SEI on every IDR for
HEVC/H.264. (AV1 mastering rides METADATA OBUs, not SEI — follow-up; AV1 `color_config` already
lands in Step 0's quick win.)
- **Linux encode-ready — DEFERRED into Step 4:** Linux capture is 8-bit only, so signalling
BT.2020 PQ + attaching mastering side-data on a downconverted 8-bit stream would be *incorrect*.
The libavcodec `side_data` path (with the `AVRational` conversion) lands together with the
8-bit→Main10 shim / true 10-bit capture in Step 4.
- **Windows secure-desktop relay** (`virtual_stream_relay`) still sends only the generic baseline
`0xCE`; the helper's in-band SEI carries the real grade. Wiring the relay's `0xCE` is a follow-up.
- **validation (RTX box):** `ffprobe -show_frames` shows mastering + CLL side-data with the
display's real luminance and VUI 9/16/9; stock Moonlight shows correct (not washed-out) HDR.
Add **encoder-CSC-range == signaled-range** check.
### Step 2 — Clients apply the metadata *(landed; CI/on-glass validation pending)*
All three clients now drain the protocol's `HdrMeta` (`next_hdr_meta` / `nextHdrMeta`) and apply it,
each remapping from the wire form (ST.2086 G,B,R order, mastering luminance in 0.0001 cd/m²) to the
platform's expected layout:
- **Windows (Rust, CI-compiled):** session pump drains `next_hdr_meta` into a `LATEST_HDR_META`
slot; `present_newest` applies it via `Presenter::set_hdr_metadata` → real `SetHDRMetaData`
(`hdr_meta_to_dxgi`: G,B,R→R,G,B reorder, 0.0001-nit→nit for `MaxMasteringLuminance`), dropping
the 1000/1000/400 hardcode. `SetColorSpace1`/`SetHDRMetaData` failures + an SDR-display
colour-space rejection are now **logged**, not swallowed.
- **Apple (Swift, mac-runner CI):** connect now advertises caps via `punktfunk_connect_ex5`
(`SessionModel` computes `videoCap10Bit|videoCapHDR` from `hdrEnabled`) — *this is the fix that
resurrects Apple's previously-dead HDR pipeline*. `nextHdrMeta`/`colorInfo` wrappers added; the
pump drains `nextHdrMeta``VideoDecoder.setHdrMeta``CVBufferSetAttachment` of
`kCVImageBufferMasteringDisplayColorVolumeKey` (24-byte BE SEI) +
`kCVImageBufferContentLightLevelInfoKey` (4-byte BE) on each HDR pixel buffer (the correct path
for the itur_2100_PQ layer; `CAEDRMetadata` on a PQ layer is ambiguous and was avoided).
- **Android (Rust `decode.rs`, cargo-ndk verified):** when `client.color.is_hdr()`, drain the first
`next_hdr_meta` and set `MediaFormat` `hdr-static-info` (`KEY_HDR_STATIC_INFO`) before
`configure()``android_hdr_static_info` builds the 25-byte CTA-861.3 Type-1 blob (LE, **R,G,B**
order, max-lum in **nits-u16**). `Display.getHdrCapabilities` gate deferred (the Surface DataSpace
already drives SurfaceFlinger tone-mapping on non-HDR displays).
### Step 3 — Display-capability gate *(landed; CI/on-glass validation pending)*
The common-case correctness step — most client displays are SDR. **Chosen approach: capability-gate**
(not an in-shader BT.2390 tone-map). Rationale: with Steps 12 the host sends *correct* mastering
metadata, so an HDR display self-tone-maps from it; the real remaining gap is SDR displays, best
fixed by **not advertising HDR you can't present** — the host then sends a proper BT.709 SDR stream
instead of PQ the panel would mis-tone-map (washed-out/dark). No guessed tone-map curve, deterministic.
- **Windows** (`present::display_supports_hdr` via DXGI: any `IDXGIOutput6` colour space ==
`G2084`): `session.rs` ANDs it with the user's HDR setting before advertising caps; logs when it
drops to SDR.
- **Apple** (`SessionModel`, main-actor): `NSScreen.maximumExtendedDynamicRangeColorComponentValue
> 1` (macOS) / `UIScreen.main.potentialEDRHeadroom > 1` (iOS) ANDed with `hdrEnabled`.
- **Android** (`Settings.displaySupportsHdr` via `Display.getHdrCapabilities` HDR10/HDR10+): Kotlin
passes it to `nativeConnect`; `session.rs` gates the caps on the new `hdr_enabled` jboolean
(cargo-ndk-verified).
- **Deferred** (need on-glass / the RTX box): the **mid-session `Reconfigure` "downgrade to SDR"**
for a monitor move HDR↔SDR; and confirming the **host produces SDR for an SDR client even off an
HDR desktop** — on the native path the per-session SudoVDA follows the negotiated depth (SDR
client → SDR virtual display → SDR stream), so it should hold end-to-end; verify the
stale-HDR-SudoVDA edge case on the RTX box.
### Step 4 — Linux (last; capture blocked upstream)
- **8-bit→Main10 NVENC upconvert shim** (`encode/linux.rs`) — Main10 transport with correct
VUI/SEI without HDR capture (gate so we don't claim HDR transfer on SDR content).
- **Linux encode color + side-data (the deferred Step 1c):** set
`color_primaries/trc/colorspace/range` from the negotiated `ColorInfo` and attach
`AV_FRAME_DATA_MASTERING_DISPLAY_METADATA` / `CONTENT_LIGHT_LEVEL` side-data (with the
`AVRational` conversion) in `encode/linux.rs` + `vaapi.rs` — only once the encoder actually
produces 10-bit, so the signalling matches the bits.
- True 10-bit capture: offer `ABGR2101010`/`P010` PipeWire formats + read colorimetry; pilot on
Sway/wlroots; track gamescope #2126. **Don't block the rest of the plan on it.**
- Linux client: `ex5` caps, P010 decode, GdkDmabufTexture CICP from Welcome,
`wp_color_management` when GTK ≥ 4.14.
## Quick wins (independent, land in parallel)
1. `connect_ex5` + fix `abi.rs:896` — resurrects Apple's pipeline *(Step 0)*.
2. H.264 VUI + AV1 `color_config` on `nvenc.rs` — closes two latent blockers *(Windows-only,
validated in CI / on the RTX box)*.
3. `probe` logs `bit_depth` + colorimetry — observability for every later round-trip assertion.
4. Linux client BT.601→BT.709 sws + texture `color_state` — standalone SDR correctness bug.
5. GameStream silent-downgrade already warns (`rtsp.rs:289`) — keep observable.
## Open questions
- **MaxCLL source:** `GetDesc1` doesn't expose it (Apollo zeroes). Static default, or measure
per-frame peak in the PQ shader (only truly-correct, adds a readback)?
- **GameStream:** implement `SS_HDR_METADATA` for Moonlight parity, or keep it deliberately SDR
and steer HDR users to punktfunk/1?
- **HLG:** carry the enum from day one (free) — but do any sources actually produce HLG?
- **Linux:** is shipping the 8-bit→Main10 shim as "HDR-capable transport" acceptable, or does it
risk advertising HDR we can't truly deliver?
## Ordering rationale
Step 0 first: it's the keystone (metadata transport is the dominant cross-cutter; the ABI line
is a one-line root cause) and needs no hardware. Step 1 next: in-band SEI is read directly by
decoders, so it fixes correctness even before our clients consume the protocol, and gives an
Apollo-parity on-glass gate. Steps 23 are mechanical per-client wiring once metadata flows.
Linux is last because capture is gated on upstream we don't control; the shim delivers Main10
transport without that dependency.
Hardware dependencies: Step 0 = none (CI); Step 1 = RTX Windows host; Steps 23 = a real HDR
display per platform; Step 4 = a Linux GPU box + HDR-capable Wayland compositor.