Files
punktfunk/docs/hdr-pipeline-plan.md
T
enricobuehler 3526517eb1
ci / rust (push) Failing after 45s
apple / swift (push) Successful in 57s
ci / web (push) Successful in 39s
ci / docs-site (push) Successful in 38s
windows-host / package (push) Successful in 3m26s
android / android (push) Successful in 3m40s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m24s
deb / build-publish (push) Successful in 2m10s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m22s
decky / build-publish (push) Successful in 25s
ci / bench (push) Successful in 4m44s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 1m4s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m7s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m45s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 30s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m37s
flatpak / build-publish (push) Successful in 4m17s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m30s
docker / deploy-docs (push) Successful in 23s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m53s
feat: HDR Step-0 colour-metadata transport + security-audit hardening
Two strands, entangled in punktfunk1.rs, committed together (one builds-green tree).

HDR pipeline Step 0 — glass-to-glass colour-metadata transport (docs/hdr-pipeline-plan.md):
- Protocol/ABI: ColorInfo on the Welcome + a 0xCE HdrMeta datagram carry the source colour
  space + HDR10 static mastering metadata (quic.rs, abi.rs connect_ex5 fixing caps=0).
- New platform-independent, unit-tested HDR static-metadata helpers (hdr.rs): chromaticities
  (1/50000), mastering luminance (0.0001 cd/m2), MaxCLL/MaxFALL in HDR10/ST.2086 units.
- Capture/encode hooks (capture.rs, encode.rs set_hdr_meta) + Linux client / probe plumbing.

Security-audit hardening — top 3 from docs/security-review.md, each adversarially verified:
- #1 [HIGH] Secret file permissions. The host key.pem/cert.pem and both trust stores are now
  written owner-only: 0600 + dir 0700 on Unix (mirrors mgmt_token), best-effort
  SYSTEM/Administrators/OWNER-only icacls DACL on Windows (%ProgramData% is Users-readable).
  Closes a local key-disclosure -> host-impersonation gap. New gamestream::{create_private_dir,
  write_secret_file} + a 0600 regression test.
- #2 [HIGH] Native SPAKE2 PIN is single-use. The PIN is consumed the moment the host sends its
  key-confirmation (which lets the client test its one guess), before reading the proof, so any
  completed attempt -- right OR wrong -- disarms the window. A wrong PIN isn't observable
  host-side (the client aborts before sending its proof), so consuming on first attempt is what
  delivers the documented "one online guess" instead of an unbounded brute-force of the static
  4-digit PIN. Test verifies single-use.
- #3 [MEDIUM] RTSP packetSize is bounded ([64,2048] in stream_config) and VideoPacketizer::new
  uses saturating .max(1), killing a PRE-AUTH div-by-zero/underflow panic of the video thread.
  Tests for {0,15,16,17} + out-of-range rejection.

fmt + clippy -D warnings clean; full workspace test suite green (93 host tests).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 09:07:59 +00:00

244 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# HDR pipeline — investigation & implementation plan
Goal: **true, correct HDR glass-to-glass** for punktfunk, across the host (Windows today;
Linux blocked upstream) and every client (Windows / Apple / Android / Linux).
This is an audit of the current state, the gap list, and a phased plan. It was produced from
a full read of every HDR-touching subsystem cross-checked against the HDR10 standards
(CICP/H.273 VUI, SMPTE ST.2086 mastering, CEA-861.3 MaxCLL/MaxFALL) and the
Sunshine/Apollo/Moonlight reference implementation.
> Status legend: **blocker** (HDR can't work) · **correctness** (HDR works but looks wrong) ·
> **quality** (correct-ish, missing refinement) · **ok**.
---
## TL;DR
Our HDR is **correct in isolated islands but broken end-to-end.** The pixel math and the HEVC
VUI we *do* emit are right (self-test validated, matches Apollo). What's missing is the
**metadata chain**: nothing measures, signals, transports, or applies the *static HDR metadata*
(mastering display colour volume + content light level) that tells a display how to tone-map —
so every client hardcodes generic values or infers from the bitstream, and one line
(`abi.rs:896`, `video_caps = 0`) makes the entire (correct) Apple HDR pipeline dead code.
---
## What's already correct (the islands)
| Stage | Where |
|---|---|
| Windows host HEVC **VUI** — primaries=9 (BT.2020) / transfer=16 (PQ) / matrix=9 (BT.2020-NCL) / limited range | `encode/nvenc.rs:307-316` |
| Windows host **scRGB→BT.2020 PQ** shader (×80 nits → BT.709→2020 → ST.2084 OETF, 10000-nit peak) | `capture/dxgi.rs` — self-test `<1` code error, matches Apollo |
| Windows client **P010 decode + YUV→RGB** (BT.2020-NCL, limited→full) + **R10G10B10A2 / G2084-P2020 swapchain** | `present.rs:66-77, 320-370` |
| Android client **Main10 decode + reactive DataSpace** (BT2020-PQ/HLG) | `decode.rs:210-227` |
| Apple client decode/present **code** (P010 VideoToolbox, BT.2020 PQ Metal, `itur_2100_PQ` + EDR) | correct — but never runs (blocker #2) |
## Gap list
### Blockers
1. **No color-metadata transport in the protocol** *(the keystone).* The wire carries only
`Hello.video_caps` (10BIT/HDR bits) and `Welcome.bit_depth` (8/10) — `quic.rs:127-128`
explicitly defers color. No primaries/transfer/matrix/range, **no ST.2086 mastering, no
MaxCLL/MaxFALL**. ST.2086/CLL host→client is impossible by construction today.
2. **C ABI hardcodes `video_caps = 0`** (`abi.rs:896`) → Apple's complete HDR pipeline is dead
code; no ABI embedder can request HDR. One-line root cause.
3. **H.264 and AV1 emit zero color signaling on Windows** — the `if self.hdr` VUI block in
`nvenc.rs` only writes `hevcConfig`. Any H.264+10-bit or AV1+HDR stream decodes as BT.709 SDR.
*(AV1 is **not** a "copy the HEVC VUI" fix — AV1 has no VUI/SEI; it carries
primaries/transfer/matrix in the sequence-header `color_config` and mastering/CLL in
**METADATA OBUs** `HDR_MDCV`/`HDR_CLL`. Verify whether NVENC's AV1 path accepts them.)*
4. **Linux host is 8-bit only end to end** — capture offers only 8-bit PipeWire formats
(`capture/linux.rs:443-453, 594-654`; gamescope #2126, portals don't wire PipeWire 1.6
BT.2020/PQ); encode downgrades 10-bit (`encode/linux.rs:153-162` TODO, `vaapi.rs:719`) with
BT.709 hardcoded. The Windows-style 8-bit→Main10 upconvert shim is not implemented here.
5. **Linux client HDR is a complete non-feature**`video_caps=0`, P010 decode path dead
(`video.rs:379`), CICP hardcoded BT.709 (`ui_stream.rs:239-243`), no Wayland
color-management (GTK4 0.11 too old).
### Correctness
6. **No host ever emits the ST.2086 mastering or CEA-861.3 CLL SEI.** Windows never reads
`IDXGIOutput6::GetDesc1`; `nvenc.rs` never builds an `NV_ENC_SEI_PAYLOAD`; Linux attaches no
libavcodec `side_data`. Apollo reads `GetDesc1` and attaches it.
7. **Clients hardcode mastering metadata.** `present.rs:584-595` ships fixed
`1000-nit / MaxCLL 1000 / MaxFALL 400` (with the literal "the protocol doesn't carry the
stream's real mastering metadata yet" comment). Apple/Android set none.
8. **HDR→SDR tone-mapping is unaddressed — and it's the common case.** Most client displays are
SDR. No client queries display peak; silent `SetColorSpace1`/`SetHDRMetaData` failures present
PQ as SDR gamma (crushed/dark). We lean entirely on OS auto-fallback.
9. **Windows secure desktop drops HDR to SDR** on lock/UAC (`dxgi.rs:325-368`,
`sudovda.rs:234-277`).
10. **GameStream silently streams SDR** on a Moonlight HDR request (`mod.rs:48-56`,
`rtsp.rs:288-293`) — logged, but no negotiated error. Real Apollo parity needs the Moonlight
`SS_HDR_METADATA` blob on the **ENet control channel**, not just in-band.
11. **Linux client software path is color-wrong even for SDR** — BT.601 applied to BT.709
(`video.rs:162-167`, no `color_state` on the texture). Standalone bug.
### Quality
12. No per-content MaxCLL/MaxFALL (`GetDesc1` doesn't expose it). No encoder-CSC-range vs
signaled-range reconciliation (black-crush risk). No automated 10-bit test — `probe` never
even reads `Welcome.bit_depth` (`main.rs:396-406`).
### Out of scope (call out, don't build)
- Dynamic metadata: HDR10+ (ST.2094-40) and Dolby Vision RPU. We handle *static* ST.2086 only,
with mid-stream changes carried by re-sending the static block (below).
- HLG: the colorimetry transfer enum carries `18` from day one (free), but the `0xCE` mastering
datagram is **omitted for HLG** (scene-referred, no mastering metadata).
---
## Protocol design (the keystone — pure-additive, hardware-free, CI-testable)
Two layers, both back-compat-safe via the established trailing-bytes / new-datagram-tag patterns.
### (A) Per-session colorimetry — 4 trailing bytes on `Welcome`
After the existing `bit_depth` (offset 59), append a fixed 4-byte CICP block at offsets 60..64.
(A future mirror on `Reconfigured` will announce a mid-stream SDR↔HDR / BT.709↔BT.2020 flip on the
control stream we already use for renegotiation — deferred to Step 1 with the mid-stream-flip work;
today a mode switch never changes the colour, and the `0xCE` re-send covers mastering changes.)
```
[60] colour_primaries (CICP: 1=BT.709, 9=BT.2020)
[61] transfer_characteristics (1=BT.709, 16=PQ/SMPTE2084, 18=HLG)
[62] matrix_coeffs (1=BT.709, 9=BT.2020-NCL) ← never emit 10 (CL): no client decodes it
[63] video_full_range_flag (0=limited, 1=full)
```
Decode with `b.get(60).unwrap_or(1)` etc. — an older host omits them → BT.709 limited SDR
(today's behavior). `Welcome` stays `Copy`. Modeled as a `ColorInfo` struct on the wire types
and exposed on `NativeClient` (with `bit_depth`) so clients *know* the colorimetry instead of
inferring it.
### (B) Per-change mastering + CLL — a new host→client datagram, tag `0xCE`
ST.2086 is variable and changes mid-stream, so it rides a datagram (next tag after `0xCD`
HIDOUT), demuxed in `client.rs` like AUDIO/RUMBLE/HIDOUT. 28 bytes, standard SEI fixed-point:
```
[0] = 0xCE
G.x G.y B.x B.y R.x R.y 6 × u16 LE display primaries, 1/50000 units
wp.x wp.y 2 × u16 LE white point, 1/50000 units
max_display_mastering_luminance u32 LE 0.0001 cd/m²
min_display_mastering_luminance u32 LE 0.0001 cd/m²
max_cll u16 LE nits
max_fall u16 LE nits
```
- Sent on session start and whenever `GetDesc1`/source mastering changes; **re-sent on every
IDR/RFI keyframe** so a client that dropped the (best-effort) datagram converges within a GOP.
Until first receipt the client uses the Welcome transfer + a documented generic default.
- **Bounds-check length before reading** (reassembler-bounds security invariant) — truncation
test required.
- **Omitted entirely for HLG.**
- Units note: these map straight to DXGI `DXGI_HDR_METADATA_HDR10`, Android `KEY_HDR_STATIC_INFO`,
and Apple `CAEDRMetadata.hdr10`. On the **libavcodec/Linux** side they need conversion —
`AVMasteringDisplayMetadata` stores `AVRational`, not raw fixed-point.
### (C) C ABI
- `punktfunk_connect_ex5(... video_caps: u8)` (ex4 delegates with 0); **fix `abi.rs:896`.**
- `punktfunk_connection_next_hdr_meta(c, *mut PunktfunkHdrMeta, timeout_ms)` — new plane,
one-puller contract like `next_audio`.
- `punktfunk_connection_color_info(c, *mut prim, *mut trc, *mut matrix, *mut range, *mut bit_depth)`.
- Regenerate `include/punktfunk_core.h` (cbindgen); `struct_size`/repr(C) guards on new structs.
---
## Phases
### Step 0 — Protocol + ABI carry color metadata end to end *(this change)*
The dominant cross-cutting blocker; everything else is downstream. No rendering changes, no
hardware, CI-testable.
- **core:** `ColorInfo` + 4 Welcome bytes; `HdrMeta` + `0xCE` codec (bounds-checked);
`NativeClient` `color`/`bit_depth` fields + HdrMeta receiver + demux + `next_hdr_meta`.
- **C ABI:** `connect_ex5`, `next_hdr_meta`, `color_info`, fix caps=0; regen header.
- **host:** populate `Welcome.color` from the negotiated bit-depth/HDR decision; send a `0xCE`
(generic default for now) when HDR is negotiated.
- **clients:** Windows/Android inherit the demux via shared core; Apple flips to `ex5`.
- **validation:** `quic.rs` round-trip + truncation + **SDR back-compat** tests; `probe` logs
`bit_depth` + colorimetry; loopback asserts a 10-bit Welcome carries trc=16 and a `0xCE` lands.
### Step 1 — Host emits correct in-band SEI + complete VUI on all codecs *(landed; RTX-validation pending)*
In-band SEI is read directly by decoders, so it fixes correctness even before clients consume
the protocol, and gives an Apollo/Moonlight on-glass parity gate.
- **Single source of truth:** the capturer learns the source display's mastering metadata and
exposes it via `Capturer::hdr_meta() -> Option<HdrMeta>`. The stream loop forwards it to the
encoder (`Encoder::set_hdr_meta` → in-band SEI) **and** the client (real `0xCE`, re-sent on each
keyframe). Pure byte-level logic (float→fixed conversion + the HEVC/H.264 SEI payloads) lives in
the unit-tested, cross-platform `src/hdr.rs` (`hdr_meta_from_display`, `hevc_mastering_display_sei`
type **137**, `hevc_content_light_level_sei` type **144** — note: NOT "type 4", that was a
drafting error).
- **Windows (done, CI-compiled / RTX on-glass pending):** `dxgi.rs` + `wgc.rs` read
`IDXGIOutput6::GetDesc1` at capture init / output change → `HdrMeta` (MaxCLL/MaxFALL left 0 —
GetDesc1 has none, like Apollo). `nvenc.rs` attaches the mastering + CLL SEI on every IDR for
HEVC/H.264. (AV1 mastering rides METADATA OBUs, not SEI — follow-up; AV1 `color_config` already
lands in Step 0's quick win.)
- **Linux encode-ready — DEFERRED into Step 4:** Linux capture is 8-bit only, so signalling
BT.2020 PQ + attaching mastering side-data on a downconverted 8-bit stream would be *incorrect*.
The libavcodec `side_data` path (with the `AVRational` conversion) lands together with the
8-bit→Main10 shim / true 10-bit capture in Step 4.
- **Windows secure-desktop relay** (`virtual_stream_relay`) still sends only the generic baseline
`0xCE`; the helper's in-band SEI carries the real grade. Wiring the relay's `0xCE` is a follow-up.
- **validation (RTX box):** `ffprobe -show_frames` shows mastering + CLL side-data with the
display's real luminance and VUI 9/16/9; stock Moonlight shows correct (not washed-out) HDR.
Add **encoder-CSC-range == signaled-range** check.
### Step 2 — Clients apply the metadata (Windows + Apple + Android, parallelizable)
- **Windows:** feed `hdr10_metadata()` from the received `HdrMeta` (drop the hardcode); **log**
`SetColorSpace1`/`SetHDRMetaData` failures.
- **Apple:** attach `kCVImageBufferMasteringDisplayColorVolumeKey` + `ContentLightLevelInfoKey`
/ `CAEDRMetadata` from `HdrMeta`; CV color attachments from Welcome.
- **Android:** set `MediaFormat KEY_HDR_STATIC_INFO` from `HdrMeta`.
### Step 3 — Display-capability query + client tone-mapping + robust fallback
The common-case correctness step — most displays are SDR.
- **HDR→SDR on every client** (defined BT.2390 EETF / Hable), not silent OS fallback.
- Content-peak > display-peak roll-off (`GetDesc1` / `NSScreen.maximumEDR…` /
`Display.getHdrCapabilities`); explicit SDR fallback when HDR present fails.
- Optional client→host "send me SDR" downgrade as a trailing field on `Reconfigure`.
### Step 4 — Linux (last; capture blocked upstream)
- **8-bit→Main10 NVENC upconvert shim** (`encode/linux.rs`) — Main10 transport with correct
VUI/SEI without HDR capture (gate so we don't claim HDR transfer on SDR content).
- **Linux encode color + side-data (the deferred Step 1c):** set
`color_primaries/trc/colorspace/range` from the negotiated `ColorInfo` and attach
`AV_FRAME_DATA_MASTERING_DISPLAY_METADATA` / `CONTENT_LIGHT_LEVEL` side-data (with the
`AVRational` conversion) in `encode/linux.rs` + `vaapi.rs` — only once the encoder actually
produces 10-bit, so the signalling matches the bits.
- True 10-bit capture: offer `ABGR2101010`/`P010` PipeWire formats + read colorimetry; pilot on
Sway/wlroots; track gamescope #2126. **Don't block the rest of the plan on it.**
- Linux client: `ex5` caps, P010 decode, GdkDmabufTexture CICP from Welcome,
`wp_color_management` when GTK ≥ 4.14.
## Quick wins (independent, land in parallel)
1. `connect_ex5` + fix `abi.rs:896` — resurrects Apple's pipeline *(Step 0)*.
2. H.264 VUI + AV1 `color_config` on `nvenc.rs` — closes two latent blockers *(Windows-only,
validated in CI / on the RTX box)*.
3. `probe` logs `bit_depth` + colorimetry — observability for every later round-trip assertion.
4. Linux client BT.601→BT.709 sws + texture `color_state` — standalone SDR correctness bug.
5. GameStream silent-downgrade already warns (`rtsp.rs:289`) — keep observable.
## Open questions
- **MaxCLL source:** `GetDesc1` doesn't expose it (Apollo zeroes). Static default, or measure
per-frame peak in the PQ shader (only truly-correct, adds a readback)?
- **GameStream:** implement `SS_HDR_METADATA` for Moonlight parity, or keep it deliberately SDR
and steer HDR users to punktfunk/1?
- **HLG:** carry the enum from day one (free) — but do any sources actually produce HLG?
- **Linux:** is shipping the 8-bit→Main10 shim as "HDR-capable transport" acceptable, or does it
risk advertising HDR we can't truly deliver?
## Ordering rationale
Step 0 first: it's the keystone (metadata transport is the dominant cross-cutter; the ABI line
is a one-line root cause) and needs no hardware. Step 1 next: in-band SEI is read directly by
decoders, so it fixes correctness even before our clients consume the protocol, and gives an
Apollo-parity on-glass gate. Steps 23 are mechanical per-client wiring once metadata flows.
Linux is last because capture is gated on upstream we don't control; the shim delivers Main10
transport without that dependency.
Hardware dependencies: Step 0 = none (CI); Step 1 = RTX Windows host; Steps 23 = a real HDR
display per platform; Step 4 = a Linux GPU box + HDR-capable Wayland compositor.