Files
punktfunk/design/hdr-pipeline-plan.md
T
enricobuehler 7b99b41ede docs(design): trim shipped plans, consolidate cluster, add index
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).

- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
  apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
  host-latency, gpu-contention (fixed stale status table), game-library,
  linux-setup (fixed m0->spike + stale zero-copy claim),
  session-aware-host-followups, windows-client-bootstrap,
  windows-dualsense-{scoping,game-detection}, windows-virtual-display,
  security-review (per-finding status table; #12 still open),
  apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
  windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
  merged, M4 done); windows-secure-desktop.md archived (now a fallback
  behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
  roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:39:06 +00:00

133 lines
9.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# HDR pipeline — investigation & implementation plan
> **Status:** Steps 03 SHIPPED — protocol/ABI/host in `3526517`, client apply + display
> capability-gate in `551012b`. Windows HDR live-validated; Apple/Android CI-compiled (on-glass
> pending). Step 4 (Linux) is OPEN, blocked upstream on capture. This doc is trimmed to design
> rationale + open items; the shipped code is the source of truth. The original audit (full gap
> list, per-file line refs, blocker walkthroughs) is in git history before this trim.
Goal: **true, correct HDR glass-to-glass** for punktfunk, across the host (Windows today; Linux
blocked upstream) and every client (Windows / Apple / Android / Linux). The plan was produced from a
full read of every HDR-touching subsystem cross-checked against the HDR10 standards (CICP/H.273 VUI,
SMPTE ST.2086 mastering, CEA-861.3 MaxCLL/MaxFALL) and the Sunshine/Apollo/Moonlight reference.
The original diagnosis: pixel math + the HEVC VUI we emitted were already correct (self-test
validated, matches Apollo), but nothing **measured, signalled, transported, or applied the static HDR
metadata** (mastering display colour volume + content light level). The fix was a metadata chain,
protocol-first.
## What shipped (Steps 03)
- **Step 0 — protocol + ABI carry colour end to end (`3526517`).** `ColorInfo` (4 CICP bytes on
`Welcome`) + `HdrMeta` (`0xCE` datagram, bounds-checked); `NativeClient` `color`/`bit_depth` fields
+ an `HdrMeta` receiver/demux + `next_hdr_meta`. C ABI: `punktfunk_connect_ex5(... video_caps)`,
`next_hdr_meta`, `color_info`, and **fixed `abi.rs:896` `video_caps = 0`** — the one-line root cause
that had made Apple's complete (and correct) HDR pipeline dead code. Header regenerated. No
rendering changes, CI-testable (round-trip + truncation + SDR back-compat).
- **Step 1 — host in-band SEI + complete VUI (`3526517`, live-validated on the RTX box).**
Cross-platform byte logic in unit-tested `src/hdr.rs`: `hdr_meta_from_display`,
`hevc_mastering_display_sei` SEI **type 137**, `hevc_content_light_level_sei` SEI **type 144**
(note: NOT "type 4" — that was a drafting error). Windows `dxgi.rs`/`wgc.rs` read
`IDXGIOutput6::GetDesc1` at capture init / output change → `HdrMeta` (MaxCLL/MaxFALL left 0, like
Apollo); `nvenc.rs` attaches mastering + CLL SEI on every IDR for HEVC/H.264 and sends the real
`0xCE` re-sent each keyframe. In-band SEI is read directly by decoders, so this fixed correctness
before clients consumed the protocol and gave an Apollo on-glass parity gate. *Follow-ups:* AV1
mastering rides METADATA OBUs (`HDR_MDCV`/`HDR_CLL`), not SEI; the Windows secure-desktop relay
still sends only the generic baseline `0xCE` (the helper's in-band SEI carries the real grade).
- **Step 2 — clients apply the metadata (`551012b`; Apple/Android CI-compiled).** Each client drains
`next_hdr_meta`/`nextHdrMeta` and remaps from the wire form (ST.2086 **G,B,R** order, mastering
luminance in 0.0001 cd/m²) to the platform layout: **Windows** `SetHDRMetaData`
(`hdr_meta_to_dxgi`: G,B,R→R,G,B reorder, 0.0001-nit→nit), dropping the 1000/1000/400 hardcode;
**Apple** `CVBufferSetAttachment` of `kCVImageBufferMasteringDisplayColorVolumeKey` (24-byte BE) +
`kCVImageBufferContentLightLevelInfoKey` (4-byte BE) per HDR pixel buffer — the correct path for the
`itur_2100_PQ` layer (`CAEDRMetadata` on a PQ layer is ambiguous, deliberately avoided); **Android**
`MediaFormat` `KEY_HDR_STATIC_INFO`, a 25-byte CTA-861.3 Type-1 blob (LE, **R,G,B** order, max-lum
in **nits-u16**). Apple's connect also flips to `connect_ex5` advertising `videoCap10Bit|videoCapHDR`
— the fix that resurrects Apple's previously-dead pipeline.
- **Step 3 — display-capability gate (`551012b`).** **Chosen approach: capability-gate, not an
in-shader BT.2390 tone-map.** Rationale: with Steps 12 the host sends *correct* mastering metadata
so an HDR display self-tone-maps; the remaining gap is SDR displays, best fixed by **not advertising
HDR you can't present** — the host then sends a proper BT.709 SDR stream instead of PQ the panel
mis-tone-maps (washed-out/dark). No guessed curve, deterministic. **Windows**
`display_supports_hdr` (any `IDXGIOutput6` colour space == `G2084`); **Apple**
`NSScreen.maximumExtendedDynamicRangeColorComponentValue > 1` (macOS) /
`UIScreen.main.potentialEDRHeadroom > 1` (iOS); **Android** `Display.getHdrCapabilities`
HDR10/HDR10+. Each ANDs with the user's HDR setting before advertising caps and logs when it drops
to SDR.
### Wire format — design decisions worth keeping
Two layers, both back-compat-safe via the established trailing-bytes / new-datagram-tag patterns.
- **(A) Per-session colorimetry** — 4 trailing bytes on `Welcome` (offsets 60..64):
`colour_primaries` (1=BT.709, 9=BT.2020) · `transfer_characteristics` (1=BT.709, 16=PQ/SMPTE2084,
18=HLG) · `matrix_coeffs` (1=BT.709, 9=BT.2020-NCL — **never emit 10 (CL): no client decodes it**) ·
`video_full_range_flag`. Decoded with `b.get(60).unwrap_or(1)` so an older host that omits them →
BT.709 limited SDR (today's behaviour). A future mirror on `Reconfigured` announces a mid-stream
SDR↔HDR / BT.709↔BT.2020 flip (deferred; today a mode switch never changes colour, and `0xCE`
re-send covers mastering changes).
- **(B) Per-change mastering + CLL** — host→client datagram tag **`0xCE`**, 28 bytes, standard SEI
fixed-point (display primaries G,B,R + white point in 1/50000 units; max/min mastering luminance in
0.0001 cd/m²; MaxCLL/MaxFALL in nits). ST.2086 is variable, so it rides a datagram rather than the
Welcome. **Re-sent on every IDR/RFI keyframe** so a client that dropped the best-effort datagram
converges within a GOP; until first receipt the client uses the Welcome transfer + a documented
generic default. **Bounds-check length before reading** (reassembler-bounds security invariant —
truncation test required). **Omitted entirely for HLG.** Units map straight to DXGI
`DXGI_HDR_METADATA_HDR10`, Android `KEY_HDR_STATIC_INFO`, Apple `CAEDRMetadata.hdr10`; the
libavcodec/Linux side needs conversion — `AVMasteringDisplayMetadata` stores `AVRational`, not raw
fixed-point.
## Out of scope (accepted — call out, don't build)
- **Dynamic metadata:** HDR10+ (ST.2094-40) and Dolby Vision RPU. We handle *static* ST.2086 only,
with mid-stream changes carried by re-sending the static block.
- **HLG:** the transfer enum carries `18` from day one (free), but the `0xCE` mastering datagram is
omitted for HLG (scene-referred, no mastering metadata).
## Step 4 — Linux (last; capture blocked upstream) — OPEN
- **(4a) 8-bit→Main10 NVENC upconvert shim** (`encode/linux.rs`) — Main10 transport with correct
VUI/SEI without HDR capture (gated so we don't claim HDR transfer on SDR content).
- **Linux encode colour + side-data (the deferred Step 1c):** set
`color_primaries/trc/colorspace/range` from the negotiated `ColorInfo` and attach
`AV_FRAME_DATA_MASTERING_DISPLAY_METADATA` / `CONTENT_LIGHT_LEVEL` side-data (with the `AVRational`
conversion) in `encode/linux.rs` + `vaapi.rs` — only once the encoder actually produces 10-bit, so
the signalling matches the bits. (Linux capture is 8-bit only, so signalling BT.2020 PQ + attaching
mastering side-data on a downconverted 8-bit stream would be *incorrect* — hence deferred out of
Step 1.)
- **(4b) True 10-bit capture:** offer `ABGR2101010`/`P010` PipeWire formats + read colorimetry; pilot
on Sway/wlroots; **blocked on gamescope #2126** (portals don't wire PipeWire 1.6 BT.2020/PQ).
**Don't block the rest of the plan on it.**
- **(4c) Linux client:** `ex5` caps, P010 decode, GdkDmabufTexture CICP from Welcome,
`wp_color_management` when GTK ≥ 4.14. (Also a standalone SDR bug: software path applies BT.601 to
BT.709 — needs a BT.601→BT.709 sws + texture `color_state`.)
## Deferred validation (need on-glass / the RTX box)
- The mid-session `Reconfigure` "downgrade to SDR" for a monitor move HDR↔SDR.
- Confirm the **host produces SDR for an SDR client even off an HDR desktop** — on the native path the
per-session SudoVDA follows the negotiated depth (SDR client → SDR virtual display → SDR stream), so
it should hold end to end; verify the stale-HDR-SudoVDA edge case.
## Open questions
- **MaxCLL source:** `GetDesc1` doesn't expose it (Apollo zeroes). Static default, or measure
per-frame peak in the PQ shader (only truly-correct, adds a readback)?
- **GameStream:** implement `SS_HDR_METADATA` (Moonlight `SS_HDR_METADATA` blob on the ENet control
channel) for parity, or keep it deliberately SDR and steer HDR users to punktfunk/1?
- **HLG:** carry the enum from day one (free) — but do any sources actually produce HLG?
- **Linux:** is shipping the 8-bit→Main10 shim as "HDR-capable transport" acceptable, or does it risk
advertising HDR we can't truly deliver?
## Ordering rationale
Step 0 first: it's the keystone (metadata transport is the dominant cross-cutter; the ABI
`video_caps = 0` line is a one-line root cause) and needs no hardware. Step 1 next: in-band SEI is
read directly by decoders, so it fixes correctness even before our clients consume the protocol, and
gives an Apollo-parity on-glass gate. Steps 23 are mechanical per-client wiring once metadata flows.
Linux is last because capture is gated on upstream we don't control; the shim delivers Main10
transport without that dependency.
Hardware dependencies: Step 0 = none (CI); Step 1 = RTX Windows host; Steps 23 = a real HDR display
per platform; Step 4 = a Linux GPU box + HDR-capable Wayland compositor.