7b99b41ede
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).
- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
host-latency, gpu-contention (fixed stale status table), game-library,
linux-setup (fixed m0->spike + stale zero-copy claim),
session-aware-host-followups, windows-client-bootstrap,
windows-dualsense-{scoping,game-detection}, windows-virtual-display,
security-review (per-finding status table; #12 still open),
apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
merged, M4 done); windows-secure-desktop.md archived (now a fallback
behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
133 lines
9.5 KiB
Markdown
133 lines
9.5 KiB
Markdown
# HDR pipeline — investigation & implementation plan
|
||
|
||
> **Status:** Steps 0–3 SHIPPED — protocol/ABI/host in `3526517`, client apply + display
|
||
> capability-gate in `551012b`. Windows HDR live-validated; Apple/Android CI-compiled (on-glass
|
||
> pending). Step 4 (Linux) is OPEN, blocked upstream on capture. This doc is trimmed to design
|
||
> rationale + open items; the shipped code is the source of truth. The original audit (full gap
|
||
> list, per-file line refs, blocker walkthroughs) is in git history before this trim.
|
||
|
||
Goal: **true, correct HDR glass-to-glass** for punktfunk, across the host (Windows today; Linux
|
||
blocked upstream) and every client (Windows / Apple / Android / Linux). The plan was produced from a
|
||
full read of every HDR-touching subsystem cross-checked against the HDR10 standards (CICP/H.273 VUI,
|
||
SMPTE ST.2086 mastering, CEA-861.3 MaxCLL/MaxFALL) and the Sunshine/Apollo/Moonlight reference.
|
||
|
||
The original diagnosis: pixel math + the HEVC VUI we emitted were already correct (self-test
|
||
validated, matches Apollo), but nothing **measured, signalled, transported, or applied the static HDR
|
||
metadata** (mastering display colour volume + content light level). The fix was a metadata chain,
|
||
protocol-first.
|
||
|
||
## What shipped (Steps 0–3)
|
||
|
||
- **Step 0 — protocol + ABI carry colour end to end (`3526517`).** `ColorInfo` (4 CICP bytes on
|
||
`Welcome`) + `HdrMeta` (`0xCE` datagram, bounds-checked); `NativeClient` `color`/`bit_depth` fields
|
||
+ an `HdrMeta` receiver/demux + `next_hdr_meta`. C ABI: `punktfunk_connect_ex5(... video_caps)`,
|
||
`next_hdr_meta`, `color_info`, and **fixed `abi.rs:896` `video_caps = 0`** — the one-line root cause
|
||
that had made Apple's complete (and correct) HDR pipeline dead code. Header regenerated. No
|
||
rendering changes, CI-testable (round-trip + truncation + SDR back-compat).
|
||
- **Step 1 — host in-band SEI + complete VUI (`3526517`, live-validated on the RTX box).**
|
||
Cross-platform byte logic in unit-tested `src/hdr.rs`: `hdr_meta_from_display`,
|
||
`hevc_mastering_display_sei` SEI **type 137**, `hevc_content_light_level_sei` SEI **type 144**
|
||
(note: NOT "type 4" — that was a drafting error). Windows `dxgi.rs`/`wgc.rs` read
|
||
`IDXGIOutput6::GetDesc1` at capture init / output change → `HdrMeta` (MaxCLL/MaxFALL left 0, like
|
||
Apollo); `nvenc.rs` attaches mastering + CLL SEI on every IDR for HEVC/H.264 and sends the real
|
||
`0xCE` re-sent each keyframe. In-band SEI is read directly by decoders, so this fixed correctness
|
||
before clients consumed the protocol and gave an Apollo on-glass parity gate. *Follow-ups:* AV1
|
||
mastering rides METADATA OBUs (`HDR_MDCV`/`HDR_CLL`), not SEI; the Windows secure-desktop relay
|
||
still sends only the generic baseline `0xCE` (the helper's in-band SEI carries the real grade).
|
||
- **Step 2 — clients apply the metadata (`551012b`; Apple/Android CI-compiled).** Each client drains
|
||
`next_hdr_meta`/`nextHdrMeta` and remaps from the wire form (ST.2086 **G,B,R** order, mastering
|
||
luminance in 0.0001 cd/m²) to the platform layout: **Windows** `SetHDRMetaData`
|
||
(`hdr_meta_to_dxgi`: G,B,R→R,G,B reorder, 0.0001-nit→nit), dropping the 1000/1000/400 hardcode;
|
||
**Apple** `CVBufferSetAttachment` of `kCVImageBufferMasteringDisplayColorVolumeKey` (24-byte BE) +
|
||
`kCVImageBufferContentLightLevelInfoKey` (4-byte BE) per HDR pixel buffer — the correct path for the
|
||
`itur_2100_PQ` layer (`CAEDRMetadata` on a PQ layer is ambiguous, deliberately avoided); **Android**
|
||
`MediaFormat` `KEY_HDR_STATIC_INFO`, a 25-byte CTA-861.3 Type-1 blob (LE, **R,G,B** order, max-lum
|
||
in **nits-u16**). Apple's connect also flips to `connect_ex5` advertising `videoCap10Bit|videoCapHDR`
|
||
— the fix that resurrects Apple's previously-dead pipeline.
|
||
- **Step 3 — display-capability gate (`551012b`).** **Chosen approach: capability-gate, not an
|
||
in-shader BT.2390 tone-map.** Rationale: with Steps 1–2 the host sends *correct* mastering metadata
|
||
so an HDR display self-tone-maps; the remaining gap is SDR displays, best fixed by **not advertising
|
||
HDR you can't present** — the host then sends a proper BT.709 SDR stream instead of PQ the panel
|
||
mis-tone-maps (washed-out/dark). No guessed curve, deterministic. **Windows**
|
||
`display_supports_hdr` (any `IDXGIOutput6` colour space == `G2084`); **Apple**
|
||
`NSScreen.maximumExtendedDynamicRangeColorComponentValue > 1` (macOS) /
|
||
`UIScreen.main.potentialEDRHeadroom > 1` (iOS); **Android** `Display.getHdrCapabilities`
|
||
HDR10/HDR10+. Each ANDs with the user's HDR setting before advertising caps and logs when it drops
|
||
to SDR.
|
||
|
||
### Wire format — design decisions worth keeping
|
||
|
||
Two layers, both back-compat-safe via the established trailing-bytes / new-datagram-tag patterns.
|
||
|
||
- **(A) Per-session colorimetry** — 4 trailing bytes on `Welcome` (offsets 60..64):
|
||
`colour_primaries` (1=BT.709, 9=BT.2020) · `transfer_characteristics` (1=BT.709, 16=PQ/SMPTE2084,
|
||
18=HLG) · `matrix_coeffs` (1=BT.709, 9=BT.2020-NCL — **never emit 10 (CL): no client decodes it**) ·
|
||
`video_full_range_flag`. Decoded with `b.get(60).unwrap_or(1)` so an older host that omits them →
|
||
BT.709 limited SDR (today's behaviour). A future mirror on `Reconfigured` announces a mid-stream
|
||
SDR↔HDR / BT.709↔BT.2020 flip (deferred; today a mode switch never changes colour, and `0xCE`
|
||
re-send covers mastering changes).
|
||
- **(B) Per-change mastering + CLL** — host→client datagram tag **`0xCE`**, 28 bytes, standard SEI
|
||
fixed-point (display primaries G,B,R + white point in 1/50000 units; max/min mastering luminance in
|
||
0.0001 cd/m²; MaxCLL/MaxFALL in nits). ST.2086 is variable, so it rides a datagram rather than the
|
||
Welcome. **Re-sent on every IDR/RFI keyframe** so a client that dropped the best-effort datagram
|
||
converges within a GOP; until first receipt the client uses the Welcome transfer + a documented
|
||
generic default. **Bounds-check length before reading** (reassembler-bounds security invariant —
|
||
truncation test required). **Omitted entirely for HLG.** Units map straight to DXGI
|
||
`DXGI_HDR_METADATA_HDR10`, Android `KEY_HDR_STATIC_INFO`, Apple `CAEDRMetadata.hdr10`; the
|
||
libavcodec/Linux side needs conversion — `AVMasteringDisplayMetadata` stores `AVRational`, not raw
|
||
fixed-point.
|
||
|
||
## Out of scope (accepted — call out, don't build)
|
||
|
||
- **Dynamic metadata:** HDR10+ (ST.2094-40) and Dolby Vision RPU. We handle *static* ST.2086 only,
|
||
with mid-stream changes carried by re-sending the static block.
|
||
- **HLG:** the transfer enum carries `18` from day one (free), but the `0xCE` mastering datagram is
|
||
omitted for HLG (scene-referred, no mastering metadata).
|
||
|
||
## Step 4 — Linux (last; capture blocked upstream) — OPEN
|
||
|
||
- **(4a) 8-bit→Main10 NVENC upconvert shim** (`encode/linux.rs`) — Main10 transport with correct
|
||
VUI/SEI without HDR capture (gated so we don't claim HDR transfer on SDR content).
|
||
- **Linux encode colour + side-data (the deferred Step 1c):** set
|
||
`color_primaries/trc/colorspace/range` from the negotiated `ColorInfo` and attach
|
||
`AV_FRAME_DATA_MASTERING_DISPLAY_METADATA` / `CONTENT_LIGHT_LEVEL` side-data (with the `AVRational`
|
||
conversion) in `encode/linux.rs` + `vaapi.rs` — only once the encoder actually produces 10-bit, so
|
||
the signalling matches the bits. (Linux capture is 8-bit only, so signalling BT.2020 PQ + attaching
|
||
mastering side-data on a downconverted 8-bit stream would be *incorrect* — hence deferred out of
|
||
Step 1.)
|
||
- **(4b) True 10-bit capture:** offer `ABGR2101010`/`P010` PipeWire formats + read colorimetry; pilot
|
||
on Sway/wlroots; **blocked on gamescope #2126** (portals don't wire PipeWire 1.6 BT.2020/PQ).
|
||
**Don't block the rest of the plan on it.**
|
||
- **(4c) Linux client:** `ex5` caps, P010 decode, GdkDmabufTexture CICP from Welcome,
|
||
`wp_color_management` when GTK ≥ 4.14. (Also a standalone SDR bug: software path applies BT.601 to
|
||
BT.709 — needs a BT.601→BT.709 sws + texture `color_state`.)
|
||
|
||
## Deferred validation (need on-glass / the RTX box)
|
||
|
||
- The mid-session `Reconfigure` "downgrade to SDR" for a monitor move HDR↔SDR.
|
||
- Confirm the **host produces SDR for an SDR client even off an HDR desktop** — on the native path the
|
||
per-session SudoVDA follows the negotiated depth (SDR client → SDR virtual display → SDR stream), so
|
||
it should hold end to end; verify the stale-HDR-SudoVDA edge case.
|
||
|
||
## Open questions
|
||
|
||
- **MaxCLL source:** `GetDesc1` doesn't expose it (Apollo zeroes). Static default, or measure
|
||
per-frame peak in the PQ shader (only truly-correct, adds a readback)?
|
||
- **GameStream:** implement `SS_HDR_METADATA` (Moonlight `SS_HDR_METADATA` blob on the ENet control
|
||
channel) for parity, or keep it deliberately SDR and steer HDR users to punktfunk/1?
|
||
- **HLG:** carry the enum from day one (free) — but do any sources actually produce HLG?
|
||
- **Linux:** is shipping the 8-bit→Main10 shim as "HDR-capable transport" acceptable, or does it risk
|
||
advertising HDR we can't truly deliver?
|
||
|
||
## Ordering rationale
|
||
|
||
Step 0 first: it's the keystone (metadata transport is the dominant cross-cutter; the ABI
|
||
`video_caps = 0` line is a one-line root cause) and needs no hardware. Step 1 next: in-band SEI is
|
||
read directly by decoders, so it fixes correctness even before our clients consume the protocol, and
|
||
gives an Apollo-parity on-glass gate. Steps 2–3 are mechanical per-client wiring once metadata flows.
|
||
Linux is last because capture is gated on upstream we don't control; the shim delivers Main10
|
||
transport without that dependency.
|
||
|
||
Hardware dependencies: Step 0 = none (CI); Step 1 = RTX Windows host; Steps 2–3 = a real HDR display
|
||
per platform; Step 4 = a Linux GPU box + HDR-capable Wayland compositor.
|