Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).
- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
host-latency, gpu-contention (fixed stale status table), game-library,
linux-setup (fixed m0->spike + stale zero-copy claim),
session-aware-host-followups, windows-client-bootstrap,
windows-dualsense-{scoping,game-detection}, windows-virtual-display,
security-review (per-finding status table; #12 still open),
apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
merged, M4 done); windows-secure-desktop.md archived (now a fallback
behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
9.5 KiB
HDR pipeline — investigation & implementation plan
Status: Steps 0–3 SHIPPED — protocol/ABI/host in
3526517, client apply + display capability-gate in551012b. Windows HDR live-validated; Apple/Android CI-compiled (on-glass pending). Step 4 (Linux) is OPEN, blocked upstream on capture. This doc is trimmed to design rationale + open items; the shipped code is the source of truth. The original audit (full gap list, per-file line refs, blocker walkthroughs) is in git history before this trim.
Goal: true, correct HDR glass-to-glass for punktfunk, across the host (Windows today; Linux blocked upstream) and every client (Windows / Apple / Android / Linux). The plan was produced from a full read of every HDR-touching subsystem cross-checked against the HDR10 standards (CICP/H.273 VUI, SMPTE ST.2086 mastering, CEA-861.3 MaxCLL/MaxFALL) and the Sunshine/Apollo/Moonlight reference.
The original diagnosis: pixel math + the HEVC VUI we emitted were already correct (self-test validated, matches Apollo), but nothing measured, signalled, transported, or applied the static HDR metadata (mastering display colour volume + content light level). The fix was a metadata chain, protocol-first.
What shipped (Steps 0–3)
- Step 0 — protocol + ABI carry colour end to end (
3526517).ColorInfo(4 CICP bytes onWelcome) +HdrMeta(0xCEdatagram, bounds-checked);NativeClientcolor/bit_depthfields- an
HdrMetareceiver/demux +next_hdr_meta. C ABI:punktfunk_connect_ex5(... video_caps),next_hdr_meta,color_info, and fixedabi.rs:896video_caps = 0— the one-line root cause that had made Apple's complete (and correct) HDR pipeline dead code. Header regenerated. No rendering changes, CI-testable (round-trip + truncation + SDR back-compat).
- an
- Step 1 — host in-band SEI + complete VUI (
3526517, live-validated on the RTX box). Cross-platform byte logic in unit-testedsrc/hdr.rs:hdr_meta_from_display,hevc_mastering_display_seiSEI type 137,hevc_content_light_level_seiSEI type 144 (note: NOT "type 4" — that was a drafting error). Windowsdxgi.rs/wgc.rsreadIDXGIOutput6::GetDesc1at capture init / output change →HdrMeta(MaxCLL/MaxFALL left 0, like Apollo);nvenc.rsattaches mastering + CLL SEI on every IDR for HEVC/H.264 and sends the real0xCEre-sent each keyframe. In-band SEI is read directly by decoders, so this fixed correctness before clients consumed the protocol and gave an Apollo on-glass parity gate. Follow-ups: AV1 mastering rides METADATA OBUs (HDR_MDCV/HDR_CLL), not SEI; the Windows secure-desktop relay still sends only the generic baseline0xCE(the helper's in-band SEI carries the real grade). - Step 2 — clients apply the metadata (
551012b; Apple/Android CI-compiled). Each client drainsnext_hdr_meta/nextHdrMetaand remaps from the wire form (ST.2086 G,B,R order, mastering luminance in 0.0001 cd/m²) to the platform layout: WindowsSetHDRMetaData(hdr_meta_to_dxgi: G,B,R→R,G,B reorder, 0.0001-nit→nit), dropping the 1000/1000/400 hardcode; AppleCVBufferSetAttachmentofkCVImageBufferMasteringDisplayColorVolumeKey(24-byte BE) +kCVImageBufferContentLightLevelInfoKey(4-byte BE) per HDR pixel buffer — the correct path for theitur_2100_PQlayer (CAEDRMetadataon a PQ layer is ambiguous, deliberately avoided); AndroidMediaFormatKEY_HDR_STATIC_INFO, a 25-byte CTA-861.3 Type-1 blob (LE, R,G,B order, max-lum in nits-u16). Apple's connect also flips toconnect_ex5advertisingvideoCap10Bit|videoCapHDR— the fix that resurrects Apple's previously-dead pipeline. - Step 3 — display-capability gate (
551012b). Chosen approach: capability-gate, not an in-shader BT.2390 tone-map. Rationale: with Steps 1–2 the host sends correct mastering metadata so an HDR display self-tone-maps; the remaining gap is SDR displays, best fixed by not advertising HDR you can't present — the host then sends a proper BT.709 SDR stream instead of PQ the panel mis-tone-maps (washed-out/dark). No guessed curve, deterministic. Windowsdisplay_supports_hdr(anyIDXGIOutput6colour space ==G2084); AppleNSScreen.maximumExtendedDynamicRangeColorComponentValue > 1(macOS) /UIScreen.main.potentialEDRHeadroom > 1(iOS); AndroidDisplay.getHdrCapabilitiesHDR10/HDR10+. Each ANDs with the user's HDR setting before advertising caps and logs when it drops to SDR.
Wire format — design decisions worth keeping
Two layers, both back-compat-safe via the established trailing-bytes / new-datagram-tag patterns.
- (A) Per-session colorimetry — 4 trailing bytes on
Welcome(offsets 60..64):colour_primaries(1=BT.709, 9=BT.2020) ·transfer_characteristics(1=BT.709, 16=PQ/SMPTE2084, 18=HLG) ·matrix_coeffs(1=BT.709, 9=BT.2020-NCL — never emit 10 (CL): no client decodes it) ·video_full_range_flag. Decoded withb.get(60).unwrap_or(1)so an older host that omits them → BT.709 limited SDR (today's behaviour). A future mirror onReconfiguredannounces a mid-stream SDR↔HDR / BT.709↔BT.2020 flip (deferred; today a mode switch never changes colour, and0xCEre-send covers mastering changes). - (B) Per-change mastering + CLL — host→client datagram tag
0xCE, 28 bytes, standard SEI fixed-point (display primaries G,B,R + white point in 1/50000 units; max/min mastering luminance in 0.0001 cd/m²; MaxCLL/MaxFALL in nits). ST.2086 is variable, so it rides a datagram rather than the Welcome. Re-sent on every IDR/RFI keyframe so a client that dropped the best-effort datagram converges within a GOP; until first receipt the client uses the Welcome transfer + a documented generic default. Bounds-check length before reading (reassembler-bounds security invariant — truncation test required). Omitted entirely for HLG. Units map straight to DXGIDXGI_HDR_METADATA_HDR10, AndroidKEY_HDR_STATIC_INFO, AppleCAEDRMetadata.hdr10; the libavcodec/Linux side needs conversion —AVMasteringDisplayMetadatastoresAVRational, not raw fixed-point.
Out of scope (accepted — call out, don't build)
- Dynamic metadata: HDR10+ (ST.2094-40) and Dolby Vision RPU. We handle static ST.2086 only, with mid-stream changes carried by re-sending the static block.
- HLG: the transfer enum carries
18from day one (free), but the0xCEmastering datagram is omitted for HLG (scene-referred, no mastering metadata).
Step 4 — Linux (last; capture blocked upstream) — OPEN
- (4a) 8-bit→Main10 NVENC upconvert shim (
encode/linux.rs) — Main10 transport with correct VUI/SEI without HDR capture (gated so we don't claim HDR transfer on SDR content). - Linux encode colour + side-data (the deferred Step 1c): set
color_primaries/trc/colorspace/rangefrom the negotiatedColorInfoand attachAV_FRAME_DATA_MASTERING_DISPLAY_METADATA/CONTENT_LIGHT_LEVELside-data (with theAVRationalconversion) inencode/linux.rs+vaapi.rs— only once the encoder actually produces 10-bit, so the signalling matches the bits. (Linux capture is 8-bit only, so signalling BT.2020 PQ + attaching mastering side-data on a downconverted 8-bit stream would be incorrect — hence deferred out of Step 1.) - (4b) True 10-bit capture: offer
ABGR2101010/P010PipeWire formats + read colorimetry; pilot on Sway/wlroots; blocked on gamescope #2126 (portals don't wire PipeWire 1.6 BT.2020/PQ). Don't block the rest of the plan on it. - (4c) Linux client:
ex5caps, P010 decode, GdkDmabufTexture CICP from Welcome,wp_color_managementwhen GTK ≥ 4.14. (Also a standalone SDR bug: software path applies BT.601 to BT.709 — needs a BT.601→BT.709 sws + texturecolor_state.)
Deferred validation (need on-glass / the RTX box)
- The mid-session
Reconfigure"downgrade to SDR" for a monitor move HDR↔SDR. - Confirm the host produces SDR for an SDR client even off an HDR desktop — on the native path the per-session SudoVDA follows the negotiated depth (SDR client → SDR virtual display → SDR stream), so it should hold end to end; verify the stale-HDR-SudoVDA edge case.
Open questions
- MaxCLL source:
GetDesc1doesn't expose it (Apollo zeroes). Static default, or measure per-frame peak in the PQ shader (only truly-correct, adds a readback)? - GameStream: implement
SS_HDR_METADATA(MoonlightSS_HDR_METADATAblob on the ENet control channel) for parity, or keep it deliberately SDR and steer HDR users to punktfunk/1? - HLG: carry the enum from day one (free) — but do any sources actually produce HLG?
- Linux: is shipping the 8-bit→Main10 shim as "HDR-capable transport" acceptable, or does it risk advertising HDR we can't truly deliver?
Ordering rationale
Step 0 first: it's the keystone (metadata transport is the dominant cross-cutter; the ABI
video_caps = 0 line is a one-line root cause) and needs no hardware. Step 1 next: in-band SEI is
read directly by decoders, so it fixes correctness even before our clients consume the protocol, and
gives an Apollo-parity on-glass gate. Steps 2–3 are mechanical per-client wiring once metadata flows.
Linux is last because capture is gated on upstream we don't control; the shim delivers Main10
transport without that dependency.
Hardware dependencies: Step 0 = none (CI); Step 1 = RTX Windows host; Steps 2–3 = a real HDR display per platform; Step 4 = a Linux GPU box + HDR-capable Wayland compositor.