feat(client/windows): HDR10 (BT.2020 PQ) decode + present
apple / swift (push) Successful in 54s
windows-msix / package (push) Successful in 1m8s
windows / build (push) Successful in 1m14s
android / android (push) Failing after 1m43s
ci / rust (push) Failing after 48s
ci / web (push) Successful in 28s
ci / docs-site (push) Successful in 29s
deb / build-publish (push) Successful in 3m5s
decky / build-publish (push) Successful in 14s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 3s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m35s
flatpak / build-publish (push) Failing after 4m27s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 3m54s
docker / deploy-docs (push) Successful in 6s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m12s
apple / swift (push) Successful in 54s
windows-msix / package (push) Successful in 1m8s
windows / build (push) Successful in 1m14s
android / android (push) Failing after 1m43s
ci / rust (push) Failing after 48s
ci / web (push) Successful in 28s
ci / docs-site (push) Successful in 29s
deb / build-publish (push) Successful in 3m5s
decky / build-publish (push) Successful in 14s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 3s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m35s
flatpak / build-publish (push) Failing after 4m27s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 3m54s
docker / deploy-docs (push) Successful in 6s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m12s
Light up the dormant 10-bit/HDR path end to end on the Windows client. - core: NativeClient::connect gains a video_caps param threaded into the Hello. The Windows client advertises VIDEO_CAP_10BIT | VIDEO_CAP_HDR; every other caller (the C ABI shim, Linux, Android, host test connects) passes 0, so the 8-bit BT.709 path is unchanged. The host already gates a Main10/PQ encode on these bits + PUNKTFUNK_10BIT. - video.rs: a PQ frame (color_trc == SMPTE2084) converts 10-bit YUV → X2BGR10 (== DXGI R10G10B10A2) with the BT.2020 matrix via sws_setColorspaceDetails; swscale applies only the matrix + range, so the PQ-encoded samples pass through untouched. - present.rs: on an HDR frame the swapchain flips in place (ResizeBuffers) to R10G10B10A2 + DXGI_COLOR_SPACE_RGB_FULL_G2084_NONE_P2020 + HDR10 metadata; the passthrough shader is unchanged and the compositor maps PQ→display. Switched to ALPHA_MODE_IGNORE so the 10-bit padding bits don't render transparent. SDR stays 8-bit B8G8R8A8. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -5,8 +5,12 @@
|
||||
//! The device prefers a hardware adapter and falls back to **WARP** (the GPU-less dev box runs
|
||||
//! the whole present path in software). The draw is a single full-screen triangle sampling the
|
||||
//! video texture; a letterbox is produced by clearing the back buffer black and setting the
|
||||
//! viewport to the Contain-fit rect (no per-frame vertex buffer). SDR 8-bit path; the
|
||||
//! 10-bit/HDR present (`R10G10B10A2` + `SetColorSpace1`) is a follow-up alongside P010 decode.
|
||||
//! viewport to the Contain-fit rect (no per-frame vertex buffer).
|
||||
//!
|
||||
//! **HDR10**: when a frame is BT.2020 PQ (`CpuFrame::hdr`), the swapchain flips to
|
||||
//! `R10G10B10A2` + `DXGI_COLOR_SPACE_RGB_FULL_G2084_NONE_P2020` (+ HDR10 metadata) via
|
||||
//! `ResizeBuffers`/`SetColorSpace1`; the decoded samples are already PQ-encoded so the shader is a
|
||||
//! plain passthrough and the compositor maps PQ→display. SDR stays 8-bit B8G8R8A8.
|
||||
//!
|
||||
//! All `windows` types here come from the same windows-rs commit as `windows-reactor`, so the
|
||||
//! `IDXGISwapChain1` handed to `set_swap_chain` satisfies reactor's `windows_core::Interface`.
|
||||
@@ -50,6 +54,9 @@ pub struct Presenter {
|
||||
/// Panel (swapchain) size in pixels, updated on resize.
|
||||
panel_w: u32,
|
||||
panel_h: u32,
|
||||
/// Whether the swapchain is currently in 10-bit HDR10 (R10G10B10A2 + ST.2084) mode; flipped
|
||||
/// to match each frame's `hdr` flag.
|
||||
hdr: bool,
|
||||
}
|
||||
|
||||
impl Presenter {
|
||||
@@ -69,6 +76,7 @@ impl Presenter {
|
||||
tex: None,
|
||||
panel_w: width.max(1),
|
||||
panel_h: height.max(1),
|
||||
hdr: false,
|
||||
})
|
||||
}
|
||||
|
||||
@@ -100,6 +108,9 @@ impl Presenter {
|
||||
/// last texture (or black). Called from the reactor `on_rendering` per-frame callback.
|
||||
pub fn present(&mut self, frame: Option<&CpuFrame>) {
|
||||
if let Some(f) = frame {
|
||||
if f.hdr != self.hdr {
|
||||
self.set_hdr(f.hdr);
|
||||
}
|
||||
if let Err(e) = self.upload(f) {
|
||||
tracing::warn!(error = %e, "frame upload failed");
|
||||
}
|
||||
@@ -144,16 +155,74 @@ impl Presenter {
|
||||
}
|
||||
}
|
||||
|
||||
/// Switch the swapchain between 8-bit SDR (B8G8R8A8, sRGB/BT.709) and 10-bit HDR10
|
||||
/// (R10G10B10A2, ST.2084 PQ BT.2020). `ResizeBuffers` can change the back-buffer format in
|
||||
/// place, so the panel binding (`set_swap_chain`) stays valid — no rebind needed. The decoded
|
||||
/// samples are already PQ-encoded BT.2020 (see `video::convert`), so the colour space is all the
|
||||
/// compositor needs to map them to the display.
|
||||
fn set_hdr(&mut self, on: bool) {
|
||||
self.rtv = None; // release back-buffer refs before ResizeBuffers
|
||||
self.tex = None; // texture format changes (R10G10B10A2 vs R8G8B8A8)
|
||||
let format = if on {
|
||||
DXGI_FORMAT_R10G10B10A2_UNORM
|
||||
} else {
|
||||
DXGI_FORMAT_B8G8R8A8_UNORM
|
||||
};
|
||||
unsafe {
|
||||
if let Err(e) = self.swap.ResizeBuffers(
|
||||
0,
|
||||
self.panel_w,
|
||||
self.panel_h,
|
||||
format,
|
||||
DXGI_SWAP_CHAIN_FLAG(0),
|
||||
) {
|
||||
tracing::warn!(error = %e, "ResizeBuffers for HDR switch failed");
|
||||
return;
|
||||
}
|
||||
let colorspace = if on {
|
||||
DXGI_COLOR_SPACE_RGB_FULL_G2084_NONE_P2020
|
||||
} else {
|
||||
DXGI_COLOR_SPACE_RGB_FULL_G22_NONE_P709
|
||||
};
|
||||
if let Ok(sc3) = self.swap.cast::<IDXGISwapChain3>() {
|
||||
// Only set a colour space the swapchain accepts for present (on an SDR desktop the
|
||||
// DWM still tone-maps HDR10 → SDR, so leaving the default there is fine).
|
||||
if let Ok(support) = sc3.CheckColorSpaceSupport(colorspace) {
|
||||
if support & DXGI_SWAP_CHAIN_COLOR_SPACE_SUPPORT_FLAG_PRESENT.0 as u32 != 0 {
|
||||
let _ = sc3.SetColorSpace1(colorspace);
|
||||
}
|
||||
}
|
||||
}
|
||||
if on {
|
||||
if let Ok(sc4) = self.swap.cast::<IDXGISwapChain4>() {
|
||||
let md = hdr10_metadata();
|
||||
let bytes = std::slice::from_raw_parts(
|
||||
&md as *const DXGI_HDR_METADATA_HDR10 as *const u8,
|
||||
std::mem::size_of::<DXGI_HDR_METADATA_HDR10>(),
|
||||
);
|
||||
let _ = sc4.SetHDRMetaData(DXGI_HDR_METADATA_TYPE_HDR10, Some(bytes));
|
||||
}
|
||||
}
|
||||
}
|
||||
self.hdr = on;
|
||||
tracing::info!(hdr = on, "swapchain colour mode switched");
|
||||
}
|
||||
|
||||
fn upload(&mut self, frame: &CpuFrame) -> Result<()> {
|
||||
let (w, h) = (frame.width, frame.height);
|
||||
let need_new = !matches!(&self.tex, Some((_, _, tw, th)) if *tw == w && *th == h);
|
||||
if need_new {
|
||||
let format = if self.hdr {
|
||||
DXGI_FORMAT_R10G10B10A2_UNORM
|
||||
} else {
|
||||
DXGI_FORMAT_R8G8B8A8_UNORM
|
||||
};
|
||||
let desc = D3D11_TEXTURE2D_DESC {
|
||||
Width: w,
|
||||
Height: h,
|
||||
MipLevels: 1,
|
||||
ArraySize: 1,
|
||||
Format: DXGI_FORMAT_R8G8B8A8_UNORM,
|
||||
Format: format,
|
||||
SampleDesc: DXGI_SAMPLE_DESC {
|
||||
Count: 1,
|
||||
Quality: 0,
|
||||
@@ -191,7 +260,7 @@ impl Presenter {
|
||||
let row_bytes = (w as usize) * 4;
|
||||
for y in 0..h as usize {
|
||||
std::ptr::copy_nonoverlapping(
|
||||
frame.rgba.as_ptr().add(y * src_pitch),
|
||||
frame.pixels.as_ptr().add(y * src_pitch),
|
||||
dst.add(y * dst_pitch),
|
||||
row_bytes.min(src_pitch),
|
||||
);
|
||||
@@ -273,7 +342,10 @@ fn create_composition_swapchain(
|
||||
BufferCount: 2,
|
||||
Scaling: DXGI_SCALING_STRETCH,
|
||||
SwapEffect: DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL,
|
||||
AlphaMode: DXGI_ALPHA_MODE_PREMULTIPLIED,
|
||||
// IGNORE (opaque), not PREMULTIPLIED: the video fills the panel and the HDR `X2BGR10`
|
||||
// upload leaves the 2 padding/alpha bits 0 — premultiplied alpha would then make HDR frames
|
||||
// transparent. Opaque is correct for a full-frame video surface either way.
|
||||
AlphaMode: DXGI_ALPHA_MODE_IGNORE,
|
||||
Flags: 0,
|
||||
};
|
||||
unsafe {
|
||||
@@ -354,3 +426,19 @@ fn blob_bytes(blob: &ID3DBlob) -> &[u8] {
|
||||
std::slice::from_raw_parts(p, n)
|
||||
}
|
||||
}
|
||||
|
||||
/// Generic HDR10 mastering metadata: BT.2020 primaries + D65 white (0.00002 units), a 1000-nit
|
||||
/// mastering display, MaxCLL 1000 / MaxFALL 400. The protocol doesn't carry the stream's real
|
||||
/// mastering metadata yet (host follow-up), so these are sane defaults the display tone-maps from.
|
||||
fn hdr10_metadata() -> DXGI_HDR_METADATA_HDR10 {
|
||||
DXGI_HDR_METADATA_HDR10 {
|
||||
RedPrimary: [35400, 14600],
|
||||
GreenPrimary: [8500, 39850],
|
||||
BluePrimary: [6550, 2300],
|
||||
WhitePoint: [15635, 16450],
|
||||
MaxMasteringLuminance: 1000,
|
||||
MinMasteringLuminance: 1, // 0.0001-nit units → 0.0001 nits
|
||||
MaxContentLightLevel: 1000,
|
||||
MaxFrameAverageLightLevel: 400,
|
||||
}
|
||||
}
|
||||
|
||||
@@ -30,7 +30,7 @@ pub struct SessionParams {
|
||||
pub identity: (String, String),
|
||||
}
|
||||
|
||||
#[derive(Clone, Copy, Default)]
|
||||
#[derive(Clone, Copy, Default, PartialEq)]
|
||||
pub struct Stats {
|
||||
pub fps: f32,
|
||||
pub mbps: f32,
|
||||
@@ -99,6 +99,10 @@ fn pump(
|
||||
params.compositor,
|
||||
params.gamepad,
|
||||
params.bitrate_kbps,
|
||||
// Advertise 10-bit + HDR10: the presenter handles BT.2020 PQ (R10G10B10A2) frames, so the
|
||||
// host may upgrade HDR content to a Main10/PQ stream (it still only does so for actual HDR
|
||||
// content with its own 10-bit gate). 8-bit SDR is unaffected.
|
||||
punktfunk_core::quic::VIDEO_CAP_10BIT | punktfunk_core::quic::VIDEO_CAP_HDR,
|
||||
None, // launch: the Windows client has no library picker yet
|
||||
params.pin,
|
||||
Some(params.identity),
|
||||
|
||||
@@ -20,13 +20,17 @@ pub enum DecodedFrame {
|
||||
Cpu(CpuFrame),
|
||||
}
|
||||
|
||||
/// RGBA pixels for a D3D11 `R8G8B8A8_UNORM` texture upload (which takes a row pitch).
|
||||
/// Packed 4-byte-per-pixel frame for a D3D11 texture upload (which takes a row pitch). The bytes
|
||||
/// are `R8G8B8A8` for SDR and `X2BGR10` (== DXGI `R10G10B10A2`, R in the low 10 bits) for HDR.
|
||||
pub struct CpuFrame {
|
||||
pub width: u32,
|
||||
pub height: u32,
|
||||
/// RGBA row stride in bytes (≥ width*4 — swscale pads rows for SIMD).
|
||||
/// Row stride in bytes (≥ width*4 — swscale pads rows for SIMD).
|
||||
pub stride: usize,
|
||||
pub rgba: Vec<u8>,
|
||||
pub pixels: Vec<u8>,
|
||||
/// BT.2020 PQ HDR10 frame: `pixels` is `X2BGR10` and the presenter switches to a 10-bit
|
||||
/// R10G10B10A2 + ST.2084 swapchain. `false` = ordinary 8-bit BT.709 SDR.
|
||||
pub hdr: bool,
|
||||
}
|
||||
|
||||
pub struct Decoder {
|
||||
@@ -51,8 +55,9 @@ impl Decoder {
|
||||
|
||||
struct SoftwareDecoder {
|
||||
decoder: ffmpeg::decoder::Video,
|
||||
/// Rebuilt whenever the decoded format/size changes (mid-stream `Reconfigure`).
|
||||
sws: Option<(scaling::Context, Pixel, u32, u32)>,
|
||||
/// Rebuilt whenever the decoded format/size **or output format** changes (mid-stream
|
||||
/// `Reconfigure`, or an SDR↔HDR flip): `(ctx, src_fmt, w, h, dst_fmt)`.
|
||||
sws: Option<(scaling::Context, Pixel, u32, u32, Pixel)>,
|
||||
}
|
||||
|
||||
impl SoftwareDecoder {
|
||||
@@ -79,28 +84,53 @@ impl SoftwareDecoder {
|
||||
let mut frame = AvFrame::empty();
|
||||
let mut out = None;
|
||||
while self.decoder.receive_frame(&mut frame).is_ok() {
|
||||
out = Some(self.convert_rgba(&frame)?);
|
||||
out = Some(self.convert(&frame)?);
|
||||
}
|
||||
Ok(out)
|
||||
}
|
||||
|
||||
fn convert_rgba(&mut self, frame: &AvFrame) -> Result<CpuFrame> {
|
||||
/// Convert the decoded YUV frame to a packed 4-byte format the presenter uploads directly:
|
||||
/// SDR → `RGBA` (BT.709), HDR (SMPTE ST.2084 / PQ transfer) → `X2BGR10` (10-bit, == DXGI
|
||||
/// R10G10B10A2) using the BT.2020 matrix. For HDR the PQ-encoded values pass through unchanged
|
||||
/// (swscale only applies the YUV→RGB matrix + range, never the transfer) — exactly what an
|
||||
/// HDR10/ST.2084 swapchain wants.
|
||||
fn convert(&mut self, frame: &AvFrame) -> Result<CpuFrame> {
|
||||
use ffmpeg::color::TransferCharacteristic;
|
||||
let (fmt, w, h) = (frame.format(), frame.width(), frame.height());
|
||||
let rebuild =
|
||||
!matches!(&self.sws, Some((_, f, sw, sh)) if *f == fmt && *sw == w && *sh == h);
|
||||
let hdr = frame.color_transfer_characteristic() == TransferCharacteristic::SMPTE2084;
|
||||
let dst = if hdr { Pixel::X2BGR10LE } else { Pixel::RGBA };
|
||||
let rebuild = !matches!(&self.sws, Some((_, f, sw, sh, d)) if *f == fmt && *sw == w && *sh == h && *d == dst);
|
||||
if rebuild {
|
||||
let ctx = scaling::Context::get(fmt, w, h, Pixel::RGBA, w, h, scaling::Flags::POINT)
|
||||
let mut ctx = scaling::Context::get(fmt, w, h, dst, w, h, scaling::Flags::POINT)
|
||||
.context("swscale context")?;
|
||||
self.sws = Some((ctx, fmt, w, h));
|
||||
if hdr {
|
||||
// BT.2020 non-constant-luminance YUV (limited range) → full-range RGB. swscale
|
||||
// applies only the matrix + range here, so the samples stay PQ-encoded.
|
||||
unsafe {
|
||||
let coef = ffmpeg::ffi::sws_getCoefficients(ffmpeg::ffi::SWS_CS_BT2020);
|
||||
ffmpeg::ffi::sws_setColorspaceDetails(
|
||||
ctx.as_mut_ptr(),
|
||||
coef,
|
||||
0, // src range: limited (video)
|
||||
coef,
|
||||
1, // dst range: full
|
||||
0,
|
||||
1 << 16,
|
||||
1 << 16, // brightness / contrast / saturation defaults (16.16)
|
||||
);
|
||||
}
|
||||
}
|
||||
self.sws = Some((ctx, fmt, w, h, dst));
|
||||
}
|
||||
let (sws, ..) = self.sws.as_mut().unwrap();
|
||||
let mut rgba = AvFrame::empty();
|
||||
sws.run(frame, &mut rgba).map_err(|e| anyhow!("sws: {e}"))?;
|
||||
let mut conv = AvFrame::empty();
|
||||
sws.run(frame, &mut conv).map_err(|e| anyhow!("sws: {e}"))?;
|
||||
Ok(CpuFrame {
|
||||
width: w,
|
||||
height: h,
|
||||
stride: rgba.stride(0),
|
||||
rgba: rgba.data(0).to_vec(),
|
||||
stride: conv.stride(0),
|
||||
pixels: conv.data(0).to_vec(),
|
||||
hdr,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user