feat(windows): pf-vdisplay IDD-push — HDR + pipelined zero-copy capture
apple / swift (push) Successful in 1m4s
windows-host / package (push) Successful in 6m28s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m14s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m10s
release / apple (push) Successful in 7m53s
android / android (push) Successful in 10m33s
ci / web (push) Successful in 44s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 3m4s
ci / docs-site (push) Successful in 53s
ci / rust (push) Successful in 12m22s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m11s
apple / screenshots (push) Successful in 5m24s
deb / build-publish (push) Successful in 3m16s
decky / build-publish (push) Successful in 21s
ci / bench (push) Successful in 4m42s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m34s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m42s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m13s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 47s
flatpak / build-publish (push) Successful in 4m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 25s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m44s

HDR (display-driven, matching the WGC path):
- CTA-861.3 HDR EDID (BT.2020 primaries + HDR Static Metadata block) so Windows
  offers "Use HDR" on the virtual display. The host FOLLOWS the display's live
  advanced-color state, recreating the shared ring at the matching format
  (FP16 in HDR / BGRA in SDR) on a toggle — no freeze.
- Always emit Main10/BT.2020-PQ Rgb10a2 while the display is HDR; the client
  auto-detects PQ from the HEVC VUI (clients under-report VIDEO_CAP_10BIT).
  Generic HDR10 mastering SEI on every IDR.
- Generation-tagged `latest` (gen<<40|seq<<8|slot) + driver `is_stale` re-attach
  kill the toggle-time garbage frame and any stale-ring read.

Perf:
- Pipeline the encode loop (Capturer::pipeline_depth; IDD-push = 2): submit N+1
  before polling N so the convert/copy on the 3D engine overlaps the NVENC encode
  of N on the ASIC. PUNKTFUNK_IDD_DEPTH overrides (1 = synchronous).
- Rotating host output ring (OUT_RING) so the in-flight encode and the next
  convert never touch the same texture.
- HDR converts directly from the keyed-mutex slot's SRV into the output ring
  (drops the redundant slot->fp16 scratch copy); SDR copies the BGRA slot in.
  The slot mutex is held only across the convert/copy, not the encode.
  RING_LEN 3->6 for publish headroom.
- Capture-health diagnostic: new_fps vs repeat_fps under PUNKTFUNK_PERF (a low
  new_fps at a high send rate means the source isn't compositing, not an encode
  stall).

Validated live on the RTX box: 5120x1440@240 HDR streams; driver composes
~180 new fps, encode 240 fps @ ~4.3 ms p50.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-24 00:35:52 +02:00
parent c5dab484df
commit e2c9bfd3d9
26 changed files with 2962 additions and 313 deletions
@@ -11,10 +11,17 @@ use wdf_umdf_sys::{
DISPLAYCONFIG_TARGET_MODE, DISPLAYCONFIG_VIDEO_SIGNAL_INFO, IDARG_IN_ADAPTER_INIT_FINISHED,
IDARG_IN_COMMITMODES, IDARG_IN_GETDEFAULTDESCRIPTIONMODES, IDARG_IN_PARSEMONITORDESCRIPTION,
IDARG_IN_QUERYTARGETMODES, IDARG_IN_SETSWAPCHAIN, IDARG_OUT_GETDEFAULTDESCRIPTIONMODES,
IDARG_OUT_PARSEMONITORDESCRIPTION, IDARG_OUT_QUERYTARGETMODES, IDDCX_ADAPTER__,
IDARG_OUT_PARSEMONITORDESCRIPTION, IDARG_OUT_QUERYTARGETMODES, IDDCX_ADAPTER__, IDDCX_PATH,
IDDCX_MONITOR_MODE, IDDCX_MONITOR_MODE_ORIGIN, IDDCX_MONITOR__, IDDCX_TARGET_MODE, NTSTATUS,
WDFDEVICE, WDF_POWER_DEVICE_STATE,
};
// IddCx 1.10 *2 DDIs (HDR-capable). For B1 we advertise SDR (8 bpc) so behaviour is unchanged; B2
// flips the bit depth + adapter flag to enable HDR.
use wdf_umdf_sys::{
IDARG_IN_COMMITMODES2, IDARG_IN_PARSEMONITORDESCRIPTION2, IDARG_IN_QUERYTARGETMODES2,
IDARG_IN_QUERYTARGET_INFO, IDARG_OUT_QUERYTARGET_INFO, IDDCX_BITS_PER_COMPONENT, IDDCX_MONITOR_MODE2,
IDDCX_PATH2, IDDCX_TARGET_CAPS, IDDCX_TARGET_MODE2, IDDCX_WIRE_BITS_PER_COMPONENT,
};
use crate::{
context::{DeviceContext, MonitorContext},
@@ -179,6 +186,7 @@ pub extern "C-unwind" fn monitor_get_default_modes(
_p_in_args: *const IDARG_IN_GETDEFAULTDESCRIPTIONMODES,
_p_out_args: *mut IDARG_OUT_GETDEFAULTDESCRIPTIONMODES,
) -> NTSTATUS {
info!("GET_DEFAULT_MODES called (we return NOT_IMPLEMENTED — only valid for a monitor with NO EDID)");
NTSTATUS::STATUS_NOT_IMPLEMENTED
}
@@ -287,9 +295,20 @@ pub extern "C-unwind" fn monitor_query_modes(
pub extern "C-unwind" fn adapter_commit_modes(
_adapter_object: *mut IDDCX_ADAPTER__,
_p_in_args: *const IDARG_IN_COMMITMODES,
p_in_args: *const IDARG_IN_COMMITMODES,
) -> NTSTATUS {
// The swap-chain is managed by IddCx; there is nothing device-specific to reconfigure on a commit.
// DIAGNOSTIC: does the OS commit an ACTIVE path for our monitor? IDDCX_PATH_FLAGS_ACTIVE = 2. If
// no active path is ever committed, the OS never calls ASSIGN_SWAPCHAIN (the bug we're chasing).
let in_args = unsafe { &*p_in_args };
info!("COMMIT_MODES: path_count={}", in_args.PathCount);
for i in 0..in_args.PathCount {
let path: &IDDCX_PATH = unsafe { &*in_args.pPaths.add(i as usize) };
let active = (path.Flags.0 & 2) != 0;
info!(
" path[{i}] monitor={:p} flags=0x{:x} active={active}",
path.MonitorObject, path.Flags.0
);
}
NTSTATUS::STATUS_SUCCESS
}
@@ -320,3 +339,194 @@ pub extern "C-unwind" fn unassign_swap_chain(monitor_object: *mut IDDCX_MONITOR_
.into()
}
}
// ===== IddCx 1.10 *2 DDIs (HDR-capable path) ============================================
// These mirror the 1.x callbacks above but advertise per-mode wire bit-depth. B1 reports SDR (8 bpc);
// B2 bumps `wire_bits()` to add 10 bpc + sets CAN_PROCESS_FP16 to actually enable HDR.
/// Wire bit-depth advertised per mode. B2: advertise BOTH 8 and 10 bpc RGB so the OS offers HDR10
/// modes (the bitfield: 8 = 0x2, 10 = 0x4).
fn wire_bits() -> IDDCX_WIRE_BITS_PER_COMPONENT {
let rgb = IDDCX_BITS_PER_COMPONENT(
IDDCX_BITS_PER_COMPONENT::IDDCX_BITS_PER_COMPONENT_8.0
| IDDCX_BITS_PER_COMPONENT::IDDCX_BITS_PER_COMPONENT_10.0,
);
IDDCX_WIRE_BITS_PER_COMPONENT {
Rgb: rgb,
YCbCr444: IDDCX_BITS_PER_COMPONENT::IDDCX_BITS_PER_COMPONENT_NONE,
YCbCr422: IDDCX_BITS_PER_COMPONENT::IDDCX_BITS_PER_COMPONENT_NONE,
YCbCr420: IDDCX_BITS_PER_COMPONENT::IDDCX_BITS_PER_COMPONENT_NONE,
}
}
/// 1.10 variant of [`parse_monitor_description`] — writes `IDDCX_MONITOR_MODE2` (adds bit-depth).
pub extern "C-unwind" fn parse_monitor_description2(
p_in_args: *const IDARG_IN_PARSEMONITORDESCRIPTION2,
p_out_args: *mut IDARG_OUT_PARSEMONITORDESCRIPTION,
) -> NTSTATUS {
let in_args = unsafe { &*p_in_args };
let out_args = unsafe { &mut *p_out_args };
let Ok(monitors) = MONITOR_MODES.lock() else {
error!("MONITOR_MODES mutex poisoned");
return NTSTATUS::STATUS_DRIVER_INTERNAL_ERROR;
};
let edid = unsafe {
std::slice::from_raw_parts(
in_args.MonitorDescription.pData as *const u8,
in_args.MonitorDescription.DataSize as usize,
)
};
let Ok(monitor_index) = Edid::get_serial(edid) else {
error!("bad edid ({} bytes)", edid.len());
return NTSTATUS::STATUS_INVALID_VIEW_SIZE;
};
let Some(monitor) = monitors.iter().find(|&m| m.data.id == monitor_index) else {
error!("Failed to find monitor id {monitor_index}");
return NTSTATUS::STATUS_DRIVER_INTERNAL_ERROR;
};
let number_of_modes: u32 = monitor
.data
.modes
.iter()
.map(|m| u32::try_from(m.refresh_rates.len()).expect("Cannot use > u32::MAX refresh rates"))
.sum();
out_args.MonitorModeBufferOutputCount = number_of_modes;
if in_args.MonitorModeBufferInputCount < number_of_modes {
return if in_args.MonitorModeBufferInputCount > 0 {
NTSTATUS::STATUS_BUFFER_TOO_SMALL
} else {
NTSTATUS::STATUS_SUCCESS
};
}
let monitor_modes = unsafe {
std::slice::from_raw_parts_mut(
in_args.pMonitorModes.cast::<MaybeUninit<IDDCX_MONITOR_MODE2>>(),
number_of_modes as usize,
)
};
for (mode, out_mode) in monitor.data.modes.flatten().zip(monitor_modes.iter_mut()) {
out_mode.write(IDDCX_MONITOR_MODE2 {
#[allow(clippy::cast_possible_truncation)]
Size: mem::size_of::<IDDCX_MONITOR_MODE2>() as u32,
Origin: IDDCX_MONITOR_MODE_ORIGIN::IDDCX_MONITOR_MODE_ORIGIN_MONITORDESCRIPTOR,
MonitorVideoSignalInfo: display_info(mode.width, mode.height, mode.refresh_rate),
BitsPerComponent: wire_bits(),
});
}
out_args.PreferredMonitorModeIdx = 0;
NTSTATUS::STATUS_SUCCESS
}
fn target_mode2(width: u32, height: u32, refresh_rate: u32) -> IDDCX_TARGET_MODE2 {
let m1 = target_mode(width, height, refresh_rate);
IDDCX_TARGET_MODE2 {
#[allow(clippy::cast_possible_truncation)]
Size: mem::size_of::<IDDCX_TARGET_MODE2>() as u32,
TargetVideoSignalInfo: m1.TargetVideoSignalInfo,
BitsPerComponent: wire_bits(),
..Default::default()
}
}
/// 1.10 variant of [`monitor_query_modes`] — writes `IDDCX_TARGET_MODE2`.
pub extern "C-unwind" fn monitor_query_modes2(
monitor_object: *mut IDDCX_MONITOR__,
p_in_args: *const IDARG_IN_QUERYTARGETMODES2,
p_out_args: *mut IDARG_OUT_QUERYTARGETMODES,
) -> NTSTATUS {
let Ok(monitors) = MONITOR_MODES.lock() else {
error!("MONITOR_MODES mutex poisoned");
return NTSTATUS::STATUS_DRIVER_INTERNAL_ERROR;
};
let Some(monitor) = monitors
.iter()
.find(|&m| m.object.is_some_and(|p| p.as_ptr() == monitor_object))
else {
error!("Failed to find monitor object in cache for {monitor_object:?}");
return NTSTATUS::STATUS_DRIVER_INTERNAL_ERROR;
};
let number_of_modes = monitor
.data
.modes
.iter()
.map(|m| u32::try_from(m.refresh_rates.len()).expect("Cannot use > u32::MAX modes"))
.sum();
let out_args = unsafe { &mut *p_out_args };
out_args.TargetModeBufferOutputCount = number_of_modes;
let in_args = unsafe { &*p_in_args };
if in_args.TargetModeBufferInputCount >= number_of_modes {
let out_target_modes = unsafe {
std::slice::from_raw_parts_mut(
in_args.pTargetModes.cast::<MaybeUninit<IDDCX_TARGET_MODE2>>(),
number_of_modes as usize,
)
};
for (mode, out_target) in monitor.data.modes.flatten().zip(out_target_modes.iter_mut()) {
out_target.write(target_mode2(mode.width, mode.height, mode.refresh_rate));
}
}
NTSTATUS::STATUS_SUCCESS
}
/// 1.10 variant of [`adapter_commit_modes`] — `IDDCX_PATH2` carries the committed wire format.
pub extern "C-unwind" fn adapter_commit_modes2(
_adapter_object: *mut IDDCX_ADAPTER__,
p_in_args: *const IDARG_IN_COMMITMODES2,
) -> NTSTATUS {
let in_args = unsafe { &*p_in_args };
info!("COMMIT_MODES2: path_count={}", in_args.PathCount);
for i in 0..in_args.PathCount {
let path: &IDDCX_PATH2 = unsafe { &*in_args.pPaths.add(i as usize) };
let active = (path.Flags.0 & 2) != 0;
info!(
" path2[{i}] monitor={:p} flags=0x{:x} active={active} colorspace={} rgb_bpc=0x{:x}",
path.MonitorObject,
path.Flags.0,
path.WireFormatInfo.ColorSpace.0,
path.WireFormatInfo.BitsPerComponent.Rgb.0
);
}
NTSTATUS::STATUS_SUCCESS
}
/// 1.10 NEW: per-target capabilities. B2 reports `HIGH_COLOR_SPACE` so the OS enables HDR10 (transfer
/// curve + wide gamut) on this target.
pub extern "C-unwind" fn query_target_info(
_adapter_object: *mut IDDCX_ADAPTER__,
_p_in_args: *mut IDARG_IN_QUERYTARGET_INFO,
p_out_args: *mut IDARG_OUT_QUERYTARGET_INFO,
) -> NTSTATUS {
let out_args = unsafe { &mut *p_out_args };
out_args.TargetCaps = IDDCX_TARGET_CAPS::IDDCX_TARGET_CAPS_HIGH_COLOR_SPACE;
out_args.DitheringSupport = IDDCX_WIRE_BITS_PER_COMPONENT::default();
NTSTATUS::STATUS_SUCCESS
}
/// 1.10 NEW (HDR): the OS hands us the default HDR10 static metadata for the monitor. B2 accepts it
/// (the host/client own the final HDR metadata for the stream); B3 will forward it to the host for the
/// HEVC mastering-display SEI. Stub keeps the OS's HDR setup happy.
pub extern "C-unwind" fn set_default_hdr_metadata(
_monitor_object: *mut IDDCX_MONITOR__,
_p_in_args: *const wdf_umdf_sys::IDARG_IN_MONITOR_SET_DEFAULT_HDR_METADATA,
) -> NTSTATUS {
NTSTATUS::STATUS_SUCCESS
}
/// 1.10 HDR: the OS hands us the gamma ramp (a 3x4 colour-space matrix in HDR mode). We do NOT apply it
/// server-side — the host streams the scRGB FP16 and the CLIENT's display applies its own transform —
/// so we accept it. Wiring this is OBLIGATED once CAN_PROCESS_FP16 is set; without it the OS rejects
/// the adapter at init (`IddCxAdapterInitAsync` → "Failed to get adapter").
pub extern "C-unwind" fn set_gamma_ramp(
_monitor_object: *mut IDDCX_MONITOR__,
_p_in_args: *const wdf_umdf_sys::IDARG_IN_SET_GAMMARAMP,
) -> NTSTATUS {
NTSTATUS::STATUS_SUCCESS
}