feat(windows): pf-vdisplay IDD-push — HDR + pipelined zero-copy capture
apple / swift (push) Successful in 1m4s
windows-host / package (push) Successful in 6m28s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m14s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m10s
release / apple (push) Successful in 7m53s
android / android (push) Successful in 10m33s
ci / web (push) Successful in 44s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 3m4s
ci / docs-site (push) Successful in 53s
ci / rust (push) Successful in 12m22s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m11s
apple / screenshots (push) Successful in 5m24s
deb / build-publish (push) Successful in 3m16s
decky / build-publish (push) Successful in 21s
ci / bench (push) Successful in 4m42s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m34s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m42s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m13s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 47s
flatpak / build-publish (push) Successful in 4m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 25s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m44s

HDR (display-driven, matching the WGC path):
- CTA-861.3 HDR EDID (BT.2020 primaries + HDR Static Metadata block) so Windows
  offers "Use HDR" on the virtual display. The host FOLLOWS the display's live
  advanced-color state, recreating the shared ring at the matching format
  (FP16 in HDR / BGRA in SDR) on a toggle — no freeze.
- Always emit Main10/BT.2020-PQ Rgb10a2 while the display is HDR; the client
  auto-detects PQ from the HEVC VUI (clients under-report VIDEO_CAP_10BIT).
  Generic HDR10 mastering SEI on every IDR.
- Generation-tagged `latest` (gen<<40|seq<<8|slot) + driver `is_stale` re-attach
  kill the toggle-time garbage frame and any stale-ring read.

Perf:
- Pipeline the encode loop (Capturer::pipeline_depth; IDD-push = 2): submit N+1
  before polling N so the convert/copy on the 3D engine overlaps the NVENC encode
  of N on the ASIC. PUNKTFUNK_IDD_DEPTH overrides (1 = synchronous).
- Rotating host output ring (OUT_RING) so the in-flight encode and the next
  convert never touch the same texture.
- HDR converts directly from the keyed-mutex slot's SRV into the output ring
  (drops the redundant slot->fp16 scratch copy); SDR copies the BGRA slot in.
  The slot mutex is held only across the convert/copy, not the encode.
  RING_LEN 3->6 for publish headroom.
- Capture-health diagnostic: new_fps vs repeat_fps under PUNKTFUNK_PERF (a low
  new_fps at a high send rate means the source isn't compositing, not an encode
  stall).

Validated live on the RTX box: 5120x1440@240 HDR streams; driver composes
~180 new fps, encode 240 fps @ ~4.3 ms p50.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-24 00:35:52 +02:00
parent c5dab484df
commit e2c9bfd3d9
26 changed files with 2962 additions and 313 deletions
@@ -6,6 +6,7 @@
use std::ptr::NonNull;
use std::sync::atomic::{AtomicU32, AtomicU64};
use std::sync::{Mutex, OnceLock};
use std::time::Instant;
use wdf_umdf_sys::{IDDCX_ADAPTER__, IDDCX_MONITOR__};
@@ -37,6 +38,10 @@ pub struct MonitorObject {
pub target_id: u32,
pub adapter_luid_low: u32,
pub adapter_luid_high: i32,
/// When the entry was pushed (`do_add`). The watchdog skips monitors younger than the host's
/// setup window (CCD commit + GDI-name resolve + settle) so a still-initializing monitor is never
/// torn down mid-birth during reconnect churn.
pub created_at: Instant,
}
// SAFETY: the raw IddCx object ptr is framework-managed; access is serialized by MONITOR_MODES.
unsafe impl Send for MonitorObject {}
@@ -53,9 +58,12 @@ pub static MONITOR_MODES: Mutex<Vec<MonitorObject>> = Mutex::new(Vec::new());
/// Monitor id / EDID-serial counter (unique per created monitor).
pub static NEXT_ID: AtomicU32 = AtomicU32::new(1);
/// Watchdog (seconds). The host reads the timeout via GET_WATCHDOG and PINGs to keep alive.
pub static WATCHDOG_TIMEOUT: AtomicU32 = AtomicU32::new(3);
pub static WATCHDOG_COUNTDOWN: AtomicU32 = AtomicU32::new(3);
/// Watchdog (seconds). The host reads the timeout via GET_WATCHDOG and PINGs to keep alive. 8 s (was
/// 3) gives the host's between-session teardown gap — stop old pinger → CCD display re-attach (a slow
/// `SetDisplayConfig`) → REMOVE — headroom, so the watchdog doesn't spuriously fire during reconnect
/// churn. The host derives its PING interval from this (timeout/3), so it auto-adjusts.
pub static WATCHDOG_TIMEOUT: AtomicU32 = AtomicU32::new(8);
pub static WATCHDOG_COUNTDOWN: AtomicU32 = AtomicU32::new(8);
/// The preferred render adapter LUID set via SET_RENDER_ADAPTER, packed `(high<<32)|low`. 0 = none.
pub static PREFERRED_RENDER_ADAPTER: AtomicU64 = AtomicU64::new(0);