feat(windows): pf-vdisplay IDD-push — HDR + pipelined zero-copy capture
apple / swift (push) Successful in 1m4s
windows-host / package (push) Successful in 6m28s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m14s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m10s
release / apple (push) Successful in 7m53s
android / android (push) Successful in 10m33s
ci / web (push) Successful in 44s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 3m4s
ci / docs-site (push) Successful in 53s
ci / rust (push) Successful in 12m22s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m11s
apple / screenshots (push) Successful in 5m24s
deb / build-publish (push) Successful in 3m16s
decky / build-publish (push) Successful in 21s
ci / bench (push) Successful in 4m42s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m34s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m42s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m13s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 47s
flatpak / build-publish (push) Successful in 4m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 25s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m44s
apple / swift (push) Successful in 1m4s
windows-host / package (push) Successful in 6m28s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m14s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m10s
release / apple (push) Successful in 7m53s
android / android (push) Successful in 10m33s
ci / web (push) Successful in 44s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 3m4s
ci / docs-site (push) Successful in 53s
ci / rust (push) Successful in 12m22s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m11s
apple / screenshots (push) Successful in 5m24s
deb / build-publish (push) Successful in 3m16s
decky / build-publish (push) Successful in 21s
ci / bench (push) Successful in 4m42s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m34s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m42s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m13s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 47s
flatpak / build-publish (push) Successful in 4m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 25s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m44s
HDR (display-driven, matching the WGC path): - CTA-861.3 HDR EDID (BT.2020 primaries + HDR Static Metadata block) so Windows offers "Use HDR" on the virtual display. The host FOLLOWS the display's live advanced-color state, recreating the shared ring at the matching format (FP16 in HDR / BGRA in SDR) on a toggle — no freeze. - Always emit Main10/BT.2020-PQ Rgb10a2 while the display is HDR; the client auto-detects PQ from the HEVC VUI (clients under-report VIDEO_CAP_10BIT). Generic HDR10 mastering SEI on every IDR. - Generation-tagged `latest` (gen<<40|seq<<8|slot) + driver `is_stale` re-attach kill the toggle-time garbage frame and any stale-ring read. Perf: - Pipeline the encode loop (Capturer::pipeline_depth; IDD-push = 2): submit N+1 before polling N so the convert/copy on the 3D engine overlaps the NVENC encode of N on the ASIC. PUNKTFUNK_IDD_DEPTH overrides (1 = synchronous). - Rotating host output ring (OUT_RING) so the in-flight encode and the next convert never touch the same texture. - HDR converts directly from the keyed-mutex slot's SRV into the output ring (drops the redundant slot->fp16 scratch copy); SDR copies the BGRA slot in. The slot mutex is held only across the convert/copy, not the encode. RING_LEN 3->6 for publish headroom. - Capture-health diagnostic: new_fps vs repeat_fps under PUNKTFUNK_PERF (a low new_fps at a high send rate means the source isn't compositing, not an encode stall). Validated live on the RTX box: 5120x1440@240 HDR streams; driver composes ~180 new fps, encode 240 fps @ ~4.3 ms p50. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
@@ -142,6 +142,16 @@ pub trait Capturer: Send {
|
||||
fn hdr_meta(&self) -> Option<punktfunk_core::quic::HdrMeta> {
|
||||
None
|
||||
}
|
||||
|
||||
/// How many frames the encode loop may keep in flight (submitted but not yet polled) before it
|
||||
/// blocks. `1` (the default) is the synchronous loop: capture → submit → poll-blocks, so the
|
||||
/// per-frame wall time is `capture+convert + encode`. A capturer that hands a fresh output texture
|
||||
/// per frame (so the encode of N reads a different texture than the convert of N+1 writes) can return
|
||||
/// `>1` to PIPELINE: the loop submits N+1 before polling N, overlapping the convert/copy on the 3D
|
||||
/// engine with the NVENC-ASIC encode of the prior frame, dropping per-frame wall toward `max(...)`.
|
||||
fn pipeline_depth(&self) -> usize {
|
||||
1
|
||||
}
|
||||
}
|
||||
|
||||
/// A deterministic moving test pattern (BGRx). Lets the spike exercise the encode → file →
|
||||
@@ -302,7 +312,11 @@ pub fn open_portal_monitor() -> Result<Box<dyn Capturer>> {
|
||||
/// [`crate::vdisplay::VirtualDisplay`] backend. The captured size is the size the output was
|
||||
/// created at — native, no scaling.
|
||||
#[cfg(target_os = "linux")]
|
||||
pub fn capture_virtual_output(vout: crate::vdisplay::VirtualOutput) -> Result<Box<dyn Capturer>> {
|
||||
pub fn capture_virtual_output(
|
||||
vout: crate::vdisplay::VirtualOutput,
|
||||
_want_hdr: bool,
|
||||
) -> Result<Box<dyn Capturer>> {
|
||||
// The Linux host stays 8-bit (HDR is blocked upstream), so `want_hdr` is unused here.
|
||||
linux::PortalCapturer::from_virtual_output(vout).map(|c| Box::new(c) as Box<dyn Capturer>)
|
||||
}
|
||||
|
||||
@@ -317,7 +331,10 @@ pub(crate) fn wgc_disabled() -> bool {
|
||||
}
|
||||
|
||||
#[cfg(target_os = "windows")]
|
||||
pub fn capture_virtual_output(vout: crate::vdisplay::VirtualOutput) -> Result<Box<dyn Capturer>> {
|
||||
pub fn capture_virtual_output(
|
||||
vout: crate::vdisplay::VirtualOutput,
|
||||
want_hdr: bool,
|
||||
) -> Result<Box<dyn Capturer>> {
|
||||
let target = vout.win_capture.clone().ok_or_else(|| {
|
||||
anyhow::anyhow!(
|
||||
"SudoVDA target not yet an active display (needs a WDDM GPU to activate it)"
|
||||
@@ -325,6 +342,18 @@ pub fn capture_virtual_output(vout: crate::vdisplay::VirtualOutput) -> Result<Bo
|
||||
})?;
|
||||
let pref = vout.preferred_mode;
|
||||
let keep = vout.keepalive;
|
||||
// P2 direct frame push (kill DDA): consume frames straight from the pf-vdisplay driver's shared
|
||||
// ring — no Desktop Duplication, no win32u reparenting hook. Opt-in while it's A/B'd against DDA;
|
||||
// `idd_push` takes the keepalive (owns the virtual display) so there's no fall-through.
|
||||
if std::env::var_os("PUNKTFUNK_IDD_PUSH").is_some() {
|
||||
// Recreate the monitor + ring per session (fix-teardown): a FRESH monitor reliably gets a
|
||||
// working IddCx swap-chain, whereas a REUSED monitor's swap-chain dies after ~2 sessions and
|
||||
// the host can't revive it. The driver's recreate crash (target id resolved to 0) is fixed by
|
||||
// stamping target_id onto the monitor context. The ring is always FP16 (the driver composes
|
||||
// the IDD in FP16); `want_hdr` selects the per-frame conversion (FP16 → Rgb10a2 vs Bgra).
|
||||
return idd_push::IddPushCapturer::open(target, pref, want_hdr, keep)
|
||||
.map(|c| Box::new(c) as Box<dyn Capturer>);
|
||||
}
|
||||
// WGC (Windows.Graphics.Capture) is the default: it captures the COMPOSED desktop including the
|
||||
// overlay/independent-flip planes DXGI Desktop Duplication misses (the frozen-HDR-animation bug),
|
||||
// and has no ACCESS_LOST-on-overlay churn. DDA stays available via PUNKTFUNK_CAPTURE=dda and is
|
||||
@@ -376,7 +405,10 @@ pub fn capture_virtual_output(vout: crate::vdisplay::VirtualOutput) -> Result<Bo
|
||||
}
|
||||
|
||||
#[cfg(not(any(target_os = "linux", target_os = "windows")))]
|
||||
pub fn capture_virtual_output(_vout: crate::vdisplay::VirtualOutput) -> Result<Box<dyn Capturer>> {
|
||||
pub fn capture_virtual_output(
|
||||
_vout: crate::vdisplay::VirtualOutput,
|
||||
_want_hdr: bool,
|
||||
) -> Result<Box<dyn Capturer>> {
|
||||
anyhow::bail!("virtual-output capture requires Linux or Windows")
|
||||
}
|
||||
|
||||
@@ -386,6 +418,8 @@ pub mod composed_flip;
|
||||
pub mod desktop_watch;
|
||||
#[cfg(target_os = "windows")]
|
||||
pub mod dxgi;
|
||||
#[cfg(target_os = "windows")]
|
||||
pub mod idd_push;
|
||||
#[cfg(target_os = "linux")]
|
||||
mod linux;
|
||||
#[cfg(target_os = "windows")]
|
||||
|
||||
Reference in New Issue
Block a user