feat(windows-drivers): STEP 6 — IDD-push FramePublisher (driver) + host migration to proto::frame
apple / swift (push) Failing after 1s
apple / screenshots (push) Has been skipped
windows-drivers / probe-and-proto (push) Successful in 19s
windows-drivers / driver-build (push) Successful in 1m9s
ci / rust (push) Successful in 1m31s
ci / web (push) Successful in 42s
ci / docs-site (push) Successful in 1m2s
android / android (push) Successful in 3m50s
deb / build-publish (push) Successful in 2m37s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
windows-host / package (push) Successful in 5m20s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m37s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m32s
docker / deploy-docs (push) Successful in 16s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m19s
apple / swift (push) Failing after 1s
apple / screenshots (push) Has been skipped
windows-drivers / probe-and-proto (push) Successful in 19s
windows-drivers / driver-build (push) Successful in 1m9s
ci / rust (push) Successful in 1m31s
ci / web (push) Successful in 42s
ci / docs-site (push) Successful in 1m2s
android / android (push) Successful in 3m50s
deb / build-publish (push) Successful in 2m37s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
windows-host / package (push) Successful in 5m20s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m37s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m32s
docker / deploy-docs (push) Successful in 16s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m19s
The driver now publishes each acquired swap-chain surface into the host-created shared ring (the IDD-push path) — the full glass-to-glass transport is code-complete. Both sides use the canonical pf_vdisplay_proto::frame layout (lockstep by compile-error, not "must match" comments). Driver compiles + LOADS on-glass (adapter inits, Status=OK; no regression — the publisher is dormant until a frame is acquired); host cargo check green; adversarially reviewed (no blockers — token layout, keyed-mutex key 0, names by target_id, and the format guard all match the host consumer). - new driver frame_transport.rs: FramePublisher OPENS the host ring by target_id (OpenFileMapping header + magic Acquire readiness gate + OpenEvent + OpenSharedResourceByName RING_LEN keyed-mutex textures), writes its render LUID + DRV_STATUS back into the header; publish() is NON-BLOCKING (round-robin 0ms try-acquire -> CopyResource -> ReleaseSync -> FrameToken::pack store Release -> SetEvent; drops the frame if every slot is busy or the surface format != the ring format). Manual handle/view cleanup on every try_open early return; RAII Drop (slots -> unmap -> CloseHandle). Layout/consts/names/token all from pf_vdisplay_proto::frame. - swap_chain_processor.rs run_core: lazy rate-limited attach (every ~30 frames) + is_stale re-attach (mid-session HDR ring recreate); publishes buffer.MetaData.pSurface via IDXGIResource::from_raw_borrowed (preserves IddCx's refcount) BEFORE IddCxSwapChainFinishedProcessingFrame. run/run_core gain the render LUID; callbacks.rs assign_swap_chain passes it. - host idd_push.rs migrated onto pf_vdisplay_proto::frame (deleted the hand-rolled SharedHeader / MAGIC / VERSION / RING_LEN / DRV_STATUS_* / name fns / token packing) — pure refactor, byte-identical, no behavior or gating change. DebugBlock + DXGI_SHARED_RESOURCE_RW kept local (not in the proto). - driver windows crate gains Win32_System_Memory (MapViewOfFile/OpenFileMappingW/...); rustfmt'd the whole driver workspace (incl. wdk-probe — fmt-only). Built via the ultracode flow: STEP-6 map workflow -> agent-implement -> box build (driver + host both green; caught nothing this time) -> adversarial-verify-agent (no blockers) -> FrameToken::pack hardening -> deploy (loads). Glass-to-glass frame validation awaits a composited session (per the parity finding: this headless box yields 0 frames for the proven SudoVDA path too). FOLLOW-UPs: port the optional Global\pfvd-dbg DebugBlock triage channel to the new driver; STEP 7 HDR; STEP 8 drop SudoVDA. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -4,13 +4,16 @@
|
||||
//! event + ring of keyed-mutex textures (`Global\` names, permissive `D:(A;;GA;;;WD)` SDDL) on the
|
||||
//! discrete render GPU, and the driver only OPENS them and copies frames in. We then consume the ring
|
||||
//! straight into the zero-copy NVENC path — no DXGI Desktop Duplication, no `win32u` hook. Gated by
|
||||
//! `PUNKTFUNK_IDD_PUSH`. Driver counterpart: `packaging/windows/vdisplay-driver/pf-vdisplay/src/
|
||||
//! frame_transport.rs` — [`SharedHeader`], [`MAGIC`], [`RING_LEN`], the status codes and the `Global\`
|
||||
//! name scheme are DUPLICATED byte-identically there.
|
||||
//! `PUNKTFUNK_IDD_PUSH`. Driver counterpart: `packaging/windows/drivers/pf-vdisplay/src/
|
||||
//! frame_transport.rs`. The shared `SharedHeader` layout, `MAGIC`/`VERSION`/`RING_LEN`, the
|
||||
//! `DRV_STATUS_*` codes, the `Global\` name scheme and the publish token all come from
|
||||
//! [`pf_vdisplay_proto::frame`] (which OWNS the contract, with `const` size asserts) — both sides
|
||||
//! `use` it, so drift is a compile error rather than a "must match" comment.
|
||||
|
||||
use super::dxgi::{make_device, D3d11Frame, HdrConverter, WinCaptureTarget};
|
||||
use super::{CapturedFrame, Capturer, FramePayload, PixelFormat};
|
||||
use anyhow::{bail, Context, Result};
|
||||
use pf_vdisplay_proto::frame;
|
||||
use std::sync::atomic::{AtomicU32, AtomicU64, Ordering};
|
||||
use std::sync::Mutex;
|
||||
use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
|
||||
@@ -39,45 +42,26 @@ use windows::Win32::System::Memory::{
|
||||
};
|
||||
use windows::Win32::System::Threading::{CreateEventW, WaitForSingleObject};
|
||||
|
||||
// --- kept byte-identical with the driver (frame_transport.rs) ---
|
||||
pub const MAGIC: u32 = 0x4456_4650;
|
||||
pub const VERSION: u32 = 1;
|
||||
/// Ring slots — MUST equal the driver's `RING_LEN` (frame_transport.rs). 6 (was 3) gives ample headroom
|
||||
/// so the driver's 0 ms-timeout publish always finds a free slot while the host briefly holds one across
|
||||
/// the convert/copy into its output ring and the depth-2 pipelined encode runs on the rest.
|
||||
pub const RING_LEN: u32 = 6;
|
||||
const DXGI_SHARED_RESOURCE_RW: u32 = 0x8000_0000 | 0x1;
|
||||
// The frame-transport contract — `SharedHeader` layout, `MAGIC`/`VERSION`/`RING_LEN`, the
|
||||
// `DRV_STATUS_*` codes and the `Global\` name helpers — lives in `pf_vdisplay_proto::frame`; both sides
|
||||
// `use frame::*`, so a layout/name/code drift is a compile error (the proto has `const` size asserts).
|
||||
use frame::{
|
||||
event_name, header_name, texture_name, SharedHeader, DRV_STATUS_NO_DEVICE1, DRV_STATUS_OPENED,
|
||||
DRV_STATUS_TEX_FAIL, MAGIC, RING_LEN, VERSION,
|
||||
};
|
||||
|
||||
// driver_status codes (the driver writes these; we read+log them).
|
||||
const DRV_STATUS_OPENED: u32 = 1;
|
||||
const DRV_STATUS_TEX_FAIL: u32 = 2;
|
||||
const DRV_STATUS_NO_DEVICE1: u32 = 3;
|
||||
/// `DXGI_SHARED_RESOURCE_READ | _WRITE` for `CreateSharedHandle`/`OpenSharedResourceByName`. Local (not
|
||||
/// part of the proto contract — it is a DXGI sharing-API arg, mirrored on the driver side).
|
||||
const DXGI_SHARED_RESOURCE_RW: u32 = 0x8000_0000 | 0x1;
|
||||
|
||||
/// Host-owned output-ring depth: distinct NVENC-input textures rotated per frame so the in-flight
|
||||
/// encode of frame N and the convert/copy of frame N+1 never touch the same texture. 3 covers a
|
||||
/// pipeline depth of 2 with one slot of margin.
|
||||
const OUT_RING: usize = 3;
|
||||
|
||||
#[repr(C)]
|
||||
struct SharedHeader {
|
||||
magic: u32,
|
||||
version: u32,
|
||||
generation: u32,
|
||||
ring_len: u32,
|
||||
width: u32,
|
||||
height: u32,
|
||||
dxgi_format: u32,
|
||||
_pad: u32,
|
||||
latest: u64,
|
||||
qpc_pts: u64,
|
||||
driver_render_luid_low: u32,
|
||||
driver_render_luid_high: i32,
|
||||
driver_status: u32,
|
||||
driver_status_detail: u32,
|
||||
}
|
||||
|
||||
/// Bring-up debug block (fixed name) — the host creates it; the driver writes diagnostics into it
|
||||
/// independent of the per-target header. Byte-identical with the driver's `DebugBlock`.
|
||||
/// independent of the per-target header. NOT part of `pf_vdisplay_proto` (a host-side bring-up channel,
|
||||
/// not the data path); the matching `DebugBlock` lives in the OLD oracle driver's `frame_transport.rs`.
|
||||
#[repr(C)]
|
||||
struct DebugBlock {
|
||||
magic: u32,
|
||||
@@ -94,17 +78,6 @@ struct DebugBlock {
|
||||
const DBG_NAME: &str = "Global\\pfvd-dbg";
|
||||
const DBG_MAGIC: u32 = 0x4742_4450;
|
||||
|
||||
fn hdr_name(target_id: u32) -> String {
|
||||
format!("Global\\pfvd-hdr-{target_id}")
|
||||
}
|
||||
fn evt_name(target_id: u32) -> String {
|
||||
format!("Global\\pfvd-evt-{target_id}")
|
||||
}
|
||||
fn tex_name(target_id: u32, generation: u32, slot: u32) -> String {
|
||||
format!("Global\\pfvd-tex-{target_id}-{generation}-{slot}")
|
||||
}
|
||||
// ----------------------------------------------------------------
|
||||
|
||||
/// Monotonic per-process generation: each capturer instance stamps its ring-texture names with a
|
||||
/// fresh value so a retried/overlapping `open()` never collides with a previous attempt's not-yet-
|
||||
/// released shared-handle names (`DXGI_ERROR_NAME_ALREADY_EXISTS`). The driver reads it from the header.
|
||||
@@ -339,7 +312,7 @@ impl IddPushCapturer {
|
||||
.CreateSharedHandle(
|
||||
Some(&sa as *const SECURITY_ATTRIBUTES),
|
||||
DXGI_SHARED_RESOURCE_RW,
|
||||
&HSTRING::from(tex_name(target_id, generation, k)),
|
||||
&HSTRING::from(texture_name(target_id, generation, k)),
|
||||
)
|
||||
.context("CreateSharedHandle(IDD-push ring slot)")?;
|
||||
let mutex: IDXGIKeyedMutex = tex.cast()?;
|
||||
@@ -406,7 +379,7 @@ impl IddPushCapturer {
|
||||
PAGE_READWRITE,
|
||||
0,
|
||||
bytes as u32,
|
||||
&HSTRING::from(hdr_name(target.target_id)),
|
||||
&HSTRING::from(header_name(target.target_id)),
|
||||
)
|
||||
.context("CreateFileMapping(IDD-push header)")?;
|
||||
let view = MapViewOfFile(map, FILE_MAP_ALL_ACCESS, 0, 0, bytes);
|
||||
@@ -431,7 +404,7 @@ impl IddPushCapturer {
|
||||
Some(&sa),
|
||||
false,
|
||||
false,
|
||||
&HSTRING::from(evt_name(target.target_id)),
|
||||
&HSTRING::from(event_name(target.target_id)),
|
||||
)
|
||||
.context("CreateEvent(IDD-push)")?;
|
||||
|
||||
@@ -719,14 +692,16 @@ impl IddPushCapturer {
|
||||
// Follow the display: a "Use HDR" flip recreates the ring at the matching format.
|
||||
self.poll_display_hdr();
|
||||
let latest = self.latest();
|
||||
// `latest` = (generation << 40) | (seq << 8) | slot. Reject any publish whose generation isn't
|
||||
// our CURRENT ring (a stale old-ring publish racing a recreate, or the 0 sentinel we reset to) so
|
||||
// we never consume an unwritten new-ring slot — eliminating the toggle-time garbage frame.
|
||||
if (latest >> 40) as u32 != self.generation {
|
||||
// `latest` is the proto publish token `(generation << 40) | (seq << 8) | slot`. Reject any publish
|
||||
// whose generation isn't our CURRENT ring (a stale old-ring publish racing a recreate, or the 0
|
||||
// sentinel we reset to) so we never consume an unwritten new-ring slot — eliminating the
|
||||
// toggle-time garbage frame.
|
||||
let tok = frame::FrameToken::unpack(latest);
|
||||
if tok.generation != self.generation {
|
||||
return Ok(None);
|
||||
}
|
||||
let seq = (latest >> 8) & 0xFFFF_FFFF;
|
||||
let slot = (latest & 0xff) as usize;
|
||||
let seq = u64::from(tok.seq);
|
||||
let slot = tok.slot as usize;
|
||||
if seq == self.last_seq || slot >= self.slots.len() {
|
||||
return Ok(None);
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user