feat(windows-drivers): STEP 6 — IDD-push FramePublisher (driver) + host migration to proto::frame
apple / swift (push) Failing after 1s
windows-drivers / driver-build (push) Successful in 1m9s
ci / rust (push) Successful in 1m31s
ci / web (push) Successful in 42s
apple / screenshots (push) Has been skipped
windows-drivers / probe-and-proto (push) Successful in 19s
ci / docs-site (push) Successful in 1m2s
android / android (push) Successful in 3m50s
deb / build-publish (push) Successful in 2m37s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
windows-host / package (push) Successful in 5m20s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m37s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m32s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m19s
docker / deploy-docs (push) Successful in 16s

The driver now publishes each acquired swap-chain surface into the host-created shared ring (the
IDD-push path) — the full glass-to-glass transport is code-complete. Both sides use the canonical
pf_vdisplay_proto::frame layout (lockstep by compile-error, not "must match" comments). Driver compiles
+ LOADS on-glass (adapter inits, Status=OK; no regression — the publisher is dormant until a frame is
acquired); host cargo check green; adversarially reviewed (no blockers — token layout, keyed-mutex key 0,
names by target_id, and the format guard all match the host consumer).

- new driver frame_transport.rs: FramePublisher OPENS the host ring by target_id (OpenFileMapping header
  + magic Acquire readiness gate + OpenEvent + OpenSharedResourceByName RING_LEN keyed-mutex textures),
  writes its render LUID + DRV_STATUS back into the header; publish() is NON-BLOCKING (round-robin 0ms
  try-acquire -> CopyResource -> ReleaseSync -> FrameToken::pack store Release -> SetEvent; drops the
  frame if every slot is busy or the surface format != the ring format). Manual handle/view cleanup on
  every try_open early return; RAII Drop (slots -> unmap -> CloseHandle). Layout/consts/names/token all
  from pf_vdisplay_proto::frame.
- swap_chain_processor.rs run_core: lazy rate-limited attach (every ~30 frames) + is_stale re-attach
  (mid-session HDR ring recreate); publishes buffer.MetaData.pSurface via IDXGIResource::from_raw_borrowed
  (preserves IddCx's refcount) BEFORE IddCxSwapChainFinishedProcessingFrame. run/run_core gain the render
  LUID; callbacks.rs assign_swap_chain passes it.
- host idd_push.rs migrated onto pf_vdisplay_proto::frame (deleted the hand-rolled SharedHeader / MAGIC /
  VERSION / RING_LEN / DRV_STATUS_* / name fns / token packing) — pure refactor, byte-identical, no
  behavior or gating change. DebugBlock + DXGI_SHARED_RESOURCE_RW kept local (not in the proto).
- driver windows crate gains Win32_System_Memory (MapViewOfFile/OpenFileMappingW/...); rustfmt'd the whole
  driver workspace (incl. wdk-probe — fmt-only).

Built via the ultracode flow: STEP-6 map workflow -> agent-implement -> box build (driver + host both
green; caught nothing this time) -> adversarial-verify-agent (no blockers) -> FrameToken::pack hardening
-> deploy (loads). Glass-to-glass frame validation awaits a composited session (per the parity finding:
this headless box yields 0 frames for the proven SudoVDA path too). FOLLOW-UPs: port the optional
Global\pfvd-dbg DebugBlock triage channel to the new driver; STEP 7 HDR; STEP 8 drop SudoVDA.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-25 10:28:47 +00:00
parent 590ceaa850
commit e2f004589c
10 changed files with 449 additions and 80 deletions
@@ -214,7 +214,17 @@ pub unsafe extern "C" fn assign_swap_chain(
if let Some(device) = crate::direct_3d_device::pooled_device(luid) {
let mut processor = crate::swap_chain_processor::SwapChainProcessor::new();
processor.run(swap_chain, device, new_frame_event, target_id);
// STEP 6: the publisher reports this render LUID into the host header so the host detects a
// render-adapter mismatch (it created the ring textures on its own GPU). `luid` is the OS-picked
// render adapter built above.
processor.run(
swap_chain,
device,
new_frame_event,
target_id,
luid.LowPart,
luid.HighPart,
);
// Install on the monitor; drop any processor it replaced (a race lost above) OUTSIDE the lock.
drop(crate::monitor::set_swap_chain_processor(monitor, processor));
} else {
@@ -5,8 +5,9 @@
//! D3D/DXGI types are the `windows` crate (refcounted COM, no manual Drop); the swap-chain/LUID hand-off
//! to the wdk-sys IddCx world happens via raw pointers in `swap_chain_processor.rs`.
//!
//! STEP 5 only DRAINS the swap-chain to keep the monitor a live display — there is no frame publisher,
//! so the device's immediate context is unused here (it returns to use in STEP 6's `CopyResource`).
//! STEP 5 binds this device to the swap-chain to keep the monitor a live display; STEP 6 reuses the
//! device's immediate context in the frame publisher's `CopyResource` (both on the swap-chain processor
//! thread, the one thread this device is touched from).
use std::sync::atomic::{AtomicI32, Ordering};
use std::sync::{Arc, Mutex};
@@ -54,8 +55,6 @@ pub struct Direct3DDevice {
pub device: ID3D11Device,
/// The single (SINGLETHREADED) immediate context — used by STEP 6's frame-push publisher's
/// `CopyResource` on the swap-chain processor thread (the one thread this device is touched from).
/// Unused in STEP 5 (drain-only); kept so the device matches the oracle exactly.
#[allow(dead_code)]
pub device_context: ID3D11DeviceContext,
}
@@ -0,0 +1,317 @@
//! STEP 6 — IDD-push frame publisher (DRIVER side).
//!
//! The restricted WUDFHost token canNOT create named kernel objects (proven on the RTX box: it can't
//! even write a world-writable file), so — exactly like the gamepad UMDF drivers
//! (`crates/punktfunk-host/src/inject/dualsense_windows.rs`: *"the host creates the section, privileged,
//! with a permissive SDDL so the WUDFHost can open it; the driver maps it"*) — the **host** creates the
//! shared header + frame-ready event + ring of keyed-mutex textures, and the driver only **OPENS** them.
//! The driver writes its actual render-adapter LUID + a status code back into the host-created header (our
//! only driver-visibility channel: UMDF hides OutputDebugString in ETW and the token can't write files),
//! then copies each acquired swap-chain surface into the next ring slot and signals the host.
//!
//! Host counterpart: `crates/punktfunk-host/src/capture/idd_push.rs`. The shared `SharedHeader` layout,
//! the [`FrameToken`] packing, the `Global\` object-name scheme, the `MAGIC`/`RING_LEN` and the
//! `DRV_STATUS_*` codes are NOT hand-duplicated here: both sides `use pf_vdisplay_proto::frame::*`, which
//! OWNS the contract (with `const` size asserts so any drift is a compile error).
//!
//! Ported from the proven oracle (`packaging/windows/vdisplay-driver/pf-vdisplay/src/frame_transport.rs`).
//! Differences from the oracle:
//! * the layout/consts/names/token come from `pf_vdisplay_proto::frame` instead of being re-declared;
//! * `dbglog!` replaces `log::info!`;
//! * the optional fixed-name `Global\pfvd-dbg` `DebugBlock` bring-up channel is SKIPPED (not on the data
//! path). FOLLOW-UP: if the host bring-up diagnostics are needed again, port the oracle's `DebugBlock`
//! here too (it is owned by `idd_push.rs`, not the proto).
use std::sync::atomic::{AtomicU32, AtomicU64, Ordering};
use pf_vdisplay_proto::frame::{
DRV_STATUS_NO_DEVICE1, DRV_STATUS_OPENED, DRV_STATUS_TEX_FAIL, FrameToken, MAGIC, RING_LEN,
SharedHeader, event_name, header_name, texture_name,
};
use windows::Win32::Foundation::{CloseHandle, HANDLE};
use windows::Win32::Graphics::Direct3D11::{
D3D11_TEXTURE2D_DESC, ID3D11Device, ID3D11Device1, ID3D11DeviceContext, ID3D11Texture2D,
};
use windows::Win32::Graphics::Dxgi::IDXGIKeyedMutex;
use windows::Win32::System::Memory::{
FILE_MAP_ALL_ACCESS, MEMORY_MAPPED_VIEW_ADDRESS, MapViewOfFile, OpenFileMappingW,
UnmapViewOfFile,
};
use windows::Win32::System::Threading::{OpenEventW, SYNCHRONIZATION_ACCESS_RIGHTS, SetEvent};
use windows::core::{HSTRING, Interface};
/// `DXGI_SHARED_RESOURCE_READ | _WRITE` — passed to `OpenSharedResourceByName` (matches the host's
/// `CreateSharedHandle` access). Kept local: it is a `OpenSharedResourceByName` arg, not part of the
/// proto contract. (Same value the host uses in `idd_push.rs`.)
const DXGI_SHARED_RESOURCE_RW: u32 = 0x8000_0000 | 0x1;
/// SYNCHRONIZE | EVENT_MODIFY_STATE — the driver does not wait on the event, only SIGNALS it.
const EVENT_ACCESS: u32 = 0x0010_0000 | 0x0002;
/// `WAIT_TIMEOUT` as an HRESULT — `AcquireSync` returns this when the slot is held by the consumer.
const WAIT_TIMEOUT_HRESULT: i32 = 0x0000_0102;
struct Slot {
tex: ID3D11Texture2D,
mutex: IDXGIKeyedMutex,
}
/// Publishes acquired swap-chain surfaces into the HOST-created ring. Owned by the swap-chain processor
/// thread; attached lazily once the host has created the shared objects.
pub struct FramePublisher {
context: ID3D11DeviceContext,
map: HANDLE,
header: *mut SharedHeader,
event: HANDLE,
slots: Vec<Slot>,
next: u32,
seq: u64,
/// The host-created ring textures' DXGI format (from the shared header). A swap-chain surface whose
/// format differs (e.g. an FP16 HDR frame vs a BGRA ring) is dropped in `publish` — `CopyResource`
/// needs matching formats.
ring_format: u32,
/// The ring generation this publisher attached to. The host BUMPS the header generation when it
/// recreates the ring at a new format mid-session (the display's HDR mode flipped) — [`Self::is_stale`]
/// detects that so `run_core` re-attaches to the new-format textures instead of dropping every frame.
generation: u32,
}
// SAFETY: created and used only on the swap-chain processor thread.
unsafe impl Send for FramePublisher {}
impl FramePublisher {
/// Try ONCE to attach to the host-created shared objects. Returns `Err` cheaply if the host hasn't
/// created/published them yet — the drain loop retries periodically, so a non-IDD-push session just
/// keeps draining with no stall. All early-return paths clean up the handles/mapping they opened
/// explicitly (raw-handle style, no RAII — matches the rest of this driver).
pub fn try_open(
target_id: u32,
render_luid_low: u32,
render_luid_high: i32,
device: &ID3D11Device,
context: &ID3D11DeviceContext,
) -> windows::core::Result<Self> {
// 1. Open the host-created header (RW). Err if the host hasn't created it yet.
let map = unsafe {
OpenFileMappingW(
FILE_MAP_ALL_ACCESS.0,
false,
&HSTRING::from(header_name(target_id)),
)?
};
let view = unsafe {
MapViewOfFile(
map,
FILE_MAP_ALL_ACCESS,
0,
0,
core::mem::size_of::<SharedHeader>(),
)
};
if view.Value.is_null() {
unsafe {
let _ = CloseHandle(map);
}
return Err(windows::core::Error::from_win32());
}
let header = view.Value.cast::<SharedHeader>();
// 2. Report our render adapter to the host immediately (lets it detect a mismatch).
unsafe {
(*header).driver_render_luid_low = render_luid_low;
(*header).driver_render_luid_high = render_luid_high;
}
// 3. The host sets magic==MAGIC only once the ring textures exist. Not ready → retry later.
let magic = unsafe {
(*(core::ptr::addr_of!((*header).magic) as *const AtomicU32)).load(Ordering::Acquire)
};
if magic != MAGIC {
unsafe {
let _ = UnmapViewOfFile(MEMORY_MAPPED_VIEW_ADDRESS {
Value: header.cast(),
});
let _ = CloseHandle(map);
}
return Err(windows::core::Error::from_win32());
}
let (generation, ring_len) =
unsafe { ((*header).generation, (*header).ring_len.min(RING_LEN)) };
// 4. Open the event (SYNCHRONIZE | EVENT_MODIFY_STATE so we can SetEvent).
let event = match unsafe {
OpenEventW(
SYNCHRONIZATION_ACCESS_RIGHTS(EVENT_ACCESS),
false,
&HSTRING::from(event_name(target_id)),
)
} {
Ok(e) => e,
Err(e) => {
unsafe {
let _ = UnmapViewOfFile(MEMORY_MAPPED_VIEW_ADDRESS {
Value: header.cast(),
});
let _ = CloseHandle(map);
}
return Err(e);
}
};
// 5. Open device1 + the ring textures the host created (same render adapter required).
let device1: ID3D11Device1 = match device.cast() {
Ok(d) => d,
Err(e) => {
unsafe {
(*header).driver_status = DRV_STATUS_NO_DEVICE1;
let _ = CloseHandle(event);
let _ = UnmapViewOfFile(MEMORY_MAPPED_VIEW_ADDRESS {
Value: header.cast(),
});
let _ = CloseHandle(map);
}
return Err(e);
}
};
let mut slots = Vec::new();
for k in 0..ring_len {
let name = HSTRING::from(texture_name(target_id, generation, k));
let opened: windows::core::Result<ID3D11Texture2D> =
unsafe { device1.OpenSharedResourceByName(&name, DXGI_SHARED_RESOURCE_RW) };
match opened {
Ok(tex) => match tex.cast::<IDXGIKeyedMutex>() {
Ok(mutex) => slots.push(Slot { tex, mutex }),
Err(e) => {
unsafe {
(*header).driver_status = DRV_STATUS_TEX_FAIL;
(*header).driver_status_detail = e.code().0 as u32;
let _ = CloseHandle(event);
let _ = UnmapViewOfFile(MEMORY_MAPPED_VIEW_ADDRESS {
Value: header.cast(),
});
let _ = CloseHandle(map);
}
return Err(e);
}
},
Err(e) => {
// Most likely a render-adapter mismatch (the host made the textures on a different
// GPU than the swap-chain renders on). Tell the host so it can report it.
unsafe {
(*header).driver_status = DRV_STATUS_TEX_FAIL;
(*header).driver_status_detail = e.code().0 as u32;
let _ = CloseHandle(event);
let _ = UnmapViewOfFile(MEMORY_MAPPED_VIEW_ADDRESS {
Value: header.cast(),
});
let _ = CloseHandle(map);
}
return Err(e);
}
}
}
unsafe {
(*header).driver_status = DRV_STATUS_OPENED;
}
dbglog!(
"[pf-vd] frame-push(driver): attached to host ring gen {generation} ({ring_len} slots)"
);
Ok(Self {
context: context.clone(),
map,
header,
event,
slots,
next: 0,
seq: 0,
ring_format: unsafe { (*header).dxgi_format },
generation,
})
}
#[inline]
fn latest_cell(&self) -> &AtomicU64 {
unsafe { &*(core::ptr::addr_of!((*self.header).latest) as *const AtomicU64) }
}
/// True once the host has recreated the ring (bumped the header generation) — e.g. the display's HDR
/// mode flipped, so the ring format changed (FP16 ⇄ BGRA) and the texture names now carry a new
/// generation. `run_core` drops the publisher on this so it re-attaches to the new ring.
pub fn is_stale(&self) -> bool {
let cur = unsafe {
(*(core::ptr::addr_of!((*self.header).generation) as *const AtomicU32))
.load(Ordering::Acquire)
};
cur != self.generation
}
/// Copy `surface` into the next free ring slot and signal the host. Never blocks (0 ms try-acquire).
pub fn publish(&mut self, surface: &ID3D11Texture2D) {
let ring_len = self.slots.len() as u32;
if ring_len == 0 {
return;
}
// Format guard: `CopyResource` needs the surface + ring textures to share a DXGI format. Drop a
// frame that doesn't match (e.g. an FP16 HDR surface arriving while the ring is still BGRA, before
// the host recreates the ring as FP16) instead of corrupting / failing the copy.
let mut desc = D3D11_TEXTURE2D_DESC::default();
unsafe { surface.GetDesc(&mut desc) };
if desc.Format.0 as u32 != self.ring_format {
return;
}
let start = self.next;
for attempt in 0..ring_len {
let slot = (start + attempt) % ring_len;
let s = &self.slots[slot as usize];
match unsafe { s.mutex.AcquireSync(0, 0) } {
Ok(()) => {
// STRAIGHT-LINE, NO `?` between acquire + release — a `?`-return here would leak the
// keyed-mutex lock and wedge the host on this slot. The ordering below is load-bearing:
// the CopyResource is GPU-ordered before the consumer via the slot keyed mutex, and the
// `latest` store (Release) publishes the slot only AFTER the copy is queued + the mutex
// released.
unsafe {
self.context.CopyResource(&s.tex, surface);
let _ = s.mutex.ReleaseSync(0);
}
self.seq = self.seq.wrapping_add(1);
// `latest` = (generation << 40) | (seq << 8) | slot, packed by the proto's `FrameToken`
// (single source of truth — the host unpacks with the same type). Stamping the generation
// lets the host REJECT a publish from a stale ring (an old-generation publisher racing the
// host's mid-session ring recreate) so it never consumes an unwritten new-ring slot.
let latest = FrameToken {
generation: self.generation,
seq: self.seq as u32,
slot: slot as u8,
}
.pack();
self.latest_cell().store(latest, Ordering::Release);
unsafe {
let _ = SetEvent(self.event);
}
self.next = (slot + 1) % ring_len;
return;
}
Err(e) if e.code().0 == WAIT_TIMEOUT_HRESULT => continue,
Err(_) => return,
}
}
// All slots busy — drop this frame (never block the swap-chain thread).
}
}
impl Drop for FramePublisher {
fn drop(&mut self) {
// Slots FIRST (release the shared textures + keyed mutexes), THEN unmap the header, THEN the
// handles.
self.slots.clear();
unsafe {
if !self.header.is_null() {
let _ = UnmapViewOfFile(MEMORY_MAPPED_VIEW_ADDRESS {
Value: self.header.cast(),
});
}
let _ = CloseHandle(self.event);
let _ = CloseHandle(self.map);
}
}
}
@@ -18,6 +18,7 @@ mod control;
mod direct_3d_device;
mod edid;
mod entry;
mod frame_transport;
mod monitor;
mod swap_chain_processor;
@@ -1,18 +1,20 @@
//! The swap-chain processor (STEP 5): a worker thread that DRAINS the IddCx swap-chain so the virtual
//! monitor stays a usable display.
//! The swap-chain processor (STEP 5 + STEP 6): a worker thread that DRAINS the IddCx swap-chain (so the
//! virtual monitor stays a usable display) and PUBLISHES each acquired surface into the host-created
//! shared ring (the IDD-push path).
//!
//! The OS presents the composited desktop to the driver through a swap-chain; the driver MUST consume
//! it (acquire → finished-processing) or the monitor stalls. STEP 5 binds our render device to the
//! swap-chain (`IddCxSwapChainSetDevice`) and loops acquire/finish, discarding each frame. It does NOT
//! publish frames to the host — that is STEP 6 (the `CopyResource` of `out.MetaData.pSurface` into a
//! shared ring), deliberately omitted here.
//! The OS presents the composited desktop to the driver through a swap-chain; the driver MUST consume it
//! (acquire → finished-processing) or the monitor stalls. STEP 5 binds our render device to the swap-chain
//! (`IddCxSwapChainSetDevice`) and loops acquire/finish. STEP 6 lazily attaches a [`FramePublisher`] to
//! the host's shared ring and, on each acquired frame, `CopyResource`s `out.MetaData.pSurface` into the
//! next ring slot before finishing the frame (a non-IDD-push session simply never attaches and keeps
//! draining).
//!
//! Ported from the proven oracle (`packaging/windows/vdisplay-driver/pf-vdisplay/src/
//! swap_chain_processor.rs`) onto wdk-sys + wdk-iddcx. The oracle's `wdf_umdf`/`wdf_umdf_sys` are
//! replaced by `wdk_sys::iddcx::*` + the `wdk_iddcx` DDI wrappers. Those wrappers return a RAW
//! `NTSTATUS` (`i32`) that is HRESULT-shaped for the swap-chain DDIs, so we classify it by hand
//! (`hr >= 0` = success; `0x8000_000A` = E_PENDING; `hr < 0 && != E_PENDING` = error) rather than with
//! `nt_success`. The publisher + `render_luid_low/high` params are dropped (STEP 6).
//! `nt_success`.
use std::{
mem::size_of,
@@ -35,7 +37,10 @@ use wdk_sys::{HANDLE, NTSTATUS, WDFOBJECT, call_unsafe_wdf_function_binding};
use windows::{
Win32::{
Foundation::HANDLE as WHANDLE,
Graphics::Dxgi::IDXGIDevice,
Graphics::{
Direct3D11::ID3D11Texture2D,
Dxgi::{IDXGIDevice, IDXGIResource},
},
System::Threading::{
AvRevertMmThreadCharacteristics, AvSetMmThreadCharacteristicsW, WaitForSingleObject,
},
@@ -43,7 +48,7 @@ use windows::{
core::{Interface, w},
};
use crate::direct_3d_device::Direct3DDevice;
use crate::{direct_3d_device::Direct3DDevice, frame_transport::FramePublisher};
/// E_PENDING — `ReleaseAndAcquireBuffer2` returns this (HRESULT-shaped) when the swap-chain is valid but
/// DWM has composed no new frame yet; wait on the surface-available event and retry.
@@ -89,6 +94,8 @@ impl SwapChainProcessor {
device: Arc<Direct3DDevice>,
available_buffer_event: HANDLE,
target_id: u32,
render_luid_low: u32,
render_luid_high: i32,
) {
let available_buffer_event = Sendable(available_buffer_event);
let swap_chain = Sendable(swap_chain);
@@ -117,6 +124,8 @@ impl SwapChainProcessor {
available_buffer_event.0,
&terminate,
target_id,
render_luid_low,
render_luid_high,
);
dbglog!(
@@ -147,6 +156,8 @@ impl SwapChainProcessor {
available_buffer_event: HANDLE,
terminate: &AtomicBool,
target_id: u32,
render_luid_low: u32,
render_luid_high: i32,
) {
// SetDevice fails (0x887A0026, FACILITY_DXGI) when the monitor briefly flaps INACTIVE during
// topology activation — the OS unassigns + re-assigns the swap-chain, and a fresh run_core thread
@@ -208,6 +219,13 @@ impl SwapChainProcessor {
return;
}
// STEP 6 IDD-push: lazily ATTACH to the HOST-created shared ring. The restricted UMDF token can't
// create named objects, so the host creates the header + event + textures and we only OPEN them
// once they appear (`try_open`). Until then we just drain — exactly the STEP-5 behaviour — so a
// non-IDD-push session never stalls. Retried every ~30 loop iterations.
let mut publisher: Option<FramePublisher> = None;
let mut frames_since_try: u32 = u32::MAX; // attach attempt on the first loop iteration
let mut logged_pending = false;
let mut logged_frame = false;
loop {
@@ -221,9 +239,40 @@ impl SwapChainProcessor {
break;
}
// The host recreates the shared ring (new format) mid-session when the display's HDR mode
// flips — it bumps the header generation. Detect that and drop the publisher so we re-attach to
// the new-format textures below; otherwise we'd keep CopyResource'ing into the stale ring, whose
// format now mismatches the surface → the publish() format-guard drops every frame and the
// stream freezes until the next swap-chain recreate.
if publisher.as_ref().is_some_and(FramePublisher::is_stale) {
publisher = None;
frames_since_try = u32::MAX; // re-attach immediately
}
// Lazy-attach (rate-limited) at the loop TOP so we keep trying even while the display is idle
// (E_PENDING / no frames presented yet), not only when a frame is acquired. `try_open` is a
// cheap OpenFileMapping that fails fast until the host has created the ring.
if publisher.is_none() {
if frames_since_try >= 30 {
frames_since_try = 0;
// `if let Ok` (not a `match` with an empty `Err` arm) keeps clippy's `single_match`
// happy under `-D warnings`; semantics are identical — attach on success, retry on Err.
if let Ok(p) = FramePublisher::try_open(
target_id,
render_luid_low,
render_luid_high,
&device.device,
&device.device_context,
) {
publisher = Some(p);
}
} else {
frames_since_try += 1;
}
}
// ...Buffer2 is required once CAN_PROCESS_FP16 is set. AcquireSystemMemoryBuffer=FALSE keeps
// the GPU surface (out.MetaData.pSurface). STEP 5 only drains — it does NOT publish the
// surface (STEP 6 will). Built zeroed + field-assigned (driver style) so a bindgen field-set
// the GPU surface (out.MetaData.pSurface) STEP 6 publishes it into the shared ring in the
// success branch below. Built zeroed + field-assigned (driver style) so a bindgen field-set
// difference can't break a positional struct literal.
let mut in_args: IDARG_IN_RELEASEANDACQUIREBUFFER2 = unsafe { core::mem::zeroed() };
#[allow(clippy::cast_possible_truncation)]
@@ -275,9 +324,23 @@ impl SwapChainProcessor {
);
logged_frame = true;
}
// STEP 6 publishes `buffer.MetaData.pSurface` into the shared ring HERE (the surface is
// valid until the next ReleaseAndAcquire). STEP 5 only drains, so we immediately finish
// the frame.
// STEP 6: copy the acquired surface into the shared ring BEFORE FinishedProcessingFrame
// (the surface is valid until the next ReleaseAndAcquire). The pointer is BORROWED —
// `from_raw_borrowed` does NOT take IddCx's refcount — and the GPU-side copy is ordered
// before the consumer via the slot keyed mutex. (Attach happens at the loop top.)
if let Some(p) = publisher.as_mut() {
let raw = buffer.MetaData.pSurface as *mut core::ffi::c_void;
if !raw.is_null() {
// SAFETY: `raw` is IddCx's live surface pointer (valid until the next
// ReleaseAndAcquire); `from_raw_borrowed` does not consume the refcount.
if let Some(res) = unsafe { IDXGIResource::from_raw_borrowed(&raw) } {
if let Ok(tex) = res.cast::<ID3D11Texture2D>() {
p.publish(&tex);
}
}
}
}
// SAFETY: driver is loaded; `swap_chain` is valid.
let hr = unsafe { wdk_iddcx::IddCxSwapChainFinishedProcessingFrame(swap_chain) };
if !hr_success(hr) {