feat(host/windows): force-composed-flip overlay to capture the secure desktop
The secure (Winlogon: UAC/lock/login) desktop presents via fullscreen independent-flip/MPO — it scans out bypassing DWM composition, so DXGI Desktop Duplication returns born-lost DXGI_ERROR_ACCESS_LOST (the client sees black; the UAC only "flashes" during the brief composed transition). Confirmed live: stable 4090 LUID across the storm (NOT reparenting) on an FP16 HDR output, recovering only when the screen changes. Fix (non-input, no system-wide registry change): capture/composed_flip.rs keeps a tiny click-through near-invisible TOPMOST LAYERED window alive on the current input desktop. Any visible window on the output disqualifies independent-flip → DWM composites → DDA can capture. A dedicated thread follows the input desktop (Default↔Winlogon) and recreates the window there on each switch (a window is bound to its desktop), re-asserting topmost + pumping messages every 200ms. Started for the two-process stream's lifetime; gated by PUNKTFUNK_FORCE_COMPOSED (default on, =0 to disable). Needs GENERIC_ALL on OpenInputDesktop for DESKTOP_CREATEWINDOW (0x80070005 otherwise). Validated: overlay creates on the Default desktop; live lock test pending. Also includes SET_RENDER_ADAPTER (sudovda.rs, Apollo item #16): pins the IDD render GPU to the NVENC GPU before ADD — issued + accepted live, though the secure-desktop storm was proven to be independent-flip (stable LUID), not reparenting, so it's correctness/hygiene here rather than this bug's fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -141,6 +141,9 @@ windows = { version = "0.62", features = [
|
||||
"Win32_System_Threading",
|
||||
"Win32_System_Pipes",
|
||||
"Win32_System_Environment",
|
||||
# Force-composed-flip overlay: a topmost layered window on the Winlogon desktop disqualifies the
|
||||
# secure desktop's fullscreen independent-flip so Desktop Duplication can capture it.
|
||||
"Win32_System_LibraryLoader",
|
||||
] }
|
||||
# Software H.264 encoder (GPU-less path + NVENC fallback). The default `source` feature statically
|
||||
# compiles OpenH264 (BSD-2) — no system lib, builds on MSVC; nasm on PATH adds the SIMD fast path.
|
||||
|
||||
@@ -320,6 +320,8 @@ pub fn capture_virtual_output(_vout: crate::vdisplay::VirtualOutput) -> Result<B
|
||||
anyhow::bail!("virtual-output capture requires Linux or Windows")
|
||||
}
|
||||
|
||||
#[cfg(target_os = "windows")]
|
||||
pub mod composed_flip;
|
||||
#[cfg(target_os = "windows")]
|
||||
pub mod desktop_watch;
|
||||
#[cfg(target_os = "windows")]
|
||||
|
||||
@@ -0,0 +1,207 @@
|
||||
//! Force-composed-flip overlay (Windows) — make the secure (Winlogon: UAC / lock / login) desktop
|
||||
//! capturable by Desktop Duplication.
|
||||
//!
|
||||
//! The secure desktop's dialog/wallpaper presents via **fullscreen independent-flip / MPO**: it scans
|
||||
//! out directly, bypassing DWM composition, so `IDXGIOutputDuplication::AcquireNextFrame` returns
|
||||
//! `DXGI_ERROR_ACCESS_LOST` (born-lost) — there is no composed frame to hand out (the client sees
|
||||
//! black). Independent-flip requires the presenting app to own the ENTIRE output: putting ANY other
|
||||
//! visible window on that output disqualifies it, forcing DWM to **composite**, which DDA can then
|
||||
//! capture. So we keep a tiny, click-through, near-invisible **topmost layered window** alive on the
|
||||
//! *current input desktop* (which is Winlogon while the secure desktop is up). On a desktop switch the
|
||||
//! window is orphaned, so a dedicated thread tracks the input desktop and recreates it there.
|
||||
//!
|
||||
//! This is the non-input alternative to "tap a key to wake the lock screen": it needs no SendInput and
|
||||
//! no system-wide registry change (it does NOT disable MPO globally — it only nudges OUR output to
|
||||
//! composed while a session is live). Effectiveness can be build/driver-dependent; gated by
|
||||
//! `PUNKTFUNK_FORCE_COMPOSED` (default ON; set =0 to disable).
|
||||
|
||||
use std::sync::atomic::{AtomicBool, Ordering};
|
||||
use std::sync::Arc;
|
||||
use windows::core::{w, PCWSTR};
|
||||
use windows::Win32::Foundation::{HWND, LPARAM, LRESULT, WPARAM};
|
||||
use windows::Win32::System::LibraryLoader::GetModuleHandleW;
|
||||
use windows::Win32::System::StationsAndDesktops::{
|
||||
CloseDesktop, GetUserObjectInformationW, OpenInputDesktop, SetThreadDesktop,
|
||||
DESKTOP_ACCESS_FLAGS, DESKTOP_CONTROL_FLAGS, UOI_NAME,
|
||||
};
|
||||
use windows::Win32::UI::WindowsAndMessaging::{
|
||||
CreateWindowExW, DefWindowProcW, DestroyWindow, DispatchMessageW, PeekMessageW, RegisterClassW,
|
||||
SetLayeredWindowAttributes, SetWindowPos, ShowWindow, TranslateMessage, HWND_TOPMOST,
|
||||
LWA_ALPHA, MSG, PM_REMOVE, SWP_NOACTIVATE, SWP_NOMOVE, SWP_NOSIZE, SW_SHOWNOACTIVATE,
|
||||
WNDCLASSW, WS_EX_LAYERED, WS_EX_NOACTIVATE, WS_EX_TOOLWINDOW, WS_EX_TOPMOST, WS_EX_TRANSPARENT,
|
||||
WS_POPUP,
|
||||
};
|
||||
|
||||
/// A running force-composed-flip overlay. Drop signals the thread to tear down its window + exit.
|
||||
pub struct ForceComposedFlip {
|
||||
stop: Arc<AtomicBool>,
|
||||
}
|
||||
|
||||
impl ForceComposedFlip {
|
||||
/// Start the overlay (no-op + `None` if disabled via `PUNKTFUNK_FORCE_COMPOSED=0`).
|
||||
pub fn start() -> Option<Self> {
|
||||
if std::env::var("PUNKTFUNK_FORCE_COMPOSED").as_deref() == Ok("0") {
|
||||
tracing::info!("force-composed-flip overlay disabled (PUNKTFUNK_FORCE_COMPOSED=0)");
|
||||
return None;
|
||||
}
|
||||
let stop = Arc::new(AtomicBool::new(false));
|
||||
let st = stop.clone();
|
||||
std::thread::Builder::new()
|
||||
.name("composed-flip".into())
|
||||
.spawn(move || unsafe { run(st) })
|
||||
.ok()?;
|
||||
tracing::info!("force-composed-flip overlay started (Winlogon-aware)");
|
||||
Some(ForceComposedFlip { stop })
|
||||
}
|
||||
}
|
||||
|
||||
impl Drop for ForceComposedFlip {
|
||||
fn drop(&mut self) {
|
||||
self.stop.store(true, Ordering::Relaxed);
|
||||
}
|
||||
}
|
||||
|
||||
extern "system" fn wndproc(hwnd: HWND, msg: u32, wp: WPARAM, lp: LPARAM) -> LRESULT {
|
||||
unsafe { DefWindowProcW(hwnd, msg, wp, lp) }
|
||||
}
|
||||
|
||||
/// Read the current input-desktop name (e.g. "Default" / "Winlogon"); `None` if it can't be read.
|
||||
unsafe fn input_desktop_name() -> Option<String> {
|
||||
let desk = OpenInputDesktop(
|
||||
DESKTOP_CONTROL_FLAGS(0),
|
||||
false,
|
||||
DESKTOP_ACCESS_FLAGS(0x0001),
|
||||
)
|
||||
.ok()?;
|
||||
let mut buf = [0u16; 64];
|
||||
let mut needed = 0u32;
|
||||
let ok = GetUserObjectInformationW(
|
||||
windows::Win32::Foundation::HANDLE(desk.0),
|
||||
UOI_NAME,
|
||||
Some(buf.as_mut_ptr() as *mut _),
|
||||
(buf.len() * 2) as u32,
|
||||
Some(&mut needed),
|
||||
)
|
||||
.is_ok();
|
||||
let _ = CloseDesktop(desk);
|
||||
if !ok {
|
||||
return None;
|
||||
}
|
||||
Some(
|
||||
String::from_utf16_lossy(&buf)
|
||||
.trim_end_matches('\u{0}')
|
||||
.to_string(),
|
||||
)
|
||||
}
|
||||
|
||||
/// Create the tiny topmost layered click-through window on the CURRENT thread's desktop. Caller must
|
||||
/// have `SetThreadDesktop`'d to the target input desktop first.
|
||||
unsafe fn make_overlay() -> Option<HWND> {
|
||||
let hinst = GetModuleHandleW(None).ok()?;
|
||||
let class = w!("PunktfunkComposedFlip");
|
||||
// RegisterClassW is idempotent-ish: a second register for the same name fails harmlessly; we
|
||||
// ignore the result and rely on the class existing. (One process, so it registers once.)
|
||||
let wc = WNDCLASSW {
|
||||
lpfnWndProc: Some(wndproc),
|
||||
hInstance: hinst.into(),
|
||||
lpszClassName: class,
|
||||
..Default::default()
|
||||
};
|
||||
let atom = RegisterClassW(&wc);
|
||||
if atom == 0 {
|
||||
let e = windows::Win32::Foundation::GetLastError();
|
||||
// 1410 = ERROR_CLASS_ALREADY_EXISTS is fine (re-register after a desktop switch).
|
||||
if e.0 != 1410 {
|
||||
tracing::warn!(err = e.0, "force-composed-flip: RegisterClassW failed");
|
||||
}
|
||||
}
|
||||
let hwnd = match CreateWindowExW(
|
||||
WS_EX_LAYERED | WS_EX_TRANSPARENT | WS_EX_TOPMOST | WS_EX_NOACTIVATE | WS_EX_TOOLWINDOW,
|
||||
class,
|
||||
w!(""),
|
||||
WS_POPUP,
|
||||
0,
|
||||
0,
|
||||
1,
|
||||
1,
|
||||
None,
|
||||
None,
|
||||
Some(hinst.into()),
|
||||
None,
|
||||
) {
|
||||
Ok(h) => h,
|
||||
Err(e) => {
|
||||
let le = windows::Win32::Foundation::GetLastError();
|
||||
tracing::warn!(err = %format!("{e:?}"), last = le.0,
|
||||
"force-composed-flip: CreateWindowExW failed");
|
||||
return None;
|
||||
}
|
||||
};
|
||||
// alpha=1: technically visible (so it disqualifies independent-flip) but imperceptible.
|
||||
let _ = SetLayeredWindowAttributes(hwnd, windows::Win32::Foundation::COLORREF(0), 1, LWA_ALPHA);
|
||||
let _ = ShowWindow(hwnd, SW_SHOWNOACTIVATE);
|
||||
let _ = SetWindowPos(
|
||||
hwnd,
|
||||
Some(HWND_TOPMOST),
|
||||
0,
|
||||
0,
|
||||
0,
|
||||
0,
|
||||
SWP_NOMOVE | SWP_NOSIZE | SWP_NOACTIVATE,
|
||||
);
|
||||
Some(hwnd)
|
||||
}
|
||||
|
||||
unsafe fn run(stop: Arc<AtomicBool>) {
|
||||
let mut cur_desktop: Option<String> = None;
|
||||
let mut hwnd: Option<HWND> = None;
|
||||
let mut ticks: u32 = 0;
|
||||
while !stop.load(Ordering::Relaxed) {
|
||||
// Follow the input desktop: if it changed (Default↔Winlogon), re-attach this thread and
|
||||
// recreate the window there (a window is bound to the desktop it was created on).
|
||||
let name = input_desktop_name();
|
||||
if name != cur_desktop {
|
||||
if let Some(h) = hwnd.take() {
|
||||
let _ = DestroyWindow(h);
|
||||
}
|
||||
if let Ok(desk) = OpenInputDesktop(
|
||||
DESKTOP_CONTROL_FLAGS(0),
|
||||
false,
|
||||
DESKTOP_ACCESS_FLAGS(0x1000_0000), // GENERIC_ALL (incl. DESKTOP_CREATEWINDOW=0x0002)
|
||||
) {
|
||||
if SetThreadDesktop(desk).is_ok() {
|
||||
hwnd = make_overlay();
|
||||
tracing::info!(desktop = ?name, created = hwnd.is_some(),
|
||||
"force-composed-flip: overlay (re)created on input desktop");
|
||||
}
|
||||
// Leak `desk` while it's the thread desktop (closing the current thread desktop is UB).
|
||||
}
|
||||
cur_desktop = name;
|
||||
}
|
||||
// Re-assert topmost periodically (other windows on the secure desktop can push us down) and
|
||||
// pump our message queue so the window stays responsive/composited.
|
||||
if let Some(h) = hwnd {
|
||||
let _ = SetWindowPos(
|
||||
h,
|
||||
Some(HWND_TOPMOST),
|
||||
0,
|
||||
0,
|
||||
0,
|
||||
0,
|
||||
SWP_NOMOVE | SWP_NOSIZE | SWP_NOACTIVATE,
|
||||
);
|
||||
let mut msg = MSG::default();
|
||||
while PeekMessageW(&mut msg, Some(h), 0, 0, PM_REMOVE).as_bool() {
|
||||
let _ = TranslateMessage(&msg);
|
||||
DispatchMessageW(&msg);
|
||||
}
|
||||
}
|
||||
ticks = ticks.wrapping_add(1);
|
||||
let _ = ticks;
|
||||
std::thread::sleep(std::time::Duration::from_millis(200));
|
||||
}
|
||||
if let Some(h) = hwnd.take() {
|
||||
let _ = DestroyWindow(h);
|
||||
}
|
||||
tracing::info!("force-composed-flip overlay stopped");
|
||||
}
|
||||
@@ -2345,6 +2345,10 @@ fn virtual_stream_relay(
|
||||
|
||||
// The authoritative Default↔Winlogon signal (requires SYSTEM to read the Winlogon desktop name).
|
||||
let watcher = crate::capture::desktop_watch::DesktopWatcher::start();
|
||||
// Keep a force-composed-flip overlay alive on the input desktop so the SECURE desktop (which
|
||||
// otherwise presents via fullscreen independent-flip → DDA gets born-lost ACCESS_LOST / black) is
|
||||
// forced into DWM composition and becomes capturable. Held for the stream's lifetime.
|
||||
let _composed_flip = crate::capture::composed_flip::ForceComposedFlip::start();
|
||||
// Test hook: PUNKTFUNK_SECURE_TEST_PERIOD_MS=N drives a square-wave secure/normal toggle every N ms
|
||||
// instead of the real watcher — exercises the mid-session helper↔DDA mux without a live UAC/lock
|
||||
// (the real Winlogon DDA capture is already proven by the single-process secure path).
|
||||
|
||||
@@ -49,6 +49,7 @@ const fn ctl(func: u32) -> u32 {
|
||||
}
|
||||
const IOCTL_ADD: u32 = ctl(0x800);
|
||||
const IOCTL_REMOVE: u32 = ctl(0x801);
|
||||
const IOCTL_SET_RENDER_ADAPTER: u32 = ctl(0x802); // == 0x0022_2008
|
||||
const IOCTL_GET_WATCHDOG: u32 = ctl(0x803);
|
||||
const IOCTL_DRIVER_PING: u32 = ctl(0x888);
|
||||
const IOCTL_GET_VERSION: u32 = ctl(0x8FF);
|
||||
@@ -76,6 +77,82 @@ struct AddOut {
|
||||
target_id: u32,
|
||||
}
|
||||
|
||||
// SET_RENDER_ADAPTER input — byte-identical to SudoVDA's `{ LUID AdapterLuid; }` (8 bytes). The
|
||||
// windows `LUID` is `{ LowPart: u32, HighPart: i32 }` == the C `LUID`, so `#[repr(C)]` is exact.
|
||||
#[repr(C)]
|
||||
#[derive(Clone, Copy)]
|
||||
struct SetRenderAdapterParams {
|
||||
luid: LUID,
|
||||
}
|
||||
|
||||
/// Pin the SudoVDA IDD's RENDER GPU to `luid` (Apollo's `SetRenderAdapter`). No output buffer. MUST be
|
||||
/// issued on the driver handle BEFORE `IOCTL_ADD` to steer which GPU the new target renders on — on a
|
||||
/// multi-adapter box (SudoVDA IDD + a discrete GPU) this stops DXGI from reparenting the virtual
|
||||
/// output onto a different adapter than the one we duplicate/encode on (the ACCESS_LOST storm).
|
||||
unsafe fn set_render_adapter(h: HANDLE, luid: LUID) -> Result<()> {
|
||||
let p = SetRenderAdapterParams { luid };
|
||||
let bytes = std::slice::from_raw_parts(
|
||||
&p as *const _ as *const u8,
|
||||
size_of::<SetRenderAdapterParams>(),
|
||||
);
|
||||
let mut none: [u8; 0] = [];
|
||||
ioctl(h, IOCTL_SET_RENDER_ADAPTER, bytes, &mut none)
|
||||
.map(|_| ())
|
||||
.context("SudoVDA SET_RENDER_ADAPTER")
|
||||
}
|
||||
|
||||
/// Resolve the LUID of the GPU that should RENDER the virtual display = the GPU that drives NVENC +
|
||||
/// Desktop Duplication (e.g. the RTX 4090). Default: the discrete adapter with the most
|
||||
/// `DedicatedVideoMemory`, skipping WARP / Basic-Render and the SudoVDA software adapter (≈0 VRAM).
|
||||
/// `PUNKTFUNK_RENDER_ADAPTER=<substring>` forces a match by Description (Apollo's `adapter_name`).
|
||||
unsafe fn resolve_render_adapter_luid() -> Option<LUID> {
|
||||
use windows::Win32::Graphics::Dxgi::{CreateDXGIFactory1, IDXGIFactory1};
|
||||
let want = std::env::var("PUNKTFUNK_RENDER_ADAPTER")
|
||||
.ok()
|
||||
.filter(|s| !s.is_empty());
|
||||
let factory: IDXGIFactory1 = CreateDXGIFactory1().ok()?;
|
||||
let mut best: Option<(LUID, u64, String)> = None;
|
||||
let mut i = 0u32;
|
||||
while let Ok(a) = factory.EnumAdapters1(i) {
|
||||
i += 1;
|
||||
let Ok(d) = a.GetDesc1() else { continue };
|
||||
let name = String::from_utf16_lossy(&d.Description);
|
||||
let name = name.trim_end_matches('\u{0}').to_string();
|
||||
let lname = name.to_ascii_lowercase();
|
||||
if lname.contains("basic render") || lname.contains("warp") {
|
||||
continue; // never pin to the software rasterizer
|
||||
}
|
||||
if let Some(w) = &want {
|
||||
if lname.contains(&w.to_ascii_lowercase()) {
|
||||
tracing::info!(
|
||||
adapter = name,
|
||||
"render adapter chosen by PUNKTFUNK_RENDER_ADAPTER"
|
||||
);
|
||||
return Some(d.AdapterLuid);
|
||||
}
|
||||
continue;
|
||||
}
|
||||
let vram = d.DedicatedVideoMemory as u64; // SudoVDA software adapter ≈ 0 → loses to the dGPU
|
||||
if best.as_ref().map_or(true, |(_, v, _)| vram > *v) {
|
||||
best = Some((d.AdapterLuid, vram, name));
|
||||
}
|
||||
}
|
||||
match best {
|
||||
Some((luid, vram, name)) => {
|
||||
tracing::info!(
|
||||
adapter = name,
|
||||
vram_mb = vram / (1024 * 1024),
|
||||
"render adapter chosen (max VRAM)"
|
||||
);
|
||||
Some(luid)
|
||||
}
|
||||
None => {
|
||||
tracing::warn!("no suitable render adapter found for SET_RENDER_ADAPTER");
|
||||
None
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[repr(C)]
|
||||
struct RemoveParams {
|
||||
guid: GUID,
|
||||
@@ -457,6 +534,22 @@ impl VirtualDisplay for SudoVdaDisplay {
|
||||
device_name,
|
||||
serial: [0u8; 14],
|
||||
};
|
||||
// Pin the IDD's RENDER GPU to the NVENC/capture GPU (e.g. the 4090) BEFORE adding the target.
|
||||
// On a multi-adapter box (SudoVDA IDD + discrete GPU) DXGI otherwise reparents the virtual
|
||||
// output onto whichever GPU its hybrid-preference path resolves, which storms ACCESS_LOST
|
||||
// (0x887A0026) on the secure/HDR desktop. Apollo's SET_RENDER_ADAPTER fixes this and MUST be
|
||||
// issued before ADD. Best-effort: a driver that rejects it just keeps the default render GPU.
|
||||
let pinned = unsafe { resolve_render_adapter_luid() };
|
||||
if let Some(luid) = pinned {
|
||||
match unsafe { set_render_adapter(self.device, luid) } {
|
||||
Ok(()) => tracing::info!(
|
||||
luid = format!("{:08x}:{:08x}", luid.HighPart, luid.LowPart),
|
||||
"SudoVDA SET_RENDER_ADAPTER: pinned IDD render GPU"
|
||||
),
|
||||
Err(e) => tracing::warn!("SudoVDA SET_RENDER_ADAPTER failed (continuing): {e:#}"),
|
||||
}
|
||||
}
|
||||
|
||||
let add_bytes = unsafe {
|
||||
std::slice::from_raw_parts(&add as *const _ as *const u8, size_of::<AddParams>())
|
||||
};
|
||||
@@ -476,6 +569,17 @@ impl VirtualDisplay for SudoVdaDisplay {
|
||||
ao.target_id,
|
||||
ao.luid.LowPart
|
||||
);
|
||||
if let Some(luid) = pinned {
|
||||
if ao.luid.LowPart == luid.LowPart && ao.luid.HighPart == luid.HighPart {
|
||||
tracing::info!("SudoVDA ADD render adapter matches the pinned GPU (pin took)");
|
||||
} else {
|
||||
tracing::warn!(
|
||||
add = format!("{:08x}:{:08x}", ao.luid.HighPart, ao.luid.LowPart),
|
||||
pinned = format!("{:08x}:{:08x}", luid.HighPart, luid.LowPart),
|
||||
"SudoVDA ADD render adapter DIFFERS from pinned — driver ignored SET_RENDER_ADAPTER?"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
// Mandatory keepalive: ping inside the watchdog window or the driver tears all displays down.
|
||||
let stop = Arc::new(AtomicBool::new(false));
|
||||
|
||||
Reference in New Issue
Block a user