NVIDIA/AMD Vulkan ICDs refuse to *advertise* an HDR color space for a surface on an
IddCx indirect/virtual display, so Vulkan games (Doom: The Dark Ages, id Tech, Indiana
Jones, …) report "device does not support HDR" — even though Windows HDR, DWM compose,
and the client PQ stream all work, and the ICD happily *accepts + presents* a forced HDR
swapchain there. The whole gap is enumeration; the community (Apollo/Sunshine/VDD) wrote
this off as kernel-side / unfixable.
Add VK_LAYER_PUNKTFUNK_hdr_inject (packaging/windows/pf-vkhdr-layer/): a standalone
cdylib Vulkan implicit layer that appends {A2B10G10R10, HDR10_ST2084} + {RGBA16F, scRGB}
to vkGetPhysicalDeviceSurfaceFormats[2]KHR (no need to hook vkCreateSwapchainKHR — the
ICD doesn't validate the color space there). Self-gated on the surface monitor's actual
advanced-color state (DisplayConfig GET_ADVANCED_COLOR_INFO), so it is a complete no-op
on SDR sessions and real monitors (dedup). Always-on (registry-discovered) so it works
regardless of how a game is launched — env-scoping silently fails for already-running
Steam. Escape hatches: DISABLE_PF_VKHDR, PF_VKHDR_EXCLUDE, and a built-in kernel-anti-
cheat denylist.
The installer builds/signs/stages it and registers it under
HKLM64\SOFTWARE\Khronos\Vulkan\ImplicitLayers (opt-out "Install the HDR Vulkan layer"
task); windows-host CI fmt+clippy-gates it (msvc-only FFI).
Live-validated on the RTX box: Doom: The Dark Ages enables HDR over the pf-vdisplay
virtual display.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
10 KiB
Windows secure-desktop capture — two-process design
Status: all steps (1–6) implemented and live-validated on the RTX 4090 (2026-06-16). The two-process path works end to end (host as SYSTEM): the user-session WGC helper relays video, the mux switches to the host's DDA on the secure desktop, a dead helper is rebuilt automatically, and the SendInput injector follows desktop switches lazily. Only a real UAC/lock smoke test remains (can't be triggered headless over SSH). The earlier user-mode WGC animation fix still ships; this is the SYSTEM-mode design that adds secure-desktop (UAC/lock/login) coverage, since WGC and the secure desktop need conflicting process tokens.
Implemented so far:
- Step 1 — DesktopWatcher (
capture/desktop_watch.rs): polls the input-desktop name → atomicDefault/Winlogon. Committed80e222d. - Step 3 — WGC helper subcommand (
wgc_helper.rs,punktfunk1-host wgc-helper): WGC→NVENC→framed AUs on stdout, stdin keyframe control. Committeda0f6cdd. - Step 4 — spawn + relay (
capture/wgc_relay.rs,m3::virtual_stream_relay): SYSTEM host spawns the helper viaCreateProcessAsUserWintowinsta0\default, relays its stdout AUs to the QUIC send thread, forwards keyframe requests, surfaces helper stderr in host tracing. Committed9f50b39. - Step 5 — source mux (
m3::virtual_stream_relay): the DesktopWatcher switches the AU source — helper relay onDefault, the host's own DDA capturer+encoder onWinlogon; every switch latches "wait for IDR" + forces the now-active source to emit a keyframe.
Live-validated on the RTX 4090 (2026-06-16, host as SYSTEM):
-
Step 4: the helper spawns via
CreateProcessAsUserW, runs WGC with no hang (HDR FP16 BT.2020 PQ), opens NVENC (D3D11 Main10), and relays AUs —client-rsover the LAN decoded 411 HEVC Main-10 frames. (Bug found+fixed:CreateProcessAsUserWgave the helper the user's env, droppingPUNKTFUNK_ENCODER=nvenc→ software-encoder fallback; fixed bymerged_env_block.) -
Step 5: with
PUNKTFUNK_SECURE_TEST_PERIOD_MS=4000driving a square-wave toggle, the source mux switchedsecure(DDA)↔normal(WGC relay)cleanly 5× in one session; the client decoded 308 frames continuously across every switch (the wait-for-IDR latch held — no decode break). The real Winlogon DDA capture itself is pre-proven by the single-process secure path (commitf4b4a6c); step 5's new surface is the mux, which the toggle exercises directly. -
Step 6: the helper relaunch watchdog. Force-killing the helper PID mid-stream triggered exactly one
WGC helper exited — rebuilt output + helper fails=1and the stream recovered — client-rs decoded 645 frames continuously across the kill. A ~30s mux soak (2s toggle) ran 16 switches with 0 rebuilds / 0 early-ends / 465 frames decoded. (Recovery rebuilds the whole output, not a same-target respawn, which storm-failed with "no DXGI output for target N yet" after an abrupt kill.) -
Step 2: SendInput now uses the retry-on-failure model (
inject/sendinput.rs) — the thread stays bound to its desktop and only reattaches (OpenInputDesktop/SetThreadDesktop) on aSendInputshort write (desktop switched), instead of two syscalls per event. Validated:client-rs --input-testinjected for ~6s with noblocked desktoperrors (steady-state path); the reattach-on-switch path is the sameOpenInputDesktopcall the old per-event code used, now lazy.
Remaining: a final user-driven smoke test — trigger a real UAC/lock on the box during a session and confirm the dialog appears on the client AND that clicking/typing on it lands (the box's UAC auto-elevates admins, so a real prompt can't be triggered headless over SSH; the mux switch itself is proven by the timed toggle, and DDA-on-Winlogon capture + input by the single-process secure path).
Note: the two-process path requires the host to run as SYSTEM (
run.cmd.sysbak→-s -i 1). As SYSTEM, WASAPI loopback audio (session 0) does not capture the user session's audio — a known limitation of SYSTEM-mode capture, separate from this work.
The constraint (verified live on the RTX 4090)
- WGC (the composed-desktop capture that fixes frozen HDR animations) will not activate under
the SYSTEM account —
CreateForMonitor→0x80070424. Thread-levelImpersonateLoggedOnUseris insufficient (tested:impersonated=true, still0x80070424). WGC needs the process to run as the interactive user. - DDA + SendInput on the secure desktop (Winlogon: UAC/lock/login) require LOCAL_SYSTEM (attach to the Winlogon desktop). This is already shipped (task #17) when the host runs as SYSTEM.
- Therefore one process can't do both. Single-process (the simpler design) is out.
Architecture: SYSTEM host + USER-session WGC helper, AU-relay (no shared GPU texture)
- SYSTEM host (the existing
punktfunk1-host, launched as SYSTEM in interactive Session 1 via the scheduled task → PsExec-s -i 1): owns the punktfunk/1 QUIC session, the single SudoVDA virtual output (+ isolate/restore RAII — the only topology owner), the DDA capture + NVENC encoder for the secure desktop, the single SendInput injector (serves both desktops), and the AU source mux that feeds the QUIC data plane. - USER-session WGC helper (a new
punktfunk1-hostsubcommand, spawned by the SYSTEM host viaWTSQueryUserToken(activeConsoleSessionId)→DuplicateTokenEx(TokenPrimary)→CreateProcessAsUserW(lpDesktop="winsta0\\default", CREATE_NO_WINDOW)): runs the existing WGC → scRGB/PQ → NVENC pipeline and ships Annex-B AUs ({data, pts_ns, keyframe}) to the SYSTEM host over a named pipe. It captures the SAME SudoVDA output by GDI name only — it must NOT create its own virtual output / touch display topology (a second topology owner re-triggers the ACCESS_LOST born-lost storm). - Mux: the SYSTEM host relays the helper's AUs onto QUIC while the input desktop is
Default(normal — WGC, HDR/animation-correct), and switches to its own DDA encoder while it'sWinlogon(secure — UAC/lock/login). The client sees one continuous stream; the encoder/FEC/AES-GCM/QUIC send path is untouched (sameEncodedFrameflow). NVENC re-inits only on a size/format change across the swap (already handled); same-mode is a pointer re-register. - Input: stays entirely in the SYSTEM host (only it can attach to Winlogon). One windowless
SendInput thread, Sunshine's retry-on-failure-only model (cache HDESK thread-local; SendInput
first; only on 0-injected re-
OpenInputDesktop+SetThreadDesktopand retry once) — serves both desktops with no per-event reattach. (Ctrl+Alt+Del/SAS needsSendSAS, out of scope; clicking UAC Yes/No + typing the login password are plain SendInput on Winlogon.)
Rejected: a shared NT-handle GPU texture (MIC/SDDL pain SYSTEM→user, keyed-mutex ring at 240 Hz, nvenc pointer-cache churn — all for a static lock dialog). AU bytes over a pipe are far simpler.
Detection
DesktopWatcher: a dedicated thread polling the input-desktop NAME at 30–60 Hz —
OpenInputDesktop(0,FALSE,0) + GetUserObjectInformationW(UOI_NAME) == "Winlogon" (secure) vs
"Default" (normal) → Arc<AtomicU8>. This is the authoritative signal; WTS session notifications
miss UAC entirely. (May also register WTSRegisterSessionNotification to short-circuit lock/unlock.)
Implementation steps (each independently buildable/testable on the 4090)
- DesktopWatcher (
capture/desktop_watch.rs, ~40 lines): the poll + atomic. Test: lock / trigger UAC over the existing stream, confirm the atomic flipsDefault↔Winlogonwithin a poll interval. - SendInput retry-on-failure model (
inject/sendinput.rs): replace per-event reattach with the cached-HDESK + retry-once model. Test: normal input unchanged; click UAC + type the lock password land (works today via per-event reattach — this is a refactor). - WGC helper subcommand (
punktfunk1-host wgc-helperor similar): the existing WGC pipeline → NVENC → Annex-B AUs over a named-pipe server. Test standalone: as the user it writes a valid.h265to the pipe (capturing the SudoVDA output by GDI name, no topology changes). - Spawn + relay: SYSTEM host spawns the helper (
CreateProcessAsUserW), connects the pipe, relays its AUs onto the live QUIC session. Test: normal-desktop stream sourced via the helper relay. - Source mux: relay helper AUs while
Default, switch to the host's own DDA encoder whileWinlogon(reusing the DesktopWatcher). Test: normal (WGC, HDR) → trigger UAC → stream shows the UAC dialog (DDA) → dismiss → back to WGC; QUIC session stays up throughout. Full-coverage milestone. - Relaunch watchdog + soak:
SERVICE_CONTROL_SESSIONCHANGE-style relaunch of the helper on console connect/disconnect; soak a few hundred lock/unlock+UAC switches (cf. task #17's 1012-switch run) — no leak / black / disconnect. Cargo features for the fallback:Win32_System_Threading,Win32_System_Pipes,Win32_System_RemoteDesktop.
Risks / notes
- Validate on the real 4090 only (
ssh "Enrico Bühler"@192.168.1.174, Session 1 via the Interactive scheduled task) — the headless build VM can't reproduce Winlogon-on-virtual-display or WGC. - The helper MUST capture the SudoVDA by GDI name and never create a second virtual output (avoids the ACCESS_LOST born-lost storm — one isolate owner = the SYSTEM host).
- Confirm
reisolatefires on a FRESH mid-session DDA open at the desktop boundary (task #17 only validated DDA recovery within an already-DDA session). - Brief one-frame repeat/flicker at the WGC↔DDA boundary is acceptable (the local lock/UAC transition flickers too); never starve the encoder (repeat last frame across the swap gap).
- Pragmatic alternative if full coverage isn't worth the build:
PromptOnSecureDesktop=0(UAC renders on the normal desktop → WGC captures it) covers UAC (not lock/login) with one reversible registry change.