Files
punktfunk/docs/windows-secure-desktop.md
T
enricobuehler ec2907fc32
apple / swift (push) Successful in 54s
android / android (push) Failing after 0s
ci / rust (push) Failing after 0s
ci / docs-site (push) Failing after 0s
ci / bench (push) Failing after 0s
deb / build-publish (push) Failing after 0s
ci / web (push) Failing after 1s
decky / build-publish (push) Failing after 0s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Failing after 1s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Failing after 0s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 1s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Failing after 0s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 0s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 0s
docker / deploy-docs (push) Has been skipped
perf(host/windows): SendInput retry-on-failure model (two-process step 2)
The injector reattached the input desktop (OpenInputDesktop + SetThreadDesktop,
two syscalls) before EVERY event. Now it stays bound to its desktop and only
reattaches on a SendInput short write (the input desktop switched into UAC/lock)
+ retries once — Sunshine's model. No steady-state per-event overhead; still
follows the desktop across the secure boundary, serving both desktops.

Validated on the RTX 4090 (host as SYSTEM): client-rs --input-test injected for
~6s with no "blocked desktop" errors. Completes all 6 steps of the two-process
secure-desktop build; only a real-UAC user smoke test remains.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 08:30:49 +00:00

9.9 KiB
Raw Blame History

Windows secure-desktop capture — two-process design

Status: all steps (16) implemented and live-validated on the RTX 4090 (2026-06-16). The two-process path works end to end (host as SYSTEM): the user-session WGC helper relays video, the mux switches to the host's DDA on the secure desktop, a dead helper is rebuilt automatically, and the SendInput injector follows desktop switches lazily. Only a real UAC/lock smoke test remains (can't be triggered headless over SSH). The earlier user-mode WGC animation fix still ships; this is the SYSTEM-mode design that adds secure-desktop (UAC/lock/login) coverage, since WGC and the secure desktop need conflicting process tokens.

Implemented so far:

  • Step 1 — DesktopWatcher (capture/desktop_watch.rs): polls the input-desktop name → atomic Default/Winlogon. Committed 80e222d.
  • Step 3 — WGC helper subcommand (wgc_helper.rs, m3-host wgc-helper): WGC→NVENC→framed AUs on stdout, stdin keyframe control. Committed a0f6cdd.
  • Step 4 — spawn + relay (capture/wgc_relay.rs, m3::virtual_stream_relay): SYSTEM host spawns the helper via CreateProcessAsUserW into winsta0\default, relays its stdout AUs to the QUIC send thread, forwards keyframe requests, surfaces helper stderr in host tracing. Committed 9f50b39.
  • Step 5 — source mux (m3::virtual_stream_relay): the DesktopWatcher switches the AU source — helper relay on Default, the host's own DDA capturer+encoder on Winlogon; every switch latches "wait for IDR" + forces the now-active source to emit a keyframe.

Live-validated on the RTX 4090 (2026-06-16, host as SYSTEM):

  • Step 4: the helper spawns via CreateProcessAsUserW, runs WGC with no hang (HDR FP16 BT.2020 PQ), opens NVENC (D3D11 Main10), and relays AUs — client-rs over the LAN decoded 411 HEVC Main-10 frames. (Bug found+fixed: CreateProcessAsUserW gave the helper the user's env, dropping PUNKTFUNK_ENCODER=nvenc → software-encoder fallback; fixed by merged_env_block.)

  • Step 5: with PUNKTFUNK_SECURE_TEST_PERIOD_MS=4000 driving a square-wave toggle, the source mux switched secure(DDA)normal(WGC relay) cleanly 5× in one session; the client decoded 308 frames continuously across every switch (the wait-for-IDR latch held — no decode break). The real Winlogon DDA capture itself is pre-proven by the single-process secure path (commit f4b4a6c); step 5's new surface is the mux, which the toggle exercises directly.

  • Step 6: the helper relaunch watchdog. Force-killing the helper PID mid-stream triggered exactly one WGC helper exited — rebuilt output + helper fails=1 and the stream recovered — client-rs decoded 645 frames continuously across the kill. A ~30s mux soak (2s toggle) ran 16 switches with 0 rebuilds / 0 early-ends / 465 frames decoded. (Recovery rebuilds the whole output, not a same-target respawn, which storm-failed with "no DXGI output for target N yet" after an abrupt kill.)

  • Step 2: SendInput now uses the retry-on-failure model (inject/sendinput.rs) — the thread stays bound to its desktop and only reattaches (OpenInputDesktop/SetThreadDesktop) on a SendInput short write (desktop switched), instead of two syscalls per event. Validated: client-rs --input-test injected for ~6s with no blocked desktop errors (steady-state path); the reattach-on-switch path is the same OpenInputDesktop call the old per-event code used, now lazy.

Remaining: a final user-driven smoke test — trigger a real UAC/lock on the box during a session and confirm the dialog appears on the client AND that clicking/typing on it lands (the box's UAC auto-elevates admins, so a real prompt can't be triggered headless over SSH; the mux switch itself is proven by the timed toggle, and DDA-on-Winlogon capture + input by the single-process secure path).

Note: the two-process path requires the host to run as SYSTEM (run.cmd.sysbak-s -i 1). As SYSTEM, WASAPI loopback audio (session 0) does not capture the user session's audio — a known limitation of SYSTEM-mode capture, separate from this work.

The constraint (verified live on the RTX 4090)

  • WGC (the composed-desktop capture that fixes frozen HDR animations) will not activate under the SYSTEM accountCreateForMonitor0x80070424. Thread-level ImpersonateLoggedOnUser is insufficient (tested: impersonated=true, still 0x80070424). WGC needs the process to run as the interactive user.
  • DDA + SendInput on the secure desktop (Winlogon: UAC/lock/login) require LOCAL_SYSTEM (attach to the Winlogon desktop). This is already shipped (task #17) when the host runs as SYSTEM.
  • Therefore one process can't do both. Single-process (the simpler design) is out.

Architecture: SYSTEM host + USER-session WGC helper, AU-relay (no shared GPU texture)

  • SYSTEM host (the existing m3-host, launched as SYSTEM in interactive Session 1 via the scheduled task → PsExec -s -i 1): owns the punktfunk/1 QUIC session, the single SudoVDA virtual output (+ isolate/restore RAII — the only topology owner), the DDA capture + NVENC encoder for the secure desktop, the single SendInput injector (serves both desktops), and the AU source mux that feeds the QUIC data plane.
  • USER-session WGC helper (a new m3-host subcommand, spawned by the SYSTEM host via WTSQueryUserToken(activeConsoleSessionId)DuplicateTokenEx(TokenPrimary)CreateProcessAsUserW(lpDesktop="winsta0\\default", CREATE_NO_WINDOW)): runs the existing WGC → scRGB/PQ → NVENC pipeline and ships Annex-B AUs ({data, pts_ns, keyframe}) to the SYSTEM host over a named pipe. It captures the SAME SudoVDA output by GDI name only — it must NOT create its own virtual output / touch display topology (a second topology owner re-triggers the ACCESS_LOST born-lost storm).
  • Mux: the SYSTEM host relays the helper's AUs onto QUIC while the input desktop is Default (normal — WGC, HDR/animation-correct), and switches to its own DDA encoder while it's Winlogon (secure — UAC/lock/login). The client sees one continuous stream; the encoder/FEC/AES-GCM/QUIC send path is untouched (same EncodedFrame flow). NVENC re-inits only on a size/format change across the swap (already handled); same-mode is a pointer re-register.
  • Input: stays entirely in the SYSTEM host (only it can attach to Winlogon). One windowless SendInput thread, Sunshine's retry-on-failure-only model (cache HDESK thread-local; SendInput first; only on 0-injected re-OpenInputDesktop+SetThreadDesktop and retry once) — serves both desktops with no per-event reattach. (Ctrl+Alt+Del/SAS needs SendSAS, out of scope; clicking UAC Yes/No + typing the login password are plain SendInput on Winlogon.)

Rejected: a shared NT-handle GPU texture (MIC/SDDL pain SYSTEM→user, keyed-mutex ring at 240 Hz, nvenc pointer-cache churn — all for a static lock dialog). AU bytes over a pipe are far simpler.

Detection

DesktopWatcher: a dedicated thread polling the input-desktop NAME at 3060 Hz — OpenInputDesktop(0,FALSE,0) + GetUserObjectInformationW(UOI_NAME) == "Winlogon" (secure) vs "Default" (normal) → Arc<AtomicU8>. This is the authoritative signal; WTS session notifications miss UAC entirely. (May also register WTSRegisterSessionNotification to short-circuit lock/unlock.)

Implementation steps (each independently buildable/testable on the 4090)

  1. DesktopWatcher (capture/desktop_watch.rs, ~40 lines): the poll + atomic. Test: lock / trigger UAC over the existing stream, confirm the atomic flips Default↔Winlogon within a poll interval.
  2. SendInput retry-on-failure model (inject/sendinput.rs): replace per-event reattach with the cached-HDESK + retry-once model. Test: normal input unchanged; click UAC + type the lock password land (works today via per-event reattach — this is a refactor).
  3. WGC helper subcommand (m3-host wgc-helper or similar): the existing WGC pipeline → NVENC → Annex-B AUs over a named-pipe server. Test standalone: as the user it writes a valid .h265 to the pipe (capturing the SudoVDA output by GDI name, no topology changes).
  4. Spawn + relay: SYSTEM host spawns the helper (CreateProcessAsUserW), connects the pipe, relays its AUs onto the live QUIC session. Test: normal-desktop stream sourced via the helper relay.
  5. Source mux: relay helper AUs while Default, switch to the host's own DDA encoder while Winlogon (reusing the DesktopWatcher). Test: normal (WGC, HDR) → trigger UAC → stream shows the UAC dialog (DDA) → dismiss → back to WGC; QUIC session stays up throughout. Full-coverage milestone.
  6. Relaunch watchdog + soak: SERVICE_CONTROL_SESSIONCHANGE-style relaunch of the helper on console connect/disconnect; soak a few hundred lock/unlock+UAC switches (cf. task #17's 1012-switch run) — no leak / black / disconnect. Cargo features for the fallback: Win32_System_Threading, Win32_System_Pipes, Win32_System_RemoteDesktop.

Risks / notes

  • Validate on the real 4090 only (ssh "Enrico Bühler"@192.168.1.174, Session 1 via the Interactive scheduled task) — the headless build VM can't reproduce Winlogon-on-virtual-display or WGC.
  • The helper MUST capture the SudoVDA by GDI name and never create a second virtual output (avoids the ACCESS_LOST born-lost storm — one isolate owner = the SYSTEM host).
  • Confirm reisolate fires on a FRESH mid-session DDA open at the desktop boundary (task #17 only validated DDA recovery within an already-DDA session).
  • Brief one-frame repeat/flicker at the WGC↔DDA boundary is acceptable (the local lock/UAC transition flickers too); never starve the encoder (repeat last frame across the swap gap).
  • Pragmatic alternative if full coverage isn't worth the build: PromptOnSecureDesktop=0 (UAC renders on the normal desktop → WGC captures it) covers UAC (not lock/login) with one reversible registry change.