d01a8fd17a
windows-host / package (push) Failing after 4m16s
ci / rust (push) Failing after 4m56s
ci / web (push) Failing after 22s
ci / docs-site (push) Successful in 1m7s
android / android (push) Successful in 9m19s
ci / bench (push) Successful in 4m47s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Failing after 3s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Failing after 6m29s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 7m4s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 7m17s
apple / swift (push) Successful in 1m13s
apple / screenshots (push) Successful in 5m27s
NVIDIA/AMD Vulkan ICDs refuse to *advertise* an HDR color space for a surface on an
IddCx indirect/virtual display, so Vulkan games (Doom: The Dark Ages, id Tech, Indiana
Jones, …) report "device does not support HDR" — even though Windows HDR, DWM compose,
and the client PQ stream all work, and the ICD happily *accepts + presents* a forced HDR
swapchain there. The whole gap is enumeration; the community (Apollo/Sunshine/VDD) wrote
this off as kernel-side / unfixable.
Add VK_LAYER_PUNKTFUNK_hdr_inject (packaging/windows/pf-vkhdr-layer/): a standalone
cdylib Vulkan implicit layer that appends {A2B10G10R10, HDR10_ST2084} + {RGBA16F, scRGB}
to vkGetPhysicalDeviceSurfaceFormats[2]KHR (no need to hook vkCreateSwapchainKHR — the
ICD doesn't validate the color space there). Self-gated on the surface monitor's actual
advanced-color state (DisplayConfig GET_ADVANCED_COLOR_INFO), so it is a complete no-op
on SDR sessions and real monitors (dedup). Always-on (registry-discovered) so it works
regardless of how a game is launched — env-scoping silently fails for already-running
Steam. Escape hatches: DISABLE_PF_VKHDR, PF_VKHDR_EXCLUDE, and a built-in kernel-anti-
cheat denylist.
The installer builds/signs/stages it and registers it under
HKLM64\SOFTWARE\Khronos\Vulkan\ImplicitLayers (opt-out "Install the HDR Vulkan layer"
task); windows-host CI fmt+clippy-gates it (msvc-only FFI).
Live-validated on the RTX box: Doom: The Dark Ages enables HDR over the pf-vdisplay
virtual display.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
133 lines
10 KiB
Markdown
133 lines
10 KiB
Markdown
# Windows secure-desktop capture — two-process design
|
||
|
||
Status: **all steps (1–6) implemented and live-validated on the RTX 4090 (2026-06-16).** The
|
||
two-process path works end to end (host as SYSTEM): the user-session WGC helper relays video, the mux
|
||
switches to the host's DDA on the secure desktop, a dead helper is rebuilt automatically, and the
|
||
SendInput injector follows desktop switches lazily. Only a *real* UAC/lock smoke test remains (can't
|
||
be triggered headless over SSH). The earlier user-mode WGC animation fix still ships; this is the
|
||
SYSTEM-mode design that adds secure-desktop (UAC/lock/login) coverage, since WGC and the secure desktop
|
||
need conflicting process tokens.
|
||
|
||
Implemented so far:
|
||
- **Step 1 — DesktopWatcher** (`capture/desktop_watch.rs`): polls the input-desktop name → atomic
|
||
`Default`/`Winlogon`. Committed `80e222d`.
|
||
- **Step 3 — WGC helper subcommand** (`wgc_helper.rs`, `punktfunk1-host wgc-helper`): WGC→NVENC→framed AUs on
|
||
stdout, stdin keyframe control. Committed `a0f6cdd`.
|
||
- **Step 4 — spawn + relay** (`capture/wgc_relay.rs`, `m3::virtual_stream_relay`): SYSTEM host spawns
|
||
the helper via `CreateProcessAsUserW` into `winsta0\default`, relays its stdout AUs to the QUIC send
|
||
thread, forwards keyframe requests, surfaces helper stderr in host tracing. Committed `9f50b39`.
|
||
- **Step 5 — source mux** (`m3::virtual_stream_relay`): the DesktopWatcher switches the AU source —
|
||
helper relay on `Default`, the host's own DDA capturer+encoder on `Winlogon`; every switch latches
|
||
"wait for IDR" + forces the now-active source to emit a keyframe.
|
||
|
||
**Live-validated on the RTX 4090 (2026-06-16, host as SYSTEM):**
|
||
- Step 4: the helper spawns via `CreateProcessAsUserW`, runs WGC with no hang (HDR FP16 BT.2020 PQ),
|
||
opens NVENC (D3D11 Main10), and relays AUs — `client-rs` over the LAN decoded 411 HEVC Main-10
|
||
frames. (Bug found+fixed: `CreateProcessAsUserW` gave the helper the *user's* env, dropping
|
||
`PUNKTFUNK_ENCODER=nvenc` → software-encoder fallback; fixed by `merged_env_block`.)
|
||
- Step 5: with `PUNKTFUNK_SECURE_TEST_PERIOD_MS=4000` driving a square-wave toggle, the source mux
|
||
switched `secure(DDA)`↔`normal(WGC relay)` cleanly 5× in one session; the client decoded 308 frames
|
||
continuously across every switch (the wait-for-IDR latch held — no decode break). The real Winlogon
|
||
DDA capture itself is pre-proven by the single-process secure path (commit `f4b4a6c`); step 5's new
|
||
surface is the mux, which the toggle exercises directly.
|
||
|
||
- Step 6: the helper relaunch watchdog. Force-killing the helper PID mid-stream triggered exactly one
|
||
`WGC helper exited — rebuilt output + helper fails=1` and the stream recovered — client-rs decoded
|
||
645 frames continuously across the kill. A ~30s mux soak (2s toggle) ran 16 switches with 0 rebuilds
|
||
/ 0 early-ends / 465 frames decoded. (Recovery rebuilds the whole output, not a same-target respawn,
|
||
which storm-failed with "no DXGI output for target N yet" after an abrupt kill.)
|
||
|
||
- Step 2: SendInput now uses the retry-on-failure model (`inject/sendinput.rs`) — the thread stays
|
||
bound to its desktop and only reattaches (`OpenInputDesktop`/`SetThreadDesktop`) on a `SendInput`
|
||
short write (desktop switched), instead of two syscalls per event. Validated: `client-rs --input-test`
|
||
injected for ~6s with no `blocked desktop` errors (steady-state path); the reattach-on-switch path
|
||
is the same `OpenInputDesktop` call the old per-event code used, now lazy.
|
||
|
||
Remaining: a **final user-driven smoke test** — trigger a *real* UAC/lock on the box during a session
|
||
and confirm the dialog appears on the client AND that clicking/typing on it lands (the box's UAC
|
||
auto-elevates admins, so a real prompt can't be triggered headless over SSH; the mux switch itself is
|
||
proven by the timed toggle, and DDA-on-Winlogon capture + input by the single-process secure path).
|
||
|
||
> **Note:** the two-process path requires the host to run as SYSTEM (`run.cmd.sysbak` → `-s -i 1`).
|
||
> As SYSTEM, WASAPI loopback audio (session 0) does not capture the user session's audio — a known
|
||
> limitation of SYSTEM-mode capture, separate from this work.
|
||
|
||
## The constraint (verified live on the RTX 4090)
|
||
|
||
- **WGC** (the composed-desktop capture that fixes frozen HDR animations) **will not activate under
|
||
the SYSTEM account** — `CreateForMonitor` → `0x80070424`. Thread-level `ImpersonateLoggedOnUser` is
|
||
**insufficient** (tested: `impersonated=true`, still `0x80070424`). WGC needs the *process* to run
|
||
as the interactive user.
|
||
- **DDA + SendInput on the secure desktop (Winlogon: UAC/lock/login) require LOCAL_SYSTEM** (attach to
|
||
the Winlogon desktop). This is already shipped (task #17) when the host runs as SYSTEM.
|
||
- Therefore one process can't do both. Single-process (the simpler design) is **out**.
|
||
|
||
## Architecture: SYSTEM host + USER-session WGC helper, AU-relay (no shared GPU texture)
|
||
|
||
- **SYSTEM host** (the existing `punktfunk1-host`, launched as SYSTEM in interactive Session 1 via the
|
||
scheduled task → PsExec `-s -i 1`): owns the punktfunk/1 QUIC session, the single SudoVDA virtual
|
||
output (+ isolate/restore RAII — the *only* topology owner), the **DDA capture + NVENC encoder for
|
||
the secure desktop**, the **single SendInput injector** (serves *both* desktops), and the **AU
|
||
source mux** that feeds the QUIC data plane.
|
||
- **USER-session WGC helper** (a new `punktfunk1-host` subcommand, spawned by the SYSTEM host via
|
||
`WTSQueryUserToken(activeConsoleSessionId)` → `DuplicateTokenEx(TokenPrimary)` →
|
||
`CreateProcessAsUserW(lpDesktop="winsta0\\default", CREATE_NO_WINDOW)`): runs the existing
|
||
**WGC → scRGB/PQ → NVENC** pipeline and ships **Annex-B AUs** (`{data, pts_ns, keyframe}`) to the
|
||
SYSTEM host over a **named pipe**. It captures the SAME SudoVDA output **by GDI name only** — it
|
||
must NOT create its own virtual output / touch display topology (a second topology owner re-triggers
|
||
the ACCESS_LOST born-lost storm).
|
||
- **Mux**: the SYSTEM host relays the helper's AUs onto QUIC while the input desktop is `Default`
|
||
(normal — WGC, HDR/animation-correct), and switches to its own DDA encoder while it's `Winlogon`
|
||
(secure — UAC/lock/login). The client sees one continuous stream; the encoder/FEC/AES-GCM/QUIC send
|
||
path is untouched (same `EncodedFrame` flow). NVENC re-inits only on a size/format change across the
|
||
swap (already handled); same-mode is a pointer re-register.
|
||
- **Input**: stays entirely in the SYSTEM host (only it can attach to Winlogon). One windowless
|
||
SendInput thread, Sunshine's **retry-on-failure-only** model (cache HDESK thread-local; SendInput
|
||
first; only on 0-injected re-`OpenInputDesktop`+`SetThreadDesktop` and retry once) — serves both
|
||
desktops with no per-event reattach. (Ctrl+Alt+Del/SAS needs `SendSAS`, out of scope; clicking UAC
|
||
Yes/No + typing the login password are plain SendInput on Winlogon.)
|
||
|
||
Rejected: a shared NT-handle GPU texture (MIC/SDDL pain SYSTEM→user, keyed-mutex ring at 240 Hz,
|
||
nvenc pointer-cache churn — all for a static lock dialog). AU bytes over a pipe are far simpler.
|
||
|
||
## Detection
|
||
|
||
`DesktopWatcher`: a dedicated thread polling the input-desktop NAME at 30–60 Hz —
|
||
`OpenInputDesktop(0,FALSE,0)` + `GetUserObjectInformationW(UOI_NAME)` == `"Winlogon"` (secure) vs
|
||
`"Default"` (normal) → `Arc<AtomicU8>`. This is the authoritative signal; WTS session notifications
|
||
miss UAC entirely. (May also register `WTSRegisterSessionNotification` to short-circuit lock/unlock.)
|
||
|
||
## Implementation steps (each independently buildable/testable on the 4090)
|
||
|
||
1. **DesktopWatcher** (`capture/desktop_watch.rs`, ~40 lines): the poll + atomic. Test: lock / trigger
|
||
UAC over the existing stream, confirm the atomic flips `Default↔Winlogon` within a poll interval.
|
||
2. **SendInput retry-on-failure model** (`inject/sendinput.rs`): replace per-event reattach with the
|
||
cached-HDESK + retry-once model. Test: normal input unchanged; click UAC + type the lock password
|
||
land (works today via per-event reattach — this is a refactor).
|
||
3. **WGC helper subcommand** (`punktfunk1-host wgc-helper` or similar): the existing WGC pipeline → NVENC →
|
||
Annex-B AUs over a named-pipe server. Test standalone: as the user it writes a valid `.h265` to the
|
||
pipe (capturing the SudoVDA output by GDI name, no topology changes).
|
||
4. **Spawn + relay**: SYSTEM host spawns the helper (`CreateProcessAsUserW`), connects the pipe,
|
||
relays its AUs onto the live QUIC session. Test: normal-desktop stream sourced via the helper relay.
|
||
5. **Source mux**: relay helper AUs while `Default`, switch to the host's own DDA encoder while
|
||
`Winlogon` (reusing the DesktopWatcher). Test: normal (WGC, HDR) → trigger UAC → stream shows the
|
||
UAC dialog (DDA) → dismiss → back to WGC; QUIC session stays up throughout. **Full-coverage milestone.**
|
||
6. **Relaunch watchdog + soak**: `SERVICE_CONTROL_SESSIONCHANGE`-style relaunch of the helper on
|
||
console connect/disconnect; soak a few hundred lock/unlock+UAC switches (cf. task #17's 1012-switch
|
||
run) — no leak / black / disconnect. Cargo features for the fallback: `Win32_System_Threading`,
|
||
`Win32_System_Pipes`, `Win32_System_RemoteDesktop`.
|
||
|
||
## Risks / notes
|
||
|
||
- Validate on the real 4090 only (`ssh "Enrico Bühler"@192.168.1.174`, Session 1 via the Interactive
|
||
scheduled task) — the headless build VM can't reproduce Winlogon-on-virtual-display or WGC.
|
||
- The helper MUST capture the SudoVDA by GDI name and never create a second virtual output (avoids the
|
||
ACCESS_LOST born-lost storm — one isolate owner = the SYSTEM host).
|
||
- Confirm `reisolate` fires on a FRESH mid-session DDA open at the desktop boundary (task #17 only
|
||
validated DDA recovery within an already-DDA session).
|
||
- Brief one-frame repeat/flicker at the WGC↔DDA boundary is acceptable (the local lock/UAC transition
|
||
flickers too); never starve the encoder (repeat last frame across the swap gap).
|
||
- Pragmatic alternative if full coverage isn't worth the build: `PromptOnSecureDesktop=0` (UAC renders
|
||
on the normal desktop → WGC captures it) covers UAC (not lock/login) with one reversible registry
|
||
change.
|