Files
punktfunk/docs/windows-secure-desktop.md
T
enricobuehler 9c8fa9340c
apple / swift (push) Failing after 40s
audit / cargo-audit (push) Failing after 1m12s
windows-msix / package (push) Successful in 1m37s
windows / build (push) Successful in 1m14s
android / android (push) Successful in 4m48s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 4m21s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 4m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 19s
deb / build-publish (push) Successful in 6m3s
flatpak / build-publish (push) Successful in 4m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m16s
docker / deploy-docs (push) Successful in 18s
refactor: drop milestone names + consolidate clients; loss-recovery & rumble fixes
Two bodies of work in one commit (the rename moved files the fixes also touched).

Naming/structure cleanup (pre-launch):
- Host modules m3.rs->punktfunk1.rs, m0.rs->spike.rs; CLI m3-host->punktfunk1-host,
  m0->spike; bare `punktfunk-host` now prints help. Types M3Options/M3Source->
  Punktfunk1Options/Punktfunk1Source.
- Clients consolidated out of crates/ into clients/: punktfunk-client-rs->
  clients/probe (crate punktfunk-probe), client-linux->clients/linux,
  client-windows->clients/windows, punktfunk-android->clients/android/native
  (crate punktfunk-client-android; kept [lib] name=punktfunk_android so the JNI
  contract is unchanged). crates/ now holds only core + host.
- Milestone codes M0-M4 purged from code/CLI/CLAUDE.md/README/docs/docs-site,
  kept only in docs/implementation-plan.md. docs/m2-plan.md->
  docs/gamestream-host-plan.md. CI/gradle/flatpak paths updated.

Client loss-recovery (video froze and never recovered after a brief drop):
- Export punktfunk_connection_frames_dropped through the C ABI (the core already
  tracked it for the client keyframe-recovery loop; it was never reachable from
  the ABI clients). Regenerated punktfunk_core.h.
- Apple (StreamPump + Stage2Pipeline) and Android (decode.rs) now poll
  frames_dropped and request a keyframe when it climbs -- the same loss-driven
  recovery Linux/Windows already had. Under infinite GOP the decoder silently
  conceals reference-missing frames, so the decode-error trigger rarely fires.

Apple rumble robustness (worked then went spotty -- DualSense + Xbox):
- Add CHHapticEngine stopped/reset handlers (rebuild on app background / audio
  interruption / server reset) and drop the permanent `broken` latch on a
  transient drive failure; latch only when the controller truly has no haptics.
- Surface swallowed SDL set_rumble errors on Linux/Windows + diagnostic logging.

Verified: cargo build/clippy/fmt --workspace, C-ABI harness, header drift.
Not runnable on this box (verify in CI): Gitea workflows, gradle/Android,
flatpak, Swift/decky.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 21:05:58 +00:00

133 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Windows secure-desktop capture — two-process design
Status: **all steps (16) implemented and live-validated on the RTX 4090 (2026-06-16).** The
two-process path works end to end (host as SYSTEM): the user-session WGC helper relays video, the mux
switches to the host's DDA on the secure desktop, a dead helper is rebuilt automatically, and the
SendInput injector follows desktop switches lazily. Only a *real* UAC/lock smoke test remains (can't
be triggered headless over SSH). The earlier user-mode WGC animation fix still ships; this is the
SYSTEM-mode design that adds secure-desktop (UAC/lock/login) coverage, since WGC and the secure desktop
need conflicting process tokens.
Implemented so far:
- **Step 1 — DesktopWatcher** (`capture/desktop_watch.rs`): polls the input-desktop name → atomic
`Default`/`Winlogon`. Committed `80e222d`.
- **Step 3 — WGC helper subcommand** (`wgc_helper.rs`, `punktfunk1-host wgc-helper`): WGC→NVENC→framed AUs on
stdout, stdin keyframe control. Committed `a0f6cdd`.
- **Step 4 — spawn + relay** (`capture/wgc_relay.rs`, `m3::virtual_stream_relay`): SYSTEM host spawns
the helper via `CreateProcessAsUserW` into `winsta0\default`, relays its stdout AUs to the QUIC send
thread, forwards keyframe requests, surfaces helper stderr in host tracing. Committed `9f50b39`.
- **Step 5 — source mux** (`m3::virtual_stream_relay`): the DesktopWatcher switches the AU source —
helper relay on `Default`, the host's own DDA capturer+encoder on `Winlogon`; every switch latches
"wait for IDR" + forces the now-active source to emit a keyframe.
**Live-validated on the RTX 4090 (2026-06-16, host as SYSTEM):**
- Step 4: the helper spawns via `CreateProcessAsUserW`, runs WGC with no hang (HDR FP16 BT.2020 PQ),
opens NVENC (D3D11 Main10), and relays AUs — `client-rs` over the LAN decoded 411 HEVC Main-10
frames. (Bug found+fixed: `CreateProcessAsUserW` gave the helper the *user's* env, dropping
`PUNKTFUNK_ENCODER=nvenc` → software-encoder fallback; fixed by `merged_env_block`.)
- Step 5: with `PUNKTFUNK_SECURE_TEST_PERIOD_MS=4000` driving a square-wave toggle, the source mux
switched `secure(DDA)``normal(WGC relay)` cleanly 5× in one session; the client decoded 308 frames
continuously across every switch (the wait-for-IDR latch held — no decode break). The real Winlogon
DDA capture itself is pre-proven by the single-process secure path (commit `f4b4a6c`); step 5's new
surface is the mux, which the toggle exercises directly.
- Step 6: the helper relaunch watchdog. Force-killing the helper PID mid-stream triggered exactly one
`WGC helper exited — rebuilt output + helper fails=1` and the stream recovered — client-rs decoded
645 frames continuously across the kill. A ~30s mux soak (2s toggle) ran 16 switches with 0 rebuilds
/ 0 early-ends / 465 frames decoded. (Recovery rebuilds the whole output, not a same-target respawn,
which storm-failed with "no DXGI output for target N yet" after an abrupt kill.)
- Step 2: SendInput now uses the retry-on-failure model (`inject/sendinput.rs`) — the thread stays
bound to its desktop and only reattaches (`OpenInputDesktop`/`SetThreadDesktop`) on a `SendInput`
short write (desktop switched), instead of two syscalls per event. Validated: `client-rs --input-test`
injected for ~6s with no `blocked desktop` errors (steady-state path); the reattach-on-switch path
is the same `OpenInputDesktop` call the old per-event code used, now lazy.
Remaining: a **final user-driven smoke test** — trigger a *real* UAC/lock on the box during a session
and confirm the dialog appears on the client AND that clicking/typing on it lands (the box's UAC
auto-elevates admins, so a real prompt can't be triggered headless over SSH; the mux switch itself is
proven by the timed toggle, and DDA-on-Winlogon capture + input by the single-process secure path).
> **Note:** the two-process path requires the host to run as SYSTEM (`run.cmd.sysbak` → `-s -i 1`).
> As SYSTEM, WASAPI loopback audio (session 0) does not capture the user session's audio — a known
> limitation of SYSTEM-mode capture, separate from this work.
## The constraint (verified live on the RTX 4090)
- **WGC** (the composed-desktop capture that fixes frozen HDR animations) **will not activate under
the SYSTEM account** — `CreateForMonitor``0x80070424`. Thread-level `ImpersonateLoggedOnUser` is
**insufficient** (tested: `impersonated=true`, still `0x80070424`). WGC needs the *process* to run
as the interactive user.
- **DDA + SendInput on the secure desktop (Winlogon: UAC/lock/login) require LOCAL_SYSTEM** (attach to
the Winlogon desktop). This is already shipped (task #17) when the host runs as SYSTEM.
- Therefore one process can't do both. Single-process (the simpler design) is **out**.
## Architecture: SYSTEM host + USER-session WGC helper, AU-relay (no shared GPU texture)
- **SYSTEM host** (the existing `punktfunk1-host`, launched as SYSTEM in interactive Session 1 via the
scheduled task → PsExec `-s -i 1`): owns the punktfunk/1 QUIC session, the single SudoVDA virtual
output (+ isolate/restore RAII — the *only* topology owner), the **DDA capture + NVENC encoder for
the secure desktop**, the **single SendInput injector** (serves *both* desktops), and the **AU
source mux** that feeds the QUIC data plane.
- **USER-session WGC helper** (a new `punktfunk1-host` subcommand, spawned by the SYSTEM host via
`WTSQueryUserToken(activeConsoleSessionId)``DuplicateTokenEx(TokenPrimary)`
`CreateProcessAsUserW(lpDesktop="winsta0\\default", CREATE_NO_WINDOW)`): runs the existing
**WGC → scRGB/PQ → NVENC** pipeline and ships **Annex-B AUs** (`{data, pts_ns, keyframe}`) to the
SYSTEM host over a **named pipe**. It captures the SAME SudoVDA output **by GDI name only** — it
must NOT create its own virtual output / touch display topology (a second topology owner re-triggers
the ACCESS_LOST born-lost storm).
- **Mux**: the SYSTEM host relays the helper's AUs onto QUIC while the input desktop is `Default`
(normal — WGC, HDR/animation-correct), and switches to its own DDA encoder while it's `Winlogon`
(secure — UAC/lock/login). The client sees one continuous stream; the encoder/FEC/AES-GCM/QUIC send
path is untouched (same `EncodedFrame` flow). NVENC re-inits only on a size/format change across the
swap (already handled); same-mode is a pointer re-register.
- **Input**: stays entirely in the SYSTEM host (only it can attach to Winlogon). One windowless
SendInput thread, Sunshine's **retry-on-failure-only** model (cache HDESK thread-local; SendInput
first; only on 0-injected re-`OpenInputDesktop`+`SetThreadDesktop` and retry once) — serves both
desktops with no per-event reattach. (Ctrl+Alt+Del/SAS needs `SendSAS`, out of scope; clicking UAC
Yes/No + typing the login password are plain SendInput on Winlogon.)
Rejected: a shared NT-handle GPU texture (MIC/SDDL pain SYSTEM→user, keyed-mutex ring at 240 Hz,
nvenc pointer-cache churn — all for a static lock dialog). AU bytes over a pipe are far simpler.
## Detection
`DesktopWatcher`: a dedicated thread polling the input-desktop NAME at 3060 Hz —
`OpenInputDesktop(0,FALSE,0)` + `GetUserObjectInformationW(UOI_NAME)` == `"Winlogon"` (secure) vs
`"Default"` (normal) → `Arc<AtomicU8>`. This is the authoritative signal; WTS session notifications
miss UAC entirely. (May also register `WTSRegisterSessionNotification` to short-circuit lock/unlock.)
## Implementation steps (each independently buildable/testable on the 4090)
1. **DesktopWatcher** (`capture/desktop_watch.rs`, ~40 lines): the poll + atomic. Test: lock / trigger
UAC over the existing stream, confirm the atomic flips `Default↔Winlogon` within a poll interval.
2. **SendInput retry-on-failure model** (`inject/sendinput.rs`): replace per-event reattach with the
cached-HDESK + retry-once model. Test: normal input unchanged; click UAC + type the lock password
land (works today via per-event reattach — this is a refactor).
3. **WGC helper subcommand** (`punktfunk1-host wgc-helper` or similar): the existing WGC pipeline → NVENC →
Annex-B AUs over a named-pipe server. Test standalone: as the user it writes a valid `.h265` to the
pipe (capturing the SudoVDA output by GDI name, no topology changes).
4. **Spawn + relay**: SYSTEM host spawns the helper (`CreateProcessAsUserW`), connects the pipe,
relays its AUs onto the live QUIC session. Test: normal-desktop stream sourced via the helper relay.
5. **Source mux**: relay helper AUs while `Default`, switch to the host's own DDA encoder while
`Winlogon` (reusing the DesktopWatcher). Test: normal (WGC, HDR) → trigger UAC → stream shows the
UAC dialog (DDA) → dismiss → back to WGC; QUIC session stays up throughout. **Full-coverage milestone.**
6. **Relaunch watchdog + soak**: `SERVICE_CONTROL_SESSIONCHANGE`-style relaunch of the helper on
console connect/disconnect; soak a few hundred lock/unlock+UAC switches (cf. task #17's 1012-switch
run) — no leak / black / disconnect. Cargo features for the fallback: `Win32_System_Threading`,
`Win32_System_Pipes`, `Win32_System_RemoteDesktop`.
## Risks / notes
- Validate on the real 4090 only (`ssh "Enrico Bühler"@192.168.1.174`, Session 1 via the Interactive
scheduled task) — the headless build VM can't reproduce Winlogon-on-virtual-display or WGC.
- The helper MUST capture the SudoVDA by GDI name and never create a second virtual output (avoids the
ACCESS_LOST born-lost storm — one isolate owner = the SYSTEM host).
- Confirm `reisolate` fires on a FRESH mid-session DDA open at the desktop boundary (task #17 only
validated DDA recovery within an already-DDA session).
- Brief one-frame repeat/flicker at the WGC↔DDA boundary is acceptable (the local lock/UAC transition
flickers too); never starve the encoder (repeat last frame across the swap gap).
- Pragmatic alternative if full coverage isn't worth the build: `PromptOnSecureDesktop=0` (UAC renders
on the normal desktop → WGC captures it) covers UAC (not lock/login) with one reversible registry
change.