Files
punktfunk/docs/windows-host.md
T
enricobuehler 54b75c9be4
apple / swift (push) Successful in 55s
windows-host / package (push) Successful in 2m31s
android / android (push) Successful in 4m40s
ci / rust (push) Successful in 4m43s
ci / web (push) Successful in 30s
ci / docs-site (push) Successful in 34s
deb / build-publish (push) Successful in 2m9s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 14s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 21s
ci / bench (push) Successful in 4m44s
docker / deploy-docs (push) Successful in 19s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m19s
feat(host): GameStream/Moonlight compat is now opt-in (--gamestream) — secure native-only by default
Follows the security audit (#5/#9): the GameStream-compat plane carries inherent on-path weaknesses
that can't be fixed on the wire without breaking stock Moonlight — its pairing runs over plain HTTP
(#9, MITM-able during the pairing window) and its legacy control encryption can reuse GCM nonces (#5,
a passive eavesdropper can recover/forge input). The native punktfunk/1 plane (SPAKE2 PIN pairing +
per-direction AEAD nonces) has neither. So flip the default to secure-by-default:

- `serve`              → native punktfunk/1 plane + management API ONLY (no GameStream surface).
- `serve --gamestream` → ALSO the GameStream/Moonlight-compat planes (nvhttp pairing, RTSP, ENet
  control, _nvstream mDNS). Opt-in, logged with a trusted-LAN caveat. `--moonlight` is an alias.
- The native plane is now ALWAYS on in `serve` (`--native` is a kept-for-compat no-op); the unified
  GameStream+native host is `serve --gamestream`.

`gamestream::serve` gates the GameStream spawns (nvhttp/rtsp/control/mdns) on the flag; the native
plane + mgmt + native-pairing handle always run.

To avoid silently regressing validated Moonlight deployments, the explicit deployment configs PRESERVE
Moonlight via `--gamestream` (each documents dropping it for a secure native-only host): the Linux
systemd unit, the Steam Deck installer, and the Windows service default (DEFAULT_HOST_CMD). The bare
`serve` default (new/manual use) is secure.

Docs swept to match (host-cli, moonlight, quickstart, install, packaging READMEs, CLAUDE.md, README,
…): Moonlight setup now instructs `--gamestream`; native/console refs use bare `serve`. OpenAPI
regenerated (a stale "run `serve --native`" string). fmt + clippy clean; 94 host tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 10:19:40 +00:00

29 KiB
Raw Blame History

Windows host + client — implementation plan

Status: in progress — dev box provisioned, host-first. A Windows host is an "add backends behind the existing traits" job, not a parallel port: punktfunk-core and the whole control plane are platform-agnostic and the host already compiles on non-Linux (macOS) thanks to existing cfg(target_os) gating. The one piece that used to make it XL — a per-client virtual output, which has no user-mode Windows API — is solved by reusing SudoVDA (the SudoMaker Virtual Display Adapter, the same IDD the Apollo Sunshine-fork ships): a pre-built IDD that creates virtual displays at arbitrary WxH@Hz on the fly. We install it and drive its IOCTL control interface — no driver to write or WHQL-sign.

History: scoped 2026-06-10 (4-agent read of the host crate); SudoVDA path 2026-06-11; this concrete plan + dev box + SudoVDA protocol + no-GPU strategy added 2026-06-14 (12-agent research pass).

Status (2026-06-15) — full pipeline live-validated on an RTX 4090

Every OS-touching backend is implemented behind the existing traits and builds clean on x86_64-pc-windows-msvc (and Linux unaffected). serve / punktfunk1-host run on Windows (identity in %APPDATA%, QUIC bound, mDNS advertising, accepting sessions). The full native pipeline is validated live on a real RTX 4090 (Windows 11): SudoVDA virtual display → DXGI Desktop Duplication (D3D11 zero-copy) → NVENC HEVC → punktfunk/1 → Rust reference client, at 720p60 and 1080p60 (0 mismatched frames, p50 1.6 / 3.45 ms cross-machine, ffmpeg-decodes clean), coexisting with a running Apollo (two concurrent NVENC sessions).

Backend State GPU-less validation on the VM
Virtual display (SudoVDA) done live: open/version/watchdog/ADD/REMOVE via the trait
Input (SendInput) live on RTX 4090 mouse injection verified — cursor tracked the client's absolute diagonal sweep across the desktop in Session 1 (keyboard shares the same SendInput primitive)
Software encode (openh264) done live: m0 synthetic→openh264→core FEC loopback, 120/120, 0 mismatches
Audio (WASAPI loopback) done live: init chain opens (silent VM → no samples)
Capture (DXGI Desktop Duplication) live on RTX 4090 SudoVDA monitor → D3D11 zero-copy duplication; output is enumerated under the rendering GPU, not the SudoVDA LUID (search all adapters)
NVENC (D3D11, --features nvenc) live on RTX 4090 720p60 + 1080p60 HEVC end-to-end to the Rust client; ffmpeg-decodes clean; ran alongside Apollo (2 NVENC sessions)
Run host (serve/punktfunk1-host) live punktfunk1-host starts + listens; c_abi_connection_roundtrip passes
Gamepad (ViGEm) done compiles incl. rumble back-channel; live needs ViGEmBus + a physical pad
Host→client audio wiring done builds on MSVC; m3 audio_thread active on Windows (silent VM → no samples to send)
GameStream (Moonlight) audio done stereo path active on Windows (WASAPI→Opus→RTP/FEC); surround stays Linux-only (libopus multistream / audiopus_sys)
Rumble back-channel (ViGEm) done request_notification → background thread → 0xCA; live needs a physical pad
Game library (Steam discovery) done Windows Steam roots (Program Files) + VDF other-drive libraries; custom store already cross-platform. Non-default Steam install dir (registry) not yet covered

Remaining for full parity:

  • Keyboard injection — exercised via the same SendInput path (mouse verified live); not yet asserted into a focused text field.
  • ViGEm rumble + gamepad input — the pad is created live (ViGEmBus connected); the rumble back-channel + input still need a physical pad to verify.
  • GameStream (Moonlight) path on a GPU box — not yet run live (its fixed ports collide with Apollo, so stop Apollo first).
  • Frame pacing on static content — DXGI duplication is change-driven, so a blank/idle virtual display delivers only ~12 fps (181/177 frames over ~15 s); a rendering app drives the full rate.

Live UX hardening (2026-06-15, validated Mac ↔ RTX 4090)

Driven by live testing with the native macOS client at the display's native 5120×1440@240:

  • Native resolution, not 1080p. sudovda::set_active_mode enumerates the modes the IDD actually advertises (EnumDisplaySettingsW) and sets the requested resolution at the best supported refresh — keeping 5120×1440@240, never silently collapsing to the 1280×720/1920×1080 OS default when an exact mode is briefly unavailable.
  • Bitrate auto-cap. NVENC init_session probes and steps the average bitrate down (×3/4 to a floor) when the requested rate exceeds the GPU's codec-level max, so a high client bitrate connects instead of failing (matches the Linux host; we do NOT split NVENC sessions).
  • Mouse cursor. DXGI duplication excludes the hardware cursor; we read the pointer position/shape from the frame info (GetFramePointerShape) and GPU-composite it onto the captured texture before NVENC (a CPU read-back would stall the pipeline). Color cursors alpha-blend; masked-color cursors (the text I-beam) use an INV_DEST_COLOR blend for true screen inversion, so the caret is visible on any background (no black box). Monochrome handled too.
  • Secure desktop (lock / login / UAC). The host runs as SYSTEM in the interactive session; the capturer SetThreadDesktops onto the current input desktop and, on the WinSta switch, recreates the D3D11 device and re-resolves the virtual output's GDI name from the stable SudoVDA target id (the name changes across the topology rebuild — the old failure was hunting the stale \\.\DISPLAYn and dropping). ACCESS_LOST / INVALID_CALL / device-removed are all treated as recoverable, and a mid-stream resolution change is followed (capturer + NVENC re-init at the new size). Validated: logging in / locking through the stream stays connected (one real session recovered 1012 desktop switches and completed cleanly). Display isolation (isolate_displays detaches other monitors so Winlogon renders to the virtual output) covers the case where a physical monitor is also attached.

Running as SYSTEM (deployment) — the PunktfunkHost service

To capture the secure desktop the host must run as SYSTEM in the interactive Session 1 (a Session 0 service can't duplicate Session 1). The end-user deployment is the built-in Windows service (src/service.rs) — see windows-service.md. One elevated command:

punktfunk-host service install   # auto-start LocalSystem service + firewall rules + default host.env
punktfunk-host service start

The service runs in Session 0 but never captures: it duplicates its own LocalSystem token, retargets it to the active console session, and CreateProcessAsUserWs the host there — supervising it across exits and console-session switches (the Sunshine/Apollo model). Config lives in %ProgramData%\punktfunk\host.env; logs in %ProgramData%\punktfunk\logs\.

Old bring-up chain (debug only, superseded by the service): a scheduled task (Interactive, Highest) → PsExec64 -s -i 1 -d wscript.exe launch.vbshost-run.cmd (hidden window), with APPDATA=C:\Users\Public as the shared-identity hack. The service replaces all of this; the host now resolves its config dir to %ProgramData%\punktfunk directly (PUNKTFUNK_CONFIG_DIR overrides).

Real-GPU test box (RTX 4090, ssh "Enrico Bühler"@192.168.1.174)

Windows 11, RTX 4090 (driver 596.36) + AMD iGPU, SudoVDA + Apollo (sunshine) installed. SSH lands in Session 0 (non-interactive) — DXGI duplication + SendInput need the interactive Session 1, so launch the host there via an Interactive scheduled task (admin SSH session is the same user): Register-ScheduledTask -Principal (New-ScheduledTaskPrincipal -UserId (whoami) -LogonType Interactive -RunLevel Highest), then Start-ScheduledTask. The host runs with desktop access; read its redirected log over SSH. nvEncodeAPI64.dll ships with the driver, so a VM-built --features nvenc exe runs here as-is (no SDK install). The 4090's Ada NVENC has no consumer session cap, so the host encodes alongside Apollo. Gotcha: the SudoVDA monitor is rendered by — and DXGI-enumerated under — the 4090, not the SudoVDA adapter LUID (the capturer searches all adapters; see the fix).

Native build on the 4090 (fast iteration loop)

Build on the box itself (edit locally → sftp to the repo → cargo build there → run via the task) instead of build-on-VM-then-copy. Prereqs that bit us, in order:

  1. Full MSVC C++ build tools, incl. the CRT libs. A VS install can land cl.exe + the Windows SDK + sanitizer libs but miss the desktop CRT import libs (VC\Tools\MSVC\<ver>\lib\x64\msvcrt.lib, libcmt.lib, …) → LNK1104: msvcrt.lib. Root cause here: the Microsoft.VisualCpp.Redist.14 package failed to install (1603), cascading to skip the NativeDesktop workload. Fix = (re)install the C++ workload via the VS Installer GUI (the headless setup.exe modify over SSH fails — a non-elevated SSH token gives 1603/87, and --quiet as SYSTEM hangs). A reboot may be needed first (a pending reboot also yields 1603). Stop-gap: the desktop CRT libs are version-pinned, so they can be copied from another box with the identical MSVC version (14.51.36231 here).
  2. Build from an ASCII path. A username with a non-ASCII char (C:\Users\Enrico Bühler\…) breaks the MSVC PDB writer → LNK1201: error writing to the program database. Clone/copy the repo to e.g. C:\Users\Public\punktfunk-native and build there (the VM worked only because it built in C:\Users\Public\punktfunk).
  3. winget install NASM.NASM Kitware.CMake; generate the NVENC import lib (lib /def → set PUNKTFUNK_NVENC_LIB_DIR); set CMAKE_POLICY_VERSION_MINIMUM=3.5 (libopus).

Build env (each cargo invocation): $env:PATH += ";C:\Program Files\NASM;C:\Program Files\CMake\bin", $env:CMAKE_POLICY_VERSION_MINIMUM="3.5", $env:PUNKTFUNK_NVENC_LIB_DIR="C:\Users\Public\nvenc", then cargo build --release -p punktfunk-host --features nvenc. Validated: native build (1m37s) → 720p60 NVENC, 174/174 frames, p50 2.5 ms, ffmpeg-decodes clean.

All Windows backends are clippy -D warnings and rustfmt clean on x86_64-pc-windows-msvc (the Windows-only modules are cfg-excluded from Linux CI, so run clippy on the VM after touching them — its rustc 1.96 clippy is stricter than the Linux CI image on shared code, e.g. needless_return).

Building & testing on a real-GPU Windows box (NVENC)

  1. Install SudoVDA (virtual display) and ViGEmBus (gamepad) drivers; install the NVIDIA driver.
  2. NVENC link lib: either install the NVIDIA Video Codec SDK, or generate an import lib from the driver DLL — lib /def:nvenc.def /machine:x64 /out:nvencodeapi.lib where nvenc.def lists NvEncodeAPICreateInstance and NvEncodeAPIGetMaxSupportedVersion — and set PUNKTFUNK_NVENC_LIB_DIR to its directory.
  3. cargo build -p punktfunk-host --features nvenc (needs NASM + CMake for aws-lc-rs; libclang for any ffmpeg-using client). Default build (no feature) uses the openh264 software encoder.
  4. Run in the interactive session (not a Session-0 service / not over SSH — SendInput + DXGI Desktop Duplication need a desktop): serve or punktfunk1-host --source virtual. Set PUNKTFUNK_ENCODER=nvenc to select NVENC (the DXGI capturer switches to zero-copy D3D11 output to match). The SudoVDA monitor activates once a real GPU drives WDDM, so capture + NVENC then work.

Dev loop (this repo → the Windows VM)

ssh "Enrico Bühler"@192.168.1.57 (PowerShell shell). Repo cloned at C:\Users\Public\punktfunk (Gitea). Sync uncommitted files with sftp (sftp -b - host, /C:/... paths — scp and base64-over-ssh are unreliable here). Commit on Linux → git reset --hard origin/main on the VM. Build env: PATH += cargo bin + NASM + CMake + LLVM (vcvars not needed — rustc/cc self-locate MSVC). Set CMAKE_POLICY_VERSION_MINIMUM=3.5 — CMake 4 rejects libopus's old cmake_minimum_required when audiopus_sys (vendored by the opus crate) builds libopus from source for the host→client audio path.

Decisions (locked 2026-06-14)

Decision Choice Rationale
Build order Host first User preference. (Note: the research recommended client first, since the client is unblocked by the no-GPU problem and becomes the host's test endpoint — see "No-GPU dev strategy". Revisit if host progress stalls on GPU-gated steps.)
Virtual display SudoVDA Arbitrary modes on the fly (no baked EDID / registry mode list, unlike parsec-vdd), MIT/CC0 (bundleable), already installed on the dev box, proven by Apollo.
Client UI Pure Rust: windows-rs + Windows Reactor (WinUI 3) No C++/C#. Links punktfunk-core directly as a crate (like the GTK Linux client — no C ABI, no GC/FFI-lifetime hazard). Built-in SwapChainPanel widget for the video surface; Custom escape hatch + raw Microsoft.UI.Xaml as fallback.
Client decode FFmpeg + D3D11VA Exactly what Moonlight ships; feeds AnnexB H.264/HEVC/AV1 directly, decodes AV1 via the GPU DXVA profile with no Store Video Extension. Cost: ffmpeg dep + libclang.
Host SW encode (no-GPU dev) openh264 BSD, no system ffmpeg, low-latency single-ref/zero-lookahead with intra-refresh. Lets the full capture→encode→FEC→send pipeline run GPU-less.
Host HW encode nvidia-video-codec-sdk (D3D11) NV_ENC_DEVICE_TYPE_DIRECTX + NvEncRegisterResource on the captured ID3D11Texture2D = true zero-copy, no CUDA bridge. Young crate — vendor + wrap behind the Encoder trait. Defers to a real-GPU box.

Dev box (ssh "Enrico Bühler"@192.168.1.57)

Windows 11 Pro 25H2 (build 26200), QEMU Q35, 8 vCPU, 12 GB. No working GPU (an RTX 5070 Ti node is present but Status: Unknown; nvidia-smi fails → NVENC cannot initialize). Installed: Rust 1.96 (MSVC), Visual Studio Community 2026 + VC tools + Windows SDK 10.0.26100/28000, Windows App Runtime 2.2 (Reactor needs ≥2.0.1 ), SudoVDA (ROOT\DISPLAY\0000, hwid root\sudomaker\sudovda, INF oem6.inf, Status OK) and Parsec VDD, git, winget. Toolchain gaps to fill (see Step 0): NASM, CMake, libclang.

Reused as-is (~95% of the codebase — no changes)

Reusable Why
punktfunk-core (protocol, FEC, crypto, session, transport, QUIC control plane, C ABI) Zero platform deps; already compiles on Windows MSVC
GameStream wire logic (mDNS, serverinfo, pairing, RTSP, ENet) except the capture/encode/audio backends pure protocol
Management REST API (mgmt.rs) + OpenAPI, native_pairing, discovery axum/tokio/quinn — portable
punktfunk1.rs / spike.rs / pipeline.rs orchestration trait-generic: call capturer.next_frame(), encoder.submit/poll(), vd.create(mode) — no changes
The trait boundaries: Capturer, Encoder, VirtualDisplay, InputInjector, AudioCapturer, VirtualMic platform-neutral; Linux deps already isolated under [target.'cfg(target_os="linux")'.dependencies]

Step 0 — make punktfunk-host compile on x86_64-pc-windows-msvc DONE (2026-06-14)

Result: the full dependency tree builds clean on MSVC (aws-lc-rs with NASM+CMake, quinn, rusty_enet, axum/hyper/utoipa), and punktfunk-host compiles and runs (the openapi subcommand emits the spec). Only 3 cfg-gates were needed — the host was already ~95% portable: main.rs mod dmabuf_fence/mod drm_sync#[cfg(target_os = "linux")]; vdisplay.rs the use std::os::fd::OwnedFd import + VirtualOutput.remote_fd field → #[cfg(target_os = "linux")]. Verified green on Linux too. Build env on the VM: rustc+cc/cmake self-locate MSVC (vcvars not needed); PATH must include cargo bin + NASM + CMake + LLVM.

The host already compiles on macOS (Linux backends are cfg-gated; heavy Linux deps are target-gated). Getting to Windows MSVC is the unix-but-not-linux delta, not a from-scratch port:

  1. Toolchain: winget install NASM.NASM Kitware.CMake LLVM.LLVM, set LIBCLANG_PATH (or tick VS "C++ Clang tools"). NASM+CMake are for aws-lc-rs (pulled by rustls/rcgen on the quic path); libclang is for ffmpeg-sys/bindgen (client decode + any host bindgen crate).
  2. std::os::fd / libc: vdisplay.rs:18 has an unconditional use std::os::fd::OwnedFd; and VirtualOutput.remote_fd: Option<OwnedFd>std::os::fd is cfg(unix), so it builds on macOS but breaks on Windows. Gate the import + field (#[cfg(unix)], with a Windows arm or omission). Sweep for other cfg(target_os="linux")-missing unix-isms (libc, fds).
  3. Build natively on the VM (cargo build -p punktfunk-hostnot cross-compile; xwin chokes on aws-lc-rs/ffmpeg-sys/WDK). Triage the remaining errors. Suspect deps to verify link on MSVC: aws-lc-rs (needs NASM+CMake), rusty_enet, the hyper/axum/utoipa stack (expected fine).
  4. CI: add a cargo build -p punktfunk-host --target x86_64-pc-windows-msvc job so the Windows path stops bit-rotting (the dev box can be a Gitea runner later).

This is the highest-value first move and is fully doable GPU-less.

Windows backends (new #[cfg(target_os = "windows")] code behind existing traits)

Subsystem Linux today Windows backend VM-testable?
VirtualDisplay KWin/gamescope/Mutter/Sway SudoVDA IOCTLs (below) + SetDisplayConfig mode-set likely (WARP) — spike
Capture PipeWire/dmabuf DXGI Desktop Duplication primary, WGC fallback → ID3D11Texture2D; add FramePayload::D3d11 ⚠️ DDA-on-WARP unreliable; WGC-on-WARP unverified — spike
Zero-copy dmabuf→EGL/Vulkan→CUDA register ID3D11Texture2D with NVENC (NV_ENC_DEVICE_TYPE_DIRECTX) — no CUDA bridge needs real GPU
Encode ffmpeg *_nvenc openh264 SW (default on VM) + nvidia-video-codec-sdk HW (real GPU); behind PUNKTFUNK_ENCODER SW / HW
Input kbd/mouse libei / wlr SendInput with MOUSEEVENTF_VIRTUALDESK absolute mapping onto the virtual desktop rect (skip the VK→evdev table — client sends Win VKs; use KEYEVENTF_SCANCODE+EXTENDEDKEY)
Gamepad uinput xpad + FF ViGEmBus via vigem-client (Xbox360Wired); rumble via request_notification()XNotification{large,small} (install driver)
Audio capture PipeWire sink monitor WASAPI loopback via the wasapi crate (48 kHz stereo f32 → existing Opus) ⚠️ needs an audio endpoint
Virtual mic PipeWire Audio/Source virtual audio driver (Virtual-Audio-Driver) or defer second driver — defer

punktfunk1.rs/spike.rs/pipeline.rs are unchanged. Note: the Windows capture needs its own capture_virtual_output entry point (the SudoVDA identity is a DXGI adapter LUID + DisplayConfig TargetId → GDI \\.\DisplayN, which doesn't fit the PipeWire node_id: u32 field — carry it inside the keepalive / a Windows-specific seam rather than overloading node_id).

SudoVDA control protocol (the VirtualDisplay backend spec)

Pure Rust via the windows crate (no C lib; Apollo vendors a header-only client under third-party/sudovda/). Reference port pattern: parsec-vdd-rust (SetupAPI/CM_* → CreateFileWDeviceIoControl). Verify the IOCTL hex with a const fn ctl_code()CTL_CODE(dev,func,method,access) = (dev<<16)|(access<<14)|(func<<2)|method, with FILE_DEVICE_UNKNOWN=0x22, METHOD_BUFFERED=0, FILE_ANY_ACCESS=0.

  • Device interface GUID: {E5BCC234-1E0C-418A-A0D4-EF8B7501414D} · HWID: root\sudomaker\sudovda
  • IOCTLs (func → value): ADD 0x8000x00222000, REMOVE 0x8010x00222004, SET_RENDER_ADAPTER 0x8020x00222008, GET_WATCHDOG 0x8030x0022200C, DRIVER_PING 0x8880x00222220, GET_PROTOCOL_VERSION 0x8FF0x002223FC.
  • Add (#[repr(C)] exact layout): in { u32 Width; u32 Height; u32 RefreshRate; GUID MonitorGuid; CHAR DeviceName[14]; CHAR SerialNumber[14] } → out { LUID AdapterLuid; u32 TargetId }. The mode is set at create (driver computes timing arithmetically — no EDID seeding). Pick a stable per-client MonitorGuid (Windows persists that monitor's layout; remove is by GUID).
  • Resolve the capture target: the monitor appears asynchronously — poll QueryDisplayConfig(QDC_ONLY_ACTIVE_PATHS), match targetInfo.id == TargetId, DisplayConfigGetDeviceInfoviewGdiDeviceName (\\.\DisplayN). Apollo polls 20 ms → ×2 → cap 320 ms. Then point DXGI Desktop Duplication at that output.
  • Keepalive (mandatory): GET_WATCHDOG{ u32 Timeout_s; u32 Countdown } (default 3 s, driver-wide). Run one thread firing DRIVER_PING every Timeout*1000/3 ms (~1 s). Miss it and the driver tears down all virtual displays.
  • Teardown (RAII): DropDeviceIoControl(REMOVE, { GUID MonitorGuid }) = the VirtualOutput keepalive drop.
  • Mid-stream Reconfigure: SudoVDA has no in-place mode IOCTL (Apollo only relaunches). Implement punktfunk's Reconfigure as remove+re-add at the new mode (or add-second + migrate capture), and watch the Win11 24H2/25H2 IDD mode-apply regression (post-create ChangeDisplaySettingsEx may not move the desktop to the new mode without a Settings-UI poke — VirtualDrivers #471). The ~90 ms Reconfigure budget needs an isolated spike to confirm on 24H2/25H2.
  • Install / signing: self-signed — ship sudovda.cer, import to Root + TrustedPublisher, create the device node via nefconc.exe (--create-device-node/--install-driver). Installs without test-signing (trusted-publisher). MIT/CC0 → bundleable (Apollo precedent). Already installed on the dev box. Document it as a host prerequisite (like the Linux udev rule).
  • GPU caveat: SudoVDA's Driver.cpp does D3D11CreateDevice(UNKNOWN) on a render adapter with no explicit WARP fallback; on the GPU-less VM Windows binds the Basic Render Driver (WARP), so display compositing should work but NVENC won't. Confirm ADD actually brings a monitor up on the VM in the first spike.

No-GPU dev strategy

Buildable + validatable on the VM now: Step 0 (MSVC compile); the SudoVDA backend (add/mode-set/keepalive/remove via WARP — spike to confirm); the openh264 SW encode path fed a CPU BGRA staging copy → real AnnexB → FEC → UDP (the full transport minus HW); SendInput injection + interactive-session/desktop-reattach; ViGEm gamepad + rumble; WASAPI loopback (if an endpoint exists); and the entire client (software decode loopback).

Defers to a real NVIDIA-GPU Windows box: NVENC-D3D11 zero-copy encode; whether the captured ID3D11Texture2D registers with NVENC zero-copy vs needing a CopyResource; the DDA-vs-WGC latency bake-off (DDA-on-WARP is E_NOTIMPL-class); split-encode + bitrate-ceiling probe; and all glass-to-glass / throughput numbers (no perf claim transfers from Linux).

Windows-specific structural issues (no Linux precedent)

  • Interactive session, not a Session-0 service. SendInput can't reach the desktop from Session 0. Run the host in the user's interactive session and replicate Apollo/Sunshine's OpenInputDesktop/SetThreadDesktop re-attach to survive UAC/lock-screen desktop switches. (Driving the UAC secure desktop needs a UIAccess manifest + signing — out of scope; document it.)
  • Clock epoch on the host side. The skew handshake assumes both ends read the same realtime epoch in ns. The Windows host must emit timestamps from GetSystemTimePreciseAsFileTime→Unix-epoch-ns or cross-machine latency numbers + ClockProbe/ClockEcho break.
  • IDD has no audio endpoint. There's nothing to loop back on a headless box unless a real/virtual render device exists → WASAPI loopback needs an endpoint, and the virtual mic (client→host) has no clean user-mode path. Audio is potentially a second driver-install problem; defer the mic.
  • Color/range. All clients assume BT.709 limited-range. A new openh264/NVENC-D3D11 path doing BGRA→I420 must match, or colors wash out — validate against the existing decoders.

Phased plan (host-first)

  1. Compile on MSVC (Step 0 above). GPU-less. ← start here
  2. SudoVDA VirtualDisplay backend control path landed (vdisplay/sudovda.rs: add/keepalive/remove + GDI-name resolution + RAII teardown, behind the existing trait; open() returns it on Windows). Compiles + live-tested on the VM. Remaining: monitor activation + \\.\DisplayN resolution (needs a GPU), then SetDisplayConfig mid-stream Reconfigure.
  3. Capture + SW encode — DXGI Desktop Duplication (or WGC) → ID3D11Texture2D → CPU staging → openh264 → existing FEC/transport. First end-to-end Windows session, GPU-less, against the Linux punktfunk-probe or the new Windows client.
  4. Input — SendInput (kbd/mouse, VIRTUALDESK mapping) + interactive-session/desktop-reattach.
  5. Gamepad + audio — ViGEm + rumble; WASAPI loopback.
  6. HW encode (real-GPU box)nvidia-video-codec-sdk D3D11 zero-copy; DDA-vs-WGC bake-off; glass-to-glass numbers. Resolve to Xbox-360 pad on Windows (drop DualSense fidelity/virtual-mic to follow-ups, as the host already does for non-Linux).

The Windows client (separate track, pure Rust)

Structurally a sibling of clients/linux (GTK4) — same shape, different toolkit:

  • UI: windows-rs + Windows Reactor (WinUI 3) for native chrome. Link punktfunk-core directly (no C ABI). De-risk early: a Reactor window with a SwapChainPanel presenting a test pattern through a flip-model waitable swapchain, before building on it. Fallback if Reactor's 3-week-old maturity bites: the Custom element + raw windows-rs Microsoft.UI.Xaml.
  • Decode: FFmpeg avcodec_send_packet/receive_frame with the D3D11VA hwaccel → NV12/P010 ID3D11Texture2D. Feeds AnnexB directly (matches host output), decodes AV1 with no Store extension.
  • Present: DXGI flip-model waitable swapchain (FLIP_DISCARD + FRAME_LATENCY_WAITABLE_OBJECT, max latency 1) bound to the SwapChainPanel via ISwapChainPanelNative::SetSwapChain. Not MediaPlayerElement.
  • Input capture: RAWINPUT/WM_INPUT for relative/pointer-lock mouse; Windows.Gaming.Input for gamepads + rumble. Forward via the linked NativeClient (send_input/send_rich_input).
  • Trust: SPAKE2 PIN + TOFU pinning via core; persist the client identity in Windows Credential Manager / DPAPI (the Keychain analog).

Open risks / spikes (do these in isolation, early)

  1. cargo build -p punktfunk-host on the VM — count + triage the real MSVC errors before estimating Step 0. (GPU-less.)
  2. SudoVDA ADD on the VM done 2026-06-15. The control path is fully validated on the GPU-less VM, both standalone and through the real VirtualDisplay trait (vdisplay/sudovda.rs): device open by GUID, GET_VERSION (0.2.1), GET_WATCHDOG (3 s), ADD 1920×1080@60 → returns adapter LUID + target_id, watchdog ping holds it, RAII DropREMOVE. Gap: with no GPU the target does NOT activate into a WDDM display path (QueryDisplayConfig active paths stay 0 → no \\.\DisplayN to resolve/capture). So activation + name-resolution + capture defer to a real GPU (passthrough on the Proxmox VM, or a GPU box) — consistent with capture/NVENC deferring anyway.
  3. IDD arbitrary-mode + Reconfigure on 24H2/25H2 — does 5120×1440@240 apply, and does a remove+re-add (or re-modeset) hit the ~90 ms budget without a Settings-UI toggle? Make-or-break for "native client resolution, no scaling".
  4. NVENC-D3D11 zero-copy (real-GPU box) — does the captured texture register as-is, or need a copy? Does nvidia-video-codec-sdk's NV_ENC_DEVICE_TYPE_DIRECTX path work end-to-end? (Expect to vendor/patch.)
  5. DDA vs WGC against the SudoVDA monitor — measure latency/jitter on a real GPU; resolve the primary-capture choice.
  6. Driver redistribution — confirm bundling SudoVDA (.cer + nefcon) + ViGEmBus installers in the punktfunk Windows package; document them as prerequisites.

References