0ce2e37faf
Final cleanup after the DDA-parity work, plus an end-user service to replace the PsExec/VBS/scheduled-task launch chain. Cleanup (behavior-preserving): - sudovda.rs: drop the dead legacy GDI isolate_displays/restore_displays (CCD is the sole isolation path), the always-empty Monitor.isolated field, and the vestigial reassert_isolation + PUNKTFUNK_ISOLATE_DISPLAYS knob; fix stale comments. - dxgi.rs: downgrade leftover debug warns/infos (DuplicateOutput1 retry, FALLBACKS, hook-hits, AcquireNextFrame idle timeout) to debug!; remove the PUNKTFUNK_NO_CURSOR per-frame test knob. Windows service (src/service.rs, `punktfunk-host service`): - SCM supervisor (windows-service crate) that duplicates its LocalSystem token, retargets it to the active console session, and CreateProcessAsUserW's the host there (Sunshine/Apollo model) — relaunching on exit and console session switch, inside a kill-on-close job object so a service crash never orphans the host. - install/uninstall/start/stop/status subcommands: one elevated `service install` registers an auto-start LocalSystem service + firewall rules + a default host.env. - Config moves to %ProgramData%\punktfunk\host.env; config_dir() now resolves to %ProgramData%\punktfunk on Windows (replacing the APPDATA=C:\Users\Public hack), with a PUNKTFUNK_CONFIG_DIR override. Logs land in %ProgramData%\punktfunk\logs\. - merged_env_block (shared with the WGC helper) now also carries RUST_LOG. - docs/windows-service.md + scripts/windows/host.env.example; windows-host.md updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
373 lines
29 KiB
Markdown
373 lines
29 KiB
Markdown
# Windows host + client — implementation plan
|
||
|
||
**Status: in progress — dev box provisioned, host-first.** A Windows host is an *"add backends
|
||
behind the existing traits"* job, not a parallel port: `punktfunk-core` and the whole control plane
|
||
are platform-agnostic and the host already compiles on non-Linux (macOS) thanks to existing
|
||
`cfg(target_os)` gating. The one piece that used to make it XL — a per-client *virtual* output, which
|
||
has no user-mode Windows API — is solved by reusing **[SudoVDA](https://github.com/SudoMaker/SudoVDA)**
|
||
(the SudoMaker Virtual Display Adapter, the same IDD the Apollo Sunshine-fork ships): a pre-built IDD
|
||
that creates virtual displays at **arbitrary `WxH@Hz` on the fly**. We install it and drive its IOCTL
|
||
control interface — **no driver to write or WHQL-sign.**
|
||
|
||
History: scoped 2026-06-10 (4-agent read of the host crate); SudoVDA path 2026-06-11; this concrete
|
||
plan + dev box + SudoVDA protocol + no-GPU strategy added 2026-06-14 (12-agent research pass).
|
||
|
||
## Status (2026-06-15) — full pipeline live-validated on an RTX 4090
|
||
|
||
Every OS-touching backend is implemented behind the existing traits and **builds clean on
|
||
`x86_64-pc-windows-msvc`** (and Linux unaffected). `serve --native` / `m3-host` **run on Windows**
|
||
(identity in `%APPDATA%`, QUIC bound, mDNS advertising, accepting sessions). The **full native
|
||
pipeline is validated live on a real RTX 4090** (Windows 11): SudoVDA virtual display → DXGI
|
||
Desktop Duplication (D3D11 zero-copy) → **NVENC HEVC** → punktfunk/1 → Rust reference client, at
|
||
720p60 and 1080p60 (0 mismatched frames, p50 1.6 / 3.45 ms cross-machine, ffmpeg-decodes clean),
|
||
coexisting with a running Apollo (two concurrent NVENC sessions).
|
||
|
||
| Backend | State | GPU-less validation on the VM |
|
||
|---|---|---|
|
||
| Virtual display (SudoVDA) | ✅ done | live: open/version/watchdog/ADD/REMOVE via the trait |
|
||
| Input (SendInput) | ✅ **live on RTX 4090** | mouse injection verified — cursor tracked the client's absolute diagonal sweep across the desktop in Session 1 (keyboard shares the same SendInput primitive) |
|
||
| Software encode (openh264) | ✅ done | **live: m0 synthetic→openh264→core FEC loopback, 120/120, 0 mismatches** |
|
||
| Audio (WASAPI loopback) | ✅ done | live: init chain opens (silent VM → no samples) |
|
||
| Capture (DXGI Desktop Duplication) | ✅ **live on RTX 4090** | SudoVDA monitor → D3D11 zero-copy duplication; output is enumerated under the *rendering* GPU, not the SudoVDA LUID (search all adapters) |
|
||
| NVENC (D3D11, `--features nvenc`) | ✅ **live on RTX 4090** | 720p60 + 1080p60 HEVC end-to-end to the Rust client; ffmpeg-decodes clean; ran alongside Apollo (2 NVENC sessions) |
|
||
| Run host (serve/m3-host) | ✅ live | m3-host starts + listens; `c_abi_connection_roundtrip` passes |
|
||
| Gamepad (ViGEm) | ✅ done | compiles incl. rumble back-channel; live needs ViGEmBus + a physical pad |
|
||
| Host→client audio wiring | ✅ done | builds on MSVC; `m3` `audio_thread` active on Windows (silent VM → no samples to send) |
|
||
| GameStream (Moonlight) audio | ✅ done | stereo path active on Windows (WASAPI→Opus→RTP/FEC); surround stays Linux-only (libopus multistream / `audiopus_sys`) |
|
||
| Rumble back-channel (ViGEm) | ✅ done | `request_notification` → background thread → 0xCA; live needs a physical pad |
|
||
| Game library (Steam discovery) | ✅ done | Windows Steam roots (Program Files) + VDF other-drive libraries; custom store already cross-platform. Non-default Steam install dir (registry) not yet covered |
|
||
|
||
**Remaining for full parity:**
|
||
- **Keyboard injection** — exercised via the same SendInput path (mouse verified live); not yet
|
||
asserted into a focused text field.
|
||
- **ViGEm rumble + gamepad input** — the pad is created live (ViGEmBus connected); the rumble
|
||
back-channel + input still need a physical pad to verify.
|
||
- **GameStream (Moonlight) path on a GPU box** — not yet run live (its fixed ports collide with
|
||
Apollo, so stop Apollo first).
|
||
- **Frame pacing on static content** — DXGI duplication is change-driven, so a blank/idle virtual
|
||
display delivers only ~12 fps (181/177 frames over ~15 s); a rendering app drives the full rate.
|
||
|
||
### Live UX hardening (2026-06-15, validated Mac ↔ RTX 4090)
|
||
|
||
Driven by live testing with the native macOS client at the display's native **5120×1440@240**:
|
||
|
||
- **Native resolution, not 1080p.** `sudovda::set_active_mode` enumerates the modes the IDD actually
|
||
advertises (`EnumDisplaySettingsW`) and sets the requested **resolution** at the best supported
|
||
refresh — keeping 5120×1440@240, never silently collapsing to the 1280×720/1920×1080 OS default
|
||
when an exact mode is briefly unavailable.
|
||
- **Bitrate auto-cap.** NVENC `init_session` probes and steps the average bitrate down (×3/4 to a
|
||
floor) when the requested rate exceeds the GPU's codec-level max, so a high client bitrate connects
|
||
instead of failing (matches the Linux host; we do NOT split NVENC sessions).
|
||
- **Mouse cursor.** DXGI duplication excludes the hardware cursor; we read the pointer
|
||
position/shape from the frame info (`GetFramePointerShape`) and GPU-composite it onto the captured
|
||
texture before NVENC (a CPU read-back would stall the pipeline). Color cursors alpha-blend;
|
||
**masked-color** cursors (the text I-beam) use an `INV_DEST_COLOR` blend for true screen inversion,
|
||
so the caret is visible on any background (no black box). Monochrome handled too.
|
||
- **Secure desktop (lock / login / UAC).** The host runs as **SYSTEM in the interactive session**;
|
||
the capturer `SetThreadDesktop`s onto the current input desktop and, on the WinSta switch,
|
||
**recreates the D3D11 device** and **re-resolves the virtual output's GDI name from the stable
|
||
SudoVDA target id** (the name changes across the topology rebuild — the old failure was hunting the
|
||
stale `\\.\DISPLAYn` and dropping). `ACCESS_LOST` / `INVALID_CALL` / device-removed are all treated
|
||
as recoverable, and a mid-stream resolution change is followed (capturer + NVENC re-init at the new
|
||
size). Validated: logging in / locking through the stream stays connected (one real session
|
||
recovered 1012 desktop switches and completed cleanly). *Display isolation* (`isolate_displays`
|
||
detaches other monitors so Winlogon renders to the virtual output) covers the case where a physical
|
||
monitor is also attached.
|
||
|
||
### Running as SYSTEM (deployment) — the `PunktfunkHost` service
|
||
|
||
To capture the secure desktop the host must run as **SYSTEM in the interactive Session 1** (a Session
|
||
0 service can't duplicate Session 1). The end-user deployment is the built-in Windows **service**
|
||
(`src/service.rs`) — see [`windows-service.md`](windows-service.md). One elevated command:
|
||
|
||
```powershell
|
||
punktfunk-host service install # auto-start LocalSystem service + firewall rules + default host.env
|
||
punktfunk-host service start
|
||
```
|
||
|
||
The service runs in Session 0 but never captures: it duplicates its own LocalSystem token, retargets
|
||
it to the active console session, and `CreateProcessAsUserW`s the host there — supervising it across
|
||
exits and console-session switches (the Sunshine/Apollo model). Config lives in
|
||
`%ProgramData%\punktfunk\host.env`; logs in `%ProgramData%\punktfunk\logs\`.
|
||
|
||
> **Old bring-up chain (debug only, superseded by the service):** a scheduled task (Interactive,
|
||
> Highest) → `PsExec64 -s -i 1 -d wscript.exe launch.vbs` → `host-run.cmd` (hidden window), with
|
||
> `APPDATA=C:\Users\Public` as the shared-identity hack. The service replaces all of this; the host
|
||
> now resolves its config dir to `%ProgramData%\punktfunk` directly (`PUNKTFUNK_CONFIG_DIR` overrides).
|
||
|
||
### Real-GPU test box (RTX 4090, `ssh "Enrico Bühler"@192.168.1.174`)
|
||
|
||
Windows 11, RTX 4090 (driver 596.36) + AMD iGPU, SudoVDA + Apollo (sunshine) installed. SSH lands in
|
||
**Session 0 (non-interactive)** — DXGI duplication + SendInput need the **interactive Session 1**, so
|
||
launch the host there via an Interactive scheduled task (admin SSH session is the same user):
|
||
`Register-ScheduledTask -Principal (New-ScheduledTaskPrincipal -UserId (whoami) -LogonType
|
||
Interactive -RunLevel Highest)`, then `Start-ScheduledTask`. The host runs with desktop access; read
|
||
its redirected log over SSH. `nvEncodeAPI64.dll` ships with the driver, so a VM-built `--features
|
||
nvenc` exe runs here as-is (no SDK install). The 4090's Ada NVENC has no consumer session cap, so the
|
||
host encodes alongside Apollo. **Gotcha:** the SudoVDA monitor is rendered by — and DXGI-enumerated
|
||
under — the 4090, not the SudoVDA adapter LUID (the capturer searches all adapters; see the fix).
|
||
|
||
#### Native build on the 4090 (fast iteration loop)
|
||
|
||
Build on the box itself (edit locally → `sftp` to the repo → `cargo build` there → run via the task)
|
||
instead of build-on-VM-then-copy. Prereqs that bit us, in order:
|
||
|
||
1. **Full MSVC C++ build tools, incl. the CRT libs.** A VS install can land `cl.exe` + the Windows
|
||
SDK + sanitizer libs but *miss* the desktop CRT import libs (`VC\Tools\MSVC\<ver>\lib\x64\msvcrt.lib`,
|
||
`libcmt.lib`, …) → `LNK1104: msvcrt.lib`. Root cause here: the `Microsoft.VisualCpp.Redist.14`
|
||
package failed to install (1603), cascading to skip the NativeDesktop workload. Fix = (re)install
|
||
the C++ workload via the VS Installer **GUI** (the headless `setup.exe modify` over SSH fails — a
|
||
non-elevated SSH token gives 1603/87, and `--quiet` as SYSTEM hangs). A reboot may be needed first
|
||
(a pending reboot also yields 1603). Stop-gap: the desktop CRT libs are version-pinned, so they can
|
||
be copied from another box with the **identical** MSVC version (`14.51.36231` here).
|
||
2. **Build from an ASCII path.** A username with a non-ASCII char (`C:\Users\Enrico Bühler\…`) breaks
|
||
the MSVC PDB writer → `LNK1201: error writing to the program database`. Clone/copy the repo to
|
||
e.g. `C:\Users\Public\punktfunk-native` and build there (the VM worked only because it built in
|
||
`C:\Users\Public\punktfunk`).
|
||
3. `winget install NASM.NASM Kitware.CMake`; generate the NVENC import lib (`lib /def` → set
|
||
`PUNKTFUNK_NVENC_LIB_DIR`); set `CMAKE_POLICY_VERSION_MINIMUM=3.5` (libopus).
|
||
|
||
Build env (each `cargo` invocation): `$env:PATH += ";C:\Program Files\NASM;C:\Program Files\CMake\bin"`,
|
||
`$env:CMAKE_POLICY_VERSION_MINIMUM="3.5"`, `$env:PUNKTFUNK_NVENC_LIB_DIR="C:\Users\Public\nvenc"`, then
|
||
`cargo build --release -p punktfunk-host --features nvenc`. Validated: native build (1m37s) →
|
||
720p60 NVENC, 174/174 frames, p50 2.5 ms, ffmpeg-decodes clean.
|
||
|
||
All Windows backends are `clippy -D warnings` and `rustfmt` clean on `x86_64-pc-windows-msvc` (the
|
||
Windows-only modules are cfg-excluded from Linux CI, so run clippy on the VM after touching them — its
|
||
rustc 1.96 clippy is stricter than the Linux CI image on shared code, e.g. `needless_return`).
|
||
|
||
### Building & testing on a real-GPU Windows box (NVENC)
|
||
|
||
1. Install **SudoVDA** (virtual display) and **ViGEmBus** (gamepad) drivers; install the NVIDIA driver.
|
||
2. NVENC link lib: either install the NVIDIA Video Codec SDK, or generate an import lib from the
|
||
driver DLL — `lib /def:nvenc.def /machine:x64 /out:nvencodeapi.lib` where `nvenc.def` lists
|
||
`NvEncodeAPICreateInstance` and `NvEncodeAPIGetMaxSupportedVersion` — and set
|
||
`PUNKTFUNK_NVENC_LIB_DIR` to its directory.
|
||
3. `cargo build -p punktfunk-host --features nvenc` (needs NASM + CMake for aws-lc-rs; libclang for
|
||
any ffmpeg-using client). Default build (no feature) uses the openh264 software encoder.
|
||
4. Run in the **interactive session** (not a Session-0 service / not over SSH — SendInput + DXGI
|
||
Desktop Duplication need a desktop): `serve --native` or `m3-host --source virtual`. Set
|
||
`PUNKTFUNK_ENCODER=nvenc` to select NVENC (the DXGI capturer switches to zero-copy D3D11 output to
|
||
match). The SudoVDA monitor activates once a real GPU drives WDDM, so capture + NVENC then work.
|
||
|
||
### Dev loop (this repo → the Windows VM)
|
||
|
||
`ssh "Enrico Bühler"@192.168.1.57` (PowerShell shell). Repo cloned at `C:\Users\Public\punktfunk`
|
||
(Gitea). Sync uncommitted files with **sftp** (`sftp -b - host`, `/C:/...` paths — scp and
|
||
base64-over-ssh are unreliable here). Commit on Linux → `git reset --hard origin/main` on the VM.
|
||
Build env: `PATH` += cargo bin + NASM + CMake + LLVM (vcvars not needed — rustc/cc self-locate MSVC).
|
||
Set `CMAKE_POLICY_VERSION_MINIMUM=3.5` — CMake 4 rejects libopus's old `cmake_minimum_required` when
|
||
`audiopus_sys` (vendored by the `opus` crate) builds libopus from source for the host→client audio path.
|
||
|
||
## Decisions (locked 2026-06-14)
|
||
|
||
| Decision | Choice | Rationale |
|
||
|---|---|---|
|
||
| **Build order** | **Host first** | User preference. (Note: the research recommended *client* first, since the client is unblocked by the no-GPU problem and becomes the host's test endpoint — see "No-GPU dev strategy". Revisit if host progress stalls on GPU-gated steps.) |
|
||
| **Virtual display** | **SudoVDA** | Arbitrary modes on the fly (no baked EDID / registry mode list, unlike parsec-vdd), MIT/CC0 (bundleable), already installed on the dev box, proven by Apollo. |
|
||
| **Client UI** | **Pure Rust: `windows-rs` + Windows Reactor (WinUI 3)** | No C++/C#. Links `punktfunk-core` directly as a crate (like the GTK Linux client — no C ABI, no GC/FFI-lifetime hazard). Built-in `SwapChainPanel` widget for the video surface; `Custom` escape hatch + raw `Microsoft.UI.Xaml` as fallback. |
|
||
| **Client decode** | **FFmpeg + D3D11VA** | Exactly what Moonlight ships; feeds AnnexB H.264/HEVC/AV1 directly, decodes AV1 via the GPU DXVA profile with **no** Store Video Extension. Cost: ffmpeg dep + libclang. |
|
||
| **Host SW encode (no-GPU dev)** | **openh264** | BSD, no system ffmpeg, low-latency single-ref/zero-lookahead with intra-refresh. Lets the full capture→encode→FEC→send pipeline run GPU-less. |
|
||
| **Host HW encode** | **nvidia-video-codec-sdk (D3D11)** | `NV_ENC_DEVICE_TYPE_DIRECTX` + `NvEncRegisterResource` on the captured `ID3D11Texture2D` = true zero-copy, no CUDA bridge. Young crate — vendor + wrap behind the `Encoder` trait. Defers to a real-GPU box. |
|
||
|
||
## Dev box (`ssh "Enrico Bühler"@192.168.1.57`)
|
||
|
||
Windows 11 Pro 25H2 (build 26200), QEMU Q35, 8 vCPU, 12 GB. **No working GPU** (an `RTX 5070 Ti` node
|
||
is present but `Status: Unknown`; `nvidia-smi` fails → NVENC cannot initialize). Installed: Rust 1.96
|
||
(MSVC), Visual Studio Community 2026 + VC tools + Windows SDK 10.0.26100/28000, Windows App Runtime
|
||
2.2 (Reactor needs ≥2.0.1 ✅), **SudoVDA** (`ROOT\DISPLAY\0000`, hwid `root\sudomaker\sudovda`, INF
|
||
`oem6.inf`, Status OK) and Parsec VDD, git, winget. **Toolchain gaps to fill** (see Step 0): NASM,
|
||
CMake, libclang.
|
||
|
||
## Reused as-is (~95% of the codebase — no changes)
|
||
|
||
| Reusable | Why |
|
||
|---|---|
|
||
| `punktfunk-core` (protocol, FEC, crypto, session, transport, QUIC control plane, C ABI) | Zero platform deps; already compiles on Windows MSVC |
|
||
| GameStream wire logic (mDNS, serverinfo, pairing, RTSP, ENet) *except* the capture/encode/audio backends | pure protocol |
|
||
| Management REST API (`mgmt.rs`) + OpenAPI, `native_pairing`, `discovery` | axum/tokio/quinn — portable |
|
||
| `m3.rs` / `m0.rs` / `pipeline.rs` orchestration | trait-generic: call `capturer.next_frame()`, `encoder.submit/poll()`, `vd.create(mode)` — no changes |
|
||
| The trait boundaries: `Capturer`, `Encoder`, `VirtualDisplay`, `InputInjector`, `AudioCapturer`, `VirtualMic` | platform-neutral; Linux deps already isolated under `[target.'cfg(target_os="linux")'.dependencies]` |
|
||
|
||
## Step 0 — make `punktfunk-host` compile on `x86_64-pc-windows-msvc` — ✅ DONE (2026-06-14)
|
||
|
||
**Result:** the full dependency tree builds clean on MSVC (aws-lc-rs with NASM+CMake, quinn,
|
||
rusty_enet, axum/hyper/utoipa), and `punktfunk-host` compiles **and runs** (the `openapi` subcommand
|
||
emits the spec). Only **3 cfg-gates** were needed — the host was already ~95% portable:
|
||
`main.rs` `mod dmabuf_fence`/`mod drm_sync` → `#[cfg(target_os = "linux")]`; `vdisplay.rs` the
|
||
`use std::os::fd::OwnedFd` import + `VirtualOutput.remote_fd` field → `#[cfg(target_os = "linux")]`.
|
||
Verified green on Linux too. Build env on the VM: rustc+`cc`/`cmake` self-locate MSVC (vcvars not
|
||
needed); `PATH` must include cargo bin + NASM + CMake + LLVM.
|
||
|
||
The host already compiles on macOS (Linux backends are `cfg`-gated; heavy Linux deps are
|
||
target-gated). Getting to Windows MSVC is the **unix-but-not-linux** delta, not a from-scratch port:
|
||
|
||
1. **Toolchain**: `winget install NASM.NASM Kitware.CMake LLVM.LLVM`, set `LIBCLANG_PATH`
|
||
(or tick VS "C++ Clang tools"). NASM+CMake are for **aws-lc-rs** (pulled by `rustls`/`rcgen` on
|
||
the `quic` path); libclang is for `ffmpeg-sys`/bindgen (client decode + any host bindgen crate).
|
||
2. **`std::os::fd` / `libc`**: `vdisplay.rs:18` has an unconditional `use std::os::fd::OwnedFd;` and
|
||
`VirtualOutput.remote_fd: Option<OwnedFd>` — `std::os::fd` is `cfg(unix)`, so it builds on macOS
|
||
but breaks on Windows. Gate the import + field (`#[cfg(unix)]`, with a Windows arm or omission).
|
||
Sweep for other `cfg(target_os="linux")`-missing unix-isms (`libc`, fds).
|
||
3. **Build natively on the VM** (`cargo build -p punktfunk-host` — *not* cross-compile; xwin chokes on
|
||
aws-lc-rs/ffmpeg-sys/WDK). Triage the remaining errors. Suspect deps to verify link on MSVC:
|
||
`aws-lc-rs` (needs NASM+CMake), `rusty_enet`, the hyper/axum/utoipa stack (expected fine).
|
||
4. **CI**: add a `cargo build -p punktfunk-host --target x86_64-pc-windows-msvc` job so the Windows
|
||
path stops bit-rotting (the dev box can be a Gitea runner later).
|
||
|
||
This is the highest-value first move and is **fully doable GPU-less**.
|
||
|
||
## Windows backends (new `#[cfg(target_os = "windows")]` code behind existing traits)
|
||
|
||
| Subsystem | Linux today | Windows backend | VM-testable? |
|
||
|---|---|---|---|
|
||
| **VirtualDisplay** | KWin/gamescope/Mutter/Sway | **SudoVDA** IOCTLs (below) + `SetDisplayConfig` mode-set | ✅ likely (WARP) — *spike* |
|
||
| **Capture** | PipeWire/dmabuf | **DXGI Desktop Duplication** primary, **WGC** fallback → `ID3D11Texture2D`; add `FramePayload::D3d11` | ⚠️ DDA-on-WARP unreliable; WGC-on-WARP unverified — *spike* |
|
||
| **Zero-copy** | dmabuf→EGL/Vulkan→CUDA | register `ID3D11Texture2D` with NVENC (`NV_ENC_DEVICE_TYPE_DIRECTX`) — no CUDA bridge | ❌ needs real GPU |
|
||
| **Encode** | ffmpeg `*_nvenc` | `openh264` SW (default on VM) + `nvidia-video-codec-sdk` HW (real GPU); behind `PUNKTFUNK_ENCODER` | SW ✅ / HW ❌ |
|
||
| **Input kbd/mouse** | libei / wlr | **SendInput** with `MOUSEEVENTF_VIRTUALDESK` absolute mapping onto the virtual desktop rect (skip the VK→evdev table — client sends Win VKs; use `KEYEVENTF_SCANCODE`+`EXTENDEDKEY`) | ✅ |
|
||
| **Gamepad** | uinput xpad + FF | **ViGEmBus** via `vigem-client` (`Xbox360Wired`); rumble via `request_notification()`→`XNotification{large,small}` | ✅ (install driver) |
|
||
| **Audio capture** | PipeWire sink monitor | **WASAPI loopback** via the `wasapi` crate (48 kHz stereo f32 → existing Opus) | ⚠️ needs an audio endpoint |
|
||
| **Virtual mic** | PipeWire `Audio/Source` | virtual audio driver (`Virtual-Audio-Driver`) or defer | ❌ second driver — defer |
|
||
|
||
`m3.rs`/`m0.rs`/`pipeline.rs` are unchanged. Note: the Windows capture needs its own
|
||
`capture_virtual_output` entry point (the SudoVDA identity is a DXGI adapter LUID + DisplayConfig
|
||
TargetId → GDI `\\.\DisplayN`, which doesn't fit the PipeWire `node_id: u32` field — carry it inside
|
||
the `keepalive` / a Windows-specific seam rather than overloading `node_id`).
|
||
|
||
## SudoVDA control protocol (the `VirtualDisplay` backend spec)
|
||
|
||
Pure Rust via the `windows` crate (no C lib; Apollo vendors a header-only client under
|
||
`third-party/sudovda/`). Reference port pattern: `parsec-vdd-rust` (SetupAPI/CM_* → `CreateFileW` →
|
||
`DeviceIoControl`). **Verify the IOCTL hex with a `const fn ctl_code()`** —
|
||
`CTL_CODE(dev,func,method,access) = (dev<<16)|(access<<14)|(func<<2)|method`, with
|
||
`FILE_DEVICE_UNKNOWN=0x22`, `METHOD_BUFFERED=0`, `FILE_ANY_ACCESS=0`.
|
||
|
||
- **Device interface GUID**: `{E5BCC234-1E0C-418A-A0D4-EF8B7501414D}` · **HWID**: `root\sudomaker\sudovda`
|
||
- **IOCTLs** (func → value): ADD `0x800`→`0x00222000`, REMOVE `0x801`→`0x00222004`,
|
||
SET_RENDER_ADAPTER `0x802`→`0x00222008`, GET_WATCHDOG `0x803`→`0x0022200C`,
|
||
DRIVER_PING `0x888`→`0x00222220`, GET_PROTOCOL_VERSION `0x8FF`→`0x002223FC`.
|
||
- **Add** (`#[repr(C)]` exact layout): in `{ u32 Width; u32 Height; u32 RefreshRate; GUID MonitorGuid;
|
||
CHAR DeviceName[14]; CHAR SerialNumber[14] }` → out `{ LUID AdapterLuid; u32 TargetId }`. **The mode
|
||
is set at create** (driver computes timing arithmetically — no EDID seeding). Pick a *stable
|
||
per-client* `MonitorGuid` (Windows persists that monitor's layout; remove is by GUID).
|
||
- **Resolve the capture target**: the monitor appears **asynchronously** — poll
|
||
`QueryDisplayConfig(QDC_ONLY_ACTIVE_PATHS)`, match `targetInfo.id == TargetId`,
|
||
`DisplayConfigGetDeviceInfo` → `viewGdiDeviceName` (`\\.\DisplayN`). Apollo polls 20 ms → ×2 → cap
|
||
320 ms. Then point DXGI Desktop Duplication at that output.
|
||
- **Keepalive (mandatory)**: `GET_WATCHDOG` → `{ u32 Timeout_s; u32 Countdown }` (default **3 s**,
|
||
driver-wide). Run one thread firing `DRIVER_PING` every `Timeout*1000/3` ms (~1 s). Miss it and the
|
||
driver tears down **all** virtual displays.
|
||
- **Teardown (RAII)**: `Drop` → `DeviceIoControl(REMOVE, { GUID MonitorGuid })` = the `VirtualOutput`
|
||
keepalive drop.
|
||
- **Mid-stream `Reconfigure`**: SudoVDA has no in-place mode IOCTL (Apollo only relaunches). Implement
|
||
punktfunk's `Reconfigure` as remove+re-add at the new mode (or add-second + migrate capture), and
|
||
**watch the Win11 24H2/25H2 IDD mode-apply regression** (post-create `ChangeDisplaySettingsEx` may
|
||
not move the *desktop* to the new mode without a Settings-UI poke — VirtualDrivers #471). The
|
||
~90 ms `Reconfigure` budget needs an isolated spike to confirm on 24H2/25H2.
|
||
- **Install / signing**: self-signed — ship `sudovda.cer`, import to Root + TrustedPublisher, create
|
||
the device node via `nefconc.exe` (`--create-device-node`/`--install-driver`). Installs **without**
|
||
test-signing (trusted-publisher). MIT/CC0 → bundleable (Apollo precedent). **Already installed on
|
||
the dev box.** Document it as a host prerequisite (like the Linux udev rule).
|
||
- **GPU caveat**: SudoVDA's `Driver.cpp` does `D3D11CreateDevice(UNKNOWN)` on a render adapter with
|
||
**no explicit WARP fallback**; on the GPU-less VM Windows binds the Basic Render Driver (WARP), so
|
||
display compositing *should* work but NVENC won't. Confirm `ADD` actually brings a monitor up on the
|
||
VM in the first spike.
|
||
|
||
## No-GPU dev strategy
|
||
|
||
**Buildable + validatable on the VM now:** Step 0 (MSVC compile); the SudoVDA backend
|
||
(add/mode-set/keepalive/remove via WARP — *spike to confirm*); the openh264 SW encode path fed a CPU
|
||
BGRA staging copy → real AnnexB → FEC → UDP (the full transport minus HW); SendInput injection +
|
||
interactive-session/desktop-reattach; ViGEm gamepad + rumble; WASAPI loopback (if an endpoint
|
||
exists); and the entire client (software decode loopback).
|
||
|
||
**Defers to a real NVIDIA-GPU Windows box:** NVENC-D3D11 zero-copy encode; whether the captured
|
||
`ID3D11Texture2D` registers with NVENC zero-copy vs needing a `CopyResource`; the DDA-vs-WGC latency
|
||
bake-off (DDA-on-WARP is `E_NOTIMPL`-class); split-encode + bitrate-ceiling probe; and **all**
|
||
glass-to-glass / throughput numbers (no perf claim transfers from Linux).
|
||
|
||
## Windows-specific structural issues (no Linux precedent)
|
||
|
||
- **Interactive session, not a Session-0 service.** SendInput can't reach the desktop from Session 0.
|
||
Run the host in the user's interactive session and replicate Apollo/Sunshine's
|
||
`OpenInputDesktop`/`SetThreadDesktop` re-attach to survive UAC/lock-screen desktop switches. (Driving
|
||
the UAC *secure* desktop needs a UIAccess manifest + signing — out of scope; document it.)
|
||
- **Clock epoch on the host side.** The skew handshake assumes both ends read the same realtime epoch
|
||
in ns. The Windows host must emit timestamps from `GetSystemTimePreciseAsFileTime`→Unix-epoch-ns or
|
||
cross-machine latency numbers + `ClockProbe`/`ClockEcho` break.
|
||
- **IDD has no audio endpoint.** There's nothing to loop back on a headless box unless a real/virtual
|
||
render device exists → WASAPI loopback needs an endpoint, and the virtual *mic* (client→host) has no
|
||
clean user-mode path. Audio is potentially a second driver-install problem; defer the mic.
|
||
- **Color/range.** All clients assume BT.709 limited-range. A new openh264/NVENC-D3D11 path doing
|
||
BGRA→I420 must match, or colors wash out — validate against the existing decoders.
|
||
|
||
## Phased plan (host-first)
|
||
|
||
0. **Compile on MSVC** (Step 0 above). GPU-less. ← *start here*
|
||
1. **SudoVDA `VirtualDisplay` backend** — ✅ *control path landed* (`vdisplay/sudovda.rs`:
|
||
add/keepalive/remove + GDI-name resolution + RAII teardown, behind the existing trait; `open()`
|
||
returns it on Windows). Compiles + live-tested on the VM. **Remaining:** monitor activation +
|
||
`\\.\DisplayN` resolution (needs a GPU), then `SetDisplayConfig` mid-stream `Reconfigure`.
|
||
2. **Capture + SW encode** — DXGI Desktop Duplication (or WGC) → `ID3D11Texture2D` → CPU staging →
|
||
openh264 → existing FEC/transport. First end-to-end Windows session, GPU-less, against the Linux
|
||
`punktfunk-client-rs` or the new Windows client.
|
||
3. **Input** — SendInput (kbd/mouse, VIRTUALDESK mapping) + interactive-session/desktop-reattach.
|
||
4. **Gamepad + audio** — ViGEm + rumble; WASAPI loopback.
|
||
5. **HW encode (real-GPU box)** — `nvidia-video-codec-sdk` D3D11 zero-copy; DDA-vs-WGC bake-off;
|
||
glass-to-glass numbers. Resolve to Xbox-360 pad on Windows (drop DualSense fidelity/virtual-mic to
|
||
follow-ups, as the host already does for non-Linux).
|
||
|
||
## The Windows client (separate track, pure Rust)
|
||
|
||
Structurally a sibling of `crates/punktfunk-client-linux` (GTK4) — same shape, different toolkit:
|
||
|
||
- **UI**: `windows-rs` + **Windows Reactor** (WinUI 3) for native chrome. Link `punktfunk-core`
|
||
directly (no C ABI). **De-risk early**: a Reactor window with a `SwapChainPanel` presenting a
|
||
test pattern through a flip-model waitable swapchain, before building on it. Fallback if Reactor's
|
||
3-week-old maturity bites: the `Custom` element + raw `windows-rs` `Microsoft.UI.Xaml`.
|
||
- **Decode**: FFmpeg `avcodec_send_packet`/`receive_frame` with the **D3D11VA** hwaccel → `NV12/P010`
|
||
`ID3D11Texture2D`. Feeds AnnexB directly (matches host output), decodes AV1 with no Store extension.
|
||
- **Present**: DXGI flip-model **waitable** swapchain (`FLIP_DISCARD` + `FRAME_LATENCY_WAITABLE_OBJECT`,
|
||
max latency 1) bound to the `SwapChainPanel` via `ISwapChainPanelNative::SetSwapChain`. **Not**
|
||
MediaPlayerElement.
|
||
- **Input capture**: RAWINPUT/`WM_INPUT` for relative/pointer-lock mouse; `Windows.Gaming.Input` for
|
||
gamepads + rumble. Forward via the linked `NativeClient` (`send_input`/`send_rich_input`).
|
||
- **Trust**: SPAKE2 PIN + TOFU pinning via core; persist the client identity in Windows Credential
|
||
Manager / DPAPI (the Keychain analog).
|
||
|
||
## Open risks / spikes (do these in isolation, early)
|
||
|
||
1. **`cargo build -p punktfunk-host` on the VM** — count + triage the real MSVC errors before
|
||
estimating Step 0. (GPU-less.)
|
||
2. **SudoVDA `ADD` on the VM** — ✅ *done 2026-06-15.* The control path is fully validated on the
|
||
GPU-less VM, both standalone and through the real `VirtualDisplay` trait (`vdisplay/sudovda.rs`):
|
||
device open by GUID, `GET_VERSION` (0.2.1), `GET_WATCHDOG` (3 s), `ADD 1920×1080@60` → returns
|
||
adapter LUID + `target_id`, watchdog ping holds it, RAII `Drop` → `REMOVE`. **Gap:** with no GPU the
|
||
target does NOT activate into a WDDM display path (`QueryDisplayConfig` active paths stay 0 → no
|
||
`\\.\DisplayN` to resolve/capture). So **activation + name-resolution + capture defer to a real
|
||
GPU** (passthrough on the Proxmox VM, or a GPU box) — consistent with capture/NVENC deferring anyway.
|
||
3. **IDD arbitrary-mode + `Reconfigure` on 24H2/25H2** — does 5120×1440@240 apply, and does a
|
||
remove+re-add (or re-modeset) hit the ~90 ms budget without a Settings-UI toggle? Make-or-break for
|
||
"native client resolution, no scaling".
|
||
4. **NVENC-D3D11 zero-copy** (real-GPU box) — does the captured texture register as-is, or need a
|
||
copy? Does `nvidia-video-codec-sdk`'s `NV_ENC_DEVICE_TYPE_DIRECTX` path work end-to-end? (Expect to
|
||
vendor/patch.)
|
||
5. **DDA vs WGC** against the SudoVDA monitor — measure latency/jitter on a real GPU; resolve the
|
||
primary-capture choice.
|
||
6. **Driver redistribution** — confirm bundling SudoVDA (`.cer` + nefcon) + ViGEmBus installers in the
|
||
punktfunk Windows package; document them as prerequisites.
|
||
|
||
## References
|
||
|
||
- SudoVDA: <https://github.com/SudoMaker/SudoVDA> · Apollo integration:
|
||
<https://github.com/ClassicOldSong/Apollo/tree/master/src/platform/windows> (`virtual_display.cpp`)
|
||
+ `third-party/sudovda/`
|
||
- parsec-vdd-rust (port pattern): <https://github.com/rohitsangwan01/parsec-vdd-rust>
|
||
- Win11 24H2 IDD mode-apply regression: VirtualDrivers/Virtual-Display-Driver #471
|
||
- Windows Reactor (WinUI 3 in Rust): windows-rs PR #4479
|
||
- Crates: `windows`, `windows-capture`, `vigem-client`, `wasapi`, `openh264`,
|
||
`nvidia-video-codec-sdk`, `ffmpeg-next`
|
||
</content>
|
||
</invoke>
|