docs(windows-host): host-first plan + SudoVDA protocol + no-GPU strategy
ci / docs-site (push) Successful in 31s
apple / swift (push) Successful in 1m17s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 7s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 2m11s
ci / bench (push) Successful in 1m36s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 7s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m26s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m56s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m5s
ci / docs-site (push) Successful in 31s
apple / swift (push) Successful in 1m17s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 7s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 2m11s
ci / bench (push) Successful in 1m36s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 7s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m26s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m56s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m5s
Rewrite the scoping doc into a concrete implementation plan: locked decisions (host-first, SudoVDA virtual display, pure-Rust windows-rs+Reactor client linking core directly, FFmpeg/D3D11VA decode), the SudoVDA IOCTL control protocol, the no-GPU dev strategy, the Windows-specific structural issues (interactive session, clock epoch, no IDD audio), and the phased plan. Step 0 (compile on MSVC) marked done. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
+202
-57
@@ -1,73 +1,218 @@
|
||||
# Windows as a host — feasibility & scoping
|
||||
# Windows host + client — implementation plan
|
||||
|
||||
**Status: scoped, deferred — but de-risked.** A Windows host is architecturally an *"add a backend"*
|
||||
job, not a parallel port. The one thing that used to make it **large** — the per-client *virtual*
|
||||
output, which has no user-mode Windows API and seemingly needed a self-signed kernel Indirect
|
||||
Display Driver (IDD) — is **solved by reusing [SudoVDA](https://github.com/VirtualDrivers), the
|
||||
Sunshine Virtual Display Adapter**: a pre-built, signed IDD that creates virtual displays at
|
||||
arbitrary `WxH@Hz` on demand. We install it and drive its control interface; **no driver to write or
|
||||
WHQL-sign.** That turns the headline feature from XL into a medium backend. This doc records what's
|
||||
left so the work can be picked up deliberately.
|
||||
**Status: in progress — dev box provisioned, host-first.** A Windows host is an *"add backends
|
||||
behind the existing traits"* job, not a parallel port: `punktfunk-core` and the whole control plane
|
||||
are platform-agnostic and the host already compiles on non-Linux (macOS) thanks to existing
|
||||
`cfg(target_os)` gating. The one piece that used to make it XL — a per-client *virtual* output, which
|
||||
has no user-mode Windows API — is solved by reusing **[SudoVDA](https://github.com/SudoMaker/SudoVDA)**
|
||||
(the SudoMaker Virtual Display Adapter, the same IDD the Apollo Sunshine-fork ships): a pre-built IDD
|
||||
that creates virtual displays at **arbitrary `WxH@Hz` on the fly**. We install it and drive its IOCTL
|
||||
control interface — **no driver to write or WHQL-sign.**
|
||||
|
||||
(Grounded in a 4-agent read of the host crate, 2026-06-10; SudoVDA path added 2026-06-11.)
|
||||
History: scoped 2026-06-10 (4-agent read of the host crate); SudoVDA path 2026-06-11; this concrete
|
||||
plan + dev box + SudoVDA protocol + no-GPU strategy added 2026-06-14 (12-agent research pass).
|
||||
|
||||
## What's already done for us
|
||||
## Decisions (locked 2026-06-14)
|
||||
|
||||
punktfunk is cleanly layered. **~95% of the codebase is platform-agnostic and reuses verbatim:**
|
||||
| Decision | Choice | Rationale |
|
||||
|---|---|---|
|
||||
| **Build order** | **Host first** | User preference. (Note: the research recommended *client* first, since the client is unblocked by the no-GPU problem and becomes the host's test endpoint — see "No-GPU dev strategy". Revisit if host progress stalls on GPU-gated steps.) |
|
||||
| **Virtual display** | **SudoVDA** | Arbitrary modes on the fly (no baked EDID / registry mode list, unlike parsec-vdd), MIT/CC0 (bundleable), already installed on the dev box, proven by Apollo. |
|
||||
| **Client UI** | **Pure Rust: `windows-rs` + Windows Reactor (WinUI 3)** | No C++/C#. Links `punktfunk-core` directly as a crate (like the GTK Linux client — no C ABI, no GC/FFI-lifetime hazard). Built-in `SwapChainPanel` widget for the video surface; `Custom` escape hatch + raw `Microsoft.UI.Xaml` as fallback. |
|
||||
| **Client decode** | **FFmpeg + D3D11VA** | Exactly what Moonlight ships; feeds AnnexB H.264/HEVC/AV1 directly, decodes AV1 via the GPU DXVA profile with **no** Store Video Extension. Cost: ffmpeg dep + libclang. |
|
||||
| **Host SW encode (no-GPU dev)** | **openh264** | BSD, no system ffmpeg, low-latency single-ref/zero-lookahead with intra-refresh. Lets the full capture→encode→FEC→send pipeline run GPU-less. |
|
||||
| **Host HW encode** | **nvidia-video-codec-sdk (D3D11)** | `NV_ENC_DEVICE_TYPE_DIRECTX` + `NvEncRegisterResource` on the captured `ID3D11Texture2D` = true zero-copy, no CUDA bridge. Young crate — vendor + wrap behind the `Encoder` trait. Defers to a real-GPU box. |
|
||||
|
||||
| Reusable as-is | Why |
|
||||
## Dev box (`ssh "Enrico Bühler"@192.168.1.57`)
|
||||
|
||||
Windows 11 Pro 25H2 (build 26200), QEMU Q35, 8 vCPU, 12 GB. **No working GPU** (an `RTX 5070 Ti` node
|
||||
is present but `Status: Unknown`; `nvidia-smi` fails → NVENC cannot initialize). Installed: Rust 1.96
|
||||
(MSVC), Visual Studio Community 2026 + VC tools + Windows SDK 10.0.26100/28000, Windows App Runtime
|
||||
2.2 (Reactor needs ≥2.0.1 ✅), **SudoVDA** (`ROOT\DISPLAY\0000`, hwid `root\sudomaker\sudovda`, INF
|
||||
`oem6.inf`, Status OK) and Parsec VDD, git, winget. **Toolchain gaps to fill** (see Step 0): NASM,
|
||||
CMake, libclang.
|
||||
|
||||
## Reused as-is (~95% of the codebase — no changes)
|
||||
|
||||
| Reusable | Why |
|
||||
|---|---|
|
||||
| `punktfunk-core` (protocol, FEC, crypto, session, transport, **C ABI**) | Zero platform deps — no `cfg(linux)` anywhere; the C ABI is already cross-platform |
|
||||
| QUIC control plane (`quic.rs`, pairing, mode negotiation) | quinn + tokio are portable |
|
||||
| GameStream P1.1 (mDNS, serverinfo, pairing, RTSP, ENet) — *except* `stream.rs`/`audio.rs` | pure wire logic |
|
||||
| Management REST API (`mgmt.rs`) + OpenAPI | axum/tokio, portable |
|
||||
| Pipeline + `m3.rs` orchestration | trait-generic — calls `capturer.next_frame()`, `encoder.submit/poll()`; **needs zero changes** |
|
||||
| The **trait boundaries** themselves: `Capturer`, `Encoder`, `VirtualDisplay`, `InputInjector`, `AudioCapturer`, `VirtualMic` | platform-neutral signatures; Linux deps are already isolated under `[target.'cfg(target_os="linux")'.dependencies]` |
|
||||
| `punktfunk-core` (protocol, FEC, crypto, session, transport, QUIC control plane, C ABI) | Zero platform deps; already compiles on Windows MSVC |
|
||||
| GameStream wire logic (mDNS, serverinfo, pairing, RTSP, ENet) *except* the capture/encode/audio backends | pure protocol |
|
||||
| Management REST API (`mgmt.rs`) + OpenAPI, `native_pairing`, `discovery` | axum/tokio/quinn — portable |
|
||||
| `m3.rs` / `m0.rs` / `pipeline.rs` orchestration | trait-generic: call `capturer.next_frame()`, `encoder.submit/poll()`, `vd.create(mode)` — no changes |
|
||||
| The trait boundaries: `Capturer`, `Encoder`, `VirtualDisplay`, `InputInjector`, `AudioCapturer`, `VirtualMic` | platform-neutral; Linux deps already isolated under `[target.'cfg(target_os="linux")'.dependencies]` |
|
||||
|
||||
So a Windows host is **new `#[cfg(target_os = "windows")]` backend modules behind the existing
|
||||
traits** — the per-frame path, protocol, and control plane don't move. No architectural refactor is
|
||||
required; the boundaries are already in the right places.
|
||||
## Step 0 — make `punktfunk-host` compile on `x86_64-pc-windows-msvc` — ✅ DONE (2026-06-14)
|
||||
|
||||
## What a Windows host needs (new code)
|
||||
**Result:** the full dependency tree builds clean on MSVC (aws-lc-rs with NASM+CMake, quinn,
|
||||
rusty_enet, axum/hyper/utoipa), and `punktfunk-host` compiles **and runs** (the `openapi` subcommand
|
||||
emits the spec). Only **3 cfg-gates** were needed — the host was already ~95% portable:
|
||||
`main.rs` `mod dmabuf_fence`/`mod drm_sync` → `#[cfg(target_os = "linux")]`; `vdisplay.rs` the
|
||||
`use std::os::fd::OwnedFd` import + `VirtualOutput.remote_fd` field → `#[cfg(target_os = "linux")]`.
|
||||
Verified green on Linux too. Build env on the VM: rustc+`cc`/`cmake` self-locate MSVC (vcvars not
|
||||
needed); `PATH` must include cargo bin + NASM + CMake + LLVM.
|
||||
|
||||
Each row is a Linux backend that needs a Windows sibling. Effort is the *implementation* effort;
|
||||
all reuse the existing trait.
|
||||
The host already compiles on macOS (Linux backends are `cfg`-gated; heavy Linux deps are
|
||||
target-gated). Getting to Windows MSVC is the **unix-but-not-linux** delta, not a from-scratch port:
|
||||
|
||||
| Subsystem | Linux today | Windows equivalent | Effort | Notes |
|
||||
|---|---|---|---|---|
|
||||
| **Capture** | xdg ScreenCast portal → PipeWire (dmabuf) | **DXGI Desktop Duplication** (or Windows.Graphics.Capture) → D3D11 texture | M | DXGI gives a GPU `B8G8R8A8` texture directly |
|
||||
| **Virtual display** | KWin/Mutter/Sway/gamescope protocols | **SudoVDA** (pre-built signed IDD) — install + drive its control API to add/remove a `WxH@Hz` virtual monitor per session | **M** | ✅ **no longer the blocker**: SudoVDA is the same IDD Sunshine ships, so no driver to author or sign. The `VirtualDisplay` backend = enable the adapter, create a monitor at the client's mode, capture it (DXGI), tear it down on session end. Fallback if SudoVDA is absent: capture an existing monitor (loses native-resolution) |
|
||||
| **Encode** | `ffmpeg-next` NVENC, CUDA hwframes | Media Foundation H.264/HEVC/AV1, **or** NVENC SDK direct with a D3D11 device context (`AVD3D11VADeviceContext`) | M–L | `encode.rs` AU/codec logic + NVENC option strings are portable; only the hwdevice + frame-pool glue swaps |
|
||||
| **Zero-copy bridge** | dmabuf → EGL/Vulkan → CUDA | D3D11 texture → NVENC (shared texture / `cudaImportExternalMemory` + D3D12 fence) | M | **optional** — a portable CPU-copy path already exists, so v1 can skip this |
|
||||
| **Input (ptr/kbd)** | libei (RemoteDesktop portal) / wlr protocols | **SendInput** (`keybd_event`/`mouse_event`) | S | the VK→evdev table just becomes VK→`VIRTUAL_KEY` (already Win32-native) |
|
||||
| **Input (gamepads)** | uinput X-Box-360 pad + FF rumble | **ViGEm** (Virtual Gamepad Emulation) + HID reports | M | rumble back-channel maps to ViGEm notifications |
|
||||
| **Audio capture** | PipeWire sink-monitor | **WASAPI loopback** (`IAudioCaptureClient`) | S–M | also produces interleaved f32 — same `AudioCapturer` contract |
|
||||
| **Virtual mic** | PipeWire `Audio/Source` | virtual audio device (VB-Cable-style WDM driver) or WASAPI render-to-fake-device | M | needs a driver or a bundled 3rd-party cable |
|
||||
| **`sendmmsg` batching** | `gamestream/stream.rs` | already has a `cfg(not(linux))` per-packet fallback | — | nothing to do |
|
||||
1. **Toolchain**: `winget install NASM.NASM Kitware.CMake LLVM.LLVM`, set `LIBCLANG_PATH`
|
||||
(or tick VS "C++ Clang tools"). NASM+CMake are for **aws-lc-rs** (pulled by `rustls`/`rcgen` on
|
||||
the `quic` path); libclang is for `ffmpeg-sys`/bindgen (client decode + any host bindgen crate).
|
||||
2. **`std::os::fd` / `libc`**: `vdisplay.rs:18` has an unconditional `use std::os::fd::OwnedFd;` and
|
||||
`VirtualOutput.remote_fd: Option<OwnedFd>` — `std::os::fd` is `cfg(unix)`, so it builds on macOS
|
||||
but breaks on Windows. Gate the import + field (`#[cfg(unix)]`, with a Windows arm or omission).
|
||||
Sweep for other `cfg(target_os="linux")`-missing unix-isms (`libc`, fds).
|
||||
3. **Build natively on the VM** (`cargo build -p punktfunk-host` — *not* cross-compile; xwin chokes on
|
||||
aws-lc-rs/ffmpeg-sys/WDK). Triage the remaining errors. Suspect deps to verify link on MSVC:
|
||||
`aws-lc-rs` (needs NASM+CMake), `rusty_enet`, the hyper/axum/utoipa stack (expected fine).
|
||||
4. **CI**: add a `cargo build -p punktfunk-host --target x86_64-pc-windows-msvc` job so the Windows
|
||||
path stops bit-rotting (the dev box can be a Gitea runner later).
|
||||
|
||||
**Rough total: ~2,000–4,000 LOC of new Rust** (no C++ driver — SudoVDA is reused as-is), spread over
|
||||
capture/encode/vdisplay/input/audio. With the driver problem solved, the overall effort is now
|
||||
**medium**; the input+audio layer alone is *small–medium*.
|
||||
This is the highest-value first move and is **fully doable GPU-less**.
|
||||
|
||||
## Recommended phasing (when picked up)
|
||||
## Windows backends (new `#[cfg(target_os = "windows")]` code behind existing traits)
|
||||
|
||||
1. **Phase 0 — "basic Windows host" (no virtual display).** Capture an *existing* monitor (DXGI
|
||||
Desktop Duplication) → Media Foundation/NVENC encode → SendInput + WASAPI loopback. This proves
|
||||
the whole stack on Windows with the smallest surface, reusing all of core/QUIC/GameStream/mgmt.
|
||||
It loses the per-client native-resolution output but is a working Windows host quickly.
|
||||
2. **Phase 1 — the virtual display via SudoVDA.** A `VirtualDisplay` backend that enables SudoVDA,
|
||||
creates a monitor at the client's exact `WxH@Hz`, captures it (DXGI), and tears it down on session
|
||||
end — restoring punktfunk's headline feature with **no driver authoring or signing**. (Ship/guide
|
||||
the SudoVDA install as a host prerequisite, like the udev rule on Linux.)
|
||||
3. **Phase 2 — input + audio parity.** ViGEm gamepads + rumble; WASAPI virtual mic; D3D11→NVENC
|
||||
zero-copy.
|
||||
| Subsystem | Linux today | Windows backend | VM-testable? |
|
||||
|---|---|---|---|
|
||||
| **VirtualDisplay** | KWin/gamescope/Mutter/Sway | **SudoVDA** IOCTLs (below) + `SetDisplayConfig` mode-set | ✅ likely (WARP) — *spike* |
|
||||
| **Capture** | PipeWire/dmabuf | **DXGI Desktop Duplication** primary, **WGC** fallback → `ID3D11Texture2D`; add `FramePayload::D3d11` | ⚠️ DDA-on-WARP unreliable; WGC-on-WARP unverified — *spike* |
|
||||
| **Zero-copy** | dmabuf→EGL/Vulkan→CUDA | register `ID3D11Texture2D` with NVENC (`NV_ENC_DEVICE_TYPE_DIRECTX`) — no CUDA bridge | ❌ needs real GPU |
|
||||
| **Encode** | ffmpeg `*_nvenc` | `openh264` SW (default on VM) + `nvidia-video-codec-sdk` HW (real GPU); behind `PUNKTFUNK_ENCODER` | SW ✅ / HW ❌ |
|
||||
| **Input kbd/mouse** | libei / wlr | **SendInput** with `MOUSEEVENTF_VIRTUALDESK` absolute mapping onto the virtual desktop rect (skip the VK→evdev table — client sends Win VKs; use `KEYEVENTF_SCANCODE`+`EXTENDEDKEY`) | ✅ |
|
||||
| **Gamepad** | uinput xpad + FF | **ViGEmBus** via `vigem-client` (`Xbox360Wired`); rumble via `request_notification()`→`XNotification{large,small}` | ✅ (install driver) |
|
||||
| **Audio capture** | PipeWire sink monitor | **WASAPI loopback** via the `wasapi` crate (48 kHz stereo f32 → existing Opus) | ⚠️ needs an audio endpoint |
|
||||
| **Virtual mic** | PipeWire `Audio/Source` | virtual audio driver (`Virtual-Audio-Driver`) or defer | ❌ second driver — defer |
|
||||
|
||||
## Why it's deferred (not started now)
|
||||
`m3.rs`/`m0.rs`/`pipeline.rs` are unchanged. Note: the Windows capture needs its own
|
||||
`capture_virtual_output` entry point (the SudoVDA identity is a DXGI adapter LUID + DisplayConfig
|
||||
TargetId → GDI `\\.\DisplayN`, which doesn't fit the PipeWire `node_id: u32` field — carry it inside
|
||||
the `keepalive` / a Windows-specific seam rather than overloading `node_id`).
|
||||
|
||||
- The remaining work is **medium** and mechanical, but **none of it is buildable or testable on the
|
||||
Linux dev box** — it would be unvalidated code until there's a Windows box in the loop.
|
||||
- SudoVDA removed the hard blocker (the signed kernel driver); what's left is a backend port, picked
|
||||
up whenever a Windows target is in scope.
|
||||
## SudoVDA control protocol (the `VirtualDisplay` backend spec)
|
||||
|
||||
The architecture is ready whenever the work is scheduled; this doc + the clean trait boundaries are
|
||||
the down payment. Start at **Phase 0** for the fastest path to a working Windows host.
|
||||
Pure Rust via the `windows` crate (no C lib; Apollo vendors a header-only client under
|
||||
`third-party/sudovda/`). Reference port pattern: `parsec-vdd-rust` (SetupAPI/CM_* → `CreateFileW` →
|
||||
`DeviceIoControl`). **Verify the IOCTL hex with a `const fn ctl_code()`** —
|
||||
`CTL_CODE(dev,func,method,access) = (dev<<16)|(access<<14)|(func<<2)|method`, with
|
||||
`FILE_DEVICE_UNKNOWN=0x22`, `METHOD_BUFFERED=0`, `FILE_ANY_ACCESS=0`.
|
||||
|
||||
- **Device interface GUID**: `{E5BCC234-1E0C-418A-A0D4-EF8B7501414D}` · **HWID**: `root\sudomaker\sudovda`
|
||||
- **IOCTLs** (func → value): ADD `0x800`→`0x00222000`, REMOVE `0x801`→`0x00222004`,
|
||||
SET_RENDER_ADAPTER `0x802`→`0x00222008`, GET_WATCHDOG `0x803`→`0x0022200C`,
|
||||
DRIVER_PING `0x888`→`0x00222220`, GET_PROTOCOL_VERSION `0x8FF`→`0x002223FC`.
|
||||
- **Add** (`#[repr(C)]` exact layout): in `{ u32 Width; u32 Height; u32 RefreshRate; GUID MonitorGuid;
|
||||
CHAR DeviceName[14]; CHAR SerialNumber[14] }` → out `{ LUID AdapterLuid; u32 TargetId }`. **The mode
|
||||
is set at create** (driver computes timing arithmetically — no EDID seeding). Pick a *stable
|
||||
per-client* `MonitorGuid` (Windows persists that monitor's layout; remove is by GUID).
|
||||
- **Resolve the capture target**: the monitor appears **asynchronously** — poll
|
||||
`QueryDisplayConfig(QDC_ONLY_ACTIVE_PATHS)`, match `targetInfo.id == TargetId`,
|
||||
`DisplayConfigGetDeviceInfo` → `viewGdiDeviceName` (`\\.\DisplayN`). Apollo polls 20 ms → ×2 → cap
|
||||
320 ms. Then point DXGI Desktop Duplication at that output.
|
||||
- **Keepalive (mandatory)**: `GET_WATCHDOG` → `{ u32 Timeout_s; u32 Countdown }` (default **3 s**,
|
||||
driver-wide). Run one thread firing `DRIVER_PING` every `Timeout*1000/3` ms (~1 s). Miss it and the
|
||||
driver tears down **all** virtual displays.
|
||||
- **Teardown (RAII)**: `Drop` → `DeviceIoControl(REMOVE, { GUID MonitorGuid })` = the `VirtualOutput`
|
||||
keepalive drop.
|
||||
- **Mid-stream `Reconfigure`**: SudoVDA has no in-place mode IOCTL (Apollo only relaunches). Implement
|
||||
punktfunk's `Reconfigure` as remove+re-add at the new mode (or add-second + migrate capture), and
|
||||
**watch the Win11 24H2/25H2 IDD mode-apply regression** (post-create `ChangeDisplaySettingsEx` may
|
||||
not move the *desktop* to the new mode without a Settings-UI poke — VirtualDrivers #471). The
|
||||
~90 ms `Reconfigure` budget needs an isolated spike to confirm on 24H2/25H2.
|
||||
- **Install / signing**: self-signed — ship `sudovda.cer`, import to Root + TrustedPublisher, create
|
||||
the device node via `nefconc.exe` (`--create-device-node`/`--install-driver`). Installs **without**
|
||||
test-signing (trusted-publisher). MIT/CC0 → bundleable (Apollo precedent). **Already installed on
|
||||
the dev box.** Document it as a host prerequisite (like the Linux udev rule).
|
||||
- **GPU caveat**: SudoVDA's `Driver.cpp` does `D3D11CreateDevice(UNKNOWN)` on a render adapter with
|
||||
**no explicit WARP fallback**; on the GPU-less VM Windows binds the Basic Render Driver (WARP), so
|
||||
display compositing *should* work but NVENC won't. Confirm `ADD` actually brings a monitor up on the
|
||||
VM in the first spike.
|
||||
|
||||
## No-GPU dev strategy
|
||||
|
||||
**Buildable + validatable on the VM now:** Step 0 (MSVC compile); the SudoVDA backend
|
||||
(add/mode-set/keepalive/remove via WARP — *spike to confirm*); the openh264 SW encode path fed a CPU
|
||||
BGRA staging copy → real AnnexB → FEC → UDP (the full transport minus HW); SendInput injection +
|
||||
interactive-session/desktop-reattach; ViGEm gamepad + rumble; WASAPI loopback (if an endpoint
|
||||
exists); and the entire client (software decode loopback).
|
||||
|
||||
**Defers to a real NVIDIA-GPU Windows box:** NVENC-D3D11 zero-copy encode; whether the captured
|
||||
`ID3D11Texture2D` registers with NVENC zero-copy vs needing a `CopyResource`; the DDA-vs-WGC latency
|
||||
bake-off (DDA-on-WARP is `E_NOTIMPL`-class); split-encode + bitrate-ceiling probe; and **all**
|
||||
glass-to-glass / throughput numbers (no perf claim transfers from Linux).
|
||||
|
||||
## Windows-specific structural issues (no Linux precedent)
|
||||
|
||||
- **Interactive session, not a Session-0 service.** SendInput can't reach the desktop from Session 0.
|
||||
Run the host in the user's interactive session and replicate Apollo/Sunshine's
|
||||
`OpenInputDesktop`/`SetThreadDesktop` re-attach to survive UAC/lock-screen desktop switches. (Driving
|
||||
the UAC *secure* desktop needs a UIAccess manifest + signing — out of scope; document it.)
|
||||
- **Clock epoch on the host side.** The skew handshake assumes both ends read the same realtime epoch
|
||||
in ns. The Windows host must emit timestamps from `GetSystemTimePreciseAsFileTime`→Unix-epoch-ns or
|
||||
cross-machine latency numbers + `ClockProbe`/`ClockEcho` break.
|
||||
- **IDD has no audio endpoint.** There's nothing to loop back on a headless box unless a real/virtual
|
||||
render device exists → WASAPI loopback needs an endpoint, and the virtual *mic* (client→host) has no
|
||||
clean user-mode path. Audio is potentially a second driver-install problem; defer the mic.
|
||||
- **Color/range.** All clients assume BT.709 limited-range. A new openh264/NVENC-D3D11 path doing
|
||||
BGRA→I420 must match, or colors wash out — validate against the existing decoders.
|
||||
|
||||
## Phased plan (host-first)
|
||||
|
||||
0. **Compile on MSVC** (Step 0 above). GPU-less. ← *start here*
|
||||
1. **SudoVDA `VirtualDisplay` backend** — add/resolve-GDI-name/keepalive/remove + `SetDisplayConfig`
|
||||
mode-set; RAII teardown. *Spike first*: does `ADD` bring up a monitor + mode-set on the VM (WARP)?
|
||||
2. **Capture + SW encode** — DXGI Desktop Duplication (or WGC) → `ID3D11Texture2D` → CPU staging →
|
||||
openh264 → existing FEC/transport. First end-to-end Windows session, GPU-less, against the Linux
|
||||
`punktfunk-client-rs` or the new Windows client.
|
||||
3. **Input** — SendInput (kbd/mouse, VIRTUALDESK mapping) + interactive-session/desktop-reattach.
|
||||
4. **Gamepad + audio** — ViGEm + rumble; WASAPI loopback.
|
||||
5. **HW encode (real-GPU box)** — `nvidia-video-codec-sdk` D3D11 zero-copy; DDA-vs-WGC bake-off;
|
||||
glass-to-glass numbers. Resolve to Xbox-360 pad on Windows (drop DualSense fidelity/virtual-mic to
|
||||
follow-ups, as the host already does for non-Linux).
|
||||
|
||||
## The Windows client (separate track, pure Rust)
|
||||
|
||||
Structurally a sibling of `crates/punktfunk-client-linux` (GTK4) — same shape, different toolkit:
|
||||
|
||||
- **UI**: `windows-rs` + **Windows Reactor** (WinUI 3) for native chrome. Link `punktfunk-core`
|
||||
directly (no C ABI). **De-risk early**: a Reactor window with a `SwapChainPanel` presenting a
|
||||
test pattern through a flip-model waitable swapchain, before building on it. Fallback if Reactor's
|
||||
3-week-old maturity bites: the `Custom` element + raw `windows-rs` `Microsoft.UI.Xaml`.
|
||||
- **Decode**: FFmpeg `avcodec_send_packet`/`receive_frame` with the **D3D11VA** hwaccel → `NV12/P010`
|
||||
`ID3D11Texture2D`. Feeds AnnexB directly (matches host output), decodes AV1 with no Store extension.
|
||||
- **Present**: DXGI flip-model **waitable** swapchain (`FLIP_DISCARD` + `FRAME_LATENCY_WAITABLE_OBJECT`,
|
||||
max latency 1) bound to the `SwapChainPanel` via `ISwapChainPanelNative::SetSwapChain`. **Not**
|
||||
MediaPlayerElement.
|
||||
- **Input capture**: RAWINPUT/`WM_INPUT` for relative/pointer-lock mouse; `Windows.Gaming.Input` for
|
||||
gamepads + rumble. Forward via the linked `NativeClient` (`send_input`/`send_rich_input`).
|
||||
- **Trust**: SPAKE2 PIN + TOFU pinning via core; persist the client identity in Windows Credential
|
||||
Manager / DPAPI (the Keychain analog).
|
||||
|
||||
## Open risks / spikes (do these in isolation, early)
|
||||
|
||||
1. **`cargo build -p punktfunk-host` on the VM** — count + triage the real MSVC errors before
|
||||
estimating Step 0. (GPU-less.)
|
||||
2. **SudoVDA `ADD` on the VM** — does a virtual monitor come up + mode-set via WARP with no GPU?
|
||||
Confirms the whole Phase 1 backend is VM-developable. (GPU-less.)
|
||||
3. **IDD arbitrary-mode + `Reconfigure` on 24H2/25H2** — does 5120×1440@240 apply, and does a
|
||||
remove+re-add (or re-modeset) hit the ~90 ms budget without a Settings-UI toggle? Make-or-break for
|
||||
"native client resolution, no scaling".
|
||||
4. **NVENC-D3D11 zero-copy** (real-GPU box) — does the captured texture register as-is, or need a
|
||||
copy? Does `nvidia-video-codec-sdk`'s `NV_ENC_DEVICE_TYPE_DIRECTX` path work end-to-end? (Expect to
|
||||
vendor/patch.)
|
||||
5. **DDA vs WGC** against the SudoVDA monitor — measure latency/jitter on a real GPU; resolve the
|
||||
primary-capture choice.
|
||||
6. **Driver redistribution** — confirm bundling SudoVDA (`.cer` + nefcon) + ViGEmBus installers in the
|
||||
punktfunk Windows package; document them as prerequisites.
|
||||
|
||||
## References
|
||||
|
||||
- SudoVDA: <https://github.com/SudoMaker/SudoVDA> · Apollo integration:
|
||||
<https://github.com/ClassicOldSong/Apollo/tree/master/src/platform/windows> (`virtual_display.cpp`)
|
||||
+ `third-party/sudovda/`
|
||||
- parsec-vdd-rust (port pattern): <https://github.com/rohitsangwan01/parsec-vdd-rust>
|
||||
- Win11 24H2 IDD mode-apply regression: VirtualDrivers/Virtual-Display-Driver #471
|
||||
- Windows Reactor (WinUI 3 in Rust): windows-rs PR #4479
|
||||
- Crates: `windows`, `windows-capture`, `vigem-client`, `wasapi`, `openh264`,
|
||||
`nvidia-video-codec-sdk`, `ffmpeg-next`
|
||||
</content>
|
||||
</invoke>
|
||||
|
||||
Reference in New Issue
Block a user