Files
punktfunk/docs/windows-host.md
T
enricobuehler e07e359b6d
ci / rust (push) Has been cancelled
docs: scope Windows-as-host (deferred) + update roadmap status
A 4-agent read of the host crate: a Windows host is an "add a backend" job, not a parallel
port — ~95% reuse (core/protocol/FEC/crypto/C-ABI, QUIC, GameStream, mgmt, m3/pipeline are all
platform-agnostic and already cfg-isolated). New cfg(windows) backends behind the existing
traits: DXGI Desktop Duplication (capture), Media Foundation / NVENC-SDK (encode), SendInput +
ViGEm (input), WASAPI loopback + virtual mic (audio). The blocker is the virtual-display
feature — no user-mode Windows API; it needs a signed kernel-mode IDD driver (XL).

docs/windows-host.md records the per-subsystem effort + a phased plan (Phase 0 = a "basic
Windows host" capturing an existing monitor, smallest surface). Deferred: large and unbuildable
on the Linux dev box, per the request to only take it on if manageable. roadmap.md marks
#1/#2/#4 done, #3 packaged, and adds #7 Windows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 22:29:01 +00:00

69 lines
5.3 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Windows as a host — feasibility & scoping
**Status: scoped, deferred.** A Windows host is architecturally an *"add a backend"* job, not a
parallel port — but it is a **large** implementation effort across five GPU/driver subsystems, and
the project's headline feature (a per-client *virtual* output at the client's exact mode) has **no
user-mode Windows API**: it needs a signed kernel-mode Indirect Display Driver (IDD). This doc
records what it takes so the work can be picked up deliberately later.
(Grounded in a 4-agent read of the host crate, 2026-06-10.)
## What's already done for us
punktfunk is cleanly layered. **~95% of the codebase is platform-agnostic and reuses verbatim:**
| Reusable as-is | Why |
|---|---|
| `punktfunk-core` (protocol, FEC, crypto, session, transport, **C ABI**) | Zero platform deps — no `cfg(linux)` anywhere; the C ABI is already cross-platform |
| QUIC control plane (`quic.rs`, pairing, mode negotiation) | quinn + tokio are portable |
| GameStream P1.1 (mDNS, serverinfo, pairing, RTSP, ENet) — *except* `stream.rs`/`audio.rs` | pure wire logic |
| Management REST API (`mgmt.rs`) + OpenAPI | axum/tokio, portable |
| Pipeline + `m3.rs` orchestration | trait-generic — calls `capturer.next_frame()`, `encoder.submit/poll()`; **needs zero changes** |
| The **trait boundaries** themselves: `Capturer`, `Encoder`, `VirtualDisplay`, `InputInjector`, `AudioCapturer`, `VirtualMic` | platform-neutral signatures; Linux deps are already isolated under `[target.'cfg(target_os="linux")'.dependencies]` |
So a Windows host is **new `#[cfg(target_os = "windows")]` backend modules behind the existing
traits** — the per-frame path, protocol, and control plane don't move. No architectural refactor is
required; the boundaries are already in the right places.
## What a Windows host needs (new code)
Each row is a Linux backend that needs a Windows sibling. Effort is the *implementation* effort;
all reuse the existing trait.
| Subsystem | Linux today | Windows equivalent | Effort | Notes |
|---|---|---|---|---|
| **Capture** | xdg ScreenCast portal → PipeWire (dmabuf) | **DXGI Desktop Duplication** (or Windows.Graphics.Capture) → D3D11 texture | M | DXGI gives a GPU `B8G8R8A8` texture directly |
| **Virtual display** | KWin/Mutter/Sway/gamescope protocols | **Indirect Display Driver (IDD)** — kernel UMDF mini-driver | **XL** | ⚠️ **the blocker**: no user-mode API; C++ driver + **code signing** (test-sign or WHQL). Fallback: capture an existing monitor (loses the native-resolution feature) or a borderless window |
| **Encode** | `ffmpeg-next` NVENC, CUDA hwframes | Media Foundation H.264/HEVC/AV1, **or** NVENC SDK direct with a D3D11 device context (`AVD3D11VADeviceContext`) | ML | `encode.rs` AU/codec logic + NVENC option strings are portable; only the hwdevice + frame-pool glue swaps |
| **Zero-copy bridge** | dmabuf → EGL/Vulkan → CUDA | D3D11 texture → NVENC (shared texture / `cudaImportExternalMemory` + D3D12 fence) | M | **optional** — a portable CPU-copy path already exists, so v1 can skip this |
| **Input (ptr/kbd)** | libei (RemoteDesktop portal) / wlr protocols | **SendInput** (`keybd_event`/`mouse_event`) | S | the VK→evdev table just becomes VK→`VIRTUAL_KEY` (already Win32-native) |
| **Input (gamepads)** | uinput X-Box-360 pad + FF rumble | **ViGEm** (Virtual Gamepad Emulation) + HID reports | M | rumble back-channel maps to ViGEm notifications |
| **Audio capture** | PipeWire sink-monitor | **WASAPI loopback** (`IAudioCaptureClient`) | SM | also produces interleaved f32 — same `AudioCapturer` contract |
| **Virtual mic** | PipeWire `Audio/Source` | virtual audio device (VB-Cable-style WDM driver) or WASAPI render-to-fake-device | M | needs a driver or a bundled 3rd-party cable |
| **`sendmmsg` batching** | `gamestream/stream.rs` | already has a `cfg(not(linux))` per-packet fallback | — | nothing to do |
**Rough total: ~2,0004,000 LOC of new Rust** (+ a C++ IDD driver if the virtual-display feature is
kept), spread over capture/encode/vdisplay/input/audio. Every reader rated the overall effort
**large**; the input+audio layer alone is *medium*.
## Recommended phasing (when picked up)
1. **Phase 0 — "basic Windows host" (no virtual display).** Capture an *existing* monitor (DXGI
Desktop Duplication) → Media Foundation/NVENC encode → SendInput + WASAPI loopback. This proves
the whole stack on Windows with the smallest surface, reusing all of core/QUIC/GameStream/mgmt.
It loses the per-client native-resolution output but is a working Windows host quickly.
2. **Phase 1 — input + audio parity.** ViGEm gamepads + rumble; WASAPI virtual mic; D3D11→NVENC
zero-copy.
3. **Phase 2 — the virtual display (IDD).** The XL piece: a signed Indirect Display Driver that
surfaces a client-sized monitor, captured via DXGI. This restores punktfunk's differentiator on
Windows. Gated on solving driver signing/distribution.
## Why it's deferred (not started now)
- It's **large**, and the virtual-display blocker (IDD) is a kernel driver + signing problem
outside Rust — not "somewhat manageable" as a side effort.
- None of it is **buildable or testable on the Linux dev box** — it would be unvalidated code.
The architecture is ready whenever the work is scheduled; this doc + the clean trait boundaries are
the down payment. Start at **Phase 0** for the fastest path to a working Windows host.