Files
punktfunk/docs/roadmap.md
T
enricobuehler 12cf2e4e16
ci / rust (push) Has been cancelled
docs: refresh README/CLAUDE status; roadmap pairing-hardening + SudoVDA Windows
- README: replace the stale M0/M2-in-flight status with reality — M1 hardened, M2
  GameStream host live to stock Moonlight, M3 punktfunk/1 validated, M4 Apple first
  light, web console + unified host; FFmpeg 7/8; Bazzite-deployed. Layout adds
  web/, packaging/, native_pairing, dualsense.
- CLAUDE: protocol-growth item now reflects the unified host + web-console native
  pairing (done) and flags the next steps; layout updated.
- roadmap §7 Windows: de-risked via SudoVDA (the Sunshine Virtual Display Adapter) —
  no self-signed kernel IDD needed; the virtual-display backend drops XL→M.
- roadmap §8 (new) Pairing & trust hardening: mandatory PIN pairing by default
  (TOFU-open is insecure on a LAN) + delegated pairing approval (an already-paired
  device approves a new one, no out-of-band PIN).
- windows-host.md: SudoVDA path throughout (status, table, phasing, effort M not L).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 09:55:30 +00:00

12 KiB
Raw Blame History

punktfunk roadmap — next goals

Decided 2026-06-10 (research-grounded; see commit history), extended since.

Done & live (on main): #1 KDE reliability (Phase 1+2), #2 client compositor options (full stack incl. the macOS client), #4 mic passthrough, #5 touch (host path) + rich UHID DualSense — input + adaptive-trigger/LED feedback over the new 0xCC/0xCD planes + C ABI, Phase C/D/E live-validated. #3 Bazzite packaging (packaging/) deployed live on a Bazzite F43 box (builds against FFmpeg 7 or 8; gamescope capture → zero-copy NVENC, sub-ms latency; Sunshine replaced). Unified host: serve --native runs the GameStream host + the punktfunk/1 QUIC host in one process, with native pairing driven from the web console (arm → show PIN), not the service log. Advanced DualSense (audio-driven voice-coil) haptics scoped NO-GO (docs/dualsense-haptics.md).

Next: §8 pairing & trust hardening (mandatory PIN by default + delegated approval), the M4 client presenter + iOS (§6), and a Windows host (§7 — now de-risked via SudoVDA, no custom signed driver needed).

1. Reliable headless KDE/compositor spawning (done — Phase 1 + 2)

Startup is a chain of timing-sensitive handoffs with no readiness checks — each is a blind sleep, one-shot timeout, or silent fire-and-forget that fails into a black screen.

  • Phase 1 (S): replace run-headless-kde.sh's blind sleep 2 with an active readiness wait (kwin socket + wl_display roundtrip + zkde_screencast global advertised + KWIN_PID alive); add a punktfunk-host probe-compositor subcommand (reuses kwin.rs's registry roundtrip); move the portal restart to after readiness and precede it with systemctl --user import-environment + dbus-update-activation-environment (the missing env import — the Sway script does this, the KDE one doesn't).
  • Phase 2 (M): bounded retry-with-backoff around vd.create() + first-frame (permanent vs transient); a PipeWire negotiation watchdog with zero-copy→CPU auto-fallback ("no PipeWire frame within 10s" → recovery or precise diagnosis); fix set_custom_refresh to wait for the output, read back the active mode, reconcile encoder fps; harden gamescope node discovery + detect the known-bad-gamescope signature; graceful PipeWire-thread stop.
  • Phase 3 (L): supervised systemd user session (kwin + portal + host) with the readiness probe as an ExecStartPost gate, Restart=on-failure.

2. Offer available compositors in the client (done)

Host enumerates which backends are actually available (binary present + version OK: gamescope ≥3.16.22, KWin ≥6.5.6, gnome-shell, sway), advertises the list in the punktfunk/1 Welcome + a mgmt-API field; client sends its pick in the Hello; host honors it per session. Picker in the Apple client + web console.

3. Bazzite / install on other devices (packaging written — packaging/)

Bazzite already ships gamescope + PipeWire + the NVIDIA driver (incl. libnvidia-encode); it's Fedora-atomic and the community installs Sunshine via COPR rpm-ostree — the analog. Written: packaging/rpm/punktfunk.spec (builds the host from source), packaging/bootc/Containerfile (FROM bazzite-nvidia), packaging/bazzite/host.env (gamescope default), packaging/copr/ + packaging/README.md. The build itself is operator-run (COPR / a Fedora toolbox; not buildable on the Ubuntu dev box). LICENSE-{MIT,APACHE} added to match the declared dual license.

  • M-Bazzite-1: a COPR RPM (primary) — binary + 60-punktfunk.rules (→ /usr/lib/udev/rules.d) + systemd --user unit + host.env.example; Requires the NVENC ffmpeg-libs Bazzite already pulls; links host libcuda/libnvidia-encode directly. Install = rpm-ostree install + reboot + add to input/render. Default backend = Bazzite's already-present gamescope (minimal session plumbing).
  • M-Bazzite-2: wrap the RPM in a bootc/OCI image layer (FROM ghcr.io/ublue-os/bazzite-nvidia:stable) for the appliance/"just rebase" experience.
  • Flatpak only later as an explicitly-degraded convenience build (sandbox fights zero-copy NVENC/dmabuf/uinput).

4. Mic passthrough — client mic → host input device (done — host side)

The exact mirror of the host→client desktop-audio path. A PipeWire virtual source apps can select = a pw_stream with Direction::Output + media.class=Audio/Source.

  • New 0xCB MIC_AUDIO datagram (mirror of 0xC9) + NativeClient::send_audio + ABI punktfunk_send_audio.
  • audio/source_linux.rs — near-copy of the capture file, Direction::Output, fed from a jitter buffer (silence-fill underrun, Opus PLC).
  • Host mic_thread (Opus decode → ring → source); teardown RAII, set node.dont-reconnect.
  • Apple capture (AVAudioEngine → Opus). Opt-in + paired-only (a remote mic is a privacy surface). punktfunk/1-only.

5. Touch + rich DualSense (decision: commit to full UHID DualSense)

  • Touch — implemented (host path), pending a backend that lands it. TouchDown/Move/Up InputKinds (reuse the abs-pointer flags=(w<<16)|h mapping, code=touch id); host inject/libei.rs requests the Touchscreen device type + binds the Touch capability and injects ei_touchscreen down/motion/up; punktfunk-client-rs --touch-test drags a finger. Validated: KWin's RemoteDesktop portal grants the Touchscreen device type, but its EIS server creates no touchscreen device (headless KWin) — so touch currently no-ops on KWin (now logged once). The code is correct; it needs a backend that exposes ei_touchscreen (gamescope / newer KWin / the real iPad client path) to land. wlroots: no virtual-touch wired.
  • Rich DualSense — HID backend built & validated live. inject/dualsense.rs: a hand-rolled /dev/uhid codec (no bindgen) presenting a genuine USB DualSense (vendor 054C/0CE6, the 232-byte inputtino report descriptor) bound by the kernel hid-playstation driver. The mandatory GET_REPORT feature handshake (calibration 0x05 / pairing 0x09 / firmware 0x20) is answered, so the kernel creates the full device (gamepad/motion/touchpad/lightbar). Input report 0x01 is built from gamepad frames; output report 0x02 is parsed for LED RGB, player LEDs, and adaptive trigger effects (L2/R2). Protocol carries new side-planes: rich-input 0xCC (touchpad/motion) + HID-output 0xCD (LED/triggers). /dev/uhid udev rule shipped.
  • Rich DualSense — Phase C/D/E end-to-end, validated live. PUNKTFUNK_GAMEPAD=dualsense selects a per-session DualSenseManager (the PadBackend enum in m3.rs): client gamepad frames build the DualSense report; the kernel's feedback comes back as HidOutput on the 0xCD plane (lightbar / player LEDs / adaptive triggers) while rumble stays on the universal 0xCA plane (so non-DualSense clients still feel it); touchpad + motion ride the 0xCC rich-input plane (DualSenseManager::apply_rich, merged with button state). The connector + C ABI gained punktfunk_connection_next_hidout (→ PunktfunkHidOutput) and punktfunk_connection_send_rich_input (← PunktfunkRichInput); header regenerated. Validated on-box: a synthetic-source m3-host + punktfunk-client-rs --rich-input-test created the real kernel DualSense, drove 0xCC, and decoded 12 live 0xCD events (the kernel's actual lightbar/trigger init reports) — data plane unaffected (600/600 frames). Remaining: the Apple client renders adaptive triggers + rumble on a real DualSense (GCDualSenseAdaptiveTrigger) — handed off to the client agent for the real playtest.
  • Advanced (audio-driven voice-coil) haptics — scoped, NO-GO for now (docs/dualsense-haptics.md). Driven by the DualSense's USB audio interface (4-ch, back 2 channels = haptic PCM), not HID — so the UHID backend structurally can't carry it. Three independent walls: host capture needs a kernel rebuild (CONFIG_USB_DUMMY_HCD is off → no UDC for an f_uac2 gadget); near-zero Linux supply (only ~510 Proton titles via custom Wine patches emit it; hid-playstation/Steam Input/RPCS3 don't); and the Apple client can't faithfully replay PCM haptics (CoreHaptics is discrete/pattern- based, no public channel-3/4 routing). Deferred; revisit only if a real DS for capture + a UDC/host path + a PCM-capable client all land. Adaptive triggers (HID, above) deliver the reachable 80%.

6. iOS/iPadOS → tvOS (deferred)

PunktfunkKit is already platform-shared; iOS needs the UIViewRepresentable presenter twin

  • touch capture (#5) + UI. tvOS later.

7. Windows as a host (scoped — docs/windows-host.md; de-risked via SudoVDA)

Architecturally an "add a backend" job, not a parallel port: punktfunk-core (protocol/FEC/ crypto/C-ABI) + QUIC + GameStream + mgmt + the m3/pipeline orchestration are all platform-agnostic and already cfg-isolated (~95% reuse). New #[cfg(windows)] backends behind the existing traits: capture (DXGI Desktop Duplication / Windows.Graphics.Capture), encode (Media Foundation / NVENC-SDK with a D3D11 context), input (SendInput + ViGEm), audio (WASAPI loopback + a virtual mic).

The old blocker is gone. Rather than author + sign our own kernel IDD for the per-client virtual display, use SudoVDA (the Sunshine Virtual Display Adapter) — a pre-built, signed Indirect Display Driver that creates virtual displays at arbitrary WxH@Hz on demand. The VirtualDisplay backend becomes "install + drive SudoVDA's control API" (M effort), not "write + WHQL-sign a kernel driver" (XL). That removes the only hard blocker — the Windows host is now a medium, mostly-mechanical port. Recommended start: Phase 0 — capture an existing monitor to prove the stack end to end; Phase 1 wires SudoVDA for the native-resolution output. Deferred only because it's unbuildable on the Linux dev box; the trait boundaries are already in the right places.

8. Pairing & trust hardening (next)

The unified host + web-console pairing (arm a window → display the host PIN → user enters it on the client) is built and live. Two changes harden it from "works" to "secure by default":

  • Mandatory PIN pairing by default. Today the punktfunk/1 host can run open (trust-on-first-use) — not acceptable on a shared LAN, where any reachable device could connect. The unified host should require_pairing out of the box: a client must complete the SPAKE2 PIN ceremony (one online guess, no offline attack) before any session. The operator arms a window and reads the PIN from the web console (already built); an explicit --open escape hatch covers trusted single-user setups. The wire is already in place (M3Options.require_pairing + the serve_session gate); this flips the default and threads it through serve --native and the mgmt arm endpoint.

  • Delegated pairing approval — the ergonomic enabler for "mandatory" (pair a new device without fetching the host PIN out of band):

    1. Device A is already paired (authenticated) to Host X.
    2. The user tries to connect Device B to Host X.
    3. Host X pushes a request to the authenticated Device A: "Allow Device B to pair with Host X?"
    4. The user approves/denies on Device A; on approve, Host X admits Device B — binding B's certificate fingerprint — with no PIN typed.

    Needs: a host→client pairing-approval-request (B's fingerprint + a human label) delivered to A's live connection (a QUIC side-plane message) or polled via the mgmt API; an approve/deny round-trip carrying an approval token; the host gating B's admission on it. The web console and the Apple client render the approval prompt. PIN pairing stays the bootstrap (the first device, or when no paired device is online to approve).