Files
punktfunk/docs/roadmap.md
T
enricobuehler 6425f9995f
ci / rust (push) Has been cancelled
docs(roadmap): §10 HDR parked (upstream compositor blocker) + Bazzite dynamic-resolution landed
§10: HDR/10-bit is blocked at the capture source, not our stack — gamescope's PipeWire
node is hardcoded 8-bit (BGRx/NV12; confirmed in src/pipewire.cpp on the box's c31743d
build), issue #2126 open+unstarted; PipeWire >=1.6 needs Fedora 44 (Bazzite F44 is
testing-only with a confirmed NVIDIA Game-Mode crash, so a rebase clears only the
PipeWire wall). The realistic route is KWin MR !8293 (HDR PipeWire capture, draft),
i.e. the desktop path. Records the settled constraints (NVENC 10-bit max, HDR⟹HEVC
Main10, AV1 can't HDR on VideoToolbox) + the ready-to-build downstream design.

Also notes the landed Bazzite dynamic-resolution work (host-managed gamescope-session
at the client's exact res+refresh, c894c6f) + the macOS/iPad input and 4K/5K UDP-buffer
freeze fixes in the Done summary.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 17:52:57 +00:00

16 KiB
Raw Blame History

punktfunk roadmap — next goals

Decided 2026-06-10 (research-grounded; see commit history), extended since.

Done & live (on main): #1 KDE reliability (Phase 1+2), #2 client compositor options (full stack incl. the macOS client), #4 mic passthrough, #5 touch (host path) + rich UHID DualSense — input + adaptive-trigger/LED feedback over the new 0xCC/0xCD planes + C ABI, Phase C/D/E live-validated. #3 Bazzite packaging (packaging/) deployed live on a Bazzite F43 box (builds against FFmpeg 7 or 8; gamescope capture → zero-copy NVENC, sub-ms latency; Sunshine replaced). Unified host: serve --native runs the GameStream host + the punktfunk/1 QUIC host in one process, with native pairing driven from the web console (arm → show PIN), not the service log. Advanced DualSense (audio-driven voice-coil) haptics scoped NO-GO (docs/dualsense-haptics.md). Bazzite dynamic resolution (c894c6f): the host now manages a headless gamescope-session-plus Steam session at the client's exact resolution + refresh — games see it (via injected --nested-refresh + generated CVT modes, not the box's TV EDID), relaunched per-connection on a mode change, reused (no Steam restart) on the same mode. Plus macOS/iPad input fixes (NSEvent motion + iPad pointer-lock) and a 4K/5K one-frame-freeze fix (grow the UDP socket buffers).

Next: §8 pairing & trust hardening (mandatory PIN by default + delegated approval), the M4 client presenter + iOS (§6), and a Windows host (§7 — now de-risked via SudoVDA, no custom signed driver needed). §10 HDR/10-bit is parked — blocked upstream at the compositor (no gamescope/KWin PipeWire 10-bit producer yet).

1. Reliable headless KDE/compositor spawning (done — Phase 1 + 2)

Startup is a chain of timing-sensitive handoffs with no readiness checks — each is a blind sleep, one-shot timeout, or silent fire-and-forget that fails into a black screen.

  • Phase 1 (S): replace run-headless-kde.sh's blind sleep 2 with an active readiness wait (kwin socket + wl_display roundtrip + zkde_screencast global advertised + KWIN_PID alive); add a punktfunk-host probe-compositor subcommand (reuses kwin.rs's registry roundtrip); move the portal restart to after readiness and precede it with systemctl --user import-environment + dbus-update-activation-environment (the missing env import — the Sway script does this, the KDE one doesn't).
  • Phase 2 (M): bounded retry-with-backoff around vd.create() + first-frame (permanent vs transient); a PipeWire negotiation watchdog with zero-copy→CPU auto-fallback ("no PipeWire frame within 10s" → recovery or precise diagnosis); fix set_custom_refresh to wait for the output, read back the active mode, reconcile encoder fps; harden gamescope node discovery + detect the known-bad-gamescope signature; graceful PipeWire-thread stop.
  • Phase 3 (L): supervised systemd user session (kwin + portal + host) with the readiness probe as an ExecStartPost gate, Restart=on-failure.

2. Offer available compositors in the client (done)

Host enumerates which backends are actually available (binary present + version OK: gamescope ≥3.16.22, KWin ≥6.5.6, gnome-shell, sway), advertises the list in the punktfunk/1 Welcome + a mgmt-API field; client sends its pick in the Hello; host honors it per session. Picker in the Apple client + web console.

3. Bazzite / install on other devices (packaging written — packaging/)

Bazzite already ships gamescope + PipeWire + the NVIDIA driver (incl. libnvidia-encode); it's Fedora-atomic and the community installs Sunshine via COPR rpm-ostree — the analog. Written: packaging/rpm/punktfunk.spec (builds the host from source), packaging/bootc/Containerfile (FROM bazzite-nvidia), packaging/bazzite/host.env (gamescope default), packaging/copr/ + packaging/README.md. The build itself is operator-run (COPR / a Fedora toolbox; not buildable on the Ubuntu dev box). LICENSE-{MIT,APACHE} added to match the declared dual license.

  • M-Bazzite-1: a COPR RPM (primary) — binary + 60-punktfunk.rules (→ /usr/lib/udev/rules.d) + systemd --user unit + host.env.example; Requires the NVENC ffmpeg-libs Bazzite already pulls; links host libcuda/libnvidia-encode directly. Install = rpm-ostree install + reboot + add to input/render. Default backend = Bazzite's already-present gamescope (minimal session plumbing).
  • M-Bazzite-2: wrap the RPM in a bootc/OCI image layer (FROM ghcr.io/ublue-os/bazzite-nvidia:stable) for the appliance/"just rebase" experience.
  • Flatpak only later as an explicitly-degraded convenience build (sandbox fights zero-copy NVENC/dmabuf/uinput).

4. Mic passthrough — client mic → host input device (done — host side)

The exact mirror of the host→client desktop-audio path. A PipeWire virtual source apps can select = a pw_stream with Direction::Output + media.class=Audio/Source.

  • New 0xCB MIC_AUDIO datagram (mirror of 0xC9) + NativeClient::send_audio + ABI punktfunk_send_audio.
  • audio/source_linux.rs — near-copy of the capture file, Direction::Output, fed from a jitter buffer (silence-fill underrun, Opus PLC).
  • Host mic_thread (Opus decode → ring → source); teardown RAII, set node.dont-reconnect.
  • Apple capture (AVAudioEngine → Opus). Opt-in + paired-only (a remote mic is a privacy surface). punktfunk/1-only.

5. Touch + rich DualSense (decision: commit to full UHID DualSense)

  • Touch — implemented (host path), pending a backend that lands it. TouchDown/Move/Up InputKinds (reuse the abs-pointer flags=(w<<16)|h mapping, code=touch id); host inject/libei.rs requests the Touchscreen device type + binds the Touch capability and injects ei_touchscreen down/motion/up; punktfunk-client-rs --touch-test drags a finger. Validated: KWin's RemoteDesktop portal grants the Touchscreen device type, but its EIS server creates no touchscreen device (headless KWin) — so touch currently no-ops on KWin (now logged once). The code is correct; it needs a backend that exposes ei_touchscreen (gamescope / newer KWin / the real iPad client path) to land. wlroots: no virtual-touch wired.
  • Rich DualSense — HID backend built & validated live. inject/dualsense.rs: a hand-rolled /dev/uhid codec (no bindgen) presenting a genuine USB DualSense (vendor 054C/0CE6, the 232-byte inputtino report descriptor) bound by the kernel hid-playstation driver. The mandatory GET_REPORT feature handshake (calibration 0x05 / pairing 0x09 / firmware 0x20) is answered, so the kernel creates the full device (gamepad/motion/touchpad/lightbar). Input report 0x01 is built from gamepad frames; output report 0x02 is parsed for LED RGB, player LEDs, and adaptive trigger effects (L2/R2). Protocol carries new side-planes: rich-input 0xCC (touchpad/motion) + HID-output 0xCD (LED/triggers). /dev/uhid udev rule shipped.
  • Rich DualSense — Phase C/D/E end-to-end, validated live. PUNKTFUNK_GAMEPAD=dualsense selects a per-session DualSenseManager (the PadBackend enum in m3.rs): client gamepad frames build the DualSense report; the kernel's feedback comes back as HidOutput on the 0xCD plane (lightbar / player LEDs / adaptive triggers) while rumble stays on the universal 0xCA plane (so non-DualSense clients still feel it); touchpad + motion ride the 0xCC rich-input plane (DualSenseManager::apply_rich, merged with button state). The connector + C ABI gained punktfunk_connection_next_hidout (→ PunktfunkHidOutput) and punktfunk_connection_send_rich_input (← PunktfunkRichInput); header regenerated. Validated on-box: a synthetic-source m3-host + punktfunk-client-rs --rich-input-test created the real kernel DualSense, drove 0xCC, and decoded 12 live 0xCD events (the kernel's actual lightbar/trigger init reports) — data plane unaffected (600/600 frames). Remaining: the Apple client renders adaptive triggers + rumble on a real DualSense (GCDualSenseAdaptiveTrigger) — handed off to the client agent for the real playtest.
  • Advanced (audio-driven voice-coil) haptics — scoped, NO-GO for now (docs/dualsense-haptics.md). Driven by the DualSense's USB audio interface (4-ch, back 2 channels = haptic PCM), not HID — so the UHID backend structurally can't carry it. Three independent walls: host capture needs a kernel rebuild (CONFIG_USB_DUMMY_HCD is off → no UDC for an f_uac2 gadget); near-zero Linux supply (only ~510 Proton titles via custom Wine patches emit it; hid-playstation/Steam Input/RPCS3 don't); and the Apple client can't faithfully replay PCM haptics (CoreHaptics is discrete/pattern- based, no public channel-3/4 routing). Deferred; revisit only if a real DS for capture + a UDC/host path + a PCM-capable client all land. Adaptive triggers (HID, above) deliver the reachable 80%.

6. iOS/iPadOS → tvOS (deferred)

PunktfunkKit is already platform-shared; iOS needs the UIViewRepresentable presenter twin

  • touch capture (#5) + UI. tvOS later.

7. Windows as a host (scoped — docs/windows-host.md; de-risked via SudoVDA)

Architecturally an "add a backend" job, not a parallel port: punktfunk-core (protocol/FEC/ crypto/C-ABI) + QUIC + GameStream + mgmt + the m3/pipeline orchestration are all platform-agnostic and already cfg-isolated (~95% reuse). New #[cfg(windows)] backends behind the existing traits: capture (DXGI Desktop Duplication / Windows.Graphics.Capture), encode (Media Foundation / NVENC-SDK with a D3D11 context), input (SendInput + ViGEm), audio (WASAPI loopback + a virtual mic).

The old blocker is gone. Rather than author + sign our own kernel IDD for the per-client virtual display, use SudoVDA (the Sunshine Virtual Display Adapter) — a pre-built, signed Indirect Display Driver that creates virtual displays at arbitrary WxH@Hz on demand. The VirtualDisplay backend becomes "install + drive SudoVDA's control API" (M effort), not "write + WHQL-sign a kernel driver" (XL). That removes the only hard blocker — the Windows host is now a medium, mostly-mechanical port. Recommended start: Phase 0 — capture an existing monitor to prove the stack end to end; Phase 1 wires SudoVDA for the native-resolution output. Deferred only because it's unbuildable on the Linux dev box; the trait boundaries are already in the right places.

8. Pairing & trust hardening (next)

The unified host + web-console pairing (arm a window → display the host PIN → user enters it on the client) is built and live. Two changes harden it from "works" to "secure by default":

  • Mandatory PIN pairing by default — done & live (§8a, serve --native now requires pairing; serve --open disables it). An unpaired client is rejected at the session gate; pairing is via the SPAKE2 PIN ceremony (one online guess, no offline attack) armed from the web console. Validated live: unpaired → "this host requires pairing", then web-armed PIN → "client trusted". Deployed to the dev box + Bazzite.

  • Delegated pairing approval (next — the ergonomic enabler for "mandatory": pair a device without fetching the host PIN out of band). Target flow:

    1. Device A is already paired (authenticated) to Host X.
    2. The user tries to connect Device B to Host X.
    3. Host X surfaces a request: "Allow Device B to pair with Host X?"
    4. The user approves/denies; on approve, Host X admits Device B — binding B's certificate fingerprint — with no PIN typed.

    Two buildable layers:

    • §8b-1 (host + web — achievable now): an unpaired B that connects to an approval-enabled host is held as a pending request {id, name, fingerprint, requested_at} in NativePairing instead of a flat reject; mgmt gains GET /native/pending + POST /native/pending/{id}/{approve, deny}; the web console lists pending requests with Approve/Deny. The operator approves from the console — delegated approval via the management surface.
    • §8b-2 (peer push — needs the client): the host also pushes the pending request over a paired Device A's live QUIC connection (a new control-plane message); A's app renders the prompt and replies approve/deny — the user's exact "Device A gets a notification" flow. The native/Apple UI is a client-agent task.

    PIN pairing (§8a) stays the bootstrap — the first device, or when no approver is online.

9. Client→host network speed test (next — prerequisite for a user-settable bitrate)

Before exposing a user-settable bitrate, the client needs a way to measure what the network actually sustains between it and the host, so the bitrate picker is informed (suggest/cap a safe value) instead of guesswork that ends in a stuttering stream.

  • A short bandwidth probe over the punktfunk/1 data path: the host bursts a few seconds of FEC-encoded payload (sized like a real high-bitrate stream) while the client measures sustained throughput + packet loss + jitter, and reports the achievable Mbps (plus the FEC-recovered headroom — the GF(2¹⁶) Leopard wall-breaker is exactly what makes a high-loss link still usable).
  • Surface it in the client UI ("Test network" → "~XXX Mbps · recommended bitrate YYY") and the web console; feed the result into the bitrate control (clamp/suggest) once that lands.
  • Reuse the existing Session/FEC plumbing — a probe is just a non-video AU stream the client byte-counts + times; no new transport. Pairs with bitrate negotiation (bitrate in Hello/Welcome, alongside the mode renegotiation already in place).

10. HDR + 10-bit color (parked — blocked upstream at the compositor producer)

Opt-in HDR10 (BT.2020 + PQ, 10-bit) streaming. Designed end to end; blocked at capture, not in our stack — the compositor doesn't emit a 10-bit/HDR PipeWire frame on any shipping build. Spiked + researched 2026-06-11 (memory: hdr-blocked-gamescope-pipewire); the downstream design is ready to build the moment a producer lands.

  • The wall — gamescope capture is 8-bit. gamescope composites HDR for a display (--hdr-enabled, --hdr-debug-force-output), but its PipeWire capture node offers only BGRx/ NV12 (8-bit) — confirmed by reading src/pipewire.cpp build_format_params() on upstream master AND the box's exact build (c31743d); color is capped BT.601/709. Issue #2126 ("pipewire: add HDR streams") is OPEN + unstarted (no PR). Forcing HDR output does not change the capture format.
  • PipeWire ≥1.6 is the other prerequisite (HDR colortype transport). Fedora 43 ships 1.4.x; Fedora 44 ships 1.6.6 — but Bazzite F44 deck-nvidia is testing-only (:stable is still F43; :testing has a confirmed NVIDIA Game-Mode crash). Rebasing now clears only the PipeWire wall while gamescope stays 8-bit → no-go for HDR; revisit a rebase when F44 promotes to stable (for its own sake), not for HDR.
  • The realistic route is KWin, not gamescope: KWin MR !8293 is a live draft adding HDR PipeWire capture. That pulls HDR onto the desktop (KWin) path — trading away gamescope's Steam-Deck-UI polish + the dynamic-resolution work (§ above). Track #2126 and !8293.
  • Constraints (settled): NVENC tops out at 10-bit → no Main12, no 12-bit AV1. HDR ⟹ HEVC Main10 (Apple VideoToolbox decodes 10-bit HEVC but not 10-bit AV1). Static HDR10 SEI (BT.2020-PQ default) since the compositor won't surface per-frame metadata. Opt-in negotiation via the Hello/Welcome trailing-byte pattern (SDR default; client declares HDR want; host master toggle).
  • Downstream design (ready when capture unblocks): add P010 + ColorInfo to capture; 10-bit zero-copy import (GL_RGB10_A2/float dest for RGB10, or P010 straight through the Vulkan→CUDA path); hevc_nvenc -profile main10 + color/SEI metadata; opt-in Hello/Welcome + C ABI; Apple VideoToolbox Main10 decode + wantsExtendedDynamicRangeContent EDR present + SDR fallback.