Deep dive into the two GNOME-only host bugs (KWin/gamescope clean):
1. Stale-frame flashes (windows at old positions, typed text reverting):
Mutter renders its virtual monitors DIRECTLY into the PipeWire buffer
pool, and NVIDIA has no implicit dmabuf fencing — our zero-copy
import raced the render and encoded each pool buffer's PREVIOUS
contents. Fix, in order of preference:
- Consumer-side PipeWire explicit sync (SPA_META_SyncTimeline): new
drm_sync module (DRM timeline-syncobj wait/signal via raw ioctls,
unit-tested incl. a live signal->wait round trip); announced
post-format via update_params (the OBS pattern — at connect time
the meta makes producers fail allocation, observed on KWin), with
a blocks=3 Buffers filter so the producer's sync pod wins; acquire
point awaited before any read (GPU import or CPU mmap), release
point signaled on every path.
- Where the producer can't do explicit sync (Mutter on NVIDIA today:
no cogl sync_fd, "error alloc buffers"), a sticky fallback flips
the capture to the synchronous CPU/shm path — Mutter's glReadPixels
download orders against its render, so frames are correct by
construction. First session pays one ~10 s probe+retry; later
sessions go straight there. Validated live on home-worker-3
(GNOME 50 + RTX 4090): clean fallback, 30 MB HEVC streamed.
- Sync is only announced on Mutter sessions (new VirtualOutput.mutter
tag): KWin+NVIDIA fails allocation when merely asked, and doesn't
need it (verified unchanged: zero-copy CUDA import + 1.1 MB/10 s).
PUNKTFUNK_EXPLICIT_SYNC=0 disables the probe outright.
2. Clicks wedged in the focused app after disconnect+reconnect: a client
vanishing mid-press left keys/buttons latched in the compositor —
Mutter keeps the destroyed EIS device's implicit grab and the focused
app stops taking clicks until restarted. EiState now tracks held
keys/buttons/touches (wire codes) and synthesizes releases through
the normal inject path before the EIS connection goes away.
GNOME hosts on NVIDIA temporarily lose zero-copy (correctness over
throughput); the moment Mutter+driver gain working explicit sync, the
sync path engages automatically and zero-copy returns.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
punktfunk
A ground-up low-latency desktop streaming stack, built Linux-first, with a shared Rust protocol core and native clients per platform.
punktfunk is a placeholder codename. The bet: ship a Linux virtual-display streaming
host that speaks the existing Moonlight protocol (every Moonlight/Artemis client works
day one), then break the ~1 Gbps FEC wall with a GF(2¹⁶) Leopard-RS transport as a
negotiated extension. See docs/implementation-plan.md.
Status
| Milestone | State |
|---|---|
M1 — punktfunk-core + C ABI |
✅ done & hardened (FEC, packetization, AES-GCM, session, adversarial-review fixes, punktfunk_core.h) |
| M2 — GameStream host → stock Moonlight | ✅ live end-to-end: pairing, RTSP, audio, per-client virtual output at native res, GPU zero-copy NVENC, gamepads |
M3 — punktfunk/1 native protocol |
✅ validated live: QUIC control + GF(2¹⁶) FEC/AES data plane, SPAKE2 PIN pairing, mid-stream mode renegotiation |
| M4 — client decode + present (Apple) | 🟡 macOS first light: AnnexB→VideoToolbox HEVC on glass + input/pairing over punktfunk/1 (clients/apple); iOS + presenter next |
| Web console + management API | ✅ TanStack web console (web/) over the OpenAPI mgmt API: host status, paired devices, on-demand native pairing (arm → show PIN) |
The GameStream host works with a stock Moonlight client — validated live on NVIDIA
(RTX 5070 Ti & RTX 4090, driver 595): trust-on-first-use pairing that persists, an app
catalog, RTSP/ENet/audio, and video at the client's exact resolution and refresh via a
per-session virtual output (KWin, gamescope, Mutter, Sway backends), encoded with GPU
zero-copy (dmabuf → CUDA/Vulkan → NVENC) at up to 5120×1440@240. The native
punktfunk/1 protocol adds a QUIC control plane and a GF(2¹⁶) Leopard-FEC + AES-GCM data
plane (p50 ~0.8 ms capture→reassembled at 720p120), with a SPAKE2 PIN pairing ceremony. Both
run from one process (serve --native), managed through a REST API + web console. Builds
against FFmpeg 7 or 8; deployed live on Bazzite. Full status: CLAUDE.md;
roadmap, setup guides & progress: the docs site (docs-site/ — Fumadocs;
bun run dev), with the canonical roadmap and
status there. Design notes stay in docs/.
Layout
crates/
punktfunk-core/ protocol · FEC · pacing · crypto · quic — the C ABI (lib + cdylib + staticlib)
punktfunk-host/ Linux host: vdisplay · capture · encode · inject · gamestream · m3 · mgmt · native_pairing
punktfunk-client-rs/ punktfunk/1 reference client (M3 headless; M4 adds decode+present)
clients/{apple,android}/ native client scaffolds (import punktfunk_core.h); apple = macOS first light
web/ TanStack web console (host status · paired devices · pairing) over the mgmt API
packaging/ Fedora/Bazzite RPM · bootc image · COPR (see packaging/bazzite/README.md)
include/punktfunk_core.h cbindgen-generated C header (checked in)
tools/{latency-probe,loss-harness}/ measurement (plan §10)
docs/{implementation-plan,roadmap,windows-host,dualsense-haptics}.md
Build & test
cargo build --workspace # green on Linux and macOS
cargo test --workspace # unit + loopback + proptest + C ABI harness
cargo clippy --workspace --all-targets
cargo run -p loss-harness # FEC loss-resilience sweep (no network needed)
bash crates/punktfunk-core/tests/c/run.sh # standalone C-ABI link+round-trip proof
The C header regenerates from crates/punktfunk-core/src/abi.rs on every build (cbindgen via
build.rs) into include/punktfunk_core.h.
Design invariants
- One core, linked everywhere. Protocol/FEC/crypto/pacing live in
punktfunk-coreexactly once, exposed over a stable, versioned C ABI (punktfunk_abi_version(),PunktfunkConfigcarries its ownstruct_size). - No async on the hot path. The per-frame pipeline uses native threads only;
tokio/quinnare gated behind the off-by-defaultquicfeature (control plane only). - FEC is the wall-breaker. GF(2⁸) (≤255 shards/block) for Moonlight compat; GF(2¹⁶) (≤65535 shards/block, SIMD, O(n log n)) to push past ~1 Gbps.
License
MIT OR Apache-2.0.