Adds negotiated 5.1/7.1 surround to the punktfunk/1 protocol and every client (previously stereo-only): - core: new shared `audio` layout table (LAYOUT_51/71 + identity multistream mapping, canonical wire order FL FR FC LFE RL RR SL SR); Hello/Welcome `audio_channels` negotiation via the trailing-byte back-compat pattern (old peers fall back to stereo); C-ABI `punktfunk_connect_ex6`, `punktfunk_connection_audio_channels`, and in-core multistream decode `punktfunk_connection_next_audio_pcm` for embedders without a multistream Opus decoder. Real-libopus channel-identity round-trip test. - host: native audio thread captures + Opus-(multi)stream-encodes at the negotiated count (with a cross-session cached-capturer channel-mismatch fix); GameStream surround unified onto the safe `opus::MSEncoder`, dropping `audiopus_sys` (~4 unsafe blocks) and un-gating Windows GameStream surround; WASAPI loopback capture relaxed to 2/6/8 with the correct dwChannelMask. - clients: Linux (PipeWire), Windows (WASAPI), Android (AAudio) decode via `opus::MSDecoder` + render multichannel; Apple decodes in-core to PCM → AVAudioEngine with an explicit wire-order channel layout; each gains a Stereo/5.1/7.1 setting. `punktfunk-probe --audio-channels N` is the headless validator. Verified on Linux: core/host/linux/probe test suites + the Android Rust (cargo-ndk) build, clippy -D warnings, and rustfmt all green. Windows/Apple builds, all on-glass checks, and the live native loopback are pending (CI / a free box). Also lands the concurrent in-tree HEVC 4:4:4 host work (PUNKTFUNK_444): it shares the same touched files (quic.rs, punktfunk1.rs, encode/*, ...) and so cannot be committed separately from the surround changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
30 KiB
CLAUDE.md — punktfunk
Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protocol core
(punktfunk-core) exposed over a C ABI and native clients per platform. Full design:
design/implementation-plan.md. Status table: README.md.
Where the work stands
- Core (
punktfunk-core+ C ABI): complete and hardened. FEC recovery, loopback-under-loss, proptests, C ABI harness all green; 13 adversarial-review findings fixed + regression-tested (a913042). - GameStream host: working end-to-end with a stock Moonlight client. Validated live
on this box: pairing (persists across restarts), serverinfo/applist (app catalog from
~/.config/punktfunk/apps.json→ each entry picks a compositor + nested command), RTSP, ENet control, audio, and video at the client's native resolution and refresh — the host creates a per-session virtual output via per-compositorVirtualDisplaybackends: KWin (zkde_screencast stream_virtual_output, needs KWin ≥ 6.5.6 headless; >60 Hz via custom modes), gamescope (spawned headless at WxH@Hz, its PipeWire node captured, needs gamescope ≥ 3.16.22 — older deadlocks on PipeWire ≥ 1.6), Mutter (D-BusRecordVirtualvirtual monitor; validated live on headless GNOME Shell 50, zero-copy), Sway/wlroots (swaymsg create_output+ custom mode, xdpw portal capture with a managed chooser config; validated live on sway 1.11, zero-copy). Performance work landed and measured: GPU zero-copy on all paths (tiled dmabuf → EGL/GL → CUDA; LINEAR dmabuf → Vulkan bridge → CUDA → NVENC), auto 2-way NVENC split-encode above ~1 Gpix/s (5K@240), infinite GOP + RFI keyframes (killed the periodic freeze), encode|send thread split withsendmmsgbatching. Stable 240 fps at 5120×1440. Input: mouse/keyboard (libei via RemoteDesktop portal on KWin/GNOME, gamescope's own EIS socket, wlr protocols on Sway) and gamepads (uinput X-Box-360 pads + rumble back-channel; validated live — pad created/destroyed with the session). Management REST API + checked-in OpenAPI doc (mgmt.rs). Web-console performance capture (stats_recorder.rs, design:design/stats-capture-plan.md): the operator arms stats recording from the web console, plays, stops, and reviews the run as graphs (per-stage latency breakdown · fps new/repeat · goodput · loss/FEC). A sharedArc<StatsRecorder>ring (the hot-path gate is a runtimeAtomicBool, replacing the startup-onlyPUNKTFUNK_PERF) is fed by both the nativevirtual_streamand the GameStream encode loop at their existing ~2 s/~1 s aggregation boundary, and finished captures are saved as on-disk recordings (~/.config/punktfunk/captures/*.json) browsable/exportable from the console's Performance page (recharts). Endpoints/api/v1/stats/*(bearer-only). Implemented; not yet on-glass validated. - Native protocol (
punktfunk/1): full session planes, validated live. QUIC control plane (punktfunk-corequicfeature: Hello{mode}/Welcome{full Config}/Start), data plane = the hardened coreSessionover raw UDP with GF(2¹⁶) Leopard FEC + AES-GCM (inexpressible in GameStream), host creates the native virtual output at the client's requested mode.punktfunk1-hostis a persistent listener (sessions back to back;--max-sessions). QUIC datagrams carry the side planes, demuxed by first byte: input 0xC8 (incl. gamepads — incremental events accumulated into the uinput xpad), Opus audio 0xC9 (48 kHz stereo, 5 ms, host→client), rumble 0xCA (host→client). Trust: host serves its persistent identity (~/.config/punktfunk/cert.pem, shared with GameStream pairing) and logs the SHA-256 fingerprint; clients pin it, established by a SPAKE2 PIN pairing ceremony (host arms pairing and displays a 4-digit PIN; a PAKE binds both cert fingerprints so an attacker gets one online guess, no offline dictionary attack) — PIN pairing is the default for new hosts. TOFU on first connect (endpoint::client_pinned) stays as an explicit host opt-in (punktfunk1-host --allow-tofu/serve --open, advertised aspair=optional) for fully trusted LANs; clients only offer the TOFU "Trust" path for a host that advertisedpair=optional, route every other new host straight to the PIN ceremony, and on a pinned-fingerprint change force re-pairing (no re-TOFU shortcut). Clients present persistent identities via QUIC client auth, the host stores paired fingerprints (punktfunk1-paired.json) and gates sessions with--require-pairing(the default;--allow-tofu/--openaccept unpaired clients). LAN auto-discovery: bothserveandpunktfunk1-hostadvertise the native service over mDNS (_punktfunk._udp,crate::discovery) with TXTproto/fp(cert fingerprint to pin)/pair(required|optional)/id;punktfunk-probe --discoverlists hosts, Apple clients browse the same service via NWBrowser (validated cross-LAN 2026-06-12). Mid-stream mode renegotiation:Reconfigureon the still-open control stream — the host rebuilds output+encoder at the new mode in ~90 ms while the data plane runs on (validated live: one .h265 with 720p and 1080p segments). Measured on-box at 720p120: 1680/1680 frames, p50 0.83 ms capture→…→reassembled; audio measured live (~200 pkts/s). A wall-clock skew handshake (ClockProbe/ClockEcho, 8 NTP rounds afterStart,clock_offset_ns) aligns the client to the host clock, so that latency is now valid cross-machine (skew_corrected=true) — measured GNOME box → dev box over the LAN: p50 1.30 ms (the −1.57 ms inter-box clock offset removed).punktfunk-probeis the working reference client (--pin, datagram counters,--input-testincl. gamepad). The embeddable connector (NativeClient) exposes it all over the C ABI:punktfunk_connect(pin/TOFU) +next_au/next_audio/next_rumble/next_hidout/send_input/send_rich_input. Client-negotiated virtual pad type: the Hello carries a gamepad preference byte (same trailing-byte back-compat pattern as the compositor), the Welcome echoes the resolved backend — precedence: explicit client choice >PUNKTFUNK_GAMEPADenv > uinput Xbox 360. Backends: Xbox 360 (uinput / ViGEm), Xbox One/Series (the same XInput backend with the One/Series USB identity for matching glyphs — no extra game-visible capability; impulse-trigger rumble is unreachable through a virtual pad), and the UHIDhid-playstationpads — DualSense (adaptive triggers, lightbar, touchpad, motion) and DualShock 4 (lightbar, touchpad, motion, rumble; DualSense minus adaptive triggers / player LEDs / mute). DualSense and DualShock 4 each have a Linux (UHIDhid-playstation) and a Windows (UMDF minidriver) backend —inject/dualsense_windows.rs+inject/dualshock4_windows.rs, one driver serving either identity per adevice_typebyte the host stamps into shared memory (the DS4 reuses the same SwDeviceCreate game-detection identity fix as the DualSense). One/Series stays Linux-only and folds into Xbox 360 off it. Clients auto-resolve the type from the physical controller (DS5→DualSense, DS4→DualShock 4, Xbox One→Xbox One). Windows uses ZERO external gamepad dependencies — ViGEmBus is gone. Xbox 360 (XInput) runs on a UMDF2 XUSB companion driver (packaging/windows/xusb-driver/,inject/gamepad_windows.rs) that registersGUID_DEVINTERFACE_XUSBand answers the buffered XInput IOCTLs from a shared section, so classicXInputGetState/SetStatework with no kernel bus driver (validated live: slot connected, state + rumble round-trip; Xbox One folds to this 360 path). All three UMDF drivers (DualSense/DS4 + XUSB) are bundled + pnputil-installed by the Inno Setup installer (packaging/windows/gamepad-drivers/+install-gamepad-drivers.ps1). Multi-pad ready: the host stamps each pad's index into the device Location (pszDeviceLocation), the driver reads it (WdfDeviceAllocAndQueryProperty) to map its own*-shm-<index>, andUmdfHostProcessSharing=ProcessSharingDisabledgives each pad its own host (per-pad statics) — validated live with 2 distinct XInput slots + 2 DualSense pads. (Client-side multi-pad forwarding is the remaining piece.) - Windows host: implemented and shipping (all-vendor, x64-only).
#[cfg(windows)]backends behind the same traits as Linux — DXGI Desktop Duplication capture (capture/dxgi.rs), SudoVDA virtual display per session (vdisplay/sudovda.rs), GPU encode (NVENC--features nvenc; AMD/Intel--features amf-qsv), SendInput + ViGEm gamepads (inject/gamepad_windows.rs), WASAPI loopback- virtual mic (
audio/wasapi_*). Ships as a signed Inno Setup installer that registers aLocalSystemSCM service launching into the interactive session for secure-desktop (UAC/lock-screen) capture (service.rs), bundles the SudoVDA driver + the FFmpeg DLLs, and is published bywindows-host.yml. Encoder is GPU-aware (encode.rsopen_video+windows_resolved_backend):PUNKTFUNK_ENCODER=auto(the host.env default) detects the DXGI adapter vendor → NVENC (NVIDIA, direct SDK,encode/nvenc.rs), AMF (AMD) / QSV (Intel) via libavcodec (encode/ffmpeg_win.rs, the Windows analogue of the Linux VAAPI backend —WinVendor{Amf,Qsv}, system-memory NV12/P010 readback default + opt-in zero-copy D3D11 behindPUNKTFUNK_ZEROCOPYwith a system fallback), or software H.264 (encode/sw.rs, GPU-less). GameStream codec advertisement is probed per-GPU on AMF/QSV (windows_codec_support→serverinfo, AV1 gated). HDR (10-bit): WGC captures the HDR desktop as FP16/Rgb10a2 (DDA FP16 for the secure desktop), the encoder forces HEVC Main10 + BT.2020 PQ (NVENC ABGR10/P010; AMF/QSV P010 + a swscale Rgb10a2→P010 fallback), the client auto-detects PQ from the HEVC VUI — gated byPUNKTFUNK_10BIT+ clientVIDEO_CAP_10BIT; Windows host only (the Linux host stays 8-bit, blocked upstream). Vulkan-game HDR over the virtual display: NVIDIA/AMD Vulkan ICDs refuse to advertise an HDR color space for a surface on an IddCx indirect display (so Vulkan games — Doom: The Dark Ages, id Tech, etc. — say "device does not support HDR"), even though the ICD happily accepts + presents a forced HDR swapchain there. A tiny always-on Vulkan implicit layer (packaging/windows/pf-vkhdr-layer/,VK_LAYER_PUNKTFUNK_hdr_inject) injects theHDR10_ST2084/scRGB surface formats intovkGetPhysicalDeviceSurfaceFormats[2]KHR, self-gated on the display's actual advanced-color state (no-op on SDR / real monitors); bundled + HKLM-registered by the installer. Live-validated: Doom: The Dark Ages enables HDR over the virtual display. AMF/QSV is CI-green but not yet on-glass validated (no AMD/Intel Windows box in the lab); NVENC is live-validated. Newer/less battle-tested than the Linux host. Packaging:packaging/windows/.
- virtual mic (
What's left
- Native clients — decode + present: macOS stage 1 done, first light achieved
(2026-06-10). PunktfunkKit compiles and is tested on macOS (AnnexB → VideoToolbox →
AVSampleBufferDisplayLayer, GCMouse/GCKeyboard capture,PunktfunkClientapp shell); validated live Mac ↔ this box at 720p60 — vkcube on glass, input injected via gamescope EIS. The app speaks the full ABI v2 trust surface: Keychain-persisted client identity presented on every connect, SPAKE2 PIN pairing UI (host-card context menu + the trust prompt's "Pair with PIN instead…"), TOFU fingerprint prompt. Gamepads (2026-06-11): controller discovery + selection in Settings (GamepadManager— exactly one pad forwarded as pad 0, auto or pinned; pad TYPE auto-resolves from the physical controller, user-overridable), capture incl. DualSense touchpad/motion (GamepadCapture/GamepadWire), feedback rendering (rumble → CoreHaptics; lightbar / player LEDs / adaptive triggers →GCDeviceLight/playerIndex/GCDualSenseAdaptiveTriggervia the table-drivenDualSenseTriggerEffectparser). Loopback-tested end to end (PUNKTFUNK_TEST_FEEDBACK=1scripted burst); DualSense motion sign/scale derived, not yet live-verified. Tests:swift testinclients/apple(unit + real-codec round trip),test-loopback.sh(Swift client vs synthetic punktfunk1-hosts on loopback — runs on macOS; includes the pairing ceremony +--require-pairinggate),RemoteFirstLightTests(full pipeline over the LAN). Seeclients/apple/README.md. Stage 2 presenter (VTDecompressionSession+CAMetalLayer) is built and live-validated on glass behind the opt-inpunktfunk.presenterflag (~11 ms p50 capture→present), to become the default after a few resolution/HDR checks. Next: make stage 2 the default, glass-to-glass numbers viatools/latency-probe, iOS/iPadOS/tvOS variants. Linux stage 1 done, first light 2026-06-12 (clients/linux, binarypunktfunk-client): GTK4/libadwaita shell linkingpunktfunk-coredirectly (no C ABI;NativeClientis nowSync— mutexed plane receivers), mDNS host list, TOFU + SPAKE2 PIN dialogs (identity shared with client-rs), FFmpeg software HEVC decode (LOW_DELAY, slice threads) →GtkGraphicsOffload-wrapped picture, PipeWire playback (mic-player jitter ring inverted), SDL3 gamepad capture + rumble/lightbar feedback, keyboard via exact inverse of the host VK table, absolute mouse + 120-unit scroll. Validated live againstserveon this box: 1080p60, steady 60 fps, capture→decoded p50 ≈6.4 ms (debug build).--connect host[:port]for scripting. Swift-parity batch + stage 1.5 (2026-06-12 evening): capture state machine (click-to-capture, Ctrl+Alt+Shift+Q / focus-loss release, held-state flush), app-lifetime SDL gamepad service (pad pin UI, auto type from the physical pad, DualSense touchpad/motion 0xCC + raw-DS5-effects trigger/player-LED replay — needs a physical pad to live-verify), mic uplink (validated live), per-host speed test, compositor pref, native-display mode default, saved-hosts list, .deb + RPM-subpackage CI (deb.yml/rpm.yml). VAAPI decode → DRM-PRIME dmabuf →GdkDmabufTexture(BT.709 color state; Tier-1 zero-copy on Intel/AMD,PUNKTFUNK_DECODER=software|vaapioverride) with a proven fallback ladder — no VAAPI device (NVIDIA) or mid-session VAAPI error → software decode. First AMD test (Steam Deck) hit a green-screen bug, fixed: FFmpeg's VAAPI export usesSEPARATE_LAYERS, so NV12 arrives as two single-plane layers (R8 luma + GR88 chroma, one shared fd); the mapper tooklayers[0]only → GTK got a luma-only R8 texture, chroma read as 0 → green field / red whites. Fix derives the combined fourcc from the decodersw_format(→DRM_FORMAT_NV12) and flattens all planes across all layers (mpv's pattern); a first-frame descriptor dump logs the real layout. Awaiting Steam Deck reconfirm. Next: the stage-2 raw-Wayland presenter (wp_presentation feedback, tearing-control, Vulkan Video on NVIDIA) — wgpu/winit rejected (no dmabuf import / presentation feedback / shortcuts-inhibit). Windows stage 1 done 2026-06-15 (clients/windows, binarypunktfunk-client): pure-Rust WinUI 3 UI via windows-reactor (a declarative React-like framework backed by WinUI; PR #4499 added theSwapChainPanelwidget +set_swap_chain). The video is aSwapChainPanelbound to a D3D11 composition swapchain (WARP fallback for the GPU-less dev box; runtime-compiled fullscreen-triangle shaders, Contain-fit letterbox), driven by reactor's per-frameon_rendering. FFmpeg HEVC decode with a D3D11VA zero-copy hardware path (gpu.rsshares one D3D11 device — hardware+VIDEO_SUPPORT, WARP fallback, multithread-protected — between the decoder and presenter; the decoder outputs NV12/P010ID3D11Texture2Darray slices withBIND_SHADER_RESOURCEand the presenter samples them via per-plane SRVs + YUV→RGB shaders — NV12/BT.709, P010/BT.2020-PQ; software CPU decode stays as the robust fallback, auto-selected with aDecoderPrefoverride). HDR10: the client advertises 10-bit/HDR (Settings toggle), detects PQ in-band (transfer == SMPTE2084), and flips the swapchain toR10G10B10A2+ ST.2084 with HDR10 metadata. WASAPI render + mic capture, SDL3 gamepads (rumble/lightbar/DualSense),mdns-sddiscovery, and the full trust surface — all in-app: a polished WinUI shell (host cards w/ monogram + status pills,InfoBarerrors/hints,ToggleSwitchsettings, status-chip stream HUD showing GPU/CPU decode + HDR), host list (live mDNS + saved + manual), settings (resolution/refresh/decoder/bitrate/HDR/ mic), SPAKE2 PIN pairing screen, TOFU, pinned-fp-mismatch re-pair. (D3D11VA + HDR present + the GUI polish are written against the windows-rs/reactor APIs but not yet on-glass validated — the dev VM is headless/WARP; needs the RTX box.) Stream input is Win32 low-level hooks (WH_KEYBOARD_LL/WH_MOUSE_LL) — reactor exposes no raw key/pointer events; native Windows VK + absolute mouse (client-rect Contain-fit) + wheel, Ctrl+Alt+Shift+Q capture toggle.--headless/--discoverkeep CLI paths. Builds + clippy- fmt green on
x86_64-pc-windows-msvcandaarch64-pc-windows-msvc— the latter cross-compiled off the one x64 runner (no ARM64 runner; the x64 MSVC toolset's ARM64 cross compiler + a per-archFFMPEG_DIRARM64 tree, SDL3/libopus build-from-source cross-compile cleanly), and both ship as signed MSIX (windows-msix.ymlmatrix →..._x64.msix/..._arm64.msix, verified: ARM64 binaries + manifest arch). windows-reactor is unpublished (git dep pinned to commitb4129fcc;windowspinned to the SAME commit soIDXGISwapChain1unifies withset_swap_chain); itsbuild.rsdownloads the Win App SDK NuGets + needsCARGO_WORKSPACE_DIRset (in the VM build env;/temp+/winmdgitignored). Gotcha:CARGO_HOMEmust be an ASCII path — theüin the dev box's username breaks SDL3's MSVC precompiled-header build. Next: on-glass validation of the D3D11VA decode + HDR present + GUI on the RTX box (the dev VM is headless/Session-0/WARP → the WinUI window + hardware decode need a real display+GPU: RDP or the RTX box), then RAWINPUT relative-mouse pointer-lock and a per-host speed test in the UI. Android stage 1 done (clients/android, Kotlin app +native/Rust JNI core linkingpunktfunk-core; phone + Android TV): NDKAMediaCodechardware HEVC decode →SurfaceViewincl. HDR10 (Main10/BT.2020 PQ) with low-latency tuning + a live stats HUD (decode.rs/stats.rs), Opus/Oboe audio + mic uplink (audio.rs/mic.rs), gamepad input with rumble/HID feedback (feedback.rs), nativemdns-sdmDNS discovery (discovery.rs, polled over JNI — the same browse the Linux/Windows clients use, replacing the flaky per-OEMNsdManager; Kotlin keeps only theMulticastLock+ permission UX), SPAKE2 PIN pairing + TOFU (Keystore identity + known-host store), Compose UI (Connect/Settings/Stream) with D-pad/controller focus nav. Built forarm64-v8a+x86_64; published to Google Play (Internal Testing) viaandroid.yml(ci/play-upload.py). Next: real-device gamepad/HDR live-verify, presenter/latency polish.
- fmt green on
- Sub-frame pipelining: overlap encode and transmit within a frame. Requires a direct NVENC SDK wrapper (libavcodec only emits whole AUs) — the next big latency lever (~2–4 ms at high res).
- punktfunk/1 protocol growth. Done: unified host (
serve --gamestreamruns GameStream + the punktfunk/1 QUIC host in one process; bareserveis the secure native-only default — GameStream is opt-in, trusted-LAN only, security-review #5/#9) with native pairing driven over the mgmt API / web console (mod native_pairing: arm-on-demand → display PIN, paired-device list). Done: PIN pairing is the default, host-gated — the host requires pairing and advertisespair=requiredunless opted out with--allow-tofu/--open(thenpair=optional, accepts unpaired clients); clients render TOFU only for apair=optionalhost and force re-pairing on a fingerprint change. Done: concurrent sessions — the accept loop spawns each session (--max-concurrent, default 4, an NVENC bound), each with its own virtual output + encoder, sharing the host-lifetime input/audio/mic services (shared-desktop multi-view on kwin/mutter/wlroots). Done: delegated pairing approval (§8b-1) — an unpaired device shows up as a pending request in the web console, one click approves + pins it. Next (see roadmap): gamescope multi-user isolation (per-session input/audio = independent desktops); §8b-2 peer-push approval from a paired device's own app. - GameStream host polish: HDR/10-bit (needs HDR capture + metadata plumbing;
av1_nvenc -highbitdepth 1already encodes Main10 from 8-bit input on this box), reconnect-at-new-mode robustness. AV1 negotiation and surround audio are implemented and unit/live-capture tested — both still need a live Moonlight confirmation (select AV1 in a stock client; a real 5.1/7.1 listen incl. FEC under loss).
Box one-time setup is complete: udev rule + input group (gamepads validated live),
gamescope 3.16.22 installed system-wide (no PATH override), gnome-shell installed (Mutter
backend validated live). All three compositor backends are live-validated.
Build / test / run
cargo build --workspace # green on Linux and macOS
cargo test --workspace # unit + loopback + proptest + C ABI harness (~100 tests)
cargo clippy --workspace --all-targets -- -D warnings
cargo fmt --all --check
cargo run -p loss-harness # FEC loss-resilience sweep (no network needed)
bash crates/punktfunk-core/tests/c/run.sh # standalone C-ABI link + round-trip proof
Generated artifacts are checked in and CI fails on drift: include/punktfunk_core.h
(cbindgen from punktfunk-core/src/abi.rs) and api/openapi.json (regenerate with
cargo run -p punktfunk-host -- openapi > api/openapi.json; spec lives in mgmt.rs).
CI is Gitea Actions (.gitea/workflows/, guide: docs-site ci.md): ci.yml runs the
workspace checks inside the git.unom.io/unom/punktfunk-rust-ci image plus web/docs-site
build+typecheck; docker.yml builds+pushes the web/docs/rust-ci images (host and native
clients are deliberately NOT containerized); apple.yml builds the xcframework and runs
swift build/swift test on the macos-arm64 host-mode runner (home-mac-mini-1,
provisioned by scripts/ci/setup-macos-runner.sh). Per-client/host release workflows:
deb.yml/rpm.yml/flatpak.yml (Linux client), android.yml (Google Play), windows-msix.yml
(Windows client), windows-host.yml (Windows host installer), release.yml (Apple notarized DMG +
TestFlight), decky.yml (Steam Deck plugin); Windows builds run on a self-hosted Windows runner.
Layout
crates/punktfunk-core/ protocol · FEC · crypto · quic (punktfunk/1 control plane, feature-gated)
crates/punktfunk-host/
gamestream/ Moonlight compat: nvhttp · pairing · rtsp · control · stream · gamepad · apps
vdisplay/{kwin,gamescope,mutter,wlroots}.rs per-compositor client-sized virtual outputs
zerocopy/{egl,cuda,vulkan}.rs dmabuf → CUDA → NVENC (tiled via EGL/GL, LINEAR via Vulkan)
inject/{libei,wlr,gamepad,dualsense}.rs input backends (uinput xpad + UHID DualSense)
encode/{nvenc,linux,vaapi,ffmpeg_win,sw}.rs per-GPU encoders (NVENC · Linux NVENC/CUDA · VAAPI · AMF/QSV · openh264)
capture.rs · encode.rs · audio.rs · spike.rs · punktfunk1.rs · mgmt.rs · native_pairing.rs · stats_recorder.rs
clients/probe/ punktfunk/1 reference/probe client (headless test/measurement tool)
clients/linux/ native Linux client (GTK4/libadwaita · FFmpeg · PipeWire · SDL3)
clients/windows/ native Windows client (WinUI 3 via windows-reactor · D3D11 · WASAPI · SDL3)
clients/apple/ native macOS/iOS/tvOS client (Swift · VideoToolbox · GameController)
clients/android/ native Android client (Kotlin app + native/ Rust JNI core over punktfunk-core)
clients/decky/ Steam Deck Decky plugin
crates/punktfunk-host/src/{capture/dxgi,vdisplay/sudovda,encode/ffmpeg_win,inject/gamepad_windows,audio/wasapi_*,service}.rs Windows host backends
web/ TanStack web console over the mgmt API (status · devices · pairing · performance graphs)
packaging/ apt(deb) · RPM/COPR · Arch/sysext · Flatpak · Bazzite bootc · Windows host installer (per-dir READMEs)
tools/{loss-harness,latency-probe}/ measurement (plan §10)
scripts/ 60-punktfunk.rules · punktfunk-host.service · host.env.example · headless/
include/punktfunk_core.h generated C header
Design invariants — do not regress
- One core, linked everywhere. Protocol/FEC/crypto live only in
punktfunk-core, behind a stable, versioned C ABI.tokio/quinnexist only behind thequicfeature (control plane); no async on the per-frame path — native threads only. - Native client resolution, no scaling. A session gets a virtual output at exactly the
client's WxH@Hz via the
VirtualDisplaytrait (create(mode) → VirtualOutput { node_id, remote_fd, preferred_mode, keepalive }, RAII teardown). There is no cross-compositor protocol for this — each compositor keeps its own backend. - FEC is the wall-breaker. GF(2⁸) (≤255 shards/block, Moonlight-compatible) and GF(2¹⁶) Leopard (≤65535 shards/block) — punktfunk/1 negotiates the latter, removing the ~1 Gbps ceiling.
- Core security hardening stays intact: reassembler bounds attacker-controlled fields
before allocating (
ReassemblerLimits); AES-GCM per-direction nonce salts + seq-as-AAD; ABIstruct_sizechecks. Regression tests exist — keep them green. - PipeWire consumer discipline: our capture streams set
node.dont-reconnectand tear down promptly on negotiation timeout — one wedged link head-blocks the daemon's shared work queue system-wide.
Running on this box
Headless QEMU VM (Ubuntu 26.04, kernel 7.0), passthrough RTX 5070 Ti (driver 595 open
module — a kernel update silently drops it; reinstall nvidia-driver-595-open), no KMS
scanout → KWin --drm impossible; everything renders offscreen via renderD128.
# compositor session (shell 1, or the systemd unit in scripts/): full headless Plasma.
# The script sets XDG_MENU_PREFIX=plasma- & co. — without it plasmashell runs but the
# launcher menu is EMPTY (no apps, no System Settings).
bash scripts/headless/run-headless-kde.sh 1920x1080
# host (shell 2): bare `serve` is native-only (secure default); add --gamestream for Moonlight compat.
WAYLAND_DISPLAY=wayland-kde XDG_CURRENT_DESKTOP=KDE PUNKTFUNK_VIDEO_SOURCE=virtual \
PUNKTFUNK_ZEROCOPY=1 cargo run -rp punktfunk-host -- serve --gamestream
# punktfunk/1 native loopback test (no Moonlight needed; same env as serve, listener persists
# across sessions — bound it with --max-sessions):
cargo run -rp punktfunk-host -- punktfunk1-host --source virtual --seconds 10 --max-sessions 1
cargo run -rp punktfunk-probe -- --mode 1280x720x120 --out /tmp/a.h265 --input-test # + --pin HEX
Pinned crate facts: ashpd 0.13 + pipewire 0.9 (must match ashpd's) + ffmpeg-next 8.x
(ffmpeg-sys-next auto-detects the system FFmpeg, so it builds against FFmpeg 7.x/libavcodec 61
or 8.x/libavcodec 62 — validated live on Ubuntu 26.04 (8) and Bazzite F43 (7.1); the zero-copy
FFI also link-needs libGL/libgbm/libcuda at build time). Env knobs: PUNKTFUNK_VIDEO_SOURCE=virtual|portal,
PUNKTFUNK_COMPOSITOR=kwin|gamescope|mutter, PUNKTFUNK_ZEROCOPY=1, PUNKTFUNK_GAMESCOPE_APP=...,
PUNKTFUNK_INPUT_BACKEND=..., PUNKTFUNK_PERF=1 (per-stage timing), PUNKTFUNK_VIDEO_DROP=N (FEC
test), PUNKTFUNK_FEC_PCT=N, PUNKTFUNK_DSCP=1 (opt-in DSCP/SO_PRIORITY media QoS on the data +
GameStream video/audio sockets; no-op on the wire on Windows without a qWAVE policy),
PUNKTFUNK_444=1 (full-chroma HEVC 4:4:4, see below).
HEVC 4:4:4 (full chroma, Range Extensions): opt-in via PUNKTFUNK_444, negotiated like 10-bit —
the host emits 4:4:4 only when the client advertised VIDEO_CAP_444 (wire bit 0x04 + ABI
PUNKTFUNK_VIDEO_CAP_444), the codec is HEVC, the session is single-process, and a GPU probe
(encode::can_encode_444, run before the Welcome) confirms support — else it resolves to 4:2:0 and
Welcome::chroma_format reflects the real value (honest downgrade; the client reads it via
punktfunk_connection_chroma_format). punktfunk/1-native only — GameStream/Moonlight stays 4:2:0
(stock clients can't decode 4:4:4). NVENC is the implemented path: Linux hevc_nvenc feeds a
swscale'd yuv444p (RGB-in is always 4:2:0 — verified on the RTX 5070 Ti — so the session forces CPU
RGB capture for 4:4:4); Windows NVENC keeps ARGB input + FREXT profile + chromaFormatIDC=3 and the
DDA capturer delivers RGB. VAAPI / AMF / QSV decline (probe returns false — no validated 4:4:4
hardware in the lab; they'd produce 4:2:0). Software (openh264) is 4:2:0-only. Test with
PUNKTFUNK_CLIENT_444=1 punktfunk-probe --out x.h265 then ffprobe x.h265 (expect pix_fmt yuv444p).
Linux NVENC mechanism validated on the RTX 5070 Ti (ffmpeg CLI); Windows NVENC + 10-bit-4:4:4 not yet
on-glass validated.
Conventions
- Rust 2021,
rustfmt+clippy -D warningsclean before commit. - Match the surrounding code's comment density and naming.
- Commit messages end with the Co-Authored-By trailer (see
git log). pkillcaution on this box: match exact comm names (pkill -x gamescope-wl,pkill -x punktfunk-host) —pkill -fself-matches the invoking shell.