refactor: drop milestone names + consolidate clients; loss-recovery & rumble fixes
apple / swift (push) Failing after 40s
audit / cargo-audit (push) Failing after 1m12s
windows-msix / package (push) Successful in 1m37s
windows / build (push) Successful in 1m14s
android / android (push) Successful in 4m48s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 4m21s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 4m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 19s
deb / build-publish (push) Successful in 6m3s
flatpak / build-publish (push) Successful in 4m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m16s
docker / deploy-docs (push) Successful in 18s

Two bodies of work in one commit (the rename moved files the fixes also touched).

Naming/structure cleanup (pre-launch):
- Host modules m3.rs->punktfunk1.rs, m0.rs->spike.rs; CLI m3-host->punktfunk1-host,
  m0->spike; bare `punktfunk-host` now prints help. Types M3Options/M3Source->
  Punktfunk1Options/Punktfunk1Source.
- Clients consolidated out of crates/ into clients/: punktfunk-client-rs->
  clients/probe (crate punktfunk-probe), client-linux->clients/linux,
  client-windows->clients/windows, punktfunk-android->clients/android/native
  (crate punktfunk-client-android; kept [lib] name=punktfunk_android so the JNI
  contract is unchanged). crates/ now holds only core + host.
- Milestone codes M0-M4 purged from code/CLI/CLAUDE.md/README/docs/docs-site,
  kept only in docs/implementation-plan.md. docs/m2-plan.md->
  docs/gamestream-host-plan.md. CI/gradle/flatpak paths updated.

Client loss-recovery (video froze and never recovered after a brief drop):
- Export punktfunk_connection_frames_dropped through the C ABI (the core already
  tracked it for the client keyframe-recovery loop; it was never reachable from
  the ABI clients). Regenerated punktfunk_core.h.
- Apple (StreamPump + Stage2Pipeline) and Android (decode.rs) now poll
  frames_dropped and request a keyframe when it climbs -- the same loss-driven
  recovery Linux/Windows already had. Under infinite GOP the decoder silently
  conceals reference-missing frames, so the decode-error trigger rarely fires.

Apple rumble robustness (worked then went spotty -- DualSense + Xbox):
- Add CHHapticEngine stopped/reset handlers (rebuild on app background / audio
  interruption / server reset) and drop the permanent `broken` latch on a
  transient drive failure; latch only when the controller truly has no haptics.
- Surface swallowed SDL set_rumble errors on Linux/Windows + diagnostic logging.

Verified: cargo build/clippy/fmt --workspace, C-ABI harness, header drift.
Not runnable on this box (verify in CI): Gitea workflows, gradle/Android,
flatpak, Swift/decky.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-18 21:03:55 +00:00
parent 1faa6c6ad4
commit 9c8fa9340c
110 changed files with 534 additions and 341 deletions
+1 -1
View File
@@ -1,4 +1,4 @@
# Android client CI (Gitea Actions). Builds the Rust JNI core (crates/punktfunk-android) via
# Android client CI (Gitea Actions). Builds the Rust JNI core (clients/android/native) via
# cargo-ndk for both shipping ABIs and assembles the debug APK (clients/android). Mirrors apple.yml
# but on a Linux runner — the NDK is cross-platform, so no self-hosted host is needed.
#
+3 -1
View File
@@ -26,7 +26,7 @@ on:
# The flatpak is the CLIENT — only rebuild when the client/core/manifest change, not on every
# docs/host push (this is a heavy flatpak-builder run). Tags (v*, the client release) build too.
paths:
- 'crates/punktfunk-client-linux/**'
- 'clients/linux/**'
- 'crates/punktfunk-core/**'
- 'packaging/flatpak/**'
- 'Cargo.lock'
@@ -40,6 +40,8 @@ env:
APP_ID: io.unom.Punktfunk
MANIFEST: packaging/flatpak/io.unom.Punktfunk.yml
PACKAGE: punktfunk-client-flatpak # generic-registry package name
REPO_URL: https://flatpak.unom.io # shared unom OSTree repo (reusable across unom apps)
DEPLOY_DIR: unom-flatpak # ~/<dir> on unom-1 (compose + ./site tree)
jobs:
build-publish:
+3 -3
View File
@@ -5,7 +5,7 @@
# makeappx/signtool are baked into the runner's daemon env, same as windows.yml.
#
# Registry (public, unom org): https://git.unom.io/unom/-/packages (generic group)
# Packaging internals: crates/punktfunk-client-windows/packaging/README.md. BOM/MAX_PATH runner
# Packaging internals: clients/windows/packaging/README.md. BOM/MAX_PATH runner
# gotchas baked into the daemon env + windows.yml: see that workflow.
#
# Versioning — MSIX requires a strictly 4-part numeric version (no ~/- suffixes), so:
@@ -25,7 +25,7 @@ on:
push:
branches: [main]
paths:
- 'crates/punktfunk-client-windows/**'
- 'clients/windows/**'
- 'crates/punktfunk-core/**'
- 'Cargo.lock'
- 'Cargo.toml'
@@ -72,7 +72,7 @@ jobs:
MSIX_CERT_PFX_B64: ${{ secrets.MSIX_CERT_PFX_B64 }}
MSIX_CERT_PASSWORD: ${{ secrets.MSIX_CERT_PASSWORD }}
run: |
& crates/punktfunk-client-windows/packaging/pack-msix.ps1 `
& clients/windows/packaging/pack-msix.ps1 `
-Version $env:MSIX_VERSION -TargetDir C:\t\release -OutDir C:\t\msix
- name: Publish to Gitea generic registry
+2 -2
View File
@@ -24,14 +24,14 @@ on:
push:
branches: [main]
paths:
- 'crates/punktfunk-client-windows/**'
- 'clients/windows/**'
- 'crates/punktfunk-core/**'
- 'Cargo.lock'
- 'Cargo.toml'
- '.gitea/workflows/windows.yml'
pull_request:
paths:
- 'crates/punktfunk-client-windows/**'
- 'clients/windows/**'
- 'crates/punktfunk-core/**'
- 'Cargo.lock'
- 'Cargo.toml'
+8
View File
@@ -20,3 +20,11 @@ xcuserdata/
# Windows App SDK staging by windows-reactor build.rs
/temp/
/winmd/
# Client crate build artifacts (clients moved out of crates/ -> clients/ 2026-06-18)
/clients/*/target
/clients/*/*/target
# Python bytecode (e.g. clients/android/ci tooling)
__pycache__/
*.pyc
+24 -20
View File
@@ -6,10 +6,10 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
## Where the work stands
- **M1 (`punktfunk-core` + C ABI): complete and hardened.** FEC recovery, loopback-under-loss,
- **Core (`punktfunk-core` + C ABI): complete and hardened.** FEC recovery, loopback-under-loss,
proptests, C ABI harness all green; 13 adversarial-review findings fixed +
regression-tested (`a913042`).
- **M2 (GameStream host): working end-to-end with a stock Moonlight client.** Validated live
- **GameStream host: working end-to-end with a stock Moonlight client.** Validated live
on this box: pairing (persists across restarts), serverinfo/applist (app catalog from
`~/.config/punktfunk/apps.json` → each entry picks a compositor + nested command), RTSP, ENet
control, audio, and video at the **client's native resolution and refresh** — the host
@@ -28,11 +28,11 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
socket, wlr protocols on Sway) and **gamepads** (uinput X-Box-360 pads + rumble
back-channel; validated live — pad created/destroyed with the session). Management REST API +
checked-in OpenAPI doc (`mgmt.rs`).
- **M3 (`punktfunk/1`, the native protocol): full session planes, validated live.** QUIC
- **Native protocol (`punktfunk/1`): full session planes, validated live.** QUIC
control plane (`punktfunk-core` `quic` feature: Hello{mode}/Welcome{full Config}/Start), data
plane = the hardened M1 `Session` over raw UDP with **GF(2¹⁶) Leopard FEC + AES-GCM**
plane = the hardened core `Session` over raw UDP with **GF(2¹⁶) Leopard FEC + AES-GCM**
(inexpressible in GameStream), host creates the native virtual output at the client's
requested mode. `m3-host` is a **persistent listener** (sessions back to back;
requested mode. `punktfunk1-host` is a **persistent listener** (sessions back to back;
`--max-sessions`). QUIC datagrams carry the side planes, demuxed by first byte: input
0xC8 (incl. **gamepads** — incremental events accumulated into the uinput xpad), **Opus
audio** 0xC9 (48 kHz stereo, 5 ms, host→client), **rumble** 0xCA (host→client). **Trust:**
@@ -41,15 +41,15 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
ceremony** (host arms pairing and displays a 4-digit PIN; a PAKE binds both cert fingerprints so an
attacker gets one online guess, no offline dictionary attack) — PIN pairing is the default for new
hosts. **TOFU on first connect** (`endpoint::client_pinned`) stays as an explicit host opt-in
(`m3-host --allow-tofu` / `serve --open`, advertised as `pair=optional`) for fully trusted LANs;
(`punktfunk1-host --allow-tofu` / `serve --open`, advertised as `pair=optional`) for fully trusted LANs;
clients only offer the TOFU "Trust" path for a host that advertised `pair=optional`, route every
other new host straight to the PIN ceremony, and on a pinned-fingerprint change force re-pairing
(no re-TOFU shortcut). Clients present persistent identities via QUIC client auth, the host stores
paired fingerprints (`punktfunk1-paired.json`) and gates sessions with `--require-pairing` (the
default; `--allow-tofu`/`--open` accept unpaired clients).
**LAN auto-discovery**: both `serve --native` and `m3-host` advertise the native service over
**LAN auto-discovery**: both `serve --native` and `punktfunk1-host` advertise the native service over
mDNS (`_punktfunk._udp`, `crate::discovery`) with TXT `proto`/`fp`(cert fingerprint to
pin)/`pair`(required|optional)/`id`; `punktfunk-client-rs --discover` lists hosts, Apple clients
pin)/`pair`(required|optional)/`id`; `punktfunk-probe --discover` lists hosts, Apple clients
browse the same service via NWBrowser (validated cross-LAN 2026-06-12).
**Mid-stream mode renegotiation**: `Reconfigure` on the still-open control stream — the
host rebuilds output+encoder at the new mode in ~90 ms while the data plane runs on
@@ -58,7 +58,7 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
(`ClockProbe`/`ClockEcho`, 8 NTP rounds after `Start`, `clock_offset_ns`) aligns the client to the
host clock, so that latency is now valid **cross-machine** (`skew_corrected=true`) — measured GNOME
box → dev box over the LAN: **p50 1.30 ms** (the 1.57 ms inter-box clock offset removed).
`punktfunk-client-rs` is the
`punktfunk-probe` is the
working reference client (`--pin`, datagram counters, `--input-test` incl. gamepad).
The embeddable connector (`NativeClient`) exposes it all over the C ABI: `punktfunk_connect`
(pin/TOFU) + `next_au`/`next_audio`/`next_rumble`/`next_hidout`/`send_input`/
@@ -69,7 +69,7 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
## What's left
1. **M4 — client decode + present: macOS stage 1 done, first light achieved
1. **Native clients — decode + present: macOS stage 1 done, first light achieved
(2026-06-10).** PunktfunkKit compiles and is tested on macOS (AnnexB → VideoToolbox →
`AVSampleBufferDisplayLayer`, GCMouse/GCKeyboard capture, `PunktfunkClient` app shell);
validated live Mac ↔ this box at 720p60 — vkcube on glass, input injected via gamescope
@@ -85,13 +85,13 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
Loopback-tested end to end (`PUNKTFUNK_TEST_FEEDBACK=1` scripted burst); DualSense
motion sign/scale derived, not yet live-verified. Tests: `swift test` in
`clients/apple` (unit + real-codec round trip),
`test-loopback.sh` (Swift client vs synthetic m3-hosts on loopback — runs on macOS;
`test-loopback.sh` (Swift client vs synthetic punktfunk1-hosts on loopback — runs on macOS;
includes the pairing ceremony + `--require-pairing` gate),
`RemoteFirstLightTests` (full pipeline over the LAN). See
[`clients/apple/README.md`](clients/apple/README.md). Next: stage 2 presenter
(`VTDecompressionSession` + `CAMetalLayer` frame pacing), glass-to-glass numbers via
`tools/latency-probe` (scaffold), iOS variant.
**Linux stage 1 done, first light 2026-06-12** (`crates/punktfunk-client-linux`, binary
**Linux stage 1 done, first light 2026-06-12** (`clients/linux`, binary
`punktfunk-client`): GTK4/libadwaita shell linking `punktfunk-core` directly (no C ABI;
`NativeClient` is now `Sync` — mutexed plane receivers), mDNS host list, TOFU + SPAKE2
PIN dialogs (identity shared with client-rs), FFmpeg software HEVC decode (LOW_DELAY,
@@ -118,7 +118,7 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
reconfirm. Next: the stage-2 raw-Wayland
presenter (wp_presentation feedback, tearing-control, Vulkan Video on NVIDIA) —
**wgpu/winit rejected** (no dmabuf import / presentation feedback / shortcuts-inhibit).
**Windows stage 1 done 2026-06-15** (`crates/punktfunk-client-windows`, binary
**Windows stage 1 done 2026-06-15** (`clients/windows`, binary
`punktfunk-client`): pure-Rust **WinUI 3** UI via **windows-reactor** (a declarative React-like
framework backed by WinUI; PR #4499 added the `SwapChainPanel` widget + `set_swap_chain`). The
video is a **`SwapChainPanel`** bound to a **D3D11 composition swapchain** (WARP fallback for
@@ -150,7 +150,7 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
unpaired clients); clients render TOFU only for a `pair=optional` host and force re-pairing on a
fingerprint change. Next (see roadmap): **delegated pairing approval** (an already-paired device
approves a new one).
4. **M2 polish**: HDR/10-bit (needs HDR capture + metadata plumbing; `av1_nvenc
4. **GameStream host polish**: HDR/10-bit (needs HDR capture + metadata plumbing; `av1_nvenc
-highbitdepth 1` already encodes Main10 from 8-bit input on this box),
reconnect-at-new-mode robustness. AV1 negotiation and surround audio are implemented
and unit/live-capture tested — both still need a live Moonlight confirmation (select
@@ -193,9 +193,13 @@ crates/punktfunk-host/
vdisplay/{kwin,gamescope,mutter,wlroots}.rs per-compositor client-sized virtual outputs
zerocopy/{egl,cuda,vulkan}.rs dmabuf → CUDA → NVENC (tiled via EGL/GL, LINEAR via Vulkan)
inject/{libei,wlr,gamepad,dualsense}.rs input backends (uinput xpad + UHID DualSense)
capture.rs · encode.rs · audio.rs · m0.rs · m3.rs · mgmt.rs · native_pairing.rs
crates/punktfunk-client-rs/ punktfunk/1 reference client (M3 headless test/measurement tool)
crates/punktfunk-client-linux/ native Linux client (GTK4/libadwaita · FFmpeg · PipeWire · SDL3)
capture.rs · encode.rs · audio.rs · spike.rs · punktfunk1.rs · mgmt.rs · native_pairing.rs
clients/probe/ punktfunk/1 reference/probe client (headless test/measurement tool)
clients/linux/ native Linux client (GTK4/libadwaita · FFmpeg · PipeWire · SDL3)
clients/windows/ native Windows client (WinUI 3 via windows-reactor · D3D11 · WASAPI · SDL3)
clients/apple/ native macOS/iOS client (Swift · VideoToolbox · GameController)
clients/android/ native Android client (Kotlin app + native/ Rust JNI core over punktfunk-core)
clients/decky/ Steam Deck Decky plugin
web/ TanStack web console over the mgmt API (status · devices · pairing)
packaging/ Fedora/Bazzite RPM · bootc · COPR (packaging/bazzite/README.md)
tools/{loss-harness,latency-probe}/ measurement (plan §10)
@@ -215,7 +219,7 @@ include/punktfunk_core.h generated C header
- **FEC is the wall-breaker.** GF(2⁸) (≤255 shards/block, Moonlight-compatible) and GF(2¹⁶)
Leopard (≤65535 shards/block) — punktfunk/1 negotiates the latter, removing the ~1 Gbps
ceiling.
- **M1 security hardening stays intact**: reassembler bounds attacker-controlled fields
- **Core security hardening stays intact**: reassembler bounds attacker-controlled fields
before allocating (`ReassemblerLimits`); AES-GCM per-direction nonce salts + seq-as-AAD;
ABI `struct_size` checks. Regression tests exist — keep them green.
- **PipeWire consumer discipline**: our capture streams set `node.dont-reconnect` and tear
@@ -240,8 +244,8 @@ PUNKTFUNK_ZEROCOPY=1 cargo run -rp punktfunk-host -- serve
# punktfunk/1 native loopback test (no Moonlight needed; same env as serve, listener persists
# across sessions — bound it with --max-sessions):
cargo run -rp punktfunk-host -- m3-host --source virtual --seconds 10 --max-sessions 1
cargo run -rp punktfunk-client-rs -- --mode 1280x720x120 --out /tmp/a.h265 --input-test # + --pin HEX
cargo run -rp punktfunk-host -- punktfunk1-host --source virtual --seconds 10 --max-sessions 1
cargo run -rp punktfunk-probe -- --mode 1280x720x120 --out /tmp/a.h265 --input-test # + --pin HEX
```
Pinned crate facts: `ashpd` 0.13 + `pipewire` 0.9 (must match ashpd's) + `ffmpeg-next` 8.x
Generated
+15 -15
View File
@@ -2540,7 +2540,7 @@ dependencies = [
]
[[package]]
name = "punktfunk-android"
name = "punktfunk-client-android"
version = "0.0.1"
dependencies = [
"android_logger",
@@ -2571,20 +2571,6 @@ dependencies = [
"tracing-subscriber",
]
[[package]]
name = "punktfunk-client-rs"
version = "0.0.1"
dependencies = [
"anyhow",
"mdns-sd",
"opus",
"punktfunk-core",
"quinn",
"tokio",
"tracing",
"tracing-subscriber",
]
[[package]]
name = "punktfunk-client-windows"
version = "0.0.1"
@@ -2693,6 +2679,20 @@ dependencies = [
"xkbcommon",
]
[[package]]
name = "punktfunk-probe"
version = "0.0.1"
dependencies = [
"anyhow",
"mdns-sd",
"opus",
"punktfunk-core",
"quinn",
"tokio",
"tracing",
"tracing-subscriber",
]
[[package]]
name = "quick-error"
version = "1.2.3"
+4 -4
View File
@@ -3,10 +3,10 @@ resolver = "2"
members = [
"crates/punktfunk-core",
"crates/punktfunk-host",
"crates/punktfunk-client-rs",
"crates/punktfunk-client-linux",
"crates/punktfunk-client-windows",
"crates/punktfunk-android",
"clients/probe",
"clients/linux",
"clients/windows",
"clients/android/native",
"tools/latency-probe",
"tools/loss-harness",
]
+11 -8
View File
@@ -12,10 +12,10 @@ negotiated extension. See [`docs/implementation-plan.md`](docs/implementation-pl
| Milestone | State |
|-----------|-------|
| **M1 — `punktfunk-core` + C ABI** | ✅ done & hardened (FEC, packetization, AES-GCM, session, adversarial-review fixes, `punktfunk_core.h`) |
| **M2 — GameStream host → stock Moonlight** | ✅ live end-to-end: pairing, RTSP, audio, per-client virtual output at native res, GPU zero-copy NVENC, gamepads |
| **M3 — `punktfunk/1` native protocol** | ✅ validated live: QUIC control + GF(2¹⁶) FEC/AES data plane, SPAKE2 PIN pairing, mid-stream mode renegotiation |
| **M4 — client decode + present (Apple)** | 🟡 macOS first light: AnnexB→VideoToolbox HEVC on glass + input/pairing over `punktfunk/1` (`clients/apple`); iOS + presenter next |
| **Core — `punktfunk-core` + C ABI** | ✅ done & hardened (FEC, packetization, AES-GCM, session, adversarial-review fixes, `punktfunk_core.h`) |
| **GameStream host → stock Moonlight** | ✅ live end-to-end: pairing, RTSP, audio, per-client virtual output at native res, GPU zero-copy NVENC, gamepads |
| **Native protocol — `punktfunk/1`** | ✅ validated live: QUIC control + GF(2¹⁶) FEC/AES data plane, SPAKE2 PIN pairing, mid-stream mode renegotiation |
| **Native clients — decode + present** | 🟡 macOS first light: AnnexB→VideoToolbox HEVC on glass + input/pairing over `punktfunk/1` (`clients/apple`); iOS + presenter next |
| **Web console + management API** | ✅ TanStack web console (`web/`) over the OpenAPI mgmt API: host status, paired devices, on-demand native pairing (arm → show PIN) |
The **GameStream host works with a stock Moonlight client** — validated live on NVIDIA
@@ -26,7 +26,7 @@ per-session virtual output (KWin, gamescope, Mutter, Sway backends), encoded wit
**`punktfunk/1`** protocol adds a QUIC control plane and a GF(2¹⁶) Leopard-FEC + AES-GCM data
plane (p50 ~0.8 ms capture→reassembled at 720p120). Its trust model is **SPAKE2 PIN pairing by
default** — a new host requires the PIN ceremony; trust-on-first-use is an explicit host opt-in
(`m3-host --allow-tofu` / `serve --open`, advertised as `pair=optional`) for fully trusted LANs. Both
(`punktfunk1-host --allow-tofu` / `serve --open`, advertised as `pair=optional`) for fully trusted LANs. Both
run from **one process** (`serve --native`), managed through a REST API + web console. Builds
against FFmpeg 7 or 8; deployed live on Bazzite. Full status: [`CLAUDE.md`](CLAUDE.md);
roadmap, setup guides & progress: the docs site ([`docs-site/`](docs-site) — Fumadocs;
@@ -55,9 +55,12 @@ Building from source (below) is a fallback.
```
crates/
punktfunk-core/ protocol · FEC · pacing · crypto · quic — the C ABI (lib + cdylib + staticlib)
punktfunk-host/ Linux host: vdisplay · capture · encode · inject · gamestream · m3 · mgmt · native_pairing
punktfunk-client-rs/ punktfunk/1 reference client (M3 headless; M4 adds decode+present)
clients/{apple,android}/ native client scaffolds (import punktfunk_core.h); apple = macOS first light
punktfunk-host/ Linux host: vdisplay · capture · encode · inject · gamestream · punktfunk1 · mgmt · native_pairing
clients/
probe/ punktfunk/1 reference/probe client (headless test + latency measurement)
linux/ windows/ native desktop clients (Rust: GTK4 / WinUI 3, link punktfunk-core directly)
apple/ android/ Swift (macOS+iOS) · Kotlin app + native/ Rust JNI core
decky/ Steam Deck Decky plugin
web/ TanStack web console (host status · paired devices · pairing) over the mgmt API
packaging/ Fedora/Bazzite RPM · bootc image · COPR (see packaging/bazzite/README.md)
include/punktfunk_core.h cbindgen-generated C header (checked in)
+4 -4
View File
@@ -11,7 +11,7 @@ machine, trust logic) instead of re-porting it into Kotlin.
| Side | Owns |
|------|------|
| **Rust** (`crates/punktfunk-android``libpunktfunk_android.so`) | the JNI seam, `NativeClient` (QUIC control + UDP data plane), AnnexB→`AMediaCodec` decode, Opus+Oboe audio, VK keymap, latency math, trust/pairing |
| **Rust** (`clients/android/native``libpunktfunk_android.so`) | the JNI seam, `NativeClient` (QUIC control + UDP data plane), AnnexB→`AMediaCodec` decode, Opus+Oboe audio, VK keymap, latency math, trust/pairing |
| **Kotlin** (`clients/android`) | Compose UI (host grid / settings / stream), `SurfaceView` lifecycle, input capture, `NsdManager` discovery, Keystore identity, permissions |
The single seam is `io.unom.punktfunk.kit.NativeBridge``Java_io_unom_punktfunk_kit_NativeBridge_*`.
@@ -19,7 +19,7 @@ The single seam is `io.unom.punktfunk.kit.NativeBridge` ⇄ `Java_io_unom_punktf
## Layout
```
crates/punktfunk-android/ Rust cdylib (workspace member)
clients/android/native/ Rust cdylib (workspace member)
src/lib.rs JNI_OnLoad + abiVersion/coreVersion (native-link proof)
src/session.rs session handle lifecycle (connect/close); plane pumps = TODO
@@ -63,7 +63,7 @@ The debug APK lands in `app/build/outputs/apk/debug/`. The scaffold screen calls
- **Scaffold (done):** Gradle modules, cargo-ndk wiring, JNI native-link proof, phone+TV-installable
manifest. `crates/punktfunk-core` `rcgen` switched to the `ring` backend so the client `.so` is
aws-lc-free.
- **Next (M4 Android stage 1):** video decode (`AMediaCodec` async → `SurfaceView`), audio
- **Next (Android stage 1):** video decode (`AMediaCodec` async → `SurfaceView`), audio
(Opus + Oboe + jitter ring), input capture → `send_input`, pairing/identity (Keystore-wrapped),
mDNS discovery, the phone/TV Compose UI. The Rust-side homes are stubbed in
`crates/punktfunk-android/src/session.rs` with port pointers to `crates/punktfunk-client-linux`.
`clients/android/native/src/session.rs` with port pointers to `clients/linux`.
+3 -3
View File
@@ -32,7 +32,7 @@ dependencies {
}
// ------------------------------------------------------------------------------------------------
// cargo-ndk: cross-compile crates/punktfunk-android into this module's jniLibs/<abi>/ so the
// cargo-ndk: cross-compile clients/android/native (punktfunk-client-android) into this module's jniLibs/<abi>/ so the
// resulting libpunktfunk_android.so is packaged into the app (and any AAR this module produces).
// NDK r28+ aligns to 16 KB pages by default — no extra linker flags. Prereqs (see clients/android
// /README.md): `cargo install cargo-ndk` + `rustup target add aarch64-linux-android x86_64-linux-android`.
@@ -57,7 +57,7 @@ fun androidSdkDir(): String {
fun registerCargoNdk(taskName: String, release: Boolean) =
tasks.register<Exec>(taskName) {
group = "rust"
description = "cargo-ndk build of punktfunk-android (${if (release) "release" else "debug"})"
description = "cargo-ndk build of punktfunk-client-android (${if (release) "release" else "debug"})"
workingDir = repoRoot
val sdk = androidSdkDir()
// A GUI Android Studio launch does not source the login shell, so make cargo, the NDK, and
@@ -84,7 +84,7 @@ fun registerCargoNdk(taskName: String, release: Boolean) =
// Link against the minSdk-31 sysroot so libaaudio (API 26+) is found.
"--platform", "31",
"-o", file("src/main/jniLibs").absolutePath,
"build", "-p", "punktfunk-android",
"build", "-p", "punktfunk-client-android",
)
if (release) cmd += "--release"
commandLine(cmd)
@@ -3,7 +3,7 @@ package io.unom.punktfunk.kit
/**
* The single JNI seam to `libpunktfunk_android.so` (the Rust-heavy client core).
*
* Symbols are implemented in `crates/punktfunk-android`. This object is intentionally thin —
* Symbols are implemented in `clients/android/native`. This object is intentionally thin —
* all protocol logic lives in Rust (`punktfunk-core` + the connector); Kotlin only marshals.
*/
object NativeBridge {
@@ -1,5 +1,5 @@
[package]
name = "punktfunk-android"
name = "punktfunk-client-android"
description = "punktfunk Android client — JNI bridge ('nativecore') over punktfunk-core (Rust-heavy client model)"
version.workspace = true
edition.workspace = true
@@ -16,7 +16,7 @@ crate-type = ["cdylib"]
[dependencies]
# The whole protocol/transport/FEC/crypto + the embeddable NativeClient connector. `quic` pulls
# the punktfunk/1 control plane (now ring-only — no aws-lc, see punktfunk-core/Cargo.toml).
punktfunk-core = { path = "../punktfunk-core", features = ["quic"] }
punktfunk-core = { path = "../../../crates/punktfunk-core", features = ["quic"] }
jni = "0.21"
log = "0.4"
@@ -15,7 +15,7 @@ use punktfunk_core::client::NativeClient;
use punktfunk_core::error::PunktfunkError;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use std::time::Duration;
use std::time::{Duration, Instant};
/// The decode loop. Runs on the `pf-decode` thread until `shutdown` is set or the session closes.
pub fn run(client: Arc<NativeClient>, window: NativeWindow, shutdown: Arc<AtomicBool>) {
@@ -56,6 +56,10 @@ pub fn run(client: Arc<NativeClient>, window: NativeWindow, shutdown: Arc<Atomic
let mut fed: u64 = 0;
let mut rendered: u64 = 0;
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it
// climbs.
let mut last_dropped = client.frames_dropped();
let mut last_kf_req: Option<Instant> = None;
while !shutdown.load(Ordering::Relaxed) {
match client.next_frame(Duration::from_millis(5)) {
Ok(frame) => {
@@ -74,6 +78,24 @@ pub fn run(client: Arc<NativeClient>, window: NativeWindow, shutdown: Arc<Atomic
Err(_) => break, // session closed
}
rendered += drain(&codec);
// Loss recovery: under infinite GOP the only recovery keyframe is one we request. The
// reassembler drops unrecoverable AUs (frames_dropped); the decoder then conceals the
// reference-missing delta frames that follow and renders them without error, so keying off
// a decode error rarely fires. Request an IDR when the drop count climbs, throttled — the
// decode stays wedged for several frames until the IDR lands, so requesting every frame
// would flood the control stream.
let dropped = client.frames_dropped();
if dropped > last_dropped {
last_dropped = dropped;
let now = Instant::now();
if last_kf_req.is_none_or(|t| now.duration_since(t) >= Duration::from_millis(100)) {
last_kf_req = Some(now);
let _ = client.request_keyframe();
log::debug!("decode: requested keyframe (loss recovery, dropped={dropped})");
}
}
if fed > 0 && fed % 300 == 0 {
log::info!("decode: fed={fed} rendered={rendered}");
}
@@ -11,7 +11,7 @@
//! Kotlin side), `nativeConnect` with identity + pin (TOFU / pinned), and `nativePair` (SPAKE2 PIN).
//!
//! TODO(M4 Android stage 1): client→host DualSense rich input (`send_rich_input`), mode
//! renegotiation. Port the remaining orchestration from `crates/punktfunk-client-linux`.
//! renegotiation. Port the remaining orchestration from `clients/linux`.
use jni::objects::{JObject, JString};
use jni::sys::{jboolean, jint, jlong};
+4 -4
View File
@@ -20,8 +20,8 @@ full session: video AUs, **Opus audio** (`nextAudio()`), **rumble** (`nextRumble
**DualSense feedback** (`nextHidOutput()` — lightbar, player LEDs, adaptive-trigger
effects), input incl. gamepads + DualSense touchpad/motion (`sendTouchpad`/`sendMotion`),
and **cert pinning + TOFU** (`pinSHA256:`/`hostFingerprint`) — see
`m3.rs::tests::c_abi_connection_roundtrip` (three sequential sessions: TOFU, pinned
reconnect, wrong-pin rejection). The host (`punktfunk-host m3-host`) is a persistent listener:
`punktfunk1.rs::tests::c_abi_connection_roundtrip` (three sequential sessions: TOFU, pinned
reconnect, wrong-pin rejection). The host (`punktfunk-host punktfunk1-host`) is a persistent listener:
reconnect at will during development.
What's here, all compiled and tested on macOS (Xcode 26.5 / Swift 6.3):
@@ -127,10 +127,10 @@ bash test-loopback.sh # full loopback proof: builds punktfunk
# (synthetic source — runs on macOS), streams
# byte-verified frames into the Swift client
# against the real host (Linux box, see CLAUDE.md "Running on this box") — m3-host is a
# against the real host (Linux box, see CLAUDE.md "Running on this box") — punktfunk1-host is a
# persistent listener, reconnect at will:
# PUNKTFUNK_COMPOSITOR=gamescope PUNKTFUNK_GAMESCOPE_APP=vkcube PUNKTFUNK_ZEROCOPY=1 \
# cargo run -rp punktfunk-host -- m3-host --source virtual --seconds 60
# cargo run -rp punktfunk-host -- punktfunk1-host --source virtual --seconds 60
PUNKTFUNK_REMOTE_HOST=<box-ip> swift test --filter RemoteFirstLightTests # headless
# (+ PUNKTFUNK_REMOTE_PORT / PUNKTFUNK_REMOTE_COMPOSITOR=gamescope|kwin|… /
# PUNKTFUNK_REMOTE_PIN=<arming-pin> for the remote pairing test)
@@ -58,7 +58,13 @@ private final class RumbleRenderer: @unchecked Sendable {
private var controller: GCController?
private var low: Motor?
private var high: Motor?
// `broken` latches OFF only for a controller that genuinely has no haptics engine (an Xbox pad
// on an OS that doesn't expose rumble through GameController, a Siri Remote) nothing to retry
// until the controller changes. A transient engine failure does NOT latch it; it tears down for
// a lazy rebuild instead, so a single hiccup can't kill rumble for the whole session.
private var broken = false
/// Last logged active/silent state for a one-line transition log, not per-event spam.
private var wasActive = false
func retarget(_ c: GCController?) {
queue.async {
@@ -70,8 +76,14 @@ private final class RumbleRenderer: @unchecked Sendable {
func apply(low lowAmp: UInt16, high highAmp: UInt16) {
queue.async {
let active = lowAmp != 0 || highAmp != 0
if active != self.wasActive {
self.wasActive = active
log.debug(
"rumble: \(active ? "active" : "stop", privacy: .public) low=\(lowAmp, privacy: .public) high=\(highAmp, privacy: .public)")
}
guard !self.broken else { return }
if (lowAmp != 0 || highAmp != 0), self.low == nil, self.high == nil {
if active, self.low == nil, self.high == nil {
self.setup()
}
if self.high != nil {
@@ -92,7 +104,15 @@ private final class RumbleRenderer: @unchecked Sendable {
/// high = right/light the Xbox/XInput convention the wire carries); one combined
/// engine otherwise, driven by whichever amplitude is stronger.
private func setup() {
guard let haptics = controller?.haptics else { return }
guard let haptics = controller?.haptics else {
// No haptics engine at all an Xbox controller on an OS/firmware that doesn't expose
// rumble through GameController (works on Android via the standard Vibrator path, but
// Apple's support is controller/OS-dependent), or a Siri Remote. Nothing to retry until
// the controller changes; latch off (retarget clears it) and say so once.
log.info("rumble: active controller exposes no haptics engine — rumble unavailable")
broken = true
return
}
let localities = haptics.supportedLocalities
if localities.contains(.leftHandle), localities.contains(.rightHandle) {
low = makeMotor(haptics, .leftHandle)
@@ -100,13 +120,28 @@ private final class RumbleRenderer: @unchecked Sendable {
} else {
low = makeMotor(haptics, .default)
}
if low == nil && high == nil {
broken = true // no usable engine (e.g. Siri Remote) stay silent
if low == nil, high == nil {
// Haptics present but no engine could be built right now (server busy / a transient
// error). Do NOT latch broken the next nonzero amplitude retries setup().
log.warning("rumble: haptics present but engine setup failed — will retry on next rumble")
}
}
private func makeMotor(_ haptics: GCDeviceHaptics, _ locality: GCHapticsLocality) -> Motor? {
guard let engine = haptics.createEngine(withLocality: locality) else { return nil }
// The haptic server can stop or reset the engine out from under us app backgrounding, an
// audio-session interruption (a call, Siri, another audio app), or a server crash. Left
// unhandled the players go dead and every later rumble throws, latching rumble off for the
// rest of the session (the "rumble worked, then went spotty" failure). Tear down on the
// serial queue so the next nonzero amplitude lazily rebuilds the engine, instead.
engine.stoppedHandler = { [weak self] reason in
log.info("rumble: haptic engine stopped (reason \(reason.rawValue, privacy: .public)) — will rebuild")
self?.queue.async { self?.teardown() }
}
engine.resetHandler = { [weak self] in
log.info("rumble: haptic engine reset — will rebuild")
self?.queue.async { self?.teardown() }
}
do {
try engine.start()
let event = CHHapticEvent(
@@ -141,14 +176,19 @@ private final class RumbleRenderer: @unchecked Sendable {
}
motor = m
} catch {
log.warning("haptic update failed — rumble disabled: \(error, privacy: .public)")
// A transient failure (the engine stopped/reset between its handler firing and now).
// Tear down so the next nonzero amplitude rebuilds do NOT latch rumble off for the
// session (that was the old "spotty" behaviour).
log.warning("rumble: haptic update failed — rebuilding: \(error, privacy: .public)")
teardown()
broken = true
}
}
private func teardown() {
for m in [low, high].compactMap({ $0 }) {
// Drop the handlers before stopping so stop() can't re-enter teardown via stoppedHandler.
m.engine.stoppedHandler = nil
m.engine.resetHandler = nil
try? m.player.stop(atTime: CHHapticTimeImmediate)
m.engine.stop()
}
@@ -362,6 +362,21 @@ public final class PunktfunkConnection {
_ = punktfunk_connection_request_keyframe(h)
}
/// Cumulative access units the hostclient reassembler dropped as unrecoverable (FEC couldn't
/// rebuild them). The video pump polls this and calls `requestKeyframe()` when it climbs the
/// correct loss trigger under the host's infinite GOP, where unrecoverable loss yields
/// reference-missing delta frames the decoder *silently conceals* (a frozen / garbage picture,
/// no decode error and no `.failed` layer), so a decode-error trigger rarely fires. Monotonic
/// for the session; 0 after close. Cheap (an atomic load) safe to poll every pump iteration.
public func framesDropped() -> UInt64 {
abiLock.lock()
defer { abiLock.unlock() }
guard let h = handle, !closeRequested else { return 0 }
var out: UInt64 = 0
_ = punktfunk_connection_frames_dropped(h, &out)
return out
}
/// The currently active session mode (updated by accepted `requestMode` switches).
public func currentMode() -> (width: UInt32, height: UInt32, refreshHz: UInt32) {
abiLock.lock()
@@ -113,8 +113,21 @@ public final class Stage2Pipeline {
let recovery = recovery
let thread = Thread {
var format: CMVideoFormatDescription?
var lastFramesDropped = connection.framesDropped()
while token.isLive {
do {
// Loss recovery (the primary recovery path). The reassembler drops unrecoverable
// AUs (framesDropped) and the decoder then conceals the reference-missing delta
// frames that follow often rendering them WITHOUT an error callback so the
// onDecodeError trigger rarely fires after a real network blip. Ask the host for
// a fresh IDR whenever the drop count climbs (throttled in KeyframeRecovery).
// Polled every iteration so a total-loss drought recovers the moment packets
// resume and the reassembler counts the gap.
let dropped = connection.framesDropped()
if dropped > lastFramesDropped {
lastFramesDropped = dropped
recovery.request()
}
guard let au = try connection.nextAU(timeoutMs: 100) else { continue }
onFrame?(au)
if let f = AnnexB.formatDescription(fromIDR: au.data) {
@@ -46,27 +46,44 @@ final class StreamPump {
let thread = Thread {
var format: CMVideoFormatDescription?
var lastKeyframeRequest = Date.distantPast
var lastFramesDropped = connection.framesDropped()
// Coalesced host keyframe request: the decode stays wedged for several frames until
// the IDR lands, so requesting on every frame would flood the control stream.
func requestKeyframeThrottled() {
let now = Date()
if now.timeIntervalSince(lastKeyframeRequest) > 0.25 {
connection.requestKeyframe()
lastKeyframeRequest = now
}
}
while token.isLive {
do {
// Loss recovery (the primary recovery path). Under the host's infinite GOP the
// only recovery keyframe is one we request. The reassembler drops unrecoverable
// AUs (framesDropped); the decoder then *conceals* the reference-missing delta
// frames that follow a frozen / garbage picture, WITHOUT flipping the layer to
// .failed so the .failed check below rarely fires after a real network blip.
// Ask the host for a fresh IDR whenever the drop count climbs. Polled every
// iteration (not just per AU) so a total-loss drought still recovers the moment
// packets resume and the reassembler counts the gap.
let dropped = connection.framesDropped()
if dropped > lastFramesDropped {
lastFramesDropped = dropped
requestKeyframeThrottled()
}
guard let au = try connection.nextAU(timeoutMs: 100) else { continue }
onFrame?(au)
if let f = AnnexB.formatDescription(fromIDR: au.data) {
format = f // refreshed on every IDR (mode changes included)
}
if layer.status == .failed {
// Decode wedged: flush and re-gate on the next in-band parameter sets
// (resuming with a delta frame can't recover), AND ask the host for a
// fresh IDR. With the host's infinite GOP the next keyframe could be
// far off, so without the request the picture stays frozen the
// intermittent first-connect freeze. Throttled: the layer stays .failed
// across several polls until the IDR lands, and one request suffices.
// Decode wedged hard (the cold-first-connect case a lost/corrupt opening
// IDR): flush and re-gate on the next in-band parameter sets (resuming with
// a delta frame can't recover), AND ask the host for a fresh IDR. Throttled:
// the layer stays .failed across several polls until the IDR lands.
layer.flush()
format = AnnexB.formatDescription(fromIDR: au.data)
let now = Date()
if now.timeIntervalSince(lastKeyframeRequest) > 0.25 {
connection.requestKeyframe()
lastKeyframeRequest = now
}
requestKeyframeThrottled()
}
guard let f = format,
let sample = AnnexB.sampleBuffer(au: au, format: f),
@@ -1,7 +1,7 @@
// Integration: the Swift wrapper against a real punktfunk/1 host over QUIC + UDP on loopback
// the Swift twin of punktfunk-host's m3.rs::c_abi_connection_roundtrip, this time through the
// statically linked xcframework. Driven by clients/apple/test-loopback.sh, which builds and
// starts `punktfunk-host m3-host --source synthetic` and sets PUNKTFUNK_LOOPBACK_PORT.
// starts `punktfunk-host punktfunk1-host --source synthetic` and sets PUNKTFUNK_LOOPBACK_PORT.
import XCTest
@testable import PunktfunkKit
@@ -11,7 +11,7 @@ final class LoopbackIntegrationTests: XCTestCase {
guard let portStr = ProcessInfo.processInfo.environment["PUNKTFUNK_LOOPBACK_PORT"],
let port = UInt16(portStr)
else {
throw XCTSkip("needs a running m3-host — use clients/apple/test-loopback.sh")
throw XCTSkip("needs a running punktfunk1-host — use clients/apple/test-loopback.sh")
}
let conn = try PunktfunkConnection(
@@ -139,7 +139,7 @@ final class LoopbackIntegrationTests: XCTestCase {
guard let portStr = env["PUNKTFUNK_PAIRING_PORT"], let port = UInt16(portStr),
let pin = env["PUNKTFUNK_PAIRING_PIN"]
else {
throw XCTSkip("needs an armed m3-host — use clients/apple/test-loopback.sh")
throw XCTSkip("needs an armed punktfunk1-host — use clients/apple/test-loopback.sh")
}
let identity = try generateIdentity()
@@ -5,7 +5,7 @@
//
// Run (host side, on the Linux box):
// PUNKTFUNK_COMPOSITOR=gamescope PUNKTFUNK_GAMESCOPE_APP=vkcube PUNKTFUNK_ZEROCOPY=1 \
// punktfunk-host m3-host --source virtual --seconds 120
// punktfunk-host punktfunk1-host --source virtual --seconds 120
// Then here:
// PUNKTFUNK_REMOTE_HOST=192.168.1.70 swift test --filter RemoteFirstLightTests
@@ -54,7 +54,7 @@ final class RemoteFirstLightTests: XCTestCase {
func testRemoteAudioBothDirections() throws {
let env = ProcessInfo.processInfo.environment
guard let host = env["PUNKTFUNK_REMOTE_HOST"] else {
throw XCTSkip("set PUNKTFUNK_REMOTE_HOST (and start m3-host --source virtual there)")
throw XCTSkip("set PUNKTFUNK_REMOTE_HOST (and start punktfunk1-host --source virtual there)")
}
let port = env["PUNKTFUNK_REMOTE_PORT"].flatMap(UInt16.init) ?? 9777
@@ -106,7 +106,7 @@ final class RemoteFirstLightTests: XCTestCase {
func testRemoteStreamDecodesToPixels() throws {
let env = ProcessInfo.processInfo.environment
guard let host = env["PUNKTFUNK_REMOTE_HOST"] else {
throw XCTSkip("set PUNKTFUNK_REMOTE_HOST (and start m3-host --source virtual there)")
throw XCTSkip("set PUNKTFUNK_REMOTE_HOST (and start punktfunk1-host --source virtual there)")
}
let port = env["PUNKTFUNK_REMOTE_PORT"].flatMap(UInt16.init) ?? 9777
// PUNKTFUNK_REMOTE_COMPOSITOR=kwin|gamescope| asks the host for a specific
+2 -2
View File
@@ -22,10 +22,10 @@ trap 'kill "${HOST_PID:-}" "${PAIR_PID:-}" 2>/dev/null || true' EXIT
# The open host also scripts a feedback burst (rumble + DualSense hidout) right after the
# handshake, so the Swift test can assert the host→client feedback planes end to end.
HOME="$CFG/open" XDG_CONFIG_HOME="$CFG/open/.config" PUNKTFUNK_TEST_FEEDBACK=1 \
target/release/punktfunk-host m3-host --port "$PORT" --source synthetic --frames 300 &
target/release/punktfunk-host punktfunk1-host --port "$PORT" --source synthetic --frames 300 &
HOST_PID=$!
HOME="$CFG/paired" XDG_CONFIG_HOME="$CFG/paired/.config" \
target/release/punktfunk-host m3-host --port "$PAIR_PORT" --source synthetic --frames 300 \
target/release/punktfunk-host punktfunk1-host --port "$PAIR_PORT" --source synthetic --frames 300 \
--require-pairing >"$PAIR_LOG" 2>&1 &
PAIR_PID=$!
sleep 1
+1 -1
View File
@@ -81,7 +81,7 @@ argv and a clear `client-not-found` error surface to the UI. The child PID is tr
installed and runnable on the Deck — via `.deb`/RPM/flatpak, or symlinked into
`~/.local/bin`.
- **avahi** (`avahi-daemon` + `avahi-browse`) for discovery — present on SteamOS/Bazzite.
- A punktfunk/1 host on the LAN (`punktfunk-host serve --native` or `m3-host`).
- A punktfunk/1 host on the LAN (`punktfunk-host serve --native` or `punktfunk1-host`).
## Build
@@ -15,7 +15,7 @@ path = "src/main.rs"
# Everything is Linux-gated so `cargo build --workspace` stays green on macOS (the Mac
# client lives in clients/apple); on other platforms this builds as a stub binary.
[target.'cfg(target_os = "linux")'.dependencies]
punktfunk-core = { path = "../punktfunk-core", features = ["quic"] }
punktfunk-core = { path = "../../crates/punktfunk-core", features = ["quic"] }
# UI shell. GraphicsOffload needs GTK ≥ 4.14; black-background ≥ 4.16. AlertDialog/
# PreferencesDialog need libadwaita ≥ 1.5.
@@ -536,7 +536,17 @@ fn run(
while let Ok((pad, low, high)) = connector.next_rumble(Duration::ZERO) {
if pad == 0 {
if let Some(p) = w.active_id().and_then(|id| w.opened.get_mut(&id)) {
let _ = p.set_rumble(low, high, 5_000);
// Surface a failed SDL rumble write: a swallowed error here (DualSense not in
// the right HIDAPI mode, etc.) reads exactly like "rumble doesn't work". The
// host logs the send side on 0xCA, so the two together pinpoint host-game vs
// client-render.
if let Err(e) = p.set_rumble(low, high, 5_000) {
tracing::warn!(low, high, error = %e, "rumble: SDL set_rumble failed");
} else {
tracing::debug!(low, high, "rumble: rendered");
}
} else {
tracing::debug!(low, high, "rumble: received but no active pad to render");
}
}
}
@@ -1,6 +1,6 @@
//! Client identity, the known-hosts (pinned fingerprint) store, and app settings.
//!
//! The identity shares `~/.config/punktfunk/client-{cert,key}.pem` with `punktfunk-client-rs`
//! The identity shares `~/.config/punktfunk/client-{cert,key}.pem` with `punktfunk-probe`
//! so a box pairs once whichever client it uses.
use anyhow::{anyhow, Context, Result};
@@ -1,6 +1,6 @@
[package]
name = "punktfunk-client-rs"
description = "punktfunk reference client (M4): VAAPI decode + wgpu/Vulkan present"
name = "punktfunk-probe"
description = "punktfunk reference/probe client: headless punktfunk/1 client for testing + latency measurement"
version.workspace = true
edition.workspace = true
rust-version.workspace = true
@@ -9,7 +9,7 @@ authors.workspace = true
repository.workspace = true
[dependencies]
punktfunk-core = { path = "../punktfunk-core", features = ["quic"] }
punktfunk-core = { path = "../../crates/punktfunk-core", features = ["quic"] }
quinn = "0.11"
tokio = { version = "1", features = ["rt-multi-thread", "net", "time", "macros"] }
anyhow = "1"
@@ -1,4 +1,4 @@
//! `punktfunk-client-rs` — the reference client for `punktfunk/1` (M3): QUIC control plane, UDP data
//! `punktfunk-probe` — the reference client for `punktfunk/1` (M3): QUIC control plane, UDP data
//! plane, input over QUIC datagrams. Two modes, decided by the host's Welcome:
//!
//! * **verification** (`frames > 0`, synthetic host): byte-checks deterministic test frames;
@@ -35,7 +35,7 @@
//! over mDNS, prints each (name, addr:port, pairing requirement, cert fingerprint to pin), and
//! exits without connecting.
//!
//! Usage: `punktfunk-client-rs [--connect HOST:PORT] [--mode WxHxFPS] [--out FILE] [--input-test]
//! Usage: `punktfunk-probe [--connect HOST:PORT] [--mode WxHxFPS] [--out FILE] [--input-test]
//! [--pin HEX] [--compositor NAME] [--gamepad NAME] | --discover [SECS]`
//! (M4 adds VAAPI decode + wgpu present on this skeleton.)
@@ -193,7 +193,7 @@ fn parse_args() -> Args {
pin,
remode,
pair: get("--pair").map(String::from),
name: get("--name").unwrap_or("punktfunk-client-rs").to_string(),
name: get("--name").unwrap_or("punktfunk-probe").to_string(),
compositor,
gamepad,
bitrate_kbps: get("--bitrate").and_then(|s| s.parse().ok()).unwrap_or(0),
@@ -337,7 +337,7 @@ fn discover(secs: u64) -> Result<()> {
println!("{row}");
}
println!(
"\nconnect with: punktfunk-client-rs --connect <addr:port> [--pin <fp> | --pair <PIN>]"
"\nconnect with: punktfunk-probe --connect <addr:port> [--pin <fp> | --pair <PIN>]"
);
}
Ok(())
@@ -13,13 +13,13 @@ name = "punktfunk-client"
path = "src/main.rs"
# Everything is Windows-gated so `cargo build --workspace` stays green on Linux/macOS (the
# other native clients live in crates/punktfunk-client-linux and clients/apple); on other
# other native clients live in clients/linux and clients/apple); on other
# platforms this builds as a stub binary. Mirrors the Linux client's cfg(target_os="linux")
# gating exactly.
[target.'cfg(windows)'.dependencies]
# The protocol core, linked directly (no C ABI) — same as the GTK Linux client. NativeClient
# is Sync (mutexed plane receivers), so it drops into a UI app cleanly.
punktfunk-core = { path = "../punktfunk-core", features = ["quic"] }
punktfunk-core = { path = "../../crates/punktfunk-core", features = ["quic"] }
# WinUI 3 UI via windows-reactor (a declarative React-like framework backed by WinUI). Its
# `build.rs` downloads the Windows App SDK NuGets and stages the bootstrap DLL + resources.pri
@@ -71,7 +71,7 @@ On the Windows runner / dev VM (MSVC + Windows SDK present), after a release bui
```powershell
cargo build --release -p punktfunk-client-windows
pwsh -File crates/punktfunk-client-windows/packaging/pack-msix.ps1 `
pwsh -File clients/windows/packaging/pack-msix.ps1 `
-Version 0.2.0.0 -TargetDir C:\t\release -OutDir C:\t\msix
```

Before

Width:  |  Height:  |  Size: 3.1 KiB

After

Width:  |  Height:  |  Size: 3.1 KiB

Before

Width:  |  Height:  |  Size: 1.0 KiB

After

Width:  |  Height:  |  Size: 1.0 KiB

Before

Width:  |  Height:  |  Size: 1.6 KiB

After

Width:  |  Height:  |  Size: 1.6 KiB

Before

Width:  |  Height:  |  Size: 1.1 KiB

After

Width:  |  Height:  |  Size: 1.1 KiB

@@ -499,7 +499,17 @@ fn run(
while let Ok((pad, low, high)) = connector.next_rumble(Duration::ZERO) {
if pad == 0 {
if let Some(p) = w.active_id().and_then(|id| w.opened.get_mut(&id)) {
let _ = p.set_rumble(low, high, 5_000);
// Surface a failed SDL rumble write: a swallowed error here (DualSense not in
// the right HIDAPI mode, etc.) reads exactly like "rumble doesn't work". The
// host logs the send side on 0xCA, so the two together pinpoint host-game vs
// client-render.
if let Err(e) = p.set_rumble(low, high, 5_000) {
tracing::warn!(low, high, error = %e, "rumble: SDL set_rumble failed");
} else {
tracing::debug!(low, high, "rumble: rendered");
}
} else {
tracing::debug!(low, high, "rumble: received but no active pad to render");
}
}
}
@@ -83,7 +83,7 @@ fn main() {
}
/// `--headless --connect host[:port] …`: connect from the CLI, count frames, print stats — the
/// Windows analogue of `punktfunk-client-rs`.
/// Windows analogue of `punktfunk-probe`.
#[cfg(windows)]
fn run_headless_cli(args: &[String], identity: (String, String)) {
use punktfunk_core::config::{CompositorPref, GamepadPref, Mode};
@@ -241,18 +241,18 @@ fn discover_and_print() {
std::thread::sleep(Duration::from_millis(100));
}
if seen.is_empty() {
println!(" (none found — is a host running with --native / m3-host?)");
println!(" (none found — is a host running with --native / punktfunk1-host?)");
}
}
/// WinUI 3 / Direct3D11 / WASAPI / SDL3 are Windows turf; this stub keeps `cargo build
/// --workspace` green on Linux/macOS (the other native clients live in
/// crates/punktfunk-client-linux and clients/apple).
/// clients/linux and clients/apple).
#[cfg(not(windows))]
fn main() {
eprintln!(
"punktfunk-client-windows is Windows-only — the Linux client lives in \
crates/punktfunk-client-linux, the macOS client in clients/apple"
clients/linux, the macOS client in clients/apple"
);
std::process::exit(2);
}
+1 -1
View File
@@ -10,7 +10,7 @@ repository.workspace = true
[lib]
name = "punktfunk_core"
# `lib` — so punktfunk-host / punktfunk-client-rs / tools link it as a normal Rust crate.
# `lib` — so punktfunk-host / punktfunk-probe / tools link it as a normal Rust crate.
# `staticlib` — `libpunktfunk_core.a` for the C test harness and static embedding.
# `cdylib` — `libpunktfunk_core.{so,dylib}` for Swift/Kotlin clients via the C ABI.
crate-type = ["lib", "cdylib", "staticlib"]
+29
View File
@@ -1494,6 +1494,35 @@ pub unsafe extern "C" fn punktfunk_connection_request_keyframe(
})
}
/// Cumulative access units the host→client reassembler dropped as unrecoverable (FEC couldn't
/// rebuild them). A video loop polls this and calls [`punktfunk_connection_request_keyframe`]
/// when it climbs — the correct loss trigger under the host's infinite GOP, where unrecoverable
/// loss yields reference-missing delta frames the decoder *silently conceals* (frozen / garbage
/// picture, no decode error), so a decode-error trigger rarely fires. Monotonic for the session;
/// compare against the last observed value. Writes 0 to `out` on a NULL connection.
///
/// # Safety
/// `c` is a valid connection handle; `out` is writable (NULL is skipped).
#[cfg(feature = "quic")]
#[no_mangle]
pub unsafe extern "C" fn punktfunk_connection_frames_dropped(
c: *const PunktfunkConnection,
out: *mut u64,
) -> PunktfunkStatus {
guard(|| {
let c = match unsafe { c.as_ref() } {
Some(c) => c,
None => return PunktfunkStatus::NullPointer,
};
unsafe {
if !out.is_null() {
*out = c.inner.frames_dropped();
}
}
PunktfunkStatus::Ok
})
}
/// A speed-test measurement, filled by [`punktfunk_connection_probe_result`]. `done` is 0 until
/// the host's end-of-burst report lands, then 1 (the numbers are final). `throughput_kbps` is the
/// measured goodput to drive a bitrate choice from; `loss_pct` is the delivery loss at that rate.
+3 -3
View File
@@ -1,10 +1,10 @@
//! The embeddable `punktfunk/1` client connector (M4 groundwork), behind the `quic` feature.
//! The embeddable `punktfunk/1` client connector, behind the `quic` feature.
//!
//! [`NativeClient::connect`] runs the full client side of the protocol — QUIC handshake
//! ([`crate::quic`]), UDP data plane ([`crate::session::Session`] on a native thread), input
//! datagrams — and hands the embedder a dead-simple surface: *pull reassembled access units,
//! push input events*. This is what the platform clients (SwiftUI/VideoToolbox, Android, …)
//! link via the C ABI (`punktfunk_connect` & co. in [`crate::abi`]); `punktfunk-client-rs` is the
//! link via the C ABI (`punktfunk_connect` & co. in [`crate::abi`]); `punktfunk-probe` is the
//! Rust-native consumer of the same flow.
//!
//! Threading: one worker thread owns a tokio runtime (QUIC control plane only — design
@@ -166,7 +166,7 @@ pub struct NativeClient {
/// kernel sees a high-QoS thread parked waiting on a lower-QoS one and the Thread Performance
/// Checker flags a priority inversion. Matching the producers to the consumers' QoS removes
/// the inversion without slowing the Swift side. No-op off Apple (the Linux client/host don't
/// run a QoS scheduler, and `punktfunk-client-rs` doesn't care).
/// run a QoS scheduler, and `punktfunk-probe` doesn't care).
#[cfg(target_vendor = "apple")]
fn pin_thread_user_interactive() {
// SAFETY: sets only the current thread's QoS class — always valid to call.
+1 -1
View File
@@ -12,7 +12,7 @@
//! `frame_index`↔`frameIndex`, `stream_seq`↔`streamPacketIndex`,
//! (`block_index`,`block_count`)↔the `multiFecBlocks` nibbles, and
//! (`data_shards`,`recovery_shards`,`shard_index`)↔the `fecInfo` bitfield. We carry them
//! as explicit fields rather than bit-packing; full GameStream wire-exactness is an M2
//! as explicit fields rather than bit-packing; full GameStream wire-exactness is a GameStream-host
//! concern (it also needs RTP framing + RTSP), this is the coherent internal format.
use crate::config::Config;
+3 -3
View File
@@ -1,4 +1,4 @@
//! `punktfunk/1` — the native control plane (M3), gated behind the `quic` feature.
//! `punktfunk/1` — the native control plane, gated behind the `quic` feature.
//!
//! GameStream is punktfunk's compatibility layer; this is the start of its own protocol. A QUIC
//! connection (quinn, tokio — control plane only, never the per-frame path) carries a
@@ -12,9 +12,9 @@
//!
//! after which both sides bring up a [`crate::session::Session`] over a plain
//! [`UdpTransport`](crate::transport::udp) (native threads, no async) and the host streams.
//! The Welcome carries everything the M1 core negotiates — FEC scheme (including GF(2¹⁶)
//! The Welcome carries everything the core negotiates — FEC scheme (including GF(2¹⁶)
//! Leopard, which GameStream can't express), shard sizing, crypto key/salt — so the data
//! plane is exactly the hardened M1 `Session`.
//! plane is exactly the hardened core `Session`.
//!
//! Transport security: the host presents a long-lived self-signed certificate
//! ([`endpoint::server_with_identity`]) and the client pins its SHA-256 fingerprint
+1 -1
View File
@@ -31,7 +31,7 @@ pub struct Frame {
/// Note: the AEAD layer authenticates each datagram but does **not** provide anti-replay.
/// Video replays are largely absorbed by the reassembler's per-frame dedup, but replayed
/// input events are not yet filtered. A sliding-window replay filter keyed on the
/// authenticated sequence belongs with the pairing/handshake layer (M2); until then,
/// authenticated sequence belongs with the pairing/handshake layer (the GameStream host); until then,
/// rely on the LAN/VPN transport assumption (plan §1).
pub struct Session {
config: Config,
+2 -2
View File
@@ -3,7 +3,7 @@
//! Send is batched via `sendmmsg` ([`Transport::send_batch`], ≤64/syscall) and recv via `recvmmsg`
//! ([`Transport::recv_batch`], ≤32/syscall into a reused ring) — the 1 Gbps+ syscall lever
//! (~125k → a few-k syscalls/sec at line rate). The host additionally paces each frame's send
//! across the frame interval (see `m3.rs::paced_submit`) so a real NIC doesn't drop a line-rate
//! across the frame interval (see `punktfunk1.rs::paced_submit`) so a real NIC doesn't drop a line-rate
//! burst. All three layer on this same [`Transport`] seam (scalar fallbacks for loopback/non-Linux).
use super::Transport;
@@ -397,7 +397,7 @@ impl UdpTransport {
/// Sized for 1 Gbps+: at ~1.2 Gbps on the wire an 8 MB buffer is only ~49 ms of steady state,
/// and a single multi-MB IDR keyframe (~4 MB ≈ 3300 packets) instantly fills most of it. 32 MB
/// gives ~200 ms of headroom and absorbs a keyframe burst without EAGAIN drops. (Paced sending
/// — `m3.rs::paced_submit` — now spreads a big frame's overflow, so this buffer mostly absorbs
/// — `punktfunk1.rs::paced_submit` — now spreads a big frame's overflow, so this buffer mostly absorbs
/// the immediate microburst rather than a whole unpaced frame.)
const TARGET_SOCKBUF: usize = 32 * 1024 * 1024;
+1 -1
View File
@@ -1,4 +1,4 @@
//! M1 acceptance: round-trip access units through the full host→client path
//! Core acceptance: round-trip access units through the full host→client path
//! (packetize → FEC → loopback with simulated loss → recover → reassemble) and assert
//! byte-exact recovery, for both FEC schemes, with and without encryption. Plus
//! property tests over the FEC layer's loss patterns.
+1 -1
View File
@@ -8,7 +8,7 @@ use anyhow::Result;
/// Opus/GameStream audio is 48 kHz.
pub const SAMPLE_RATE: u32 = 48_000;
/// Stereo channel count — the default and the punktfunk/1 (M3) audio plane's fixed layout.
/// Stereo channel count — the default and the punktfunk/1 audio plane's fixed layout.
pub const CHANNELS: usize = 2;
/// Produces interleaved `f32` PCM at [`SAMPLE_RATE`] in the channel count it was opened
+3 -3
View File
@@ -1,4 +1,4 @@
//! Frame capture (plan §7). On Linux: a PipeWire ScreenCast portal stream. M0 uses the
//! Frame capture (plan §7). On Linux: a PipeWire ScreenCast portal stream. The spike uses the
//! CPU-copy fallback (the portal delivers a CPU buffer; the encoder uploads it to the GPU
//! internally). Zero-copy dmabuf→NVENC import is deferred (plan §9 risk).
@@ -45,7 +45,7 @@ impl PixelFormat {
}
/// A captured frame. [`format`](Self::format)/dimensions describe the pixels regardless of
/// where they live — [`payload`](Self::payload) is either a CPU buffer (the M0/fallback path)
/// where they live — [`payload`](Self::payload) is either a CPU buffer (the spike/fallback path)
/// or a GPU buffer already on the device (the zero-copy path, plan §9).
pub struct CapturedFrame {
pub width: u32,
@@ -103,7 +103,7 @@ pub trait Capturer: Send {
fn set_active(&self, _active: bool) {}
}
/// A deterministic moving test pattern (BGRx). Lets M0 exercise the encode → file →
/// A deterministic moving test pattern (BGRx). Lets the spike exercise the encode → file →
/// `punktfunk_core` path with no live capture session, and produces obviously non-static
/// content (a sweeping bar + animated gradient) so the encoded output is verifiable.
pub struct SyntheticCapturer {
+1 -1
View File
@@ -1319,7 +1319,7 @@ pub struct DuplCapturer {
ever_got_frame: bool,
/// Consecutive rebuilds that produced a BORN-LOST duplication (created OK, but its first
/// AcquireNextFrame instantly returned ACCESS_LOST). On the NORMAL desktop this is the hybrid
/// reparent/flip storm — once it persists, `acquire` returns Err so the m3 loop cold-rebuilds the
/// reparent/flip storm — once it persists, `acquire` returns Err so the punktfunk1 loop cold-rebuilds the
/// whole pipeline (new device/output) instead of spinning on a dead dup forever (the bug where the
/// stream froze on the last frame). Reset to 0 by any real frame. NOT armed on the secure
/// (Winlogon) desktop, where a long static dwell is legitimate and must never end the session.
@@ -2,7 +2,7 @@
//! docs/windows-secure-desktop.md — step 4).
//!
//! WGC won't activate under the SYSTEM account, so the SYSTEM host can't capture the normal desktop
//! itself. Instead it spawns `m3-host wgc-helper` in the **interactive user session** (so WGC works)
//! itself. Instead it spawns `punktfunk-host wgc-helper` in the **interactive user session** (so WGC works)
//! via `CreateProcessAsUserW`, with the helper's **stdout** redirected to an anonymous pipe the host
//! reads. The helper ships framed Annex-B access units; this module parses them back into AUs the
//! host relays onto the live QUIC session (same `EncodedFrame` flow, just sourced over a pipe instead
+1 -1
View File
@@ -1,5 +1,5 @@
//! Hardware video encode (plan §7). Binds FFmpeg (NVENC); never rewrites codecs.
//! Low-latency preset, B-frames off. M0 feeds BGRx CPU frames directly — `*_nvenc`
//! Low-latency preset, B-frames off. The spike feeds BGRx CPU frames directly — `*_nvenc`
//! accepts `bgr0` input and converts to YUV on the GPU, so no host-side swscale is
//! needed (dmabuf zero-copy import is deferred; plan §9).
+7 -4
View File
@@ -1,10 +1,10 @@
//! GameStream (P1) control plane — what a stock Moonlight/Artemis client talks to around
//! the media streams: mDNS discovery, the nvhttp serverinfo + pairing HTTP(S) API, RTSP,
//! and the ENet control stream. `tokio`/`axum` live here (control plane, I/O-bound — never
//! the per-frame hot path; that is `punktfunk_core`'s P1 wire codec). See `docs/m2-plan.md`.
//! the per-frame hot path; that is `punktfunk_core`'s P1 wire codec). See `docs/gamestream-host-plan.md`.
//!
//! Status: P1.1 — mDNS `_nvstream._tcp` advertisement + `/serverinfo`. Pairing, RTSP, and
//! the media streams follow (see the M2 task list / plan).
//! the media streams follow (see the GameStream host task list / plan).
pub mod apps;
// Platform-neutral wire/negotiation logic + the Linux capture/encode pipeline (non-Linux
@@ -149,7 +149,10 @@ impl AppState {
/// QUIC server on `cfg.port` in the same process, sharing one [`crate::native_pairing`] handle with
/// the management API so the web console can arm pairing and show the PIN. `None` = GameStream only
/// (the mgmt API's native endpoints report `enabled: false`).
pub fn serve(mgmt: crate::mgmt::Options, native: Option<crate::m3::NativeServe>) -> Result<()> {
pub fn serve(
mgmt: crate::mgmt::Options,
native: Option<crate::punktfunk1::NativeServe>,
) -> Result<()> {
let host = Host::detect()?;
let identity = cert::ServerIdentity::load_or_create().context("host certificate")?;
let state = Arc::new(AppState::new(host, identity));
@@ -187,7 +190,7 @@ pub fn serve(mgmt: crate::mgmt::Options, native: Option<crate::m3::NativeServe>)
tokio::try_join!(
nvhttp::run(state.clone()),
crate::mgmt::run(state.clone(), mgmt, Some(np.clone())),
crate::m3::serve(crate::m3::native_serve_opts(&cfg), np),
crate::punktfunk1::serve(crate::punktfunk1::native_serve_opts(&cfg), np),
)?;
}
_ => {
@@ -1,6 +1,6 @@
//! The video data plane: on RTSP PLAY, learn the client's UDP endpoint (it pings the video
//! port), then run capture → NVENC encode → [`VideoPacketizer`] → UDP send. The source is
//! either real portal desktop capture (`PUNKTFUNK_VIDEO_SOURCE=portal`, the M0 PipeWire path) or
//! either real portal desktop capture (`PUNKTFUNK_VIDEO_SOURCE=portal`, the portal PipeWire path) or
//! a synthetic test pattern (default). Runs on its own native thread.
use super::video::{FrameType, VideoPacketizer};
+33 -31
View File
@@ -6,9 +6,10 @@
//! `#[cfg(target_os = "linux")]`; the crate compiles everywhere so the workspace builds
//! on non-Linux dev machines — it just can't run the pipeline there.
//!
//! Status: M0. The `m0` subcommand runs the capture→encode→file pipeline spike and feeds
//! the encoded AUs through a `punktfunk_core` loopback. M2 wires the full P1 host that a stock
//! Moonlight client connects to.
//! Subcommands: `serve` runs the GameStream-compatible host + management REST API (and, with
//! `--native`, the native punktfunk/1 host in-process); `punktfunk1-host` runs the native
//! punktfunk/1 host standalone; `spike` is a capture→encode→file pipeline dev tool that also
//! round-trips the encoded AUs through a `punktfunk_core` loopback.
// Scaffold: trait methods and config paths are defined ahead of their backends.
#![allow(dead_code)]
@@ -24,15 +25,15 @@ mod encode;
mod gamestream;
mod inject;
mod library;
mod m0;
mod m3;
mod mgmt;
mod mgmt_token;
mod native_pairing;
mod pipeline;
mod punktfunk1;
mod pwinit;
#[cfg(target_os = "windows")]
mod service;
mod spike;
mod vdisplay;
#[cfg(target_os = "windows")]
mod wgc_helper;
@@ -41,7 +42,7 @@ mod zerocopy;
use anyhow::{bail, Context, Result};
use encode::Codec;
use m0::{Options, Source};
use spike::{Options, Source};
use std::path::PathBuf;
fn main() {
@@ -185,10 +186,10 @@ fn real_main() -> Result<()> {
println!("dualsense-test: done");
Ok(())
}
// M0 pipeline spike.
Some("m0") => m0::run(parse_m0(&args[1..])?),
// M3: native punktfunk/1 host (QUIC control plane + UDP data plane).
Some("m3-host") => {
// Capture→encode→file pipeline spike (dev tool).
Some("spike") => spike::run(parse_spike(&args[1..])?),
// Native punktfunk/1 host (QUIC control plane + UDP data plane).
Some("punktfunk1-host") => {
let get = |flag: &str| {
args.iter()
.skip_while(|a| *a != flag)
@@ -196,10 +197,10 @@ fn real_main() -> Result<()> {
.map(String::as_str)
};
let source = match get("--source") {
Some("virtual") => m3::M3Source::Virtual,
_ => m3::M3Source::Synthetic,
Some("virtual") => punktfunk1::Punktfunk1Source::Virtual,
_ => punktfunk1::Punktfunk1Source::Synthetic,
};
m3::run(m3::M3Options {
punktfunk1::run(punktfunk1::Punktfunk1Options {
port: get("--port").and_then(|s| s.parse().ok()).unwrap_or(9777),
source,
seconds: get("--seconds").and_then(|s| s.parse().ok()).unwrap_or(30),
@@ -209,7 +210,7 @@ fn real_main() -> Result<()> {
.unwrap_or(0),
max_concurrent: get("--max-concurrent")
.and_then(|s| s.parse().ok())
.unwrap_or(m3::DEFAULT_MAX_CONCURRENT),
.unwrap_or(punktfunk1::DEFAULT_MAX_CONCURRENT),
// Secure by default: REQUIRE PIN pairing (reject unpaired clients) unless
// --allow-tofu opts into trust-on-first-use — the host then accepts unpaired
// clients and advertises pair=optional. Pairing is always armed so a PIN is
@@ -259,8 +260,9 @@ fn real_main() -> Result<()> {
print_usage();
Ok(())
}
// Bare flags (no subcommand) default to the m0 spike for back-compat.
Some(_) => m0::run(parse_m0(&args)?),
// Unknown subcommand → usage. (No implicit default; a bare `punktfunk-host` with no
// args hits the None arm above and prints help.)
Some(other) => bail!("unknown command '{other}' (try --help)"),
}
}
@@ -320,7 +322,7 @@ fn input_test() -> Result<()> {
/// the native punktfunk/1 host in-process (`--native`, the unified host). Returns the mgmt options
/// and the native host config (`None` = GameStream only). Native pairing is **required by default**
/// (an open host any LAN device can stream from is insecure); `--open` turns it off.
fn parse_serve(args: &[String]) -> Result<(mgmt::Options, Option<m3::NativeServe>)> {
fn parse_serve(args: &[String]) -> Result<(mgmt::Options, Option<punktfunk1::NativeServe>)> {
let mut opts = mgmt::Options::default();
let mut native_port: Option<u16> = None;
let mut open = false;
@@ -377,14 +379,14 @@ fn parse_serve(args: &[String]) -> Result<(mgmt::Options, Option<m3::NativeServe
if opts.token.is_none() {
opts.token = Some(crate::mgmt_token::load_or_generate()?);
}
let native = native_port.map(|port| m3::NativeServe {
let native = native_port.map(|port| punktfunk1::NativeServe {
port,
require_pairing: !open,
});
Ok((opts, native))
}
fn parse_m0(args: &[String]) -> Result<Options> {
fn parse_spike(args: &[String]) -> Result<Options> {
let mut source = Source::Portal;
let mut width = 1920u32;
let mut height = 1080u32;
@@ -465,7 +467,7 @@ fn parse_m0(args: &[String]) -> Result<Options> {
Codec::H265 => "h265",
Codec::Av1 => "obu",
};
PathBuf::from(format!("/tmp/punktfunk-m0.{ext}"))
PathBuf::from(format!("/tmp/punktfunk-spike.{ext}"))
});
Ok(Options {
@@ -486,12 +488,12 @@ fn print_usage() {
"punktfunk-host — Linux streaming host
USAGE:
punktfunk-host serve [OPTIONS] GameStream host control plane (M2: mDNS + serverinfo …)
+ the management REST API
punktfunk-host openapi print the management API's OpenAPI document (codegen)
punktfunk-host m3-host [OPTIONS] native punktfunk/1 host (QUIC control plane + UDP data plane)
punktfunk-host probe-compositor exit 0 iff the compositor is up + ready (session-bringup gate)
punktfunk-host m0 [OPTIONS] M0 capture→encode→file pipeline spike
punktfunk-host serve [OPTIONS] GameStream host control plane (mDNS + serverinfo …)
+ the management REST API
punktfunk-host openapi print the management API's OpenAPI document (codegen)
punktfunk-host punktfunk1-host [OPTIONS] native punktfunk/1 host (QUIC control + UDP data plane)
punktfunk-host probe-compositor exit 0 iff the compositor is up + ready (bringup gate)
punktfunk-host spike [OPTIONS] capture→encode→file pipeline spike (dev tool)
SERVE OPTIONS:
--mgmt-bind <IP:PORT> management API address (default: 127.0.0.1:47990)
@@ -503,7 +505,7 @@ SERVE OPTIONS:
--open disable mandatory native pairing (default: pairing REQUIRED —
an open host any LAN device can stream from is insecure)
M3-HOST OPTIONS:
PUNKTFUNK1-HOST OPTIONS:
--port <N> QUIC listen port (default: 9777)
--source <synthetic|virtual> test frames, or virtual display + NVENC (default: synthetic)
--seconds <N> per-session stream duration, virtual source (default: 30)
@@ -516,7 +518,7 @@ M3-HOST OPTIONS:
unpaired clients and logs a 4-digit pairing PIN at startup;
TOFU without pairing is insecure on a LAN
M0 OPTIONS:
SPIKE OPTIONS:
--source <synthetic|portal|kwin-virtual>
frame source (default: portal). 'kwin-virtual' creates a
KWin virtual output at --width x --height and captures it
@@ -525,7 +527,7 @@ M0 OPTIONS:
--codec <h264|h265|av1> NVENC codec (default: h265)
--bitrate <MBPS> target bitrate in Mbps (default: 20)
--width <W> --height <H> synthetic source size (default: 1920x1080)
--out <PATH> raw Annex-B output (default: /tmp/punktfunk-m0.<ext>)
--out <PATH> raw Annex-B output (default: /tmp/punktfunk-spike.<ext>)
--no-loopback skip the punktfunk_core round-trip verification
-h, --help this help
@@ -534,8 +536,8 @@ NOTES:
(see docs/linux-setup.md). 'synthetic' needs no capture session and always runs.
Encoded AUs are written to a playable file AND (unless --no-loopback) fed through a
punktfunk_core host→client loopback that reassembles and byte-verifies each one.
Both 'serve --native' and 'm3-host' advertise the native service over mDNS
(_punktfunk._udp) for client auto-discovery — 'punktfunk-client-rs --discover' lists them."
Both 'serve --native' and 'punktfunk1-host' advertise the native service over mDNS
(_punktfunk._udp) for client auto-discovery — 'punktfunk-probe --discover' lists them."
);
#[cfg(target_os = "windows")]
eprintln!(
+1 -1
View File
@@ -1,6 +1,6 @@
//! Shared native (`punktfunk/1`) pairing state — the on-demand arming PIN (with expiry) plus the
//! persistent paired-clients store. One [`NativePairing`] handle is shared by the punktfunk/1 QUIC
//! accept loop ([`crate::m3`]) and the management API ([`crate::mgmt`]), so an operator can **arm
//! accept loop ([`crate::punktfunk1`]) and the management API ([`crate::mgmt`]), so an operator can **arm
//! pairing and read the PIN from the web console** instead of the service log.
//!
//! The PIN direction is inherent to the SPAKE2 ceremony: the *host* mints the PIN and the *client*
@@ -1,4 +1,4 @@
//! M3 — the `punktfunk/1` native host: QUIC control plane + the hardened M1 data plane over UDP.
//! The `punktfunk/1` native host: QUIC control plane + the hardened core data plane over UDP.
//! This is punktfunk's own protocol, past the GameStream compatibility layer:
//!
//! * the Welcome negotiates **GF(2¹⁶) Leopard FEC** (inexpressible in GameStream) + AES-GCM;
@@ -9,9 +9,9 @@
//! * video frames carry a wall-clock `pts_ns`, so a same-host client measures the full
//! capture→encode→FEC→UDP→reassemble latency per frame.
//!
//! `punktfunk-host m3-host [--port 9777] [--source synthetic|virtual] [--seconds 30]
//! `punktfunk-host punktfunk1-host [--port 9777] [--source synthetic|virtual] [--seconds 30]
//! [--frames 300]` serves sessions back to back (one at a time — the virtual output and
//! encoder are single-tenant); `punktfunk-client-rs --connect host:9777` is the counterpart.
//! encoder are single-tenant); `punktfunk-probe --connect host:9777` is the counterpart.
//! The data plane runs on native threads (no async on the frame path).
//!
//! Alongside video + input, a session carries **audio** (desktop Opus, 5 ms frames, host →
@@ -37,16 +37,16 @@ use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum M3Source {
pub enum Punktfunk1Source {
/// Deterministic test frames (protocol verification; the client byte-checks them).
Synthetic,
/// Real capture: virtual display at the client's requested mode → NVENC.
Virtual,
}
pub struct M3Options {
pub struct Punktfunk1Options {
pub port: u16,
pub source: M3Source,
pub source: Punktfunk1Source,
/// Virtual-source stream duration.
pub seconds: u32,
/// Synthetic-source frame count.
@@ -97,7 +97,7 @@ fn now_ns() -> u64 {
.unwrap_or(0)
}
pub fn run(opts: M3Options) -> Result<()> {
pub fn run(opts: Punktfunk1Options) -> Result<()> {
let rt = tokio::runtime::Builder::new_multi_thread()
.worker_threads(2)
.enable_all()
@@ -138,10 +138,10 @@ pub(crate) struct NativeServe {
/// overflow clients wait in the accept queue. Override with `--max-concurrent`.
pub(crate) const DEFAULT_MAX_CONCURRENT: usize = 4;
pub(crate) fn native_serve_opts(cfg: &NativeServe) -> M3Options {
M3Options {
pub(crate) fn native_serve_opts(cfg: &NativeServe) -> Punktfunk1Options {
Punktfunk1Options {
port: cfg.port,
source: M3Source::Virtual,
source: Punktfunk1Source::Virtual,
seconds: 7 * 24 * 3600, // per-session cap; large enough not to cut a live stream
frames: 0,
max_sessions: 0,
@@ -153,7 +153,7 @@ pub(crate) fn native_serve_opts(cfg: &NativeServe) -> M3Options {
}
}
pub(crate) async fn serve(opts: M3Options, np: Arc<NativePairing>) -> Result<()> {
pub(crate) async fn serve(opts: Punktfunk1Options, np: Arc<NativePairing>) -> Result<()> {
let identity = crate::gamestream::cert::ServerIdentity::load_or_create()
.context("load host identity (~/.config/punktfunk)")?;
let fingerprint = endpoint::fingerprint_of_pem(&identity.cert_pem)
@@ -427,7 +427,7 @@ async fn pair_ceremony(
#[allow(clippy::too_many_arguments)]
async fn serve_session(
conn: quinn::Connection,
opts: &M3Options,
opts: &Punktfunk1Options,
audio_cap: &AudioCapSlot,
inj_tx: std::sync::mpsc::Sender<InputEvent>,
mic_tx: std::sync::mpsc::Sender<Vec<u8>>,
@@ -514,7 +514,7 @@ async fn serve_session(
// can report what we'll actually drive. Only the Virtual source has a compositor; the
// synthetic source has no virtual output. Blocking probes → spawn_blocking.
let compositor = match source {
M3Source::Virtual => {
Punktfunk1Source::Virtual => {
let pref = hello.compositor;
Some(
tokio::task::spawn_blocking(move || resolve_compositor(pref))
@@ -522,7 +522,7 @@ async fn serve_session(
.context("resolve compositor task")??,
)
}
M3Source::Synthetic => None,
Punktfunk1Source::Synthetic => None,
};
// Resolve a requested library launch (the client sends only the store-qualified id;
@@ -600,8 +600,8 @@ async fn serve_session(
key,
salt: *b"pkf1",
frames: match source {
M3Source::Synthetic => frames,
M3Source::Virtual => 0, // unbounded — client streams until we close
Punktfunk1Source::Synthetic => frames,
Punktfunk1Source::Virtual => 0, // unbounded — client streams until we close
},
// Report the resolved backends back to the client (compositor: Auto for the
// synthetic source).
@@ -726,7 +726,7 @@ async fn serve_session(
let conn = conn.clone();
let gamepad = welcome.gamepad;
std::thread::Builder::new()
.name("punktfunk-m3-input".into())
.name("punktfunk1-input".into())
.spawn(move || input_thread(input_rx, rich_rx, conn, inj_tx, gamepad))
.context("spawn input thread")?
};
@@ -778,12 +778,12 @@ async fn serve_session(
// → host→client QUIC datagrams, on its own native thread. Best-effort on every failure
// (no PipeWire audio, spawn error): the session continues without audio — and a spawn
// error must NOT early-return here, the threads above are already running.
let audio_handle = if opts.source == M3Source::Virtual {
let audio_handle = if opts.source == Punktfunk1Source::Virtual {
let conn = conn.clone();
let stop = stop.clone();
let cap = audio_cap.clone();
std::thread::Builder::new()
.name("punktfunk-m3-audio".into())
.name("punktfunk1-audio".into())
.spawn(move || audio_thread(conn, stop, cap))
.map_err(|e| tracing::error!(error = %e, "audio thread spawn failed — session continues without audio"))
.ok()
@@ -794,7 +794,7 @@ async fn serve_session(
// Test hook (synthetic source only): a scripted feedback burst on the host→client
// planes — rumble (0xCA) + DualSense HID-output (0xCD) — so loopback tests can assert
// the client's feedback path without a real game writing output reports to a real pad.
if opts.source == M3Source::Synthetic
if opts.source == Punktfunk1Source::Synthetic
&& std::env::var("PUNKTFUNK_TEST_FEEDBACK").as_deref() == Ok("1")
{
use punktfunk_core::quic::HidOutput;
@@ -852,14 +852,14 @@ async fn serve_session(
let mut session = Session::new(cfg, Box::new(transport))
.map_err(|e| anyhow!("host session: {e:?}"))?;
match source {
M3Source::Synthetic => synthetic_stream(
Punktfunk1Source::Synthetic => synthetic_stream(
&mut session,
frames,
&stop_stream,
&probe_rx,
&probe_result_tx,
),
M3Source::Virtual => {
Punktfunk1Source::Virtual => {
let compositor = compositor
.expect("the Virtual source resolves a compositor during the handshake");
virtual_stream(
@@ -986,7 +986,7 @@ impl InjectorService {
fn start() -> InjectorService {
let (tx, rx) = std::sync::mpsc::channel::<InputEvent>();
if let Err(e) = std::thread::Builder::new()
.name("punktfunk-m3-injector".into())
.name("punktfunk1-injector".into())
.spawn(move || injector_service_thread(rx))
{
tracing::error!(error = %e, "injector service thread spawn failed — pointer/keyboard input disabled");
@@ -1080,7 +1080,7 @@ impl MicService {
fn start() -> MicService {
let (tx, rx) = std::sync::mpsc::channel::<Vec<u8>>();
if let Err(e) = std::thread::Builder::new()
.name("punktfunk-m3-mic".into())
.name("punktfunk1-mic".into())
.spawn(move || mic_service_thread(rx))
{
tracing::error!(error = %e, "mic service thread spawn failed — mic passthrough disabled");
@@ -2117,7 +2117,7 @@ fn virtual_stream(
let _watcher = if watch {
let stop = stop.clone();
std::thread::Builder::new()
.name("punktfunk-m3-watcher".into())
.name("punktfunk1-watcher".into())
.spawn(move || session_watcher_loop(session_tx, stop))
.ok()
} else {
@@ -3014,9 +3014,9 @@ mod tests {
use punktfunk_core::error::PunktfunkStatus;
let host = std::thread::spawn(|| {
run(M3Options {
run(Punktfunk1Options {
port: 19777,
source: M3Source::Synthetic,
source: Punktfunk1Source::Synthetic,
seconds: 0,
frames: 25,
max_sessions: 3,
@@ -3182,9 +3182,9 @@ mod tests {
.build()
.unwrap();
rt.block_on(serve(
M3Options {
Punktfunk1Options {
port: 19779,
source: M3Source::Synthetic,
source: Punktfunk1Source::Synthetic,
seconds: 0,
frames: 25,
max_sessions: 2, // the knock + the post-approval session
@@ -3268,9 +3268,9 @@ mod tests {
use punktfunk_core::quic::endpoint;
let host = std::thread::spawn(|| {
run(M3Options {
run(Punktfunk1Options {
port: 19778,
source: M3Source::Synthetic,
source: Punktfunk1Source::Synthetic,
seconds: 0,
frames: 25,
max_sessions: 4,
@@ -1,9 +1,9 @@
//! M0 — the pipeline spike (plan §8): capture → NVENC encode → playable file, with the
//! The pipeline spike (plan §8): capture → NVENC encode → playable file, with the
//! encoded access units also fed through a `punktfunk_core` host→client `Session` over an
//! in-process loopback to prove the core's FEC + packetize + reassemble path on real
//! encoder output.
//!
//! This is the spike runner, not the M2 hot path: it drives the stages on one thread (the
//! This is the spike runner, not the production host path: it drives the stages on one thread (the
//! per-stage-thread pipeline with bounded channels is [`crate::pipeline`]). Source is
//! either a synthetic BGRx test pattern (no capture session needed) or the live xdg
//! ScreenCast portal monitor.
@@ -52,12 +52,12 @@ pub fn run(opts: Options) -> Result<()> {
width = opts.width,
height = opts.height,
fps = opts.fps,
"M0 source: synthetic BGRx test pattern"
"spike source: synthetic BGRx test pattern"
);
Box::new(SyntheticCapturer::new(opts.width, opts.height, opts.fps))
}
Source::Portal => {
tracing::info!("M0 source: xdg ScreenCast portal (live monitor)");
tracing::info!("spike source: xdg ScreenCast portal (live monitor)");
capture::open_portal_monitor().context("open portal capturer")?
}
Source::KwinVirtual => {
@@ -66,7 +66,7 @@ pub fn run(opts: Options) -> Result<()> {
width = opts.width,
height = opts.height,
?compositor,
"M0 source: virtual output (PUNKTFUNK_COMPOSITOR)"
"spike source: virtual output (PUNKTFUNK_COMPOSITOR)"
);
let mut vd = crate::vdisplay::open(compositor).context("open virtual display")?;
let vout = vd
@@ -104,7 +104,7 @@ pub fn run(opts: Options) -> Result<()> {
opts.fps,
opts.bitrate_bps,
first.is_cuda(),
8, // m0 synthetic harness: 8-bit
8, // spike synthetic harness: 8-bit
)
.context("open encoder")?;
@@ -147,7 +147,7 @@ pub fn run(opts: Options) -> Result<()> {
out = %opts.out.display(),
elapsed_s = format!("{elapsed:.2}"),
encode_fps = format!("{:.1}", stats.encoded as f64 / elapsed.max(1e-9)),
"M0 capture→encode→file complete"
"spike capture→encode→file complete"
);
if let Some(lb) = lb {
@@ -194,7 +194,7 @@ fn drain_encoder(
/// A host↔client `punktfunk_core` pair over a lossless in-process loopback. Each encoded AU is
/// FEC-protected, packetized, sent, then reassembled on the client and byte-compared to the
/// original — exercising the core on real encoder output (the M0 "feed into a Session" goal).
/// original — exercising the core on real encoder output (the spike "feed into a Session" goal).
struct Loopback {
host: Session,
client: Session,
+1 -1
View File
@@ -125,7 +125,7 @@ impl Compositor {
/// installed (it spawns a nested session — independent of the running desktop), plus the live
/// session's own compositor (KWin / Mutter / wlroots) when the host runs inside it. Cheap,
/// side-effect-free probes — safe to call per management request. A concrete client preference
/// is validated against this set before it's honored (see the m3 handshake's resolution).
/// is validated against this set before it's honored (see the punktfunk/1 handshake's resolution).
pub fn available() -> Vec<Compositor> {
#[cfg(target_os = "linux")]
{
@@ -40,7 +40,7 @@ fn chooser_file() -> String {
}
/// The managed xdpw config: per-session output selection with no GUI. The `|| echo` fallback
/// keeps plain portal capture (`--source portal`, M0 flow) working when no session has written
/// keeps plain portal capture (`--source portal` flow) working when no session has written
/// the chooser file. xdpw runs `chooser_cmd` via `/bin/sh -c`, reads stdout.
fn xdpw_config() -> String {
format!(
+1 -1
View File
@@ -50,7 +50,7 @@ pub fn run(opts: HelperOptions) -> Result<()> {
// path. Elevate its OS priority so a CPU-heavy game can't deschedule it and delay submission (which
// would leave our HIGH GPU priority with nothing queued to prioritise). Apollo's capture thread is
// likewise CRITICAL.
crate::m3::boost_thread_priority(true);
crate::punktfunk1::boost_thread_priority(true);
// Capture the EXISTING SudoVDA output by GDI name / target id — do NOT create one (the host owns
// the virtual output + its isolate/restore; a second topology owner breaks DDA recovery).
@@ -102,10 +102,10 @@ same-host-only, as today.
- `swift test`: add a decode-output test (decode a known IDR built like
`VideoToolboxRoundTripTests` → assert a `CVPixelBuffer` of the right dimensions + the
decode callback fires). Present is display-bound — validate it **live** via the HUD number.
- Live: connect to a Linux host (`m3-host --source virtual` on the GNOME box; see
- Live: connect to a Linux host (`punktfunk1-host --source virtual` on the GNOME box; see
[Ubuntu — GNOME](/docs/ubuntu-gnome)), confirm `capture→present` is a few ms over `capture→client`
and that `decode→present` shrank vs. an `AVSampleBufferDisplayLayer` baseline.
- Compare against the headless reference number: `punktfunk-client-rs` reports skew-corrected
- Compare against the headless reference number: `punktfunk-probe` reports skew-corrected
capture→reassembled (~1.3 ms p50 GNOME box → dev box); capture→present should be that **+ decode +
present**.
+5 -5
View File
@@ -56,7 +56,7 @@ punktfunk-client --connect <host>:9777 # skip the picker, start a session imme
## Windows desktop client (in development)
`punktfunk-client` for Windows (`crates/punktfunk-client-windows`) is the native graphical client
`punktfunk-client` for Windows (`clients/windows`) is the native graphical client
for Windows — pure Rust, the same `punktfunk/1` core as the Apple and Linux apps, with a **WinUI 3**
UI (host list, settings, PIN pairing) and the video on a `SwapChainPanel`, plus WASAPI audio, FFmpeg
decode, SDL3 controllers, network discovery, and PIN pairing. Launch it and pick a host from the
@@ -74,13 +74,13 @@ Until it ships, **Moonlight** remains the recommended way to stream to Windows (
## Linux reference client (headless)
`punktfunk-client-rs` (in the repo) is a command-line client for the native protocol, used for
`punktfunk-probe` (in the repo) is a command-line client for the native protocol, used for
testing, development, and latency measurement — not an everyday client. It connects, streams to a
file, runs the speed test, and can discover hosts:
```sh
punktfunk-client-rs --discover # list hosts on the network
punktfunk-client-rs --connect <host>:9777 --pin <fp> # connect to one
punktfunk-probe --discover # list hosts on the network
punktfunk-probe --connect <host>:9777 --pin <fp> # connect to one
```
## Which should I use?
@@ -90,6 +90,6 @@ punktfunk-client-rs --connect <host>:9777 --pin <fp> # connect to one
| A Mac, iPhone, iPad, or Apple TV | The **Apple app** |
| A Linux desktop or laptop, or a Steam Deck | **`punktfunk-client`** (GTK4) |
| Windows, Android, a browser, a TV | **Moonlight** |
| Automated tests / latency measurement | **`punktfunk-client-rs`** (headless) |
| Automated tests / latency measurement | **`punktfunk-probe`** (headless) |
Whichever you choose, the first connection needs a one-time [pairing](/docs/pairing).
+1 -1
View File
@@ -47,7 +47,7 @@ Today the native `punktfunk/1` host (`serve --native`) streams **one session at
clients wait in the accept queue until the active session ends. Each session gets its own virtual
display at the client's exact resolution; concurrent native sessions are on the roadmap.
(`m3-host`, the standalone test host, has a `--max-concurrent N` knob, default 4, bounded by your
(`punktfunk1-host`, the standalone test host, has a `--max-concurrent N` knob, default 4, bounded by your
GPU's encoder — see the [Host CLI](/docs/host-cli) reference — but `serve --native` does **not** take
that flag.)
@@ -29,7 +29,7 @@ Each gamescope **process is per-session** (`vdisplay/gamescope.rs::create()` spa
- **EIS input socket — single global file.** gamescope exports `LIBEI_SOCKET` for its children; a
shell wrapper relays it to the fixed path `/tmp/punktfunk-gamescope-ei` (`EI_SOCKET_FILE`).
**Two concurrent instances overwrite each other's socket name** in that one file.
- **Injector — one host-lifetime `!Send` service.** `m3.rs::InjectorService` opens **one**
- **Injector — one host-lifetime `!Send` service.** `punktfunk1.rs::InjectorService` opens **one**
`inject::open(backend)` for the whole run and forwards events over an mpsc channel. It was made
shared deliberately (the portal `CreateSession` churn wedged KWin's EIS — "EIS setup timed out").
For gamescope it reads the one global socket file, so all sessions' input lands in whichever
@@ -1,5 +1,5 @@
---
title: "M2 — Moonlight Host"
title: "GameStream Host"
description: "Stream to a stock Moonlight client on a client-sized virtual display."
---
@@ -72,13 +72,13 @@ Ground-truth protocol reference: [`research/gamestream-protocol-research.json`](
handshake, negotiate `Config`, create a wlroots virtual output sized to the client.
*Acceptance: Moonlight completes RTSP and the host stands up the UDP streams.*
- **P1.3 — Video (punktfunk-core P1 codec), plaintext, clean-LAN.** RTP+NV framing + FEC shard
layout in punktfunk-core; wire M0's NVENC AUs → UDP 47998. *Acceptance: Moonlight DISPLAYS video.*
layout in punktfunk-core; wire the spike's NVENC AUs → UDP 47998. *Acceptance: Moonlight DISPLAYS video.*
- **P1.4 — Control + input.** ENet (`rusty_enet`) control stream; decode input → `inject.rs`
(uinput/reis); request-IDR → force NVENC keyframe. *Acceptance: mouse/keyboard work.*
- **P1.5 — Robustness: FEC recovery + encryption.** nanors-exact FEC; per-shard AES-GCM.
*Acceptance: stable under `tc netem` loss; encrypted streams.*
- **P1.6 — Audio + polish.** Opus + audio RTP/FEC/CBC (UDP 47999); disconnect teardown; KWin
backend for the user's KDE box. *Acceptance: full game stream with sound — the M2 goal.*
backend for the user's KDE box. *Acceptance: full game stream with sound — the GameStream-host goal.*
## Crates (verified available)
+6 -6
View File
@@ -32,15 +32,15 @@ token is **required** when you bind the API off loopback with `--mgmt-bind`.
By default the host **requires pairing** — see [Pairing & Trust](/docs/pairing). On `serve --native` you
**arm pairing from the web console** (or mgmt API); the host then displays a 4-digit PIN. Pass `--open` to
turn off the mandatory-pairing default and serve any device on the network (trusted single-user setups
only). The pairing flags below are `m3-host`-only and do **not** apply to `serve`.
only). The pairing flags below are `punktfunk1-host`-only and do **not** apply to `serve`.
## `m3-host`
## `punktfunk1-host`
A standalone native-only host, mainly for testing the `punktfunk/1` path without the GameStream server
or web console.
```sh
punktfunk-host m3-host --source virtual
punktfunk-host punktfunk1-host --source virtual
```
| Flag | Meaning |
@@ -53,12 +53,12 @@ punktfunk-host m3-host --source virtual
| `--allow-pairing` | Accept PIN pairing; the host prints a PIN when a client pairs. |
| `--require-pairing` | Only serve paired devices (implies `--allow-pairing`). |
`--max-concurrent`, `--allow-pairing`, and `--require-pairing` are **`m3-host`-only** — `serve` does not
`--max-concurrent`, `--allow-pairing`, and `--require-pairing` are **`punktfunk1-host`-only** — `serve` does not
accept them. On `serve --native` you arm pairing from the web console instead, and concurrency is not
yet capped from the command line.
Both `serve --native` and `m3-host` advertise the host on the network so clients can discover it. List
hosts from another machine with `punktfunk-client-rs --discover`.
Both `serve --native` and `punktfunk1-host` advertise the host on the network so clients can discover it. List
hosts from another machine with `punktfunk-probe --discover`.
## Environment
@@ -276,7 +276,7 @@ punktfunk/
│ │ ├── src/vdisplay/ # trait + kwin/wlroots/mutter impls
│ │ ├── src/input/ # reis + uinput
│ │ └── src/web/ # axum config/pairing API
│ └── punktfunk-client-rs/ # reference Rust client (M4)
│ └── punktfunk-probe/ # reference Rust client (M4)
├── clients/
│ ├── apple/ # Swift package, imports punktfunk_core.h (M5)
│ └── android/ # Kotlin + JNI (later)
+3 -3
View File
@@ -45,7 +45,7 @@ host's management console, click to arm pairing, and the host displays a 4-digit
list of paired devices. This works on a headless host over the network — there is no command-line flag
to arm pairing on `serve`.
(The standalone headless test host, `m3-host`, takes `--allow-pairing`/`--require-pairing` on its
(The standalone headless test host, `punktfunk1-host`, takes `--allow-pairing`/`--require-pairing` on its
command line instead; the production `serve --native` host arms pairing from the console.)
Then, on the client:
@@ -60,13 +60,13 @@ the right setting on a shared network: a device has to complete the PIN ceremony
connect.
If you're on a fully trusted single-user network and want to skip pairing, run the host open with
`serve --native --open` (or `m3-host --allow-tofu` for the standalone host) — it then advertises
`serve --native --open` (or `punktfunk1-host --allow-tofu` for the standalone host) — it then advertises
`pair=optional` and accepts unpaired clients. Requiring pairing is strongly recommended.
## Trust-on-first-use (host opt-in)
Trust-on-first-use (TOFU) is **off by default** and is an explicit *host* opt-in for fully trusted
networks. A host enables it by running open — `m3-host --allow-tofu` or `serve --open` — which makes
networks. A host enables it by running open — `punktfunk1-host --allow-tofu` or `serve --open` — which makes
it advertise `pair=optional` over mDNS and accept unpaired clients. Only then does a client offer the
TOFU path: connecting to such a host for the first time shows the host's fingerprint and asks you to
confirm it (compare it with the one the host logged at startup), then pins it. The client presents
+10 -10
View File
@@ -20,7 +20,7 @@ Steam session at the **client's exact resolution + refresh** — games see it (v
change, reused (no Steam restart) on the same mode. Plus macOS/iPad input fixes (NSEvent motion +
iPad pointer-lock) and a 4K/5K one-frame-freeze fix (grow the UDP socket buffers).
**Next:** **§8 pairing & trust hardening** (mandatory PIN by default + delegated approval), the M4
**Next:** **§8 pairing & trust hardening** (mandatory PIN by default + delegated approval), the native
client presenter + iOS (§6), and a Windows host (§7 — now **de-risked via SudoVDA**, no custom
signed driver needed). **§10 HDR/10-bit is parked — blocked upstream at the compositor** (no
gamescope/KWin PipeWire 10-bit producer yet).
@@ -88,7 +88,7 @@ select = a `pw_stream` with `Direction::Output` + `media.class=Audio/Source`.
- **Touch — implemented (host path), pending a backend that lands it.** `TouchDown/Move/Up`
InputKinds (reuse the abs-pointer `flags=(w<<16)|h` mapping, `code`=touch id); host
`inject/libei.rs` requests the `Touchscreen` device type + binds the `Touch` capability and
injects `ei_touchscreen` down/motion/up; `punktfunk-client-rs --touch-test` drags a finger.
injects `ei_touchscreen` down/motion/up; `punktfunk-probe --touch-test` drags a finger.
**Validated:** KWin's RemoteDesktop portal *grants* the Touchscreen device type, but its EIS
server creates **no touchscreen device** (headless KWin) — so touch currently no-ops on KWin
(now logged once). The code is correct; it needs a backend that exposes `ei_touchscreen`
@@ -102,14 +102,14 @@ select = a `pw_stream` with `Direction::Output` + `media.class=Audio/Source`.
trigger effects (L2/R2)**. Protocol carries new side-planes: rich-input `0xCC`
(touchpad/motion) + HID-output `0xCD` (LED/triggers). `/dev/uhid` udev rule shipped.
- **Rich DualSense — Phase C/D/E end-to-end, validated live.** `PUNKTFUNK_GAMEPAD=dualsense`
selects a per-session `DualSenseManager` (the `PadBackend` enum in `m3.rs`): client gamepad frames
selects a per-session `DualSenseManager` (the `PadBackend` enum in `punktfunk1.rs`): client gamepad frames
build the DualSense report; the kernel's feedback comes back as `HidOutput` on the **0xCD** plane
(lightbar / player LEDs / adaptive triggers) while **rumble stays on the universal 0xCA plane**
(so non-DualSense clients still feel it); touchpad + motion ride the **0xCC** rich-input plane
(`DualSenseManager::apply_rich`, merged with button state). The connector + C ABI gained
`punktfunk_connection_next_hidout` (→ `PunktfunkHidOutput`) and `punktfunk_connection_send_rich_input`
(← `PunktfunkRichInput`); header regenerated. Validated on-box: a synthetic-source `m3-host` +
`punktfunk-client-rs --rich-input-test` created the real kernel DualSense, drove 0xCC, and decoded
(← `PunktfunkRichInput`); header regenerated. Validated on-box: a synthetic-source `punktfunk1-host` +
`punktfunk-probe --rich-input-test` created the real kernel DualSense, drove 0xCC, and decoded
12 live 0xCD events (the kernel's actual lightbar/trigger init reports) — data plane unaffected
(600/600 frames). *Remaining:* the Apple client renders adaptive triggers + rumble on a real
DualSense (`GCDualSenseAdaptiveTrigger`) — handed off to the client agent for the real playtest.
@@ -192,7 +192,7 @@ value) instead of guesswork that ends in a stuttering stream.
and exposes it (`punktfunk_connection_speed_test()` + `punktfunk_connection_probe_result()` →
`PunktfunkProbeResult{throughput_kbps, loss_pct, …}`). Probe filler is diverted from the decoder.
Validated on loopback (synthetic source): a 20 Mbps/2 s probe measured 20050 kbps at 0% loss,
interleaved probe AUs excluded from frame verification. `punktfunk-client-rs` gains `--bitrate` +
interleaved probe AUs excluded from frame verification. `punktfunk-probe` gains `--bitrate` +
`--speed-test KBPS:MS` as the reference/loopback driver.
**Done (Apple client UI):** Settings grows a Bitrate control (Automatic = host default; manual is
@@ -244,7 +244,7 @@ the GF(2⁸)/Moonlight ~1 Gbps wall). A 6-way subagent investigation (2026-06-11
**Verdict: ~halfway, and it's mostly clamps + ONE real piece of work.** Already 1 Gbps-ready and
untouched: the integer/type path (u32 kbps → u64 → int64_t, no truncation); FEC (a 1 Gbps frame is
only ~434874 data shards = a single GF(2¹⁶) block, two orders under the 65535 ceiling); AES-GCM
(RustCrypto auto AES-NI, ~1025× headroom on x86_64); the u64 sequence/nonce space; and the **M1
(RustCrypto auto AES-NI, ~1025× headroom on x86_64); the u64 sequence/nonce space; and the **core
`ReassemblerLimits`** — fully *derived* from the negotiated `FecConfig`, so they already admit every
legit high-bitrate frame with nothing to relax. Security invariant to keep: every allocation size
must trace to a host-negotiated parameter clamped to a scheme ceiling — scale via the negotiated
@@ -271,7 +271,7 @@ params (`max_data_per_block`, `shard_payload`), never by widening a bound by han
- **DoS hygiene (last):** derive the one hardcoded reassembler field (`max_frame_bytes` = 64 MiB,
never set by `session_config`) from the negotiated mode/bitrate — strictly *tightens* the surface.
- **Validate with the speed-test probe** (it reuses the real `submit_frame`→FEC+crypto+send path):
`punktfunk-client-rs --speed-test KBPS:MS`, RELEASE build (debug is CPU-bound ~30 Mbps), watching
`punktfunk-probe --speed-test KBPS:MS`, RELEASE build (debug is CPU-bound ~30 Mbps), watching
`packets_send_dropped`. Open Qs: NVENC CBR rate-tracking at 0.51 Gbps (no explicit
`rc_buffer_size`); LAN/QEMU-NIC jumbo/GSO support; any `web/` bitrate slider hardcoding 500 Mbps.
@@ -344,7 +344,7 @@ buffer; `sendmmsg`/`recvmmsg` batching; the capture-timestamp anchor placement.
The native protocol had no discovery — clients connected by `--connect HOST:PORT` only, while
GameStream already auto-discovered via mDNS (`_nvstream._tcp`). Now both the unified host
(`serve --native`) and standalone `m3-host` advertise the native service over mDNS:
(`serve --native`) and standalone `punktfunk1-host` advertise the native service over mDNS:
- **Service**: `_punktfunk._udp.local.` (UDP — punktfunk/1 is QUIC; the advertised port is the QUIC
control/data port). Host side: `crate::discovery::advertise_native`, wired into `m3::serve` so
@@ -353,7 +353,7 @@ GameStream already auto-discovered via mDNS (`_nvstream._tcp`). Now both the uni
- **TXT records**: `proto=punktfunk/1`, `fp=<host cert SHA-256>` (the value a client pins — advisory
over unauthenticated mDNS, TOFU/pinning still verifies on connect), `pair=required|optional`
(so a picker knows up front whether the PIN ceremony is needed), `id=<host uniqueid>` (dedup).
- **Client**: `punktfunk-client-rs --discover [SECS]` browses and prints each host (name, addr:port,
- **Client**: `punktfunk-probe --discover [SECS]` browses and prints each host (name, addr:port,
pairing, fingerprint), then exits. Apple clients browse the same service natively via NWBrowser
(Bonjour) — no Rust-connector dependency; this section's service type + TXT keys are the contract.
- **Validated**: cross-LAN — dev box discovered the GNOME-box appliance
@@ -82,7 +82,7 @@ session unit — see [Bazzite](/docs/bazzite).
After a reboot, from another machine on the network:
```sh
punktfunk-client-rs --discover # or just look for the host in the Apple app / Moonlight
punktfunk-probe --discover # or just look for the host in the Apple app / Moonlight
```
If the host is listed, it's up. If not, check `journalctl --user -u punktfunk-host` on the host.
+6 -6
View File
@@ -11,10 +11,10 @@ and the design in the [Implementation Plan](/docs/implementation-plan); this pag
| Milestone | State |
|---|---|
| **M1**`punktfunk-core` + C ABI (protocol · FEC · crypto) | ✅ complete & hardened |
| **M2**GameStream host (Moonlight-compatible) | ✅ working end-to-end; HDR/surround-audio polish open |
| **M3**`punktfunk/1` native protocol (QUIC control + UDP data) | ✅ full session planes, validated live |
| **M4** — native client decode + present (Apple first) | 🟡 macOS stage 1 live; stage-2 presenter built + decode-tested (opt-in, present needs live validation). **Linux GTK client stage 1 live** (2026-06-12) |
| **Core**`punktfunk-core` + C ABI (protocol · FEC · crypto) | ✅ complete & hardened |
| **GameStream host** (Moonlight-compatible) | ✅ working end-to-end; HDR/surround-audio polish open |
| **Native protocol**`punktfunk/1` (QUIC control + UDP data) | ✅ full session planes, validated live |
| **Native clients** decode + present (Apple first) | 🟡 macOS stage 1 live; stage-2 presenter built + decode-tested (opt-in, present needs live validation). **Linux GTK client stage 1 live** (2026-06-12) |
## Live on the boxes
@@ -29,7 +29,7 @@ All three appliances advertise over mDNS (`_punktfunk._udp`) and require PIN pai
## Progress log
### 2026-06-12
- **Native Linux client — stage 1, first light** (`crates/punktfunk-client-linux`, binary
- **Native Linux client — stage 1, first light** (`clients/linux`, binary
`punktfunk-client`). GTK4/libadwaita app on the **Option A** architecture picked after a
six-angle research pass (toolkits / hw decode / Wayland presentation / input capture /
prior art / codebase): links `punktfunk-core` directly as a crate (no C ABI;
@@ -81,7 +81,7 @@ All three appliances advertise over mDNS (`_punktfunk._udp`) and require PIN pai
client's capture→reassembled latency valid **cross-machine**. Validated GNOME box → dev box:
offset 1.57 ms removed, **p50 1.30 ms** skew-corrected. (`05bc9ab`)
- **Native LAN auto-discovery** — host advertises `_punktfunk._udp` (TXT: fingerprint, pairing,
proto); `punktfunk-client-rs --discover` lists hosts. Validated cross-LAN. (`4fff464`)
proto); `punktfunk-probe --discover` lists hosts. Validated cross-LAN. (`4fff464`)
- **Third test box stood up** — home-worker-3 (Ubuntu 26.04, RTX 4090, GNOME 50): first GNOME/Mutter
zero-copy streaming on a real desktop; **1 Gbps probe clean** (625 MB/5 s, `send_dropped=0`).
Two physical-NVIDIA gotchas documented in [Ubuntu — GNOME](/docs/ubuntu-gnome).
+1 -1
View File
@@ -25,7 +25,7 @@ punktfunk is cleanly layered. **~95% of the codebase is platform-agnostic and re
| QUIC control plane (`quic.rs`, pairing, mode negotiation) | quinn + tokio are portable |
| GameStream P1.1 (mDNS, serverinfo, pairing, RTSP, ENet) — *except* `stream.rs`/`audio.rs` | pure wire logic |
| Management REST API (`mgmt.rs`) + OpenAPI | axum/tokio, portable |
| Pipeline + `m3.rs` orchestration | trait-generic — calls `capturer.next_frame()`, `encoder.submit/poll()`; **needs zero changes** |
| Pipeline + `punktfunk1.rs` orchestration | trait-generic — calls `capturer.next_frame()`, `encoder.submit/poll()`; **needs zero changes** |
| The **trait boundaries** themselves: `Capturer`, `Encoder`, `VirtualDisplay`, `InputInjector`, `AudioCapturer`, `VirtualMic` | platform-neutral signatures; Linux deps are already isolated under `[target.'cfg(target_os="linux")'.dependencies]` |
So a Windows host is **new `#[cfg(target_os = "windows")]` backend modules behind the existing
+40 -40
View File
@@ -37,12 +37,12 @@ Apollo is host-only. A stream flows: **nvhttp** (HTTPS pairing + serverinfo/appl
| Apollo subsystem | Apollo key files | punktfunk counterpart |
|---|---|---|
| Apollo — Protocol & streaming (RTP/FEC/ENet/RTSP/crypto) | `stream.cpp`; `rtsp.cpp`; `crypto.cpp`; `network.cpp`; `rtsp.h`; `stream.h` | `gamestream/stream.rs`; `gamestream/video.rs`; `gamestream/rtsp.rs`; `gamestream/control.rs`; `gamestream/audio.rs`; `m3.rs` |
| Apollo — Protocol & streaming (RTP/FEC/ENet/RTSP/crypto) | `stream.cpp`; `rtsp.cpp`; `crypto.cpp`; `network.cpp`; `rtsp.h`; `stream.h` | `gamestream/stream.rs`; `gamestream/video.rs`; `gamestream/rtsp.rs`; `gamestream/control.rs`; `gamestream/audio.rs`; `punktfunk1.rs` |
| Apollo (Sunshine fork) — GameStream HTTP server, pairing ceremony, and discovery/UPnP | `nvhttp.cpp`; `nvhttp.h`; `httpcommon.cpp`; `upnp.cpp`; `crypto.h`; `crypto.cpp` | `gamestream/nvhttp.rs`; `gamestream/pairing.rs`; `gamestream/tls.rs`; `gamestream/cert.rs`; `gamestream/serverinfo.rs`; `gamestream/mod.rs` |
| Apollo — Video encode pipeline & codec config | `video.cpp`; `nvenc_base.cpp`; `video_colorspace.cpp`; `cbs.cpp`; `nvenc_config.h`; `video.h` | `encode.rs`; `encode/nvenc.rs`; `encode/sw.rs`; `encode/linux.rs`; `zerocopy/mod.rs`; `gamestream/control.rs` |
| Apollo — Audio capture, encode, transport (Windows host) | `audio.cpp`; `audio.h`; `audio.cpp`; `common.h`; `stream.cpp` | `audio.rs`; `audio/wasapi_cap.rs`; `audio/linux.rs`; `gamestream/audio.rs`; `m3.rs` |
| Apollo — Audio capture, encode, transport (Windows host) | `audio.cpp`; `audio.h`; `audio.cpp`; `common.h`; `stream.cpp` | `audio.rs`; `audio/wasapi_cap.rs`; `audio/linux.rs`; `gamestream/audio.rs`; `punktfunk1.rs` |
| Apollo (Sunshine fork) — Input handling & injection | `input.cpp`; `input.cpp`; `keylayout.h`; `misc.cpp` | — |
| Apollo: App/process launch & display configuration (Windows host) | `process.cpp`; `display_device.cpp`; `process.h`; `virtual_display.h`; `misc.cpp`; `utils.cpp` | `vdisplay/sudovda.rs`; `vdisplay.rs`; `gamestream/apps.rs`; `library.rs`; `m3.rs`; `capture/wgc_relay.rs` |
| Apollo: App/process launch & display configuration (Windows host) | `process.cpp`; `display_device.cpp`; `process.h`; `virtual_display.h`; `misc.cpp`; `utils.cpp` | `vdisplay/sudovda.rs`; `vdisplay.rs`; `gamestream/apps.rs`; `library.rs`; `punktfunk1.rs`; `capture/wgc_relay.rs` |
| Apollo: Config, management/web UI, system tray | `config.h`; `config.cpp`; `confighttp.cpp`; `confighttp.h`; `system_tray.cpp`; `system_tray.h` | `mgmt.rs`; `mgmt_token.rs`; `main.rs`; `native_pairing.rs`; `library.rs`; `docs/windows-host.md` |
### Apollo — Protocol & streaming (RTP/FEC/ENet/RTSP/crypto)
@@ -523,7 +523,7 @@ For the **shared GameStream wire protocol**, punktfunk is at par with Apollo and
**How punktfunk does it.**
punktfunk runs **two protocol planes** that share one crypto/FEC core (`punktfunk-core`), with the streaming logic split across the GameStream-compat host (`crates/punktfunk-host/src/gamestream/`) and the native punktfunk/1 host (`m3.rs`).
punktfunk runs **two protocol planes** that share one crypto/FEC core (`punktfunk-core`), with the streaming logic split across the GameStream-compat host (`crates/punktfunk-host/src/gamestream/`) and the native punktfunk/1 host (`punktfunk1.rs`).
#### GameStream-compat plane (mirrors Apollo's wire protocol)
- **RTSP** (`gamestream/rtsp.rs`): a hand-rolled TCP/48010 handler, one request per connection (`handle_conn` rtsp.rs:58, closes the socket so moonlight-common-c sees end-of-response). Maps OPTIONS/DESCRIBE/SETUP/ANNOUNCE/PLAY/TEARDOWN (rtsp.rs:124). DESCRIBE (`describe_sdp` rtsp.rs:224) emits HEVC/AV1 indicators and the six Opus `surround-params` lines in Sunshine's normal-before-HQ order (rtsp.rs:239-251). ANNOUNCE parses `x-nv-*` keys into a `StreamConfig` (rtsp.rs:270) + `AudioParams` (rtsp.rs:313), snapping Opus packet duration to 5/10 ms (rtsp.rs:327). PLAY launches video (`stream::start`) and audio (`audio::start`). **Notable: it advertises `encryptionSupported:0` (rtsp.rs:228) — streams are plaintext; only the ENet control plane is encrypted.**
@@ -533,16 +533,16 @@ punktfunk runs **two protocol planes** that share one crypto/FEC core (`punktfun
- **Pairing crypto** (`gamestream/crypto.rs`): AES-128-ECB-no-pad + SHA-256 + RSA, the GameStream pairing primitives.
#### Native punktfunk/1 plane (a strict superset, no Apollo equivalent)
- `m3.rs` drives QUIC control (Hello/Welcome/Start/Reconfigure/ClockProbe) + the hardened M1 `Session` data plane over raw UDP.
- `punktfunk1.rs` drives QUIC control (Hello/Welcome/Start/Reconfigure/ClockProbe) + the hardened core `Session` data plane over raw UDP.
- The data plane (`punktfunk-core/src/session.rs`, `crypto.rs`, `fec/`, `transport/udp.rs`) uses **GF(2¹⁶) Leopard FEC** (65535-shard ceiling, fec/mod.rs:5) + per-direction-salted AES-GCM with seq-as-AAD (crypto.rs:1-20).
- Transport (`transport/udp.rs`) has batched `sendmmsg`/`recvmmsg` AND **UDP GSO on both Linux (`UDP_SEGMENT`, udp.rs:97) and Windows (`WSASendMsg`+`UDP_SEND_MSG_SIZE`/USO, on by default, udp.rs:135-241)** with auto-fallback latching. The host send path is a dedicated `send_loop` (m3.rs:1804) doing FEC+seal+microburst-paced send (`paced_submit` m3.rs:1720) off the encode thread.
- Adds mid-stream `Reconfigure`, an 8-round NTP clock-skew handshake, and `ProbeRequest`/`run_probe_burst` bandwidth probing (m3.rs:1629) — none of which exist in GameStream/Apollo.
- Transport (`transport/udp.rs`) has batched `sendmmsg`/`recvmmsg` AND **UDP GSO on both Linux (`UDP_SEGMENT`, udp.rs:97) and Windows (`WSASendMsg`+`UDP_SEND_MSG_SIZE`/USO, on by default, udp.rs:135-241)** with auto-fallback latching. The host send path is a dedicated `send_loop` (punktfunk1.rs:1804) doing FEC+seal+microburst-paced send (`paced_submit` punktfunk1.rs:1720) off the encode thread.
- Adds mid-stream `Reconfigure`, an 8-round NTP clock-skew handshake, and `ProbeRequest`/`run_probe_burst` bandwidth probing (punktfunk1.rs:1629) — none of which exist in GameStream/Apollo.
**Intentional divergences (by design, not gaps):**
- Two protocol planes from one core: punktfunk keeps the GameStream wire protocol AND its own punktfunk/1 (QUIC control + raw-UDP GF(2^16) data plane), where Apollo is GameStream-only. The native plane intentionally expresses things GameStream cannot: GF(2^16) Leopard FEC (65535-shard ceiling vs Apollo's 255-shard ~1 Gbps wall, fec/mod.rs:5), mid-stream Reconfigure, an NTP clock-skew handshake, and active bandwidth probing (m3.rs:1629).
- Zero-copy by design: punktfunk's video packetizer and the native seal path are built around reusable buffers and a wire-buffer pool (m3.rs `reclaim_wires`), and the capture→encoder path is GPU zero-copy. Apollo's stream.cpp:676 also does zero-copy shard pointers, but punktfunk's choice is a core invariant, not just an optimization.
- Two protocol planes from one core: punktfunk keeps the GameStream wire protocol AND its own punktfunk/1 (QUIC control + raw-UDP GF(2^16) data plane), where Apollo is GameStream-only. The native plane intentionally expresses things GameStream cannot: GF(2^16) Leopard FEC (65535-shard ceiling vs Apollo's 255-shard ~1 Gbps wall, fec/mod.rs:5), mid-stream Reconfigure, an NTP clock-skew handshake, and active bandwidth probing (punktfunk1.rs:1629).
- Zero-copy by design: punktfunk's video packetizer and the native seal path are built around reusable buffers and a wire-buffer pool (punktfunk1.rs `reclaim_wires`), and the capture→encoder path is GPU zero-copy. Apollo's stream.cpp:676 also does zero-copy shard pointers, but punktfunk's choice is a core invariant, not just an optimization.
- Native-thread hot path with QUIC only on the control plane: punktfunk forbids async on the per-frame path (tokio/quinn live only behind the quic feature, used for control/m3 orchestration). The per-frame video send loops are plain std::thread + blocking sockets, matching Apollo's threaded model rather than introducing async I/O on the data plane.
- Encrypt-by-default native plane vs plaintext GameStream: punktfunk advertises encryptionSupported:0 for GameStream (plaintext video/audio, rtsp.rs:228) to match stock Moonlight expectations, but the native plane mandates per-direction-salted AES-GCM with seq-as-AAD (crypto.rs) — a deliberate security upgrade GameStream's optional/legacy crypto cannot match.
- Control-plane GCM scheme auto-detection: rather than hardcoding one of Moonlight's nonce/tag/key permutations like Apollo does per negotiated flag, punktfunk brute-forces the authenticating combination once per connection (control.rs:289) — more robust across moonlight-common-c variants, a Rust-idiomatic divergence.
@@ -625,7 +625,7 @@ punktfunk has **three encoder back-ends behind one trait**, `Encoder` (`encode.r
**10-bit HEVC Main10** on Windows: upconverts 8-bit ARGB via `pixelBitDepthMinus8=2` + Main10 profile GUID (`nvenc.rs:233-237`), and sets a BT.2020/PQ VUI when the capturer hands a 10-bit `Rgb10a2` frame (`nvenc.rs:243-254`). `validate_dimensions` (`encode.rs:83-99`) is the up-front gate on attacker/typo dimensions (zero/odd/over-max).
**Bit-depth/HDR negotiation** happens at the protocol layer (`m3.rs:563-569`), not in a colorspace module; there is no separate colorspace negotiation file.
**Bit-depth/HDR negotiation** happens at the protocol layer (`punktfunk1.rs:563-569`), not in a colorspace module; there is no separate colorspace negotiation file.
**Intentional divergences (by design, not gaps):**
@@ -660,20 +660,20 @@ Richer than the Windows path: opens a sink-monitor capture (`STREAM_CAPTURE_SINK
##### Transport — GameStream path (`crates/punktfunk-host/src/gamestream/audio.rs`)
Learns the client audio endpoint from a port-learning ping (audio.rs:305-315), reuses or reopens the persistent capturer when the channel count changes (audio.rs:318-335), and builds a per-session `SessionEncoder`: plain `opus` crate for stereo (RESTRICTED_LOWDELAY + hard CBR, audio.rs:357-365), or a hand-wrapped `audiopus_sys` multistream encoder for 5.1/7.1 (audio.rs:404-465, **Linux-only — Windows surround `bail!`s** at audio.rs:371-378). Each frame is AES-128-CBC encrypted under the `/launch` rikey with a per-packet IV (audio.rs:540-544), wrapped in a 12-byte RTP header (payload type 97, audio.rs:213-222), and paced to its packet-duration slot (audio.rs:592-598). Surround sessions add Sunshine-compatible RS(4,2) ReedSolomon audio FEC (audio.rs:194-248, 553-583) with a verbatim OpenFEC matrix.
##### Transport — native punktfunk/1 path (`crates/punktfunk-host/src/m3.rs:1379-1453`)
The same capturer feeds an Opus stereo encoder (128 kbps, LOWDELAY, CBR — m3.rs:1401-1414) whose output goes straight into `encode_audio_datagram` over QUIC (0xC9, m3.rs:1437-1441). QUIC already encrypts, so no AES-CBC/RTP/FEC layer. Stereo-only.
##### Transport — native punktfunk/1 path (`crates/punktfunk-host/src/punktfunk1.rs:1379-1453`)
The same capturer feeds an Opus stereo encoder (128 kbps, LOWDELAY, CBR — punktfunk1.rs:1401-1414) whose output goes straight into `encode_audio_datagram` over QUIC (0xC9, punktfunk1.rs:1437-1441). QUIC already encrypts, so no AES-CBC/RTP/FEC layer. Stereo-only.
##### Shared lifecycle
Both transports use the persistent `AudioCapSlot` (gamestream/audio.rs:251-257) and the "audio is best-effort" convention: a capture-open failure logs a warning and the session continues without sound (m3.rs:1395-1399, gamestream/audio.rs:332-334).
Both transports use the persistent `AudioCapSlot` (gamestream/audio.rs:251-257) and the "audio is best-effort" convention: a capture-open failure logs a warning and the session continues without sound (punktfunk1.rs:1395-1399, gamestream/audio.rs:332-334).
**Intentional divergences (by design, not gaps):**
- Native protocol transport is fundamentally different and BETTER, not a gap: punktfunk/1 ships Opus as QUIC datagrams (0xC9, m3.rs:1437-1441) — QUIC provides encryption + the data plane carries GF(2^16) Leopard FEC, so there is no AES-CBC/RTP/Reed-Solomon-RS(4,2) layer like GameStream/Apollo. This is inexpressible in Apollo's Moonlight-only world.
- Native protocol transport is fundamentally different and BETTER, not a gap: punktfunk/1 ships Opus as QUIC datagrams (0xC9, punktfunk1.rs:1437-1441) — QUIC provides encryption + the data plane carries GF(2^16) Leopard FEC, so there is no AES-CBC/RTP/Reed-Solomon-RS(4,2) layer like GameStream/Apollo. This is inexpressible in Apollo's Moonlight-only world.
- One C-ABI core, cfg-dispatched backends: a single `AudioCapturer` trait (audio.rs:17-30) with PipeWire (Linux) and WASAPI (Windows) impls behind one `open_audio_capture` factory — vs Apollo's separate src/audio.cpp + per-platform src/platform/*/audio.cpp. punktfunk's capture, encode, and transport all stay synchronous on native threads (no async on the per-frame path).
- GameStream surround is more correct than Apollo: punktfunk rotates the surround-params mapping over [3, channels) (gamestream/audio.rs:159-171) so 7.1 LFE/SL/SR round-trip after the client's GFE-order swap, where Sunshine/Apollo only rotate [3,6) and scramble 7.1 — verified by a real-codec round-trip test (gamestream/audio.rs:668-688, 750-818).
- WASAPI silent-flag + buffer-straddle handling that Apollo does by hand is provided by the `wasapi` 0.23 crate's `read_from_device_to_deque` (wasapi_cap.rs:146) + a byte VecDeque carrying the remainder (wasapi_cap.rs:135,152-156). Not a missing behavior — it is delegated to the dependency, so Apollo's manual silent-flag lesson is largely already covered on the capture-correctness axis.
- Audio is explicitly best-effort and decoupled: a failed open just logs and the session streams video-only (m3.rs:1395-1399, gamestream/audio.rs:332-334) — the same hard-won lesson Apollo encodes with its fail_guard, already adopted.
- Audio is explicitly best-effort and decoupled: a failed open just logs and the session streams video-only (punktfunk1.rs:1395-1399, gamestream/audio.rs:332-334) — the same hard-won lesson Apollo encodes with its fail_guard, already adopted.
**Transfer candidates from Apollo (6):** _Detect default-render-device changes and reinit WASAPI capture_, _Raise the WASAPI capture thread to MMCSS Pro Audio priority_, _Support 5.1/7.1 WASAPI loopback + multistream encode on Windows_, _Implement client-mic passthrough on Windows_, _Surface WASAPI data-discontinuity as a glitch diagnostic_, _Recover when there is no render endpoint at session start_ — see Part 4.
@@ -707,16 +707,16 @@ One virtual Xbox 360 pad per client index, lazily plugged on first `State` (`gam
#### Threading — two different models
- **GameStream**: injection happens **inline on the ENet service thread**`on_receive` calls `inj.inject(&ev)` directly inside the `host.service()` loop (`gamestream/control.rs:84-91, 207-211`). Rumble is pumped each 2 ms tick (`control.rs:103-124`).
- **punktfunk/1 (m3)**: a per-session `input_thread` (`m3.rs:1245-1344`) receives decoded events over an mpsc channel, routes pointer/keyboard to a **host-lifetime `injector_service_thread`** (`m3.rs:1011-1064`, `inj_tx.send(ev)` at `m3.rs:1300`) and gamepad events to the session `PadBackend`. It tracks `held_buttons`/`held_keys` and **synthesizes release events at session end** to avoid latched-button drag (`m3.rs:1267-1301, 1345-1368`). The injector service auto-reopens when the resolved backend changes mid-session (`m3.rs:1020-1029`).
- **punktfunk/1 (m3)**: a per-session `input_thread` (`punktfunk1.rs:1245-1344`) receives decoded events over an mpsc channel, routes pointer/keyboard to a **host-lifetime `injector_service_thread`** (`punktfunk1.rs:1011-1064`, `inj_tx.send(ev)` at `punktfunk1.rs:1300`) and gamepad events to the session `PadBackend`. It tracks `held_buttons`/`held_keys` and **synthesizes release events at session end** to avoid latched-button drag (`punktfunk1.rs:1267-1301, 1345-1368`). The injector service auto-reopens when the resolved backend changes mid-session (`punktfunk1.rs:1020-1029`).
**Intentional divergences (by design, not gaps):**
- One core, one event struct: both protocols decode to the same fixed-size punktfunk_core::input::InputEvent (input.rs:108-150) that crosses the C ABI unchanged, and all backends implement one InputInjector trait (inject.rs:18-20). Apollo bridges through the platf:: abstraction but has no cross-process ABI — punktfunk's event is the same shape host-side and client-side (the embeddable NativeClient sends it back via punktfunk_connection_send_input, m3.rs:2798).
- One core, one event struct: both protocols decode to the same fixed-size punktfunk_core::input::InputEvent (input.rs:108-150) that crosses the C ABI unchanged, and all backends implement one InputInjector trait (inject.rs:18-20). Apollo bridges through the platf:: abstraction but has no cross-process ABI — punktfunk's event is the same shape host-side and client-side (the embeddable NativeClient sends it back via punktfunk_connection_send_input, punktfunk1.rs:2798).
- Runtime VK→scancode via MapVirtualKeyExW (sendinput.rs:209) instead of Apollo's compile-time static US-English table (keylayout.h). This intentionally respects the host's live keyboard layout rather than forcing 0x409; the trade-off (no fixed canonical mapping for games that demand exact set-1 scancodes) is the candidate improvement below, not a bug.
- punktfunk/1 native path runs gamepads as a per-session PadBackend that can be a virtual DualSense (UHID hid-playstation) on Linux — a game sees a REAL DualSense with adaptive triggers/lightbar/touchpad/motion, with rich feedback flowing back over a dedicated 0xCD HID-output plane (m3.rs:1169-1235). Apollo emulates DS4 via ViGEm on Windows; punktfunk's UHID approach is a deliberately different, higher-fidelity Linux design (not applicable to the Windows host).
- Native threading model: the m3 path decouples ingest from injection via an mpsc channel to a host-lifetime injector service thread (m3.rs:1011-1064, 1300) — same anti-head-of-line-blocking intent as Apollo's task-pool queue but realized with Rust channels + native threads (no async, honoring the no-async-on-the-frame-path invariant). It also auto-reopens the injector when the active session/backend changes mid-stream (m3.rs:1020-1029), which Apollo has no analogue for.
- Session-end safety: the m3 input thread tracks held buttons/keys and synthesizes release events when the client vanishes mid-press (m3.rs:1267-1301, 1350-1368), preventing latched-button drag across reconnects — a robustness feature Apollo's per-stream input_t does not implement this way.
- punktfunk/1 native path runs gamepads as a per-session PadBackend that can be a virtual DualSense (UHID hid-playstation) on Linux — a game sees a REAL DualSense with adaptive triggers/lightbar/touchpad/motion, with rich feedback flowing back over a dedicated 0xCD HID-output plane (punktfunk1.rs:1169-1235). Apollo emulates DS4 via ViGEm on Windows; punktfunk's UHID approach is a deliberately different, higher-fidelity Linux design (not applicable to the Windows host).
- Native threading model: the m3 path decouples ingest from injection via an mpsc channel to a host-lifetime injector service thread (punktfunk1.rs:1011-1064, 1300) — same anti-head-of-line-blocking intent as Apollo's task-pool queue but realized with Rust channels + native threads (no async, honoring the no-async-on-the-frame-path invariant). It also auto-reopens the injector when the active session/backend changes mid-stream (punktfunk1.rs:1020-1029), which Apollo has no analogue for.
- Session-end safety: the m3 input thread tracks held buttons/keys and synthesizes release events when the client vanishes mid-press (punktfunk1.rs:1267-1301, 1350-1368), preventing latched-button drag across reconnects — a robustness feature Apollo's per-stream input_t does not implement this way.
**Transfer candidates from Apollo (6):** _Switch SendInput to retry-on-failure desktop reattach (drop per-event OpenInputDesktop)_, _Move GameStream input injection off the ENet service thread_, _Coalesce relative-mouse/scroll/controller spam before injection_, _Add virtual touch + pen on the Windows host via synthetic pointer devices_, _Map absolute mouse to the target output rect, not the whole virtual desktop_, _Add a static canonical VK→scancode table as a layout-independent fallback_ — see Part 4.
@@ -737,7 +737,7 @@ For the Windows host specifically, Apollo is clearly ahead on this subsystem. Ap
**Layer 2 — virtual display (`vdisplay.rs` + `vdisplay/sudovda.rs`).** The `VirtualDisplay` trait (`vdisplay.rs:47-53`) is RAII: `create(mode) -> VirtualOutput { node_id, win_capture, keepalive }`, teardown by dropping `keepalive`. On Windows `open()` always returns the single `SudoVdaDisplay` (the compositor arg is moot, `vdisplay.rs:525-530`).
#### Windows launch + display-config flow
`m3.rs:532-543` (and the GameStream twin `stream.rs:93-97`) resolve the launch id to a command and do `std::env::set_var("PUNKTFUNK_GAMESCOPE_APP", &cmd)`**but that env var is read only by the Linux gamescope backend** (`vdisplay/gamescope.rs:441`). On Windows there is **no process launch at all**: SudoVDA's `create()` (`sudovda/sudovda.rs:448-543`) never spawns the app, and the only `CreateProcessAsUserW` in the crate is the WGC capture helper (`capture/wgc_relay.rs:6`), not app launch. So on Windows a session shows the bare desktop; apps.json `cmd`/library titles are effectively dead.
`punktfunk1.rs:532-543` (and the GameStream twin `stream.rs:93-97`) resolve the launch id to a command and do `std::env::set_var("PUNKTFUNK_GAMESCOPE_APP", &cmd)`**but that env var is read only by the Linux gamescope backend** (`vdisplay/gamescope.rs:441`). On Windows there is **no process launch at all**: SudoVDA's `create()` (`sudovda/sudovda.rs:448-543`) never spawns the app, and the only `CreateProcessAsUserW` in the crate is the WGC capture helper (`capture/wgc_relay.rs:6`), not app launch. So on Windows a session shows the bare desktop; apps.json `cmd`/library titles are effectively dead.
`SudoVdaDisplay::create` does the display-config work Apollo splits into a separate library, inline: ADD-IOCTL at the client's exact WxH@Hz (`sudovda.rs:452-469`), watchdog ping thread keyed to the driver's reported timeout (`sudovda.rs:481-494`), resolve `\\.\DisplayN` via CCD with a 15×200ms retry (`sudovda.rs:499-505`), `set_active_mode()` which enumerates advertised modes and falls back to the best refresh AT THE SAME RESOLUTION + `CDS_TEST` before `CDS_SET_PRIMARY` (`sudovda.rs:146-265`), then `isolate_displays()` detaches every physical display so the secure desktop renders on the VD (`sudovda.rs:276-329`). Teardown (`sudovda.rs:559-580`) stops the pinger, `restore_displays()`, then REMOVE by a **fixed** `MONITOR_GUID` (`sudovda.rs:59`, "one session at a time today").
@@ -899,7 +899,7 @@ QPC values from `LastPresentTime`/`LastMouseUpdateTime` are translated to `stead
- **Treat S_OK-with-no-change frames as timeouts via DXGI update flags** (sev high, medium) — In dxgi.rs acquire(), after a successful AcquireNextFrame, compute frame_update_flag = info.LastPresentTime != 0 (and/or info.AccumulatedFrames != 0) and mouse_update_flag from LastMouseUpdateTime/PointerShapeBufferSize. Always call update_cursor (mouse). If !frame_update_flag, ReleaseFrame and return Ok(None) (so next_frame repeats last_present) UNLESS the cursor moved and we need a recomposite — in which case recomposite onto the existing last_present texture instead of CopyResource'ing the source. This cuts idle/cursor-only GPU load and avoids re-encoding unchanged content.
- **Detect resolution/format change on the acquire hot path, not only during rebuild** (sev high, small) — In acquire(), after res.cast::<ID3D11Texture2D>(), call GetDesc and compare Width/Height/Format against self.width/height and the expected format (BGRA8 vs R16G16B16A16_FLOAT). On mismatch, ReleaseFrame and run the existing recreate_dupl path (or drop gpu_copy/staging/fp16/hdr10 textures and update width/height/hdr_fp16) so the encoder re-inits cleanly. This makes live resolution + HDR-toggle changes robust even when DDA doesn't fault.
- **Release the duplication device lock during idle to avoid encoder starvation** (sev medium, small) — Cap the per-acquire DDA timeout to a small value (e.g. 8-16ms) and, when it returns WAIT_TIMEOUT, std::thread::sleep a few ms with no outstanding AcquireNextFrame before retrying — so the encode thread can grab the device for NVENC setup/reinit. Keep the generous timeout only for first_frame. Low risk, directly mirrors Apollo's documented fix.
- **Add client-framerate frame pacing with a high-precision timer** (sev medium, large) — Add an optional pacing layer (in dxgi.rs or the encode-loop caller in m3.rs/encode.rs) keyed on the negotiated client framerate: track a group start from the frame pts, sleep to the computed target with a Windows high-resolution timer (timeBeginPeriod or CREATE_WAITABLE_TIMER_HIGH_RESOLUTION), and snap near-integral refresh to integer divisors. This is the lever for steady pacing on odd refresh rates without changing the zero-copy design.
- **Add client-framerate frame pacing with a high-precision timer** (sev medium, large) — Add an optional pacing layer (in dxgi.rs or the encode-loop caller in punktfunk1.rs/encode.rs) keyed on the negotiated client framerate: track a group start from the frame pts, sleep to the computed target with a Windows high-resolution timer (timeBeginPeriod or CREATE_WAITABLE_TIMER_HIGH_RESOLUTION), and snap near-integral refresh to integer divisors. This is the lever for steady pacing on odd refresh rates without changing the zero-copy design.
- **Harden GPU scheduling priority + SetMaximumFrameLatency + NVIDIA-HAGS NVENC-realtime avoidance** (sev medium, medium) — After D3D11CreateDevice in dxgi.rs (and the NVENC encoder device wherever it's built), query IDXGIDevice1::SetMaximumFrameLatency(1) and SetGPUThreadPriority; load gdi32 D3DKMTSetProcessSchedulingPriorityClass and request HIGH (not REALTIME) when the adapter is NVIDIA (VendorId 0x10DE) with HAGS on, REALTIME otherwise. Mirror the privilege-enable. Guard behind admin/SYSTEM (host already relaunches as SYSTEM).
- **Retry DuplicateOutput at startup and request encoder-supported formats via Output5** (sev medium, small) — In open() wrap DuplicateOutput in a short retry (2-3 tries, ~200ms apart, re-attach_input_desktop between) before bailing. Optionally cast the output to IDXGIOutput5 and call DuplicateOutput1 with an explicit format list (BGRA8 for SDR, R16G16B16A16_FLOAT for HDR) so the capture format is intentional rather than incidental, falling back to DuplicateOutput when Output5 is absent.
@@ -1173,7 +1173,7 @@ punktfunk actually has a **surprisingly complete** HDR capture+encode path on Wi
2. **NVENC ingests RGB (ABGR10), not YUV.** punktfunk feeds R10G10B10A2 RGB and relies on **NVENC's internal RGB→YUV** with a hardcoded VUI. Apollo converts to P010 in its own shader so it controls the exact ITU-T H.273 matrix/range. punktfunk has `video_colorspace`-equivalent math nowhere; the rich `new_color_vectors_from_colorspace` machinery has no analogue.
3. **SDR VUI is never signaled.** For the SDR 8-bit path the NVENC VUI block (`nvenc.rs:243`) is only set when `hdr`; SDR streams carry NVENC's default VUI (effectively unsignaled / BT.709 by luck), with no full/limited-range or rec601/709 control. Apollo always sets all four fields for SDR too.
4. **HDR detection is a single hardcoded colorspace constant** (`G2084_NONE_P2020`); no logging of the full `DXGI_OUTPUT_DESC1` like Apollo's `display_base.cpp:722-732`, making field diagnosis hard.
5. **No client-request AND display-HDR gate.** The protocol has `VIDEO_CAP_HDR` (`quic.rs:86`) but HDR is driven purely by the capture surface format; there's no explicit "client wanted HDR but desktop is SDR → signal SDR" decision like Apollo's `:29`. `m3.rs:558-563` only gates *bit depth*, and the doc comment (`quic.rs:128`) admits "BT.2020 PQ HDR signaling is added alongside HDR support" — i.e. the Welcome doesn't carry colorspace.
5. **No client-request AND display-HDR gate.** The protocol has `VIDEO_CAP_HDR` (`quic.rs:86`) but HDR is driven purely by the capture surface format; there's no explicit "client wanted HDR but desktop is SDR → signal SDR" decision like Apollo's `:29`. `punktfunk1.rs:558-563` only gates *bit depth*, and the doc comment (`quic.rs:128`) admits "BT.2020 PQ HDR signaling is added alongside HDR support" — i.e. the Welcome doesn't carry colorspace.
6. **Split-encode disabled for 10-bit** (`nvenc.rs:166`) is a sensible measured choice, but it means HDR throughput is single-engine-bound — a known limitation, not a bug.
@@ -1182,7 +1182,7 @@ punktfunk actually has a **surprisingly complete** HDR capture+encode path on Wi
- **Plumb HDR10 static metadata (mastering display + MaxCLL/MaxFALL) into NVENC** (sev high, medium) — In capture/dxgi.rs read IDXGIOutput6::GetDesc1 (MaxLuminance/MinLuminance/MaxFullFrameLuminance + primaries) when entering the HDR path, carry it on D3d11Frame/CapturedFrame, and in encode/nvenc.rs populate NV_ENC_MASTERING_DISPLAY_INFO + NV_ENC_CONTENT_LIGHT_LEVEL (set via NV_ENC_HEVC_PROFILE_MAIN10 SEI fields / the per-pic HDR metadata API) on the keyframe. Override primaries to Rec.2020/D65 like Apollo.
- **Always signal the SDR VUI (primaries/transfer/matrix/range), not just HDR** (sev medium, small) — In encode/nvenc.rs always set videoSignalTypePresentFlag + colourDescriptionPresentFlag, defaulting SDR to BT.709 primaries/transfer/matrix and limited range, and thread a colorspace/range choice down from the Welcome so the client and decoder agree.
- **Convert to P010 in a D3D11 shader and feed NVENC YUV instead of ABGR10 RGB** (sev medium, large) — Add an optional GPU RGB->P010 conversion pass in capture/dxgi.rs (mirroring HdrConverter) producing DXGI_FORMAT_P010, and switch the NVENC buffer format to NV_ENC_BUFFER_FORMAT_YUV420_10BIT. Port video_colorspace's matrix generation to Rust as the conversion's constant buffer. Keep the current RGB path as a fallback behind an env knob.
- **Gate HDR on (client requested HDR) AND (desktop is actually HDR), and signal the result in Welcome** (sev medium, medium) — Add a colorspace/dynamic-range field to the Welcome (m3.rs around :562-612), resolve HDR = host_wants AND client VIDEO_CAP_HDR AND capture-surface-is-FP16, and send the resolved colorspace so the client decoder/presenter sets the right transfer/primaries.
- **Gate HDR on (client requested HDR) AND (desktop is actually HDR), and signal the result in Welcome** (sev medium, medium) — Add a colorspace/dynamic-range field to the Welcome (punktfunk1.rs around :562-612), resolve HDR = host_wants AND client VIDEO_CAP_HDR AND capture-surface-is-FP16, and send the resolved colorspace so the client decoder/presenter sets the right transfer/primaries.
- **Log the full DXGI_OUTPUT_DESC1 + capture format on HDR setup** (sev low, small) — When hdr_fp16 is detected in capture/dxgi.rs, QueryInterface IDXGIOutput6, GetDesc1, and tracing::info! the colorspace name, BitsPerColor, primaries, white point, and Max/Min/MaxFullFrame luminance — porting Apollo's colorspace_to_string table.
- **Derive HDR cursor/graphics white from the display, not a fixed 203 nits** (sev low, small) — Once GetDesc1 luminance is read (per hdr10-static-metadata), use MaxFullFrameLuminance (or the SDR-white registry value if exposed) as the cursor white target instead of a constant, falling back to 203; document the limitation like Apollo.
@@ -1268,7 +1268,7 @@ Strict reverse order with logged-but-non-fatal failures: destroy bitstream buffe
punktfunk drives the **raw NVENC API** via `nvidia_video_codec_sdk::{sys, ENCODE_API}` (the safe wrapper is CUDA-only) in a single struct `NvencD3d11Encoder` (`nvenc.rs:38`). It implements the same `Encoder` trait (`submit`/`request_keyframe`/`poll`/`flush`, `encode.rs:41-51`) as the Linux/SW encoders. What it does well:
- **True zero-copy, register-in-place.** Unlike Apollo (which owns a dedicated input texture and color-converts into it), punktfunk registers the **capturer's own** texture by raw pointer, caches the registration in `regs: HashMap<isize,...>`, and `encode_picture`s it directly — no per-frame `CopyResource` (`nvenc.rs:404-423`, comment `:1-12`). This is a genuinely tighter zero-copy path than Apollo's, valid because the encode loop is synchronous (`gamestream/stream.rs:338-344`, `m3.rs`).
- **True zero-copy, register-in-place.** Unlike Apollo (which owns a dedicated input texture and color-converts into it), punktfunk registers the **capturer's own** texture by raw pointer, caches the registration in `regs: HashMap<isize,...>`, and `encode_picture`s it directly — no per-frame `CopyResource` (`nvenc.rs:404-423`, comment `:1-12`). This is a genuinely tighter zero-copy path than Apollo's, valid because the encode loop is synchronous (`gamestream/stream.rs:338-344`, `punktfunk1.rs`).
- **Shared device.** Session opened on the capturer's `ID3D11Device` carried on `FramePayload::D3d11` (`nvenc.rs:182-192, 394`); re-inits on device/size/HDR change (`:366-397`) — handles the secure-desktop device recreate.
- **Low-latency RC mirrors Apollo's intent:** P1 + `ULTRA_LOW_LATENCY` (`:206-207, 260-261`), infinite GOP (`gopLength = NVENC_INFINITE_GOPLENGTH`, `:218`), `frameIntervalP=1` (no B-frames, `:219`), CBR (`:220`), ~1-frame VBV (`vbvBufferSize=vbvInitialDelay=bitrate/fps`, `:225-227`).
- **Bitrate probe-and-step-down** on `initialize_encoder` InvalidParam, floor 10 Mbps (`:140-314`) — equivalent to Apollo's level-cap handling, done by retry.
@@ -1278,7 +1278,7 @@ punktfunk drives the **raw NVENC API** via `nvidia_video_codec_sdk::{sys, ENCODE
##### Where punktfunk is weaker / missing / fragile
1. **No reference-frame invalidation at all.** `request_keyframe()` only sets `force_kf` → a full **IDR** (`nvenc.rs:437-442, 465-467`). There is no `nvEncInvalidateRefFrames`, no DPB depth, no RFI-range handling, no "rfi_needs_confirmation" tagging. Apollo's entire `invalidate_ref_frames` (`nvenc_base.cpp:574-610`) and deep-DPB-with-L0=1 design (`:268-281`) is absent. punktfunk relies wholly on FEC + IDR for loss recovery — every recovery event is a costly keyframe spike (the exact thing the infinite-GOP design was meant to avoid). Note `m3.rs:2153`/`gamestream/stream.rs:336` call `request_keyframe()` for "RFI", but it's always a full IDR.
1. **No reference-frame invalidation at all.** `request_keyframe()` only sets `force_kf` → a full **IDR** (`nvenc.rs:437-442, 465-467`). There is no `nvEncInvalidateRefFrames`, no DPB depth, no RFI-range handling, no "rfi_needs_confirmation" tagging. Apollo's entire `invalidate_ref_frames` (`nvenc_base.cpp:574-610`) and deep-DPB-with-L0=1 design (`:268-281`) is absent. punktfunk relies wholly on FEC + IDR for loss recovery — every recovery event is a costly keyframe spike (the exact thing the infinite-GOP design was meant to avoid). Note `punktfunk1.rs:2153`/`gamestream/stream.rs:336` call `request_keyframe()` for "RFI", but it's always a full IDR.
2. **No async encode.** punktfunk is synchronous map→encode→lock with a `pending` VecDeque and `POOL=4` bitstreams (`:28, 459-460, 469-503`), but never uses a completion event / `enableEncodeAsync` / `doNotWait`. Apollo overlaps GPU encode with CPU via a Win32 event + 100 ms timeout (`nvenc_d3d11.cpp:15,64-66`, `nvenc_base.cpp:525,534-539`). punktfunk's `lock_bitstream` blocks with no timeout — a hung encode wedges the encode thread forever.
@@ -1295,7 +1295,7 @@ punktfunk drives the **raw NVENC API** via `nvidia_video_codec_sdk::{sys, ENCODE
#### Transfer opportunities
- **Add real reference-frame invalidation (RFI) instead of always forcing IDR** (sev high, large) — In nvenc.rs add `maxNumRefFramesInDPB`/`numRefL0=1` to the HEVC/H264/AV1 config in init_session, gate on a new caps query NV_ENC_CAPS_SUPPORT_REF_PIC_INVALIDATION, track last_encoded_frame_index + last_rfi_range, and add an `invalidate_ref_frames(first,last)` method on the Encoder trait (encode.rs:41-51) that calls API.invalidate_ref_frames per index with Apollo's dedup/escalate-to-IDR-on-overflow logic. Wire m3.rs RFI requests to it, falling back to request_keyframe() only when it returns false.
- **Add real reference-frame invalidation (RFI) instead of always forcing IDR** (sev high, large) — In nvenc.rs add `maxNumRefFramesInDPB`/`numRefL0=1` to the HEVC/H264/AV1 config in init_session, gate on a new caps query NV_ENC_CAPS_SUPPORT_REF_PIC_INVALIDATION, track last_encoded_frame_index + last_rfi_range, and add an `invalidate_ref_frames(first,last)` method on the Encoder trait (encode.rs:41-51) that calls API.invalidate_ref_frames per index with Apollo's dedup/escalate-to-IDR-on-overflow logic. Wire punktfunk1.rs RFI requests to it, falling back to request_keyframe() only when it returns false.
- **Query nvEncGetEncodeCaps and gate config on real GPU capabilities** (sev medium, medium) — Add a `get_cap(cap: NV_ENC_CAPS) -> i32` helper in nvenc.rs after open_encode_session_ex (using API.get_encode_caps), verify codec_guid is in get_encode_guids, reject out-of-range WxH up front, and use SUPPORT_10BIT_ENCODE / SUPPORT_REF_PIC_INVALIDATION / SUPPORT_CUSTOM_VBV_BUF_SIZE to gate the corresponding config rather than assuming support. Surfaces clear errors instead of opaque InvalidParam.
- **Use async encode with a Win32 completion event + timeout** (sev medium, medium) — In nvenc.rs, gate on NV_ENC_CAPS_ASYNC_ENCODE_SUPPORT, create a per-bitstream Win32 Event (windows::Win32::System::Threading::CreateEventW), set init.enableEncodeAsync=1, store the event in `pending`, set pic.completionEvent + lock.doNotWait=1, and in poll() WaitForSingleObject(ev, 100ms) before lock_bitstream — returning a clear timeout error instead of blocking forever.
- **Minimize NvEnc API/struct versions per codec for older-driver compatibility** (sev medium, medium) — Add a `min_api_version(codec)` (v11 for H264/HEVC, v12 for AV1) and a helper that rewrites the version word (and optionally the struct-revision byte) before each NvEnc struct is passed, mirroring nvenc_base.cpp:666-680. Set apiVersion in open_encode_session_ex (nvenc.rs:186) from it. Maximizes driver compatibility for the field.
@@ -1489,7 +1489,7 @@ punktfunk's SudoVDA backend lives in `crates/punktfunk-host/src/vdisplay/sudovda
#### Transfer opportunities
- **Detect watchdog ping failures and escalate (re-open the device)** (sev high, medium) — In the pinger thread in sudovda.rs (around 485-494), track a consecutive-failure counter; after N (3) failures set a shared AtomicBool 'driver_dead' on SudoVdaDisplay/keepalive and stop pinging. Surface it so the session loop in m3.rs treats a dead virtual display like ACCESS_LOST and re-opens (re-run open_device + re-create). Add a DriverStatus enum mirroring Apollo's DRIVER_STATUS.
- **Detect watchdog ping failures and escalate (re-open the device)** (sev high, medium) — In the pinger thread in sudovda.rs (around 485-494), track a consecutive-failure counter; after N (3) failures set a shared AtomicBool 'driver_dead' on SudoVdaDisplay/keepalive and stop pinging. Surface it so the session loop in punktfunk1.rs treats a dead virtual display like ACCESS_LOST and re-opens (re-run open_device + re-create). Add a DriverStatus enum mirroring Apollo's DRIVER_STATUS.
- **Gate on SudoVDA protocol-version compatibility instead of only logging it** (sev medium, small) — In SudoVdaDisplay::new (sudovda.rs:412-432) parse {Major,Minor,Incremental} and compare against a compiled-in EXPECTED_PROTOCOL {Major:0,Minor:2}. If Major differs or our Minor > driver Minor, return Err with a 'driver too old / incompatible — update SudoVDA' message (and a distinct error variant the mgmt API can surface, like Apollo's VirtualDisplayDriverReady in nvhttp.cpp:936).
- **Retry device open with exponential backoff** (sev medium, small) — Wrap open_device in SudoVdaDisplay::new (sudovda.rs:412-413) in a 20→320ms backoff loop matching Apollo; on a session-time re-open after watchdog failure, allow a few retries with ~1s spacing.
- **Add SET_RENDER_ADAPTER (IOCTL 0x802) to bind the IDD render GPU to the capture/encode GPU** (sev high, medium) — Add `const IOCTL_SET_RENDER_ADAPTER: u32 = ctl(0x802);` and a `#[repr(C)] struct SetRenderAdapterParams { luid: LUID }` in sudovda.rs. Before ADD in create() (sudovda.rs:448), enumerate DXGI adapters (reuse capture/dxgi.rs adapter-by-LUID/name helpers) to match the configured/encoder GPU and issue the IOCTL so the IDD's AddOut LUID matches the capture device's adapter.
@@ -1570,7 +1570,7 @@ punktfunk's **secure-desktop / desktop-switch capture recovery is genuinely matu
- **Replace the PsExec scheduled-task launch with a real Windows service that relaunches the host on session change** (sev high, large) — Add a small Rust service binary (new crate or punktfunk-host `service` subcommand) using windows::Win32::System::Services (RegisterServiceCtrlHandlerEx, StartServiceCtrlDispatcher) that mirrors sunshinesvc.cpp: WTSGetActiveConsoleSessionId -> DuplicateTokenEx+SetTokenInformation(TokenSessionId) -> CreateProcessAsUserW(lpDesktop=winsta0\\default) into a kill-on-close job, accept SERVICE_ACCEPT_SESSIONCHANGE, and relaunch the host on a genuine console-session change. Ship an installer and drop the PsExec dependency.
- **Add an NvAPI driver-settings manager (PREFERRED_PSTATE_MAX + OGL_CPL_PREFER_DXPRESENT) with a crash-safe undo file** (sev medium, large) — Add a windows-only nvprefs module wrapping NvAPI DRS (load nvapi64 dynamically, treat NvAPI_Initialize failure as 'no NVIDIA, skip'). Create a 'punktfunk' app profile with PREFERRED_PSTATE_PREFER_MAX, set OGL_CPL_PREFER_DXPRESENT_ENABLED on the base profile behind a config flag, write an undo file under %ProgramData%\\punktfunk before global changes, and call it on session start (the new stream_will_start hook below).
- **Hook win32u!NtGdiDdDDIGetCachedHybridQueryValue to stop DXGI output-reparenting on hybrid/Optimus GPUs** (sev medium, medium) — Add a once-init in the Windows capture path (capture/dxgi.rs open) that installs the same hook via a minhook-rs/detour crate (or a manual IAT/inline hook) on NtGdiDdDDIGetCachedHybridQueryValue forcing STATE_UNSPECIFIED, plus SetProcessDpiAwarenessContext(PER_MONITOR_AWARE_V2). Gate it to NVIDIA/hybrid boxes; it's process-lifetime so no teardown needed.
- **Add a Windows stream_will_start/stop hook: timer resolution, MMCSS, HIGH_PRIORITY_CLASS, display-required, headless Mouse Keys** (sev medium, medium) — Add a windows-only RAII guard invoked when a session starts (m3.rs/pipeline session setup) that raises timer resolution (NtSetTimerResolution or timeBeginPeriod(1)), DwmEnableMMCSS(true), SetPriorityClass(HIGH_PRIORITY_CLASS), and wraps the DXGI capture loop in SetThreadExecutionState(ES_CONTINUOUS|ES_DISPLAY_REQUIRED) (capture/dxgi.rs next_frame loop), reverting on drop. Optionally the headless Mouse-Keys trick for cursor visibility.
- **Add a Windows stream_will_start/stop hook: timer resolution, MMCSS, HIGH_PRIORITY_CLASS, display-required, headless Mouse Keys** (sev medium, medium) — Add a windows-only RAII guard invoked when a session starts (punktfunk1.rs/pipeline session setup) that raises timer resolution (NtSetTimerResolution or timeBeginPeriod(1)), DwmEnableMMCSS(true), SetPriorityClass(HIGH_PRIORITY_CLASS), and wraps the DXGI capture loop in SetThreadExecutionState(ES_CONTINUOUS|ES_DISPLAY_REQUIRED) (capture/dxgi.rs next_frame loop), reverting on drop. Optionally the headless Mouse-Keys trick for cursor visibility.
- **Use Windows-native DnsServiceRegister (or fix the TXT record) so Apple's mDNS resolver accepts the host** (sev low, medium) — Either (a) verify mdns-sd always emits an RFC-1035-valid TXT (never zero strings) and add a regression test, or (b) add a windows-only discovery backend using DnsServiceRegister via the windows crate's DNS APIs mirroring publish.cpp, including the single-empty-TXT workaround, so Apple NWBrowser/Moonlight discover the host reliably.
- **Add per-frame IDXGIFactory::IsCurrent reinit detection and switch the host clock to GetSystemTimePreciseAsFileTime** (sev medium, small) — In capture/dxgi.rs next_frame, query the cached IDXGIFactory's IsCurrent() once per loop and trigger the existing recreate path when it goes false (catches HDR/topology changes cleanly). Replace now_ns() on Windows with GetSystemTimePreciseAsFileTime converted to Unix-epoch ns so ClockProbe/ClockEcho skew correction stays accurate cross-machine.
@@ -1769,14 +1769,14 @@ adversarial-verify pass. *Area* is the investigation that surfaced it.
*Area:* `cmp:input` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
- **Apollo does:** The control thread only enqueues bytes + schedules a task; a pool thread pops one packet, batches later same-type packets while holding the queue lock, then RELEASES the lock before the (slow) SendInput/ViGEm call — src/input.cpp:1481-1520, 1639-1643. A slow OS input call never stalls the network thread.
- **punktfunk gap:** on_receive() calls inj.inject(&ev) synchronously inside the host.service() ENet loop — crates/punktfunk-host/src/gamestream/control.rs:84-91,207-211. A SendInput that blocks crossing a desktop switch (or a slow ViGEm update) head-blocks ENet handshake/keepalive/retransmit servicing. The m3 path already does this right (m3.rs:1300 → injector_service_thread).
- **punktfunk gap:** on_receive() calls inj.inject(&ev) synchronously inside the host.service() ENet loop — crates/punktfunk-host/src/gamestream/control.rs:84-91,207-211. A SendInput that blocks crossing a desktop switch (or a slow ViGEm update) head-blocks ENet handshake/keepalive/retransmit servicing. The m3 path already does this right (punktfunk1.rs:1300 → injector_service_thread).
- **Proposal:** Mirror the m3 design in the GameStream control thread: push decoded InputEvents onto an mpsc channel drained by a dedicated injector thread (reuse injector_service_thread or a sibling), so the ENet thread never blocks on SendInput/ViGEm. No async needed — native thread + std::sync::mpsc, consistent with the invariant.
#### 9. Actually launch the app/game on Windows (CreateProcessAsUserW into the user session)
*Area:* `cmp:process-launch` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
- **Apollo does:** Apollo runs prep/detached/main commands via platf::run_command = retrieve_users_token + impersonate_current_user + CreateProcessAsUserW, launching apps from a SYSTEM service into the interactive user session (process.cpp execute() l430-494; platform/windows/misc.cpp run_command).
- **punktfunk gap:** On Windows the resolved command is only written to PUNKTFUNK_GAMESCOPE_APP (m3.rs:535-536, stream.rs:96), which is read solely by the Linux gamescope backend (vdisplay/gamescope.rs:441). SudoVdaDisplay::create (sudovda.rs:448-543) never spawns anything, so a Windows session always shows the bare desktop — apps.json cmd and library titles are dead on Windows.
- **punktfunk gap:** On Windows the resolved command is only written to PUNKTFUNK_GAMESCOPE_APP (punktfunk1.rs:535-536, stream.rs:96), which is read solely by the Linux gamescope backend (vdisplay/gamescope.rs:441). SudoVdaDisplay::create (sudovda.rs:448-543) never spawns anything, so a Windows session always shows the bare desktop — apps.json cmd and library titles are dead on Windows.
- **Proposal:** Add a Windows app-launch path: resolve the AppEntry.cmd / library launch_command, then CreateProcessAsUserW into the interactive session (the token+pipe plumbing already exists in capture/wgc_relay.rs — reuse retrieve-token + CreateProcessAsUserW). Track the launched process for lifetime/teardown. Make library.rs Windows launch resolve to a real command (steam steam://rungameid works on Windows) instead of the gamescope env var.
#### 10. Native system tray with state-driven icon + notifications
@@ -1826,7 +1826,7 @@ adversarial-verify pass. *Area* is the investigation that surfaced it.
- **Apollo does:** Apollo's ping thread counts consecutive PingDriver failures and after >3 calls failCb, which sets WATCHDOG_FAILED + closeVDisplayDevice; sessions then re-init the driver (virtual_display.cpp:603-616, process.cpp:65-78, process.cpp:243-246).
- **punktfunk gap:** punktfunk's pinger discards the ioctl Result entirely (`let _ = ioctl(...)`, sudovda.rs:485-494) — a dead driver is never noticed; there is no WATCHDOG_FAILED state and no re-init path.
- **Proposal:** In the pinger thread in sudovda.rs (around 485-494), track a consecutive-failure counter; after N (3) failures set a shared AtomicBool 'driver_dead' on SudoVdaDisplay/keepalive and stop pinging. Surface it so the session loop in m3.rs treats a dead virtual display like ACCESS_LOST and re-opens (re-run open_device + re-create). Add a DriverStatus enum mirroring Apollo's DRIVER_STATUS.
- **Proposal:** In the pinger thread in sudovda.rs (around 485-494), track a consecutive-failure counter; after N (3) failures set a shared AtomicBool 'driver_dead' on SudoVdaDisplay/keepalive and stop pinging. Surface it so the session loop in punktfunk1.rs treats a dead virtual display like ACCESS_LOST and re-opens (re-run open_device + re-create). Add a DriverStatus enum mirroring Apollo's DRIVER_STATUS.
#### 16. Add SET_RENDER_ADAPTER (IOCTL 0x802) to bind the IDD render GPU to the capture/encode GPU
*Area:* `win:virtual-display-sudovda` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
@@ -1885,8 +1885,8 @@ adversarial-verify pass. *Area* is the investigation that surfaced it.
*Area:* `win:nvenc-d3d11` · *Windows-host:* yes · *Severity:* high · *Effort:* large
- **Apollo does:** Apollo keeps a deep DPB (maxNumRefFrames 5/HEVC, 8/AV1) but pins L0 ref to 1 (nvenc_base.cpp:268-281), then on a loss event calls nvEncInvalidateRefFrames per-frame over the requested range, dedups against the last range, expands to the last-encoded index, escalates to IDR only if the range exceeds DPB depth, and tags the next frame rfi_needs_confirmation (nvenc_base.cpp:574-610). This lets the encoder re-reference an older still-valid frame rather than emit a multi-millisecond keyframe.
- **punktfunk gap:** punktfunk has NO invalidate path — request_keyframe() always forces a full IDR (nvenc.rs:437-442,465-467); m3.rs:2153 / gamestream/stream.rs:336 wire 'RFI' straight to a keyframe. Every recovery is a costly IDR spike, defeating the infinite-GOP design.
- **Proposal:** In nvenc.rs add `maxNumRefFramesInDPB`/`numRefL0=1` to the HEVC/H264/AV1 config in init_session, gate on a new caps query NV_ENC_CAPS_SUPPORT_REF_PIC_INVALIDATION, track last_encoded_frame_index + last_rfi_range, and add an `invalidate_ref_frames(first,last)` method on the Encoder trait (encode.rs:41-51) that calls API.invalidate_ref_frames per index with Apollo's dedup/escalate-to-IDR-on-overflow logic. Wire m3.rs RFI requests to it, falling back to request_keyframe() only when it returns false.
- **punktfunk gap:** punktfunk has NO invalidate path — request_keyframe() always forces a full IDR (nvenc.rs:437-442,465-467); punktfunk1.rs:2153 / gamestream/stream.rs:336 wire 'RFI' straight to a keyframe. Every recovery is a costly IDR spike, defeating the infinite-GOP design.
- **Proposal:** In nvenc.rs add `maxNumRefFramesInDPB`/`numRefL0=1` to the HEVC/H264/AV1 config in init_session, gate on a new caps query NV_ENC_CAPS_SUPPORT_REF_PIC_INVALIDATION, track last_encoded_frame_index + last_rfi_range, and add an `invalidate_ref_frames(first,last)` method on the Encoder trait (encode.rs:41-51) that calls API.invalidate_ref_frames per index with Apollo's dedup/escalate-to-IDR-on-overflow logic. Wire punktfunk1.rs RFI requests to it, falling back to request_keyframe() only when it returns false.
#### 23. Add a DS4 (DualShock4) ViGEm target on Windows with type auto-selection, motion, touchpad, battery and timestamp pump
*Area:* `win:input-sendinput-vigem` · *Windows-host:* yes · *Severity:* high · *Effort:* large
@@ -1906,10 +1906,10 @@ adversarial-verify pass. *Area* is the investigation that surfaced it.
*Area:* `cmp:protocol-streaming` · *Windows-host:* yes · *Severity:* medium · *Effort:* small · **✓ verified**
- **Apollo does:** Apollo raises the transmit/capture thread priority: platf::adjust_thread_priority(thread_priority_e::critical) in the video broadcast thread (stream.cpp:1122) and ::high in the audio/control paths (stream.cpp:1333, 1672); the Windows impl is SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST/ABOVE_NORMAL) (platform/windows/misc.cpp:1081-1102).
- **punktfunk gap:** punktfunk names its hot-path threads (stream.rs:44 video, stream.rs:204 send, m3.rs:1804 send_loop, m3.rs:2017/2328 send threads) but never sets a scheduling priority — every host capture/encode/send thread runs at default priority. Only the macOS client elevates (client.rs:169). On a loaded Windows desktop the encode/send thread can be preempted, adding jitter the frame-pacing logic can't recover.
- **punktfunk gap:** punktfunk names its hot-path threads (stream.rs:44 video, stream.rs:204 send, punktfunk1.rs:1804 send_loop, punktfunk1.rs:2017/2328 send threads) but never sets a scheduling priority — every host capture/encode/send thread runs at default priority. Only the macOS client elevates (client.rs:169). On a loaded Windows desktop the encode/send thread can be preempted, adding jitter the frame-pacing logic can't recover.
- **Proposal:** Add a cross-platform raise_current_thread_priority() helper (SetThreadPriority on Windows, optionally AvSetMmThreadCharacteristics for MMCSS; sched/nice on Linux) and call it at the top of the GameStream send thread, the native send_loop, and the encode thread. Cheap, high-value jitter reduction, no design impact.
- **Verify verdict:** `confirmed_gap` — punktfunk: NO thread-priority call exists anywhere in the workspace (grep for SetThreadPriority/sched_setscheduler/setpriority/AvSetMm/THREAD_PRIORITY across crates/ returned zero hits). Hot-path threads are named-only at default priority: GameStream video thread crates/punktfunk-host/src/gamestream/stream.rs:44-53 (thread::Builder name "punktfunk-video") and GameStream send thread stream.rs:204-206 ("punktfunk-send"); native send threads crates/punktfunk-host/src/m3.rs:2017-2033 and m3.rs:2328-2333 ("punktfunk-send"), and the native send_loop at m3.rs:1804 — all spawned with no priority set. The encode work shares the capture thread (m3.rs:2011-2013 "this thread captures+encodes ... and hands each AU to a dedicated send thread"), also default priority. The windows crate is ALREADY a dependency with the needed feature: crates/punktfunk-host/Cargo.toml:141 enables "Win32_System_Threading" (SetThreadPriority/GetCurrentThread available, zero new deps). Apollo: confirmed it raises priority on every hot-path thread — capture src/video.cpp:1295 (critical), encode src/video.cpp:2359 and 2396 (high), video send src/stream.cpp:1333 (high), control src/stream.cpp:1122 (critical), audio src/stream.cpp:1672 + src/audio.cpp:94/208. Windows impl is SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST/ABOVE_NORMAL) at src/platform/windows/misc.cpp:1081-1102, plus DwmEnableMMCSS(true) (misc.cpp:1139) and AvSetMmThreadCharacteristics("Pro Audio") for the audio-capture thread (src/platform/windows/audio.cpp:540). CRITICAL NUANCE: Apollo's adjust_thread_priority is effectively Windows-only — src/platform/linux/misc.cpp:362-364 is "// Unimplemented" and src/platform/macos/misc.mm:218-220 is "// Unimplemented".
- **Refined:** Add a small cross-platform helper raise_current_thread_priority(level) and call it at the TOP of each hot-path thread body (so the calling thread itself is elevated): the GameStream send thread (stream.rs:206), the GameStream video/capture+encode thread (stream.rs:46), the native send threads (m3.rs:2021 and m3.rs:2331 closures, before/at the start of send_loop), and the native capture+encode thread (the m3.rs run body that owns capture+encode, m3.rs ~2011+). Windows: SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST) for the send/network thread (latency-critical, matches Apollo's video-send=high but the punktfunk send thread also does FEC+seal so HIGHEST is defensible) and THREAD_PRIORITY_ABOVE_NORMAL for capture+encode — using the windows crate already on Cargo.toml:141, no new deps. Optionally associate the network/encode thread with MMCSS via AvSetMmThreadCharacteristics (needs the Win32_System_Threading "Games"/"Pro Audio" task + AVRT feature) for higher-fidelity scheduling under DWM load; treat as a follow-up, not the first cut. Linux (net-new beyond Apollo, since Apollo leaves it unimplemented and punktfunk is Linux-first): best-effort nice(-10)/setpriority on the send+encode threads — note SCHED_FIFO/RR requires CAP_SYS_NICE/rtprio limits the host won't have by default, so do NOT default to realtime; a plain niceness bump is the safe portable choice and silently no-ops without privilege. Make every priority call best-effort (log-and-continue on failure, exactly as Apollo does at misc.cpp:1104). No async, no per-frame allocation, no ABI surface change — purely thread-setup, so no design invariant is touched.
- **Verify verdict:** `confirmed_gap` — punktfunk: NO thread-priority call exists anywhere in the workspace (grep for SetThreadPriority/sched_setscheduler/setpriority/AvSetMm/THREAD_PRIORITY across crates/ returned zero hits). Hot-path threads are named-only at default priority: GameStream video thread crates/punktfunk-host/src/gamestream/stream.rs:44-53 (thread::Builder name "punktfunk-video") and GameStream send thread stream.rs:204-206 ("punktfunk-send"); native send threads crates/punktfunk-host/src/punktfunk1.rs:2017-2033 and punktfunk1.rs:2328-2333 ("punktfunk-send"), and the native send_loop at punktfunk1.rs:1804 — all spawned with no priority set. The encode work shares the capture thread (punktfunk1.rs:2011-2013 "this thread captures+encodes ... and hands each AU to a dedicated send thread"), also default priority. The windows crate is ALREADY a dependency with the needed feature: crates/punktfunk-host/Cargo.toml:141 enables "Win32_System_Threading" (SetThreadPriority/GetCurrentThread available, zero new deps). Apollo: confirmed it raises priority on every hot-path thread — capture src/video.cpp:1295 (critical), encode src/video.cpp:2359 and 2396 (high), video send src/stream.cpp:1333 (high), control src/stream.cpp:1122 (critical), audio src/stream.cpp:1672 + src/audio.cpp:94/208. Windows impl is SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST/ABOVE_NORMAL) at src/platform/windows/misc.cpp:1081-1102, plus DwmEnableMMCSS(true) (misc.cpp:1139) and AvSetMmThreadCharacteristics("Pro Audio") for the audio-capture thread (src/platform/windows/audio.cpp:540). CRITICAL NUANCE: Apollo's adjust_thread_priority is effectively Windows-only — src/platform/linux/misc.cpp:362-364 is "// Unimplemented" and src/platform/macos/misc.mm:218-220 is "// Unimplemented".
- **Refined:** Add a small cross-platform helper raise_current_thread_priority(level) and call it at the TOP of each hot-path thread body (so the calling thread itself is elevated): the GameStream send thread (stream.rs:206), the GameStream video/capture+encode thread (stream.rs:46), the native send threads (punktfunk1.rs:2021 and punktfunk1.rs:2331 closures, before/at the start of send_loop), and the native capture+encode thread (the punktfunk1.rs run body that owns capture+encode, punktfunk1.rs ~2011+). Windows: SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST) for the send/network thread (latency-critical, matches Apollo's video-send=high but the punktfunk send thread also does FEC+seal so HIGHEST is defensible) and THREAD_PRIORITY_ABOVE_NORMAL for capture+encode — using the windows crate already on Cargo.toml:141, no new deps. Optionally associate the network/encode thread with MMCSS via AvSetMmThreadCharacteristics (needs the Win32_System_Threading "Games"/"Pro Audio" task + AVRT feature) for higher-fidelity scheduling under DWM load; treat as a follow-up, not the first cut. Linux (net-new beyond Apollo, since Apollo leaves it unimplemented and punktfunk is Linux-first): best-effort nice(-10)/setpriority on the send+encode threads — note SCHED_FIFO/RR requires CAP_SYS_NICE/rtprio limits the host won't have by default, so do NOT default to realtime; a plain niceness bump is the safe portable choice and silently no-ops without privilege. Make every priority call best-effort (log-and-continue on failure, exactly as Apollo does at misc.cpp:1104). No async, no per-frame allocation, no ABI surface change — purely thread-setup, so no design invariant is touched.
#### 43. Socket QoS / DSCP marking on the media sockets
*Area:* `cmp:protocol-streaming` · *Windows-host:* yes · *Severity:* medium · *Effort:* medium · **✓ verified**
@@ -1924,10 +1924,10 @@ adversarial-verify pass. *Area* is the investigation that surfaced it.
*Area:* `cmp:protocol-streaming` · *Windows-host:* no · *Severity:* medium · *Effort:* medium · **✓ verified**
- **Apollo does:** Apollo paces each frame's packets at the *negotiated bitrate*: ratecontrol_packets_in_1ms = giga*80/100/1000/blocksize/8 (stream.cpp:1464) and sleeps the send loop to that per-millisecond budget across the frame (stream.cpp:1578-1627), so the sender shapes to the link's allotted rate, not just the frame deadline.
- **punktfunk gap:** Both punktfunk send pacers spread purely over the FRAME INTERVAL: the GameStream sender uses budget = frame_interval * 0.75 (stream.rs:209) and the native paced_submit uses budget to next frame's deadline * 0.9 (m3.rs:1752) — neither derives a packets-per-ms budget from cfg.bitrate_kbps (the bitrate is only used to open NVENC, stream.rs:275). A spiky IDR or VBR overshoot can still microburst above the negotiated rate within its frame window.
- **punktfunk gap:** Both punktfunk send pacers spread purely over the FRAME INTERVAL: the GameStream sender uses budget = frame_interval * 0.75 (stream.rs:209) and the native paced_submit uses budget to next frame's deadline * 0.9 (punktfunk1.rs:1752) — neither derives a packets-per-ms budget from cfg.bitrate_kbps (the bitrate is only used to open NVENC, stream.rs:275). A spiky IDR or VBR overshoot can still microburst above the negotiated rate within its frame window.
- **Proposal:** Compute a bitrate-derived per-millisecond send budget (like Apollo's ratecontrol_packets_in_1ms) from the negotiated bitrate and pace overflow to THAT rate inside paced_submit / spawn_sender, taking the min of the frame-interval budget and the bitrate budget. Smooths VBR bursts on rate-limited links without breaking the existing microburst fast-path.
- **Verify verdict:** `partial` — PUNKTFUNK gap is real: both pacers spread over the FRAME INTERVAL only, never the bitrate. GameStream sender: `let budget = frame_interval.mul_f32(0.75)` (crates/punktfunk-host/src/gamestream/stream.rs:209). Native paced_submit: `let budget = deadline.checked_duration_since(pace_start)...mul_f32(0.9)` (crates/punktfunk-host/src/m3.rs:1752-1755) where deadline = `next += interval` (m3.rs:2162) and `interval = Duration::from_secs_f64(1.0 / effective_hz...)` (m3.rs:2357). bitrate_kbps only configures NVENC (stream.rs:275; m3.rs:2306, 2694) and is never fed to the pacer. So far the gap claim holds. BUT the Apollo characterization in the proposal is FACTUALLY WRONG: Apollo's `size_t ratecontrol_packets_in_1ms = std::giga::num * 80 / 100 / 1000 / blocksize / 8;` (/home/enricobuehler/Apollo/src/stream.cpp:1464) is a HARDCODED 80% of 1 Gigabit/sec — a fixed constant. grep across stream.cpp shows the negotiated/session bitrate never enters this formula (only std::giga::num, blocksize, and the 80/100 constant appear at lines 1464/1578-1582/1625-1627). Apollo paces to a FIXED ~800 Mbps link ceiling regardless of negotiated bitrate; it is NOT "negotiated-bitrate pacing." punktfunk's own design notes deliberately reject clamping to negotiated bitrate: "The encoder is pixel-rate bound, not bitrate bound" (m3.rs:321) and the whole 1Gbps+ effort raised the ceiling (m3.rs:1617-1619, MAX_BITRATE_KBPS ~2 Gbps).
- **Refined:** Reject the proposal AS WRITTEN — its premise ("Apollo paces to the negotiated bitrate") is false; Apollo paces to a hardcoded 80%-of-1Gbps fixed link ceiling (stream.cpp:1464), and pacing to negotiated bitrate would actively regress punktfunk (VBR/IDR spikes legitimately exceed average bitrate, and punktfunk explicitly treats the encoder as pixel-rate-bound, not bitrate-bound — m3.rs:321). If anything is worth porting, it is the FIXED per-millisecond link-rate ceiling concept, not bitrate-derived pacing: optionally compute a fixed packets-per-ms budget from a configurable link-rate ceiling (default high, e.g. matching MAX_BITRATE_KBPS, env-overridable like PUNKTFUNK_PACE_BURST_KB) and take min(frame-interval budget, link-ceiling budget) inside paced_submit/spawn_sender — purely as a microburst smoother for rate-limited links, NOT tied to cfg.bitrate_kbps. Note punktfunk already has the microburst fast-path (burst_cap, m3.rs:2005-2009 / paced_submit:1734-1743) and frame-interval spreading, which together already address the "spiky IDR microburst" symptom the proposal cites. Recommend deferring unless a measured rate-limited-link regression appears; the current frame-interval + burst-cap pacing covers the cited risk.
- **Verify verdict:** `partial` — PUNKTFUNK gap is real: both pacers spread over the FRAME INTERVAL only, never the bitrate. GameStream sender: `let budget = frame_interval.mul_f32(0.75)` (crates/punktfunk-host/src/gamestream/stream.rs:209). Native paced_submit: `let budget = deadline.checked_duration_since(pace_start)...mul_f32(0.9)` (crates/punktfunk-host/src/punktfunk1.rs:1752-1755) where deadline = `next += interval` (punktfunk1.rs:2162) and `interval = Duration::from_secs_f64(1.0 / effective_hz...)` (punktfunk1.rs:2357). bitrate_kbps only configures NVENC (stream.rs:275; punktfunk1.rs:2306, 2694) and is never fed to the pacer. So far the gap claim holds. BUT the Apollo characterization in the proposal is FACTUALLY WRONG: Apollo's `size_t ratecontrol_packets_in_1ms = std::giga::num * 80 / 100 / 1000 / blocksize / 8;` (/home/enricobuehler/Apollo/src/stream.cpp:1464) is a HARDCODED 80% of 1 Gigabit/sec — a fixed constant. grep across stream.cpp shows the negotiated/session bitrate never enters this formula (only std::giga::num, blocksize, and the 80/100 constant appear at lines 1464/1578-1582/1625-1627). Apollo paces to a FIXED ~800 Mbps link ceiling regardless of negotiated bitrate; it is NOT "negotiated-bitrate pacing." punktfunk's own design notes deliberately reject clamping to negotiated bitrate: "The encoder is pixel-rate bound, not bitrate bound" (punktfunk1.rs:321) and the whole 1Gbps+ effort raised the ceiling (punktfunk1.rs:1617-1619, MAX_BITRATE_KBPS ~2 Gbps).
- **Refined:** Reject the proposal AS WRITTEN — its premise ("Apollo paces to the negotiated bitrate") is false; Apollo paces to a hardcoded 80%-of-1Gbps fixed link ceiling (stream.cpp:1464), and pacing to negotiated bitrate would actively regress punktfunk (VBR/IDR spikes legitimately exceed average bitrate, and punktfunk explicitly treats the encoder as pixel-rate-bound, not bitrate-bound — punktfunk1.rs:321). If anything is worth porting, it is the FIXED per-millisecond link-rate ceiling concept, not bitrate-derived pacing: optionally compute a fixed packets-per-ms budget from a configurable link-rate ceiling (default high, e.g. matching MAX_BITRATE_KBPS, env-overridable like PUNKTFUNK_PACE_BURST_KB) and take min(frame-interval budget, link-ceiling budget) inside paced_submit/spawn_sender — purely as a microburst smoother for rate-limited links, NOT tied to cfg.bitrate_kbps. Note punktfunk already has the microburst fast-path (burst_cap, punktfunk1.rs:2005-2009 / paced_submit:1734-1743) and frame-interval spreading, which together already address the "spiky IDR microburst" symptom the proposal cites. Recommend deferring unless a measured rate-limited-link regression appears; the current frame-interval + burst-cap pacing covers the cited risk.
#### 94. Consume the GameStream client loss-stats report
*Area:* `cmp:protocol-streaming` · *Windows-host:* no · *Severity:* low · *Effort:* small · **✓ verified**
@@ -1,4 +1,4 @@
# M2 — P1 host: stream to a stock Moonlight client
# GameStream host: stream to a stock Moonlight client
The shippable milestone (plan §8). A stock Moonlight/Artemis client discovers this host,
pairs, launches, and gets video (then input, then audio) on a client-sized virtual display.
@@ -68,13 +68,13 @@ Ground-truth protocol reference: [`research/gamestream-protocol-research.json`](
handshake, negotiate `Config`, create a wlroots virtual output sized to the client.
*Acceptance: Moonlight completes RTSP and the host stands up the UDP streams.*
- **P1.3 — Video (punktfunk-core P1 codec), plaintext, clean-LAN.** RTP+NV framing + FEC shard
layout in punktfunk-core; wire M0's NVENC AUs → UDP 47998. *Acceptance: Moonlight DISPLAYS video.*
layout in punktfunk-core; wire the spike's NVENC AUs → UDP 47998. *Acceptance: Moonlight DISPLAYS video.*
- **P1.4 — Control + input.** ENet (`rusty_enet`) control stream; decode input → `inject.rs`
(uinput/reis); request-IDR → force NVENC keyframe. *Acceptance: mouse/keyboard work.*
- **P1.5 — Robustness: FEC recovery + encryption.** nanors-exact FEC; per-shard AES-GCM.
*Acceptance: stable under `tc netem` loss; encrypted streams.*
- **P1.6 — Audio + polish.** Opus + audio RTP/FEC/CBC (UDP 47999); disconnect teardown; KWin
backend for the user's KDE box. *Acceptance: full game stream with sound — the M2 goal.*
backend for the user's KDE box. *Acceptance: full game stream with sound — the GameStream-host goal.*
## Crates (verified available)
+7 -7
View File
@@ -1,7 +1,7 @@
# Linux host setup — NVIDIA GPU VM (M0/M2)
# Linux host setup — NVIDIA GPU VM (pipeline spike + GameStream host)
How to bring up the build environment for the punktfunk Linux host on an NVIDIA-GPU Ubuntu VM
and run the **M0** capture→encode spike. `punktfunk-core` already builds and is tested
and run the **pipeline spike** (capture→encode). `punktfunk-core` already builds and is tested
cross-platform; this is about the platform backends in `crates/punktfunk-host`.
> Target **Ubuntu 24.04 (noble)**: Sway 1.9, FFmpeg 6.1.1, xdg-desktop-portal 1.18.
@@ -77,7 +77,7 @@ ffprobe /tmp/punktfunk-headless-test.mkv # confirm a real H.265 stream
`wf-recorder` uses `wlr-screencopy` directly (no portal/D-Bus) — the fastest way to
de-risk the GPU encode path. **Note:** screencopy encodes straight to a file and *cannot*
feed PipeWire; the real integration uses the ScreenCast portal (see M0). If shell 1 logged
feed PipeWire; the real integration uses the ScreenCast portal (see the pipeline spike). If shell 1 logged
a Mesa/EGL fallback (or Sway dropped to pixman) instead of `EGL vendor: NVIDIA`, install the
NVIDIA GL userspace (§2) — the portal cannot capture a pixman output.
@@ -89,13 +89,13 @@ The wlroots-on-NVIDIA env workarounds (`WLR_RENDERER=gles2`, `WLR_NO_HARDWARE_CU
`GBM_BACKEND=nvidia-drm`, `sway --unsupported-gpu`, …) live in
`scripts/headless/env.sh``source` it before launching anything Wayland.
## 4. M0 proper — wire it into `punktfunk-core`
## 4. The spike proper — wire it into `punktfunk-core`
Goal (plan §8): headless output → PipeWire ScreenCast → NVENC → a playable file, then feed
the encoded access units into a `punktfunk_core::Session` (host role). The module seams exist
in `crates/punktfunk-host/src/{vdisplay,capture,encode,inject,pipeline}.rs`.
**Status: implemented and verified end-to-end** in `crates/punktfunk-host` (`m0.rs`,
**Status: implemented and verified end-to-end** in `crates/punktfunk-host` (`spike.rs`,
`capture/linux.rs`, `encode/linux.rs`). After the §3 bring-up:
```sh
@@ -139,10 +139,10 @@ Crate choices, verified current:
**Start with the CPU-copy fallback** (download frame → `hwupload_cuda``hevc_nvenc`)
to get an end-to-end stream, then chase true dmabuf zero-copy. The plan flags this
(§9) and the `capture` module already has a `cpu_bytes` fallback field.
- **Input (M2):** [`reis`](https://crates.io/crates/reis) (pure-Rust libei — no native
- **Input (GameStream host):** [`reis`](https://crates.io/crates/reis) (pure-Rust libei — no native
`libei` needed) with `input-linux`/uinput as the universal fallback.
Then continue toward **M2**: `serverinfo`/RTSP/pairing enough for a stock Moonlight client
Then continue toward the **GameStream host**: `serverinfo`/RTSP/pairing enough for a stock Moonlight client
to connect, a KWin virtual output created on connect, input via reis/uinput — the
shippable milestone.
+8 -8
View File
@@ -7,7 +7,7 @@ the gotchas. Read it top to bottom, then start at **Phase 1** (de-risk Reactor f
## Status — WinUI 3 client landed (2026-06-15)
The client is implemented in `crates/punktfunk-client-windows` (binary `punktfunk-client`) and is
The client is implemented in `clients/windows` (binary `punktfunk-client`) and is
**build + clippy + fmt green on `x86_64-pc-windows-msvc`** (built on the dev VM). It is the **WinUI 3**
client this doc planned: native chrome (host list, settings, in-app SPAKE2 PIN pairing) + the video on
a **`SwapChainPanel`**, all in pure Rust.
@@ -44,9 +44,9 @@ a **`SwapChainPanel`**, all in pure Rust.
## What we're building
A native Windows client that connects to a punktfunk/1 host (`serve --native` / `m3-host`), decodes
A native Windows client that connects to a punktfunk/1 host (`serve --native` / `punktfunk1-host`), decodes
HEVC, presents it low-latency, plays Opus audio, and captures local mouse/keyboard/gamepad to send
back — i.e. the Windows analogue of the **GTK4 Linux client** (`crates/punktfunk-client-linux`),
back — i.e. the Windows analogue of the **GTK4 Linux client** (`clients/linux`),
which is the architectural template. The Windows client is close to a 1:1 port of the Linux client
with the platform layers swapped.
@@ -73,7 +73,7 @@ with the platform layers swapped.
- **Trust = shared client identity + SPAKE2 PIN pairing + TOFU** (port `trust.rs`; same identity
files/logic as the other native clients).
## The reference: `crates/punktfunk-client-linux/src/`
## The reference: `clients/linux/src/`
Port these files (near 1:1; only the platform layers change):
@@ -144,11 +144,11 @@ Windows client should mirror it:
`SwapChainPanel` and presents a cleared D3D11 swapchain into it. Confirm the windows-rs Reactor
version/API (PR #4479) and `ISwapChainPanelNative::SetSwapChain` interop. If Reactor proves too
raw, the fallback is `winit` + a child HWND swapchain, but try Reactor first per the decision.
2. **Crate scaffold.** `crates/punktfunk-client-windows`, `[target.'cfg(windows)'.dependencies]`:
2. **Crate scaffold.** `clients/windows`, `[target.'cfg(windows)'.dependencies]`:
`punktfunk-core { path, features=["quic"] }`, `windows`, the Reactor crate, `ffmpeg-next`, `opus`,
`sdl3`, `mdns-sd`, `anyhow`, `tracing`. Mirror `crates/punktfunk-client-linux/Cargo.toml`.
`sdl3`, `mdns-sd`, `anyhow`, `tracing`. Mirror `clients/linux/Cargo.toml`.
3. **Connect + control plane.** Port `session.rs` + `trust.rs`; validate headless against the 4090
box (`m3-host`/`serve --native`) — handshake, PIN/TOFU, plane counters — before any UI/decode.
box (`punktfunk1-host`/`serve --native`) — handshake, PIN/TOFU, plane counters — before any UI/decode.
4. **Decode + present.** FFmpeg D3D11VA → `SwapChainPanel`. SDR (8-bit BGRA) first, then **P010 +
HDR colorspace** (see the HDR section).
5. **Audio.** WASAPI render + Opus decode (port `audio.rs`).
@@ -158,7 +158,7 @@ Windows client should mirror it:
## Key references
- **Template:** `crates/punktfunk-client-linux/src/*` (the client to port).
- **Template:** `clients/linux/src/*` (the client to port).
- **Apple HDR present** (the pattern to mirror): `clients/apple/Sources/PunktfunkKit/{VideoDecoder,
MetalVideoPresenter,Stage2Pipeline}.swift` — in-band PQ detection, P010 decode, EDR present.
- **Core client API:** `crates/punktfunk-core/src/client.rs` (`NativeClient`).

Some files were not shown because too many files have changed in this diff Show More