38 Commits

Author SHA1 Message Date
enricobuehler 36259b264f docs(security): record remediation status for the 2026-06-28 host audit
apple / swift (push) Successful in 1m6s
ci / rust (push) Failing after 56s
ci / web (push) Successful in 52s
android / android (push) Successful in 3m24s
ci / docs-site (push) Successful in 1m4s
apple / screenshots (push) Successful in 5m23s
windows-host / package (push) Successful in 7m36s
deb / build-publish (push) Successful in 2m52s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m43s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m59s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m43s
14/18 fixed (3532e35 Linux-verified + 6f903f7 Windows DACL paths pending
CI/box); #5 deferred (needs on-box validation), #9/#13 accepted, S7
acknowledged (no upstream rsa fix).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 22:16:25 +00:00
enricobuehler 6f903f79bc fix(host/security): Windows DACL hardening — close audit #2, #3, #8, #11
Windows local-privilege findings from design/security-review-2026-06-28.md.
These are #[cfg(windows)] paths (verify in CI / on the box; this Linux dev
VM can't compile MSVC). They follow the existing write_secret_file/icacls
patterns; the cross-platform parts are cargo check/clippy/test green.

- #2 [HIGH]: route the mgmt bearer token write through the shared
  write_secret_file so it gets the SAME Windows DACL (SYSTEM/Administrators)
  as the host key — it was cfg(unix)-only and left Users-readable, leaking
  full mgmt admin authority to any local user.
- #3 [HIGH]: create_private_dir now applies a restrictive DACL to the
  %ProgramData%\punktfunk config directory (re-owns to Administrators to
  defeat a pre-creation, strips inheritance, SYSTEM/Admins/OWNER full +
  Users read-only) so a local user can't plant host.env/apps.json that the
  SYSTEM service trusts (env/arg-injection LPE). host.env is now written
  DACL-locked via write_secret_file; the config + logs dirs go through
  create_private_dir.
- #8 [LOW]: write the web-console password file empty, icacls-lock it, THEN
  write the secret — closes the brief write-then-icacls TOCTOU window.
- #11 [LOW]: the SYSTEM logs dir is DACL-locked (Users read-only, no
  create), so a local user can't pre-plant host.log as a reparse/hardlink to
  redirect SYSTEM's writes (subsumed by the #3 dir lockdown).

Deferred: #5 (host<->UMDF gamepad/IDD shared-section Everyone:GENERIC_ALL).
The section SDDL is intentionally permissive because the UMDF driver opens
it under a restricted token of unknown SID/integrity; scoping it blind would
likely break the live-validated gamepad/IDD pipeline, so it needs on-box
validation first. Tracked in the report.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 22:14:19 +00:00
enricobuehler 3532e35b75 fix(host/security): close audit findings S1,#1,#4,#10,#12,#7,#6,S2-S6 (Linux/cross-platform)
Remediations from design/security-review-2026-06-28.md verified on Linux
(cargo check/clippy/test green; Windows-gated paths verify in CI):

- S1 [HIGH]: bump quinn-proto 0.11.14 -> 0.11.15 (RUSTSEC-2026-0185,
  pre-auth out-of-order STREAM reassembly memory exhaustion on the
  always-on default QUIC listener).
- #1 [HIGH]: remove the unauthenticated nvhttp `GET /pin` endpoint; the
  GameStream PIN is delivered ONLY via the bearer-gated mgmt API, so a
  network client can no longer submit its own displayed PIN and self-pair.
- #4 [HIGH->MED]: gate the unauthenticated RTSP/UDP media plane on a paired
  `/launch` and bind it to the launching client's source IP (threaded
  through the HTTPS handler), so an unpaired peer can neither start capture
  on an idle host nor ride a paired client's active launch.
- #12: bound concurrent parked pairing waiters (MAX_PARKED_WAITERS) so a
  pre-auth peer can't pin unbounded 300s handshakes. +regression test.
- #10: throttle the per-packet ENet control GCM-decrypt-failed warn
  (exponential backoff) so a junk flood can't spam the log.
- #7 [MED->LOW]: serialize all process-global env mutation on the
  session-setup path under a new vdisplay::ENV_LOCK (apply_session_env /
  apply_input_env / the launch-cmd set_var / the gamescope env read), so
  concurrent native sessions can't race set_var/getenv (data-race UB ->
  host-wide DoS). Full per-session SessionContext threading remains a
  follow-up for cross-session value confusion.
- #6 [MED]: move the gamescope EIS socket relay from world-writable /tmp to
  $XDG_RUNTIME_DIR (per-user 0700) and reject a symlinked relay file, so a
  local user can't intercept (keylog) or deny the remote session's input.
- S2: a malformed client Opus mic frame now drops that frame instead of
  tearing down the shared host-lifetime virtual mic (cross-session DoS).
- S3: track held buttons/keys in capped HashSets (was unbounded Vec with
  O(n) scans) so a paired client can't grow per-session input state.
- S5: reject fps==0/absurd at the open_video chokepoint (covers Hello,
  ANNOUNCE, Reconfigure) so the encoder time_base/pts math can't div-by-0.
- S6: bound the shared mic mpsc (drop-newest when full).
- S4: cap Epic launcher-cache reads (catcache.bin/.item) so a planted giant
  can't OOM the host during library enumeration.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 22:06:24 +00:00
enricobuehler 6b846913f5 docs(security): 2026-06-28 host security audit (follow-up) report
Multi-agent follow-up audit of the privileged streaming host: 18 attack
surfaces, every finding adversarially double-verified, plus a coverage
critic. Records 15 confirmed + 9 partial findings and a prior-fix
re-verification of the 2026-06-21 review.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 22:05:58 +00:00
enricobuehler 26c6c939a2 fix(ci/apple): set CMAKE_POLICY_VERSION_MINIMUM=3.5 for the vendored libopus
apple / swift (push) Successful in 1m6s
release / apple (push) Successful in 8m50s
ci / rust (push) Successful in 1m17s
ci / web (push) Successful in 52s
apple / screenshots (push) Successful in 5m40s
ci / docs-site (push) Successful in 1m27s
android / android (push) Successful in 3m46s
deb / build-publish (push) Successful in 2m53s
decky / build-publish (push) Successful in 10s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m43s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m0s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m45s
With cmake now found, Homebrew's CMake 4 refuses the vendored libopus's
`cmake_minimum_required(VERSION <3.5)` ("Compatibility with CMake < 3.5 has been
removed"). Export CMAKE_POLICY_VERSION_MINIMUM=3.5 (the same knob the Windows
build uses) so the cmake crate's child cmake configures the audiopus_sys libopus.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 21:49:27 +00:00
enricobuehler b6e6f2bff5 fix(ci/apple): locate Homebrew explicitly for the cmake install
apple / swift (push) Failing after 31s
release / apple (push) Failing after 8s
apple / screenshots (push) Has been skipped
android / android (push) Has been cancelled
ci / docs-site (push) Has been cancelled
ci / bench (push) Has been cancelled
ci / web (push) Has been cancelled
ci / rust (push) Has been cancelled
deb / build-publish (push) Has been cancelled
decky / build-publish (push) Has been cancelled
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Has been cancelled
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Has been cancelled
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Has been cancelled
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Has been cancelled
docker / deploy-docs (push) Has been cancelled
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Has been cancelled
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m25s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m30s
The self-hosted macOS runner runs steps with `bash --noprofile --norc`, so
Homebrew's bin dir is not on PATH — the previous `brew install cmake` died with
`brew: command not found` (exit 127). Find brew at its known prefix, install cmake
only if missing, and export the brew bin dir to $GITHUB_PATH so the subsequent
xcframework build (audiopus_sys → vendored libopus) actually finds `cmake`.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 21:47:05 +00:00
enricobuehler e3034958ee fix(ci): unbreak the Apple + Windows-client builds after the surround-audio merge
apple / swift (push) Failing after 2s
release / apple (push) Failing after 3s
apple / screenshots (push) Has been skipped
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m17s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m15s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 1m2s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m6s
android / android (push) Has been cancelled
ci / docs-site (push) Has been cancelled
ci / bench (push) Has been cancelled
ci / web (push) Has been cancelled
ci / rust (push) Has been cancelled
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
decky / build-publish (push) Has been cancelled
deb / build-publish (push) Has been cancelled
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Has been cancelled
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Has been cancelled
docker / deploy-docs (push) Has been cancelled
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Has been cancelled
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Has been cancelled
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Has been cancelled
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Has been cancelled
The 5.1/7.1 surround commit (75627c8) added in-core Opus, which broke two CI jobs
that the merge didn't touch:

  * Windows MSIX client: clients/windows/src/main.rs's headless `SessionParams`
    initializer was missing the new `audio_channels` field (the GUI path sets it
    from settings). Default the CLI/test path to stereo (2), matching trust.rs.
  * Apple xcframework (apple.yml + release.yml): in-core Opus decode pulls
    `audiopus_sys`, which builds a vendored *static* libopus via CMake when
    pkg-config finds no system Opus — keeping the xcframework self-contained (no
    runtime libopus.dylib on end-user Macs/devices). The self-hosted macOS runner
    lacked `cmake`; install it self-healing before every xcframework build.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 21:44:44 +00:00
enricobuehler 8672026e97 fix(host): clear clippy doc_lazy_continuation in the 4:4:4 docs
apple / swift (push) Failing after 7s
apple / screenshots (push) Has been skipped
android / android (push) Successful in 3m17s
ci / rust (push) Successful in 1m17s
ci / web (push) Successful in 50s
ci / docs-site (push) Successful in 58s
windows-host / package (push) Successful in 7m27s
deb / build-publish (push) Successful in 2m54s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m39s
docker / deploy-docs (push) Has been cancelled
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Has been cancelled
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Has been cancelled
A line-wrap put `+`/`*`-style markers at the start of two doc lines, which
clippy (Windows host job, rust 1.96) reads as markdown list items whose
unindented follow-on lines trip `doc_lazy_continuation` under `-D warnings`:

  - encode/windows/nvenc.rs `chroma_444` field doc (the failing Windows-host
    clippy job): "+ chromaFormatIDC = 3" → "and chromaFormatIDC = 3".
  - encode/linux/vaapi.rs `probe_can_encode_444` doc: "+ validate" → "and
    validate" (last line, didn't fire yet, but fragile — fixed pre-emptively).

Pure doc rewording, no behaviour change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 21:38:07 +00:00
enricobuehler 75627c8afe feat(audio): end-to-end 5.1/7.1 surround across the native path + all clients
apple / swift (push) Failing after 10s
release / apple (push) Failing after 7s
apple / screenshots (push) Has been skipped
audit / cargo-audit (push) Failing after 1m19s
windows-host / package (push) Failing after 2m44s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Failing after 39s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Failing after 39s
windows / build (aarch64-pc-windows-msvc) (push) Failing after 45s
android / android (push) Successful in 5m17s
windows / build (x86_64-pc-windows-msvc) (push) Failing after 45s
ci / web (push) Successful in 57s
ci / docs-site (push) Successful in 56s
ci / rust (push) Successful in 9m19s
ci / bench (push) Successful in 4m40s
decky / build-publish (push) Successful in 26s
deb / build-publish (push) Successful in 2m57s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 33s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m56s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m35s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m20s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 53s
flatpak / build-publish (push) Successful in 4m22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m51s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m50s
Adds negotiated 5.1/7.1 surround to the punktfunk/1 protocol and every client
(previously stereo-only):

- core: new shared `audio` layout table (LAYOUT_51/71 + identity multistream
  mapping, canonical wire order FL FR FC LFE RL RR SL SR); Hello/Welcome
  `audio_channels` negotiation via the trailing-byte back-compat pattern (old
  peers fall back to stereo); C-ABI `punktfunk_connect_ex6`,
  `punktfunk_connection_audio_channels`, and in-core multistream decode
  `punktfunk_connection_next_audio_pcm` for embedders without a multistream
  Opus decoder. Real-libopus channel-identity round-trip test.
- host: native audio thread captures + Opus-(multi)stream-encodes at the
  negotiated count (with a cross-session cached-capturer channel-mismatch fix);
  GameStream surround unified onto the safe `opus::MSEncoder`, dropping
  `audiopus_sys` (~4 unsafe blocks) and un-gating Windows GameStream surround;
  WASAPI loopback capture relaxed to 2/6/8 with the correct dwChannelMask.
- clients: Linux (PipeWire), Windows (WASAPI), Android (AAudio) decode via
  `opus::MSDecoder` + render multichannel; Apple decodes in-core to PCM →
  AVAudioEngine with an explicit wire-order channel layout; each gains a
  Stereo/5.1/7.1 setting. `punktfunk-probe --audio-channels N` is the headless
  validator.

Verified on Linux: core/host/linux/probe test suites + the Android Rust
(cargo-ndk) build, clippy -D warnings, and rustfmt all green. Windows/Apple
builds, all on-glass checks, and the live native loopback are pending (CI / a
free box).

Also lands the concurrent in-tree HEVC 4:4:4 host work (PUNKTFUNK_444): it
shares the same touched files (quic.rs, punktfunk1.rs, encode/*, ...) and so
cannot be committed separately from the surround changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 21:11:05 +00:00
enricobuehler 6383e5f4fd feat(client/android): CI screenshot capture via Roborazzi
Play-listing/marketing screenshots of the Compose client rendered on the host JVM
by Roborazzi (Robolectric Native Graphics) — no emulator, GPU, KVM, host, or JNI
core. Five scenes render the REAL composables with embedded mock state under a
forced brand palette (Material You has no wallpaper to seed from on the JVM):
hosts grid, settings, TOFU + PIN dialogs, and the live stats HUD. Validated 5/5
locally.

- New JVM unit-test source set (app/src/test) + Roborazzi/Robolectric test deps;
  @Config(sdk=36) is mandatory (no android-all jar for compileSdk 37) and the
  animation clock is paused so a text-bearing scene reaches idle.
- kit: `-PskipRustBuild` skips the cargo-ndk native build so the JVM-only test job
  needs no Rust/NDK; normal APK/AAR builds are unchanged.
- Widen BrandDark / StatsOverlay to internal so the tests can use them.
- Standalone best-effort tag-gated workflow; PNGs upload as a 30-day artifact.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 15:05:54 +00:00
enricobuehler 6a93d164a0 feat(client/linux): CI screenshot capture
Host-free UI screenshots of the GTK4/libadwaita client under a virtual X display
(clients/linux/tools/screenshots.sh) — Xvfb + software GL (llvmpipe) + a root-window
grab, one app launch per scene. PUNKTFUNK_SHOT_SCENE routes build_ui to render one
mock-populated REAL view (hosts grid / settings dialog / TOFU + PIN dialogs) and
print PF_SHOT_READY once it has settled; the saved-hosts grid is driven by a seeded
client-known-hosts.json. NON_UNIQUE in shot mode so back-to-back launches don't
collide. The stream scene is deferred — its page needs a live NativeClient.

Gated to stable release tags in a standalone best-effort workflow that builds the
client in the rust-ci image and captures under Xvfb; PNGs upload as a 30-day
artifact, not committed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 15:05:38 +00:00
enricobuehler 9e98618e5f feat(web): CI screenshot capture for the mgmt console
Marketing/store screenshots of the console, captured from the built Storybook
with headless Chromium (web/tools/screenshots.mjs) — every Pages/* + Shell/*
story rendered at 1440x900@2x. The page stories render from fixtures, so no live
mgmt API, login, or GPU is needed (the web analogue of apple.yml's screenshots
job). Gated to stable release tags in a standalone best-effort workflow; PNGs
upload as a 30-day artifact, not committed.

- Add Stats + Pairing stories (the two pages that lacked them) with stats/pairing
  fixtures typed against the generated models.
- Extract a pure PairingView (index.tsx -> view.tsx), matching the
  Dashboard/Clients/Stats split, so the page renders host-free from mock state
  instead of racing its polling queries. Container wiring is behaviour-identical.
- Playwright driver + a chromium-capable tag-gated job.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 15:05:27 +00:00
enricobuehler 1bd60ffb34 refactor(docs): use shared @unom/app-ui/footer component
apple / swift (push) Successful in 1m2s
android / android (push) Successful in 4m23s
deb / build-publish (push) Successful in 2m30s
decky / build-publish (push) Successful in 13s
ci / rust (push) Successful in 4m47s
ci / web (push) Successful in 50s
ci / docs-site (push) Successful in 58s
apple / screenshots (push) Successful in 5m16s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 53s
ci / bench (push) Successful in 4m39s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m29s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m23s
docker / deploy-docs (push) Successful in 18s
The docs footer was a hand-maintained mirror of the marketing site's. Both now
render the same @unom/app-ui/footer component, so they stay in sync. The shared
view themes itself through @unom/style tokens (which the docs already map onto
their Fumadocs surfaces), and a resolveHref hook rebases root-relative links
onto the marketing-site origin. Footer types now come from the library too.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 14:34:45 +00:00
enricobuehler 30d0d36efe feat(decky): self-update without the store + Gaming-Mode launch polish, and ship the Steam Deck docs
apple / swift (push) Successful in 1m4s
apple / screenshots (push) Successful in 5m26s
android / android (push) Successful in 3m27s
ci / web (push) Successful in 1m7s
ci / docs-site (push) Successful in 1m16s
ci / rust (push) Successful in 4m21s
deb / build-publish (push) Successful in 2m31s
decky / build-publish (push) Successful in 20s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 8s
ci / bench (push) Successful in 4m46s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 11s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 10s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 1m0s
flatpak / build-publish (push) Successful in 4m55s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m38s
docker / deploy-docs (push) Successful in 6s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m25s
Plugin self-update (no Decky store): CI publishes a per-channel manifest.json
({version, immutable per-version artifact, sha256}) beside the zip and bakes
update.json {channel, manifest} into the plugin. main.py `check_update` reads the
installed version from package.json (the value Decky reports — not plugin.json),
fetches the channel manifest, and the frontend shows an "Update to vX" button that
drives Decky Loader's own install RPC (root downloads + SHA-256-verifies + hot-reloads).
CI now stamps a plain-numeric semver (0.3.<run> canary / X.Y.Z stable) into
package.json — a -ciN suffix would mis-order under compare-versions.

Linux client: `--fullscreen` (plus SteamDeck/gamescope env fallback) enters GTK
fullscreen on stream start so Gaming-Mode chrome is hidden; native-mode resolution
falls back to the display's first monitor when the window isn't mapped yet (was
dropping to the 1080p floor — wrong on the Deck's 1280×800); add a confirmed
"Remove saved host" action (KnownHosts::remove_by_fp).

Docs: new docs/steam-deck.md (Decky install/pair/stream/self-update/troubleshooting),
wired into meta.json nav, and cross-linked from clients/install-client/channels. This
is the page docs.punktfunk.unom.io/docs/steam-deck — the website's download link
pointed at it before it existed; committing it makes that link resolve.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 13:03:44 +00:00
enricobuehler 3947d5b07a fix(host/audio): drive the Linux virtual mic with RT_PROCESS (was silent)
apple / swift (push) Successful in 1m1s
ci / rust (push) Successful in 4m36s
ci / web (push) Successful in 48s
ci / docs-site (push) Successful in 55s
apple / screenshots (push) Successful in 5m9s
ci / bench (push) Successful in 4m36s
windows-host / package (push) Successful in 7m8s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m19s
release / apple (push) Successful in 9m52s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m16s
android / android (push) Successful in 3m21s
decky / build-publish (push) Successful in 11s
deb / build-publish (push) Successful in 2m45s
flatpak / build-publish (push) Successful in 4m11s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m48s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m35s
docker / deploy-docs (push) Successful in 18s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 8s
The punktfunk-mic PipeWire source connected without RT_PROCESS, so it ran as an
async/main-loop node. In the host's busy multi-stream graph (desktop audio + video
capture + the session) it never acquired a driver, stayed suspended, and its
process() callback never fired — every recorder reading the remote mic heard pure
silence (the long-standing "Linux host mic broken"). Connect the mic stream with
RT_PROCESS so it is a synchronous node that joins its consumer's driver group and
is actually driven.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 12:46:06 +00:00
enricobuehler 238501597e feat(host/gamestream): follow Desktop<->Game session switches
android / android (push) Successful in 4m49s
ci / web (push) Successful in 55s
apple / swift (push) Successful in 59s
ci / rust (push) Successful in 4m52s
ci / docs-site (push) Successful in 56s
apple / screenshots (push) Successful in 5m16s
windows-host / package (push) Successful in 7m1s
deb / build-publish (push) Successful in 2m30s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 42s
ci / bench (push) Successful in 4m41s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m7s
docker / deploy-docs (push) Successful in 19s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m59s
The GameStream/Moonlight video plane is a separate encode loop that lacked the
session-following the native punktfunk/1 plane has, so a mid-stream Desktop<->Game
switch killed the stream ("video stream failed") instead of following it.

* Normalize the session env like the native plane: extract open_gs_virtual_source,
  which detects the LIVE compositor + apply_session_env/apply_input_env (gamescope
  ATTACH default -> resize-on-attach to the box's own game-mode session at the
  client mode; KWin/Mutter retargeting). GameStream previously ran a bare detect()
  against raw process env, so in game mode it bare-spawned a COMPETING gamescope
  instead of attaching to the box's session.

* In-place capture-loss rebuild: replace the `?` that ended the stream with a
  bounded rebuild (re-detect the live compositor via the same factory, build the
  new source BEFORE dropping the old, reopen the encoder, force an IDR) — keeping
  the send thread + packetizer + socket + RTP clock. A same-resolution
  Desktop<->Game toggle is now FOLLOWED with no Moonlight reconnect.

Protocol limit (unchanged): a mid-stream RESOLUTION change is impossible on
GameStream (WxH locked at ANNOUNCE; no Reconfigure) — a session toggle keeps the
negotiated mode, so this isn't hit. The portal/synthetic source passes no rebuild
closure (propagates as before).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 12:22:12 +00:00
enricobuehler 04dd3e3a19 docs: refresh Windows host page for new users; drop stale Status/NVIDIA-only/SudoVDA
Rewrite the Windows host docs page for first-time setup, on par with the
other host guides: remove the standout "Status:" banner, restructure into
Requirements / Install (web console + pairing + configure) / How it works /
Notes & limits.

Bring the content up to date with the shipping host:
- encode is all-vendor (NVENC/AMF/QSV + software fallback), not NVIDIA-only
- virtual display is punktfunk's own pf-vdisplay IDD (SudoVDA removed)
- gamepads need no prerequisite — UMDF drivers bundled; ViGEmBus is gone
- add HDR10 + Vulkan-game HDR layer coverage

Fix the same stale claims where other pages cross-reference the Windows host
(requirements, running-as-a-service, install, roadmap, status).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 11:22:50 +00:00
enricobuehler 61aa1053e7 feat(host/gamescope): headless game mode that follows the box + matches the client
apple / swift (push) Successful in 1m2s
android / android (push) Successful in 4m43s
ci / rust (push) Successful in 4m53s
ci / web (push) Successful in 54s
ci / docs-site (push) Successful in 57s
apple / screenshots (push) Successful in 5m6s
deb / build-publish (push) Successful in 2m31s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
windows-host / package (push) Successful in 9m2s
ci / bench (push) Successful in 4m41s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m6s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m43s
Make Steam game mode work on a display-less streaming host and stream it at the
client's resolution:

* Ship /etc/gamescope-session-plus/sessions.d/steam (packaging/bazzite/
  gamescope-headless-session, installed by the RPM + Arch PKGBUILD): fall back to
  gamescope's headless backend when no display is connected, so "Switch to Game
  Mode" boots offscreen instead of crashing on the missing panel (and 5-striking
  back to desktop). No-op on display-attached boxes; only sets unset values so
  the host's per-client mode still wins.

* Default Bazzite/SteamOS to ATTACH (PUNKTFUNK_GAMESCOPE_ATTACH=1 in host.env):
  the box owns its session (Desktop<->Game, persistent), the host follows +
  captures it and never tears it down — so switching is rock-solid and a
  disconnect leaves the box in its mode (reconnect returns there).

* Resize-on-attach (gamescope.rs): on connect, ensure the box's own game-mode
  session runs at the CLIENT's resolution — reuse it when already matching (fast
  path, no restart), else reconfigure + restart the box's own autologin
  gamescope-session-plus@<client> at the client mode (cooperative: no competing
  unit, so no autologin-respawn fight). Detect the live gamescope's -W/-H via
  argv[0] in /proc (its /proc/<pid>/exe is unreadable for that process).

Validated live on a headless bazzite-deck-nvidia box: game mode boots headless +
stable (0 strikes); the host attaches + streams video/audio/EIS input; a
5120x1440 client reuses the matching session and streams at 5120x1440.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 11:09:45 +00:00
enricobuehler 50e17b3508 fix(host/capture): hold the session through a slow compositor switch
apple / swift (push) Successful in 1m1s
ci / docs-site (push) Successful in 54s
apple / screenshots (push) Successful in 5m14s
deb / build-publish (push) Successful in 2m30s
decky / build-publish (push) Successful in 11s
android / android (push) Successful in 4m41s
ci / rust (push) Successful in 4m52s
ci / web (push) Successful in 49s
windows-host / package (push) Successful in 7m54s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m34s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m10s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m7s
A Bazzite/SteamOS Gaming↔Desktop switch tears the old compositor down and can
take 15s+ to bring the new one up — longer than the capture-loss rebuild's
~10s window, so the session failed mid-switch ("disconnect — session failed")
and forced the client to cold-reconnect. Retry the rebuild within a 40s budget
instead of giving up after one round, and re-detect the live compositor on
each attempt so the stream follows the box to whatever session comes up (a new
instance of the same compositor, or a different one — the kind-change case).
The QUIC keepalive runs on its own thread, so the client stays connected
(frozen on the last frame) and the stream resumes when the new output appears,
with no reconnect.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 09:31:47 +00:00
enricobuehler 94c556f0e3 fix(host/capture): recover from compositor loss instead of freezing
apple / screenshots (push) Successful in 5m7s
apple / swift (push) Successful in 1m1s
windows-host / package (push) Successful in 7m26s
android / android (push) Successful in 4m50s
ci / web (push) Successful in 50s
ci / docs-site (push) Successful in 54s
decky / build-publish (push) Successful in 11s
ci / rust (push) Successful in 4m51s
deb / build-publish (push) Successful in 2m29s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m37s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m1s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m47s
When the compositor is torn down mid-stream (a Gaming↔Desktop switch removes
the virtual output), its PipeWire stream leaves Streaming for Paused rather
than disconnecting. try_latest treated that as Ok(None) ("static desktop —
repeat the last frame"), so the stream froze on the last frame forever and
neither recovery path fired: the capture-loss rebuild keys on Err, and the
session watcher keys on a session-KIND change (a desktop→desktop new KWin
instance is the same kind).

Track the PipeWire stream state via state_changed (a `streaming` flag) and,
in try_latest, surface a sustained non-Streaming state (1.5s grace for a
transient renegotiation blip) as a capture-loss Err — which the encode loop
already handles by rebuilding the pipeline in place. A static desktop stays
Streaming, so no false trigger. Complements the now-default session watcher.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 09:00:35 +00:00
enricobuehler 32c1929948 feat(host/session-watch): default Gaming↔Desktop follow on for Bazzite/SteamOS
apple / swift (push) Successful in 1m2s
android / android (push) Successful in 4m52s
ci / rust (push) Successful in 5m3s
ci / web (push) Successful in 55s
ci / docs-site (push) Successful in 54s
decky / build-publish (push) Successful in 22s
windows-host / package (push) Successful in 9m7s
ci / bench (push) Successful in 4m40s
apple / screenshots (push) Successful in 5m20s
deb / build-publish (push) Successful in 2m31s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 32s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m40s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m39s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m24s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 47s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m19s
docker / deploy-docs (push) Successful in 22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m29s
The mid-stream session watcher (rebuild the backend in place when the box
flips Gaming↔Desktop) was opt-in via PUNKTFUNK_SESSION_WATCH, so it never
ran on a stock Bazzite/SteamOS box — switching modes froze the stream on the
now-dead compositor. Default it ON when os-release ID/ID_LIKE is
bazzite/steamos (the platforms that flip sessions); still off on plain
desktops. Also parse the env properly so PUNKTFUNK_SESSION_WATCH=0 actually
disables it (was: any value, including "0", enabled it).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-28 08:43:27 +00:00
enricobuehler 3915a82780 fix(host/input): route KWin auto-detect to the fake_input backend
apple / swift (push) Successful in 1m1s
apple / screenshots (push) Successful in 5m2s
windows-host / package (push) Successful in 6m56s
android / android (push) Successful in 4m42s
ci / rust (push) Successful in 4m52s
ci / web (push) Successful in 52s
ci / docs-site (push) Successful in 56s
deb / build-publish (push) Successful in 2m31s
decky / build-publish (push) Successful in 13s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m29s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m4s
docker / deploy-docs (push) Successful in 6s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m0s
apply_input_env() hard-pinned PUNKTFUNK_INPUT_BACKEND=libei for KWin, and
default_backend() reads that env first — so the auto-detecting host (the
normal `serve` service) ignored the new KwinFakeInput backend and fell back
to the RemoteDesktop portal path that needs a user to approve. Route KWin to
"kwin" (org_kde_kwin_fake_input); GNOME/Mutter stay on libei (no fake_input
there).

Validated live on a Bazzite KDE box via the auto-detect path:
backend=KwinFakeInput, "KWin fake_input ready (no portal)", input events
forwarded with no errors.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 11:52:02 +00:00
enricobuehler a4833e4780 feat(android/touch): trackpad-relative cursor (default), with a direct-touch toggle
apple / swift (push) Successful in 1m10s
android / android (push) Successful in 4m53s
ci / rust (push) Successful in 5m1s
ci / web (push) Successful in 58s
ci / docs-site (push) Successful in 55s
apple / screenshots (push) Successful in 5m28s
deb / build-publish (push) Successful in 2m30s
windows-host / package (push) Successful in 8m41s
decky / build-publish (push) Successful in 29s
ci / bench (push) Successful in 4m27s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 34s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m43s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m35s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m25s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 48s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m46s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 10m1s
docker / deploy-docs (push) Successful in 24s
One-finger touch was absolute "direct pointing" — the host cursor jumped to the
finger and was recomputed from each touch-start, so you couldn't precisely reach a
target. Now a relative trackpad: the cursor stays put on touch-down and moves by the
finger delta (host MouseMove via nativeSendPointerMove, already supported — no
protocol change), with mild pointer acceleration and sub-pixel remainder
accumulation so slow precise moves aren't lost to Int truncation. Swipe, lift, and
re-swipe to walk it across; tap = left-click at the cursor's current position.
Two-finger scroll / right-click, three-finger HUD toggle, and tap-then-hold-drag are
preserved unchanged; finger-id re-anchoring keeps multi-touch transitions jump-free.

Added Settings → Pointer → "Trackpad mode" (default on); turning it off restores the
old direct-pointing path verbatim.

:app:compileDebugKotlin green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 11:34:03 +00:00
enricobuehler 4e79e6cdad fix(android/audio): kill the AAudio crackle (RT-safe ring + deeper buffer + XRun sizing)
The jitter ring was a port of the Linux client's, but Linux runs on PipeWire
(adaptive resampling masks host↔DAC drift + a shallow buffer); AAudio hands us a
raw realtime callback and we own the buffer, so the same code crackled only on
Android. Three converging causes, all fixed:

- Heap free on the realtime audio thread every quantum (Android's Scudo free() has
  unbounded tail latency → XRun → click). Decoded buffers are now recycled back to
  the producer via a free-list instead of freed on the audio thread; the ring is
  pre-reserved so extend() never reallocates there.
- The ring collapsed to ~15 ms on the tiny LowLatency burst and re-primed (a fresh
  silence) on every single empty callback. Now ~40 ms prime / ~150 ms hard cap,
  decoupled from the burst size, with de-prime hysteresis (re-prime only after a
  sustained drain).
- AAudio's anti-glitch knobs were unused: prime the HW buffer above its 2-burst
  default and grow it on getXRunCount(). The post-open log now reports
  perf/sharing/buffer so a fall to a resampled legacy path is visible.

Steady-state audio latency ~15 → ~40 ms (within lip-sync tolerance; matches the
Moonlight/Sunshine operating point). cargo-ndk build both ABIs + fmt + clippy green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 11:33:51 +00:00
enricobuehler f74bc4a3f1 feat(host/input): headless KDE input via org_kde_kwin_fake_input
Desktop-mode (KWin) streaming had no input: the path was libei via the
RemoteDesktop portal, which (a) isn't reachable from the host service env
and (b) requires a human to approve "Allow remote control?" — a
non-starter on a headless box. KWin's own headless RDP server (krdpserver)
solves this with org_kde_kwin_fake_input, authorized by the exact same
.desktop X-KDE-Wayland-Interfaces grant we already ship
(org_kde_kwin_fake_input is listed alongside zkde_screencast_unstable_v1).

Add a fake_input injector: vendor the protocol XML, bind the global as an
ordinary Wayland client, authenticate (auto-accepted for an
interface-authorized client — no dialog), and translate pointer (rel/abs),
button, scroll, keyboard (raw evdev keycodes resolved by KWin's own keymap)
and touch. Select it for KWin (compositor=="kwin" or XDG_CURRENT_DESKTOP
KDE); GNOME stays on libei (it has neither fake_input nor the wlr
protocols). PUNKTFUNK_INPUT_BACKEND=kwin forces it.

cargo check + clippy + fmt green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 11:26:04 +00:00
enricobuehler 8e18d01af5 fix(host/kwin): authorize Desktop-mode streaming via a shipped .desktop
Streaming the KDE *Desktop* (KWin) session failed on a real interactive
Plasma session with "KWin does not expose zkde_screencast_unstable_v1":
KWin treats the screencast/virtual-output and fake_input globals as
restricted and advertises them only to a client whose installed .desktop
lists them under X-KDE-Wayland-Interfaces (matched by /proc/<pid>/exe ->
Exec, and cached per-executable on first connect). The host shipped no
.desktop, so it was permanently denied; it only ever worked on the
headless dev box via KWIN_WAYLAND_NO_PERMISSION_CHECKS=1.

Ship packaging/linux/io.unom.Punktfunk.Host.desktop (least-privilege:
only the host, only zkde_screencast_unstable_v1 + org_kde_kwin_fake_input)
and install it from the RPM/.deb/Arch host packaging so it is present
before the host first connects. Drop the blunt session-wide
NO_PERMISSION_CHECKS hack from kde-desktop-setup.sh (it now only seeds the
RemoteDesktop input grant) and fix the now-misleading kwin.rs docs/errors.

Validated live on a Bazzite Kinoite box (KWin 6.6.4): probe-compositor +
spike --source kwin-virtual succeed against a KWin running WITHOUT the
permission bypass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-27 11:15:39 +00:00
enricobuehler 3477cbe7ce fix(audio/windows): stop the client mic echoing back through the loopback
The Windows virtual mic fakes a capture endpoint by writing the client's
uplinked PCM into a virtual device's *render* endpoint, while the
desktop-audio plane loopback-captures the *default render* endpoint — with
no mutual exclusion between the two. WASAPI loopback captures the mixed
output of an endpoint (everything any app renders to it, including our mic
writes), so when both resolve to the same device — VB-CABLE used for both,
or the auto-installed Steam Streaming Microphone being the default render on
a headless box — the injected mic is captured straight back into the
host->client audio stream: an infinite echo.

find_device() now resolves the loopback's endpoint id (default render) and
skips any candidate matching it, scanning on to the next non-loopback match,
so the mic can never land on the device the loopback reads. The auto-install
path now provisions the full Steam pair (Streaming Microphone + Streaming
Speakers) so a bare host gets two distinct devices instead of one shared
one. Errors distinguish "no device" from "only candidate is the loopback
device". Linux was already immune (its mic is a dedicated Audio/Source node,
structurally separate from the monitored sink).

Windows-only (#[cfg(windows)]); rustfmt-clean, compile-checked in
windows-host CI, needs on-glass validation on the RTX box. Does not force
the system default playback onto Steam Streaming Speakers (IPolicyConfig) —
not required to break the echo.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 23:51:46 +00:00
enricobuehler 5a2e07e865 style(windows): rustfmt install.rs to unbreak cargo fmt --all --check
apple / swift (push) Successful in 1m3s
ci / rust (push) Successful in 4m52s
ci / web (push) Successful in 56s
ci / docs-site (push) Successful in 59s
apple / screenshots (push) Successful in 5m12s
ci / bench (push) Successful in 4m40s
windows-host / package (push) Successful in 6m28s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m17s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m13s
release / apple (push) Successful in 10m9s
deb / build-publish (push) Successful in 2m44s
decky / build-publish (push) Successful in 11s
android / android (push) Successful in 3m33s
flatpak / build-publish (push) Successful in 4m9s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m5s
docker / deploy-docs (push) Successful in 7s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m16s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m41s
The pnputil /add-driver call in windows/install.rs was committed unwrapped;
`cargo fmt --all --check` (which checks cfg(windows) files too) flagged it and
failed the `rust` CI job at the Format step, skipping clippy/build/test. Apply
rustfmt — no behavior change. Clears the way to cut the v0.2.0 release from
green main.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 23:19:12 +00:00
enricobuehler 6e949b6748 fix(readme): make the logo readable on light + dark themes
apple / swift (push) Successful in 1m3s
apple / screenshots (push) Successful in 5m25s
ci / rust (push) Failing after 1m5s
ci / web (push) Successful in 52s
ci / docs-site (push) Successful in 1m0s
android / android (push) Successful in 3m59s
deb / build-publish (push) Successful in 2m31s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
ci / bench (push) Successful in 4m35s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m55s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m46s
docker / deploy-docs (push) Successful in 7s
The wordmark was light violet only — low-contrast on a light README
background. Swap to a single theme-adaptive SVG: an internal
`prefers-color-scheme` media query paints it deep violet (the brand-mark
palette) on light backgrounds and the original light violet on dark, so it
reads on both GitHub/Gitea themes with no markup change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:54:03 +00:00
enricobuehler 8ae161fe61 docs(windows): README - install via punktfunk-host.exe driver install / web setup (not .ps1)
apple / swift (push) Successful in 1m0s
windows-host / package (push) Successful in 6m20s
apple / screenshots (push) Successful in 5m26s
ci / rust (push) Failing after 26s
ci / web (push) Successful in 54s
deb / build-publish (push) Successful in 2m30s
ci / docs-site (push) Successful in 1m3s
android / android (push) Successful in 3m19s
decky / build-publish (push) Successful in 13s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m35s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m2s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m48s
docker / deploy-docs (push) Successful in 6s
Option A removed install-pf-vdisplay.ps1 / install-gamepad-drivers.ps1 / web-setup.ps1;
the installer now calls the exe subcommands. Drop the stale table rows + reword the
install-flow + 'thin installer' notes.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 16:46:05 +00:00
enricobuehler 3a89ee8cd7 docs(readme): add logo banner + refresh Windows-host status
- Add the centered punktfunk wordmark banner at the top (assets/punktfunk-logo.svg,
  the same logo + layout the marketing site's README uses).
- Refresh the now-stale Windows-host facts: all-vendor (NVENC + AMF/QSV), its own
  all-Rust pf-vdisplay IddCx virtual display (was SudoVDA), bundled UMDF virtual-gamepad
  drivers (ViGEmBus gone), HDR incl. Vulkan-game HDR; x64-only, no longer NVIDIA-only.
- Note punktfunk-host covers Linux + Windows; point design/ at its new README index.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:45:29 +00:00
enricobuehler dac0fee4e3 docs(windows): reflect the install-via-exe (Option A) landing in the build/packaging doc
apple / swift (push) Successful in 1m3s
apple / screenshots (push) Successful in 5m31s
ci / web (push) Successful in 49s
decky / build-publish (push) Successful in 14s
ci / rust (push) Failing after 32s
ci / docs-site (push) Successful in 1m1s
android / android (push) Successful in 3m21s
deb / build-publish (push) Successful in 2m30s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 13s
ci / bench (push) Successful in 4m49s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m32s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m47s
docker / deploy-docs (push) Successful in 6s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 16:44:47 +00:00
enricobuehler 125a51d81d feat(windows-installer): move driver + web install into the host exe (ASCII root fix)
apple / swift (push) Successful in 1m0s
apple / screenshots (push) Successful in 5m16s
windows-host / package (push) Successful in 6m25s
ci / rust (push) Failing after 28s
ci / web (push) Successful in 53s
ci / docs-site (push) Successful in 1m1s
android / android (push) Successful in 3m21s
deb / build-publish (push) Successful in 2m31s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m39s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m2s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m50s
docker / deploy-docs (push) Successful in 17s
Port the three install-time PowerShell *files* (install-pf-vdisplay.ps1,
install-gamepad-drivers.ps1, web-setup.ps1) into punktfunk-host.exe subcommands:
`driver install [--gamepad] --dir <stage>` and `web setup --app-dir <app>
[--password-file <f>]` (windows/install.rs).

Why: PowerShell 5.1 reads a BOM-less .ps1 FILE in the machine ANSI codepage, so a
stray non-ASCII byte mis-decodes and aborts on a non-English box - exactly how the
pf-vdisplay driver install silently failed. A compiled subcommand drives the same
external tools (certutil/pnputil/nefconc/schtasks/netsh/icacls) as fixed string
literals, with no file-codepage surface. (The .iss's INLINE -Command PowerShell is a
command-line string, not a file read, so it's unaffected and stays.)

- windows/install.rs: faithful port - cert trust, gated nefconc node create + pnputil
  for pf-vdisplay; pnputil per-inf for gamepads; web-password ACL, the PunktfunkWeb task
  (generated UTF-16 XML), firewall rule, start. Best-effort (a hiccup warns, never aborts).
- punktfunk-host.iss [Run]: call the exe instead of `powershell -File`; drop the
  web-setup.ps1 staging + WebSetup define; WebSetupParams emits --app-dir/--password-file.
- pack-host-installer.ps1: stop copying the three install scripts into the stages.
- delete the three .ps1 files.

The `mod install;` + dispatch arms in main.rs landed in the preceding docs commit
(swept up by a concurrent commit); this commit adds the module + installer wiring.
CI-compile-validated via windows-host; the install path is on-glass-validated on the
next canary install (the test box is offline).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 16:43:18 +00:00
enricobuehler 7b99b41ede docs(design): trim shipped plans, consolidate cluster, add index
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).

- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
  apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
  host-latency, gpu-contention (fixed stale status table), game-library,
  linux-setup (fixed m0->spike + stale zero-copy claim),
  session-aware-host-followups, windows-client-bootstrap,
  windows-dualsense-{scoping,game-detection}, windows-virtual-display,
  security-review (per-finding status table; #12 still open),
  apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
  windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
  merged, M4 done); windows-secure-desktop.md archived (now a fallback
  behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
  roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-26 16:39:06 +00:00
enricobuehler 9ea2c17419 docs(windows): add design/windows-build-and-packaging.md + refresh packaging README
apple / swift (push) Successful in 1m0s
apple / screenshots (push) Successful in 5m19s
windows-host / package (push) Successful in 6m20s
android / android (push) Successful in 4m42s
ci / rust (push) Successful in 4m47s
ci / web (push) Successful in 50s
ci / docs-site (push) Successful in 58s
deb / build-publish (push) Successful in 2m30s
decky / build-publish (push) Successful in 23s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
ci / bench (push) Successful in 4m40s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m16s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m3s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m0s
docker / deploy-docs (push) Successful in 22s
A single repo-internal source of truth for the Windows build/packaging: what ships, the
all-Rust driver workspace built FROM SOURCE in CI (+ the anti-stale rationale), the
toolchain (clang 22 + bindgen 0.72, no LLVM pin), the Inno installer, the web console
bundle, the CI workflows, signing, and the dev loop. (design/, not the docs-site.)

packaging/windows/README.md: drop the deleted vendored-driver dir + its "Vendored driver"
callout, add the build-* / install-gamepad / clear-force-integrity rows, point at the new
design doc.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 16:22:40 +00:00
enricobuehler a9cca82fb8 chore(windows): clean up build/packaging - drop vendored driver binaries + the LLVM-21 pin
windows-drivers-provision / provision (push) Successful in 13s
windows-drivers / probe-and-proto (push) Successful in 17s
android / android (push) Failing after 40s
apple / swift (push) Successful in 1m0s
ci / web (push) Successful in 58s
windows-drivers / driver-build (push) Successful in 1m9s
ci / docs-site (push) Successful in 1m18s
ci / rust (push) Successful in 4m25s
apple / screenshots (push) Successful in 5m24s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
deb / build-publish (push) Successful in 2m29s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 29s
ci / bench (push) Successful in 4m48s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
windows-host / package (push) Successful in 6m38s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m24s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m31s
docker / deploy-docs (push) Successful in 18s
Now that the drivers build from source in CI, remove the dead checked-in binaries and
the toolchain cruft they left behind:

- Delete packaging/windows/{pf-vdisplay,gamepad-drivers}/ (the prebuilt .dll/.inf/.cat/.cer).
  pack-host-installer.ps1 builds + signs all three drivers from the drivers/ workspace and
  nothing reads the vendored dirs anymore; stage-pf-vdisplay.ps1's -VendorDir is now a
  mandatory build-output path, not a vendored default.
- Drop the LLVM-21 pin. The vendored bindgen 0.71->0.72 bump (the shipping pack already
  builds green on the runner-default clang 22) retired the bindgen-0.71 layout-test overflow
  that needed LLVM 21.1.2, so windows-drivers.yml + provision-windows-wdk.ps1 no longer
  install/point at C:\llvm-21 (~898 MB off a fresh provision) - both driver builds now use one
  toolchain (clang 22 + bindgen 0.72).
- pack -SkipBuild on the gamepad build (build-pf-vdisplay.ps1 already builds the whole
  workspace), build-web.ps1 reaps a stale node too, deploy-dev.ps1 nefconc path + comments.
- Reword the vendored-driver references (build scripts, .iss, READMEs, the vite web-bundle
  comment) to the build-from-source reality.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 16:16:46 +00:00
enricobuehler 7ab0661ddc fix(windows-installer): escape the brace in the [UninstallRun] PowerShell so ISCC compiles
windows-drivers / probe-and-proto (push) Successful in 21s
apple / swift (push) Successful in 1m4s
windows-drivers / driver-build (push) Successful in 1m9s
android / android (push) Successful in 4m25s
ci / web (push) Successful in 53s
apple / screenshots (push) Successful in 5m32s
ci / rust (push) Successful in 4m45s
ci / docs-site (push) Successful in 52s
windows-host / package (push) Successful in 6m47s
deb / build-publish (push) Successful in 2m28s
decky / build-publish (push) Successful in 14s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m45s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m55s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m46s
The Bug C [UninstallRun] one-liner had `ForEach-Object { Stop-Process ... }`; Inno
Setup parses `{...}` as a constant in [Run]/[UninstallRun] sections, so ISCC aborted
with "Unknown constant" and the windows-host pack failed at the ISCC step (the host
build, clippy, driver build + web smoke-boot all passed). Escape `{` as `{{`. The
same one-liner in the [Code] StopWebConsole proc is inside a Pascal string literal,
so its brace is literal and must NOT be escaped. Validated: ISCC now parses past
[UninstallRun] + [Code] (fails only later on the absent dummy payload).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 15:15:07 +00:00
enricobuehler 92e68024f1 fix(windows-installer): build the gamepad drivers from source in CI too
Fold the pf-dualsense (DualSense / DualShock 4) and pf-xusb (Xbox 360 / XInput)
UMDF drivers into the in-tree drivers workspace (their source had stale
../../crates/wdk-* path-deps from before the wdk vendoring reorg and could no
longer build at all) and build them from source per release, exactly like
pf-vdisplay - same anti-stale reasoning. One `cargo build --release` now builds
all three drivers against the vendored wdk-sys (incl. the bindgen 0.72 pin), and
build-gamepad-drivers.ps1 signs pf_dualsense + pf_xusb (clear FORCE_INTEGRITY ->
sign dll -> stampinf -> Inf2Cat -> sign cat) with one shared cert + .cer,
matching the layout install-gamepad-drivers.ps1 expects. pack-host-installer.ps1
builds + stages them instead of the retired checked-in binaries.

Validated on the runner: the whole workspace (pf-vdisplay + pf-dualsense +
pf-xusb) builds with CARGO_TARGET_DIR=C:\t set, and build-gamepad-drivers.ps1
produces signed pf_dualsense.{dll,inf,cat} + pf_xusb.{dll,inf,cat} + the .cer.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 15:08:40 +00:00
195 changed files with 9034 additions and 5194 deletions
+57
View File
@@ -0,0 +1,57 @@
# Android client screenshots for the Play listing / marketing. Roborazzi renders the real Compose
# UI with mock state on the host JVM via Robolectric — NO emulator, GPU, KVM, host, or JNI core
# (`-PskipRustBuild` skips the cargo-ndk native build). The Android analogue of apple.yml's
# `screenshots` job, gated to STABLE RELEASE tags only. Standalone + best-effort: a failure here
# reds nothing else. PNGs land as a 30-day artifact; not committed or published.
name: android-screenshots
on:
push:
tags: ["v*"]
workflow_dispatch:
jobs:
screenshots:
if: startsWith(github.ref, 'refs/tags/v') || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-24.04
timeout-minutes: 45
steps:
- uses: actions/checkout@v4
- name: JDK 21 (AGP 9.2 + Robolectric's SDK-36 android-all jar both want 1721)
uses: actions/setup-java@v4
with:
distribution: temurin
java-version: "21"
- name: Android SDK
uses: android-actions/setup-android@v3
# No NDK/CMake — the screenshot unit tests are pure JVM. compileSdk 37 auto-downloads via AGP
# if the platform channel lacks it (same note as android.yml).
- name: platform-tools + platform 36 + build-tools
run: sdkmanager "platform-tools" "platforms;android-36" "build-tools;37.0.0"
- name: Cache (gradle)
uses: actions/cache@v4
with:
path: |
~/.gradle/caches
~/.gradle/wrapper
key: android-screenshots-${{ hashFiles('clients/android/**/*.gradle.kts') }}
restore-keys: android-screenshots-
# Roborazzi renders Compose on the JVM (Robolectric Native Graphics). `-PskipRustBuild` keeps
# the cargo-ndk native build out of the graph — the tests never load libpunktfunk_android.so.
- name: Capture screenshots (Roborazzi)
working-directory: clients/android
run: ./gradlew :app:testDebugUnitTest -PskipRustBuild --stacktrace
- name: Upload screenshots
if: always()
# v3: Gitea's API rejects upload-artifact@v4 (see apple.yml). Download is a zip.
uses: actions/upload-artifact@v3
with:
name: punktfunk-android-screenshots
path: clients/android/app/build/outputs/roborazzi
retention-days: 30
+35
View File
@@ -32,6 +32,25 @@ jobs:
dirname "$RUSTUP" >> "$GITHUB_PATH"
"$RUSTUP" target add aarch64-apple-darwin x86_64-apple-darwin
# `punktfunk-core` now decodes Opus in-core for the Apple client (surround), pulling
# `audiopus_sys`, which builds a vendored static libopus via CMake when pkg-config can't find a
# system Opus — so the xcframework is self-contained (no runtime libopus.dylib on end-user Macs).
# CMake must be on PATH; install it self-healing on a fresh runner.
- name: CMake (for the vendored libopus audiopus_sys builds)
run: |
# Runner steps run with `bash --noprofile --norc`, so Homebrew's bin dir isn't on PATH —
# locate brew explicitly, install cmake if missing, and export its bin dir to GITHUB_PATH so
# the xcframework build step (audiopus_sys → vendored libopus) finds `cmake`.
for B in /opt/homebrew/bin/brew /usr/local/bin/brew; do [ -x "$B" ] && BREW="$B" && break; done
if [ -z "$BREW" ]; then echo "::error::Homebrew not found on the runner"; exit 1; fi
BREW_BIN="$(dirname "$BREW")"; export PATH="$BREW_BIN:$PATH"
command -v cmake >/dev/null || "$BREW" install cmake
echo "$BREW_BIN" >> "$GITHUB_PATH"
# Homebrew's CMake 4 dropped compatibility with the vendored libopus's pre-3.5
# `cmake_minimum_required`; treat 3.5 as the policy minimum (the cmake crate's child cmake
# inherits this from the env during the xcframework build).
echo "CMAKE_POLICY_VERSION_MINIMUM=3.5" >> "$GITHUB_ENV"
- name: Build PunktfunkCore.xcframework
run: bash scripts/build-xcframework.sh
@@ -71,6 +90,22 @@ jobs:
"$RUSTUP" target add aarch64-apple-darwin x86_64-apple-darwin \
aarch64-apple-ios aarch64-apple-ios-sim x86_64-apple-ios
# See the swift job: audiopus_sys (via the in-core Opus decode) builds vendored libopus with CMake.
- name: CMake (for the vendored libopus audiopus_sys builds)
run: |
# Runner steps run with `bash --noprofile --norc`, so Homebrew's bin dir isn't on PATH —
# locate brew explicitly, install cmake if missing, and export its bin dir to GITHUB_PATH so
# the xcframework build step (audiopus_sys → vendored libopus) finds `cmake`.
for B in /opt/homebrew/bin/brew /usr/local/bin/brew; do [ -x "$B" ] && BREW="$B" && break; done
if [ -z "$BREW" ]; then echo "::error::Homebrew not found on the runner"; exit 1; fi
BREW_BIN="$(dirname "$BREW")"; export PATH="$BREW_BIN:$PATH"
command -v cmake >/dev/null || "$BREW" install cmake
echo "$BREW_BIN" >> "$GITHUB_PATH"
# Homebrew's CMake 4 dropped compatibility with the vendored libopus's pre-3.5
# `cmake_minimum_required`; treat 3.5 as the policy minimum (the cmake crate's child cmake
# inherits this from the env during the xcframework build).
echo "CMAKE_POLICY_VERSION_MINIMUM=3.5" >> "$GITHUB_ENV"
- name: Build PunktfunkCore.xcframework (mac + iOS slices)
run: BUILD_IOS=1 bash scripts/build-xcframework.sh
+44 -13
View File
@@ -11,12 +11,18 @@
# punktfunk.zip
# punktfunk/ <- single top-level dir == plugin.json "name"
# plugin.json [required]
# package.json [required]
# package.json [required; CI stamps "version" — Decky reads the installed version here]
# main.py [required: python backend]
# dist/index.js [required: rollup output]
# update.json [CI-baked {channel, manifest}: where the plugin's self-update check polls]
# README.md (recommended)
# LICENSE [required by the plugin store]
#
# SELF-UPDATE (no Decky store): alongside the zip we also publish a tiny per-channel
# `manifest.json` ({version, artifact=<immutable per-version zip URL>, sha256}). The installed
# plugin polls it (main.py check_update), and the frontend drives Decky's own install RPC to
# apply a newer build. See clients/decky/README.md "Updating".
#
# REGISTRY_TOKEN: repo Actions secret, a PAT with write:package scope (shared with deb/rpm/docker).
name: decky
@@ -56,20 +62,26 @@ jobs:
pnpm install --frozen-lockfile
pnpm run build # rollup -> clients/decky/dist/index.js
- name: Version + channel
# Tag vX.Y.Z -> X.Y.Z (stable `latest/` alias + Gitea Release); main push -> 0.3.0-ciN.g<sha>
# (`canary/` alias). Used for the registry version path + the zip name (the plugin.json
# version is the source of truth Decky reads after install — bump it in the release commit).
- name: Version + channel + stamp
# Tag vX.Y.Z -> X.Y.Z (stable `latest/` alias + Gitea Release); main push -> 0.3.<run>
# (`canary/` alias). Decky reads a plugin's INSTALLED version from package.json (NOT
# plugin.json), and the plugin's own update check (clients/decky/main.py check_update)
# compares against it — so the build version is STAMPED into package.json here (mirrored
# into plugin.json for store parity). Canary is a PLAIN numeric semver, never a
# `-ci<N>` prerelease: compare-versions orders prerelease identifiers lexically
# (ci10 < ci9), which would break update detection; the run number is monotonic.
working-directory: ${{ gitea.workspace }}
run: |
SHORT=$(echo "$GITHUB_SHA" | cut -c1-8)
case "$GITHUB_REF" in
refs/tags/v*) V="${GITHUB_REF_NAME#v}"; ALIAS=latest ;;
*) V="0.3.0-ci${GITHUB_RUN_NUMBER}.g${SHORT}"; ALIAS=canary ;;
*) V="0.3.${GITHUB_RUN_NUMBER}"; ALIAS=canary ;;
esac
BASE="https://$REGISTRY/api/packages/$OWNER/generic/$PACKAGE"
echo "VERSION=$V" >> "$GITHUB_ENV"
echo "ALIAS=$ALIAS" >> "$GITHUB_ENV"
echo "BASE=$BASE" >> "$GITHUB_ENV"
echo "decky version $V -> alias '$ALIAS'"
VERSION="$V" node -e 'const fs=require("fs");for(const f of ["clients/decky/package.json","clients/decky/plugin.json"]){const j=JSON.parse(fs.readFileSync(f,"utf8"));j.version=process.env.VERSION;fs.writeFileSync(f,JSON.stringify(j,null,2)+"\n");}'
- name: Assemble store-layout zip
working-directory: ${{ gitea.workspace }}
@@ -89,9 +101,20 @@ jobs:
chmod 0755 "$DEST/bin/punktfunkrun.sh"
# Store requires a LICENSE in the plugin root; the project is MIT OR Apache-2.0.
cp LICENSE-MIT "$DEST/LICENSE"
# Self-update channel pointer the backend reads (main.py check_update). It points at
# THIS channel's manifest.json (published below); that manifest in turn points at the
# immutable per-version zip, so its sha256 stays valid across future alias re-uploads.
printf '{"channel":"%s","manifest":"%s/%s/manifest.json"}\n' "$ALIAS" "$BASE" "$ALIAS" > "$DEST/update.json"
( cd "$STAGE" && zip -r "$RUNNER_TEMP/punktfunk.zip" "$PLUGIN" )
ls -lh "$RUNNER_TEMP/punktfunk.zip"
unzip -l "$RUNNER_TEMP/punktfunk.zip"
# The update manifest the plugin polls: the immutable per-version artifact + its
# sha256 (Decky's installer verifies the download against this hash, aborting on
# mismatch — so it MUST be the per-version URL, never the mutable alias).
SHA=$(sha256sum "$RUNNER_TEMP/punktfunk.zip" | cut -d' ' -f1)
printf '{"version":"%s","artifact":"%s/%s/punktfunk.zip","sha256":"%s"}\n' \
"$VERSION" "$BASE" "$VERSION" "$SHA" > "$RUNNER_TEMP/manifest.json"
cat "$RUNNER_TEMP/manifest.json"
- name: Publish to the Gitea generic registry
working-directory: ${{ gitea.workspace }}
@@ -99,18 +122,26 @@ jobs:
TOKEN: ${{ secrets.REGISTRY_TOKEN }}
run: |
BASE="https://$REGISTRY/api/packages/$OWNER/generic/$PACKAGE"
# 1) Immutable, versioned URL.
# 1) Immutable, versioned URL + its update manifest (the manifest's `artifact` points
# here, so the published sha256 keeps matching what Decky later downloads).
curl -fsS --user "enricobuehler:$TOKEN" --upload-file "$RUNNER_TEMP/punktfunk.zip" \
"$BASE/$VERSION/punktfunk.zip"
curl -fsS --user "enricobuehler:$TOKEN" --upload-file "$RUNNER_TEMP/manifest.json" \
"$BASE/$VERSION/manifest.json"
echo "published $BASE/$VERSION/punktfunk.zip"
# 2) Channel alias (stable release -> latest/, canary main build -> canary/) — the link
# to paste into Decky's "install from URL". The generic registry rejects re-uploading
# an existing version/file (409), so delete the prior alias first (ignore 404 on run #1).
curl -fsS -o /dev/null --user "enricobuehler:$TOKEN" -X DELETE \
"$BASE/$ALIAS/punktfunk.zip" || true
# 2) Channel alias (stable release -> latest/, canary main build -> canary/) — the
# zip is the "install from URL" link; manifest.json is what the installed plugin
# polls for updates. The generic registry rejects re-uploading an existing
# version/file (409), so delete the prior alias copies first (ignore 404 on run #1).
for f in punktfunk.zip manifest.json; do
curl -fsS -o /dev/null --user "enricobuehler:$TOKEN" -X DELETE "$BASE/$ALIAS/$f" || true
done
curl -fsS --user "enricobuehler:$TOKEN" --upload-file "$RUNNER_TEMP/punktfunk.zip" \
"$BASE/$ALIAS/punktfunk.zip"
curl -fsS --user "enricobuehler:$TOKEN" --upload-file "$RUNNER_TEMP/manifest.json" \
"$BASE/$ALIAS/manifest.json"
echo "install-from-URL link: $BASE/$ALIAS/punktfunk.zip"
echo "update manifest: $BASE/$ALIAS/manifest.json"
- name: Attach zip to the Gitea release (stable tags only)
if: startsWith(gitea.ref, 'refs/tags/v')
@@ -0,0 +1,67 @@
# Native Linux client screenshots for the app/marketing listings. The client renders
# host-free mock scenes (PUNKTFUNK_SHOT_SCENE) under a virtual X display; the driver
# (clients/linux/tools/screenshots.sh) grabs each one — no host, GPU, or Wayland. The
# Linux analogue of apple.yml's `screenshots` job, gated to STABLE RELEASE tags only.
# Standalone + best-effort: a failure here reds nothing else. PNGs land as a 30-day
# artifact; they are not committed or published.
name: linux-client-screenshots
on:
push:
tags: ["v*"]
workflow_dispatch:
jobs:
screenshots:
if: startsWith(github.ref, 'refs/tags/v') || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-24.04
# Same image as ci.yml/deb.yml — already carries the Rust toolchain + GTK/SDL build deps.
container:
image: git.unom.io/unom/punktfunk-rust-ci:latest
timeout-minutes: 90
steps:
- uses: actions/checkout@v4
# Client link deps (baked into the image; kept here so the job is green across image
# rebuilds — a no-op once present) PLUS the headless-render extras: a virtual X server,
# software GL+Vulkan (llvmpipe/lavapipe), the icon theme + fonts the UI draws with, and a
# root-window grab tool.
- name: Client link + headless-render deps
run: |
apt-get update
apt-get install -y --no-install-recommends \
libgtk-4-dev libadwaita-1-dev libsdl3-dev \
xvfb x11-utils imagemagick scrot \
libgl1-mesa-dri mesa-vulkan-drivers \
adwaita-icon-theme fonts-cantarell fonts-dejavu-core
# Reuse the workspace cargo caches (same keys as ci.yml/deb.yml).
- name: Cache keys
run: echo "rustc=$(rustc --version | cut -d' ' -f2)" >> "$GITHUB_ENV"
- uses: actions/cache@v4
with:
path: |
/usr/local/cargo/registry
/usr/local/cargo/git
key: cargo-home-${{ hashFiles('Cargo.lock') }}
restore-keys: cargo-home-
- uses: actions/cache@v4
with:
path: target
key: cargo-target-v3-${{ env.rustc }}-${{ hashFiles('Cargo.lock') }}
restore-keys: cargo-target-v3-${{ env.rustc }}-
- name: Build client
run: cargo build --release -p punktfunk-client-linux --locked
- name: Capture screenshots
run: bash clients/linux/tools/screenshots.sh
- name: Upload screenshots
if: always()
# v3: Gitea's API rejects upload-artifact@v4 (see apple.yml). Download is a zip.
uses: actions/upload-artifact@v3
with:
name: punktfunk-linux-client-screenshots
path: clients/linux/screenshots
retention-days: 30
+17
View File
@@ -118,6 +118,23 @@ jobs:
"$RUSTUP" toolchain install nightly --profile minimal
"$RUSTUP" component add rust-src --toolchain nightly
# The in-core Opus decode (surround) pulls audiopus_sys, which builds a vendored static libopus
# via CMake — keep the xcframework self-contained (no runtime libopus.dylib on end-user devices).
- name: CMake (for the vendored libopus audiopus_sys builds)
run: |
# Runner steps run with `bash --noprofile --norc`, so Homebrew's bin dir isn't on PATH —
# locate brew explicitly, install cmake if missing, and export its bin dir to GITHUB_PATH so
# the xcframework build step (audiopus_sys → vendored libopus) finds `cmake`.
for B in /opt/homebrew/bin/brew /usr/local/bin/brew; do [ -x "$B" ] && BREW="$B" && break; done
if [ -z "$BREW" ]; then echo "::error::Homebrew not found on the runner"; exit 1; fi
BREW_BIN="$(dirname "$BREW")"; export PATH="$BREW_BIN:$PATH"
command -v cmake >/dev/null || "$BREW" install cmake
echo "$BREW_BIN" >> "$GITHUB_PATH"
# Homebrew's CMake 4 dropped compatibility with the vendored libopus's pre-3.5
# `cmake_minimum_required`; treat 3.5 as the policy minimum (the cmake crate's child cmake
# inherits this from the env during the xcframework build).
echo "CMAKE_POLICY_VERSION_MINIMUM=3.5" >> "$GITHUB_ENV"
- name: Build PunktfunkCore.xcframework (mac + iOS + tvOS)
# tvOS is a tier-3 target (nightly -Zbuild-std): slow on the first build, then cached on
# the self-hosted runner. Built on canary too so the tvOS archive/upload below runs on the
+53
View File
@@ -0,0 +1,53 @@
# Management-console screenshots for the app/marketing listings. Captured from the
# built Storybook with headless Chromium (web/tools/screenshots.mjs) — the page
# stories render from fixtures, so no live mgmt API, login, or GPU is needed. This
# is the web analogue of apple.yml's `screenshots` job, but gated to STABLE RELEASE
# tags only (the console has no release workflow of its own — it ships inside the
# host packaging). Best-effort: a standalone workflow, so a failure here reds
# nothing else. PNGs land as a 30-day artifact; they are not committed or published.
name: web-screenshots
on:
push:
tags: ["v*"]
workflow_dispatch:
jobs:
screenshots:
if: startsWith(github.ref, 'refs/tags/v') || github.event_name == 'workflow_dispatch'
runs-on: ubuntu-24.04
container:
image: oven/bun:1
timeout-minutes: 30
defaults:
run:
working-directory: web
steps:
# oven/bun ships neither git nor a real node (the driver runs under node), and
# the slim Debian base lacks a CA bundle — without it actions/checkout's HTTPS
# fetch dies with "Problem with the SSL CA cert" (same as ci.yml's web job).
- name: Install git + node + CA certs
working-directory: /
run: apt-get update && apt-get install -y --no-install-recommends ca-certificates git nodejs
- uses: actions/checkout@v4
# --ignore-scripts skips the prepare→codegen hook (mirrors ci.yml); run codegen
# explicitly since build-storybook has no prebuild hook of its own.
- name: Install dependencies
run: bun install --frozen-lockfile --ignore-scripts
- name: Generate API client + i18n messages
run: bun run codegen
# Pulls the matching Chromium build + the apt libs it needs (root in-container).
- name: Install Chromium
run: bunx playwright install --with-deps chromium
- name: Build Storybook
run: bun run build-storybook
- name: Capture screenshots
run: bun run screenshots
- name: Upload screenshots
if: always()
# v3: Gitea's API rejects upload-artifact@v4 (see apple.yml). Download is a zip.
uses: actions/upload-artifact@v3
with:
name: punktfunk-web-console-screenshots
path: web/screenshots
retention-days: 30
+5 -5
View File
@@ -76,7 +76,7 @@ jobs:
head "EWDK"
Write-Host ("EWDKROOT = " + ($env:EWDKROOT ?? '<unset>'))
head "LLVM / clang (README pins 21.1.2 for wdk-sys bindgen)"
head "LLVM / clang (bindgen 0.72 builds on the runner default clang)"
Write-Host ("LIBCLANG_PATH = " + ($env:LIBCLANG_PATH ?? '<unset>'))
$clang = Get-Command clang -ErrorAction SilentlyContinue
if ($clang) { & clang --version } else { Write-Host "clang: NOT on PATH" }
@@ -119,12 +119,12 @@ jobs:
env:
# wdk-build otherwise picks 10.0.28000.0 (no km/crt) and bindgen fails — pin the WDK SDK version.
Version_Number: '10.0.26100.0'
# wdk-sys bindgen layout tests overflow (E0080) on the runner's default LLVM (ToT/22-dev); point at
# the pinned LLVM 21.1.2 that windows-drivers-rs builds clean against (provisioned to C:\llvm-21).
LIBCLANG_PATH: 'C:\llvm-21\bin'
# No LIBCLANG_PATH pin: the vendored bindgen 0.72 builds clean on the runner's default clang 22
# (the shipping pack proves it). A 0.71-era layout-test overflow once needed LLVM 21; the 0.72 bump
# retired that — see design/windows-build-and-packaging.md.
steps:
- uses: actions/checkout@v4
- name: Ensure WDK + cargo-wdk + LLVM 21.1.2 (idempotent self-provision)
- name: Ensure WDK + cargo-wdk (idempotent self-provision)
# Run the provisioning script here too so driver-build is self-sufficient and never races a
# separate provision run on the single runner. Path is relative to the job working-directory
# (packaging/windows/drivers). Near-noop once the toolchain is present.
+1
View File
@@ -13,6 +13,7 @@ clients/apple/PunktfunkCore.xcframework/
clients/apple/.swiftpm/
# Generated App Store screenshots (tools/screenshots.sh output; uploaded as a CI artifact)
clients/apple/screenshots/
clients/linux/screenshots/
# Xcode per-user state
xcuserdata/
+17 -1
View File
@@ -346,7 +346,23 @@ FFI also link-needs `libGL`/`libgbm`/`libcuda` at build time). Env knobs: `PUNKT
`PUNKTFUNK_COMPOSITOR=kwin|gamescope|mutter`, `PUNKTFUNK_ZEROCOPY=1`, `PUNKTFUNK_GAMESCOPE_APP=...`,
`PUNKTFUNK_INPUT_BACKEND=...`, `PUNKTFUNK_PERF=1` (per-stage timing), `PUNKTFUNK_VIDEO_DROP=N` (FEC
test), `PUNKTFUNK_FEC_PCT=N`, `PUNKTFUNK_DSCP=1` (opt-in DSCP/SO_PRIORITY media QoS on the data +
GameStream video/audio sockets; no-op on the wire on Windows without a qWAVE policy).
GameStream video/audio sockets; no-op on the wire on Windows without a qWAVE policy),
`PUNKTFUNK_444=1` (full-chroma HEVC 4:4:4, see below).
**HEVC 4:4:4 (full chroma, Range Extensions)**: opt-in via `PUNKTFUNK_444`, negotiated like 10-bit —
the host emits 4:4:4 only when the client advertised `VIDEO_CAP_444` (wire bit `0x04` + ABI
`PUNKTFUNK_VIDEO_CAP_444`), the codec is HEVC, the session is single-process, **and** a GPU probe
(`encode::can_encode_444`, run before the Welcome) confirms support — else it resolves to 4:2:0 and
`Welcome::chroma_format` reflects the real value (honest downgrade; the client reads it via
`punktfunk_connection_chroma_format`). **punktfunk/1-native only** — GameStream/Moonlight stays 4:2:0
(stock clients can't decode 4:4:4). **NVENC is the implemented path**: Linux `hevc_nvenc` feeds a
swscale'd `yuv444p` (RGB-in is always 4:2:0 — verified on the RTX 5070 Ti — so the session forces CPU
RGB capture for 4:4:4); Windows NVENC keeps ARGB input + FREXT profile + `chromaFormatIDC=3` and the
DDA capturer delivers RGB. VAAPI / AMF / QSV **decline** (probe returns false — no validated 4:4:4
hardware in the lab; they'd produce 4:2:0). Software (openh264) is 4:2:0-only. Test with
`PUNKTFUNK_CLIENT_444=1 punktfunk-probe --out x.h265` then `ffprobe x.h265` (expect `pix_fmt yuv444p`).
*Linux NVENC mechanism validated on the RTX 5070 Ti (ffmpeg CLI); Windows NVENC + 10-bit-4:4:4 not yet
on-glass validated.*
## Conventions
Generated
+1 -1
View File
@@ -2828,6 +2828,7 @@ dependencies = [
"fec-rs",
"hmac",
"libc",
"opus",
"proptest",
"quinn",
"rand 0.9.4",
@@ -2855,7 +2856,6 @@ dependencies = [
"anyhow",
"ash",
"ashpd",
"audiopus_sys",
"axum",
"axum-server",
"base64",
+13 -9
View File
@@ -1,8 +1,12 @@
# punktfunk
<p align="center">
<img src="assets/punktfunk-logo.svg" alt="punktfunk" width="320" />
</p>
**Low-latency desktop and game streaming with first-class Linux and Windows hosts.** Run the host on
a Linux machine or a Windows PC, connect from a Mac, PC, phone, tablet, or TV, and stream your desktop
or games — each device at its **own native resolution and refresh rate**, over your local network.
<p align="center"><b>Low-latency desktop and game streaming with first-class Linux and Windows hosts.</b></p>
Run the host on a Linux machine or a Windows PC, connect from a Mac, PC, phone, tablet, or TV, and
stream your desktop or games — each device at its **own native resolution and refresh rate**, over
your local network.
📖 **Documentation: [docs.punktfunk.unom.io](https://docs.punktfunk.unom.io)** — start with
[How It Works](https://docs.punktfunk.unom.io/docs/how-it-works) or the
@@ -43,7 +47,7 @@ protocol, FEC, and crypto, linked into the host and every client over a stable C
| **Core**`punktfunk-core` + C ABI (protocol · FEC · crypto · QUIC) | ✅ Complete & hardened |
| **GameStream host** → stock Moonlight | ✅ Live end-to-end: pairing, RTSP, audio, per-client virtual output at native resolution, GPU zero-copy NVENC, gamepads |
| **Native protocol**`punktfunk/1` | ✅ Validated live: QUIC control + GF(2¹⁶) FEC/AES-GCM data plane, PIN pairing, mDNS discovery, mid-stream mode renegotiation |
| **Windows host** (NVIDIA, x64) | 🟡 Implemented & shipping as a signed installer (DXGI capture · SudoVDA virtual display · NVENC · WASAPI · ViGEm); NVIDIA-only, newer than the Linux host |
| **Windows host** (x64) | 🟡 Implemented & shipping as a signed installer: DXGI/WGC capture · its own all-Rust IddCx **virtual display** (secure-desktop capable) · GPU encode (NVENC on NVIDIA, AMF/QSV on AMD/Intel) · WASAPI audio · bundled virtual-gamepad drivers (no ViGEmBus) · HDR incl. Vulkan-game HDR. NVIDIA live-validated; AMD/Intel CI-green |
| **macOS / iOS / tvOS client** (`clients/apple`) | ✅ Streaming live: VideoToolbox decode, controllers incl. DualSense, discovery, pairing, speed test |
| **Linux client** (`clients/linux`, GTK4) | ✅ Streaming live: FFmpeg + VAAPI zero-copy decode, PipeWire audio, SDL3 controllers; ships as Flatpak/apt/rpm/Arch |
| **Android client** (`clients/android`, phone + TV) | ✅ Streaming live: AMediaCodec decode + HDR10, Oboe audio, controllers, discovery, pairing |
@@ -69,14 +73,14 @@ roadmap: **[/docs/roadmap](https://docs.punktfunk.unom.io/docs/roadmap)**.
Pick your platform and install from its package registry — the per-platform guide covers adding the
repo, first run, and the web console. The Linux host is the primary, most battle-tested path; a
Windows host (NVIDIA-only) also ships as a signed installer.
Windows host also ships as a signed installer (all-vendor: NVIDIA, AMD, Intel).
| Platform | Install | Guide |
|--------|---------|-------|
| **Ubuntu / Debian** (apt) | `sudo apt install punktfunk-host` *(after adding the repo)* | [Ubuntu — GNOME](https://docs.punktfunk.unom.io/docs/ubuntu-gnome) · [KDE](https://docs.punktfunk.unom.io/docs/ubuntu-kde) |
| **Fedora / Bazzite** (rpm-ostree) | `rpm-ostree install punktfunk punktfunk-web` *(or the bootc image)* | [Fedora — KDE](https://docs.punktfunk.unom.io/docs/fedora-kde) · [Bazzite](https://docs.punktfunk.unom.io/docs/bazzite) |
| **Arch / Steam Deck** (PKGBUILD / sysext) | `makepkg -si` *(Arch)* · sysext `.raw` *(SteamOS)* | [packaging/arch](packaging/arch/README.md) |
| **Windows** (NVIDIA, x64) | signed `setup.exe` from the package registry | [Windows Host](https://docs.punktfunk.unom.io/docs/windows-host) |
| **Windows** (x64) | signed `setup.exe` from the package registry | [Windows Host](https://docs.punktfunk.unom.io/docs/windows-host) |
`punktfunk-host` is the streaming host; `punktfunk-web` is the browser console (pairing + status).
After install, run `punktfunk-host serve` inside your desktop session (the secure native default;
@@ -121,7 +125,7 @@ and the [docs site](https://docs.punktfunk.unom.io).
```
crates/
punktfunk-core/ protocol · FEC · pacing · crypto · QUIC control plane — the C ABI (lib + cdylib + staticlib)
punktfunk-host/ Linux host: virtual displays · capture · encode · input · GameStream · punktfunk/1 · mgmt
punktfunk-host/ the host (Linux + Windows): virtual displays · capture · encode · input · GameStream · punktfunk/1 · mgmt
clients/
apple/ macOS / iOS / tvOS app (Swift · VideoToolbox · Metal · GameController)
linux/ Linux desktop app (Rust · GTK4/libadwaita · FFmpeg/VAAPI · PipeWire · SDL3)
@@ -132,7 +136,7 @@ clients/
web/ web console (TanStack) over the management API — status · devices · pairing
packaging/ apt · rpm / COPR · Arch · Flatpak · Bazzite bootc image
docs-site/ public documentation site (Fumadocs) — https://docs.punktfunk.unom.io
design/ design notes & deep-dive plans
design/ design notes & deep-dive plans (index: design/README.md)
include/punktfunk_core.h cbindgen-generated C header (checked in)
tools/ latency-probe · loss-harness (measurement)
```
+33
View File
@@ -0,0 +1,33 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<svg width="100%" height="100%" viewBox="0 0 579 298" version="1.1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" xml:space="preserve" style="fill-rule:evenodd;clip-rule:evenodd;stroke-linejoin:round;stroke-miterlimit:2;">
<style>
/* Theme-adaptive so the logo stays readable on both light and dark README
backgrounds: deep violet (the brand-mark palette) on light, the original
light violet on dark. Evaluated by the viewer's color scheme. */
.pf-wm { fill: #6c5bf3; }
.pf-back { fill: #a79ff8; }
.pf-deep { fill: #6c5bf3; }
@media (prefers-color-scheme: dark) {
.pf-wm { fill: #cec9fb; }
.pf-back { fill: #f2f1fe; }
.pf-deep { fill: #8c7ef5; }
}
</style>
<g>
<g>
<path class="pf-wm" style="fill-rule:nonzero;" d="M21.144,176.635l0,102.687l31.253,0l0,-35.563l73.436,0l0,-23.555l-73.436,0l0,-19.398l77.285,0l0,-24.171l-108.537,0Z"/>
<path class="pf-wm" style="fill-rule:nonzero;" d="M136.148,176.635l0,47.264c0.154,16.627 0.154,16.627 0.308,20.014c0.77,15.087 2.463,21.4 7.544,26.634c7.698,8.16 20.014,10.315 59.272,10.315c23.863,0 34.178,-0.616 43.415,-2.463c11.7,-2.463 19.552,-10.623 21.246,-22.323c0.924,-7.236 1.078,-8.929 1.54,-32.176l0,-47.264l-31.253,0l0,47.264c0,2.155 -0.154,7.082 -0.308,10.623c-0.462,9.699 -1.232,12.47 -3.695,15.087c-3.387,3.695 -9.853,4.619 -31.407,4.619c-26.634,0 -32.638,-1.693 -34.332,-9.853c-0.77,-4.157 -0.77,-4.311 -1.078,-20.476l0,-47.264l-31.253,0Z"/>
<path class="pf-wm" style="fill-rule:nonzero;" d="M275.938,176.527l0,102.687l31.868,0l-0.77,-76.669l3.387,0l54.038,76.669l54.346,0l0,-102.687l-31.868,0l0.77,76.515l-3.233,0l-53.73,-76.515l-54.808,0Z"/>
<path class="pf-wm" style="fill-rule:nonzero;" d="M425.273,176.527l0,102.687l31.253,0l0,-39.258l17.089,0l46.032,39.258l47.418,0l-64.353,-52.344l59.426,-50.959l-47.88,0l-40.644,37.873l-17.089,0l0,-37.257l-31.253,0Z"/>
</g>
<path class="pf-back" style="fill-rule:nonzero;" d="M65.442,150.143c24.514,0 44.298,-19.784 44.298,-44.298c0,-24.514 -19.784,-44.298 -44.298,-44.298c-24.514,0 -44.298,19.784 -44.298,44.298c0,24.514 19.784,44.298 44.298,44.298Z"/>
<path class="pf-deep" style="fill-rule:nonzero;" d="M141.063,92.871c17.334,-17.334 17.334,-45.312 0,-62.647c-17.334,-17.334 -45.312,-17.334 -62.647,-0c-17.334,17.334 -17.334,45.312 0,62.647c17.334,17.334 45.312,17.334 62.647,-0Z"/>
<path style="fill:url(#_Linear1);" d="M121.228,104.359c-14.777,3.965 -31.187,0.136 -42.811,-11.488c-11.624,-11.624 -15.453,-28.034 -11.488,-42.811c14.777,-3.965 31.187,-0.136 42.811,11.488c11.624,11.624 15.453,28.034 11.488,42.811Z"/>
</g>
<defs>
<linearGradient id="_Linear1" x1="0" y1="0" x2="1" y2="0" gradientUnits="userSpaceOnUse" gradientTransform="matrix(31.323323,-31.323323,31.323323,31.323323,78.416832,92.870811)">
<stop offset="0" style="stop-color:#cec9fb;stop-opacity:0"/>
<stop offset="1" style="stop-color:#fcfcff;stop-opacity:1"/>
</linearGradient>
</defs>
</svg>

After

Width:  |  Height:  |  Size: 3.0 KiB

+21
View File
@@ -62,6 +62,10 @@ android {
buildFeatures { compose = true }
// Roborazzi/Robolectric render Compose on the host JVM (the CI screenshot harness) and need the
// merged Android resources + the app's manifest/theme available to the unit tests.
testOptions { unitTests { isIncludeAndroidResources = true } }
compileOptions {
sourceCompatibility = JavaVersion.VERSION_21
targetCompatibility = JavaVersion.VERSION_21
@@ -99,4 +103,21 @@ dependencies {
// Android TV components (we target phone + TV) land in the TV-UI milestone:
// implementation("androidx.tv:tv-material:1.1.0")
// The manifest already declares leanback so the scaffold installs on TV.
// --- CI screenshot harness (Roborazzi on the JVM via Robolectric — no emulator/GPU). The
// screenshot tests render the real Compose UI with mock state; never load the JNI core, so the
// job runs `:app:testDebugUnitTest -PskipRustBuild` (see kit/build.gradle.kts). ---
testImplementation(composeBom)
testImplementation("androidx.compose.ui:ui-test-junit4")
debugImplementation("androidx.compose.ui:ui-test-manifest") // the ComponentActivity test host
testImplementation("junit:junit:4.13.2")
testImplementation("org.robolectric:robolectric:4.16.1")
testImplementation("io.github.takahirom.roborazzi:roborazzi:1.64.0")
testImplementation("io.github.takahirom.roborazzi:roborazzi-compose:1.64.0")
}
// Record (write) the screenshots when the unit tests run. These tests exist to GENERATE marketing
// images, not to diff goldens, so always capture rather than verify.
tasks.withType<Test>().configureEach {
systemProperty("roborazzi.test.record", "true")
}
@@ -163,7 +163,7 @@ fun ConnectScreen(settings: Settings, onConnected: (Long) -> Unit) {
targetHost, targetPort, w, h, hz,
id.certPem, id.privateKeyPem, pinHex ?: "",
settings.bitrateKbps, settings.compositor, gamepadPref,
hdrEnabled,
hdrEnabled, settings.audioChannels,
)
}
connecting = false
@@ -16,9 +16,18 @@ data class Settings(
val bitrateKbps: Int = 0,
val compositor: Int = 0,
val gamepad: Int = 0,
/** Requested audio channel count: 2 (stereo), 6 (5.1) or 8 (7.1). The host clamps to what it
* can capture; the resolved count drives the decoder + AAudio layout. */
val audioChannels: Int = 2,
val micEnabled: Boolean = false,
/** Show the live stats overlay (FPS / throughput / latency) during a stream. */
val statsHudEnabled: Boolean = true,
/**
* Touch input model. `true` (default) = trackpad: the cursor stays put on touch-down and moves
* by the finger's relative delta (swipe to nudge, lift and re-swipe to walk it across), tap to
* click where it is. `false` = direct pointing: the cursor jumps to the finger (the old behaviour).
*/
val trackpadMode: Boolean = true,
)
/** Loads/saves [Settings] in the app-private `punktfunk_settings` prefs. */
@@ -33,8 +42,10 @@ class SettingsStore(context: Context) {
bitrateKbps = prefs.getInt(K_BITRATE, 0),
compositor = prefs.getInt(K_COMPOSITOR, 0),
gamepad = prefs.getInt(K_GAMEPAD, 0),
audioChannels = prefs.getInt(K_AUDIO_CH, 2),
micEnabled = prefs.getBoolean(K_MIC, false),
statsHudEnabled = prefs.getBoolean(K_HUD, true),
trackpadMode = prefs.getBoolean(K_TRACKPAD, true),
)
fun save(s: Settings) {
@@ -45,8 +56,10 @@ class SettingsStore(context: Context) {
.putInt(K_BITRATE, s.bitrateKbps)
.putInt(K_COMPOSITOR, s.compositor)
.putInt(K_GAMEPAD, s.gamepad)
.putInt(K_AUDIO_CH, s.audioChannels)
.putBoolean(K_MIC, s.micEnabled)
.putBoolean(K_HUD, s.statsHudEnabled)
.putBoolean(K_TRACKPAD, s.trackpadMode)
.apply()
}
@@ -57,8 +70,10 @@ class SettingsStore(context: Context) {
const val K_BITRATE = "bitrate_kbps"
const val K_COMPOSITOR = "compositor"
const val K_GAMEPAD = "gamepad"
const val K_AUDIO_CH = "audio_channels"
const val K_MIC = "mic_enabled"
const val K_HUD = "stats_hud_enabled"
const val K_TRACKPAD = "trackpad_mode"
}
}
@@ -124,6 +139,13 @@ val REFRESH_OPTIONS = listOf(
240 to "240 Hz",
)
/** (channel count, label). 2 = stereo (default), 6 = 5.1, 8 = 7.1. */
val AUDIO_CHANNEL_OPTIONS = listOf(
2 to "Stereo",
6 to "5.1 Surround",
8 to "7.1 Surround",
)
/** (kbps, label). `0` = host default. */
val BITRATE_OPTIONS = listOf(
0 to "Automatic",
@@ -104,6 +104,12 @@ fun SettingsScreen(initial: Settings, onChange: (Settings) -> Unit, onBack: () -
}
SettingsGroup("Audio") {
SettingDropdown(
label = "Audio channels",
options = AUDIO_CHANNEL_OPTIONS,
selected = s.audioChannels,
) { ch -> update(s.copy(audioChannels = ch)) }
ToggleRow(
title = "Microphone",
subtitle = "Send your mic to the host's virtual microphone",
@@ -119,6 +125,16 @@ fun SettingsScreen(initial: Settings, onChange: (Settings) -> Unit, onBack: () -
)
}
SettingsGroup("Pointer") {
ToggleRow(
title = "Trackpad mode",
subtitle = "Relative cursor like a laptop touchpad — swipe to nudge, tap to click. " +
"Off = the cursor jumps to your finger.",
checked = s.trackpadMode,
onCheckedChange = { on -> update(s.copy(trackpadMode = on)) },
)
}
SettingsGroup("Overlay") {
ToggleRow(
title = "Stats overlay",
@@ -41,6 +41,7 @@ import io.unom.punktfunk.kit.NativeBridge
import java.util.concurrent.atomic.AtomicBoolean
import kotlinx.coroutines.delay
import kotlin.math.abs
import kotlin.math.hypot
import kotlin.math.roundToInt
// Touch-gesture tuning (px / ms). TAP_SLOP: movement under this still counts as a tap, not a drag.
@@ -50,6 +51,15 @@ private const val TAP_SLOP = 12f
private const val TAP_DRAG_MS = 250L
private const val SCROLL_DIV = 4f
// Trackpad-mode pointer ballistics (relative one-finger motion). POINTER_SENS: base finger-px →
// host-px gain (~1:1, never twitchy). The rest is mild acceleration so a flick crosses the screen
// while a slow drag stays precise: above ACCEL_SPEED_FLOOR px/ms the gain ramps by ACCEL_GAIN per
// px/ms, capped at ACCEL_MAX (so a fast swipe can't fling the cursor uncontrollably).
private const val POINTER_SENS = 1.3f
private const val ACCEL_GAIN = 0.6f
private const val ACCEL_SPEED_FLOOR = 0.3f
private const val ACCEL_MAX = 3.0f
@Composable
fun StreamScreen(handle: Long, micEnabled: Boolean, onDisconnect: () -> Unit) {
val context = LocalContext.current
@@ -68,8 +78,11 @@ fun StreamScreen(handle: Long, micEnabled: Boolean, onDisconnect: () -> Unit) {
// Live decode stats for the HUD. Poll once a second for the whole stream (cheap, and each call
// drains+resets the native window so it never grows unbounded even while the overlay is hidden);
// `showStats` only gates rendering. A 3-finger tap toggles it live; the default comes from Settings.
val initialSettings = remember { SettingsStore(context).load() }
var stats by remember { mutableStateOf<DoubleArray?>(null) }
var showStats by remember { mutableStateOf(SettingsStore(context).load().statsHudEnabled) }
var showStats by remember { mutableStateOf(initialSettings.statsHudEnabled) }
// Touch model is fixed per session (re-keys the gesture handler below if it ever changes).
val trackpad = initialSettings.trackpadMode
LaunchedEffect(handle) {
while (true) {
delay(1000)
@@ -145,13 +158,18 @@ fun StreamScreen(handle: Long, micEnabled: Boolean, onDisconnect: () -> Unit) {
if (showStats) {
stats?.let { StatsOverlay(it, Modifier.align(Alignment.TopStart).padding(12.dp)) }
}
// Touch → mouse, absolute "direct pointing" like the Apple client: the host cursor follows
// your finger (MouseMoveAbs, host-normalized against the overlay size — which fills the video,
// so finger position maps straight onto the remote screen). Gestures: tap = left click;
// two-finger tap = right click; two-finger drag = scroll; tap-then-press-and-drag = left-drag
// (text selection / moving windows); three-finger tap = toggle the stats HUD.
// Touch → mouse. Two models, chosen by the Trackpad-mode setting:
// • trackpad (default): the cursor STAYS where it is on touch-down and moves by the finger's
// relative delta (MouseMove) with mild pointer acceleration — swipe to nudge, lift and
// re-swipe to walk it across, tap to click where it is. This is what makes the cursor
// reachable on a small screen.
// • direct (opt-out): the cursor jumps to the finger and follows it (MouseMoveAbs,
// host-normalized against the overlay size), the old "direct pointing" behaviour.
// Both share the same gesture vocabulary: tap = left click; two-finger tap = right click;
// two-finger drag = scroll; tap-then-press-and-drag = left-drag (text selection / moving
// windows); three-finger tap = toggle the stats HUD.
Box(
Modifier.fillMaxSize().pointerInput(handle) {
Modifier.fillMaxSize().pointerInput(handle, trackpad) {
var lastTapUp = 0L
var lastTapX = 0f
var lastTapY = 0f
@@ -176,7 +194,9 @@ fun StreamScreen(handle: Long, micEnabled: Boolean, onDisconnect: () -> Unit) {
val isDrag = down.uptimeMillis - lastTapUp < TAP_DRAG_MS &&
abs(startX - lastTapX) < TAP_SLOP && abs(startY - lastTapY) < TAP_SLOP
lastTapUp = 0L // consume the arming either way
moveAbs(startX, startY) // cursor jumps to the finger immediately
// Direct mode jumps the cursor to the finger; trackpad mode leaves it put (the
// whole point — you nudge it with swipes instead).
if (!trackpad) moveAbs(startX, startY)
if (isDrag) NativeBridge.nativeSendPointerButton(handle, 1, true)
var moved = false
@@ -185,6 +205,14 @@ fun StreamScreen(handle: Long, micEnabled: Boolean, onDisconnect: () -> Unit) {
var prevCx = startX
var prevCy = startY
var upTime = down.uptimeMillis
// Trackpad relative-motion state: the tracked finger, its last position/time, and
// the sub-pixel remainder so a slow drag isn't lost to Int truncation.
var trackId = down.id
var prevX = startX
var prevY = startY
var prevT = down.uptimeMillis
var accX = 0f
var accY = 0f
while (true) {
val ev = awaitPointerEvent()
@@ -217,15 +245,46 @@ fun StreamScreen(handle: Long, micEnabled: Boolean, onDisconnect: () -> Unit) {
moved = true
}
} else if (!scrolling) {
// One finger → the cursor follows it (skipped once a gesture turned into
// a scroll, so dropping back to one finger doesn't jerk the cursor).
// One finger (skipped once a gesture turned into a scroll, so dropping
// back to one finger doesn't jerk the cursor).
val p = pressed.firstOrNull { it.id == down.id } ?: pressed.first()
if (abs(p.position.x - startX) > TAP_SLOP ||
abs(p.position.y - startY) > TAP_SLOP
) {
moved = true
}
moveAbs(p.position.x, p.position.y)
if (trackpad) {
// Relative: move by the finger delta × (sensitivity × acceleration),
// carrying the sub-pixel remainder. Re-anchor (zero delta this frame)
// if the tracked finger changed, so lifting one of several fingers
// never jumps the cursor.
if (p.id != trackId) {
trackId = p.id
prevX = p.position.x
prevY = p.position.y
prevT = p.uptimeMillis
}
val dx = p.position.x - prevX
val dy = p.position.y - prevY
val dt = (p.uptimeMillis - prevT).coerceAtLeast(1L)
prevX = p.position.x
prevY = p.position.y
prevT = p.uptimeMillis
val speed = hypot(dx, dy) / dt // finger px per ms
val accel = (1f + ACCEL_GAIN * (speed - ACCEL_SPEED_FLOOR).coerceAtLeast(0f))
.coerceAtMost(ACCEL_MAX)
accX += dx * POINTER_SENS * accel
accY += dy * POINTER_SENS * accel
val outX = accX.toInt() // truncates toward zero → remainder kept w/ sign
val outY = accY.toInt()
if (outX != 0 || outY != 0) {
NativeBridge.nativeSendPointerMove(handle, outX, outY)
accX -= outX
accY -= outY
}
} else {
moveAbs(p.position.x, p.position.y) // direct: cursor follows the finger
}
}
ev.changes.forEach { it.consume() }
}
@@ -239,7 +298,7 @@ fun StreamScreen(handle: Long, micEnabled: Boolean, onDisconnect: () -> Unit) {
NativeBridge.nativeSendPointerButton(handle, 3, true)
NativeBridge.nativeSendPointerButton(handle, 3, false)
}
else -> { // tap → left click, and arm tap-and-drag
else -> { // tap → left click (at the cursor's current spot), arm tap-drag
NativeBridge.nativeSendPointerButton(handle, 1, true)
NativeBridge.nativeSendPointerButton(handle, 1, false)
lastTapUp = upTime
@@ -260,7 +319,7 @@ fun StreamScreen(handle: Long, micEnabled: Boolean, onDisconnect: () -> Unit) {
* `[fps, mbps, latP50Ms, latP95Ms, latValid, skew, w, h, hz, dropped]`.
*/
@Composable
private fun StatsOverlay(s: DoubleArray, modifier: Modifier = Modifier) {
internal fun StatsOverlay(s: DoubleArray, modifier: Modifier = Modifier) {
if (s.size < 10) return
val w = s[6].toInt()
val h = s[7].toInt()
@@ -10,7 +10,9 @@ import androidx.compose.ui.platform.LocalContext
// punktfunk brand violets (from the app icon: #6C5BF3 / #A79FF8 / #D2C9FB on a #16132A indigo).
// Used as the fallback dark scheme on pre-Android-12 devices; on 12+ we defer to Material You.
private val BrandDark = darkColorScheme(
// `internal` (not private) so the CI screenshot tests can force the deterministic brand palette —
// Material You dynamic colour has no wallpaper to seed from under the Robolectric JVM renderer.
internal val BrandDark = darkColorScheme(
primary = Color(0xFFA79FF8),
onPrimary = Color(0xFF1B1442),
primaryContainer = Color(0xFF4C3FB3),
@@ -0,0 +1,74 @@
package io.unom.punktfunk.screenshots
import androidx.activity.ComponentActivity
import androidx.compose.ui.test.junit4.createAndroidComposeRule
import androidx.compose.ui.test.onRoot
import com.github.takahirom.roborazzi.captureRoboImage
import com.github.takahirom.roborazzi.captureScreenRoboImage
import org.junit.Rule
import org.junit.Test
import org.junit.runner.RunWith
import org.robolectric.RobolectricTestRunner
import org.robolectric.annotation.Config
import org.robolectric.annotation.GraphicsMode
/**
* App-store / marketing screenshots of the native Android client, rendered on the JVM by Roborazzi
* (Robolectric Native Graphics) — no emulator, GPU, host, or JNI core. The scenes (ShotScenes.kt)
* render the REAL Compose UI with mock state.
*
* `sdk = [36]` is mandatory: Robolectric ships android-all jars only up to API 36 (Android 16), and
* the app's compileSdk is 37. PNGs land in build/outputs/roborazzi/.
*/
@RunWith(RobolectricTestRunner::class)
@GraphicsMode(GraphicsMode.Mode.NATIVE)
@Config(sdk = [36], qualifiers = "w360dp-h800dp-xxhdpi")
class ScreenshotTest {
@get:Rule
val compose = createAndroidComposeRule<ComponentActivity>()
private val out = "build/outputs/roborazzi"
// Pausing the animation clock before composing (then advancing once past the entrance animation
// and freezing) is what makes a text-field-bearing scene capturable: a focused field blinks its
// cursor via an infinite animation that otherwise keeps Compose perpetually "busy", so
// setContent's wait-for-idle never returns. Frozen, the capture is also deterministic.
/** Full-screen content scenes: the compose root fills the device, so a root capture is the shot. */
private fun shootRoot(name: String, content: @androidx.compose.runtime.Composable () -> Unit) {
compose.mainClock.autoAdvance = false
compose.setContent { ShotTheme(content) }
compose.mainClock.advanceTimeBy(800)
compose.onRoot().captureRoboImage("$out/phone-$name.png")
}
/** Dialog scenes: the AlertDialog is a separate window, so capture the whole screen (all windows). */
private fun shootScreen(name: String, content: @androidx.compose.runtime.Composable () -> Unit) {
compose.mainClock.autoAdvance = false
compose.setContent { ShotTheme(content) }
compose.mainClock.advanceTimeBy(800)
captureScreenRoboImage("$out/phone-$name.png")
}
@Test
fun hosts() = shootRoot("hosts") { HostsScene() }
@Test
fun settings() = shootRoot("settings") { SettingsScene() }
@Test
@Config(sdk = [36], qualifiers = "w800dp-h360dp-xxhdpi") // landscape — the stream is immersive
fun stream() = shootRoot("stream") { StreamScene() }
@Test
fun trust() = shootScreen("trust") {
HostsScene()
TrustDialog()
}
@Test
fun pair() = shootScreen("pair") {
HostsScene()
PairDialog()
}
}
@@ -0,0 +1,195 @@
package io.unom.punktfunk.screenshots
import androidx.compose.foundation.background
import androidx.compose.foundation.layout.Arrangement
import androidx.compose.foundation.layout.Box
import androidx.compose.foundation.layout.Column
import androidx.compose.foundation.layout.Spacer
import androidx.compose.foundation.layout.fillMaxSize
import androidx.compose.foundation.layout.fillMaxWidth
import androidx.compose.foundation.layout.height
import androidx.compose.foundation.layout.padding
import androidx.compose.foundation.lazy.grid.GridCells
import androidx.compose.foundation.lazy.grid.GridItemSpan
import androidx.compose.foundation.lazy.grid.LazyVerticalGrid
import androidx.compose.foundation.lazy.grid.items
import androidx.compose.material3.AlertDialog
import androidx.compose.material3.MaterialTheme
import androidx.compose.material3.Surface
import androidx.compose.material3.Text
import androidx.compose.material3.TextButton
import androidx.compose.runtime.Composable
import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.graphics.Brush
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.text.style.TextAlign
import androidx.compose.ui.unit.dp
import io.unom.punktfunk.BrandDark
import io.unom.punktfunk.Settings
import io.unom.punktfunk.SettingsScreen
import io.unom.punktfunk.StatsOverlay
import io.unom.punktfunk.components.HostCard
import io.unom.punktfunk.components.SectionLabel
import io.unom.punktfunk.models.HostStatus
// The CI screenshot scenes: the REAL app composables, fed embedded mock state, under the forced
// brand palette (Material You has no wallpaper to seed from on the JVM). The stream-video surface
// and ConnectScreen/App are intentionally absent — they require the live JNI core / a session.
/** Forces the deterministic punktfunk brand scheme (see Theme.kt) instead of dynamic colour. */
@Composable
internal fun ShotTheme(content: @Composable () -> Unit) {
MaterialTheme(colorScheme = BrandDark, content = content)
}
private data class MockHost(val name: String, val address: String, val status: HostStatus)
private val SAVED = listOf(
MockHost("Living Room PC", "192.168.1.42:9777", HostStatus.PAIRED),
MockHost("Office", "192.168.1.50:9777", HostStatus.TOFU),
)
private val DISCOVERED = listOf(
MockHost("studio-deck", "192.168.1.61:9777", HostStatus.PAIRING),
MockHost("HTPC", "192.168.1.70:9777", HostStatus.TOFU),
)
/** The connect screen's host grid, reconstructed from the real HostCard/SectionLabel components. */
@Composable
internal fun HostsScene() {
Surface(Modifier.fillMaxSize(), color = MaterialTheme.colorScheme.background) {
LazyVerticalGrid(
columns = GridCells.Adaptive(minSize = 160.dp),
modifier = Modifier.fillMaxSize(),
contentPadding = androidx.compose.foundation.layout.PaddingValues(16.dp),
horizontalArrangement = Arrangement.spacedBy(8.dp),
verticalArrangement = Arrangement.spacedBy(8.dp),
) {
item(span = { GridItemSpan(maxLineSpan) }) {
Column(
horizontalAlignment = Alignment.CenterHorizontally,
modifier = Modifier.fillMaxWidth(),
) {
Spacer(Modifier.height(8.dp))
Text("Punktfunk", style = MaterialTheme.typography.headlineLarge)
Text(
"stream a remote desktop",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.onSurfaceVariant,
)
Spacer(Modifier.height(24.dp))
}
}
item(span = { GridItemSpan(maxLineSpan) }) { SectionLabel("Saved hosts") }
items(SAVED) { h ->
HostCard(h.name, h.address, h.status, enabled = true, onConnect = {}, onForget = {}, onRename = {})
}
item(span = { GridItemSpan(maxLineSpan) }) {
Spacer(Modifier.height(12.dp))
SectionLabel("Discovered on the network")
}
items(DISCOVERED) { h ->
HostCard(h.name, h.address, h.status, enabled = true, onConnect = {}, onForget = null)
}
}
}
}
/** The real SettingsScreen, fed a representative non-default Settings. */
@Composable
internal fun SettingsScene() {
Surface(Modifier.fillMaxSize(), color = MaterialTheme.colorScheme.background) {
SettingsScreen(
initial = Settings(
width = 1920,
height = 1080,
hz = 120,
bitrateKbps = 50_000,
compositor = 1,
gamepad = 2,
micEnabled = true,
statsHudEnabled = true,
trackpadMode = true,
),
onChange = {},
onBack = {},
)
}
}
/** The real TOFU AlertDialog (mirrors ConnectScreen's PendingTrust.Kind.TRUST_NEW), shown over the host grid. */
@Composable
internal fun TrustDialog() {
AlertDialog(
onDismissRequest = {},
title = { Text("Trust this host?") },
text = {
Column {
Text("First connection to 192.168.1.61:9777.")
Text("Fingerprint 9f8e7d6c5b4a3928…")
Text(
"This host allows trust-on-first-use, but that can't tell an impostor " +
"from the real host. Pairing with a PIN is stronger — it proves both sides.",
)
}
},
confirmButton = { TextButton({}) { Text("Trust (TOFU)") } },
dismissButton = { TextButton({}) { Text("Pair with PIN…") } },
)
}
/** The PIN-pairing AlertDialog (mirrors ConnectScreen's PendingTrust.Kind.PAIR). The live screen
* uses OutlinedTextFields, but a TextField inside a Dialog window never reaches idle under
* Robolectric (its focus/cursor machinery animates forever) — so the PIN is shown as a static
* display here, which also reads better in a marketing shot. */
@Composable
internal fun PairDialog() {
AlertDialog(
onDismissRequest = {},
title = { Text("Pair with PIN") },
text = {
Column {
Text("Enter the 4-digit PIN shown on the host.")
Spacer(Modifier.height(16.dp))
Surface(
color = MaterialTheme.colorScheme.surfaceVariant,
shape = MaterialTheme.shapes.medium,
modifier = Modifier.fillMaxWidth(),
) {
Text(
"4 8 2 7",
style = MaterialTheme.typography.headlineMedium,
textAlign = TextAlign.Center,
modifier = Modifier.fillMaxWidth().padding(vertical = 16.dp),
)
}
Spacer(Modifier.height(12.dp))
Text(
"This device: Pixel 9 Pro",
style = MaterialTheme.typography.bodyMedium,
color = MaterialTheme.colorScheme.onSurfaceVariant,
)
}
},
confirmButton = { TextButton({}) { Text("Pair") } },
dismissButton = { TextButton({}) { Text("Cancel") } },
)
}
/** The live stats HUD (the real StatsOverlay) over a synthetic "streamed frame" gradient. */
@Composable
internal fun StreamScene() {
Box(
Modifier
.fillMaxSize()
.background(
Brush.linearGradient(listOf(Color(0xFF2A1E5C), Color(0xFF0E1B3D), Color(0xFF06122B))),
),
) {
// [fps, mbps, latP50, latP95, latValid, skew, w, h, hz, dropped]
StatsOverlay(
doubleArrayOf(238.0, 921.4, 1.3, 2.1, 1.0, 1.0, 5120.0, 1440.0, 240.0, 0.0),
Modifier.align(Alignment.TopStart).padding(12.dp),
)
}
}
+8 -2
View File
@@ -99,6 +99,12 @@ val cargoNdkDebug = registerCargoNdk("cargoNdkDebug", release = false)
val cargoNdkRelease = registerCargoNdk("cargoNdkRelease", release = true)
afterEvaluate {
tasks.named("preDebugBuild").configure { dependsOn(cargoNdkDebug) }
tasks.named("preReleaseBuild").configure { dependsOn(cargoNdkRelease) }
// `-PskipRustBuild` skips the cargo-ndk native build — for JVM-only tasks (the Roborazzi
// screenshot unit tests render Compose on the JVM and never load libpunktfunk_android.so), so
// CI/local screenshot runs don't need the Rust toolchain or NDK. The native build stays wired
// for every normal APK/AAR build.
if (!project.hasProperty("skipRustBuild")) {
tasks.named("preDebugBuild").configure { dependsOn(cargoNdkDebug) }
tasks.named("preReleaseBuild").configure { dependsOn(cargoNdkRelease) }
}
}
@@ -45,6 +45,7 @@ object NativeBridge {
compositorPref: Int,
gamepadPref: Int,
hdrEnabled: Boolean,
audioChannels: Int,
): Long
/** 64-hex SHA-256 of the cert the host presented on [handle]; valid after a successful connect. */
+189 -27
View File
@@ -1,8 +1,21 @@
//! Android audio playback (android-only): pull Opus packets from the connector, decode to
//! interleaved f32 stereo, and feed AAudio (LowLatency) via its realtime data callback through a
//! jitter ring. Mirrors [`crate::decode`]: one thread we own (the Opus decode producer) plus a
//! shutdown flag; the realtime callback thread is owned by AAudio. Ring logic ported from
//! `punktfunk-client-linux/src/audio.rs` (prime ~3 quanta, drop-oldest cap, re-prime on drain).
//! interleaved f32 (stereo or 5.1/7.1 surround), and feed AAudio (LowLatency) via its realtime data
//! callback through a jitter ring. Mirrors [`crate::decode`]: one thread we own (the Opus decode
//! producer) plus a shutdown flag; the realtime callback thread is owned by AAudio.
//!
//! The layout is the host-RESOLVED channel count (`NativeClient::audio_channels`, negotiated at
//! connect), so an older/clamping host that can only capture stereo is decoded + played as stereo.
//! 2 = stereo / 6 = 5.1 / 8 = 7.1, in the canonical wire order FL FR FC LFE RL RR SL SR.
//!
//! The ring started as a port of `punktfunk-client-linux/src/audio.rs`, but AAudio — unlike
//! PipeWire, which adaptively rate-matches the stream and absorbs a shallow buffer — hands us a raw
//! realtime callback and makes us own the buffer. So this client diverges deliberately to stop the
//! Android-only crackle: (1) the callback is allocation/free-free — decoded buffers are recycled to
//! the producer via a free-list instead of being freed on the audio thread (Android's Scudo `free`
//! has unbounded tail latency); (2) the jitter ring is deeper (~40 ms prime / ~150 ms hard cap) and
//! decoupled from the tiny LowLatency burst size, with de-prime hysteresis so a transient drain
//! doesn't manufacture a silence; (3) the AAudio HW buffer is primed above its 2-burst default and
//! grown on XRuns (Google's anti-glitch technique).
use ndk::audio::{
AudioCallbackResult, AudioDirection, AudioFormat, AudioPerformanceMode, AudioSharingMode,
@@ -13,16 +26,75 @@ use punktfunk_core::error::PunktfunkError;
use std::collections::VecDeque;
use std::ffi::c_void;
use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};
use std::sync::mpsc::{sync_channel, SyncSender, TrySendError};
use std::sync::mpsc::{sync_channel, Receiver, SyncSender, TrySendError};
use std::sync::Arc;
use std::time::Duration;
const CHANNELS: usize = 2;
const SAMPLE_RATE: i32 = 48_000;
/// Decoded-chunk hand-off depth: 64 × 5 ms = 320 ms slack (matches the core's AUDIO_QUEUE).
const RING_CHUNKS: usize = 64;
/// Opus decode scratch: worst-case 120 ms stereo frame (5760 samples/ch × 2 ch).
const PCM_SCRATCH: usize = 5760 * CHANNELS;
// --- Jitter-ring depths, in MILLISECONDS (scaled to interleaved-f32 samples at runtime). --------
// The channel count is negotiated, not a compile-time const, so these are kept in ms and multiplied
// by `ms` (interleaved-f32 samples per millisecond at the resolved layout) inside `start`.
// Unlike the Linux client (PipeWire adaptively rate-matches the stream to the graph clock, masking
// host↔DAC drift + a shallow ring), AAudio hands us a raw callback and we own the buffer: drift and
// WiFi power-save bunching land as underruns/overflows = crackle. So Android runs a deliberately
// deeper, smoothly-managed ring than Linux — keep the two clients' depths intentionally divergent.
/// Prime/target floor: fill to ~40 ms before playing (and after a sustained drain). Deep enough to
/// ride out WiFi arrival jitter + clock drift; the dominant Android-only anti-crackle lever.
const PRIME_FLOOR_MS: usize = 40;
/// Ceiling for the burst-scaled target (so a large quantum can't push the prime depth too high).
const PRIME_CEIL_MS: usize = 80;
/// Drop-oldest headroom above the target before trimming — a ~80 ms band swallows an arrival burst
/// without overflowing.
const JITTER_HEADROOM_MS: usize = 80;
/// Hard latency bound: never let the ring exceed ~150 ms (the only thing that caps added latency).
const HARD_CAP_MS: usize = 150;
/// Re-prime (go silent to refill) only after this many CONSECUTIVE empty callbacks, so one transient
/// drain doesn't manufacture a fresh 40 ms silence (the old `if ring.is_empty()` re-primed instantly).
const DEPRIME_AFTER_CALLBACKS: u32 = 5;
/// Throttle the AAudio XRun-driven HW-buffer grow check (cheap, but no need to poll every quantum).
const XRUN_CHECK_EVERY: u32 = 128;
/// Opus decoder for the audio plane: a plain stereo decoder (the validated path) or a multistream
/// decoder for 5.1/7.1, both behind one `decode_float`. Built from the host-RESOLVED channel count
/// via the shared layout table. Mirrors the Linux client's `AudioDec`.
enum AudioDec {
Stereo(opus::Decoder),
Surround(opus::MSDecoder),
}
impl AudioDec {
fn new(channels: u8) -> Result<AudioDec, opus::Error> {
if channels == 2 {
Ok(AudioDec::Stereo(opus::Decoder::new(
SAMPLE_RATE as u32,
opus::Channels::Stereo,
)?))
} else {
let l = punktfunk_core::audio::layout_for(channels, false);
Ok(AudioDec::Surround(opus::MSDecoder::new(
SAMPLE_RATE as u32,
l.streams,
l.coupled,
l.mapping,
)?))
}
}
fn decode_float(
&mut self,
input: &[u8],
out: &mut [f32],
fec: bool,
) -> Result<usize, opus::Error> {
match self {
AudioDec::Stereo(d) => d.decode_float(input, out, fec),
AudioDec::Surround(d) => d.decode_float(input, out, fec),
}
}
}
/// Diagnostics — written by the decode thread + the realtime callback, logged periodically. The
/// audio analogue of the video `fed`/`rendered` counters (we can't "screenshot" sound).
@@ -42,27 +114,57 @@ pub struct AudioPlayback {
}
impl AudioPlayback {
/// Open AAudio (LowLatency, 48 kHz/stereo/f32) with a realtime callback draining a jitter ring,
/// then spawn the Opus decode thread. `None` on failure (the caller leaves video streaming).
/// Open AAudio (LowLatency, 48 kHz/f32, the host-resolved channel layout) with a realtime
/// callback draining a jitter ring, then spawn the Opus decode thread. `None` on failure (the
/// caller leaves video streaming).
pub fn start(client: Arc<NativeClient>) -> Option<AudioPlayback> {
// Build playback from the host-RESOLVED channel count (never the request): 2 = stereo /
// 6 = 5.1 / 8 = 7.1, canonical wire order FL FR FC LFE RL RR SL SR.
let channels = punktfunk_core::audio::normalize_channels(client.audio_channels) as usize;
// Interleaved f32 samples per millisecond at this layout (48 kHz × channels); the ms-
// denominated jitter-ring depths scale by it.
let ms = (SAMPLE_RATE as usize / 1000) * channels;
let prime_floor = PRIME_FLOOR_MS * ms;
let prime_ceil = PRIME_CEIL_MS * ms;
let jitter_headroom = JITTER_HEADROOM_MS * ms;
let hard_cap_max = HARD_CAP_MS * ms;
let counters = Arc::new(Counters::default());
let (tx, rx) = sync_channel::<Vec<f32>>(RING_CHUNKS);
// Recycle free-list: drained PCM buffers go BACK to the decode thread to be refilled, so the
// realtime callback never frees heap (Android's Scudo allocator has unbounded free() tail
// latency — a free on the audio thread is an XRun = a click) and the decode thread rarely
// allocates. Same depth as the data channel.
let (free_tx, free_rx) = sync_channel::<Vec<f32>>(RING_CHUNKS);
// Realtime consumer state, owned by the callback (FnMut) — no lock: AAudio calls it from a
// single high-priority thread, and the decode thread only touches `tx`.
// single high-priority thread, and the decode thread only touches `tx`/`free_rx`.
let cb_counters = counters.clone();
let mut ring: VecDeque<f32> = VecDeque::with_capacity(PCM_SCRATCH);
// Pre-reserve the ring so `extend` never reallocates on the realtime thread. Worst transient
// before the trim below = the hard cap plus one full channel of 5 ms (480-f32) frames — the
// punktfunk protocol always sends 5 ms Opus frames (host `audio_thread`); a larger frame
// would force a one-time realloc, asserted (not silently corrupted) in `decode_loop`.
let mut ring: VecDeque<f32> = VecDeque::with_capacity(hard_cap_max + RING_CHUNKS * 5 * ms);
let mut primed = false;
let callback = move |_s: &AudioStream, data: *mut c_void, num_frames: i32| {
let want = num_frames as usize * CHANNELS;
let mut empties: u32 = 0; // consecutive empty callbacks (de-prime hysteresis)
let mut cb_count: u32 = 0; // callbacks since open (throttles the XRun grow check)
let mut last_xrun: i32 = 0; // last AAudio XRun count we grew the buffer for
let callback = move |s: &AudioStream, data: *mut c_void, num_frames: i32| {
let want = num_frames as usize * channels;
// SAFETY: AAudio provides `num_frames * channel_count` F32 slots at `data`.
let out = unsafe { std::slice::from_raw_parts_mut(data as *mut f32, want) };
while let Ok(chunk) = rx.try_recv() {
ring.extend(chunk);
// Drain decoded chunks into the ring WITHOUT freeing on the RT thread: `drain(..)` empties
// each Vec but keeps its capacity, then the empty buffer is handed back for reuse. The
// only RT-thread free is the rare case where the recycle channel is momentarily full.
while let Ok(mut chunk) = rx.try_recv() {
ring.extend(chunk.drain(..));
let _ = free_tx.try_send(chunk);
}
// Prime to ~3 quanta (15 ms; floor 15 ms / ceiling 200 ms); drop OLDEST above the cap.
let target = (3 * want).clamp(720 * CHANNELS, 9600 * CHANNELS);
while ring.len() > target.max(want) + want {
// Jitter buffer: prime to ~40 ms (prime_floor) before playing and after a sustained drain;
// drop-oldest only above a wide ~120 ms band. Decoupled from the AAudio burst `want` (tiny
// on the LowLatency MMAP path) so the depth doesn't collapse to a single quantum.
let target = (3 * want).clamp(prime_floor, prime_ceil);
let hard_cap = (target + jitter_headroom).min(hard_cap_max);
while ring.len() > hard_cap {
ring.pop_front();
}
if !primed && ring.len() >= target {
@@ -79,12 +181,34 @@ impl AudioPlayback {
out.fill(0.0);
cb_counters.underruns.fetch_add(1, Ordering::Relaxed);
}
// Re-prime only after a RUN of empty callbacks, not a single transient one — otherwise
// every momentary drain costs a fresh 40 ms silence (the old behaviour, self-inflicted
// crackle on any jitter spike).
if ring.is_empty() {
primed = false; // re-prime after a genuine drain (avoids sustained crackle on loss)
empties += 1;
if empties >= DEPRIME_AFTER_CALLBACKS {
primed = false;
}
} else {
empties = 0;
}
cb_counters
.ring_depth
.store(ring.len() as u64, Ordering::Relaxed);
// Google's AAudio anti-glitch technique: when the device reports new XRuns, grow the HW
// buffer by one burst (up to capacity). getXRunCount + setBufferSizeInFrames are both
// callback-safe / non-blocking, and set clamps to capacity so it self-limits. Throttled.
cb_count = cb_count.wrapping_add(1);
if cb_count % XRUN_CHECK_EVERY == 0 {
let xr = s.x_run_count();
if xr > last_xrun {
last_xrun = xr;
let burst = s.frames_per_burst().max(1);
let grown =
(s.buffer_size_in_frames() + burst).min(s.buffer_capacity_in_frames());
let _ = s.set_buffer_size_in_frames(grown);
}
}
AudioCallbackResult::Continue
};
@@ -93,7 +217,11 @@ impl AudioPlayback {
.ok()?
.direction(AudioDirection::Output)
.sample_rate(SAMPLE_RATE)
.channel_count(CHANNELS as i32)
// The wire order (FL FR FC LFE RL RR SL SR) is the standard AAudio/Android channel
// order, so this is an IDENTITY mapping — no permute. AAudio infers the 5.1/7.1 mask
// from `channel_count` (the ndk crate's builder exposes no setChannelMask); the host
// captures + Opus-encodes in exactly this order.
.channel_count(channels as i32)
.format(AudioFormat::PCM_Float)
.performance_mode(AudioPerformanceMode::LowLatency)
.sharing_mode(AudioSharingMode::Shared)
@@ -109,19 +237,31 @@ impl AudioPlayback {
log::error!("audio: request_start: {e}");
return None;
}
// Lift the AAudio HW buffer off its brittle ~2-burst LowLatency default so a single late
// callback doesn't immediately underrun; the in-callback XRun loop grows it further if the
// device still glitches. set_buffer_size_in_frames clamps to capacity.
let burst = stream.frames_per_burst().max(1);
let _ =
stream.set_buffer_size_in_frames((burst * 3).min(stream.buffer_capacity_in_frames()));
// perf != LowLatency or rate != 48000 means AAudio silently fell to a resampled legacy path
// (different burst behaviour) — surface it so the field can tell that apart from plain jitter.
log::info!(
"audio: AAudio started rate={} ch={} fmt={:?} burst={}",
"audio: AAudio started rate={} ch={} fmt={:?} perf={:?} share={:?} burst={} buf={}/{}",
stream.sample_rate(),
stream.channel_count(),
stream.format(),
stream.performance_mode(),
stream.sharing_mode(),
stream.frames_per_burst(),
stream.buffer_size_in_frames(),
stream.buffer_capacity_in_frames(),
);
let shutdown = Arc::new(AtomicBool::new(false));
let sd = shutdown.clone();
let join = std::thread::Builder::new()
.name("pf-audio".into())
.spawn(move || decode_loop(client, tx, sd, counters))
.spawn(move || decode_loop(client, tx, free_rx, sd, counters, channels))
.ok();
Some(AudioPlayback {
@@ -143,31 +283,53 @@ impl Drop for AudioPlayback {
}
/// Producer: `next_audio` → Opus `decode_float` → push interleaved f32 into the ring channel.
/// Buffers come from (and return to) the realtime callback's recycle free-list so the steady state
/// is allocation-free on both threads.
fn decode_loop(
client: Arc<NativeClient>,
tx: SyncSender<Vec<f32>>,
free_rx: Receiver<Vec<f32>>,
shutdown: Arc<AtomicBool>,
counters: Arc<Counters>,
channels: usize,
) {
let mut dec = match opus::Decoder::new(SAMPLE_RATE as u32, opus::Channels::Stereo) {
// Interleaved f32 samples per millisecond at this layout — the ring's 5 ms reserve check below.
let ms = (SAMPLE_RATE as usize / 1000) * channels;
// Opus decode scratch: worst-case 120 ms frame (5760 samples/ch) × channels.
let pcm_scratch = 5760 * channels;
let mut dec = match AudioDec::new(channels as u8) {
Ok(d) => d,
Err(e) => {
log::error!("audio: opus decoder init: {e} — audio disabled");
return;
}
};
let mut pcm = vec![0f32; PCM_SCRATCH];
let mut pcm = vec![0f32; pcm_scratch];
let mut window_peak = 0f32; // loudest |sample| since the last log — tells a tone from silence
while !shutdown.load(Ordering::Relaxed) {
match client.next_audio(Duration::from_millis(5)) {
Ok(pkt) => match dec.decode_float(&pkt.data, &mut pcm, false) {
Ok(samples) => {
let n = samples * CHANNELS;
let n = samples * channels;
for &s in &pcm[..n] {
window_peak = window_peak.max(s.abs());
}
// The ring's pre-reservation in `start` assumes the protocol's 5 ms (≤480-f32/ch)
// frames; a larger frame would force a one-time realloc on the RT thread. Catch a
// future host frame-size change here in debug, not as a silent audio glitch.
debug_assert!(
n <= 5 * ms,
"audio frame {n} f32 exceeds the 5 ms ring reserve"
);
let count = counters.opus_decoded.fetch_add(1, Ordering::Relaxed) + 1;
match tx.try_send(pcm[..n].to_vec()) {
// Reuse a recycled buffer if the callback handed one back; only allocate when the
// free-list is momentarily empty (startup / after a backpressure drop).
let mut buf = free_rx
.try_recv()
.unwrap_or_else(|_| Vec::with_capacity(pcm_scratch));
buf.clear();
buf.extend_from_slice(&pcm[..n]);
match tx.try_send(buf) {
Ok(()) | Err(TrySendError::Full(_)) => {} // drop-newest under backpressure
Err(TrySendError::Disconnected(_)) => break,
}
+12 -4
View File
@@ -140,10 +140,12 @@ pub extern "system" fn Java_io_unom_punktfunk_kit_NativeBridge_nativeGenerateIde
}
/// `NativeBridge.nativeConnect(host, port, w, h, hz, certPem, keyPem, pinHex, bitrateKbps,
/// compositorPref, gamepadPref): Long`. `certPem`/`keyPem` empty = anonymous, else presented as the
/// persistent identity. `pinHex` empty = TOFU (read `nativeHostFingerprint` after), else 64-hex
/// SHA-256 to pin the host (mismatch → 0). `bitrateKbps` 0 = host default. `compositorPref`/
/// `gamepadPref` are `CompositorPref`/`GamepadPref` wire bytes (0 = Auto; unknown → Auto).
/// compositorPref, gamepadPref, hdrEnabled, audioChannels): Long`. `certPem`/`keyPem` empty =
/// anonymous, else presented as the persistent identity. `pinHex` empty = TOFU (read
/// `nativeHostFingerprint` after), else 64-hex SHA-256 to pin the host (mismatch → 0). `bitrateKbps`
/// 0 = host default. `compositorPref`/`gamepadPref` are `CompositorPref`/`GamepadPref` wire bytes
/// (0 = Auto; unknown → Auto). `audioChannels` is the requested surround layout (2/6/8; normalized,
/// anything else → stereo) — the host clamps it and the resolved count drives playback.
/// Returns an opaque handle, or 0 on failure (logged).
#[no_mangle]
#[allow(clippy::too_many_arguments)]
@@ -162,6 +164,7 @@ pub extern "system" fn Java_io_unom_punktfunk_kit_NativeBridge_nativeConnect<'lo
compositor_pref: jint,
gamepad_pref: jint,
hdr_enabled: jboolean,
audio_channels: jint,
) -> jlong {
let host: String = match env.get_string(&host) {
Ok(s) => s.into(),
@@ -213,6 +216,11 @@ pub extern "system" fn Java_io_unom_punktfunk_kit_NativeBridge_nativeConnect<'lo
} else {
0
},
// Requested surround layout (2 = stereo / 6 = 5.1 / 8 = 7.1). The host clamps to what it can
// capture and echoes the resolved count in `connector.audio_channels`, which drives the
// decoder + AAudio layout (read in `crate::audio::AudioPlayback::start`). Anything else
// normalizes to stereo here.
punktfunk_core::audio::normalize_channels(audio_channels.clamp(0, u8::MAX as jint) as u8),
None, // launch: default app
pin, // Some → Crypto on host-fp mismatch
identity, // owned (cert, key) PEM, or None (anonymous)
@@ -25,6 +25,7 @@ struct ContentView: View {
@AppStorage(DefaultsKey.compositor) private var compositor = 0
@AppStorage(DefaultsKey.gamepadType) private var gamepadType = 0
@AppStorage(DefaultsKey.bitrateKbps) private var bitrateKbps = 0
@AppStorage(DefaultsKey.audioChannels) private var audioChannels = 2
@AppStorage(DefaultsKey.fullscreenWhileStreaming) private var fullscreenWhileStreaming = true
@AppStorage(DefaultsKey.hudEnabled) private var hudEnabled = true
@AppStorage(DefaultsKey.hudPlacement) private var hudPlacement = HUDPlacement.topTrailing.rawValue
@@ -252,6 +253,7 @@ struct ContentView: View {
setting: PunktfunkConnection.GamepadType(
rawValue: UInt32(clamping: gamepadType)) ?? .auto),
bitrateKbps: UInt32(clamping: bitrateKbps),
audioChannels: UInt8(clamping: audioChannels),
launchID: launchID,
allowTofu: host.pinnedSHA256 == nil)
}
@@ -351,6 +353,7 @@ struct ContentView: View {
compositor: pref,
gamepad: pad,
bitrateKbps: bitrate,
audioChannels: UInt8(clamping: audioChannels),
autoTrust: true)
}
}
@@ -99,6 +99,7 @@ final class SessionModel: ObservableObject {
compositor: PunktfunkConnection.Compositor = .auto,
gamepad: PunktfunkConnection.GamepadType = .auto,
bitrateKbps: UInt32 = 0,
audioChannels: UInt8 = 2,
hdrEnabled: Bool = true,
launchID: String? = nil,
allowTofu: Bool = false,
@@ -137,7 +138,7 @@ final class SessionModel: ObservableObject {
width: width, height: height, refreshHz: hz,
pinSHA256: pin, identity: identity, compositor: compositor,
gamepad: gamepad, bitrateKbps: bitrateKbps, videoCaps: videoCaps,
launchID: launchID) }
audioChannels: audioChannels, launchID: launchID) }
await MainActor.run { [weak self] in
guard let self else { return }
// The user may have abandoned this attempt (window closed, another host
@@ -25,6 +25,7 @@ struct SettingsView: View {
@AppStorage(DefaultsKey.libraryEnabled) private var libraryEnabled = false
@AppStorage(DefaultsKey.fullscreenWhileStreaming) private var fullscreenWhileStreaming = true
@AppStorage(DefaultsKey.micEnabled) private var micEnabled = true
@AppStorage(DefaultsKey.audioChannels) private var audioChannels = 2
@AppStorage(DefaultsKey.hudEnabled) private var hudEnabled = true
@AppStorage(DefaultsKey.hudPlacement) private var hudPlacement = HUDPlacement.topTrailing.rawValue
@ObservedObject private var gamepads = GamepadManager.shared
@@ -173,6 +174,10 @@ struct SettingsView: View {
TVSelectionRow(title: "Stream mode", options: options, selection: modeTag)
TVSelectionRow(
title: "Bitrate", options: bitrateOptions, selection: $bitrateKbps)
TVSelectionRow(
title: "Audio channels",
options: [("Stereo", 2), ("5.1 Surround", 6), ("7.1 Surround", 8)],
selection: $audioChannels)
if bitrateKbps > 1_000_000 {
Label(Self.gigabitWarning, systemImage: "exclamationmark.triangle.fill")
.font(.caption)
@@ -271,6 +276,11 @@ struct SettingsView: View {
@ViewBuilder private var audioSection: some View {
Section {
Picker("Audio channels", selection: $audioChannels) {
Text("Stereo").tag(2)
Text("5.1 Surround").tag(6)
Text("7.1 Surround").tag(8)
}
#if os(macOS)
Picker("Speaker", selection: $speakerUID) {
Text("System default").tag("")
@@ -15,6 +15,9 @@ public enum DefaultsKey {
public static let gamepadType = "punktfunk.gamepadType"
public static let gamepadID = "punktfunk.gamepadID"
public static let bitrateKbps = "punktfunk.bitrateKbps"
/// Requested audio channel count: 2 (stereo), 6 (5.1) or 8 (7.1). The host clamps to what it
/// can capture; the resolved count drives the in-core decode + AVAudioEngine layout.
public static let audioChannels = "punktfunk.audioChannels"
public static let micEnabled = "punktfunk.micEnabled"
public static let speakerUID = "punktfunk.speakerUID"
public static let micUID = "punktfunk.micUID"
@@ -235,6 +235,12 @@ public final class PunktfunkConnection {
/// drain `nextHdrMeta`.
public var isHDR: Bool { colorTransfer == 16 || colorTransfer == 18 }
/// The audio channel count the host resolved for this session (the Welcome's echo of the
/// requested `audioChannels`, clamped to what the host can capture): `2` (stereo), `6` (5.1)
/// or `8` (7.1). Build the playback layout from THIS, never the request. `2` for an older host.
/// PCM from `nextAudioPcm` is interleaved in the canonical wire order FL FR FC LFE RL RR SL SR.
public private(set) var resolvedAudioChannels: UInt8 = 2
/// Connect and start a session at the requested mode (the host creates a native virtual
/// output at exactly this size/refresh). Blocks up to `timeoutMs`.
///
@@ -264,6 +270,7 @@ public final class PunktfunkConnection {
gamepad: GamepadType = .auto,
bitrateKbps: UInt32 = 0,
videoCaps: UInt8 = 0,
audioChannels: UInt8 = 2,
launchID: String? = nil,
timeoutMs: UInt32 = 10_000
) throws {
@@ -279,16 +286,16 @@ public final class PunktfunkConnection {
withOptionalCString(launchID) { launch in
if let pin = pinSHA256 {
return pin.withUnsafeBytes { p in
punktfunk_connect_ex5(
punktfunk_connect_ex6(
cs, port, width, height, refreshHz, compositor.rawValue,
gamepad.rawValue, bitrateKbps, videoCaps, launch,
gamepad.rawValue, bitrateKbps, videoCaps, audioChannels, launch,
p.bindMemory(to: UInt8.self).baseAddress, &observed,
cert, key, timeoutMs)
}
}
return punktfunk_connect_ex5(
return punktfunk_connect_ex6(
cs, port, width, height, refreshHz, compositor.rawValue,
gamepad.rawValue, bitrateKbps, videoCaps, launch,
gamepad.rawValue, bitrateKbps, videoCaps, audioChannels, launch,
nil, &observed, cert, key, timeoutMs)
}
}
@@ -320,6 +327,9 @@ public final class PunktfunkConnection {
colorMatrix = mtx
colorFullRange = fullRange != 0
bitDepth = depth
var ac: UInt8 = 2
_ = punktfunk_connection_audio_channels(handle, &ac)
resolvedAudioChannels = ac
}
/// A bandwidth speed-test measurement (see `startSpeedTest`). Partial until `done`.
@@ -468,6 +478,50 @@ public final class PunktfunkConnection {
}
}
/// One decoded audio frame from `nextAudioPcm`: interleaved 32-bit float at 48 kHz, in the
/// canonical wire channel order FL FR FC LFE RL RR SL SR (the first `channels`).
public struct AudioPCM: Sendable {
/// Interleaved f32 samples (`frameCount * channels` long), wire channel order.
public let samples: [Float]
/// Samples per channel.
public let frameCount: Int
/// Channel count (2/6/8) `resolvedAudioChannels`.
public let channels: Int
public let ptsNs: UInt64
public let seq: UInt32
}
/// Pull the next audio frame, **decoded in-core** to interleaved f32 PCM Apple's AudioToolbox
/// Opus path is stereo-only, so surround (and, for uniformity, stereo too) is decoded by the
/// Rust core (libopus multistream) and handed back as PCM. nil on timeout, throws `.closed` once
/// the session ended. Drain from a dedicated audio thread (do NOT also call `nextAudio` they
/// share the underlying queue). The returned `samples` are copied out, so the buffer is owned.
public func nextAudioPcm(timeoutMs: UInt32 = 100) throws -> AudioPCM? {
audioLock.lock()
defer { audioLock.unlock() }
guard let h = liveHandle() else { throw PunktfunkClientError.closed }
var out = PunktfunkAudioPcm()
let rc = punktfunk_connection_next_audio_pcm(h, &out, timeoutMs)
switch rc {
case statusOK:
let channels = Int(out.channels)
let total = Int(out.frame_count) * channels
guard let base = out.samples, total > 0 else { return nil }
// Copy: the pointer borrows connection memory only until the next PCM call.
let samples = Array(UnsafeBufferPointer(start: base, count: total))
return AudioPCM(
samples: samples, frameCount: Int(out.frame_count),
channels: channels, ptsNs: out.pts_ns, seq: out.seq)
case statusNoFrame:
return nil
case statusClosed:
throw PunktfunkClientError.closed
default:
throw PunktfunkClientError.status(rc)
}
}
/// Pull the next force-feedback update for the GCController haptics engine:
/// `(pad, lowFrequency, highFrequency)` with 0...0xFFFF amplitudes, (0, 0) = stop.
/// Drain from the (single) feedback thread, alongside `nextHidOutput`.
@@ -19,13 +19,13 @@ import os
private let log = Logger(subsystem: "io.unom.punktfunk", category: "audio")
/// SPSC-ish jitter ring (interleaved stereo float), drain thread render callback.
/// The unfair lock is held for microseconds; fine at render-callback rates. Priming:
/// SPSC-ish jitter ring (interleaved float, `channels` per frame), drain thread render
/// callback. The unfair lock is held for microseconds; fine at render-callback rates. Priming:
/// reads return silence until enough is buffered (at least `prefill`, and at least one
/// packet more than the device's render quantum large-buffer devices would otherwise
/// chronically out-demand the prefill and oscillate prime dropout re-prime), and an
/// underrun re-primes, concealing jitter as one short dip instead of sustained crackle.
/// All counts stay even (whole stereo frames), so L/R interleave can never flip.
/// All counts stay whole frames (multiples of `channels`), so the interleave can never slip.
final class AudioRing: @unchecked Sendable {
private var buf: [Float]
private var readIdx = 0
@@ -34,12 +34,14 @@ final class AudioRing: @unchecked Sendable {
private var renderQuantum = 0
private let prefill: Int
private let highWater: Int
private let channels: Int
private let lock = OSAllocatedUnfairLock()
/// `capacity`/`prefill` in samples (interleaved 2 per frame, both must be even).
init(capacity: Int, prefill: Int) {
/// `capacity`/`prefill` in samples (interleaved `channels` per frame, both whole frames).
init(capacity: Int, prefill: Int, channels: Int) {
buf = [Float](repeating: 0, count: capacity)
self.prefill = prefill
self.channels = channels
highWater = prefill * 4
}
@@ -74,8 +76,8 @@ final class AudioRing: @unchecked Sendable {
renderQuantum = max(renderQuantum, count)
let available = writeIdx - readIdx
if !primed {
// 480 samples = one 5 ms host packet of slack beyond the device's demand.
if available >= max(prefill, renderQuantum + 480) {
// One 5 ms host packet (240 frames × channels) of slack beyond the device's demand.
if available >= max(prefill, renderQuantum + 240 * channels) {
primed = true
} else {
for i in 0..<count { out[i] = 0 }
@@ -113,10 +115,55 @@ private final class StopFlag: @unchecked Sendable {
/// Render-block-owned scratch storage: freed exactly when the closure (and thus the
/// last possible render call) is released never racing CoreAudio.
private final class ScratchBuffer {
let ptr = UnsafeMutablePointer<Float>.allocate(capacity: 8192 * 2)
// 8192 frames × up to 8 channels (7.1) the render block caps `frames` at 8192.
let ptr = UnsafeMutablePointer<Float>.allocate(capacity: 8192 * 8)
deinit { ptr.deallocate() }
}
/// CoreAudio channel layout for the canonical wire order FL FR FC LFE RL RR [SL SR]. nil for
/// stereo (the standard layout is correct). For 5.1/7.1 we list explicit channel labels via
/// `kAudioChannelLayoutTag_UseChannelDescriptions` preset tags (DTS_5_1 etc.) don't reliably
/// match Moonlight's order. NB the 7.1 mapping (verified against the WASAPI 0x63F + SPA orderings):
/// wire idx 4-5 = RL/RR = the WAVE *back* pair LeftSurround/RightSurround; idx 6-7 = SL/SR = the
/// WAVE *side* pair LeftSurroundDirect/RightSurroundDirect. (Using RearSurround* for 6-7 would
/// swap side/back vs the Windows/Linux clients.)
private func wireChannelLayout(channels: Int) -> AVAudioChannelLayout? {
let labels: [AudioChannelLabel]
switch channels {
case 6:
labels = [
kAudioChannelLabel_Left, kAudioChannelLabel_Right, kAudioChannelLabel_Center,
kAudioChannelLabel_LFEScreen, kAudioChannelLabel_LeftSurround,
kAudioChannelLabel_RightSurround,
]
case 8:
labels = [
kAudioChannelLabel_Left, kAudioChannelLabel_Right, kAudioChannelLabel_Center,
kAudioChannelLabel_LFEScreen,
kAudioChannelLabel_LeftSurround, kAudioChannelLabel_RightSurround, // wire RL/RR (back)
kAudioChannelLabel_LeftSurroundDirect, kAudioChannelLabel_RightSurroundDirect, // wire SL/SR (side)
]
default:
return nil
}
let size = MemoryLayout<AudioChannelLayout>.size
+ (labels.count - 1) * MemoryLayout<AudioChannelDescription>.stride
let raw = UnsafeMutableRawPointer.allocate(byteCount: size, alignment: 16)
defer { raw.deallocate() }
let layout = raw.bindMemory(to: AudioChannelLayout.self, capacity: 1)
layout.pointee.mChannelLayoutTag = kAudioChannelLayoutTag_UseChannelDescriptions
layout.pointee.mChannelBitmap = AudioChannelBitmap(rawValue: 0)
layout.pointee.mNumberChannelDescriptions = UInt32(labels.count)
let descs = UnsafeMutableBufferPointer(
start: &layout.pointee.mChannelDescriptions, count: labels.count)
for (i, lbl) in labels.enumerated() {
descs[i] = AudioChannelDescription(
mChannelLabel: lbl, mChannelFlags: AudioChannelFlags(rawValue: 0),
mCoordinates: (0, 0, 0))
}
return AVAudioChannelLayout(layout: layout)
}
public final class SessionAudio {
private let connection: PunktfunkConnection
private let flag = StopFlag()
@@ -229,9 +276,13 @@ public final class SessionAudio {
// MARK: - Playback (host speaker)
private func startPlayback(speakerUID: String) {
// 1 s of interleaved stereo capacity, ~20 ms prefill: four 5 ms host packets of
// jitter absorption before the first sample plays.
let ring = AudioRing(capacity: 96_000, prefill: 1920)
// Build the playback layout from the host-RESOLVED channel count (never the request):
// 2 = stereo / 6 = 5.1 / 8 = 7.1, canonical wire order FL FR FC LFE RL RR SL SR.
let channels = Int(connection.resolvedAudioChannels)
// 1 s interleaved capacity, ~20 ms prefill (four 5 ms host packets of jitter absorption
// before the first sample plays), both scaled by the channel count.
let ring = AudioRing(
capacity: 48_000 * channels, prefill: 960 * channels, channels: channels)
let engine = AVAudioEngine()
#if os(macOS)
@@ -247,21 +298,32 @@ public final class SessionAudio {
}
#endif
// Engine-native deinterleaved float; the render block deinterleaves from the ring.
guard let format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)
else { return }
// Engine-native deinterleaved float; the render block deinterleaves from the ring. Surround
// uses an explicit wire-order channel layout; the mixer downmixes to the output device when
// it has fewer speakers (e.g. an iPhone's stereo built-ins). (Explicit if/else rather than
// map/flatMap so it's correct whether the channelLayout initializer is failable or not.)
var format: AVAudioFormat?
if channels == 2 {
format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)
} else if let layout = wireChannelLayout(channels: channels) {
format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channelLayout: layout)
}
guard let format else {
log.error("could not build \(channels)-channel audio format — audio disabled")
return
}
let scratch = ScratchBuffer() // block-owned; freed with the closure
let source = AVAudioSourceNode(format: format) { _, _, frameCount, abl -> OSStatus in
let frames = Int(frameCount)
guard frames <= 8192 else { return kAudioUnitErr_TooManyFramesToProcess }
ring.read(into: scratch.ptr, count: frames * 2)
ring.read(into: scratch.ptr, count: frames * channels)
let buffers = UnsafeMutableAudioBufferListPointer(abl)
if buffers.count >= 2,
let left = buffers[0].mData?.assumingMemoryBound(to: Float.self),
let right = buffers[1].mData?.assumingMemoryBound(to: Float.self) {
for f in 0..<frames {
left[f] = scratch.ptr[f * 2]
right[f] = scratch.ptr[f * 2 + 1]
// Deinterleave the wire-order interleaved ring into the engine's per-channel buses.
if buffers.count >= channels {
for ch in 0..<channels {
if let dst = buffers[ch].mData?.assumingMemoryBound(to: Float.self) {
for f in 0..<frames { dst[f] = scratch.ptr[f * channels + ch] }
}
}
}
return noErr
@@ -292,29 +354,20 @@ public final class SessionAudio {
stateLock.unlock()
let thread = Thread { [connection, flag, drainDone] in
defer { drainDone.signal() }
guard let decoder = try? OpusDecoder(framesPerPacket: 240),
let pcm = AVAudioPCMBuffer(
pcmFormat: decoder.pcmFormat, frameCapacity: 5760)
else {
log.error("Opus decoder unavailable — audio playback disabled")
return
}
// Decode happens IN-CORE (libopus multistream) AudioToolbox's Opus path is
// stereo-only and is handed back as interleaved f32 PCM in wire channel order.
while !flag.isStopped {
let packet: AudioPacket?
let pcm: PunktfunkConnection.AudioPCM?
do {
packet = try connection.nextAudio(timeoutMs: 100)
pcm = try connection.nextAudioPcm(timeoutMs: 100)
} catch {
break // session closed
}
guard let packet else { continue }
do {
let frames = try decoder.decode(packet.data, into: pcm)
if frames > 0, let p = pcm.floatChannelData?[0] {
ring.write(p, count: Int(frames) * 2)
guard let pcm, pcm.frameCount > 0 else { continue }
pcm.samples.withUnsafeBufferPointer { p in
if let base = p.baseAddress {
ring.write(base, count: pcm.frameCount * pcm.channels)
}
} catch {
// One corrupt packet a dead stream; skip it.
log.warning("audio decode failed: \(error.localizedDescription)")
}
}
}
+36 -1
View File
@@ -45,8 +45,9 @@ Gaming Mode automatically.
| `src/steam.ts` | Steam-shortcut launch (`AddShortcut` / `SetAppLaunchOptions` / `RunGame`) — the focus-correct stream start. |
| `src/backend.ts` | Typed `callable` bridges to `main.py`. |
| `bin/punktfunkrun.sh` | The launch wrapper the Steam shortcut targets (so the window is focusable). |
| `main.py` | Backend: `discover` / `pair` / `runner_info` / `get_settings` / `set_settings` / `kill_stream`. |
| `main.py` | Backend: `discover` / `pair` / `runner_info` / `get_settings` / `set_settings` / `kill_stream` / `check_update`. |
| `plugin.json` | Decky plugin manifest. |
| `update.json` | CI-baked `{channel, manifest}` — where `check_update()` polls (absent on dev builds). |
| `decky.pyi` | Type stub for the injected `decky` module (vendored from the template). |
### Discovery (`discover()`)
@@ -140,6 +141,40 @@ shows up in the Quick Access Menu.
> [`../../packaging/flatpak/README.md`](../../packaging/flatpak/README.md)) — install that on
> the Deck too, or the panel's Connect surfaces a `client-not-found` error.
## Updating (self-update, no store)
The plugin updates itself without the official Decky store. CI (`decky.yml`) publishes a tiny
per-channel `manifest.json` next to the zip in the Gitea registry:
```json
{"version":"0.3.123","artifact":".../punktfunk-decky/0.3.123/punktfunk.zip","sha256":"…"}
```
and bakes an `update.json` (`{channel, manifest}`) into the plugin so it knows which channel it was
installed from. The backend `check_update()` reads the **installed** version from `package.json`
the value Decky itself reports (it does **not** read `plugin.json`) — fetches the channel manifest,
and compares. When a newer build exists the frontend shows an **Update to vX** button that drives
Decky Loader's own install RPC:
```ts
window.DeckyBackend.callable("utilities/install_plugin")(artifact, "punktfunk", version, hash, /*UPDATE=*/2)
```
The loader (root) downloads the immutable per-version zip, **SHA-256-verifies** it against `hash`,
replaces `~/homebrew/plugins/punktfunk`, and hot-reloads — the unprivileged backend never writes the
root-owned plugins dir itself. `window.DeckyBackend` / `utilities/install_plugin` are loader
internals (not `@decky/api`), so every access is guarded; missing them, the button falls back to a
toast pointing at **Install Plugin from URL**.
> CI stamps a **plain numeric** semver per channel (`0.3.<run>` canary, `X.Y.Z` stable) into
> `package.json`. Decky's `compare-versions` orders pre-release identifiers lexically (so `ci10 < ci9`)
> — a `-ciN` suffix would mis-detect updates.
**Optional — native Updates tab:** Decky's store is single-source (a custom store URL *replaces* the
official catalog), so punktfunk doesn't ship one by default. A user who wants the native update badge
can point Decky → Settings → **Custom store** at a punktfunk-only store JSON — not recommended if you
use other plugins, since it hides the official catalog.
## Limitations / next steps
- **Needs on-Deck validation in Gaming Mode**: the Steam-shortcut launch (`AddShortcut` /
+3 -1
View File
@@ -31,4 +31,6 @@ fi
echo "punktfunkrun: streaming $APPID --connect $PF_HOST" >&2
# exec so the flatpak client IS the game process — when it exits, Steam ends the "game" and
# Gaming Mode reclaims focus automatically (no manual refocus needed).
exec "$FLATPAK" run --arch=x86_64 "$APPID" --connect "$PF_HOST"
# --fullscreen: present the stream chrome-less and fullscreen (the client also auto-detects the
# Deck/gamescope env, and ignores the flag harmlessly on older builds that predate it).
exec "$FLATPAK" run --arch=x86_64 "$APPID" --connect "$PF_HOST" --fullscreen
+141 -4
View File
@@ -17,6 +17,8 @@ The backend's jobs are the things Steam can't do:
* **get_settings() / set_settings()** — read/write the flatpak client's stream settings JSON
(resolution / bitrate / gamepad), so the Deck UI configures the stream the client reads.
* **kill_stream()** — force-stop a wedged stream (``flatpak kill``).
* **check_update()** — poll the registry's per-channel ``manifest.json`` and report whether a
newer build is available (the frontend then drives Decky's own install RPC to apply it).
The TXT-record keys parsed (``proto`` / ``fp`` / ``pair`` / ``id``) are defined by the host
advert in ``crates/punktfunk-host/src/discovery.rs``.
@@ -26,7 +28,10 @@ import asyncio
import json
import os
import shutil
import ssl
import stat
import time
import urllib.request
from pathlib import Path
import decky
@@ -37,22 +42,99 @@ APP_ID = "io.unom.Punktfunk"
# Service type advertised by punktfunk/1 hosts (matches NATIVE_SERVICE in the Rust host).
SERVICE_TYPE = "_punktfunk._udp"
# The flatpak client persists identity / known-hosts / settings under HOME/.config/punktfunk;
# inside the flatpak sandbox HOME is ~/.var/app/<APP_ID>, so the real on-disk location is this.
# The backend writes settings here so the (sandboxed) client reads them.
# The flatpak client persists identity / known-hosts / settings under HOME/.config/punktfunk.
# The sandbox HOME resolves to the REAL user home (== DECKY_USER_HOME), NOT the per-app
# ~/.var/app/<APP_ID> dir — verified on-device (`flatpak run … sh -c 'echo $HOME'` prints
# /home/deck, and the manifest's `--filesystem=~/.config/punktfunk` grants exactly that path;
# we also pass HOME=DECKY_USER_HOME into `flatpak run`, see _flatpak_env). Pointing here is what
# lets plugin settings actually reach the client AND lets us read the client's known-hosts to
# tell whether THIS device is already paired with a given host.
def _client_config_dir() -> Path:
return Path(decky.DECKY_USER_HOME) / ".var" / "app" / APP_ID / ".config" / "punktfunk"
return Path(decky.DECKY_USER_HOME) / ".config" / "punktfunk"
def _settings_path() -> Path:
return _client_config_dir() / "client-gtk-settings.json"
def _paired_fingerprints() -> set[str]:
"""Host cert fingerprints (lowercase hex) this client has PIN-paired, from the client's
known-hosts store. Keyed by fingerprint so it survives a host changing IP address."""
try:
data = json.loads((_client_config_dir() / "client-known-hosts.json").read_text())
except (OSError, json.JSONDecodeError):
return set()
hosts = data.get("hosts", []) if isinstance(data, dict) else []
return {
h["fp_hex"].lower()
for h in hosts
if isinstance(h, dict) and h.get("paired") and isinstance(h.get("fp_hex"), str)
}
def _runner_path() -> str:
"""Absolute path to the launch wrapper shipped with the plugin (bin/punktfunkrun.sh)."""
return str(Path(decky.DECKY_PLUGIN_DIR) / "bin" / "punktfunkrun.sh")
# ----------------------------------------------------------------------------------------
# Self-update check (no Decky store). The plugin is distributed via "Install Plugin from
# URL" pointing at our Gitea generic registry, so the official store never sees it and
# can't offer updates. Instead the backend polls a tiny per-channel ``manifest.json`` the
# CI publishes next to the zip, compares it to the installed version, and the frontend
# offers a one-tap update that drives Decky's own (root, privileged) install RPC. The
# channel + manifest URL are baked into ``update.json`` by CI (.gitea/workflows/decky.yml);
# a dev/sideload build has no ``update.json`` and update checks are simply disabled.
_UPDATE_TTL_S = 1800.0 # cache a successful check for 30 min (the QAM remounts often)
_update_cache: dict = {"at": 0.0, "data": None}
def _update_config() -> dict:
"""The CI-baked ``{channel, manifest}`` next to the plugin (absent on dev builds)."""
try:
return json.loads((Path(decky.DECKY_PLUGIN_DIR) / "update.json").read_text())
except (OSError, json.JSONDecodeError):
return {}
def _installed_version() -> str:
"""The version Decky itself reports for this plugin — it reads ``package.json`` (NOT
plugin.json), so the CI stamps the build version there."""
try:
pkg = json.loads((Path(decky.DECKY_PLUGIN_DIR) / "package.json").read_text())
return str(pkg.get("version", "0.0.0"))
except (OSError, json.JSONDecodeError):
return "0.0.0"
def _semver_tuple(v: str) -> tuple[int, int, int]:
"""A tolerant (major, minor, patch) tuple for ``>`` comparison. We control the version
format (plain numeric ``X.Y.Z`` on both channels), so leading-int-per-component is
enough; any pre-release suffix is dropped before comparing."""
parts: list[int] = []
for comp in str(v).split("-", 1)[0].split(".")[:3]:
digits = ""
for ch in comp:
if ch.isdigit():
digits += ch
else:
break
parts.append(int(digits) if digits else 0)
while len(parts) < 3:
parts.append(0)
return (parts[0], parts[1], parts[2])
def _fetch_json(url: str, timeout: float = 8.0) -> dict:
"""Blocking HTTPS GET of a small JSON document (run in an executor)."""
req = urllib.request.Request(
url, headers={"Accept": "application/json", "User-Agent": "punktfunk-decky"}
)
ctx = ssl.create_default_context()
with urllib.request.urlopen(req, timeout=timeout, context=ctx) as resp:
return json.loads(resp.read().decode("utf-8", errors="replace"))
def _flatpak() -> str | None:
return shutil.which("flatpak") or (
"/usr/bin/flatpak" if Path("/usr/bin/flatpak").exists() else None
@@ -179,6 +261,13 @@ class Plugin:
if stderr:
decky.logger.debug("avahi-browse stderr: %s", stderr.decode(errors="replace"))
hosts = _parse_avahi_browse(stdout.decode(errors="replace"))
# Mark which hosts THIS device has already paired (by cert fingerprint), so the UI can
# show "Stream" instead of "Pair" — the mDNS `pair` field is the host's policy, not our
# per-device pairing state.
paired = _paired_fingerprints()
for h in hosts:
fp = h.get("fp") or ""
h["paired"] = bool(fp) and fp.lower() in paired
decky.logger.info("discovered %d punktfunk host(s)", len(hosts))
return hosts
@@ -279,6 +368,54 @@ class Plugin:
return {"ok": False}
return {"ok": True}
async def check_update(self, force: bool = False) -> dict:
"""Is a newer build available in our registry? Compares the installed version
(``package.json``) against the per-channel ``manifest.json`` the CI publishes, and
returns everything the frontend needs to drive Decky's install RPC. Non-fatal: any
failure (no channel baked in, network down) returns ``update_available: False``.
"""
current = _installed_version()
cfg = _update_config()
result = {
"current": current,
"latest": current,
"artifact": "",
"hash": "",
"channel": str(cfg.get("channel", "")),
"update_available": False,
}
manifest_url = cfg.get("manifest")
if not manifest_url:
result["error"] = "update-channel-unknown" # dev / sideloaded build
return result
now = time.monotonic()
cached = _update_cache["data"]
if not force and cached and (now - _update_cache["at"]) < _UPDATE_TTL_S:
return cached
try:
loop = asyncio.get_running_loop()
manifest = await loop.run_in_executor(None, _fetch_json, manifest_url)
except Exception as exc: # noqa: BLE001
decky.logger.warning("update check failed: %s", exc)
result["error"] = "fetch-failed"
return result # transient — don't cache, retry next open
latest = str(manifest.get("version", current))
result["latest"] = latest
result["artifact"] = str(manifest.get("artifact", ""))
result["hash"] = str(manifest.get("sha256", ""))
result["update_available"] = bool(result["artifact"]) and (
_semver_tuple(latest) > _semver_tuple(current)
)
if result["update_available"]:
decky.logger.info("update available: %s -> %s (%s)", current, latest, result["channel"])
_update_cache["at"] = now
_update_cache["data"] = result
return result
# ---- Decky lifecycle ----
async def _main(self):
+13 -1
View File
@@ -5,8 +5,9 @@ export interface Host {
name: string;
host: string;
port: number;
pair: string; // "required" | "optional"
pair: string; // "required" | "optional" — the HOST's policy
fp: string;
paired: boolean; // whether THIS device has already PIN-paired this host (by fingerprint)
}
export interface PairResult {
@@ -32,6 +33,16 @@ export interface StreamSettings {
mic_enabled: boolean;
}
export interface UpdateInfo {
current: string; // installed version (package.json)
latest: string; // newest version in our registry for this channel
artifact: string; // immutable zip URL Decky should install
hash: string; // sha256 of that zip (Decky verifies it)
channel: string; // "latest" (stable) | "canary"
update_available: boolean;
error?: string; // "update-channel-unknown" (dev build) | "fetch-failed"
}
export const discover = callable<[], Host[]>("discover");
export const pair = callable<
[host: string, port: number, pin: string, name: string],
@@ -43,3 +54,4 @@ export const setSettings = callable<[settings: StreamSettings], { ok: boolean }>
"set_settings",
);
export const killStream = callable<[], { ok: boolean }>("kill_stream");
export const checkUpdate = callable<[force: boolean], UpdateInfo>("check_update");
+269 -38
View File
@@ -10,12 +10,22 @@ import {
PanelSectionRow,
SliderField,
Spinner,
Tabs,
ToggleField,
showModal,
staticClasses,
} from "@decky/ui";
import { definePlugin, routerHook, toaster } from "@decky/api";
import { FC, useCallback, useEffect, useState } from "react";
import {
Component,
CSSProperties,
ErrorInfo,
FC,
ReactNode,
useCallback,
useEffect,
useState,
} from "react";
import {
FaTv,
FaSyncAlt,
@@ -23,19 +33,130 @@ import {
FaLockOpen,
FaPlay,
FaArrowLeft,
FaDownload,
} from "react-icons/fa";
import {
discover,
getSettings,
pair,
setSettings,
checkUpdate,
Host,
StreamSettings,
UpdateInfo,
} from "./backend";
import { launchStream } from "./steam";
const ROUTE = "/punktfunk";
// Decky Loader exposes its already-authenticated WSRouter as a global. This is NOT part of
// @decky/api (it's a loader internal), so we treat it as optional and guard every use — on a
// loader without it we fall back to manual "Install Plugin from URL". We use it to drive
// Decky's own privileged install path (the root loader does the download + SHA-256 verify +
// extract + hot-reload), which is the only way a plugin can update itself: ~/homebrew/plugins
// is root-owned, so our unprivileged backend can't swap its own files.
declare global {
interface Window {
DeckyBackend?: {
callable: (route: string) => (...args: unknown[]) => Promise<unknown>;
};
}
}
// PluginInstallType.UPDATE in decky-loader's browser.py (INSTALL=0/REINSTALL=1/UPDATE=2/…).
const INSTALL_TYPE_UPDATE = 2;
// ----------------------------------------------------------------------------------------
// Error boundary — contains ANY render failure in our UI so a single bad render can never take
// down the whole Quick Access "Decky" section (Decky's tab-level boundary shows the generic
// "Something went wrong while displaying this content" for the entire tab when one plugin
// throws). The realistic trigger is a future Steam client update that makes a @decky/ui
// component resolve to `undefined` (React then throws "Element type is invalid"). The fallback
// is built from ONLY plain DOM elements + inline styles, so it cannot itself depend on a
// (possibly broken) Steam-internal component — it is guaranteed to render.
// ----------------------------------------------------------------------------------------
class PluginErrorBoundary extends Component<
{ children: ReactNode },
{ error: Error | null }
> {
state: { error: Error | null } = { error: null };
static getDerivedStateFromError(error: Error) {
return { error };
}
componentDidCatch(error: Error, info: ErrorInfo) {
// Surface it for diagnosis, but never rethrow — containment is the whole point.
// eslint-disable-next-line no-console
console.error("[punktfunk] contained UI render error:", error, info?.componentStack);
}
render() {
const { error } = this.state;
if (!error) return this.props.children;
return (
<div style={{ padding: "1em", lineHeight: 1.45 }}>
<div style={{ fontWeight: "bold", marginBottom: "0.4em" }}>
punktfunk couldnt draw this view
</div>
<div style={{ opacity: 0.8, marginBottom: "0.6em" }}>
The plugin hit a display error your Steam Deck is fine. Reload punktfunk from
Decky&apos;s plugin list, or update the plugin.
</div>
<div
style={{
opacity: 0.55,
fontFamily: "monospace",
fontSize: "0.8em",
wordBreak: "break-word",
}}
>
{String(error?.message ?? error)}
</div>
</div>
);
}
}
// Checks our registry for a newer build on mount (the backend caches + is non-fatal offline).
function useUpdate() {
const [info, setInfo] = useState<UpdateInfo | null>(null);
useEffect(() => {
void checkUpdate(false)
.then(setInfo)
.catch(() => {});
}, []);
return info;
}
async function applyUpdate(info: UpdateInfo) {
try {
const backend = window.DeckyBackend;
if (backend?.callable) {
// Fire-and-forget: the loader reinstalls + reloads THIS plugin, tearing the panel down
// before any result could arrive — so never await it. Decky shows its own confirm prompt.
void backend.callable("utilities/install_plugin")(
info.artifact,
"punktfunk",
info.latest,
info.hash,
INSTALL_TYPE_UPDATE,
);
toaster.toast({
title: "punktfunk",
body: `Updating to v${info.latest}… confirm the Decky prompt.`,
});
return;
}
} catch {
// fall through to the manual path
}
toaster.toast({
title: "punktfunk",
body: "Update from Decky → Developer → Install Plugin from URL.",
});
}
// ----------------------------------------------------------------------------------------
// Discovery hook — shared by the QAM panel and the full page.
// ----------------------------------------------------------------------------------------
@@ -255,20 +376,24 @@ const SettingsSection: FC = () => {
// One host row on the full page.
// ----------------------------------------------------------------------------------------
const HostRow: FC<{ host: Host }> = ({ host }) => {
const pairRequired = host.pair === "required";
// The host's policy is `pair=required`, but if THIS device is already paired we don't need to
// pair again — show it as trusted and go straight to Stream.
const needsPair = host.pair === "required" && !host.paired;
return (
<Field
label={
<span style={{ display: "inline-flex", alignItems: "center", gap: "0.4em" }}>
{pairRequired ? <FaLock /> : <FaLockOpen />}
{needsPair ? <FaLock /> : <FaLockOpen />}
{host.name}
</span>
}
description={`${host.host}:${host.port}${pairRequired ? " · pairing required" : ""}`}
description={`${host.host}:${host.port}${
needsPair ? " · pairing required" : host.paired ? " · paired" : ""
}`}
childrenContainerWidth="max"
>
<Focusable style={{ display: "flex", gap: "0.5em" }}>
{pairRequired && (
{needsPair && (
<DialogButton
style={{ minWidth: "5em" }}
onClick={() =>
@@ -288,52 +413,129 @@ const HostRow: FC<{ host: Host }> = ({ host }) => {
};
// ----------------------------------------------------------------------------------------
// The fullscreen page (registered as the /punktfunk route).
// The fullscreen page (registered as the /punktfunk route) — a tabbed Hosts / Settings view.
// ----------------------------------------------------------------------------------------
// Bottom inset so the last control clears Gaming Mode's footer hint bar. Routed pages render
// *under* that bar otherwise — that's why the last Stream-settings row was getting hidden. The
// value is generous on purpose (and harmless where the tab area already insets); tune to taste.
const SAFE_BOTTOM = "80px";
// Each tab is its own scroll area so long content is always reachable above the footer.
const tabScroll: CSSProperties = {
height: "100%",
overflowY: "auto",
padding: "0.5em 2.5em",
paddingBottom: SAFE_BOTTOM,
boxSizing: "border-box",
};
const HostsTab: FC<{
hosts: Host[];
scanning: boolean;
refresh: () => void;
}> = ({ hosts, scanning, refresh }) => (
<div style={tabScroll}>
<Field
label="Discover"
description={
scanning
? "Scanning the LAN…"
: `${hosts.length} host${hosts.length === 1 ? "" : "s"} on your network`
}
childrenContainerWidth="max"
bottomSeparator={hosts.length ? "standard" : "none"}
>
<DialogButton style={{ minWidth: "8em" }} disabled={scanning} onClick={refresh}>
{scanning ? (
<Spinner style={{ height: "1em", marginRight: "0.5em" }} />
) : (
<FaSyncAlt style={{ marginRight: "0.5em" }} />
)}
{scanning ? "Scanning…" : "Refresh"}
</DialogButton>
</Field>
{hosts.length === 0 && !scanning && (
<Field
focusable={false}
description="No punktfunk hosts found. Make sure a host is running on the same network."
>
No hosts found
</Field>
)}
{hosts.map((h) => (
<HostRow key={h.fp || `${h.host}:${h.port}`} host={h} />
))}
</div>
);
const SettingsTab: FC = () => (
<div style={tabScroll}>
<SettingsSection />
</div>
);
const PunktfunkPage: FC = () => {
const { hosts, scanning, refresh } = useHosts();
const update = useUpdate();
const [tab, setTab] = useState("hosts");
return (
<div
style={{
marginTop: "40px",
height: "calc(100% - 40px)",
overflowY: "auto",
padding: "0 2.5em 2.5em",
display: "flex",
flexDirection: "column",
}}
>
<Focusable style={{ display: "flex", alignItems: "center", gap: "1em", marginBottom: "1em" }}>
<Focusable
style={{
display: "flex",
alignItems: "center",
gap: "1em",
padding: "0 2.5em",
marginBottom: "0.4em",
flexShrink: 0,
}}
>
<DialogButton
style={{ width: "3em", minWidth: "3em" }}
style={{ width: "3em", minWidth: "3em", padding: 0 }}
onClick={() => Navigation.NavigateBack()}
>
<FaArrowLeft />
</DialogButton>
<div className={staticClasses.Title} style={{ flex: 1 }}>
<div className={staticClasses?.Title} style={{ flex: 1, margin: 0 }}>
punktfunk
</div>
<DialogButton style={{ width: "10em" }} disabled={scanning} onClick={refresh}>
{scanning ? (
<Spinner style={{ height: "1em", marginRight: "0.5em" }} />
) : (
<FaSyncAlt style={{ marginRight: "0.5em" }} />
)}
{scanning ? "Scanning…" : "Refresh"}
</DialogButton>
{update?.update_available && (
<DialogButton style={{ minWidth: "9em" }} onClick={() => applyUpdate(update)}>
<FaDownload style={{ marginRight: "0.4em" }} />
Update v{update.latest}
</DialogButton>
)}
</Focusable>
<div style={{ fontSize: "1.1em", fontWeight: "bold", margin: "0.5em 0" }}>Hosts</div>
{hosts.length === 0 && !scanning && (
<Field focusable={false}>No hosts discovered on the LAN.</Field>
)}
{hosts.map((h) => (
<HostRow key={h.fp || `${h.host}:${h.port}`} host={h} />
))}
<div style={{ fontSize: "1.1em", fontWeight: "bold", margin: "1.5em 0 0.5em" }}>
Stream settings
<div style={{ flex: 1, minHeight: 0 }}>
<Tabs
activeTab={tab}
onShowTab={(id: string) => setTab(id)}
autoFocusContents
tabs={[
{
id: "hosts",
title: "Hosts",
content: <HostsTab hosts={hosts} scanning={scanning} refresh={refresh} />,
},
{
id: "settings",
title: "Settings",
content: <SettingsTab />,
},
]}
/>
</div>
<SettingsSection />
</div>
);
};
@@ -343,9 +545,25 @@ const PunktfunkPage: FC = () => {
// ----------------------------------------------------------------------------------------
const QamPanel: FC = () => {
const { hosts, scanning, refresh } = useHosts();
const update = useUpdate();
return (
<>
{update?.update_available && (
<PanelSection title="Update">
<PanelSectionRow>
<ButtonItem
layout="below"
onClick={() => applyUpdate(update)}
label={`v${update.current} → v${update.latest}`}
>
<FaDownload style={{ marginRight: "0.5em" }} />
Update punktfunk
</ButtonItem>
</PanelSectionRow>
</PanelSection>
)}
<PanelSection title="punktfunk">
<PanelSectionRow>
<ButtonItem
@@ -378,25 +596,25 @@ const QamPanel: FC = () => {
</PanelSectionRow>
)}
{hosts.map((h) => {
const pairRequired = h.pair === "required";
const needsPair = h.pair === "required" && !h.paired;
return (
<PanelSectionRow key={h.fp || `${h.host}:${h.port}`}>
<ButtonItem
layout="below"
onClick={() =>
pairRequired
needsPair
? showModal(<PairModal host={h} onPaired={() => startStream(h)} />)
: startStream(h)
}
label={
<span style={{ display: "inline-flex", alignItems: "center", gap: "0.4em" }}>
{pairRequired ? <FaLock /> : <FaLockOpen />}
{needsPair ? <FaLock /> : <FaLockOpen />}
{h.name}
</span>
}
description={`${h.host}:${h.port}`}
description={`${h.host}:${h.port}${h.paired ? " · paired" : ""}`}
>
{pairRequired ? "Pair & Stream" : "Stream"}
{needsPair ? "Pair & Stream" : "Stream"}
</ButtonItem>
</PanelSectionRow>
);
@@ -406,12 +624,25 @@ const QamPanel: FC = () => {
);
};
// Full page behind the boundary — registered as the /punktfunk route.
const PunktfunkRoute: FC = () => (
<PluginErrorBoundary>
<PunktfunkPage />
</PluginErrorBoundary>
);
export default definePlugin(() => {
routerHook.addRoute(ROUTE, PunktfunkPage, { exact: true });
routerHook.addRoute(ROUTE, PunktfunkRoute, { exact: true });
return {
name: "punktfunk",
titleView: <div className={staticClasses.Title}>punktfunk</div>,
content: <QamPanel />,
// `staticClasses?.Title` is guarded so a future client that drops the export can't throw
// at plugin-load time (an error boundary only catches render-time, not load-time, errors).
titleView: <div className={staticClasses?.Title}>punktfunk</div>,
content: (
<PluginErrorBoundary>
<QamPanel />
</PluginErrorBoundary>
),
icon: <FaTv />,
onDismount() {
routerHook.removeRoute(ROUTE);
+22 -2
View File
@@ -24,12 +24,31 @@ declare const SteamClient: {
SetShortcutExe(appId: number, exe: string): void;
SetShortcutStartDir(appId: number, dir: string): void;
SetAppLaunchOptions(appId: number, options: string): void;
SetAppHidden(appId: number, hidden: boolean): void;
RunGame(gameId: string, _unused: string, _i: number, _j: number): void;
TerminateApp(gameId: string, _b: boolean): void;
};
};
// Steam removed `SteamClient.Apps.SetAppHidden`. Hiding a non-Steam shortcut now goes through
// `collectionStore.SetAppsAsHidden([appId], true)` — but that looks the app up in appStore, which
// only registers a freshly-created shortcut a moment later (calling it immediately throws on a
// null overview). So hiding is BEST-EFFORT + DEFERRED and must NEVER block the launch.
declare const collectionStore:
| { SetAppsAsHidden?: (appIds: number[], hidden: boolean) => void }
| undefined;
function hideShortcut(appId: number): void {
const attempt = () => {
try {
collectionStore?.SetAppsAsHidden?.([appId], true);
} catch {
/* overview not registered yet, or the API changed — cosmetic, ignore */
}
};
attempt(); // succeeds immediately for an already-registered (reused) shortcut
setTimeout(attempt, 2500); // fresh shortcut: retry once its app overview lands
}
const SHORTCUT_NAME = "punktfunk";
// The 64-bit "gameid" RunGame wants, derived from a 32-bit non-Steam shortcut appId: the
@@ -88,7 +107,8 @@ async function ensureShortcut(): Promise<number> {
);
SteamClient.Apps.SetShortcutName(appId, SHORTCUT_NAME);
// Hide it from the library — it's an implementation detail, launched programmatically.
SteamClient.Apps.SetAppHidden(appId, true);
// Best-effort + deferred (see hideShortcut); never let it block the launch.
hideShortcut(appId);
rememberAppId(appId);
return appId;
}
+95 -2
View File
@@ -22,6 +22,8 @@ struct App {
gamepad: crate::gamepad::GamepadService,
/// One session at a time — ignore connects while one is starting/running.
busy: std::cell::Cell<bool>,
/// Steam Deck / Gaming-Mode launch: fullscreen the window (chrome-less) when a stream starts.
fullscreen: bool,
}
impl App {
@@ -41,7 +43,13 @@ pub fn run() -> glib::ExitCode {
if let Some(pin) = arg_value("--pair") {
return headless_pair(&pin);
}
let app = adw::Application::builder().application_id(APP_ID).build();
let mut builder = adw::Application::builder().application_id(APP_ID);
// Screenshot mode launches the app once per scene back-to-back; NON_UNIQUE keeps each
// launch its own primary instance instead of forwarding to a still-registered name.
if shot_scene().is_some() {
builder = builder.flags(gtk::gio::ApplicationFlags::NON_UNIQUE);
}
let app = builder.build();
app.connect_activate(build_ui);
// GTK doesn't see our argv (`--connect` is handled in `build_ui`); an empty argv also
// keeps GApplication from rejecting unknown options.
@@ -56,6 +64,20 @@ fn arg_value(flag: &str) -> Option<String> {
.filter(|v| !v.starts_with("--"))
}
/// True if argv contains `flag` (a valueless switch).
fn arg_flag(flag: &str) -> bool {
std::env::args().any(|a| a == flag)
}
/// Run the stream fullscreen with no window chrome — the Steam Deck / Gaming-Mode launch path.
/// The Decky wrapper passes `--fullscreen`; we also honor the Deck/gamescope env as a fallback
/// so a manual launch under Gaming Mode does the right thing too.
fn fullscreen_mode() -> bool {
arg_flag("--fullscreen")
|| std::env::var_os("SteamDeck").is_some()
|| std::env::var_os("GAMESCOPE_WAYLAND_DISPLAY").is_some()
}
/// Run the SPAKE2 PIN ceremony without a GTK window and persist the verified host to the
/// known-hosts store as paired, so a later `--connect` connects silently. Same identity
/// store the streaming path uses (same binary), so pairing here makes the stream work.
@@ -161,6 +183,7 @@ fn build_ui(gtk_app: &adw::Application) {
identity,
gamepad: crate::gamepad::GamepadService::start(),
busy: std::cell::Cell::new(false),
fullscreen: fullscreen_mode(),
});
let hosts_page = crate::ui_hosts::new(
@@ -182,11 +205,65 @@ fn build_ui(gtk_app: &adw::Application) {
nav.add(&hosts_page);
window.present();
// CI screenshot mode: render one scripted, host-free scene and signal readiness
// (clients/linux/tools/screenshots.sh). Mutually exclusive with a real connect.
if let Some(scene) = shot_scene() {
run_shot(app, &scene);
return;
}
if let Some(req) = cli_connect_request() {
initiate_connect(app, req);
}
}
/// `PUNKTFUNK_SHOT_SCENE`, when set, selects a scripted host-free scene for CI screenshots.
fn shot_scene() -> Option<String> {
std::env::var("PUNKTFUNK_SHOT_SCENE")
.ok()
.filter(|s| !s.is_empty())
}
/// Render one mock-populated, host-free scene over the already-presented window, then print
/// `PF_SHOT_READY` once it has had a moment to map + settle so the driver knows when to capture.
/// No `NativeClient` or session is created. The stream scene is deliberately absent — its page
/// requires a live connector (`ui_stream::new` takes an `Arc<NativeClient>`).
fn run_shot(app: Rc<App>, scene: &str) {
// A plausible host for the trust/pair dialogs (fp_hex is 64 hex chars, like a real SHA-256).
let mock_req = || ConnectRequest {
name: "Living Room PC".to_string(),
addr: "192.168.1.42".to_string(),
port: 9777,
fp_hex: Some(
"9f8e7d6c5b4a39281706f5e4d3c2b1a0998877665544332211ffeeddccbbaa00".to_string(),
),
pair_optional: true,
};
match scene {
// The saved-hosts grid reads ~/.config/punktfunk/client-known-hosts.json, which the
// driver seeds — so the already-shown hosts page is the scene; nothing to do here.
"hosts" | "02-hosts" => {}
"settings" | "03-settings" => {
crate::ui_settings::show(&app.window, app.settings.clone(), &app.gamepad);
}
"trust" | "04-trust" => tofu_dialog(app.clone(), mock_req()),
"pair" | "05-pair" => pin_dialog(app.clone(), mock_req()),
other => tracing::warn!("unknown PUNKTFUNK_SHOT_SCENE={other:?}; showing hosts only"),
}
let settle_ms = std::env::var("PUNKTFUNK_SHOT_SETTLE_MS")
.ok()
.and_then(|v| v.parse().ok())
.unwrap_or(900);
let scene = scene.to_string();
glib::timeout_add_local_once(std::time::Duration::from_millis(settle_ms), move || {
use std::io::Write as _;
println!("PF_SHOT_READY scene={scene}");
let _ = std::io::stdout().flush();
});
}
/// The trust gate in front of every connect. The host is the policy authority (it
/// advertises `pair=optional` only when it accepts unpaired clients); the client renders
/// its trust UI from that:
@@ -375,6 +452,7 @@ fn speed_test(app: Rc<App>, req: ConnectRequest) {
GamepadPref::Auto,
0, // bitrate_kbps (host default)
0, // video_caps: the Linux client has no 10-bit/HDR present path yet
2, // audio_channels: speed-test probe, stereo
None, // launch: speed-test probe connect, no game
pin,
Some(identity),
@@ -443,11 +521,19 @@ fn resolve_mode(app: &App) -> punktfunk_core::config::Mode {
refresh_hz: s.refresh_hz,
};
if mode.width == 0 || mode.refresh_hz == 0 {
// Prefer the monitor the window is on; fall back to the display's first monitor. On a
// `--connect` launch the window may not be mapped yet when this runs, and without the
// fallback we'd drop to the 1920×1080 floor below — wrong on the Deck (1280×800).
let monitor = app
.window
.surface()
.zip(gdk::Display::default())
.and_then(|(surf, d)| d.monitor_at_surface(&surf));
.and_then(|(surf, d)| d.monitor_at_surface(&surf))
.or_else(|| {
gdk::Display::default()
.and_then(|d| d.monitors().item(0))
.and_then(|o| o.downcast::<gdk::Monitor>().ok())
});
if let Some(m) = monitor {
let geo = m.geometry();
let scale = m.scale_factor().max(1);
@@ -488,6 +574,7 @@ fn start_session(app: Rc<App>, req: ConnectRequest, pin: Option<[u8; 32]>) {
},
bitrate_kbps: s.bitrate_kbps,
mic_enabled: s.mic_enabled,
audio_channels: s.audio_channels,
pin,
identity: app.identity.clone(),
};
@@ -540,6 +627,12 @@ fn start_session(app: Rc<App>, req: ConnectRequest, pin: Option<[u8; 32]>) {
&title,
);
app.nav.push(&p.page);
// Steam Deck / Gaming Mode: gamescope fullscreens the window but GTK doesn't
// know it, so its header bar stays drawn. Enter GTK fullscreen explicitly —
// the stream page's `connect_fullscreened_notify` then hides all chrome.
if app.fullscreen {
app.window.fullscreen();
}
page = Some(p);
}
SessionEvent::Stats(s) => {
+21 -10
View File
@@ -27,16 +27,17 @@ pub struct AudioPlayer {
}
impl AudioPlayer {
/// Spawn the PipeWire playback thread. Failure (no PipeWire in the session) is
/// survivable — the caller streams video-only.
pub fn spawn() -> Result<AudioPlayer> {
/// Spawn the PipeWire playback thread for `channels` (2/6/8, canonical wire order
/// FL FR FC LFE RL RR SL SR). Failure (no PipeWire in the session) is survivable — the
/// caller streams video-only.
pub fn spawn(channels: u32) -> Result<AudioPlayer> {
// 64 × 5 ms = 320 ms of slack between the pump and the PipeWire loop.
let (pcm_tx, pcm_rx) = std::sync::mpsc::sync_channel::<Vec<f32>>(64);
let (quit_tx, quit_rx) = pipewire::channel::channel::<Terminate>();
let thread = std::thread::Builder::new()
.name("punktfunk-audio".into())
.spawn(move || {
if let Err(e) = pw_thread(pcm_rx, quit_rx) {
if let Err(e) = pw_thread(pcm_rx, quit_rx, channels as usize) {
tracing::warn!(error = %e, "audio playback thread ended");
}
})
@@ -48,8 +49,8 @@ impl AudioPlayer {
})
}
/// Queue one interleaved-stereo f32 chunk. Drops the chunk if the PipeWire side is
/// wedged (the renderer conceals the gap; never block the session pump).
/// Queue one interleaved f32 chunk (in the session's channel layout). Drops the chunk if the
/// PipeWire side is wedged (the renderer conceals the gap; never block the session pump).
pub fn push(&self, pcm: Vec<f32>) {
if let Err(TrySendError::Disconnected(_)) = self.pcm_tx.try_send(pcm) {
// Thread already dead — Drop will reap it; nothing to do per-chunk.
@@ -71,11 +72,14 @@ struct PlayerData {
rx: Receiver<Vec<f32>>,
ring: VecDeque<f32>,
primed: bool,
/// Interleaved channel count this stream was opened with (2/6/8).
channels: usize,
}
fn pw_thread(
pcm_rx: Receiver<Vec<f32>>,
quit_rx: pipewire::channel::Receiver<Terminate>,
channels: usize,
) -> Result<()> {
use pipewire as pw;
use pw::{properties::properties, spa};
@@ -115,6 +119,7 @@ fn pw_thread(
rx: pcm_rx,
ring: VecDeque::new(),
primed: false,
channels,
};
let _listener = stream
@@ -130,19 +135,19 @@ fn pw_thread(
while let Ok(chunk) = ud.rx.try_recv() {
ud.ring.extend(chunk);
}
let stride = 4 * CHANNELS; // F32LE interleaved
let stride = 4 * ud.channels; // F32LE interleaved
let datas = buffer.datas_mut();
if datas.is_empty() {
return;
}
let data = &mut datas[0];
let want_frames = data.data().map(|s| s.len() / stride).unwrap_or(0);
let want = want_frames * CHANNELS;
let want = want_frames * ud.channels;
// Adaptive jitter buffer (same shape as the host's virtual mic): prime to
// ~3 quanta, cap at ~1 quantum of slack beyond that, re-prime after a
// genuine drain.
let target = (3 * want).clamp(720 * CHANNELS, 9600 * CHANNELS);
let target = (3 * want).clamp(720 * ud.channels, 9600 * ud.channels);
while ud.ring.len() > target.max(want) + want {
ud.ring.pop_front();
}
@@ -182,7 +187,13 @@ fn pw_thread(
let mut info = AudioInfoRaw::new();
info.set_format(AudioFormat::F32LE);
info.set_rate(SAMPLE_RATE);
info.set_channels(CHANNELS as u32);
info.set_channels(channels as u32);
// Channel positions in canonical wire order (FL FR FC LFE RL RR SL SR) so PipeWire routes each
// slot to the matching speaker (and downmixes when the sink has fewer). Identity, no permute.
let order = punktfunk_core::audio::spa_positions(channels as u8);
let mut positions = [0u32; 64];
positions[..order.len()].copy_from_slice(order);
info.set_position(positions);
let obj = pw::spa::pod::Object {
type_: pw::spa::utils::SpaTypes::ObjectParamFormat.as_raw(),
id: pw::spa::param::ParamType::EnumFormat.as_raw(),
+50 -7
View File
@@ -20,6 +20,8 @@ pub struct SessionParams {
pub compositor: CompositorPref,
pub gamepad: GamepadPref,
pub bitrate_kbps: u32,
/// Requested audio channel count (2/6/8); the host echoes the resolved value.
pub audio_channels: u8,
/// Stream the default microphone to the host's virtual mic source.
pub mic_enabled: bool,
/// Pinned host fingerprint; `None` = trust on first use (caller persists the observed one).
@@ -83,6 +85,42 @@ fn now_ns() -> u64 {
.unwrap_or(0)
}
/// Opus decoder for the audio plane: a plain stereo decoder (the validated path) or a multistream
/// decoder for 5.1/7.1, both behind one `decode_float`. Built from the host-RESOLVED channel count
/// via the shared layout table.
enum AudioDec {
Stereo(opus::Decoder),
Surround(opus::MSDecoder),
}
impl AudioDec {
fn new(channels: u8) -> Result<AudioDec, opus::Error> {
if channels == 2 {
Ok(AudioDec::Stereo(opus::Decoder::new(
48_000,
opus::Channels::Stereo,
)?))
} else {
let l = punktfunk_core::audio::layout_for(channels, false);
Ok(AudioDec::Surround(opus::MSDecoder::new(
48_000, l.streams, l.coupled, l.mapping,
)?))
}
}
fn decode_float(
&mut self,
input: &[u8],
out: &mut [f32],
fec: bool,
) -> Result<usize, opus::Error> {
match self {
AudioDec::Stereo(d) => d.decode_float(input, out, fec),
AudioDec::Surround(d) => d.decode_float(input, out, fec),
}
}
}
fn pump(
params: SessionParams,
ev_tx: async_channel::Sender<SessionEvent>,
@@ -96,7 +134,8 @@ fn pump(
params.compositor,
params.gamepad,
params.bitrate_kbps,
0, // video_caps: the Linux client has no 10-bit/HDR present path yet
0, // video_caps: the Linux client has no 10-bit/HDR present path yet
params.audio_channels,
None, // launch: the Linux client has no library picker yet
params.pin,
Some(params.identity),
@@ -134,11 +173,14 @@ fn pump(
}
};
// Audio is best-effort: a session without it still streams. Gamepads are the
// app-lifetime service's job (the UI attaches it on Connected).
let player = audio::AudioPlayer::spawn()
// app-lifetime service's job (the UI attaches it on Connected). Build the decoder + playback
// from the host-RESOLVED channel count (never the request), so an older/clamping host that
// resolves stereo is decoded as stereo.
let channels = connector.audio_channels;
let player = audio::AudioPlayer::spawn(channels as u32)
.map_err(|e| tracing::warn!(error = %e, "audio disabled"))
.ok();
let mut opus_dec = opus::Decoder::new(48_000, opus::Channels::Stereo)
let mut opus_dec = AudioDec::new(channels)
.map_err(|e| tracing::warn!(error = %e, "opus decoder failed — audio disabled"))
.ok();
let _mic = params
@@ -157,8 +199,8 @@ fn pump(
let mut bytes_n = 0u64;
let mut decode_us_sum = 0u64;
let mut lat_us: Vec<u64> = Vec::with_capacity(256);
let mut pcm = vec![0f32; 5760 * 2]; // decode scratch: max Opus frame (120 ms stereo)
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs.
let mut pcm = vec![0f32; 5760 * channels as usize]; // scratch: max Opus frame (120 ms) × channels
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs.
let mut last_dropped = connector.frames_dropped();
let mut last_kf_req: Option<Instant> = None;
@@ -221,7 +263,8 @@ fn pump(
while let Ok(pkt) = connector.next_audio(Duration::ZERO) {
if let (Some(player), Some(dec)) = (&player, opus_dec.as_mut()) {
match dec.decode_float(&pkt.data, &mut pcm, false) {
Ok(samples) => player.push(pcm[..samples * 2].to_vec()),
// `samples` is per-channel; the interleaved frame is `samples * channels`.
Ok(samples) => player.push(pcm[..samples * channels as usize].to_vec()),
Err(e) => tracing::debug!(error = %e, "opus decode"),
}
}
+12
View File
@@ -90,6 +90,14 @@ impl KnownHosts {
self.hosts.iter().find(|h| h.addr == addr && h.port == port)
}
/// Forget the entry with this fingerprint. Returns true if one was removed (the user
/// will have to pair/trust again to reconnect).
pub fn remove_by_fp(&mut self, fp_hex: &str) -> bool {
let before = self.hosts.len();
self.hosts.retain(|h| h.fp_hex != fp_hex);
self.hosts.len() != before
}
/// Insert or refresh an entry, keyed by fingerprint. `paired` only ever upgrades
/// (a later TOFU connect must not demote a PIN-paired host).
pub fn upsert(&mut self, entry: KnownHost) {
@@ -124,6 +132,9 @@ pub struct Settings {
pub inhibit_shortcuts: bool,
/// Stream the default microphone to the host's virtual mic source.
pub mic_enabled: bool,
/// Requested audio channel count: 2 (stereo), 6 (5.1) or 8 (7.1). The host clamps to what it
/// can capture; the resolved count drives the decoder + playback layout.
pub audio_channels: u8,
}
impl Default for Settings {
@@ -137,6 +148,7 @@ impl Default for Settings {
compositor: "auto".into(),
inhibit_shortcuts: true,
mic_enabled: false,
audio_channels: 2,
}
}
}
+46
View File
@@ -181,6 +181,52 @@ pub fn new(
// pinned connect; TOFU eligibility is irrelevant.
pair_optional: false,
};
// Forget this host (drops the pinned fingerprint — a later connect re-pairs).
// Confirmed first, since it's destructive and a misclick on the Deck is easy.
let remove_btn = gtk::Button::from_icon_name("user-trash-symbolic");
remove_btn.set_tooltip_text(Some("Remove saved host"));
remove_btn.set_valign(gtk::Align::Center);
remove_btn.add_css_class("flat");
{
let fp = k.fp_hex.clone();
let name = k.name.clone();
let saved_list = saved_list.clone();
let saved_label = saved_label.clone();
let row = row.clone();
remove_btn.connect_clicked(move |_| {
let dialog = adw::AlertDialog::new(
Some("Remove saved host?"),
Some(&format!(
"Forget “{name}”? You'll need to pair (or trust) it again to reconnect."
)),
);
dialog.add_responses(&[("cancel", "Cancel"), ("remove", "Remove")]);
dialog.set_response_appearance(
"remove",
adw::ResponseAppearance::Destructive,
);
dialog.set_default_response(Some("cancel"));
dialog.set_close_response("cancel");
{
// Scoped clones for the response handler so `row` survives for present().
let fp = fp.clone();
let saved_list = saved_list.clone();
let saved_label = saved_label.clone();
let row = row.clone();
dialog.connect_response(Some("remove"), move |_, _| {
let mut known = KnownHosts::load();
known.remove_by_fp(&fp);
let _ = known.save();
saved_list.remove(&row);
let empty = known.hosts.is_empty();
saved_list.set_visible(!empty);
saved_label.set_visible(!empty);
});
}
dialog.present(Some(&row));
});
}
row.add_suffix(&remove_btn);
let speed_btn = gtk::Button::from_icon_name("network-transmit-receive-symbolic");
speed_btn.set_tooltip_text(Some("Test network speed"));
speed_btn.set_valign(gtk::Align::Center);
+20
View File
@@ -140,6 +140,16 @@ pub fn show(
input.add(&inhibit_row);
let audio = adw::PreferencesGroup::builder().title("Audio").build();
let surround_row = adw::ComboRow::builder()
.title("Audio channels")
.subtitle("Request stereo or surround (the host downmixes if its output has fewer)")
.model(&gtk::StringList::new(&[
"Stereo",
"5.1 Surround",
"7.1 Surround",
]))
.build();
audio.add(&surround_row);
let mic_row = adw::SwitchRow::builder()
.title("Stream microphone")
.subtitle("Send the default input device to the host's virtual microphone")
@@ -170,6 +180,11 @@ pub fn show(
compositor_row.set_selected(comp_i as u32);
inhibit_row.set_active(s.inhibit_shortcuts);
mic_row.set_active(s.mic_enabled);
surround_row.set_selected(match s.audio_channels {
6 => 1,
8 => 2,
_ => 0,
});
}
let dialog = adw::PreferencesDialog::new();
@@ -186,6 +201,11 @@ pub fn show(
.to_string();
s.inhibit_shortcuts = inhibit_row.is_active();
s.mic_enabled = mic_row.is_active();
s.audio_channels = match surround_row.selected() {
1 => 6,
2 => 8,
_ => 2,
};
s.save();
});
dialog.present(Some(parent));
+123
View File
@@ -0,0 +1,123 @@
#!/usr/bin/env bash
# Capture host-free UI screenshots of the native Linux client under a virtual X
# display. Mirrors the iOS harness (clients/apple/tools/screenshots.sh): one app
# launch per scene (PUNKTFUNK_SHOT_SCENE), the app renders a mock-populated REAL
# view and prints `PF_SHOT_READY`, then we grab the X root window. No host, GPU, or
# live stream — only the chrome scenes (the stream page needs a live connector).
#
# cargo build --release -p punktfunk-client-linux
# bash clients/linux/tools/screenshots.sh # → clients/linux/screenshots/<scene>.png
# bash clients/linux/tools/screenshots.sh hosts pair # a subset
#
# Env knobs: BIN (client binary), OUT (output dir), GEOMETRY (Xvfb WxHxDepth),
# SETTLE (extra seconds after PF_SHOT_READY), SHOT_DISPLAY (X display), GSK_RENDERER
# (gl|ngl|cairo — gl/llvmpipe by default for full libadwaita fidelity).
set -euo pipefail
here="$(cd "$(dirname "${BASH_SOURCE[0]}")/.." && pwd)" # clients/linux
BIN="${BIN:-$here/../../target/release/punktfunk-client}"
OUT="${OUT:-$here/screenshots}"
# The client window maps at its 1100x720 default; with no WM under Xvfb it lands at the
# top-left, so keep the root just larger so the full window (incl. its CSD shadow) is
# captured by a root grab with only a thin margin to crop.
GEOMETRY="${GEOMETRY:-1280x800x24}"
SETTLE="${SETTLE:-1.2}"
SHOT_DISPLAY="${SHOT_DISPLAY:-:99}"
if [ "$#" -gt 0 ]; then SCENES=("$@"); else SCENES=(hosts settings trust pair); fi
[ -x "$BIN" ] || {
echo "client binary not found: $BIN (build it first: cargo build --release -p punktfunk-client-linux)" >&2
exit 1
}
# Isolated scratch HOME: the client generates its identity here on first run, and the
# saved-hosts grid is read from client-known-hosts.json, so seed mock hosts for the
# `hosts` scene (the dialogs/settings build their own mock state in-app).
WORK="$(mktemp -d)"
export HOME="$WORK"
mkdir -p "$HOME/.config/punktfunk"
cat >"$HOME/.config/punktfunk/client-known-hosts.json" <<'JSON'
{
"hosts": [
{ "name": "Living Room PC", "addr": "192.168.1.42", "port": 9777,
"fp_hex": "9f8e7d6c5b4a39281706f5e4d3c2b1a0998877665544332211ffeeddccbbaa00",
"paired": true },
{ "name": "Office", "addr": "192.168.1.50", "port": 9777,
"fp_hex": "a1b2c3d4e5f60718293a4b5c6d7e8f90112233445566778899aabbccddeeff00",
"paired": false }
]
}
JSON
# Software-rendered X session — no GPU/Wayland. GL/llvmpipe runs the real NGL renderer
# (cairo is documented-incomplete for 3D-transformed content / libadwaita transitions).
unset WAYLAND_DISPLAY
export DISPLAY="$SHOT_DISPLAY"
export GDK_BACKEND=x11
export LIBGL_ALWAYS_SOFTWARE=1
export GALLIUM_DRIVER="${GALLIUM_DRIVER:-llvmpipe}"
export GSK_RENDERER="${GSK_RENDERER:-gl}"
Xvfb "$SHOT_DISPLAY" -screen 0 "$GEOMETRY" -nolisten tcp >"$WORK/xvfb.log" 2>&1 &
XVFB_PID=$!
cleanup() {
kill "$XVFB_PID" 2>/dev/null || true
rm -rf "$WORK"
}
trap cleanup EXIT
# Wait for the display to accept connections.
for _ in $(seq 1 50); do
if command -v xdpyinfo >/dev/null 2>&1; then
xdpyinfo -display "$SHOT_DISPLAY" >/dev/null 2>&1 && break
else
[ -e "/tmp/.X11-unix/X${SHOT_DISPLAY#:}" ] && break
fi
sleep 0.1
done
capture() {
local out="$1"
if command -v import >/dev/null 2>&1; then
import -silent -window root "$out"
elif command -v scrot >/dev/null 2>&1; then
scrot -o "$out"
else
echo "no screenshot tool — install imagemagick or scrot" >&2
return 1
fi
}
mkdir -p "$OUT"
rc=0
for scene in "${SCENES[@]}"; do
: >"$WORK/log"
PUNKTFUNK_SHOT_SCENE="$scene" "$BIN" >"$WORK/log" 2>&1 &
pid=$!
ready=0
for _ in $(seq 1 200); do # up to ~20s
if grep -q "PF_SHOT_READY" "$WORK/log"; then
ready=1
break
fi
if ! kill -0 "$pid" 2>/dev/null; then break; fi
sleep 0.1
done
if [ "$ready" = 1 ]; then
sleep "$SETTLE"
if capture "$OUT/$scene.png"; then
echo "$scene$OUT/$scene.png"
else
rc=1
fi
else
echo "$scene: client never signalled PF_SHOT_READY" >&2
sed 's/^/ /' "$WORK/log" >&2 || true
rc=1
fi
kill "$pid" 2>/dev/null || true
wait "$pid" 2>/dev/null || true
done
exit "$rc"
+3 -4
View File
@@ -18,8 +18,7 @@ tracing-subscriber = { version = "0.3", features = ["env-filter"] }
# LAN host discovery (`--discover`): browse the native `_punktfunk._udp` mDNS service the host
# advertises (same crate/version the host advertises with).
mdns-sd = "0.20"
# Linux-only: --mic-test's Opus encoder (libopus). The mic UPLINK itself is portable —
# only this synthetic-tone test rig needs the encoder.
[target.'cfg(target_os = "linux")'.dependencies]
# Opus: multistream DECODE of the host's audio plane (the surround validator) + `--mic-test`'s
# encoder. libopus is already in the graph via `punktfunk-core`'s quic feature; this exposes the
# name directly. Cross-platform (cmake-vendored), so the probe builds + validates everywhere.
opus = "0.3"
+51 -6
View File
@@ -78,6 +78,10 @@ struct Args {
gamepad: GamepadPref,
/// `--bitrate KBPS` — request this encoder bitrate (kilobits/s); 0 = host default.
bitrate_kbps: u32,
/// `--audio-channels N` — request stereo (2), 5.1 (6) or 7.1 (8) audio; default 2. The probe
/// multistream-decodes the host's frames and asserts the per-channel sample count, so it's the
/// headless validator for the surround encode path.
audio_channels: u8,
/// `--launch ID` — ask the host to launch a library title in this session (a store-qualified
/// id from the host's `GET /api/v1/library`, e.g. `steam:570`). Host resolves it; `None` = none.
launch: Option<String>,
@@ -201,6 +205,11 @@ fn parse_args() -> Args {
compositor,
gamepad,
bitrate_kbps: get("--bitrate").and_then(|s| s.parse().ok()).unwrap_or(0),
audio_channels: punktfunk_core::audio::normalize_channels(
get("--audio-channels")
.and_then(|s| s.parse().ok())
.unwrap_or(2),
),
launch: get("--launch").map(str::to_string),
speed_test: get("--speed-test").and_then(|s| {
let (kbps, ms) = s.split_once(':')?;
@@ -385,13 +394,23 @@ async fn session(args: Args) -> Result<()> {
// `--launch ID` — host resolves it against its own library and runs it this session.
launch: args.launch.clone(),
// This headless tool just dumps the bitstream (no decode), so it can always claim
// 10-bit support. Gated by env so latency runs stay on the 8-bit baseline:
// PUNKTFUNK_CLIENT_10BIT=1 advertises VIDEO_CAP_10BIT to exercise the host Main10 path.
video_caps: if std::env::var_os("PUNKTFUNK_CLIENT_10BIT").is_some() {
punktfunk_core::quic::VIDEO_CAP_10BIT
} else {
0
// 10-bit / 4:4:4 support. Gated by env so latency runs stay on the 8-bit 4:2:0 baseline:
// PUNKTFUNK_CLIENT_10BIT=1 advertises VIDEO_CAP_10BIT (host Main10 path);
// PUNKTFUNK_CLIENT_444=1 advertises VIDEO_CAP_444 (host HEVC 4:4:4 path) — verify the
// resulting chroma with `ffprobe` on the `--out` .h265.
video_caps: {
let mut caps = 0u8;
if std::env::var_os("PUNKTFUNK_CLIENT_10BIT").is_some() {
caps |= punktfunk_core::quic::VIDEO_CAP_10BIT;
}
if std::env::var_os("PUNKTFUNK_CLIENT_444").is_some() {
caps |= punktfunk_core::quic::VIDEO_CAP_444;
}
caps
},
// `--audio-channels` (default stereo); the probe multistream-decodes + validates the
// host's frames to exercise the surround encode path headlessly.
audio_channels: args.audio_channels,
}
.encode(),
)
@@ -408,6 +427,8 @@ async fn session(args: Args) -> Result<()> {
bit_depth = welcome.bit_depth,
color = ?welcome.color,
hdr = welcome.color.is_hdr(),
chroma_444 = welcome.chroma_format == punktfunk_core::quic::CHROMA_IDC_444,
chroma_format_idc = welcome.chroma_format,
"session offer"
);
@@ -830,13 +851,37 @@ async fn session(args: Args) -> Result<()> {
hidout_pkts.clone(),
);
let conn2 = conn.clone();
// Build a multistream decoder for the host-RESOLVED layout so the probe actually decodes
// the surround stream (not just counts bytes) — the headless validator for the encode path.
let audio_channels = welcome.audio_channels;
tokio::spawn(async move {
use std::sync::atomic::Ordering::Relaxed;
let mut hdr_logged = false;
let layout = punktfunk_core::audio::layout_for(audio_channels, false);
let mut audio_dec =
opus::MSDecoder::new(48_000, layout.streams, layout.coupled, layout.mapping).ok();
let mut pcm = vec![0f32; 5760 * audio_channels as usize];
let mut audio_decoded_logged = false;
while let Ok(d) = conn2.read_datagram().await {
if let Some((_, _, opus)) = punktfunk_core::quic::decode_audio_datagram(&d) {
a.fetch_add(1, Relaxed);
ab.fetch_add(opus.len() as u64, Relaxed);
// Decode + validate: the per-channel sample count must be a legal Opus frame
// size; log the first success so a loopback test can assert surround decoded.
if let Some(dec) = audio_dec.as_mut() {
match dec.decode_float(opus, &mut pcm, false) {
Ok(samples) if !audio_decoded_logged => {
audio_decoded_logged = true;
tracing::info!(
channels = audio_channels,
samples_per_channel = samples,
"audio decoded (Opus multistream)"
);
}
Ok(_) => {}
Err(e) => tracing::debug!(error = %e, "probe audio decode"),
}
}
} else if punktfunk_core::quic::decode_rumble_datagram(&d).is_some() {
r.fetch_add(1, Relaxed);
} else if let Some(meta) = punktfunk_core::quic::decode_hdr_meta_datagram(&d) {
+32 -2
View File
@@ -39,6 +39,9 @@ const DECODERS: &[(&str, &str)] = &[
];
/// Bitrate presets in Mb/s; `0` = host default.
const BITRATES_MBPS: &[u32] = &[0, 10, 20, 30, 50, 80, 150];
/// Audio channel presets: `(channel count, display label)`. The host clamps to what it can
/// capture; the resolved count drives the decoder + WASAPI render layout.
const AUDIO_CHANNELS: &[(u8, &str)] = &[(2, "Stereo"), (6, "5.1 Surround"), (8, "7.1 Surround")];
#[derive(Clone, PartialEq)]
enum Screen {
@@ -598,6 +601,7 @@ fn connect(
compositor: CompositorPref::Auto,
gamepad: gamepad_pref,
bitrate_kbps: s.bitrate_kbps,
audio_channels: s.audio_channels,
mic_enabled: s.mic_enabled,
hdr_enabled: s.hdr_enabled,
decoder: DecoderPref::from_name(&s.decoder),
@@ -886,6 +890,23 @@ fn settings_page(ctx: &Arc<AppCtx>, set_screen: &AsyncSetState<Screen>) -> Eleme
s.save();
})
};
let ac_i = AUDIO_CHANNELS
.iter()
.position(|&(v, _)| v == s.audio_channels)
.unwrap_or(0) as i32;
let ac_names: Vec<String> = AUDIO_CHANNELS.iter().map(|&(_, l)| l.to_string()).collect();
let channels_combo = {
let ctx = ctx.clone();
ComboBox::new(ac_names)
.header("Audio channels")
.selected_index(ac_i)
.on_selection_changed(move |i: i32| {
let (v, _) = AUDIO_CHANNELS[(i.max(0) as usize).min(AUDIO_CHANNELS.len() - 1)];
let mut s = ctx.settings.lock().unwrap();
s.audio_channels = v;
s.save();
})
};
let header = grid((
text_block("Settings")
@@ -934,8 +955,17 @@ fn settings_page(ctx: &Arc<AppCtx>, set_screen: &AsyncSetState<Screen>) -> Eleme
.spacing(10.0),
);
let audio_card =
card(vstack((text_block("Audio").font_size(15.0).semibold(), mic_toggle)).spacing(10.0));
let audio_card = card(
vstack((
text_block("Audio").font_size(15.0).semibold(),
text_block("Request stereo or surround — the host downmixes if its output has fewer.")
.font_size(12.0)
.foreground(ThemeRef::SecondaryText),
channels_combo,
mic_toggle,
))
.spacing(10.0),
);
page(vec![
header.into(),
+28 -12
View File
@@ -21,9 +21,9 @@ use std::time::Duration;
use wasapi::{DeviceEnumerator, Direction, SampleType, StreamMode, WaveFormat};
const SAMPLE_RATE: usize = 48_000;
/// The microphone uplink stays stereo (the host's virtual mic is stereo). The render path is
/// multichannel — its channel count + block align are runtime, driven by the host-resolved layout.
const CHANNELS: usize = 2;
/// 48 kHz stereo f32: 2 channels * 4 bytes = 8 bytes per frame.
const BLOCK_ALIGN: usize = CHANNELS * 4;
/// Mic frames are 20 ms (960 samples/channel) — any size ≤ 120 ms is fine host-side.
const MIC_FRAME: usize = 960;
@@ -34,9 +34,10 @@ pub struct AudioPlayer {
}
impl AudioPlayer {
/// Spawn the WASAPI render thread. Failure (no render endpoint on this box) is
/// survivable — the caller streams video-only.
pub fn spawn() -> Result<AudioPlayer> {
/// Spawn the WASAPI render thread for `channels` (2/6/8, canonical wire order
/// FL FR FC LFE RL RR SL SR). Failure (no render endpoint on this box) is survivable — the
/// caller streams video-only.
pub fn spawn(channels: u8) -> Result<AudioPlayer> {
// 64 × 5 ms = 320 ms of slack between the pump and the WASAPI loop.
let (pcm_tx, pcm_rx) = std::sync::mpsc::sync_channel::<Vec<f32>>(64);
let stop = Arc::new(AtomicBool::new(false));
@@ -45,14 +46,14 @@ impl AudioPlayer {
let thread = std::thread::Builder::new()
.name("punktfunk-audio".into())
.spawn(move || {
if let Err(e) = render_thread(pcm_rx, stop_t, ready_tx) {
if let Err(e) = render_thread(pcm_rx, stop_t, ready_tx, channels) {
tracing::warn!(error = format!("{e:#}"), "audio playback thread ended");
}
})
.context("spawn audio thread")?;
match ready_rx.recv_timeout(Duration::from_secs(3)) {
Ok(Ok(())) => {
tracing::info!("WASAPI render: 48 kHz stereo f32 (default endpoint)");
tracing::info!(channels, "WASAPI render: 48 kHz f32 (default endpoint)");
Ok(AudioPlayer {
pcm_tx,
stop,
@@ -66,8 +67,8 @@ impl AudioPlayer {
}
}
/// Queue one interleaved-stereo f32 chunk. Drops the chunk if the WASAPI side is wedged
/// (the renderer conceals the gap; never block the session pump).
/// Queue one interleaved f32 chunk (in the session's channel layout). Drops the chunk if the
/// WASAPI side is wedged (the renderer conceals the gap; never block the session pump).
pub fn push(&self, pcm: Vec<f32>) {
if let Err(TrySendError::Disconnected(_)) = self.pcm_tx.try_send(pcm) {
// Thread already dead — Drop will reap it; nothing to do per-chunk.
@@ -88,6 +89,7 @@ fn render_thread(
pcm_rx: Receiver<Vec<f32>>,
stop: Arc<AtomicBool>,
ready: SyncSender<Result<()>>,
channels: u8,
) -> Result<()> {
if let Err(e) = wasapi::initialize_mta()
.ok()
@@ -97,12 +99,26 @@ fn render_thread(
return Ok(());
}
let res = (|| -> Result<()> {
// F32LE interleaved: channels × 4 bytes/sample. Stereo (channels == 2) is byte-identical
// to the old fixed path (mask 0x3, block align 8).
let block_align = channels as usize * 4;
let device = DeviceEnumerator::new()
.context("DeviceEnumerator")?
.get_default_device(&Direction::Render)
.context("default render endpoint")?;
let mut audio_client = device.get_iaudioclient().context("IAudioClient")?;
let desired = WaveFormat::new(32, 32, &SampleType::Float, SAMPLE_RATE, CHANNELS, None);
// The explicit dwChannelMask is the wire order (FL FR FC LFE RL RR SL SR); 5.1 = 0x3F,
// 7.1 = 0x63F. WASAPI delivers channels in ascending mask-bit order, which equals the wire
// order, so the render mapping is the identity — no permute. `autoconvert` (below) lets the
// audio engine downmix when the endpoint has fewer speakers.
let desired = WaveFormat::new(
32,
32,
&SampleType::Float,
SAMPLE_RATE,
channels as usize,
Some(punktfunk_core::audio::wasapi_channel_mask(channels)),
);
let (default_period, _min_period) =
audio_client.get_device_period().context("device period")?;
let mode = StreamMode::EventsShared {
@@ -139,10 +155,10 @@ fn render_thread(
if avail_frames == 0 {
continue;
}
let want_bytes = avail_frames * BLOCK_ALIGN;
let want_bytes = avail_frames * block_align;
// Prime to ~3 quanta; cap at ~1 quantum of slack beyond that; re-prime on drain.
let target = (3 * want_bytes).clamp(720 * BLOCK_ALIGN, 9600 * BLOCK_ALIGN);
let target = (3 * want_bytes).clamp(720 * block_align, 9600 * block_align);
while ring.len() > target.max(want_bytes) + want_bytes {
ring.pop_front();
}
+2
View File
@@ -177,6 +177,8 @@ fn run_headless_cli(args: &[String], identity: (String, String)) {
compositor: CompositorPref::Auto,
gamepad: GamepadPref::Auto,
bitrate_kbps,
// Headless CLI path (test/scripting) — stereo baseline; the GUI sources this from settings.
audio_channels: 2,
mic_enabled: flag("--mic"),
hdr_enabled: !flag("--no-hdr"),
decoder,
+49 -6
View File
@@ -23,6 +23,8 @@ pub struct SessionParams {
pub compositor: CompositorPref,
pub gamepad: GamepadPref,
pub bitrate_kbps: u32,
/// Requested audio channel count (2/6/8); the host echoes the resolved value.
pub audio_channels: u8,
/// Stream the default microphone to the host's virtual mic source.
pub mic_enabled: bool,
/// Advertise 10-bit + HDR10 so the host may upgrade HDR content to a Main10/PQ stream.
@@ -94,6 +96,42 @@ fn now_ns() -> u64 {
.unwrap_or(0)
}
/// Opus decoder for the audio plane: a plain stereo decoder (the validated path) or a multistream
/// decoder for 5.1/7.1, both behind one `decode_float`. Built from the host-RESOLVED channel count
/// via the shared layout table.
enum AudioDec {
Stereo(opus::Decoder),
Surround(opus::MSDecoder),
}
impl AudioDec {
fn new(channels: u8) -> Result<AudioDec, opus::Error> {
if channels == 2 {
Ok(AudioDec::Stereo(opus::Decoder::new(
48_000,
opus::Channels::Stereo,
)?))
} else {
let l = punktfunk_core::audio::layout_for(channels, false);
Ok(AudioDec::Surround(opus::MSDecoder::new(
48_000, l.streams, l.coupled, l.mapping,
)?))
}
}
fn decode_float(
&mut self,
input: &[u8],
out: &mut [f32],
fec: bool,
) -> Result<usize, opus::Error> {
match self {
AudioDec::Stereo(d) => d.decode_float(input, out, fec),
AudioDec::Surround(d) => d.decode_float(input, out, fec),
}
}
}
fn pump(
params: SessionParams,
ev_tx: async_channel::Sender<SessionEvent>,
@@ -122,6 +160,7 @@ fn pump(
}
0
},
params.audio_channels,
None, // launch: the Windows client has no library picker yet
params.pin,
Some(params.identity),
@@ -161,11 +200,14 @@ fn pump(
let mut hardware = decoder.is_hardware();
let mut hdr = false;
// Audio is best-effort: a session without it still streams. Gamepads are the
// app-lifetime service's job (the UI attaches it on Connected).
let player = audio::AudioPlayer::spawn()
// app-lifetime service's job (the UI attaches it on Connected). Build the decoder + playback
// from the host-RESOLVED channel count (never the request), so an older/clamping host that
// resolves stereo is decoded as stereo.
let channels = connector.audio_channels;
let player = audio::AudioPlayer::spawn(channels)
.map_err(|e| tracing::warn!(error = %e, "audio disabled"))
.ok();
let mut opus_dec = opus::Decoder::new(48_000, opus::Channels::Stereo)
let mut opus_dec = AudioDec::new(channels)
.map_err(|e| tracing::warn!(error = %e, "opus decoder failed — audio disabled"))
.ok();
let _mic = params
@@ -184,8 +226,8 @@ fn pump(
let mut bytes_n = 0u64;
let mut decode_us_sum = 0u64;
let mut lat_us: Vec<u64> = Vec::with_capacity(256);
let mut pcm = vec![0f32; 5760 * 2]; // decode scratch: max Opus frame (120 ms stereo)
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs.
let mut pcm = vec![0f32; 5760 * channels as usize]; // scratch: max Opus frame (120 ms) × channels
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs.
let mut last_dropped = connector.frames_dropped();
let mut last_kf_req: Option<Instant> = None;
@@ -253,7 +295,8 @@ fn pump(
while let Ok(pkt) = connector.next_audio(Duration::ZERO) {
if let (Some(player), Some(dec)) = (&player, opus_dec.as_mut()) {
match dec.decode_float(&pkt.data, &mut pcm, false) {
Ok(samples) => player.push(pcm[..samples * 2].to_vec()),
// `samples` is per-channel; the interleaved frame is `samples * channels`.
Ok(samples) => player.push(pcm[..samples * channels as usize].to_vec()),
Err(e) => tracing::debug!(error = %e, "opus decode"),
}
}
+4
View File
@@ -130,6 +130,9 @@ pub struct Settings {
pub inhibit_shortcuts: bool,
/// Stream the default microphone to the host's virtual mic source.
pub mic_enabled: bool,
/// Requested audio channel count: 2 (stereo), 6 (5.1) or 8 (7.1). The host clamps to what it
/// can capture; the resolved count drives the decoder + WASAPI render layout.
pub audio_channels: u8,
/// Advertise 10-bit + HDR10 so the host upgrades HDR content to a Main10/PQ stream (the client
/// presents it on a 10-bit ST.2084 swapchain). No effect on SDR content.
pub hdr_enabled: bool,
@@ -148,6 +151,7 @@ impl Default for Settings {
compositor: "auto".into(),
inhibit_shortcuts: true,
mic_enabled: false,
audio_channels: 2,
hdr_enabled: true,
decoder: "auto".into(),
}
+7 -1
View File
@@ -19,7 +19,7 @@ crate-type = ["lib", "cdylib", "staticlib"]
default = []
# Control-plane QUIC (pairing, config, reverse audio). tokio is permitted ONLY here,
# never on the per-frame hot path. Off by default so the core stays runtime-free.
quic = ["dep:quinn", "dep:tokio", "dep:rustls", "dep:rcgen", "dep:rustls-pki-types", "dep:sha2", "dep:hmac", "dep:spake2"]
quic = ["dep:quinn", "dep:tokio", "dep:rustls", "dep:rcgen", "dep:rustls-pki-types", "dep:sha2", "dep:hmac", "dep:spake2", "dep:opus"]
[dependencies]
reed-solomon-simd = "3.1" # GF(2^16) Leopard-RS, SIMD, O(n log n) — the wall-breaker (P2)
@@ -51,6 +51,12 @@ sha2 = { version = "0.10", optional = true }
hmac = { version = "0.12", optional = true }
spake2 = { version = "0.4", optional = true }
tokio = { version = "1", optional = true, features = ["rt-multi-thread", "net", "sync", "macros"] }
# In-core Opus (multistream) DECODE for the C-ABI `punktfunk_connection_next_audio_pcm` path —
# used by embedders without a multistream-capable Opus decoder (Apple's AudioToolbox is
# stereo-only). The Rust clients link `opus` themselves and decode the raw `next_audio` frames,
# so this only matters when the connection API (quic) is built. Same libopus the host vendors;
# cargo unifies the build. Multistream API: `opus::MSDecoder` (lib.rs:1187).
opus = { version = "0.3", optional = true }
# `libc` for batched UDP syscalls: `sendmmsg`/`recvmmsg` on Linux (the 1 Gbps+ lever) and the
# `recv(MSG_DONTWAIT)` drain on the other unix (Apple/BSD) targets, which have no `recvmmsg`
+219
View File
@@ -467,6 +467,23 @@ pub struct PunktfunkConnection {
last: std::sync::Mutex<Option<crate::session::Frame>>,
/// Same, for `punktfunk_connection_next_audio` (independent of the video slot).
last_audio: std::sync::Mutex<Option<crate::client::AudioPacket>>,
/// Decode-in-core state for `punktfunk_connection_next_audio_pcm` (Apple / any embedder
/// without a multistream Opus decoder). The decoder is built lazily from the negotiated
/// `inner.audio_channels`; `pcm` is a fixed-capacity reusable buffer the returned pointer
/// borrows until the next PCM call (same contract as `last_audio`).
audio_pcm: std::sync::Mutex<AudioPcmState>,
}
/// Lazily-initialized in-core Opus decode state. A coupled-1-stream multistream decoder is
/// equivalent to a plain stereo decoder, so one [`opus::MSDecoder`] handles 2/6/8 channels.
#[cfg(feature = "quic")]
#[derive(Default)]
struct AudioPcmState {
decoder: Option<opus::MSDecoder>,
/// Interleaved f32 PCM, wire channel order. Pre-sized to the largest legal Opus frame
/// (120 ms @ 48 kHz = 5760 samples/ch) × 8 channels so decode never reallocates (which would
/// dangle the pointer handed to the embedder).
pcm: Vec<f32>,
}
/// `PunktfunkHidOutput::kind` — lightbar RGB (`r`/`g`/`b` valid).
@@ -708,12 +725,18 @@ pub const PUNKTFUNK_VIDEO_CAP_10BIT: u8 = 0x01;
/// Video-capability bit for [`punktfunk_connect_ex5`] (`video_caps`): the client can present
/// BT.2020 PQ HDR10 (implies 10-bit). (Mirrors `quic::VIDEO_CAP_HDR`.)
pub const PUNKTFUNK_VIDEO_CAP_HDR: u8 = 0x02;
/// Video-capability bit for [`punktfunk_connect_ex5`] (`video_caps`): the client can decode a
/// full-chroma 4:4:4 HEVC stream (Range Extensions). The host emits 4:4:4 only when this is set,
/// the host opted in, the codec is HEVC, and the GPU supports it — else the stream stays 4:2:0 and
/// [`punktfunk_connection_chroma_format`] reports the real value. (Mirrors `quic::VIDEO_CAP_444`.)
pub const PUNKTFUNK_VIDEO_CAP_444: u8 = 0x04;
// Keep the ABI cap bits in lockstep with the wire constants (compile-time guard against drift).
#[cfg(feature = "quic")]
const _: () = {
assert!(PUNKTFUNK_VIDEO_CAP_10BIT == crate::quic::VIDEO_CAP_10BIT);
assert!(PUNKTFUNK_VIDEO_CAP_HDR == crate::quic::VIDEO_CAP_HDR);
assert!(PUNKTFUNK_VIDEO_CAP_444 == crate::quic::VIDEO_CAP_444);
};
// Keep the ABI gamepad constants in lockstep with the wire enum (compile-time guard against drift).
@@ -980,6 +1003,58 @@ pub unsafe extern "C" fn punktfunk_connect_ex5(
client_cert_pem: *const std::os::raw::c_char,
client_key_pem: *const std::os::raw::c_char,
timeout_ms: u32,
) -> *mut PunktfunkConnection {
// Delegate to the surround-aware variant requesting stereo (the pre-surround behaviour).
unsafe {
punktfunk_connect_ex6(
host,
port,
width,
height,
refresh_hz,
compositor,
gamepad,
bitrate_kbps,
video_caps,
2, // audio_channels = stereo
launch_id,
pin_sha256,
observed_sha256_out,
client_cert_pem,
client_key_pem,
timeout_ms,
)
}
}
/// Like [`punktfunk_connect_ex5`], but additionally requests the audio channel count:
/// `2` (stereo, the default behaviour of every earlier variant), `6` (5.1) or `8` (7.1). The host
/// clamps the request to what it can actually capture and echoes the resolved count via
/// [`punktfunk_connection_audio_channels`]; the `0xC9` audio frames are Opus-(multi)stream encoded
/// for that layout. A client that wants surround calls this; everything else inherits stereo.
///
/// # Safety
/// Same as [`punktfunk_connect`].
#[cfg(feature = "quic")]
#[no_mangle]
#[allow(clippy::too_many_arguments)]
pub unsafe extern "C" fn punktfunk_connect_ex6(
host: *const std::os::raw::c_char,
port: u16,
width: u32,
height: u32,
refresh_hz: u32,
compositor: u32,
gamepad: u32,
bitrate_kbps: u32,
video_caps: u8,
audio_channels: u8,
launch_id: *const std::os::raw::c_char,
pin_sha256: *const u8,
observed_sha256_out: *mut u8,
client_cert_pem: *const std::os::raw::c_char,
client_key_pem: *const std::os::raw::c_char,
timeout_ms: u32,
) -> *mut PunktfunkConnection {
let r = std::panic::catch_unwind(AssertUnwindSafe(|| {
if host.is_null() {
@@ -1029,6 +1104,7 @@ pub unsafe extern "C" fn punktfunk_connect_ex5(
gamepad,
bitrate_kbps,
video_caps,
crate::audio::normalize_channels(audio_channels),
launch,
pin,
identity,
@@ -1045,6 +1121,7 @@ pub unsafe extern "C" fn punktfunk_connect_ex5(
inner: c,
last: std::sync::Mutex::new(None),
last_audio: std::sync::Mutex::new(None),
audio_pcm: std::sync::Mutex::new(AudioPcmState::default()),
}))
}
Err(_) => std::ptr::null_mut(),
@@ -1250,6 +1327,121 @@ pub unsafe extern "C" fn punktfunk_connection_next_audio(
})
}
/// Read the audio channel count the host resolved for this session (from its Welcome): `2`
/// (stereo), `6` (5.1) or `8` (7.1). `*out` is filled when non-NULL. The `0xC9` Opus frames are
/// (multistream-)encoded for this layout; an embedder decoding raw frames itself must build its
/// decoder from THIS value (see [`crate::audio::layout_for`]) — or use
/// [`punktfunk_connection_next_audio_pcm`], which decodes in-core. Available immediately after a
/// successful connect (it doesn't change without a reconfigure).
///
/// # Safety
/// `c` is a valid connection handle; `out` is NULL or writable for one `u8`.
#[cfg(feature = "quic")]
#[no_mangle]
pub unsafe extern "C" fn punktfunk_connection_audio_channels(
c: *mut PunktfunkConnection,
out: *mut u8,
) -> PunktfunkStatus {
guard(|| {
let c = match unsafe { c.as_ref() } {
Some(c) => c,
None => return PunktfunkStatus::NullPointer,
};
if !out.is_null() {
// SAFETY: `out` is non-null and the caller guarantees it is writable for one `u8`.
unsafe { *out = c.inner.audio_channels };
}
PunktfunkStatus::Ok
})
}
/// One decoded audio frame from [`punktfunk_connection_next_audio_pcm`]: interleaved 32-bit
/// float PCM at 48 kHz, in the canonical wire channel order `FL FR FC LFE RL RR SL SR` (the
/// first `channels` of it). `samples` points at `frame_count * channels` floats and borrows
/// connection memory **until the next PCM call** on this handle.
#[cfg(feature = "quic")]
#[repr(C)]
pub struct PunktfunkAudioPcm {
/// Interleaved f32 samples (wire channel order), `frame_count * channels` long.
pub samples: *const f32,
/// Samples per channel in this frame.
pub frame_count: u32,
/// Channel count (2/6/8) — the negotiated [`punktfunk_connection_audio_channels`].
pub channels: u8,
/// Source packet sequence number.
pub seq: u32,
/// Capture presentation timestamp (ns).
pub pts_ns: u64,
}
/// Pull the next audio frame and **decode it in-core** to interleaved f32 PCM — for embedders
/// without a multistream-capable Opus decoder (e.g. Apple, whose AudioToolbox Opus path is
/// stereo-only). The decoder is built once from the negotiated channel count and handles 2/6/8
/// channels (a 1-coupled-stream multistream decoder is exactly a stereo decoder). Same
/// timeout/closed semantics as [`punktfunk_connection_next_audio`]; `out->samples` borrows
/// connection memory until the next PCM call on this handle. Use EITHER this or
/// [`punktfunk_connection_next_audio`] on a given connection, from one dedicated audio thread —
/// not both (they share the underlying queue).
///
/// # Safety
/// `c` is a valid connection handle; `out` is writable. At most one thread pulls audio.
#[cfg(feature = "quic")]
#[no_mangle]
pub unsafe extern "C" fn punktfunk_connection_next_audio_pcm(
c: *mut PunktfunkConnection,
out: *mut PunktfunkAudioPcm,
timeout_ms: u32,
) -> PunktfunkStatus {
guard(|| {
let c = match unsafe { c.as_ref() } {
Some(c) => c,
None => return PunktfunkStatus::NullPointer,
};
if out.is_null() {
return PunktfunkStatus::NullPointer;
}
let channels = crate::audio::normalize_channels(c.inner.audio_channels);
let pkt = match c
.inner
.next_audio(std::time::Duration::from_millis(timeout_ms as u64))
{
Ok(pkt) => pkt,
Err(e) => return e.status(),
};
let mut state = c.audio_pcm.lock().unwrap();
if state.decoder.is_none() {
let layout = crate::audio::layout_for(channels, false);
match opus::MSDecoder::new(48_000, layout.streams, layout.coupled, layout.mapping) {
Ok(d) => {
// Largest legal Opus frame is 120 ms = 5760 samples/ch.
state.pcm = vec![0f32; 5760 * channels as usize];
state.decoder = Some(d);
}
Err(_) => return PunktfunkStatus::Unsupported,
}
}
let AudioPcmState { decoder, pcm } = &mut *state;
let dec = decoder.as_mut().unwrap();
// `decode_float` divides the output buffer length by the channel count to get the
// per-channel capacity; an empty payload requests packet-loss concealment.
match dec.decode_float(&pkt.data, pcm, false) {
Ok(frame_count) => {
unsafe {
*out = PunktfunkAudioPcm {
samples: pcm.as_ptr(),
frame_count: frame_count as u32,
channels,
seq: pkt.seq,
pts_ns: pkt.pts_ns,
};
}
PunktfunkStatus::Ok
}
Err(_) => PunktfunkStatus::BadPacket,
}
})
}
/// Pull the next rumble (force-feedback) update, waiting up to `timeout_ms`. Amplitudes
/// are 0..0xFFFF (`low` = low-frequency motor, `high` = high-frequency), `(0, 0)` = stop.
/// Same timeout/closed semantics as [`punktfunk_connection_next_audio`].
@@ -1414,6 +1606,33 @@ pub unsafe extern "C" fn punktfunk_connection_color_info(
})
}
/// Read the session's resolved chroma subsampling (from the host's Welcome) as the HEVC
/// `chroma_format_idc`: `1` = 4:2:0 (the default every pre-4:4:4 host produced), `3` = full-chroma
/// 4:4:4. `*out` is filled when non-NULL. The in-band SPS is authoritative; this lets the embedder
/// pre-size its decoder / pick a 4:4:4 pixel format up front. Available immediately after a
/// successful connect (it doesn't change without a reconfigure).
///
/// # Safety
/// `c` is a valid connection handle; `out` is NULL or writable for one `u8`.
#[cfg(feature = "quic")]
#[no_mangle]
pub unsafe extern "C" fn punktfunk_connection_chroma_format(
c: *mut PunktfunkConnection,
out: *mut u8,
) -> PunktfunkStatus {
guard(|| {
let c = match unsafe { c.as_ref() } {
Some(c) => c,
None => return PunktfunkStatus::NullPointer,
};
if !out.is_null() {
// SAFETY: `out` is non-null and the caller guarantees it is writable for one `u8`.
unsafe { *out = c.inner.chroma_format };
}
PunktfunkStatus::Ok
})
}
/// Send one input event to the host as a QUIC datagram (non-blocking enqueue).
///
/// # Safety
+298
View File
@@ -0,0 +1,298 @@
//! Shared audio layout: the single source of truth for Opus (multi)stream surround across the
//! host, the GameStream compatibility path, and every client decoder.
//!
//! **Canonical wire channel order** is `FL FR FC LFE RL RR SL SR` (the GameStream/Moonlight
//! order, and the PipeWire/PulseAudio default map for 6/8 channels). Every host capturer
//! delivers PCM in this order and every client decodes into it, so the Opus multistream
//! `mapping` is the **identity** (`[0, 1, …, channels-1]`) on both ends — punktfunk owns the
//! encoder and every decoder, so the GFE-style pre-rotation Moonlight needs over SDP
//! (`gamestream::audio::surround_params`) is a GameStream-only concern and never touches the
//! native `punktfunk/1` path.
//!
//! Channel counts the protocol negotiates: `2` (stereo), `6` (5.1) and `8` (7.1). Anything
//! else clamps to stereo ([`normalize_channels`]).
/// Canonical wire channel positions; the index is the channel's slot in the interleaved PCM
/// frame. A count of N uses positions `0..N` (always a prefix of this 8-channel order).
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
#[repr(u8)]
pub enum WirePos {
FrontLeft = 0,
FrontRight = 1,
FrontCenter = 2,
Lfe = 3,
RearLeft = 4,
RearRight = 5,
SideLeft = 6,
SideRight = 7,
}
/// The full 8-channel wire order; the N-channel order is its first N entries.
pub const WIRE_ORDER_8: [WirePos; 8] = {
use WirePos::*;
[
FrontLeft,
FrontRight,
FrontCenter,
Lfe,
RearLeft,
RearRight,
SideLeft,
SideRight,
]
};
/// One Opus (multi)stream layout. `mapping` is the libopus multistream mapping we encode AND
/// decode with — identity, since punktfunk owns both ends. `streams`/`coupled` give the
/// normal-quality coupling (FL,FR)+(FC,LFE) [+(RL,RR) on 7.1] with the remaining channels as
/// mono streams; high quality is one mono stream per channel. Bitrates match Sunshine's
/// per-config values (stereo keeps punktfunk's live-validated 128 kbps).
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct OpusLayout {
/// Interleaved channel count (2, 6 or 8).
pub channels: u8,
/// Number of Opus streams in the multistream packet.
pub streams: u8,
/// How many of those streams are coupled (stereo) pairs.
pub coupled: u8,
/// libopus multistream channel mapping — identity `[0, 1, …, channels-1]`.
pub mapping: &'static [u8],
/// Target Opus bitrate in bits/sec (hard CBR; constant packet size, which GameStream's
/// audio FEC relies on).
pub bitrate: i32,
}
/// Stereo: a plain coupled pair. The 128 kbps live-validated config.
pub const LAYOUT_STEREO: OpusLayout = OpusLayout {
channels: 2,
streams: 1,
coupled: 1,
mapping: &[0, 1],
bitrate: 128_000,
};
/// 5.1 normal quality: (FL,FR)+(FC,LFE) coupled, RL+RR mono.
pub const LAYOUT_51: OpusLayout = OpusLayout {
channels: 6,
streams: 4,
coupled: 2,
mapping: &[0, 1, 2, 3, 4, 5],
bitrate: 256_000,
};
/// 5.1 high quality: one mono stream per channel.
pub const LAYOUT_51_HQ: OpusLayout = OpusLayout {
channels: 6,
streams: 6,
coupled: 0,
mapping: &[0, 1, 2, 3, 4, 5],
bitrate: 1_536_000,
};
/// 7.1 normal quality: (FL,FR)+(FC,LFE)+(RL,RR) coupled, SL+SR mono.
pub const LAYOUT_71: OpusLayout = OpusLayout {
channels: 8,
streams: 5,
coupled: 3,
mapping: &[0, 1, 2, 3, 4, 5, 6, 7],
bitrate: 450_000,
};
/// 7.1 high quality: one mono stream per channel.
pub const LAYOUT_71_HQ: OpusLayout = OpusLayout {
channels: 8,
streams: 8,
coupled: 0,
mapping: &[0, 1, 2, 3, 4, 5, 6, 7],
bitrate: 2_048_000,
};
/// Pick the layout for a negotiated channel count. Unknown counts fall back to stereo (clients
/// only ever request 2/6/8). `high_quality` selects the uncoupled high-bitrate config.
pub fn layout_for(channels: u8, high_quality: bool) -> &'static OpusLayout {
match (channels, high_quality) {
(6, false) => &LAYOUT_51,
(6, true) => &LAYOUT_51_HQ,
(8, false) => &LAYOUT_71,
(8, true) => &LAYOUT_71_HQ,
_ => &LAYOUT_STEREO,
}
}
/// Clamp an arbitrary (wire / requested) channel count to one the protocol negotiates. `0`,
/// absent, or any unsupported value becomes stereo.
pub fn normalize_channels(requested: u8) -> u8 {
match requested {
6 => 6,
8 => 8,
_ => 2,
}
}
// ---- per-platform channel-layout helpers (pure data; no platform deps) --------------------
/// Windows `WAVEFORMATEXTENSIBLE.dwChannelMask` for the wire layout.
///
/// NB 7.1 == `0x63F` (FL FR FC LFE **BL BR SL SR**), NOT `0xFF` — `0xFF` selects the
/// front-of-center pair FLC/FRC, the wrong speakers. WASAPI delivers channels in ascending
/// mask-bit order, which equals the wire order, so the decoded PCM needs no permutation.
pub const fn wasapi_channel_mask(channels: u8) -> u32 {
const FL: u32 = 0x1;
const FR: u32 = 0x2;
const FC: u32 = 0x4;
const LFE: u32 = 0x8;
const BL: u32 = 0x10; // back left (wire RL)
const BR: u32 = 0x20; // back right (wire RR)
const SL: u32 = 0x200; // side left
const SR: u32 = 0x400; // side right
match channels {
6 => FL | FR | FC | LFE | BL | BR, // 0x3F
8 => FL | FR | FC | LFE | BL | BR | SL | SR, // 0x63F
_ => FL | FR, // 0x3 (stereo)
}
}
/// PipeWire / SPA `enum spa_audio_channel` positions in wire order — identical to the host
/// capture side (`punktfunk-host` `audio::linux::spa_positions`): FL=3 FR=4 FC=5 LFE=6 SL=7
/// SR=8 RL=12 RR=13. Identity routing: the client sets these on its playback node so PipeWire
/// maps each wire slot to the matching speaker (and downmixes when the sink has fewer).
pub fn spa_positions(channels: u8) -> &'static [u32] {
const STEREO: [u32; 2] = [3, 4]; // FL FR
const C51: [u32; 6] = [3, 4, 5, 6, 12, 13]; // FL FR FC LFE RL RR
const C71: [u32; 8] = [3, 4, 5, 6, 12, 13, 7, 8]; // FL FR FC LFE RL RR SL SR
match channels {
6 => &C51,
8 => &C71,
_ => &STEREO,
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn layout_table_is_consistent() {
for l in [
&LAYOUT_STEREO,
&LAYOUT_51,
&LAYOUT_51_HQ,
&LAYOUT_71,
&LAYOUT_71_HQ,
] {
// Mapping is identity and exactly `channels` entries long.
assert_eq!(l.mapping.len(), l.channels as usize);
for (i, &m) in l.mapping.iter().enumerate() {
assert_eq!(m as usize, i, "mapping must be identity for {l:?}");
}
// libopus invariant: total channels == coupled*2 + (streams - coupled).
assert_eq!(
l.coupled * 2 + (l.streams - l.coupled),
l.channels,
"stream/coupled accounting for {l:?}"
);
assert!(l.coupled <= l.streams);
assert!(l.bitrate > 0);
}
}
#[test]
fn layout_for_picks_expected() {
assert_eq!(layout_for(2, false), &LAYOUT_STEREO);
assert_eq!(layout_for(6, false), &LAYOUT_51);
assert_eq!(layout_for(6, true), &LAYOUT_51_HQ);
assert_eq!(layout_for(8, false), &LAYOUT_71);
assert_eq!(layout_for(8, true), &LAYOUT_71_HQ);
// Unknown / 0 → stereo.
assert_eq!(layout_for(0, false), &LAYOUT_STEREO);
assert_eq!(layout_for(3, false), &LAYOUT_STEREO);
assert_eq!(layout_for(7, true), &LAYOUT_STEREO);
}
#[test]
fn normalize_clamps_to_negotiable() {
assert_eq!(normalize_channels(2), 2);
assert_eq!(normalize_channels(6), 6);
assert_eq!(normalize_channels(8), 8);
for bad in [0u8, 1, 3, 4, 5, 7, 9, 255] {
assert_eq!(normalize_channels(bad), 2, "{bad} must clamp to stereo");
}
}
#[test]
fn wasapi_masks_are_correct() {
assert_eq!(wasapi_channel_mask(2), 0x3);
assert_eq!(wasapi_channel_mask(6), 0x3F);
assert_eq!(wasapi_channel_mask(8), 0x63F); // NOT 0xFF
// Bit count must equal the channel count.
assert_eq!(wasapi_channel_mask(2).count_ones(), 2);
assert_eq!(wasapi_channel_mask(6).count_ones(), 6);
assert_eq!(wasapi_channel_mask(8).count_ones(), 8);
}
#[test]
fn spa_positions_match_wire_order() {
assert_eq!(spa_positions(2), &[3, 4]);
assert_eq!(spa_positions(6), &[3, 4, 5, 6, 12, 13]);
assert_eq!(spa_positions(8), &[3, 4, 5, 6, 12, 13, 7, 8]);
assert_eq!(spa_positions(2).len(), 2);
assert_eq!(spa_positions(6).len(), 6);
assert_eq!(spa_positions(8).len(), 8);
}
/// Real-libopus proof that the shared layout round-trips with channel identity: a tone fed
/// into wire channel N (host `opus::MSEncoder`) comes back out on channel N (client
/// `opus::MSDecoder`), for stereo / 5.1 / 7.1. This is the single guarantee the whole
/// feature rests on — encoder layout == decoder layout == identity mapping — so if a layout
/// constant is ever wrong, this fails. Gated on `quic` (where `opus` is a dependency).
#[cfg(feature = "quic")]
#[test]
fn multistream_layout_roundtrips_with_channel_identity() {
const SR: u32 = 48_000;
const SAMPLES: usize = 240; // 5 ms @ 48 kHz
for &channels in &[2u8, 6, 8] {
let l = layout_for(channels, false);
let ch = l.channels as usize;
let mut enc = opus::MSEncoder::new(
SR,
l.streams,
l.coupled,
l.mapping,
opus::Application::LowDelay,
)
.expect("MSEncoder");
enc.set_bitrate(opus::Bitrate::Bits(l.bitrate)).unwrap();
enc.set_vbr(false).unwrap();
let mut dec =
opus::MSDecoder::new(SR, l.streams, l.coupled, l.mapping).expect("MSDecoder");
for tone_ch in 0..ch {
let mut out = vec![0u8; 4000];
let mut energy = vec![0f64; ch];
// A few frames to clear the codec startup transient before measuring.
for f in 0..8 {
let mut frame = vec![0f32; SAMPLES * ch];
for t in 0..SAMPLES {
let phase = (f * SAMPLES + t) as f32 * 440.0 * 2.0 * std::f32::consts::PI
/ SR as f32;
frame[t * ch + tone_ch] = 0.5 * phase.sin();
}
let n = enc.encode_float(&frame, &mut out).unwrap();
let mut decoded = vec![0f32; SAMPLES * ch];
let got = dec.decode_float(&out[..n], &mut decoded, false).unwrap();
assert_eq!(got, SAMPLES, "{channels}ch frame size");
if f >= 4 {
for t in 0..SAMPLES {
for (c, e) in energy.iter_mut().enumerate() {
*e += (decoded[t * ch + c] as f64).powi(2);
}
}
}
}
let loudest = (0..ch)
.max_by(|&a, &b| energy[a].total_cmp(&energy[b]))
.unwrap();
assert_eq!(
loudest, tone_ch,
"{channels}ch: tone in channel {tone_ch} must come out on {tone_ch} (energies {energy:?})"
);
}
}
}
}
+34 -2
View File
@@ -40,8 +40,9 @@ enum CtrlRequest {
/// mode, the host-resolved compositor backend, the host-resolved gamepad backend, the host's
/// certificate fingerprint, the resolved encoder bitrate (kbps), and the host↔client clock offset
/// (ns, host minus client; 0 = no skew correction / an old host that didn't answer the handshake).
/// The trailing `u8` is the resolved encode bit depth (8/10) and [`ColorInfo`] the resolved colour
/// signalling, both from the [`Welcome`].
/// The trailing `u8`s are the resolved encode bit depth (8/10), the chroma `chroma_format_idc`
/// (1 = 4:2:0, 3 = 4:4:4), and the resolved audio channel count (2/6/8), with [`ColorInfo`] the
/// resolved colour signalling — all from the [`Welcome`].
type Negotiated = (
Mode,
CompositorPref,
@@ -51,6 +52,8 @@ type Negotiated = (
i64,
u8,
ColorInfo,
u8,
u8,
);
/// Accumulated state of an in-flight / finished speed test. The data-plane pump mirrors the
@@ -202,6 +205,17 @@ pub struct NativeClient {
/// decoder/presenter from this. [`ColorInfo::SDR_BT709`] for an older host. The static HDR
/// mastering metadata (when [`ColorInfo::is_hdr`]) arrives via [`NativeClient::next_hdr_meta`].
pub color: ColorInfo,
/// The chroma subsampling the host resolved for this session ([`Welcome::chroma_format`]), as the
/// HEVC `chroma_format_idc`: [`quic::CHROMA_IDC_420`] (4:2:0, the default / older host) or
/// [`quic::CHROMA_IDC_444`] (full-chroma 4:4:4). The in-band SPS is authoritative; this lets the
/// client pre-size its decoder. `CHROMA_IDC_420` for an older host that didn't report it.
pub chroma_format: u8,
/// The audio channel count the host resolved for this session ([`Welcome::audio_channels`]):
/// `2` (stereo), `6` (5.1) or `8` (7.1). The client MUST build its Opus (multistream) decoder
/// from this value (via [`crate::audio::layout_for`]) — never from its own request — so an older
/// host that omits it (→ `2`) yields working stereo. The `0xC9` audio frames are encoded with the
/// matching layout.
pub audio_channels: u8,
}
/// Pin the calling thread to the user-interactive QoS class on Apple targets.
@@ -246,6 +260,9 @@ impl NativeClient {
// VIDEO_CAP_HDR) — the host upgrades to a 10-bit / HDR encode only when the matching bit is
// set. 0 = the 8-bit BT.709 stream every client understands.
video_caps: u8,
// Requested audio channel count (2 = stereo / 6 = 5.1 / 8 = 7.1); the host clamps to what it
// can capture and echoes the result in [`NativeClient::audio_channels`].
audio_channels: u8,
launch: Option<String>,
pin: Option<[u8; 32]>,
identity: Option<(String, String)>,
@@ -298,6 +315,7 @@ impl NativeClient {
gamepad,
bitrate_kbps,
video_caps,
audio_channels,
launch,
pin,
identity,
@@ -329,6 +347,8 @@ impl NativeClient {
clock_offset_ns,
bit_depth,
color,
chroma_format,
audio_channels,
) = match ready_rx.recv_timeout(timeout) {
Ok(Ok(t)) => t,
Ok(Err(e)) => return Err(e),
@@ -360,6 +380,8 @@ impl NativeClient {
clock_offset_ns,
bit_depth,
color,
chroma_format,
audio_channels,
})
}
@@ -666,6 +688,7 @@ struct WorkerArgs {
gamepad: GamepadPref,
bitrate_kbps: u32,
video_caps: u8,
audio_channels: u8,
launch: Option<String>,
pin: Option<[u8; 32]>,
identity: Option<(String, String)>,
@@ -697,6 +720,7 @@ async fn worker_main(args: WorkerArgs) {
gamepad,
bitrate_kbps,
video_caps,
audio_channels,
launch,
pin,
identity,
@@ -763,6 +787,8 @@ async fn worker_main(args: WorkerArgs) {
// VIDEO_CAP_10BIT | VIDEO_CAP_HDR). The host only upgrades to a 10-bit / HDR encode
// when the matching bit is set, so `0` stays an 8-bit BT.709 stream.
video_caps,
// Requested surround channel count; the host echoes the resolved value in Welcome.
audio_channels,
}
.encode(),
)
@@ -834,6 +860,8 @@ async fn worker_main(args: WorkerArgs) {
clock_offset_ns,
welcome.bit_depth,
welcome.color,
welcome.chroma_format,
welcome.audio_channels,
))
};
@@ -850,6 +878,8 @@ async fn worker_main(args: WorkerArgs) {
clock_offset_ns,
bit_depth,
color,
chroma_format,
audio_channels,
) = match setup.await {
Ok(t) => t,
Err(e) => {
@@ -866,6 +896,8 @@ async fn worker_main(args: WorkerArgs) {
clock_offset_ns,
bit_depth,
color,
chroma_format,
audio_channels,
)));
// Input task: embedder events → QUIC datagrams.
+1
View File
@@ -25,6 +25,7 @@
#![forbid(unsafe_op_in_unsafe_fn)]
pub mod abi;
pub mod audio;
#[cfg(feature = "quic")]
pub mod client;
pub mod config;
+98 -7
View File
@@ -78,12 +78,33 @@ pub struct Hello {
/// zero-length name/launch placeholder precedes it when those are absent so the offset stays
/// deterministic. Omitted by older clients (decodes to `0`).
pub video_caps: u8,
/// Requested audio channel count: `2` (stereo, default), `6` (5.1) or `8` (7.1). The host
/// resolves it against what it can capture and echoes the final count in
/// [`Welcome::audio_channels`], which is what both ends build their Opus (multistream)
/// codec from. Appended after `video_caps` as a single trailing byte; when it differs from
/// the stereo default the name/launch/video_caps placeholders are forced (0) so it lands at a
/// deterministic offset. Omitted by older clients / when `2` (decodes to `2`, i.e. stereo) so
/// the stereo wire form stays byte-identical to the pre-surround build.
pub audio_channels: u8,
}
/// [`Hello::video_caps`] bit: the client can decode a 10-bit (Main10) HEVC stream.
pub const VIDEO_CAP_10BIT: u8 = 0x01;
/// [`Hello::video_caps`] bit: the client can present BT.2020 PQ HDR10 (implies 10-bit).
pub const VIDEO_CAP_HDR: u8 = 0x02;
/// [`Hello::video_caps`] bit: the client can decode a full-chroma **4:4:4** HEVC stream (HEVC
/// Range Extensions / Rec.ITU-T H.265 `chroma_format_idc = 3`). The host emits 4:4:4 ONLY when this
/// bit is set, the host opted in (`PUNKTFUNK_444`), the codec is HEVC, **and** the GPU/driver
/// actually supports a 4:4:4 encode (probed) — otherwise the session stays 4:2:0 and
/// [`Welcome::chroma_format`] reflects the real resolved value. Independent of 10-bit/HDR (4:4:4 is a
/// chroma decision, bit depth is a depth decision; the two may combine where the hardware allows).
pub const VIDEO_CAP_444: u8 = 0x04;
/// HEVC `chroma_format_idc` for 4:2:0 — what every pre-4:4:4 build produced and the back-compat
/// default when a peer omits [`Welcome::chroma_format`].
pub const CHROMA_IDC_420: u8 = 1;
/// HEVC `chroma_format_idc` for full-chroma 4:4:4 (Range Extensions).
pub const CHROMA_IDC_444: u8 = 3;
/// Per-session colour signalling (CICP / ITU-T H.273 code points) the host resolved for the
/// encoded video, carried on [`Welcome`]. A client configures its decoder/presenter from these
@@ -198,6 +219,22 @@ pub struct Welcome {
/// [`ColorInfo::SDR_BT709`]. The client configures its decoder/presenter from this instead of
/// guessing from the bitstream; the mastering metadata arrives separately on [`HDR_META_MAGIC`].
pub color: ColorInfo,
/// The chroma subsampling the host actually encodes at, as the HEVC `chroma_format_idc`:
/// [`CHROMA_IDC_420`] (4:2:0, default / older host) or [`CHROMA_IDC_444`] (full-chroma 4:4:4,
/// enabled only when the client advertised [`VIDEO_CAP_444`] *and* the host could open a real
/// 4:4:4 encode). The client sizes its decoder/surface pool from this; the in-band SPS carries
/// the authoritative value, so this is a hint (and the honest-downgrade channel — if the host
/// requested 4:4:4 but the GPU declined, this reads `CHROMA_IDC_420`). Appended after the colour
/// bytes as a single trailing byte; an older host that omits it decodes to [`CHROMA_IDC_420`].
pub chroma_format: u8,
/// The audio channel count the host actually resolved and **will** send on the `0xC9` plane:
/// `2` (stereo, default), `6` (5.1) or `8` (7.1). Echoes [`Hello::audio_channels`] clamped to
/// what the host can capture (Linux PipeWire always synthesizes the count; Windows WASAPI
/// loopback is clamped to the render endpoint's mix-format channels). The client builds its Opus
/// (multistream) decoder from THIS value via [`crate::audio::layout_for`] — never from its own
/// request — so an older host that omits the byte (→ `2`) always yields working stereo. Appended
/// after `chroma_format` as a single trailing byte.
pub audio_channels: u8,
}
/// `client → host`: data plane is bound, begin streaming.
@@ -630,10 +667,11 @@ impl Hello {
// so a Hello with neither name nor launch stays byte-identical to the bitrate-era form
// (26 bytes). When `launch` is present we must still emit name's length byte (0 for None)
// so `launch` lands at a deterministic offset.
// `video_caps` is the last trailing field, after `launch`; when it's present (non-zero)
// the name/launch length bytes must still be emitted (0 for absent) so it lands at a
// `video_caps`/`audio_channels` are the trailing fields, after `launch`; when either is
// present (video_caps non-zero / audio_channels not stereo) the name/launch length bytes
// AND the video_caps byte must still be emitted (0 / 0) so the later byte lands at a
// deterministic offset — the same discipline `launch` already imposes on `name`.
let need_placeholders = self.video_caps != 0;
let need_placeholders = self.video_caps != 0 || self.audio_channels != 2;
match (&self.name, &self.launch) {
(None, None) if !need_placeholders => {}
(name, _) => {
@@ -648,10 +686,15 @@ impl Hello {
b.push(l.len() as u8);
b.extend_from_slice(l.as_bytes());
}
// video_caps: single trailing byte. Last field.
if self.video_caps != 0 {
// video_caps: single trailing byte. Emitted when non-zero OR when audio_channels follows
// (so audio_channels lands at a deterministic offset right after it).
if self.video_caps != 0 || self.audio_channels != 2 {
b.push(self.video_caps);
}
// audio_channels: single trailing byte. Last field; omitted when stereo (default).
if self.audio_channels != 2 {
b.push(self.audio_channels);
}
b
}
@@ -714,6 +757,15 @@ impl Hello {
let launch_len = b.get(launch_off).copied().unwrap_or(0) as usize;
b.get(launch_off + 1 + launch_len).copied().unwrap_or(0)
},
// Optional trailing audio-channel byte, one past video_caps. Absent on an older client
// → stereo. Normalized so a corrupt/unsupported value can't build a bad decoder.
audio_channels: {
let name_len = b.get(26).copied().unwrap_or(0) as usize;
let launch_off = 27 + name_len;
let launch_len = b.get(launch_off).copied().unwrap_or(0) as usize;
let video_caps_off = launch_off + 1 + launch_len;
crate::audio::normalize_channels(b.get(video_caps_off + 1).copied().unwrap_or(2))
},
})
}
}
@@ -747,6 +799,10 @@ impl Welcome {
b.push(self.color.transfer);
b.push(self.color.matrix);
b.push(self.color.full_range);
// Chroma subsampling at offset 64 — older clients stop before this → 4:2:0 (CHROMA_IDC_420).
b.push(self.chroma_format);
// Audio channel count at offset 65 — older clients stop before this → stereo (2).
b.push(self.audio_channels);
b
}
@@ -755,7 +811,8 @@ impl Welcome {
// scheme[22] pct[23] max_data[24..26] shard[26..28] encrypt[28] key[29..45]
// salt[45..49] frames[49..53] compositor[53] gamepad[54] bitrate_kbps[55..59]
// bit_depth[59] color.primaries[60] color.transfer[61] color.matrix[62] color.range[63]
// (everything from compositor on is an optional trailing byte; an older host stops earlier).
// chroma_format[64] audio_channels[65] (everything from compositor on is an optional
// trailing byte; an older host stops earlier).
if b.len() < 53 || &b[0..4] != MAGIC {
return Err(PunktfunkError::InvalidArg("bad Welcome"));
}
@@ -812,6 +869,15 @@ impl Welcome {
matrix: b.get(62).copied().unwrap_or(ColorInfo::MC_BT709),
full_range: b.get(63).copied().unwrap_or(0),
},
// Optional trailing chroma byte — absent on an older host (or an explicit 0 / unknown
// value) → 4:2:0. Only `CHROMA_IDC_444` flips the client to a 4:4:4 decode.
chroma_format: match b.get(64).copied() {
Some(CHROMA_IDC_444) => CHROMA_IDC_444,
_ => CHROMA_IDC_420,
},
// Optional trailing audio-channel byte — absent on an older host → stereo. Any
// non-{6,8} value normalizes to stereo so a corrupt byte never builds a bad decoder.
audio_channels: crate::audio::normalize_channels(b.get(65).copied().unwrap_or(2)),
})
}
@@ -1809,6 +1875,8 @@ mod tests {
bitrate_kbps: 50_000,
bit_depth: 10,
color: ColorInfo::HDR10_BT2020_PQ,
chroma_format: CHROMA_IDC_444,
audio_channels: 2,
};
assert_eq!(Welcome::decode(&w.encode()).unwrap(), w);
}
@@ -1851,6 +1919,7 @@ mod tests {
name: Some("Test Device".into()),
launch: Some("steam:570".into()),
video_caps: VIDEO_CAP_10BIT,
audio_channels: 2,
};
assert_eq!(Hello::decode(&h.encode()).unwrap(), h);
let s = Start {
@@ -1930,6 +1999,7 @@ mod tests {
name: None,
launch: None,
video_caps: 0,
audio_channels: 2,
};
let enc = h.encode();
assert_eq!(enc.len(), 26);
@@ -1969,9 +2039,11 @@ mod tests {
bitrate_kbps: 120_000,
bit_depth: 10,
color: ColorInfo::HDR10_BT2020_PQ,
chroma_format: CHROMA_IDC_444,
audio_channels: 6, // 5.1 — exercises the non-default trailing byte
};
let wenc = w.encode();
assert_eq!(wenc.len(), 64); // 60 base + 4 colour bytes
assert_eq!(wenc.len(), 66); // 60 base + 4 colour + 1 chroma + 1 audio-channels byte
let legacy_w = Welcome::decode(&wenc[..53]).unwrap();
assert_eq!(legacy_w.compositor, CompositorPref::Auto);
assert_eq!(legacy_w.gamepad, GamepadPref::Auto);
@@ -1991,13 +2063,29 @@ mod tests {
let pre_color_w = Welcome::decode(&wenc[..60]).unwrap();
assert_eq!(pre_color_w.bit_depth, 10);
assert_eq!(pre_color_w.color, ColorInfo::SDR_BT709);
assert_eq!(pre_color_w.chroma_format, CHROMA_IDC_420); // pre-chroma host → 4:2:0
assert_eq!(legacy_w.color, ColorInfo::SDR_BT709);
assert_eq!(legacy_w.chroma_format, CHROMA_IDC_420);
// A pre-chroma (64-byte) Welcome carries colour but no chroma/audio bytes → 4:2:0 + stereo.
let pre_chroma_w = Welcome::decode(&wenc[..64]).unwrap();
assert_eq!(pre_chroma_w.color, ColorInfo::HDR10_BT2020_PQ);
assert_eq!(pre_chroma_w.chroma_format, CHROMA_IDC_420);
assert_eq!(pre_chroma_w.audio_channels, 2); // audio byte (offset 65) absent → stereo
// A pre-audio (65-byte) Welcome carries chroma but no audio byte → 4:4:4 + stereo.
let pre_audio_w = Welcome::decode(&wenc[..65]).unwrap();
assert_eq!(pre_audio_w.chroma_format, CHROMA_IDC_444);
assert_eq!(pre_audio_w.audio_channels, 2);
assert_eq!(Welcome::decode(&wenc).unwrap().bitrate_kbps, 120_000);
assert_eq!(Welcome::decode(&wenc).unwrap().bit_depth, 10); // full form carries it
assert_eq!(
Welcome::decode(&wenc).unwrap().color,
ColorInfo::HDR10_BT2020_PQ
);
assert_eq!(
Welcome::decode(&wenc).unwrap().chroma_format,
CHROMA_IDC_444
); // full form carries 4:4:4
assert_eq!(Welcome::decode(&wenc).unwrap().audio_channels, 6); // ...and 5.1
}
#[test]
@@ -2015,6 +2103,7 @@ mod tests {
name: Some("Enrico's MacBook".into()),
launch: None,
video_caps: 0,
audio_channels: 2,
};
let enc = base.encode();
assert_eq!(
@@ -2062,6 +2151,7 @@ mod tests {
name: None,
launch: None,
video_caps: 0,
audio_channels: 2,
};
// launch alone (no name): a zero-length name placeholder keeps the offset deterministic.
let with_launch = Hello {
@@ -2268,6 +2358,7 @@ mod tests {
name: None,
launch: None,
video_caps: 0,
audio_channels: 2,
}
.encode();
assert!(PairRequest::decode(&h).is_err(), "abi {abi} parsed as pair");
+14 -2
View File
@@ -13,8 +13,10 @@ use std::process::Command;
fn native_libs() -> &'static [&'static str] {
if cfg!(target_os = "macos") {
// The workspace build unifies features into the staticlib, and `quic` pulls
// rustls's platform verifier → Security/CoreFoundation.
// rustls's platform verifier → Security/CoreFoundation, plus libopus (the in-core
// `next_audio_pcm` decode path) which the `abi.rs` object references.
&[
"-lopus",
"-liconv",
"-lm",
"-framework",
@@ -23,7 +25,17 @@ fn native_libs() -> &'static [&'static str] {
"CoreFoundation",
]
} else if cfg!(target_os = "linux") {
&["-lgcc_s", "-lutil", "-lrt", "-lpthread", "-lm", "-ldl"]
// `-lopus`: the `quic` feature pulls in-core Opus decode (`next_audio_pcm`), whose
// symbols the linked `abi.rs` object references. Before `-lm` (opus needs libm).
&[
"-lopus",
"-lgcc_s",
"-lutil",
"-lrt",
"-lpthread",
"-lm",
"-ldl",
]
} else {
&[]
}
+4 -7
View File
@@ -61,9 +61,10 @@ utoipa-scalar = { version = "0.3", features = ["axum"] }
tower = { version = "0.5", features = ["util"] }
http-body-util = "0.1"
# Opus stereo encode for the host->client audio plane. The `opus` crate vendors libopus via
# `audiopus_sys` (cmake-built from source — no system lib, no vcpkg), so it builds on Windows MSVC
# too (needs CMake + NASM, both on the box). Both platforms that have an audio-capture backend.
# Opus encode for the host->client audio plane — stereo (`opus::Encoder`) AND 5.1/7.1 surround
# (`opus::MSEncoder`, the safe multistream API the crate exposes; no `audiopus_sys` needed). The
# crate vendors libopus (cmake-built from source — no system lib, no vcpkg), so it builds on Windows
# MSVC too (needs CMake + NASM, both on the box). Both platforms that have an audio-capture backend.
[target.'cfg(any(target_os = "linux", target_os = "windows"))'.dependencies]
opus = "0.3"
@@ -99,10 +100,6 @@ serde_json = "1"
rusqlite = { version = "0.40", features = ["bundled"] }
# Builds/validates the xkb keymap uploaded to the virtual keyboard + tracks modifier state.
xkbcommon = "0.8"
# The safe `opus` crate is stereo-only; surround (5.1/7.1) needs the libopus *multistream*
# encoder (`opus_multistream_encoder_*`). `audiopus_sys` is the sys layer `opus` already
# vendors (same libopus link), so this adds bindings, not a second copy of the library.
audiopus_sys = "0.2"
# libei (EI sender) for the portable input path on KWin/GNOME (RemoteDesktop portal).
# The `tokio` feature wires reis's event stream into tokio's reactor.
reis = { version = "0.6.1", features = ["tokio"] }
@@ -0,0 +1,73 @@
<?xml version="1.0" encoding="UTF-8"?>
<protocol name="fake_input">
<copyright>
SPDX-FileCopyrightText: 2015 Martin Gräßlin
SPDX-License-Identifier: LGPL-2.1-or-later
</copyright>
<interface name="org_kde_kwin_fake_input" version="4">
<description summary="Fake input manager">
This interface allows other processes to provide fake input events.
Purpose is on the one hand side to provide testing facilities like XTest
on X11, but also to support use cases like remote control (a remote
desktop server). The compositor gates the interface: it is only exposed
to clients authorized through their .desktop X-KDE-Wayland-Interfaces, so
binding it is the authorization — no per-event confirmation dialog.
</description>
<request name="authenticate">
<description summary="Information about the application requesting fake input">
A FakeInput is required to authenticate itself by providing the
application name and the reason for fake input. The compositor may use
this information to decide whether to allow or deny the request.
</description>
<arg name="application" type="string" summary="user visible name of the application requesting fake input"/>
<arg name="reason" type="string" summary="reason of why fake input is requested"/>
</request>
<request name="pointer_motion">
<description summary="pointer motion event"/>
<arg name="delta_x" type="fixed" summary="X delta of the relative pointer motion"/>
<arg name="delta_y" type="fixed" summary="Y delta of the relative pointer motion"/>
</request>
<request name="button">
<description summary="pointer button event"/>
<arg name="button" type="uint" summary="evdev button code"/>
<arg name="state" type="uint" summary="button state, 0 released, 1 pressed"/>
</request>
<request name="axis">
<description summary="pointer axis (scroll) event"/>
<arg name="axis" type="uint" summary="wl_pointer.axis (0 vertical, 1 horizontal)"/>
<arg name="value" type="fixed" summary="axis value"/>
</request>
<request name="touch_down" since="2">
<description summary="touch down event"/>
<arg name="id" type="uint" summary="unique id of this touch point; must not be reused until up"/>
<arg name="x" type="fixed" summary="x coordinate in global compositor space"/>
<arg name="y" type="fixed" summary="y coordinate in global compositor space"/>
</request>
<request name="touch_motion" since="2">
<description summary="touch motion event"/>
<arg name="id" type="uint" summary="unique id of an existing touch point"/>
<arg name="x" type="fixed" summary="x coordinate in global compositor space"/>
<arg name="y" type="fixed" summary="y coordinate in global compositor space"/>
</request>
<request name="touch_up" since="2">
<description summary="touch up event"/>
<arg name="id" type="uint" summary="unique id of an existing touch point"/>
</request>
<request name="touch_cancel" since="2">
<description summary="cancel all current touch points"/>
</request>
<request name="touch_frame" since="2">
<description summary="end a set of touch events (atomic frame)"/>
</request>
<request name="pointer_motion_absolute" since="3">
<description summary="absolute pointer motion event"/>
<arg name="x" type="fixed" summary="x coordinate in global compositor space"/>
<arg name="y" type="fixed" summary="y coordinate in global compositor space"/>
</request>
<request name="keyboard_key" since="4">
<description summary="keyboard key event"/>
<arg name="button" type="uint" summary="evdev key code"/>
<arg name="state" type="uint" summary="key state, 0 released, 1 pressed"/>
</request>
</interface>
</protocol>
+8 -1
View File
@@ -320,11 +320,18 @@ fn mic_pw_thread(
.into_inner();
let mut params = [Pod::from_bytes(&values).context("mic pod from bytes")?];
// RT_PROCESS: run the producer callback on PipeWire's realtime data loop, so the source is a
// *synchronous* graph node that joins its consumer's driver group and is actually driven. Without
// it the node is async/main-loop and, in the host's busy multi-stream graph (desktop-audio +
// video capture + the session), never acquires a driver — it stays suspended and its process()
// never fires, so every recorder hears pure silence (the long-standing "Linux host mic broken").
stream
.connect(
spa::utils::Direction::Output, // we PRODUCE samples (a source)
None,
pw::stream::StreamFlags::AUTOCONNECT | pw::stream::StreamFlags::MAP_BUFFERS,
pw::stream::StreamFlags::AUTOCONNECT
| pw::stream::StreamFlags::MAP_BUFFERS
| pw::stream::StreamFlags::RT_PROCESS,
&mut params,
)
.context("pw mic stream connect")?;
@@ -1,7 +1,9 @@
//! WASAPI loopback capture of the default render endpoint (system output) — the Windows analogue
//! of the PipeWire sink-monitor backend. Delivers interleaved f32 PCM at 48 kHz stereo, ready for
//! the existing Opus path with NO resampling (WASAPI shared-mode autoconvert does any SRC). WASAPI
//! objects are COM-apartment-bound and not `Send`, so they live on a dedicated thread (mirrors
//! of the PipeWire sink-monitor backend. Delivers interleaved f32 PCM at 48 kHz in the requested
//! channel count (stereo / 5.1 / 7.1, canonical wire order FL FR FC LFE RL RR SL SR via the
//! explicit `dwChannelMask`), ready for the Opus path with NO resampling (WASAPI shared-mode
//! autoconvert does any SRC + up/downmix to the requested layout). WASAPI objects are
//! COM-apartment-bound and not `Send`, so they live on a dedicated thread (mirrors
//! `linux::PwAudioCapturer`); only the channel + stop flag + join handle are in the struct.
use super::{AudioCapturer, SAMPLE_RATE};
@@ -14,9 +16,6 @@ use std::thread::{self, JoinHandle};
use std::time::Duration;
use wasapi::{DeviceEnumerator, Direction, SampleType, StreamMode, WaveFormat};
// 48 kHz stereo 32-bit float: 2 channels * 4 bytes = 8 bytes per frame.
const BLOCK_ALIGN: usize = 2 * 4;
pub struct WasapiLoopbackCapturer {
chunks: Receiver<Vec<f32>>,
channels: u32,
@@ -27,8 +26,8 @@ pub struct WasapiLoopbackCapturer {
impl WasapiLoopbackCapturer {
pub fn open(channels: u32) -> Result<WasapiLoopbackCapturer> {
anyhow::ensure!(
channels == 2,
"WASAPI loopback backend is stereo-only (got {channels})"
matches!(channels, 2 | 6 | 8),
"WASAPI loopback backend supports 2/6/8 channels (got {channels})"
);
let (tx, rx) = sync_channel::<Vec<f32>>(64);
let stop = Arc::new(AtomicBool::new(false));
@@ -39,7 +38,7 @@ impl WasapiLoopbackCapturer {
let join = thread::Builder::new()
.name("punktfunk-wasapi-audio".into())
.spawn(move || {
if let Err(e) = capture_thread(tx, stop_t, ready_tx) {
if let Err(e) = capture_thread(tx, stop_t, ready_tx, channels) {
tracing::error!(error = format!("{e:#}"), "wasapi loopback thread failed");
}
})
@@ -47,7 +46,8 @@ impl WasapiLoopbackCapturer {
match ready_rx.recv_timeout(Duration::from_secs(3)) {
Ok(Ok(())) => {
tracing::info!(
"WASAPI loopback capture: 48 kHz stereo f32 (default render endpoint)"
channels,
"WASAPI loopback capture: 48 kHz f32 (default render endpoint)"
);
Ok(WasapiLoopbackCapturer {
chunks: rx,
@@ -95,7 +95,10 @@ fn capture_thread(
tx: SyncSender<Vec<f32>>,
stop: Arc<AtomicBool>,
ready: SyncSender<Result<()>>,
channels: u32,
) -> Result<()> {
// Interleaved f32: channels * 4 bytes per frame.
let block_align = channels as usize * 4;
// COM must be initialized on THIS thread (MTA), before any device call.
if let Err(e) = wasapi::initialize_mta()
.ok()
@@ -106,16 +109,29 @@ fn capture_thread(
}
let res = (|| -> Result<()> {
// Loopback = capture the RENDER endpoint: get the default render device, but open a CAPTURE
// client with loopback=true over it.
// client with loopback=true over it. NOTE: the virtual mic (`super::wasapi_mic`) is guarded
// to NEVER target this same endpoint — otherwise the client's injected mic would be captured
// here and streamed back to the client (infinite echo). Keep that guard in sync if this
// device selection ever changes.
let device = DeviceEnumerator::new()
.context("DeviceEnumerator")?
.get_default_device(&Direction::Render)
.context("default render endpoint (loopback needs a render device)")?;
let mut audio_client = device.get_iaudioclient().context("IAudioClient")?;
// 48 kHz stereo f32 interleaved; autoconvert lets WASAPI's shared-mode SRC match the engine
// mix format to ours, so we never resample in Rust. Loopback is implied by capturing a
// RENDER device with Direction::Capture in shared mode (wasapi sets STREAMFLAGS_LOOPBACK).
let desired = WaveFormat::new(32, 32, &SampleType::Float, SAMPLE_RATE as usize, 2, None);
// 48 kHz f32 interleaved in the requested channel layout; autoconvert lets WASAPI's
// shared-mode SRC match the engine mix format to ours (incl. up/downmix to the requested
// channel count), so we never resample/remix in Rust. The explicit dwChannelMask pins the
// wire order (FL FR FC LFE RL RR SL SR; 7.1 = 0x63F, not 0xFF). Loopback is implied by
// capturing a RENDER device with Direction::Capture in shared mode (STREAMFLAGS_LOOPBACK).
let mask = punktfunk_core::audio::wasapi_channel_mask(channels as u8);
let desired = WaveFormat::new(
32,
32,
&SampleType::Float,
SAMPLE_RATE as usize,
channels as usize,
Some(mask),
);
let (default_period, _min_period) =
audio_client.get_device_period().context("device period")?;
let mode = StreamMode::EventsShared {
@@ -151,7 +167,7 @@ fn capture_thread(
Err(e) => return Err(anyhow!("get_next_packet_size: {e}")),
}
}
let whole = (bytes.len() / BLOCK_ALIGN) * BLOCK_ALIGN;
let whole = (bytes.len() / block_align) * block_align;
if whole == 0 {
continue;
}
@@ -5,8 +5,18 @@
//!
//! Target device, by friendly-name substring (first match wins; override with `PUNKTFUNK_MIC_DEVICE`):
//! "Steam Streaming Microphone" (ships with Steam Remote Play — exactly this purpose), VB-Audio
//! "CABLE Input", VoiceMeeter, or anything with "virtual" in the name. If none is present we return an
//! error with install guidance and the host runs without mic passthrough.
//! "CABLE Input", VoiceMeeter, or anything with "virtual" in the name. If none is present we
//! auto-install the Steam Streaming audio pair (see [`install_steam_audio_pair`]); failing that we
//! return an error with install guidance and the host runs without mic passthrough.
//!
//! **Anti-echo guard (the whole point of this being non-trivial).** The desktop-audio plane
//! ([`super::wasapi_cap`]) loopback-captures the **default render endpoint**. WASAPI loopback
//! captures the *mixed* output of an endpoint — i.e. everything any app renders to it, including
//! what THIS module writes. So if the virtual-mic target is the same device the loopback captures,
//! the client's uplinked mic is captured straight back into the host→client audio stream: an
//! infinite echo. [`find_device`] therefore **excludes the default render endpoint** from the
//! candidates — the mic is guaranteed to land on a different device. (Linux gets this for free: its
//! mic is a dedicated `Audio/Source` node, structurally separate from the monitored sink.)
//!
//! `push` enqueues decoded interleaved-f32 PCM into a bounded ring (drop-oldest beyond ~80 ms so mic
//! latency stays bounded); a dedicated COM-apartment thread renders it event-driven, filling silence
@@ -113,8 +123,23 @@ impl VirtualMic for WasapiVirtualMic {
}
}
/// Resolve the virtual-mic target among render endpoints by friendly-name. Logs all candidates so a
/// missing device is diagnosable.
/// The endpoint ID of the device the desktop-audio loopback records (the **default render
/// endpoint**, see [`super::wasapi_cap`]). The virtual mic must never target this device — injecting
/// there echoes the client's mic back into the host→client audio stream. `None` if it can't be
/// resolved (then [`find_device`] can't prove a candidate is safe and falls back to name-only
/// matching — no worse than before the guard existed).
fn default_render_id() -> Option<String> {
wasapi::DeviceEnumerator::new()
.ok()?
.get_default_device(&Direction::Render)
.ok()?
.get_id()
.ok()
}
/// Resolve the virtual-mic target among render endpoints by friendly-name, **excluding the endpoint
/// the loopback captures** (the [`default_render_id`] anti-echo guard). Logs all candidates so a
/// missing/skipped device is diagnosable.
fn find_device() -> Result<wasapi::Device> {
let enumerator = wasapi::DeviceEnumerator::new().context("DeviceEnumerator")?;
let collection = enumerator
@@ -124,8 +149,11 @@ fn find_device() -> Result<wasapi::Device> {
let want = std::env::var("PUNKTFUNK_MIC_DEVICE")
.ok()
.map(|s| s.to_lowercase());
// The device the loopback captures — a name match on it is rejected below (would echo).
let loopback_id = default_render_id();
let mut names = Vec::new();
let mut found = None;
let mut skipped_loopback = false;
for i in 0..n {
let Ok(dev) = collection.get_device_at_index(i) else {
continue;
@@ -137,16 +165,37 @@ fn find_device() -> Result<wasapi::Device> {
None => CANDIDATES.iter().any(|c| lname.contains(c)),
};
if hit && found.is_none() {
found = Some(dev);
// Anti-echo guard: never inject into the endpoint the loopback captures.
let is_loopback = match (dev.get_id().ok(), loopback_id.as_deref()) {
(Some(id), Some(lb)) => id == lb,
_ => false,
};
if is_loopback {
skipped_loopback = true;
tracing::warn!(device = %name,
"virtual-mic candidate is the loopback (default render) endpoint — skipping; \
injecting there would echo the client's mic into the desktop-audio stream");
} else {
found = Some(dev);
}
}
names.push(name);
}
found.ok_or_else(|| {
anyhow!(
"no virtual-mic device among render endpoints {names:?}. Install VB-Audio Virtual Cable \
or enable Steam Remote Play's microphone (Steam Streaming Microphone), or set \
PUNKTFUNK_MIC_DEVICE=<friendly-name substring>."
)
if skipped_loopback {
anyhow!(
"the only virtual-mic candidate among render endpoints {names:?} is the default \
playback device the host loopback-captures — injecting there would echo the mic \
back to the client. Add a SEPARATE virtual audio device for the mic (e.g. the Steam \
Streaming Microphone) or set a different default playback device, then reconnect."
)
} else {
anyhow!(
"no virtual-mic device among render endpoints {names:?}. Install VB-Audio Virtual \
Cable or enable Steam Remote Play's microphone (Steam Streaming Microphone), or set \
PUNKTFUNK_MIC_DEVICE=<friendly-name substring>."
)
}
})
}
@@ -156,15 +205,15 @@ fn find_or_install_device() -> Result<wasapi::Device> {
match find_device() {
Ok(d) => Ok(d),
Err(e) => {
tracing::info!("no virtual mic device present — attempting auto-install");
// SAFETY: `try_install_virtual_mic` is `unsafe` only because it `LoadLibraryExW`s
tracing::info!("no usable virtual mic device present — attempting auto-install");
// SAFETY: `install_steam_audio_pair` is `unsafe` only because it `LoadLibraryExW`s
// `newdev.dll` and calls `DiInstallDriverW` through a `transmute`d function pointer;
// calling it imposes no extra precondition here (it takes no args and aliases nothing).
// Its internal contract holds: the `DiInstall` type matches the documented
// `BOOL DiInstallDriverW(HWND, PCWSTR, DWORD, PBOOL)` ABI, and it passes a
// NUL-terminated UTF-16 INF path with null/zero optional args. Invoked once on the
// dedicated mic thread.
if unsafe { try_install_virtual_mic() } {
if unsafe { install_steam_audio_pair() } {
find_device()
} else {
Err(e)
@@ -173,13 +222,26 @@ fn find_or_install_device() -> Result<wasapi::Device> {
}
}
/// Best-effort: install a virtual mic device so one exists without the user installing anything.
/// Mirrors Apollo's Steam Streaming Speakers install — Steam Remote Play ships
/// `SteamStreamingMicrophone.inf` next to the speakers INF, so install it via `DiInstallDriverW`
/// (loaded from `newdev.dll`, like Apollo, to avoid an extra windows-crate feature). Needs admin (the
/// host runs as SYSTEM). Returns true on success; false (no-op) if Steam isn't installed (INF absent),
/// the install is denied, or `PUNKTFUNK_NO_MIC_INSTALL` is set.
unsafe fn try_install_virtual_mic() -> bool {
/// Best-effort: install BOTH Steam Streaming audio devices (the "Steam pair") so mic passthrough
/// works out of the box and the host has a desktop-audio sink distinct from the mic. Steam Remote
/// Play ships `SteamStreamingMicrophone.inf` + `SteamStreamingSpeakers.inf`: the microphone gives the
/// virtual mic a target whose **capture** endpoint apps record from, and the speakers give a
/// **render** endpoint a headless box can loopback-capture that is NOT the mic — so the loopback and
/// the mic land on different devices and never echo (see [`find_device`]). Returns true if either
/// installed. No-op when Steam isn't installed (INFs absent), the install is denied (needs admin —
/// the host runs as SYSTEM), or `PUNKTFUNK_NO_MIC_INSTALL` is set.
unsafe fn install_steam_audio_pair() -> bool {
// Microphone first (the mic's actual target); speakers second (the distinct desktop-audio sink).
let mic = try_install_steam_audio("SteamStreamingMicrophone.inf");
let spk = try_install_steam_audio("SteamStreamingSpeakers.inf");
mic || spk
}
/// Install one Steam Streaming driver INF by filename via `DiInstallDriverW` (loaded from
/// `newdev.dll`, like Apollo, to avoid an extra windows-crate feature). See
/// [`install_steam_audio_pair`] for the contract; `inf_name` is a bare filename under Steam's
/// per-arch `drivers\Windows10\{arch}\` directory.
unsafe fn try_install_steam_audio(inf_name: &str) -> bool {
use windows::core::{s, w, PCWSTR};
use windows::Win32::Foundation::HWND;
use windows::Win32::System::Environment::ExpandEnvironmentStringsW;
@@ -197,12 +259,11 @@ unsafe fn try_install_virtual_mic() -> bool {
let subdir = "arm64";
#[cfg(not(any(target_arch = "x86_64", target_arch = "aarch64")))]
let subdir = "x86";
let template: Vec<u16> = format!(
"%CommonProgramFiles(x86)%\\Steam\\drivers\\Windows10\\{subdir}\\SteamStreamingMicrophone.inf"
)
.encode_utf16()
.chain(std::iter::once(0))
.collect();
let template: Vec<u16> =
format!("%CommonProgramFiles(x86)%\\Steam\\drivers\\Windows10\\{subdir}\\{inf_name}")
.encode_utf16()
.chain(std::iter::once(0))
.collect();
let mut path = vec![0u16; 1024];
let n = ExpandEnvironmentStringsW(PCWSTR(template.as_ptr()), Some(path.as_mut_slice()));
if n == 0 || n as usize > path.len() {
@@ -210,7 +271,7 @@ unsafe fn try_install_virtual_mic() -> bool {
}
let Ok(newdev) = LoadLibraryExW(w!("newdev.dll"), None, LOAD_LIBRARY_SEARCH_SYSTEM32) else {
tracing::warn!("could not load newdev.dll — virtual-mic auto-install unavailable");
tracing::warn!("could not load newdev.dll — Steam-audio auto-install unavailable");
return false;
};
let Some(addr) = GetProcAddress(newdev, s!("DiInstallDriverW")) else {
@@ -226,13 +287,17 @@ unsafe fn try_install_virtual_mic() -> bool {
std::ptr::null_mut(),
) != 0;
if ok {
tracing::info!("installed the Steam Streaming Microphone virtual device");
tracing::info!(
inf = inf_name,
"installed a Steam Streaming virtual audio device"
);
std::thread::sleep(Duration::from_secs(5)); // let the audio subsystem register the endpoint
} else {
let err = windows::Win32::Foundation::GetLastError();
tracing::info!(
inf = inf_name,
?err,
"no virtual mic auto-installed (Steam absent / not admin) — see manual-install guidance"
"Steam-audio device not auto-installed (Steam absent / not admin) — see install guidance"
);
}
ok
+35 -10
View File
@@ -62,6 +62,11 @@ pub struct OutputFormat {
/// HDR: the capturer converts to 10-bit (IDD-push FP16 → `Rgb10a2`; the DDA secure-desktop HDR hint).
/// `false` = 8-bit SDR.
pub hdr: bool,
/// Full-chroma 4:4:4 session: the capturer must keep full chroma — deliver packed **RGB**
/// (`Bgra` / `Rgb10a2`), NOT the subsampled `Nv12`/`P010` the Windows video-engine path produces by
/// default — because 4:4:4 can only be recovered from a full-chroma source. NVENC then does the
/// RGB→YUV444 CSC at encode (chroma_format_idc=3). `false` on every 4:2:0 session.
pub chroma_444: bool,
}
impl OutputFormat {
@@ -73,6 +78,8 @@ impl OutputFormat {
OutputFormat {
gpu: gpu_encode(),
hdr,
// The GameStream + spike paths are always 4:2:0 (4:4:4 is punktfunk/1-native only).
chroma_444: false,
}
}
}
@@ -361,13 +368,16 @@ pub fn open_portal_monitor() -> Result<Box<dyn Capturer>> {
#[cfg(target_os = "linux")]
pub fn capture_virtual_output(
vout: crate::vdisplay::VirtualOutput,
_want: OutputFormat,
want: OutputFormat,
_capture: crate::session_plan::CaptureBackend,
) -> Result<Box<dyn Capturer>> {
// The Linux host stays 8-bit (HDR is blocked upstream) and the portal negotiates its own format, so
// the `OutputFormat` is unused here; the capture backend is always the portal (the `CaptureBackend`
// arg is a Windows-only dispatch — ignored here).
linux::PortalCapturer::from_virtual_output(vout).map(|c| Box::new(c) as Box<dyn Capturer>)
// The Linux host stays 8-bit (HDR is blocked upstream) and the portal negotiates its own pixel
// format, so only `want.gpu` is honored here: it gates GPU zero-copy capture (the capture backend
// is always the portal — the `CaptureBackend` arg is a Windows-only dispatch). `gpu = false`
// (a 4:4:4 NVENC session) forces the CPU mmap path so the encoder gets CPU-resident RGB to swscale
// into YUV444P — otherwise it would receive CUDA frames and bail.
linux::PortalCapturer::from_virtual_output(vout, want.gpu)
.map(|c| Box::new(c) as Box<dyn Capturer>)
}
/// `PUNKTFUNK_NO_WGC=1` forces the pure single-process DDA (Desktop Duplication) path everywhere: it
@@ -394,6 +404,14 @@ pub fn capture_virtual_output(
})?;
let pref = vout.preferred_mode;
let keep = vout.keepalive;
// Full-chroma 4:4:4 needs a full-chroma RGB source. The IDD-push and WGC paths emit subsampled
// NV12/P010 by default, which can't reconstruct 4:4:4; route a 4:4:4 session to DDA, which delivers
// RGB (Bgra) when its `chroma_444` flag is set. (IDD-push/WGC 4:4:4 capture is a follow-up.)
if want.chroma_444 && capture != CaptureBackend::Dda {
tracing::info!("4:4:4 session — using DDA capture (RGB source) instead of {capture:?}");
return dxgi::DuplCapturer::open(target, pref, keep, want.gpu, false, want.chroma_444)
.map(|c| Box::new(c) as Box<dyn Capturer>);
}
// P2 direct frame push (kill DDA): consume frames straight from the pf-vdisplay driver's shared
// ring — no Desktop Duplication, no win32u reparenting hook. Resolved once in the `SessionPlan`
// (was re-derived from `config().idd_push` here); `IddPush` takes the keepalive (owns the virtual
@@ -414,8 +432,15 @@ pub fn capture_virtual_output(
error = %format!("{e:#}"),
"IDD-push open/attach failed — falling back to DDA"
);
return dxgi::DuplCapturer::open(target, pref, keep, want.gpu, false)
.map(|c| Box::new(c) as Box<dyn Capturer>);
return dxgi::DuplCapturer::open(
target,
pref,
keep,
want.gpu,
false,
want.chroma_444,
)
.map(|c| Box::new(c) as Box<dyn Capturer>);
}
}
}
@@ -426,7 +451,7 @@ pub fn capture_virtual_output(
// chosen backend (it owns the SudoVDA keepalive), so there's no open-time auto-fallback. The
// backend choice (`dda`/`dxgi`/`PUNKTFUNK_NO_WGC` → DDA, else WGC) is now resolved once in the plan.
if capture == CaptureBackend::Dda {
return dxgi::DuplCapturer::open(target, pref, keep, want.gpu, false)
return dxgi::DuplCapturer::open(target, pref, keep, want.gpu, false, want.chroma_444)
.map(|c| Box::new(c) as Box<dyn Capturer>);
}
// WGC default, with a watchdog'd DDA fallback. WGC's Direct3D11CaptureFramePool::CreateFreeThreaded
@@ -461,12 +486,12 @@ pub fn capture_virtual_output(
}
Ok(Err(e)) => {
tracing::warn!(error = %format!("{e:#}"), "WGC open failed — falling back to DDA");
dxgi::DuplCapturer::open(target, pref, keep, want.gpu, false)
dxgi::DuplCapturer::open(target, pref, keep, want.gpu, false, want.chroma_444)
.map(|c| Box::new(c) as Box<dyn Capturer>)
}
Err(_) => {
tracing::warn!("WGC open timed out (CreateFreeThreaded hang on the virtual display) — falling back to DDA");
dxgi::DuplCapturer::open(target, pref, keep, want.gpu, false)
dxgi::DuplCapturer::open(target, pref, keep, want.gpu, false, want.chroma_444)
.map(|c| Box::new(c) as Box<dyn Capturer>)
}
}
+70 -5
View File
@@ -40,6 +40,13 @@ pub struct PortalCapturer {
/// branch to tell "format never negotiated" (modifier/format mismatch) apart from "negotiated
/// but no buffers arrived" (compositor idle/unmapped) — the two black-screen root causes.
negotiated: Arc<AtomicBool>,
/// True only while the PipeWire stream is `Streaming`. [`try_latest`](Self::try_latest) reads it
/// to distinguish a static desktop (alive, no new buffers) from a dead source (left `Streaming`).
streaming: Arc<AtomicBool>,
/// When the stream first dropped out of `Streaming` with no new frame; used to grace a transient
/// renegotiation before declaring the source lost. Cleared whenever a frame arrives or the stream
/// is `Streaming`.
stall_since: Option<std::time::Instant>,
/// The PipeWire node this capturer consumes — surfaced in error messages for diagnosis.
node_id: u32,
/// Stops the PipeWire loop on teardown (sent in `Drop`). Without it a dropped or failed
@@ -82,21 +89,29 @@ impl PortalCapturer {
node_id,
"ScreenCast portal session started; connecting PipeWire"
);
Ok(spawn_pipewire(Some(fd), node_id, None)?.into_capturer(node_id, None))
// This portal path (GameStream / monitor capture) is always 4:2:0, so allow zero-copy as before.
Ok(spawn_pipewire(Some(fd), node_id, None, true)?.into_capturer(node_id, None))
}
/// Build a capturer from an already-created virtual output ([`crate::vdisplay::VirtualOutput`]):
/// connect PipeWire to its node (`remote_fd` selects portal-remote vs. default-daemon) and
/// take ownership of its keepalive so the output lives exactly as long as this capturer. This
/// is how the client's requested resolution becomes the captured resolution without scaling.
pub fn from_virtual_output(vout: crate::vdisplay::VirtualOutput) -> Result<PortalCapturer> {
/// `allow_zerocopy` mirrors [`OutputFormat::gpu`](crate::capture::OutputFormat): `false` forces the
/// CPU mmap path (a 4:4:4 NVENC session needs CPU-resident RGB), `true` keeps the GPU zero-copy
/// path subject to `PUNKTFUNK_ZEROCOPY`.
pub fn from_virtual_output(
vout: crate::vdisplay::VirtualOutput,
allow_zerocopy: bool,
) -> Result<PortalCapturer> {
tracing::info!(
node_id = vout.node_id,
allow_zerocopy,
"connecting PipeWire to virtual output"
);
let node_id = vout.node_id;
Ok(
spawn_pipewire(vout.remote_fd, node_id, vout.preferred_mode)?
spawn_pipewire(vout.remote_fd, node_id, vout.preferred_mode, allow_zerocopy)?
.into_capturer(node_id, Some(vout.keepalive)),
)
}
@@ -109,6 +124,7 @@ struct PwHandles {
frames: Receiver<CapturedFrame>,
active: Arc<AtomicBool>,
negotiated: Arc<AtomicBool>,
streaming: Arc<AtomicBool>,
quit: ::pipewire::channel::Sender<()>,
join: thread::JoinHandle<()>,
}
@@ -121,6 +137,8 @@ impl PwHandles {
frames: self.frames,
active: self.active,
negotiated: self.negotiated,
streaming: self.streaming,
stall_since: None,
node_id,
quit: Some(self.quit),
join: Some(self.join),
@@ -136,6 +154,12 @@ fn spawn_pipewire(
fd: Option<OwnedFd>,
node_id: u32,
preferred: Option<(u32, u32, u32)>,
// Allow GPU zero-copy capture (dmabuf→CUDA/VA). `false` forces the CPU mmap path even when
// `PUNKTFUNK_ZEROCOPY` is set — a 4:4:4 NVENC session needs CPU-resident RGB (the encoder
// swscales RGB→YUV444P; `hevc_nvenc` can't 4:4:4 from a CUDA RGB surface), so the session plan
// passes `gpu = false` for it. Without this, a 4:4:4 session under `PUNKTFUNK_ZEROCOPY=1` would
// get CUDA frames and the encoder would bail (`want_444 && cuda`).
allow_zerocopy: bool,
) -> Result<PwHandles> {
// Frames flow from the pipewire thread over a small bounded channel.
let (frame_tx, frame_rx) = sync_channel::<CapturedFrame>(8);
@@ -143,11 +167,13 @@ fn spawn_pipewire(
let active_cb = active.clone();
let negotiated = Arc::new(AtomicBool::new(false));
let negotiated_cb = negotiated.clone();
let streaming = Arc::new(AtomicBool::new(false));
let streaming_cb = streaming.clone();
// pipewire's own cross-thread channel: the receiver attaches to the loop and quits it; the
// sender lives on the capturer and fires in its `Drop`. Absolute `::pipewire` path — the
// inner `mod pipewire` shadows the crate name at this scope.
let (quit_tx, quit_rx) = ::pipewire::channel::channel::<()>();
let zerocopy = crate::zerocopy::enabled();
let zerocopy = allow_zerocopy && crate::zerocopy::enabled();
let join = thread::Builder::new()
.name("punktfunk-pipewire".into())
.spawn(move || {
@@ -157,6 +183,7 @@ fn spawn_pipewire(
frame_tx,
active_cb,
negotiated_cb,
streaming_cb,
zerocopy,
preferred,
quit_rx,
@@ -169,6 +196,7 @@ fn spawn_pipewire(
frames: frame_rx,
active,
negotiated,
streaming,
quit: quit_tx,
join,
})
@@ -219,6 +247,28 @@ impl Capturer for PortalCapturer {
}
}
}
if latest.is_some() || self.streaming.load(Ordering::Relaxed) {
// A frame arrived, or the source is alive but idle (static desktop) — normal. Clear any
// stall and repeat the last frame on `None`, exactly as before.
self.stall_since = None;
return Ok(latest);
}
// No new frame AND the stream has left `Streaming` (Paused/Unconnected/Error). The source
// went away — a compositor torn down on a Gaming↔Desktop switch, a removed virtual output.
// Grace a brief window (a transient mid-stream renegotiation can blip out of Streaming and
// back) before declaring it lost so the encode loop rebuilds in place rather than freezing
// on the last frame forever.
const STALL_GRACE: Duration = Duration::from_millis(1500);
let since = *self.stall_since.get_or_insert_with(std::time::Instant::now);
if since.elapsed() >= STALL_GRACE {
self.stall_since = None;
return Err(anyhow!(
"PipeWire source stalled (node {}): stream left Streaming for >{}ms with no frames \
— the compositor/virtual output went away (session switch?)",
self.node_id,
STALL_GRACE.as_millis()
));
}
Ok(latest)
}
@@ -467,6 +517,10 @@ mod pipewire {
/// Set once a video format is agreed (`param_changed`), so a first-frame timeout can tell
/// "format never negotiated" apart from "negotiated but no buffers arrived".
negotiated: Arc<AtomicBool>,
/// True only while the PipeWire stream is in `Streaming` (the source is alive). Goes false on
/// `Paused`/`Unconnected`/`Error` — the source vanished (compositor torn down on a session
/// switch). Read by [`PortalCapturer::try_latest`] to surface a sustained drop as a loss.
streaming: Arc<AtomicBool>,
/// Present when zero-copy is enabled on NVIDIA: imports a dmabuf → CUDA device buffer.
importer: Option<crate::zerocopy::EglImporter>,
/// VAAPI zero-copy: hand the raw dmabuf to the encoder (which imports + GPU-CSCs it) instead
@@ -1056,6 +1110,7 @@ mod pipewire {
tx: SyncSender<CapturedFrame>,
active: Arc<AtomicBool>,
negotiated: Arc<AtomicBool>,
streaming: Arc<AtomicBool>,
zerocopy: bool,
preferred: Option<(u32, u32, u32)>,
quit_rx: pw::channel::Receiver<()>,
@@ -1150,6 +1205,7 @@ mod pipewire {
tx,
active,
negotiated,
streaming,
importer,
vaapi_passthrough,
nv12: crate::zerocopy::nv12_enabled(),
@@ -1174,8 +1230,17 @@ mod pipewire {
let _listener = stream
.add_local_listener_with_user_data(data)
.state_changed(|_stream, _ud, old, new| {
.state_changed(|_stream, ud, old, new| {
tracing::info!(?old, ?new, "pipewire stream state");
// Track whether the node is actively producing. A live source sits in `Streaming`
// (a static desktop just sends no buffers); anything else — `Paused`/`Unconnected`/
// `Error` — means the source went away (compositor died, virtual output removed on a
// Gaming↔Desktop switch). `try_latest` turns a sustained non-Streaming state into a
// capture-loss so the encode loop rebuilds instead of freezing on the last frame.
ud.streaming.store(
matches!(new, pw::stream::StreamState::Streaming),
Ordering::Relaxed,
);
})
.param_changed(|_stream, ud, id, param| {
let Some(param) = param else { return };
@@ -1,5 +1,5 @@
//! Input-desktop watcher (Windows) — the authoritative "normal vs secure desktop" signal for the
//! two-process secure-desktop design (design/windows-secure-desktop.md).
//! two-process secure-desktop design (design/archive/windows-secure-desktop.md).
//!
//! Windows switches the *input desktop* to "Winlogon" (the secure desktop) for UAC elevation, the
//! lock screen and the login screen, and back to "Default" for the normal session. WGC captures only
@@ -2010,6 +2010,10 @@ pub struct DuplCapturer {
/// first, retried (legacy DuplicateOutput can't capture HDR). Set for the secure-desktop DDA leg
/// when the SudoVDA is in HDR; threaded into every (re)duplication incl. ACCESS_LOST recovery.
want_hdr: bool,
/// Full-chroma 4:4:4 session: deliver packed RGB (`Bgra` SDR / `Rgb10a2` HDR) and SKIP the
/// video-engine RGB→YUV (NV12/P010) conversion — NVENC reconstructs 4:4:4 only from a full-chroma
/// source, so we hand it the RGB texture and it CSCs to YUV444 at encode (chroma_format_idc=3).
chroma_444: bool,
/// HDR (scRGB FP16) capture state. Set when the duplication surface is `R16G16B16A16_FLOAT`
/// (the desktop has HDR on). The frame can't be `CopyResource`d into a BGRA target, so the HDR
/// path copies it into an FP16 SRV texture, composites the cursor, then runs [`HdrConverter`] to
@@ -2087,6 +2091,8 @@ impl DuplCapturer {
// stage 5) so the capturer never re-derives the encode backend itself.
gpu: bool,
want_hdr: bool,
// 4:4:4 session → deliver RGB, skip the NV12/P010 video-engine conversion (see the field doc).
chroma_444: bool,
) -> Result<Self> {
// SAFETY: runs on the capture thread that will own this `DuplCapturer`. `install_gpu_pref_hook()`
// and the DPI-context calls take by-value handles / no args and touch only thread/process state;
@@ -2311,6 +2317,7 @@ impl DuplCapturer {
gpu_copy: None,
last_present: None,
want_hdr,
chroma_444,
hdr_fp16: is_hdr_init,
hdr_meta: hdr_meta_init,
fp16_src: None,
@@ -3088,7 +3095,10 @@ impl DuplCapturer {
// Video-engine path: scRGB FP16 → BT.2020 PQ P010 on the VIDEO engine (no 3D shader, and
// NVENC encodes P010 natively). Fall back to the HdrConverter pixel shader (3D) only if the
// video processor is unavailable.
if let Some(p010) = self.convert_to_yuv(&src, true) {
if let Some(p010) = (!self.chroma_444)
.then(|| self.convert_to_yuv(&src, true))
.flatten()
{
self.last_present = Some((p010.clone(), PixelFormat::P010));
return Ok(CapturedFrame {
width: self.width,
@@ -3148,7 +3158,10 @@ impl DuplCapturer {
// conversion AND NVENC's encode stay OFF the 3D engine — the only way to keep up when a
// game pins the 3D engine at ~100%. Fall back to handing NVENC the BGRA texture (it then
// does RGB→YUV internally on the 3D/compute engine).
if let Some(nv12) = self.convert_to_yuv(&gpu, false) {
if let Some(nv12) = (!self.chroma_444)
.then(|| self.convert_to_yuv(&gpu, false))
.flatten()
{
self.last_present = Some((nv12.clone(), PixelFormat::Nv12));
return Ok(CapturedFrame {
width: self.width,
@@ -1,5 +1,5 @@
//! Host-side WGC helper relay (Windows two-process secure-desktop design,
//! design/windows-secure-desktop.md — step 4).
//! design/archive/windows-secure-desktop.md — step 4).
//!
//! WGC won't activate under the SYSTEM account, so the SYSTEM host can't capture the normal desktop
//! itself. Instead it spawns `punktfunk-host wgc-helper` in the **interactive user session** (so WGC works)
+6 -1
View File
@@ -7,7 +7,7 @@
//! **Goal-1 stages 12** (`design/windows-host-rewrite.md` §2.2): stage 1 stood this up; stage 2 migrated the
//! genuinely-constant operator/dispatch knobs onto it (the dispatch-disagreement bug class: `idd_push`,
//! `capture_backend`, `encoder_pref`, `render_adapter`, `no_wgc`, the vdisplay backend select — plus the
//! plan-named `secure_dda`/`idd_depth`/`zerocopy`/`ten_bit` and the multi-site `perf`/`compositor`/
//! plan-named `secure_dda`/`idd_depth`/`zerocopy`/`ten_bit`/`four_four_four` and the multi-site `perf`/`compositor`/
//! `video_source`/`gamepad`). `SessionPlan` (stage 3) consumes it as the single owner of the
//! capture/topology/encoder decision.
//!
@@ -63,6 +63,10 @@ pub struct HostConfig {
pub zerocopy: bool,
/// `PUNKTFUNK_10BIT` — host policy gate for HEVC Main10 (only honored when the client also advertised 10-bit).
pub ten_bit: bool,
/// `PUNKTFUNK_444` — host policy gate for full-chroma HEVC 4:4:4 (Range Extensions). Honored only
/// when the client also advertised 4:4:4, the codec is HEVC, and the GPU/driver supports a 4:4:4
/// encode (probed) — otherwise the session stays 4:2:0. Independent of `ten_bit` (chroma vs depth).
pub four_four_four: bool,
/// `PUNKTFUNK_PERF` — per-stage timing instrumentation.
pub perf: bool,
/// `PUNKTFUNK_VIDEO_SOURCE` — GameStream video source select (`virtual` / `portal` / unset → synthetic).
@@ -112,6 +116,7 @@ impl HostConfig {
.unwrap_or(2),
zerocopy: flag("PUNKTFUNK_ZEROCOPY"),
ten_bit: flag("PUNKTFUNK_10BIT"),
four_four_four: flag("PUNKTFUNK_444"),
perf: flag("PUNKTFUNK_PERF"),
video_source: val("PUNKTFUNK_VIDEO_SOURCE"),
compositor: val("PUNKTFUNK_COMPOSITOR"),
+134 -3
View File
@@ -29,6 +29,33 @@ pub enum Codec {
Av1,
}
/// Chroma subsampling the encoder emits, negotiated with the client (the `PUNKTFUNK_444` gate + the
/// client's `VIDEO_CAP_444` + a GPU probe). `Yuv420` is the universal default; `Yuv444` is HEVC-only,
/// native-protocol-only (GameStream stays 4:2:0), and the host only ever passes it after
/// [`can_encode_444`] confirmed the active backend supports it.
#[derive(Clone, Copy, Debug, PartialEq, Eq, Default)]
pub enum ChromaFormat {
#[default]
Yuv420,
Yuv444,
}
impl ChromaFormat {
/// The HEVC `chroma_format_idc` this maps to: `1` (4:2:0) or `3` (4:4:4). Also the wire value
/// echoed in [`punktfunk_core::quic::Welcome::chroma_format`].
pub fn idc(self) -> u8 {
match self {
ChromaFormat::Yuv420 => punktfunk_core::quic::CHROMA_IDC_420,
ChromaFormat::Yuv444 => punktfunk_core::quic::CHROMA_IDC_444,
}
}
/// True for full-chroma 4:4:4.
pub fn is_444(self) -> bool {
matches!(self, ChromaFormat::Yuv444)
}
}
impl Codec {
/// The FFmpeg NVENC encoder name (selected by name, not codec id — the latter would
/// pick the software encoder).
@@ -89,6 +116,13 @@ pub struct EncoderCaps {
/// When `false`, `set_hdr_meta` is a no-op and no in-band grade reaches the client. Only the
/// Windows direct-NVENC path attaches it today.
pub supports_hdr_metadata: bool,
/// The opened encoder is actually producing a full-chroma 4:4:4 (`chroma_format_idc = 3`) stream.
/// `false` on every 4:2:0 session (the default) and on a backend that declined 4:4:4. Set by the
/// NVENC backends (Linux + Windows). The chroma is committed to the wire (`Welcome::chroma_format`)
/// from the pre-open probe, so this is a *post-open cross-check*: the session glue logs loudly if
/// the encoder's real chroma disagrees with what was negotiated (the in-band SPS is authoritative
/// for the decoder either way).
pub chroma_444: bool,
}
/// A hardware encoder. One per session; runs on the encode thread.
@@ -193,8 +227,29 @@ pub fn open_video(
bitrate_bps: u64,
cuda: bool,
bit_depth: u8,
chroma: ChromaFormat,
) -> Result<Box<dyn Encoder>> {
validate_dimensions(codec, width, height)?;
// Refresh/fps must be positive and sane: fps feeds the encoder time_base (`Rational(1, fps)`)
// and the pts→ns conversion (`pts * 1e9 / fps`), so 0 builds a 1/0 rational / divides by zero.
// The mid-stream Reconfigure path already guards `refresh_hz > 0`; enforcing it at this single
// open chokepoint makes EVERY path (initial Hello, GameStream ANNOUNCE, Reconfigure) safe
// regardless of which backend opens (security-review 2026-06-28 S5).
if fps == 0 || fps > 1000 {
anyhow::bail!("invalid refresh/fps {fps}: must be 1..=1000 Hz");
}
// 4:4:4 is HEVC-only. The negotiator should never pass `Yuv444` for another codec (it gates on
// `codec == H265`), but defend the contract here so a future caller can't silently emit a stream
// no decoder expects: a non-HEVC 4:4:4 request degrades to 4:2:0 with a warning.
let chroma = if chroma.is_444() && codec != Codec::H265 {
tracing::warn!(
?codec,
"4:4:4 requested for a non-HEVC codec — encoding 4:2:0"
);
ChromaFormat::Yuv420
} else {
chroma
};
#[cfg(target_os = "linux")]
{
// Pick the GPU encode backend. NVIDIA → NVENC/CUDA (the original path, unchanged);
@@ -203,8 +258,17 @@ pub fn open_video(
// its errors crisply instead of silently trying the other).
let pref = crate::config::config().encoder_pref.as_str();
let open_vaapi = || -> Result<Box<dyn Encoder>> {
vaapi::VaapiEncoder::open(codec, format, width, height, fps, bitrate_bps, bit_depth)
.map(|e| Box::new(e) as Box<dyn Encoder>)
vaapi::VaapiEncoder::open(
codec,
format,
width,
height,
fps,
bitrate_bps,
bit_depth,
chroma,
)
.map(|e| Box::new(e) as Box<dyn Encoder>)
};
match pref {
"nvenc" | "nvidia" | "cuda" => open_nvenc_probed(
@@ -216,6 +280,7 @@ pub fn open_video(
bitrate_bps,
cuda,
bit_depth,
chroma,
),
"vaapi" | "amd" | "intel" => open_vaapi(),
"auto" | "" => {
@@ -231,6 +296,7 @@ pub fn open_video(
bitrate_bps,
cuda,
bit_depth,
chroma,
)
} else {
open_vaapi()
@@ -260,6 +326,7 @@ pub fn open_video(
fps,
bitrate_bps,
bit_depth,
chroma,
)
.map(|e| Box::new(e) as Box<dyn Encoder>)
}
@@ -289,6 +356,7 @@ pub fn open_video(
fps,
bitrate_bps,
bit_depth,
chroma,
)
.map(|e| Box::new(e) as Box<dyn Encoder>)
}
@@ -333,6 +401,7 @@ pub fn open_video(
bitrate_bps,
cuda,
bit_depth,
chroma,
);
anyhow::bail!("video encode requires Linux or Windows")
}
@@ -355,6 +424,7 @@ fn open_nvenc_probed(
bitrate_bps: u64,
cuda: bool,
bit_depth: u8,
chroma: ChromaFormat,
) -> Result<Box<dyn Encoder>> {
const MIN_PROBE_BPS: u64 = 50_000_000;
let mut candidates = vec![bitrate_bps];
@@ -369,7 +439,9 @@ fn open_nvenc_probed(
}
let mut last: Option<anyhow::Error> = None;
for (i, &b) in candidates.iter().enumerate() {
match linux::NvencEncoder::open(codec, format, width, height, fps, b, cuda, bit_depth) {
match linux::NvencEncoder::open(
codec, format, width, height, fps, b, cuda, bit_depth, chroma,
) {
Ok(enc) => {
if i > 0 {
tracing::warn!(
@@ -446,6 +518,65 @@ pub fn vaapi_codec_support() -> CodecSupport {
})
}
/// Whether the active GPU encode backend can actually produce a full-chroma **4:4:4** HEVC stream.
/// Resolved (and cached, once) *before* the Welcome so the host advertises the chroma it will really
/// encode — the honest-downgrade channel. 4:4:4 is HEVC-only; the probe opens a tiny encoder on the
/// active backend (NVENC FREXT is broad on NVIDIA, but VAAPI / AMF / QSV 4:4:4 is hardware-specific,
/// so it must be probed, never assumed). Non-HEVC codecs are always `false`.
#[cfg(any(target_os = "linux", target_os = "windows"))]
pub fn can_encode_444(codec: Codec) -> bool {
use std::sync::OnceLock;
if codec != Codec::H265 {
return false;
}
static CACHE: OnceLock<bool> = OnceLock::new();
*CACHE.get_or_init(|| {
let supported = {
#[cfg(target_os = "linux")]
{
// Mirror open_video's backend dispatch: VAAPI (AMD/Intel) vs NVENC (NVIDIA).
if linux_zero_copy_is_vaapi() {
vaapi::probe_can_encode_444(codec)
} else {
linux::probe_can_encode_444(codec)
}
}
#[cfg(target_os = "windows")]
{
match windows_resolved_backend() {
WindowsBackend::Nvenc => {
#[cfg(feature = "nvenc")]
{
nvenc::probe_can_encode_444(codec)
}
#[cfg(not(feature = "nvenc"))]
{
false
}
}
WindowsBackend::Amf | WindowsBackend::Qsv => {
#[cfg(feature = "amf-qsv")]
{
let vendor = match windows_resolved_backend() {
WindowsBackend::Qsv => ffmpeg_win::WinVendor::Qsv,
_ => ffmpeg_win::WinVendor::Amf,
};
ffmpeg_win::probe_can_encode_444(vendor, codec)
}
#[cfg(not(feature = "amf-qsv"))]
{
false
}
}
WindowsBackend::Software => false,
}
}
};
tracing::info!(supported, "HEVC 4:4:4 encode capability probed");
supported
})
}
// ---------------------------------------------------------------------------------------------
// Windows backend selection (the analogue of the Linux nvidia_present / linux_zero_copy_is_vaapi
// logic). NVIDIA → NVENC, AMD → AMF, Intel → QSV; `auto` (default) reads the DXGI adapter vendor.
+205 -11
View File
@@ -11,7 +11,7 @@
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
#![deny(clippy::undocumented_unsafe_blocks)]
use super::{Codec, EncodedFrame, Encoder};
use super::{ChromaFormat, Codec, EncodedFrame, Encoder};
use crate::capture::{CapturedFrame, FramePayload, PixelFormat};
use anyhow::{anyhow, bail, Context, Result};
use ffmpeg::format::Pixel;
@@ -19,9 +19,33 @@ use ffmpeg::util::frame::Video as VideoFrame;
use ffmpeg::{codec, encoder, Dictionary, Packet, Rational};
use ffmpeg_next as ffmpeg;
use std::os::raw::c_int;
use std::ptr;
use ffmpeg::ffi; // = ffmpeg_sys_next
/// swscale: nearest-neighbour scaler flag (`SWS_POINT`). We never rescale (src dims == dst dims), so
/// the resampler choice only governs the colour-conversion path; POINT is the cheapest.
const SWS_POINT: c_int = 0x10;
/// swscale colorspace id for ITU-R BT.709 (`SWS_CS_ITU709`) — the CSC coefficients for our RGB→YUV.
const SWS_CS_ITU709: c_int = 1;
/// The swscale *source* pixel format for a captured packed RGB/BGR layout (the real byte order, not
/// the NVENC-padded `*0` form). Used by the 4:4:4 RGB→YUV444P conversion path. Mirrors the VAAPI
/// CPU-input mapping; YUV/10-bit inputs can't feed this path (the 4:4:4 session forces packed RGB).
fn sws_src_pixel(format: PixelFormat) -> Result<Pixel> {
Ok(match format {
PixelFormat::Bgrx => Pixel::BGRZ, // bgr0
PixelFormat::Rgbx => Pixel::RGBZ, // rgb0
PixelFormat::Bgra => Pixel::BGRA,
PixelFormat::Rgba => Pixel::RGBA,
PixelFormat::Rgb => Pixel::RGB24,
PixelFormat::Bgr => Pixel::BGR24,
PixelFormat::Nv12 | PixelFormat::P010 | PixelFormat::Rgb10a2 => {
bail!("NVENC 4:4:4 CPU-input path supports packed RGB/BGR only; got {format:?}")
}
})
}
/// `AVCUDADeviceContext` (libavutil/hwcontext_cuda.h) — not in the ffmpeg-sys bindings (the
/// crate doesn't allowlist that header), so mirror its stable 3-pointer layout. We set the
/// first field to *our* `CUcontext` so NVENC shares the context the EGL importer maps into.
@@ -131,6 +155,10 @@ pub struct NvencEncoder {
frame: Option<VideoFrame>,
/// Zero-copy path: CUDA hwdevice/hwframes contexts (the encoder takes `AV_PIX_FMT_CUDA`).
cuda: Option<CudaHw>,
/// 4:4:4 path only: swscale context converting the captured packed RGB/BGR → planar YUV444P
/// (BT.709 limited) into [`Self::frame`], because `hevc_nvenc` only emits 4:4:4 from a YUV444
/// *input* (RGB-in is always 4:2:0). `None` on the ordinary 4:2:0 RGB path. Freed in `Drop`.
sws_444: Option<*mut ffi::SwsContext>,
src_format: PixelFormat,
expand: bool,
width: u32,
@@ -142,10 +170,12 @@ pub struct NvencEncoder {
force_kf: bool,
}
// `CudaHw` holds raw `AVBufferRef`s; the encoder lives on a single thread. The CPU encoder is
// already `Send` via ffmpeg-next; assert it for the CUDA fields too.
// `CudaHw` holds raw `AVBufferRef`s and `sws_444` a raw `SwsContext`; the encoder lives on a single
// thread. The CPU encoder is already `Send` via ffmpeg-next; assert it for the raw fields too.
// SAFETY: `NvencEncoder` owns an ffmpeg-next `Encoder`/`VideoFrame` (already `Send`) plus a `CudaHw`
// holding raw `AVBufferRef`s, which are not `Send` by default. The encoder is owned and driven by
// holding raw `AVBufferRef`s and an optional raw `SwsContext`, none of which are `Send` by default.
// The `SwsContext` is a self-contained swscale state object with no thread affinity, touched only
// through `&mut self` on the one encode thread. The encoder is owned and driven by
// exactly ONE thread — the per-session encode thread it is moved to — and is only touched through
// `&mut self` methods, so it is never aliased or accessed concurrently. The wrapped libav contexts
// (and the shared `CUcontext` the `CudaHw` references) have no thread affinity, so transferring
@@ -164,6 +194,7 @@ impl NvencEncoder {
bitrate_bps: u64,
cuda: bool,
bit_depth: u8,
chroma: ChromaFormat,
) -> Result<Self> {
// TODO(hdr): Linux 10-bit parity. Unlike the Windows raw-SDK path (which upconverts 8-bit
// ARGB → Main10 via pixelBitDepthMinus8), libavcodec hevc_nvenc needs a 10-bit input pixel
@@ -175,6 +206,18 @@ impl NvencEncoder {
"Linux NVENC 10-bit not yet wired — encoding 8-bit"
);
}
// Full-chroma 4:4:4 (HEVC Range Extensions). `hevc_nvenc` only emits 4:4:4 from a YUV444
// *input* frame — feeding RGB always subsamples to 4:2:0 regardless of profile (verified on
// the RTX 5070 Ti). So a 4:4:4 session swscales the captured RGB → YUV444P (BT.709 limited)
// and feeds that with `profile=rext`. The negotiator gates this to HEVC + the single-process
// CPU-capture topology, so `cuda` must be false here; defend the contract.
let want_444 = chroma.is_444() && codec == Codec::H265;
if want_444 && cuda {
bail!(
"NVENC 4:4:4 needs CPU RGB frames (the session forces non-zero-copy capture for \
4:4:4); got a CUDA frame — capture/encoder negotiation mismatch"
);
}
ffmpeg::init().context("ffmpeg init")?;
if std::env::var_os("PUNKTFUNK_FFMPEG_DEBUG").is_some() {
// SAFETY: `av_log_set_level` sets libav's global integer log level; `48` (= AV_LOG_DEBUG)
@@ -185,7 +228,14 @@ impl NvencEncoder {
let name = codec.nvenc_name();
let av_codec = encoder::find_by_name(name)
.ok_or_else(|| anyhow!("{name} not built into libavcodec"))?;
let (nvenc_pixel, expand) = nvenc_input(format);
let (rgb_pixel, rgb_expand) = nvenc_input(format);
// 4:4:4 feeds NVENC a planar YUV444P frame we produce by swscale; the ordinary path feeds the
// captured RGB straight in and lets NVENC's internal CSC subsample to 4:2:0.
let (nvenc_pixel, expand) = if want_444 {
(Pixel::YUV444P, false)
} else {
(rgb_pixel, rgb_expand)
};
let mut video = codec::context::Context::new_with_codec(av_codec)
.encoder()
@@ -234,12 +284,12 @@ impl NvencEncoder {
(*video.as_mut_ptr()).gop_size = -1;
}
// NV12 path: we did the RGB→YUV conversion ourselves as BT.709 *limited* range, so signal
// that in the bitstream VUI (colorspace/range/primaries/transfer) — otherwise the client
// decoder assumes a default and the picture comes out washed-out / wrong-contrast. The
// RGB-input paths leave these unset (NVENC's internal CSC writes its own VUI). Matches the
// Windows NV12 path's BT.709 limited-range signalling.
if matches!(format, PixelFormat::Nv12) {
// NV12 / 4:4:4 paths: we do the RGB→YUV conversion ourselves as BT.709 *limited* range
// (swscale), so signal that in the bitstream VUI (colorspace/range/primaries/transfer) —
// otherwise the client decoder assumes a default and the picture comes out washed-out /
// wrong-contrast. The RGB-input 4:2:0 path leaves these unset (NVENC's internal CSC writes
// its own VUI). Matches the Windows NV12 path's BT.709 limited-range signalling.
if matches!(format, PixelFormat::Nv12) || want_444 {
// SAFETY: same `video` builder — `raw = video.as_mut_ptr()` is the non-null, properly-
// aligned, sole-owned, not-yet-opened `AVCodecContext`. We set its four VUI colour enum
// fields to valid `AVColorSpace`/`AVColorRange`/`AVColorPrimaries`/`AVColorTransfer-
@@ -280,6 +330,45 @@ impl NvencEncoder {
None
};
// 4:4:4: build the RGB→YUV444P swscale (BT.709 limited, no rescale). Mirrors the VAAPI CPU
// path's RGB→NV12 scaler, but the dst is full-chroma planar 4:4:4.
let sws_444 = if want_444 {
let src_av = pixel_to_av(sws_src_pixel(format)?);
// SAFETY: `sws_getContext` allocates a swscale context for the given src/dst dims + pixel
// formats. Both dims are the encoder's positive `width`/`height` as `c_int`; `src_av` is a
// valid `AVPixelFormat` (from the `sws_src_pixel`-validated, packed-RGB-only source), the
// dst is YUV444P. The trailing filter/param pointers are null = "use defaults" (documented
// as accepted). No Rust memory is borrowed; the returned pointer is null-checked below.
let sws = unsafe {
ffi::sws_getContext(
width as c_int,
height as c_int,
src_av,
width as c_int,
height as c_int,
ffi::AVPixelFormat::AV_PIX_FMT_YUV444P,
SWS_POINT,
ptr::null_mut(),
ptr::null_mut(),
ptr::null(),
)
};
if sws.is_null() {
bail!("sws_getContext(RGB→YUV444P) failed");
}
// SAFETY: `sws` is the non-null context from the call above (null-checked). The ITU-709
// coefficient table from `sws_getCoefficients` is a process-lifetime libswscale static,
// reused for src+dst matrices; `sws_setColorspaceDetails` only reads it and writes scalar
// CSC settings into `sws` (limited-range dst: dstRange = 0). No Rust memory is passed.
unsafe {
let cs709 = ffi::sws_getCoefficients(SWS_CS_ITU709);
ffi::sws_setColorspaceDetails(sws, cs709, 1, cs709, 0, 0, 1 << 16, 1 << 16);
}
Some(sws)
} else {
None
};
// Low-latency NVENC tuning (plan §7 / linux-setup doc).
let mut opts = Dictionary::new();
opts.set("preset", "p1"); // fastest
@@ -288,6 +377,12 @@ impl NvencEncoder {
opts.set("bf", "0");
opts.set("delay", "0");
opts.set("forced-idr", "1"); // RFI/request_keyframe → real IDR under the infinite GOP
if want_444 {
// HEVC Range Extensions — the profile that carries chroma_format_idc=3. With a YUV444P
// input `hevc_nvenc` auto-selects it, but pin it explicitly so the chroma is never silently
// dropped on a future libavcodec.
opts.set("profile", "rext");
}
// Split-frame encode across both NVENC engines (GB203 has 2) when the pixel rate exceeds
// a single engine's HEVC capacity (~1 Gpix/s); e.g. 5120x1440@240 = 1.77 Gpix/s needs it,
@@ -321,6 +416,7 @@ impl NvencEncoder {
enc,
frame,
cuda: cuda_hw,
sws_444,
src_format: format,
expand,
width,
@@ -333,6 +429,15 @@ impl NvencEncoder {
}
impl Encoder for NvencEncoder {
fn caps(&self) -> super::EncoderCaps {
super::EncoderCaps {
// 4:4:4 iff this session opened the RGB→YUV444P swscale path (FREXT). RFI/HDR-SEI stay
// unsupported on libavcodec NVENC (the trait defaults).
chroma_444: self.sws_444.is_some(),
..super::EncoderCaps::default()
}
}
fn submit(&mut self, captured: &CapturedFrame) -> Result<()> {
anyhow::ensure!(
captured.width == self.width && captured.height == self.height,
@@ -411,6 +516,47 @@ impl NvencEncoder {
bytes.len(),
src_row * h
);
// 4:4:4: swscale the packed RGB straight into the planar YUV444P input frame (BT.709 limited),
// then send it — no byte-expand. The 4:2:0 RGB path (below) feeds NVENC packed RGB directly.
if let Some(sws) = self.sws_444 {
let frame = self
.frame
.as_mut()
.context("CPU frame missing (encoder opened in CUDA mode)")?;
// SAFETY: `format == self.src_format` and `bytes.len() >= src_row * h` (the `ensure!`s
// above), so `sws_scale` reads `h` rows of `src_row` bytes from `src_data[0] = bytes`
// (packed RGB is single-plane; the other src planes are null/0) — all in bounds. `sws` is
// the non-null context built in `open`. The dst is `frame`'s underlying `AVFrame`: its
// `data`/`linesize` in-struct arrays were sized for YUV444P by `VideoFrame::new`, and the
// 3 planes are each `width`×`height`. All pointers are live locals for this synchronous
// call; the encoder runs only on this thread (`unsafe impl Send`), so no aliasing/race.
unsafe {
let dst_av = frame.as_mut_ptr();
let src_data: [*const u8; 4] =
[bytes.as_ptr(), ptr::null(), ptr::null(), ptr::null()];
let src_stride: [c_int; 4] = [src_row as c_int, 0, 0, 0];
let r = ffi::sws_scale(
sws,
src_data.as_ptr(),
src_stride.as_ptr(),
0,
h as c_int,
(*dst_av).data.as_ptr(),
(*dst_av).linesize.as_ptr(),
);
if r < 0 {
bail!("sws_scale(RGB→YUV444P) failed ({r})");
}
}
frame.set_pts(Some(pts));
frame.set_kind(if idr {
ffmpeg::picture::Type::I
} else {
ffmpeg::picture::Type::None
});
self.enc.send_frame(frame).context("send_frame(444)")?;
return Ok(());
}
let frame = self
.frame
.as_mut()
@@ -526,3 +672,51 @@ impl NvencEncoder {
Ok(())
}
}
impl Drop for NvencEncoder {
fn drop(&mut self) {
if let Some(sws) = self.sws_444.take() {
// SAFETY: `sws` is the non-null `SwsContext` allocated by `sws_getContext` in `open` and
// owned exclusively by this encoder (taken out of the field so it can't be freed twice).
// `sws_freeContext` frees it; nothing else references it after this single-threaded drop.
unsafe { ffi::sws_freeContext(sws) };
}
}
}
/// Probe whether this NVIDIA GPU + driver + libavcodec can actually encode HEVC **4:4:4** (Range
/// Extensions). Opens a tiny real `hevc_nvenc` 4:4:4 session — the exact path [`NvencEncoder::open`]
/// takes for a live 4:4:4 stream — and reports whether it succeeded. HEVC-only; the result is cached
/// by the caller ([`crate::encode::can_encode_444`]). A GPU/driver/ffmpeg without RExt 4:4:4 fails
/// the open here, so the host resolves the session to 4:2:0 before the Welcome (honest downgrade).
pub fn probe_can_encode_444(codec: Codec) -> bool {
if codec != Codec::H265 {
return false;
}
if ffmpeg::init().is_err() {
return false;
}
// Quiet ffmpeg's open error on a GPU that lacks 4:4:4 — the probe failing is an expected outcome.
// SAFETY: libav initialized above; `av_log_{get,set}_level` only read/write the global int level
// (no pointer args) and are always sound post-init.
let prev = unsafe {
let p = ffi::av_log_get_level();
ffi::av_log_set_level(ffi::AV_LOG_FATAL);
p
};
let ok = NvencEncoder::open(
codec,
PixelFormat::Bgra,
640,
480,
30,
2_000_000,
false, // CPU input (the 4:4:4 path never uses CUDA)
8,
ChromaFormat::Yuv444,
)
.is_ok();
// SAFETY: restore the saved global log level (scalar arg, no pointers).
unsafe { ffi::av_log_set_level(prev) };
ok
}
@@ -160,6 +160,18 @@ pub fn probe_can_encode(codec: Codec) -> bool {
}
}
/// Whether the active VAAPI GPU can encode HEVC **4:4:4** (Range Extensions). **Deferred in v1 —
/// always `false`.** VAAPI HEVC 4:4:4 encode is narrow and vendor-specific (the lab's AMD Phoenix1 /
/// RDNA3 exposes only `VAProfileHEVCMain`/`Main10` `EncSlice`, no `Main444`), and there is no
/// validated hardware to build + verify the 4:4:4 surface/profile path against. Returning `false`
/// keeps the negotiation honest: a VAAPI host resolves every session to 4:2:0 before the Welcome, so
/// the client never builds a 4:4:4 decoder it would only get 4:2:0 frames for. (Follow-up: implement
/// and validate on an Intel Arc / RDNA4-class box that advertises a HEVC 4:4:4 encode entrypoint.)
pub fn probe_can_encode_444(_codec: Codec) -> bool {
tracing::info!("VAAPI HEVC 4:4:4 encode is not implemented yet — declining (encoding 4:2:0)");
false
}
/// Drain the encoder for one packet (shared poll logic).
fn poll_encoder(enc: &mut encoder::video::Encoder, fps: u32) -> Result<Option<EncodedFrame>> {
let mut pkt = Packet::empty();
@@ -848,6 +860,7 @@ pub struct VaapiEncoder {
unsafe impl Send for VaapiEncoder {}
impl VaapiEncoder {
#[allow(clippy::too_many_arguments)]
pub fn open(
codec: Codec,
format: PixelFormat,
@@ -856,10 +869,18 @@ impl VaapiEncoder {
fps: u32,
bitrate_bps: u64,
bit_depth: u8,
chroma: super::ChromaFormat,
) -> Result<Self> {
if bit_depth != 8 {
tracing::warn!(bit_depth, "VAAPI 10-bit not yet wired — encoding 8-bit");
}
// VAAPI 4:4:4 is deferred (see `probe_can_encode_444`): no validated AMD/Intel hardware in the
// lab exposes a HEVC 4:4:4 encode entrypoint, and the probe returns false so the host never
// negotiates 4:4:4 for a VAAPI session. If a request slips through, fall back to 4:2:0 rather
// than emit an unverified stream — the host signalled 4:2:0 in the Welcome anyway.
if chroma.is_444() {
tracing::warn!("VAAPI 4:4:4 encode not implemented — encoding 4:2:0");
}
ffmpeg::init().context("ffmpeg init")?;
if std::env::var_os("PUNKTFUNK_FFMPEG_DEBUG").is_some() {
// SAFETY: `av_log_set_level` sets libav's global integer log level; `48` (= AV_LOG_DEBUG)
@@ -31,7 +31,7 @@
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
#![deny(clippy::undocumented_unsafe_blocks)]
use super::{Codec, EncodedFrame, Encoder};
use super::{ChromaFormat, Codec, EncodedFrame, Encoder};
use crate::capture::{dxgi::D3d11Frame, CapturedFrame, FramePayload, PixelFormat};
use anyhow::{anyhow, bail, Context, Result};
use ffmpeg::format::Pixel;
@@ -241,6 +241,18 @@ unsafe fn open_win_encoder(
/// driver/runtime rejects codecs the video engine can't do (AV1 on pre-RDNA3 AMD / pre-Arc Intel,
/// or HEVC on a very old part). Used to build the GameStream codec advertisement so a client never
/// negotiates a codec the encoder can't open. Torn down immediately.
/// Whether the active AMD (AMF) / Intel (QSV) GPU can encode HEVC **4:4:4**. **Deferred in v1 —
/// always `false`.** AMF/QSV HEVC 4:4:4 encode is narrow (AMD RDNA3+, Intel Arc/Xe2+) and the
/// libavcodec profile/pixel-format incantation is vendor- and driver-specific — a wrong profile
/// `avcodec_open2` *silently* falls back to 4:2:0, so a positive probe would need a verify-by-frame,
/// and there is no AMD/Intel Windows box in the lab to build + validate that against. Returning
/// `false` keeps the negotiation honest: an AMF/QSV host resolves every session to 4:2:0 before the
/// Welcome. (Follow-up: implement + validate on an RDNA3+/Arc Windows box.)
pub fn probe_can_encode_444(_vendor: WinVendor, _codec: Codec) -> bool {
tracing::info!("AMF/QSV HEVC 4:4:4 encode is not implemented yet — declining (encoding 4:2:0)");
false
}
pub fn probe_can_encode(vendor: WinVendor, codec: Codec) -> bool {
if ffmpeg::init().is_err() {
return false;
@@ -1096,6 +1108,7 @@ pub struct FfmpegWinEncoder {
unsafe impl Send for FfmpegWinEncoder {}
impl FfmpegWinEncoder {
#[allow(clippy::too_many_arguments)]
#[allow(clippy::too_many_arguments)]
pub fn open(
vendor: WinVendor,
@@ -1106,7 +1119,15 @@ impl FfmpegWinEncoder {
fps: u32,
bitrate_bps: u64,
bit_depth: u8,
chroma: ChromaFormat,
) -> Result<Self> {
// AMF/QSV 4:4:4 is deferred (see `probe_can_encode_444`): no validated AMD/Intel Windows
// hardware in the lab, and the AMF/QSV HEVC 4:4:4 profile/format incantations are vendor- and
// driver-specific (a wrong profile silently encodes 4:2:0). The probe returns false so the host
// never negotiates 4:4:4 for an AMF/QSV session; if a request slips through, fall back to 4:2:0.
if chroma.is_444() {
tracing::warn!("AMF/QSV 4:4:4 encode not implemented — encoding 4:2:0");
}
ffmpeg::init().context("ffmpeg init")?;
if std::env::var_os("PUNKTFUNK_FFMPEG_DEBUG").is_some() {
// SAFETY: `ffmpeg::init()` ran on the line above, so libav is initialised; `av_log_set_level`
@@ -16,7 +16,7 @@
// Every `unsafe` block / impl in this file carries a `// SAFETY:` proof; enforce it.
#![deny(clippy::undocumented_unsafe_blocks)]
use super::{Codec, EncodedFrame, Encoder, EncoderCaps};
use super::{ChromaFormat, Codec, EncodedFrame, Encoder, EncoderCaps};
use crate::capture::{CapturedFrame, FramePayload, PixelFormat};
use anyhow::{anyhow, bail, Context, Result};
use std::collections::{HashMap, VecDeque};
@@ -57,6 +57,15 @@ pub struct NvencD3d11Encoder {
buffer_fmt: nv::NV_ENC_BUFFER_FORMAT,
/// Encoded bit depth (8 or 10). 10 → HEVC Main10 (NVENC upconverts the 8-bit ARGB input).
bit_depth: u8,
/// Full-chroma 4:4:4 (HEVC Range Extensions, `chroma_format_idc = 3`) requested for this session.
/// NVENC ingests the RGB (ARGB/ABGR10) input and CSCs it to YUV444 internally; the `FREXT` profile
/// and `chromaFormatIDC = 3` in the encode config carry the chroma. Gated on the GPU's
/// `NV_ENC_CAPS_SUPPORT_YUV444_ENCODE` (cleared in `query_caps` on a card that lacks it) and on an
/// RGB input format (NV12/P010 capture can't reconstruct 4:4:4). HEVC-only.
chroma_444: bool,
/// `NV_ENC_CAPS_SUPPORT_YUV444_ENCODE` from the caps probe — whether this GPU can 4:4:4 encode at
/// all. `chroma_444` is forced off when this is false (graceful downgrade to 4:2:0).
yuv444_supported: bool,
/// HDR: the capturer is delivering BT.2020 PQ 10-bit (`PixelFormat::Rgb10a2`) frames. Sets the
/// `ABGR10` input format + the BT.2020/PQ colour VUI. Derived per-frame from the capture format
/// (HDR can toggle mid-session); a change re-inits the session.
@@ -103,6 +112,7 @@ pub struct NvencD3d11Encoder {
unsafe impl Send for NvencD3d11Encoder {}
impl NvencD3d11Encoder {
#[allow(clippy::too_many_arguments)]
pub fn open(
codec: Codec,
_format: PixelFormat,
@@ -111,6 +121,7 @@ impl NvencD3d11Encoder {
fps: u32,
bitrate_bps: u64,
bit_depth: u8,
chroma: ChromaFormat,
) -> Result<Self> {
Ok(Self {
encoder: ptr::null_mut(),
@@ -122,6 +133,9 @@ impl NvencD3d11Encoder {
bitrate_bps,
buffer_fmt: nv::NV_ENC_BUFFER_FORMAT::NV_ENC_BUFFER_FORMAT_ARGB,
bit_depth,
// 4:4:4 is HEVC-only; the GPU-support gate is applied in `query_caps`.
chroma_444: chroma.is_444() && codec == Codec::H265,
yuv444_supported: false,
hdr: false,
hdr_meta: None,
regs: HashMap::new(),
@@ -209,6 +223,7 @@ impl NvencD3d11Encoder {
let wmax = self.get_cap(enc, nv::NV_ENC_CAPS::NV_ENC_CAPS_WIDTH_MAX);
let hmax = self.get_cap(enc, nv::NV_ENC_CAPS::NV_ENC_CAPS_HEIGHT_MAX);
let ten_bit = self.get_cap(enc, nv::NV_ENC_CAPS::NV_ENC_CAPS_SUPPORT_10BIT_ENCODE);
let yuv444 = self.get_cap(enc, nv::NV_ENC_CAPS::NV_ENC_CAPS_SUPPORT_YUV444_ENCODE);
let rfi = self.get_cap(
enc,
nv::NV_ENC_CAPS::NV_ENC_CAPS_SUPPORT_REF_PIC_INVALIDATION,
@@ -235,6 +250,13 @@ impl NvencD3d11Encoder {
self.bit_depth = 8;
self.hdr = false;
}
// Same for 4:4:4: a card without YUV444 encode falls back to 4:2:0. (The host already probed
// this via `probe_can_encode_444` before the Welcome, so this is a belt-and-braces guard.)
self.yuv444_supported = yuv444 != 0;
if self.chroma_444 && !self.yuv444_supported {
tracing::warn!("NVENC: this GPU can't 4:4:4 encode — falling back to 4:2:0");
self.chroma_444 = false;
}
self.rfi_supported = rfi != 0;
self.custom_vbv = custom_vbv != 0;
tracing::info!(
@@ -313,9 +335,31 @@ impl NvencD3d11Encoder {
cfg.encodeCodecConfig.hevcConfig.tier = 1;
cfg.encodeCodecConfig.hevcConfig.level = 0;
// 10-bit HEVC Main10 (HDR foundation): NVENC upconverts the 8-bit input; 8-bit leaves the
// preset default (Main) untouched.
if self.bit_depth == 10 {
// Chroma + bit depth. Full-chroma 4:4:4 (HEVC Range Extensions) takes precedence and composes
// with 10-bit (Main 4:4:4 10): NVENC ingests the RGB input (ARGB / ABGR10) and CSCs it to
// YUV444 internally when `chromaFormatIDC = 3` under the FREXT profile. Only valid on an RGB
// input — a subsampled NV12/P010 source can't reconstruct full chroma (so the capturer is
// forced to RGB for a 4:4:4 session, and we guard on the input format here too).
//
// ON-GLASS TODO (RTX box): confirm ARGB + chromaFormatIDC=3 + FREXT yields a *true* 4:4:4
// stream. NVENC's RGB→YUV CSC is documented to honor chromaFormatIDC (unlike libavcodec's
// wrapper, which always subsamples RGB to 4:2:0 — hence the Linux path feeds planar YUV444
// instead). If on-glass shows 4:2:0, the follow-up is a BGRA→AYUV shader feeding the native
// `NV_ENC_BUFFER_FORMAT_AYUV` 4:4:4 input format.
let rgb_input = matches!(
self.buffer_fmt,
nv::NV_ENC_BUFFER_FORMAT::NV_ENC_BUFFER_FORMAT_ARGB
| nv::NV_ENC_BUFFER_FORMAT::NV_ENC_BUFFER_FORMAT_ABGR10
);
if self.chroma_444 && rgb_input {
cfg.profileGUID = nv::NV_ENC_HEVC_PROFILE_FREXT_GUID;
cfg.encodeCodecConfig.hevcConfig.set_chromaFormatIDC(3);
if self.bit_depth == 10 {
cfg.encodeCodecConfig.hevcConfig.set_pixelBitDepthMinus8(2); // Main 4:4:4 10
}
} else if self.bit_depth == 10 {
// 10-bit HEVC Main10 (HDR foundation): NVENC upconverts the 8-bit input; 8-bit leaves the
// preset default (Main) untouched.
cfg.profileGUID = nv::NV_ENC_HEVC_PROFILE_MAIN10_GUID;
cfg.encodeCodecConfig.hevcConfig.set_pixelBitDepthMinus8(2); // 10 - 8
}
@@ -787,6 +831,9 @@ impl Encoder for NvencD3d11Encoder {
EncoderCaps {
supports_rfi: self.rfi_supported,
supports_hdr_metadata: self.hdr,
// Reflects what the session actually configured (cleared in `query_caps` if the GPU lacks
// YUV444 encode), so the glue can confirm 4:4:4 vs the negotiated request.
chroma_444: self.chroma_444,
}
}
@@ -904,3 +951,69 @@ impl Drop for NvencD3d11Encoder {
unsafe { self.teardown() };
}
}
/// Probe whether the active NVIDIA GPU can encode HEVC **4:4:4** (`NV_ENC_CAPS_SUPPORT_YUV444_ENCODE`).
/// Creates a throwaway hardware D3D11 device + NVENC session, queries the cap, and tears down. HEVC-only;
/// the result is cached by the caller ([`crate::encode::can_encode_444`]) and read *before* the Welcome
/// so the host advertises the chroma it can really encode (honest downgrade to 4:2:0 on a card without it).
pub fn probe_can_encode_444(codec: Codec) -> bool {
use windows::Win32::Foundation::HMODULE;
use windows::Win32::Graphics::Direct3D::{D3D_DRIVER_TYPE_HARDWARE, D3D_FEATURE_LEVEL_11_0};
use windows::Win32::Graphics::Direct3D11::{
D3D11CreateDevice, D3D11_CREATE_DEVICE_BGRA_SUPPORT, D3D11_SDK_VERSION,
};
if codec != Codec::H265 {
return false;
}
// SAFETY: a self-contained probe owning every handle it creates. `D3D11CreateDevice` (HARDWARE
// driver, NULL adapter) fills `device` or returns Err (→ false). `open_encode_session_ex` opens an
// NVENC session against that device's raw pointer (valid while `device` is held) or errors (→ false,
// tearing nothing down). `get_encode_caps` reads one scalar cap into `val` via the loaded API table.
// `destroy_encoder` frees the session exactly once; `device`/its context drop with the COM wrappers.
// No handle escapes this call and nothing runs concurrently.
unsafe {
let mut device: Option<ID3D11Device> = None;
if D3D11CreateDevice(
None,
D3D_DRIVER_TYPE_HARDWARE,
HMODULE::default(),
D3D11_CREATE_DEVICE_BGRA_SUPPORT,
Some(&[D3D_FEATURE_LEVEL_11_0]),
D3D11_SDK_VERSION,
Some(&mut device),
None,
None,
)
.is_err()
{
return false;
}
let Some(device) = device else { return false };
let mut params = nv::NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS {
version: nv::NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS_VER,
deviceType: nv::NV_ENC_DEVICE_TYPE::NV_ENC_DEVICE_TYPE_DIRECTX,
device: device.as_raw(),
apiVersion: nv::NVENCAPI_VERSION,
..Default::default()
};
let mut enc: *mut c_void = ptr::null_mut();
if (API.open_encode_session_ex)(&mut params, &mut enc)
.result_without_string()
.is_err()
{
return false;
}
let mut param = nv::NV_ENC_CAPS_PARAM {
version: nv::NV_ENC_CAPS_PARAM_VER,
capsToQuery: nv::NV_ENC_CAPS::NV_ENC_CAPS_SUPPORT_YUV444_ENCODE,
reserved: [0; 62],
};
let mut val: i32 = 0;
let ok = (API.get_encode_caps)(enc, nv::NV_ENC_CODEC_HEVC_GUID, &mut param, &mut val)
.result_without_string()
.is_ok()
&& val != 0;
let _ = (API.destroy_encoder)(enc);
ok
}
}
+71 -267
View File
@@ -41,8 +41,6 @@ type Aes128CbcEnc = cbc::Encryptor<aes::Aes128>;
/// `RTP_PAYLOAD_TYPE_FEC 127`).
const AUDIO_PACKET_TYPE: u8 = 97;
const AUDIO_FEC_PACKET_TYPE: u8 = 127;
/// Stereo Opus bitrate (unchanged from the live-validated stereo path).
const OPUS_BITRATE: i32 = 128_000;
/// Audio FEC geometry (moonlight-common-c `RtpAudioQueue.h`: `RTPA_DATA_SHARDS 4`,
/// `RTPA_FEC_SHARDS 2`). Blocks are aligned: the client synthesizes the block base as
@@ -82,67 +80,20 @@ impl Default for AudioParams {
}
}
/// One Opus (multi)stream layout. Channel order is the GameStream/Moonlight order
/// FL FR FC LFE RL RR [SL SR]; `mapping` is the libopus multistream mapping we *encode*
/// with — identical to Sunshine's `audio.cpp stream_configs` (verified verbatim 2026-06-10):
/// identity mapping, so normal quality couples (FL,FR) and (FC,LFE) [+ (RL,RR) on 7.1] with
/// the remaining channels as mono streams; high quality is one mono stream per channel.
/// Bitrates are Sunshine's per-config values (stereo keeps punktfunk's existing 128 kbps).
pub struct OpusLayout {
pub channels: u8,
pub streams: u8,
pub coupled: u8,
pub mapping: &'static [u8],
pub bitrate: i32,
}
pub const LAYOUT_STEREO: OpusLayout = OpusLayout {
channels: 2,
streams: 1,
coupled: 1,
mapping: &[0, 1],
bitrate: OPUS_BITRATE,
};
pub const LAYOUT_51: OpusLayout = OpusLayout {
channels: 6,
streams: 4,
coupled: 2,
mapping: &[0, 1, 2, 3, 4, 5],
bitrate: 256_000,
};
pub const LAYOUT_51_HQ: OpusLayout = OpusLayout {
channels: 6,
streams: 6,
coupled: 0,
mapping: &[0, 1, 2, 3, 4, 5],
bitrate: 1_536_000,
};
pub const LAYOUT_71: OpusLayout = OpusLayout {
channels: 8,
streams: 5,
coupled: 3,
mapping: &[0, 1, 2, 3, 4, 5, 6, 7],
bitrate: 450_000,
};
pub const LAYOUT_71_HQ: OpusLayout = OpusLayout {
channels: 8,
streams: 8,
coupled: 0,
mapping: &[0, 1, 2, 3, 4, 5, 6, 7],
bitrate: 2_048_000,
// The Opus surround layout table (channel order FL FR FC LFE RL RR [SL SR], identity mapping,
// Sunshine's per-config bitrates) now lives in `punktfunk_core::audio`, shared with the native
// `punktfunk/1` path and every client decoder. Re-export the pieces the GameStream module + its
// RTSP SDP (`rtsp.rs`) reference; the GFE-specific `surround_params` SDP rotation stays below.
pub use punktfunk_core::audio::{
OpusLayout, LAYOUT_51, LAYOUT_51_HQ, LAYOUT_71, LAYOUT_71_HQ, LAYOUT_STEREO,
};
/// Pick the encoder layout for the negotiated session parameters. Unknown channel counts
/// fall back to stereo (the client can only request 2/6/8 — `AUDIO_CONFIGURATION_*` in
/// Pick the encoder layout for the negotiated session parameters. Thin wrapper over the shared
/// [`punktfunk_core::audio::layout_for`] keyed on this module's [`AudioParams`] (unknown channel
/// counts fall back to stereo; the client can only request 2/6/8 — `AUDIO_CONFIGURATION_*` in
/// Limelight.h).
pub fn layout_for(params: &AudioParams) -> &'static OpusLayout {
match (params.channels, params.high_quality) {
(6, false) => &LAYOUT_51,
(6, true) => &LAYOUT_51_HQ,
(8, false) => &LAYOUT_71,
(8, true) => &LAYOUT_71_HQ,
_ => &LAYOUT_STEREO,
}
punktfunk_core::audio::layout_for(params.channels, params.high_quality)
}
/// The `a=fmtp:97 surround-params=` digit string for a layout: channelCount, streams,
@@ -345,21 +296,21 @@ fn run(
}
/// Opus encoder for one session: the plain stereo encoder (the live-validated path, byte
/// identical) or a libopus multistream encoder for 5.1/7.1.
/// identical) or the safe `opus::MSEncoder` multistream encoder for 5.1/7.1. Both are
/// cross-platform (Linux + Windows) — surround no longer needs `audiopus_sys`.
#[cfg(any(target_os = "linux", target_os = "windows"))]
enum SessionEncoder {
Stereo(opus::Encoder),
// Surround needs the libopus *multistream* encoder via `audiopus_sys` (Linux-only dep).
#[cfg(target_os = "linux")]
Surround(MsEncoder),
Surround(opus::MSEncoder),
}
#[cfg(any(target_os = "linux", target_os = "windows"))]
impl SessionEncoder {
fn new(layout: &'static OpusLayout) -> Result<SessionEncoder> {
// RESTRICTED_LOWDELAY (`opus::Application::LowDelay`) + hard CBR, matching Sunshine — CBR
// keeps the Opus packet size constant, which the GameStream audio FEC (equal-length shards)
// relies on, and the client asserts a constant per-stream TOC.
if layout.channels == 2 {
// RESTRICTED_LOWDELAY + CBR, matching Sunshine — CBR keeps the Opus TOC byte
// constant, which the client asserts per stream.
let mut enc = opus::Encoder::new(
SAMPLE_RATE,
opus::Channels::Stereo,
@@ -370,138 +321,32 @@ impl SessionEncoder {
enc.set_vbr(false).ok();
Ok(SessionEncoder::Stereo(enc))
} else {
#[cfg(target_os = "linux")]
{
Ok(SessionEncoder::Surround(MsEncoder::new(layout)?))
}
#[cfg(not(target_os = "linux"))]
{
anyhow::bail!(
"surround audio ({} ch) needs the libopus multistream encoder (Linux only) — \
use a stereo session",
layout.channels
)
}
let mut enc = opus::MSEncoder::new(
SAMPLE_RATE,
layout.streams,
layout.coupled,
layout.mapping,
opus::Application::LowDelay,
)
.map_err(|e| anyhow::anyhow!("create Opus multistream encoder: {e}"))?;
enc.set_bitrate(opus::Bitrate::Bits(layout.bitrate)).ok();
enc.set_vbr(false).ok();
Ok(SessionEncoder::Surround(enc))
}
}
/// Encode one interleaved frame (`samples_per_channel * channels` f32s) into `out`,
/// returning the packet length.
fn encode_float(
&mut self,
frame: &[f32],
samples_per_channel: usize,
out: &mut [u8],
) -> Result<usize> {
// `samples_per_channel` only feeds the multistream (surround) encoder; stereo infers it.
#[cfg(not(target_os = "linux"))]
let _ = samples_per_channel;
/// Encode one interleaved frame into `out`, returning the packet length. Both encoders infer
/// the per-channel sample count from `frame.len()` and their channel count.
fn encode_float(&mut self, frame: &[f32], out: &mut [u8]) -> Result<usize> {
match self {
SessionEncoder::Stereo(enc) => enc.encode_float(frame, out).context("opus encode"),
#[cfg(target_os = "linux")]
SessionEncoder::Surround(enc) => enc.encode_float(frame, samples_per_channel, out),
SessionEncoder::Surround(enc) => enc
.encode_float(frame, out)
.context("opus multistream encode"),
}
}
}
/// RAII wrapper for `OpusMSEncoder` (the safe `opus` crate is stereo-only; the multistream
/// API comes from `audiopus_sys`, the same libopus the crate already links). Configured like
/// the stereo path: RESTRICTED_LOWDELAY, hard CBR, per-layout bitrate.
#[cfg(target_os = "linux")]
struct MsEncoder {
st: std::ptr::NonNull<audiopus_sys::OpusMSEncoder>,
}
// SAFETY: `MsEncoder` owns a unique `OpusMSEncoder` via `NonNull` (it is neither `Clone` nor
// `Sync`, so the pointer is never aliased). libopus's multistream encoder state is a self-contained
// heap allocation with no thread-local or thread-affine state, so moving ownership to another thread
// is sound; every method takes `&mut self`, keeping access single-threaded at any instant.
#[cfg(target_os = "linux")]
unsafe impl Send for MsEncoder {}
#[cfg(target_os = "linux")]
impl MsEncoder {
fn new(layout: &OpusLayout) -> Result<MsEncoder> {
use std::os::raw::c_int;
let mut err: c_int = 0;
// SAFETY: every scalar arg is a valid libopus input (sample rate, channel/stream/coupled
// counts, the RESTRICTED_LOWDELAY application constant). `layout.mapping.as_ptr()` addresses
// a 'static slice of exactly `layout.channels` bytes (every `OpusLayout` constant upholds
// that), which is the element count `opus_multistream_encoder_create` reads through it, and
// `&mut err` is a live local the call writes its status into. libopus copies the mapping into
// its own allocation, so the pointer need only be valid for the call; the returned pointer is
// null/`OPUS_OK`-checked below before any use.
let st = unsafe {
audiopus_sys::opus_multistream_encoder_create(
SAMPLE_RATE as i32,
layout.channels as c_int,
layout.streams as c_int,
layout.coupled as c_int,
layout.mapping.as_ptr(),
audiopus_sys::OPUS_APPLICATION_RESTRICTED_LOWDELAY,
&mut err,
)
};
let st = std::ptr::NonNull::new(st)
.filter(|_| err == audiopus_sys::OPUS_OK)
.ok_or_else(|| anyhow::anyhow!("opus_multistream_encoder_create failed ({err})"))?;
// SAFETY: `st` is the non-null encoder `opus_multistream_encoder_create` just returned, owned
// exclusively here. Each `opus_multistream_encoder_ctl` call passes a valid request constant
// with the single by-value `c_int` argument that request's variadic ABI expects
// (`OPUS_SET_BITRATE_REQUEST` → bitrate, `OPUS_SET_VBR_REQUEST` → 0). No pointer escapes the
// call and the encoder outlives it.
unsafe {
audiopus_sys::opus_multistream_encoder_ctl(
st.as_ptr(),
audiopus_sys::OPUS_SET_BITRATE_REQUEST,
layout.bitrate as c_int,
);
audiopus_sys::opus_multistream_encoder_ctl(
st.as_ptr(),
audiopus_sys::OPUS_SET_VBR_REQUEST,
0 as c_int, // hard CBR (constant packet size — also what audio FEC relies on)
);
}
Ok(MsEncoder { st })
}
fn encode_float(
&mut self,
frame: &[f32],
samples_per_channel: usize,
out: &mut [u8],
) -> Result<usize> {
// SAFETY: `self.st` is the live encoder from `new`. libopus reads `samples_per_channel *
// channels` f32s through `frame.as_ptr()`; every caller passes a `frame` of exactly that
// length together with the matching `samples_per_channel` (`audio_body`'s `frame_len =
// samples_per_channel * layout.channels`; the round-trip tests size identically), so the read
// stays in bounds. `out.as_mut_ptr()` is written for at most `out.len()` bytes, which is
// passed as the capacity bound. Both buffers are live locals outliving this synchronous call;
// the return value is range-checked before being used as a length.
let n = unsafe {
audiopus_sys::opus_multistream_encode_float(
self.st.as_ptr(),
frame.as_ptr(),
samples_per_channel as std::os::raw::c_int,
out.as_mut_ptr(),
out.len() as i32,
)
};
anyhow::ensure!(n > 0, "opus_multistream_encode_float failed ({n})");
Ok(n as usize)
}
}
#[cfg(target_os = "linux")]
impl Drop for MsEncoder {
fn drop(&mut self) {
// SAFETY: `self.st` is the encoder `opus_multistream_encoder_create` returned; this
// `MsEncoder` owns it uniquely and `drop` runs exactly once, so the destroy frees it once
// with no subsequent use.
unsafe { audiopus_sys::opus_multistream_encoder_destroy(self.st.as_ptr()) }
}
}
#[cfg(any(target_os = "linux", target_os = "windows"))]
fn audio_body(
cap: &mut dyn AudioCapturer,
@@ -565,7 +410,7 @@ fn audio_body(
*s = (*s * gain).clamp(-1.0, 1.0);
}
}
let n = enc.encode_float(&frame, samples_per_channel, &mut out)?;
let n = enc.encode_float(&frame, &mut out)?;
// AES-128-CBC the Opus payload (RTP header stays plaintext). Per-packet IV =
// BE32(rikeyid + seq) in [0..4], zero elsewhere; PKCS7 padding.
let iv_seq = (rikeyid as u32).wrapping_add(seq as u32);
@@ -775,41 +620,33 @@ mod tests {
/// Real-codec proof of the 5.1 mapping math: encode with our encoder layout, decode with
/// the mapping a stock Moonlight client derives from our advertised surround-params
/// (parse → GFE swap), and verify a tone fed into each input channel comes out on the
/// same output channel.
#[cfg(target_os = "linux")]
/// same output channel. Cross-platform via the safe `opus` crate — this also guards the
/// (now un-gated) Windows GameStream surround build.
#[test]
fn multistream_51_roundtrip_channel_identity() {
let layout = &LAYOUT_51;
let samples = 240; // 5 ms
let ch = layout.channels as usize;
// Client-side decoder mapping derived exactly as moonlight-common-c does.
// Client-side decoder mapping derived exactly as moonlight-common-c does (GFE swap).
let s = surround_params(layout, false);
let digits: Vec<u8> = s.bytes().map(|b| b - b'0').collect();
let client_mapping = client_swap(&digits[3..]);
let mut err = 0i32;
// SAFETY: scalar args are valid libopus inputs. `client_mapping.as_ptr()` addresses a
// `Vec<u8>` of exactly `ch` entries (derived from the advertised surround-params), which is
// the element count the decoder reads through it, and `&mut err` is a live local the call
// writes. The returned pointer is `OPUS_OK`/non-null-checked immediately below before use.
let dec = unsafe {
audiopus_sys::opus_multistream_decoder_create(
SAMPLE_RATE as i32,
ch as i32,
layout.streams as i32,
layout.coupled as i32,
client_mapping.as_ptr(),
&mut err,
)
};
assert_eq!(err, audiopus_sys::OPUS_OK);
assert!(!dec.is_null());
let mut dec =
opus::MSDecoder::new(SAMPLE_RATE, layout.streams, layout.coupled, &client_mapping)
.expect("multistream decoder");
for tone_ch in 0..ch {
let mut enc = MsEncoder::new(layout).unwrap();
let mut enc = opus::MSEncoder::new(
SAMPLE_RATE,
layout.streams,
layout.coupled,
layout.mapping,
opus::Application::LowDelay,
)
.expect("multistream encoder");
let mut out = vec![0u8; 1400];
let mut decoded = vec![0f32; samples * ch];
let mut energy = vec![0f64; ch];
// A few frames so the codec converges past its startup transient.
for f in 0..8 {
@@ -819,28 +656,15 @@ mod tests {
/ SAMPLE_RATE as f32;
frame[t * ch + tone_ch] = 0.5 * phase.sin();
}
let n = enc.encode_float(&frame, samples, &mut out).unwrap();
let n = enc.encode_float(&frame, &mut out).unwrap();
assert!(n > 0);
// SAFETY: `dec` is the non-null decoder asserted above. `out.as_ptr()` is read for
// the `n` encoded bytes just produced by `encode_float`; `decoded.as_mut_ptr()` is
// written for up to `samples * ch` f32s and `decoded` is exactly that long; `samples`
// is the per-channel frame size. All buffers are live locals outliving the call; the
// return is checked to equal `samples`.
let got = unsafe {
audiopus_sys::opus_multistream_decode_float(
dec,
out.as_ptr(),
n as i32,
decoded.as_mut_ptr(),
samples as i32,
0,
)
};
assert_eq!(got as usize, samples);
let mut decoded = vec![0f32; samples * ch];
let got = dec.decode_float(&out[..n], &mut decoded, false).unwrap();
assert_eq!(got, samples);
if f >= 4 {
for t in 0..samples {
for c in 0..ch {
energy[c] += (decoded[t * ch + c] as f64).powi(2);
for (c, e) in energy.iter_mut().enumerate() {
*e += (decoded[t * ch + c] as f64).powi(2);
}
}
}
@@ -854,9 +678,6 @@ mod tests {
(energies: {energy:?})"
);
}
// SAFETY: `dec` is the decoder `opus_multistream_decoder_create` returned; the test owns it
// and destroys it exactly once here, after the final decode — no later use, no double free.
unsafe { audiopus_sys::opus_multistream_decoder_destroy(dec) };
}
/// Live 5.1 capture → multistream encode → decode, against a real PipeWire session.
@@ -869,7 +690,15 @@ mod tests {
fn surround_capture_live() {
let mut cap = crate::audio::open_audio_capture(6).expect("open 6ch capture");
let layout = &LAYOUT_51;
let mut enc = MsEncoder::new(layout).unwrap();
let mut enc = opus::MSEncoder::new(
SAMPLE_RATE,
layout.streams,
layout.coupled,
layout.mapping,
opus::Application::LowDelay,
)
.unwrap();
enc.set_vbr(false).ok(); // hard CBR so packet sizes are constant (audio FEC relies on it)
let mut out = vec![0u8; 1400];
let mut acc: Vec<f32> = Vec::new();
let frame_len = 240 * 6;
@@ -880,49 +709,24 @@ mod tests {
acc.extend_from_slice(&chunk);
while acc.len() >= frame_len && packets < 100 {
let frame: Vec<f32> = acc.drain(..frame_len).collect();
let n = enc.encode_float(&frame, 240, &mut out).unwrap();
let n = enc.encode_float(&frame, &mut out).unwrap();
sizes.insert(n);
packets += 1;
}
}
// Hard CBR: every multistream packet must be the same size (audio FEC relies on it).
assert_eq!(sizes.len(), 1, "CBR sizes: {sizes:?}");
// And a stock client's decoder must accept them.
// And a stock client's GFE-derived decoder must accept them.
let s = surround_params(layout, false);
let digits: Vec<u8> = s.bytes().map(|b| b - b'0').collect();
let client_mapping = client_swap(&digits[3..]);
let mut err = 0i32;
// SAFETY: scalar args are valid; `client_mapping.as_ptr()` addresses a 6-entry `Vec<u8>`
// (matches the 6-channel layout the decoder reads through it), alive past the call, and
// `&mut err` is a live local. The pointer is `OPUS_OK`-checked before use.
let dec = unsafe {
audiopus_sys::opus_multistream_decoder_create(
48000,
6,
layout.streams as i32,
layout.coupled as i32,
client_mapping.as_ptr(),
&mut err,
)
};
assert_eq!(err, audiopus_sys::OPUS_OK);
let mut dec =
opus::MSDecoder::new(SAMPLE_RATE, layout.streams, layout.coupled, &client_mapping)
.unwrap();
let mut pcm = vec![0f32; 240 * 6];
// SAFETY: `dec` is the non-null decoder from create. `out.as_ptr()` is read for the CBR
// packet length passed in (`*sizes.first()`, a real encoded packet size in `out`);
// `pcm.as_mut_ptr()` is written for up to `240 * 6` f32s and `pcm` is exactly that long;
// `240` is the per-channel frame size. All buffers are live locals outliving the call.
let got = unsafe {
audiopus_sys::opus_multistream_decode_float(
dec,
out.as_ptr(),
*sizes.first().unwrap() as i32,
pcm.as_mut_ptr(),
240,
0,
)
};
// SAFETY: `dec` is owned by the test; destroyed exactly once here after the final decode.
unsafe { audiopus_sys::opus_multistream_decoder_destroy(dec) };
let got = dec
.decode_float(&out[..*sizes.first().unwrap()], &mut pcm, false)
.unwrap();
assert_eq!(got, 240);
}
}
@@ -56,6 +56,9 @@ pub fn spawn(state: Arc<AppState>) -> Result<()> {
.spawn(move || {
// GCM scheme detected from the first authenticating packet; reused thereafter.
let mut detected: Option<Scheme> = None;
// Consecutive control-decrypt failures for this peer — throttles the warn log so a
// junk-packet flood can't spam unbounded lines (security-review 2026-06-28 #10).
let mut decrypt_fails: u64 = 0;
// Decoded keyboard/mouse is forwarded to a dedicated host-lifetime injector thread —
// NEVER injected inline, so a slow Wayland/libei/SendInput call can't head-block ENet
// keepalive/retransmit servicing on this thread. The injector owns non-Send compositor
@@ -77,6 +80,7 @@ pub fn spawn(state: Arc<AppState>) -> Result<()> {
Event::Disconnect { .. } => {
tracing::info!("control: client disconnected");
detected = None;
decrypt_fails = 0;
peer = None;
// Unplug the session's virtual pads.
pads = GamepadManager::new();
@@ -89,6 +93,7 @@ pub fn spawn(state: Arc<AppState>) -> Result<()> {
channel_id,
packet.data(),
&mut detected,
&mut decrypt_fails,
&inj_tx,
&mut pads,
);
@@ -163,6 +168,7 @@ fn on_receive(
_channel_id: u8,
d: &[u8],
detected: &mut Option<Scheme>,
decrypt_fails: &mut u64,
inj_tx: &Sender<InputEvent>,
pads: &mut GamepadManager,
) {
@@ -180,10 +186,20 @@ fn on_receive(
tracing::info!(?scheme, "control: GCM scheme locked in");
}
*detected = Some(scheme);
*decrypt_fails = 0;
pt
}
None => {
tracing::warn!(len = d.len(), "control: GCM decrypt failed");
// Throttle: a junk-packet flood must not spam one warn line per packet. Log the first
// failure, then only at exponentially-spaced counts (1, 2, 4, 8, …).
*decrypt_fails += 1;
if decrypt_fails.is_power_of_two() {
tracing::warn!(
len = d.len(),
fails = *decrypt_fails,
"control: GCM decrypt failed"
);
}
return;
}
};
+63 -4
View File
@@ -90,6 +90,11 @@ pub struct LaunchSession {
pub fps: u32,
/// `/launch?appid=N` — selects the app-catalog entry (session recipe).
pub appid: u32,
/// Source IP of the paired HTTPS client that issued `/launch`. The unauthenticated RTSP/UDP
/// media plane binds to this so only the launching peer can start/own the stream — an
/// unpaired RTSP peer cannot ride a paired client's launch (security-review 2026-06-28 #4).
/// `None` if the address could not be captured (then RTSP falls back to launch-present only).
pub peer_ip: Option<std::net::IpAddr>,
}
/// Shared control-plane state used as the axum app state.
@@ -262,9 +267,10 @@ pub(crate) fn config_dir() -> PathBuf {
}
/// Create `dir` (and parents) owner-private — **0700** on Unix (so the host's secrets aren't readable
/// by other local users via a traversable config path). Best-effort on Windows: the dir inherits the
/// (Users-readable) `%ProgramData%` ACL, so secret *files* are individually locked down by
/// [`write_secret_file`]. Tightens an already-existing dir too.
/// by other local users via a traversable config path). On Windows, applies a restrictive DACL
/// ([`restrict_dir_to_system_admins`]) so a local unprivileged user can't pre-create / plant files in
/// the config tree (the default `%ProgramData%` ACL grants Users *create*; security-review
/// 2026-06-28 #3/#11). Tightens (and re-owns) an already-existing dir too.
pub(crate) fn create_private_dir(dir: &std::path::Path) -> std::io::Result<()> {
#[cfg(unix)]
{
@@ -281,7 +287,60 @@ pub(crate) fn create_private_dir(dir: &std::path::Path) -> std::io::Result<()> {
}
#[cfg(not(unix))]
{
std::fs::create_dir_all(dir)
let r = std::fs::create_dir_all(dir);
#[cfg(windows)]
restrict_dir_to_system_admins(dir);
r
}
}
/// Best-effort Windows DACL lockdown of the config *directory* (the companion to
/// [`restrict_to_system_admins`] for files). The default `%ProgramData%` ACL lets `BUILTIN\Users`
/// create subfolders/files (and become `CREATOR OWNER`), so a non-admin could pre-create the
/// `punktfunk` dir or plant a `host.env`/`apps.json` that the privileged SYSTEM service then trusts
/// (LPE; security-review 2026-06-28 #3). This re-owns the dir to Administrators (defeating a
/// pre-creation), strips inheritance, and sets an explicit DACL: SYSTEM/Administrators/OWNER full
/// (object+container inherit so child files/dirs inherit it), and Users **read-only** (so existing
/// reads of non-secret config keep working but a local user can no longer write/plant). Secret files
/// are additionally locked to SYSTEM/Admins by [`write_secret_file`]. Hard-coded SIDs
/// (locale-independent) via the absolute `%SystemRoot%` path; never fatal.
#[cfg(windows)]
fn restrict_dir_to_system_admins(dir: &std::path::Path) {
let icacls = std::env::var("SystemRoot")
.map(|r| format!("{r}\\System32\\icacls.exe"))
.unwrap_or_else(|_| "icacls".to_string());
// Reset ownership of the directory object to Administrators first, so a dir a non-admin may have
// pre-created can't keep OWNER control (an owner can always rewrite the DACL). No `/T` — re-owning
// the dir itself is what defeats the pre-creation; recursing a large captures tree each call is
// needless churn (secret files are individually owner-locked by `write_secret_file`).
let _ = std::process::Command::new(&icacls)
.arg(dir.as_os_str())
.args(["/setowner", "*S-1-5-32-544"]) // BUILTIN\Administrators
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
let status = std::process::Command::new(&icacls)
.arg(dir.as_os_str())
.args([
"/inheritance:r",
"/grant:r",
"*S-1-5-18:(OI)(CI)(F)", // NT AUTHORITY\SYSTEM
"/grant:r",
"*S-1-5-32-544:(OI)(CI)(F)", // BUILTIN\Administrators
"/grant:r",
"*S-1-3-4:(OI)(CI)(F)", // OWNER RIGHTS
"/grant:r",
"*S-1-5-32-545:(OI)(CI)(RX)", // BUILTIN\Users — read-only (no create/write → no plant)
])
.stdout(std::process::Stdio::null())
.stderr(std::process::Stdio::null())
.status();
match status {
Ok(s) if s.success() => {}
_ => tracing::warn!(
dir = %dir.display(),
"config-dir DACL hardening did not fully succeed — a local user may be able to plant config files"
),
}
}
+13 -18
View File
@@ -1,9 +1,14 @@
//! The nvhttp servers: plain HTTP on 47989 and mutual-TLS on 47984. Serves `/serverinfo`,
//! the `/pair` flow, `/applist`, and `/launch`/`/resume`/`/cancel`, plus a punktfunk-only
//! `/pin` endpoint to deliver the Moonlight-displayed PIN. Over HTTPS the client is
//! the `/pair` flow, `/applist`, and `/launch`/`/resume`/`/cancel`. Over HTTPS the client is
//! mutual-TLS-authenticated, so `/serverinfo` reports `PairStatus=1` there.
//!
//! The pairing PIN is delivered out-of-band ONLY through the bearer-authenticated management
//! API (`POST /api/v1/pair/pin`): the operator reads the PIN off the Moonlight client and
//! types it into the host console. There is deliberately NO unauthenticated nvhttp PIN
//! endpoint — one would let a network client submit its own displayed PIN and drive the whole
//! ceremony to a pinned cert with no operator consent (security-review 2026-06-28 #1).
use super::tls::PeerCertFingerprint;
use super::tls::{PeerAddr, PeerCertFingerprint};
use super::{serverinfo, AppState, LaunchSession, HTTPS_PORT, HTTP_PORT, RTSP_PORT};
use anyhow::{anyhow, Context, Result};
use axum::{
@@ -58,7 +63,6 @@ fn router(state: Arc<AppState>, https: bool) -> Router {
Router::new()
.route("/serverinfo", get(h_serverinfo))
.route("/pair", get(h_pair))
.route("/pin", get(h_pin))
.route("/applist", get(h_applist))
.route("/launch", get(h_launch))
.route("/resume", get(h_resume))
@@ -82,19 +86,6 @@ async fn h_serverinfo(
xml(serverinfo::serverinfo_xml(&st.host, https, paired))
}
async fn h_pin(
State(st): State<Arc<AppState>>,
Query(q): Query<HashMap<String, String>>,
) -> impl IntoResponse {
match q.get("pin").filter(|p| !p.is_empty()) {
Some(pin) => {
st.pairing.pin.submit(pin.clone());
"PIN accepted\n".to_string()
}
None => "usage: GET /pin?pin=NNNN\n".to_string(),
}
}
async fn h_applist(
State(st): State<Arc<AppState>>,
peer: Option<Extension<PeerCertFingerprint>>,
@@ -110,6 +101,7 @@ async fn h_applist(
async fn h_launch(
State(st): State<Arc<AppState>>,
peer: Option<Extension<PeerCertFingerprint>>,
addr: Option<Extension<PeerAddr>>,
Query(q): Query<HashMap<String, String>>,
) -> impl IntoResponse {
if !peer_is_paired(&peer, &st) {
@@ -117,7 +109,9 @@ async fn h_launch(
return xml(error_xml());
}
match launch(&st, &q) {
Ok(session) => {
Ok(mut session) => {
// Bind the (unauthenticated) RTSP/UDP media plane to this paired client's source IP.
session.peer_ip = addr.map(|Extension(PeerAddr(a))| a.ip());
*st.launch.lock().unwrap() = Some(session);
tracing::info!(
w = session.width,
@@ -193,6 +187,7 @@ fn launch(_st: &AppState, q: &HashMap<String, String>) -> Result<LaunchSession>
height,
fps,
appid,
peer_ip: None, // set by `h_launch` from the verified HTTPS peer address
})
}
@@ -17,9 +17,14 @@ use std::sync::Mutex;
use std::time::Duration;
use tokio::sync::Notify;
/// Out-of-band PIN delivery. Moonlight generates + displays a PIN; the user submits it
/// (via the management API's `POST /api/v1/pair/pin` or nvhttp's `GET /pin?pin=NNNN`).
/// `getservercert` parks until a PIN arrives.
/// Out-of-band PIN delivery. Moonlight generates + displays a PIN; the operator submits it
/// via the bearer-authenticated management API (`POST /api/v1/pair/pin`) only — there is no
/// unauthenticated nvhttp delivery path (a network client must never be able to submit its
/// own PIN; security-review 2026-06-28 #1). `getservercert` parks until a PIN arrives.
/// Max pairing handshakes parked in [`PinGate::take`] at once (each holds a slot for up to
/// 300s), bounding a pre-auth waiter flood. Real pairing is one operator-driven client at a time.
const MAX_PARKED_WAITERS: usize = 4;
pub struct PinGate {
pin: Mutex<Option<String>>,
notify: Notify,
@@ -48,7 +53,20 @@ impl PinGate {
}
async fn take(&self, timeout: Duration) -> Option<String> {
self.waiters.fetch_add(1, Ordering::SeqCst);
// Bound the number of pairing handshakes parked at once: each `getservercert` is
// pre-auth and parks for up to 300s, so without a cap an unpaired LAN peer could pin
// unbounded tasks + keep `awaiting_pin` asserted (security-review 2026-06-28 #12).
// Reserve a slot atomically; refuse (treated as "no PIN") once the cap is reached.
if self
.waiters
.fetch_update(Ordering::SeqCst, Ordering::SeqCst, |n| {
(n < MAX_PARKED_WAITERS).then_some(n + 1)
})
.is_err()
{
tracing::warn!("pairing: too many handshakes awaiting a PIN — refusing");
return None;
}
// Decrement on every exit path (PIN delivered, timeout, or future cancellation).
struct WaiterGuard<'a>(&'a AtomicUsize);
impl Drop for WaiterGuard<'_> {
@@ -117,7 +135,8 @@ impl Pairing {
tracing::info!(
uniqueid,
"pairing phase 1 (getservercert) — awaiting PIN: submit `GET /pin?pin=NNNN`"
"pairing phase 1 (getservercert) — awaiting PIN: deliver it via the management \
API `POST /api/v1/pair/pin` (operator reads the PIN off the Moonlight client)"
);
let pin = self
.pin
@@ -304,4 +323,28 @@ mod tests {
assert_eq!(pairing.pin.take(Duration::from_millis(10)).await, None);
assert!(!pairing.pin.awaiting_pin());
}
/// A pre-auth peer flood can park at most `MAX_PARKED_WAITERS` pairing handshakes; the next
/// `take` is refused immediately (returns `None` without parking), bounding the 300s-waiter DoS
/// (security-review 2026-06-28 #12).
#[tokio::test]
async fn pin_gate_caps_parked_waiters() {
let pairing = Arc::new(Pairing::new());
let mut handles = Vec::new();
for _ in 0..MAX_PARKED_WAITERS {
let p = pairing.clone();
handles.push(tokio::spawn(async move {
p.pin.take(Duration::from_secs(5)).await
}));
}
// Wait until all the slots are taken.
while pairing.pin.waiters.load(Ordering::SeqCst) < MAX_PARKED_WAITERS {
tokio::time::sleep(Duration::from_millis(2)).await;
}
// One more is refused right away (no parking), even with a long timeout.
assert_eq!(pairing.pin.take(Duration::from_secs(5)).await, None);
for h in handles {
h.abort();
}
}
}
+32 -22
View File
@@ -14,7 +14,7 @@ use crate::encode::Codec;
use anyhow::{Context, Result};
use std::collections::HashMap;
use std::io::{Read, Write};
use std::net::{TcpListener, TcpStream};
use std::net::{SocketAddr, TcpListener, TcpStream};
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::Arc;
use std::time::Duration;
@@ -102,13 +102,12 @@ fn handle_conn(mut stream: TcpStream, state: Arc<AppState>) -> Result<()> {
"RTSP {} | {}", req.head.replace("\r\n", " | "),
if req.body.is_empty() { String::new() } else { format!("body: {}", req.body.replace("\r\n", " | ")) }
);
let resp = handle_request(&req, &state);
let resp = handle_request(&req, &state, peer);
stream.write_all(resp.as_bytes()).context("RTSP write")?;
stream.flush().ok();
// Close (FIN after the flushed response) so the client detects end-of-response.
let _ = stream.shutdown(std::net::Shutdown::Both);
}
let _ = peer;
Ok(())
}
@@ -171,7 +170,7 @@ fn parse_request(head: &str, body: String) -> Request {
}
}
fn handle_request(req: &Request, state: &AppState) -> String {
fn handle_request(req: &Request, state: &AppState, peer: Option<SocketAddr>) -> String {
match req.method.as_str() {
"OPTIONS" => response(
&req.cseq,
@@ -216,16 +215,30 @@ fn handle_request(req: &Request, state: &AppState) -> String {
response(&req.cseq, &[], None)
}
"PLAY" => {
// The RTSP/UDP media plane is UNAUTHENTICATED. A stream may start only for the paired
// client that completed the pairing-gated `/launch` (which set `state.launch`), and —
// when the launching IP is known — only from that same source IP. So an unpaired RTSP
// peer can neither start a stream on an idle host nor ride a paired client's active
// launch (security-review 2026-06-28 #4). `nvhttp` gates `/launch` on a pinned cert.
let launch = *state.launch.lock().unwrap();
let Some(ls) = launch else {
tracing::warn!(?peer, "RTSP PLAY — refused: no paired `/launch` session");
return response_status("401 Unauthorized", &req.cseq, &[], None);
};
if let (Some(want), Some(got)) = (ls.peer_ip, peer.map(|p| p.ip())) {
if want != got {
tracing::warn!(
%want, %got,
"RTSP PLAY — refused: peer IP does not match the launching client"
);
return response_status("401 Unauthorized", &req.cseq, &[], None);
}
}
let cfg = *state.stream.lock().unwrap();
match cfg {
Some(cfg) if !state.streaming.swap(true, Ordering::SeqCst) => {
// Resolve the launched catalog entry (session recipe) for the stream.
let app = state
.launch
.lock()
.unwrap()
.map(|l| l.appid)
.and_then(super::apps::by_id);
let app = super::apps::by_id(ls.appid);
tracing::info!(app = ?app.as_ref().map(|a| &a.title), "RTSP PLAY — starting video stream");
stream::start(
cfg,
@@ -243,18 +256,15 @@ fn handle_request(req: &Request, state: &AppState) -> String {
// Audio runs independently (Opus on UDP 48000, stereo or 5.1/7.1 multistream per
// the ANNOUNCE); it needs the launch key for the AES-CBC payload encryption the
// client expects.
let launch = *state.launch.lock().unwrap();
if let Some(ls) = launch {
if !state.audio_streaming.swap(true, Ordering::SeqCst) {
tracing::info!("RTSP PLAY — starting audio stream");
audio::start(
state.audio_streaming.clone(),
ls.gcm_key,
ls.rikeyid,
*state.audio_params.lock().unwrap(),
state.audio_cap.clone(),
);
}
if !state.audio_streaming.swap(true, Ordering::SeqCst) {
tracing::info!("RTSP PLAY — starting audio stream");
audio::start(
state.audio_streaming.clone(),
ls.gcm_key,
ls.rikeyid,
*state.audio_params.lock().unwrap(),
state.audio_cap.clone(),
);
}
response(&req.cseq, &[("Session", "DEADBEEFCAFE;timeout = 90")], None)
}
+140 -38
View File
@@ -114,12 +114,12 @@ fn run(
// `video_cap`, since a reconnect at a different resolution needs a freshly-sized output; the
// output is released when this capturer drops at stream end (RAII via its keepalive).
if crate::config::config().video_source.as_deref() == Some("virtual") {
// The launched app picks the compositor (e.g. gamescope for game entries) and the
// nested command.
let compositor = app
.and_then(|a| a.compositor)
.map(Ok)
.unwrap_or_else(|| crate::vdisplay::detect().context("detect compositor"))?;
// Open the virtual-display source: pick the live compositor, normalize the session env
// (apply_session_env/apply_input_env — gamescope ATTACH/resize + KWin/Mutter retargeting,
// exactly like the native plane), create a virtual output at the client mode, and capture it.
// Re-runnable: the encode loop calls it again on a mid-stream capture loss to FOLLOW a
// Desktop<->Game switch.
let (mut capturer, compositor) = open_gs_virtual_source(cfg, app)?;
tracing::info!(
?compositor,
app = ?app.map(|a| &a.title),
@@ -127,31 +127,6 @@ fn run(
h = cfg.height,
"video source: virtual display (native client resolution)"
);
let mut vd = crate::vdisplay::open(compositor).context("open virtual display")?;
// Carry the resolved launch command on the backend instance (per-session) rather than a
// process-global env var, so concurrent sessions can't stomp each other's launch target.
vd.set_launch_command(app.and_then(|a| a.cmd.clone()));
let vout = vd
.create(punktfunk_core::Mode {
width: cfg.width,
height: cfg.height,
refresh_hz: cfg.fps,
})
.context("create virtual output at client resolution")?;
// `want_hdr=false`: the IDD-push backend (opt-in PUNKTFUNK_IDD_PUSH) has no monitor-HDR
// auto-detection — it converts its always-FP16 ring per this flag — and GameStream HDR is not
// negotiated into StreamConfig here, so an IDD-push GameStream session streams SDR even on an
// HDR desktop. (The default WGC backend DOES auto-detect HDR from the output colorspace, but
// IDD-push bypasses WGC.) Acceptable for the experimental IDD-push A/B path; HDR over IDD-push
// is wired only for punktfunk/1 (want_hdr = negotiated bit_depth >= 10). TODO: derive want_hdr
// from a GameStream HDR flag once StreamConfig carries one.
let mut capturer = capture::capture_virtual_output(
vout,
capture::OutputFormat::resolve(false),
crate::session_plan::CaptureBackend::resolve(),
)
.context("capture virtual output")?;
capturer.set_active(true);
// Launch the app's command now that capture is live, for the backends that DON'T nest it via
// set_launch_command above: Windows (no gamescope) and Linux kwin/mutter/wlroots (which stream
// the existing desktop, so the app must be spawned into the session to land on the streamed
@@ -171,8 +146,14 @@ fn run(
}
}
}
// Rebuild closure: re-open the source on a mid-stream capture loss, RE-DETECTING the live
// compositor — so a Desktop<->Game switch (at the client's fixed mode) is FOLLOWED in place
// without a Moonlight reconnect. (A resolution change can't be followed mid-stream on
// GameStream — WxH is locked at ANNOUNCE — but a session toggle keeps the negotiated mode.)
let rebuild = || open_gs_virtual_source(cfg, app).map(|(c, _)| c);
return stream_body(
&mut *capturer,
&mut capturer,
Some(&rebuild),
&sock,
cfg,
running,
@@ -200,8 +181,10 @@ fn run(
}
};
capturer.set_active(true);
// Portal/synthetic source: no compositor virtual output to re-detect, so no rebuild closure.
let result = stream_body(
&mut *capturer,
&mut capturer,
None,
&sock,
cfg,
running,
@@ -215,6 +198,53 @@ fn run(
result
}
/// Open the virtual-display video source for a GameStream session: pick the LIVE compositor + normalize
/// the session env (apply_session_env/apply_input_env — gamescope ATTACH/resize, KWin/Mutter
/// retargeting) exactly like the native plane (punktfunk1.rs resolve_compositor), create a virtual
/// output at the client's mode, and capture it. Returns the capturer (it owns the output's keepalive;
/// the stateless VirtualDisplay factory is dropped here) plus the resolved compositor. An apps.json
/// entry can PIN a compositor (skips the live detect/retarget). Re-run on a mid-stream capture loss to
/// FOLLOW a Desktop<->Game switch: it re-detects the now-live compositor and re-targets at it. Does NOT
/// launch the app (that happens once at stream start; a rebuild must not re-spawn it).
fn open_gs_virtual_source(
cfg: StreamConfig,
app: Option<&super::apps::AppEntry>,
) -> Result<(Box<dyn Capturer>, crate::vdisplay::Compositor)> {
let compositor = if let Some(c) = app.and_then(|a| a.compositor) {
c
} else {
let active = crate::vdisplay::detect_active_session();
crate::vdisplay::apply_session_env(&active);
let c = crate::vdisplay::compositor_for_kind(active.kind)
.map(Ok)
.unwrap_or_else(crate::vdisplay::detect)
.context("detect compositor")?;
crate::vdisplay::apply_input_env(c);
c
};
let mut vd = crate::vdisplay::open(compositor).context("open virtual display")?;
// Carry the resolved launch command on the backend instance (per-session) rather than a
// process-global env var, so concurrent sessions can't stomp each other's launch target.
vd.set_launch_command(app.and_then(|a| a.cmd.clone()));
let vout = vd
.create(punktfunk_core::Mode {
width: cfg.width,
height: cfg.height,
refresh_hz: cfg.fps,
})
.context("create virtual output at client resolution")?;
// want_hdr=false: GameStream HDR is not negotiated into StreamConfig here (the default WGC backend
// still auto-detects HDR from the output colorspace; only the opt-in IDD-push path streams SDR).
let capturer = capture::capture_virtual_output(
vout,
capture::OutputFormat::resolve(false),
crate::session_plan::CaptureBackend::resolve(),
)
.context("capture virtual output")?;
capturer.set_active(true);
Ok((capturer, compositor))
}
/// One frame's packets, handed from the encode thread to the send thread.
type PacketBatch = Vec<Vec<u8>>;
@@ -367,7 +397,11 @@ fn percentile(v: &mut [u32], q: f64) -> u32 {
/// (see [`spawn_sender`]) so a send spike can never stall capture/encode.
#[allow(clippy::too_many_arguments)]
fn stream_body(
capturer: &mut dyn Capturer,
// `&mut Box` (not `&mut dyn`) so a mid-stream capture-loss rebuild can SWAP the capturer in place.
capturer: &mut Box<dyn Capturer>,
// Re-open the video source on capture loss (virtual-display path → follow a Desktop<->Game switch);
// `None` for the portal/synthetic source, which has nothing to re-detect (propagate the error).
rebuild: Option<&dyn Fn() -> Result<Box<dyn Capturer>>>,
sock: &UdpSocket,
cfg: StreamConfig,
running: &Arc<AtomicBool>,
@@ -397,6 +431,9 @@ fn stream_body(
cfg.bitrate_kbps as u64 * 1000,
frame.is_cuda(),
8, // GameStream/Moonlight path: 8-bit (its own codec negotiation)
// GameStream/Moonlight stays 4:2:0 — stock Moonlight clients can't decode 4:4:4, and the
// protocol has no chroma negotiation. 4:4:4 is punktfunk/1-native only.
encode::ChromaFormat::Yuv420,
)
.context("open video encoder for stream")?;
// FEC overhead percent (Sunshine default 20). Override with PUNKTFUNK_FEC_PCT (0 = data-only).
@@ -459,7 +496,12 @@ fn stream_body(
// RFI capability is fixed for the session (probed at encoder open). Query it once so the
// recovery path skips the always-`false` invalidate call on encoders without NVENC RFI and
// forces a keyframe directly instead.
let supports_rfi = enc.caps().supports_rfi;
let mut supports_rfi = enc.caps().supports_rfi;
// Bound consecutive capture-loss rebuilds (a delivered frame clears the counter) so a permanently
// dead source can't loop forever — it ends the stream after the cap, falling back to a reconnect.
const MAX_REBUILDS: u32 = 5;
let mut rebuilds: u32 = 0;
while running.load(Ordering::SeqCst) {
let tick = Instant::now();
@@ -467,9 +509,69 @@ fn stream_body(
// armed (cheap Relaxed atomic, re-read each frame).
let measure = perf || stats.is_armed();
// Advance to the freshest captured frame if one arrived; otherwise reuse the last.
if let Some(f) = capturer.try_latest().context("capture frame")? {
frame = f;
uniq += 1;
match capturer.try_latest() {
Ok(Some(f)) => {
frame = f;
uniq += 1;
rebuilds = 0; // a delivered frame clears the consecutive-loss counter
}
Ok(None) => {} // no new frame — reuse the last (static/idle desktop)
Err(e) => {
// The capture source went away — the compositor was torn down on a Desktop<->Game
// switch, or the virtual output was removed. On the virtual-display path, re-detect the
// now-live compositor and re-attach IN PLACE (the send thread + packetizer + socket +
// RTP clock all survive), then force an IDR so Moonlight resyncs — so the stream FOLLOWS
// the switch with no client reconnect. Build the new source BEFORE dropping the old.
// Bounded by a counter + a ~40s budget; on exhaustion, end the stream (Moonlight
// reconnect). The portal/synthetic path has no rebuild closure → propagate as before.
let Some(rebuild) = rebuild else {
return Err(e).context("capture frame");
};
rebuilds += 1;
if rebuilds > MAX_REBUILDS {
return Err(e).context("capture lost — rebuild attempts exhausted");
}
tracing::warn!(error = %format!("{e:#}"), rebuild = rebuilds,
"gamestream: capture lost — rebuilding source in place (following a session switch)");
let rebuild_deadline = Instant::now() + Duration::from_secs(40);
let new_cap = loop {
match rebuild() {
Ok(c) => break c,
Err(e2) => {
if !running.load(Ordering::SeqCst) || Instant::now() >= rebuild_deadline
{
return Err(e2)
.context("capture lost — no source within the rebuild budget");
}
tracing::warn!(error = %format!("{e2:#}"),
"gamestream: source not up yet — retrying");
std::thread::sleep(Duration::from_millis(500));
}
}
};
*capturer = new_cap;
capturer.set_active(true);
frame = capturer.next_frame().context("first frame after rebuild")?;
// Re-open the encoder for the new source (same negotiated WxH → same SPS profile) and
// force an IDR so Moonlight resyncs on the first emitted AU.
enc = encode::open_video(
cfg.codec,
frame.format,
frame.width,
frame.height,
cfg.fps,
cfg.bitrate_kbps as u64 * 1000,
frame.is_cuda(),
8,
encode::ChromaFormat::Yuv420, // GameStream stays 4:2:0
)
.context("reopen encoder after rebuild")?;
supports_rfi = enc.caps().supports_rfi;
enc.request_keyframe();
next_frame = Instant::now();
tracing::info!("gamestream: source rebuilt — stream continues");
continue;
}
}
let t_cap = tick.elapsed();
// Honor a client recovery request. Prefer reference-frame invalidation (the encoder
+12 -4
View File
@@ -24,6 +24,12 @@ use std::sync::Arc;
#[derive(Clone)]
pub(crate) struct PeerCertFingerprint(pub Option<String>);
/// The TCP source address of an HTTPS request, injected per-connection by [`serve_https`]. Used by
/// `/launch` to record which paired client owns the session so the unauthenticated RTSP/UDP media
/// plane can bind to that peer's IP (security-review 2026-06-28 #4).
#[derive(Clone, Copy)]
pub(crate) struct PeerAddr(pub SocketAddr);
/// HTTPS server that surfaces the verified client cert to handlers. `axum_server` can't expose the
/// peer cert, so this runs the rustls handshake itself (tokio-rustls), reads the peer certificate,
/// and serves the axum `Router` over hyper with the peer's fingerprint attached to every request as
@@ -39,7 +45,7 @@ pub(crate) async fn serve_https(
.await
.with_context(|| format!("bind HTTPS {bind}"))?;
loop {
let (tcp, _peer) = match listener.accept().await {
let (tcp, peer) = match listener.accept().await {
Ok(v) => v,
Err(e) => {
tracing::warn!(error = %e, "HTTPS accept failed");
@@ -63,14 +69,16 @@ pub(crate) async fn serve_https(
.peer_certificates()
.and_then(|c| c.first())
.map(|c| hex::encode(punktfunk_core::quic::endpoint::cert_fingerprint(c.as_ref())));
let peer = PeerCertFingerprint(fp);
let fp = PeerCertFingerprint(fp);
let addr = PeerAddr(peer);
let svc =
hyper::service::service_fn(move |req: hyper::Request<hyper::body::Incoming>| {
let app = app.clone();
let peer = peer.clone();
let fp = fp.clone();
async move {
let mut req = req.map(axum::body::Body::new);
req.extensions_mut().insert(peer);
req.extensions_mut().insert(fp);
req.extensions_mut().insert(addr);
app.oneshot(req).await // Router error is Infallible
}
});
+42 -12
View File
@@ -24,6 +24,9 @@ pub trait InputInjector {
pub enum Backend {
/// wlroots virtual pointer + keyboard Wayland protocols — the headless-Sway path.
WlrVirtual,
/// KWin `org_kde_kwin_fake_input` — direct injection, no RemoteDesktop portal / approval dialog
/// (authorized by the host's `.desktop`). The headless KDE-Desktop path; what krdpserver uses.
KwinFakeInput,
/// libei via `reis` — Wayland-native (RemoteDesktop portal). Not yet implemented.
Libei,
/// libei directly against gamescope's own EIS socket (no portal): input lands in the
@@ -47,6 +50,16 @@ pub fn open(backend: Backend) -> Result<Box<dyn InputInjector>> {
anyhow::bail!("wlroots virtual input requires Linux + a Wayland compositor")
}
}
Backend::KwinFakeInput => {
#[cfg(target_os = "linux")]
{
Ok(Box::new(kwin_fake_input::KwinFakeInjector::open()?))
}
#[cfg(not(target_os = "linux"))]
{
anyhow::bail!("KWin fake_input requires Linux + a KWin Wayland session")
}
}
Backend::Libei => {
#[cfg(target_os = "linux")]
{
@@ -63,9 +76,7 @@ pub fn open(backend: Backend) -> Result<Box<dyn InputInjector>> {
#[cfg(target_os = "linux")]
{
Ok(Box::new(libei::LibeiInjector::open_with(
libei::EiSource::SocketPathFile(
crate::vdisplay::gamescope_ei_socket_file().into(),
),
libei::EiSource::SocketPathFile(crate::vdisplay::gamescope_ei_socket_file()),
)?))
}
#[cfg(not(target_os = "linux"))]
@@ -90,12 +101,18 @@ pub fn open(backend: Backend) -> Result<Box<dyn InputInjector>> {
/// Pick the injection backend for the current session. gamescope hosts its own EIS server (no
/// portal), so a gamescope session injects directly into it. wlroots/Sway only implements the
/// ScreenCast portal (no RemoteDesktop), so libei can't run there — use the wlr virtual-input
/// protocols. KWin and GNOME implement RemoteDesktop but not the wlr protocols, so use libei.
/// `PUNKTFUNK_INPUT_BACKEND=wlr|libei|gamescope|uinput` overrides the auto-detection.
/// protocols. **KWin** exposes `org_kde_kwin_fake_input` (direct injection, no portal / approval
/// dialog — the only headless-capable path; what krdpserver uses), so prefer it there. **GNOME**
/// has neither fake_input nor the wlr protocols, so it uses libei via the RemoteDesktop portal
/// (which needs a user to approve, or a pre-seeded grant — not truly headless).
/// `PUNKTFUNK_INPUT_BACKEND=wlr|kwin|libei|gamescope|uinput` overrides the auto-detection.
pub fn default_backend() -> Backend {
if let Ok(v) = std::env::var("PUNKTFUNK_INPUT_BACKEND") {
match v.trim().to_ascii_lowercase().as_str() {
"wlr" | "wlroots" | "wlrvirtual" => return Backend::WlrVirtual,
"kwin" | "fakeinput" | "fake_input" | "kwin-fake-input" => {
return Backend::KwinFakeInput
}
"libei" | "ei" | "portal" => return Backend::Libei,
"gamescope" | "gamescope-ei" => return Backend::GamescopeEi,
"uinput" => return Backend::Uinput,
@@ -112,16 +129,26 @@ pub fn default_backend() -> Backend {
}
#[cfg(not(target_os = "windows"))]
{
if crate::config::config()
.compositor
.as_deref()
.is_some_and(|v| v.trim().eq_ignore_ascii_case("gamescope"))
{
return Backend::GamescopeEi;
// An explicit compositor pick (set per connect / mid-stream) is the strongest signal.
let compositor = crate::config::config().compositor.clone();
if let Some(c) = compositor.as_deref() {
let c = c.trim();
if c.eq_ignore_ascii_case("gamescope") {
return Backend::GamescopeEi;
}
if c.eq_ignore_ascii_case("kwin") {
return Backend::KwinFakeInput;
}
if c.eq_ignore_ascii_case("wlroots") || c.eq_ignore_ascii_case("sway") {
return Backend::WlrVirtual;
}
// mutter (GNOME) falls through to the XDG_CURRENT_DESKTOP check below.
}
let desktop = std::env::var("XDG_CURRENT_DESKTOP").unwrap_or_default();
let d = desktop.to_ascii_uppercase();
if d.contains("KDE") || d.contains("GNOME") {
if d.contains("KDE") {
Backend::KwinFakeInput
} else if d.contains("GNOME") {
Backend::Libei
} else {
Backend::WlrVirtual
@@ -478,6 +505,9 @@ pub mod gamepad {
}
}
#[cfg(target_os = "linux")]
#[path = "inject/linux/kwin_fake_input.rs"]
mod kwin_fake_input;
#[cfg(target_os = "linux")]
#[path = "inject/linux/libei.rs"]
mod libei;
#[cfg(target_os = "windows")]
@@ -0,0 +1,209 @@
//! Headless input injection on KWin via the privileged `org_kde_kwin_fake_input` protocol — the
//! exact path KDE's own headless RDP server (`krdpserver`) uses. KWin advertises this restricted
//! global only to a client authorized through its installed `.desktop` `X-KDE-Wayland-Interfaces`
//! (we ship `io.unom.Punktfunk.Host.desktop`, which lists `org_kde_kwin_fake_input` alongside
//! `zkde_screencast_unstable_v1`). Binding the global IS the authorization, so injection needs **no
//! RemoteDesktop portal and no "Allow remote control?" dialog** — it works with no user present,
//! which the libei/portal path cannot. We connect as an ordinary Wayland client on the KWin session's
//! `$WAYLAND_DISPLAY` and translate events into fake-input requests; keyboard keys are raw Linux
//! evdev codes that KWin resolves through the session's own keymap (no keymap upload, unlike the wlr
//! virtual-keyboard path), and absolute pointer/touch coordinates are global compositor space — which
//! on a headless box (single per-session virtual output at the origin, scale 1) equals the streamed
//! output's pixels.
#![allow(clippy::all, dead_code, non_camel_case_types, non_snake_case, unused)]
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
#![deny(clippy::undocumented_unsafe_blocks)]
use super::{gs_button_to_evdev, vk_to_evdev, InputEvent, InputInjector};
use anyhow::{Context, Result};
use punktfunk_core::input::InputKind;
use wayland_client::protocol::wl_registry::{self, WlRegistry};
use wayland_client::{Connection, Dispatch, EventQueue, Proxy, QueueHandle};
// Generate the client bindings for the vendored protocol XML inline (no build.rs), exactly like the
// KWin virtual-output backend. Path is relative to CARGO_MANIFEST_DIR.
#[allow(clippy::all, dead_code, non_camel_case_types, non_snake_case, unused)]
pub mod fake {
use wayland_client;
use wayland_client::protocol::*;
pub mod __interfaces {
use wayland_client::protocol::__interfaces::*;
wayland_scanner::generate_interfaces!("protocols/fake-input.xml");
}
use self::__interfaces::*;
wayland_scanner::generate_client_code!("protocols/fake-input.xml");
}
use fake::org_kde_kwin_fake_input::OrgKdeKwinFakeInput as FakeInput;
/// Highest interface version we drive. `keyboard_key` arrived at v4; KWin advertises ≥4.
const MAX_VERSION: u32 = 4;
/// `wl_pointer.axis` values used by `axis`.
const AXIS_VERTICAL: u32 = 0;
const AXIS_HORIZONTAL: u32 = 1;
/// `code` value marking a horizontal scroll event (mirrors `gamestream::input` / the wlr backend).
const SCROLL_HORIZONTAL: u32 = 1;
/// Registry-bound globals (the Wayland dispatch state).
#[derive(Default)]
struct State {
fake: Option<FakeInput>,
}
impl Dispatch<WlRegistry, ()> for State {
fn event(
state: &mut Self,
registry: &WlRegistry,
event: wl_registry::Event,
_: &(),
_: &Connection,
qh: &QueueHandle<Self>,
) {
if let wl_registry::Event::Global {
name,
interface,
version,
} = event
{
if interface == "org_kde_kwin_fake_input" {
state.fake = Some(registry.bind(name, version.min(MAX_VERSION), qh, ()));
}
}
}
}
// fake_input emits no events.
impl Dispatch<FakeInput, ()> for State {
fn event(
_: &mut Self,
_: &FakeInput,
_: <FakeInput as Proxy>::Event,
_: &(),
_: &Connection,
_: &QueueHandle<Self>,
) {
}
}
pub struct KwinFakeInjector {
conn: Connection,
queue: EventQueue<State>,
state: State,
fake: FakeInput,
}
impl KwinFakeInjector {
pub fn open() -> Result<Self> {
let conn = Connection::connect_to_env()
.context("connect to KWin Wayland (is WAYLAND_DISPLAY set to the KWin socket?)")?;
let mut queue = conn.new_event_queue();
let qh = queue.handle();
let _registry = conn.display().get_registry(&qh, ());
let mut state = State::default();
queue
.roundtrip(&mut state)
.context("Wayland registry roundtrip")?;
let fake = state.fake.clone().context(
"KWin does not expose org_kde_kwin_fake_input to this client — install the host's \
.desktop (io.unom.Punktfunk.Host.desktop, X-KDE-Wayland-Interfaces) and re-login so \
KWin authorizes it (the grant is cached per-exe on first connect), or this is not a \
KWin session",
)?;
// Authenticate (the legacy handshake; for an interface-authorized client KWin accepts it
// without a dialog — same as krdpserver/krfb headless).
fake.authenticate("punktfunk".into(), "remote streaming input".into());
queue
.roundtrip(&mut state)
.context("fake_input authenticate roundtrip")?;
conn.flush().ok();
tracing::info!("KWin fake_input ready (headless keyboard/mouse/touch — no portal)");
Ok(Self {
conn,
queue,
state,
fake,
})
}
}
impl InputInjector for KwinFakeInjector {
fn inject(&mut self, event: &InputEvent) -> Result<()> {
match event.kind {
InputKind::MouseMove => {
self.fake.pointer_motion(event.x as f64, event.y as f64);
}
InputKind::MouseMoveAbs => {
let w = (event.flags >> 16) & 0xffff;
let h = event.flags & 0xffff;
if w > 0 && h > 0 {
let x = event.x.clamp(0, w as i32) as f64;
let y = event.y.clamp(0, h as i32) as f64;
self.fake.pointer_motion_absolute(x, y);
}
}
InputKind::MouseButtonDown | InputKind::MouseButtonUp => {
if let Some(btn) = gs_button_to_evdev(event.code) {
let st = u32::from(event.kind == InputKind::MouseButtonDown);
self.fake.button(btn, st);
}
}
InputKind::MouseScroll => {
// GameStream sends WHEEL_DELTA(120)-scaled units; a notch ≈ 15px. Vertical flips
// sign on the Wayland axis, horizontal passes through — same as the wlr backend.
let horizontal = event.code == SCROLL_HORIZONTAL;
let axis = if horizontal {
AXIS_HORIZONTAL
} else {
AXIS_VERTICAL
};
let notches = event.x as f64 / 120.0;
let sign = if horizontal { 1.0 } else { -1.0 };
self.fake.axis(axis, sign * notches * 15.0);
}
InputKind::KeyDown | InputKind::KeyUp => {
// Raw evdev keycode; KWin resolves it through the session's own keymap (and tracks
// modifier state itself, so no separate modifiers request is needed).
if let Some(evdev) = vk_to_evdev(event.code as u8) {
let st = u32::from(event.kind == InputKind::KeyDown);
self.fake.keyboard_key(evdev as u32, st);
} else {
tracing::debug!(vk = event.code, "unmapped VK keycode — dropped");
}
}
// Touch: id = event.code, coords in the client surface w×h packed into flags (same
// absolute mapping as MouseMoveAbs). Each event is its own frame.
InputKind::TouchDown | InputKind::TouchMove => {
let w = (event.flags >> 16) & 0xffff;
let h = event.flags & 0xffff;
if w > 0 && h > 0 {
let x = event.x.clamp(0, w as i32) as f64;
let y = event.y.clamp(0, h as i32) as f64;
if event.kind == InputKind::TouchDown {
self.fake.touch_down(event.code, x, y);
} else {
self.fake.touch_motion(event.code, x, y);
}
self.fake.touch_frame();
}
}
InputKind::TouchUp => {
self.fake.touch_up(event.code);
self.fake.touch_frame();
}
// Gamepads are injected through uinput, not the compositor.
InputKind::GamepadButton | InputKind::GamepadAxis => {}
}
// Surface protocol errors / disconnects, then push the batch to the compositor.
self.queue
.dispatch_pending(&mut self.state)
.context("wayland dispatch")?;
self.conn.flush().context("wayland flush")?;
Ok(())
}
}
@@ -305,6 +305,19 @@ async fn connect_socket_file(file: &std::path::Path) -> Result<UnixStream> {
let deadline = std::time::Instant::now() + Duration::from_secs(15);
let mut logged = String::new();
loop {
// Defense-in-depth: never follow a symlinked relay file. It lives under `$XDG_RUNTIME_DIR`
// (per-user 0700) so a cross-user plant is already blocked, but refuse a symlink outright
// rather than read through one to an attacker-chosen target (a rogue EIS server would
// keylog/deny the session's input; security-review 2026-06-28 #6).
if std::fs::symlink_metadata(file)
.map(|m| m.file_type().is_symlink())
.unwrap_or(false)
{
return Err(anyhow!(
"EIS relay file {} is a symlink — refusing to follow it",
file.display()
));
}
if let Ok(s) = std::fs::read_to_string(file) {
let name = s.trim();
if !name.is_empty() {
+22 -3
View File
@@ -577,10 +577,11 @@ impl LibraryProvider for EpicProvider {
if p.extension().and_then(|e| e.to_str()) != Some("item") {
continue;
}
let Ok(text) = std::fs::read_to_string(&p) else {
// `.item` manifests are small JSON; cap the read so a planted giant can't OOM the host.
let Some(bytes) = read_capped(&p, 1024 * 1024) else {
continue;
};
let Ok(v) = serde_json::from_str::<serde_json::Value>(&text) else {
let Ok(v) = serde_json::from_slice::<serde_json::Value>(&bytes) else {
continue;
};
if let Some(g) = epic_entry(&v, &art) {
@@ -650,6 +651,23 @@ fn epic_entry(
})
}
/// Read a launcher cache/manifest with a hard size cap, so a local unprivileged user can't plant a
/// multi-GB file under the launcher's (Users-writable) data dir that OOMs the privileged host when
/// it's loaded — then base64/JSON-decoded into further copies — during library enumeration
/// (security-review 2026-06-28 S4). Returns `None` if missing, empty, or over `max`. Mirrors the
/// Linux lutris-art reader's 1 MiB cap.
#[cfg(windows)]
fn read_capped(path: &Path, max: u64) -> Option<Vec<u8>> {
let meta = std::fs::metadata(path).ok()?;
if meta.len() == 0 || meta.len() > max {
if meta.len() > max {
tracing::warn!(path = %path.display(), len = meta.len(), max, "launcher cache exceeds size cap — skipping");
}
return None;
}
std::fs::read(path).ok()
}
/// Best-effort parse of `catcache.bin` (base64-encoded JSON array of catalog items) into
/// catalogItemId → [`Artwork`] from each item's `keyImages`. Empty map on any read/decode failure
/// (the format is community-reverse-engineered + can lag a fresh install → titles just show no art).
@@ -657,7 +675,8 @@ fn epic_entry(
fn epic_art_index(catcache: &Path) -> std::collections::HashMap<String, Artwork> {
use base64::Engine as _;
let mut map = std::collections::HashMap::new();
let Ok(raw) = std::fs::read(catcache) else {
// 32 MiB cap: comfortably fits a real catalog cache, blocks a planted giant (S4).
let Some(raw) = read_capped(catcache, 32 * 1024 * 1024) else {
return map;
};
let Ok(decoded) = base64::engine::general_purpose::STANDARD.decode(raw) else {
+10 -1
View File
@@ -35,6 +35,9 @@ mod gamestream;
mod hdr;
mod inject;
#[cfg(target_os = "windows")]
#[path = "windows/install.rs"]
mod install;
#[cfg(target_os = "windows")]
#[path = "windows/interactive.rs"]
mod interactive;
mod library;
@@ -390,7 +393,7 @@ fn real_main() -> Result<()> {
}
// USER-session WGC helper (Windows two-process secure-desktop design): capture the EXISTING
// SudoVDA via WGC + NVENC, stream AUs on stdout to the SYSTEM host. Spawned by the host
// (CreateProcessAsUser), not run by hand. See design/windows-secure-desktop.md.
// (CreateProcessAsUser), not run by hand. See design/archive/windows-secure-desktop.md.
#[cfg(target_os = "windows")]
Some("wgc-helper") => {
let get = |flag: &str| {
@@ -422,6 +425,12 @@ fn real_main() -> Result<()> {
// that launches the host into the active interactive session.
#[cfg(target_os = "windows")]
Some("service") => service::main(&args[1..]),
// Install-time work the Windows installer delegates to the exe instead of locale-parsed
// PowerShell *files* (the ANSI-codepage parse-break root fix; see windows/install.rs).
#[cfg(target_os = "windows")]
Some("driver") => install::driver_main(&args[1..]),
#[cfg(target_os = "windows")]
Some("web") => install::web_main(&args[1..]),
Some("-h") | Some("--help") | Some("help") | None => {
print_usage();
Ok(())
+2
View File
@@ -1680,6 +1680,7 @@ mod tests {
height: 1440,
fps: 120,
appid: 1,
peer_ip: None,
});
state.streaming.store(true, Ordering::SeqCst);
@@ -1805,6 +1806,7 @@ mod tests {
height: 1080,
fps: 60,
appid: 1,
peer_ip: None,
});
let del = axum::http::Request::delete("/api/v1/session")
+12 -17
View File
@@ -11,9 +11,6 @@
use anyhow::{Context, Result};
use rand::RngCore;
use std::fs;
use std::io::Write;
#[cfg(unix)]
use std::os::unix::fs::{OpenOptionsExt, PermissionsExt};
use std::path::Path;
const ENV_VAR: &str = "PUNKTFUNK_MGMT_TOKEN";
@@ -38,9 +35,10 @@ pub fn load_or_generate() -> Result<String> {
rand::thread_rng().fill_bytes(&mut buf);
let token = hex::encode(buf);
let dir = crate::gamestream::config_dir();
fs::create_dir_all(&dir).with_context(|| format!("create {}", dir.display()))?;
// Owner-private dir (0700 Unix / DACL-locked Windows) so the token can't leak via the config path.
crate::gamestream::create_private_dir(&dir).with_context(|| format!("create {}", dir.display()))?;
write_token(&path, &token)?;
tracing::info!(path = %path.display(), "generated and persisted management API token (0600)");
tracing::info!(path = %path.display(), "generated and persisted management API token (owner-only)");
Ok(token)
}
@@ -55,19 +53,15 @@ fn parse_token(contents: &str) -> Option<String> {
(!tok.is_empty()).then(|| tok.to_string())
}
/// Write `PUNKTFUNK_MGMT_TOKEN=<token>` to `path`, mode 0600 (never briefly world-readable).
/// Write `PUNKTFUNK_MGMT_TOKEN=<token>` to `path` as an owner-only secret — 0600 on Unix AND
/// DACL-locked to SYSTEM/Administrators on Windows. Routes through the shared `write_secret_file` so
/// the mgmt bearer token (full admin authority) gets the SAME Windows lockdown as the host key; the
/// bespoke `cfg(unix)`-only writer used to leave it readable by any local user (security-review
/// 2026-06-28 #2).
fn write_token(path: &Path, token: &str) -> Result<()> {
let mut opts = fs::OpenOptions::new();
opts.write(true).create(true).truncate(true);
#[cfg(unix)]
opts.mode(0o600);
let mut f = opts
.open(path)
.with_context(|| format!("write {}", path.display()))?;
writeln!(f, "PUNKTFUNK_MGMT_TOKEN={token}")?;
#[cfg(unix)]
let _ = fs::set_permissions(path, fs::Permissions::from_mode(0o600));
Ok(())
let line = format!("PUNKTFUNK_MGMT_TOKEN={token}\n");
crate::gamestream::write_secret_file(path, line.as_bytes())
.with_context(|| format!("write {}", path.display()))
}
#[cfg(test)]
@@ -95,6 +89,7 @@ mod tests {
assert_eq!(parse_token(&read).as_deref(), Some("cafef00d"));
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
let mode = fs::metadata(&path).unwrap().permissions().mode() & 0o777;
assert_eq!(mode, 0o600);
}
+349 -57
View File
@@ -355,6 +355,15 @@ fn resolve_bitrate_kbps(requested: u32) -> u32 {
}
}
/// Resolve the audio channel count the session will capture + encode from the client's request.
/// Normalizes to one of 2 (stereo) / 6 (5.1) / 8 (7.1); anything else (older client, garbage)
/// becomes stereo. Both backends can produce the requested count (PipeWire pads/upmixes positions,
/// WASAPI loopback up/downmixes via AUTOCONVERTPCM), so no capability clamp is needed here — the
/// surround channels just carry up/downmixed content when the host's sink has fewer real channels.
fn resolve_audio_channels(requested: u8) -> u8 {
punktfunk_core::audio::normalize_channels(requested)
}
/// Static FEC override: `PUNKTFUNK_FEC_PCT`, when set, PINS the recovery percent and DISABLES
/// adaptive FEC — so a speed test / measurement keeps a fixed, known overhead. `None` ⇒ adaptive
/// FEC (the host sizes recovery to the loss the client reports). `0` disables FEC entirely.
@@ -488,7 +497,7 @@ async fn serve_session(
opts: &Punktfunk1Options,
audio_cap: &AudioCapSlot,
inj_tx: std::sync::mpsc::Sender<InputEvent>,
mic_tx: std::sync::mpsc::Sender<Vec<u8>>,
mic_tx: std::sync::mpsc::SyncSender<Vec<u8>>,
host_fp: &[u8; 32],
np: &NativePairing,
last_pairing: &std::sync::Mutex<Option<std::time::Instant>>,
@@ -588,9 +597,11 @@ async fn serve_session(
// we look it up in OUR library so a client can't inject a command). The bare-spawn gamescope
// backend picks this up via the `PUNKTFUNK_GAMESCOPE_APP` env fallback in `spawn` (on a shared
// desktop / attach-to-existing session it's a harmless no-op). This is the process-global env
// path — safe under today's ONE-session-at-a-time model; when concurrent native sessions land
// (`what's left` §3), resolve the command into the per-session VirtualDisplay via
// `set_launch_command` (as the GameStream path now does) so sessions can't stomp each other.
// path; the write is serialized via `vdisplay::with_env_lock` so concurrent native-session
// handshakes can't race the `set_var` (security-review 2026-06-28 #7). The remaining
// cross-session *value* confusion (B's launch id stomping A's pending gamescope spawn) wants
// the command resolved into the per-session VirtualDisplay via `set_launch_command` (as the
// GameStream path does) — a follow-up; the data-race UB is closed here.
if let Some(id) = hello.launch.as_deref() {
// Linux: resolve the id to a gamescope-nested command and stash it in the env the
// gamescope backend reads. Windows has no gamescope to nest into — the data plane launches
@@ -600,7 +611,9 @@ async fn serve_session(
match crate::library::launch_command(id) {
Some(cmd) => {
tracing::info!(launch_id = id, command = %cmd, "launching library title");
std::env::set_var("PUNKTFUNK_GAMESCOPE_APP", &cmd);
crate::vdisplay::with_env_lock(|| {
std::env::set_var("PUNKTFUNK_GAMESCOPE_APP", &cmd)
});
}
None => tracing::warn!(
launch_id = id,
@@ -623,6 +636,17 @@ async fn serve_session(
"encoder bitrate"
);
// Resolve the audio channel count (client request → stereo / 5.1 / 7.1). The capturer opens
// at this count: PipeWire synthesizes the requested positions (padding with silence when the
// sink has fewer), WASAPI loopback up/downmixes via AUTOCONVERTPCM — so a client always gets
// the channels it asked for, and the Welcome echoes the value the audio thread will encode.
let audio_channels = resolve_audio_channels(hello.audio_channels);
tracing::info!(
requested = hello.audio_channels,
resolved = audio_channels,
"audio channels"
);
// Resolve the encode bit depth: HEVC Main10 only when the client advertised it AND the host
// opted in (PUNKTFUNK_10BIT). A client that can't decode 10-bit (caps bit clear, or an older
// client) always gets the 8-bit stream. PUNKTFUNK_10BIT is the host policy gate until a
@@ -642,6 +666,44 @@ async fn serve_session(
"encode bit depth"
);
// Resolve the chroma subsampling: full-chroma HEVC 4:4:4 only when ALL of — the host opted in
// (PUNKTFUNK_444), the client advertised VIDEO_CAP_444, the session is single-process (the
// two-process WGC relay encodes 4:2:0 in v1), and the active GPU/driver actually supports a
// 4:4:4 encode (probed, cached). The native path always encodes HEVC. We resolve this BEFORE
// the Welcome so `chroma_format` reflects what we'll really emit — the honest-downgrade
// channel: if any gate fails the client is told 4:2:0 before it builds its decoder. The probe
// opens a tiny encoder; it runs only when both opt-ins are set and is cached after the first.
let host_wants_444 = crate::config::config().four_four_four;
let client_supports_444 = hello.video_caps & punktfunk_core::quic::VIDEO_CAP_444 != 0;
let single_process = crate::session_plan::resolve_topology()
== crate::session_plan::SessionTopology::SingleProcess;
// The GPU probe opens a real (tiny) encoder on first use, so run it off the reactor like the
// compositor probe above (blocking probes → spawn_blocking). Short-circuit so it only runs when
// the cheap gates already pass. The result is cached process-wide (a negative latches until
// restart — acceptable: a GPU either supports HEVC 4:4:4 or it doesn't, and a transient open
// failure here is rare since the session's own encoder isn't open yet).
let gpu_supports_444 = if host_wants_444 && client_supports_444 && single_process {
tokio::task::spawn_blocking(|| {
crate::encode::can_encode_444(crate::encode::Codec::H265)
})
.await
.context("4:4:4 capability probe task")?
} else {
false
};
let chroma = if gpu_supports_444 {
crate::encode::ChromaFormat::Yuv444
} else {
crate::encode::ChromaFormat::Yuv420
};
tracing::info!(
chroma = ?chroma,
host_wants_444,
client_supports_444,
single_process,
"encode chroma"
);
// Reserve a UDP port for the data plane (bind, read it back, rebind in UdpTransport).
let probe = std::net::UdpSocket::bind("0.0.0.0:0")?;
let udp_port = probe.local_addr()?.port();
@@ -691,6 +753,12 @@ async fn serve_session(
} else {
ColorInfo::SDR_BT709
},
// The chroma the encoder will actually emit (resolved + GPU-probed above) — 4:4:4 only
// when every gate passed, else 4:2:0. The client sizes its decoder from this.
chroma_format: chroma.idc(),
// The resolved audio channel count the audio thread will capture + Opus-(multi)stream
// encode (2/6/8). The client builds its decoder from this echoed value.
audio_channels,
};
io::write_msg(&mut send, &welcome.encode()).await?;
@@ -843,8 +911,9 @@ async fn serve_session(
while let Ok(d) = input_conn.read_datagram().await {
if let Some((_seq, _pts, opus)) = punktfunk_core::quic::decode_mic_datagram(&d) {
mic_count += 1;
// Host-lifetime mic service; a send error just means the host is shutting down.
let _ = mic_tx.send(opus.to_vec());
// Host-lifetime mic service (bounded queue): `try_send` drops the frame when the
// service is full or gone, never blocking this datagram loop (security-review S6).
let _ = mic_tx.try_send(opus.to_vec());
} else if let Some(rich) = punktfunk_core::quic::RichInput::decode(&d) {
rich_count += 1;
if rich_tx.send(rich).is_err() {
@@ -884,9 +953,10 @@ async fn serve_session(
let conn = conn.clone();
let stop = stop.clone();
let cap = audio_cap.clone();
let channels = welcome.audio_channels;
std::thread::Builder::new()
.name("punktfunk1-audio".into())
.spawn(move || audio_thread(conn, stop, cap))
.spawn(move || audio_thread(conn, stop, cap, channels))
.map_err(|e| tracing::error!(error = %e, "audio thread spawn failed — session continues without audio"))
.ok()
} else {
@@ -946,6 +1016,13 @@ async fn serve_session(
let launch_for_dp = hello.launch.clone();
let bitrate_kbps = welcome.bitrate_kbps; // resolved encoder bitrate (Hello clamped, or default)
let bit_depth = welcome.bit_depth; // resolved encode bit depth (8, or 10 when negotiated)
// Resolved chroma — derive the typed value back from the wire byte the Welcome carried (so the
// session uses exactly what the client was told). `Yuv444` only when the handshake gate passed.
let chroma = if welcome.chroma_format == punktfunk_core::quic::CHROMA_IDC_444 {
crate::encode::ChromaFormat::Yuv444
} else {
crate::encode::ChromaFormat::Yuv420
};
let stop_stream = stop.clone();
let fec_target_dp = fec_target.clone(); // data-plane handle to the adaptive-FEC target
let conn_stream = conn.clone(); // for sending the source's real HDR metadata (0xCE) mid-stream
@@ -1005,6 +1082,7 @@ async fn serve_session(
compositor,
bitrate_kbps,
bit_depth,
chroma,
probe_rx,
probe_result_tx,
fec_target: fec_target_dp,
@@ -1112,6 +1190,8 @@ const INJECTOR_REOPEN_BACKOFF: std::time::Duration = std::time::Duration::from_s
/// Mic is 48 kHz stereo — matches the Opus stereo decoder and the host→client audio layout.
const MIC_CHANNELS: u32 = 2;
/// Bound for the shared mic frame queue (drop-newest when full). See [`MicService::start`].
const MIC_QUEUE_CAP: usize = 64;
/// Host-lifetime virtual microphone, shared across punktfunk/1 sessions (mirror of
/// [`InjectorService`]). One thread owns the PipeWire `Audio/Source` + an Opus decoder; sessions
@@ -1119,12 +1199,16 @@ const MIC_CHANNELS: u32 = 2;
/// feeds the source. Opened lazily on the first frame, the source node persists across sessions
/// (no per-session registration churn), and reopens after a backoff if the source/decoder fails.
struct MicService {
tx: std::sync::mpsc::Sender<Vec<u8>>,
tx: std::sync::mpsc::SyncSender<Vec<u8>>,
}
impl MicService {
fn start() -> MicService {
let (tx, rx) = std::sync::mpsc::channel::<Vec<u8>>();
// Bounded so the host-lifetime mic queue (shared across all concurrent sessions) can't grow
// without limit under a near-line-rate flood; the producer drops the newest frame when full
// (audio is lossy by design) rather than buffering unboundedly (security-review 2026-06-28
// S6). 64 × 510 ms frames ≈ 0.30.6 s of slack, far more than the decode loop ever lags.
let (tx, rx) = std::sync::mpsc::sync_channel::<Vec<u8>>(MIC_QUEUE_CAP);
if let Err(e) = std::thread::Builder::new()
.name("punktfunk1-mic".into())
.spawn(move || mic_service_thread(rx))
@@ -1136,7 +1220,7 @@ impl MicService {
/// A sender a session forwards the client's Opus mic frames to. Cloned per session; dropping a
/// clone does NOT stop the service (it holds the original sender for the host life).
fn sender(&self) -> std::sync::mpsc::Sender<Vec<u8>> {
fn sender(&self) -> std::sync::mpsc::SyncSender<Vec<u8>> {
self.tx.clone()
}
}
@@ -1151,14 +1235,17 @@ fn mic_service_thread(rx: std::sync::mpsc::Receiver<Vec<u8>>) {
/// The host-lifetime mic worker: lazily open the virtual mic + decoder, then Opus-decode each
/// forwarded frame and push the PCM into the source. Reopen (after [`INJECTOR_REOPEN_BACKOFF`])
/// on open failure or a decode error. Exits when every session sender and the service's own
/// sender drop (host shutdown), tearing the virtual mic down. Linux = PipeWire `Audio/Source`;
/// Windows = a virtual audio device's render endpoint (see `audio::wasapi_mic`).
/// only on a backend OPEN failure; a per-frame Opus DECODE error is just a dropped frame (it must
/// not tear down this mic, which is shared across every concurrent session — otherwise one paired
/// client's junk frames would deny everyone's mic; security-review 2026-06-28 S2). Exits when every
/// session sender and the service's own sender drop (host shutdown), tearing the virtual mic down.
/// Linux = PipeWire `Audio/Source`; Windows = a virtual audio device's render endpoint.
#[cfg(any(target_os = "linux", target_os = "windows"))]
fn mic_service_thread(rx: std::sync::mpsc::Receiver<Vec<u8>>) {
let mut mic: Option<Box<dyn crate::audio::VirtualMic>> = None;
let mut decoder: Option<opus::Decoder> = None;
let mut last_failed: Option<std::time::Instant> = None;
let mut decode_fails: u64 = 0;
let mut pcm = vec![0f32; 5760 * MIC_CHANNELS as usize]; // up to 120 ms scratch
for opus_frame in rx {
if opus_frame.is_empty() {
@@ -1194,12 +1281,16 @@ fn mic_service_thread(rx: std::sync::mpsc::Receiver<Vec<u8>>) {
Ok(samples_per_ch) => {
let total = (samples_per_ch * MIC_CHANNELS as usize).min(pcm.len());
m.push(&pcm[..total]);
decode_fails = 0;
}
Err(e) => {
tracing::warn!(error = %e, "mic opus decode failed — reopening");
mic = None;
decoder = None;
last_failed = Some(std::time::Instant::now());
// Malformed/garbage frame: drop it and keep the (shared) mic + decoder open. The
// next valid frame decodes normally; only a backend OPEN failure reopens. Throttle
// the log (1, 2, 4, … fails) so a junk flood can't spam.
decode_fails += 1;
if decode_fails.is_power_of_two() {
tracing::warn!(error = %e, fails = decode_fails, "mic opus decode failed — dropping frame");
}
}
}
}
@@ -1381,8 +1472,14 @@ fn input_thread(
// left-button-down then turns every later click into a drag: windows move, but clicking buttons
// and text inputs does nothing). We synthesize the matching up-events when this session ends —
// see the release loop after the `break`.
let mut held_buttons: Vec<u32> = Vec::new();
let mut held_keys: Vec<u32> = Vec::new();
// Sets (not Vecs) so the presence test is O(1), not O(n) per event, and bounded by `MAX_HELD`
// so a client flooding distinct never-released codes can't grow the tracking state or spike the
// input thread (security-review 2026-06-28 S3). A real keyboard+mouse holds far fewer at once;
// codes past the cap simply aren't tracked for end-of-session release (worst case: one unreleased
// key on a pathological disconnect, which the injector's own state still bounds).
const MAX_HELD: usize = 256;
let mut held_buttons: std::collections::HashSet<u32> = std::collections::HashSet::new();
let mut held_keys: std::collections::HashSet<u32> = std::collections::HashSet::new();
loop {
match rx.recv_timeout(std::time::Duration::from_millis(4)) {
Ok(ev) => match ev.kind {
@@ -1400,14 +1497,18 @@ fn input_thread(
_ => {
// Track press/release so a mid-press disconnect can be undone below.
match ev.kind {
InputKind::MouseButtonDown if !held_buttons.contains(&ev.code) => {
held_buttons.push(ev.code)
InputKind::MouseButtonDown if held_buttons.len() < MAX_HELD => {
held_buttons.insert(ev.code);
}
InputKind::MouseButtonUp => held_buttons.retain(|&c| c != ev.code),
InputKind::KeyDown if !held_keys.contains(&ev.code) => {
held_keys.push(ev.code)
InputKind::MouseButtonUp => {
held_buttons.remove(&ev.code);
}
InputKind::KeyDown if held_keys.len() < MAX_HELD => {
held_keys.insert(ev.code);
}
InputKind::KeyUp => {
held_keys.remove(&ev.code);
}
InputKind::KeyUp => held_keys.retain(|&c| c != ev.code),
_ => {}
}
// Pointer/keyboard → the host-lifetime injector service (one persistent
@@ -1493,33 +1594,88 @@ fn input_thread(
}
}
/// The audio thread: desktop capture → Opus (48 kHz stereo, 5 ms, CBR — same tuning as the
/// GameStream path) → `AUDIO_MAGIC` datagrams. QUIC already encrypts; no extra layer.
/// The capturer comes from (and returns to) the persistent slot — see [`AudioCapSlot`].
/// Opus encoder for the native audio plane: a plain stereo encoder (the live-validated,
/// byte-identical path) or a libopus *multistream* encoder for 5.1/7.1, both behind one
/// `encode_float`. Surround uses the safe `opus::MSEncoder` (no `audiopus_sys`).
#[cfg(any(target_os = "linux", target_os = "windows"))]
fn audio_thread(conn: quinn::Connection, stop: Arc<AtomicBool>, audio_cap: AudioCapSlot) {
use crate::audio::{CHANNELS, SAMPLE_RATE};
enum NativeAudioEnc {
Stereo(opus::Encoder),
Surround(opus::MSEncoder),
}
#[cfg(any(target_os = "linux", target_os = "windows"))]
impl NativeAudioEnc {
/// Build the encoder for `channels` (2/6/8), hard-CBR + RESTRICTED_LOWDELAY like the
/// GameStream path; bitrate from the shared layout table (stereo keeps the validated 128 kbps).
fn new(channels: u8) -> Result<NativeAudioEnc, opus::Error> {
if channels == 2 {
let mut e = opus::Encoder::new(
crate::audio::SAMPLE_RATE,
opus::Channels::Stereo,
opus::Application::LowDelay,
)?;
e.set_bitrate(opus::Bitrate::Bits(128_000)).ok();
e.set_vbr(false).ok();
Ok(NativeAudioEnc::Stereo(e))
} else {
let l = punktfunk_core::audio::layout_for(channels, false);
let mut e = opus::MSEncoder::new(
crate::audio::SAMPLE_RATE,
l.streams,
l.coupled,
l.mapping,
opus::Application::LowDelay,
)?;
e.set_bitrate(opus::Bitrate::Bits(l.bitrate)).ok();
e.set_vbr(false).ok();
Ok(NativeAudioEnc::Surround(e))
}
}
fn encode_float(&mut self, frame: &[f32], out: &mut [u8]) -> Result<usize, opus::Error> {
match self {
NativeAudioEnc::Stereo(e) => e.encode_float(frame, out),
NativeAudioEnc::Surround(e) => e.encode_float(frame, out),
}
}
}
/// The audio thread: desktop capture → Opus (48 kHz, 5 ms, CBR — same tuning as the GameStream
/// path) → `AUDIO_MAGIC` datagrams, at the negotiated `channels` (2 stereo / 6 = 5.1 / 8 = 7.1,
/// canonical wire order FL FR FC LFE RL RR SL SR). QUIC already encrypts; no extra layer. The
/// capturer comes from (and returns to) the persistent slot — see [`AudioCapSlot`].
#[cfg(any(target_os = "linux", target_os = "windows"))]
fn audio_thread(
conn: quinn::Connection,
stop: Arc<AtomicBool>,
audio_cap: AudioCapSlot,
channels: u8,
) {
use crate::audio::SAMPLE_RATE;
const FRAME_MS: usize = 5;
const SAMPLES_PER_FRAME: usize = SAMPLE_RATE as usize * FRAME_MS / 1000; // 240
let want = punktfunk_core::audio::normalize_channels(channels);
// Reuse the cached capturer ONLY when its channel count matches this session's; a stereo
// capturer left by a prior session must not feed a 5.1/7.1 session (the encoder + the client's
// decoder are sized for `want`, so a mismatched capturer would garble/desync the audio).
let capturer = match audio_cap.lock().unwrap().take() {
Some(mut c) => {
Some(mut c) if c.channels() == want as u32 => {
c.drain(); // discard audio captured between sessions
c
}
None => match crate::audio::open_audio_capture(CHANNELS as u32) {
Ok(c) => c,
Err(e) => {
tracing::warn!(error = %format!("{e:#}"), "punktfunk/1 audio unavailable — session continues without it");
return;
prev => {
drop(prev); // wrong channel count (or none): clean teardown, open fresh at `want`
match crate::audio::open_audio_capture(want as u32) {
Ok(c) => c,
Err(e) => {
tracing::warn!(error = %format!("{e:#}"), "punktfunk/1 audio unavailable — session continues without it");
return;
}
}
},
}
};
let mut enc = match opus::Encoder::new(
SAMPLE_RATE,
opus::Channels::Stereo,
opus::Application::LowDelay,
) {
let mut enc = match NativeAudioEnc::new(want) {
Ok(e) => e,
Err(e) => {
tracing::error!(error = %e, "opus encoder");
@@ -1527,12 +1683,11 @@ fn audio_thread(conn: quinn::Connection, stop: Arc<AtomicBool>, audio_cap: Audio
return;
}
};
enc.set_bitrate(opus::Bitrate::Bits(128_000)).ok();
enc.set_vbr(false).ok();
let frame_len = SAMPLES_PER_FRAME * CHANNELS;
let frame_len = SAMPLES_PER_FRAME * want as usize;
let mut acc: Vec<f32> = Vec::with_capacity(frame_len * 4);
let mut opus_buf = vec![0u8; 1500];
// Sized for the largest surround frame (7.1 HQ ≈ 1.3 KB at 5 ms); ample for normal quality.
let mut opus_buf = vec![0u8; 4096];
let mut seq: u32 = 0;
// Reopen-with-backoff: hold the capturer in an Option so a mid-session capture-thread death
// (device unplug, daemon restart) reopens instead of muting the rest of a multi-hour session.
@@ -1542,14 +1697,17 @@ fn audio_thread(conn: quinn::Connection, stop: Arc<AtomicBool>, audio_cap: Audio
// restart). The first open already happened above; failing THAT still ends the session quietly.
let mut capturer = Some(capturer);
let mut last_failed: Option<std::time::Instant> = None;
tracing::info!("punktfunk/1 audio streaming (Opus 48 kHz stereo, 5 ms datagrams)");
tracing::info!(
channels = want,
"punktfunk/1 audio streaming (Opus 48 kHz, 5 ms datagrams)"
);
'session: while !stop.load(Ordering::SeqCst) {
if capturer.is_none() {
if last_failed.is_some_and(|t| t.elapsed() < INJECTOR_REOPEN_BACKOFF) {
std::thread::sleep(std::time::Duration::from_millis(200));
continue;
}
match crate::audio::open_audio_capture(CHANNELS as u32) {
match crate::audio::open_audio_capture(want as u32) {
Ok(c) => {
tracing::info!("punktfunk/1 audio capture reopened");
capturer = Some(c);
@@ -1599,7 +1757,12 @@ fn audio_thread(conn: quinn::Connection, stop: Arc<AtomicBool>, audio_cap: Audio
/// Stub — punktfunk/1 audio needs Linux (PipeWire capture + libopus); non-Linux dev builds
/// run sessions without it, same as when the capturer fails to open.
#[cfg(not(any(target_os = "linux", target_os = "windows")))]
fn audio_thread(_conn: quinn::Connection, _stop: Arc<AtomicBool>, _audio_cap: AudioCapSlot) {
fn audio_thread(
_conn: quinn::Connection,
_stop: Arc<AtomicBool>,
_audio_cap: AudioCapSlot,
_channels: u8,
) {
tracing::warn!("punktfunk/1 audio requires Linux or Windows — session continues without it");
}
@@ -2256,6 +2419,45 @@ struct SessionSwitch {
/// read (so no handshake plumbing). Opt-in via `PUNKTFUNK_SESSION_WATCH`; readiness of the new
/// backend is left to the encode thread's `build_pipeline_with_retry` (the watcher never writes
/// env). Exits when `stop` is set or the channel closes.
/// Whether to run the mid-stream session-switch watcher. An explicit `PUNKTFUNK_SESSION_WATCH` wins
/// (truthy → on; `0`/`false`/`no`/`off`/empty → off). When unset it defaults **on** for Steam HTPC
/// platforms (Bazzite / SteamOS) — which flip Gaming↔Desktop and need the host to follow the switch
/// mid-stream — and **off** elsewhere, preserving the opt-in default for plain desktop hosts.
fn session_watch_enabled() -> bool {
match std::env::var("PUNKTFUNK_SESSION_WATCH") {
Ok(v) => {
let v = v.trim();
!(v.is_empty()
|| v == "0"
|| v.eq_ignore_ascii_case("false")
|| v.eq_ignore_ascii_case("no")
|| v.eq_ignore_ascii_case("off"))
}
Err(_) => is_steam_htpc_platform(),
}
}
/// True on Bazzite or SteamOS (matched against os-release `ID`/`ID_LIKE`) — the platforms that flip
/// between Steam Gaming Mode and a Desktop session, where following a mid-stream switch is the
/// sensible default. Anything else (incl. non-Linux, where the file is absent) → false.
fn is_steam_htpc_platform() -> bool {
let Ok(os) = std::fs::read_to_string("/etc/os-release") else {
return false;
};
os.lines().any(|line| {
let line = line.trim();
let Some(val) = line
.strip_prefix("ID=")
.or_else(|| line.strip_prefix("ID_LIKE="))
else {
return false;
};
val.trim_matches('"')
.split_whitespace()
.any(|tok| tok.eq_ignore_ascii_case("bazzite") || tok.eq_ignore_ascii_case("steamos"))
})
}
fn session_watcher_loop(tx: std::sync::mpsc::Sender<SessionSwitch>, stop: Arc<AtomicBool>) {
use crate::vdisplay;
const DEBOUNCE: std::time::Duration = std::time::Duration::from_secs(3);
@@ -2329,6 +2531,8 @@ struct SessionContext {
bitrate_kbps: u32,
/// Negotiated encode bit depth (8, or 10 = HEVC Main10).
bit_depth: u8,
/// Negotiated chroma subsampling (4:2:0, or 4:4:4 when the client + host + GPU all support it).
chroma: crate::encode::ChromaFormat,
/// Speed-test burst requests (see [`service_probes`]).
probe_rx: std::sync::mpsc::Receiver<ProbeRequest>,
/// Speed-test results back to the control task.
@@ -2359,12 +2563,12 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
// path now reads this typed `SessionPlan` instead of re-deriving from config at each dispatch site
// (the latent "capture and encode disagree on the backend" hazard, plan §2.4). `bit_depth` is the
// only per-session input — capture/topology/encoder are otherwise pure functions of `HostConfig`.
let plan = crate::session_plan::SessionPlan::resolve(ctx.bit_depth);
let plan = crate::session_plan::SessionPlan::resolve(ctx.bit_depth, ctx.chroma);
tracing::info!(?plan, "resolved session plan");
// Windows two-process secure-desktop path: when the host runs as SYSTEM (required for the secure
// desktop + SendInput), WGC can't activate in-process, so we capture the normal desktop via a
// helper spawned in the user session and relay its AUs. (Single-process WGC/DDA is used as the
// user, and stays the path on Linux.) See design/windows-secure-desktop.md.
// user, and stays the path on Linux.) See design/archive/windows-secure-desktop.md.
#[cfg(target_os = "windows")]
if plan.topology == crate::session_plan::SessionTopology::TwoProcessRelay {
return virtual_stream_relay(ctx);
@@ -2381,6 +2585,8 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
compositor,
bitrate_kbps,
bit_depth,
// The resolved chroma is already captured in `plan` (above); ignore the duplicate here.
chroma: _,
probe_rx,
probe_result_tx,
fec_target,
@@ -2491,9 +2697,9 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
// place when the box flips Gaming↔Desktop. When not spawned, session_rx just stays empty.
let mut compositor = compositor;
let (session_tx, session_rx) = std::sync::mpsc::channel::<SessionSwitch>();
let watch = std::env::var_os("PUNKTFUNK_SESSION_WATCH").is_some()
&& crate::config::config().compositor.is_none();
let watch = session_watch_enabled() && crate::config::config().compositor.is_none();
let _watcher = if watch {
tracing::info!("session watcher on — following a mid-stream Gaming↔Desktop switch");
let stop = stop.clone();
std::thread::Builder::new()
.name("punktfunk1-watcher".into())
@@ -2675,15 +2881,76 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
}
tracing::warn!(error = %format!("{e:#}"), rebuild = capture_rebuilds,
"capture lost — rebuilding pipeline in place");
let (new_cap, new_enc, new_frame, new_interval) =
build_pipeline_with_retry(&mut vd, cur_mode, bitrate_kbps, bit_depth, plan)
.context("rebuild after capture loss")?;
// A Bazzite/SteamOS Gaming↔Desktop switch tears the old compositor down and can take
// 15s+ to bring the new one up. Don't fail the session over that (the client would
// have to cold-reconnect, surfacing a "session failed") — keep retrying within a
// generous budget while the QUIC keepalive (its own thread) holds the connection,
// RE-DETECTING the live compositor each attempt so we follow the box to whatever
// session comes up: a fresh instance of the same compositor, OR a different one
// (the kind-change case the session watcher also handles). The client stays
// connected, frozen on the last frame, and the stream resumes when the new output
// appears — no reconnect.
const REBUILD_BUDGET: std::time::Duration = std::time::Duration::from_secs(40);
let rebuild_deadline = std::time::Instant::now() + REBUILD_BUDGET;
let (new_cap, new_enc, new_frame, new_interval) = loop {
// Follow the active session unless an explicit PUNKTFUNK_COMPOSITOR pin forbids
// retargeting (then we stick to the pinned backend and just rebuild it).
if crate::config::config().compositor.is_none() {
let active = crate::vdisplay::detect_active_session();
if let Some(c) = crate::vdisplay::compositor_for_kind(active.kind) {
crate::vdisplay::apply_session_env(&active);
crate::vdisplay::apply_input_env(c);
if c != compositor {
if matches!(
c,
crate::vdisplay::Compositor::Kwin
| crate::vdisplay::Compositor::Mutter
) {
crate::vdisplay::settle_desktop_portal(c);
}
match crate::vdisplay::open(c) {
Ok(v) => {
tracing::info!(from = compositor.id(), to = c.id(),
"capture loss: active session switched compositor — retargeting");
vd = v;
compositor = c;
}
Err(e2) => tracing::warn!(error = %format!("{e2:#}"),
"capture loss: opening the newly-detected compositor failed — retrying"),
}
}
}
}
match build_pipeline_with_retry(
&mut vd,
cur_mode,
bitrate_kbps,
bit_depth,
plan,
) {
Ok(p) => break p,
Err(e2) => {
if stop.load(Ordering::SeqCst)
|| std::time::Instant::now() >= rebuild_deadline
{
return Err(e2)
.context("capture lost — no compositor came up within the rebuild budget");
}
tracing::warn!(error = %format!("{e2:#}"),
"capture lost — new session not up yet, retrying");
}
}
};
capturer = new_cap;
enc = new_enc;
frame = new_frame;
interval = new_interval;
enc.request_keyframe(); // belt-and-suspenders; a fresh encoder opens on an IDR anyway
next = std::time::Instant::now();
tracing::info!(
compositor = compositor.id(),
"capture loss: pipeline rebuilt — stream resumes"
);
}
}
if perf && diag_at.elapsed() >= std::time::Duration::from_secs(2) {
@@ -2869,6 +3136,9 @@ fn virtual_stream_relay(ctx: SessionContext) -> Result<()> {
compositor,
bitrate_kbps,
bit_depth,
// The two-process WGC relay encodes 4:2:0 in v1 — the handshake's `single_process` gate already
// forced `chroma` to Yuv420 for this topology, so the helper + secure-desktop DDA stay 4:2:0.
chroma: _,
probe_rx,
probe_result_tx,
fec_target,
@@ -2979,6 +3249,7 @@ fn virtual_stream_relay(ctx: SessionContext) -> Result<()> {
// stage 5) so the DDA capturer doesn't re-derive it.
crate::capture::gpu_encode(),
hdr,
false, // the two-process relay path is 4:2:0 in v1
)
.context("open DDA for secure desktop")?;
cap.set_active(true);
@@ -2992,6 +3263,8 @@ fn virtual_stream_relay(ctx: SessionContext) -> Result<()> {
bitrate_kbps as u64 * 1000,
frame.is_cuda(),
bit_depth,
// Secure-desktop DDA on the two-process relay path: 4:2:0 in v1 (matches the helper).
crate::encode::ChromaFormat::Yuv420,
)
.context("open video encoder for DDA")?;
Ok(DdaPipe {
@@ -3391,6 +3664,9 @@ fn is_permanent_build_error(chain: &str) -> bool {
"could not find output", // KWin < 6.5.6: createVirtualOutput unsupported
"must be a node id", // PUNKTFUNK_GAMESCOPE_NODE not an integer
"is it installed", // gamescope / kscreen-doctor not on PATH
// 4:4:4 NVENC got a CUDA frame — should never happen now the Linux capturer honors gpu=false,
// but fail fast instead of 8× retry (~90 s) rather than wedge the session if it ever recurs.
"capture/encoder negotiation mismatch",
];
let lower = chain.to_ascii_lowercase();
PERMANENT.iter().any(|p| lower.contains(p))
@@ -3440,8 +3716,20 @@ fn build_pipeline(
bitrate_kbps as u64 * 1000,
frame.is_cuda(),
bit_depth,
plan.chroma,
)
.context("open video encoder")?;
// Post-open cross-check: the Welcome already committed `chroma_format` from the pre-open probe, so
// warn loudly if the encoder actually opened a different chroma than negotiated (the in-band SPS is
// authoritative for the decoder, but a mismatch means the probe and the live open disagreed).
let opened_444 = enc.caps().chroma_444;
if opened_444 != plan.chroma.is_444() {
tracing::warn!(
negotiated_444 = plan.chroma.is_444(),
opened_444,
"encoder chroma disagrees with the negotiated Welcome — the client was told the other value"
);
}
let interval = std::time::Duration::from_secs_f64(1.0 / effective_hz.max(1) as f64);
Ok((capturer, enc, frame, interval))
}
@@ -3880,6 +4168,7 @@ mod tests {
GamepadPref::Auto,
0,
0, // video_caps
2, // audio_channels (stereo)
None, // launch
None,
Some((cert.clone(), key.clone())),
@@ -3912,6 +4201,7 @@ mod tests {
GamepadPref::Auto,
0,
0, // video_caps
2, // audio_channels (stereo)
None, // launch
None,
Some((cert, key)),
@@ -3965,6 +4255,7 @@ mod tests {
GamepadPref::Auto,
0,
0, // video_caps
2, // audio_channels (stereo)
None, // launch
None,
None,
@@ -3990,6 +4281,7 @@ mod tests {
GamepadPref::Auto,
0,
0, // video_caps
2, // audio_channels (stereo)
None, // launch
Some(host_fp),
Some((cert.clone(), key.clone())),
+25 -5
View File
@@ -106,17 +106,22 @@ pub struct SessionPlan {
/// The IDD-push HDR hint (`bit_depth >= 10`) — the want-HDR flag the capturer was passed before.
/// Non-IDD-push Windows backends ignore it and auto-detect HDR from the monitor; Linux is 8-bit.
pub hdr: bool,
/// Handshake-negotiated chroma subsampling (4:2:0, or full-chroma 4:4:4 when the client + host +
/// GPU all support it). Resolved before the Welcome; `Yuv420` on every backend that declined it.
pub chroma: crate::encode::ChromaFormat,
}
impl SessionPlan {
/// Resolve the whole plan once from [`config`](crate::config) + the negotiated `bit_depth`.
pub fn resolve(bit_depth: u8) -> Self {
/// Resolve the whole plan once from [`config`](crate::config) + the negotiated `bit_depth` and
/// `chroma`.
pub fn resolve(bit_depth: u8, chroma: crate::encode::ChromaFormat) -> Self {
SessionPlan {
capture: CaptureBackend::resolve(),
topology: resolve_topology(),
encoder: resolve_encoder(),
bit_depth,
hdr: bit_depth >= 10,
chroma,
}
}
@@ -124,9 +129,24 @@ impl SessionPlan {
/// (no second backend probe), `hdr` from the plan. Handed into `capture::capture_virtual_output` so the
/// capturer never re-derives the encode backend.
pub fn output_format(&self) -> crate::capture::OutputFormat {
let gpu = self.encoder.is_gpu();
// Linux NVENC 4:4:4: libavcodec `hevc_nvenc` only emits 4:4:4 from a YUV444 *input* frame —
// RGB-in is always subsampled to 4:2:0 (verified on the RTX 5070 Ti). So the encoder does an
// RGB→YUV444P swscale and needs CPU-resident RGB frames; force the zero-copy GPU capture off
// for a 4:4:4 NVENC session. (VAAPI 4:4:4, where the hardware supports it, keeps its dmabuf
// path via `scale_vaapi`; Windows NVENC ingests ARGB directly and stays GPU.)
#[cfg(target_os = "linux")]
let gpu = {
let force_cpu_for_nvenc_444 =
self.chroma.is_444() && !crate::encode::linux_zero_copy_is_vaapi();
gpu && !force_cpu_for_nvenc_444
};
crate::capture::OutputFormat {
gpu: self.encoder.is_gpu(),
gpu,
hdr: self.hdr,
// 4:4:4 needs a full-chroma source: on Windows this keeps the capturer on RGB (not the
// default NV12/P010 video-engine output) so NVENC can CSC to 4:4:4.
chroma_444: self.chroma.is_444(),
}
}
}
@@ -134,7 +154,7 @@ impl SessionPlan {
/// Process topology. On Windows this is the former `punktfunk1::should_use_helper` logic verbatim; on
/// every other platform the session is always single-process.
#[cfg(target_os = "windows")]
fn resolve_topology() -> SessionTopology {
pub(crate) fn resolve_topology() -> SessionTopology {
let cfg = crate::config::config();
// `NO_HELPER`/`NO_WGC` force single-process; IDD-push captures in-process in Session 0 (no helper);
// otherwise the helper runs when forced or when we're SYSTEM (in-process WGC can't activate there).
@@ -151,7 +171,7 @@ fn resolve_topology() -> SessionTopology {
}
#[cfg(not(target_os = "windows"))]
fn resolve_topology() -> SessionTopology {
pub(crate) fn resolve_topology() -> SessionTopology {
SessionTopology::SingleProcess
}
+2 -1
View File
@@ -109,7 +109,8 @@ pub fn run(opts: Options) -> Result<()> {
opts.fps,
opts.bitrate_bps,
first.is_cuda(),
8, // spike synthetic harness: 8-bit
8, // spike synthetic harness: 8-bit
encode::ChromaFormat::Yuv420, // ...and 4:2:0
)
.context("open encoder")?;
+29 -7
View File
@@ -358,13 +358,30 @@ fn find_wayland_socket(runtime: &str, uid: u32) -> Option<String> {
cands.into_iter().next().map(|(_, n)| n)
}
/// Serializes ALL process-global env mutation on the per-session setup path. `std::env::set_var`
/// concurrent with another thread's `set_var` (glibc `environ` realloc) is a data race = UB. With
/// the default concurrent native sessions each running `resolve_compositor` in its own
/// `spawn_blocking`, the per-session env retargeting would otherwise race and could crash the host
/// (security-review 2026-06-28 #7). Every env write on the setup path takes this lock; steady-state
/// streaming reads cached config, not env. This removes the memory-unsafety; it is NOT a full fix
/// for cross-session env *value* confusion (that needs per-session `SessionContext` threading, as the
/// GameStream/Windows path already does via `set_launch_command`).
pub static ENV_LOCK: std::sync::Mutex<()> = std::sync::Mutex::new(());
/// Run `f` with [`ENV_LOCK`] held. Use around any `set_var`/`remove_var` on the session-setup path.
pub fn with_env_lock<R>(f: impl FnOnce() -> R) -> R {
let _g = ENV_LOCK.lock().unwrap_or_else(|e| e.into_inner());
f()
}
/// Write a detected session's [`SessionEnv`] into the process env so every backend (video capture
/// and input alike) that reads `WAYLAND_DISPLAY` / `XDG_RUNTIME_DIR` / `DBUS_SESSION_BUS_ADDRESS` /
/// `XDG_CURRENT_DESKTOP` at open time targets the live session. The host serves one session at a
/// time, so a process-global write is sound; the next connect re-detects and re-applies. Same
/// `set_var` discipline already used for `PUNKTFUNK_GAMESCOPE_APP` on the launch path.
/// `XDG_CURRENT_DESKTOP` at open time targets the live session. Serialized via [`ENV_LOCK`] so
/// concurrent session handshakes can't race the `set_var`s; the next connect re-detects and
/// re-applies. Same `set_var` discipline used for `PUNKTFUNK_GAMESCOPE_APP` on the launch path.
#[cfg(target_os = "linux")]
pub fn apply_session_env(active: &ActiveSession) {
let _env_guard = ENV_LOCK.lock().unwrap_or_else(|e| e.into_inner());
let e = &active.env;
std::env::set_var("XDG_RUNTIME_DIR", &e.xdg_runtime_dir);
std::env::set_var("DBUS_SESSION_BUS_ADDRESS", &e.dbus_session_bus_address);
@@ -455,9 +472,14 @@ pub fn settle_desktop_portal(_chosen: Compositor) {}
/// `PUNKTFUNK_GAMESCOPE_MANAGED` forces managed over either.
#[cfg(target_os = "linux")]
pub fn apply_input_env(chosen: Compositor) {
let _env_guard = ENV_LOCK.lock().unwrap_or_else(|e| e.into_inner());
let backend = match chosen {
Compositor::Gamescope => "gamescope",
Compositor::Kwin | Compositor::Mutter => "libei",
// KWin: org_kde_kwin_fake_input — direct injection, no RemoteDesktop portal / approval
// dialog (headless, the krdpserver path), authorized by the host's shipped .desktop.
Compositor::Kwin => "kwin",
// GNOME has neither fake_input nor the wlr protocols → RemoteDesktop portal via libei.
Compositor::Mutter => "libei",
Compositor::Wlroots => "wlr",
};
std::env::set_var("PUNKTFUNK_INPUT_BACKEND", backend);
@@ -583,10 +605,10 @@ pub fn probe(compositor: Compositor) -> Result<()> {
}
/// Path of the file where the gamescope backend relays the nested session's `LIBEI_SOCKET`
/// (gamescope's EIS server) for the input injector.
/// (gamescope's EIS server) for the input injector. Under `$XDG_RUNTIME_DIR` (per-user 0700).
#[cfg(target_os = "linux")]
pub fn gamescope_ei_socket_file() -> &'static str {
gamescope::EI_SOCKET_FILE
pub fn gamescope_ei_socket_file() -> std::path::PathBuf {
gamescope::ei_socket_file()
}
/// Call when a client session ends: if the host-managed gamescope path took over a box's autologin
@@ -15,7 +15,7 @@
//! `inject/libei.rs`) — wired and live-validated.
use super::{Mode, VirtualDisplay, VirtualOutput};
use anyhow::{anyhow, Context, Result};
use anyhow::{anyhow, bail, Context, Result};
use std::process::{Child, Command, Stdio};
use std::time::{Duration, Instant};
@@ -110,12 +110,11 @@ impl VirtualDisplay for GamescopeDisplay {
// PUNKTFUNK_GAMESCOPE_NODE=<id|auto>; "auto" discovers the gamescope `Video/Source` node.
if let Ok(id) = std::env::var("PUNKTFUNK_GAMESCOPE_NODE") {
let node_id: u32 = if id.trim().eq_ignore_ascii_case("auto") {
find_gamescope_node().ok_or_else(|| {
anyhow!(
"PUNKTFUNK_GAMESCOPE_NODE=auto but no running gamescope Video/Source node \
was found is the headless gamescope/Steam session up?"
)
})?
// Attach to the box-owned game-mode session, but FIRST make it run at the connecting
// client's resolution (the box is headless, so its game-mode mode is ours to set).
// Reuse if it already matches (fast, no restart); otherwise relaunch the box's own
// session at the client mode. Without this the client gets the box's default mode.
ensure_box_gamescope_mode(mode)?
} else {
id.parse()
.context("PUNKTFUNK_GAMESCOPE_NODE must be a node id or 'auto'")?
@@ -368,6 +367,150 @@ fn create_managed_session_steamos(mode: Mode) -> Result<VirtualOutput> {
})
}
/// ATTACH at the CLIENT's resolution: ensure the box's own game-mode session is running at `mode`'s
/// output size, then return its capture node. Reuses the running session if it already matches (no
/// restart — the rock-solid fast path a stable client always hits); otherwise reconfigures + restarts
/// the box's OWN autologin `gamescope-session-plus@<client>` unit at the client mode. Restarting the
/// box's own unit (rather than spawning a competing one) avoids the autologin-respawn fight the old
/// MANAGED path hit. A headless box has no physical panel, so its game-mode resolution is ours to set;
/// Steam restarts only on an actual resolution CHANGE.
fn ensure_box_gamescope_mode(mode: Mode) -> Result<u32> {
let target = (mode.width, mode.height);
// Fast path: already at the client's resolution — just attach to the live node.
if current_gamescope_output_size() == Some(target) {
if let Some(node) = find_gamescope_node() {
tracing::info!(
w = mode.width,
h = mode.height,
node,
"gamescope: box game-mode session already at the client's resolution — reusing"
);
return Ok(node);
}
}
let Some(unit) = running_autologin_gamescope_unit() else {
// No box-owned autologin session to reconfigure (a bare/foreign gamescope): attach to
// whatever node exists, accepting its resolution.
return find_gamescope_node().ok_or_else(|| {
anyhow!(
"no running gamescope Video/Source node — is the headless game mode up? \
(put the box into Steam Game Mode)"
)
});
};
tracing::info!(
from = ?current_gamescope_output_size(),
to_w = mode.width,
to_h = mode.height,
hz = mode.refresh_hz,
%unit,
"gamescope: relaunching the box game-mode session at the client's resolution"
);
// The session reads SCREEN_WIDTH/HEIGHT (+ CUSTOM_REFRESH_RATES) from the user-manager
// environment; set them and restart the box's own unit.
systemctl_user(&[
"set-environment",
&format!("SCREEN_WIDTH={}", mode.width),
&format!("SCREEN_HEIGHT={}", mode.height),
&format!("CUSTOM_REFRESH_RATES={}", mode.refresh_hz.max(1)),
]);
systemctl_user(&["restart", &unit]);
// Wait for the relaunched session to come up at the new size and publish its capture node. The
// node appears when gamescope is up (well before Steam finishes booting); the caller's
// first-frame retry absorbs Steam's cold start.
let deadline = Instant::now() + Duration::from_secs(45);
loop {
if current_gamescope_output_size() == Some(target) {
if let Some(node) = find_gamescope_node() {
tracing::info!(
node,
w = mode.width,
h = mode.height,
"gamescope: box game-mode session relaunched at the client's resolution"
);
return Ok(node);
}
}
if Instant::now() >= deadline {
bail!(
"box game-mode session did not come up at {}x{} within 45s after relaunch \
(Steam may still be booting)",
mode.width,
mode.height
);
}
std::thread::sleep(Duration::from_millis(500));
}
}
/// Output (capture) resolution `-W <w> -H <h>` of the running `gamescope` binary, parsed from its
/// `/proc/<pid>/cmdline`. `None` if no gamescope is running or the flags aren't present.
fn current_gamescope_output_size() -> Option<(u32, u32)> {
for entry in std::fs::read_dir("/proc").ok()?.flatten() {
let name = entry.file_name();
let Some(pid) = name.to_str() else { continue };
if !pid.bytes().all(|b| b.is_ascii_digit()) {
continue;
}
let Ok(raw) = std::fs::read(format!("/proc/{pid}/cmdline")) else {
continue;
};
let args: Vec<String> = raw
.split(|&b| b == 0)
.filter(|s| !s.is_empty())
.map(|s| String::from_utf8_lossy(s).into_owned())
.collect();
// Match the gamescope BINARY by argv[0]'s basename — NOT /proc/<pid>/exe, which is commonly
// unreadable for the gamescope process (returns empty). The session wrapper scripts run as
// bash/sh (argv[0] != gamescope), so they're excluded; the -W/-H presence check below is the
// final filter.
let is_gamescope = args
.first()
.map(|a0| a0.rsplit('/').next().unwrap_or(a0) == "gamescope")
.unwrap_or(false);
if !is_gamescope {
continue;
}
let flag = |names: &[&str]| -> Option<u32> {
args.iter().enumerate().find_map(|(i, a)| {
names
.contains(&a.as_str())
.then(|| args.get(i + 1).and_then(|v| v.parse().ok()))
.flatten()
})
};
if let (Some(w), Some(h)) = (
flag(&["-W", "--output-width"]),
flag(&["-H", "--output-height"]),
) {
return Some((w, h));
}
}
None
}
/// The running autologin gaming-mode unit (`gamescope-session-plus@<client>.service`), if any — the
/// box's own game-mode session, which [`ensure_box_gamescope_mode`] reconfigures + restarts.
fn running_autologin_gamescope_unit() -> Option<String> {
let out = Command::new("systemctl")
.args([
"--user",
"list-units",
"--type=service",
"--state=running",
"--no-legend",
"--plain",
"gamescope-session-plus@*.service",
])
.output()
.ok()?;
String::from_utf8_lossy(&out.stdout)
.lines()
.filter_map(|l| l.split_whitespace().next())
.find(|u| u.starts_with("gamescope-session-plus@") && u.ends_with(".service"))
.map(|u| u.to_string())
}
/// Stop every running autologin gaming-mode session (`gamescope-session-plus@*.service`) so its
/// single-instance Steam is free for our own host-managed session. Records the units so
/// [`schedule_restore_tv_session`] can restart them on disconnect. Our own session is the transient
@@ -527,11 +670,11 @@ pub fn start_restore_worker() -> std::sync::Arc<()> {
}
/// Point the libei injector at the running gamescope's EIS socket (it reads the relay file
/// [`EI_SOCKET_FILE`]). Best-effort — video still works without it (input just won't reach the
/// [`ei_socket_file`]). Best-effort — video still works without it (input just won't reach the
/// session). Shared by the attach and host-managed-session paths.
fn point_injector_at_eis() {
match find_gamescope_eis_socket() {
Some(sock) => match std::fs::write(EI_SOCKET_FILE, &sock) {
Some(sock) => match std::fs::write(ei_socket_file(), &sock) {
Ok(()) => {
tracing::info!(socket = %sock, "gamescope: pointed injector at the session's EIS socket")
}
@@ -627,18 +770,31 @@ fn stop_session(unit_name: &str) {
let _ = Command::new("systemctl")
.args(["--user", "stop", unit_name])
.status();
let _ = std::fs::remove_file(EI_SOCKET_FILE);
let _ = std::fs::remove_file(ei_socket_file());
}
/// File where the wrapper below writes gamescope's `LIBEI_SOCKET` (its EIS server socket),
/// read by the libei injector to drive input into the nested app. See [`crate::inject`].
pub const EI_SOCKET_FILE: &str = "/tmp/punktfunk-gamescope-ei";
/// File where the wrapper below writes gamescope's `LIBEI_SOCKET` (its EIS server socket), read by
/// the libei injector to drive input into the nested app. See [`crate::inject`].
///
/// Placed under `$XDG_RUNTIME_DIR` (a per-user, 0700 directory) — NOT a world-writable `/tmp` —
/// so a second unprivileged local user can neither read the relayed socket path nor pre-plant the
/// file to redirect the host's injector to a rogue EIS server (which would let them keylog or deny
/// the remote session's keyboard/mouse input; security-review 2026-06-28 #6). Falls back to `/tmp`
/// only if `XDG_RUNTIME_DIR` is unset (gamescope itself requires it, so this is rare); the reader
/// ([`crate::inject`]) additionally rejects a symlinked relay file as defense-in-depth.
pub fn ei_socket_file() -> std::path::PathBuf {
let runtime = crate::vdisplay::with_env_lock(|| std::env::var_os("XDG_RUNTIME_DIR"));
match runtime {
Some(rt) if !rt.is_empty() => std::path::PathBuf::from(rt).join("punktfunk-gamescope-ei"),
_ => std::path::PathBuf::from("/tmp/punktfunk-gamescope-ei"),
}
}
/// Spawn `gamescope --backend headless -W w -H h -r hz -- <app>`. The app comes from
/// `PUNKTFUNK_GAMESCOPE_APP` (default a no-op that just keeps gamescope alive — set it to a real
/// game/GL app for actual content, e.g. `steam -gamepadui` for the SteamOS-like session).
/// stdout/stderr go to `/tmp/punktfunk-gamescope.log`. The app is launched through a tiny shell
/// wrapper that relays gamescope's `LIBEI_SOCKET` (set for its children) to [`EI_SOCKET_FILE`]
/// wrapper that relays gamescope's `LIBEI_SOCKET` (set for its children) to [`ei_socket_file`]
/// so the input injector can connect to gamescope's EIS server from outside.
fn spawn(w: u32, h: u32, hz: u32, cmd: Option<&str>) -> Result<Child> {
// A non-empty per-session command (set via `set_launch_command`) wins; else the
@@ -648,10 +804,13 @@ fn spawn(w: u32, h: u32, hz: u32, cmd: Option<&str>) -> Result<Child> {
let app = cmd
.map(str::to_string)
.filter(|s| !s.trim().is_empty())
.or_else(|| std::env::var("PUNKTFUNK_GAMESCOPE_APP").ok())
// Read the env fallback under the shared env lock so it can't race a concurrent session's
// `set_var` of the same key (security-review 2026-06-28 #7).
.or_else(|| crate::vdisplay::with_env_lock(|| std::env::var("PUNKTFUNK_GAMESCOPE_APP").ok()))
.filter(|s| !s.trim().is_empty())
.unwrap_or_else(|| "sleep infinity".to_string());
let _ = std::fs::remove_file(EI_SOCKET_FILE); // stale socket path from a previous session
let relay = ei_socket_file();
let _ = std::fs::remove_file(&relay); // stale socket path from a previous session
let mut cmd = Command::new("gamescope");
cmd.args(["--backend", "headless"])
.args(["-W", &w.to_string()])
@@ -661,7 +820,10 @@ fn spawn(w: u32, h: u32, hz: u32, cmd: Option<&str>) -> Result<Child> {
.args([
"sh",
"-c",
&format!("printf %s \"$LIBEI_SOCKET\" > {EI_SOCKET_FILE}; exec \"$@\""),
&format!(
"printf %s \"$LIBEI_SOCKET\" > '{}'; exec \"$@\"",
relay.display()
),
"sh",
])
.args(app.split_whitespace())
@@ -854,7 +1016,7 @@ impl Drop for GamescopeProc {
let _ = self.0.wait();
// Clear the relayed EIS socket name so the host-lifetime injector can't reconnect to this
// now-dead session's socket between sessions (the stale path is the "Connection refused").
let _ = std::fs::remove_file(EI_SOCKET_FILE);
let _ = std::fs::remove_file(ei_socket_file());
}
}
@@ -6,8 +6,14 @@
//! node for it. The node lives on the user's default PipeWire daemon, so [`VirtualOutput::remote_fd`]
//! is `None` and capture connects to that daemon directly.
//!
//! Requirements: KWin must expose the privileged `zkde_screencast` global — a real Plasma session
//! authorizes it for its own clients; the headless test exposes it to bare clients via
//! Requirements: KWin must expose the privileged `zkde_screencast` global. It is a *restricted*
//! protocol — KWin advertises it only to a client whose installed `.desktop` lists it under
//! `X-KDE-Wayland-Interfaces` (KWin maps the connecting client to a `.desktop` by resolving
//! `/proc/<pid>/exe` against `Exec=`, then caches the grant per-executable for the session's life).
//! So an interactive Plasma session does NOT hand it to a bare client — the host packages ship
//! `io.unom.Punktfunk.Host.desktop` (`Exec=/usr/bin/punktfunk-host`,
//! `X-KDE-Wayland-Interfaces=zkde_screencast_unstable_v1,…`) so it is present before the host first
//! connects. The headless test path instead exposes it to bare clients via
//! `KWIN_WAYLAND_NO_PERMISSION_CHECKS=1`. The compositor backend must implement
//! `createVirtualOutput`: the **DRM backend** (any version) or the **VirtualBackend since KWin
//! 6.5.6** (`kwin_wayland --virtual`); on `--virtual` < 6.5.6 the request fails with
@@ -406,9 +412,11 @@ pub fn probe() -> Result<()> {
queue.roundtrip(&mut state).context("registry roundtrip")?;
if state.screencast.is_none() {
bail!(
"KWin is up but does not (yet) expose zkde_screencast_unstable_v1 — needs a real \
KDE session (or KWIN_WAYLAND_NO_PERMISSION_CHECKS=1), and KWin 6.5.6 for the \
headless virtual output"
"KWin is up but does not expose zkde_screencast_unstable_v1 to this client — KWin gates \
it on the host's .desktop X-KDE-Wayland-Interfaces (install \
io.unom.Punktfunk.Host.desktop with Exec=/usr/bin/punktfunk-host, then re-login so KWin \
re-reads it the grant is cached per-exe on first connect), or set \
KWIN_WAYLAND_NO_PERMISSION_CHECKS=1 for the headless test; needs KWin 6.5.6"
);
}
Ok(())
@@ -437,8 +445,9 @@ fn run(
let screencast = state.screencast.clone().ok_or_else(|| {
anyhow!(
"KWin does not expose zkde_screencast_unstable_v1 (need a real KDE session, or run \
KWin with KWIN_WAYLAND_NO_PERMISSION_CHECKS=1 for the headless test)"
"KWin does not expose zkde_screencast_unstable_v1 to this client — install the host's \
.desktop (io.unom.Punktfunk.Host.desktop, X-KDE-Wayland-Interfaces) and re-login so \
KWin authorizes it, or run KWin with KWIN_WAYLAND_NO_PERMISSION_CHECKS=1 (headless test)"
)
})?;
@@ -0,0 +1,409 @@
//! `punktfunk-host driver install` / `web setup` - the install-time work the Windows installer's Inno
//! `[Run]` section delegates to the host EXE instead of locale-parsed PowerShell *files*.
//!
//! Why: Windows PowerShell 5.1 reads a BOM-less `.ps1` *file* in the machine's ANSI codepage, so on a
//! non-English locale a stray non-ASCII byte mis-decodes and the script aborts "unterminated string" -
//! exactly how the pf-vdisplay driver install silently failed on a German box. A compiled subcommand has
//! no such surface: the external tools it drives (`certutil`/`pnputil`/`nefconc`/`schtasks`/`netsh`/
//! `icacls`) are fixed string literals, not a file parsed in some codepage. (The installer's *inline*
//! `-Command` PowerShell in the `.iss` is unaffected - that's a command-line string, not a file read -
//! so it stays.) Sits next to `service install` (`service.rs`), the established Rust-owns-install pattern.
//!
//! Everything here is BEST-EFFORT: a hiccup warns but returns `Ok` - a non-zero exit would abort the
//! whole installer, and a missing driver only degrades the host to a physical display.
use anyhow::{bail, Context, Result};
use std::path::{Path, PathBuf};
use std::process::{Command, Stdio};
// ── arg + command helpers ──────────────────────────────────────────────────────────────────────
fn flag_val(args: &[String], name: &str) -> Option<String> {
args.iter()
.position(|a| a == name)
.and_then(|i| args.get(i + 1))
.cloned()
}
fn flag_present(args: &[String], name: &str) -> bool {
args.iter().any(|a| a == name)
}
/// Run a command, discard output, return whether it succeeded.
fn run_quiet(cmd: &str, args: &[&str]) -> bool {
Command::new(cmd)
.args(args)
.stdout(Stdio::null())
.stderr(Stdio::null())
.status()
.map(|s| s.success())
.unwrap_or(false)
}
/// Run a command, capture stdout (lossy UTF-8); empty on failure.
fn run_capture(cmd: &str, args: &[&str]) -> String {
Command::new(cmd)
.args(args)
.output()
.map(|o| String::from_utf8_lossy(&o.stdout).into_owned())
.unwrap_or_default()
}
// ── `driver install [--gamepad] --dir <stage>` ─────────────────────────────────────────────────
pub fn driver_main(args: &[String]) -> Result<()> {
match args.first().map(String::as_str) {
Some("install") => driver_install(&args[1..]),
_ => bail!("usage: punktfunk-host driver install --dir <stage> [--gamepad]"),
}
}
fn driver_install(args: &[String]) -> Result<()> {
let dir =
PathBuf::from(flag_val(args, "--dir").context("driver install: --dir <stage> required")?);
let gamepad = flag_present(args, "--gamepad");
let (what, res) = if gamepad {
("gamepad", install_gamepad(&dir))
} else {
("pf-vdisplay", install_pf_vdisplay(&dir))
};
if let Err(e) = res {
// Never abort the installer on a driver failure (matches the old best-effort PS scripts).
eprintln!("warning: {what} driver install: {e:#} (the host degrades without it)");
}
Ok(())
}
/// Trust the bundled self-signed driver cert: machine `Root` (so the chain validates) + `TrustedPublisher`
/// (so PnP installs without a prompt).
fn trust_cert(dir: &Path) {
match first_with_ext(dir, "cer") {
Some(cer) => {
let cer = cer.to_string_lossy().into_owned();
for store in ["Root", "TrustedPublisher"] {
if !run_quiet("certutil", &["-addstore", "-f", store, &cer]) {
eprintln!("warning: certutil -addstore {store} failed for {cer}");
}
}
println!("trusted driver cert {cer} (Root + TrustedPublisher)");
}
None => eprintln!(
"warning: no .cer in {} - driver may not install silently",
dir.display()
),
}
}
fn install_pf_vdisplay(dir: &Path) -> Result<()> {
let inf = dir.join("pf_vdisplay.inf");
if !inf.exists() {
bail!("no pf_vdisplay.inf in {}", dir.display());
}
trust_cert(dir);
// Create the ROOT device node only if absent (a blind re-create spawns a phantom duplicate, and the
// host binds interface index 0). ALWAYS nefconc (a clean ROOT\DISPLAY node), NEVER devgen (which makes
// persistent SWD\DEVGEN software devices that survive reboot + registry deletion).
if pf_vdisplay_present() {
println!("pf-vdisplay device node already present - leaving it.");
} else if let Some(nef) = first_named(dir, "nefconc.exe") {
let (class, guid) = inf_class(&inf);
let ok = run_quiet(
&nef.to_string_lossy(),
&[
"--create-device-node",
"--hardware-id",
"root\\pf_vdisplay",
"--class-name",
&class,
"--class-guid",
&guid,
],
);
if ok {
println!("created root\\pf_vdisplay device node (nefconc)");
} else {
eprintln!("warning: nefconc --create-device-node failed");
}
} else {
eprintln!(
"warning: nefconc.exe not found in {} - cannot create the device node",
dir.display()
);
}
// Stage + bind the driver (idempotent; re-staging the same .inf is harmless).
if run_quiet(
"pnputil",
&["/add-driver", &inf.to_string_lossy(), "/install"],
) {
println!("pnputil /add-driver pf_vdisplay.inf /install ok");
} else {
eprintln!("warning: pnputil /add-driver /install failed (driver may not have installed)");
}
Ok(())
}
fn install_gamepad(dir: &Path) -> Result<()> {
let infs: Vec<PathBuf> = std::fs::read_dir(dir)
.with_context(|| format!("read {}", dir.display()))?
.flatten()
.map(|e| e.path())
.filter(|p| p.extension().is_some_and(|x| x.eq_ignore_ascii_case("inf")))
.collect();
if infs.is_empty() {
bail!("no driver .inf in {}", dir.display());
}
trust_cert(dir);
// Add each package to the store - no /install, no device node: the host SwDeviceCreate's the
// per-session devnode when a client forwards a pad, so PnP binds the store driver on demand.
for inf in &infs {
if run_quiet("pnputil", &["/add-driver", &inf.to_string_lossy()]) {
println!("pnputil /add-driver {} ok", file_name(inf));
} else {
eprintln!("warning: pnputil /add-driver {} failed", inf.display());
}
}
Ok(())
}
/// Is a punktfunk virtual-display device already enumerated? Matches the device ID / description, which
/// are NOT localized, so the substring check is locale-safe.
fn pf_vdisplay_present() -> bool {
let lo = run_capture("pnputil", &["/enum-devices", "/class", "Display"]).to_ascii_lowercase();
lo.contains("pf_vdisplay") || lo.contains("punktfunk virtual display")
}
/// Read `Class` + `ClassGuid` from an INF so the node matches the shipped driver; falls back to Display.
fn inf_class(inf: &Path) -> (String, String) {
let text = std::fs::read_to_string(inf).unwrap_or_default();
let (mut class, mut guid) = (None, None);
for line in text.lines() {
let t = line.trim();
if let Some(eq) = t.find('=') {
let key = t[..eq].trim().to_ascii_lowercase();
let val = t[eq + 1..]
.split(';')
.next()
.unwrap_or("")
.trim()
.to_string();
match key.as_str() {
"class" => class = Some(val),
"classguid" => guid = Some(val),
_ => {}
}
}
}
(
class
.filter(|c| !c.is_empty())
.unwrap_or_else(|| "Display".into()),
guid.filter(|g| !g.is_empty())
.unwrap_or_else(|| "{4d36e968-e325-11ce-bfc1-08002be10318}".into()),
)
}
// ── `web setup --app-dir <app> [--password-file <file>]` ────────────────────────────────────────
const WEB_TASK: &str = "PunktfunkWeb";
pub fn web_main(args: &[String]) -> Result<()> {
match args.first().map(String::as_str) {
Some("setup") => web_setup(&args[1..]),
_ => bail!("usage: punktfunk-host web setup --app-dir <app> [--password-file <file>]"),
}
}
fn web_setup(args: &[String]) -> Result<()> {
let app_dir =
PathBuf::from(flag_val(args, "--app-dir").context("web setup: --app-dir <app> required")?);
let pw_file = flag_val(args, "--password-file");
let data_dir = crate::gamestream::config_dir();
std::fs::create_dir_all(&data_dir).ok();
let pw_path = data_dir.join("web-password");
let token_path = data_dir.join("mgmt-token");
// 1. login password
set_web_password(&pw_path, pw_file.as_deref());
// 2. (upgrade-safe) stop any running console so the new task binds :3000 + the files unlock
stop_web_console();
// 3. register the PunktfunkWeb scheduled task
let cmd = app_dir.join("web").join("web-run.cmd");
if !cmd.exists() {
bail!("web launcher missing: {}", cmd.display());
}
register_web_task(&cmd)?;
// 4. firewall: inbound TCP 3000
if !run_quiet(
"netsh",
&[
"advfirewall",
"firewall",
"add",
"rule",
"name=punktfunk web console (TCP 3000)",
"dir=in",
"action=allow",
"protocol=TCP",
"localport=3000",
],
) {
eprintln!("warning: could not add the firewall rule for TCP 3000");
}
// 5. wait briefly for the host's mgmt token, then start (restart-on-failure picks it up otherwise)
for _ in 0..30 {
if token_path.exists() {
break;
}
std::thread::sleep(std::time::Duration::from_secs(1));
}
run_quiet("schtasks", &["/run", "/tn", WEB_TASK]);
println!("web console set up + started (http://<host-ip>:3000)");
Ok(())
}
/// Source: a non-empty `--password-file` (fresh install) > keep existing (upgrade) > random fallback.
/// Writes `PUNKTFUNK_UI_PASSWORD=<pw>\n` (LF, no BOM) + ACLs it to Administrators + SYSTEM only.
fn set_web_password(pw_path: &Path, pw_file: Option<&str>) {
let password = pw_file
.and_then(|f| std::fs::read_to_string(f).ok())
.map(|s| s.trim().to_string())
.filter(|s| !s.is_empty())
.or_else(|| {
if pw_path.exists() {
println!("keeping existing web console password");
None
} else {
Some(random_password())
}
});
if let Some(pw) = password {
// Create the file EMPTY first, lock its DACL, THEN write the secret — so the cleartext
// password is never present at the inherited (Users-readable) %ProgramData% ACL, even for
// the brief window before icacls runs (security-review 2026-06-28 #8).
if std::fs::write(pw_path, b"").is_err() {
eprintln!("warning: could not create {}", pw_path.display());
return;
}
// Lock down: drop inheritance, grant only Administrators (S-1-5-32-544) + SYSTEM (S-1-5-18).
let p = pw_path.to_string_lossy();
run_quiet(
"icacls",
&[
&p,
"/inheritance:r",
"/grant:r",
"*S-1-5-32-544:F",
"*S-1-5-18:F",
],
);
// Now write the secret into the already-locked file (truncate keeps the explicit DACL).
if std::fs::write(pw_path, format!("PUNKTFUNK_UI_PASSWORD={pw}\n")).is_err() {
eprintln!("warning: could not write {}", pw_path.display());
}
}
}
/// 20-char URL/shell-safe password (no `/ + =`), like web-init.sh / the old web-setup.ps1.
fn random_password() -> String {
use base64::Engine;
use rand::RngCore;
let mut b = [0u8; 24];
rand::thread_rng().fill_bytes(&mut b);
base64::engine::general_purpose::STANDARD
.encode(b)
.chars()
.filter(|c| !matches!(c, '/' | '+' | '='))
.take(20)
.collect()
}
/// Stop + reap a running console before re-registering (upgrade-safe): end the task AND kill the :3000
/// listener owner (runtime-agnostic - a prior install may have run node vs the current bun). The listener
/// is identified by the wildcard foreign address (`0.0.0.0:0`/`[::]:0`), so the localized state word
/// ("LISTENING"/"ABHOEREN"/...) is never parsed.
fn stop_web_console() {
run_quiet("schtasks", &["/end", "/tn", WEB_TASK]);
for line in run_capture("netstat", &["-ano", "-p", "tcp"]).lines() {
let toks: Vec<&str> = line.split_whitespace().collect();
if toks.len() >= 5
&& toks[0].eq_ignore_ascii_case("tcp")
&& toks[1].ends_with(":3000")
&& (toks[2] == "0.0.0.0:0" || toks[2] == "[::]:0")
{
let pid = toks[toks.len() - 1];
if !pid.is_empty() && pid.bytes().all(|b| b.is_ascii_digit()) {
run_quiet("taskkill", &["/PID", pid, "/F"]);
}
}
}
std::thread::sleep(std::time::Duration::from_secs(1));
}
/// Register the boot/SYSTEM/restart-on-failure task via a generated Task Scheduler XML (`schtasks /xml`,
/// no COM). The XML declares UTF-16, so it's written UTF-16LE+BOM.
fn register_web_task(cmd: &Path) -> Result<()> {
let xml = format!(
"<?xml version=\"1.0\" encoding=\"UTF-16\"?>\n\
<Task version=\"1.2\" xmlns=\"http://schemas.microsoft.com/windows/2004/02/mit/task\">\n\
<RegistrationInfo><Description>punktfunk web management console (Nitro SSR on bun, :3000)</Description></RegistrationInfo>\n\
<Triggers><BootTrigger><Enabled>true</Enabled></BootTrigger></Triggers>\n\
<Principals><Principal id=\"Author\"><UserId>S-1-5-18</UserId><RunLevel>HighestAvailable</RunLevel></Principal></Principals>\n\
<Settings>\n\
<MultipleInstancesPolicy>IgnoreNew</MultipleInstancesPolicy>\n\
<DisallowStartIfOnBatteries>false</DisallowStartIfOnBatteries>\n\
<StopIfGoingOnBatteries>false</StopIfGoingOnBatteries>\n\
<StartWhenAvailable>true</StartWhenAvailable>\n\
<ExecutionTimeLimit>PT0S</ExecutionTimeLimit>\n\
<RestartOnFailure><Interval>PT1M</Interval><Count>10</Count></RestartOnFailure>\n\
</Settings>\n\
<Actions Context=\"Author\"><Exec><Command>{}</Command></Exec></Actions>\n\
</Task>",
xml_escape(&cmd.to_string_lossy())
);
let xml_path = std::env::temp_dir().join("punktfunk-web-task.xml");
write_utf16le_bom(&xml_path, &xml)?;
let ok = run_quiet(
"schtasks",
&[
"/create",
"/tn",
WEB_TASK,
"/xml",
&xml_path.to_string_lossy(),
"/f",
],
);
let _ = std::fs::remove_file(&xml_path);
if ok {
println!("registered scheduled task {WEB_TASK} -> {}", cmd.display());
Ok(())
} else {
bail!("schtasks /create {WEB_TASK} failed")
}
}
fn write_utf16le_bom(path: &Path, s: &str) -> Result<()> {
let mut bytes = vec![0xFFu8, 0xFE]; // UTF-16LE BOM
for u in s.encode_utf16() {
bytes.extend_from_slice(&u.to_le_bytes());
}
std::fs::write(path, bytes).with_context(|| format!("write {}", path.display()))
}
fn xml_escape(s: &str) -> String {
s.replace('&', "&amp;")
.replace('<', "&lt;")
.replace('>', "&gt;")
}
fn first_with_ext(dir: &Path, ext: &str) -> Option<PathBuf> {
std::fs::read_dir(dir)
.ok()?
.flatten()
.map(|e| e.path())
.find(|p| p.extension().is_some_and(|x| x.eq_ignore_ascii_case(ext)))
}
fn first_named(dir: &Path, name: &str) -> Option<PathBuf> {
let p = dir.join(name);
p.exists().then_some(p)
}
fn file_name(p: &Path) -> String {
p.file_name()
.unwrap_or_default()
.to_string_lossy()
.into_owned()
}
+12 -4
View File
@@ -114,13 +114,15 @@ pub fn main(args: &[String]) -> Result<()> {
/// stdout/stderr are redirected to `host.log` in the same dir.
pub fn service_log_path() -> PathBuf {
let dir = crate::gamestream::config_dir().join("logs");
let _ = std::fs::create_dir_all(&dir);
// DACL-locked (Users read-only, no create) so a local user can't pre-plant SYSTEM log files as
// reparse points / hardlinks to redirect the SYSTEM service's writes (security-review #11).
let _ = crate::gamestream::create_private_dir(&dir);
dir.join("service.log")
}
fn host_log_path() -> PathBuf {
let dir = crate::gamestream::config_dir().join("logs");
let _ = std::fs::create_dir_all(&dir);
let _ = crate::gamestream::create_private_dir(&dir);
dir.join("host.log")
}
@@ -684,7 +686,9 @@ fn ensure_default_host_env() -> Result<()> {
return Ok(());
}
if let Some(dir) = path.parent() {
std::fs::create_dir_all(dir).ok();
// DACL-lock the config dir on creation so a local user can't pre-create it and plant a
// host.env (which feeds the SYSTEM service's env + command line) — security-review #3.
crate::gamestream::create_private_dir(dir).ok();
}
let default = "# punktfunk host configuration (read by the Windows service).\n\
# KEY=VALUE per line; '#' comments. Restart the service after editing:\n\
@@ -707,7 +711,11 @@ fn ensure_default_host_env() -> Result<()> {
\n\
# Force a specific render GPU by name substring (multi-GPU boxes only):\n\
# PUNKTFUNK_RENDER_ADAPTER=4090\n";
std::fs::write(&path, default).with_context(|| format!("write {}", path.display()))?;
// Write host.env DACL-locked to SYSTEM/Administrators: it controls the SYSTEM service's
// environment + launched command line, so a local user must not be able to read or tamper with
// it (security-review 2026-06-28 #3).
crate::gamestream::write_secret_file(&path, default.as_bytes())
.with_context(|| format!("write {}", path.display()))?;
println!("Wrote default config: {}", path.display());
Ok(())
}
@@ -1,5 +1,5 @@
//! USER-session WGC helper (Windows) — part of the two-process secure-desktop design
//! (design/windows-secure-desktop.md).
//! (design/archive/windows-secure-desktop.md).
//!
//! WGC won't activate under the SYSTEM account, but the host must run as SYSTEM for the secure
//! desktop. So the SYSTEM host spawns THIS helper in the interactive user session
@@ -98,6 +98,9 @@ pub fn run(opts: HelperOptions) -> Result<()> {
opts.bitrate_kbps as u64 * 1000,
false, // not cuda
opts.bit_depth, // 8, or 10 = Main10 (HDR auto-upgrades from the Rgb10a2 frame regardless)
// The two-process WGC relay helper encodes 4:2:0 in v1 (4:4:4 over the relay is a follow-up);
// the host gates 4:4:4 to the single-process topology.
encode::ChromaFormat::Yuv420,
)
.context("open NVENC")?;
+85
View File
@@ -0,0 +1,85 @@
# design/ — design notes & deep-dive plans
Repo-internal design docs: architecture rationale, investigations, and the *why* behind decisions that
the code and [`../CLAUDE.md`](../CLAUDE.md) don't capture. **Authoritative current status lives in
[`../CLAUDE.md`](../CLAUDE.md)** ("Where the work stands" / "What's left"); the user-facing guides live in
`docs-site/`. These docs are kept trimmed: once work ships, the redundant implementation detail is dropped
(the code is the source of truth) and only the durable rationale + still-open items remain. Git history
holds the full originals.
## Index
| Doc | What it is | Status |
|-----|-----------|--------|
| [`implementation-plan.md`](implementation-plan.md) | Master design thesis (why GF(2¹⁶) FEC + Linux virtual displays; three-phase de-risking), architecture invariants, latency budget, risk register | **Design reference** — §07,9 kept; milestones → CLAUDE.md |
| [`apollo-comparison.md`](apollo-comparison.md) | Apollo↔punktfunk architecture map + file index + ~63-item transferable-improvement backlog (Windows-host focus) | **Reference + open backlog** — ~⅓ shipped (collapsed); rest open |
| [`security-review.md`](security-review.md) | Whole-project security audit (2026-06-21), 12 findings | **Audit trail** — 11 fixed/inherent; **#12 open** |
| [`ci.md`](ci.md) | CI/CD architecture: Gitea workflows, runners, release model, signing | **Evergreen reference** |
| [`linux-setup.md`](linux-setup.md) | Linux host bring-up (NVIDIA/headless) + troubleshooting | **Setup guide** (evergreen) |
| [`gamestream-host-plan.md`](gamestream-host-plan.md) | GameStream/Moonlight-compat host (P1.1P1.6) | **Shipped** — stub + the 2 deferral decisions |
| [`stats-capture-plan.md`](stats-capture-plan.md) | Web-console performance capture | **Shipped** — stub |
| [`session-aware-host-followups.md`](session-aware-host-followups.md) | Session-aware host known limitations | **Open items**#2/#3 shipped; #1,#48 parked |
| [`gamescope-multiuser.md`](gamescope-multiuser.md) | Per-session gamescope isolation (the 4 plumbing items) | **Deferred** — reference spec |
| [`host-latency-plan.md`](host-latency-plan.md) | Latency under GPU contention — 4-tier plan | **Partly shipped** — superseded by ↓; diagnostics + open tiers kept |
| [`gpu-contention-investigation.md`](gpu-contention-investigation.md) | GPU-contention root-cause + ranked levers (supersedes ↑) | **Active plan** — §5.A shipped; §5.B/C/E/F/G open |
| [`hdr-pipeline-plan.md`](hdr-pipeline-plan.md) | Glass-to-glass HDR | **Steps 03 shipped**; Step 4 (Linux) open |
| [`windows-host-rewrite.md`](windows-host-rewrite.md) | **Windows host — the single architecture/status/reference doc** (validated invariants, ops, open work) | **Active reference** |
| [`windows-build-and-packaging.md`](windows-build-and-packaging.md) | How the Windows host is built, signed, packaged (drivers-from-source, Inno, CI) | **Evergreen reference** |
| [`windows-service.md`](windows-service.md) | SYSTEM SCM service + secure-desktop deployment model | **Shipped** — stub + graceful-stop open item |
| [`windows-host.md`](windows-host.md) | (original 2026-06 plan) | **Redirect**`windows-host-rewrite.md` |
| [`windows-virtual-display-rust-port.md`](windows-virtual-display-rust-port.md) | pf-vdisplay IddCx port + the "IDD-push is impossible on bare metal" finding | **Shipped** — P2 do-not-retry record kept |
| [`windows-dualsense-scoping.md`](windows-dualsense-scoping.md) | Virtual DualSense (UMDF2) decision + M0 bug lessons | **Shipped (M0M4)** — public signing open |
| [`windows-dualsense-game-detection.md`](windows-dualsense-game-detection.md) | Native game-detection fix (SwDeviceCreate identity) | **Shipped** — on-glass test + GameInput open |
| [`windows-client-bootstrap.md`](windows-client-bootstrap.md) | Windows client architecture record + HDR guide + build gotchas | **Shipped** — on-glass validation open |
| [`apple-stage2-presenter.md`](apple-stage2-presenter.md) | Apple stage-2 (VTDecompressionSession + CAMetalLayer) presenter | **Shipped (opt-in)** — make-default + iOS open |
| [`game-library-stores.md`](game-library-stores.md) | Multi-store game library | **Phases 14 shipped** — 6 providers + 8 Qs open |
| [`dualsense-haptics.md`](dualsense-haptics.md) | DualSense advanced-haptics feasibility | **HID shipped**; audio haptics deferred (3 walls) |
| [`archive/windows-secure-desktop.md`](archive/windows-secure-desktop.md) | Two-process WGC secure-desktop design | **Archived** — shipped but now a fallback (IDD-push primary) |
Plus `research/gamestream-protocol-research.json` — raw Moonlight/GameStream wire reference (data, not prose).
## Consolidated open items
Still-open work scattered across the docs above, rolled up by theme so nothing is tracked in only one
buried doc. CLAUDE.md "What's left" is the headline list; this is the design-level detail. (→ names the
owning doc.)
**Latency / performance**
- Sub-frame pipelining — overlap encode+transmit within a frame; needs a direct NVENC SDK wrapper (~24 ms). → `implementation-plan`, `gamestream-host-plan`
- GPU-contention levers: correct async NVENC pipeline, auto-gated REALTIME GPU priority, clock/P-state pinning, frame-source escape (swapchain-hook/NvFBC/compose-flip), iGPU encode offload, PERF uniq-vs-fps instrumentation. → `gpu-contention-investigation` (§5.B/C/E/F/G), `host-latency-plan` (Tiers 1A/1B/3B/3C/3D/4)
- Apple stage-2 as default (after resolution/HDR checks) + smoothing/pacing policy + glass-to-glass numbers via `tools/latency-probe`. → `apple-stage2-presenter`
**HDR**
- Linux 10-bit HDR (Step 4): 8-bit→Main10 shim, true 10-bit PipeWire capture (blocked upstream — gamescope #2126), Linux-client P010 + GTK color management. → `hdr-pipeline-plan`
- GameStream HDR/10-bit (capture + metadata plumbing). → `gamestream-host-plan`
- Open Qs: MaxCLL source, GameStream SS_HDR_METADATA vs deliberate SDR, HLG sources, mid-session SDR-downgrade + SDR-for-SDR-client validation. → `hdr-pipeline-plan`
**Clients**
- Windows client on-glass validation (D3D11VA decode + HDR present + GUI on the RTX box) + RAWINPUT relative-mouse pointer-lock + per-host speed-test UI. → `windows-client-bootstrap`, `implementation-plan`
- iOS/iPadOS/tvOS stage-2 presenter variants. → `apple-stage2-presenter`, `implementation-plan`
- Android real-device validation (gamepad rumble/lightbar/DualSense, HDR10). → `implementation-plan`
**Windows host**
- Graceful stop signal — host is killed via TerminateProcess (skips RAII teardown → a stale virtual monitor can linger). → `windows-service`
- pf-vdisplay slot-reclaim on-glass reconnect-storm A/B; M4 driver-unification source-build validation; P2/P3 cleanup (D1-host lints, M6 scaffolding, M5 reshape WGC/DDA onto session/pipeline). → `windows-host-rewrite`
- Session-aware follow-ups: F44 gamescope teardown GPU-context corruption (#1, SIGKILL hypothesis); mid-stream-switch input-loss window; NVENC InitializeEncoder noise at 5K@240; NVENC HEVC ~800 Mbps cap (prefer AV1 above it); restore-guard/keep-warm coupling; Feature B (`PUNKTFUNK_SESSION_WATCH`) opt-in → default. → `session-aware-host-followups`
- Apollo backlog (~63 open) — highest-value: #9 Windows app launch (CreateProcessAsUserW), #7/#18 WASAPI device-loss recovery, #3 per-frame `IDXGIFactory::IsCurrent()`, #15 watchdog escalation, #14/#30/#56 abs-mouse through the real output rect, #10/#20/#32/#33 tray + browser-UI + in-binary service install + logs endpoint, #67/#68 frame pacing. → `apollo-comparison`
**Windows gamepads**
- DualSense public-distribution signing (EV cert + Microsoft Partner Center attestation — blocks public release); GameInput detection (reads VID/PID 0x0000 — may need a rank-3 KMDF USB-emulating bus driver); HidHide integration; minimum-OS / UMDFVERSION targeting; on-glass Cyberpunk glyph test. → `windows-dualsense-scoping`, `windows-dualsense-game-detection`
**GameStream**
- AV1 + surround 5.1/7.1 live stock-Moonlight confirmation (incl. FEC-under-loss); reconnect-at-new-mode robustness. → `gamestream-host-plan`, `implementation-plan`
**Game library**
- 6 remaining providers (Desktop/Flatpak, itch.io, Ubisoft Connect, Amazon Games, Battle.net, EA app); the `/library/art/<entryId>/<slot>` mgmt endpoint; refactor `library.rs` into a `library/` dir; 8 open design questions; optional SteamGridDB v2 enrichment. → `game-library-stores`
**Multi-user / sessions**
- gamescope per-session input/audio isolation (independent desktops) — the 4 plumbing items, deferred. → `gamescope-multiuser`, `implementation-plan`
**Security**
- **#12** — scope `NODE_TLS_REJECT_UNAUTHORIZED` to a per-request pinned agent (needs `bun add undici`); latent-only today, but **must fix before the web app gains any off-loopback server-side TLS**. → `security-review`
**Deferred / do-not-retry records** (kept so the dead ends aren't re-explored)
- DualSense audio-driven haptics — deferred until all 3 GO conditions are met. → `dualsense-haptics`
- IDD-push direct frame-push on bare-metal console capture — architecturally impossible (no presentation consumer for the swapchain). → `windows-virtual-display-rust-port`
+39 -136
View File
@@ -1,5 +1,7 @@
# Apollo vs punktfunk — architecture map & transferable improvements
> **Status:** Reference doc — an Apollo↔punktfunk architecture map plus a 96-item transferable-improvement backlog. About a third of the backlog has since shipped or gone obsolete (those items are collapsed to one-liners below); the rest is still open with full citations. The **Re-verified status (2026-06-20)** section is the authoritative shipped-status record.
> Generated 2026-06-16 by the `apollo-vs-punktfunk` multi-agent workflow, then reconstructed from
> the run journal after the live run was interrupted. **Apollo** = `~/Apollo` (commit `adc5c5a0`),
> a C++ fork of Sunshine — a Moonlight-compatible streaming **host only** (no client of its own).
@@ -680,7 +682,7 @@ Both transports use the persistent `AudioCapSlot` (gamestream/audio.rs:251-257)
### Input handling & injection — 🔴 Apollo ahead
For the Windows host specifically, Apollo is ahead on input breadth and robustness. Apollo covers mouse (rel+abs), keyboard (with a static US-layout VK→scancode table for game compatibility), Unicode text, scroll, **touch + pen via CreateSyntheticPointerDevice**, and **both X360 and DS4** gamepads with rumble/LED/motion/touchpad/battery feedback (Apollo src/platform/windows/input.cpp). punktfunk's Windows host covers mouse/keyboard/scroll/X360-only; touch and pen are explicit no-ops (sendinput.rs:231-237), there is no Unicode text path (gamestream/input.rs:83-84), and only the Xbox 360 virtual pad exists on Windows. Apollo also has the more efficient secure-desktop model (retry-only) vs punktfunk's per-event reattach (sendinput.rs:97), and Apollo's task-pool queue + type-aware batching (Apollo src/input.cpp:1481-1571, 1208-1475) coalesces input spam off the network thread — punktfunk's GameStream path injects inline on the ENet thread (control.rs:207-211) with no batching anywhere. punktfunk's design is cleaner and its m3 path's session-end held-key release + backend-follow logic is genuinely nicer than Apollo, but those are punktfunk/1-specific; on the shared Windows-host injection surface Apollo is the more complete, battle-tested implementation. punktfunk's design/windows-secure-desktop.md already flags the retry-only refactor as planned-but-unshipped, confirming the gap.
For the Windows host specifically, Apollo is ahead on input breadth and robustness. Apollo covers mouse (rel+abs), keyboard (with a static US-layout VK→scancode table for game compatibility), Unicode text, scroll, **touch + pen via CreateSyntheticPointerDevice**, and **both X360 and DS4** gamepads with rumble/LED/motion/touchpad/battery feedback (Apollo src/platform/windows/input.cpp). punktfunk's Windows host covers mouse/keyboard/scroll/X360-only; touch and pen are explicit no-ops (sendinput.rs:231-237), there is no Unicode text path (gamestream/input.rs:83-84), and only the Xbox 360 virtual pad exists on Windows. Apollo also has the more efficient secure-desktop model (retry-only) vs punktfunk's per-event reattach (sendinput.rs:97), and Apollo's task-pool queue + type-aware batching (Apollo src/input.cpp:1481-1571, 1208-1475) coalesces input spam off the network thread — punktfunk's GameStream path injects inline on the ENet thread (control.rs:207-211) with no batching anywhere. punktfunk's design is cleaner and its m3 path's session-end held-key release + backend-follow logic is genuinely nicer than Apollo, but those are punktfunk/1-specific; on the shared Windows-host injection surface Apollo is the more complete, battle-tested implementation. punktfunk's design/archive/windows-secure-desktop.md already flags the retry-only refactor as planned-but-unshipped, confirming the gap.
**How punktfunk does it.**
@@ -748,7 +750,7 @@ For the Windows host specifically, Apollo is clearly ahead on this subsystem. Ap
- punktfunk has TWO app surfaces by design: the GameStream apps.json catalog (Moonlight compat) AND a richer punktfunk/1 library (Steam local scan + custom store + CDN art + uniform GameEntry grid). Apollo has only the apps.json catalog because it ships no client.
- punktfunk's launch security model is deliberately client-can't-inject: the client sends only a store-qualified id and the host resolves it against its OWN library (library.rs:394-412), with steam appid validated digits-only. Apollo trusts its own apps.json cmds (it has no untrusted remote launch id).
- punktfunk keeps NO async on the per-frame path; the SudoVDA watchdog pinger and capture are native threads. Apollo's libdisplaydevice RetryScheduler is its own machinery; punktfunk has no equivalent scheduler by choice (yet — see candidate improvements).
- punktfunk's Windows virtual display is the SOLE primary output (isolate_displays + CDS_SET_PRIMARY) specifically to capture the secure/Winlogon desktop — a deliberate, documented design (design/windows-secure-desktop.md) that goes beyond what stock Apollo needs.
- punktfunk's Windows virtual display is the SOLE primary output (isolate_displays + CDS_SET_PRIMARY) specifically to capture the secure/Winlogon desktop — a deliberate, documented design (design/archive/windows-secure-desktop.md) that goes beyond what stock Apollo needs.
**Transfer candidates from Apollo (6):** _Actually launch the app/game on Windows (CreateProcessAsUserW into the user session)_, _Display-config apply/revert with a retry scheduler and guaranteed revert on disconnect_, _Set HDR on the virtual display and advertise IsHdrSupported when the client requests it_, _Per-(app,client) stable virtual-display GUID instead of one fixed MONITOR_GUID_, _Inject per-app launch env (client res/fps/HDR/audio + status) for launch scripts_, _auto_detach heuristic for launcher-style apps (Steam/UWP) that exit immediately_ — see Part 4.
@@ -897,11 +899,11 @@ QPC values from `LastPresentTime`/`LastMouseUpdateTime` are translated to `stead
#### Transfer opportunities
- **Treat S_OK-with-no-change frames as timeouts via DXGI update flags** (sev high, medium) — In dxgi.rs acquire(), after a successful AcquireNextFrame, compute frame_update_flag = info.LastPresentTime != 0 (and/or info.AccumulatedFrames != 0) and mouse_update_flag from LastMouseUpdateTime/PointerShapeBufferSize. Always call update_cursor (mouse). If !frame_update_flag, ReleaseFrame and return Ok(None) (so next_frame repeats last_present) UNLESS the cursor moved and we need a recomposite — in which case recomposite onto the existing last_present texture instead of CopyResource'ing the source. This cuts idle/cursor-only GPU load and avoids re-encoding unchanged content.
- **Detect resolution/format change on the acquire hot path, not only during rebuild** (sev high, small) — In acquire(), after res.cast::<ID3D11Texture2D>(), call GetDesc and compare Width/Height/Format against self.width/height and the expected format (BGRA8 vs R16G16B16A16_FLOAT). On mismatch, ReleaseFrame and run the existing recreate_dupl path (or drop gpu_copy/staging/fp16/hdr10 textures and update width/height/hdr_fp16) so the encoder re-inits cleanly. This makes live resolution + HDR-toggle changes robust even when DDA doesn't fault.
- **Release the duplication device lock during idle to avoid encoder starvation** (sev medium, small) — Cap the per-acquire DDA timeout to a small value (e.g. 8-16ms) and, when it returns WAIT_TIMEOUT, std::thread::sleep a few ms with no outstanding AcquireNextFrame before retrying — so the encode thread can grab the device for NVENC setup/reinit. Keep the generous timeout only for first_frame. Low risk, directly mirrors Apollo's documented fix.
- **Detect resolution/format change on the acquire hot path, not only during rebuild** — SHIPPED (2026-06-20). [#2]
- **Release the duplication device lock during idle to avoid encoder starvation** — OBSOLETE / not-a-bug (2026-06-20). [#34]
- **Add client-framerate frame pacing with a high-precision timer** (sev medium, large) — Add an optional pacing layer (in dxgi.rs or the encode-loop caller in punktfunk1.rs/encode.rs) keyed on the negotiated client framerate: track a group start from the frame pts, sleep to the computed target with a Windows high-resolution timer (timeBeginPeriod or CREATE_WAITABLE_TIMER_HIGH_RESOLUTION), and snap near-integral refresh to integer divisors. This is the lever for steady pacing on odd refresh rates without changing the zero-copy design.
- **Harden GPU scheduling priority + SetMaximumFrameLatency + NVIDIA-HAGS NVENC-realtime avoidance** (sev medium, medium) — After D3D11CreateDevice in dxgi.rs (and the NVENC encoder device wherever it's built), query IDXGIDevice1::SetMaximumFrameLatency(1) and SetGPUThreadPriority; load gdi32 D3DKMTSetProcessSchedulingPriorityClass and request HIGH (not REALTIME) when the adapter is NVIDIA (VendorId 0x10DE) with HAGS on, REALTIME otherwise. Mirror the privilege-enable. Guard behind admin/SYSTEM (host already relaunches as SYSTEM).
- **Retry DuplicateOutput at startup and request encoder-supported formats via Output5** (sev medium, small) — In open() wrap DuplicateOutput in a short retry (2-3 tries, ~200ms apart, re-attach_input_desktop between) before bailing. Optionally cast the output to IDXGIOutput5 and call DuplicateOutput1 with an explicit format list (BGRA8 for SDR, R16G16B16A16_FLOAT for HDR) so the capture format is intentional rather than incidental, falling back to DuplicateOutput when Output5 is absent.
- **Harden GPU scheduling priority + SetMaximumFrameLatency + NVIDIA-HAGS NVENC-realtime avoidance** — SHIPPED (2026-06-20). [#47]
- **Retry DuplicateOutput at startup and request encoder-supported formats via Output5** — SHIPPED (2026-06-20). [#35]
### Windows.Graphics.Capture (WGC) path — Apollo vs punktfunk
@@ -1099,10 +1101,10 @@ punktfunk's cursor handling lives in `crates/punktfunk-host/src/capture/dxgi.rs`
#### Transfer opportunities
- ✅ **DONE (2026-06-16)****Split every cursor shape into an alpha image + an XOR image (two-pass composite)** (sev high, medium) — Refactor convert_pointer_shape in dxgi.rs to return two optional images (alpha, xor) mirroring Apollo's split. Store cursor_shape as Option<(alpha, xor)>, upload up to two SRVs in CursorCompositor, and in composite_cursor_gpu run the alpha pass with self.blend then the xor pass with self.blend_invert (skip empties). Drop the single cursor_invert flag.
- **Render the monochrome 'inverse of screen' pixels via the XOR pass instead of dropping them** (sev medium, small) — In convert_pointer_shape's monochrome branch (dxgi.rs:628-654), once the dual-pass split (above) exists, route code (1,1) to the XOR image as white and codes (0,0)/(0,1) to the alpha image as opaque black/white, matching Apollo's case mapping.
- ⊘ **ALREADY-HANDLED (2026-06-16; premise incorrect — DDA returns S_OK on pointer-only updates, punktfunk recomposites)****Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame** (sev high, large) — Keep a clean intermediate copy of the last desktop frame (an extra DEFAULT texture). In acquire (dxgi.rs:1341), when AcquireNextFrame times out but update_cursor saw a position change (LastMouseUpdateTime changed) and the cursor is visible, copy the clean intermediate into gpu_copy and re-run composite_cursor_gpu, then return that as a fresh frame instead of repeating last_present.
- **Stop baking the cursor destructively into the repeated gpu_copy texture** (sev medium, medium) — Add a clean base texture: CopyResource(duplication -> clean_base), then CopyResource(clean_base -> gpu_copy) and composite onto gpu_copy. Repeat clean_base (cursor-free) plus a re-composite on repeats. Also create the cursor RTV once per gpu_copy and cache it rather than CreateRenderTargetView every composite (dxgi.rs:1181-1184).
- ✅ **Split every cursor shape into an alpha image + an XOR image (two-pass composite)** — SHIPPED (2026-06-16; capture/dxgi.rs). [#13]
- **Render the monochrome 'inverse of screen' pixels via the XOR pass instead of dropping them** — SHIPPED (2026-06-20). [#37]
- ⊘ **Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame** — NOT-A-BUG (2026-06-16; DDA returns S_OK on pointer-only updates and punktfunk recomposites). [#21]
- **Stop baking the cursor destructively into the repeated gpu_copy texture** — SHIPPED (2026-06-20). [#49]
- **Handle rotated outputs in cursor positioning** (sev low, medium) — Read rotation from DXGI_OUTDUPL_DESC.Rotation when opening/rebuilding the duplication (around dxgi.rs:888 and 1298), store it on DuplCapturer, and apply Apollo's rotation transform when computing the NDC rect in CursorCompositor::draw and when sampling the cursor texture in the VS.
- **Validate masked-color mask bytes and log illegal values** (sev low, small) — In the MASKED_COLOR branch of convert_pointer_shape (dxgi.rs:594-627), branch explicitly on mask==0x00 vs mask==0xFF and emit a tracing::warn! once for any other value, matching Apollo's guard, so future cursor-render bugs are observable.
@@ -1295,10 +1297,10 @@ punktfunk drives the **raw NVENC API** via `nvidia_video_codec_sdk::{sys, ENCODE
#### Transfer opportunities
- **Add real reference-frame invalidation (RFI) instead of always forcing IDR** (sev high, large) — In nvenc.rs add `maxNumRefFramesInDPB`/`numRefL0=1` to the HEVC/H264/AV1 config in init_session, gate on a new caps query NV_ENC_CAPS_SUPPORT_REF_PIC_INVALIDATION, track last_encoded_frame_index + last_rfi_range, and add an `invalidate_ref_frames(first,last)` method on the Encoder trait (encode.rs:41-51) that calls API.invalidate_ref_frames per index with Apollo's dedup/escalate-to-IDR-on-overflow logic. Wire punktfunk1.rs RFI requests to it, falling back to request_keyframe() only when it returns false.
- **Query nvEncGetEncodeCaps and gate config on real GPU capabilities** (sev medium, medium) — Add a `get_cap(cap: NV_ENC_CAPS) -> i32` helper in nvenc.rs after open_encode_session_ex (using API.get_encode_caps), verify codec_guid is in get_encode_guids, reject out-of-range WxH up front, and use SUPPORT_10BIT_ENCODE / SUPPORT_REF_PIC_INVALIDATION / SUPPORT_CUSTOM_VBV_BUF_SIZE to gate the corresponding config rather than assuming support. Surfaces clear errors instead of opaque InvalidParam.
- **Add real reference-frame invalidation (RFI) instead of always forcing IDR** — SHIPPED (2026-06-20; NVENC impl CI-pending). [#22]
- **Query nvEncGetEncodeCaps and gate config on real GPU capabilities** — SHIPPED (2026-06-20; CI-pending). [#51]
- **Use async encode with a Win32 completion event + timeout** (sev medium, medium) — In nvenc.rs, gate on NV_ENC_CAPS_ASYNC_ENCODE_SUPPORT, create a per-bitstream Win32 Event (windows::Win32::System::Threading::CreateEventW), set init.enableEncodeAsync=1, store the event in `pending`, set pic.completionEvent + lock.doNotWait=1, and in poll() WaitForSingleObject(ev, 100ms) before lock_bitstream — returning a clear timeout error instead of blocking forever.
- **Minimize NvEnc API/struct versions per codec for older-driver compatibility** (sev medium, medium) — Add a `min_api_version(codec)` (v11 for H264/HEVC, v12 for AV1) and a helper that rewrites the version word (and optionally the struct-revision byte) before each NvEnc struct is passed, mirroring nvenc_base.cpp:666-680. Set apiVersion in open_encode_session_ex (nvenc.rs:186) from it. Maximizes driver compatibility for the field.
- **Minimize NvEnc API/struct versions per codec for older-driver compatibility** — OBSOLETE (2026-06-20; handled by the SDK crate). [#53]
- **Add zeroReorderDelay/lookahead-off/lowDelayKeyFrameScale and always emit SDR VUI** (sev low, small) — In init_session set cfg.rcParams.zeroReorderDelay=1, enableLookahead=0, lowDelayKeyFrameScale=1 right after the CBR/VBV block (nvenc.rs:220-227). Add an SDR VUI branch (BT.709 primaries/transfer/matrix, limited range) alongside the existing HDR branch (:243) so every HEVC/H264 stream signals its colorspace.
- **Honor client slices-per-frame and offer NVENC intra-refresh** (sev low, medium) — Thread a slices-per-frame value from session negotiation into NvencD3d11Encoder::open and set hevcConfig/h264Config sliceMode=3 + sliceModeData in init_session; for AV1 set numTileRows/numTileColumns as nearest powers of two. Optionally add an intra-refresh config branch gated on NV_ENC_CAPS_SUPPORT_INTRA_REFRESH as an alternative recovery mode to RFI.
@@ -1492,8 +1494,8 @@ punktfunk's SudoVDA backend lives in `crates/punktfunk-host/src/vdisplay/sudovda
- **Detect watchdog ping failures and escalate (re-open the device)** (sev high, medium) — In the pinger thread in sudovda.rs (around 485-494), track a consecutive-failure counter; after N (3) failures set a shared AtomicBool 'driver_dead' on SudoVdaDisplay/keepalive and stop pinging. Surface it so the session loop in punktfunk1.rs treats a dead virtual display like ACCESS_LOST and re-opens (re-run open_device + re-create). Add a DriverStatus enum mirroring Apollo's DRIVER_STATUS.
- **Gate on SudoVDA protocol-version compatibility instead of only logging it** (sev medium, small) — In SudoVdaDisplay::new (sudovda.rs:412-432) parse {Major,Minor,Incremental} and compare against a compiled-in EXPECTED_PROTOCOL {Major:0,Minor:2}. If Major differs or our Minor > driver Minor, return Err with a 'driver too old / incompatible — update SudoVDA' message (and a distinct error variant the mgmt API can surface, like Apollo's VirtualDisplayDriverReady in nvhttp.cpp:936).
- **Retry device open with exponential backoff** (sev medium, small) — Wrap open_device in SudoVdaDisplay::new (sudovda.rs:412-413) in a 20→320ms backoff loop matching Apollo; on a session-time re-open after watchdog failure, allow a few retries with ~1s spacing.
- **Add SET_RENDER_ADAPTER (IOCTL 0x802) to bind the IDD render GPU to the capture/encode GPU** (sev high, medium) — Add `const IOCTL_SET_RENDER_ADAPTER: u32 = ctl(0x802);` and a `#[repr(C)] struct SetRenderAdapterParams { luid: LUID }` in sudovda.rs. Before ADD in create() (sudovda.rs:448), enumerate DXGI adapters (reuse capture/dxgi.rs adapter-by-LUID/name helpers) to match the configured/encoder GPU and issue the IOCTL so the IDD's AddOut LUID matches the capture device's adapter.
- **Derive a stable per-client MonitorGuid instead of one global constant** (sev medium, medium) — Pass a client/session identifier into create() (thread it from the m3 handshake) and derive the GUID deterministically from it (e.g. hash the client cert fingerprint into a u128), replacing the constant at sudovda.rs:452-456 and the RemoveParams guid at sudovda.rs:568. Keep a fixed probe GUID for the startup encoder probe like Apollo's PROBE_DISPLAY_UUID.
- **Add SET_RENDER_ADAPTER (IOCTL 0x802) to bind the IDD render GPU to the capture/encode GPU** — SHIPPED (2026-06-20). [#16]
- **Derive a stable per-client MonitorGuid instead of one global constant** — SHIPPED (2026-06-20). [#55]
- **Add millihertz CCD mode-set with ±1 Hz fallback and SDC_SAVE_TO_DATABASE persistence** (sev medium, medium) — In set_active_mode (sudovda.rs:146-265), after the integer DEVMODE attempt add a CCD path: QueryDisplayConfig(QDC_ONLY_ACTIVE_PATHS), match the path by GDI name, set sourceMode width/height and targetInfo.refreshRate = {hz,1000}, and call SetDisplayConfig with SDC_APPLY|SDC_USE_SUPPLIED_DISPLAY_CONFIG|SDC_SAVE_TO_DATABASE. Add an alt-rate (±1) retry mirroring virtual_display.cpp:294-300.
### Windows host: running as SYSTEM, secure-desktop capture, session/desktop switching + D3D recreation, NVIDIA driver prefs (nvprefs), GPU/adapter preference, display isolation, mDNS publish
@@ -1555,7 +1557,7 @@ punktfunk's **secure-desktop / desktop-switch capture recovery is genuinely matu
##### Where punktfunk is weaker / missing / fragile
1. **No real Windows service — relies on a PsExec scheduled task.** The launch chain is a scheduled task → `PsExec64 -s -i 1``wscript.exe launch.vbs` → hidden `host-run.cmd` (`design/windows-host.md:78-84`). There is **no `SERVICE_CONTROL_SESSIONCHANGE` relaunch** — the doc even lists it as unimplemented "step 6" (`design/windows-secure-desktop.md:89`). PsExec is a 3rd-party SysInternals tool, not redistributable cleanly, and `-s -i 1` hard-codes session 1. None of the launch scripts (`launch.vbs`, `host-run.cmd`) are checked into the repo (only `scripts/headless/win-build.cmd` exists). This is the single biggest fragility vs Apollo's `sunshinesvc.cpp`.
1. **No real Windows service — relies on a PsExec scheduled task.** The launch chain is a scheduled task → `PsExec64 -s -i 1``wscript.exe launch.vbs` → hidden `host-run.cmd` (`design/windows-host.md:78-84`). There is **no `SERVICE_CONTROL_SESSIONCHANGE` relaunch** — the doc even lists it as unimplemented "step 6" (`design/archive/windows-secure-desktop.md:89`). PsExec is a 3rd-party SysInternals tool, not redistributable cleanly, and `-s -i 1` hard-codes session 1. None of the launch scripts (`launch.vbs`, `host-run.cmd`) are checked into the repo (only `scripts/headless/win-build.cmd` exists). This is the single biggest fragility vs Apollo's `sunshinesvc.cpp`.
2. **No nvprefs / NvAPI at all.** `grep` for `nvprefs|NvAPI|DRS_|PREFERRED_PSTATE|DXPRESENT` across the host returns nothing. No PREFERRED_PSTATE_MAX for the encoder, no OGL_CPL_PREFER_DXPRESENT (so GL/Vulkan fullscreen apps may not be capturable via WGC/DDA), and no undo-file crash safety.
3. **No DXGI GPU-preference / output-reparenting hook.** No MinHook of `NtGdiDdDDIGetCachedHybridQueryValue`. On a hybrid/Optimus box DXGI can reparent the SudoVDA output onto the render GPU and break DDA. punktfunk's "search all adapters" partly papers over this but does not prevent the reparenting itself.
4. **mDNS uses the cross-platform `mdns-sd` crate, not Windows-native `DnsServiceRegister`** (`discovery.rs:17`). It works, but it does NOT carry Apollo's RFC-1035 empty-TXT fix — and the GameStream/Moonlight mDNS path on Windows is unverified (`design/windows-host.md:46`). A non-RFC-compliant TXT can be rejected by Apple's resolver.
@@ -1567,12 +1569,12 @@ punktfunk's **secure-desktop / desktop-switch capture recovery is genuinely matu
#### Transfer opportunities
- **Replace the PsExec scheduled-task launch with a real Windows service that relaunches the host on session change** (sev high, large) — Add a small Rust service binary (new crate or punktfunk-host `service` subcommand) using windows::Win32::System::Services (RegisterServiceCtrlHandlerEx, StartServiceCtrlDispatcher) that mirrors sunshinesvc.cpp: WTSGetActiveConsoleSessionId -> DuplicateTokenEx+SetTokenInformation(TokenSessionId) -> CreateProcessAsUserW(lpDesktop=winsta0\\default) into a kill-on-close job, accept SERVICE_ACCEPT_SESSIONCHANGE, and relaunch the host on a genuine console-session change. Ship an installer and drop the PsExec dependency.
- **Replace the PsExec scheduled-task launch with a real Windows service that relaunches the host on session change** — SHIPPED (2026-06-20). [#24]
- **Add an NvAPI driver-settings manager (PREFERRED_PSTATE_MAX + OGL_CPL_PREFER_DXPRESENT) with a crash-safe undo file** (sev medium, large) — Add a windows-only nvprefs module wrapping NvAPI DRS (load nvapi64 dynamically, treat NvAPI_Initialize failure as 'no NVIDIA, skip'). Create a 'punktfunk' app profile with PREFERRED_PSTATE_PREFER_MAX, set OGL_CPL_PREFER_DXPRESENT_ENABLED on the base profile behind a config flag, write an undo file under %ProgramData%\\punktfunk before global changes, and call it on session start (the new stream_will_start hook below).
- **Hook win32u!NtGdiDdDDIGetCachedHybridQueryValue to stop DXGI output-reparenting on hybrid/Optimus GPUs** (sev medium, medium) — Add a once-init in the Windows capture path (capture/dxgi.rs open) that installs the same hook via a minhook-rs/detour crate (or a manual IAT/inline hook) on NtGdiDdDDIGetCachedHybridQueryValue forcing STATE_UNSPECIFIED, plus SetProcessDpiAwarenessContext(PER_MONITOR_AWARE_V2). Gate it to NVIDIA/hybrid boxes; it's process-lifetime so no teardown needed.
- **Hook win32u!NtGdiDdDDIGetCachedHybridQueryValue to stop DXGI output-reparenting on hybrid/Optimus GPUs** — SHIPPED (2026-06-20). [#57]
- **Add a Windows stream_will_start/stop hook: timer resolution, MMCSS, HIGH_PRIORITY_CLASS, display-required, headless Mouse Keys** (sev medium, medium) — Add a windows-only RAII guard invoked when a session starts (punktfunk1.rs/pipeline session setup) that raises timer resolution (NtSetTimerResolution or timeBeginPeriod(1)), DwmEnableMMCSS(true), SetPriorityClass(HIGH_PRIORITY_CLASS), and wraps the DXGI capture loop in SetThreadExecutionState(ES_CONTINUOUS|ES_DISPLAY_REQUIRED) (capture/dxgi.rs next_frame loop), reverting on drop. Optionally the headless Mouse-Keys trick for cursor visibility.
- **Use Windows-native DnsServiceRegister (or fix the TXT record) so Apple's mDNS resolver accepts the host** (sev low, medium) — Either (a) verify mdns-sd always emits an RFC-1035-valid TXT (never zero strings) and add a regression test, or (b) add a windows-only discovery backend using DnsServiceRegister via the windows crate's DNS APIs mirroring publish.cpp, including the single-empty-TXT workaround, so Apple NWBrowser/Moonlight discover the host reliably.
- **Add per-frame IDXGIFactory::IsCurrent reinit detection and switch the host clock to GetSystemTimePreciseAsFileTime** (sev medium, small) — In capture/dxgi.rs next_frame, query the cached IDXGIFactory's IsCurrent() once per loop and trigger the existing recreate path when it goes false (catches HDR/topology changes cleanly). Replace now_ns() on Windows with GetSystemTimePreciseAsFileTime converted to Unix-epoch ns so ClockProbe/ClockEcho skew correction stays accurate cross-machine.
- **Use Windows-native DnsServiceRegister (or fix the TXT record) so Apple's mDNS resolver accepts the host** — SHIPPED (2026-06-20). [#87]
- **Add per-frame IDXGIFactory::IsCurrent reinit detection and switch the host clock to GetSystemTimePreciseAsFileTime** — SHIPPED (2026-06-20). [#42]
### Completeness critic — areas flagged as under-covered
@@ -1769,18 +1771,10 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
#### 1. Switch SendInput to retry-on-failure desktop reattach (drop per-event OpenInputDesktop)
*Area:* `cmp:input` · *Windows-host:* yes · *Severity:* high · *Effort:* small
- **Apollo does:** send_input() / inject_synthetic_pointer_input() call SendInput FIRST, and only on failure (0 injected) re-run syncThreadDesktop() (OpenInputDesktop(DF_ALLOWOTHERACCOUNTHOOK)+SetThreadDesktop) and retry once, tracking the desktop in a thread_local _lastKnownInputDesktop — src/platform/windows/input.cpp:477,499 + src/platform/windows/misc.cpp:251
- **punktfunk gap:** SendInputInjector::inject() calls reattach_input_desktop() (an OpenInputDesktop+SetThreadDesktop+CloseDesktop) at the TOP of EVERY event — crates/punktfunk-host/src/inject/sendinput.rs:97,50-69. This is a syscall triple per mouse-move; punktfunk's own design/windows-secure-desktop.md:78-80 lists this exact refactor (step 2) as planned but unshipped.
- **Proposal:** Inject first; cache the HDESK thread-local; only on a 0/partial SendInput result call reattach_input_desktop() and retry once. Use DF_ALLOWOTHERACCOUNTHOOK in the OpenInputDesktop access (sendinput.rs:52-56 currently passes DESKTOP_CONTROL_FLAGS(0)) so the secure desktop is reachable. Keeps the steady-state hot path to a single SendInput call.
**SHIPPED (2026-06-20)** — per-event OpenInputDesktop dropped for inject-first + retry-on-failure desktop reattach.
#### 2. Detect resolution/format change on the acquire hot path, not only during rebuild
*Area:* `win:capture-dxgi-dd` · *Windows-host:* yes · *Severity:* high · *Effort:* small
- **Apollo does:** Every frame Apollo reads src->GetDesc() and reinits if desc.Width/Height != width_before_rotation/height_before_rotation or capture_format != desc.Format (display_vram.cpp:1215-1236, display_ram.cpp:253-265, wgc 1662-1674).
- **punktfunk gap:** punktfunk only re-reads dimensions inside recreate_dupl (dxgi.rs:1298-1313). On the normal acquire path (dxgi.rs:1426-1492) it never validates the acquired texture's desc, so a mode change that doesn't raise ACCESS_LOST leads to CopyResource of a mismatched-size/format source into a stale gpu_copy/staging/fp16_src — silent corruption or a hard copy failure.
- **Proposal:** In acquire(), after res.cast::<ID3D11Texture2D>(), call GetDesc and compare Width/Height/Format against self.width/height and the expected format (BGRA8 vs R16G16B16A16_FLOAT). On mismatch, ReleaseFrame and run the existing recreate_dupl path (or drop gpu_copy/staging/fp16/hdr10 textures and update width/height/hdr_fp16) so the encoder re-inits cleanly. This makes live resolution + HDR-toggle changes robust even when DDA doesn't fault.
**SHIPPED (2026-06-20)** — acquire-path GetDesc check now catches resolution/format changes that don't raise ACCESS_LOST.
#### 3. Per-frame IsCurrent() check to catch HDR/GPU/mode changes
*Area:* `win:capture-wgc` · *Windows-host:* yes · *Severity:* high · *Effort:* small
@@ -1790,36 +1784,13 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
- **Proposal:** Hold an IDXGIFactory1 in WgcCapturer (from the same adapter as make_device) and call IsCurrent() at the top of next_frame/wait_and_drain; on false, return the reinit signal. This pairs with wgc-size-format-reinit to give a complete change-detection story.
#### 4. Batched/GSO send for the GameStream video plane on Windows
*Area:* `cmp:protocol-streaming` · *Windows-host:* yes · *Severity:* high · *Effort:* medium · **✓ verified · ✅ DONE (2026-06-16)**
> **Resolution:** Implemented per the refined proposal. Added a reusable Windows-only
> `punktfunk_core::transport::send_uso_all(&UdpSocket, &[&[u8]]) -> io::Result<usize>` that reuses the
> native plane's proven `send_one_uso` + `uso` on/off latch + `uso_unsupported`, with the same
> uniform-size guard and ≤512-segment chunking. `gamestream/stream.rs` `sendmmsg_all` now has a
> `#[cfg(target_os="windows")]` arm that calls it per 16-packet paced burst (one `WSASendMsg` instead
> of 16 `send`s) and sends any remainder scalar; the Linux `sendmmsg` arm and a generic scalar arm are
> unchanged. PUNKTFUNK_GSO=0 kill-switch + auto-fallback inherited. Linux build unaffected;
> punktfunk-core type-checks for x86_64-pc-windows-msvc. Host Windows compile deferred to CI/dev box.
- **Apollo does:** Apollo sends every plane through platf::send_batch / send (one code path for all OSes; on Windows it uses real batched socket writes), and the video broadcast thread is the single transmit path (stream.cpp:1327, send batching at stream.cpp:1337 send_batch latency logger).
- **punktfunk gap:** The GameStream video sender's batched path is Linux-only: sendmmsg_all has a #[cfg(target_os="linux")] real implementation (stream.rs:147) and a #[cfg(not(target_os="linux"))] fallback that does one sock.send() per packet (stream.rs:185-191). On a Windows GameStream-compat host (capture IS wired for Windows via DXGI/WGC, capture.rs:261) every video datagram is an individual syscall — the native punktfunk/1 plane got Windows USO (transport/udp.rs:135) but the GameStream plane did not.
- **Proposal:** Route the GameStream video send thread through the same Windows WSASendMsg/USO + WSASend-batch path the native plane already implements in punktfunk-core transport/udp.rs (or factor that send helper into a shared module and call it from gamestream/stream.rs). Keeps GameStream-on-Windows from being syscall-bound at high bitrate.
- **Verify verdict:** `confirmed_gap` — PUNKTFUNK gap is real. The GameStream video send path uses a private `sendmmsg_all`: real `sendmmsg` only under `#[cfg(target_os="linux")]` (crates/punktfunk-host/src/gamestream/stream.rs:147-181), and a `#[cfg(not(target_os="linux"))]` fallback that does one `sock.send(p)` per packet (stream.rs:185-191). The paced sender calls it in PACE_CHUNK=16 bursts (stream.rs:230). It operates on a raw `std::net::UdpSocket` (stream.rs:66, cloned at :310), NOT the core `Transport` trait, so it does NOT pick up the native plane's USO. The GameStream host genuinely runs on Windows: `serve`/`gamestream` are not OS-gated (main.rs:81-83 dispatch is uncfg'd; gamestream/mod.rs declares `mod stream;` with no cfg), capture is wired for Windows (capture.rs:261-279 `capture_virtual_output` via SudoVDA+WGC/DXGI), and the module has explicit Windows handling (gamestream/mod.rs:209-210 APPDATA, :216-217 COMPUTERNAME). So on a Windows GameStream-compat host every video datagram is its own syscall. Meanwhile the native plane already has the answer: crates/punktfunk-core/src/transport/udp.rs:141-246 (`uso` state + `send_one_uso` via `WSASendMsg`+`UDP_SEND_MSG_SIZE`), wired default-on at udp.rs:610-647 (`send_gso`), called by session.rs:182. Also note GameStream video datagrams are uniform `blocksize` (= packet_size+16): data shards, the zero-padded last data shard, and FEC parity shards are all full blocksize (gamestream/video.rs:41-42,76,111-166) — the exact uniform-size precondition USO/GSO needs. APOLLO confirms the claimed unified path: `platf::send_batch` (src/platform/common.h:697) is the single video transmit call (src/stream.cpp:1598, in videoBroadcastThread, latency-logged at stream.cpp:1337); its Windows impl is real USO — `WSASendMsg` with a `UDP_SEND_MSG_SIZE` cmsg of `header_size+payload_size` (src/platform/windows/misc.cpp:1408,1499,1508), with a per-packet `send()` fallback (misc.cpp:1510-1587) "if USO is not supported ... caller will fall back to unbatched sends" (misc.cpp:1504-1505).
- **Refined:** Route the GameStream Windows video send through USO instead of per-packet `send`. Do NOT duplicate the WSASendMsg code — factor the native plane's USO helper out of `UdpTransport`. Extract `send_one_uso` + the `uso` enable/latch state + `uso_unsupported` + the uniform-size chunking loop (currently udp.rs:185-246 and the `send_gso` Windows body udp.rs:610-647) into a small `pub(crate)` free function in punktfunk-core, e.g. `transport::udp::send_packets_uso(socket: &UdpSocket, packets: &[&[u8]]) -> io::Result<usize>` that takes a raw connected `std::net::UdpSocket` (the GameStream sender already owns one) and applies USO with the same default-on + auto-fallback-to-per-packet + PUNKTFUNK_GSO=0 kill-switch semantics. Then rewrite gamestream/stream.rs `sendmmsg_all` so the `#[cfg(target_os="windows")]` arm calls that helper (the Linux arm keeps its sendmmsg; a `not(any(linux,windows))` arm keeps the scalar loop). GameStream packets are already uniform blocksize per the packetizer, so the USO uniform-size guard passes; the existing PACE_CHUNK=16 microburst pacing is unaffected (each chunk becomes one WSASendMsg). Add a Linux GSO arm too while there (same helper pattern) for parity, but USO/Windows is the point of this item. Keep the change inside punktfunk-core for the helper (one core, C-ABI-stable — no new public ABI surface needed, it's pub(crate)) and a ~10-line edit in the host. This respects: no async on frame path (native sockets only), no protocol change, no scaling change.
**SHIPPED (2026-06-16)** — Windows USO batched send for the GameStream video plane via the reusable `punktfunk_core::transport::send_uso_all` helper (one WSASendMsg per 16-packet paced burst, PUNKTFUNK_GSO=0 kill-switch + auto-fallback); Host Windows compile CI-pending.
#### 5. Gate the GameStream HTTPS plane on the paired-cert allow-list
*Area:* `cmp:gamestream-http-pairing` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
- **Apollo does:** Apollo defers TLS verification (nvhttp.cpp:88 sets verify_peer|verify_fail_if_no_peer_cert with a permissive OpenSSL cb, then the accept() override runs cert_chain.verify() post-handshake and stashes the matched named_cert_t into request->userp; every authenticated handler calls get_verified_cert(request) — nvhttp.cpp:665-667,915,1086,1172,1360 — so an unpaired cert is rejected with a proper XML body, not just accepted).
- **punktfunk gap:** punktfunk pins the client cert at pairing (pairing.rs:230-236) and loads it into AppState.paired (mod.rs:134) but NEVER consults it: tls.rs:38-45 verify_client_cert always returns assertion(), and /launch (nvhttp.rs:87-109) does no identity check. Any client that completed a TLS handshake — paired or not — can launch a session.
- **Proposal:** After the handshake, recover the peer cert (axum_server exposes the rustls connection / peer certs), SHA-256 it, and check it against AppState.paired in /launch, /resume, /applist, /cancel (and reflect the real result in serverinfo PairStatus). Keep verify_client_cert lenient for the handshake but reject unpaired identities at the handler with an XML error, mirroring Apollo's get_verified_cert pattern. This is the single highest-value GameStream-compat hardening item and applies equally to the Windows host.
**SHIPPED (2026-06-20)** — gamestream/tls.rs surfaces the verified peer cert (PeerCertFingerprint) and nvhttp.rs gates /launch /resume /applist /cancel on the paired-fingerprint set (closes the "any TLS client can launch" hole).
#### 6. Query NVENC encode capabilities before init and degrade gracefully
*Area:* `cmp:video-encode` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
- **Apollo does:** nvenc_base.cpp:175-220 builds a get_encoder_cap lambda over nvEncGetEncodeCaps and checks NV_ENC_CAPS_WIDTH_MAX/HEIGHT_MAX (rejects with a clear message), SUPPORT_10BIT_ENCODE, SUPPORT_YUV444_ENCODE, SUPPORT_REF_PIC_INVALIDATION (toggles encoder_params.rfi), SUPPORT_CUSTOM_VBV_BUF_SIZE (nvenc_base.cpp:250-255), SUPPORT_CABAC (nvenc_base.cpp:311-315), SUPPORT_WEIGHTED_PREDICTION (nvenc_base.cpp:220), and SUPPORT_INTRA_REFRESH/SINGLE_SLICE_INTRA_REFRESH (nvenc_base.cpp:334-345). Each missing cap downgrades a feature instead of failing.
- **punktfunk gap:** crates/punktfunk-host/src/encode/nvenc.rs:131-323 init_session never calls nvEncGetEncodeCaps. Max W/H is only checked against a static per-codec constant (encode.rs:57-62) not the GPU's real cap; 10-bit Main10 is forced (nvenc.rs:233-237) without checking SUPPORT_10BIT_ENCODE; custom VBV (nvenc.rs:224-227) is set without checking SUPPORT_CUSTOM_VBV_BUF_SIZE. On an unsupported card these surface as opaque InvalidParam handled only by bitrate step-down, which masks the real cause.
- **Proposal:** Add a caps query in NvencD3d11Encoder::init_session right after open_encode_session_ex: build a get_cap(NV_ENC_CAPS) helper over nvEncGetEncodeCaps, validate encodeWidth/Height against WIDTH_MAX/HEIGHT_MAX with a clear error, gate the 10-bit path on SUPPORT_10BIT_ENCODE (fall back to 8-bit with a warning instead of failing), gate custom VBV on SUPPORT_CUSTOM_VBV_BUF_SIZE, and record an rfi-supported flag for the RFI work below.
**SHIPPED (2026-06-20)** — encode/nvenc.rs query_caps probes nvEncGetEncodeCaps and degrades gracefully (over-range reject, 10-bit→8-bit fallback, custom-VBV gate, RFI flag); Windows compile CI-pending.
#### 7. Detect default-render-device changes and reinit WASAPI capture
*Area:* `cmp:audio` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
@@ -1829,11 +1800,7 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
- **Proposal:** In wasapi_cap.rs, register a device-notification callback on the DeviceEnumerator; on default-render change, break the capture loop and reopen get_default_device(Render) + a fresh loopback IAudioClient (re-running the init block at wasapi_cap.rs:105-133). Surface it through the existing thread without tearing down the WasapiLoopbackCapturer handle so the session keeps streaming.
#### 8. Move GameStream input injection off the ENet service thread
*Area:* `cmp:input` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
- **Apollo does:** The control thread only enqueues bytes + schedules a task; a pool thread pops one packet, batches later same-type packets while holding the queue lock, then RELEASES the lock before the (slow) SendInput/ViGEm call — src/input.cpp:1481-1520, 1639-1643. A slow OS input call never stalls the network thread.
- **punktfunk gap:** on_receive() calls inj.inject(&ev) synchronously inside the host.service() ENet loop — crates/punktfunk-host/src/gamestream/control.rs:84-91,207-211. A SendInput that blocks crossing a desktop switch (or a slow ViGEm update) head-blocks ENet handshake/keepalive/retransmit servicing. The m3 path already does this right (punktfunk1.rs:1300 → injector_service_thread).
- **Proposal:** Mirror the m3 design in the GameStream control thread: push decoded InputEvents onto an mpsc channel drained by a dedicated injector thread (reuse injector_service_thread or a sibling), so the ENet thread never blocks on SendInput/ViGEm. No async needed — native thread + std::sync::mpsc, consistent with the invariant.
**SHIPPED (2026-06-20)** — on_receive forwards to a shared crate::inject InjectorService thread (+ relative-mouse/scroll coalescing, #45); the ENet thread no longer blocks on injection.
#### 9. Actually launch the app/game on Windows (CreateProcessAsUserW into the user session)
*Area:* `cmp:process-launch` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
@@ -1864,18 +1831,7 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
- **Proposal:** In WgcCapturer::process_frame, call src.GetDesc() and compare Width/Height/Format against self.width/height and the expected format. On mismatch, return a Reinit error (add a capture_e::Reinit-equivalent to the Capturer contract or bail with a recognizable error the m3/stream loop maps to a capturer rebuild). Drop and re-create fp16_src/hdr10_out/bgra_copy when size changes.
#### 13. Split every cursor shape into an alpha image + an XOR image (two-pass composite)
*Area:* `win:cursor-compositing` · *Windows-host:* yes · *Severity:* high · *Effort:* medium · **✅ DONE (2026-06-16)**
> **Resolution:** Implemented in `capture/dxgi.rs`. `convert_pointer_shape` now returns a `CursorShape`
> with optional `alpha`/`xor` layers; `CursorCompositor` holds `tex_alpha`/`tex_xor` and `draw_layer`
> renders each with its own blend (alpha = src-over + HDR scale; XOR = inversion, unscaled). MASKED_COLOR
> opaque pixels now go through the alpha pass (not the invert blend), and MONOCHROME `(1,1)` invert pixels
> now feed the XOR layer (previously approximated as solid black). CPU path blends both layers too.
> The `cursor_invert` flag was removed. Independently reviewed (ship); pending Windows CI/dev-VM compile.
- **Apollo does:** Apollo emits two BGRA images per shape — make_cursor_alpha_image (display_vram.cpp:279) and make_cursor_xor_image (display_vram.cpp:210) — and runs both an alpha-blend pass and an invert-blend pass in blend_cursor (display_vram.cpp:1448-1469), each skipped if its image is empty. MASKED_COLOR and MONOCHROME shapes legitimately need both.
- **punktfunk gap:** convert_pointer_shape (dxgi.rs:566) produces ONE image and cursor_invert (dxgi.rs:1133-1134) picks ONE blend for the whole shape, so a cursor mixing opaque and screen-inverting pixels (common I-beams and themed arrows) renders wrong; masked-color opaque pixels are even forced through the invert blend (dxgi.rs:612-624 + 1205).
- **Proposal:** Refactor convert_pointer_shape in dxgi.rs to return two optional images (alpha, xor) mirroring Apollo's split. Store cursor_shape as Option<(alpha, xor)>, upload up to two SRVs in CursorCompositor, and in composite_cursor_gpu run the alpha pass with self.blend then the xor pass with self.blend_invert (skip empties). Drop the single cursor_invert flag.
**SHIPPED (2026-06-16)** — two-pass cursor composite in capture/dxgi.rs (CursorShape alpha/xor layers, CursorCompositor draw_layer; MASKED_COLOR→alpha, MONOCHROME (1,1)→XOR; cursor_invert flag removed). Windows CI/dev-VM compile pending.
#### 14. Map absolute mouse through the real virtual-desktop / output rect, not a blind 0..65535 normalize
*Area:* `win:input-sendinput-vigem` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
@@ -1892,11 +1848,7 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
- **Proposal:** In the pinger thread in sudovda.rs (around 485-494), track a consecutive-failure counter; after N (3) failures set a shared AtomicBool 'driver_dead' on SudoVdaDisplay/keepalive and stop pinging. Surface it so the session loop in punktfunk1.rs treats a dead virtual display like ACCESS_LOST and re-opens (re-run open_device + re-create). Add a DriverStatus enum mirroring Apollo's DRIVER_STATUS.
#### 16. Add SET_RENDER_ADAPTER (IOCTL 0x802) to bind the IDD render GPU to the capture/encode GPU
*Area:* `win:virtual-display-sudovda` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
- **Apollo does:** setRenderAdapterByName enumerates DXGI adapters, matches desc.Description, and issues SET_RENDER_ADAPTER with that adapter's LUID before every create (virtual_display.cpp:624-654, sudovda.h:109-128, called at main.cpp:369-371 and process.cpp:250-252).
- **punktfunk gap:** punktfunk defines no IOCTL_SET_RENDER_ADAPTER and never binds the render adapter (sudovda.rs:47-54). On a hybrid/multi-GPU box the IDD may render on the iGPU while NVENC + Desktop Duplication run on the dGPU, breaking or slowing zero-copy.
- **Proposal:** Add `const IOCTL_SET_RENDER_ADAPTER: u32 = ctl(0x802);` and a `#[repr(C)] struct SetRenderAdapterParams { luid: LUID }` in sudovda.rs. Before ADD in create() (sudovda.rs:448), enumerate DXGI adapters (reuse capture/dxgi.rs adapter-by-LUID/name helpers) to match the configured/encoder GPU and issue the IOCTL so the IDD's AddOut LUID matches the capture device's adapter.
**SHIPPED (2026-06-20)** — SET_RENDER_ADAPTER (IOCTL 0x802) now binds the IDD render GPU to the capture/encode adapter on hybrid/multi-GPU boxes.
#### 17. Add streaming_will_start/stop session-level latency tuning on Windows
*Area:* `win:critic` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
@@ -1913,43 +1865,16 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
- **Proposal:** On the capture thread, register an IMMNotificationClient (or poll GetDefaultAudioEndpoint) and treat a default-render change OR a device-invalidated error as a re-open: tear down the IAudioClient and re-acquire the new default endpoint in-place, like the Linux PipeWire reconnect discipline. Lives entirely in audio/wasapi_cap.rs
#### 19. Implement true reference-frame invalidation with a multi-ref DPB instead of always-full-IDR
*Area:* `cmp:video-encode` · *Windows-host:* yes · *Severity:* high · *Effort:* large
- **Apollo does:** nvenc_base.cpp:268-281 sets maxNumRefFrames/maxNumRefFramesInDPB to 5 (HEVC/H264) and L0 to 1, enabling a deep DPB; invalidate_ref_frames (nvenc_base.cpp:574-610) calls nvEncInvalidateRefFrames per lost frame range, dedupes already-done ranges, falls back to IDR only when the range exceeds the DPB, and sets rfi_needs_confirmation so the next encoded frame is marked as the RFI fulfilment (nvenc_base.cpp:551-557, 490-491).
- **punktfunk gap:** crates/punktfunk-host/src/encode/nvenc.rs leaves ref frames at the preset default and exposes only request_keyframe (nvenc.rs:465-467) which always emits a full FORCE_IDR. gamestream/control.rs:163-177 collapses both RFI (0x0301) and request-IDR (0x0302) into the same full-IDR. A full IDR at high resolution is the multi-millisecond spike punktfunk's own infinite-GOP comments call out (linux.rs:197-201) — true RFI avoids it for recoverable loss.
- **Proposal:** Extend the Encoder trait with an invalidate_ref_frames(first,last) method (default: fall back to request_keyframe). In the Windows NVENC config set maxNumRefFramesInDPB/maxNumRefFrames>1 (and numRefL0=1) gated on SUPPORT_MULTIPLE_REF_FRAMES, implement invalidate_ref_frames via nvEncInvalidateRefFrames with the dedupe + IDR-fallback logic, and route control.rs 0x0301 to invalidate (carrying the lost frame range) while 0x0302 stays full-IDR.
**SHIPPED (2026-06-20)** — Encoder::invalidate_ref_frames added (Windows NVENC multi-ref DPB + nvEncInvalidateRefFrames; GameStream 0x0301 routes to invalidate); Linux degrades to IDR; NVENC impl CI-pending. See also #22.
#### 20. In-binary Windows service install + interactive-session launch
*Area:* `cmp:config-management` · *Windows-host:* yes · *Severity:* high · *Effort:* large
- **Apollo does:** config.cpp:1490-1534 handles the Windows shortcut/service launch dance inside the binary: --shortcut/--shortcut-admin handling, ShellExecuteExW(runas, --shortcut-admin) to self-elevate when the service isn't running, waits for the service, wait_for_ui_ready(), launch_ui(), then returns 1 so the foreground process does NOT also start a stream host. This is Sunshine/Apollo's mature service<->UI two-process split that makes one-click launch work.
- **punktfunk gap:** punktfunk has no service-install / self-elevation / interactive-session bring-up in the binary. Deployment is documented as a manual chain of external scripts — scheduled task -> PsExec64 -i 1 -> launch.vbs -> host-run.cmd (design/windows-host.md:77-96) — fragile and operator-hostile. main.rs has no install/service subcommand.
- **Proposal:** Add `punktfunk-host install`/`uninstall`/`service` subcommands (Windows-gated) that register a service or an Interactive/Highest scheduled task to launch the host in Session 1 (the documented requirement for DXGI duplication + SendInput), and the self-elevate-if-not-running shortcut path. Reuse the existing capture/wgc_relay CreateProcessAsUserW machinery already in the crate. This codifies the script chain into the binary without touching the per-frame path or core.
**SHIPPED (2026-06-20)** — in-binary punktfunk-host service subcommand installs/launches the host into the interactive session (PsExec chain dropped). See also #24.
#### 21. Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame
*Area:* `win:cursor-compositing` · *Windows-host:* yes · *Severity:* high · *Effort:* large · **⊘ ALREADY-HANDLED (2026-06-16)**
> **Resolution — not a bug for punktfunk.** The gap below assumes a cursor moving over a static screen
> produces `AcquireNextFrame` **timeouts**. It does not: DXGI returns **S_OK for pointer-only updates**
> (`FrameInfo.LastMouseUpdateTime != 0`, `LastPresentTime == 0`), with the resource holding the
> (unchanged) desktop. `acquire()` always re-runs `present_acquired` on S_OK (`dxgi.rs:1407,1474`), which
> re-copies the desktop and recomposites the cursor at its new position. `last_present` is repeated only
> on a genuine `WAIT_TIMEOUT` (nothing changed) or a mid-rebuild gap — correct. The agent that raised this
> didn't account for DDA's pointer-update S_OK semantics, and the run was killed before the verify phase
> reached it. The only real delta from Apollo is a **perf** micro-opt (Apollo retains a clean copy and
> re-blends just the cursor rect, avoiding a full ~29 MB `CopyResource` per pointer update) — deferred as
> optional, pending evidence of GPU-copy pressure.
- **Apollo does:** Apollo treats a mouse-only update as a real update (display_vram.cpp:1162-1168) and keeps an intermediate D3D surface of the last desktop frame so it can copy surface->fresh image and re-blend the cursor at its new position with no new DDA frame (last_frame_variant state machine, display_vram.cpp:1239-1306).
- **punktfunk gap (as originally filed — see Resolution above; premise incorrect):** punktfunk only composites on a fresh AcquireNextFrame (dxgi.rs:1477); on timeout it repeats last_present (dxgi.rs:1547-1561) which has the OLD cursor position baked in, so a cursor moving over a static screen stutters/lags.
- **Proposal (superseded; only the perf variant remains):** Keep a clean intermediate copy of the last desktop frame (an extra DEFAULT texture). In acquire (dxgi.rs:1341), when AcquireNextFrame times out but update_cursor saw a position change (LastMouseUpdateTime changed) and the cursor is visible, copy the clean intermediate into gpu_copy and re-run composite_cursor_gpu, then return that as a fresh frame instead of repeating last_present.
**NOT-A-BUG (2026-06-16)** — premise incorrect: DXGI returns S_OK for pointer-only updates (LastMouseUpdateTime != 0, LastPresentTime == 0) and acquire() recomposites the cursor at its new position; last_present is repeated only on a genuine WAIT_TIMEOUT. Only an optional perf micro-opt remains (Apollo re-blends just the cursor rect to avoid a full CopyResource per pointer update).
#### 22. Add real reference-frame invalidation (RFI) instead of always forcing IDR
*Area:* `win:nvenc-d3d11` · *Windows-host:* yes · *Severity:* high · *Effort:* large
- **Apollo does:** Apollo keeps a deep DPB (maxNumRefFrames 5/HEVC, 8/AV1) but pins L0 ref to 1 (nvenc_base.cpp:268-281), then on a loss event calls nvEncInvalidateRefFrames per-frame over the requested range, dedups against the last range, expands to the last-encoded index, escalates to IDR only if the range exceeds DPB depth, and tags the next frame rfi_needs_confirmation (nvenc_base.cpp:574-610). This lets the encoder re-reference an older still-valid frame rather than emit a multi-millisecond keyframe.
- **punktfunk gap:** punktfunk has NO invalidate path — request_keyframe() always forces a full IDR (nvenc.rs:437-442,465-467); punktfunk1.rs:2153 / gamestream/stream.rs:336 wire 'RFI' straight to a keyframe. Every recovery is a costly IDR spike, defeating the infinite-GOP design.
- **Proposal:** In nvenc.rs add `maxNumRefFramesInDPB`/`numRefL0=1` to the HEVC/H264/AV1 config in init_session, gate on a new caps query NV_ENC_CAPS_SUPPORT_REF_PIC_INVALIDATION, track last_encoded_frame_index + last_rfi_range, and add an `invalidate_ref_frames(first,last)` method on the Encoder trait (encode.rs:41-51) that calls API.invalidate_ref_frames per index with Apollo's dedup/escalate-to-IDR-on-overflow logic. Wire punktfunk1.rs RFI requests to it, falling back to request_keyframe() only when it returns false.
**SHIPPED (2026-06-20)** — real RFI via nvEncInvalidateRefFrames with dedup + IDR-on-overflow; control plane 0x0301 routes to invalidate. NVENC impl CI-pending. See #19.
#### 23. Add a DS4 (DualShock4) ViGEm target on Windows with type auto-selection, motion, touchpad, battery and timestamp pump
*Area:* `win:input-sendinput-vigem` · *Windows-host:* yes · *Severity:* high · *Effort:* large
@@ -1959,38 +1884,16 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
- **Proposal:** In gamepad_windows.rs, add a DS4Wired branch via vigem_client::DualShock4Wired with a union/enum PadEntry. Resolve type from the decoded Arrival (precedence: explicit env/client choice > PS type > motion/touchpad caps > X360), mirroring the existing GAMEPAD-preference negotiation. Port Apollo's wTimestamp pump (5.333us units, re-send every 100ms), motion calibration constants (:157-170), and the touchpad byte packing (:1604-1608). Surface the LED color via the existing 0xCA/feedback plane.
#### 24. Replace the PsExec scheduled-task launch with a real Windows service that relaunches the host on session change
*Area:* `win:system-secure-desktop` · *Windows-host:* yes · *Severity:* high · *Effort:* large
- **Apollo does:** SunshineSvc.exe runs as LocalSystem in Session 0, loops on WTSGetActiveConsoleSessionId, clones its own token with DuplicateTokenEx(TokenPrimary)+SetTokenInformation(TokenSessionId) and CreateProcessAsUserW into winsta0\\default inside a per-session job object (JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE|BREAKAWAY_OK); opts into SERVICE_ACCEPT_SESSIONCHANGE and on WTS_CONSOLE_CONNECT terminates+relaunches the host in the new session (tools/sunshinesvc.cpp:95,111,239,256,267,276-294)
- **punktfunk gap:** punktfunk has no Windows service; launch is a PsExec64 -s -i 1 scheduled task hard-coded to session 1 (design/windows-host.md:78-84), with the SERVICE_CONTROL_SESSIONCHANGE relaunch listed as unimplemented step 6 (design/windows-secure-desktop.md:89). Launch scripts are not even in the repo.
- **Proposal:** Add a small Rust service binary (new crate or punktfunk-host `service` subcommand) using windows::Win32::System::Services (RegisterServiceCtrlHandlerEx, StartServiceCtrlDispatcher) that mirrors sunshinesvc.cpp: WTSGetActiveConsoleSessionId -> DuplicateTokenEx+SetTokenInformation(TokenSessionId) -> CreateProcessAsUserW(lpDesktop=winsta0\\default) into a kill-on-close job, accept SERVICE_ACCEPT_SESSIONCHANGE, and relaunch the host on a genuine console-session change. Ship an installer and drop the PsExec dependency.
**SHIPPED (2026-06-20)** — real Windows service relaunches the host on console-session change (SERVICE_ACCEPT_SESSIONCHANGE); PsExec scheduled-task dropped. See also #20.
#### 25. Elevate capture/encode/send thread priority on the host hot path
*Area:* `cmp:protocol-streaming` · *Windows-host:* yes · *Severity:* medium · *Effort:* small · ** verified**
- **Apollo does:** Apollo raises the transmit/capture thread priority: platf::adjust_thread_priority(thread_priority_e::critical) in the video broadcast thread (stream.cpp:1122) and ::high in the audio/control paths (stream.cpp:1333, 1672); the Windows impl is SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST/ABOVE_NORMAL) (platform/windows/misc.cpp:1081-1102).
- **punktfunk gap:** punktfunk names its hot-path threads (stream.rs:44 video, stream.rs:204 send, punktfunk1.rs:1804 send_loop, punktfunk1.rs:2017/2328 send threads) but never sets a scheduling priority — every host capture/encode/send thread runs at default priority. Only the macOS client elevates (client.rs:169). On a loaded Windows desktop the encode/send thread can be preempted, adding jitter the frame-pacing logic can't recover.
- **Proposal:** Add a cross-platform raise_current_thread_priority() helper (SetThreadPriority on Windows, optionally AvSetMmThreadCharacteristics for MMCSS; sched/nice on Linux) and call it at the top of the GameStream send thread, the native send_loop, and the encode thread. Cheap, high-value jitter reduction, no design impact.
- **Verify verdict:** `confirmed_gap` — punktfunk: NO thread-priority call exists anywhere in the workspace (grep for SetThreadPriority/sched_setscheduler/setpriority/AvSetMm/THREAD_PRIORITY across crates/ returned zero hits). Hot-path threads are named-only at default priority: GameStream video thread crates/punktfunk-host/src/gamestream/stream.rs:44-53 (thread::Builder name "punktfunk-video") and GameStream send thread stream.rs:204-206 ("punktfunk-send"); native send threads crates/punktfunk-host/src/punktfunk1.rs:2017-2033 and punktfunk1.rs:2328-2333 ("punktfunk-send"), and the native send_loop at punktfunk1.rs:1804 — all spawned with no priority set. The encode work shares the capture thread (punktfunk1.rs:2011-2013 "this thread captures+encodes ... and hands each AU to a dedicated send thread"), also default priority. The windows crate is ALREADY a dependency with the needed feature: crates/punktfunk-host/Cargo.toml:141 enables "Win32_System_Threading" (SetThreadPriority/GetCurrentThread available, zero new deps). Apollo: confirmed it raises priority on every hot-path thread — capture src/video.cpp:1295 (critical), encode src/video.cpp:2359 and 2396 (high), video send src/stream.cpp:1333 (high), control src/stream.cpp:1122 (critical), audio src/stream.cpp:1672 + src/audio.cpp:94/208. Windows impl is SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST/ABOVE_NORMAL) at src/platform/windows/misc.cpp:1081-1102, plus DwmEnableMMCSS(true) (misc.cpp:1139) and AvSetMmThreadCharacteristics("Pro Audio") for the audio-capture thread (src/platform/windows/audio.cpp:540). CRITICAL NUANCE: Apollo's adjust_thread_priority is effectively Windows-only — src/platform/linux/misc.cpp:362-364 is "// Unimplemented" and src/platform/macos/misc.mm:218-220 is "// Unimplemented".
- **Refined:** Add a small cross-platform helper raise_current_thread_priority(level) and call it at the TOP of each hot-path thread body (so the calling thread itself is elevated): the GameStream send thread (stream.rs:206), the GameStream video/capture+encode thread (stream.rs:46), the native send threads (punktfunk1.rs:2021 and punktfunk1.rs:2331 closures, before/at the start of send_loop), and the native capture+encode thread (the punktfunk1.rs run body that owns capture+encode, punktfunk1.rs ~2011+). Windows: SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_HIGHEST) for the send/network thread (latency-critical, matches Apollo's video-send=high but the punktfunk send thread also does FEC+seal so HIGHEST is defensible) and THREAD_PRIORITY_ABOVE_NORMAL for capture+encode — using the windows crate already on Cargo.toml:141, no new deps. Optionally associate the network/encode thread with MMCSS via AvSetMmThreadCharacteristics (needs the Win32_System_Threading "Games"/"Pro Audio" task + AVRT feature) for higher-fidelity scheduling under DWM load; treat as a follow-up, not the first cut. Linux (net-new beyond Apollo, since Apollo leaves it unimplemented and punktfunk is Linux-first): best-effort nice(-10)/setpriority on the send+encode threads — note SCHED_FIFO/RR requires CAP_SYS_NICE/rtprio limits the host won't have by default, so do NOT default to realtime; a plain niceness bump is the safe portable choice and silently no-ops without privilege. Make every priority call best-effort (log-and-continue on failure, exactly as Apollo does at misc.cpp:1104). No async, no per-frame allocation, no ABI surface change — purely thread-setup, so no design invariant is touched.
**SHIPPED (2026-06-20)** — hot-path capture/encode/send threads now elevate priority (Windows SetThreadPriority HIGHEST for send / ABOVE_NORMAL for capture+encode; best-effort niceness on Linux, no-ops without privilege), per the verified plan.
#### 43. Socket QoS / DSCP marking on the media sockets
*Area:* `cmp:protocol-streaming` · *Windows-host:* yes · *Severity:* medium · *Effort:* medium · **✓ verified**
- **Apollo does:** Apollo tags video and audio sockets for prioritized delivery: enable_socket_qos(...qos_data_type_e::video...) and (...audio...) called per session (stream.cpp:1917, stream.cpp:1938); the Windows impl uses qWAVE QOSCreateHandle/QOSAddSocketToFlow with DSCP tagging (platform/windows/misc.cpp:1616-1652), with Linux/macOS equivalents.
- **punktfunk gap:** punktfunk sets NO QoS/DSCP anywhere — grep for qos/DSCP/IP_TOS across crates/punktfunk-host and crates/punktfunk-core finds only the x-nv-vqos ANNOUNCE keys (rtsp.rs:278) and a macOS *client* pthread QoS (client.rs:169). Neither the GameStream sockets (stream.rs:66 bind, audio) nor the native data socket (transport/udp.rs) request link-layer/router priority.
- **Proposal:** Add a small per-OS helper to mark the video/audio/data UDP sockets: DSCP EF/AF41 via IP_TOS/IPV6_TCLASS on Linux/macOS, qWAVE QOSAddSocketToFlow on Windows (gated behind an env/config opt-in). Wire it into stream.rs socket setup and the native transport socket creation. Directly improves latency under contended Wi-Fi / shared uplink.
- **Verify verdict:** `confirmed_gap` — PUNKTFUNK — gap is real on every media socket. Native data plane crates/punktfunk-core/src/transport/udp.rs:359-365 (UdpTransport::connect) and :374-414 (connect_via_punch) grow SO_SNDBUF/SO_RCVBUF (:431-447 grow_buffers via socket2::SockRef) and set GSO/USO, but never set IP_TOS/IPV6_TCLASS/SO_PRIORITY/qWAVE. GameStream sockets are bare std UdpSocket with no QoS: video crates/punktfunk-host/src/gamestream/stream.rs:66, audio audio.rs:305, control control.rs:36. RTSP does NOT parse the GameStream qosTrafficType keys at all (grep qosTrafficType in crates/punktfunk-host → exit 1), and rtsp.rs only reads x-nv-vqos bitrate/fec/codec (rtsp.rs:278). The only QoS in the tree is a macOS *client* pthread QoS-class (core/src/client.rs:156-169) — unrelated to link-layer marking. socket2 is already a punktfunk-core dep (Cargo.toml:34), so DSCP via SockRef::set_tos is trivial to add. APOLLO — confirmed it does exactly this, on by default. Per-session calls: src/stream.cpp:1917 (video) and :1938 (audio) → platf::enable_socket_qos(..., videoQosType/audioQosType != 0). Those flags come from RTSP src/rtsp.cpp:1005-1006 and are DEFAULTED non-zero at src/rtsp.cpp:982-983 (x-nv-vqos qosTrafficType="5", x-nv-aqos="4"), so QoS is on for stock Moonlight. Linux impl: src/platform/linux/misc.cpp:797-851 sets IP_TOS/IPV6_TCLASS (DSCP 40=AF41 video, 48=CS6 audio, shifted <<2) plus SO_PRIORITY 5/6. Windows impl: src/platform/windows/misc.cpp:1616-1722 dynamically loads qwave.dll and uses QOSCreateHandle/QOSAddSocketToFlow with QOSTrafficTypeAudioVideo/Voice — and crucially returns nullptr (no-op) unless dscp_tagging is set (:1622-1625). macOS: src/platform/macos/misc.mm:446.
- **Refined:** Add a per-OS set_media_qos(socket, kind) helper. Linux/macOS: use the already-present socket2 — SockRef::set_tos(AF41<<2) for IPv4 / set_tclass_v6 for IPv6, plus SO_PRIORITY on Linux (video=5, audio=6, the max without CAP_NET_ADMIN; set AFTER TOS since TOS resets it — Apollo linux/misc.cpp:841-845). Wire it into UdpTransport::connect / connect_via_punch (the native punktfunk/1 data plane — the primary, highest-value target) behind an opt-in env (PUNKTFUNK_DSCP=1) and optionally a Config field, plus the GameStream stream.rs:66 / audio.rs:305 / control.rs:36 sockets. IMPORTANT Windows-host caveat (this is the user's focus and where the naive version fails): on Windows, plain IP_TOS setsockopt is silently stripped by the OS unless a registry/group-policy QoS policy ('Do not use NLA') is configured — which is exactly why Apollo uses qWAVE (QOSAddSocketToFlow) instead. So a one-line socket2 set_tos does NOT tag on the wire on Windows. To actually deliver value on the Windows host, port Apollo's qWAVE path (runtime LoadLibraryExA qwave.dll, QOSCreateHandle once, QOSAddSocketToFlow per socket with QOSTrafficTypeAudioVideo/Voice) including the dual-stack v4-mapped connect() workaround (windows/misc.cpp:1675-1700) — note our data socket is already connect()ed (udp.rs:361), which sidesteps most of that hack. Keep RAII teardown (QOSRemoveSocketFromFlow on drop) like Apollo's qos_t/deinit_t. This is purely socket-setup, off the per-frame path, no core C-ABI change, no async — fully compatible with all three design invariants.
**SHIPPED (2026-06-20)** — punktfunk_core::transport::qos set_media_qos marks the native + GameStream media sockets (DSCP CS5 video / CS6 audio via IP_TOS + Linux SO_PRIORITY 5/6, opt-in PUNKTFUNK_DSCP=1). Windows caveat: plain IP_TOS is a no-op on the wire without a qWAVE policy — porting Apollo's qWAVE path (QOSAddSocketToFlow) remains a documented follow-up.
#### 90. Bitrate-derived rate-control pacing (vs frame-interval-only)
*Area:* `cmp:protocol-streaming` · *Windows-host:* no · *Severity:* medium · *Effort:* medium · **✓ verified**
- **Apollo does:** Apollo paces each frame's packets at the *negotiated bitrate*: ratecontrol_packets_in_1ms = giga*80/100/1000/blocksize/8 (stream.cpp:1464) and sleeps the send loop to that per-millisecond budget across the frame (stream.cpp:1578-1627), so the sender shapes to the link's allotted rate, not just the frame deadline.
- **punktfunk gap:** Both punktfunk send pacers spread purely over the FRAME INTERVAL: the GameStream sender uses budget = frame_interval * 0.75 (stream.rs:209) and the native paced_submit uses budget to next frame's deadline * 0.9 (punktfunk1.rs:1752) — neither derives a packets-per-ms budget from cfg.bitrate_kbps (the bitrate is only used to open NVENC, stream.rs:275). A spiky IDR or VBR overshoot can still microburst above the negotiated rate within its frame window.
- **Proposal:** Compute a bitrate-derived per-millisecond send budget (like Apollo's ratecontrol_packets_in_1ms) from the negotiated bitrate and pace overflow to THAT rate inside paced_submit / spawn_sender, taking the min of the frame-interval budget and the bitrate budget. Smooths VBR bursts on rate-limited links without breaking the existing microburst fast-path.
- **Verify verdict:** `partial` — PUNKTFUNK gap is real: both pacers spread over the FRAME INTERVAL only, never the bitrate. GameStream sender: `let budget = frame_interval.mul_f32(0.75)` (crates/punktfunk-host/src/gamestream/stream.rs:209). Native paced_submit: `let budget = deadline.checked_duration_since(pace_start)...mul_f32(0.9)` (crates/punktfunk-host/src/punktfunk1.rs:1752-1755) where deadline = `next += interval` (punktfunk1.rs:2162) and `interval = Duration::from_secs_f64(1.0 / effective_hz...)` (punktfunk1.rs:2357). bitrate_kbps only configures NVENC (stream.rs:275; punktfunk1.rs:2306, 2694) and is never fed to the pacer. So far the gap claim holds. BUT the Apollo characterization in the proposal is FACTUALLY WRONG: Apollo's `size_t ratecontrol_packets_in_1ms = std::giga::num * 80 / 100 / 1000 / blocksize / 8;` (/home/enricobuehler/Apollo/src/stream.cpp:1464) is a HARDCODED 80% of 1 Gigabit/sec — a fixed constant. grep across stream.cpp shows the negotiated/session bitrate never enters this formula (only std::giga::num, blocksize, and the 80/100 constant appear at lines 1464/1578-1582/1625-1627). Apollo paces to a FIXED ~800 Mbps link ceiling regardless of negotiated bitrate; it is NOT "negotiated-bitrate pacing." punktfunk's own design notes deliberately reject clamping to negotiated bitrate: "The encoder is pixel-rate bound, not bitrate bound" (punktfunk1.rs:321) and the whole 1Gbps+ effort raised the ceiling (punktfunk1.rs:1617-1619, MAX_BITRATE_KBPS ~2 Gbps).
- **Refined:** Reject the proposal AS WRITTEN — its premise ("Apollo paces to the negotiated bitrate") is false; Apollo paces to a hardcoded 80%-of-1Gbps fixed link ceiling (stream.cpp:1464), and pacing to negotiated bitrate would actively regress punktfunk (VBR/IDR spikes legitimately exceed average bitrate, and punktfunk explicitly treats the encoder as pixel-rate-bound, not bitrate-bound — punktfunk1.rs:321). If anything is worth porting, it is the FIXED per-millisecond link-rate ceiling concept, not bitrate-derived pacing: optionally compute a fixed packets-per-ms budget from a configurable link-rate ceiling (default high, e.g. matching MAX_BITRATE_KBPS, env-overridable like PUNKTFUNK_PACE_BURST_KB) and take min(frame-interval budget, link-ceiling budget) inside paced_submit/spawn_sender — purely as a microburst smoother for rate-limited links, NOT tied to cfg.bitrate_kbps. Note punktfunk already has the microburst fast-path (burst_cap, punktfunk1.rs:2005-2009 / paced_submit:1734-1743) and frame-interval spreading, which together already address the "spiky IDR microburst" symptom the proposal cites. Recommend deferring unless a measured rate-limited-link regression appears; the current frame-interval + burst-cap pacing covers the cited risk.
**REJECTED / OBSOLETE (2026-06-20)** — proposal premise is false: Apollo paces to a hardcoded ~80%-of-1Gbps FIXED link ceiling (stream.cpp:1464), NOT the negotiated bitrate, and punktfunk is pixel-rate-bound by design (VBR/IDR spikes legitimately exceed average bitrate). Existing frame-interval + burst-cap pacing already covers the cited microburst risk; defer unless a measured rate-limited-link regression appears. (If anything, port the FIXED link-ceiling concept via an env knob like PUNKTFUNK_PACE_BURST_KB, not bitrate-derived pacing.)
#### 94. Consume the GameStream client loss-stats report
*Area:* `cmp:protocol-streaming` · *Windows-host:* no · *Severity:* low · *Effort:* small · **✓ verified**
@@ -2001,5 +1904,5 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
- **Verify verdict:** `confirmed_gap` — PUNKTFUNK gap is real. crates/punktfunk-host/src/gamestream/control.rs:165-177 — after decrypt, the only inner-type dispatch is `if matches!(inner, 0x0301 | 0x0302 | 0x0305)` → force_idr; everything else falls through to gamepad::decode (returns None for non-controller) then input::decode, which at crates/punktfunk-host/src/gamestream/input.rs:35 returns empty unless `type == 0x0206`. So a loss-stats packet (`0x0201`) is silently dropped — `on_receive` has no branch for it. A broad grep across crates/ for loss-stats/last-good-frame/0x0201 found nothing (only DXGI's unrelated "last good frame" comment at capture/dxgi.rs:751). The native plane has only end-of-burst ProbeResult bandwidth/loss telemetry (crates/punktfunk-core/src/client.rs:436, abi.rs:1499) — a one-shot speed test, NOT continuous in-stream loss feedback. APOLLO confirms the claim: src/stream.cpp:41 `#define IDX_LOSS_STATS 3`, src/stream.cpp:61 maps it to wire type `0x0201`, and src/stream.cpp:943-957 reads `int32_t *stats` with stats[0]=count, stats[1]=time-window ms, stats[3]=lastGoodFrame (logged at BOOST verbose). Wire offset confirmed: the map callback receives `next_payload = plaintext.data()+4` (src/stream.cpp:1104), i.e. the body AFTER the 4-byte `[type][payloadLength]` header — so stats[0..] is at body offset 0. Note: Apollo only LOGS it; it does not yet drive adaptive FEC/bitrate off it either.
- **Refined:** Add one branch to control.rs `on_receive`: when the decrypted `pt` inner type (LE u16 at pt[0..2]) == 0x0201 and pt.len() >= 20, decode the body as four LE i32 — pt[4..8]=loss_count, pt[8..12]=time_window_ms, pt[16..20]=last_good_frame (mirroring Apollo's stats[0]/stats[1]/stats[3]; verify endianness against a real Moonlight capture — moonlight-common-c writes these as host-order/LE, and punktfunk already treats control inner fields as LE). Initially log at debug/trace and optionally surface via an AtomicU32 in AppState or the mgmt API so the web console can show client-observed loss. Keep it read-only first. Caveat for the backlog: this is a low-value telemetry hook, NOT adaptive control. The actual lever (adaptive FEC % / bitrate de-rating) is a separate, larger piece of work that Apollo itself does not implement off this signal — do not over-scope. Place it next to the existing 0x0301/0x0302/0x0305 dispatch so the control hot path stays a single decrypt + cheap type match. windowsHost=false is correct: this is GameStream-plane, OS-independent, and the punktfunk/1 native plane is the higher-priority protocol — so prioritize accordingly.
_(28 detailed; remaining 68 medium/low items are in the table above with citations available in Parts 23.)_
_(28 items had detail subsections — 16 shipped/obsolete ones are now collapsed to one-liners above, 12 still-open ones keep full citations; the remaining 68 medium/low items are in the table above with citations available in Parts 23.)_
+45 -115
View File
@@ -1,126 +1,56 @@
---
title: "Apple Stage-2 Presenter (handoff)"
description: "Implementation plan for the explicit VTDecompressionSession → CAMetalLayer presenter — hand-paced present + true decode→present (glass-to-glass) measurement. Written so a Mac agent can pick it up."
description: "Design rationale + open items for the explicit VTDecompressionSession → CAMetalLayer presenter. Implementation shipped; this page is trimmed to the why + what's left."
---
> **Status update:** the stage-2 presenter described here has since been **built and live-validated**,
> shipping behind an opt-in flag (`AVSampleBufferDisplayLayer` remains the default known-good path).
> This page is preserved as the implementation/handoff record for that work.
> **Status:** SHIPPED behind the opt-in `punktfunk.presenter` flag (`AVSampleBufferDisplayLayer`
> stage-1 remains the default known-good path). Live-validated ~11 ms p50 capture→present (commit
> `7b10714`). Code: `clients/apple/Sources/PunktfunkKit/{Stage2Pipeline,MetalVideoPresenter,VideoDecoder,LatencyMeter}.swift`;
> Settings has a presenter picker (`DefaultsKey.presenter`, `SettingsView.swift`). This doc is trimmed
> to design rationale + open items — the shipped `.swift` code is the source of truth for the
> decode/present/measurement walkthrough.
The implementation plan for the **stage-2 Apple presenter**. The **stage-1** presenter feeds
compressed HEVC straight into `AVSampleBufferDisplayLayer`, which hardware-decodes **and presents
internally with no per-frame callback** — so we can't stamp decode or present, and we can't hand-pace.
Stage-2 takes explicit control: decode with `VTDecompressionSession`, present decoded frames through a
`CAMetalLayer` driven by a display link. Two wins: **~0.5 refresh off the present tail** (the biggest
client latency term at 60 Hz) and **true decode→present / glass-to-glass** numbers.
## Why stage 2 (design rationale)
The **stage-1** presenter feeds compressed HEVC straight into `AVSampleBufferDisplayLayer`, which
hardware-decodes **and presents internally with no per-frame callback** — so we can't stamp decode or
present, and we can't hand-pace. **Stage-2** takes explicit control: decode with
`VTDecompressionSession`, present decoded frames through a `CAMetalLayer` driven by a display link.
Two wins justify the extra machinery:
- **~0.5 refresh off the present tail** — the present tail is the biggest client latency term at 60 Hz;
display-link-driven present pops the newest-ready frame each vsync instead of letting the layer present
on its own internal schedule.
- **True decode→present / glass-to-glass measurement** — explicit decode-completion and present
timestamps make `capture→present` measurable (modulo the still-unmeasured host render→capture term).
All of this is **macOS/iOS/tvOS-only** — build + validate on a Mac (`swift build && swift test`, then
live against a Linux host). The host + connector side is already done: `PunktfunkConnection.clockOffsetNs`
(the connect-time skew offset, host minus client) is what makes the present timestamp cross-machine
valid. See [Status](/docs/status) and roadmap §12.
live against a Linux host). The host + connector side is already done:
`PunktfunkConnection.clockOffsetNs` (the connect-time skew offset, host minus client) is what makes the
present timestamp cross-machine valid. `skewCorrected` stays false when `clockOffsetNs == 0` (old host)
— then the numbers are same-host-only.
## Where it plugs into the existing code
## Architecture pattern (worth recording)
| Existing (stage-1) | Stage-2 change |
|---|---|
| `StreamPump` pulls AUs → `AnnexB.sampleBuffer``layer.enqueue` (compressed) | A `Stage2Pump` (or a mode flag on `StreamPump`) feeds AUs to `VTDecompressionSessionDecodeFrame` instead |
| `StreamView`/`StreamViewIOS` host an `AVSampleBufferDisplayLayer` | Host a `CAMetalLayer` (+ a display link); keep the input-capture + HUD overlay unchanged |
| `AnnexB.formatDescription(fromIDR:)` builds the format desc, refreshed on every IDR | **Reused** — it's the `VTDecompressionSession`'s format description; recreate the session when it changes |
| `LatencyMeter` records capture→client-receipt at `onFrame` | Extend to record **decode-completion** and **present** stages (below) |
Keep stage-1 behind a `UserDefaults` flag (e.g. `punktfunk.presenter = "stage1" | "stage2"`) so a
regression can fall back — `AVSampleBufferDisplayLayer` is the known-good path.
## Decode: VTDecompressionSession
1. Create the session from the IDR's `CMVideoFormatDescription`
(`AnnexB.formatDescription(fromIDR:)`):
```
VTDecompressionSessionCreate(
allocator: nil,
formatDescription: fmt,
decoderSpecification: nil, // hardware by default; no need to force
imageBufferAttributes: [
kCVPixelBufferMetalCompatibilityKey: true,
kCVPixelBufferPixelFormatTypeKey:
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, // 8-bit SDR; 10-bit (…10BiPlanar) for HDR later
],
outputCallback: <C-callback>,
decompressionSessionOut: &session)
```
2. Per AU: build the same `CMSampleBuffer` as stage-1 (`AnnexB.sampleBuffer(au:format:)`, PTS =
`au.ptsNs` @ 1e9 timescale) and submit:
```
VTDecompressionSessionDecodeFrame(session, sampleBuffer,
flags: ._EnableAsynchronousDecompression,
frameRefcon: <pts or a boxed context>, infoFlagsOut: nil)
```
3. The **output callback** delivers `(status, infoFlags, imageBuffer: CVImageBuffer?, presentationTimeStamp, …)`.
`presentationTimeStamp` is `au.ptsNs` (the host capture clock). **Stamp decode-completion here**
(`CLOCK_REALTIME` ns), retain the `CVPixelBuffer`, and push `{pts, pixelBuffer, decodedNs}` into a
small NSLock-guarded ring (the "ready" queue) the display link drains.
4. **IDR / mode change**: when `AnnexB.formatDescription` yields a new desc, check
`VTDecompressionSessionCanAcceptFormatDescription`; if not, finish-and-recreate the session (same
trigger stage-1 uses to refresh `format`). On decoder error (`kVTVideoDecoderBadDataErr`, etc.) drop
to the next IDR — there's no out-of-band extradata; recovery keyframes re-carry the parameter sets.
## Present: CAMetalLayer + display link
- `CAMetalLayer` (device = system default, `pixelFormat = .bgra8Unorm`, `framebufferOnly = true`,
`drawableSize` = stream WxH). The view: macOS `NSView`/iOS `UIView` whose `layerClass`/backing layer
is the `CAMetalLayer` (mirror `StreamView`/`StreamViewIOS`).
- **Display link** drives present: macOS `CVDisplayLink` (or `CADisplayLink` on macOS 14+),
iOS/tvOS `CADisplayLink`. Each callback carries the **target present timestamp** (`CVTimeStamp` /
`targetTimestamp`).
- Each vsync: pop the **newest** ready frame (drop older undisplayed ones — low-latency default; no
smoothing buffer to start), render a fullscreen quad sampling the **biplanar YUV** (luma +
chroma planes via `CVMetalTextureCache`) with a BT.709 YUV→RGB fragment shader, then
`commandBuffer.present(drawable)` (or `present(drawable, atTime:)`). **Stamp present time** for the
frame just shown (use the display link's target timestamp converted to `CLOCK_REALTIME`).
- Colorspace: BT.709 8-bit for now (matches the host's SDR). HDR (BT.2020/PQ, 10-bit `…10BiPlanar` +
EDR `CAMetalLayer.wantsExtendedDynamicRangeContent`) is a later tie-in with the HDR roadmap (§10).
### Cheaper intermediate (2a) if the Metal path is too big in one step
Decode with `VTDecompressionSession` (gets the **decode-completion timestamp** = capture→decoded),
then wrap the decoded `CVPixelBuffer` in a `CMSampleBuffer` and `enqueue` it into the existing
`AVSampleBufferDisplayLayer` (it accepts uncompressed pixel buffers too). This yields the decode term
**without** a Metal renderer — but **not** true present (the layer still presents internally). Ship 2a
first if useful; 2b (CAMetalLayer + display link) is required for the on-glass present stamp.
## Measurement (the whole point)
Extend `LatencyMeter` (or add per-stage meters) so each frame records three instants, all
`CLOCK_REALTIME` ns, all shifted by `connection.clockOffsetNs` to the host clock:
- **capture→decoded** = `decodedNs + offset pts_ns` (VideoToolbox decode latency, cross-machine)
- **decode→present** = `presentedNs decodedNs` (the present tail stage-2 shortens)
- **capture→present** = `presentedNs + offset pts_ns`**the glass-to-glass number** (modulo the
host render→capture term, still unmeasured; see roadmap §12)
Surface `capture→present` p50/p95 in the HUD (extend the existing `model.latency*` line in
`ContentView`). `skewCorrected` stays false when `clockOffsetNs == 0` (old host) — then the numbers are
same-host-only, as today.
## Validation
- `swift test`: add a decode-output test (decode a known IDR built like
`VideoToolboxRoundTripTests` → assert a `CVPixelBuffer` of the right dimensions + the
decode callback fires). Present is display-bound — validate it **live** via the HUD number.
- Live: connect to a Linux host (`punktfunk1-host --source virtual` on the GNOME box; see
[Ubuntu — GNOME](/docs/ubuntu-gnome)), confirm `capture→present` is a few ms over `capture→client`
and that `decode→present` shrank vs. an `AVSampleBufferDisplayLayer` baseline.
- Compare against the headless reference number: `punktfunk-probe` reports skew-corrected
capture→reassembled (~1.3 ms p50 GNOME box → dev box); capture→present should be that **+ decode +
present**.
## Gotchas
Async `VTDecompressionSession` callback → **1-slot newest-ready ring** → display-link-driven present:
- VT decode is **async**; the output callback runs on a VT-managed thread — don't block it, just stamp
+ enqueue. Retain the `CVPixelBuffer` until presented (the ring owns it).
- `VTDecompressionSessionDecodeFrame` wants the **same** `CMSampleBuffer` shape stage-1 builds (AVCC
length-prefixed NALs, in-band parameter sets in the format desc, never as extradata).
- `CAMetalLayer.drawableSize` must track mode changes (the host can `Reconfigure` mid-stream — watch
`PunktfunkConnection.mode`/the new-IDR dimensions).
- Don't add a jitter/smoothing buffer for the first cut — present newest-ready for lowest latency; a
pacing policy can come later if frames look uneven.
- Keep `clients/apple/README.md`'s "Stage 2" item + [Status](/docs/status) updated when this lands.
decode-completion (`CLOCK_REALTIME` ns) + enqueue. Retain the `CVPixelBuffer` until presented (the ring
owns it).
- Each vsync pops the **newest** ready frame and drops older undisplayed ones — low-latency default, no
smoothing buffer.
- Three per-frame instants (all `CLOCK_REALTIME` ns, all shifted by `clockOffsetNs` to the host clock):
**capture→decoded** = `decodedNs + offset pts_ns`; **decode→present** = `presentedNs decodedNs`
(the tail stage-2 shortens); **capture→present** = `presentedNs + offset pts_ns` — the glass-to-glass
number.
## Open items
- **Make stage 2 the default** — after resolution / HDR edge-case checks (HDR = BT.2020/PQ, 10-bit
`…10BiPlanar` + EDR `CAMetalLayer.wantsExtendedDynamicRangeContent`; ties in with the HDR roadmap).
- **Glass-to-glass numbers via `tools/latency-probe`** — close the still-unmeasured host render→capture
term.
- **Smoothing / pacing policy** — present newest-ready for lowest latency today; a pacing policy can come
later if frames look uneven.
- **iOS / iPadOS / tvOS stage-2 variants.**

Some files were not shown because too many files have changed in this diff Show More