Compare commits
17 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 5bf787eb2b | |||
| 0a6c9d8852 | |||
| 0eedfb3c1f | |||
| f6490f4c28 | |||
| d01a8fd17a | |||
| 3e7c9bd059 | |||
| 7aa787a789 | |||
| 3514702d8c | |||
| 327a5fa828 | |||
| 9777ed7fb3 | |||
| ba68a98873 | |||
| 22359f5dc8 | |||
| 7e9023faad | |||
| 5acc12d9e9 | |||
| aed0bf0c2a | |||
| b65745284e | |||
| 8ca695eb4c |
+2
-2
@@ -1,9 +1,9 @@
|
||||
# Root build context is used only by web/Dockerfile, which needs web/ and
|
||||
# docs/api/openapi.json. Allowlist those; keep everything else (target/, .git, crates)
|
||||
# api/openapi.json. Allowlist those; keep everything else (target/, .git, crates)
|
||||
# out of the context upload.
|
||||
*
|
||||
!web
|
||||
!docs/api/openapi.json
|
||||
!api/openapi.json
|
||||
web/node_modules
|
||||
web/.output
|
||||
web/dist
|
||||
|
||||
@@ -24,7 +24,7 @@ on:
|
||||
push:
|
||||
branches: [main]
|
||||
# The flatpak is the CLIENT — only rebuild when the client/core/manifest change, not on every
|
||||
# docs/host push (this is a heavy flatpak-builder run). Tags (v*, the client release) build too.
|
||||
# design/host push (this is a heavy flatpak-builder run). Tags (v*, the client release) build too.
|
||||
paths:
|
||||
- 'clients/linux/**'
|
||||
- 'crates/punktfunk-core/**'
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
# One-shot provisioning of the WDK + cargo-wdk onto the persistent self-hosted windows-amd64 runner, so
|
||||
# the all-Rust UMDF drivers can build there (docs/windows-host-rewrite.md, M0). The runner has the base
|
||||
# the all-Rust UMDF drivers can build there (design/windows-host-rewrite.md, M0). The runner has the base
|
||||
# Windows SDK + MSVC + LLVM + Rust but NOT the WDK (no km/wdf/iddcx headers) or cargo-wdk.
|
||||
#
|
||||
# Dispatch manually (workflow_dispatch). Idempotent: re-running is a near no-op once provisioned. The
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
# Windows driver workspace CI — runs on the self-hosted Windows runner (home-windows-1, host mode;
|
||||
# label windows-amd64). Part of the Windows-host rewrite (docs/windows-host-rewrite.md, M0).
|
||||
# label windows-amd64). Part of the Windows-host rewrite (design/windows-host-rewrite.md, M0).
|
||||
#
|
||||
# Stage 1 (this file): PROBE the runner's driver toolchain (WDK / EWDK / cargo-make / LLVM / the
|
||||
# inf2cat/stampinf/devgen/signtool tools) so we know what's provisioned BEFORE writing driver code,
|
||||
|
||||
@@ -96,6 +96,18 @@ jobs:
|
||||
# First-ever Windows lint coverage for the host (Linux CI never lints the windows-cfg code).
|
||||
run: cargo clippy -p punktfunk-host --features nvenc,amf-qsv -- -D warnings
|
||||
|
||||
- name: Build + lint the HDR Vulkan layer (pf-vkhdr-layer)
|
||||
shell: pwsh
|
||||
# Standalone cdylib (own [workspace]) the installer bundles + registers (it lets Vulkan games
|
||||
# like Doom use HDR on the virtual display). Lint here so a regression fails CI instead of
|
||||
# silently shipping the host without the layer (pack-host-installer.ps1 builds it non-fatally).
|
||||
# Windows-only FFI (user32 + the vk_layer loader glue) → can't be linted on the Linux CI.
|
||||
run: |
|
||||
Push-Location packaging/windows/pf-vkhdr-layer
|
||||
cargo fmt --check; if ($LASTEXITCODE) { throw "pf-vkhdr-layer rustfmt" }
|
||||
cargo clippy --release -- -D warnings; if ($LASTEXITCODE) { throw "pf-vkhdr-layer clippy" }
|
||||
Pop-Location
|
||||
|
||||
- name: Ensure Inno Setup
|
||||
shell: pwsh
|
||||
run: |
|
||||
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protocol core
|
||||
(`punktfunk-core`) exposed over a C ABI and native clients per platform. Full design:
|
||||
[`docs/implementation-plan.md`](docs/implementation-plan.md). Status table: `README.md`.
|
||||
[`design/implementation-plan.md`](design/implementation-plan.md). Status table: `README.md`.
|
||||
|
||||
## Where the work stands
|
||||
|
||||
@@ -27,7 +27,15 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
|
||||
Input: mouse/keyboard (libei via RemoteDesktop portal on KWin/GNOME, gamescope's own EIS
|
||||
socket, wlr protocols on Sway) and **gamepads** (uinput X-Box-360 pads + rumble
|
||||
back-channel; validated live — pad created/destroyed with the session). Management REST API +
|
||||
checked-in OpenAPI doc (`mgmt.rs`).
|
||||
checked-in OpenAPI doc (`mgmt.rs`). **Web-console performance capture** (`stats_recorder.rs`,
|
||||
design: [`design/stats-capture-plan.md`](design/stats-capture-plan.md)): the operator arms stats
|
||||
recording from the web console, plays, stops, and reviews the run as graphs (per-stage latency
|
||||
breakdown · fps new/repeat · goodput · loss/FEC). A shared `Arc<StatsRecorder>` ring (the hot-path
|
||||
gate is a runtime `AtomicBool`, replacing the startup-only `PUNKTFUNK_PERF`) is fed by **both** the
|
||||
native `virtual_stream` and the GameStream encode loop at their existing ~2 s/~1 s aggregation
|
||||
boundary, and finished captures are saved as on-disk recordings
|
||||
(`~/.config/punktfunk/captures/*.json`) browsable/exportable from the console's **Performance** page
|
||||
(recharts). Endpoints `/api/v1/stats/*` (bearer-only). *Implemented; not yet on-glass validated.*
|
||||
- **Native protocol (`punktfunk/1`): full session planes, validated live.** QUIC
|
||||
control plane (`punktfunk-core` `quic` feature: Hello{mode}/Welcome{full Config}/Start), data
|
||||
plane = the hardened core `Session` over raw UDP with **GF(2¹⁶) Leopard FEC + AES-GCM**
|
||||
@@ -104,9 +112,16 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
|
||||
captures the HDR desktop as FP16/Rgb10a2 (DDA FP16 for the secure desktop), the encoder forces HEVC
|
||||
Main10 + BT.2020 PQ (NVENC ABGR10/P010; AMF/QSV P010 + a swscale Rgb10a2→P010 fallback), the client
|
||||
auto-detects PQ from the HEVC VUI — gated by `PUNKTFUNK_10BIT` + client `VIDEO_CAP_10BIT`; **Windows
|
||||
host only** (the Linux host stays 8-bit, blocked upstream). **AMF/QSV is CI-green but not yet
|
||||
on-glass validated** (no AMD/Intel Windows box in the lab); NVENC is live-validated. Newer/less
|
||||
battle-tested than the Linux host. Packaging: `packaging/windows/`.
|
||||
host only** (the Linux host stays 8-bit, blocked upstream). **Vulkan-game HDR over the virtual
|
||||
display**: NVIDIA/AMD Vulkan ICDs refuse to *advertise* an HDR color space for a surface on an IddCx
|
||||
indirect display (so Vulkan games — Doom: The Dark Ages, id Tech, etc. — say "device does not support
|
||||
HDR"), even though the ICD happily *accepts + presents* a forced HDR swapchain there. A tiny always-on
|
||||
Vulkan **implicit layer** (`packaging/windows/pf-vkhdr-layer/`, `VK_LAYER_PUNKTFUNK_hdr_inject`)
|
||||
injects the `HDR10_ST2084`/scRGB surface formats into `vkGetPhysicalDeviceSurfaceFormats[2]KHR`,
|
||||
self-gated on the display's actual advanced-color state (no-op on SDR / real monitors); bundled +
|
||||
HKLM-registered by the installer. **Live-validated: Doom: The Dark Ages enables HDR over the virtual
|
||||
display.** **AMF/QSV is CI-green but not yet on-glass validated** (no AMD/Intel Windows box in the
|
||||
lab); NVENC is live-validated. Newer/less battle-tested than the Linux host. Packaging: `packaging/windows/`.
|
||||
|
||||
## What's left
|
||||
|
||||
@@ -245,8 +260,8 @@ bash crates/punktfunk-core/tests/c/run.sh # standalone C-ABI link + round-trip
|
||||
```
|
||||
|
||||
Generated artifacts are **checked in** and CI fails on drift: `include/punktfunk_core.h`
|
||||
(cbindgen from `punktfunk-core/src/abi.rs`) and `docs/api/openapi.json` (regenerate with
|
||||
`cargo run -p punktfunk-host -- openapi > docs/api/openapi.json`; spec lives in `mgmt.rs`).
|
||||
(cbindgen from `punktfunk-core/src/abi.rs`) and `api/openapi.json` (regenerate with
|
||||
`cargo run -p punktfunk-host -- openapi > api/openapi.json`; spec lives in `mgmt.rs`).
|
||||
|
||||
CI is Gitea Actions (`.gitea/workflows/`, guide: docs-site `ci.md`): `ci.yml` runs the
|
||||
workspace checks inside the `git.unom.io/unom/punktfunk-rust-ci` image plus web/docs-site
|
||||
@@ -268,7 +283,7 @@ crates/punktfunk-host/
|
||||
zerocopy/{egl,cuda,vulkan}.rs dmabuf → CUDA → NVENC (tiled via EGL/GL, LINEAR via Vulkan)
|
||||
inject/{libei,wlr,gamepad,dualsense}.rs input backends (uinput xpad + UHID DualSense)
|
||||
encode/{nvenc,linux,vaapi,ffmpeg_win,sw}.rs per-GPU encoders (NVENC · Linux NVENC/CUDA · VAAPI · AMF/QSV · openh264)
|
||||
capture.rs · encode.rs · audio.rs · spike.rs · punktfunk1.rs · mgmt.rs · native_pairing.rs
|
||||
capture.rs · encode.rs · audio.rs · spike.rs · punktfunk1.rs · mgmt.rs · native_pairing.rs · stats_recorder.rs
|
||||
clients/probe/ punktfunk/1 reference/probe client (headless test/measurement tool)
|
||||
clients/linux/ native Linux client (GTK4/libadwaita · FFmpeg · PipeWire · SDL3)
|
||||
clients/windows/ native Windows client (WinUI 3 via windows-reactor · D3D11 · WASAPI · SDL3)
|
||||
@@ -276,7 +291,7 @@ clients/apple/ native macOS/iOS/tvOS client (Swift · VideoToolbox · GameCon
|
||||
clients/android/ native Android client (Kotlin app + native/ Rust JNI core over punktfunk-core)
|
||||
clients/decky/ Steam Deck Decky plugin
|
||||
crates/punktfunk-host/src/{capture/dxgi,vdisplay/sudovda,encode/ffmpeg_win,inject/gamepad_windows,audio/wasapi_*,service}.rs Windows host backends
|
||||
web/ TanStack web console over the mgmt API (status · devices · pairing)
|
||||
web/ TanStack web console over the mgmt API (status · devices · pairing · performance graphs)
|
||||
packaging/ apt(deb) · RPM/COPR · Arch/sysext · Flatpak · Bazzite bootc · Windows host installer (per-dir READMEs)
|
||||
tools/{loss-harness,latency-probe}/ measurement (plan §10)
|
||||
scripts/ 60-punktfunk.rules · punktfunk-host.service · host.env.example · headless/
|
||||
|
||||
Generated
+332
@@ -2,6 +2,12 @@
|
||||
# It is not intended for manual editing.
|
||||
version = 3
|
||||
|
||||
[[package]]
|
||||
name = "adler2"
|
||||
version = "2.0.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "320119579fcad9c21884f5c4861d16174d0e06250625266f50fe6898340abefa"
|
||||
|
||||
[[package]]
|
||||
name = "aead"
|
||||
version = "0.5.2"
|
||||
@@ -735,6 +741,15 @@ dependencies = [
|
||||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "crc32fast"
|
||||
version = "1.5.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9481c1c90cbf2ac953f07c8d4a58aa3945c425b7185c9154d67a65e4230da511"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "criterion"
|
||||
version = "0.5.1"
|
||||
@@ -1100,6 +1115,16 @@ version = "0.5.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1d674e81391d1e1ab681a28d99df07927c6d4aa5b027d7da16ba32d1d21ecd99"
|
||||
|
||||
[[package]]
|
||||
name = "flate2"
|
||||
version = "1.1.9"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "843fba2746e448b37e26a819579957415c8cef339bf08564fe8b7ddbd959573c"
|
||||
dependencies = [
|
||||
"crc32fast",
|
||||
"miniz_oxide",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "flume"
|
||||
version = "0.11.1"
|
||||
@@ -1751,12 +1776,115 @@ dependencies = [
|
||||
"tower-service",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "icu_collections"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "2984d1cd16c883d7935b9e07e44071dca8d917fd52ecc02c04d5fa0b5a3f191c"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"potential_utf",
|
||||
"utf8_iter",
|
||||
"yoke",
|
||||
"zerofrom",
|
||||
"zerovec",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "icu_locale_core"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "92219b62b3e2b4d88ac5119f8904c10f8f61bf7e95b640d25ba3075e6cac2c29"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"litemap",
|
||||
"tinystr",
|
||||
"writeable",
|
||||
"zerovec",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "icu_normalizer"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "c56e5ee99d6e3d33bd91c5d85458b6005a22140021cc324cea84dd0e72cff3b4"
|
||||
dependencies = [
|
||||
"icu_collections",
|
||||
"icu_normalizer_data",
|
||||
"icu_properties",
|
||||
"icu_provider",
|
||||
"smallvec",
|
||||
"zerovec",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "icu_normalizer_data"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "da3be0ae77ea334f4da67c12f149704f19f81d1adf7c51cf482943e84a2bad38"
|
||||
|
||||
[[package]]
|
||||
name = "icu_properties"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "bee3b67d0ea5c2cca5003417989af8996f8604e34fb9ddf96208a033901e70de"
|
||||
dependencies = [
|
||||
"icu_collections",
|
||||
"icu_locale_core",
|
||||
"icu_properties_data",
|
||||
"icu_provider",
|
||||
"zerotrie",
|
||||
"zerovec",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "icu_properties_data"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "8e2bbb201e0c04f7b4b3e14382af113e17ba4f63e2c9d2ee626b720cbce54a14"
|
||||
|
||||
[[package]]
|
||||
name = "icu_provider"
|
||||
version = "2.2.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "139c4cf31c8b5f33d7e199446eff9c1e02decfc2f0eec2c8d71f65befa45b421"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"icu_locale_core",
|
||||
"writeable",
|
||||
"yoke",
|
||||
"zerofrom",
|
||||
"zerotrie",
|
||||
"zerovec",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "id-arena"
|
||||
version = "2.3.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "3d3067d79b975e8844ca9eb072e16b31c3c1c36928edf9c6789548c524d0d954"
|
||||
|
||||
[[package]]
|
||||
name = "idna"
|
||||
version = "1.1.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "3b0875f23caa03898994f6ddc501886a45c7d3d62d04d2d90788d47be1b1e4de"
|
||||
dependencies = [
|
||||
"idna_adapter",
|
||||
"smallvec",
|
||||
"utf8_iter",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "idna_adapter"
|
||||
version = "1.2.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "cb68373c0d6620ef8105e855e7745e18b0d00d3bdb07fb532e434244cdb9a714"
|
||||
dependencies = [
|
||||
"icu_normalizer",
|
||||
"icu_properties",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "if-addrs"
|
||||
version = "0.15.0"
|
||||
@@ -2022,6 +2150,12 @@ version = "0.12.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "32a66949e030da00e8c7d4434b251670a91556f4144941d37452769c25d58a53"
|
||||
|
||||
[[package]]
|
||||
name = "litemap"
|
||||
version = "0.8.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "92daf443525c4cce67b150400bc2316076100ce0b3686209eb8cf3c31612e6f0"
|
||||
|
||||
[[package]]
|
||||
name = "lock_api"
|
||||
version = "0.4.14"
|
||||
@@ -2116,6 +2250,16 @@ version = "0.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "68354c5c6bd36d73ff3feceb05efa59b6acb7626617f4962be322a825e61f79a"
|
||||
|
||||
[[package]]
|
||||
name = "miniz_oxide"
|
||||
version = "0.8.9"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1fa76a2c86f704bdb222d66965fb3d63269ce38518b83cb0575fca855ebb6316"
|
||||
dependencies = [
|
||||
"adler2",
|
||||
"simd-adler32",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "mio"
|
||||
version = "1.2.1"
|
||||
@@ -2548,6 +2692,15 @@ dependencies = [
|
||||
"universal-hash",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "potential_utf"
|
||||
version = "0.1.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0103b1cef7ec0cf76490e969665504990193874ea05c85ff9bab8b911d0a0564"
|
||||
dependencies = [
|
||||
"zerovec",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "powerfmt"
|
||||
version = "0.2.0"
|
||||
@@ -2728,6 +2881,7 @@ dependencies = [
|
||||
"rand 0.8.6",
|
||||
"rcgen",
|
||||
"reis",
|
||||
"roxmltree",
|
||||
"rsa",
|
||||
"rusqlite",
|
||||
"rustls",
|
||||
@@ -2741,6 +2895,7 @@ dependencies = [
|
||||
"tower",
|
||||
"tracing",
|
||||
"tracing-subscriber",
|
||||
"ureq",
|
||||
"utoipa",
|
||||
"utoipa-axum",
|
||||
"utoipa-scalar",
|
||||
@@ -2752,6 +2907,7 @@ dependencies = [
|
||||
"wayland-scanner",
|
||||
"windows 0.62.2 (registry+https://github.com/rust-lang/crates.io-index)",
|
||||
"windows-service",
|
||||
"winreg",
|
||||
"x509-parser",
|
||||
"xkbcommon",
|
||||
]
|
||||
@@ -3054,6 +3210,15 @@ dependencies = [
|
||||
"windows-sys 0.52.0",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "roxmltree"
|
||||
version = "0.21.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f1964b10c76125c36f8afe190065a4bf9a87bf324842c05701330bba9f1cacbb"
|
||||
dependencies = [
|
||||
"memchr",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rpkg-config"
|
||||
version = "0.1.2"
|
||||
@@ -3555,6 +3720,12 @@ dependencies = [
|
||||
"rand_core 0.6.4",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "simd-adler32"
|
||||
version = "0.3.9"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "703d5c7ef118737c72f1af64ad2f6f8c5e1921f818cdcb97b8fe6fc69bf66214"
|
||||
|
||||
[[package]]
|
||||
name = "siphasher"
|
||||
version = "1.0.3"
|
||||
@@ -3637,6 +3808,12 @@ dependencies = [
|
||||
"wasm-bindgen",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "stable_deref_trait"
|
||||
version = "1.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6ce2be8dc25455e1f91df71bfa12ad37d7af1092ae736f3a6cd0e37bc7810596"
|
||||
|
||||
[[package]]
|
||||
name = "strsim"
|
||||
version = "0.11.1"
|
||||
@@ -3789,6 +3966,16 @@ dependencies = [
|
||||
"time-core",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tinystr"
|
||||
version = "0.8.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "c8323304221c2a851516f22236c5722a72eaa19749016521d6dff0824447d96d"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"zerovec",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tinytemplate"
|
||||
version = "1.2.1"
|
||||
@@ -4094,6 +4281,40 @@ version = "0.9.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "8ecb6da28b8a351d773b68d5825ac39017e680750f980f3a1a85cd8dd28a47c1"
|
||||
|
||||
[[package]]
|
||||
name = "ureq"
|
||||
version = "2.12.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "02d1a66277ed75f640d608235660df48c8e3c19f3b4edb6a263315626cc3c01d"
|
||||
dependencies = [
|
||||
"base64",
|
||||
"flate2",
|
||||
"log",
|
||||
"once_cell",
|
||||
"rustls",
|
||||
"rustls-pki-types",
|
||||
"url",
|
||||
"webpki-roots 0.26.11",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "url"
|
||||
version = "2.5.8"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "ff67a8a4397373c3ef660812acab3268222035010ab8680ec4215f38ba3d0eed"
|
||||
dependencies = [
|
||||
"form_urlencoded",
|
||||
"idna",
|
||||
"percent-encoding",
|
||||
"serde",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "utf8_iter"
|
||||
version = "1.0.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b6c140620e7ffbb22c2dee59cafe6084a59b5ffc27a8859a5f0d494b5d52b6be"
|
||||
|
||||
[[package]]
|
||||
name = "utf8parse"
|
||||
version = "0.2.2"
|
||||
@@ -4421,6 +4642,24 @@ dependencies = [
|
||||
"rustls-pki-types",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "webpki-roots"
|
||||
version = "0.26.11"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "521bc38abb08001b01866da9f51eb7c5d647a19260e00054a8c7fd5f9e57f7a9"
|
||||
dependencies = [
|
||||
"webpki-roots 1.0.8",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "webpki-roots"
|
||||
version = "1.0.8"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "bf85cb06032201fa7c6f829d7db5a7e5aa45bcc0655327713065f6f0576731bf"
|
||||
dependencies = [
|
||||
"rustls-pki-types",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "wide"
|
||||
version = "0.7.33"
|
||||
@@ -4946,6 +5185,16 @@ dependencies = [
|
||||
"memchr",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "winreg"
|
||||
version = "0.56.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7d6f32a0ff4a9f6f01231eb2059cc85479330739333e0e58cadf03b6af2cca10"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"windows-sys 0.61.2",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "wit-bindgen"
|
||||
version = "0.51.0"
|
||||
@@ -5040,6 +5289,12 @@ dependencies = [
|
||||
"wasmparser",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "writeable"
|
||||
version = "0.6.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1ffae5123b2d3fc086436f8834ae3ab053a283cfac8fe0a0b8eaae044768a4c4"
|
||||
|
||||
[[package]]
|
||||
name = "x509-parser"
|
||||
version = "0.16.0"
|
||||
@@ -5083,6 +5338,29 @@ dependencies = [
|
||||
"time",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "yoke"
|
||||
version = "0.8.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "709fe23a0424b6a435d82152b1bd3fdfb0833487d5fa90d05d42762a9891fef5"
|
||||
dependencies = [
|
||||
"stable_deref_trait",
|
||||
"yoke-derive",
|
||||
"zerofrom",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "yoke-derive"
|
||||
version = "0.8.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "de844c262c8848816172cef550288e7dc6c7b7814b4ee56b3e1553f275f1858e"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
"syn",
|
||||
"synstructure",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zbus"
|
||||
version = "5.16.0"
|
||||
@@ -5159,12 +5437,66 @@ dependencies = [
|
||||
"syn",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zerofrom"
|
||||
version = "0.1.8"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0ec05a11813ea801ff6d75110ad09cd0824ddba17dfe17128ea0d5f68e6c5272"
|
||||
dependencies = [
|
||||
"zerofrom-derive",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zerofrom-derive"
|
||||
version = "0.1.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "11532158c46691caf0f2593ea8358fed6bbf68a0315e80aae9bd41fbade684a1"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
"syn",
|
||||
"synstructure",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zeroize"
|
||||
version = "1.8.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b97154e67e32c85465826e8bcc1c59429aaaf107c1e4a9e53c8d8ccd5eff88d0"
|
||||
|
||||
[[package]]
|
||||
name = "zerotrie"
|
||||
version = "0.2.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0f9152d31db0792fa83f70fb2f83148effb5c1f5b8c7686c3459e361d9bc20bf"
|
||||
dependencies = [
|
||||
"displaydoc",
|
||||
"yoke",
|
||||
"zerofrom",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zerovec"
|
||||
version = "0.11.6"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "90f911cbc359ab6af17377d242225f4d75119aec87ea711a880987b18cd7b239"
|
||||
dependencies = [
|
||||
"yoke",
|
||||
"zerofrom",
|
||||
"zerovec-derive",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zerovec-derive"
|
||||
version = "0.11.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "625dc425cab0dca6dc3c3319506e6593dcb08a9f387ea3b284dbd52a92c40555"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
"syn",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zmij"
|
||||
version = "1.0.21"
|
||||
|
||||
@@ -1,13 +1,16 @@
|
||||
# punktfunk
|
||||
|
||||
**Low-latency desktop and game streaming, Linux-first.** Run the host on a Linux machine — or a
|
||||
Windows PC — with an NVIDIA GPU, connect from a Mac, PC, phone, tablet, or TV, and stream your desktop
|
||||
**Low-latency desktop and game streaming with first-class Linux and Windows hosts.** Run the host on
|
||||
a Linux machine or a Windows PC, connect from a Mac, PC, phone, tablet, or TV, and stream your desktop
|
||||
or games — each device at its **own native resolution and refresh rate**, over your local network.
|
||||
|
||||
📖 **Documentation: [docs.punktfunk.unom.io](https://docs.punktfunk.unom.io)** — start with
|
||||
[How It Works](https://docs.punktfunk.unom.io/docs/how-it-works) or the
|
||||
[Quick Start](https://docs.punktfunk.unom.io/docs/quickstart).
|
||||
|
||||
💬 **Community: [Discord](https://discord.gg/kaPNvzMuGU)** — chat, support, and **Android beta
|
||||
access** · **[r/Punktfunk](https://www.reddit.com/r/Punktfunk/)**.
|
||||
|
||||
punktfunk pairs a **virtual-display streaming host** with native clients on every platform. It speaks
|
||||
the existing **GameStream** protocol, so any [Moonlight](https://moonlight-stream.org/) client works
|
||||
day one — and adds its own faster **`punktfunk/1`** protocol that breaks the ~1 Gbps FEC wall with a
|
||||
@@ -19,6 +22,11 @@ protocol, FEC, and crypto, linked into the host and every client over a stable C
|
||||
- **Your device's exact mode.** For each client that connects, the host spins up a virtual display
|
||||
sized to that device — 1080p60 to a laptop, 1440p120 to a desktop, 4K to a TV, all at once. No
|
||||
letterboxing, no scaling, no rearranging your real monitors.
|
||||
- **A real virtual display on Windows, too.** On Linux the host uses per-compositor virtual outputs;
|
||||
on Windows you get the same on-the-fly virtual display — at the client's exact mode, no physical
|
||||
monitor or dummy HDMI plug, even on the secure desktop (UAC / lock screen). It also has **its own
|
||||
indirect display driver (IDD)** the host pushes finished frames straight into, rather than scraping
|
||||
a screen — tight, push-based integration that's unusual for a Windows streaming host.
|
||||
- **Low latency, GPU end to end.** Frames go straight from the compositor to the NVENC encoder with
|
||||
zero CPU copies (dmabuf → CUDA/Vulkan → NVENC), over a transport tuned for responsiveness rather
|
||||
than throughput. Stable 240 fps at 5120×1440; sub-millisecond capture-to-reassembly on a LAN.
|
||||
@@ -124,7 +132,7 @@ clients/
|
||||
web/ web console (TanStack) over the management API — status · devices · pairing
|
||||
packaging/ apt · rpm / COPR · Arch · Flatpak · Bazzite bootc image
|
||||
docs-site/ public documentation site (Fumadocs) — https://docs.punktfunk.unom.io
|
||||
docs/ design notes & deep-dive plans
|
||||
design/ design notes & deep-dive plans
|
||||
include/punktfunk_core.h cbindgen-generated C header (checked in)
|
||||
tools/ latency-probe · loss-harness (measurement)
|
||||
```
|
||||
|
||||
@@ -978,6 +978,309 @@
|
||||
}
|
||||
}
|
||||
},
|
||||
"/api/v1/stats/capture/live": {
|
||||
"get": {
|
||||
"tags": [
|
||||
"stats"
|
||||
],
|
||||
"summary": "Live in-progress capture",
|
||||
"description": "The full sample time-series of the capture currently recording, for live graphing. `404` when\nnothing is armed.",
|
||||
"operationId": "statsCaptureLive",
|
||||
"responses": {
|
||||
"200": {
|
||||
"description": "The in-progress capture (meta + samples so far)",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/Capture"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"401": {
|
||||
"description": "Missing or invalid bearer token",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"404": {
|
||||
"description": "No capture is currently recording",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"/api/v1/stats/capture/start": {
|
||||
"post": {
|
||||
"tags": [
|
||||
"stats"
|
||||
],
|
||||
"summary": "Start a stats capture",
|
||||
"description": "Arms a new performance-stats capture. Idempotent: if a capture is already running this returns\nthe current status unchanged. While armed, the streaming loops emit aggregated samples (~ every\n1–2 s) into the in-progress capture, readable live via `GET /stats/capture/live`.",
|
||||
"operationId": "statsCaptureStart",
|
||||
"responses": {
|
||||
"200": {
|
||||
"description": "Capture armed (or already running)",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/StatsStatus"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"401": {
|
||||
"description": "Missing or invalid bearer token",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"/api/v1/stats/capture/status": {
|
||||
"get": {
|
||||
"tags": [
|
||||
"stats"
|
||||
],
|
||||
"summary": "Stats capture status",
|
||||
"description": "Whether a capture is armed, its sample count, and start time. Poll this (e.g. every 2 s) to\ndrive the capture-control UI.",
|
||||
"operationId": "statsCaptureStatus",
|
||||
"responses": {
|
||||
"200": {
|
||||
"description": "In-progress capture status (idle when not armed)",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/StatsStatus"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"401": {
|
||||
"description": "Missing or invalid bearer token",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"/api/v1/stats/capture/stop": {
|
||||
"post": {
|
||||
"tags": [
|
||||
"stats"
|
||||
],
|
||||
"summary": "Stop the stats capture",
|
||||
"description": "Disarms the in-progress capture and writes it to disk atomically, returning its summary. If\nnothing was recording, returns `204 No Content`.",
|
||||
"operationId": "statsCaptureStop",
|
||||
"responses": {
|
||||
"200": {
|
||||
"description": "Capture stopped and saved",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/CaptureMeta"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"204": {
|
||||
"description": "Nothing was recording"
|
||||
},
|
||||
"401": {
|
||||
"description": "Missing or invalid bearer token",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"500": {
|
||||
"description": "Could not write the recording to disk",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"/api/v1/stats/recordings": {
|
||||
"get": {
|
||||
"tags": [
|
||||
"stats"
|
||||
],
|
||||
"summary": "List saved recordings",
|
||||
"description": "Every saved capture's summary (the `meta` head only — not the sample body), newest first.",
|
||||
"operationId": "statsRecordingsList",
|
||||
"responses": {
|
||||
"200": {
|
||||
"description": "Saved capture summaries, newest first",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/components/schemas/CaptureMeta"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"401": {
|
||||
"description": "Missing or invalid bearer token",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"/api/v1/stats/recordings/{id}": {
|
||||
"get": {
|
||||
"tags": [
|
||||
"stats"
|
||||
],
|
||||
"summary": "Get a saved recording",
|
||||
"description": "The full capture (meta + samples) for `id`, for graphing or download.",
|
||||
"operationId": "statsRecordingGet",
|
||||
"parameters": [
|
||||
{
|
||||
"name": "id",
|
||||
"in": "path",
|
||||
"description": "The recording id (its filename stem)",
|
||||
"required": true,
|
||||
"schema": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
],
|
||||
"responses": {
|
||||
"200": {
|
||||
"description": "The full capture",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/Capture"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"401": {
|
||||
"description": "Missing or invalid bearer token",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"404": {
|
||||
"description": "No recording with that id",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"500": {
|
||||
"description": "The recording file is unreadable",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"delete": {
|
||||
"tags": [
|
||||
"stats"
|
||||
],
|
||||
"summary": "Delete a saved recording",
|
||||
"description": "Removes the recording `id` from disk. `404` if there is no such recording.",
|
||||
"operationId": "statsRecordingDelete",
|
||||
"parameters": [
|
||||
{
|
||||
"name": "id",
|
||||
"in": "path",
|
||||
"description": "The recording id (its filename stem)",
|
||||
"required": true,
|
||||
"schema": {
|
||||
"type": "string"
|
||||
}
|
||||
}
|
||||
],
|
||||
"responses": {
|
||||
"204": {
|
||||
"description": "Recording deleted"
|
||||
},
|
||||
"401": {
|
||||
"description": "Missing or invalid bearer token",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"404": {
|
||||
"description": "No recording with that id",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"500": {
|
||||
"description": "Could not delete the recording",
|
||||
"content": {
|
||||
"application/json": {
|
||||
"schema": {
|
||||
"$ref": "#/components/schemas/ApiError"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"/api/v1/status": {
|
||||
"get": {
|
||||
"tags": [
|
||||
@@ -1125,6 +1428,89 @@
|
||||
}
|
||||
}
|
||||
},
|
||||
"Capture": {
|
||||
"type": "object",
|
||||
"description": "A full capture: summary + the sample time-series. The wire + on-disk shape.",
|
||||
"required": [
|
||||
"meta",
|
||||
"samples"
|
||||
],
|
||||
"properties": {
|
||||
"meta": {
|
||||
"$ref": "#/components/schemas/CaptureMeta"
|
||||
},
|
||||
"samples": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/components/schemas/StatsSample"
|
||||
}
|
||||
}
|
||||
}
|
||||
},
|
||||
"CaptureMeta": {
|
||||
"type": "object",
|
||||
"description": "Capture summary — the filename stem plus the negotiated mode/codec/client. Stored at the head\nof each on-disk recording and listed standalone (without the sample body) by\n[`StatsRecorder::list`].",
|
||||
"required": [
|
||||
"id",
|
||||
"started_unix_ms",
|
||||
"duration_ms",
|
||||
"kind",
|
||||
"width",
|
||||
"height",
|
||||
"fps",
|
||||
"codec",
|
||||
"client",
|
||||
"sample_count"
|
||||
],
|
||||
"properties": {
|
||||
"client": {
|
||||
"type": "string",
|
||||
"description": "Short label / fingerprint prefix, or `\"\"` if unknown."
|
||||
},
|
||||
"codec": {
|
||||
"type": "string",
|
||||
"description": "`\"h264\" | \"hevc\" | \"av1\"`."
|
||||
},
|
||||
"duration_ms": {
|
||||
"type": "integer",
|
||||
"format": "int64",
|
||||
"minimum": 0
|
||||
},
|
||||
"fps": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"minimum": 0
|
||||
},
|
||||
"height": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"minimum": 0
|
||||
},
|
||||
"id": {
|
||||
"type": "string",
|
||||
"description": "e.g. `\"2026-06-26T20-14-03Z_5120x1440\"` — also the filename stem."
|
||||
},
|
||||
"kind": {
|
||||
"type": "string",
|
||||
"description": "`\"native\" | \"gamestream\"`."
|
||||
},
|
||||
"sample_count": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"minimum": 0
|
||||
},
|
||||
"started_unix_ms": {
|
||||
"type": "integer",
|
||||
"format": "int64",
|
||||
"minimum": 0
|
||||
},
|
||||
"width": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"minimum": 0
|
||||
}
|
||||
}
|
||||
},
|
||||
"CustomEntry": {
|
||||
"type": "object",
|
||||
"description": "A user-added title, persisted in `~/.config/punktfunk/library.json`. Same shape the API\nreturns and the web console edits.",
|
||||
@@ -1595,6 +1981,144 @@
|
||||
}
|
||||
}
|
||||
},
|
||||
"StageTiming": {
|
||||
"type": "object",
|
||||
"description": "One pipeline stage's latency in an aggregation window (microseconds).",
|
||||
"required": [
|
||||
"name",
|
||||
"p50_us",
|
||||
"p99_us"
|
||||
],
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string",
|
||||
"description": "`\"capture\" | \"submit\" | \"encode\" | \"packetize\" | \"send\"` (path-dependent)."
|
||||
},
|
||||
"p50_us": {
|
||||
"type": "number",
|
||||
"format": "float"
|
||||
},
|
||||
"p99_us": {
|
||||
"type": "number",
|
||||
"format": "float"
|
||||
}
|
||||
}
|
||||
},
|
||||
"StatsSample": {
|
||||
"type": "object",
|
||||
"description": "One aggregated sample (~ every 2 s native, ~ every 1 s GameStream).",
|
||||
"required": [
|
||||
"t_ms",
|
||||
"session_id",
|
||||
"stages",
|
||||
"fps",
|
||||
"repeat_fps",
|
||||
"mbps",
|
||||
"bitrate_kbps",
|
||||
"frames_dropped",
|
||||
"packets_dropped",
|
||||
"send_dropped",
|
||||
"fec_recovered"
|
||||
],
|
||||
"properties": {
|
||||
"bitrate_kbps": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"description": "Configured target bitrate.",
|
||||
"minimum": 0
|
||||
},
|
||||
"fec_recovered": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"description": "FEC shards recovered this window (delta).",
|
||||
"minimum": 0
|
||||
},
|
||||
"fps": {
|
||||
"type": "number",
|
||||
"format": "float",
|
||||
"description": "Genuine NEW frames/s from the source."
|
||||
},
|
||||
"frames_dropped": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"description": "Frames dropped this window (delta).",
|
||||
"minimum": 0
|
||||
},
|
||||
"mbps": {
|
||||
"type": "number",
|
||||
"format": "float",
|
||||
"description": "Transmit goodput (Mb/s)."
|
||||
},
|
||||
"packets_dropped": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"description": "Packets dropped this window (receiver-side / reassembler, where known).",
|
||||
"minimum": 0
|
||||
},
|
||||
"repeat_fps": {
|
||||
"type": "number",
|
||||
"format": "float",
|
||||
"description": "Re-encoded holds/s (source-starvation indicator)."
|
||||
},
|
||||
"send_dropped": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"description": "Host send-buffer overflow / EAGAIN this window (delta).",
|
||||
"minimum": 0
|
||||
},
|
||||
"session_id": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"description": "Disambiguates concurrent sessions (usually constant).",
|
||||
"minimum": 0
|
||||
},
|
||||
"stages": {
|
||||
"type": "array",
|
||||
"items": {
|
||||
"$ref": "#/components/schemas/StageTiming"
|
||||
},
|
||||
"description": "Ordered pipeline stages for this path."
|
||||
},
|
||||
"t_ms": {
|
||||
"type": "integer",
|
||||
"format": "int64",
|
||||
"description": "Milliseconds since capture start (monotonic; stamped by [`StatsRecorder::push_sample`]).",
|
||||
"minimum": 0
|
||||
}
|
||||
}
|
||||
},
|
||||
"StatsStatus": {
|
||||
"type": "object",
|
||||
"description": "Snapshot of the in-progress capture for the management API.",
|
||||
"required": [
|
||||
"armed",
|
||||
"sample_count",
|
||||
"started_unix_ms",
|
||||
"kind"
|
||||
],
|
||||
"properties": {
|
||||
"armed": {
|
||||
"type": "boolean",
|
||||
"description": "Capture currently running."
|
||||
},
|
||||
"kind": {
|
||||
"type": "string",
|
||||
"description": "Path of the in-progress capture (`\"\"` if idle)."
|
||||
},
|
||||
"sample_count": {
|
||||
"type": "integer",
|
||||
"format": "int32",
|
||||
"description": "Samples in the in-progress capture.",
|
||||
"minimum": 0
|
||||
},
|
||||
"started_unix_ms": {
|
||||
"type": "integer",
|
||||
"format": "int64",
|
||||
"description": "Unix start time of the in-progress capture (`0` if idle).",
|
||||
"minimum": 0
|
||||
}
|
||||
}
|
||||
},
|
||||
"StreamInfo": {
|
||||
"type": "object",
|
||||
"description": "RTSP-negotiated stream parameters.",
|
||||
@@ -1696,6 +2220,10 @@
|
||||
{
|
||||
"name": "library",
|
||||
"description": "Game library: installed-store titles (Steam) plus user-curated custom entries"
|
||||
},
|
||||
{
|
||||
"name": "stats",
|
||||
"description": "Streaming performance-stats capture: arm/stop a recording, read the live + saved time-series for graphing"
|
||||
}
|
||||
]
|
||||
}
|
||||
@@ -361,4 +361,4 @@ ever switched to a logged-in GUI session, re-adding macOS to the job's capture s
|
||||
- Mid-stream renegotiation (resolution change without reconnect) is designed-for but not
|
||||
implemented (the Welcome is one-shot today).
|
||||
- Host-side gamepad injection needs `/dev/uinput` access on the box (udev rule from
|
||||
`docs/linux-setup.md`).
|
||||
`design/linux-setup.md`).
|
||||
|
||||
@@ -276,7 +276,7 @@ pub mod frame {
|
||||
/// These were hand-duplicated as `OFF_*`/`SHM_*` constants in `inject/{gamepad,dualsense}_windows.rs`
|
||||
/// and (as bare literals — `*view.add(140)`) in the standalone `xusb-driver`/`dualsense-driver`
|
||||
/// workspaces, guarded only by "must match" comments — the top ABI-drift hazard the audit flagged
|
||||
/// (`docs/windows-host-rewrite.md` §2.7). Owning them here with `Pod` derives + `offset_of!`
|
||||
/// (`design/windows-host-rewrite.md` §2.7). Owning them here with `Pod` derives + `offset_of!`
|
||||
/// asserts makes a one-sided edit a compile error.
|
||||
///
|
||||
/// The host creates the section (privileged, permissive DACL so the restricted WUDFHost token can
|
||||
|
||||
@@ -25,6 +25,14 @@ aes-gcm = "0.10"
|
||||
cbc = { version = "0.1", features = ["alloc"] }
|
||||
rand = "0.8"
|
||||
hex = "0.4"
|
||||
# Cover-art delivery in the game library: encode Lutris's local JPEGs into `data:` URLs and decode
|
||||
# the Epic launcher's base64 `catcache.bin`. Cross-platform (Linux Lutris art + Windows Epic art).
|
||||
base64 = "0.22"
|
||||
# Blocking HTTP for the library cover-art warmer (no-auth GOG api.gog.com + Xbox displaycatalog),
|
||||
# run on a background thread off the hot path. `ureq` is small + sync (no tokio here) and bundles
|
||||
# webpki roots (no system cert dependency). Cross-platform so the fetch/parse code is compiled +
|
||||
# checked everywhere even though only the Windows GOG/Xbox providers need it today.
|
||||
ureq = "2"
|
||||
rcgen = { version = "0.13", default-features = false, features = ["aws_lc_rs", "pem"] }
|
||||
x509-parser = "0.16"
|
||||
axum-server = { version = "0.7", features = ["tls-rustls"] }
|
||||
@@ -89,9 +97,6 @@ serde_json = "1"
|
||||
# SQLite (cc, already needed for ffmpeg/opus) so there's no system libsqlite3 runtime dependency —
|
||||
# clean for the deb/rpm/flatpak packaging. Opened read-only/immutable (Lutris may hold it open).
|
||||
rusqlite = { version = "0.40", features = ["bundled"] }
|
||||
# Inline Lutris's local cover-art JPEGs as `data:` URLs in the library (Lutris has no public CDN
|
||||
# keyed by a stable id, unlike Steam/Heroic; a `data:` URL is self-contained — no host-served endpoint).
|
||||
base64 = "0.22"
|
||||
# Builds/validates the xkb keymap uploaded to the virtual keyboard + tracks modifier state.
|
||||
xkbcommon = "0.8"
|
||||
# The safe `opus` crate is stereo-only; surround (5.1/7.1) needs the libopus *multistream*
|
||||
@@ -176,6 +181,12 @@ windows = { version = "0.62", features = [
|
||||
# handler / ServiceManager install). Wraps the Win32 service API; the supervision loop itself uses
|
||||
# the `windows` crate above.
|
||||
windows-service = "0.7"
|
||||
# Read the GOG.com install registry (HKLM\SOFTWARE\WOW6432Node\GOG.com\Games) for the GOG store
|
||||
# provider — ergonomic + correct-by-construction vs. hand-rolled Reg* FFI for subkey enumeration.
|
||||
winreg = "0.56"
|
||||
# Parse each Xbox/Game-Pass game's MicrosoftGame.config (GDK manifest XML) for the Xbox store
|
||||
# provider — a small read-only DOM is all we need (Identity/Executable/ShellVisuals/StoreId).
|
||||
roxmltree = "0.21"
|
||||
# Software H.264 encoder (GPU-less path + NVENC fallback). The default `source` feature statically
|
||||
# compiles OpenH264 (BSD-2) — no system lib, builds on MSVC; nasm on PATH adds the SIMD fast path.
|
||||
openh264 = "0.9"
|
||||
|
||||
@@ -13,6 +13,9 @@
|
||||
//! when the client isn't talking. WASAPI objects are `!Send`, so they live entirely on that thread
|
||||
//! (mirrors `WasapiLoopbackCapturer`).
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it.
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{VirtualMic, SAMPLE_RATE};
|
||||
use anyhow::{anyhow, Context, Result};
|
||||
use std::collections::VecDeque;
|
||||
@@ -154,6 +157,13 @@ fn find_or_install_device() -> Result<wasapi::Device> {
|
||||
Ok(d) => Ok(d),
|
||||
Err(e) => {
|
||||
tracing::info!("no virtual mic device present — attempting auto-install");
|
||||
// SAFETY: `try_install_virtual_mic` is `unsafe` only because it `LoadLibraryExW`s
|
||||
// `newdev.dll` and calls `DiInstallDriverW` through a `transmute`d function pointer;
|
||||
// calling it imposes no extra precondition here (it takes no args and aliases nothing).
|
||||
// Its internal contract holds: the `DiInstall` type matches the documented
|
||||
// `BOOL DiInstallDriverW(HWND, PCWSTR, DWORD, PBOOL)` ABI, and it passes a
|
||||
// NUL-terminated UTF-16 INF path with null/zero optional args. Invoked once on the
|
||||
// dedicated mic thread.
|
||||
if unsafe { try_install_virtual_mic() } {
|
||||
find_device()
|
||||
} else {
|
||||
|
||||
@@ -2,6 +2,10 @@
|
||||
//! CPU-copy fallback (the portal delivers a CPU buffer; the encoder uploads it to the GPU
|
||||
//! internally). Zero-copy dmabuf→NVENC import is deferred (plan §9 risk).
|
||||
|
||||
// Every unsafe block in this module tree carries a `// SAFETY:` proof; enforce it (unsafe-proof
|
||||
// program). As a parent module this also covers the child modules (capture::windows/linux::*).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use anyhow::Result;
|
||||
|
||||
/// Packed pixel layout of a [`CapturedFrame`]. The ScreenCast portal negotiates the
|
||||
@@ -433,6 +437,11 @@ pub fn capture_virtual_output(
|
||||
// DDA is the safety net (+ the secure-desktop path). The encode thread is set MTA so the WGC
|
||||
// objects built on the watchdog thread (also MTA) are usable here; the keepalive is handed to WGC
|
||||
// only on success, else to DDA. A hung watchdog thread is abandoned (holds no keepalive).
|
||||
// SAFETY: `RoInitialize` is a combase FFI call that initializes the WinRT apartment for the calling
|
||||
// thread. It takes the `RO_INIT_MULTITHREADED` enum by value and borrows no memory, so there is no
|
||||
// pointer/lifetime/aliasing obligation; it is safe on any thread and idempotent — a second call on a
|
||||
// thread already in a compatible apartment returns S_FALSE / RPC_E_CHANGED_MODE, which we discard.
|
||||
// Runs on the encode thread that goes on to use the WGC (WinRT) objects built by the watchdog thread.
|
||||
unsafe {
|
||||
let _ = windows::Win32::System::WinRT::RoInitialize(
|
||||
windows::Win32::System::WinRT::RO_INIT_MULTITHREADED,
|
||||
|
||||
@@ -17,6 +17,9 @@
|
||||
//! instead of leaking it to process exit. The portal thread (when used) still parks on its zbus
|
||||
//! connection until process exit.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{CapturedFrame, Capturer, DmabufFrame, FramePayload, PixelFormat};
|
||||
use anyhow::{anyhow, Context, Result};
|
||||
use std::os::fd::OwnedFd;
|
||||
@@ -498,6 +501,12 @@ mod pipewire {
|
||||
|
||||
impl DmabufMap {
|
||||
fn new(fd: i32, len: usize) -> Option<DmabufMap> {
|
||||
// SAFETY: a null `addr` lets the kernel choose the mapping address; `fd` is a caller-owned
|
||||
// dmabuf/MemFd fd, valid for the duration of this call, and `len` is the requested map length.
|
||||
// `mmap` reads no Rust memory — it installs a fresh PROT_READ/MAP_SHARED page mapping and
|
||||
// returns its base (or MAP_FAILED, checked below before `DmabufMap` adopts it). The returned
|
||||
// region is a brand-new VMA, so it aliases no live Rust object, and it keeps the underlying
|
||||
// object mapped independently of `fd` (which may be closed after this returns).
|
||||
let ptr = unsafe {
|
||||
libc::mmap(
|
||||
std::ptr::null_mut(),
|
||||
@@ -514,6 +523,11 @@ mod pipewire {
|
||||
|
||||
impl Drop for DmabufMap {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `self.ptr`/`self.len` are exactly the base+length of a successful `mmap` in
|
||||
// `DmabufMap::new` (constructed only when `ptr != MAP_FAILED`). This `DmabufMap` uniquely owns
|
||||
// that mapping and `drop` runs once, so `munmap` releases a live mapping exactly once — no
|
||||
// double-unmap. Every `&[u8]` derived from the mapping is bounded by this `DmabufMap`'s
|
||||
// lifetime, so no borrow outlives the unmap.
|
||||
unsafe {
|
||||
libc::munmap(self.ptr, self.len);
|
||||
}
|
||||
@@ -719,6 +733,14 @@ mod pipewire {
|
||||
if !ud.active.load(Ordering::Relaxed) {
|
||||
return;
|
||||
}
|
||||
// SAFETY: `spa_buf` is the `*mut spa_buffer` of the PipeWire buffer we dequeued and still hold for
|
||||
// this `.process` callback (not requeued until after `consume_frame` returns), so it is live. The
|
||||
// block null-checks `spa_buf`, requires `n_datas != 0`, and null-checks the `datas` array pointer
|
||||
// before forming any slice. `(*spa_buf).datas` points to `n_datas` libspa `spa_data` structs, and
|
||||
// `pw::spa::buffer::Data` is `#[repr(transparent)]` over `spa_data` (the same cast
|
||||
// `Buffer::datas_mut` performs — see the function doc), so the pointer cast + length describe
|
||||
// exactly that array, in bounds. The PipeWire loop is single-threaded and owns the buffer here, so
|
||||
// this `&mut` slice is the only reference to it (no aliasing/data race).
|
||||
let datas: &mut [pw::spa::buffer::Data] = unsafe {
|
||||
if spa_buf.is_null() || (*spa_buf).n_datas == 0 || (*spa_buf).datas.is_null() {
|
||||
&mut []
|
||||
@@ -783,6 +805,10 @@ mod pipewire {
|
||||
// dup the fd so it survives the SPA buffer recycle — the encode thread
|
||||
// imports it. (Content stability across the brief map+CSC window relies on
|
||||
// the compositor's buffer-pool depth, like any zero-copy capture.)
|
||||
// SAFETY: `datas[0].fd()` is the dmabuf fd owned by the live PipeWire buffer (valid
|
||||
// for this callback). `fcntl(fd, F_DUPFD_CLOEXEC, 0)` reads only the integer fd,
|
||||
// touches no Rust memory, and returns a fresh independent CLOEXEC duplicate (or -1).
|
||||
// The original stays owned by PipeWire; the dup is a new fd we own (checked >= 0).
|
||||
let dup =
|
||||
unsafe { libc::fcntl(datas[0].fd() as i32, libc::F_DUPFD_CLOEXEC, 0) };
|
||||
if dup >= 0 {
|
||||
@@ -796,6 +822,10 @@ mod pipewire {
|
||||
pts_ns,
|
||||
format: fmt,
|
||||
payload: FramePayload::Dmabuf(DmabufFrame {
|
||||
// SAFETY: `dup` is the fresh fd `fcntl(F_DUPFD_CLOEXEC)` just returned
|
||||
// (checked `dup >= 0`); nothing else owns it, so `OwnedFd` takes sole
|
||||
// ownership and closes it exactly once on drop — no alias, no
|
||||
// double-close.
|
||||
fd: unsafe { OwnedFd::from_raw_fd(dup) },
|
||||
fourcc,
|
||||
modifier: ud.modifier,
|
||||
@@ -930,6 +960,11 @@ mod pipewire {
|
||||
// cleanly if the real buffer is genuinely too small. MemPtr buffers (no fd) are same-process —
|
||||
// trust `d.data()`.
|
||||
let fd_len = if raw_fd > 0 {
|
||||
// SAFETY: `libc::stat` is a C plain-old-data struct for which all-zero is a valid value, so
|
||||
// `mem::zeroed()` is a sound initializer. `raw_fd` is the buffer's fd (`> 0` checked here) and
|
||||
// valid for this callback; `fstat` writes metadata into `&mut st`, a live, aligned,
|
||||
// correctly-sized stack `stat` that outlives the synchronous call. `st.st_size` is read only
|
||||
// after the return value is confirmed `== 0`. `st` is a fresh local, so nothing aliases it.
|
||||
unsafe {
|
||||
let mut st: libc::stat = std::mem::zeroed();
|
||||
(libc::fstat(raw_fd as i32, &mut st) == 0 && st.st_size > 0)
|
||||
@@ -946,6 +981,14 @@ mod pipewire {
|
||||
match DmabufMap::new(raw_fd as i32, map_len) {
|
||||
Some(m) => {
|
||||
_mapping = m;
|
||||
// SAFETY: `_mapping` is the `DmabufMap` just stored; its `ptr`/`len` come from a
|
||||
// successful `mmap` of `map_len` PROT_READ bytes, so `ptr` is non-null, page-aligned,
|
||||
// and the VMA is one allocated object of `len` bytes valid for reads. In the common
|
||||
// path `map_len == fd_len` (the fd's real size from `fstat`), so the mapping spans the
|
||||
// whole object; the de-pad copy below is further bounded by the `offset <= buf.len()`
|
||||
// and `needed > avail` guards. The `&[u8]` borrows `_mapping`, which lives to the end
|
||||
// of `consume_frame`, so the slice never outlives the mapping, and the memory is only
|
||||
// read here, so there is no aliasing/mutation.
|
||||
Some(unsafe {
|
||||
std::slice::from_raw_parts(_mapping.ptr as *const u8, _mapping.len)
|
||||
})
|
||||
@@ -1177,24 +1220,43 @@ mod pipewire {
|
||||
// Latest-frame-only (OBS pattern): Mutter delivers buffers in bursts and
|
||||
// recycles its pool; an older queued buffer carries a STALE frame. Drain all
|
||||
// queued buffers, requeue the older ones, keep only the newest.
|
||||
// SAFETY: `stream` is the live stream PipeWire passes into this `.process` callback on
|
||||
// the loop thread, where `pw_stream_dequeue_buffer` is the documented call. It returns
|
||||
// a `*mut pw_buffer` owned by the stream (or null when the queue is drained),
|
||||
// null-checked before any use. The loop is single-threaded, so no concurrent access.
|
||||
let mut newest = unsafe { stream.dequeue_raw_buffer() };
|
||||
if newest.is_null() {
|
||||
return;
|
||||
}
|
||||
let mut drained = 1u32;
|
||||
loop {
|
||||
// SAFETY: same stream/loop-thread contract as the dequeue above; each call returns
|
||||
// the next stream-owned `*mut pw_buffer` or null (null-checked before use).
|
||||
let next = unsafe { stream.dequeue_raw_buffer() };
|
||||
if next.is_null() {
|
||||
break;
|
||||
}
|
||||
// SAFETY: `newest` is a non-null `*mut pw_buffer` previously dequeued from this same
|
||||
// stream and not yet requeued; `pw_stream_queue_buffer` hands ownership back to the
|
||||
// stream. We immediately overwrite `newest = next`, so the requeued pointer is never
|
||||
// touched again (no use-after-requeue). Loop thread, single-threaded.
|
||||
unsafe { stream.queue_raw_buffer(newest) };
|
||||
newest = next;
|
||||
drained += 1;
|
||||
}
|
||||
// SAFETY: `newest` is the non-null buffer we still own (dequeued, not requeued);
|
||||
// `.buffer` is a `*mut spa_buffer` field libpipewire populated. This is a single field
|
||||
// load through a valid pointer — no mutation or aliasing.
|
||||
let spa_buf = unsafe { (*newest).buffer };
|
||||
|
||||
// Inspect the newest buffer's header + first chunk for the diagnostic and the
|
||||
// CORRUPTED skip. SPA_META_Header is optional — `hdr` may be null.
|
||||
// SAFETY: `spa_buf` is the `*mut spa_buffer` of the buffer we still hold.
|
||||
// `spa_buffer_find_meta_data` scans that buffer's metadata array for a `SPA_META_Header`
|
||||
// of at least `size_of::<spa_meta_header>()` bytes and returns a pointer into the held
|
||||
// buffer's metadata (or null). The size argument matches the struct the result is cast
|
||||
// to, and the pointer stays valid as long as the buffer is held (until requeue). Null is
|
||||
// handled below.
|
||||
let hdr = unsafe {
|
||||
spa::sys::spa_buffer_find_meta_data(
|
||||
spa_buf,
|
||||
@@ -1205,11 +1267,20 @@ mod pipewire {
|
||||
let hdr_flags = if hdr.is_null() {
|
||||
0u32
|
||||
} else {
|
||||
// SAFETY: reached only when `hdr` is non-null; it points to a `spa_meta_header`
|
||||
// inside the live buffer's metadata (returned for a size >=
|
||||
// `size_of::<spa_meta_header>()`, so `.flags` is in bounds). A single field read
|
||||
// while the buffer is still held.
|
||||
unsafe { (*hdr).flags }
|
||||
};
|
||||
// First data chunk's size + flags (used for the diagnostic + CORRUPTED check)
|
||||
// and its data type (a dmabuf legitimately reports chunk size 0, so the size-0
|
||||
// stale skip only applies to mappable SHM buffers).
|
||||
// SAFETY: every dereference is guarded in order before any field read — `spa_buf`
|
||||
// non-null, `n_datas > 0`, the `datas` (`*mut spa_data`) array non-null, and the first
|
||||
// element's `chunk` (`*mut spa_chunk`) non-null. `d0` is that first `spa_data` and `c`
|
||||
// its chunk; reading `(*d0).type_`, `(*c).size`, `(*c).flags` are in-bounds field loads
|
||||
// of libspa structs inside the buffer we still hold. Single-threaded loop, no mutation.
|
||||
let (chunk_size, chunk_flags, is_dmabuf) = unsafe {
|
||||
if !spa_buf.is_null()
|
||||
&& (*spa_buf).n_datas > 0
|
||||
@@ -1246,11 +1317,17 @@ mod pipewire {
|
||||
"capture: skipped a stale CORRUPTED/cursor buffer (GNOME)"
|
||||
);
|
||||
}
|
||||
// SAFETY: `newest` is the non-null buffer we own (dequeued, never requeued on this
|
||||
// skip path); hand it back to the stream exactly once and return without touching it
|
||||
// again. Loop thread inside `.process`.
|
||||
unsafe { stream.queue_raw_buffer(newest) };
|
||||
return;
|
||||
}
|
||||
|
||||
consume_frame(ud, spa_buf);
|
||||
// SAFETY: `consume_frame` has finished reading `spa_buf` (and the `datas` borrows derived
|
||||
// from `newest`), so requeuing the owned `newest` exactly once here is sound — no
|
||||
// use-after-requeue. Loop thread inside `.process`.
|
||||
unsafe { stream.queue_raw_buffer(newest) };
|
||||
}));
|
||||
if outcome.is_err() {
|
||||
|
||||
@@ -15,6 +15,9 @@
|
||||
//! composed while a session is live). Effectiveness can be build/driver-dependent; gated by
|
||||
//! `PUNKTFUNK_FORCE_COMPOSED` (default ON; set =0 to disable).
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use std::sync::atomic::{AtomicBool, Ordering};
|
||||
use std::sync::Arc;
|
||||
use windows::core::w;
|
||||
@@ -48,6 +51,10 @@ impl ForceComposedFlip {
|
||||
let st = stop.clone();
|
||||
std::thread::Builder::new()
|
||||
.name("composed-flip".into())
|
||||
// SAFETY: `run` is this module's `unsafe fn` (it owns a desktop+window lifecycle via Win32
|
||||
// FFI); it takes ownership of `st` (the stop `Arc<AtomicBool>`) and has no caller-side memory
|
||||
// precondition. It is designed to own its thread for its whole duration — exactly the
|
||||
// dedicated `composed-flip` thread spawned here.
|
||||
.spawn(move || unsafe { run(st) })
|
||||
.ok()?;
|
||||
tracing::info!("force-composed-flip overlay started (Winlogon-aware)");
|
||||
@@ -62,6 +69,9 @@ impl Drop for ForceComposedFlip {
|
||||
}
|
||||
|
||||
extern "system" fn wndproc(hwnd: HWND, msg: u32, wp: WPARAM, lp: LPARAM) -> LRESULT {
|
||||
// SAFETY: this is the window procedure the OS invokes with the window's own `hwnd` and a real
|
||||
// message `(msg, wp, lp)`. `DefWindowProcW` performs default processing for exactly those
|
||||
// parameters (all passed straight through by value); it borrows no Rust memory and is synchronous.
|
||||
unsafe { DefWindowProcW(hwnd, msg, wp, lp) }
|
||||
}
|
||||
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
//! Input-desktop watcher (Windows) — the authoritative "normal vs secure desktop" signal for the
|
||||
//! two-process secure-desktop design (docs/windows-secure-desktop.md).
|
||||
//! two-process secure-desktop design (design/windows-secure-desktop.md).
|
||||
//!
|
||||
//! Windows switches the *input desktop* to "Winlogon" (the secure desktop) for UAC elevation, the
|
||||
//! lock screen and the login screen, and back to "Default" for the normal session. WGC captures only
|
||||
@@ -7,6 +7,9 @@
|
||||
//! desktop's NAME (WTS session notifications miss UAC entirely, so the name is the reliable signal)
|
||||
//! and publishes it as an atomic the capture mux + input path read.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use std::sync::atomic::{AtomicBool, AtomicU8, Ordering};
|
||||
use std::sync::Arc;
|
||||
use std::time::Duration;
|
||||
@@ -33,6 +36,10 @@ impl DesktopWatcher {
|
||||
// mux) sees the real state immediately. Otherwise a session that begins already on the secure
|
||||
// desktop (e.g. a reconnect to a locked box) would read DESKTOP_NORMAL for the first poll
|
||||
// interval and relay one stale normal-desktop frame — the "flash of the login screen" bug.
|
||||
// SAFETY: `is_secure_desktop` is this module's `unsafe fn` — unsafe only because it calls Win32
|
||||
// desktop FFI (`OpenInputDesktop`/`GetUserObjectInformationW`/`CloseDesktop`), with no caller
|
||||
// precondition; it opens, names, and closes the input-desktop handle internally and is safe to
|
||||
// call from any thread (here, on the thread running `DesktopWatcher::start`).
|
||||
let initial = if unsafe { is_secure_desktop() } {
|
||||
DESKTOP_SECURE
|
||||
} else {
|
||||
@@ -53,6 +60,9 @@ impl DesktopWatcher {
|
||||
let mut candidate = initial;
|
||||
let mut stable = 0u32;
|
||||
while !st.load(Ordering::Relaxed) {
|
||||
// SAFETY: same as in `start` — `is_secure_desktop` is self-contained Win32 desktop
|
||||
// FFI with no caller precondition, called here on the dedicated `desktop-watch`
|
||||
// polling thread.
|
||||
let v = if unsafe { is_secure_desktop() } {
|
||||
DESKTOP_SECURE
|
||||
} else {
|
||||
|
||||
@@ -7,6 +7,9 @@
|
||||
//! Validates only with a real GPU + an *activated* SudoVDA monitor (`DuplicateOutput` needs a live
|
||||
//! WDDM output). Compiles on the GPU-less VM; the pure helpers are unit-tested there.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{CapturedFrame, Capturer, FramePayload, PixelFormat};
|
||||
use anyhow::{anyhow, bail, Context, Result};
|
||||
use std::ffi::c_void;
|
||||
@@ -69,7 +72,12 @@ pub struct D3d11Frame {
|
||||
pub texture: ID3D11Texture2D,
|
||||
pub device: ID3D11Device,
|
||||
}
|
||||
// COM pointers, used only from the single owning thread.
|
||||
// SAFETY: `D3d11Frame` owns an `ID3D11Texture2D` + `ID3D11Device`, which are COM interface pointers.
|
||||
// D3D11 devices/resources use thread-safe (interlocked) COM reference counting, and the device is
|
||||
// created free-threaded (`make_device` passes no `D3D11_CREATE_DEVICE_SINGLETHREADED`), so handing
|
||||
// ownership of the frame to another thread — the capture→encode handoff — and releasing it there is
|
||||
// sound. The value is moved, never aliased (no `Sync`), so there is no concurrent use of the
|
||||
// single-threaded immediate context.
|
||||
unsafe impl Send for D3d11Frame {}
|
||||
|
||||
pub fn pack_luid(luid: LUID) -> i64 {
|
||||
@@ -295,6 +303,12 @@ unsafe fn d3dkmt_set_scheduling_priority_class(
|
||||
fn elevate_process_gpu_priority() {
|
||||
use std::sync::Once;
|
||||
static ONCE: Once = Once::new();
|
||||
// SAFETY: the closure calls two of this module's `unsafe fn`s — `enable_inc_base_priority`
|
||||
// (adjusts the current-process token; it has no caller precondition and builds all its FFI args
|
||||
// locally) and `d3dkmt_set_scheduling_priority_class` (loads gdi32 by name and calls the export).
|
||||
// The latter requires `process` to be a valid process handle; `GetCurrentProcess()` returns the
|
||||
// current-process pseudo-handle, which is always valid and needs no close. Runs once via
|
||||
// `Once::call_once`; no raw pointers are dereferenced here.
|
||||
ONCE.call_once(|| unsafe {
|
||||
use windows::Win32::System::Threading::GetCurrentProcess;
|
||||
let Some(prio) = configured_gpu_priority_class() else {
|
||||
@@ -538,6 +552,17 @@ unsafe extern "system" fn hybrid_query_hook(gpu_preference: *mut u32) -> i32 {
|
||||
pub(crate) fn install_gpu_pref_hook() {
|
||||
use std::sync::Once;
|
||||
static HOOK: Once = Once::new();
|
||||
// SAFETY: this one-time hook install only touches a region it has just validated.
|
||||
// `LoadLibraryA("win32u.dll")` + `GetProcAddress("NtGdiDdDDIGetCachedHybridQueryValue")` yield the
|
||||
// live base of the real exported function, so `target` is a valid executable code pointer to at
|
||||
// least the 12 bytes the patch overwrites (an x64 prologue, per Apollo's verified hook). The two
|
||||
// `ptr::copy_nonoverlapping`s each move exactly 12 bytes between the 12-byte stack arrays
|
||||
// (`patch`/`readback`) and `target`, which `VirtualProtect(target, 12, PAGE_EXECUTE_READWRITE, …)`
|
||||
// has just made writable (and is restored to `old` after) — source and dest never overlap (stack
|
||||
// vs. loaded module image), so every access stays in mapped, in-bounds memory.
|
||||
// `FlushInstructionCache` gets the current-process pseudo-handle + that same range. The DPI calls
|
||||
// take by-value context handles / fill the live local `&mut old`/`&mut restore` for the duration of
|
||||
// each synchronous call. Runs once via `Once::call_once`, before any DXGI use.
|
||||
HOOK.call_once(|| unsafe {
|
||||
use windows::Win32::System::LibraryLoader::{GetProcAddress, LoadLibraryA};
|
||||
use windows::Win32::System::Memory::{
|
||||
@@ -1389,6 +1414,14 @@ pub fn hdr_p010_selftest() -> Result<()> {
|
||||
}
|
||||
}
|
||||
|
||||
// SAFETY: this self-test creates its own D3D11 device + immediate context (`D3D11CreateDevice`,
|
||||
// both checked non-null) and uses ONLY that device for the rest of the block: every
|
||||
// `CreateTexture2D`/`CreateShaderResourceView`/`HdrP010Converter::{new,convert}`/`CopyResource`/
|
||||
// `Map` is invoked on that device or its context, so all resources share one device and run on this
|
||||
// single thread. The source texture's `D3D11_SUBRESOURCE_DATA` points at `fp16`, a live
|
||||
// `Vec<u16>` of `W*H*4` samples with `SysMemPitch = W*8`, matching the W×H R16G16B16A16 texture;
|
||||
// `fp16` outlives the synchronous `CreateTexture2D` that reads it. The mapped-pointer reads are
|
||||
// proven individually at the `read_u16` closure below.
|
||||
unsafe {
|
||||
// Hardware D3D11 device (no adapter pin — the default GPU is fine for the self-test).
|
||||
let mut device: Option<ID3D11Device> = None;
|
||||
@@ -2038,7 +2071,11 @@ pub struct DuplCapturer {
|
||||
dbg_cursor: u64,
|
||||
_keepalive: Box<dyn Send>,
|
||||
}
|
||||
// COM objects used only from the one thread that owns the capturer (the encode thread).
|
||||
// SAFETY: `DuplCapturer` holds D3D11 device/context/duplication COM pointers plus plain data. The
|
||||
// device is created free-threaded (`make_device` sets no `D3D11_CREATE_DEVICE_SINGLETHREADED`) and
|
||||
// COM reference counting is interlocked, so moving ownership of the whole capturer to another thread
|
||||
// is sound. It is used by exactly one thread (the encode thread) at a time — moved to it once, never
|
||||
// shared (no `Sync`) — so the single-threaded immediate context is never touched concurrently.
|
||||
unsafe impl Send for DuplCapturer {}
|
||||
|
||||
impl DuplCapturer {
|
||||
@@ -2051,6 +2088,13 @@ impl DuplCapturer {
|
||||
gpu: bool,
|
||||
want_hdr: bool,
|
||||
) -> Result<Self> {
|
||||
// SAFETY: runs on the capture thread that will own this `DuplCapturer`. `install_gpu_pref_hook()`
|
||||
// and the DPI-context calls take by-value handles / no args and touch only thread/process state;
|
||||
// `SetThreadExecutionState` takes a flags bitmask by value. `CreateDXGIFactory1` yields a live
|
||||
// `IDXGIFactory1`, and every subsequent COM method (`EnumAdapters1`/`EnumOutputs`/`GetDesc1`/
|
||||
// `GetDesc`/`cast`) is called on that factory or on an adapter/output it returned — each obtained
|
||||
// through a checked `while let Ok(..)`/`?` — all from this one thread. No raw pointers are
|
||||
// dereferenced; the borrowed strings/locals outlive each synchronous call.
|
||||
unsafe {
|
||||
// Stop DXGI hybrid-GPU output reparenting BEFORE we create the factory / enumerate outputs
|
||||
// (the cause of the 0x887A0026 ACCESS_LOST churn on this hybrid box: RTX 4090 + AMD iGPU).
|
||||
@@ -3207,6 +3251,11 @@ impl Capturer for DuplCapturer {
|
||||
// the duplication up to 12 s). Better a few seconds of frozen-last-frame than dropping the stream.
|
||||
let mut deadline = Instant::now() + Duration::from_secs(20);
|
||||
loop {
|
||||
// SAFETY: `acquire` is an `unsafe fn` because it drives the D3D11 immediate context + the
|
||||
// output duplication, which must be touched only from the capturer's owning thread.
|
||||
// `next_frame` runs on that one thread — `DuplCapturer` is `Send` but not `Sync`, so it is
|
||||
// owned by a single (encode) thread for its whole life — and `&mut self` gives exclusive
|
||||
// access for the call, satisfying that contract.
|
||||
if let Some(f) = unsafe { self.acquire() }? {
|
||||
self.ever_got_frame = true;
|
||||
return Ok(f);
|
||||
@@ -3253,6 +3302,8 @@ impl Capturer for DuplCapturer {
|
||||
}
|
||||
|
||||
fn try_latest(&mut self) -> Result<Option<CapturedFrame>> {
|
||||
// SAFETY: as in `next_frame` — `acquire` must run on the capturer's single owning thread, and
|
||||
// `try_latest` is called on it (`DuplCapturer` is `Send`, not `Sync`); `&mut self` is exclusive.
|
||||
unsafe { self.acquire() }
|
||||
}
|
||||
|
||||
@@ -3264,11 +3315,19 @@ impl Capturer for DuplCapturer {
|
||||
impl Drop for DuplCapturer {
|
||||
fn drop(&mut self) {
|
||||
if self.holding_frame {
|
||||
// SAFETY: `self.dupl` is the live `IDXGIOutputDuplication` this capturer created and owns;
|
||||
// `ReleaseFrame` is a valid COM method on it, called only when `holding_frame` records that a
|
||||
// frame was acquired and not yet released (so it is not an unbalanced release). Drop runs on
|
||||
// whichever thread owns the capturer — its sole owner, since it is `!Sync` — and the `&`
|
||||
// borrow of the duplication outlives this synchronous call.
|
||||
unsafe {
|
||||
let _ = self.dupl.as_ref().map(|d| d.ReleaseFrame());
|
||||
}
|
||||
}
|
||||
// Release the display/system-required execution state we took at open().
|
||||
// SAFETY: `SetThreadExecutionState` is a Win32 FFI call taking an execution-state flag bitmask
|
||||
// by value (`ES_CONTINUOUS` clears the display/system-required state taken at open); it borrows
|
||||
// no Rust memory and is safe to call from any thread.
|
||||
unsafe {
|
||||
SetThreadExecutionState(ES_CONTINUOUS);
|
||||
}
|
||||
|
||||
@@ -10,7 +10,10 @@
|
||||
//! [`pf_driver_proto::frame`] (which OWNS the contract, with `const` size asserts) — both sides
|
||||
//! `use` it, so drift is a compile error rather than a "must match" comment.
|
||||
|
||||
use super::dxgi::{make_device, D3d11Frame, HdrConverter, WinCaptureTarget};
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::dxgi::{make_device, D3d11Frame, HdrP010Converter, VideoConverter, WinCaptureTarget};
|
||||
use super::{CapturedFrame, Capturer, FramePayload, PixelFormat};
|
||||
use anyhow::{bail, Context, Result};
|
||||
use pf_driver_proto::frame;
|
||||
@@ -20,13 +23,12 @@ use std::time::{Duration, Instant, SystemTime, UNIX_EPOCH};
|
||||
use windows::core::{w, Interface, HSTRING};
|
||||
use windows::Win32::Foundation::{HANDLE, INVALID_HANDLE_VALUE, LUID};
|
||||
use windows::Win32::Graphics::Direct3D11::{
|
||||
ID3D11Device, ID3D11DeviceContext, ID3D11RenderTargetView, ID3D11ShaderResourceView,
|
||||
ID3D11Texture2D, D3D11_BIND_RENDER_TARGET, D3D11_BIND_SHADER_RESOURCE,
|
||||
D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX, D3D11_RESOURCE_MISC_SHARED_NTHANDLE,
|
||||
D3D11_TEXTURE2D_DESC, D3D11_USAGE_DEFAULT,
|
||||
ID3D11Device, ID3D11DeviceContext, ID3D11ShaderResourceView, ID3D11Texture2D,
|
||||
D3D11_BIND_RENDER_TARGET, D3D11_BIND_SHADER_RESOURCE, D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX,
|
||||
D3D11_RESOURCE_MISC_SHARED_NTHANDLE, D3D11_TEXTURE2D_DESC, D3D11_USAGE_DEFAULT,
|
||||
};
|
||||
use windows::Win32::Graphics::Dxgi::Common::{
|
||||
DXGI_FORMAT, DXGI_FORMAT_B8G8R8A8_UNORM, DXGI_FORMAT_R10G10B10A2_UNORM,
|
||||
DXGI_FORMAT, DXGI_FORMAT_B8G8R8A8_UNORM, DXGI_FORMAT_NV12, DXGI_FORMAT_P010,
|
||||
DXGI_FORMAT_R16G16B16A16_FLOAT, DXGI_SAMPLE_DESC,
|
||||
};
|
||||
use windows::Win32::Graphics::Dxgi::{
|
||||
@@ -205,21 +207,33 @@ pub struct IddPushCapturer {
|
||||
/// cleared when a fresh frame resumes. If it stays set past the recovery window, `try_consume` drops
|
||||
/// the session (recover-or-drop, no DDA).
|
||||
recovering_since: Option<Instant>,
|
||||
/// Host-owned ROTATING output ring NVENC encodes (texture + RTV per slot). Rotating it per frame is
|
||||
/// the precondition for pipelining the encode loop: while NVENC encodes frame N's texture on the
|
||||
/// ASIC, frame N+1's convert/copy writes a DIFFERENT texture on the 3D engine — the two overlap. The
|
||||
/// HDR convert and the SDR copy both write into the current slot. Format = `out_format()` (Rgb10a2 in
|
||||
/// HDR, Bgra in SDR); rebuilt on a display-mode flip. Built lazily.
|
||||
out_ring: Vec<(ID3D11Texture2D, ID3D11RenderTargetView)>,
|
||||
/// Host-owned ROTATING output ring NVENC encodes (one YUV texture per slot). Rotating it per frame
|
||||
/// is the precondition for pipelining the encode loop: while NVENC encodes frame N's texture on the
|
||||
/// ASIC, frame N+1's convert writes a DIFFERENT texture — the two overlap. Format = `out_format()`:
|
||||
/// NV12 (SDR, BT.709 limited) or P010 (HDR, BT.2020 PQ limited), so NVENC takes native YUV and skips
|
||||
/// its internal RGB→YUV CSC on the SM/3D engine the game saturates (plan §5.A). Rebuilt on a
|
||||
/// display-mode flip. Built lazily.
|
||||
out_ring: Vec<ID3D11Texture2D>,
|
||||
out_idx: usize,
|
||||
/// FP16 scRGB → `Rgb10a2` BT.2020 PQ converter, used while the display is HDR. Built lazily.
|
||||
hdr_conv: Option<HdrConverter>,
|
||||
/// BGRA slot → NV12 (BT.709 limited) on the dedicated D3D11 VIDEO engine, used while the display is
|
||||
/// SDR — keeps the colour-convert OFF the contended 3D/compute engine. Built lazily; rebuilt on a
|
||||
/// size/HDR flip.
|
||||
video_conv: Option<VideoConverter>,
|
||||
/// FP16 scRGB slot → P010 (BT.2020 PQ limited) via two shader passes, used while the display is HDR
|
||||
/// (NVIDIA's VideoProcessor can't do RGB→P010). The passes run on the 3D engine, but it still skips
|
||||
/// NVENC's internal SM-side CSC. Built lazily.
|
||||
hdr_p010_conv: Option<HdrP010Converter>,
|
||||
last_seq: u64,
|
||||
last_present: Option<(ID3D11Texture2D, PixelFormat)>,
|
||||
status_logged: bool,
|
||||
_keepalive: Box<dyn Send>,
|
||||
}
|
||||
// COM objects used only from the owning (encode) thread.
|
||||
// SAFETY: `IddPushCapturer` is `!Send` only because of its `*mut SharedHeader`/`*mut DebugBlock` raw
|
||||
// pointers (and the COM interfaces). It is created, used, and dropped by a SINGLE thread — the owning
|
||||
// capture/encode thread — never shared: the `ID3D11DeviceContext` is the device's IMMEDIATE context
|
||||
// (single-threaded by D3D11 contract) and is only ever touched from that thread, and the header/
|
||||
// dbg_block pointers (into mappings this struct owns) are only dereferenced there. `Send` transfers
|
||||
// ownership to one thread at a time with NO concurrent access; we do not (and must not) claim `Sync`.
|
||||
unsafe impl Send for IddPushCapturer {}
|
||||
|
||||
/// Build a permissive (Everyone:GenericAll) `SECURITY_ATTRIBUTES` so the restricted WUDFHost driver
|
||||
@@ -337,6 +351,9 @@ impl IddPushCapturer {
|
||||
// a fullscreen game can hold the virtual display at a different mode (esp. across a reconnect), so
|
||||
// matching the actual mode lets the first frame flow instead of being dropped (game-capture bug
|
||||
// GB1). Falls back to the negotiated mode when the CCD read is unavailable.
|
||||
// SAFETY: `active_resolution` is an `unsafe fn` (Win32 CCD `QueryDisplayConfig`) that takes only a
|
||||
// copy of the plain `u32` CCD target id and returns owned `(w, h)` values; it forms no borrows from
|
||||
// us and validates the id internally, returning `None` on any failure (handled by `unwrap_or`).
|
||||
let (w, h) =
|
||||
unsafe { crate::win_display::active_resolution(target.target_id) }.unwrap_or((pw, ph));
|
||||
if (w, h) != (pw, ph) {
|
||||
@@ -355,6 +372,27 @@ impl IddPushCapturer {
|
||||
// PROACTIVELY enable advanced color so HDR streams without the user toggling anything; an
|
||||
// SDR-only client leaves the display alone (and still gets a tone-mapped picture, never a freeze,
|
||||
// if the user does enable HDR).
|
||||
// SAFETY: one block over the whole ring setup; every operation in it is sound:
|
||||
// - `set_advanced_color`/`advanced_color_enabled` are `unsafe fn`s taking only a copy of the plain
|
||||
// `u32` target id; they read/flip CCD display config and return owned values, borrowing nothing.
|
||||
// - `CreateDXGIFactory1`, `EnumAdapterByLuid`, `make_device`, `permissive_sa`, `CreateFileMappingW`,
|
||||
// `MapViewOfFile`, `CreateEventW`, and `create_ring_slots` are all `?`-checked, so every returned
|
||||
// interface/handle/view is non-error before use; `&sa`/`&adapter`/`&device`/the `&HSTRING` names
|
||||
// are live borrows that outlive each synchronous call, and `sa.lpSecurityDescriptor` stays valid
|
||||
// because its backing `_psd` is held in scope for the whole block.
|
||||
// - The header mapping is created AND viewed at `bytes == size_of::<SharedHeader>().max(64)`; the
|
||||
// view's null is checked (`bail!` on failure, after which the owned `map` closes the mapping). The
|
||||
// OS view base is page-aligned, so `section.ptr::<SharedHeader>()` is suitably aligned for a
|
||||
// `SharedHeader`, and `write_bytes(.., 0, bytes)` plus the `(*header).field = ..` writes all stay
|
||||
// within those `bytes` and write THROUGH the raw pointer without forming any `&mut`. The debug
|
||||
// section is the same pattern at `dbg_bytes == size_of::<DebugBlock>()`, only entered when its
|
||||
// own view is non-null.
|
||||
// - The `magic` publish stores through `addr_of!((*header).magic) as *const AtomicU32`: `addr_of!`
|
||||
// takes the field address without a reference; the field is a 4-aligned `u32` (valid for
|
||||
// `AtomicU32`), and the `Release` store after the `Release` fence is the cross-process handshake
|
||||
// that orders all preceding writes before the driver may observe `MAGIC`.
|
||||
// - `header`/`dbg_block` point into the OS mappings, NOT into the `MappedSection` structs, so moving
|
||||
// `section`/`dbg_section` into `me` leaves them valid (see the `MappedSection` doc comment).
|
||||
unsafe {
|
||||
// If we ENABLE advanced color for a 10-bit client, trust it (the driver will compose FP16) and
|
||||
// size the ring FP16 directly — don't race the advanced_color_enabled poll, which may not have
|
||||
@@ -504,7 +542,8 @@ impl IddPushCapturer {
|
||||
recovering_since: None,
|
||||
out_ring: Vec::new(),
|
||||
out_idx: 0,
|
||||
hdr_conv: None,
|
||||
video_conv: None,
|
||||
hdr_p010_conv: None,
|
||||
last_seq: 0,
|
||||
last_present: None,
|
||||
status_logged: false,
|
||||
@@ -523,7 +562,7 @@ impl IddPushCapturer {
|
||||
|
||||
/// Block (bounded) until the driver has ATTACHED to the host ring (`DRV_STATUS_OPENED`) **and published
|
||||
/// a first frame**, else fail so the caller can fall back to DDA (audit §5.1 +
|
||||
/// `docs/windows-host-rewrite.md` §2.5 — the GB1 game-capture fix).
|
||||
/// `design/windows-host-rewrite.md` §2.5 — the GB1 game-capture fix).
|
||||
///
|
||||
/// Requiring the first frame — not just the attach — catches the *reconnect-into-a-broken-state* case:
|
||||
/// a fullscreen game can leave the virtual display in a format/size that the driver's `publish()` guard
|
||||
@@ -534,10 +573,16 @@ impl IddPushCapturer {
|
||||
fn wait_for_attach(&self) -> Result<()> {
|
||||
let deadline = Instant::now() + Duration::from_secs(4);
|
||||
loop {
|
||||
// Plain read: the driver writes this u32; an aligned u32 read can't tear (same access as
|
||||
// SAFETY: `self.header` points into the live shared-header mapping this capturer owns (sized
|
||||
// `>= size_of::<SharedHeader>()`, page-aligned), so the field read is in-bounds + aligned, and
|
||||
// no reference into the shared region is formed. Plain read: the driver writes this `u32`
|
||||
// cross-process, but an aligned `u32` read can't tear and `driver_status` is best-effort
|
||||
// diagnostics — the real handshake is the atomic `magic`/`latest` (same access as
|
||||
// log_driver_status_once).
|
||||
let st = unsafe { (*self.header).driver_status };
|
||||
if matches!(st, DRV_STATUS_TEX_FAIL | DRV_STATUS_NO_DEVICE1) {
|
||||
// SAFETY: as above — an in-bounds, aligned `u32` read of a best-effort diagnostic field
|
||||
// through the owned, live header mapping; no reference into the shared region is formed.
|
||||
let detail = unsafe { (*self.header).driver_status_detail };
|
||||
bail!(
|
||||
"IDD-push driver failed to attach (driver_status={st} detail=0x{detail:08x} — \
|
||||
@@ -560,6 +605,10 @@ impl IddPushCapturer {
|
||||
|
||||
#[inline]
|
||||
fn latest(&self) -> u64 {
|
||||
// SAFETY: `self.header` is the live, owned shared-header mapping (page-aligned, sized for a
|
||||
// `SharedHeader`). `addr_of!((*self.header).latest)` forms the address of the `latest` field
|
||||
// WITHOUT a reference; it is an 8-aligned `u64` (so valid for `AtomicU64`), and the `Acquire` load
|
||||
// is the consumer half of the cross-process publish handshake (pairs with the driver's `Release`).
|
||||
unsafe {
|
||||
(*(std::ptr::addr_of!((*self.header).latest) as *const AtomicU64))
|
||||
.load(Ordering::Acquire)
|
||||
@@ -571,6 +620,10 @@ impl IddPushCapturer {
|
||||
if self.status_logged {
|
||||
return;
|
||||
}
|
||||
// SAFETY: four in-bounds, aligned reads of the live, owned shared-header mapping. The driver writes
|
||||
// these `u32`/`i32` diagnostic fields cross-process, but aligned word reads can't tear and these are
|
||||
// best-effort status (the real handshake is the atomic `magic`/`latest`); no `&`/`&mut` reference
|
||||
// into the shared region is formed.
|
||||
let (status, detail, lo, hi) = unsafe {
|
||||
(
|
||||
(*self.header).driver_status,
|
||||
@@ -610,6 +663,11 @@ impl IddPushCapturer {
|
||||
tracing::warn!("IDD push DEBUG: no debug block");
|
||||
return;
|
||||
}
|
||||
// SAFETY: `self.dbg_block` was just checked non-null (the early return above); it points into the
|
||||
// owned `dbg_section` mapping sized exactly `size_of::<DebugBlock>()` and page-aligned, so it is
|
||||
// valid + aligned for `DebugBlock`. `d` is a short-lived SHARED reference used only to read the
|
||||
// fields below; we never form `&mut` into this region, and the driver's cross-process writes are
|
||||
// aligned `u32`s that don't tear (best-effort bring-up diagnostics).
|
||||
let d = unsafe { &*self.dbg_block };
|
||||
tracing::error!(
|
||||
run_core_entries = d.run_core_entries,
|
||||
@@ -625,16 +683,17 @@ impl IddPushCapturer {
|
||||
);
|
||||
}
|
||||
|
||||
/// The output texture format + the [`PixelFormat`] it presents as, driven SOLELY by the DISPLAY's
|
||||
/// HDR state (like the WGC path): HDR → `Rgb10a2` BT.2020 PQ → NVENC Main10, and the client
|
||||
/// auto-detects PQ from the HEVC VUI; SDR → 8-bit `Bgra`. We do NOT gate HDR on the client's
|
||||
/// advertised `VIDEO_CAP_10BIT` — clients under-report it (e.g. the Mac advertises 10-bit only when
|
||||
/// its OWN display is HDR), yet all decode Main10 + auto-switch, exactly as on the WGC path.
|
||||
/// The output texture format + the [`PixelFormat`] NVENC encodes, driven SOLELY by the DISPLAY's HDR
|
||||
/// state (like the WGC path): HDR → `P010` (BT.2020 PQ 10-bit limited) → NVENC Main10, and the client
|
||||
/// auto-detects PQ from the HEVC VUI; SDR → `Nv12` (BT.709 8-bit limited). Both are native YUV so
|
||||
/// NVENC skips its internal RGB→YUV CSC on the contended SM (plan §5.A). We do NOT gate HDR on the
|
||||
/// client's advertised `VIDEO_CAP_10BIT` — clients under-report it (e.g. the Mac advertises 10-bit
|
||||
/// only when its OWN display is HDR), yet all decode Main10 + auto-switch, exactly as on the WGC path.
|
||||
fn out_format(&self) -> (DXGI_FORMAT, PixelFormat) {
|
||||
if self.display_hdr {
|
||||
(DXGI_FORMAT_R10G10B10A2_UNORM, PixelFormat::Rgb10a2)
|
||||
(DXGI_FORMAT_P010, PixelFormat::P010)
|
||||
} else {
|
||||
(DXGI_FORMAT_B8G8R8A8_UNORM, PixelFormat::Bgra)
|
||||
(DXGI_FORMAT_NV12, PixelFormat::Nv12)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -658,6 +717,10 @@ impl IddPushCapturer {
|
||||
self.height = new_h;
|
||||
let fmt = self.ring_format();
|
||||
let new_gen = IDD_GENERATION.fetch_add(1, Ordering::Relaxed);
|
||||
// SAFETY: `create_ring_slots` is an `unsafe fn` (it makes D3D11/DXGI COM calls); we pass a live
|
||||
// borrow of `self.device` (the capturer's own device, on which the slots are created) plus plain
|
||||
// `u32`/`DXGI_FORMAT` values, and `?` propagates any failure before the slots are used. Every
|
||||
// returned slot's texture + keyed mutex belongs to that same `self.device`.
|
||||
let new_slots = unsafe {
|
||||
Self::create_ring_slots(
|
||||
&self.device,
|
||||
@@ -668,6 +731,12 @@ impl IddPushCapturer {
|
||||
fmt,
|
||||
)?
|
||||
};
|
||||
// SAFETY: `self.header` is the live, owned shared-header mapping (page-aligned, sized for a
|
||||
// `SharedHeader`). The `latest`/`generation` stores go through `addr_of!`-formed field pointers (no
|
||||
// references) of correctly-aligned `u64`/`u32` fields, valid for `AtomicU64`/`AtomicU32`; the
|
||||
// `dxgi_format`/`width`/`height` writes are in-bounds raw writes through the pointer (no `&mut`).
|
||||
// The `Release` fence + the `Release` `generation` store publish all preceding writes so the driver
|
||||
// only re-attaches (`Acquire`) once the new textures + format are in place.
|
||||
unsafe {
|
||||
// Clear `latest` to the 0 sentinel (generation 0, which try_consume rejects). The real guard
|
||||
// against consuming an unwritten new-ring slot is the generation tag in `latest`: a stale
|
||||
@@ -688,6 +757,8 @@ impl IddPushCapturer {
|
||||
self.generation = new_gen;
|
||||
self.last_seq = 0;
|
||||
self.out_ring.clear(); // the output format changed → rebuild lazily at the new format
|
||||
self.video_conv = None; // converters are sized + HDR-specific → rebuild at the new mode
|
||||
self.hdr_p010_conv = None;
|
||||
self.out_idx = 0;
|
||||
self.last_present = None;
|
||||
Ok(())
|
||||
@@ -701,9 +772,13 @@ impl IddPushCapturer {
|
||||
return;
|
||||
}
|
||||
self.last_acm_poll = Instant::now();
|
||||
// SAFETY: `advanced_color_enabled` is an `unsafe fn` taking only a copy of the plain `u32` target
|
||||
// id; it performs a read-only CCD query and returns an owned `bool`, borrowing nothing from us.
|
||||
let now_hdr = unsafe { crate::win_display::advanced_color_enabled(self.target_id) };
|
||||
// Follow the display's ACTUAL resolution too — a fullscreen game can mode-set the virtual display
|
||||
// out from under the negotiated size (game-capture bug GB1). Unknown read → keep our current size.
|
||||
// SAFETY: `active_resolution` is an `unsafe fn` taking only a copy of the plain `u32` target id; it
|
||||
// performs a read-only CCD query and returns owned `(w, h)` values, borrowing nothing from us.
|
||||
let (now_w, now_h) = unsafe { crate::win_display::active_resolution(self.target_id) }
|
||||
.unwrap_or((self.width, self.height));
|
||||
if now_hdr == self.display_hdr && now_w == self.width && now_h == self.height {
|
||||
@@ -742,31 +817,46 @@ impl IddPushCapturer {
|
||||
Quality: 0,
|
||||
},
|
||||
Usage: D3D11_USAGE_DEFAULT,
|
||||
BindFlags: (D3D11_BIND_RENDER_TARGET.0 | D3D11_BIND_SHADER_RESOURCE.0) as u32,
|
||||
// RENDER_TARGET: the VIDEO processor (NV12) and the P010 shader passes both write here, and
|
||||
// NVENC registers it as encode input — matching the WGC YUV ring.
|
||||
BindFlags: D3D11_BIND_RENDER_TARGET.0 as u32,
|
||||
CPUAccessFlags: 0,
|
||||
MiscFlags: 0,
|
||||
};
|
||||
for _ in 0..OUT_RING {
|
||||
let mut t: Option<ID3D11Texture2D> = None;
|
||||
let mut rtv: Option<ID3D11RenderTargetView> = None;
|
||||
// SAFETY: `CreateTexture2D` is called on `self.device` (the capturer's live D3D11 device);
|
||||
// `&desc` is a fully-initialized stack `D3D11_TEXTURE2D_DESC`, the data arg is `None` (no
|
||||
// initial data), and `Some(&mut t)` is a live out-parameter the call fills. `?` rejects a failed
|
||||
// HRESULT before `t` is unwrapped, and the created texture belongs to `self.device`.
|
||||
unsafe {
|
||||
self.device
|
||||
.CreateTexture2D(&desc, None, Some(&mut t))
|
||||
.context("CreateTexture2D(IDD out ring)")?;
|
||||
let t = t.context("null out-ring texture")?;
|
||||
self.device
|
||||
.CreateRenderTargetView(&t, None, Some(&mut rtv))
|
||||
.context("CreateRenderTargetView(IDD out ring)")?;
|
||||
self.out_ring.push((t, rtv.context("null out-ring rtv")?));
|
||||
self.out_ring.push(t.context("null out-ring texture")?);
|
||||
}
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Build the HDR converter if not already built (HDR-display path only — an SDR display is a copy).
|
||||
/// Build the per-mode YUV converter if not already built: a VIDEO-engine BGRA→NV12 processor on an
|
||||
/// SDR display, or the FP16→P010 shader on an HDR display. Both keep NVENC's RGB→YUV CSC off the SM.
|
||||
fn ensure_converter(&mut self) -> Result<()> {
|
||||
if self.hdr_conv.is_none() {
|
||||
self.hdr_conv = Some(unsafe { HdrConverter::new(&self.device)? });
|
||||
if self.display_hdr {
|
||||
if self.hdr_p010_conv.is_none() {
|
||||
// SAFETY: `HdrP010Converter::new` is `unsafe` (it compiles D3D11 shaders + creates
|
||||
// resources); we pass a live borrow of `self.device`, the device the converter's resources
|
||||
// belong to, and `?` propagates any failure before the converter is stored.
|
||||
self.hdr_p010_conv = Some(unsafe { HdrP010Converter::new(&self.device)? });
|
||||
}
|
||||
} else if self.video_conv.is_none() {
|
||||
// SAFETY: `VideoConverter::new` is `unsafe` (it sets up the D3D11 VIDEO processor); we pass live
|
||||
// borrows of `self.device` + its immediate `self.context` (single-threaded, this thread) plus
|
||||
// plain `u32` dimensions, and `?` propagates any failure before it is stored. The converter's
|
||||
// resources belong to that same device/context.
|
||||
self.video_conv = Some(unsafe {
|
||||
VideoConverter::new(&self.device, &self.context, self.width, self.height, false)?
|
||||
});
|
||||
}
|
||||
Ok(())
|
||||
}
|
||||
@@ -801,16 +891,11 @@ impl IddPushCapturer {
|
||||
return Ok(None);
|
||||
}
|
||||
self.ensure_out_ring()?;
|
||||
// Build the HDR converter BEFORE acquiring the slot so nothing between Acquire and Release can
|
||||
// Build the converter BEFORE acquiring the slot so nothing between Acquire and Release can
|
||||
// `?`-return and leak the keyed-mutex lock (which would stall the driver on that slot).
|
||||
if self.display_hdr {
|
||||
self.ensure_converter()?;
|
||||
}
|
||||
self.ensure_converter()?;
|
||||
let i = self.out_idx;
|
||||
let (out, out_rtv) = {
|
||||
let (t, rtv) = &self.out_ring[i];
|
||||
(t.clone(), rtv.clone())
|
||||
};
|
||||
let out = self.out_ring[i].clone();
|
||||
let (_, pf) = self.out_format();
|
||||
|
||||
// Hold the slot's keyed mutex only across the convert/copy into the host out-ring (NOT across the
|
||||
@@ -824,16 +909,27 @@ impl IddPushCapturer {
|
||||
let Some(_lock) = KeyedMutexGuard::acquire(&s.mutex, 0, 8) else {
|
||||
return Ok(None);
|
||||
};
|
||||
// SAFETY: convert/copy on the owning (encode) thread's immediate context, holding the slot lock.
|
||||
// SAFETY: convert on the owning (encode) thread's immediate context, holding the slot lock.
|
||||
// A `?` here is leak-safe: `_lock` (the KeyedMutexGuard) drops on the early return, releasing
|
||||
// the slot back to the driver.
|
||||
unsafe {
|
||||
if self.display_hdr {
|
||||
// Sample the FP16 slot's SRV directly (no scratch copy) → BT.2020 PQ Rgb10a2.
|
||||
if let Some(conv) = self.hdr_conv.as_ref() {
|
||||
conv.convert(&self.context, &s.srv, &out_rtv, self.width, self.height);
|
||||
// HDR: FP16 slot SRV → P010 (BT.2020 PQ) via the shader; NVENC takes native P010.
|
||||
if let Some(conv) = self.hdr_p010_conv.as_ref() {
|
||||
conv.convert(
|
||||
&self.device,
|
||||
&self.context,
|
||||
&s.srv,
|
||||
&out,
|
||||
self.width,
|
||||
self.height,
|
||||
)?;
|
||||
}
|
||||
} else {
|
||||
// SDR: the slot is already 8-bit BGRA — one copy into the out-ring (hidden by pipelining).
|
||||
self.context.CopyResource(&out, &s.tex);
|
||||
// SDR: BGRA slot → NV12 on the VIDEO engine; NVENC takes native NV12, no SM-side CSC.
|
||||
if let Some(conv) = self.video_conv.as_ref() {
|
||||
conv.convert(&s.tex, &out)?;
|
||||
}
|
||||
}
|
||||
}
|
||||
// `_lock` drops here → `ReleaseSync(0)`.
|
||||
@@ -861,7 +957,7 @@ impl IddPushCapturer {
|
||||
// OUT_RING(3) > the max pipeline_depth(2) guarantees the rotated slot is not in flight.
|
||||
let (src, pf) = self.last_present.clone()?;
|
||||
let i = self.out_idx;
|
||||
let dst = self.out_ring.get(i)?.0.clone();
|
||||
let dst = self.out_ring.get(i)?.clone();
|
||||
// SAFETY: GPU copy on the owning thread's immediate context; src/dst are our out-ring textures of
|
||||
// identical format/size (src is a previous out-ring slot; dst the next).
|
||||
unsafe {
|
||||
@@ -922,6 +1018,8 @@ pub fn spawn_observer(target: WinCaptureTarget, preferred: Option<(u32, u32, u32
|
||||
|
||||
/// The discrete render GPU LUID (where NVENC runs), falling back to the monitor's `OsAdapterLuid`.
|
||||
fn resolve_render_adapter_luid_or(fallback_packed: i64) -> LUID {
|
||||
// SAFETY: `resolve_render_adapter_luid` is an `unsafe fn` (it enumerates DXGI adapters) that takes no
|
||||
// arguments and returns an owned `Option<LUID>`, borrowing nothing.
|
||||
if let Some(l) = unsafe { crate::win_adapter::resolve_render_adapter_luid() } {
|
||||
return l;
|
||||
}
|
||||
@@ -935,6 +1033,9 @@ impl Capturer for IddPushCapturer {
|
||||
fn next_frame(&mut self) -> Result<CapturedFrame> {
|
||||
let deadline = Instant::now() + Duration::from_secs(20);
|
||||
loop {
|
||||
// SAFETY: `self.event` is the live frame-ready `OwnedHandle` this capturer owns; its raw value
|
||||
// (borrowed for the call, so it outlives this synchronous wait) is a valid auto-reset event
|
||||
// handle. `WaitForSingleObject` only reads the handle; the 16 ms timeout bounds the wait.
|
||||
let _ = unsafe { WaitForSingleObject(HANDLE(self.event.as_raw_handle()), 16) };
|
||||
if let Some(f) = self.try_consume()? {
|
||||
return Ok(f);
|
||||
@@ -944,6 +1045,9 @@ impl Capturer for IddPushCapturer {
|
||||
}
|
||||
if Instant::now() > deadline {
|
||||
self.log_debug_block();
|
||||
// SAFETY: four in-bounds, aligned reads of the live, owned shared-header mapping — the same
|
||||
// best-effort diagnostic fields as `log_driver_status_once` (aligned word reads can't tear;
|
||||
// no reference into the shared region is formed).
|
||||
let (st, detail, lo, hi) = unsafe {
|
||||
(
|
||||
(*self.header).driver_status,
|
||||
|
||||
@@ -16,6 +16,9 @@
|
||||
//! Limitation: WGC cannot capture the secure desktop (lock / UAC / login) — the caller falls back to
|
||||
//! the DDA backend ([`super::dxgi::DuplCapturer`]) for those (see capture.rs).
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::dxgi::{
|
||||
find_output, hdr_shader_p010_enabled, make_device, nudge_cursor_onto, D3d11Frame, HdrConverter,
|
||||
HdrP010Converter, VideoConverter, WinCaptureTarget,
|
||||
@@ -92,6 +95,10 @@ struct Deimpersonate(Option<HANDLE>);
|
||||
impl Drop for Deimpersonate {
|
||||
fn drop(&mut self) {
|
||||
if let Some(tok) = self.0.take() {
|
||||
// SAFETY: `RevertToSelf` takes no arguments and undoes the thread impersonation set during
|
||||
// WGC activation; `tok` is the impersonation token `HANDLE` from `impersonate_active_user`,
|
||||
// owned by this `Deimpersonate` and closed exactly once here (taken out of the `Option`, so
|
||||
// no double-close). Both are FFI calls borrowing no Rust memory.
|
||||
unsafe {
|
||||
let _ = RevertToSelf();
|
||||
let _ = CloseHandle(tok);
|
||||
@@ -174,7 +181,12 @@ pub struct WgcCapturer {
|
||||
_keepalive: Option<Box<dyn Send>>,
|
||||
}
|
||||
|
||||
// COM + WinRT pointers; confined to the single owning (encode) thread, like DuplCapturer.
|
||||
// SAFETY: like `DuplCapturer`. `WgcCapturer` holds D3D11 (free-threaded device/context) plus WGC WinRT
|
||||
// objects (`Direct3D11CaptureFramePool` etc., created free-threaded via `CreateFreeThreaded`). COM/WinRT
|
||||
// reference counting is interlocked, and the capturer is owned + used by exactly one encode thread,
|
||||
// moved to it once and never shared (no `Sync`), so transferring ownership across threads is sound. The
|
||||
// free-threaded `FrameArrived` callback touches only the `Arc<WgcSignal>` (itself `Send + Sync`), not
|
||||
// the capturer's COM fields.
|
||||
unsafe impl Send for WgcCapturer {}
|
||||
|
||||
impl WgcCapturer {
|
||||
@@ -182,6 +194,15 @@ impl WgcCapturer {
|
||||
/// [`attach_keepalive`](Self::attach_keepalive) only after open succeeds, so a failure leaves the
|
||||
/// keepalive with the caller to hand to the DDA fallback.
|
||||
pub fn open(target: WinCaptureTarget, preferred: Option<(u32, u32, u32)>) -> Result<Self> {
|
||||
// SAFETY: runs on the thread opening the WGC session. `RoInitialize` inits this thread's WinRT
|
||||
// apartment (idempotent; result ignored). `impersonate_active_user()` and `find_output()` are
|
||||
// this module's `unsafe fn`s whose contracts (call on the activating thread; pass a GDI name)
|
||||
// are met, and the impersonation is reverted by `_deimp`'s Drop on every return path. Every
|
||||
// COM/WinRT call thereafter operates on an object obtained + `?`-checked earlier in this same
|
||||
// block on this single thread — the `IDXGIOutput1` from `find_output`, the device/context from
|
||||
// `make_device`, the factory/interop/item/pool/session — and the `TypedEventHandler` closure
|
||||
// captures an `Arc<WgcSignal>` (Send+Sync) by move. No raw pointers are dereferenced; borrowed
|
||||
// locals outlive their synchronous calls.
|
||||
unsafe {
|
||||
// WGC is WinRT — the calling thread needs a COM/WinRT apartment for the GraphicsCaptureItem
|
||||
// activation factory (RoGetActivationFactory). Initialize MTA; ignore "already initialized"
|
||||
@@ -585,6 +606,15 @@ impl WgcCapturer {
|
||||
}
|
||||
|
||||
fn process_frame(&mut self, frame: Direct3D11CaptureFrame) -> Result<CapturedFrame> {
|
||||
// SAFETY: runs on the capturer's single owning thread. `frame` is a live
|
||||
// `Direct3D11CaptureFrame` from `self.pool`; `frame.Surface().cast::<IDirect3DDxgiInterfaceAccess
|
||||
// >().GetInterface()` yields the frame's backing `ID3D11Texture2D`, which belongs to
|
||||
// `self.device` (the pool was created on it via `CreateDirect3D11DeviceFromDXGIDevice`). Every
|
||||
// helper called here — `hdr_to_p010`, `convert_to_yuv`, `ensure_fp16_src`, `ensure_out_ring`,
|
||||
// `HdrConverter::convert`, `CopyResource`, `CreateRenderTargetView` — operates on
|
||||
// `self.device`/`self.context` and that same-device texture, so all resources share one device.
|
||||
// The frame is held in `self.held` until its async GPU read completes for the zero-copy paths.
|
||||
// Single-threaded immediate-context use; borrowed textures/SRVs/RTVs outlive each synchronous call.
|
||||
unsafe {
|
||||
let surface = frame.Surface().context("frame Surface")?;
|
||||
let access: IDirect3DDxgiInterfaceAccess = surface
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
//! Host-side WGC helper relay (Windows two-process secure-desktop design,
|
||||
//! docs/windows-secure-desktop.md — step 4).
|
||||
//! design/windows-secure-desktop.md — step 4).
|
||||
//!
|
||||
//! WGC won't activate under the SYSTEM account, so the SYSTEM host can't capture the normal desktop
|
||||
//! itself. Instead it spawns `punktfunk-host wgc-helper` in the **interactive user session** (so WGC works)
|
||||
@@ -13,6 +13,9 @@
|
||||
//! Wire framing (must match `wgc_helper::write_au`): per AU
|
||||
//! `[u32 magic "PFAU" LE][u32 len LE][u64 pts_ns LE][u8 keyframe][len bytes data]`.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use crate::capture::dxgi::WinCaptureTarget;
|
||||
use anyhow::{bail, Context, Result};
|
||||
use std::io::{BufRead, BufReader, Read};
|
||||
@@ -56,9 +59,15 @@ pub struct HelperRelay {
|
||||
rx: Receiver<RelayAu>,
|
||||
}
|
||||
|
||||
// HANDLEs are just kernel handle values; we own them for the relay's lifetime and close them on Drop.
|
||||
// SAFETY: every field is itself `Send`: the `proc`/`thread` `HANDLE`s are process-global kernel
|
||||
// handle values (plain integers valid from any thread, owned for the relay's lifetime and closed once
|
||||
// on Drop), `stdin_w` is a `Mutex<HANDLE>`, and `rx` is an mpsc `Receiver<RelayAu>` (which is `Send`).
|
||||
// The relay is moved to one thread and owned there, so transferring it across threads is sound.
|
||||
unsafe impl Send for HelperRelay {}
|
||||
unsafe impl Sync for HelperRelay {}
|
||||
// NOTE: `HelperRelay` is deliberately NOT `Sync`. Its `rx: Receiver<RelayAu>` is `!Sync` (std mpsc
|
||||
// is single-consumer), and the relay is only ever a single-owner local in the punktfunk1 two-process
|
||||
// mux loop — never shared by `&` across threads — so `Sync` is neither sound nor needed. (A prior
|
||||
// `unsafe impl Sync` here asserted more than the fields support; removed.)
|
||||
|
||||
/// Control byte on the helper's stdin: force the next encoded frame to be an IDR (client decode
|
||||
/// recovery). Mirrors `enc.request_keyframe()` in the single-process path.
|
||||
@@ -84,6 +93,10 @@ impl HelperRelay {
|
||||
);
|
||||
tracing::info!(cmd = %cmdline, "spawning WGC helper in user session");
|
||||
|
||||
// SAFETY: `spawn_inner` is an `unsafe fn` only because it drives raw Win32 token/pipe/process
|
||||
// FFI; it imposes no caller-side memory precondition beyond valid arguments. `cmdline` is a live
|
||||
// `&str` borrowed for the synchronous call and `(w, h, hz)` are plain `u32`s. It validates its
|
||||
// own runtime requirements (active console session, SYSTEM token) and returns `Err` otherwise.
|
||||
unsafe { spawn_inner(&cmdline, w, h, hz) }
|
||||
}
|
||||
|
||||
@@ -108,6 +121,11 @@ impl HelperRelay {
|
||||
pub fn request_keyframe(&self) {
|
||||
let h = self.stdin_w.lock().unwrap();
|
||||
let mut written = 0u32;
|
||||
// SAFETY: `*h` is the host's write end of the helper's stdin pipe — a live `HANDLE` owned by
|
||||
// this `HelperRelay` (held under the `stdin_w` Mutex, locked here), closed only in Drop.
|
||||
// `WriteFile` reads the 1-byte `&[CTL_KEYFRAME]` buffer and writes the byte count into
|
||||
// `written`; both are live locals that outlive the synchronous call. A failure (helper gone) is
|
||||
// discarded as documented.
|
||||
unsafe {
|
||||
let _ = windows::Win32::Storage::FileSystem::WriteFile(
|
||||
*h,
|
||||
@@ -121,6 +139,10 @@ impl HelperRelay {
|
||||
|
||||
impl Drop for HelperRelay {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `self.proc`/`self.thread` are the child process/thread `HANDLE`s from
|
||||
// `CreateProcessAsUserW`, and `stdin_w` is the host's pipe write end — all owned by this
|
||||
// `HelperRelay` and closed exactly once here in Drop (no double-close). `TerminateProcess` and
|
||||
// the three `CloseHandle`s are FFI calls taking those handles by value, borrowing no Rust memory.
|
||||
unsafe {
|
||||
// Terminate the child first so its WGC capture + NVENC session tear down, then close our
|
||||
// handles (the reader threads end on the resulting broken pipe).
|
||||
@@ -364,10 +386,17 @@ fn au_reader(mut r: HandleReader, tx: SyncSender<RelayAu>) {
|
||||
|
||||
/// Minimal `Read` over a Win32 pipe HANDLE (the windows crate doesn't impl `Read` on HANDLE).
|
||||
struct HandleReader(HANDLE);
|
||||
// SAFETY: `HandleReader` owns a single pipe `HANDLE` (a process-global kernel handle value, valid from
|
||||
// any thread). It is moved into the dedicated reader thread and used only there (and closed once on
|
||||
// Drop), never shared — so transferring ownership across threads is sound.
|
||||
unsafe impl Send for HandleReader {}
|
||||
impl Read for HandleReader {
|
||||
fn read(&mut self, buf: &mut [u8]) -> std::io::Result<usize> {
|
||||
let mut read = 0u32;
|
||||
// SAFETY: `self.0` is the live read end of an anonymous pipe owned by this `HandleReader`
|
||||
// (closed only in Drop). `ReadFile` fills the caller-provided `buf` (writing at most `buf.len()`
|
||||
// bytes) and stores the count in `read`; both outlive the synchronous call. A broken pipe
|
||||
// surfaces as `Err` and is mapped to EOF below.
|
||||
let ok = unsafe {
|
||||
windows::Win32::Storage::FileSystem::ReadFile(self.0, Some(buf), Some(&mut read), None)
|
||||
};
|
||||
@@ -380,6 +409,8 @@ impl Read for HandleReader {
|
||||
}
|
||||
impl Drop for HandleReader {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `self.0` is the pipe `HANDLE` this `HandleReader` owns; `CloseHandle` (an FFI call
|
||||
// taking the handle by value) is invoked exactly once here in Drop, so there is no double-close.
|
||||
unsafe {
|
||||
let _ = CloseHandle(self.0);
|
||||
}
|
||||
@@ -391,6 +422,13 @@ impl Drop for HandleReader {
|
||||
pub fn running_as_system() -> bool {
|
||||
use windows::Win32::Security::{GetTokenInformation, TokenUser, TOKEN_QUERY, TOKEN_USER};
|
||||
use windows::Win32::System::Threading::{GetCurrentProcess, OpenProcessToken};
|
||||
// SAFETY: `OpenProcessToken(GetCurrentProcess(), TOKEN_QUERY, &mut token)` opens the current-process
|
||||
// token (the pseudo-handle is always valid) into `token`, which is closed once before each return.
|
||||
// The first `GetTokenInformation` (null buffer) queries the required `len`; `buf` is then a
|
||||
// `Vec<u8>` of exactly `len` bytes and the second call fills it, so `&*(buf.as_ptr() as *const
|
||||
// TOKEN_USER)` reads a `TOKEN_USER` the kernel just wrote into a sufficiently-sized buffer (the
|
||||
// variable-length SID it points at also lies within `buf`, which outlives the borrow).
|
||||
// `is_local_system_sid` is this module's `unsafe fn`, given that in-buffer `PSID`. Safe on any thread.
|
||||
unsafe {
|
||||
let mut token = HANDLE::default();
|
||||
if OpenProcessToken(GetCurrentProcess(), TOKEN_QUERY, &mut token).is_err() {
|
||||
|
||||
@@ -4,7 +4,7 @@
|
||||
//! environment before the host starts, and **for the knobs captured here the environment is constant for the
|
||||
//! process lifetime**, so a lazily-parsed global is equivalent to "parsed once at startup".
|
||||
//!
|
||||
//! **Goal-1 stages 1–2** (`docs/windows-host-rewrite.md` §2.2): stage 1 stood this up; stage 2 migrated the
|
||||
//! **Goal-1 stages 1–2** (`design/windows-host-rewrite.md` §2.2): stage 1 stood this up; stage 2 migrated the
|
||||
//! genuinely-constant operator/dispatch knobs onto it (the dispatch-disagreement bug class: `idd_push`,
|
||||
//! `capture_backend`, `encoder_pref`, `render_adapter`, `no_wgc`, the vdisplay backend select — plus the
|
||||
//! plan-named `secure_dda`/`idd_depth`/`zerocopy`/`ten_bit` and the multi-site `perf`/`compositor`/
|
||||
|
||||
@@ -3,6 +3,9 @@
|
||||
//! RGB→YUV on the GPU, so no host-side CSC) and VAAPI on AMD/Intel (`*_vaapi`; the CPU-input
|
||||
//! fallback swscales RGB→NV12, the zero-copy path imports the capture dmabuf straight into a
|
||||
//! VA surface). One [`Encoder`] trait, selected in [`open_video`].
|
||||
// Every unsafe block in this module tree carries a `// SAFETY:` proof; enforce it (unsafe-proof
|
||||
// program). As a parent module this also covers the child modules (encode::windows/linux::*).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use crate::capture::{CapturedFrame, PixelFormat};
|
||||
use anyhow::Result;
|
||||
@@ -505,6 +508,14 @@ fn windows_gpu_vendor() -> Option<GpuVendor> {
|
||||
CreateDXGIFactory1, IDXGIFactory1, DXGI_ADAPTER_FLAG_SOFTWARE,
|
||||
};
|
||||
static CACHE: OnceLock<Option<GpuVendor>> = OnceLock::new();
|
||||
// SAFETY: `CreateDXGIFactory1` returns a fresh owned `IDXGIFactory1` COM object (refcounted by the
|
||||
// windows-rs wrapper, Released when the local drops); `.ok()?` bails on failure so `factory` is a
|
||||
// valid interface before any use. `EnumAdapters1(i)` hands back the i-th adapter as an owned
|
||||
// `IDXGIAdapter1` (or an error past the last adapter, which ends the loop). `GetDesc1()` returns the
|
||||
// `DXGI_ADAPTER_DESC1` by value (no out-pointer), so reading `desc.Flags`/`desc.VendorId` is plain
|
||||
// field access. Every call only touches COM objects this closure owns; the `OnceLock` runs the
|
||||
// closure once (no data race) and all interfaces are Released as the locals drop. No raw pointer is
|
||||
// dereferenced and nothing is aliased.
|
||||
*CACHE.get_or_init(|| unsafe {
|
||||
let factory: IDXGIFactory1 = CreateDXGIFactory1().ok()?;
|
||||
let mut i = 0u32;
|
||||
|
||||
@@ -8,6 +8,8 @@
|
||||
//! does *not* accept — we expand it to `rgb0` (one padding byte/pixel, no colour math).
|
||||
//! The encoder is opened *without* a global header so VPS/SPS/PPS are emitted in-band on
|
||||
//! every IDR — the output is both a playable raw Annex-B stream and self-contained AUs.
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{Codec, EncodedFrame, Encoder};
|
||||
use crate::capture::{CapturedFrame, FramePayload, PixelFormat};
|
||||
@@ -79,6 +81,12 @@ impl CudaHw {
|
||||
|
||||
impl Drop for CudaHw {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `frames_ref`/`device_ref` are the two non-null `AVBufferRef`s `CudaHw::new` created
|
||||
// (it bails before returning `Self` if either alloc fails, so a live `CudaHw` always holds
|
||||
// both). `av_buffer_unref` drops one reference and nulls the pointer through the `&mut`. This
|
||||
// `Drop` runs exactly once and `CudaHw` owns these refs exclusively → no double-free /
|
||||
// use-after-free. Frames are unref'd before the device (the frames ctx internally refs the
|
||||
// device; refcounted, so the order is sound regardless).
|
||||
unsafe {
|
||||
ffi::av_buffer_unref(&mut self.frames_ref);
|
||||
ffi::av_buffer_unref(&mut self.device_ref);
|
||||
@@ -136,6 +144,13 @@ pub struct NvencEncoder {
|
||||
|
||||
// `CudaHw` holds raw `AVBufferRef`s; the encoder lives on a single thread. The CPU encoder is
|
||||
// already `Send` via ffmpeg-next; assert it for the CUDA fields too.
|
||||
// SAFETY: `NvencEncoder` owns an ffmpeg-next `Encoder`/`VideoFrame` (already `Send`) plus a `CudaHw`
|
||||
// holding raw `AVBufferRef`s, which are not `Send` by default. The encoder is owned and driven by
|
||||
// exactly ONE thread — the per-session encode thread it is moved to — and is only touched through
|
||||
// `&mut self` methods, so it is never aliased or accessed concurrently. The wrapped libav contexts
|
||||
// (and the shared `CUcontext` the `CudaHw` references) have no thread affinity, so transferring
|
||||
// ownership across threads is sound. This asserts `Send` (transfer) only, extending ffmpeg-next's
|
||||
// existing `Send` to the raw CUDA fields; `Sync` (shared `&`) is deliberately NOT implemented.
|
||||
unsafe impl Send for NvencEncoder {}
|
||||
|
||||
impl NvencEncoder {
|
||||
@@ -162,6 +177,9 @@ impl NvencEncoder {
|
||||
}
|
||||
ffmpeg::init().context("ffmpeg init")?;
|
||||
if std::env::var_os("PUNKTFUNK_FFMPEG_DEBUG").is_some() {
|
||||
// SAFETY: `av_log_set_level` sets libav's global integer log level; `48` (= AV_LOG_DEBUG)
|
||||
// is a valid level with no pointer args, and libav was just initialized by `ffmpeg::init()`
|
||||
// above — always sound.
|
||||
unsafe { ffi::av_log_set_level(48) }; // AV_LOG_DEBUG — surface NVENC hw-frame rejects
|
||||
}
|
||||
let name = codec.nvenc_name();
|
||||
@@ -195,6 +213,11 @@ impl NvencEncoder {
|
||||
.unwrap_or(1.0);
|
||||
let vbv_bits = ((bitrate_bps as f64 / fps.max(1) as f64) * vbv_frames as f64)
|
||||
.clamp(1.0, i32::MAX as f64);
|
||||
// SAFETY: `video` is the ffmpeg-next encoder builder wrapping a freshly-allocated
|
||||
// `AVCodecContext` that we hold by value and have not opened yet; `video.as_mut_ptr()` returns
|
||||
// that non-null, properly-aligned, exclusively-owned context. Writing the plain `rc_buffer_size`
|
||||
// int field before `open_with` is the supported way to set a field ffmpeg-next exposes no
|
||||
// setter for. Sole owner → no aliasing; synchronous in-bounds scalar write.
|
||||
unsafe {
|
||||
(*video.as_mut_ptr()).rc_buffer_size = vbv_bits as i32;
|
||||
}
|
||||
@@ -204,6 +227,9 @@ impl NvencEncoder {
|
||||
// "freeze". NVENC emits one IDR at stream start, then P-frames only; `forced-idr` (below)
|
||||
// turns a client recovery request (RFI, via `request_keyframe`) into an IDR on demand.
|
||||
// This is the Moonlight/Sunshine low-latency model.
|
||||
// SAFETY: same `video` builder as above — a non-null, properly-aligned, sole-owned, not-yet-
|
||||
// opened `AVCodecContext`. We write the plain `gop_size` int field (= -1, infinite GOP) before
|
||||
// `open_with`, which ffmpeg-next has no setter for. No aliasing; synchronous scalar write.
|
||||
unsafe {
|
||||
(*video.as_mut_ptr()).gop_size = -1;
|
||||
}
|
||||
@@ -214,6 +240,10 @@ impl NvencEncoder {
|
||||
// RGB-input paths leave these unset (NVENC's internal CSC writes its own VUI). Matches the
|
||||
// Windows NV12 path's BT.709 limited-range signalling.
|
||||
if matches!(format, PixelFormat::Nv12) {
|
||||
// SAFETY: same `video` builder — `raw = video.as_mut_ptr()` is the non-null, properly-
|
||||
// aligned, sole-owned, not-yet-opened `AVCodecContext`. We set its four VUI colour enum
|
||||
// fields to valid `AVColorSpace`/`AVColorRange`/`AVColorPrimaries`/`AVColorTransfer-
|
||||
// Characteristic` variants before `open_with`. Sole owner → no aliasing; synchronous writes.
|
||||
unsafe {
|
||||
let raw = video.as_mut_ptr();
|
||||
(*raw).colorspace = ffi::AVColorSpace::AVCOL_SPC_BT709;
|
||||
@@ -228,7 +258,17 @@ impl NvencEncoder {
|
||||
// *before* open (NVENC derives the device from `hw_frames_ctx`).
|
||||
let cuda_hw = if cuda {
|
||||
let cu_ctx = crate::zerocopy::cuda::context().context("shared CUDA context")?;
|
||||
// SAFETY: `CudaHw::new` (an `unsafe fn`) requires libav initialized (the `ffmpeg::init()`
|
||||
// above ran) and a valid `CUcontext`; `cu_ctx` is the shared importer context from
|
||||
// `zerocopy::cuda::context()?`, non-null on the `Ok` path. `nvenc_pixel` is a valid `Pixel`
|
||||
// and `width`/`height` are the validated positive dims. It returns a RAII `CudaHw` wrapping
|
||||
// (not owning) `cu_ctx` and owning two `AVBufferRef`s freed on drop.
|
||||
let hw = unsafe { CudaHw::new(cu_ctx, nvenc_pixel, width, height)? };
|
||||
// SAFETY: `raw = video.as_mut_ptr()` is the non-null, sole-owned, not-yet-opened
|
||||
// `AVCodecContext`. We set `pix_fmt = CUDA` and attach NEW refs (`av_buffer_ref`) of
|
||||
// `hw.device_ref`/`hw.frames_ref` — both non-null (`CudaHw::new` guarantees) and from the
|
||||
// live `hw`, which is moved into `NvencEncoder.cuda` next to `enc` and so outlives the
|
||||
// encoder. The context owns its own refs (freed when the context closes). No aliasing.
|
||||
unsafe {
|
||||
let raw = video.as_mut_ptr();
|
||||
(*raw).pix_fmt = ffi::AVPixelFormat::AV_PIX_FMT_CUDA;
|
||||
@@ -428,6 +468,19 @@ impl NvencEncoder {
|
||||
// The device→device copy below uses our shared context directly; make it current on the
|
||||
// encode thread (ffmpeg pushes its own around the pool alloc, so order is fine).
|
||||
crate::zerocopy::cuda::make_current().context("CUDA context current (encode thread)")?;
|
||||
// SAFETY: `frames_ref` is the non-null CUDA frames ctx from `self.cuda` (unwrapped via
|
||||
// `.context(..)?` above), and the shared CUDA context was just made current on THIS thread
|
||||
// (`make_current()?`), the precondition for the device-pointer copies below.
|
||||
// * `av_frame_alloc` → `f` (null-checked). `av_hwframe_get_buffer(frames_ref, f, 0)` fills `f`
|
||||
// with a pooled CUDA surface (sets `data[]`/`linesize[]`/`buf[0]`/`hw_frames_ctx`); on
|
||||
// failure we free `f` and bail.
|
||||
// * For NV12 we read `(*f).data[0..2]` / `linesize[0..2]` (Y + interleaved UV), else
|
||||
// `data[0]`/`linesize[0]` — in-struct fields of the non-null `f`, valid for the surface dims
|
||||
// ffmpeg allocated — and pass them to the cuda copy helpers, which device→device copy `buf`
|
||||
// (the imported `DeviceBuffer`, owned by the caller and live for this call) into the surface.
|
||||
// * On copy error we free `f` and return. Otherwise we write `pts`/`pict_type` through `f` and
|
||||
// `avcodec_send_frame` it into the live owned `self.enc` context (which takes its own ref of
|
||||
// the pooled surface), then free our `f` ref exactly once. Single-threaded encoder → no race.
|
||||
unsafe {
|
||||
let mut f = ffi::av_frame_alloc();
|
||||
if f.is_null() {
|
||||
|
||||
@@ -19,6 +19,8 @@
|
||||
//! hwdevice/hwframes/buffersrc/buffersink calls go through `ffmpeg::ffi` (= `ffmpeg_sys_next`),
|
||||
//! as the CUDA encode path and the clients' decode paths already do. The encoder is opened
|
||||
//! *without* a global header, so VPS/SPS/PPS are in-band on every IDR.
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{Codec, EncodedFrame, Encoder};
|
||||
use crate::capture::{CapturedFrame, DmabufFrame, FramePayload, PixelFormat};
|
||||
@@ -133,6 +135,14 @@ pub fn probe_can_encode(codec: Codec) -> bool {
|
||||
if ffmpeg::init().is_err() {
|
||||
return false;
|
||||
}
|
||||
// SAFETY: `ffmpeg::init()` returned Ok above, so libav is initialized. `av_log_get_level`/
|
||||
// `av_log_set_level` only read/write libav's global integer log level (no pointer args) and are
|
||||
// always sound to call post-init. `VaapiHw::new` (an `unsafe fn`) builds a VAAPI device + NV12
|
||||
// frames pool from the literal NV12/640x480/pool=2 args and hands back a RAII handle that unrefs
|
||||
// both `AVBufferRef`s on drop. `open_vaapi_encoder` (an `unsafe fn`) borrows `hw.device_ref`/
|
||||
// `hw.frames_ref` — the two non-null refs `VaapiHw::new` just created — and `av_buffer_ref`s them
|
||||
// into the encoder; `hw` is a live local for the whole match arm, so the borrows outlive the
|
||||
// synchronous call, and both `hw` and the probe encoder are dropped (RAII) when the arm ends.
|
||||
unsafe {
|
||||
// A missing VA device (non-VAAPI host, GPU-less CI) is an expected probe outcome — quiet
|
||||
// ffmpeg's "No VA display found" error for the probe, then restore the level.
|
||||
@@ -224,6 +234,12 @@ impl VaapiHw {
|
||||
|
||||
impl Drop for VaapiHw {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `frames_ref`/`device_ref` are the two non-null `AVBufferRef`s `VaapiHw::new`
|
||||
// created (it bails before constructing `Self` if either alloc fails, so a live `VaapiHw`
|
||||
// always holds both). `av_buffer_unref` drops one reference and nulls the pointer through the
|
||||
// `&mut`. This `Drop` runs exactly once and `VaapiHw` owns these refs exclusively, so there
|
||||
// is no double-free / use-after-free. Frames are unref'd before the device because the frames
|
||||
// ctx internally holds a ref on the device (refcounted, so the order is sound either way).
|
||||
unsafe {
|
||||
ffi::av_buffer_unref(&mut self.frames_ref);
|
||||
ffi::av_buffer_unref(&mut self.device_ref);
|
||||
@@ -252,7 +268,16 @@ impl CpuInner {
|
||||
) -> Result<Self> {
|
||||
let src_pixel = vaapi_sws_src(format)?;
|
||||
const POOL: c_int = 16;
|
||||
// SAFETY: `VaapiHw::new` (an `unsafe fn`) requires libav initialized — guaranteed because the
|
||||
// only path here is `VaapiEncoder::open` → `ensure_inner` → `CpuInner::open`, and `open` ran
|
||||
// `ffmpeg::init()`. The args are valid: NV12 sw_format, the validated positive `width`/`height`,
|
||||
// pool=16. It returns a RAII `VaapiHw` that unrefs its two `AVBufferRef`s on drop.
|
||||
let hw = unsafe { VaapiHw::new(ffi::AVPixelFormat::AV_PIX_FMT_NV12, width, height, POOL)? };
|
||||
// SAFETY: `open_vaapi_encoder` (an `unsafe fn`) borrows `hw.device_ref`/`hw.frames_ref` — both
|
||||
// non-null (`VaapiHw::new` guarantees it) and from the `hw` just built above, which is a live
|
||||
// local that outlives this synchronous call. The fn `av_buffer_ref`s them into the encoder, so
|
||||
// the encoder holds its own references; `hw` is also moved into the returned `CpuInner` next to
|
||||
// `enc`, keeping the device/frames alive for the encoder's whole lifetime.
|
||||
let enc = unsafe {
|
||||
open_vaapi_encoder(
|
||||
codec,
|
||||
@@ -266,6 +291,12 @@ impl CpuInner {
|
||||
};
|
||||
// swscale RGB→NV12, BT.709 limited (matches the VUI), no rescale.
|
||||
let src_av = pixel_to_av(src_pixel);
|
||||
// SAFETY: `sws_getContext` allocates a swscale context for the given src/dst dimensions and
|
||||
// pixel formats. All four dims are the encoder's positive `width`/`height` cast to `c_int`;
|
||||
// `src_av` is a valid `AVPixelFormat` (from `pixel_to_av` of the `vaapi_sws_src`-validated
|
||||
// `src_pixel`), the dst is NV12. The three trailing pointers (srcFilter, dstFilter, param) are
|
||||
// explicitly null = "use defaults", which the API documents as accepted. No Rust memory is
|
||||
// borrowed — only by-value ints/enums — and the returned pointer is null-checked just below.
|
||||
let sws = unsafe {
|
||||
ffi::sws_getContext(
|
||||
width as c_int,
|
||||
@@ -283,10 +314,23 @@ impl CpuInner {
|
||||
if sws.is_null() {
|
||||
bail!("sws_getContext(RGB→NV12) failed");
|
||||
}
|
||||
// SAFETY: `sws` is the non-null `SwsContext` from `sws_getContext` above (the `is_null()`
|
||||
// check immediately preceding returned false). `sws_getCoefficients(SWS_CS_ITU709)` returns a
|
||||
// pointer into a libswscale static const coefficient table valid for the whole process, reused
|
||||
// here for both the inverse (src) and forward (dst) matrices. `sws_setColorspaceDetails` only
|
||||
// reads those tables and writes scalar CSC settings into `sws`; the table pointer outlives the
|
||||
// synchronous call and no Rust memory is passed.
|
||||
unsafe {
|
||||
let cs709 = ffi::sws_getCoefficients(SWS_CS_ITU709);
|
||||
ffi::sws_setColorspaceDetails(sws, cs709, 1, cs709, 0, 0, 1 << 16, 1 << 16);
|
||||
}
|
||||
// SAFETY: `av_frame_alloc` returns a fresh, uniquely-owned heap `AVFrame` (null-checked — on
|
||||
// null we free the already-built `sws` and bail). We then write the plain `format`/`width`/
|
||||
// `height` fields through the non-null, properly-aligned `f` (sole owner, not yet shared).
|
||||
// `av_frame_get_buffer(f, 0)` allocates backing storage for those dims/format; on failure we
|
||||
// free `f` and `sws` (unwinding the half-built state) and bail. On success `f` is a fully-owned
|
||||
// NV12 frame stored in `CpuInner.nv12` and freed once in `CpuInner::drop`. `f` is a unique
|
||||
// fresh pointer, so none of these writes alias anything.
|
||||
let nv12 = unsafe {
|
||||
let f = ffi::av_frame_alloc();
|
||||
if f.is_null() {
|
||||
@@ -329,6 +373,18 @@ impl CpuInner {
|
||||
let h = self.height as usize;
|
||||
let src_row = w * self.src_format.bytes_per_pixel();
|
||||
anyhow::ensure!(bytes.len() >= src_row * h, "captured buffer too small");
|
||||
// SAFETY: The `ensure!`s above guarantee `format == self.src_format` and
|
||||
// `bytes.len() >= src_row * h`. `sws_scale` reads `h` rows of `src_row` bytes from
|
||||
// `src_data[0] = bytes.as_ptr()` (the other planes null/0 — packed RGB is single-plane), all
|
||||
// in bounds; `bytes`, `src_data`, `src_stride` are live locals for this synchronous call.
|
||||
// `self.sws` is the non-null context built in `open`; it writes into `self.nv12` (a non-null
|
||||
// owned frame whose `data`/`linesize` in-struct arrays were sized by `av_frame_get_buffer`).
|
||||
// `av_frame_alloc` (null-checked) yields a fresh `hwf`; `av_hwframe_get_buffer` pulls a pooled
|
||||
// VAAPI surface from the live non-null `self.hw.frames_ref`; `av_hwframe_transfer_data` uploads
|
||||
// the staged NV12 into it — both frames live, failures free `hwf` and bail. We then write
|
||||
// `pts`/`pict_type` through the non-null `hwf` and `avcodec_send_frame` it into the live
|
||||
// owned `self.enc` context (which takes its own ref), then free our `hwf` ref exactly once.
|
||||
// The encoder runs only on this thread (see `unsafe impl Send`), so no aliasing/data race.
|
||||
unsafe {
|
||||
let src_data: [*const u8; 4] = [bytes.as_ptr(), ptr::null(), ptr::null(), ptr::null()];
|
||||
let src_stride: [c_int; 4] = [src_row as c_int, 0, 0, 0];
|
||||
@@ -374,6 +430,12 @@ impl CpuInner {
|
||||
|
||||
impl Drop for CpuInner {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `self.nv12` (an owned `AVFrame`) and `self.sws` (an owned `SwsContext`) are each
|
||||
// freed exactly once here, guarded by `is_null()` so a never-set pointer is skipped (no double
|
||||
// free). `CpuInner` owns both exclusively and `Drop` runs once. `av_frame_free` takes `&mut`
|
||||
// and nulls the pointer. `self.enc`/`self.hw` are freed afterward by their own `Drop` impls;
|
||||
// the encoder holds its own `av_buffer_ref`'d device/frames copies, so field-drop order is
|
||||
// irrelevant to soundness.
|
||||
unsafe {
|
||||
if !self.nv12.is_null() {
|
||||
ffi::av_frame_free(&mut self.nv12);
|
||||
@@ -417,6 +479,31 @@ impl DmabufInner {
|
||||
let drm_fourcc = crate::zerocopy::drm_fourcc(format)
|
||||
.ok_or_else(|| anyhow!("no DRM fourcc for {format:?} (VAAPI zero-copy)"))?;
|
||||
let node = render_node();
|
||||
// SAFETY: libav is initialized (`VaapiEncoder::open` ran `ffmpeg::init()` before
|
||||
// `ensure_inner` → `DmabufInner::open`). Every raw pointer dereferenced below is either freshly
|
||||
// allocated by the immediately-preceding ffmpeg call and null-checked, or an in-struct field of
|
||||
// such an object:
|
||||
// * `node` is a `CString` (from `render_node`) live for the whole block; its `.as_ptr()` is a
|
||||
// NUL-terminated path read only during `av_hwdevice_ctx_create`.
|
||||
// * `av_hwdevice_ctx_create(&mut drm_device, DRM, …)` / `…_create_derived(&mut vaapi_device,
|
||||
// VAAPI, drm_device, …)`: on `r < 0` the out-param stays null and we bail (the derive path
|
||||
// unrefs `drm_device` first); on success each is a non-null owned `AVBufferRef`.
|
||||
// * `av_hwframe_ctx_alloc(drm_device)` → `drm_frames` (null-checked); `(*drm_frames).data` is
|
||||
// its `AVHWFramesContext` payload, written before `av_hwframe_ctx_init`.
|
||||
// * `avfilter_graph_alloc` → `graph` (null-checked); `avfilter_get_by_name` returns a static
|
||||
// const `AVFilter` (process-lifetime) or null; `avfilter_graph_alloc_filter` allocates each
|
||||
// filter ctx inside `graph`; the four are null-checked together. `inst`/arg strings are
|
||||
// 'static C literals.
|
||||
// * `(*hwmap/scale).hw_device_ctx = av_buffer_ref(vaapi_device)` attaches a NEW ref owned by
|
||||
// the filter (freed by `avfilter_graph_free`); our `vaapi_device` ref is untouched.
|
||||
// * `av_buffersink_get_hw_frames_ctx(sink)` → `nv12_ctx` is a borrowed ref owned by the sink,
|
||||
// valid while `graph` lives (and `graph` is moved into the returned `DmabufInner`).
|
||||
// * `open_vaapi_encoder` borrows `vaapi_device` (our live owned ref) and `nv12_ctx` (sink's
|
||||
// live ref) and `av_buffer_ref`s both into the encoder.
|
||||
// Every early-error path unref's the allocated buffers and frees the graph in the right order
|
||||
// before bailing; on success the four `AVBufferRef`s + `graph` + `src`/`sink` are moved into
|
||||
// `DmabufInner` and freed in its `Drop`. (Two non-UB leaks noted below: `av_buffersrc_*` and
|
||||
// the final `?`.)
|
||||
unsafe {
|
||||
// DRM device (source dmabuf frames) + a VAAPI device derived from it (same GPU) for
|
||||
// hwmap/scale_vaapi/the encoder.
|
||||
@@ -509,7 +596,12 @@ impl DmabufInner {
|
||||
num: 1,
|
||||
den: fps as c_int,
|
||||
};
|
||||
(*par).hw_frames_ctx = ffi::av_buffer_ref(drm_frames);
|
||||
// Assign `drm_frames` BORROWED (no extra ref): `av_buffersrc_parameters_set` takes its
|
||||
// own ref of `par->hw_frames_ctx` (via av_buffer_replace), and `av_free(par)` frees only
|
||||
// the struct, not the ref. Our single owned `drm_frames` ref is retained, lives in
|
||||
// `DmabufInner`, and is unref'd in `Drop`. Wrapping it in `av_buffer_ref` here would leak
|
||||
// that extra ref every session (the persistent listener would accumulate them).
|
||||
(*par).hw_frames_ctx = drm_frames;
|
||||
let r = ffi::av_buffersrc_parameters_set(src, par);
|
||||
ffi::av_free(par as *mut _);
|
||||
if r < 0 {
|
||||
@@ -564,7 +656,12 @@ impl DmabufInner {
|
||||
ffi::av_buffer_unref(&mut drm_device);
|
||||
bail!("filter sink has no VAAPI frames context");
|
||||
}
|
||||
let enc = open_vaapi_encoder(
|
||||
// On encoder-open failure, free the graph + our owned buffer refs before bailing (matching
|
||||
// every error path above) so a failed session doesn't leak them. `nv12_ctx` is borrowed
|
||||
// from the sink (owned by `graph`), so `avfilter_graph_free` reclaims it — don't unref it
|
||||
// separately. On success the encoder takes its own ref of `vaapi_device`, and `drm_frames`/
|
||||
// `vaapi_device`/`drm_device`/`graph` move into `DmabufInner` (freed in `Drop`).
|
||||
let enc = match open_vaapi_encoder(
|
||||
codec,
|
||||
width,
|
||||
height,
|
||||
@@ -572,7 +669,16 @@ impl DmabufInner {
|
||||
bitrate_bps,
|
||||
vaapi_device,
|
||||
nv12_ctx,
|
||||
)?;
|
||||
) {
|
||||
Ok(enc) => enc,
|
||||
Err(e) => {
|
||||
ffi::avfilter_graph_free(&mut graph);
|
||||
ffi::av_buffer_unref(&mut drm_frames);
|
||||
ffi::av_buffer_unref(&mut vaapi_device);
|
||||
ffi::av_buffer_unref(&mut drm_device);
|
||||
return Err(e);
|
||||
}
|
||||
};
|
||||
|
||||
tracing::info!(
|
||||
encoder = codec.vaapi_name(),
|
||||
@@ -600,6 +706,23 @@ impl DmabufInner {
|
||||
dmabuf.fourcc,
|
||||
self.fourcc
|
||||
);
|
||||
// SAFETY: The `ensure!` above checked `dmabuf.fourcc == self.fourcc`.
|
||||
// * `std::mem::zeroed::<AVDRMFrameDescriptor>()` is sound: it is a `#[repr(C)]` POD of ints and
|
||||
// nested int-struct arrays (no `NonNull`/refs), for which all-zero is a valid bit pattern;
|
||||
// `Box` puts it on the heap with a unique owner.
|
||||
// * `dmabuf.fd.as_raw_fd()` is the fd of the caller's `&DmabufFrame`, which owns it for the
|
||||
// whole synchronous `submit`; we describe one object/layer/plane from its
|
||||
// fourcc/modifier/offset/stride and pass `object.size = 0` (ffmpeg queries the real size).
|
||||
// * `av_frame_alloc` → `drm` (null-checked); we set its scalar fields and
|
||||
// `hw_frames_ctx = av_buffer_ref(self.drm_frames)` (new ref of the live owned ctx).
|
||||
// * `data[0] = Box::into_raw(desc)` transfers the box into the frame; `buf[0] =
|
||||
// av_buffer_create(.., free_desc, ..)` registers a destructor that reclaims it exactly once
|
||||
// when the buffer's refcount hits zero — matched alloc/free, no leak/double-free.
|
||||
// * `av_buffersrc_add_frame_flags(self.src, drm, KEEP_REF)` pushes a ref into the live
|
||||
// buffersrc; KEEP_REF keeps our own `drm` ref, which we then `av_frame_free`. We pull the
|
||||
// converted surface with `av_buffersink_get_frame(self.sink, nv12)` BEFORE returning, so the
|
||||
// dmabuf (owned by the caller) is read while still valid. `nv12` is sent into the live owned
|
||||
// `self.enc` (takes its own ref) and our ref freed once. Single-threaded encoder → no race.
|
||||
unsafe {
|
||||
// Build a DRM-PRIME AVFrame describing the dmabuf (one object/fd, one layer/plane).
|
||||
let mut desc: Box<ffi::AVDRMFrameDescriptor> = Box::new(std::mem::zeroed());
|
||||
@@ -626,6 +749,11 @@ impl DmabufInner {
|
||||
// Own the descriptor so it frees with the frame (the fd is owned by the DmabufFrame,
|
||||
// which outlives this call — the graph reads the surface before submit returns).
|
||||
extern "C" fn free_desc(_opaque: *mut std::ffi::c_void, data: *mut u8) {
|
||||
// SAFETY: `data` is exactly the pointer produced by `Box::into_raw(desc)` and passed as
|
||||
// `av_buffer_create`'s first arg, which libav hands back verbatim to this callback. It
|
||||
// is a valid, uniquely-owned `Box<AVDRMFrameDescriptor>` raw pointer; libav invokes the
|
||||
// callback exactly once (when the last buffer ref drops), so `from_raw` + `drop`
|
||||
// reclaims it exactly once — no double-free. `_opaque` is unused (we passed null).
|
||||
unsafe { drop(Box::from_raw(data as *mut ffi::AVDRMFrameDescriptor)) };
|
||||
}
|
||||
(*drm).buf[0] = ffi::av_buffer_create(
|
||||
@@ -673,6 +801,13 @@ impl DmabufInner {
|
||||
|
||||
impl Drop for DmabufInner {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `graph`/`drm_frames`/`vaapi_device`/`drm_device` are the non-null objects
|
||||
// `DmabufInner::open` built and moved into `self` (open bails before constructing `Self` if any
|
||||
// alloc fails). `avfilter_graph_free` frees the graph (and the per-filter device refs it owns);
|
||||
// each `av_buffer_unref` drops one ref and nulls the pointer via `&mut`. `DmabufInner` owns all
|
||||
// four exclusively and `Drop` runs once → no double-free/use-after-free. The graph is freed
|
||||
// first (it holds refs on the devices), then frames, then the derived VAAPI device, then DRM.
|
||||
// (`self.enc` drops via ffmpeg-next afterward, holding its own refs.)
|
||||
unsafe {
|
||||
ffi::avfilter_graph_free(&mut self.graph);
|
||||
ffi::av_buffer_unref(&mut self.drm_frames);
|
||||
@@ -703,6 +838,13 @@ pub struct VaapiEncoder {
|
||||
}
|
||||
|
||||
// Raw FFI pointers; the encoder lives on a single thread (same contract as `NvencEncoder`).
|
||||
// SAFETY: `VaapiEncoder`'s `Inner` holds raw FFI pointers (`SwsContext`, `AVFrame`, `AVBufferRef`,
|
||||
// `AVFilterContext`, `AVCodecContext`) that are not `Send` by default. The encoder is owned and
|
||||
// driven by exactly ONE thread — the host's per-session encode thread it is moved (transferred) to —
|
||||
// and is only ever touched through `&mut self` methods, so it is never aliased or accessed
|
||||
// concurrently from two threads. None of the underlying libav/libswscale objects have thread
|
||||
// affinity (they are not thread-local), so transferring ownership across threads is sound. This
|
||||
// asserts `Send` (transfer) only; `Sync` (shared `&`) is deliberately NOT implemented.
|
||||
unsafe impl Send for VaapiEncoder {}
|
||||
|
||||
impl VaapiEncoder {
|
||||
@@ -720,6 +862,9 @@ impl VaapiEncoder {
|
||||
}
|
||||
ffmpeg::init().context("ffmpeg init")?;
|
||||
if std::env::var_os("PUNKTFUNK_FFMPEG_DEBUG").is_some() {
|
||||
// SAFETY: `av_log_set_level` sets libav's global integer log level; `48` (= AV_LOG_DEBUG)
|
||||
// is a valid level and there are no pointer args. libav was just initialized by the
|
||||
// `ffmpeg::init()` above, so the call is always sound.
|
||||
unsafe { ffi::av_log_set_level(48) };
|
||||
}
|
||||
// Validate the codec/format up front so a bad request fails at open, not on the first frame.
|
||||
|
||||
@@ -28,6 +28,8 @@
|
||||
//! through `ffmpeg::ffi` (= `ffmpeg_sys_next`), exactly as the Linux CUDA/VAAPI paths do. The
|
||||
//! `AVD3D11VADeviceContext`/`AVD3D11VAFramesContext` layouts are mirrored (the bindings don't
|
||||
//! allowlist `hwcontext_d3d11va.h`), as [`super::linux`] mirrors `AVCUDADeviceContext`.
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{Codec, EncodedFrame, Encoder};
|
||||
use crate::capture::{dxgi::D3d11Frame, CapturedFrame, FramePayload, PixelFormat};
|
||||
@@ -243,6 +245,12 @@ pub fn probe_can_encode(vendor: WinVendor, codec: Codec) -> bool {
|
||||
if ffmpeg::init().is_err() {
|
||||
return false;
|
||||
}
|
||||
// SAFETY: `ffmpeg::init()` succeeded above, so libav's global state is initialised.
|
||||
// `av_log_get_level`/`av_log_set_level` are global scalar getters/setters with no pointer args.
|
||||
// `open_win_encoder` (the `unsafe fn`) is called with null `device_ref`/`frames_ref` (the system
|
||||
// path), so it touches no D3D11/hwcontext — it only allocates and opens a self-contained
|
||||
// libavcodec encoder that is dropped at the end of `.is_ok()`. We restore the prior log level and
|
||||
// no raw pointer escapes the block.
|
||||
unsafe {
|
||||
// A missing AMF/QSV runtime (wrong-vendor host, GPU-less CI) is an expected probe outcome —
|
||||
// quiet ffmpeg's open error for the probe, then restore the level.
|
||||
@@ -337,6 +345,10 @@ impl SystemInner {
|
||||
} else {
|
||||
ffi::AVPixelFormat::AV_PIX_FMT_NV12
|
||||
};
|
||||
// SAFETY: calls the `unsafe fn open_win_encoder` with null `device_ref`/`frames_ref`, so the
|
||||
// system path is taken (no hw device/frames context is touched); all other args are scalars.
|
||||
// The returned `encoder::video::Encoder` owns its `AVCodecContext` and frees it on drop; no raw
|
||||
// pointer is aliased.
|
||||
let enc = unsafe {
|
||||
open_win_encoder(
|
||||
vendor,
|
||||
@@ -352,6 +364,11 @@ impl SystemInner {
|
||||
ptr::null_mut(),
|
||||
)?
|
||||
};
|
||||
// SAFETY: `av_frame_alloc` returns a freshly-allocated, uniquely-owned `AVFrame` (null-checked
|
||||
// before any deref); writing `format`/`width`/`height` through `*f` stays inside that
|
||||
// allocation. `av_frame_get_buffer(f, 0)` allocates the backing planes — on failure we
|
||||
// `av_frame_free` the sole owner (no double-free) and bail; on success the raw `f` is moved into
|
||||
// `self.sw_frame` and freed exactly once in `Drop`.
|
||||
let sw_frame = unsafe {
|
||||
let f = ffi::av_frame_alloc();
|
||||
if f.is_null() {
|
||||
@@ -467,6 +484,18 @@ impl SystemInner {
|
||||
} else {
|
||||
DXGI_FORMAT_NV12
|
||||
};
|
||||
// SAFETY: `ensure_staging` builds a STAGING texture (CPU_ACCESS_READ) matching `dxgi_fmt` on
|
||||
// `frame.device` — the same `ID3D11Device` that owns `frame.texture` — and caches that device's
|
||||
// immediate context in `self.ctx`. `src`/`dst` are that device's textures of identical NV12/P010
|
||||
// format and dimensions, so `CopyResource` on the single-threaded immediate context is valid.
|
||||
// `Map(.., D3D11_MAP_READ)` succeeds on a staging texture and yields `map.pData` valid for the
|
||||
// whole resource; for NV12/P010 the luma plane is `H` rows at `RowPitch` and the chroma plane
|
||||
// follows at byte offset `RowPitch*H` (`H/2` rows), so `total = pitch*(H+⌈H/2⌉)` is exactly the
|
||||
// mapped extent and `from_raw_parts(base, total)` stays in-bounds. Each `copy_nonoverlapping`
|
||||
// reads a bounds-checked `mapped[..]` sub-slice (`row_bytes ≤ pitch`) and writes `row_bytes ≤
|
||||
// linesize` into the `av_frame_get_buffer`-allocated plane at row `y < H`, so every destination
|
||||
// offset is inside the frame's plane allocation; src and dst never alias. `Unmap` pairs `Map`,
|
||||
// then `send` (the `unsafe fn`) hands `sw_frame` to the encoder.
|
||||
unsafe {
|
||||
self.ensure_staging(&frame.device, dxgi_fmt)?;
|
||||
let staging = self.staging.clone().context("staging texture")?;
|
||||
@@ -510,6 +539,14 @@ impl SystemInner {
|
||||
if self.ten_bit {
|
||||
bail!("ffmpeg_win: BGRA readback is 8-bit only (HDR needs the P010 capture path)");
|
||||
}
|
||||
// SAFETY: `ensure_staging` builds a B8G8R8A8 STAGING texture on `frame.device` and caches that
|
||||
// device's immediate context; `src`/`dst` are that device's textures of matching BGRA format,
|
||||
// so `CopyResource` on the single-threaded context is valid. `Map(READ)` on the staging texture
|
||||
// yields `base` valid for `pitch` × `h` rows. `ensure_sws` lazily builds the BGRA→NV12 context;
|
||||
// `sws_scale` reads `h` rows of `pitch` bytes from `base` (in-bounds — the staging surface is
|
||||
// `≥ pitch*h`) into the `sw_frame` planes addressed by its `data`/`linesize` (allocated for
|
||||
// `width`×`height` NV12). `Unmap` pairs `Map`; the cached `sws` is freed once in `Drop`. The
|
||||
// mapped read region never aliases the owned encoder frame.
|
||||
unsafe {
|
||||
self.ensure_staging(&frame.device, DXGI_FORMAT_B8G8R8A8_UNORM)?;
|
||||
let staging = self.staging.clone().context("staging texture")?;
|
||||
@@ -552,6 +589,13 @@ impl SystemInner {
|
||||
/// R10 shader output instead of P010. DXGI `R10G10B10A2_UNORM` (R in the low 10 bits, X2 alpha in
|
||||
/// the top 2) == FFmpeg `AV_PIX_FMT_X2BGR10LE`. UNTESTED on glass (no AMD/Intel Windows box).
|
||||
fn readback_rgb10(&mut self, frame: &D3d11Frame, pts: i64, idr: bool) -> Result<()> {
|
||||
// SAFETY: same shape as `readback_yuv`/`readback_bgra` — `ensure_staging` builds an
|
||||
// R10G10B10A2 STAGING texture on `frame.device` and caches its immediate context; `src`/`dst`
|
||||
// are that device's matching-format textures, so `CopyResource` on the single-threaded context
|
||||
// is valid. `Map(READ)` yields `base` valid for `pitch` × `h` rows. `ensure_sws` builds the
|
||||
// X2BGR10LE→P010 (BT.2020) context; `sws_scale` reads `h` rows of `pitch` bytes from `base`
|
||||
// (in-bounds) into the `sw_frame` P010 planes (`data`/`linesize`, allocated `width`×`height`).
|
||||
// `Unmap` pairs `Map`; `sws` is freed once in `Drop`. No aliasing between read and write.
|
||||
unsafe {
|
||||
self.ensure_staging(&frame.device, DXGI_FORMAT_R10G10B10A2_UNORM)?;
|
||||
let staging = self.staging.clone().context("staging texture")?;
|
||||
@@ -605,6 +649,12 @@ impl SystemInner {
|
||||
let h = self.height as usize;
|
||||
let src_row = w * format.bytes_per_pixel();
|
||||
anyhow::ensure!(bytes.len() >= src_row * h, "captured buffer too small");
|
||||
// SAFETY: `ensure_sws` lazily builds the (packed RGB/BGR)→NV12 context for this fixed src/dst
|
||||
// format pair. `src_data[0] = bytes.as_ptr()` with `src_stride[0] = src_row`; the `ensure!`
|
||||
// above guarantees `bytes` holds at least `src_row*h` bytes, so `sws_scale` reads `h` rows of
|
||||
// `src_row` bytes in-bounds and writes the `sw_frame` NV12 planes (`data`/`linesize`, allocated
|
||||
// `width`×`height`). `bytes` is borrowed for the call only and never aliases the owned
|
||||
// `sw_frame`. `send` then hands `sw_frame` to the encoder.
|
||||
unsafe {
|
||||
self.ensure_sws(
|
||||
pixel_to_av(sws_src(format)?),
|
||||
@@ -667,6 +717,10 @@ impl SystemInner {
|
||||
|
||||
impl Drop for SystemInner {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `sw_frame` is the `AVFrame` allocated in `open` (or null) — `av_frame_free` drops it
|
||||
// once and nulls the pointer through the `&mut`; `sws` is the cached `SwsContext` (or null) —
|
||||
// `sws_freeContext` frees it once. This `Drop` runs exactly once and `SystemInner` owns both
|
||||
// exclusively, so there is no double-free or use-after-free.
|
||||
unsafe {
|
||||
if !self.sw_frame.is_null() {
|
||||
ffi::av_frame_free(&mut self.sw_frame);
|
||||
@@ -745,6 +799,12 @@ impl D3d11Hw {
|
||||
|
||||
impl Drop for D3d11Hw {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `frames_ref`/`device_ref` are the two non-null `AVBufferRef`s `D3d11Hw::new` created
|
||||
// (it bails before constructing `Self` if either alloc/init fails, so a live `D3d11Hw` always
|
||||
// holds both). `av_buffer_unref` drops one reference and nulls the pointer through the `&mut`.
|
||||
// This `Drop` runs exactly once and `D3d11Hw` owns these refs exclusively → no double-free /
|
||||
// use-after-free. Frames are unref'd before the device because the frames ctx internally holds
|
||||
// a ref on the device (refcounted, so the order is sound either way).
|
||||
unsafe {
|
||||
ffi::av_buffer_unref(&mut self.frames_ref);
|
||||
ffi::av_buffer_unref(&mut self.device_ref);
|
||||
@@ -800,6 +860,18 @@ impl ZeroCopyInner {
|
||||
WinVendor::Qsv => (D3D11_BIND_DECODER.0 | D3D11_BIND_VIDEO_ENCODER.0) as u32,
|
||||
};
|
||||
const POOL: c_int = 8;
|
||||
// SAFETY: `D3d11Hw::new` wraps the capturer's `device` as a D3D11VA hwdevice (handing FFmpeg an
|
||||
// owned AddRef of it, balanced by FFmpeg's teardown Release) and builds an owned
|
||||
// device_ref/frames_ref pair freed by `D3d11Hw::Drop`; `hw` is a local, so it is dropped (and
|
||||
// both refs freed) on every early `return Err`. For QSV, `av_hwdevice_ctx_create_derived` and
|
||||
// `av_hwframe_ctx_create_derived` fill the null-initialised `qsv_device`/`qsv_frames` out-params
|
||||
// only on success (`r >= 0` checked); on the frames-derive failure we unref the already-created
|
||||
// `qsv_device` before bailing. `open_win_encoder` internally `av_buffer_ref`s the dev/frames
|
||||
// refs it is given (so ownership of `hw`'s and the derived refs stays here), and on its failure
|
||||
// we unref the still-owned derived `qsv_frames`/`qsv_device` (null for AMF → skipped) and return
|
||||
// — `hw` then drops its D3D11 refs. On success the derived refs are moved into `ZeroCopyInner`
|
||||
// (freed in its `Drop`) and the encoder holds its own AddRef'd copies. Every `AVBufferRef` is
|
||||
// unref'd exactly once across all paths — no leak, no double-free.
|
||||
unsafe {
|
||||
let hw = D3d11Hw::new(device, sw_av, bind_flags, width, height, POOL)?;
|
||||
let (pix_fmt, dev_ref, frames_ref, mut qsv_device, mut qsv_frames) = match vendor {
|
||||
@@ -887,6 +959,19 @@ impl ZeroCopyInner {
|
||||
}
|
||||
|
||||
fn submit(&mut self, frame: &D3d11Frame, pts: i64, idr: bool) -> Result<()> {
|
||||
// SAFETY: `d3d = av_frame_alloc()` is a fresh owned frame (null-checked) and is `av_frame_free`d
|
||||
// exactly once on every path below. `av_hwframe_get_buffer` fills it from the pool — on failure
|
||||
// we free it and bail. `(*d3d).data[0]` is the pool's texture-array and `data[1]` the array
|
||||
// index; `from_raw_borrowed` borrows that `ID3D11Texture2D` WITHOUT taking ownership (no Release
|
||||
// — the frame owns it) and is null-checked. `src` (the captured texture) and `dst` (the pooled
|
||||
// slice) live on the SAME D3D11 device wrapped by `self.hw`, and the caller guarantees
|
||||
// `captured.format == pool_format` before calling, so `CopySubresourceRegion(dst, dst_index, ..,
|
||||
// src, 0, ..)` on the single-threaded immediate context `self.ctx` is a valid same-format GPU
|
||||
// copy. For QSV the mapped `qsv` frame is a fresh owned frame whose `hw_frames_ctx` takes an
|
||||
// `av_buffer_ref` of `self.qsv_frames`; it is `av_frame_free`d (releasing that ref) on both the
|
||||
// map-failure and success paths. `avcodec_send_frame` only internally refs the input frame, so
|
||||
// the `av_frame_free(d3d)`/`av_frame_free(qsv)` afterwards are the sole owning frees — no leak,
|
||||
// no double-free, no use-after-free.
|
||||
unsafe {
|
||||
// Pull a pooled D3D11 surface; its data[0] is the pool's texture-ARRAY, data[1] the slice.
|
||||
let mut d3d = ffi::av_frame_alloc();
|
||||
@@ -959,6 +1044,11 @@ impl ZeroCopyInner {
|
||||
|
||||
impl Drop for ZeroCopyInner {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `qsv_frames`/`qsv_device` are the derived QSV `AVBufferRef`s (or null for AMF); each
|
||||
// is `av_buffer_unref`'d once here (nulling the pointer through the `&mut`) — `ZeroCopyInner`
|
||||
// owns these handles exclusively and this `Drop` runs once, so no double-free. The `enc` and
|
||||
// `hw` fields free the encoder's AddRef'd copies and the D3D11 device/frames refs through their
|
||||
// own `Drop`, so all references stay balanced.
|
||||
unsafe {
|
||||
if !self.qsv_frames.is_null() {
|
||||
ffi::av_buffer_unref(&mut self.qsv_frames);
|
||||
@@ -996,6 +1086,13 @@ pub struct FfmpegWinEncoder {
|
||||
}
|
||||
|
||||
// Raw FFI pointers + COM objects; the encoder lives on a single thread (same contract as NVENC/VAAPI).
|
||||
// SAFETY: `FfmpegWinEncoder` owns raw libav pointers (`AVFrame`/`SwsContext`/`AVBufferRef`) and
|
||||
// windows-rs COM handles (`ID3D11Device`/`ID3D11DeviceContext`/textures) that are not auto-`Send`. The
|
||||
// session creates the encoder, drives `submit`/`poll`/`flush`, and drops it all on one dedicated encode
|
||||
// thread; it is never shared by reference across threads, and the D3D11 immediate context is only ever
|
||||
// touched from that thread. The only cross-thread action is the initial move to the encode thread,
|
||||
// after which every interior pointer/COM ref is used single-threaded — the same contract the
|
||||
// NVENC/VAAPI encoders rely on. No interior state is accessed concurrently.
|
||||
unsafe impl Send for FfmpegWinEncoder {}
|
||||
|
||||
impl FfmpegWinEncoder {
|
||||
@@ -1012,6 +1109,8 @@ impl FfmpegWinEncoder {
|
||||
) -> Result<Self> {
|
||||
ffmpeg::init().context("ffmpeg init")?;
|
||||
if std::env::var_os("PUNKTFUNK_FFMPEG_DEBUG").is_some() {
|
||||
// SAFETY: `ffmpeg::init()` ran on the line above, so libav is initialised; `av_log_set_level`
|
||||
// is a global scalar setter with no pointer arguments.
|
||||
unsafe { ffi::av_log_set_level(48) };
|
||||
}
|
||||
// Make sure the encoder name exists in this libavcodec build up front (clear error vs a
|
||||
|
||||
@@ -13,6 +13,9 @@
|
||||
//! Needs a real NVIDIA GPU at runtime (session creation fails otherwise) — compiles GPU-less, but
|
||||
//! `open`/`submit` only succeed on a GPU box. The software encoder (`super::sw`) is the fallback.
|
||||
|
||||
// Every `unsafe` block / impl in this file carries a `// SAFETY:` proof; enforce it.
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{Codec, EncodedFrame, Encoder, EncoderCaps};
|
||||
use crate::capture::{CapturedFrame, FramePayload, PixelFormat};
|
||||
use anyhow::{anyhow, bail, Context, Result};
|
||||
@@ -88,7 +91,15 @@ pub struct NvencD3d11Encoder {
|
||||
init_device: *mut c_void,
|
||||
}
|
||||
|
||||
// Raw NVENC handle + COM ptrs; confined to the single encode thread (like the Linux encoder).
|
||||
// SAFETY: the `!Send` fields are the raw NVENC session/device handles (`encoder`, `init_device`),
|
||||
// the raw NVENC bitstream/registered/mapped pointers carried in `bitstreams`/`regs`/`pending`, and
|
||||
// the `ID3D11Texture2D` COM refs — none of which may be touched concurrently from two threads. This
|
||||
// encoder is owned by exactly one thread: it is moved onto the host encode thread once at
|
||||
// construction, and every NVENC call and D3D11 access happens only from that thread thereafter
|
||||
// (`submit`/`poll`/`invalidate_ref_frames`/`Drop` all run there, like the Linux encoder). Moving the
|
||||
// handles across that single ownership-transfer boundary is sound because no NVENC/D3D11 call is in
|
||||
// flight during the move and the session and its D3D11 immediate context are never shared (`&`) or
|
||||
// used concurrently — so `Send` introduces no data race on the non-`Send` fields.
|
||||
unsafe impl Send for NvencD3d11Encoder {}
|
||||
|
||||
impl NvencD3d11Encoder {
|
||||
@@ -403,6 +414,17 @@ impl NvencD3d11Encoder {
|
||||
|
||||
/// Lazily create the session on the first frame's D3D11 device (so capture + encode share it).
|
||||
fn init_session(&mut self, device: &ID3D11Device) -> Result<()> {
|
||||
// SAFETY: every call below goes through a function pointer resolved once from the loaded
|
||||
// `nvidia_video_codec_sdk::ENCODE_API` (`nvEncodeAPI`) table, or through this type's own
|
||||
// `unsafe fn`s whose contract is met here. `query_caps`/`try_open_session` receive `device`,
|
||||
// the live `ID3D11Device` the caller pulled off the first frame; each returns either a valid
|
||||
// open NVENC session handle or an `Err`. `destroy_encoder` is only ever called on a handle a
|
||||
// `try_open_session` just returned (and `best` only when `!best.is_null()`), so it never frees
|
||||
// a dangling or null session. `create_bitstream_buffer` is passed `enc` — the one chosen live
|
||||
// session — and `&mut cb`, a `#[repr(C)] NV_ENC_CREATE_BITSTREAM_BUFFER` whose `version` is set
|
||||
// to `NV_ENC_CREATE_BITSTREAM_BUFFER_VER`; `cb` lives across the synchronous call and its
|
||||
// returned `bitstreamBuffer` is copied into `self.bitstreams` before `cb` drops. No handle
|
||||
// escapes the encode thread.
|
||||
unsafe {
|
||||
// Probe real GPU caps first (max dims / 10-bit / custom-VBV / RFI) so the config below is
|
||||
// gated on what this card supports and an out-of-range mode fails with a clear error
|
||||
@@ -589,6 +611,11 @@ impl Encoder for NvencD3d11Encoder {
|
||||
new = format!("{}x{}", captured.width, captured.height),
|
||||
"NVENC: capture device/size/HDR changed — re-initializing session"
|
||||
);
|
||||
// SAFETY: `teardown` (an `unsafe fn`) requires the encode thread with no NVENC call in
|
||||
// flight and a session whose cached regs/bitstreams/pending all belong to `self.encoder`.
|
||||
// All hold: this is the synchronous encode thread, `self.inited` so `self.encoder` is the
|
||||
// live session every cached resource was created against, and the previous frame's encode
|
||||
// has already been polled (synchronous submit→poll), so nothing is mid-encode.
|
||||
unsafe { self.teardown() };
|
||||
}
|
||||
if !self.inited {
|
||||
@@ -609,7 +636,14 @@ impl Encoder for NvencD3d11Encoder {
|
||||
self.bit_depth = 10;
|
||||
nv::NV_ENC_BUFFER_FORMAT::NV_ENC_BUFFER_FORMAT_ABGR10
|
||||
}
|
||||
PixelFormat::Nv12 => nv::NV_ENC_BUFFER_FORMAT::NV_ENC_BUFFER_FORMAT_NV12,
|
||||
PixelFormat::Nv12 => {
|
||||
// NV12 is 8-bit 4:2:0. Force 8-bit so a transition from a prior P010 (10-bit) session
|
||||
// — or a 10-bit-negotiated client on an SDR display — re-inits at the matching depth.
|
||||
// Unlike ARGB (which NVENC upconverts to Main10), NV12 cannot feed a 10-bit session:
|
||||
// `register_resource` rejects it as InvalidParam (the HDR→SDR-toggle stream drop).
|
||||
self.bit_depth = 8;
|
||||
nv::NV_ENC_BUFFER_FORMAT::NV_ENC_BUFFER_FORMAT_NV12
|
||||
}
|
||||
_ => nv::NV_ENC_BUFFER_FORMAT::NV_ENC_BUFFER_FORMAT_ARGB,
|
||||
};
|
||||
let device = frame.device.clone();
|
||||
@@ -618,6 +652,21 @@ impl Encoder for NvencD3d11Encoder {
|
||||
}
|
||||
let slot = self.next % POOL;
|
||||
self.next += 1;
|
||||
// SAFETY: every NVENC call goes through a function pointer from the loaded `ENCODE_API` table
|
||||
// and takes `self.encoder`, the live session `init_session` just established (non-null on the
|
||||
// path that reaches here). `NV_ENC_REGISTER_RESOURCE rr` has `version =
|
||||
// NV_ENC_REGISTER_RESOURCE_VER` and registers `frame.texture` — a D3D11 texture from
|
||||
// `frame.device`, which is the SAME device the session was opened against (any device change
|
||||
// tears down and re-inits above, so `init_device == frame.device.as_raw()` here); the cloned
|
||||
// `ID3D11Texture2D` is kept alive in `regs` so NVENC's registration never outlives the texture.
|
||||
// `mp` (`NV_ENC_MAP_INPUT_RESOURCE`, version set) maps that registration and the map is recorded
|
||||
// in `pending` to be unmapped exactly once in `poll`/`teardown`. `pic` (`NV_ENC_PIC_PARAMS`,
|
||||
// version set) points `inputBuffer` at `mp.mappedResource` and `outputBitstream` at the live
|
||||
// pool bitstream `bitstreams[slot]`; the optional SEI scratch (`mastering_sei`/`cll_sei` and the
|
||||
// `sei` Vec whose `as_mut_ptr()` is written into the codec union) are stack locals that outlive
|
||||
// the synchronous `encode_picture`. Every `#[repr(C)]` param is a live local borrowed `&mut`
|
||||
// for the duration of its one synchronous call. (In-place encode without `CopyResource` is
|
||||
// sound because the encode loop is synchronous, as the module docs state.)
|
||||
unsafe {
|
||||
// Register the capturer's texture with NVENC once (cached by raw pointer), then encode it
|
||||
// IN PLACE — no `CopyResource` into an encoder-owned pool. This is the zero-copy win: the
|
||||
@@ -774,6 +823,12 @@ impl Encoder for NvencD3d11Encoder {
|
||||
// We tag each input with `inputTimeStamp = frame_idx` (0,1,2,…), which is also the client's
|
||||
// frame number (the packetizer numbers frames in submit order), so the client's lost-frame
|
||||
// range maps 1:1 onto the timestamps NVENC invalidates here.
|
||||
// SAFETY: `invalidate_ref_frames` is a function pointer from the loaded `ENCODE_API` table.
|
||||
// `self.encoder` was checked non-null at the top of this fn and is the live session; this runs
|
||||
// on the encode thread (like submit/poll), so there is no concurrent NVENC use. Each `ts` was
|
||||
// clamped to `[oldest_in_dpb, frame_idx - 1]` above, so it names a frame still in the session's
|
||||
// DPB; the call passes only that `u64` timestamp (no struct), so there is no struct-size or
|
||||
// lifetime concern.
|
||||
unsafe {
|
||||
for ts in first..=last {
|
||||
if (API.invalidate_ref_frames)(self.encoder, ts as u64)
|
||||
@@ -792,6 +847,16 @@ impl Encoder for NvencD3d11Encoder {
|
||||
let Some((bs, map, pts_ns)) = self.pending.pop_front() else {
|
||||
return Ok(None);
|
||||
};
|
||||
// SAFETY: a non-empty `pending` implies `submit` ran, so `self.encoder` is the live session
|
||||
// (`teardown` clears `pending` whenever it nulls the handle); all calls below use function
|
||||
// pointers from the loaded `ENCODE_API` table on the encode thread. `NV_ENC_LOCK_BITSTREAM lock`
|
||||
// (version = `NV_ENC_LOCK_BITSTREAM_VER`) locks `bs`, a pool bitstream a prior `encode_picture`
|
||||
// targeted; `lock_bitstream` blocks until that encode finishes, so on success
|
||||
// `lock.bitstreamBufferPtr` is non-null and points at `lock.bitstreamSizeInBytes` bytes of
|
||||
// NVENC-owned, CPU-readable output valid until `unlock_bitstream`. The `from_raw_parts` slice is
|
||||
// only read (copied via `to_vec()`) BEFORE `unlock_bitstream(bs)` — lock and unlock pair on the
|
||||
// same buffer — so it never outlives the lock. `map` (the input resource paired with `bs` in
|
||||
// `pending`) is unmapped here, after the encode completed, exactly once.
|
||||
unsafe {
|
||||
let mut lock = nv::NV_ENC_LOCK_BITSTREAM {
|
||||
version: nv::NV_ENC_LOCK_BITSTREAM_VER,
|
||||
@@ -831,6 +896,11 @@ impl Encoder for NvencD3d11Encoder {
|
||||
|
||||
impl Drop for NvencD3d11Encoder {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `teardown` (an `unsafe fn`) needs the owning thread with no NVENC call in flight and
|
||||
// a session whose cached resources all belong to `self.encoder`. At Drop this encoder is owned
|
||||
// exclusively (no other reference can exist), runs on the encode thread it was confined to, and
|
||||
// `teardown` early-returns when `self.encoder` is null; otherwise every cached reg/bitstream/
|
||||
// pending was created against that live session. It runs exactly once (here).
|
||||
unsafe { self.teardown() };
|
||||
}
|
||||
}
|
||||
|
||||
@@ -2,6 +2,8 @@
|
||||
//! fallback when NVENC is unavailable). Low-latency screen-content config: single-reference,
|
||||
//! no B-frames (Baseline), bitrate rate-control, in-band SPS/PPS each IDR, BT.709 limited range.
|
||||
//! Synchronous: `submit` encodes immediately and stashes the AU for `poll` (no internal queue).
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{EncodedFrame, Encoder};
|
||||
use crate::capture::{CapturedFrame, FramePayload, PixelFormat};
|
||||
@@ -30,6 +32,12 @@ pub struct OpenH264Encoder {
|
||||
}
|
||||
|
||||
// openh264's Encoder holds a raw C handle (not auto-Send); it lives on the single encode thread.
|
||||
// SAFETY: `OpenH264Encoder` wraps `Oh264` (openh264's `Encoder`), which holds a raw C handle to the
|
||||
// openh264 `ISVCEncoder` and is not auto-`Send`; the other fields (`YUVBuffer`, `Vec`, scalars,
|
||||
// `Option<EncodedFrame>`) are plain owned data. The session creates the encoder, calls
|
||||
// `submit`/`poll`/`flush`, and drops it all on one dedicated encode thread, never sharing it by
|
||||
// reference across threads, so the C handle is only ever touched from a single thread. Moving the
|
||||
// whole value to that thread is therefore sound — there is no concurrent access to the handle.
|
||||
unsafe impl Send for OpenH264Encoder {}
|
||||
|
||||
impl OpenH264Encoder {
|
||||
|
||||
@@ -17,6 +17,9 @@
|
||||
//! data packets are consumed immediately and missing parity only costs loss recovery — so
|
||||
//! the validated stereo path stays byte-identical (data packets only, exactly as before).
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it.
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
#[cfg(any(target_os = "linux", target_os = "windows", test))]
|
||||
use crate::audio::SAMPLE_RATE;
|
||||
#[cfg(any(target_os = "linux", target_os = "windows"))]
|
||||
@@ -409,7 +412,10 @@ struct MsEncoder {
|
||||
st: std::ptr::NonNull<audiopus_sys::OpusMSEncoder>,
|
||||
}
|
||||
|
||||
// The raw encoder state has no thread affinity; the session owns it on one thread at a time.
|
||||
// SAFETY: `MsEncoder` owns a unique `OpusMSEncoder` via `NonNull` (it is neither `Clone` nor
|
||||
// `Sync`, so the pointer is never aliased). libopus's multistream encoder state is a self-contained
|
||||
// heap allocation with no thread-local or thread-affine state, so moving ownership to another thread
|
||||
// is sound; every method takes `&mut self`, keeping access single-threaded at any instant.
|
||||
#[cfg(target_os = "linux")]
|
||||
unsafe impl Send for MsEncoder {}
|
||||
|
||||
@@ -418,6 +424,13 @@ impl MsEncoder {
|
||||
fn new(layout: &OpusLayout) -> Result<MsEncoder> {
|
||||
use std::os::raw::c_int;
|
||||
let mut err: c_int = 0;
|
||||
// SAFETY: every scalar arg is a valid libopus input (sample rate, channel/stream/coupled
|
||||
// counts, the RESTRICTED_LOWDELAY application constant). `layout.mapping.as_ptr()` addresses
|
||||
// a 'static slice of exactly `layout.channels` bytes (every `OpusLayout` constant upholds
|
||||
// that), which is the element count `opus_multistream_encoder_create` reads through it, and
|
||||
// `&mut err` is a live local the call writes its status into. libopus copies the mapping into
|
||||
// its own allocation, so the pointer need only be valid for the call; the returned pointer is
|
||||
// null/`OPUS_OK`-checked below before any use.
|
||||
let st = unsafe {
|
||||
audiopus_sys::opus_multistream_encoder_create(
|
||||
SAMPLE_RATE as i32,
|
||||
@@ -432,6 +445,11 @@ impl MsEncoder {
|
||||
let st = std::ptr::NonNull::new(st)
|
||||
.filter(|_| err == audiopus_sys::OPUS_OK)
|
||||
.ok_or_else(|| anyhow::anyhow!("opus_multistream_encoder_create failed ({err})"))?;
|
||||
// SAFETY: `st` is the non-null encoder `opus_multistream_encoder_create` just returned, owned
|
||||
// exclusively here. Each `opus_multistream_encoder_ctl` call passes a valid request constant
|
||||
// with the single by-value `c_int` argument that request's variadic ABI expects
|
||||
// (`OPUS_SET_BITRATE_REQUEST` → bitrate, `OPUS_SET_VBR_REQUEST` → 0). No pointer escapes the
|
||||
// call and the encoder outlives it.
|
||||
unsafe {
|
||||
audiopus_sys::opus_multistream_encoder_ctl(
|
||||
st.as_ptr(),
|
||||
@@ -453,6 +471,13 @@ impl MsEncoder {
|
||||
samples_per_channel: usize,
|
||||
out: &mut [u8],
|
||||
) -> Result<usize> {
|
||||
// SAFETY: `self.st` is the live encoder from `new`. libopus reads `samples_per_channel *
|
||||
// channels` f32s through `frame.as_ptr()`; every caller passes a `frame` of exactly that
|
||||
// length together with the matching `samples_per_channel` (`audio_body`'s `frame_len =
|
||||
// samples_per_channel * layout.channels`; the round-trip tests size identically), so the read
|
||||
// stays in bounds. `out.as_mut_ptr()` is written for at most `out.len()` bytes, which is
|
||||
// passed as the capacity bound. Both buffers are live locals outliving this synchronous call;
|
||||
// the return value is range-checked before being used as a length.
|
||||
let n = unsafe {
|
||||
audiopus_sys::opus_multistream_encode_float(
|
||||
self.st.as_ptr(),
|
||||
@@ -470,6 +495,9 @@ impl MsEncoder {
|
||||
#[cfg(target_os = "linux")]
|
||||
impl Drop for MsEncoder {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `self.st` is the encoder `opus_multistream_encoder_create` returned; this
|
||||
// `MsEncoder` owns it uniquely and `drop` runs exactly once, so the destroy frees it once
|
||||
// with no subsequent use.
|
||||
unsafe { audiopus_sys::opus_multistream_encoder_destroy(self.st.as_ptr()) }
|
||||
}
|
||||
}
|
||||
@@ -761,6 +789,10 @@ mod tests {
|
||||
let client_mapping = client_swap(&digits[3..]);
|
||||
|
||||
let mut err = 0i32;
|
||||
// SAFETY: scalar args are valid libopus inputs. `client_mapping.as_ptr()` addresses a
|
||||
// `Vec<u8>` of exactly `ch` entries (derived from the advertised surround-params), which is
|
||||
// the element count the decoder reads through it, and `&mut err` is a live local the call
|
||||
// writes. The returned pointer is `OPUS_OK`/non-null-checked immediately below before use.
|
||||
let dec = unsafe {
|
||||
audiopus_sys::opus_multistream_decoder_create(
|
||||
SAMPLE_RATE as i32,
|
||||
@@ -789,6 +821,11 @@ mod tests {
|
||||
}
|
||||
let n = enc.encode_float(&frame, samples, &mut out).unwrap();
|
||||
assert!(n > 0);
|
||||
// SAFETY: `dec` is the non-null decoder asserted above. `out.as_ptr()` is read for
|
||||
// the `n` encoded bytes just produced by `encode_float`; `decoded.as_mut_ptr()` is
|
||||
// written for up to `samples * ch` f32s and `decoded` is exactly that long; `samples`
|
||||
// is the per-channel frame size. All buffers are live locals outliving the call; the
|
||||
// return is checked to equal `samples`.
|
||||
let got = unsafe {
|
||||
audiopus_sys::opus_multistream_decode_float(
|
||||
dec,
|
||||
@@ -817,6 +854,8 @@ mod tests {
|
||||
(energies: {energy:?})"
|
||||
);
|
||||
}
|
||||
// SAFETY: `dec` is the decoder `opus_multistream_decoder_create` returned; the test owns it
|
||||
// and destroys it exactly once here, after the final decode — no later use, no double free.
|
||||
unsafe { audiopus_sys::opus_multistream_decoder_destroy(dec) };
|
||||
}
|
||||
|
||||
@@ -853,6 +892,9 @@ mod tests {
|
||||
let digits: Vec<u8> = s.bytes().map(|b| b - b'0').collect();
|
||||
let client_mapping = client_swap(&digits[3..]);
|
||||
let mut err = 0i32;
|
||||
// SAFETY: scalar args are valid; `client_mapping.as_ptr()` addresses a 6-entry `Vec<u8>`
|
||||
// (matches the 6-channel layout the decoder reads through it), alive past the call, and
|
||||
// `&mut err` is a live local. The pointer is `OPUS_OK`-checked before use.
|
||||
let dec = unsafe {
|
||||
audiopus_sys::opus_multistream_decoder_create(
|
||||
48000,
|
||||
@@ -865,6 +907,10 @@ mod tests {
|
||||
};
|
||||
assert_eq!(err, audiopus_sys::OPUS_OK);
|
||||
let mut pcm = vec![0f32; 240 * 6];
|
||||
// SAFETY: `dec` is the non-null decoder from create. `out.as_ptr()` is read for the CBR
|
||||
// packet length passed in (`*sizes.first()`, a real encoded packet size in `out`);
|
||||
// `pcm.as_mut_ptr()` is written for up to `240 * 6` f32s and `pcm` is exactly that long;
|
||||
// `240` is the per-channel frame size. All buffers are live locals outliving the call.
|
||||
let got = unsafe {
|
||||
audiopus_sys::opus_multistream_decode_float(
|
||||
dec,
|
||||
@@ -875,6 +921,7 @@ mod tests {
|
||||
0,
|
||||
)
|
||||
};
|
||||
// SAFETY: `dec` is owned by the test; destroyed exactly once here after the final decode.
|
||||
unsafe { audiopus_sys::opus_multistream_decoder_destroy(dec) };
|
||||
assert_eq!(got, 240);
|
||||
}
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
//! Pairing crypto primitives (control plane only — distinct from `punktfunk_core`'s AES-GCM
|
||||
//! data-plane sealing). GameStream pairing uses: AES-128-**ECB** with **no padding**,
|
||||
//! SHA-256 (host appversion major ≥ 7), and RSA-PKCS1v15-SHA256 signatures. See the
|
||||
//! `serverinfo + pairing` section of `docs/research/gamestream-protocol-research.json`.
|
||||
//! `serverinfo + pairing` section of `design/research/gamestream-protocol-research.json`.
|
||||
|
||||
use aes::cipher::generic_array::GenericArray;
|
||||
use aes::cipher::{BlockDecrypt, BlockEncrypt, KeyInit};
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
//! GameStream (P1) control plane — what a stock Moonlight/Artemis client talks to around
|
||||
//! the media streams: mDNS discovery, the nvhttp serverinfo + pairing HTTP(S) API, RTSP,
|
||||
//! and the ENet control stream. `tokio`/`axum` live here (control plane, I/O-bound — never
|
||||
//! the per-frame hot path; that is `punktfunk_core`'s P1 wire codec). See `docs/gamestream-host-plan.md`.
|
||||
//! the per-frame hot path; that is `punktfunk_core`'s P1 wire codec). See `design/gamestream-host-plan.md`.
|
||||
//!
|
||||
//! Status: P1.1 — mDNS `_nvstream._tcp` advertisement + `/serverinfo`. Pairing, RTSP, and
|
||||
//! the media streams follow (see the GameStream host task list / plan).
|
||||
@@ -125,12 +125,21 @@ pub struct AppState {
|
||||
/// (avoids a PipeWire stream setup per reconnect); drained on reuse so no stale audio is
|
||||
/// sent, dropped + reopened when a session negotiates a different channel count.
|
||||
pub audio_cap: std::sync::Arc<std::sync::Mutex<Option<Box<dyn crate::audio::AudioCapturer>>>>,
|
||||
/// Shared streaming-stats recorder (web-console capture/graph). The GameStream encode loop
|
||||
/// reads `is_armed()` per frame and emits samples; the same `Arc` is shared with the mgmt API
|
||||
/// and the native punktfunk/1 loops so one capture spans whichever path is streaming.
|
||||
pub stats: Arc<crate::stats_recorder::StatsRecorder>,
|
||||
}
|
||||
|
||||
impl AppState {
|
||||
/// Fresh control-plane state: no active session; the pairing allow-list is loaded from
|
||||
/// disk (pairings persist across restarts).
|
||||
pub fn new(host: Host, identity: cert::ServerIdentity) -> AppState {
|
||||
/// disk (pairings persist across restarts). `stats` is the shared recorder handed to both the
|
||||
/// mgmt API and the streaming loops.
|
||||
pub fn new(
|
||||
host: Host,
|
||||
identity: cert::ServerIdentity,
|
||||
stats: Arc<crate::stats_recorder::StatsRecorder>,
|
||||
) -> AppState {
|
||||
AppState {
|
||||
host,
|
||||
identity,
|
||||
@@ -145,6 +154,7 @@ impl AppState {
|
||||
rfi_range: std::sync::Arc::new(std::sync::Mutex::new(None)),
|
||||
video_cap: std::sync::Arc::new(std::sync::Mutex::new(None)),
|
||||
audio_cap: std::sync::Arc::new(std::sync::Mutex::new(None)),
|
||||
stats,
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -166,7 +176,10 @@ pub fn serve(
|
||||
) -> Result<()> {
|
||||
let host = Host::detect()?;
|
||||
let identity = cert::ServerIdentity::load_or_create().context("host certificate")?;
|
||||
let state = Arc::new(AppState::new(host, identity));
|
||||
// The shared streaming-stats recorder: one handle for the mgmt API, the GameStream encode loop
|
||||
// (via `AppState`), and the native punktfunk/1 loops (passed to `punktfunk1::serve`).
|
||||
let stats = crate::stats_recorder::StatsRecorder::new(crate::stats_recorder::default_dir());
|
||||
let state = Arc::new(AppState::new(host, identity, stats.clone()));
|
||||
// The native plane always runs, so the shared native-pairing handle (linking the QUIC ceremony
|
||||
// and the management API) always exists.
|
||||
let np = Arc::new(
|
||||
@@ -206,8 +219,8 @@ pub fn serve(
|
||||
);
|
||||
tokio::try_join!(
|
||||
nvhttp::run(state.clone()),
|
||||
crate::mgmt::run(state.clone(), mgmt, Some(np.clone())),
|
||||
crate::punktfunk1::serve(native_opts, np),
|
||||
crate::mgmt::run(state.clone(), mgmt, Some(np.clone()), stats.clone()),
|
||||
crate::punktfunk1::serve(native_opts, np, stats.clone()),
|
||||
)?;
|
||||
} else {
|
||||
// Secure default: native punktfunk/1 + management API only (no GameStream surface).
|
||||
@@ -217,8 +230,8 @@ pub fn serve(
|
||||
(GameStream OFF — pass --gamestream for stock-Moonlight compat)"
|
||||
);
|
||||
tokio::try_join!(
|
||||
crate::mgmt::run(state.clone(), mgmt, Some(np.clone())),
|
||||
crate::punktfunk1::serve(native_opts, np),
|
||||
crate::mgmt::run(state.clone(), mgmt, Some(np.clone()), stats.clone()),
|
||||
crate::punktfunk1::serve(native_opts, np, stats.clone()),
|
||||
)?;
|
||||
}
|
||||
Ok(())
|
||||
|
||||
@@ -291,7 +291,10 @@ mod tests {
|
||||
https_port: HTTPS_PORT,
|
||||
};
|
||||
let identity = super::super::cert::ServerIdentity::ephemeral().expect("ephemeral identity");
|
||||
Arc::new(AppState::new(host, identity))
|
||||
let stats = crate::stats_recorder::StatsRecorder::new(
|
||||
std::env::temp_dir().join(format!("pf-nvhttp-stats-{}", std::process::id())),
|
||||
);
|
||||
Arc::new(AppState::new(host, identity, stats))
|
||||
}
|
||||
|
||||
fn fp_of(der: &[u8]) -> String {
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
//! The 4-phase GameStream pairing state machine (over HTTP), keyed by `uniqueid`. Proves
|
||||
//! both sides know the PIN (via the SHA-256(salt||pin) AES-ECB key) and own their certs
|
||||
//! (RSA signatures), then pins the client cert. The final `pairchallenge` happens over
|
||||
//! HTTPS (handled in `nvhttp`). Byte-exact spec: `docs/research/…-research.json`.
|
||||
//! HTTPS (handled in `nvhttp`). Byte-exact spec: `design/research/…-research.json`.
|
||||
|
||||
use super::cert::ServerIdentity;
|
||||
use super::crypto;
|
||||
|
||||
@@ -234,6 +234,7 @@ fn handle_request(req: &Request, state: &AppState) -> String {
|
||||
state.force_idr.clone(),
|
||||
state.rfi_range.clone(),
|
||||
state.video_cap.clone(),
|
||||
state.stats.clone(),
|
||||
);
|
||||
}
|
||||
Some(_) => tracing::info!("RTSP PLAY — stream already running"),
|
||||
|
||||
@@ -3,6 +3,9 @@
|
||||
//! either real portal desktop capture (`PUNKTFUNK_VIDEO_SOURCE=portal`, the portal PipeWire path) or
|
||||
//! a synthetic test pattern (default). Runs on its own native thread.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it.
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::video::{FrameType, VideoPacketizer};
|
||||
use super::VIDEO_PORT;
|
||||
use crate::capture::{self, Capturer, FastSyntheticCapturer};
|
||||
@@ -45,6 +48,7 @@ pub fn start(
|
||||
force_idr: Arc<AtomicBool>,
|
||||
rfi_range: RfiSlot,
|
||||
video_cap: CapturerSlot,
|
||||
stats: Arc<crate::stats_recorder::StatsRecorder>,
|
||||
) {
|
||||
let _ = std::thread::Builder::new()
|
||||
.name("punktfunk-video".into())
|
||||
@@ -57,6 +61,7 @@ pub fn start(
|
||||
&force_idr,
|
||||
&rfi_range,
|
||||
&video_cap,
|
||||
&stats,
|
||||
) {
|
||||
tracing::error!(error = %format!("{e:#}"), "video stream failed");
|
||||
}
|
||||
@@ -65,6 +70,7 @@ pub fn start(
|
||||
});
|
||||
}
|
||||
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
fn run(
|
||||
cfg: StreamConfig,
|
||||
app: Option<&super::apps::AppEntry>,
|
||||
@@ -72,6 +78,9 @@ fn run(
|
||||
force_idr: &AtomicBool,
|
||||
rfi_range: &std::sync::Mutex<Option<(i64, i64)>>,
|
||||
video_cap: &std::sync::Mutex<Option<Box<dyn Capturer>>>,
|
||||
// Shared stats recorder for the web-console capture/graph. Threaded into `stream_body` (the
|
||||
// encode loop); per-frame sample emission is wired by a later pass.
|
||||
stats: &Arc<crate::stats_recorder::StatsRecorder>,
|
||||
) -> Result<()> {
|
||||
// GameStream capture/encode thread: apply Windows session tuning (no-op off Windows).
|
||||
crate::session_tuning::on_hot_thread();
|
||||
@@ -97,6 +106,8 @@ fn run(
|
||||
sock.connect(client)
|
||||
.context("connect client video endpoint")?;
|
||||
tracing::info!(%client, "video: client endpoint learned");
|
||||
// Short label for web-console stats captures: the client's peer IP.
|
||||
let client_label = client.ip().to_string();
|
||||
|
||||
// Native client-resolution source: create a compositor virtual output sized to the client's
|
||||
// request and capture it (no scaling). Self-contained — deliberately NOT pooled in
|
||||
@@ -141,7 +152,35 @@ fn run(
|
||||
)
|
||||
.context("capture virtual output")?;
|
||||
capturer.set_active(true);
|
||||
return stream_body(&mut *capturer, &sock, cfg, running, force_idr, rfi_range);
|
||||
// Launch the app's command now that capture is live, for the backends that DON'T nest it via
|
||||
// set_launch_command above: Windows (no gamescope) and Linux kwin/mutter/wlroots (which stream
|
||||
// the existing desktop, so the app must be spawned into the session to land on the streamed
|
||||
// output). Linux gamescope already nested it via set_launch_command, so skip it there.
|
||||
#[cfg(windows)]
|
||||
let launch_here = true;
|
||||
#[cfg(target_os = "linux")]
|
||||
let launch_here = compositor != crate::vdisplay::Compositor::Gamescope;
|
||||
#[cfg(any(windows, target_os = "linux"))]
|
||||
if launch_here {
|
||||
if let Some(cmd) = app
|
||||
.and_then(|a| a.cmd.as_deref())
|
||||
.filter(|c| !c.trim().is_empty())
|
||||
{
|
||||
if let Err(e) = crate::library::launch_gamestream_command(cmd) {
|
||||
tracing::warn!(command = %cmd, error = %e, "gamestream: could not launch app");
|
||||
}
|
||||
}
|
||||
}
|
||||
return stream_body(
|
||||
&mut *capturer,
|
||||
&sock,
|
||||
cfg,
|
||||
running,
|
||||
force_idr,
|
||||
rfi_range,
|
||||
stats,
|
||||
&client_label,
|
||||
);
|
||||
}
|
||||
|
||||
// Reuse the persistent capturer (one screencast session → clean reconnect); create it on
|
||||
@@ -161,7 +200,16 @@ fn run(
|
||||
}
|
||||
};
|
||||
capturer.set_active(true);
|
||||
let result = stream_body(&mut *capturer, &sock, cfg, running, force_idr, rfi_range);
|
||||
let result = stream_body(
|
||||
&mut *capturer,
|
||||
&sock,
|
||||
cfg,
|
||||
running,
|
||||
force_idr,
|
||||
rfi_range,
|
||||
stats,
|
||||
&client_label,
|
||||
);
|
||||
capturer.set_active(false);
|
||||
*video_cap.lock().unwrap() = Some(capturer);
|
||||
result
|
||||
@@ -188,6 +236,10 @@ fn sendmmsg_all(sock: &UdpSocket, pkts: &[Vec<u8>]) -> std::io::Result<()> {
|
||||
let mut hdrs: Vec<libc::mmsghdr> = iovs
|
||||
.iter_mut()
|
||||
.map(|iov| {
|
||||
// SAFETY: `libc::mmsghdr` is a plain `#[repr(C)]` struct of integers and raw
|
||||
// pointers, for which an all-zero bit pattern is valid (null pointers / zero
|
||||
// lengths); the fields we rely on (`msg_iov`, `msg_iovlen`) are overwritten on the
|
||||
// next two lines before the struct is handed to the kernel.
|
||||
let mut h: libc::mmsghdr = unsafe { std::mem::zeroed() };
|
||||
h.msg_hdr.msg_iov = iov;
|
||||
h.msg_hdr.msg_iovlen = 1;
|
||||
@@ -196,6 +248,13 @@ fn sendmmsg_all(sock: &UdpSocket, pkts: &[Vec<u8>]) -> std::io::Result<()> {
|
||||
.collect();
|
||||
let mut off = 0usize;
|
||||
while off < hdrs.len() {
|
||||
// SAFETY: `fd` is `sock`'s live raw fd (`sock` outlives the call). `hdrs[off..]
|
||||
// .as_mut_ptr()` is a live slice of `(hdrs.len() - off)` `mmsghdr`s — exactly the count
|
||||
// passed — into which the kernel writes each `msg_len`. Each header's `msg_iov` points
|
||||
// into `iovs` (a local that outlives this call, with `msg_iovlen == 1` matching its one
|
||||
// entry) and each `iovec.iov_base` points into the `chunk` packet buffers (the caller's
|
||||
// `pkts`, alive for the call); the kernel only reads those payloads. Flags 0; the return
|
||||
// is error-/progress-checked before advancing `off`.
|
||||
let n = unsafe {
|
||||
libc::sendmmsg(fd, hdrs[off..].as_mut_ptr(), (hdrs.len() - off) as u32, 0)
|
||||
};
|
||||
@@ -293,8 +352,20 @@ fn spawn_sender(
|
||||
Ok(())
|
||||
}
|
||||
|
||||
/// Percentile of a slice (sorts it in place first). `q` in `0.0..=1.0`. Used for the web-console
|
||||
/// stats sample's per-stage p50/p99.
|
||||
fn percentile(v: &mut [u32], q: f64) -> u32 {
|
||||
if v.is_empty() {
|
||||
return 0;
|
||||
}
|
||||
v.sort_unstable();
|
||||
let i = ((v.len() as f64 * q) as usize).min(v.len() - 1);
|
||||
v[i]
|
||||
}
|
||||
|
||||
/// The encode → packetize loop, over a borrowed capturer. Sending runs on a dedicated thread
|
||||
/// (see [`spawn_sender`]) so a send spike can never stall capture/encode.
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
fn stream_body(
|
||||
capturer: &mut dyn Capturer,
|
||||
sock: &UdpSocket,
|
||||
@@ -302,6 +373,11 @@ fn stream_body(
|
||||
running: &Arc<AtomicBool>,
|
||||
force_idr: &AtomicBool,
|
||||
rfi_range: &std::sync::Mutex<Option<(i64, i64)>>,
|
||||
// Shared stats recorder. The encode loop reads `stats.is_armed()` per frame to decide whether
|
||||
// to accumulate the per-stage split, then emits a `StatsSample` at its 1 s aggregation boundary.
|
||||
stats: &Arc<crate::stats_recorder::StatsRecorder>,
|
||||
// Short client label (peer IP) seeded into the capture meta on the first armed registration.
|
||||
client_label: &str,
|
||||
) -> Result<()> {
|
||||
// The first frame establishes the authoritative size/format for the encoder.
|
||||
let mut frame = capturer.next_frame().context("capture first frame")?;
|
||||
@@ -365,6 +441,19 @@ fn stream_body(
|
||||
let perf = crate::config::config().perf;
|
||||
let (mut mx_cap, mut mx_enc, mut mx_pkt, mut mx_send, mut mx_pkts, mut uniq) =
|
||||
(0u128, 0u128, 0u128, 0u128, 0usize, 0u32);
|
||||
// Web-console stats accumulation (active when `perf` OR a capture is armed): per-stage vectors
|
||||
// for p50/p99, the goodput bytes queued to the sender this window, the previous window's
|
||||
// dropped-frame count for delta computation, and the registration id cached on the first sample.
|
||||
let codec_name = match cfg.codec {
|
||||
Codec::H264 => "h264",
|
||||
Codec::H265 => "hevc",
|
||||
Codec::Av1 => "av1",
|
||||
};
|
||||
let mut sid: Option<u32> = None;
|
||||
let (mut v_cap, mut v_enc, mut v_pkt, mut v_send): (Vec<u32>, Vec<u32>, Vec<u32>, Vec<u32>) =
|
||||
(Vec::new(), Vec::new(), Vec::new(), Vec::new());
|
||||
let mut bytes_win: u64 = 0;
|
||||
let mut last_dropped_batches: u64 = 0;
|
||||
// Absolute next-frame deadline — the single pacing clock for the loop.
|
||||
let mut next_frame = Instant::now();
|
||||
// RFI capability is fixed for the session (probed at encoder open). Query it once so the
|
||||
@@ -374,6 +463,9 @@ fn stream_body(
|
||||
|
||||
while running.load(Ordering::SeqCst) {
|
||||
let tick = Instant::now();
|
||||
// Measure per-stage timing when `PUNKTFUNK_PERF` is set OR a web-console stats capture is
|
||||
// armed (cheap Relaxed atomic, re-read each frame).
|
||||
let measure = perf || stats.is_armed();
|
||||
// Advance to the freshest captured frame if one arrived; otherwise reuse the last.
|
||||
if let Some(f) = capturer.try_latest().context("capture frame")? {
|
||||
frame = f;
|
||||
@@ -414,9 +506,19 @@ fn stream_body(
|
||||
// Hand the frame's packets to the send thread; never block here. A full queue means
|
||||
// the sender is behind — drop this batch (FEC/RFI covers the client) and keep encoding.
|
||||
let n = batch.len();
|
||||
// Goodput this window = bytes actually queued to the sender (a dropped batch never reaches
|
||||
// the wire, so it's excluded). Summed only when measuring, to keep the idle path free.
|
||||
let batch_bytes: u64 = if measure {
|
||||
batch.iter().map(|p| p.len() as u64).sum()
|
||||
} else {
|
||||
0
|
||||
};
|
||||
if n > 0 {
|
||||
match batch_tx.try_send(batch) {
|
||||
Ok(()) => sent_batches += 1,
|
||||
Ok(()) => {
|
||||
sent_batches += 1;
|
||||
bytes_win += batch_bytes;
|
||||
}
|
||||
Err(std::sync::mpsc::TrySendError::Full(_)) => {
|
||||
dropped_batches += 1;
|
||||
if dropped_batches.is_power_of_two() {
|
||||
@@ -428,17 +530,26 @@ fn stream_body(
|
||||
}
|
||||
}
|
||||
}
|
||||
if perf {
|
||||
if measure {
|
||||
let t_send = tick.elapsed();
|
||||
mx_cap = mx_cap.max(t_cap.as_micros());
|
||||
mx_enc = mx_enc.max((t_enc - t_cap).as_micros());
|
||||
mx_pkt = mx_pkt.max((t_pkt - t_enc).as_micros());
|
||||
mx_send = mx_send.max((t_send - t_pkt).as_micros());
|
||||
let cap_us = t_cap.as_micros();
|
||||
let enc_us = (t_enc - t_cap).as_micros();
|
||||
let pkt_us = (t_pkt - t_enc).as_micros();
|
||||
let send_us = (t_send - t_pkt).as_micros();
|
||||
mx_cap = mx_cap.max(cap_us);
|
||||
mx_enc = mx_enc.max(enc_us);
|
||||
mx_pkt = mx_pkt.max(pkt_us);
|
||||
mx_send = mx_send.max(send_us);
|
||||
mx_pkts = mx_pkts.max(n);
|
||||
v_cap.push(cap_us as u32);
|
||||
v_enc.push(enc_us as u32);
|
||||
v_pkt.push(pkt_us as u32);
|
||||
v_send.push(send_us as u32);
|
||||
}
|
||||
|
||||
fps_count += 1;
|
||||
if fps_t.elapsed() >= Duration::from_secs(1) {
|
||||
let secs = fps_t.elapsed().as_secs_f64();
|
||||
if perf {
|
||||
// Max µs/stage this second: cap=drain channel, enc=submit (zero-copy device
|
||||
// copy + NVENC), pkt=poll+FEC+packetize, send=paced packet send. `uniq`=new
|
||||
@@ -453,12 +564,6 @@ fn stream_body(
|
||||
max_pkts = mx_pkts,
|
||||
"video: streaming (perf)"
|
||||
);
|
||||
mx_cap = 0;
|
||||
mx_enc = 0;
|
||||
mx_pkt = 0;
|
||||
mx_send = 0;
|
||||
mx_pkts = 0;
|
||||
uniq = 0;
|
||||
} else {
|
||||
tracing::info!(
|
||||
fps = fps_count,
|
||||
@@ -467,6 +572,68 @@ fn stream_body(
|
||||
"video: streaming"
|
||||
);
|
||||
}
|
||||
// Web-console capture: build the aggregated sample. The host send side exposes no
|
||||
// receiver-side packet loss / FEC-recovery / send-buffer EAGAIN counters, so those stay
|
||||
// 0 (not fabricated); `frames_dropped` is the per-frame send-queue overflow delta.
|
||||
if stats.is_armed() {
|
||||
let session_id = *sid.get_or_insert_with(|| {
|
||||
stats.register_session(
|
||||
"gamestream",
|
||||
cfg.width,
|
||||
cfg.height,
|
||||
cfg.fps,
|
||||
codec_name,
|
||||
client_label,
|
||||
)
|
||||
});
|
||||
let sample = crate::stats_recorder::StatsSample {
|
||||
t_ms: 0, // stamped by push_sample from the capture's monotonic start
|
||||
session_id,
|
||||
stages: vec![
|
||||
crate::stats_recorder::StageTiming {
|
||||
name: "capture".into(),
|
||||
p50_us: percentile(&mut v_cap, 0.50) as f32,
|
||||
p99_us: percentile(&mut v_cap, 0.99) as f32,
|
||||
},
|
||||
crate::stats_recorder::StageTiming {
|
||||
name: "encode".into(),
|
||||
p50_us: percentile(&mut v_enc, 0.50) as f32,
|
||||
p99_us: percentile(&mut v_enc, 0.99) as f32,
|
||||
},
|
||||
crate::stats_recorder::StageTiming {
|
||||
name: "packetize".into(),
|
||||
p50_us: percentile(&mut v_pkt, 0.50) as f32,
|
||||
p99_us: percentile(&mut v_pkt, 0.99) as f32,
|
||||
},
|
||||
crate::stats_recorder::StageTiming {
|
||||
name: "send".into(),
|
||||
p50_us: percentile(&mut v_send, 0.50) as f32,
|
||||
p99_us: percentile(&mut v_send, 0.99) as f32,
|
||||
},
|
||||
],
|
||||
fps: (uniq as f64 / secs) as f32,
|
||||
repeat_fps: (fps_count.saturating_sub(uniq) as f64 / secs) as f32,
|
||||
mbps: (bytes_win as f64 * 8.0 / secs / 1_000_000.0) as f32,
|
||||
bitrate_kbps: cfg.bitrate_kbps,
|
||||
frames_dropped: dropped_batches.saturating_sub(last_dropped_batches) as u32,
|
||||
packets_dropped: 0,
|
||||
send_dropped: 0,
|
||||
fec_recovered: 0,
|
||||
};
|
||||
stats.push_sample(session_id, sample);
|
||||
}
|
||||
mx_cap = 0;
|
||||
mx_enc = 0;
|
||||
mx_pkt = 0;
|
||||
mx_send = 0;
|
||||
mx_pkts = 0;
|
||||
uniq = 0;
|
||||
v_cap.clear();
|
||||
v_enc.clear();
|
||||
v_pkt.clear();
|
||||
v_send.clear();
|
||||
bytes_win = 0;
|
||||
last_dropped_batches = dropped_batches;
|
||||
fps_count = 0;
|
||||
fps_t = Instant::now();
|
||||
}
|
||||
|
||||
@@ -3,7 +3,7 @@
|
||||
//! `RTP_PACKET(12, big-endian) + reserved[4] + NV_VIDEO_PACKET(16, little-endian) + payload`
|
||||
//! and the frame's bitstream is prefixed with an 8-byte `video_short_frame_header_t`, then
|
||||
//! striped into ≤4 FEC blocks of ≤255 shards. Byte-exact spec:
|
||||
//! `docs/research/gamestream-protocol-research.json` (video plane).
|
||||
//! `design/research/gamestream-protocol-research.json` (video plane).
|
||||
//!
|
||||
//! FEC (P1.5): each block carries `m = ⌈k·pct/100⌉` Reed–Solomon parity shards generated by
|
||||
//! `punktfunk_core::fec::Gf8Coder` (the nanors-compatible Cauchy GF(2⁸) coder). Crucially, RS runs
|
||||
|
||||
@@ -15,6 +15,9 @@
|
||||
//! `<linux/uinput.h>` on x86_64. `/dev/uinput` needs a udev rule + `input` group membership
|
||||
//! (see `scripts/60-punktfunk.rules`); creation fails with a clear error otherwise.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use crate::gamestream::gamepad::{self, GamepadFrame, MAX_PADS};
|
||||
use anyhow::{bail, Result};
|
||||
use std::collections::HashMap;
|
||||
@@ -215,6 +218,11 @@ const _: () = {
|
||||
};
|
||||
|
||||
fn ioctl_int(fd: i32, req: libc::c_ulong, arg: libc::c_int, what: &str) -> Result<()> {
|
||||
// SAFETY: every caller passes one of UI_SET_EVBIT/KEYBIT/FFBIT/UI_DEV_CREATE/UI_DEV_DESTROY as
|
||||
// `req` — all integer-argument ioctls whose third arg the kernel takes BY VALUE, so nothing is
|
||||
// dereferenced through `arg` and no memory must outlive the call. The only precondition is `fd`
|
||||
// being a valid open descriptor; callers pass the live `/dev/uinput` fd, and even a stale fd
|
||||
// would merely return -1/EBADF (reported below), never UB.
|
||||
if unsafe { libc::ioctl(fd, req, arg) } < 0 {
|
||||
bail!("{what}: {}", std::io::Error::last_os_error());
|
||||
}
|
||||
@@ -222,6 +230,12 @@ fn ioctl_int(fd: i32, req: libc::c_ulong, arg: libc::c_int, what: &str) -> Resul
|
||||
}
|
||||
|
||||
fn ioctl_ptr<T>(fd: i32, req: libc::c_ulong, arg: *mut T, what: &str) -> Result<()> {
|
||||
// SAFETY: `fd` is the caller's live `/dev/uinput` fd. Every call site passes `&mut x` for a live,
|
||||
// uniquely-borrowed `#[repr(C)]` `x: T` whose size matches the struct the request number encodes
|
||||
// (UI_DEV_SETUP=0x405c_5503 → 0x5c=92=size_of::<UinputSetup>(); UI_ABS_SETUP → 0x1c=28; the FF
|
||||
// upload/erase ioctls → 0x68/0x0c — all pinned by the `size_of` asserts above). The kernel copies
|
||||
// exactly that many bytes in/out through `arg`; the `&mut` keeps the pointee alive and unaliased
|
||||
// for the whole synchronous call.
|
||||
if unsafe { libc::ioctl(fd, req, arg) } < 0 {
|
||||
bail!("{what}: {}", std::io::Error::last_os_error());
|
||||
}
|
||||
@@ -251,6 +265,9 @@ pub struct VirtualPad {
|
||||
impl VirtualPad {
|
||||
pub fn create(index: usize, identity: PadIdentity) -> Result<VirtualPad> {
|
||||
use std::os::fd::FromRawFd;
|
||||
// SAFETY: `c"/dev/uinput"` is a 'static NUL-terminated C string literal; `as_ptr()` yields a
|
||||
// valid pointer the kernel only reads as a filesystem path. `open` returns a fresh fd (or -1)
|
||||
// and retains nothing; no Rust memory is aliased or handed to the kernel beyond that 'static path.
|
||||
let raw = unsafe {
|
||||
libc::open(
|
||||
c"/dev/uinput".as_ptr(),
|
||||
@@ -264,6 +281,9 @@ impl VirtualPad {
|
||||
std::io::Error::last_os_error()
|
||||
);
|
||||
}
|
||||
// SAFETY: `raw >= 0` here (the `< 0` branch above already bailed), so it is a freshly-opened fd
|
||||
// from `libc::open` that is not stored or owned anywhere else. Transferring it to `OwnedFd` makes
|
||||
// this the unique owner, which will `close` it exactly once on drop (no double-close, no leak).
|
||||
let fd = unsafe { OwnedFd::from_raw_fd(raw) };
|
||||
|
||||
ioctl_int(raw, UI_SET_EVBIT, EV_KEY as i32, "UI_SET_EVBIT(EV_KEY)")?;
|
||||
@@ -356,6 +376,11 @@ impl VirtualPad {
|
||||
code,
|
||||
value,
|
||||
};
|
||||
// SAFETY: `ev` is a live local `#[repr(C)]` struct of all-integer fields with no padding bytes
|
||||
// (timeval=16 + u16 + u16 + i32 = 24, the size asserted above), so every byte is initialized and
|
||||
// valid to read as `u8`. The pointer is non-null and `u8`-aligned (align 1), the length is exactly
|
||||
// `size_of::<InputEventRaw>()` so the slice spans precisely `ev`'s bytes (in bounds), and `ev`
|
||||
// outlives `bytes` (used immediately below) with no concurrent mutation (single-threaded local).
|
||||
let bytes = unsafe {
|
||||
std::slice::from_raw_parts(
|
||||
&ev as *const _ as *const u8,
|
||||
@@ -363,6 +388,10 @@ impl VirtualPad {
|
||||
)
|
||||
};
|
||||
// Best-effort: a full kernel queue drops the event; the next frame re-syncs state.
|
||||
// SAFETY: `self.fd` is the live uinput `OwnedFd` (borrowed via `as_raw_fd`, so it stays open for
|
||||
// the call); `bytes` is the slice above backed by the still-live local `ev`. `write` only READS
|
||||
// exactly `bytes.len()` bytes from `bytes.as_ptr()` (in bounds) and retains nothing past return,
|
||||
// so the buffer outlives the synchronous call and the read-only access cannot race or alias.
|
||||
let _ = unsafe {
|
||||
libc::write(
|
||||
self.fd.as_raw_fd(),
|
||||
@@ -404,6 +433,10 @@ impl VirtualPad {
|
||||
let raw = self.fd.as_raw_fd();
|
||||
let mut buf = [0u8; std::mem::size_of::<InputEventRaw>()];
|
||||
loop {
|
||||
// SAFETY: `raw` is the live raw fd of `self.fd` (the non-blocking uinput device). `buf` is a
|
||||
// live local `[u8; size_of::<InputEventRaw>()]`; `buf.as_mut_ptr()` is a valid writable pointer
|
||||
// to its `buf.len()` bytes. `read` writes AT MOST `buf.len()` bytes (in bounds), the buffer
|
||||
// outlives this synchronous call, and `buf` is borrowed uniquely here (no alias/race).
|
||||
let n = unsafe { libc::read(raw, buf.as_mut_ptr() as *mut libc::c_void, buf.len()) };
|
||||
if n != buf.len() as isize {
|
||||
break; // EAGAIN / short read — queue drained
|
||||
@@ -415,6 +448,10 @@ impl VirtualPad {
|
||||
unsafe { std::ptr::read_unaligned(buf.as_ptr() as *const InputEventRaw) };
|
||||
match (ev.type_, ev.code) {
|
||||
(EV_UINPUT, UI_FF_UPLOAD) => {
|
||||
// SAFETY: `UinputFfUpload` is `#[repr(C)]` over integers (`u32`, `i32`) and two
|
||||
// `FfEffect`s (integers + `[u8; 32]`); all-zero is a valid bit pattern for every field
|
||||
// (no bool/NonZero/enum/reference niche), so `zeroed` yields a fully-initialized valid
|
||||
// value — `request_id` is then set below and the rest filled by UI_BEGIN_FF_UPLOAD.
|
||||
let mut up: UinputFfUpload = unsafe { std::mem::zeroed() };
|
||||
up.request_id = ev.value as u32;
|
||||
if ioctl_ptr(raw, UI_BEGIN_FF_UPLOAD, &mut up, "UI_BEGIN_FF_UPLOAD").is_ok() {
|
||||
@@ -442,6 +479,9 @@ impl VirtualPad {
|
||||
}
|
||||
}
|
||||
(EV_UINPUT, UI_FF_ERASE) => {
|
||||
// SAFETY: `UinputFfErase` is `#[repr(C)]` over three integer fields (`u32`, `i32`,
|
||||
// `u32`); all-zero is a valid bit pattern for each, so `zeroed` produces a fully-valid
|
||||
// initialized value — `request_id` is set below and `effect_id` filled by the ioctl.
|
||||
let mut er: UinputFfErase = unsafe { std::mem::zeroed() };
|
||||
er.request_id = ev.value as u32;
|
||||
if ioctl_ptr(raw, UI_BEGIN_FF_ERASE, &mut er, "UI_BEGIN_FF_ERASE").is_ok() {
|
||||
@@ -492,6 +532,9 @@ impl VirtualPad {
|
||||
|
||||
impl Drop for VirtualPad {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `self.fd` is still the live owned uinput fd here (the `OwnedFd` field is closed only
|
||||
// AFTER this `drop` body returns), borrowed by `as_raw_fd`. UI_DEV_DESTROY takes its argument
|
||||
// (0) BY VALUE, so nothing is dereferenced or aliased; the ioctl just tears down the device.
|
||||
let _ = unsafe { libc::ioctl(self.fd.as_raw_fd(), UI_DEV_DESTROY, 0) };
|
||||
}
|
||||
}
|
||||
|
||||
@@ -5,6 +5,9 @@
|
||||
//! keymap, and translate events into virtual pointer/keyboard requests, tracking modifier state
|
||||
//! so the compositor resolves shifted keysyms correctly.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{gs_button_to_evdev, vk_to_evdev, InputEvent, InputInjector};
|
||||
use anyhow::{bail, Context, Result};
|
||||
use punktfunk_core::input::InputKind;
|
||||
@@ -264,10 +267,17 @@ impl InputInjector for WlrootsInjector {
|
||||
/// Create an anonymous in-memory file holding `s` + a trailing NUL (for the keymap fd).
|
||||
fn memfd_with(s: &str) -> Result<std::fs::File> {
|
||||
let name = b"punktfunk-keymap\0";
|
||||
// SAFETY: `name` is a byte-string literal with an explicit trailing NUL, so `name.as_ptr()` is a
|
||||
// valid NUL-terminated C string; `memfd_create` only reads that name (copying it) and creates an
|
||||
// anonymous file, returning a fresh fd (or -1). `MFD_CLOEXEC` is a valid flag. The 'static literal
|
||||
// outlives the synchronous call and nothing aliases it. The result is checked `< 0` below.
|
||||
let fd = unsafe { libc::memfd_create(name.as_ptr() as *const libc::c_char, libc::MFD_CLOEXEC) };
|
||||
if fd < 0 {
|
||||
bail!("memfd_create failed: {}", std::io::Error::last_os_error());
|
||||
}
|
||||
// SAFETY: `fd` is the fresh memfd `memfd_create` just returned and checked `>= 0`; it is a unique
|
||||
// open fd nothing else owns, so `File` takes sole ownership and closes it exactly once on drop —
|
||||
// no alias, no double-close.
|
||||
let mut f = unsafe { std::fs::File::from_raw_fd(fd) };
|
||||
f.write_all(s.as_bytes()).context("write keymap")?;
|
||||
f.write_all(&[0]).context("write keymap NUL")?;
|
||||
|
||||
@@ -41,7 +41,8 @@ pub(super) const SHM_MAGIC: u32 = pf_driver_proto::gamepad::PAD_MAGIC; // "PFDS"
|
||||
pub(super) const OFF_INPUT: usize = core::mem::offset_of!(pf_driver_proto::gamepad::PadShm, input);
|
||||
pub(super) const OFF_OUT_SEQ: usize =
|
||||
core::mem::offset_of!(pf_driver_proto::gamepad::PadShm, out_seq);
|
||||
pub(super) const OFF_OUTPUT: usize = core::mem::offset_of!(pf_driver_proto::gamepad::PadShm, output);
|
||||
pub(super) const OFF_OUTPUT: usize =
|
||||
core::mem::offset_of!(pf_driver_proto::gamepad::PadShm, output);
|
||||
/// Device-type selector the driver reads to choose which HID identity/descriptor it serves: 0 =
|
||||
/// DualSense (the default — the section is zeroed), 1 = DualShock 4.
|
||||
pub(super) const OFF_DEVTYPE: usize =
|
||||
@@ -108,7 +109,7 @@ pub(super) struct SwDeviceProfile<'a> {
|
||||
/// `profile.instance`). The returned `HSWDEVICE` owns it — `SwDeviceClose` removes it on drop, so the
|
||||
/// pad appears/disappears with the session and nothing persists.
|
||||
///
|
||||
/// **Game-detection identity** (see `docs/windows-dualsense-game-detection.md`). `HIDD_ATTRIBUTES`
|
||||
/// **Game-detection identity** (see `design/windows-dualsense-game-detection.md`). `HIDD_ATTRIBUTES`
|
||||
/// alone (VID/PID via the IOCTL) satisfies SDL/HIDAPI/RawInput, but a native PS5 path (libScePad-
|
||||
/// style raw HID) classifies the *connection type* by walking from the HID child to its parent
|
||||
/// (`CM_Get_Parent`) and string-matching `"USB"`/`"BTHENUM"` in that parent's
|
||||
|
||||
@@ -187,8 +187,10 @@ impl XusbWinPad {
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
fn write_state(&mut self, buttons: u16, lt: u8, rt: u8, lx: i16, ly: i16, rx: i16, ry: i16) {
|
||||
self.packet = self.packet.wrapping_add(1);
|
||||
// SAFETY: base points at SHM_SIZE bytes; all offsets are in range.
|
||||
let base = self.shm.base();
|
||||
// SAFETY: `base` is the start of the mapped section (`SHM_SIZE` bytes, owned by `Shm`); every
|
||||
// `OFF_*` is a fixed in-range offset into it and `write_unaligned` handles the unaligned field
|
||||
// writes. Single owner (`&mut self`), so no concurrent writer races these stores.
|
||||
unsafe {
|
||||
std::ptr::write_unaligned(base.add(OFF_BUTTONS) as *mut u16, buttons);
|
||||
*base.add(OFF_LT) = lt;
|
||||
|
||||
@@ -5,6 +5,9 @@
|
||||
//! thread stays bound to its desktop and only reattaches (`OpenInputDesktop`/`SetThreadDesktop`) when
|
||||
//! `SendInput` reports a short write (the input desktop switched) — no per-event reattach overhead.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it.
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use anyhow::Result;
|
||||
use punktfunk_core::input::{InputEvent, InputKind};
|
||||
use std::mem::size_of;
|
||||
@@ -35,7 +38,12 @@ pub struct SendInputInjector {
|
||||
desktop: Option<HDESK>,
|
||||
}
|
||||
|
||||
// Only ever used from the host's single injector thread.
|
||||
// SAFETY: `SendInputInjector` holds only an `Option<HDESK>` (a desktop handle). The host creates
|
||||
// and drives it from a single dedicated injector thread; the handle is opened, rebound, and closed
|
||||
// on whichever thread owns the value, and the type is not `Sync`, so there is never concurrent
|
||||
// access. A desktop `HDESK` is not thread-affine for ownership (`CloseDesktop` works from any
|
||||
// thread; `SetThreadDesktop` rebinds the current thread), so transferring ownership via `Send` is
|
||||
// sound.
|
||||
unsafe impl Send for SendInputInjector {}
|
||||
|
||||
impl SendInputInjector {
|
||||
@@ -49,6 +57,12 @@ impl SendInputInjector {
|
||||
/// Bind this thread to the desktop currently receiving input. UAC / lock screen / Ctrl-Alt-Del
|
||||
/// swap the input desktop; `SendInput` silently no-ops unless our thread is on it.
|
||||
fn reattach_input_desktop(&mut self) {
|
||||
// SAFETY: `OpenInputDesktop`/`SetThreadDesktop`/`CloseDesktop` are FFI calls passed only
|
||||
// by-value args (constant desktop flags, a `bool`, an access mask). `OpenInputDesktop`
|
||||
// yields an owned `HDESK` only on `Ok`; we then either install it with `SetThreadDesktop`
|
||||
// (closing the previously-owned handle exactly once) or close the fresh handle on failure —
|
||||
// so every handle is closed exactly once and none is used after close. `SetThreadDesktop`
|
||||
// only rebinds this calling thread, which is where the injector runs.
|
||||
unsafe {
|
||||
match OpenInputDesktop(
|
||||
DESKTOP_CONTROL_FLAGS(0),
|
||||
@@ -75,12 +89,17 @@ impl SendInputInjector {
|
||||
/// switched out from under us, e.g. into UAC/lock) do we reattach to the now-current input desktop
|
||||
/// and retry once. This serves both the normal and secure desktops with no steady-state overhead.
|
||||
fn send(&mut self, inputs: &[INPUT]) -> Result<()> {
|
||||
// SAFETY: `inputs` is a live `&[INPUT]` slice that outlives this synchronous `SendInput`
|
||||
// call; `size_of::<INPUT>()` is the exact per-element stride Win32 requires as `cbSize`. The
|
||||
// call only reads the array (one event per element) and returns the count injected.
|
||||
let n = unsafe { SendInput(inputs, size_of::<INPUT>() as i32) };
|
||||
if n as usize == inputs.len() {
|
||||
return Ok(());
|
||||
}
|
||||
// Short write → the input desktop likely changed. Reattach + retry once.
|
||||
self.reattach_input_desktop();
|
||||
// SAFETY: same as the first `SendInput` — `inputs` is the identical live slice outliving the
|
||||
// call and `cbSize == size_of::<INPUT>()`; only re-issued after reattaching the input desktop.
|
||||
let n = unsafe { SendInput(inputs, size_of::<INPUT>() as i32) };
|
||||
if n as usize != inputs.len() {
|
||||
anyhow::bail!(
|
||||
@@ -95,6 +114,9 @@ impl SendInputInjector {
|
||||
impl Drop for SendInputInjector {
|
||||
fn drop(&mut self) {
|
||||
if let Some(h) = self.desktop.take() {
|
||||
// SAFETY: `h` is the `HDESK` this injector owned (moved out of `self.desktop`);
|
||||
// `CloseDesktop` runs once here in `Drop` on that still-valid handle, with no later use —
|
||||
// no double close.
|
||||
unsafe {
|
||||
let _ = CloseDesktop(h);
|
||||
}
|
||||
@@ -216,7 +238,11 @@ impl InputInjector for SendInputInjector {
|
||||
}
|
||||
InputKind::KeyDown | InputKind::KeyUp => {
|
||||
let down = event.kind == InputKind::KeyDown;
|
||||
let vk = (event.code & 0xff) as u16; // client sends Windows VK
|
||||
// client sends Windows VK
|
||||
let vk = (event.code & 0xff) as u16;
|
||||
// SAFETY: `MapVirtualKeyExW` is a pure value translation (VK → scancode); all three
|
||||
// args are by-value (`u32`, the `MAPVK_VK_TO_VSC_EX` map-type constant, a `None`
|
||||
// HKL). It dereferences no pointer and returns a `u32` — FFI-`unsafe` only.
|
||||
let sc_ex = unsafe { MapVirtualKeyExW(vk as u32, MAPVK_VK_TO_VSC_EX, None) };
|
||||
if sc_ex == 0 {
|
||||
return Ok(()); // unmappable -> drop
|
||||
@@ -264,6 +290,8 @@ fn key(ki: KEYBDINPUT) -> INPUT {
|
||||
}
|
||||
|
||||
fn virtual_desktop_rect() -> (i32, i32, i32, i32) {
|
||||
// SAFETY: each `GetSystemMetrics` takes a single by-value `SYSTEM_METRICS_INDEX` constant and
|
||||
// returns an `i32`; it dereferences no pointer and has no side effects — FFI-`unsafe` only.
|
||||
unsafe {
|
||||
(
|
||||
GetSystemMetrics(SM_XVIRTUALSCREEN),
|
||||
|
||||
@@ -548,6 +548,621 @@ fn heroic_launch_prefix() -> Option<String> {
|
||||
flatpak.then(|| "flatpak run com.heroicgameslauncher.hgl".into())
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------
|
||||
// Epic Games Store (Windows) — reads the launcher's local `.item` manifests under ProgramData
|
||||
// (no auth, launcher need not run). Cover art from the base64 `catcache.bin` (public Epic CDN).
|
||||
// ---------------------------------------------------------------------------------------
|
||||
|
||||
/// Reads the Epic Games Launcher's local install manifests. Windows-only. Best-effort: empty when
|
||||
/// the launcher (or its manifest dir) isn't present.
|
||||
#[cfg(windows)]
|
||||
pub struct EpicProvider;
|
||||
|
||||
#[cfg(windows)]
|
||||
impl LibraryProvider for EpicProvider {
|
||||
fn store(&self) -> &'static str {
|
||||
"epic"
|
||||
}
|
||||
|
||||
fn list(&self) -> Vec<GameEntry> {
|
||||
let data = epic_data_dir();
|
||||
let Ok(rd) = std::fs::read_dir(data.join("Manifests")) else {
|
||||
return Vec::new();
|
||||
};
|
||||
// Parse the (best-effort) artwork cache ONCE: catalogItemId -> Artwork.
|
||||
let art = epic_art_index(&data.join("Catalog").join("catcache.bin"));
|
||||
let mut games = Vec::new();
|
||||
for entry in rd.flatten() {
|
||||
let p = entry.path();
|
||||
if p.extension().and_then(|e| e.to_str()) != Some("item") {
|
||||
continue;
|
||||
}
|
||||
let Ok(text) = std::fs::read_to_string(&p) else {
|
||||
continue;
|
||||
};
|
||||
let Ok(v) = serde_json::from_str::<serde_json::Value>(&text) else {
|
||||
continue;
|
||||
};
|
||||
if let Some(g) = epic_entry(&v, &art) {
|
||||
games.push(g);
|
||||
}
|
||||
}
|
||||
games
|
||||
}
|
||||
}
|
||||
|
||||
/// `%ProgramData%\Epic\EpicGamesLauncher\Data` (machine-wide, SYSTEM-readable).
|
||||
#[cfg(windows)]
|
||||
fn epic_data_dir() -> PathBuf {
|
||||
std::env::var_os("ProgramData")
|
||||
.map(PathBuf::from)
|
||||
.unwrap_or_else(|| PathBuf::from("C:\\ProgramData"))
|
||||
.join("Epic")
|
||||
.join("EpicGamesLauncher")
|
||||
.join("Data")
|
||||
}
|
||||
|
||||
/// Map one `.item` manifest to a [`GameEntry`], or `None` if it isn't a launchable game. Uses
|
||||
/// Playnite's proven EXCLUSION filter (skip `UE_*` Unreal components; skip a DLC/addon unless it is
|
||||
/// `addons/launchable`) rather than a positive `games`-category match, which can drop legit titles.
|
||||
#[cfg(windows)]
|
||||
fn epic_entry(
|
||||
v: &serde_json::Value,
|
||||
art: &std::collections::HashMap<String, Artwork>,
|
||||
) -> Option<GameEntry> {
|
||||
let s = |k: &str| v.get(k).and_then(|x| x.as_str());
|
||||
let app_name = s("AppName")?.to_string();
|
||||
if app_name.starts_with("UE_") {
|
||||
return None; // Unreal Engine component, not a game
|
||||
}
|
||||
let cats: Vec<&str> = v
|
||||
.get("AppCategories")
|
||||
.and_then(|c| c.as_array())
|
||||
.map(|a| a.iter().filter_map(|x| x.as_str()).collect())
|
||||
.unwrap_or_default();
|
||||
if cats.contains(&"addons") && !cats.contains(&"addons/launchable") {
|
||||
return None; // non-launchable DLC/addon
|
||||
}
|
||||
// Drop stale records whose install dir is gone.
|
||||
let install = s("InstallLocation")?;
|
||||
if !Path::new(install).is_dir() {
|
||||
return None;
|
||||
}
|
||||
let title = s("DisplayName").unwrap_or(&app_name).to_string();
|
||||
let namespace = s("CatalogNamespace").unwrap_or("");
|
||||
let catalog = s("CatalogItemId").unwrap_or("");
|
||||
// The robust launch form is the namespace:catalogItemId:appName triple; fall back to the bare
|
||||
// appName when those ids are absent (some manifests lack them) — never drop the launch entirely.
|
||||
let value = if !namespace.is_empty() && !catalog.is_empty() {
|
||||
format!("{namespace}:{catalog}:{app_name}")
|
||||
} else {
|
||||
app_name.clone()
|
||||
};
|
||||
Some(GameEntry {
|
||||
id: format!("epic:{app_name}"),
|
||||
store: "epic".into(),
|
||||
title,
|
||||
art: art.get(catalog).cloned().unwrap_or_default(),
|
||||
launch: Some(LaunchSpec {
|
||||
kind: "epic".into(),
|
||||
value,
|
||||
}),
|
||||
})
|
||||
}
|
||||
|
||||
/// Best-effort parse of `catcache.bin` (base64-encoded JSON array of catalog items) into
|
||||
/// catalogItemId → [`Artwork`] from each item's `keyImages`. Empty map on any read/decode failure
|
||||
/// (the format is community-reverse-engineered + can lag a fresh install → titles just show no art).
|
||||
#[cfg(windows)]
|
||||
fn epic_art_index(catcache: &Path) -> std::collections::HashMap<String, Artwork> {
|
||||
use base64::Engine as _;
|
||||
let mut map = std::collections::HashMap::new();
|
||||
let Ok(raw) = std::fs::read(catcache) else {
|
||||
return map;
|
||||
};
|
||||
let Ok(decoded) = base64::engine::general_purpose::STANDARD.decode(raw) else {
|
||||
return map;
|
||||
};
|
||||
let Ok(items) = serde_json::from_slice::<serde_json::Value>(&decoded) else {
|
||||
return map;
|
||||
};
|
||||
let Some(arr) = items.as_array() else {
|
||||
return map;
|
||||
};
|
||||
for item in arr {
|
||||
let Some(cat) = item
|
||||
.get("id")
|
||||
.or_else(|| item.get("catalogItemId"))
|
||||
.and_then(|v| v.as_str())
|
||||
else {
|
||||
continue;
|
||||
};
|
||||
let Some(images) = item.get("keyImages").and_then(|v| v.as_array()) else {
|
||||
continue;
|
||||
};
|
||||
let mut art = Artwork::default();
|
||||
for img in images {
|
||||
let (Some(ty), Some(url)) = (
|
||||
img.get("type").and_then(|v| v.as_str()),
|
||||
img.get("url").and_then(|v| v.as_str()),
|
||||
) else {
|
||||
continue;
|
||||
};
|
||||
if !(url.starts_with("http://") || url.starts_with("https://")) {
|
||||
continue;
|
||||
}
|
||||
match ty {
|
||||
"DieselGameBoxTall" => art.portrait = Some(url.to_string()),
|
||||
"DieselGameBox" => art.hero = Some(url.to_string()),
|
||||
"DieselGameBoxLogo" => art.logo = Some(url.to_string()),
|
||||
_ => {}
|
||||
}
|
||||
}
|
||||
if art.portrait.is_some() || art.hero.is_some() || art.logo.is_some() {
|
||||
map.insert(cat.to_string(), art);
|
||||
}
|
||||
}
|
||||
map
|
||||
}
|
||||
|
||||
/// Build the `com.epicgames.launcher://` launch URI from a stored launch value — the triple
|
||||
/// `<namespace>:<catalogItemId>:<appName>` (colons URL-encoded), or a bare `<appName>` fallback.
|
||||
/// Each part is charset-validated (host-derived, but belt-and-suspenders) so no shell/URI injection.
|
||||
#[cfg(windows)]
|
||||
fn epic_launch_uri(value: &str) -> Option<String> {
|
||||
let ok = |s: &str| {
|
||||
!s.is_empty()
|
||||
&& s.bytes()
|
||||
.all(|b| b.is_ascii_alphanumeric() || matches!(b, b'.' | b'_' | b'-'))
|
||||
};
|
||||
let inner = match value.split(':').collect::<Vec<_>>().as_slice() {
|
||||
[ns, cat, app] if ok(ns) && ok(cat) && ok(app) => format!("{ns}%3A{cat}%3A{app}"),
|
||||
[app] if ok(app) => (*app).to_string(),
|
||||
_ => return None,
|
||||
};
|
||||
Some(format!(
|
||||
"com.epicgames.launcher://apps/{inner}?action=launch&silent=true"
|
||||
))
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------
|
||||
// GOG (Windows) — registry-indexed installs + each game's `goggame-<id>.info` for a direct-exe
|
||||
// launch (no Galaxy needed, dodges its cold-start/anti-cheat). Art (api.gog.com) is a follow-up.
|
||||
// ---------------------------------------------------------------------------------------
|
||||
|
||||
/// Reads the GOG.com install registry + per-game `.info` files. Windows-only. Best-effort: empty
|
||||
/// when GOG isn't installed.
|
||||
#[cfg(windows)]
|
||||
pub struct GogProvider;
|
||||
|
||||
#[cfg(windows)]
|
||||
impl LibraryProvider for GogProvider {
|
||||
fn store(&self) -> &'static str {
|
||||
"gog"
|
||||
}
|
||||
|
||||
fn list(&self) -> Vec<GameEntry> {
|
||||
gog_games()
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(windows)]
|
||||
fn gog_games() -> Vec<GameEntry> {
|
||||
use winreg::enums::HKEY_LOCAL_MACHINE;
|
||||
use winreg::RegKey;
|
||||
// 32-bit GOG writes under WOW6432Node; a 64-bit process reads the explicit path directly.
|
||||
let Ok(games_key) =
|
||||
RegKey::predef(HKEY_LOCAL_MACHINE).open_subkey("SOFTWARE\\WOW6432Node\\GOG.com\\Games")
|
||||
else {
|
||||
return Vec::new();
|
||||
};
|
||||
let mut out = Vec::new();
|
||||
for sub in games_key.enum_keys().flatten() {
|
||||
// The subkey name IS the GOG product id.
|
||||
let Ok(k) = games_key.open_subkey(&sub) else {
|
||||
continue;
|
||||
};
|
||||
let Ok(path) = k.get_value::<String, _>("PATH") else {
|
||||
continue;
|
||||
};
|
||||
if !Path::new(&path).is_dir() {
|
||||
continue;
|
||||
}
|
||||
let title = k
|
||||
.get_value::<String, _>("GAMENAME")
|
||||
.unwrap_or_else(|_| sub.clone());
|
||||
// Resolve the primary play task (exe + args + workdir) from goggame-<id>.info; skip if absent.
|
||||
let Some((exe, args, workdir)) = gog_play_task(&path, &sub) else {
|
||||
continue;
|
||||
};
|
||||
let id = format!("gog:{sub}");
|
||||
// Art (public api.gog.com) is resolved off the hot path by the background warmer; read
|
||||
// whatever it has cached (title-only until warmed).
|
||||
let art = cached_art(&id).unwrap_or_default();
|
||||
out.push(GameEntry {
|
||||
id,
|
||||
store: "gog".into(),
|
||||
title,
|
||||
art,
|
||||
launch: Some(LaunchSpec {
|
||||
kind: "gog".into(),
|
||||
value: format!("{exe}\t{args}\t{workdir}"),
|
||||
}),
|
||||
});
|
||||
}
|
||||
out
|
||||
}
|
||||
|
||||
/// The primary play task from `<install>\goggame-<id>.info`: `(absolute exe, args, working dir)`.
|
||||
/// Prefers `isPrimary` + `FileTask`, else the first `FileTask`. Paths are resolved against `install`.
|
||||
#[cfg(windows)]
|
||||
fn gog_play_task(install: &str, id: &str) -> Option<(String, String, String)> {
|
||||
let text =
|
||||
std::fs::read_to_string(Path::new(install).join(format!("goggame-{id}.info"))).ok()?;
|
||||
let v: serde_json::Value = serde_json::from_str(&text).ok()?;
|
||||
let tasks = v.get("playTasks")?.as_array()?;
|
||||
let is_file =
|
||||
|t: &serde_json::Value| t.get("type").and_then(|s| s.as_str()) == Some("FileTask");
|
||||
let pick = tasks
|
||||
.iter()
|
||||
.find(|t| {
|
||||
t.get("isPrimary")
|
||||
.and_then(|b| b.as_bool())
|
||||
.unwrap_or(false)
|
||||
&& is_file(t)
|
||||
})
|
||||
.or_else(|| tasks.iter().find(|t| is_file(t)))?;
|
||||
let rel = pick.get("path").and_then(|s| s.as_str())?;
|
||||
let exe = Path::new(install).join(rel);
|
||||
let args = pick
|
||||
.get("arguments")
|
||||
.and_then(|s| s.as_str())
|
||||
.unwrap_or("")
|
||||
.to_string();
|
||||
let workdir = pick
|
||||
.get("workingDir")
|
||||
.and_then(|s| s.as_str())
|
||||
.map(|w| Path::new(install).join(w))
|
||||
.unwrap_or_else(|| Path::new(install).to_path_buf());
|
||||
Some((
|
||||
exe.to_string_lossy().into_owned(),
|
||||
args,
|
||||
workdir.to_string_lossy().into_owned(),
|
||||
))
|
||||
}
|
||||
|
||||
/// Build the spawn `(command line, working dir)` for a `gog` launch value (`exe \t args \t workdir`,
|
||||
/// all host-resolved from the operator's own disk). Direct exe — no shell, no Galaxy.
|
||||
#[cfg(windows)]
|
||||
fn gog_spawn(value: &str) -> Option<(String, Option<PathBuf>)> {
|
||||
let mut parts = value.split('\t');
|
||||
let exe = parts.next().filter(|s| !s.is_empty())?;
|
||||
let args = parts.next().unwrap_or("");
|
||||
let workdir = parts.next().filter(|s| !s.is_empty()).map(PathBuf::from);
|
||||
let cmdline = if args.trim().is_empty() {
|
||||
format!("\"{exe}\"")
|
||||
} else {
|
||||
format!("\"{exe}\" {args}")
|
||||
};
|
||||
Some((cmdline, workdir))
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------
|
||||
// Xbox / Microsoft Store / Game Pass (Windows) — scans the flat-file `XboxGames` install dirs
|
||||
// (no auth) for GDK games (each has a Content\MicrosoftGame.config). Launch via the AUMID
|
||||
// (shell:AppsFolder\<PFN>!<AppId>) in the interactive session. Cover art (displaycatalog) deferred.
|
||||
// ---------------------------------------------------------------------------------------
|
||||
|
||||
/// Reads installed Xbox / Game Pass / Store GDK games from the flat-file install dirs. Windows-only.
|
||||
/// Best-effort: empty when no `XboxGames` dir exists.
|
||||
#[cfg(windows)]
|
||||
pub struct XboxProvider;
|
||||
|
||||
#[cfg(windows)]
|
||||
impl LibraryProvider for XboxProvider {
|
||||
fn store(&self) -> &'static str {
|
||||
"xbox"
|
||||
}
|
||||
|
||||
fn list(&self) -> Vec<GameEntry> {
|
||||
xbox_games()
|
||||
}
|
||||
}
|
||||
|
||||
/// Scan each fixed drive's default `<drive>:\XboxGames` for GDK games — the presence of
|
||||
/// `Content\MicrosoftGame.config` is the game marker (so we list games, not ordinary UWP apps). A
|
||||
/// custom install folder (set via the undocumented `.GamingRoot`) isn't covered; the default folder
|
||||
/// is the common case. Non-GDK pure-UWP Store games (under the ACL-locked WindowsApps) are missed too.
|
||||
#[cfg(windows)]
|
||||
fn xbox_games() -> Vec<GameEntry> {
|
||||
let mut games = Vec::new();
|
||||
for letter in b'C'..=b'Z' {
|
||||
let root = PathBuf::from(format!("{}:\\XboxGames", letter as char));
|
||||
let Ok(rd) = std::fs::read_dir(&root) else {
|
||||
continue;
|
||||
};
|
||||
for entry in rd.flatten() {
|
||||
let title_dir = entry.path();
|
||||
let cfg = title_dir.join("Content").join("MicrosoftGame.config");
|
||||
if !cfg.is_file() {
|
||||
continue;
|
||||
}
|
||||
let Ok(text) = std::fs::read_to_string(&cfg) else {
|
||||
continue;
|
||||
};
|
||||
let folder = title_dir
|
||||
.file_name()
|
||||
.map(|f| f.to_string_lossy().into_owned());
|
||||
let Some((name, app_id, title, store_id)) = xbox_parse_config(&text, folder.as_deref())
|
||||
else {
|
||||
continue;
|
||||
};
|
||||
let Some(pfn) = xbox_pfn(&name) else {
|
||||
tracing::debug!(package = %name, "xbox: no AppRepository entry → can't resolve PFN, skipping");
|
||||
continue;
|
||||
};
|
||||
let id_key = if store_id.is_empty() {
|
||||
pfn.clone()
|
||||
} else {
|
||||
store_id
|
||||
};
|
||||
let id = format!("xbox:{id_key}");
|
||||
// Art (unofficial displaycatalog, keyed by StoreId) is resolved off the hot path by the
|
||||
// background warmer; read whatever it has cached (title-only until warmed / if no StoreId).
|
||||
let art = cached_art(&id).unwrap_or_default();
|
||||
games.push(GameEntry {
|
||||
id,
|
||||
store: "xbox".into(),
|
||||
title,
|
||||
art,
|
||||
launch: Some(LaunchSpec {
|
||||
kind: "aumid".into(),
|
||||
value: format!("{pfn}!{app_id}"),
|
||||
}),
|
||||
});
|
||||
}
|
||||
}
|
||||
games.sort_by(|a, b| a.id.cmp(&b.id));
|
||||
games.dedup_by(|a, b| a.id == b.id); // same game on two drives → one entry
|
||||
games
|
||||
}
|
||||
|
||||
/// Parse the fields we need from a `MicrosoftGame.config`: `(Identity Name, AppId, title, StoreId)`.
|
||||
/// AppId is the `<Executable>`'s `Id` (the AUMID app id, typically "Game"). The title prefers
|
||||
/// `ShellVisuals@DefaultDisplayName`, but that can be an unresolved `ms-resource:` ref → fall back to
|
||||
/// the install folder name, then the package name.
|
||||
#[cfg(windows)]
|
||||
fn xbox_parse_config(text: &str, folder: Option<&str>) -> Option<(String, String, String, String)> {
|
||||
let doc = roxmltree::Document::parse(text).ok()?;
|
||||
let root = doc.root_element();
|
||||
let name = root
|
||||
.children()
|
||||
.find(|n| n.has_tag_name("Identity"))?
|
||||
.attribute("Name")?
|
||||
.to_string();
|
||||
let app_id = root
|
||||
.children()
|
||||
.find(|n| n.has_tag_name("ExecutableList"))
|
||||
.and_then(|el| {
|
||||
el.children()
|
||||
.filter(|n| n.has_tag_name("Executable"))
|
||||
.find_map(|e| e.attribute("Id"))
|
||||
})?
|
||||
.to_string();
|
||||
let ddn = root
|
||||
.children()
|
||||
.find(|n| n.has_tag_name("ShellVisuals"))
|
||||
.and_then(|sv| sv.attribute("DefaultDisplayName"))
|
||||
.filter(|s| !s.is_empty() && !s.starts_with("ms-resource"));
|
||||
let title = ddn
|
||||
.map(String::from)
|
||||
.or_else(|| folder.map(String::from))
|
||||
.unwrap_or_else(|| name.clone());
|
||||
let store_id = root
|
||||
.children()
|
||||
.find(|n| n.has_tag_name("StoreId"))
|
||||
.and_then(|n| n.text())
|
||||
.unwrap_or("")
|
||||
.to_string();
|
||||
Some((name, app_id, title, store_id))
|
||||
}
|
||||
|
||||
/// Resolve a package's PackageFamilyName by finding its
|
||||
/// `AppRepository\Packages\<PackageFullName>` dir (machine-wide, SYSTEM-readable) and reducing the
|
||||
/// full name to `Name_PublisherHash`. This READS the authoritative PFN — never compute the hash.
|
||||
#[cfg(windows)]
|
||||
fn xbox_pfn(identity: &str) -> Option<String> {
|
||||
let pkgs = PathBuf::from(std::env::var_os("ProgramData")?)
|
||||
.join("Microsoft")
|
||||
.join("Windows")
|
||||
.join("AppRepository")
|
||||
.join("Packages");
|
||||
let prefix = format!("{identity}_");
|
||||
for e in std::fs::read_dir(&pkgs).ok()?.flatten() {
|
||||
let dn = e.file_name().to_string_lossy().into_owned();
|
||||
if dn.starts_with(&prefix) {
|
||||
if let Some(pfn) = pfn_from_full(&dn, identity) {
|
||||
return Some(pfn);
|
||||
}
|
||||
}
|
||||
}
|
||||
None
|
||||
}
|
||||
|
||||
/// PackageFamilyName from a PackageFullName dir name
|
||||
/// (`Name_Version_Arch_ResourceId_PublisherHash`) → `Name_PublisherHash`. The hash is the last
|
||||
/// `_`-segment; `Name` is the caller's identity.
|
||||
#[cfg(windows)]
|
||||
fn pfn_from_full(dir_name: &str, identity: &str) -> Option<String> {
|
||||
let hash = dir_name.rsplit('_').next()?;
|
||||
(!hash.is_empty() && hash != dir_name).then(|| format!("{identity}_{hash}"))
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------
|
||||
// Cover-art resolver + cache (shared by the Windows GOG + Xbox providers, which have no local
|
||||
// art). A disk cache is the source of truth read by all_games() (so the list/launch path never
|
||||
// blocks on the network); a host-lifetime background warmer fetches uncached art (GOG's public
|
||||
// api.gog.com + Xbox's displaycatalog, both no-auth) and persists it. Cross-platform so the
|
||||
// HTTP/JSON code is compiled + checked everywhere; the warmer simply finds nothing to fetch on a
|
||||
// host whose stores all carry their own art (Steam CDN / Heroic CDN / Lutris data: URLs).
|
||||
// ---------------------------------------------------------------------------------------
|
||||
|
||||
/// The persisted art cache: GameEntry id → resolved [`Artwork`]. An entry's PRESENCE means "already
|
||||
/// resolved" (even an empty Artwork = fetched, none found) so the warmer never re-fetches it.
|
||||
fn art_cache() -> &'static std::sync::Mutex<std::collections::HashMap<String, Artwork>> {
|
||||
static CACHE: std::sync::OnceLock<
|
||||
std::sync::Mutex<std::collections::HashMap<String, Artwork>>,
|
||||
> = std::sync::OnceLock::new();
|
||||
CACHE.get_or_init(|| {
|
||||
let loaded = std::fs::read_to_string(art_cache_path())
|
||||
.ok()
|
||||
.and_then(|s| serde_json::from_str(&s).ok())
|
||||
.unwrap_or_default();
|
||||
std::sync::Mutex::new(loaded)
|
||||
})
|
||||
}
|
||||
|
||||
/// The art cache lives in the canonical HOST config dir (`%ProgramData%\punktfunk` on Windows /
|
||||
/// `~/.config/punktfunk` on Linux — gamestream::config_dir, NOT the legacy XDG/HOME `config_dir`
|
||||
/// below that the custom store still uses).
|
||||
fn art_cache_path() -> PathBuf {
|
||||
crate::gamestream::config_dir().join("library-art-cache.json")
|
||||
}
|
||||
|
||||
/// The cached art for a library id, if it has been resolved (positive or negative). `None` = not yet
|
||||
/// warmed → the provider shows title-only until the warmer fills it in.
|
||||
fn cached_art(id: &str) -> Option<Artwork> {
|
||||
art_cache().lock().unwrap().get(id).cloned()
|
||||
}
|
||||
|
||||
/// Record resolved art for a library id + persist the cache (write-then-rename; best-effort).
|
||||
fn store_art(id: &str, art: Artwork) {
|
||||
let mut cache = art_cache().lock().unwrap();
|
||||
cache.insert(id.to_string(), art);
|
||||
if let Ok(json) = serde_json::to_string(&*cache) {
|
||||
let path = art_cache_path();
|
||||
if let Some(dir) = path.parent() {
|
||||
let _ = std::fs::create_dir_all(dir);
|
||||
}
|
||||
let tmp = path.with_extension("json.tmp");
|
||||
if std::fs::write(&tmp, json).is_ok() {
|
||||
let _ = std::fs::rename(&tmp, &path);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Start the host-lifetime cover-art warmer: every few minutes, fetch + cache art for any library
|
||||
/// entry whose store needs a network lookup (GOG / Xbox) and isn't cached yet. Idempotent — once
|
||||
/// everything is cached a pass makes no network calls (and a host with only self-art stores never
|
||||
/// fetches at all). Call once from `serve()`; the returned handle can be dropped to detach it.
|
||||
pub fn start_art_warmer() -> std::thread::JoinHandle<()> {
|
||||
std::thread::Builder::new()
|
||||
.name("pf-art-warmer".into())
|
||||
.spawn(|| loop {
|
||||
warm_art_once();
|
||||
std::thread::sleep(std::time::Duration::from_secs(300));
|
||||
})
|
||||
.expect("spawn art warmer thread")
|
||||
}
|
||||
|
||||
/// One warming pass: resolve uncached GOG/Xbox art. Other stores carry their own art (Steam CDN
|
||||
/// template, Heroic CDN URLs, Lutris data: URLs, custom user URLs) and are skipped.
|
||||
fn warm_art_once() {
|
||||
for g in all_games() {
|
||||
if cached_art(&g.id).is_some() {
|
||||
continue;
|
||||
}
|
||||
let Some((store, localid)) = g.id.split_once(':') else {
|
||||
continue;
|
||||
};
|
||||
let art = match store {
|
||||
"gog" => fetch_gog_art(localid),
|
||||
// The xbox id is the StoreId when present, else the PFN (contains '_', no displaycatalog
|
||||
// entry) → cache empty for those so they aren't retried every pass.
|
||||
"xbox" if !localid.contains('_') => fetch_xbox_art(localid),
|
||||
"xbox" => Artwork::default(),
|
||||
_ => continue, // steam/heroic/lutris/custom resolve their own art
|
||||
};
|
||||
store_art(&g.id, art);
|
||||
}
|
||||
}
|
||||
|
||||
/// HTTP GET + parse JSON with a bounded timeout. `None` on any network/parse failure (best-effort —
|
||||
/// art is non-essential, so a failure just leaves the title-only card).
|
||||
fn fetch_json(url: &str) -> Option<serde_json::Value> {
|
||||
let agent = ureq::AgentBuilder::new()
|
||||
.timeout(std::time::Duration::from_secs(10))
|
||||
.build();
|
||||
let body = agent.get(url).call().ok()?.into_string().ok()?;
|
||||
serde_json::from_str(&body).ok()
|
||||
}
|
||||
|
||||
/// Make a protocol-relative URL (`//host/...`, common in GOG + MS catalog responses) absolute https.
|
||||
fn abs_url(u: &str) -> String {
|
||||
u.strip_prefix("//")
|
||||
.map(|rest| format!("https://{rest}"))
|
||||
.unwrap_or_else(|| u.to_string())
|
||||
}
|
||||
|
||||
/// GOG cover art via the public (no-auth) product API. Field names / URL shapes are GOG-specific and
|
||||
/// best-effort (worth on-box confirmation); a wrong URL just degrades to the title card client-side.
|
||||
fn fetch_gog_art(product_id: &str) -> Artwork {
|
||||
let Some(v) = fetch_json(&format!(
|
||||
"https://api.gog.com/products/{product_id}?expand=images"
|
||||
)) else {
|
||||
return Artwork::default();
|
||||
};
|
||||
let img = |k: &str| {
|
||||
v.get("images")
|
||||
.and_then(|i| i.get(k))
|
||||
.and_then(|u| u.as_str())
|
||||
.map(abs_url)
|
||||
};
|
||||
Artwork {
|
||||
portrait: img("verticalCover"),
|
||||
hero: img("background"),
|
||||
logo: img("logo2x"),
|
||||
header: img("logo"),
|
||||
}
|
||||
}
|
||||
|
||||
/// Xbox cover art via the (unofficial, no-auth) Microsoft display catalog, keyed by StoreId. Best-
|
||||
/// effort: the endpoint is internal/unstable, so on drift this just yields no art (title-only).
|
||||
fn fetch_xbox_art(store_id: &str) -> Artwork {
|
||||
let Some(v) = fetch_json(&format!(
|
||||
"https://displaycatalog.mp.microsoft.com/v7.0/products/{store_id}?market=US&languages=en-us&fieldsTemplate=Details"
|
||||
)) else {
|
||||
return Artwork::default();
|
||||
};
|
||||
let images = v
|
||||
.get("Products")
|
||||
.and_then(|p| p.as_array())
|
||||
.and_then(|a| a.first())
|
||||
.and_then(|p| p.get("LocalizedProperties"))
|
||||
.and_then(|l| l.as_array())
|
||||
.and_then(|a| a.first())
|
||||
.and_then(|lp| lp.get("Images"))
|
||||
.and_then(|i| i.as_array());
|
||||
let mut art = Artwork::default();
|
||||
for img in images.into_iter().flatten() {
|
||||
let (Some(purpose), Some(uri)) = (
|
||||
img.get("ImagePurpose").and_then(|v| v.as_str()),
|
||||
img.get("Uri").and_then(|v| v.as_str()),
|
||||
) else {
|
||||
continue;
|
||||
};
|
||||
let url = abs_url(uri);
|
||||
match purpose {
|
||||
"Poster" => art.portrait = Some(url),
|
||||
"SuperHeroArt" | "Hero" => art.hero = Some(url),
|
||||
"Logo" => art.logo = Some(url),
|
||||
"BoxArt" => art.header = Some(url),
|
||||
_ => {}
|
||||
}
|
||||
}
|
||||
art
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------
|
||||
// Custom store (user-curated entries, persisted + CRUD'd via the mgmt API)
|
||||
// ---------------------------------------------------------------------------------------
|
||||
@@ -768,6 +1383,32 @@ fn windows_launch_for(spec: &LaunchSpec) -> Option<(String, Option<std::path::Pa
|
||||
};
|
||||
Some((cmdline, None))
|
||||
}
|
||||
// Epic: open the (host-built, validated) com.epicgames.launcher:// URI via explorer.exe — a
|
||||
// concrete EXE that resolves the registered protocol handler as the user; the URI is a single
|
||||
// argv element (no shell, no cmd /c). Same pattern as the steam explorer fallback.
|
||||
"epic" => epic_launch_uri(&spec.value).map(|uri| (format!("explorer.exe \"{uri}\""), None)),
|
||||
// GOG: spawn the resolved game exe directly (host-derived from goggame-<id>.info), no Galaxy.
|
||||
"gog" => gog_spawn(&spec.value),
|
||||
// Xbox/Game Pass: activate the UWP/GDK package by its AUMID (<PFN>!<AppId>) via explorer's
|
||||
// shell:AppsFolder — which runs in the interactive user session (UWP activation fails as
|
||||
// SYSTEM/session-0; spawn_in_active_session uses the user token). Guard the charset (the value
|
||||
// is host-derived from MicrosoftGame.config + AppRepository, but belt-and-suspenders).
|
||||
"aumid" => {
|
||||
let valid = spec.value.split_once('!').is_some_and(|(pfn, app)| {
|
||||
let part = |s: &str| {
|
||||
!s.is_empty()
|
||||
&& s.bytes()
|
||||
.all(|b| b.is_ascii_alphanumeric() || matches!(b, b'.' | b'_' | b'-'))
|
||||
};
|
||||
part(pfn) && part(app)
|
||||
});
|
||||
valid.then(|| {
|
||||
(
|
||||
format!("explorer.exe \"shell:AppsFolder\\{}\"", spec.value),
|
||||
None,
|
||||
)
|
||||
})
|
||||
}
|
||||
// Operator-typed custom command (host-owned, never client-set): run it through the shell in the
|
||||
// interactive session. `cmd.exe /c` is acceptable here precisely because the value is operator
|
||||
// input — the same trust as the operator typing it — not a client-influenced string.
|
||||
@@ -795,6 +1436,38 @@ fn steam_exe() -> Option<std::path::PathBuf> {
|
||||
None
|
||||
}
|
||||
|
||||
/// Launch a GameStream `apps.json` command (operator-typed, trusted — never client-set) into the live
|
||||
/// session, AFTER capture is up. Used by the GameStream path for the backends that DON'T nest the
|
||||
/// command via [`VirtualDisplay::set_launch_command`]: Windows (no gamescope) and Linux
|
||||
/// kwin/mutter/wlroots (which stream the existing desktop). The caller skips this for Linux gamescope,
|
||||
/// which already nested it. On Windows it runs in the interactive USER session (the host is SYSTEM);
|
||||
/// on Linux the host is already inside the user's graphical session, so a plain spawn lands the app on
|
||||
/// the streamed (primary) output.
|
||||
#[cfg(any(windows, target_os = "linux"))]
|
||||
pub fn launch_gamestream_command(cmd: &str) -> Result<()> {
|
||||
let cmd = cmd.trim();
|
||||
anyhow::ensure!(!cmd.is_empty(), "empty command");
|
||||
#[cfg(windows)]
|
||||
{
|
||||
// cmd.exe /c is fine here: the value is the host operator's own apps.json command, not a
|
||||
// client-influenced string (same trust as the custom-store `command` kind).
|
||||
let pid = crate::interactive::spawn_in_active_session(&format!("cmd.exe /c {cmd}"), None)
|
||||
.context("spawn gamestream command in the interactive session")?;
|
||||
tracing::info!(command = %cmd, pid, "gamestream: launched app in the interactive session");
|
||||
Ok(())
|
||||
}
|
||||
#[cfg(target_os = "linux")]
|
||||
{
|
||||
let child = std::process::Command::new("sh")
|
||||
.arg("-c")
|
||||
.arg(cmd)
|
||||
.spawn()
|
||||
.context("spawn gamestream command")?;
|
||||
tracing::info!(command = %cmd, pid = child.id(), "gamestream: launched app into the session");
|
||||
Ok(())
|
||||
}
|
||||
}
|
||||
|
||||
/// The full library: every store's titles merged + the custom entries, sorted by title.
|
||||
pub fn all_games() -> Vec<GameEntry> {
|
||||
let mut games = SteamProvider.list();
|
||||
@@ -805,6 +1478,13 @@ pub fn all_games() -> Vec<GameEntry> {
|
||||
games.extend(LutrisProvider.list());
|
||||
games.extend(HeroicProvider.list());
|
||||
}
|
||||
// Windows store providers (their launchers are Windows-only): Epic + GOG + Xbox/Game Pass.
|
||||
#[cfg(windows)]
|
||||
{
|
||||
games.extend(EpicProvider.list());
|
||||
games.extend(GogProvider.list());
|
||||
games.extend(XboxProvider.list());
|
||||
}
|
||||
games.extend(load_custom().into_iter().map(GameEntry::from));
|
||||
games.sort_by_key(|g| g.title.to_lowercase());
|
||||
games
|
||||
@@ -1048,6 +1728,20 @@ mod tests {
|
||||
windows_launch_for(&cmd).unwrap().0,
|
||||
"cmd.exe /c notepad.exe"
|
||||
);
|
||||
// Xbox AUMID → explorer shell:AppsFolder activation; a value without '!' is rejected.
|
||||
let aumid = LaunchSpec {
|
||||
kind: "aumid".into(),
|
||||
value: "Microsoft.X_8wekyb3d8bbwe!Game".into(),
|
||||
};
|
||||
assert_eq!(
|
||||
windows_launch_for(&aumid).unwrap().0,
|
||||
"explorer.exe \"shell:AppsFolder\\Microsoft.X_8wekyb3d8bbwe!Game\""
|
||||
);
|
||||
assert!(windows_launch_for(&LaunchSpec {
|
||||
kind: "aumid".into(),
|
||||
value: "no-bang".into()
|
||||
})
|
||||
.is_none());
|
||||
// Empty / unknown kinds → no recipe.
|
||||
assert!(windows_launch_for(&LaunchSpec {
|
||||
kind: "command".into(),
|
||||
@@ -1060,4 +1754,116 @@ mod tests {
|
||||
})
|
||||
.is_none());
|
||||
}
|
||||
|
||||
#[cfg(windows)]
|
||||
#[test]
|
||||
fn epic_filters_and_builds_launch() {
|
||||
let dir = std::env::temp_dir().join(format!("pf-epic-test-{}", std::process::id()));
|
||||
std::fs::create_dir_all(&dir).unwrap();
|
||||
let inst = dir.to_string_lossy().into_owned();
|
||||
let empty = std::collections::HashMap::new();
|
||||
// Normal game with the full triple → kept, triple launch value.
|
||||
let game = serde_json::json!({
|
||||
"AppName": "Fortnite", "DisplayName": "Fortnite", "CatalogNamespace": "fn",
|
||||
"CatalogItemId": "abc123", "InstallLocation": inst.clone(),
|
||||
"AppCategories": ["public", "games", "applications"]
|
||||
});
|
||||
let e = epic_entry(&game, &empty).expect("game kept");
|
||||
assert_eq!(e.id, "epic:Fortnite");
|
||||
assert_eq!(e.launch.as_ref().unwrap().value, "fn:abc123:Fortnite");
|
||||
// UE component, non-launchable addon, and a missing install dir are all skipped.
|
||||
let ue = serde_json::json!({"AppName":"UE_5.3","InstallLocation":inst.clone(),"AppCategories":["engines"]});
|
||||
assert!(epic_entry(&ue, &empty).is_none());
|
||||
let dlc =
|
||||
serde_json::json!({"AppName":"DLC","InstallLocation":inst,"AppCategories":["addons"]});
|
||||
assert!(epic_entry(&dlc, &empty).is_none());
|
||||
let gone = serde_json::json!({"AppName":"Gone","InstallLocation":"C:\\nope-xyz","AppCategories":["games"]});
|
||||
assert!(epic_entry(&gone, &empty).is_none());
|
||||
std::fs::remove_dir_all(&dir).ok();
|
||||
}
|
||||
|
||||
#[cfg(windows)]
|
||||
#[test]
|
||||
fn epic_launch_uri_triple_bare_and_guard() {
|
||||
assert_eq!(
|
||||
epic_launch_uri("fn:abc:Fortnite").as_deref(),
|
||||
Some("com.epicgames.launcher://apps/fn%3Aabc%3AFortnite?action=launch&silent=true")
|
||||
);
|
||||
assert_eq!(
|
||||
epic_launch_uri("Fortnite").as_deref(),
|
||||
Some("com.epicgames.launcher://apps/Fortnite?action=launch&silent=true")
|
||||
);
|
||||
assert!(epic_launch_uri("bad part:x:y").is_none()); // a space → rejected
|
||||
assert!(epic_launch_uri("").is_none());
|
||||
}
|
||||
|
||||
#[cfg(windows)]
|
||||
#[test]
|
||||
fn gog_spawn_parses_and_guards() {
|
||||
let (cmd, wd) = gog_spawn("C:\\Games\\W3\\witcher3.exe\t--skip\tC:\\Games\\W3").unwrap();
|
||||
assert_eq!(cmd, "\"C:\\Games\\W3\\witcher3.exe\" --skip");
|
||||
assert_eq!(wd, Some(std::path::PathBuf::from("C:\\Games\\W3")));
|
||||
let (cmd2, wd2) = gog_spawn("C:\\g.exe").unwrap();
|
||||
assert_eq!(cmd2, "\"C:\\g.exe\"");
|
||||
assert!(wd2.is_none());
|
||||
assert!(gog_spawn("").is_none());
|
||||
}
|
||||
|
||||
#[cfg(windows)]
|
||||
#[test]
|
||||
fn gog_play_task_picks_primary_filetask() {
|
||||
let dir = std::env::temp_dir().join(format!("pf-gog-test-{}", std::process::id()));
|
||||
std::fs::create_dir_all(&dir).unwrap();
|
||||
let id = "1207658924";
|
||||
std::fs::write(
|
||||
dir.join(format!("goggame-{id}.info")),
|
||||
r#"{"playTasks":[
|
||||
{"isPrimary":false,"type":"FileTask","path":"other.exe"},
|
||||
{"isPrimary":true,"type":"FileTask","path":"bin\\game.exe","arguments":"-w","workingDir":"bin"}
|
||||
]}"#,
|
||||
)
|
||||
.unwrap();
|
||||
let (exe, args, wd) = gog_play_task(&dir.to_string_lossy(), id).unwrap();
|
||||
std::fs::remove_dir_all(&dir).ok();
|
||||
assert!(exe.ends_with("bin\\game.exe"), "exe={exe}");
|
||||
assert_eq!(args, "-w");
|
||||
assert!(wd.ends_with("bin"), "wd={wd}");
|
||||
}
|
||||
|
||||
#[cfg(windows)]
|
||||
#[test]
|
||||
fn xbox_parse_config_and_pfn() {
|
||||
let xml = r#"<?xml version="1.0" encoding="utf-8"?>
|
||||
<Game configVersion="1">
|
||||
<Identity Name="Microsoft.624F8B84B80" Publisher="CN=Microsoft" Version="1.0.0.0" />
|
||||
<ExecutableList>
|
||||
<Executable Name="gamelaunchhelper.exe" Id="Game" />
|
||||
</ExecutableList>
|
||||
<StoreId>9NBLGGH4R315</StoreId>
|
||||
<ShellVisuals DefaultDisplayName="Halo Infinite" Square150x150Logo="x.png" />
|
||||
</Game>"#;
|
||||
let (name, app_id, title, store_id) = xbox_parse_config(xml, Some("HaloInfinite")).unwrap();
|
||||
assert_eq!(name, "Microsoft.624F8B84B80");
|
||||
assert_eq!(app_id, "Game");
|
||||
assert_eq!(title, "Halo Infinite");
|
||||
assert_eq!(store_id, "9NBLGGH4R315");
|
||||
// An ms-resource DefaultDisplayName is unresolvable → fall back to the install folder name.
|
||||
let xml2 = r#"<Game><Identity Name="Pkg.Name"/>
|
||||
<ExecutableList><Executable Id="App"/></ExecutableList>
|
||||
<ShellVisuals DefaultDisplayName="ms-resource:DisplayName"/></Game>"#;
|
||||
let (_, app2, title2, sid2) = xbox_parse_config(xml2, Some("MyGameFolder")).unwrap();
|
||||
assert_eq!(app2, "App");
|
||||
assert_eq!(title2, "MyGameFolder");
|
||||
assert_eq!(sid2, "");
|
||||
// PackageFamilyName reduced from a PackageFullName dir name (the hash is the last segment).
|
||||
assert_eq!(
|
||||
pfn_from_full(
|
||||
"Microsoft.624F8B84B80_1.0.0.0_x64__8wekyb3d8bbwe",
|
||||
"Microsoft.624F8B84B80"
|
||||
)
|
||||
.as_deref(),
|
||||
Some("Microsoft.624F8B84B80_8wekyb3d8bbwe")
|
||||
);
|
||||
assert!(pfn_from_full("NoUnderscore", "NoUnderscore").is_none());
|
||||
}
|
||||
}
|
||||
|
||||
@@ -13,6 +13,9 @@
|
||||
//! attaches none, the export yields an already-signaled sync_file (poll returns immediately) — no
|
||||
//! wait, no harm, and `waited=false` tells us the driver doesn't fence (so zero-copy would still race).
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use std::os::fd::RawFd;
|
||||
|
||||
// linux/dma-buf.h ioctls on the DMA_BUF_BASE ('b' = 0x62) magic. _IOWR = dir(3)<<30 | size<<16 | base<<8 | nr.
|
||||
@@ -40,6 +43,11 @@ pub fn wait_read_ready(dmabuf_fd: RawFd, timeout_ms: i32) -> std::io::Result<boo
|
||||
flags: DMA_BUF_SYNC_READ,
|
||||
fd: -1,
|
||||
};
|
||||
// SAFETY: `dmabuf_fd` is a live dmabuf fd supplied by the caller (borrowed for this call; we
|
||||
// never close it). `DMA_BUF_IOCTL_EXPORT_SYNC_FILE` encodes `size_of::<DmaBufExportSyncFile>()`
|
||||
// — the exact byte count the kernel copies — and `&mut req` is a live, correctly-sized
|
||||
// `#[repr(C)]` struct the EXPORT_SYNC_FILE ioctl reads (`flags`) and writes (`fd`). `req`
|
||||
// outlives this synchronous call and is not aliased elsewhere.
|
||||
let r = unsafe { libc::ioctl(dmabuf_fd, DMA_BUF_IOCTL_EXPORT_SYNC_FILE, &mut req) };
|
||||
if r < 0 {
|
||||
return Err(std::io::Error::last_os_error());
|
||||
@@ -54,11 +62,21 @@ pub fn wait_read_ready(dmabuf_fd: RawFd, timeout_ms: i32) -> std::io::Result<boo
|
||||
revents: 0,
|
||||
};
|
||||
// Non-blocking probe: not-yet-signaled (poll==0) means the producer is still rendering.
|
||||
// SAFETY: `&mut pfd` points at a single live `libc::pollfd` and `nfds == 1` matches that one
|
||||
// element; `pfd.fd` is `sync_fd`, the sync_file fd just exported (already checked `>= 0`).
|
||||
// `poll` reads `fd`/`events` and writes `revents` for this non-blocking (timeout 0) probe, then
|
||||
// returns — `pfd` outlives the call and aliases nothing.
|
||||
let pending = unsafe { libc::poll(&mut pfd, 1, 0) } == 0;
|
||||
if pending {
|
||||
pfd.revents = 0;
|
||||
// SAFETY: same live single-element `pfd` (its `revents` reset to 0 just above), `nfds == 1`,
|
||||
// and `sync_fd` still open. This blocking `poll` (up to `timeout_ms`) waits for the render
|
||||
// fence to signal; it reads `fd`/`events`, writes `revents`, and returns before `pfd` ends.
|
||||
unsafe { libc::poll(&mut pfd, 1, timeout_ms) }; // block until the render fence signals
|
||||
}
|
||||
// SAFETY: `sync_fd` is the sync_file fd the EXPORT_SYNC_FILE ioctl created and handed us to own;
|
||||
// this point is reached only when `sync_fd >= 0`, this `close` runs exactly once on it, and it is
|
||||
// never used afterward — no double-close or use-after-close.
|
||||
unsafe { libc::close(sync_fd) };
|
||||
Ok(pending)
|
||||
}
|
||||
|
||||
@@ -8,6 +8,8 @@
|
||||
//! verified (ioctl numbers + a live signal→wait round trip), ready to wire in the moment a producer
|
||||
//! gains working `SPA_META_SyncTimeline`.
|
||||
#![allow(dead_code)]
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
//!
|
||||
//! Compositors that render directly into the PipeWire buffer pool (Mutter's virtual
|
||||
//! monitors) hand buffers over at GPU-submit time; on drivers without implicit dmabuf
|
||||
@@ -81,6 +83,8 @@ pub struct DrmSync {
|
||||
impl DrmSync {
|
||||
pub fn open() -> Result<DrmSync> {
|
||||
let path = c"/dev/dri/renderD128";
|
||||
// SAFETY: `path` is a 'static NUL-terminated C string literal; `open` only reads it as a
|
||||
// filesystem path and returns an fd (or -1). No Rust memory is aliased or handed to the kernel.
|
||||
let fd = unsafe { libc::open(path.as_ptr(), libc::O_RDWR | libc::O_CLOEXEC) };
|
||||
if fd < 0 {
|
||||
bail!("open /dev/dri/renderD128 for syncobj ops: {}", errno());
|
||||
@@ -94,6 +98,9 @@ impl DrmSync {
|
||||
fd: syncobj_fd,
|
||||
..Default::default()
|
||||
};
|
||||
// SAFETY: `self.fd` is the live render-node fd from `open`; the request number encodes
|
||||
// `size_of::<DrmSyncobjHandle>()` (the bytes the kernel copies), and `&mut req` is a live,
|
||||
// correctly-sized `#[repr(C)]` struct the FD_TO_HANDLE ioctl reads (`fd`) and writes (`handle`).
|
||||
let r = unsafe { libc::ioctl(self.fd, DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, &mut req) };
|
||||
if r < 0 {
|
||||
bail!("SYNCOBJ_FD_TO_HANDLE: {}", errno());
|
||||
@@ -106,6 +113,8 @@ impl DrmSync {
|
||||
handle,
|
||||
..Default::default()
|
||||
};
|
||||
// SAFETY: `self.fd` is the live render-node fd; `DRM_IOCTL_SYNCOBJ_DESTROY` encodes
|
||||
// `size_of::<DrmSyncobjDestroy>()`, and `&mut req` is a live correctly-sized struct the kernel reads.
|
||||
unsafe { libc::ioctl(self.fd, DRM_IOCTL_SYNCOBJ_DESTROY, &mut req) };
|
||||
}
|
||||
|
||||
@@ -117,6 +126,8 @@ impl DrmSync {
|
||||
tv_sec: 0,
|
||||
tv_nsec: 0,
|
||||
};
|
||||
// SAFETY: `CLOCK_MONOTONIC` is a valid clock id and `&mut now` is a live `libc::timespec` the
|
||||
// kernel fills in; the call returns before `now` is read, so there is no aliasing/lifetime issue.
|
||||
unsafe { libc::clock_gettime(libc::CLOCK_MONOTONIC, &mut now) };
|
||||
let deadline = now.tv_sec * 1_000_000_000 + now.tv_nsec + timeout_ms as i64 * 1_000_000;
|
||||
let handles = [handle];
|
||||
@@ -129,6 +140,11 @@ impl DrmSync {
|
||||
flags: DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT,
|
||||
..Default::default()
|
||||
};
|
||||
// SAFETY: `self.fd` is the live render-node fd; the request number encodes
|
||||
// `size_of::<DrmSyncobjTimelineWait>()`; `&mut req` is a live correctly-sized struct. Its
|
||||
// `handles`/`points` u64 fields hold the addresses of the local `handles`/`points` arrays, which
|
||||
// outlive this synchronous call, and `count_handles == 1` matches their length — so every kernel
|
||||
// read through those addresses stays in bounds.
|
||||
let r = unsafe { libc::ioctl(self.fd, DRM_IOCTL_SYNCOBJ_TIMELINE_WAIT, &mut req) };
|
||||
let saved = errno();
|
||||
self.destroy(handle);
|
||||
@@ -151,6 +167,10 @@ impl DrmSync {
|
||||
count_handles: 1,
|
||||
flags: 0,
|
||||
};
|
||||
// SAFETY: `self.fd` is the live render-node fd; the request number encodes
|
||||
// `size_of::<DrmSyncobjTimelineArray>()`; `&mut req` is a live correctly-sized struct whose
|
||||
// `handles`/`points` u64 fields address the local `handles`/`points` arrays (alive for this
|
||||
// synchronous call, `count_handles == 1` matching their length).
|
||||
let r = unsafe { libc::ioctl(self.fd, DRM_IOCTL_SYNCOBJ_TIMELINE_SIGNAL, &mut req) };
|
||||
let saved = errno();
|
||||
self.destroy(handle);
|
||||
@@ -163,6 +183,8 @@ impl DrmSync {
|
||||
|
||||
impl Drop for DrmSync {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: `self.fd` is the fd `open` returned; this `DrmSync` owns it exclusively and `close`
|
||||
// runs exactly once (here, in `Drop`), so there is no double-close or use-after-close.
|
||||
unsafe { libc::close(self.fd) };
|
||||
}
|
||||
}
|
||||
@@ -203,14 +225,19 @@ mod tests {
|
||||
const CREATE: u64 = iowr(0xBF, std::mem::size_of::<Create>());
|
||||
const HANDLE_TO_FD: u64 = iowr(0xC1, std::mem::size_of::<DrmSyncobjHandle>());
|
||||
let mut c = Create::default();
|
||||
// SAFETY: `sync.fd` is the live render-node fd; `CREATE` encodes `size_of::<Create>()`, and
|
||||
// `&mut c` is a live correctly-sized struct the kernel fills (`handle`).
|
||||
assert!(unsafe { libc::ioctl(sync.fd, CREATE, &mut c) } >= 0);
|
||||
let mut h = DrmSyncobjHandle {
|
||||
handle: c.handle,
|
||||
..Default::default()
|
||||
};
|
||||
// SAFETY: `sync.fd` is live; `HANDLE_TO_FD` encodes `size_of::<DrmSyncobjHandle>()`; `&mut h`
|
||||
// is a live correctly-sized struct (the kernel reads `handle`, writes `fd`).
|
||||
assert!(unsafe { libc::ioctl(sync.fd, HANDLE_TO_FD, &mut h) } >= 0);
|
||||
sync.signal_point(h.fd, 1).expect("signal");
|
||||
sync.wait_point(h.fd, 1, 100).expect("wait after signal");
|
||||
// SAFETY: `h.fd` is the fd HANDLE_TO_FD just exported; we own it and close it exactly once here.
|
||||
unsafe { libc::close(h.fd) };
|
||||
sync.destroy(c.handle);
|
||||
}
|
||||
|
||||
@@ -11,6 +11,8 @@
|
||||
//! thread) and ffmpeg's `hevc_nvenc` (encode thread); each thread makes it current before use.
|
||||
|
||||
#![allow(non_camel_case_types, non_snake_case)]
|
||||
// Every `unsafe` block/impl below carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use anyhow::{bail, Result};
|
||||
use std::os::raw::{c_int, c_uint, c_void};
|
||||
@@ -128,8 +130,14 @@ struct CudaApi {
|
||||
) -> CUresult,
|
||||
cuDestroyExternalMemory: unsafe extern "C" fn(CUexternalMemory) -> CUresult,
|
||||
}
|
||||
// The resolved fn pointers are plain addresses into a process-lifetime mapping; safe to share.
|
||||
// SAFETY: every field is a bare `extern "C" fn` address into the leaked, process-lifetime
|
||||
// `libcuda` mapping (`cuda_api` `forget`s the `Library`, so it is never unloaded) — an immutable
|
||||
// value with no interior mutability and no thread affinity. Moving the table to another thread
|
||||
// cannot dangle (the code it points at stays mapped) or race (the fields are read-only).
|
||||
unsafe impl Send for CudaApi {}
|
||||
// SAFETY: as above — the table is a set of immutable fn-pointer addresses with no interior
|
||||
// mutability, so concurrent shared reads from multiple threads cannot race; the driver entry
|
||||
// points they address are themselves thread-safe.
|
||||
unsafe impl Sync for CudaApi {}
|
||||
|
||||
/// `CUresult` returned by the wrappers when `libcuda` isn't loaded (no NVIDIA driver). Non-zero so
|
||||
@@ -143,6 +151,14 @@ static CUDA_API: OnceLock<Option<CudaApi>> = OnceLock::new();
|
||||
/// (the expected case on AMD/Intel hosts) — logged at debug, not an error.
|
||||
fn cuda_api() -> Option<&'static CudaApi> {
|
||||
CUDA_API
|
||||
// SAFETY: `Library::new` runs `libcuda.so.1`'s initializers — it is the trusted NVIDIA
|
||||
// driver library, so loading has no unexpected effects; `?`/`None` handle its absence.
|
||||
// Each `lib.get::<T>(name)` asserts the symbol's real ABI equals `T`: every NUL-terminated
|
||||
// name is a documented CUDA Driver API entry point and `T` is the exact
|
||||
// `unsafe extern "C" fn(..)` signature from cuda.h/cudaGL.h (`_v2` for ctx/mem ops). Each
|
||||
// `Symbol` only borrows `lib` until the end of the struct-literal statement; we deref-copy
|
||||
// the raw fn-pointer out first, then `forget(lib)` leaks the mapping so those addresses
|
||||
// stay valid for the whole process. Runs once under the `OnceLock` init — no aliasing.
|
||||
.get_or_init(|| unsafe {
|
||||
let lib = libloading::Library::new("libcuda.so.1")
|
||||
.or_else(|_| libloading::Library::new("libcuda.so"))
|
||||
@@ -361,6 +377,12 @@ pub fn read_plane_to_host(
|
||||
Height: height,
|
||||
..Default::default()
|
||||
};
|
||||
// SAFETY: `copy_blocking` is unsafe because it issues a CUDA copy; its contract is a valid
|
||||
// descriptor with the shared context current (the caller's responsibility — self-test path).
|
||||
// `©` is a live local `#[repr(C)] CUDA_MEMCPY2D` that outlives the synchronous call:
|
||||
// `srcDevice`/`srcPitch` are the caller's live pitched device plane, `dstHost` addresses the
|
||||
// freshly-allocated `host` `Vec` of exactly `width_bytes*height` bytes, and `WidthInBytes`×
|
||||
// `Height` fit both. The copy is synchronous, so `host` is fully written before we return it.
|
||||
unsafe { copy_blocking(©, "cuMemcpy2DAsync_v2(dev->host)")? };
|
||||
Ok(host)
|
||||
}
|
||||
@@ -369,7 +391,13 @@ pub fn read_plane_to_host(
|
||||
/// in a `OnceLock`; the raw `CUcontext` is thread-safe to make current from any thread.
|
||||
#[derive(Clone, Copy)]
|
||||
pub struct Context(pub CUcontext);
|
||||
// SAFETY: `CUcontext` is an opaque CUDA driver handle, not a dereferenceable Rust pointer. It is
|
||||
// created once and never destroyed (process lifetime), and the only thing done with it is
|
||||
// `cuCtxSetCurrent`, which the Driver API explicitly allows from any thread — so transferring the
|
||||
// handle to another thread cannot dangle or race (the driver owns the synchronization).
|
||||
unsafe impl Send for Context {}
|
||||
// SAFETY: as above — the wrapped handle is an immutable opaque address and the driver does all the
|
||||
// synchronization, so sharing `&Context` across threads is sound.
|
||||
unsafe impl Sync for Context {}
|
||||
|
||||
static CONTEXT: OnceLock<Context> = OnceLock::new();
|
||||
@@ -382,6 +410,12 @@ pub fn context() -> Result<CUcontext> {
|
||||
if cuda_api().is_none() {
|
||||
bail!("libcuda.so.1 not available — no NVIDIA driver (CUDA zero-copy disabled)");
|
||||
}
|
||||
// SAFETY: we returned above unless `cuda_api()` is `Some`, so every wrapper here forwards into
|
||||
// the live, leaked `libcuda` table rather than the not-loaded stub. `cuInit(0)` passes the
|
||||
// API-required flags value 0. `&mut dev`/`&mut ctx` are live, zero/null-initialized stack
|
||||
// out-params the driver writes the device handle / new context into; each outlives its
|
||||
// synchronous call and they are distinct locals (no aliasing). `cuCtxCreate_v2` yields a valid
|
||||
// `CUcontext` on success (`ck` bails otherwise), which becomes the block's value.
|
||||
let ctx = unsafe {
|
||||
ck(cuInit(0), "cuInit")?;
|
||||
let mut dev: CUdevice = 0;
|
||||
@@ -401,6 +435,10 @@ pub fn context() -> Result<CUcontext> {
|
||||
/// Make the shared context current on the calling thread (required before any CUDA op here).
|
||||
pub fn make_current() -> Result<()> {
|
||||
let ctx = context()?;
|
||||
// SAFETY: `ctx` came from `context()?`, so it is the live shared `CUcontext` and the driver
|
||||
// table is present. `cuCtxSetCurrent` binds that opaque handle to the calling thread; it takes
|
||||
// no Rust-memory pointer and is thread-safe (affects only this thread's current context), so
|
||||
// there is no aliasing or lifetime hazard.
|
||||
unsafe { ck(cuCtxSetCurrent(ctx), "cuCtxSetCurrent") }
|
||||
}
|
||||
|
||||
@@ -423,6 +461,12 @@ fn copy_stream() -> CUstream {
|
||||
if let Some(s) = cell.get() {
|
||||
return s;
|
||||
}
|
||||
// SAFETY: `copy_stream` runs with the shared context current (its doc contract), so the
|
||||
// wrappers forward into the live `libcuda` table. `&mut least`/`&mut greatest` are live
|
||||
// stack `i32`s the driver fills with the priority range; `&mut s` is a live null-init
|
||||
// `CUstream` the driver writes the new stream into. All out-params outlive their
|
||||
// synchronous calls and are distinct locals. On any non-zero result we fall back to a null
|
||||
// (NULL-stream) value and never read an uninitialized handle.
|
||||
let stream = unsafe {
|
||||
let (mut least, mut greatest) = (0i32, 0i32);
|
||||
if cuCtxGetStreamPriorityRange(&mut least, &mut greatest) != 0 {
|
||||
@@ -459,6 +503,11 @@ unsafe fn copy_blocking(copy: &CUDA_MEMCPY2D, what: &str) -> Result<()> {
|
||||
fn alloc_pitched(width: u32, height: u32) -> Result<(CUdeviceptr, usize)> {
|
||||
let mut ptr: CUdeviceptr = 0;
|
||||
let mut pitch: usize = 0;
|
||||
// SAFETY: `cuMemAllocPitch_v2` allocates a pitched device buffer (the wrapper forwards to the
|
||||
// live table on any path that reached allocation). `&mut ptr` (`CUdeviceptr`) and `&mut pitch`
|
||||
// (`usize`) are live, distinct stack out-params the driver writes the allocation pointer and
|
||||
// its pitch into; both outlive the synchronous call. Width/height/element-size are by-value
|
||||
// ints. No aliasing — two separate locals.
|
||||
unsafe {
|
||||
ck(
|
||||
cuMemAllocPitch_v2(
|
||||
@@ -486,6 +535,10 @@ fn alloc_pitched_nv12(
|
||||
let mut y_pitch: usize = 0;
|
||||
let mut uv_ptr: CUdeviceptr = 0;
|
||||
let mut uv_pitch: usize = 0;
|
||||
// SAFETY: two independent `cuMemAllocPitch_v2` calls (wrapper → live table). `&mut y_ptr`/
|
||||
// `&mut y_pitch` and `&mut uv_ptr`/`&mut uv_pitch` are live, distinct stack out-params the
|
||||
// driver writes each plane's pointer and pitch into; all outlive their synchronous calls. The
|
||||
// dimension/element-size args are by-value ints. No aliasing — four separate locals.
|
||||
unsafe {
|
||||
ck(
|
||||
cuMemAllocPitch_v2(
|
||||
@@ -524,6 +577,13 @@ struct PoolInner {
|
||||
|
||||
impl Drop for PoolInner {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: the pool only exists because allocation succeeded, so the driver table is live.
|
||||
// `PoolInner` drops only once every `DeviceBuffer` that referenced it (each holds an `Arc`
|
||||
// clone) has been recycled, so `free`/`free_uv` hold every outstanding allocation exactly
|
||||
// once and nothing else still uses them — no double-free or use-after-free. We make the
|
||||
// shared context current first (drop may run off the allocating thread) so `cuMemFree_v2`
|
||||
// targets the right context. Each `p` is a `CUdeviceptr` previously returned by
|
||||
// `cuMemAllocPitch_v2`; results are ignored (best-effort teardown).
|
||||
unsafe {
|
||||
if let Some(c) = CONTEXT.get() {
|
||||
let _ = cuCtxSetCurrent(c.0);
|
||||
@@ -697,6 +757,12 @@ impl Drop for DeviceBuffer {
|
||||
}
|
||||
} else {
|
||||
// The buffer may be freed on the encode thread; cuMemFree needs a current context.
|
||||
// SAFETY: this is the un-pooled branch (`pool` is `None`), so this `DeviceBuffer`
|
||||
// exclusively owns `self.ptr` (and `self.uv`'s `uv_ptr`), each returned by
|
||||
// `cuMemAllocPitch_v2` and freed exactly once here — `drop` runs once and the
|
||||
// `self.ptr == 0` guard above skips the sentinel/empty case, so no double-free. We set
|
||||
// the shared context current first because drop may run on a thread where it isn't, and
|
||||
// `cuMemFree_v2` needs it. Wrapper → live table; results ignored (teardown).
|
||||
unsafe {
|
||||
if let Some(c) = CONTEXT.get() {
|
||||
let _ = cuCtxSetCurrent(c.0);
|
||||
@@ -745,6 +811,16 @@ impl RegisteredTexture {
|
||||
/// unmap. The copy is synchronized (on our priority stream) before unmap so `dst` is ready
|
||||
/// before the source dmabuf is recycled. Always unmaps, even if the copy errors.
|
||||
pub fn copy_mapped_to(&mut self, dst: &DeviceBuffer) -> Result<()> {
|
||||
// SAFETY: `self.resource` is the valid `CUgraphicsResource` from a successful `register_gl`
|
||||
// (its only constructor), so the wrappers forward to the live table; the caller holds the
|
||||
// GL+CUDA contexts current (the registration's contract). `cuGraphicsMapResources` maps
|
||||
// `count == 1` resource via `&mut self.resource` (a live field) on the default stream;
|
||||
// `cuGraphicsSubResourceGetMappedArray` writes the mapped `CUarray` into the live local
|
||||
// `array` (index 0, mip 0). On failure we unmap and bail (balanced). `©` is a live
|
||||
// local `CUDA_MEMCPY2D` outliving the synchronous `copy_blocking`: `srcArray` is valid
|
||||
// while mapped, `dstDevice`/`dstPitch` are `dst`'s live allocation, `width*4`×`height` fit
|
||||
// both. `copy_blocking` syncs before we unmap, so the array stays valid through the copy;
|
||||
// we always unmap afterward (even on error), keeping the map/unmap pair balanced.
|
||||
unsafe {
|
||||
ck(
|
||||
cuGraphicsMapResources(1, &mut self.resource, std::ptr::null_mut()),
|
||||
@@ -783,6 +859,14 @@ impl RegisteredTexture {
|
||||
width_bytes: usize,
|
||||
height: usize,
|
||||
) -> Result<()> {
|
||||
// SAFETY: identical contract to `copy_mapped_to` — `self.resource` is the valid
|
||||
// `CUgraphicsResource` from `register_gl` (wrappers → live table; caller holds GL+CUDA
|
||||
// contexts current). Map `count == 1` resource via the live `&mut self.resource`; the
|
||||
// mapped `CUarray` is written into the live local `array` (index 0, mip 0); on failure we
|
||||
// unmap and bail (balanced). `©` is a live local outliving the synchronous
|
||||
// `copy_blocking`: `srcArray` valid while mapped, `dstDevice`/`dstPitch` are the caller's
|
||||
// live plane, `width_bytes`×`height` fit it. We always unmap afterward, even on copy error,
|
||||
// so the map/unmap pair stays balanced and the array outlives the copy.
|
||||
unsafe {
|
||||
ck(
|
||||
cuGraphicsMapResources(1, &mut self.resource, std::ptr::null_mut()),
|
||||
@@ -847,6 +931,10 @@ pub fn copy_device_to_device(
|
||||
Height: src.height as usize,
|
||||
..Default::default()
|
||||
};
|
||||
// SAFETY: `copy_blocking` is unsafe (issues a CUDA copy); the caller must have the shared
|
||||
// context current (documented). `©` is a live local device→device `CUDA_MEMCPY2D` outliving
|
||||
// the synchronous call: `srcDevice`/`srcPitch` are `src`'s live allocation, `dstDevice`/
|
||||
// `dstPitch` the caller's live region, `width*4`×`height` within both. Wrapper → live table.
|
||||
unsafe { copy_blocking(©, "cuMemcpy2DAsync_v2(dev->dev)") }
|
||||
}
|
||||
|
||||
@@ -888,6 +976,12 @@ pub fn copy_nv12_to_device(
|
||||
Height: h / 2,
|
||||
..Default::default()
|
||||
};
|
||||
// SAFETY: two unsafe `copy_blocking` device→device copies; the caller must have the shared
|
||||
// context current (documented). `&y`/`&uv` are live local `CUDA_MEMCPY2D`s outliving each
|
||||
// synchronous call. All four device pointers are valid: `src.ptr`/`src_uv_ptr` come from a live
|
||||
// NV12 `DeviceBuffer` (its `.uv` presence was checked via `ok_or_else`), `y_dst`/`uv_dst` are
|
||||
// the caller's live NVENC surface planes; the luma copy is `w`×`h`, the chroma copy
|
||||
// `(w/2)*2`×`h/2`, each within its planes. Wrappers → live table.
|
||||
unsafe {
|
||||
copy_blocking(&y, "cuMemcpy2DAsync_v2(nv12 Y dev->dev)")?;
|
||||
copy_blocking(&uv, "cuMemcpy2DAsync_v2(nv12 UV dev->dev)")
|
||||
@@ -897,6 +991,12 @@ pub fn copy_nv12_to_device(
|
||||
impl Drop for RegisteredTexture {
|
||||
fn drop(&mut self) {
|
||||
if !self.resource.is_null() {
|
||||
// SAFETY: `self.resource` is non-null (just checked) and is the valid
|
||||
// `CUgraphicsResource` from `register_gl`, owned exclusively by this `RegisteredTexture`
|
||||
// and unregistered exactly once here (drop runs once) — no use-after-free or
|
||||
// double-unregister. `cuGraphicsUnregisterResource` releases the GL↔CUDA registration;
|
||||
// wrapper → live table (the resource exists ⇒ the driver was present). Result ignored
|
||||
// (best-effort teardown).
|
||||
unsafe {
|
||||
let _ = cuGraphicsUnregisterResource(self.resource);
|
||||
}
|
||||
@@ -913,7 +1013,11 @@ pub struct ExternalDmabuf {
|
||||
pub size: u64,
|
||||
}
|
||||
|
||||
// Raw driver handles; used from the single capture thread but moved with the importer.
|
||||
// SAFETY: the fields are opaque CUDA driver handles — an external-memory handle and a device
|
||||
// pointer — not dereferenceable Rust memory, and the value is uniquely owned (no `Clone`). It is
|
||||
// used from a single capture thread but constructed on / moved between threads with the importer;
|
||||
// transferring these handles is sound because uniqueness rules out aliasing and they are destroyed
|
||||
// exactly once in `Drop`. Only `Send` (not `Sync`) is asserted, matching the single-thread use.
|
||||
unsafe impl Send for ExternalDmabuf {}
|
||||
|
||||
impl ExternalDmabuf {
|
||||
@@ -921,6 +1025,9 @@ impl ExternalDmabuf {
|
||||
/// from then on) and map its full `size` bytes to a device pointer. The shared context
|
||||
/// must be current.
|
||||
pub fn import(fd: i32, size: u64) -> Result<ExternalDmabuf> {
|
||||
// SAFETY: `libc::dup` only reads the integer `fd` and returns a new descriptor (or -1); it
|
||||
// touches no Rust memory and `fd` is the caller's still-owned dmabuf fd (not consumed
|
||||
// here). No aliasing or lifetime concern — a pure syscall on an integer.
|
||||
let dup = unsafe { libc::dup(fd) };
|
||||
if dup < 0 {
|
||||
bail!("dup(dmabuf fd) failed");
|
||||
@@ -938,8 +1045,17 @@ impl ExternalDmabuf {
|
||||
};
|
||||
desc.handle[0] = dup as u32 as u64; // union member `int fd` (little-endian low bytes)
|
||||
let mut ext: CUexternalMemory = std::ptr::null_mut();
|
||||
// SAFETY: `cuImportExternalMemory` imports the memory described by `&desc`, a live local
|
||||
// `#[repr(C)] CUDA_EXTERNAL_MEMORY_HANDLE_DESC` (cuda.h 64-bit layout) that outlives this
|
||||
// synchronous call: `type_` is OPAQUE_FD, `handle[0]` holds the dup'd fd in the union's
|
||||
// `int fd` low bytes, `size` is set. `&mut ext` is a live null-init out-param the driver
|
||||
// writes the imported handle into. The driver takes ownership of the fd only on success.
|
||||
// Distinct locals → no aliasing. Wrapper → live table (caller holds the context current).
|
||||
let r = unsafe { cuImportExternalMemory(&mut ext, &desc) };
|
||||
if r != 0 {
|
||||
// SAFETY: import failed (`r != 0`), so the driver did NOT take ownership of `dup`; we
|
||||
// still own it and close it exactly once here on the error path (the success path never
|
||||
// closes it — the driver does). `libc::close` acts on the integer fd alone.
|
||||
unsafe { libc::close(dup) }; // import failed → the driver did not take the fd
|
||||
bail!("cuImportExternalMemory failed ({r}) — LINEAR dmabuf import unsupported?");
|
||||
}
|
||||
@@ -949,8 +1065,17 @@ impl ExternalDmabuf {
|
||||
..Default::default()
|
||||
};
|
||||
let mut ptr: CUdeviceptr = 0;
|
||||
// SAFETY: maps a device pointer from `ext` (the valid `CUexternalMemory` just imported) per
|
||||
// `&buf`, a live local `CUDA_EXTERNAL_MEMORY_BUFFER_DESC` (offset 0, full `size`) that
|
||||
// outlives this synchronous call. `&mut ptr` is a live zero-init out-param the driver writes
|
||||
// the mapped device address into; distinct locals → no aliasing. Wrapper → live table
|
||||
// (context current).
|
||||
let r = unsafe { cuExternalMemoryGetMappedBuffer(&mut ptr, ext, &buf) };
|
||||
if r != 0 {
|
||||
// SAFETY: mapping failed; `ext` is the valid `CUexternalMemory` we imported and
|
||||
// exclusively own. We destroy it exactly once here on the error path (the success path
|
||||
// instead moves it into the returned `ExternalDmabuf`, whose `Drop` destroys it),
|
||||
// releasing the fd the driver took — no double-destroy or use-after-free.
|
||||
unsafe {
|
||||
let _ = cuDestroyExternalMemory(ext);
|
||||
}
|
||||
@@ -962,6 +1087,12 @@ impl ExternalDmabuf {
|
||||
|
||||
impl Drop for ExternalDmabuf {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: this `ExternalDmabuf` only exists after a successful import, so the driver table
|
||||
// is live. It exclusively owns `self.ptr` (the mapped buffer) and `self.ext` (the external
|
||||
// memory), each torn down exactly once here (drop runs once; guarded by `!= 0` / `!null`) —
|
||||
// no double-free or use-after-free. We make the shared context current first because drop
|
||||
// may run off the import thread, and we free the mapped buffer before destroying its
|
||||
// backing external memory. Results ignored (best-effort teardown).
|
||||
unsafe {
|
||||
if let Some(c) = CONTEXT.get() {
|
||||
let _ = cuCtxSetCurrent(c.0);
|
||||
@@ -996,5 +1127,10 @@ pub fn copy_pitched_to_buffer(
|
||||
};
|
||||
// copy_blocking syncs our priority stream before returning, so the copy is complete before the
|
||||
// dmabuf is requeued to the producer.
|
||||
// SAFETY: `copy_blocking` is unsafe (issues a CUDA copy); the caller must have the shared
|
||||
// context current (documented). `©` is a live local device→device `CUDA_MEMCPY2D` outliving
|
||||
// the synchronous call: `srcDevice`/`srcPitch` are the caller's live mapped span (e.g. an
|
||||
// `ExternalDmabuf`), `dstDevice`/`dstPitch` are `dst`'s live allocation, `width*4`×`height`
|
||||
// within both. Wrapper → live table.
|
||||
unsafe { copy_blocking(©, "cuMemcpy2DAsync_v2(ext->dev)") }
|
||||
}
|
||||
|
||||
@@ -12,6 +12,8 @@
|
||||
//! owned [`DeviceBuffer`] so the dmabuf can be returned to the compositor immediately.
|
||||
|
||||
#![allow(non_upper_case_globals)]
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::cuda::{self, DeviceBuffer};
|
||||
use anyhow::{bail, ensure, Context as _, Result};
|
||||
@@ -415,6 +417,14 @@ impl Nv12Blit {
|
||||
|
||||
impl Drop for Nv12Blit {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: these GL names (textures/FBOs/VAO/programs) were all created by THIS `Nv12Blit`
|
||||
// in `Nv12Blit::new` on the current GL context, which is still current because the owning
|
||||
// `EglImporter` is dropped on its single capture thread (fields drop before
|
||||
// `EglImporter::drop`, which never releases the context). `glDelete*` takes a count + a
|
||||
// pointer to that many names: `&self.y_tex`/`&self.vao` are `&u32` to one live field (n=1);
|
||||
// `[self.y_fbo, self.uv_fbo].as_ptr()` points at a 2-element temporary that lives for the
|
||||
// whole `glDeleteFramebuffers` call (n=2 matches). The symbols dispatch through libGL
|
||||
// (libglvnd) to the driver for the current context. Each name is deleted exactly once.
|
||||
unsafe {
|
||||
glDeleteTextures(1, &self.y_tex);
|
||||
glDeleteTextures(1, &self.uv_tex);
|
||||
@@ -459,7 +469,14 @@ pub struct EglImporter {
|
||||
render_fd: c_int,
|
||||
}
|
||||
|
||||
// The EGL handles are confined to the capture thread; the struct is moved there once.
|
||||
// SAFETY: `EglImporter` owns thread-affine handles — an EGLDisplay/contexts made current on one
|
||||
// thread, a loaded GL proc pointer, a `gbm_device*`, a raw fd, and CUDA-registered GL textures —
|
||||
// none safe to touch concurrently. It is constructed inside `pipewire_thread` on the dedicated
|
||||
// `punktfunk-pipewire` thread, and every method (`import*`, `supported_modifiers`, `Drop`) runs on
|
||||
// that same thread; it is never accessed through a shared `&` from another thread. `Send` asserts
|
||||
// only that transferring *ownership* is sound (needed so the importer can live in the PipeWire
|
||||
// stream's user-data, whose API imposes a `Send` bound) — the live handles are never used
|
||||
// off-thread. `Sync` is deliberately NOT implied.
|
||||
unsafe impl Send for EglImporter {}
|
||||
|
||||
impl EglImporter {
|
||||
@@ -470,16 +487,38 @@ impl EglImporter {
|
||||
// to the same DRM device CUDA-GL interop associates with, which the EGL device platform
|
||||
// did not (cuGraphicsGLRegisterImage rejected device-platform GL textures).
|
||||
let path = std::ffi::CString::new("/dev/dri/renderD128").unwrap();
|
||||
// SAFETY: `path` is a live local `CString` (built from a string with no interior NUL, so it
|
||||
// is NUL-terminated); `path.as_ptr()` is a valid pointer to that buffer which outlives this
|
||||
// synchronous `open`. `open` only reads the path and returns a new fd (or -1); it neither
|
||||
// retains the pointer nor writes through it, so there is no aliasing or lifetime hazard.
|
||||
let render_fd = unsafe { libc::open(path.as_ptr(), libc::O_RDWR | libc::O_CLOEXEC) };
|
||||
ensure!(render_fd >= 0, "open /dev/dri/renderD128 for GBM");
|
||||
// SAFETY: `render_fd` is the live DRM render-node fd just returned by `open` and checked
|
||||
// `>= 0`. `gbm_create_device` (libgbm, linked above) builds a `gbm_device` over that fd and
|
||||
// returns a `*mut gbm_device` (or null); it borrows but does not take ownership of the fd,
|
||||
// which `EglImporter` keeps open and closes only in `Drop` after `gbm_device_destroy`. No
|
||||
// Rust-owned memory is passed, so there is nothing to alias.
|
||||
let gbm = unsafe { gbm_create_device(render_fd) };
|
||||
if gbm.is_null() {
|
||||
// SAFETY: reached only when `gbm_create_device` failed (null) — the fd was not consumed
|
||||
// and no `EglImporter` exists yet to close it again, so this `close` runs exactly once on
|
||||
// the live `render_fd`, releasing it before the error return. No double-close.
|
||||
unsafe { libc::close(render_fd) };
|
||||
anyhow::bail!("gbm_create_device failed");
|
||||
}
|
||||
|
||||
// SAFETY: `Egl::load_required` dlopens the system libEGL and binds its entry points,
|
||||
// trusting that libEGL (libglvnd) is a genuine EGL 1.5 implementation whose core symbols
|
||||
// match the ABI the `khronos_egl` `EGL1_5` bindings declare. No Rust memory is passed; the
|
||||
// returned instance is afterwards used only through the safe `khronos_egl` wrappers.
|
||||
let egl: Egl =
|
||||
unsafe { Egl::load_required() }.context("load libEGL (EGL 1.5 dynamic instance)")?;
|
||||
// SAFETY: `gbm` is the non-null `gbm_device*` created just above (checked), and
|
||||
// `EGL_PLATFORM_GBM_KHR` is exactly the platform enum that pairs with a GBM device as the
|
||||
// native-display handle, so the `gbm as NativeDisplayType` cast hands EGL a valid native
|
||||
// display for the requested platform. `&[egl::ATTRIB_NONE]` is a properly terminated, empty
|
||||
// attribute array borrowed for this synchronous call; EGL only reads it and returns an
|
||||
// `EGLDisplay`, retaining no pointer into Rust memory.
|
||||
let display = unsafe {
|
||||
egl.get_platform_display(
|
||||
EGL_PLATFORM_GBM_KHR,
|
||||
@@ -533,6 +572,13 @@ impl EglImporter {
|
||||
.context("eglCreateContext(OpenGL)")?;
|
||||
egl.make_current(display, None, None, Some(gl_ctx))
|
||||
.context("eglMakeCurrent surfaceless (needs EGL_KHR_surfaceless_context)")?;
|
||||
// SAFETY: the GL context was made current on this thread just above, which `eglGetProcAddress`
|
||||
// requires to return a usable pointer. The non-null (`?`-checked) pointer it returns for
|
||||
// "glEGLImageTargetTexture2DOES" is the driver's implementation of that GL-OES entry point,
|
||||
// whose real ABI is `void(GLenum, GLeglImageOES)` = `(u32, *mut c_void)` `extern "system"`.
|
||||
// `EglImageTargetFn` is declared with exactly that signature, so the transmute only retypes a
|
||||
// same-size, same-ABI thin function pointer (no value/representation change). The function is
|
||||
// present because `EGL_EXT_image_dma_buf_import` was asserted on this display above.
|
||||
let egl_image_target: EglImageTargetFn = unsafe {
|
||||
std::mem::transmute(
|
||||
egl.get_proc_address("glEGLImageTargetTexture2DOES")
|
||||
@@ -543,6 +589,10 @@ impl EglImporter {
|
||||
// Create the shared CUDA context up front so import() is pure hot path.
|
||||
cuda::context().context("create CUDA context")?;
|
||||
|
||||
// SAFETY: `egl::NO_CONTEXT` is EGL's defined sentinel (a null handle) for "no context";
|
||||
// `Context::from_ptr` only stores the handle (it never dereferences it), so wrapping the
|
||||
// null sentinel is sound and yields exactly the `EGL_NO_CONTEXT` value that
|
||||
// `eglCreateImage(EGL_LINUX_DMA_BUF_EXT)` requires as its context argument later.
|
||||
let no_ctx = unsafe { egl::Context::from_ptr(egl::NO_CONTEXT) };
|
||||
tracing::info!(
|
||||
"zero-copy EGL importer ready (GBM platform + GL texture interop, dma_buf_import + modifiers)"
|
||||
@@ -602,8 +652,21 @@ impl EglImporter {
|
||||
let Some(sym) = self.egl.get_proc_address("eglQueryDmaBufModifiersEXT") else {
|
||||
return Vec::new();
|
||||
};
|
||||
// SAFETY: `sym` is the non-null pointer `eglGetProcAddress("eglQueryDmaBufModifiersEXT")`
|
||||
// returned (the `let-else` already bailed on `None`) — the driver's implementation of that
|
||||
// EGL extension entry point. `QueryFn` is declared with that function's exact documented ABI
|
||||
// (`EGLDisplay, EGLint, EGLint, EGLuint64* , EGLBoolean*, EGLint* -> EGLBoolean`), all
|
||||
// `extern "system"`, so the transmute only retypes a same-size, same-ABI thin fn pointer.
|
||||
let query: QueryFn = unsafe { std::mem::transmute(sym) };
|
||||
let dpy = self.display.as_ptr();
|
||||
// SAFETY: `dpy` is this importer's live, initialized `EGLDisplay`; `query` is the proc loaded
|
||||
// just above. The first call passes null out-arrays with `max_modifiers == 0`, which the
|
||||
// extension defines as "write only the count" — it writes solely through `&mut count` (a live
|
||||
// local `i32`). For the second call, `mods`/`ext` are freshly allocated `Vec`s of exactly
|
||||
// `count` elements and `max_modifiers == count`, so the driver writes at most `count`
|
||||
// `u64`/`u32` entries (in bounds) plus the actual count through `&mut n` (a live local). All
|
||||
// four Rust addresses outlive these synchronous calls and alias nothing else. `truncate` only
|
||||
// shrinks, so even a misbehaving `n > count` cannot read out of bounds.
|
||||
unsafe {
|
||||
let mut count: i32 = 0;
|
||||
if query(
|
||||
@@ -699,6 +762,10 @@ impl EglImporter {
|
||||
]);
|
||||
}
|
||||
attrs.push(egl::ATTRIB_NONE);
|
||||
// SAFETY: `eglCreateImage(EGL_LINUX_DMA_BUF_EXT, ...)` mandates a NULL `EGLClientBuffer`
|
||||
// (the source is described entirely by the attribute list built above), so wrapping
|
||||
// `null_mut()` is the required value. `from_ptr` only stores the pointer without
|
||||
// dereferencing it, so constructing it from null is sound.
|
||||
let client = unsafe { egl::ClientBuffer::from_ptr(std::ptr::null_mut()) };
|
||||
let image = self
|
||||
.egl
|
||||
@@ -733,11 +800,21 @@ impl EglImporter {
|
||||
) -> Result<DeviceBuffer> {
|
||||
cuda::make_current()?;
|
||||
if self.blit.as_ref().map(|b| (b.width, b.height)) != Some((width, height)) {
|
||||
// SAFETY: `GlBlit::new` requires the GL context current on the calling thread and a
|
||||
// current CUDA context. Both hold: this runs on the capture thread where
|
||||
// `EglImporter::new` made the GL context current and never released it, and
|
||||
// `cuda::make_current()?` ran at the top of this function. `width`/`height` are plain
|
||||
// `Copy` frame dimensions.
|
||||
self.blit = Some(unsafe { GlBlit::new(width, height)? });
|
||||
}
|
||||
let egl_image_target = self.egl_image_target;
|
||||
let blit = self.blit.as_mut().unwrap();
|
||||
// SAFETY: GL + CUDA contexts current on this thread; `image` is a valid EGLImage.
|
||||
// SAFETY: `GlBlit::run` requires a current GL context and a valid `EGLImage`. The GL context
|
||||
// is current on this capture thread (made current in `EglImporter::new`, never released) and
|
||||
// `cuda::make_current()` ran above; `egl_image_target` is the `glEGLImageTargetTexture2DOES`
|
||||
// pointer loaded in `new`; `image` is the raw handle of the live `EGLImage` that
|
||||
// `import_inner` created with `eglCreateImage` and destroys only AFTER this call returns, so
|
||||
// it stays valid for the whole synchronous `run`.
|
||||
unsafe { blit.run(egl_image_target, image)? };
|
||||
// Persistent registration (mapped per frame) + a pooled buffer — no per-frame
|
||||
// cuGraphicsGLRegisterImage / cuMemAllocPitch.
|
||||
@@ -757,11 +834,21 @@ impl EglImporter {
|
||||
) -> Result<DeviceBuffer> {
|
||||
cuda::make_current()?;
|
||||
if self.nv12_blit.as_ref().map(|b| (b.width, b.height)) != Some((width, height)) {
|
||||
// SAFETY: `Nv12Blit::new` requires the GL context current on the calling thread and a
|
||||
// current CUDA context. Both hold: this runs on the capture thread where
|
||||
// `EglImporter::new` made the GL context current and never released it, and
|
||||
// `cuda::make_current()?` ran at the top of this function. `width`/`height` are plain
|
||||
// `Copy` frame dimensions.
|
||||
self.nv12_blit = Some(unsafe { Nv12Blit::new(width, height)? });
|
||||
}
|
||||
let egl_image_target = self.egl_image_target;
|
||||
let blit = self.nv12_blit.as_mut().unwrap();
|
||||
// SAFETY: GL + CUDA contexts current on this thread; `image` is a valid EGLImage.
|
||||
// SAFETY: `Nv12Blit::run` requires a current GL context and a valid `EGLImage`. The GL
|
||||
// context is current on this capture thread (made current in `EglImporter::new`, never
|
||||
// released) and `cuda::make_current()` ran above; `egl_image_target` is the
|
||||
// `glEGLImageTargetTexture2DOES` pointer loaded in `new`; `image` is the raw handle of the
|
||||
// live `EGLImage` that `import_inner` created with `eglCreateImage` and destroys only AFTER
|
||||
// this call returns, so it stays valid for the whole synchronous `run`.
|
||||
unsafe { blit.run(egl_image_target, image)? };
|
||||
let dst = blit.pool.get()?;
|
||||
cuda::copy_mapped_nv12(&mut blit.y_registered, &mut blit.uv_registered, &dst)?;
|
||||
@@ -787,9 +874,22 @@ impl EglImporter {
|
||||
);
|
||||
cuda::make_current()?;
|
||||
if self.nv12_blit.as_ref().map(|b| (b.width, b.height)) != Some((width, height)) {
|
||||
// SAFETY: `Nv12Blit::new` requires the GL context current on the calling thread and a
|
||||
// current CUDA context. Both hold: this self-test path runs on the thread that owns this
|
||||
// `EglImporter` with its GL context current, and `cuda::make_current()?` ran just above.
|
||||
// `width`/`height` are plain `Copy` scalars.
|
||||
self.nv12_blit = Some(unsafe { Nv12Blit::new(width, height)? });
|
||||
}
|
||||
let blit = self.nv12_blit.as_mut().unwrap();
|
||||
// SAFETY: runs on the thread that owns this `EglImporter` with its GL context current.
|
||||
// `blit.src_tex` is a texture this `Nv12Blit` owns; `glTexStorage2D` allocates immutable
|
||||
// RGBA8 storage exactly once (guarded by `test_src_storage`) sized `width×height`.
|
||||
// `glTexSubImage2D` then uploads exactly `width×height` RGBA8 texels, reading `width*height*4`
|
||||
// bytes from `rgba.as_ptr()`; the caller already asserted `rgba.len() == width*height*4`, rows
|
||||
// are `width*4` bytes (a multiple of the default 4-byte unpack alignment, so no row-padding
|
||||
// over-read), and `rgba` is a live borrow that outlives this synchronous upload. `run_passes`
|
||||
// then needs only the current GL context (no further Rust pointers). All GL names are this
|
||||
// blit's own, alias no other live object, and nothing is retained past the calls.
|
||||
unsafe {
|
||||
// Upload the host RGBA into `src_tex` (an immutable GL_RGBA8 backing must exist first;
|
||||
// the live path never allocates it — it retargets `src_tex` via EGLImage instead).
|
||||
@@ -824,9 +924,16 @@ impl EglImporter {
|
||||
impl Drop for EglImporter {
|
||||
fn drop(&mut self) {
|
||||
if !self.gbm.is_null() {
|
||||
// SAFETY: `self.gbm` is the non-null `gbm_device*` from `gbm_create_device` in `new`
|
||||
// (checked non-null here), owned exclusively by this `EglImporter` and destroyed exactly
|
||||
// once (in `Drop`). It is freed BEFORE `render_fd` is closed below — the correct order,
|
||||
// since the device borrowed that fd for its lifetime.
|
||||
unsafe { gbm_device_destroy(self.gbm) };
|
||||
}
|
||||
if self.render_fd >= 0 {
|
||||
// SAFETY: `self.render_fd` is the fd `open` returned in `new` (checked `>= 0`), owned
|
||||
// exclusively by this `EglImporter`; this `close` runs exactly once, after the gbm device
|
||||
// that borrowed it has been destroyed. No double-close or use-after-close.
|
||||
unsafe { libc::close(self.render_fd) };
|
||||
}
|
||||
}
|
||||
|
||||
@@ -16,6 +16,9 @@
|
||||
//! a stream's life). Falls back cleanly: any init/import error disables the importer and the
|
||||
//! CPU mmap path takes over.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::cuda::{self, DeviceBuffer};
|
||||
use anyhow::{anyhow, bail, Context as _, Result};
|
||||
use ash::vk;
|
||||
@@ -51,12 +54,27 @@ pub struct VkBridge {
|
||||
dst: Option<DstBuf>,
|
||||
}
|
||||
|
||||
// Confined to the capture thread; moved there once.
|
||||
// SAFETY: `VkBridge` owns ash Vulkan handles (instance/device/queue/command pool+buffer/fence), a
|
||||
// CUDA external-memory mapping, and an fd→buffer cache — none `Sync`, and a single queue +
|
||||
// command buffer must be externally synchronized. It is created inside `EglImporter::import_linear`
|
||||
// on the dedicated `punktfunk-pipewire` capture thread and every method (`import_linear`, `Drop`)
|
||||
// runs on that thread; it is never shared via `&` across threads. `Send` asserts only that
|
||||
// transferring ownership is sound (so the bridge can live inside the `Send` `EglImporter`); the live
|
||||
// handles are never touched off-thread, and `Sync` is deliberately NOT implied.
|
||||
unsafe impl Send for VkBridge {}
|
||||
|
||||
impl VkBridge {
|
||||
/// Bring up Vulkan on the NVIDIA GPU with the external-memory extensions.
|
||||
pub fn new() -> Result<VkBridge> {
|
||||
// SAFETY: standard ash bring-up — every call is `unsafe` only because ash cannot statically
|
||||
// verify Vulkan handle/CreateInfo validity. `ash::Entry::load` dlopens a real system
|
||||
// libvulkan. Each `*CreateInfo`/`AllocateInfo` is built by ash's builders from locals (`app`,
|
||||
// `exts`, `prio`, `qci`, and the inline infos) that all live for the duration of the
|
||||
// synchronous `create_*`/`enumerate_*` call that reads them — in particular the
|
||||
// `enabled_extension_names(&exts)` and `queue_priorities(&prio)` borrows outlive their calls.
|
||||
// Every handle passed (`instance`, `phys`, `device`, `qf`, `cmd_pool`) was just created and
|
||||
// checked via `?`/`ok_or_else` in this same function, so no invalid handle is ever used. This
|
||||
// constructor shares nothing across threads.
|
||||
unsafe {
|
||||
let entry = ash::Entry::load().context("load libvulkan")?;
|
||||
let app = vk::ApplicationInfo::default().api_version(vk::API_VERSION_1_1);
|
||||
@@ -294,6 +312,19 @@ impl VkBridge {
|
||||
height: u32,
|
||||
pool: &cuda::BufferPool,
|
||||
) -> Result<DeviceBuffer> {
|
||||
// SAFETY: `fd` is the live dmabuf fd handed in by the caller (borrowed; `import_src` dup's it
|
||||
// internally and Vulkan owns the dup). `libc::lseek` only queries the fd's size. The unsafe
|
||||
// `import_src`/`ensure_dst` are called with a valid fd and a checked size. The bounds are
|
||||
// proven: `import_src` asserts `size >= span` (so the cached `src_size >= span`),
|
||||
// `copy_size = src_size.min(span)`, and `ensure_dst(copy_size)` makes `dst` at least
|
||||
// `copy_size` — so the GPU `cmd_copy_buffer` of `copy_size` bytes reads/writes within both
|
||||
// buffers, and the later CUDA pitched copy reading `[offset, span)` from `dst.cuda.ptr` (=
|
||||
// `offset + stride*height = span <= copy_size`) stays inside the freshly-copied region. The
|
||||
// `*Info`/`region`/`cmds`/`submit` are locals that outlive the synchronous calls reading them.
|
||||
// `cmd`/`queue`/`fence` are this bridge's own handles, used on this single thread only. The
|
||||
// host-side `wait_for_fences` fully retires the Vulkan copy BEFORE CUDA reads the shared
|
||||
// memory, so there is no GPU write/read data race. `dst` is an `&self.dst` shared borrow that
|
||||
// does not alias the `&self.device` calls.
|
||||
unsafe {
|
||||
let span = offset as u64 + stride as u64 * height as u64;
|
||||
if !self.src_cache.contains_key(&fd) {
|
||||
@@ -347,6 +378,15 @@ impl VkBridge {
|
||||
|
||||
impl Drop for VkBridge {
|
||||
fn drop(&mut self) {
|
||||
// SAFETY: runs once when the bridge is dropped on its owning capture thread.
|
||||
// `device_wait_idle` first drains all in-flight GPU work, so no queued command still
|
||||
// references these objects. Every handle freed (the `src_cache` buffers+memories, the `dst`
|
||||
// buffer+memory, `fence`, `cmd_pool`, `device`, `instance`) was created by this `VkBridge`
|
||||
// and owned exclusively by it, so each `destroy_*`/`free_*` runs exactly once with no
|
||||
// double-free, in dependency order (child objects before `device`, `device` before
|
||||
// `instance`). `dst.cuda` is dropped after `free_memory`, which is safe because CUDA holds
|
||||
// its own dup'd OPAQUE_FD reference to the underlying allocation. No other thread touches
|
||||
// these handles.
|
||||
unsafe {
|
||||
let _ = self.device.device_wait_idle();
|
||||
for (_, s) in self.src_cache.drain() {
|
||||
|
||||
@@ -13,6 +13,10 @@
|
||||
|
||||
// Scaffold: trait methods and config paths are defined ahead of their backends.
|
||||
#![allow(dead_code)]
|
||||
// Unsafe-proof program: every `unsafe {}` / `unsafe impl` in the crate must carry a `// SAFETY:`
|
||||
// proof of why it is sound. This crate-root deny is the permanent, catch-all gate (it also covers
|
||||
// any future module); individual files keep their own `#![deny(...)]` as belt-and-suspenders.
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
mod audio;
|
||||
mod capture;
|
||||
@@ -46,6 +50,7 @@ mod service;
|
||||
mod session_plan;
|
||||
mod session_tuning;
|
||||
mod spike;
|
||||
mod stats_recorder;
|
||||
mod vdisplay;
|
||||
#[cfg(target_os = "windows")]
|
||||
#[path = "windows/wgc_helper.rs"]
|
||||
@@ -385,7 +390,7 @@ fn real_main() -> Result<()> {
|
||||
}
|
||||
// USER-session WGC helper (Windows two-process secure-desktop design): capture the EXISTING
|
||||
// SudoVDA via WGC + NVENC, stream AUs on stdout to the SYSTEM host. Spawned by the host
|
||||
// (CreateProcessAsUser), not run by hand. See docs/windows-secure-desktop.md.
|
||||
// (CreateProcessAsUser), not run by hand. See design/windows-secure-desktop.md.
|
||||
#[cfg(target_os = "windows")]
|
||||
Some("wgc-helper") => {
|
||||
let get = |flag: &str| {
|
||||
@@ -700,7 +705,7 @@ SPIKE OPTIONS:
|
||||
|
||||
NOTES:
|
||||
'portal' needs headless Sway + xdg-desktop-portal-wlr running in this session
|
||||
(see docs/linux-setup.md). 'synthetic' needs no capture session and always runs.
|
||||
(see design/linux-setup.md). 'synthetic' needs no capture session and always runs.
|
||||
Encoded AUs are written to a playable file AND (unless --no-loopback) fed through a
|
||||
punktfunk_core host→client loopback that reassembles and byte-verifies each one.
|
||||
Both 'serve' and 'punktfunk1-host' advertise the native service over mDNS
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
//! The API is versioned under `/api/v1` and described by an OpenAPI 3.1 document generated
|
||||
//! at compile time with `utoipa` — `punktfunk-host openapi` prints it for client codegen, the
|
||||
//! running server serves it at `/api/v1/openapi.json` plus interactive docs at `/api/docs`,
|
||||
//! and a copy is checked in at `docs/api/openapi.json` (a test fails if it drifts, like the
|
||||
//! and a copy is checked in at `api/openapi.json` (a test fails if it drifts, like the
|
||||
//! cbindgen header).
|
||||
//!
|
||||
//! Security: binds loopback by default, serves HTTPS with the host's identity cert, and requires
|
||||
@@ -20,6 +20,7 @@ use crate::gamestream::{
|
||||
tls::{serve_https, PeerCertFingerprint},
|
||||
AppState, APP_VERSION, AUDIO_PORT, CONTROL_PORT, GFE_VERSION, RTSP_PORT, VIDEO_PORT,
|
||||
};
|
||||
use crate::stats_recorder::{Capture, CaptureMeta, StatsStatus};
|
||||
use anyhow::{Context, Result};
|
||||
use axum::{
|
||||
extract::{Path, Request, State},
|
||||
@@ -66,6 +67,9 @@ struct MgmtState {
|
||||
/// Native (punktfunk/1) pairing — shared with the QUIC host when the unified `serve --native`
|
||||
/// runs it. `None` ⇒ GameStream-only host (the native endpoints report `enabled: false`).
|
||||
native: Option<Arc<crate::native_pairing::NativePairing>>,
|
||||
/// Shared streaming-stats recorder — the same handle the streaming loops emit into, so an
|
||||
/// operator can arm/stop a capture here and review/list/delete saved recordings.
|
||||
stats: Arc<crate::stats_recorder::StatsRecorder>,
|
||||
token: Option<String>,
|
||||
/// The port we serve on, echoed in [`PortMap`] so a client can persist a full endpoint map.
|
||||
port: u16,
|
||||
@@ -77,6 +81,7 @@ pub async fn run(
|
||||
state: Arc<AppState>,
|
||||
opts: Options,
|
||||
native: Option<Arc<crate::native_pairing::NativePairing>>,
|
||||
stats: Arc<crate::stats_recorder::StatsRecorder>,
|
||||
) -> Result<()> {
|
||||
// The mgmt API is HTTPS + token-authenticated ALWAYS (even on loopback): `parse_serve`
|
||||
// guarantees a token (CLI flag / env / persisted ~/.config/punktfunk/mgmt-token / generated).
|
||||
@@ -100,7 +105,7 @@ pub async fn run(
|
||||
auth = "mTLS (paired cert) or bearer (required)",
|
||||
"management API listening over HTTPS (docs at /api/docs, spec at /api/v1/openapi.json)"
|
||||
);
|
||||
let app = app(state, Some(token), opts.bind.port(), native);
|
||||
let app = app(state, Some(token), opts.bind.port(), native, stats);
|
||||
serve_https(opts.bind, app, tls).await
|
||||
}
|
||||
|
||||
@@ -110,10 +115,12 @@ fn app(
|
||||
token: Option<String>,
|
||||
port: u16,
|
||||
native: Option<Arc<crate::native_pairing::NativePairing>>,
|
||||
stats: Arc<crate::stats_recorder::StatsRecorder>,
|
||||
) -> Router {
|
||||
let shared = Arc::new(MgmtState {
|
||||
app: state,
|
||||
native,
|
||||
stats,
|
||||
token,
|
||||
port,
|
||||
});
|
||||
@@ -158,13 +165,19 @@ fn api_router_parts() -> (Router<Arc<MgmtState>>, utoipa::openapi::OpenApi) {
|
||||
.routes(routes!(request_idr))
|
||||
.routes(routes!(get_library))
|
||||
.routes(routes!(create_custom_game))
|
||||
.routes(routes!(update_custom_game, delete_custom_game)),
|
||||
.routes(routes!(update_custom_game, delete_custom_game))
|
||||
.routes(routes!(stats_capture_start))
|
||||
.routes(routes!(stats_capture_stop))
|
||||
.routes(routes!(stats_capture_status))
|
||||
.routes(routes!(stats_capture_live))
|
||||
.routes(routes!(stats_recordings_list))
|
||||
.routes(routes!(stats_recording_get, stats_recording_delete)),
|
||||
)
|
||||
.split_for_parts()
|
||||
}
|
||||
|
||||
/// The OpenAPI document as pretty JSON — what `punktfunk-host openapi` prints and what is
|
||||
/// checked in at `docs/api/openapi.json` for client codegen.
|
||||
/// checked in at `api/openapi.json` for client codegen.
|
||||
pub fn openapi_json() -> String {
|
||||
let (_, api) = api_router_parts();
|
||||
let mut json = api.to_pretty_json().expect("serialize OpenAPI document");
|
||||
@@ -190,6 +203,7 @@ pub fn openapi_json() -> String {
|
||||
(name = "native", description = "Native punktfunk/1 pairing: arm a window, display the host PIN, manage paired devices"),
|
||||
(name = "session", description = "Active streaming session control"),
|
||||
(name = "library", description = "Game library: installed-store titles (Steam) plus user-curated custom entries"),
|
||||
(name = "stats", description = "Streaming performance-stats capture: arm/stop a recording, read the live + saved time-series for graphing"),
|
||||
)
|
||||
)]
|
||||
struct ApiDoc;
|
||||
@@ -1218,6 +1232,185 @@ async fn delete_custom_game(Path(id): Path<String>) -> Response {
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------
|
||||
// Streaming stats capture (design/stats-capture-plan.md §2)
|
||||
// ---------------------------------------------------------------------------------------
|
||||
|
||||
/// Start a stats capture
|
||||
///
|
||||
/// Arms a new performance-stats capture. Idempotent: if a capture is already running this returns
|
||||
/// the current status unchanged. While armed, the streaming loops emit aggregated samples (~ every
|
||||
/// 1–2 s) into the in-progress capture, readable live via `GET /stats/capture/live`.
|
||||
#[utoipa::path(
|
||||
post,
|
||||
path = "/stats/capture/start",
|
||||
tag = "stats",
|
||||
operation_id = "statsCaptureStart",
|
||||
responses(
|
||||
(status = OK, description = "Capture armed (or already running)", body = StatsStatus),
|
||||
(status = UNAUTHORIZED, description = "Missing or invalid bearer token", body = ApiError),
|
||||
)
|
||||
)]
|
||||
async fn stats_capture_start(State(st): State<Arc<MgmtState>>) -> Json<StatsStatus> {
|
||||
let status = st.stats.start();
|
||||
tracing::info!(
|
||||
started_unix_ms = status.started_unix_ms,
|
||||
"management API: stats capture armed"
|
||||
);
|
||||
Json(status)
|
||||
}
|
||||
|
||||
/// Stop the stats capture
|
||||
///
|
||||
/// Disarms the in-progress capture and writes it to disk atomically, returning its summary. If
|
||||
/// nothing was recording, returns `204 No Content`.
|
||||
#[utoipa::path(
|
||||
post,
|
||||
path = "/stats/capture/stop",
|
||||
tag = "stats",
|
||||
operation_id = "statsCaptureStop",
|
||||
responses(
|
||||
(status = OK, description = "Capture stopped and saved", body = CaptureMeta),
|
||||
(status = NO_CONTENT, description = "Nothing was recording"),
|
||||
(status = INTERNAL_SERVER_ERROR, description = "Could not write the recording to disk", body = ApiError),
|
||||
(status = UNAUTHORIZED, description = "Missing or invalid bearer token", body = ApiError),
|
||||
)
|
||||
)]
|
||||
async fn stats_capture_stop(State(st): State<Arc<MgmtState>>) -> Response {
|
||||
match st.stats.stop() {
|
||||
Ok(Some(meta)) => {
|
||||
tracing::info!(id = %meta.id, samples = meta.sample_count, "management API: stats capture saved");
|
||||
(StatusCode::OK, Json(meta)).into_response()
|
||||
}
|
||||
Ok(None) => StatusCode::NO_CONTENT.into_response(),
|
||||
Err(e) => api_error(
|
||||
StatusCode::INTERNAL_SERVER_ERROR,
|
||||
&format!("could not save capture: {e}"),
|
||||
),
|
||||
}
|
||||
}
|
||||
|
||||
/// Stats capture status
|
||||
///
|
||||
/// Whether a capture is armed, its sample count, and start time. Poll this (e.g. every 2 s) to
|
||||
/// drive the capture-control UI.
|
||||
#[utoipa::path(
|
||||
get,
|
||||
path = "/stats/capture/status",
|
||||
tag = "stats",
|
||||
operation_id = "statsCaptureStatus",
|
||||
responses(
|
||||
(status = OK, description = "In-progress capture status (idle when not armed)", body = StatsStatus),
|
||||
(status = UNAUTHORIZED, description = "Missing or invalid bearer token", body = ApiError),
|
||||
)
|
||||
)]
|
||||
async fn stats_capture_status(State(st): State<Arc<MgmtState>>) -> Json<StatsStatus> {
|
||||
Json(st.stats.status())
|
||||
}
|
||||
|
||||
/// Live in-progress capture
|
||||
///
|
||||
/// The full sample time-series of the capture currently recording, for live graphing. `404` when
|
||||
/// nothing is armed.
|
||||
#[utoipa::path(
|
||||
get,
|
||||
path = "/stats/capture/live",
|
||||
tag = "stats",
|
||||
operation_id = "statsCaptureLive",
|
||||
responses(
|
||||
(status = OK, description = "The in-progress capture (meta + samples so far)", body = Capture),
|
||||
(status = NOT_FOUND, description = "No capture is currently recording", body = ApiError),
|
||||
(status = UNAUTHORIZED, description = "Missing or invalid bearer token", body = ApiError),
|
||||
)
|
||||
)]
|
||||
async fn stats_capture_live(State(st): State<Arc<MgmtState>>) -> Response {
|
||||
match st.stats.live_snapshot() {
|
||||
Some(capture) => Json(capture).into_response(),
|
||||
None => api_error(StatusCode::NOT_FOUND, "no capture is currently recording"),
|
||||
}
|
||||
}
|
||||
|
||||
/// List saved recordings
|
||||
///
|
||||
/// Every saved capture's summary (the `meta` head only — not the sample body), newest first.
|
||||
#[utoipa::path(
|
||||
get,
|
||||
path = "/stats/recordings",
|
||||
tag = "stats",
|
||||
operation_id = "statsRecordingsList",
|
||||
responses(
|
||||
(status = OK, description = "Saved capture summaries, newest first", body = [CaptureMeta]),
|
||||
(status = UNAUTHORIZED, description = "Missing or invalid bearer token", body = ApiError),
|
||||
)
|
||||
)]
|
||||
async fn stats_recordings_list(State(st): State<Arc<MgmtState>>) -> Json<Vec<CaptureMeta>> {
|
||||
Json(st.stats.list())
|
||||
}
|
||||
|
||||
/// Get a saved recording
|
||||
///
|
||||
/// The full capture (meta + samples) for `id`, for graphing or download.
|
||||
#[utoipa::path(
|
||||
get,
|
||||
path = "/stats/recordings/{id}",
|
||||
tag = "stats",
|
||||
operation_id = "statsRecordingGet",
|
||||
params(("id" = String, Path, description = "The recording id (its filename stem)")),
|
||||
responses(
|
||||
(status = OK, description = "The full capture", body = Capture),
|
||||
(status = NOT_FOUND, description = "No recording with that id", body = ApiError),
|
||||
(status = UNAUTHORIZED, description = "Missing or invalid bearer token", body = ApiError),
|
||||
(status = INTERNAL_SERVER_ERROR, description = "The recording file is unreadable", body = ApiError),
|
||||
)
|
||||
)]
|
||||
async fn stats_recording_get(State(st): State<Arc<MgmtState>>, Path(id): Path<String>) -> Response {
|
||||
match st.stats.load(&id) {
|
||||
Ok(capture) => Json(capture).into_response(),
|
||||
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
|
||||
api_error(StatusCode::NOT_FOUND, "no recording with that id")
|
||||
}
|
||||
Err(e) => api_error(
|
||||
StatusCode::INTERNAL_SERVER_ERROR,
|
||||
&format!("could not read recording: {e}"),
|
||||
),
|
||||
}
|
||||
}
|
||||
|
||||
/// Delete a saved recording
|
||||
///
|
||||
/// Removes the recording `id` from disk. `404` if there is no such recording.
|
||||
#[utoipa::path(
|
||||
delete,
|
||||
path = "/stats/recordings/{id}",
|
||||
tag = "stats",
|
||||
operation_id = "statsRecordingDelete",
|
||||
params(("id" = String, Path, description = "The recording id (its filename stem)")),
|
||||
responses(
|
||||
(status = NO_CONTENT, description = "Recording deleted"),
|
||||
(status = NOT_FOUND, description = "No recording with that id", body = ApiError),
|
||||
(status = UNAUTHORIZED, description = "Missing or invalid bearer token", body = ApiError),
|
||||
(status = INTERNAL_SERVER_ERROR, description = "Could not delete the recording", body = ApiError),
|
||||
)
|
||||
)]
|
||||
async fn stats_recording_delete(
|
||||
State(st): State<Arc<MgmtState>>,
|
||||
Path(id): Path<String>,
|
||||
) -> Response {
|
||||
match st.stats.delete(&id) {
|
||||
Ok(()) => {
|
||||
tracing::info!(id, "management API: recording deleted");
|
||||
StatusCode::NO_CONTENT.into_response()
|
||||
}
|
||||
Err(e) if e.kind() == std::io::ErrorKind::NotFound => {
|
||||
api_error(StatusCode::NOT_FOUND, "no recording with that id")
|
||||
}
|
||||
Err(e) => api_error(
|
||||
StatusCode::INTERNAL_SERVER_ERROR,
|
||||
&format!("could not delete recording: {e}"),
|
||||
),
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------------------
|
||||
// Tests
|
||||
// ---------------------------------------------------------------------------------------
|
||||
@@ -1231,6 +1424,15 @@ mod tests {
|
||||
use std::net::{IpAddr, Ipv4Addr};
|
||||
use tower::ServiceExt;
|
||||
|
||||
/// A throwaway stats recorder rooted in a unique temp dir (never touches the real config dir).
|
||||
fn test_stats() -> Arc<crate::stats_recorder::StatsRecorder> {
|
||||
crate::stats_recorder::StatsRecorder::new(std::env::temp_dir().join(format!(
|
||||
"pf-mgmt-stats-{}-{:p}",
|
||||
std::process::id(),
|
||||
&0u8 as *const u8
|
||||
)))
|
||||
}
|
||||
|
||||
fn test_state() -> Arc<AppState> {
|
||||
let host = Host {
|
||||
hostname: "test-host".into(),
|
||||
@@ -1240,18 +1442,20 @@ mod tests {
|
||||
https_port: HTTPS_PORT,
|
||||
};
|
||||
let identity = ServerIdentity::ephemeral().expect("ephemeral identity");
|
||||
Arc::new(AppState::new(host, identity))
|
||||
Arc::new(AppState::new(host, identity, test_stats()))
|
||||
}
|
||||
|
||||
// The mgmt API now always requires auth, so the router always has a token. A test that passes
|
||||
// `None` gets the default "test-secret" (and `send` auto-attaches the matching bearer); a test
|
||||
// that passes an explicit token exercises a mismatch (e.g. `bearer_token_is_enforced`).
|
||||
fn test_app(state: Arc<AppState>, token: Option<&str>) -> Router {
|
||||
let stats = state.stats.clone();
|
||||
app(
|
||||
state,
|
||||
Some(token.unwrap_or("test-secret").to_string()),
|
||||
DEFAULT_PORT,
|
||||
None,
|
||||
stats,
|
||||
)
|
||||
}
|
||||
|
||||
@@ -1261,11 +1465,13 @@ mod tests {
|
||||
) -> Router {
|
||||
// Auth required always; the paired-cert tests inject a fingerprint (cert branch wins), the
|
||||
// rest authenticate via the `send`-attached default bearer.
|
||||
let stats = state.stats.clone();
|
||||
app(
|
||||
state,
|
||||
Some("test-secret".to_string()),
|
||||
DEFAULT_PORT,
|
||||
Some(np),
|
||||
stats,
|
||||
)
|
||||
}
|
||||
|
||||
@@ -1580,7 +1786,9 @@ mod tests {
|
||||
bind: "127.0.0.1:0".parse().unwrap(),
|
||||
token: Some(" ".into()),
|
||||
};
|
||||
let err = run(test_state(), opts, None).await.unwrap_err();
|
||||
let err = run(test_state(), opts, None, test_stats())
|
||||
.await
|
||||
.unwrap_err();
|
||||
assert!(err.to_string().contains("no token"), "{err}");
|
||||
}
|
||||
|
||||
@@ -1663,14 +1871,14 @@ mod tests {
|
||||
serde_json::json!([{}])
|
||||
);
|
||||
|
||||
let checked_in = include_str!("../../../docs/api/openapi.json");
|
||||
let checked_in = include_str!("../../../api/openapi.json");
|
||||
// Compare content, not line-ending style: the generated `json` is LF (serde_json), but git
|
||||
// may check the file out CRLF on Windows.
|
||||
assert_eq!(
|
||||
json.trim().replace('\r', ""),
|
||||
checked_in.trim().replace('\r', ""),
|
||||
"docs/api/openapi.json is stale — regenerate with: \
|
||||
cargo run -p punktfunk-host -- openapi > docs/api/openapi.json"
|
||||
"api/openapi.json is stale — regenerate with: \
|
||||
cargo run -p punktfunk-host -- openapi > api/openapi.json"
|
||||
);
|
||||
}
|
||||
|
||||
|
||||
@@ -22,6 +22,9 @@
|
||||
//! Trust: the host serves with its persistent identity (`~/.config/punktfunk/cert.pem`, shared
|
||||
//! with GameStream pairing) and logs the SHA-256 fingerprint clients pin.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use anyhow::{anyhow, Context, Result};
|
||||
use punktfunk_core::config::{CompositorPref, FecConfig, FecScheme, GamepadPref, Role};
|
||||
use punktfunk_core::input::{InputEvent, InputKind};
|
||||
@@ -76,6 +79,9 @@ pub struct Punktfunk1Options {
|
||||
|
||||
/// The native (punktfunk/1) trust store + on-demand arming PIN, shared with the management API.
|
||||
use crate::native_pairing::NativePairing;
|
||||
/// The shared streaming-stats recorder (web-console capture/graph), shared with the management API
|
||||
/// and the GameStream loop; threaded into each session's `SessionContext`.
|
||||
use crate::stats_recorder::StatsRecorder;
|
||||
|
||||
/// Minimum spacing between accepted pairing ceremonies (bounds online PIN guessing — with
|
||||
/// SPAKE2 an attacker already gets only one guess per ceremony; this caps the rate).
|
||||
@@ -111,7 +117,11 @@ pub fn run(opts: Punktfunk1Options) -> Result<()> {
|
||||
opts.pairing_pin.clone(),
|
||||
opts.allow_pairing || opts.require_pairing,
|
||||
)?);
|
||||
rt.block_on(serve(opts, np))
|
||||
// Standalone `punktfunk1-host` has no mgmt API to arm capture, so this recorder stays disarmed
|
||||
// (harmless — the loops' `is_armed()` gate is always false). The unified `serve` shares one
|
||||
// recorder across mgmt + both streaming paths instead.
|
||||
let stats = StatsRecorder::new(crate::stats_recorder::default_dir());
|
||||
rt.block_on(serve(opts, np, stats))
|
||||
}
|
||||
|
||||
fn fingerprint_hex(fp: &[u8; 32]) -> String {
|
||||
@@ -154,7 +164,11 @@ pub(crate) fn native_serve_opts(cfg: &NativeServe) -> Punktfunk1Options {
|
||||
}
|
||||
}
|
||||
|
||||
pub(crate) async fn serve(opts: Punktfunk1Options, np: Arc<NativePairing>) -> Result<()> {
|
||||
pub(crate) async fn serve(
|
||||
opts: Punktfunk1Options,
|
||||
np: Arc<NativePairing>,
|
||||
stats: Arc<StatsRecorder>,
|
||||
) -> Result<()> {
|
||||
let identity = crate::gamestream::cert::ServerIdentity::load_or_create()
|
||||
.context("load host identity (~/.config/punktfunk)")?;
|
||||
let fingerprint = endpoint::fingerprint_of_pem(&identity.cert_pem)
|
||||
@@ -209,6 +223,10 @@ pub(crate) async fn serve(opts: Punktfunk1Options, np: Arc<NativePairing>) -> Re
|
||||
// restores the box's autologin gaming session on idle, not per-disconnect — see
|
||||
// `vdisplay::restore_managed_session`). Held for serve()'s lifetime; dropping it stops it.
|
||||
let _restore_worker = crate::vdisplay::start_restore_worker();
|
||||
// Host-lifetime cover-art warmer: fetches + caches GOG/Xbox cover art (no-auth api.gog.com /
|
||||
// displaycatalog) off the hot path so `all_games()` (the library list + launch resolve) never
|
||||
// blocks on the network. A no-op on a host whose stores all carry their own art.
|
||||
let _art_warmer = crate::library::start_art_warmer();
|
||||
// Pairing state (arming PIN + trust store) is shared with the management API. If it was armed
|
||||
// at startup (the CLI flags), surface the PIN the headless operator reads from the log; the
|
||||
// web console arms it on demand instead (a fresh, time-limited PIN).
|
||||
@@ -269,6 +287,7 @@ pub(crate) async fn serve(opts: Punktfunk1Options, np: Arc<NativePairing>) -> Re
|
||||
let audio_cap = audio_cap.clone();
|
||||
let np = np.clone();
|
||||
let last_pairing = last_pairing.clone();
|
||||
let stats = stats.clone();
|
||||
let inj_tx = injector.sender();
|
||||
let mic_tx = mic_service.sender();
|
||||
sessions.spawn(async move {
|
||||
@@ -282,6 +301,7 @@ pub(crate) async fn serve(opts: Punktfunk1Options, np: Arc<NativePairing>) -> Re
|
||||
&fingerprint,
|
||||
&np,
|
||||
&last_pairing,
|
||||
stats,
|
||||
)
|
||||
.await
|
||||
{
|
||||
@@ -472,6 +492,7 @@ async fn serve_session(
|
||||
host_fp: &[u8; 32],
|
||||
np: &NativePairing,
|
||||
last_pairing: &std::sync::Mutex<Option<std::time::Instant>>,
|
||||
stats: Arc<StatsRecorder>,
|
||||
) -> Result<()> {
|
||||
let peer = conn.remote_address();
|
||||
|
||||
@@ -928,6 +949,12 @@ async fn serve_session(
|
||||
let stop_stream = stop.clone();
|
||||
let fec_target_dp = fec_target.clone(); // data-plane handle to the adaptive-FEC target
|
||||
let conn_stream = conn.clone(); // for sending the source's real HDR metadata (0xCE) mid-stream
|
||||
let stats_dp = stats; // data-plane handle to the shared stats recorder
|
||||
// Short label for web-console stats captures: the client's cert-fingerprint prefix, else its
|
||||
// peer IP (no fingerprint = anonymous TOFU/--open client).
|
||||
let client_label = endpoint::peer_fingerprint(&conn)
|
||||
.map(|fp| fingerprint_hex(&fp)[..12].to_string())
|
||||
.unwrap_or_else(|| conn.remote_address().ip().to_string());
|
||||
let result: Result<()> = async {
|
||||
tokio::task::spawn_blocking(move || -> Result<()> {
|
||||
// Wait briefly for the client to hole-punch our data port, then stream to its OBSERVED
|
||||
@@ -982,6 +1009,8 @@ async fn serve_session(
|
||||
probe_result_tx,
|
||||
fec_target: fec_target_dp,
|
||||
conn: conn_stream,
|
||||
stats: stats_dp,
|
||||
client_label,
|
||||
#[cfg(target_os = "windows")]
|
||||
launch: launch_for_dp,
|
||||
})
|
||||
@@ -1940,6 +1969,21 @@ struct FrameMsg {
|
||||
deadline: std::time::Instant,
|
||||
/// capture→encoded latency (µs), measured on the encode thread, carried for the perf histogram.
|
||||
encode_us: u32,
|
||||
/// Per-stage µs splits, measured on the capture/encode thread (0 when neither `PUNKTFUNK_PERF`
|
||||
/// nor a stats capture is armed). The send thread accumulates them for the web-console sample:
|
||||
/// `cap_us` = `try_latest` (ring read + colour convert), `submit_us` = NVENC `encode_picture`
|
||||
/// launch, `wait_us` = `lock_bitstream` (the scheduling wait + ASIC encode = the "encode" stage).
|
||||
cap_us: u32,
|
||||
submit_us: u32,
|
||||
wait_us: u32,
|
||||
/// This frame is a re-encoded hold (the source had no fresh frame): a source-starvation signal
|
||||
/// the send thread folds into `repeat_fps`.
|
||||
repeat: bool,
|
||||
/// Whether the per-stage splits (`cap_us`/`submit_us`/`wait_us`) were actually measured at
|
||||
/// capture time (`perf` was on or a stats capture was armed). The send thread trusts this
|
||||
/// instead of re-reading `is_armed()`, so a capture that arms while frames are already in flight
|
||||
/// doesn't fold their zeroed splits into the first window's percentiles.
|
||||
was_measured: bool,
|
||||
}
|
||||
|
||||
/// The dedicated send thread: it owns the whole [`Session`] (so no socket clone or shared stats are
|
||||
@@ -1961,6 +2005,11 @@ pub(crate) fn boost_thread_priority(critical: bool) {
|
||||
// capture/encode (critical) and send (non-critical).
|
||||
crate::session_tuning::on_hot_thread();
|
||||
#[cfg(target_os = "windows")]
|
||||
// SAFETY: `GetCurrentThread()` returns the constant pseudo-handle for the calling thread — always
|
||||
// valid, thread-local in meaning, and never closed (no leak/double-close). `SetThreadPriority`
|
||||
// takes that handle plus a `THREAD_PRIORITY_*` value the windows crate defines (HIGHEST or
|
||||
// ABOVE_NORMAL here); it only reprioritizes this OS thread, borrows no Rust memory, and its
|
||||
// `Result` is matched (a failure is logged, never UB). No pointers, lifetimes, or aliasing.
|
||||
unsafe {
|
||||
use windows::Win32::System::Threading::{
|
||||
GetCurrentThread, SetThreadPriority, THREAD_PRIORITY_ABOVE_NORMAL,
|
||||
@@ -1988,6 +2037,10 @@ pub(crate) fn boost_thread_priority(critical: bool) {
|
||||
// realtime CPU class can preempt the compositor AND the game's own render thread, adding the
|
||||
// very frame-time we refuse to add (opt-in only — see PUNKTFUNK_SCHED_RR).
|
||||
let nice = if critical { -10 } else { -5 };
|
||||
// SAFETY: `setpriority` takes three by-value integers and no pointers, so there is nothing to
|
||||
// alias or outlive. `PRIO_PROCESS` with `who == 0` targets the calling task on Linux and
|
||||
// `nice` is in range; the call only adjusts this thread's scheduling nice value and returns an
|
||||
// `int` we inspect. No memory is touched.
|
||||
let rc = unsafe { libc::setpriority(libc::PRIO_PROCESS, 0, nice) };
|
||||
if rc == 0 {
|
||||
tracing::debug!(critical, nice, "thread nice raised");
|
||||
@@ -2004,6 +2057,19 @@ pub(crate) fn boost_thread_priority(critical: bool) {
|
||||
}
|
||||
}
|
||||
|
||||
/// Everything the send thread needs to emit web-console stats samples at its 2 s aggregation
|
||||
/// boundary: the shared recorder (whose `is_armed()` gates emission) plus the negotiated
|
||||
/// mode/codec/client to seed the capture's `CaptureMeta` on the first armed registration.
|
||||
struct SendStats {
|
||||
rec: Arc<StatsRecorder>,
|
||||
width: u32,
|
||||
height: u32,
|
||||
fps: u32,
|
||||
codec: &'static str,
|
||||
client: String,
|
||||
bitrate_kbps: u32,
|
||||
}
|
||||
|
||||
#[allow(clippy::too_many_arguments)]
|
||||
fn send_loop(
|
||||
mut session: Session,
|
||||
@@ -2014,6 +2080,7 @@ fn send_loop(
|
||||
perf: bool,
|
||||
burst_cap: usize,
|
||||
fec_target: Arc<AtomicU8>,
|
||||
stats: SendStats,
|
||||
) {
|
||||
boost_thread_priority(false); // transmit thread: above-normal (Apollo's encoder-thread level)
|
||||
let mut last_perf = std::time::Instant::now();
|
||||
@@ -2022,6 +2089,16 @@ fn send_loop(
|
||||
let mut encode_us: Vec<u32> = Vec::new();
|
||||
let mut pace_us: Vec<u32> = Vec::new();
|
||||
let (mut paced_frames, mut immediate_frames) = (0u64, 0u64);
|
||||
// Web-console stats accumulation (active when `perf` OR the recorder is armed): the per-stage
|
||||
// split carried on each FrameMsg, the new-vs-repeat frame split, the cached registration id, and
|
||||
// the previous window's loss snapshot for delta computation.
|
||||
let mut sid: Option<u32> = None;
|
||||
let (mut cap_v, mut submit_v, mut wait_v): (Vec<u32>, Vec<u32>, Vec<u32>) =
|
||||
(Vec::new(), Vec::new(), Vec::new());
|
||||
let (mut new_frames, mut repeat_frames) = (0u64, 0u64);
|
||||
let mut last_frames_dropped = 0u64;
|
||||
let mut last_packets_dropped = 0u64;
|
||||
let mut last_fec_recovered = 0u64;
|
||||
loop {
|
||||
if stop.load(Ordering::SeqCst) {
|
||||
break;
|
||||
@@ -2042,9 +2119,24 @@ fn send_loop(
|
||||
burst_cap,
|
||||
) {
|
||||
Ok(stat) => {
|
||||
if perf {
|
||||
if perf || stats.rec.is_armed() {
|
||||
// `encode_us`/`pace_us`/fps are valid for every frame (always measured),
|
||||
// including the Windows relay + tail-drain frames. The cap/submit/wait splits
|
||||
// are only real when the frame was measured at capture time — a frame captured
|
||||
// before this capture armed carries zeroed splits, so skip those (an empty
|
||||
// window → `percentile()` returns 0) rather than pull the percentiles down.
|
||||
encode_us.push(msg.encode_us);
|
||||
pace_us.push(stat.spread_us);
|
||||
if msg.was_measured {
|
||||
cap_v.push(msg.cap_us);
|
||||
submit_v.push(msg.submit_us);
|
||||
wait_v.push(msg.wait_us);
|
||||
}
|
||||
if msg.repeat {
|
||||
repeat_frames += 1;
|
||||
} else {
|
||||
new_frames += 1;
|
||||
}
|
||||
if stat.paced {
|
||||
paced_frames += 1;
|
||||
} else {
|
||||
@@ -2060,31 +2152,91 @@ fn send_loop(
|
||||
Err(std::sync::mpsc::RecvTimeoutError::Timeout) => {}
|
||||
Err(std::sync::mpsc::RecvTimeoutError::Disconnected) => break, // encode thread done
|
||||
}
|
||||
if perf && last_perf.elapsed() >= std::time::Duration::from_secs(2) {
|
||||
if last_perf.elapsed() >= std::time::Duration::from_secs(2) {
|
||||
let s = session.stats();
|
||||
let secs = last_perf.elapsed().as_secs_f64();
|
||||
// Attempted (sealed) transmit rate; `send_dropped` is what didn't reach the wire.
|
||||
let tx_mbps = (s.bytes_sent - last_bytes) as f64 * 8.0 / secs / 1_000_000.0;
|
||||
tracing::info!(
|
||||
tx_mbps = format!("{tx_mbps:.0}"),
|
||||
send_dropped = s.packets_send_dropped - last_send_dropped,
|
||||
send_dropped_total = s.packets_send_dropped,
|
||||
encode_us_p50 = percentile(&mut encode_us, 0.50),
|
||||
encode_us_p99 = percentile(&mut encode_us, 0.99),
|
||||
pace_us_p50 = percentile(&mut pace_us, 0.50),
|
||||
pace_us_p99 = percentile(&mut pace_us, 0.99),
|
||||
pace_us_max = pace_us.last().copied().unwrap_or(0),
|
||||
immediate_frames,
|
||||
paced_frames,
|
||||
"perf"
|
||||
);
|
||||
if perf {
|
||||
tracing::info!(
|
||||
tx_mbps = format!("{tx_mbps:.0}"),
|
||||
send_dropped = s.packets_send_dropped - last_send_dropped,
|
||||
send_dropped_total = s.packets_send_dropped,
|
||||
encode_us_p50 = percentile(&mut encode_us, 0.50),
|
||||
encode_us_p99 = percentile(&mut encode_us, 0.99),
|
||||
pace_us_p50 = percentile(&mut pace_us, 0.50),
|
||||
pace_us_p99 = percentile(&mut pace_us, 0.99),
|
||||
pace_us_max = pace_us.last().copied().unwrap_or(0),
|
||||
immediate_frames,
|
||||
paced_frames,
|
||||
"perf"
|
||||
);
|
||||
}
|
||||
// Web-console capture: this thread owns `session.stats()`, so it emits the COMPLETE
|
||||
// sample — the cap/submit/encode split carried over from the capture thread plus this
|
||||
// window's pacing/goodput/loss. Loss fields are deltas vs the previous window's snapshot.
|
||||
if stats.rec.is_armed() {
|
||||
let session_id = *sid.get_or_insert_with(|| {
|
||||
stats.rec.register_session(
|
||||
"native",
|
||||
stats.width,
|
||||
stats.height,
|
||||
stats.fps,
|
||||
stats.codec,
|
||||
&stats.client,
|
||||
)
|
||||
});
|
||||
let sample = crate::stats_recorder::StatsSample {
|
||||
t_ms: 0, // stamped by push_sample from the capture's monotonic start
|
||||
session_id,
|
||||
stages: vec![
|
||||
crate::stats_recorder::StageTiming {
|
||||
name: "capture".into(),
|
||||
p50_us: percentile(&mut cap_v, 0.50) as f32,
|
||||
p99_us: percentile(&mut cap_v, 0.99) as f32,
|
||||
},
|
||||
crate::stats_recorder::StageTiming {
|
||||
name: "submit".into(),
|
||||
p50_us: percentile(&mut submit_v, 0.50) as f32,
|
||||
p99_us: percentile(&mut submit_v, 0.99) as f32,
|
||||
},
|
||||
crate::stats_recorder::StageTiming {
|
||||
name: "encode".into(),
|
||||
p50_us: percentile(&mut wait_v, 0.50) as f32,
|
||||
p99_us: percentile(&mut wait_v, 0.99) as f32,
|
||||
},
|
||||
crate::stats_recorder::StageTiming {
|
||||
name: "send".into(),
|
||||
p50_us: percentile(&mut pace_us, 0.50) as f32,
|
||||
p99_us: percentile(&mut pace_us, 0.99) as f32,
|
||||
},
|
||||
],
|
||||
fps: (new_frames as f64 / secs) as f32,
|
||||
repeat_fps: (repeat_frames as f64 / secs) as f32,
|
||||
mbps: tx_mbps as f32,
|
||||
bitrate_kbps: stats.bitrate_kbps,
|
||||
frames_dropped: s.frames_dropped.saturating_sub(last_frames_dropped) as u32,
|
||||
packets_dropped: s.packets_dropped.saturating_sub(last_packets_dropped) as u32,
|
||||
send_dropped: s.packets_send_dropped.saturating_sub(last_send_dropped) as u32,
|
||||
fec_recovered: s.fec_recovered_shards.saturating_sub(last_fec_recovered) as u32,
|
||||
};
|
||||
stats.rec.push_sample(session_id, sample);
|
||||
}
|
||||
last_perf = std::time::Instant::now();
|
||||
last_bytes = s.bytes_sent;
|
||||
last_send_dropped = s.packets_send_dropped;
|
||||
last_frames_dropped = s.frames_dropped;
|
||||
last_packets_dropped = s.packets_dropped;
|
||||
last_fec_recovered = s.fec_recovered_shards;
|
||||
encode_us.clear();
|
||||
pace_us.clear();
|
||||
cap_v.clear();
|
||||
submit_v.clear();
|
||||
wait_v.clear();
|
||||
paced_frames = 0;
|
||||
immediate_frames = 0;
|
||||
new_frames = 0;
|
||||
repeat_frames = 0;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -2185,6 +2337,13 @@ struct SessionContext {
|
||||
fec_target: Arc<AtomicU8>,
|
||||
/// The QUIC control connection (carries host→client 0xCE source-HDR metadata mid-stream).
|
||||
conn: quinn::Connection,
|
||||
/// Shared streaming-stats recorder. The capture loop reads `is_armed()` per frame to decide
|
||||
/// whether to measure the per-stage split; the send thread builds + pushes the aggregated
|
||||
/// `StatsSample` at its 2 s boundary.
|
||||
stats: Arc<StatsRecorder>,
|
||||
/// Short client label (cert-fingerprint prefix, else peer IP) seeded into the capture meta on
|
||||
/// the first armed stats registration.
|
||||
client_label: String,
|
||||
/// Windows: the store-qualified library id to launch into the interactive user session once
|
||||
/// capture is live (no gamescope nesting on Windows). `None` = no launch requested. Linux uses the
|
||||
/// gamescope `PUNKTFUNK_GAMESCOPE_APP` path resolved at handshake, so this field is Windows-only.
|
||||
@@ -2205,7 +2364,7 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
// Windows two-process secure-desktop path: when the host runs as SYSTEM (required for the secure
|
||||
// desktop + SendInput), WGC can't activate in-process, so we capture the normal desktop via a
|
||||
// helper spawned in the user session and relay its AUs. (Single-process WGC/DDA is used as the
|
||||
// user, and stays the path on Linux.) See docs/windows-secure-desktop.md.
|
||||
// user, and stays the path on Linux.) See design/windows-secure-desktop.md.
|
||||
#[cfg(target_os = "windows")]
|
||||
if plan.topology == crate::session_plan::SessionTopology::TwoProcessRelay {
|
||||
return virtual_stream_relay(ctx);
|
||||
@@ -2226,6 +2385,8 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
probe_result_tx,
|
||||
fec_target,
|
||||
conn,
|
||||
stats,
|
||||
client_label,
|
||||
#[cfg(target_os = "windows")]
|
||||
launch,
|
||||
} = ctx;
|
||||
@@ -2294,6 +2455,17 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
// The bounded channel applies backpressure (the encode thread blocks if the send falls behind,
|
||||
// so frames slow down rather than a dropped frame freezing the infinite-GOP stream).
|
||||
let (frame_tx, frame_rx) = std::sync::mpsc::sync_channel::<FrameMsg>(3);
|
||||
// The send thread emits the web-console stats sample (it owns `session.stats()`); clone the
|
||||
// recorder so the capture loop keeps its own handle for the per-frame `is_armed()` gate.
|
||||
let send_stats = SendStats {
|
||||
rec: stats.clone(),
|
||||
width: mode.width,
|
||||
height: mode.height,
|
||||
fps: mode.refresh_hz,
|
||||
codec: "hevc",
|
||||
client: client_label,
|
||||
bitrate_kbps,
|
||||
};
|
||||
let send_thread = std::thread::Builder::new()
|
||||
.name("punktfunk-send".into())
|
||||
.spawn({
|
||||
@@ -2308,6 +2480,7 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
perf,
|
||||
burst_cap,
|
||||
fec_target,
|
||||
send_stats,
|
||||
)
|
||||
}
|
||||
})
|
||||
@@ -2352,6 +2525,12 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
// compositing), NOT an encoder problem. Logged every 2 s when `PUNKTFUNK_PERF`.
|
||||
let (mut diag_new, mut diag_repeat) = (0u64, 0u64);
|
||||
let mut diag_at = std::time::Instant::now();
|
||||
// Per-stage latency breakdown (PUNKTFUNK_PERF): per-call µs for the GPU-bound stages so we see
|
||||
// exactly where the capture→encoded latency goes — cap=try_latest (ring read + colour convert),
|
||||
// submit=encode_picture launch, wait=lock_bitstream (the scheduling wait + ASIC encode, the one
|
||||
// that dominates under a GPU-saturating game).
|
||||
let (mut st_cap, mut st_submit, mut st_wait): (Vec<u32>, Vec<u32>, Vec<u32>) =
|
||||
(Vec::new(), Vec::new(), Vec::new());
|
||||
while !stop.load(Ordering::SeqCst) && std::time::Instant::now() < deadline {
|
||||
// Mid-stream session switch (the box flipped Gaming↔Desktop): rebuild the WHOLE backend in
|
||||
// place — a different compositor at the SAME client mode — keeping the Session + send thread
|
||||
@@ -2458,13 +2637,31 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
tracing::debug!("forcing keyframe (client decode recovery)");
|
||||
enc.request_keyframe();
|
||||
}
|
||||
match capturer.try_latest() {
|
||||
// Measure the per-stage split when `PUNKTFUNK_PERF` is set OR a web-console stats capture is
|
||||
// armed (a cheap Relaxed atomic, re-read each frame). The values feed the existing perf log
|
||||
// unchanged and ride each FrameMsg to the send thread, which builds the aggregated sample.
|
||||
let measure = perf || stats.is_armed();
|
||||
let t_cap = std::time::Instant::now();
|
||||
let cap_result = capturer.try_latest();
|
||||
let cap_us = if measure {
|
||||
t_cap.elapsed().as_micros() as u32
|
||||
} else {
|
||||
0
|
||||
};
|
||||
if perf {
|
||||
st_cap.push(cap_us);
|
||||
}
|
||||
let mut repeat = false;
|
||||
match cap_result {
|
||||
Ok(Some(f)) => {
|
||||
frame = f;
|
||||
diag_new += 1;
|
||||
capture_rebuilds = 0; // a delivered frame clears the consecutive-loss counter
|
||||
}
|
||||
Ok(None) => diag_repeat += 1, // no new frame (static desktop / mid-rebuild) — repeat the last
|
||||
Ok(None) => {
|
||||
diag_repeat += 1; // no new frame (static desktop / mid-rebuild) — repeat the last
|
||||
repeat = true;
|
||||
}
|
||||
// The capture source died (PipeWire/compositor thread ended, virtual output gone). Rather
|
||||
// than tear the whole session down — the client has no reconnect path and would have to
|
||||
// cold-restart the handshake — rebuild the pipeline IN PLACE at the current mode, exactly
|
||||
@@ -2497,6 +2694,20 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
"capture diag: NEW frames from the source vs REPEATS (low new_fps at high send rate ⇒ \
|
||||
the source isn't producing frames, not an encode stall)"
|
||||
);
|
||||
let wait_max = st_wait.iter().copied().max().unwrap_or(0);
|
||||
tracing::info!(
|
||||
cap_us_p50 = percentile(&mut st_cap, 0.50),
|
||||
cap_us_p99 = percentile(&mut st_cap, 0.99),
|
||||
submit_us_p50 = percentile(&mut st_submit, 0.50),
|
||||
submit_us_p99 = percentile(&mut st_submit, 0.99),
|
||||
wait_us_p50 = percentile(&mut st_wait, 0.50),
|
||||
wait_us_p99 = percentile(&mut st_wait, 0.99),
|
||||
wait_us_max = wait_max,
|
||||
"stage perf (µs/call): cap=try_latest(ring+convert) submit=encode_picture wait=lock_bitstream(sched+ASIC)"
|
||||
);
|
||||
st_cap.clear();
|
||||
st_submit.clear();
|
||||
st_wait.clear();
|
||||
diag_new = 0;
|
||||
diag_repeat = 0;
|
||||
diag_at = std::time::Instant::now();
|
||||
@@ -2515,7 +2726,16 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
// capturer hands a rotating ring of output textures, so it returns >1; other capturers default 1.
|
||||
let depth = capturer.pipeline_depth().max(1);
|
||||
let capture_ns = now_ns();
|
||||
let t_submit = std::time::Instant::now();
|
||||
enc.submit(&frame).context("encoder submit")?;
|
||||
let submit_us = if measure {
|
||||
t_submit.elapsed().as_micros() as u32
|
||||
} else {
|
||||
0
|
||||
};
|
||||
if perf {
|
||||
st_submit.push(submit_us);
|
||||
}
|
||||
// This frame's pacing deadline (the next frame's due time); the send thread spreads a big frame
|
||||
// up to here. Each in-flight frame carries its own (capture_ns, deadline) for when it's polled.
|
||||
next += interval;
|
||||
@@ -2526,7 +2746,17 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
// the oldest submitted frame's AU — matching `inflight.pop_front()`.
|
||||
let mut send_gone = false;
|
||||
while inflight.len() >= depth {
|
||||
let au = match enc.poll().context("encoder poll")? {
|
||||
let t_wait = std::time::Instant::now();
|
||||
let polled = enc.poll().context("encoder poll")?;
|
||||
let wait_us = if measure {
|
||||
t_wait.elapsed().as_micros() as u32
|
||||
} else {
|
||||
0
|
||||
};
|
||||
if perf {
|
||||
st_wait.push(wait_us);
|
||||
}
|
||||
let au = match polled {
|
||||
Some(au) => au,
|
||||
None => break, // no AU ready for a submitted frame (shouldn't happen — poll blocks)
|
||||
};
|
||||
@@ -2552,6 +2782,11 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
flags,
|
||||
deadline,
|
||||
encode_us,
|
||||
cap_us,
|
||||
submit_us,
|
||||
wait_us,
|
||||
repeat,
|
||||
was_measured: measure,
|
||||
};
|
||||
// Hand to the send thread; this blocks (backpressure) if it's behind. An Err means it
|
||||
// exited (send failure / stop) — end the encode loop too.
|
||||
@@ -2579,12 +2814,19 @@ fn virtual_stream(ctx: SessionContext) -> Result<()> {
|
||||
FLAG_PIC as u32
|
||||
};
|
||||
let encode_us = (now_ns().saturating_sub(cap_ns) / 1000) as u32;
|
||||
// End-of-stream tail drain: the per-stage split isn't measured here (the capture loop has
|
||||
// exited), so leave it zero — these last few frames are negligible for the aggregates.
|
||||
let msg = FrameMsg {
|
||||
data: au.data,
|
||||
capture_ns: cap_ns,
|
||||
flags,
|
||||
deadline,
|
||||
encode_us,
|
||||
cap_us: 0,
|
||||
submit_us: 0,
|
||||
wait_us: 0,
|
||||
repeat: false,
|
||||
was_measured: false,
|
||||
};
|
||||
if frame_tx.send(msg).is_err() {
|
||||
break;
|
||||
@@ -2631,6 +2873,8 @@ fn virtual_stream_relay(ctx: SessionContext) -> Result<()> {
|
||||
probe_result_tx,
|
||||
fec_target,
|
||||
conn: _conn,
|
||||
stats,
|
||||
client_label,
|
||||
launch,
|
||||
} = ctx;
|
||||
tracing::info!(
|
||||
@@ -2669,6 +2913,11 @@ fn virtual_stream_relay(ctx: SessionContext) -> Result<()> {
|
||||
// The secure-desktop HDR drop (for the DDA leg) keys off the monitor's real state in the mux loop.
|
||||
#[cfg(target_os = "windows")]
|
||||
if bit_depth >= 10 {
|
||||
// SAFETY: `set_advanced_color` is marked `unsafe` only because it drives the Win32 CCD API
|
||||
// internally; it takes `target_id` by value (Copy `u32` — this session's live SudoVDA
|
||||
// monitor's CCD target id) and sizes + owns every buffer it hands the OS on its own stack.
|
||||
// We pass no pointers, so nothing must outlive the call and there is no aliasing; an
|
||||
// unknown/absent target id simply returns false.
|
||||
unsafe {
|
||||
if crate::win_display::set_advanced_color(target.target_id, true) {
|
||||
// Let the colorspace change settle before WGC creates its capture item / detects HDR.
|
||||
@@ -2760,7 +3009,18 @@ fn virtual_stream_relay(ctx: SessionContext) -> Result<()> {
|
||||
* 1024;
|
||||
|
||||
// Same encode|send split as the single-process path: this thread relays AUs, a dedicated send
|
||||
// thread owns the Session and does FEC+seal+paced-send.
|
||||
// thread owns the Session and does FEC+seal+paced-send. The relay encodes in the helper process,
|
||||
// so this path's FrameMsgs carry no cap/submit/encode split (those stages stay 0 in the sample);
|
||||
// the send thread still emits fps/goodput/pacing/loss from `session.stats()`.
|
||||
let send_stats = SendStats {
|
||||
rec: stats,
|
||||
width: mode.width,
|
||||
height: mode.height,
|
||||
fps: effective_hz,
|
||||
codec: "hevc",
|
||||
client: client_label,
|
||||
bitrate_kbps,
|
||||
};
|
||||
let (frame_tx, frame_rx) = std::sync::mpsc::sync_channel::<FrameMsg>(3);
|
||||
let send_thread = std::thread::Builder::new()
|
||||
.name("punktfunk-send".into())
|
||||
@@ -2776,6 +3036,7 @@ fn virtual_stream_relay(ctx: SessionContext) -> Result<()> {
|
||||
perf,
|
||||
burst_cap,
|
||||
fec_target,
|
||||
send_stats,
|
||||
)
|
||||
}
|
||||
})
|
||||
@@ -2838,6 +3099,11 @@ fn virtual_stream_relay(ctx: SessionContext) -> Result<()> {
|
||||
flags,
|
||||
deadline: std::time::Instant::now() + interval,
|
||||
encode_us,
|
||||
cap_us: 0,
|
||||
submit_us: 0,
|
||||
wait_us: 0,
|
||||
repeat: false,
|
||||
was_measured: false,
|
||||
};
|
||||
let ok = frame_tx.send(msg).is_ok();
|
||||
if ok {
|
||||
@@ -2904,8 +3170,12 @@ fn virtual_stream_relay(ctx: SessionContext) -> Result<()> {
|
||||
// desktop (the drop just churned + still went black). Instead, if the monitor is in HDR,
|
||||
// open DDA in HDR (FP16 DuplicateOutput1 → BT.2020 PQ Main10); the normal-desktop DDA
|
||||
// overlay/flip issues that drove us to WGC don't apply to the composed Winlogon UI.
|
||||
let hdr =
|
||||
unsafe { crate::win_display::advanced_color_enabled(target.target_id) };
|
||||
// SAFETY: `advanced_color_enabled` is `unsafe` only because it queries the Win32 CCD
|
||||
// API; it takes `target_id` by value (the live SudoVDA monitor's CCD target id) and
|
||||
// allocates + owns every buffer it passes the OS internally. No caller pointer is
|
||||
// involved, so nothing must outlive the call and there is no aliasing; a missing
|
||||
// target id just yields false.
|
||||
let hdr = unsafe { crate::win_display::advanced_color_enabled(target.target_id) };
|
||||
dda = None; // reopen to capture the secure desktop
|
||||
match open_dda(&target, cur_mode.width, cur_mode.height, effective_hz, hdr) {
|
||||
Ok(mut p) => {
|
||||
@@ -3330,12 +3600,27 @@ mod tests {
|
||||
unsafe fn pull_verified(conn: *mut punktfunk_core::abi::PunktfunkConnection, count: u32) {
|
||||
use punktfunk_core::error::PunktfunkStatus;
|
||||
let mut got = 0u32;
|
||||
// SAFETY: the inferred type is the `#[repr(C)]` POD `PunktfunkFrame` (a raw `*const u8`, a
|
||||
// `usize`, and integer fields); all-zero is a valid bit pattern for every field (a null
|
||||
// `data`, `len == 0`). It is only ever read after `next_au` below fully overwrites it on `Ok`,
|
||||
// so the zeroed value is never observed.
|
||||
let mut frame = unsafe { std::mem::zeroed() };
|
||||
while got < count {
|
||||
// SAFETY: `conn` is the live, non-null `*mut PunktfunkConnection` from `punktfunk_connect`
|
||||
// (the caller asserts non-null and does not close it until after this returns), meeting the
|
||||
// ABI's "valid handle". `&mut frame` is an exclusive, writable borrow of the local
|
||||
// `PunktfunkFrame` that outlives this synchronous call. This single test thread is the only
|
||||
// video puller, satisfying the one-video-thread rule.
|
||||
match unsafe {
|
||||
punktfunk_core::abi::punktfunk_connection_next_au(conn, &mut frame, 2000)
|
||||
} {
|
||||
PunktfunkStatus::Ok => {
|
||||
// SAFETY: on `Ok`, `next_au` set `frame.data`/`frame.len` to the reassembled AU
|
||||
// buffer the connection owns; per the ABI contract that borrow stays valid until
|
||||
// the NEXT `next_au` call on this handle. We read the whole slice here (the assert
|
||||
// + length-checked indexing) before the loop's next `next_au`, and `conn` outlives
|
||||
// it — so the pointer is live, exactly `len` bytes, read-only, single-threaded (no
|
||||
// aliasing/use-after-free).
|
||||
let data = unsafe { std::slice::from_raw_parts(frame.data, frame.len) };
|
||||
let idx = u32::from_le_bytes(data[0..4].try_into().unwrap());
|
||||
assert_eq!(
|
||||
@@ -3383,6 +3668,11 @@ mod tests {
|
||||
// Session 1: TOFU (no pin) — observe the host fingerprint.
|
||||
let addr = std::ffi::CString::new("127.0.0.1").unwrap();
|
||||
let mut observed = [0u8; 32];
|
||||
// SAFETY: `addr` is a live `CString` ("127.0.0.1") whose `as_ptr()` is the NUL-terminated
|
||||
// UTF-8 host string the contract requires; `pin_sha256`/cert/key are NULL (all permitted), and
|
||||
// `observed.as_mut_ptr()` is the local `[u8; 32]` — exactly the 32 writable bytes the contract
|
||||
// demands, not aliased during the call. Every pointer references a live local that outlives the
|
||||
// blocking connect.
|
||||
let conn = unsafe {
|
||||
punktfunk_connect(
|
||||
addr.as_ptr(),
|
||||
@@ -3401,26 +3691,28 @@ mod tests {
|
||||
assert_ne!(observed, [0u8; 32], "fingerprint not reported");
|
||||
|
||||
let (mut w, mut h, mut hz) = (0u32, 0u32, 0u32);
|
||||
assert_eq!(
|
||||
unsafe { punktfunk_connection_mode(conn, &mut w, &mut h, &mut hz) },
|
||||
PunktfunkStatus::Ok
|
||||
);
|
||||
// SAFETY: `conn` is the live, non-null connection handle just asserted above; `&mut w/h/hz` are
|
||||
// exclusive, writable borrows of local `u32`s that outlive this synchronous call — the three
|
||||
// writable out-params the contract names.
|
||||
let st = unsafe { punktfunk_connection_mode(conn, &mut w, &mut h, &mut hz) };
|
||||
assert_eq!(st, PunktfunkStatus::Ok);
|
||||
assert_eq!((w, h, hz), (1280, 720, 60));
|
||||
|
||||
// Mid-stream renegotiation: request a new mode, the host acks on the control
|
||||
// stream, and punktfunk_connection_mode reflects the switch.
|
||||
assert_eq!(
|
||||
unsafe {
|
||||
punktfunk_core::abi::punktfunk_connection_request_mode(conn, 1920, 1080, 144)
|
||||
},
|
||||
PunktfunkStatus::Ok
|
||||
);
|
||||
// SAFETY: `conn` is the live, non-null connection handle (the only pointer arg); the remaining
|
||||
// arguments are by-value integers. The handle outlives this non-blocking enqueue.
|
||||
let st = unsafe {
|
||||
punktfunk_core::abi::punktfunk_connection_request_mode(conn, 1920, 1080, 144)
|
||||
};
|
||||
assert_eq!(st, PunktfunkStatus::Ok);
|
||||
let deadline = std::time::Instant::now() + std::time::Duration::from_secs(5);
|
||||
loop {
|
||||
assert_eq!(
|
||||
unsafe { punktfunk_connection_mode(conn, &mut w, &mut h, &mut hz) },
|
||||
PunktfunkStatus::Ok
|
||||
);
|
||||
// SAFETY: same as the earlier `punktfunk_connection_mode` call — `conn` is the live handle
|
||||
// and `&mut w/h/hz` are exclusive writable borrows of locals that outlive this synchronous
|
||||
// call.
|
||||
let st = unsafe { punktfunk_connection_mode(conn, &mut w, &mut h, &mut hz) };
|
||||
assert_eq!(st, PunktfunkStatus::Ok);
|
||||
if (w, h, hz) == (1920, 1080, 144) {
|
||||
break;
|
||||
}
|
||||
@@ -3431,6 +3723,8 @@ mod tests {
|
||||
std::thread::sleep(std::time::Duration::from_millis(20));
|
||||
}
|
||||
|
||||
// SAFETY: `pull_verified` requires a live connection handle it alone pulls video from; `conn` is
|
||||
// the open, non-null handle from `punktfunk_connect` and this is the only thread touching it.
|
||||
unsafe { pull_verified(conn, 25) };
|
||||
|
||||
let ev = punktfunk_core::input::InputEvent {
|
||||
@@ -3441,13 +3735,19 @@ mod tests {
|
||||
y: 2,
|
||||
flags: 0,
|
||||
};
|
||||
assert_eq!(
|
||||
unsafe { punktfunk_connection_send_input(conn, &ev) },
|
||||
PunktfunkStatus::Ok
|
||||
);
|
||||
// SAFETY: `conn` is the live handle; `&ev` borrows the local `InputEvent`, valid and immutable
|
||||
// for this synchronous enqueue — the contract's "valid InputEvent" pointer.
|
||||
let st = unsafe { punktfunk_connection_send_input(conn, &ev) };
|
||||
assert_eq!(st, PunktfunkStatus::Ok);
|
||||
// SAFETY: `conn` was returned by `punktfunk_connect` and is never used after this call (session
|
||||
// 2 below uses a fresh `conn2`); `close` takes ownership and frees the handle exactly once.
|
||||
unsafe { punktfunk_connection_close(conn) };
|
||||
|
||||
// Session 2 (same host process — the listener survived): pin the fingerprint.
|
||||
// SAFETY: as for session 1 — `addr` is the live NUL-terminated host string; here
|
||||
// `observed.as_ptr()` is the 32-byte pin (the fingerprint captured above, a valid `[u8; 32]`),
|
||||
// `observed_sha256_out` is NULL and cert/key are NULL. All pointers reference live locals for
|
||||
// the duration of the blocking connect.
|
||||
let conn2 = unsafe {
|
||||
punktfunk_connect(
|
||||
addr.as_ptr(),
|
||||
@@ -3463,11 +3763,17 @@ mod tests {
|
||||
)
|
||||
};
|
||||
assert!(!conn2.is_null(), "pinned reconnect failed");
|
||||
// SAFETY: `conn2` is the live, non-null pinned handle, pulled only from this thread —
|
||||
// `pull_verified`'s requirement.
|
||||
unsafe { pull_verified(conn2, 25) };
|
||||
// SAFETY: `conn2` came from `punktfunk_connect` and is not used after this; `close` frees it once.
|
||||
unsafe { punktfunk_connection_close(conn2) };
|
||||
|
||||
// Session 3: a wrong pin must be rejected by the handshake.
|
||||
let bad = [0xAAu8; 32];
|
||||
// SAFETY: same shape as the prior connects — `addr` is the live host string, `bad.as_ptr()` is
|
||||
// the 32-byte `[0xAA; 32]` pin, and out/cert/key are NULL; all reference live locals across the
|
||||
// blocking call. (The handshake is expected to fail and return NULL here, which is sound.)
|
||||
let conn3 = unsafe {
|
||||
punktfunk_connect(
|
||||
addr.as_ptr(),
|
||||
@@ -3487,6 +3793,8 @@ mod tests {
|
||||
// The host saw the rejected handshake attempt as session 3? No — a TLS-failed
|
||||
// handshake never yields a connection, so accept() is still waiting. Connect once
|
||||
// more (TOFU) to complete the host's third session and let it exit.
|
||||
// SAFETY: same as session 1's connect — `addr` is the live host string, pin/out/cert/key all
|
||||
// NULL; the pointers reference live locals for the duration of the blocking connect.
|
||||
let conn4 = unsafe {
|
||||
punktfunk_connect(
|
||||
addr.as_ptr(),
|
||||
@@ -3502,7 +3810,9 @@ mod tests {
|
||||
)
|
||||
};
|
||||
assert!(!conn4.is_null());
|
||||
// SAFETY: `conn4` is the live, non-null handle, pulled only from this thread.
|
||||
unsafe { pull_verified(conn4, 25) };
|
||||
// SAFETY: `conn4` came from `punktfunk_connect` and is unused after this; `close` frees it once.
|
||||
unsafe { punktfunk_connection_close(conn4) };
|
||||
|
||||
host.join().unwrap().unwrap();
|
||||
@@ -3546,6 +3856,9 @@ mod tests {
|
||||
paired_store: None, // unused: the shared `np` IS the store handle
|
||||
},
|
||||
np_host,
|
||||
StatsRecorder::new(
|
||||
std::env::temp_dir().join(format!("pf-approval-stats-{}", std::process::id())),
|
||||
),
|
||||
))
|
||||
});
|
||||
std::thread::sleep(std::time::Duration::from_millis(500));
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
//! `SessionPlan` — the per-session capture / topology / encoder decision, resolved **once** from
|
||||
//! [`HostConfig`](crate::config) (+ the handshake-negotiated bit depth) into a typed, logged value.
|
||||
//!
|
||||
//! **Goal-1 stage 3** (`docs/windows-host-rewrite.md` §2.2): before this, the Windows session decision was
|
||||
//! **Goal-1 stage 3** (`design/windows-host-rewrite.md` §2.2): before this, the Windows session decision was
|
||||
//! re-derived at three call sites — the capture backend inside `capture::capture_virtual_output`, the
|
||||
//! process topology in `punktfunk1::should_use_helper`, and the encode backend in
|
||||
//! `encode::windows_resolved_backend` — each reading [`config`](crate::config) independently, with no
|
||||
|
||||
@@ -9,7 +9,10 @@
|
||||
//! Raw C-ABI FFI (winmm/kernel32/dwmapi/avrt) rather than the `windows` crate so it builds without
|
||||
//! pulling new windows-rs features. No-op on non-Windows. Per-thread effects (MMCSS, execution
|
||||
//! state) auto-revert at thread exit (= session end); the process-wide bits revert at process exit.
|
||||
//! See `docs/host-latency-plan.md` Tier 3A.
|
||||
//! See `design/host-latency-plan.md` Tier 3A.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
#[cfg(target_os = "windows")]
|
||||
mod imp {
|
||||
@@ -49,6 +52,10 @@ mod imp {
|
||||
/// Process-wide tuning, applied exactly once. Reverts at process exit. Best-effort: each call is
|
||||
/// independent and a failure is ignored (e.g. a non-elevated host may not get HIGH class).
|
||||
fn tune_process_once() {
|
||||
// SAFETY: each call is a C-ABI FFI into winmm/kernel32/dwmapi declared with a matching
|
||||
// `extern "system"` signature; every argument is a plain integer (no pointers/buffers escape),
|
||||
// and `GetCurrentProcess()` returns the current-process pseudo-handle (a constant, always valid,
|
||||
// never closed). The body runs inside `get_or_init`, so it executes exactly once per process.
|
||||
PROCESS_TUNED.get_or_init(|| unsafe {
|
||||
// 1 ms timer granularity (default ~15.6 ms) — the floor for precise frame pacing and the
|
||||
// encode|send split's sub-ms sleeps.
|
||||
@@ -70,6 +77,11 @@ mod imp {
|
||||
/// thread exits, so a session that ends tears them down without explicit bookkeeping.
|
||||
pub fn on_hot_thread() {
|
||||
tune_process_once();
|
||||
// SAFETY: C-ABI FFI declared with matching `extern "system"` signatures. SetThreadExecutionState
|
||||
// takes only flag bits. `task` is a local NUL-terminated UTF-16 buffer ("Games\0") alive for the
|
||||
// whole block, so `task.as_ptr()` is a valid LPCWSTR for the call, and `&mut idx` is a live local
|
||||
// u32 the call writes the task index into. The returned MMCSS handle is intentionally leaked (the
|
||||
// OS reverts the characteristics at thread exit), so there is nothing to free or double-free.
|
||||
unsafe {
|
||||
SetThreadExecutionState(ES_CONTINUOUS | ES_DISPLAY_REQUIRED | ES_SYSTEM_REQUIRED);
|
||||
let task: Vec<u16> = "Games\0".encode_utf16().collect();
|
||||
|
||||
@@ -0,0 +1,553 @@
|
||||
//! Shared streaming-stats recorder (`design/stats-capture-plan.md` §1). One
|
||||
//! [`StatsRecorder`] handle is created once in the unified host entry
|
||||
//! (`gamestream::serve`) alongside [`crate::native_pairing::NativePairing`], and shared with
|
||||
//! **both** the management API ([`crate::mgmt`]) and the streaming loops (threaded through
|
||||
//! [`crate::punktfunk1::serve`] → `SessionContext` and into the GameStream encode loop). The
|
||||
//! operator arms a capture from the web console, plays a session, stops, and reviews the
|
||||
//! captured time-series as graphs; captures are saved to disk and survive a host restart.
|
||||
//!
|
||||
//! Hot-path discipline: [`StatsRecorder::is_armed`] is a cheap `Relaxed` atomic load (re-read
|
||||
//! per frame); sample construction happens only at the loops' existing ~2 s / ~1 s aggregation
|
||||
//! boundary, never per frame. Memory is bounded ([`MAX_SAMPLES`]); the on-disk write is atomic
|
||||
//! (temp + rename); and capture ids are path-traversal-safe.
|
||||
|
||||
use serde::{Deserialize, Serialize};
|
||||
use std::path::PathBuf;
|
||||
use std::sync::atomic::{AtomicBool, AtomicU32, Ordering};
|
||||
use std::sync::{Arc, Mutex};
|
||||
use std::time::Instant;
|
||||
use utoipa::ToSchema;
|
||||
|
||||
/// Cap on samples kept in one capture: ≈ 3 h at one sample / 2 s. On overflow we stop appending
|
||||
/// (keeping the oldest — a saved recording must keep its start), never dropping the front and never
|
||||
/// growing unbounded.
|
||||
const MAX_SAMPLES: usize = 5400;
|
||||
|
||||
/// One pipeline stage's latency in an aggregation window (microseconds).
|
||||
#[derive(Serialize, Deserialize, ToSchema, Clone, Debug)]
|
||||
pub struct StageTiming {
|
||||
/// `"capture" | "submit" | "encode" | "packetize" | "send"` (path-dependent).
|
||||
pub name: String,
|
||||
pub p50_us: f32,
|
||||
pub p99_us: f32,
|
||||
}
|
||||
|
||||
/// One aggregated sample (~ every 2 s native, ~ every 1 s GameStream).
|
||||
#[derive(Serialize, Deserialize, ToSchema, Clone, Debug)]
|
||||
pub struct StatsSample {
|
||||
/// Milliseconds since capture start (monotonic; stamped by [`StatsRecorder::push_sample`]).
|
||||
pub t_ms: u64,
|
||||
/// Disambiguates concurrent sessions (usually constant).
|
||||
pub session_id: u32,
|
||||
/// Ordered pipeline stages for this path.
|
||||
pub stages: Vec<StageTiming>,
|
||||
/// Genuine NEW frames/s from the source.
|
||||
pub fps: f32,
|
||||
/// Re-encoded holds/s (source-starvation indicator).
|
||||
pub repeat_fps: f32,
|
||||
/// Transmit goodput (Mb/s).
|
||||
pub mbps: f32,
|
||||
/// Configured target bitrate.
|
||||
pub bitrate_kbps: u32,
|
||||
/// Frames dropped this window (delta).
|
||||
pub frames_dropped: u32,
|
||||
/// Packets dropped this window (receiver-side / reassembler, where known).
|
||||
pub packets_dropped: u32,
|
||||
/// Host send-buffer overflow / EAGAIN this window (delta).
|
||||
pub send_dropped: u32,
|
||||
/// FEC shards recovered this window (delta).
|
||||
pub fec_recovered: u32,
|
||||
}
|
||||
|
||||
/// Capture summary — the filename stem plus the negotiated mode/codec/client. Stored at the head
|
||||
/// of each on-disk recording and listed standalone (without the sample body) by
|
||||
/// [`StatsRecorder::list`].
|
||||
#[derive(Serialize, Deserialize, ToSchema, Clone, Debug)]
|
||||
pub struct CaptureMeta {
|
||||
/// e.g. `"2026-06-26T20-14-03Z_5120x1440"` — also the filename stem.
|
||||
pub id: String,
|
||||
pub started_unix_ms: u64,
|
||||
pub duration_ms: u64,
|
||||
/// `"native" | "gamestream"`.
|
||||
pub kind: String,
|
||||
pub width: u32,
|
||||
pub height: u32,
|
||||
pub fps: u32,
|
||||
/// `"h264" | "hevc" | "av1"`.
|
||||
pub codec: String,
|
||||
/// Short label / fingerprint prefix, or `""` if unknown.
|
||||
pub client: String,
|
||||
pub sample_count: u32,
|
||||
}
|
||||
|
||||
/// A full capture: summary + the sample time-series. The wire + on-disk shape.
|
||||
#[derive(Serialize, Deserialize, ToSchema, Clone, Debug)]
|
||||
pub struct Capture {
|
||||
pub meta: CaptureMeta,
|
||||
pub samples: Vec<StatsSample>,
|
||||
}
|
||||
|
||||
/// Snapshot of the in-progress capture for the management API.
|
||||
#[derive(Serialize, Deserialize, ToSchema, Clone, Debug)]
|
||||
pub struct StatsStatus {
|
||||
/// Capture currently running.
|
||||
pub armed: bool,
|
||||
/// Samples in the in-progress capture.
|
||||
pub sample_count: u32,
|
||||
/// Unix start time of the in-progress capture (`0` if idle).
|
||||
pub started_unix_ms: u64,
|
||||
/// Path of the in-progress capture (`""` if idle).
|
||||
pub kind: String,
|
||||
}
|
||||
|
||||
/// Mode/codec/client seeded on the first [`StatsRecorder::register_session`] of a capture.
|
||||
#[derive(Clone)]
|
||||
struct MetaSeed {
|
||||
kind: String,
|
||||
width: u32,
|
||||
height: u32,
|
||||
fps: u32,
|
||||
codec: String,
|
||||
client: String,
|
||||
}
|
||||
|
||||
/// The in-progress capture (present iff armed).
|
||||
struct Live {
|
||||
/// Monotonic clock origin for sample `t_ms`.
|
||||
started: Instant,
|
||||
started_unix_ms: u64,
|
||||
/// Seeded once, on the first session registration.
|
||||
meta: Option<MetaSeed>,
|
||||
samples: Vec<StatsSample>,
|
||||
/// Set once the sample cap was hit (further samples dropped). Read so it isn't dead.
|
||||
truncated: bool,
|
||||
}
|
||||
|
||||
/// Shared streaming-stats recorder: an arm/disarm flag (the hot-path gate), the in-progress
|
||||
/// capture, and the on-disk capture directory.
|
||||
pub struct StatsRecorder {
|
||||
dir: PathBuf,
|
||||
/// The hot-path gate — a `Relaxed` load per frame; never blocks the frame thread.
|
||||
armed: AtomicBool,
|
||||
/// The in-progress capture. Locks recover a poisoned guard (`unwrap_or_else(|e| e.into_inner())`,
|
||||
/// as in `vdisplay::gamescope`) rather than `unwrap()`: a panic somewhere must never make stats
|
||||
/// recording crash an otherwise-healthy stream. The critical sections only push/clone/format, so
|
||||
/// poisoning is near-impossible anyway — this is belt-and-suspenders.
|
||||
live: Mutex<Option<Live>>,
|
||||
next_sid: AtomicU32,
|
||||
}
|
||||
|
||||
/// The default captures directory: `~/.config/punktfunk/captures/` (next to `cert.pem`),
|
||||
/// resolved via the same config-dir helper the rest of the host uses.
|
||||
pub fn default_dir() -> PathBuf {
|
||||
crate::gamestream::config_dir().join("captures")
|
||||
}
|
||||
|
||||
/// `id` charset gate, matching `^[A-Za-z0-9._-]+$` — the exact charset `capture_id` emits (which
|
||||
/// deliberately uses dashes, not colons, so the stem is a valid Windows filename). We additionally
|
||||
/// reject `.`/`..` so a path-component sneaks no parent reference even though the charset would allow
|
||||
/// bare dots. The charset already excludes `/` and `\`, so `dir.join("<id>.json")` is always a single
|
||||
/// child of `dir`. Defense in depth — the endpoints are bearer-authed.
|
||||
fn valid_id(id: &str) -> bool {
|
||||
!id.is_empty()
|
||||
&& id != "."
|
||||
&& id != ".."
|
||||
&& id
|
||||
.bytes()
|
||||
.all(|b| b.is_ascii_alphanumeric() || matches!(b, b'.' | b'_' | b'-'))
|
||||
}
|
||||
|
||||
fn unix_ms_now() -> u64 {
|
||||
std::time::SystemTime::now()
|
||||
.duration_since(std::time::UNIX_EPOCH)
|
||||
.map(|d| d.as_millis() as u64)
|
||||
.unwrap_or(0)
|
||||
}
|
||||
|
||||
/// A human-readable, filesystem-safe capture id from the start time + mode, e.g.
|
||||
/// `2026-06-26T20-14-03Z_5120x1440`. Dashes (not colons) in the time so it's a valid Windows
|
||||
/// filename; matches [`valid_id`].
|
||||
fn capture_id(unix_ms: u64, width: u32, height: u32) -> String {
|
||||
let secs = (unix_ms / 1000) as i64;
|
||||
let days = secs.div_euclid(86_400);
|
||||
let tod = secs.rem_euclid(86_400);
|
||||
let (y, mo, d) = civil_from_days(days);
|
||||
let (h, mi, s) = (tod / 3600, (tod % 3600) / 60, tod % 60);
|
||||
format!("{y:04}-{mo:02}-{d:02}T{h:02}-{mi:02}-{s:02}Z_{width}x{height}")
|
||||
}
|
||||
|
||||
/// Civil (Y, M, D) from a count of days since the Unix epoch (Howard Hinnant's `civil_from_days`).
|
||||
fn civil_from_days(z: i64) -> (i64, u32, u32) {
|
||||
let z = z + 719_468;
|
||||
let era = if z >= 0 { z } else { z - 146_096 }.div_euclid(146_097);
|
||||
let doe = z - era * 146_097; // [0, 146096]
|
||||
let yoe = (doe - doe / 1460 + doe / 36524 - doe / 146_096) / 365; // [0, 399]
|
||||
let y = yoe + era * 400;
|
||||
let doy = doe - (365 * yoe + yoe / 4 - yoe / 100); // [0, 365]
|
||||
let mp = (5 * doy + 2) / 153; // [0, 11]
|
||||
let d = (doy - (153 * mp + 2) / 5 + 1) as u32; // [1, 31]
|
||||
let m = if mp < 10 { mp + 3 } else { mp - 9 }; // [1, 12]
|
||||
(if m <= 2 { y + 1 } else { y }, m as u32, d)
|
||||
}
|
||||
|
||||
impl StatsRecorder {
|
||||
/// Create the recorder, creating `dir` (owner-private, best-effort) if missing.
|
||||
pub fn new(dir: PathBuf) -> Arc<Self> {
|
||||
if let Err(e) = crate::gamestream::create_private_dir(&dir) {
|
||||
tracing::warn!(dir = %dir.display(), error = %e, "could not create stats captures dir");
|
||||
}
|
||||
Arc::new(StatsRecorder {
|
||||
dir,
|
||||
armed: AtomicBool::new(false),
|
||||
live: Mutex::new(None),
|
||||
next_sid: AtomicU32::new(0),
|
||||
})
|
||||
}
|
||||
|
||||
/// The hot-path gate: cheap `Relaxed` load, called per frame to decide whether to measure.
|
||||
pub fn is_armed(&self) -> bool {
|
||||
self.armed.load(Ordering::Relaxed)
|
||||
}
|
||||
|
||||
/// Arm a new capture. No-op if already armed (returns the current status).
|
||||
pub fn start(&self) -> StatsStatus {
|
||||
let mut guard = self.live.lock().unwrap_or_else(|e| e.into_inner());
|
||||
if guard.is_none() {
|
||||
*guard = Some(Live {
|
||||
started: Instant::now(),
|
||||
started_unix_ms: unix_ms_now(),
|
||||
meta: None,
|
||||
samples: Vec::new(),
|
||||
truncated: false,
|
||||
});
|
||||
// Publish AFTER the live capture exists, so a frame thread that observes `armed` always
|
||||
// finds a capture to push into.
|
||||
self.armed.store(true, Ordering::Relaxed);
|
||||
}
|
||||
status_of(guard.as_ref())
|
||||
}
|
||||
|
||||
/// A streaming loop announces itself when it first records while armed. Seeds the capture's
|
||||
/// `CaptureMeta` (kind/w/h/fps/codec/client) on the FIRST registration; returns a session id
|
||||
/// to stamp on the loop's samples.
|
||||
pub fn register_session(
|
||||
&self,
|
||||
kind: &'static str,
|
||||
w: u32,
|
||||
h: u32,
|
||||
fps: u32,
|
||||
codec: &str,
|
||||
client: &str,
|
||||
) -> u32 {
|
||||
let sid = self.next_sid.fetch_add(1, Ordering::Relaxed);
|
||||
let mut guard = self.live.lock().unwrap_or_else(|e| e.into_inner());
|
||||
if let Some(live) = guard.as_mut() {
|
||||
if live.meta.is_none() {
|
||||
live.meta = Some(MetaSeed {
|
||||
kind: kind.to_string(),
|
||||
width: w,
|
||||
height: h,
|
||||
fps,
|
||||
codec: codec.to_string(),
|
||||
client: client.to_string(),
|
||||
});
|
||||
}
|
||||
}
|
||||
sid
|
||||
}
|
||||
|
||||
/// Append one aggregated sample (called from the loops' existing ~2 s / ~1 s boundary). The
|
||||
/// `t_ms` is (re)stamped here from the capture's monotonic start, so callers may leave it `0`.
|
||||
/// Bounded at [`MAX_SAMPLES`]: on overflow we stop appending (oldest kept) and flag truncation.
|
||||
/// A no-op when nothing is armed (e.g. a `stop()` raced the frame boundary).
|
||||
pub fn push_sample(&self, session_id: u32, mut sample: StatsSample) {
|
||||
let mut guard = self.live.lock().unwrap_or_else(|e| e.into_inner());
|
||||
let Some(live) = guard.as_mut() else { return };
|
||||
if live.samples.len() >= MAX_SAMPLES {
|
||||
if !live.truncated {
|
||||
live.truncated = true;
|
||||
tracing::warn!(
|
||||
max = MAX_SAMPLES,
|
||||
"stats capture hit the sample cap — further samples dropped (oldest kept)"
|
||||
);
|
||||
}
|
||||
return;
|
||||
}
|
||||
sample.session_id = session_id;
|
||||
sample.t_ms = live.started.elapsed().as_millis() as u64;
|
||||
live.samples.push(sample);
|
||||
}
|
||||
|
||||
/// Disarm + finalize: write `<dir>/<id>.json` atomically (temp + rename) and return its meta.
|
||||
/// `Ok(None)` if nothing was recording.
|
||||
pub fn stop(&self) -> std::io::Result<Option<CaptureMeta>> {
|
||||
// Clear the hot-path gate first so frame threads stop building samples immediately.
|
||||
self.armed.store(false, Ordering::Relaxed);
|
||||
let Some(live) = self.live.lock().unwrap_or_else(|e| e.into_inner()).take() else {
|
||||
return Ok(None);
|
||||
};
|
||||
let meta = meta_of(&live);
|
||||
let capture = Capture {
|
||||
meta: meta.clone(),
|
||||
samples: live.samples,
|
||||
};
|
||||
let bytes = serde_json::to_vec(&capture).map_err(std::io::Error::other)?;
|
||||
// Atomic replace: write a sibling temp then rename, so a crash mid-write can't leave a half
|
||||
// file. The id is generated (always `valid_id`), so this only ever names a child of `dir`.
|
||||
let path = self.dir.join(format!("{}.json", meta.id));
|
||||
let tmp = self.dir.join(format!("{}.json.tmp", meta.id));
|
||||
std::fs::write(&tmp, &bytes)?;
|
||||
std::fs::rename(&tmp, &path)?;
|
||||
Ok(Some(meta))
|
||||
}
|
||||
|
||||
/// The in-progress capture status (idle = `armed: false`, zeroed fields).
|
||||
pub fn status(&self) -> StatsStatus {
|
||||
status_of(self.live.lock().unwrap_or_else(|e| e.into_inner()).as_ref())
|
||||
}
|
||||
|
||||
/// A clone of the in-progress capture for live graphing (`None` when idle).
|
||||
pub fn live_snapshot(&self) -> Option<Capture> {
|
||||
let guard = self.live.lock().unwrap_or_else(|e| e.into_inner());
|
||||
let live = guard.as_ref()?;
|
||||
Some(Capture {
|
||||
meta: meta_of(live),
|
||||
samples: live.samples.clone(),
|
||||
})
|
||||
}
|
||||
|
||||
/// All saved recordings, newest first, parsing each file's `meta` head only (not the samples).
|
||||
pub fn list(&self) -> Vec<CaptureMeta> {
|
||||
/// Parse only the `meta` head — serde skips the (large) `samples` array.
|
||||
#[derive(Deserialize)]
|
||||
struct MetaOnly {
|
||||
meta: CaptureMeta,
|
||||
}
|
||||
let mut out: Vec<CaptureMeta> = Vec::new();
|
||||
let Ok(entries) = std::fs::read_dir(&self.dir) else {
|
||||
return out;
|
||||
};
|
||||
for entry in entries.flatten() {
|
||||
let path = entry.path();
|
||||
if path.extension().and_then(|e| e.to_str()) != Some("json") {
|
||||
continue;
|
||||
}
|
||||
if let Ok(bytes) = std::fs::read(&path) {
|
||||
if let Ok(parsed) = serde_json::from_slice::<MetaOnly>(&bytes) {
|
||||
out.push(parsed.meta);
|
||||
}
|
||||
}
|
||||
}
|
||||
out.sort_by_key(|m| std::cmp::Reverse(m.started_unix_ms));
|
||||
out
|
||||
}
|
||||
|
||||
/// Load a saved recording by id. Rejects a path-unsafe id (and a missing file) as `NotFound`.
|
||||
pub fn load(&self, id: &str) -> std::io::Result<Capture> {
|
||||
let path = self.recording_path(id)?;
|
||||
let bytes = std::fs::read(&path)?;
|
||||
serde_json::from_slice(&bytes)
|
||||
.map_err(|e| std::io::Error::new(std::io::ErrorKind::InvalidData, e))
|
||||
}
|
||||
|
||||
/// Delete a saved recording by id. Rejects a path-unsafe id (and a missing file) as `NotFound`.
|
||||
pub fn delete(&self, id: &str) -> std::io::Result<()> {
|
||||
let path = self.recording_path(id)?;
|
||||
std::fs::remove_file(&path)
|
||||
}
|
||||
|
||||
/// Resolve `dir/<id>.json` after validating `id`. A rejected id is `NotFound` (defense in
|
||||
/// depth: never let an attacker-shaped id escape `dir`).
|
||||
fn recording_path(&self, id: &str) -> std::io::Result<PathBuf> {
|
||||
if !valid_id(id) {
|
||||
return Err(std::io::Error::new(
|
||||
std::io::ErrorKind::NotFound,
|
||||
"invalid recording id",
|
||||
));
|
||||
}
|
||||
Ok(self.dir.join(format!("{id}.json")))
|
||||
}
|
||||
}
|
||||
|
||||
/// Build the live `StatsStatus` from the optional in-progress capture.
|
||||
fn status_of(live: Option<&Live>) -> StatsStatus {
|
||||
match live {
|
||||
Some(l) => StatsStatus {
|
||||
armed: true,
|
||||
sample_count: l.samples.len() as u32,
|
||||
started_unix_ms: l.started_unix_ms,
|
||||
kind: l.meta.as_ref().map(|m| m.kind.clone()).unwrap_or_default(),
|
||||
},
|
||||
None => StatsStatus {
|
||||
armed: false,
|
||||
sample_count: 0,
|
||||
started_unix_ms: 0,
|
||||
kind: String::new(),
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
/// Compute the `CaptureMeta` for an in-progress or finalizing capture (id derived from the start
|
||||
/// time + negotiated mode; duration from the monotonic start).
|
||||
fn meta_of(live: &Live) -> CaptureMeta {
|
||||
let (kind, width, height, fps, codec, client) = match &live.meta {
|
||||
Some(m) => (
|
||||
m.kind.clone(),
|
||||
m.width,
|
||||
m.height,
|
||||
m.fps,
|
||||
m.codec.clone(),
|
||||
m.client.clone(),
|
||||
),
|
||||
None => (String::new(), 0, 0, 0, String::new(), String::new()),
|
||||
};
|
||||
CaptureMeta {
|
||||
id: capture_id(live.started_unix_ms, width, height),
|
||||
started_unix_ms: live.started_unix_ms,
|
||||
duration_ms: live.started.elapsed().as_millis() as u64,
|
||||
kind,
|
||||
width,
|
||||
height,
|
||||
fps,
|
||||
codec,
|
||||
client,
|
||||
sample_count: live.samples.len() as u32,
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
fn temp_dir() -> PathBuf {
|
||||
// A per-call unique dir: a process-wide counter (NOT a timestamp, which collides when tests
|
||||
// run in parallel within the same millisecond — one test's cleanup would then wipe another's
|
||||
// dir mid-run).
|
||||
static COUNTER: AtomicU32 = AtomicU32::new(0);
|
||||
let n = COUNTER.fetch_add(1, Ordering::Relaxed);
|
||||
let p = std::env::temp_dir().join(format!("pf-stats-{}-{}", std::process::id(), n));
|
||||
let _ = std::fs::remove_dir_all(&p);
|
||||
p
|
||||
}
|
||||
|
||||
fn sample() -> StatsSample {
|
||||
StatsSample {
|
||||
t_ms: 0,
|
||||
session_id: 0,
|
||||
stages: vec![StageTiming {
|
||||
name: "capture".into(),
|
||||
p50_us: 100.0,
|
||||
p99_us: 200.0,
|
||||
}],
|
||||
fps: 60.0,
|
||||
repeat_fps: 0.0,
|
||||
mbps: 25.0,
|
||||
bitrate_kbps: 20_000,
|
||||
frames_dropped: 0,
|
||||
packets_dropped: 0,
|
||||
send_dropped: 0,
|
||||
fec_recovered: 0,
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn arm_record_save_load_delete() {
|
||||
let dir = temp_dir();
|
||||
let rec = StatsRecorder::new(dir.clone());
|
||||
assert!(!rec.is_armed());
|
||||
assert!(!rec.status().armed);
|
||||
// A push while idle is a no-op (no live capture).
|
||||
rec.push_sample(0, sample());
|
||||
|
||||
let st = rec.start();
|
||||
assert!(st.armed);
|
||||
assert!(rec.is_armed());
|
||||
let sid = rec.register_session("native", 5120, 1440, 240, "hevc", "abcd");
|
||||
rec.push_sample(sid, sample());
|
||||
rec.push_sample(sid, sample());
|
||||
assert_eq!(rec.status().sample_count, 2);
|
||||
assert_eq!(rec.status().kind, "native");
|
||||
assert!(rec.live_snapshot().is_some());
|
||||
|
||||
let meta = rec.stop().unwrap().expect("a capture was recording");
|
||||
assert_eq!(meta.sample_count, 2);
|
||||
assert_eq!(meta.kind, "native");
|
||||
assert_eq!(meta.width, 5120);
|
||||
assert!(meta.id.ends_with("_5120x1440"), "id was {}", meta.id);
|
||||
assert!(!rec.is_armed());
|
||||
assert!(rec.live_snapshot().is_none());
|
||||
// Stop with nothing recording → Ok(None).
|
||||
assert!(rec.stop().unwrap().is_none());
|
||||
|
||||
// It is listed and loadable.
|
||||
let list = rec.list();
|
||||
assert_eq!(list.len(), 1);
|
||||
assert_eq!(list[0].id, meta.id);
|
||||
let loaded = rec.load(&meta.id).unwrap();
|
||||
assert_eq!(loaded.samples.len(), 2);
|
||||
assert_eq!(loaded.meta.codec, "hevc");
|
||||
|
||||
// Delete removes it; a second delete is NotFound.
|
||||
rec.delete(&meta.id).unwrap();
|
||||
assert!(rec.list().is_empty());
|
||||
assert_eq!(
|
||||
rec.delete(&meta.id).unwrap_err().kind(),
|
||||
std::io::ErrorKind::NotFound
|
||||
);
|
||||
let _ = std::fs::remove_dir_all(&dir);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rejects_path_traversal_ids() {
|
||||
let dir = temp_dir();
|
||||
let rec = StatsRecorder::new(dir.clone());
|
||||
for bad in [
|
||||
"../secret",
|
||||
"..",
|
||||
".",
|
||||
"a/b",
|
||||
"a\\b",
|
||||
"",
|
||||
"/etc/passwd",
|
||||
"x/../../y",
|
||||
] {
|
||||
assert_eq!(
|
||||
rec.load(bad).unwrap_err().kind(),
|
||||
std::io::ErrorKind::NotFound,
|
||||
"load({bad:?}) must be rejected as NotFound"
|
||||
);
|
||||
assert_eq!(
|
||||
rec.delete(bad).unwrap_err().kind(),
|
||||
std::io::ErrorKind::NotFound,
|
||||
"delete({bad:?}) must be rejected as NotFound"
|
||||
);
|
||||
}
|
||||
let _ = std::fs::remove_dir_all(&dir);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn samples_are_bounded() {
|
||||
let dir = temp_dir();
|
||||
let rec = StatsRecorder::new(dir.clone());
|
||||
rec.start();
|
||||
for _ in 0..(MAX_SAMPLES + 50) {
|
||||
rec.push_sample(0, sample());
|
||||
}
|
||||
assert_eq!(rec.status().sample_count as usize, MAX_SAMPLES);
|
||||
let _ = std::fs::remove_dir_all(&dir);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn start_is_idempotent_while_armed() {
|
||||
let dir = temp_dir();
|
||||
let rec = StatsRecorder::new(dir.clone());
|
||||
rec.start();
|
||||
rec.register_session("native", 1920, 1080, 60, "hevc", "");
|
||||
rec.push_sample(0, sample());
|
||||
// A second start must NOT wipe the in-progress capture.
|
||||
let st = rec.start();
|
||||
assert!(st.armed);
|
||||
assert_eq!(st.sample_count, 1);
|
||||
let _ = std::fs::remove_dir_all(&dir);
|
||||
}
|
||||
}
|
||||
@@ -13,6 +13,9 @@
|
||||
//! owned keepalive whose `Drop` releases the output (RAII — no explicit `destroy`). Capture
|
||||
//! consumes the node via [`crate::capture::capture_virtual_output`].
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use anyhow::Result;
|
||||
pub use punktfunk_core::Mode;
|
||||
#[cfg(target_os = "linux")]
|
||||
@@ -225,6 +228,8 @@ pub fn compositor_for_kind(kind: ActiveKind) -> Option<Compositor> {
|
||||
#[cfg(target_os = "linux")]
|
||||
fn default_runtime_dir() -> String {
|
||||
std::env::var("XDG_RUNTIME_DIR").unwrap_or_else(|_| {
|
||||
// SAFETY: `getuid()` is a parameterless POSIX call that always succeeds and touches no
|
||||
// memory — it just returns the calling process's real uid. Nothing is aliased or freed.
|
||||
let uid = unsafe { libc::getuid() };
|
||||
format!("/run/user/{uid}")
|
||||
})
|
||||
@@ -245,6 +250,8 @@ fn default_bus(runtime: &str) -> String {
|
||||
#[cfg(target_os = "linux")]
|
||||
pub fn detect_active_session() -> ActiveSession {
|
||||
use std::os::unix::fs::MetadataExt;
|
||||
// SAFETY: `getuid()` is a parameterless POSIX call that always succeeds and touches no memory —
|
||||
// it just returns the calling process's real uid. Nothing is aliased or freed.
|
||||
let uid = unsafe { libc::getuid() };
|
||||
let xdg_runtime_dir = default_runtime_dir();
|
||||
let dbus = default_bus(&xdg_runtime_dir);
|
||||
@@ -615,12 +622,12 @@ mod gamescope;
|
||||
#[cfg(target_os = "linux")]
|
||||
#[path = "vdisplay/linux/kwin.rs"]
|
||||
mod kwin;
|
||||
#[cfg(target_os = "linux")]
|
||||
#[path = "vdisplay/linux/mutter.rs"]
|
||||
mod mutter;
|
||||
#[cfg(target_os = "windows")]
|
||||
#[path = "vdisplay/windows/manager.rs"]
|
||||
pub(crate) mod manager;
|
||||
#[cfg(target_os = "linux")]
|
||||
#[path = "vdisplay/linux/mutter.rs"]
|
||||
mod mutter;
|
||||
#[cfg(target_os = "windows")]
|
||||
#[path = "vdisplay/windows/pf_vdisplay.rs"]
|
||||
pub(crate) mod pf_vdisplay;
|
||||
|
||||
@@ -15,6 +15,8 @@
|
||||
//! the KWin session's environment.
|
||||
|
||||
#![allow(clippy::all, dead_code, non_camel_case_types, non_snake_case, unused)]
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use super::{Mode, VirtualDisplay, VirtualOutput};
|
||||
use anyhow::{anyhow, bail, Context, Result};
|
||||
@@ -495,6 +497,11 @@ fn run(
|
||||
events: libc::POLLIN,
|
||||
revents: 0,
|
||||
};
|
||||
// SAFETY: `&mut pfd` points at a single live, fully-initialized `libc::pollfd` on the stack, and
|
||||
// the count `1` matches that one-element array, so `poll` reads `fd`/`events` and writes `revents`
|
||||
// strictly within `pfd`. `pfd.fd` is the Wayland connection's fd, valid because `conn` (and the
|
||||
// `prepare_read` guard) are alive across the call. `poll` blocks up to 200 ms and writes only
|
||||
// `revents`; `pfd` outlives the synchronous call and aliases nothing (a fresh local).
|
||||
let r = unsafe { libc::poll(&mut pfd, 1, 200) };
|
||||
if r > 0 && (pfd.revents & libc::POLLIN) != 0 {
|
||||
let _ = guard.read();
|
||||
|
||||
@@ -13,6 +13,9 @@
|
||||
//! its `Drop` releases the refcount (a *stale* lease — its monitor was preempted + recreated under it —
|
||||
//! is a no-op, so it can never tear down the live monitor).
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use std::os::windows::io::{AsRawHandle, OwnedHandle};
|
||||
use std::sync::atomic::{AtomicBool, AtomicU32, AtomicU64, Ordering};
|
||||
use std::sync::{Arc, Mutex, Once, OnceLock};
|
||||
@@ -60,8 +63,12 @@ pub(crate) trait VdisplayDriver: Send + Sync {
|
||||
///
|
||||
/// # Safety
|
||||
/// `dev` must be the live control handle from [`open`](Self::open).
|
||||
unsafe fn add_monitor(&self, dev: HANDLE, mode: Mode, render_luid: Option<LUID>)
|
||||
-> Result<AddedMonitor>;
|
||||
unsafe fn add_monitor(
|
||||
&self,
|
||||
dev: HANDLE,
|
||||
mode: Mode,
|
||||
render_luid: Option<LUID>,
|
||||
) -> Result<AddedMonitor>;
|
||||
/// REMOVE the monitor identified by `key`.
|
||||
///
|
||||
/// # Safety
|
||||
@@ -147,7 +154,8 @@ pub(crate) fn init(driver: Box<dyn VdisplayDriver>) -> &'static VirtualDisplayMa
|
||||
/// The process-wide manager. Panics if reached before a backend called [`init`] — by construction a
|
||||
/// session is only ever created after `vdisplay::open` constructed the backend (which calls `init`).
|
||||
pub(crate) fn vdm() -> &'static VirtualDisplayManager {
|
||||
VDM.get().expect("VirtualDisplayManager used before a backend initialised it")
|
||||
VDM.get()
|
||||
.expect("VirtualDisplayManager used before a backend initialised it")
|
||||
}
|
||||
|
||||
impl VirtualDisplayManager {
|
||||
@@ -161,6 +169,10 @@ impl VirtualDisplayManager {
|
||||
if let Some(d) = self.device.get() {
|
||||
return Ok(HANDLE(d.as_raw_handle()));
|
||||
}
|
||||
// SAFETY: `VdisplayDriver::open` is `unsafe` only because it issues SetupAPI + `DeviceIoControl`
|
||||
// FFI in the caller's apartment; `ensure_device` runs that on the acquiring thread under the
|
||||
// `state` lock (callers hold it), so there is no concurrent open. `open` has no handle
|
||||
// precondition to uphold, and the `OwnedHandle` it returns is the sole owner of the device.
|
||||
let (handle, watchdog_s) = unsafe { self.driver.open()? };
|
||||
self.watchdog_s.store(watchdog_s, Ordering::Relaxed);
|
||||
let raw = HANDLE(handle.as_raw_handle());
|
||||
@@ -171,9 +183,7 @@ impl VirtualDisplayManager {
|
||||
/// The live control handle for the pinger/linger threads (lock-free: the device never changes once
|
||||
/// opened). `None` only before the first acquire opened it.
|
||||
fn device_handle(&self) -> Option<HANDLE> {
|
||||
self.device
|
||||
.get()
|
||||
.map(|d| HANDLE(d.as_raw_handle()))
|
||||
self.device.get().map(|d| HANDLE(d.as_raw_handle()))
|
||||
}
|
||||
|
||||
/// Open + initialise the backend (validates the driver is present). Mirrors the old
|
||||
@@ -196,8 +206,7 @@ impl VirtualDisplayManager {
|
||||
// client is gone). A REUSED IddCx swap-chain is DEAD, so joining it hands a black screen —
|
||||
// PREEMPT: tear the old monitor down (its key/topology are restored) and create a fresh one. The
|
||||
// old session's lease is gen-stamped, so its later drop is a no-op and can't tear down the new one.
|
||||
if idd_push_mode()
|
||||
&& matches!(*state, MgrState::Active { .. } | MgrState::Lingering { .. })
|
||||
if idd_push_mode() && matches!(*state, MgrState::Active { .. } | MgrState::Lingering { .. })
|
||||
{
|
||||
if let MgrState::Active { mon, .. } | MgrState::Lingering { mon, .. } =
|
||||
std::mem::replace(&mut *state, MgrState::Idle)
|
||||
@@ -206,6 +215,10 @@ impl VirtualDisplayManager {
|
||||
old_target = mon.target_id,
|
||||
"IDD-push reconnect — preempting the prior session, recreating a fresh monitor"
|
||||
);
|
||||
// SAFETY: `teardown` requires `dev` to be the live control handle; `dev` is the value
|
||||
// `ensure_device()` returned above (the device is cached in the `OnceLock` and never
|
||||
// closed for the manager's lifetime). `mon` was moved out of the prior `Active`/
|
||||
// `Lingering` state by `mem::replace`, so it is exclusively owned here — no aliasing.
|
||||
unsafe { self.teardown(dev, mon) };
|
||||
// Let the OS finish the ASYNC monitor departure before the next ADD; a back-to-back
|
||||
// REMOVE→ADD races the teardown and the ADD IOCTL is rejected under reconnect churn.
|
||||
@@ -219,21 +232,37 @@ impl VirtualDisplayManager {
|
||||
if let MgrState::Active { mon, refs } = &mut *state {
|
||||
*refs += 1;
|
||||
if mon.mode != mode {
|
||||
// SAFETY: `reconfigure` only manipulates the live display topology via the CCD/GDI
|
||||
// helpers and needs an exclusive `&mut Monitor`. `mon` is the `&mut` into the current
|
||||
// `Active` state, held under the `state` lock, so nothing else reconfigures it concurrently.
|
||||
unsafe { self.reconfigure(mon, mode) };
|
||||
}
|
||||
tracing::info!(refs = *refs, backend = self.driver.name(), "virtual monitor reused (concurrent / reconfigure session)");
|
||||
tracing::info!(
|
||||
refs = *refs,
|
||||
backend = self.driver.name(),
|
||||
"virtual monitor reused (concurrent / reconfigure session)"
|
||||
);
|
||||
return Ok(self.output_for(mon));
|
||||
}
|
||||
|
||||
// Idle or Lingering: repurpose a lingering monitor / create a fresh one → Active{refs:1}.
|
||||
let mon = match std::mem::replace(&mut *state, MgrState::Idle) {
|
||||
MgrState::Lingering { mut mon, .. } => {
|
||||
tracing::info!(backend = self.driver.name(), "virtual monitor reused (reconnect within the linger window)");
|
||||
tracing::info!(
|
||||
backend = self.driver.name(),
|
||||
"virtual monitor reused (reconnect within the linger window)"
|
||||
);
|
||||
if mon.mode != mode {
|
||||
// SAFETY: `reconfigure` needs an exclusive `&mut Monitor` and only touches the live
|
||||
// display topology. `mon` is the local monitor just moved out of the `Lingering`
|
||||
// state (sole owner), and we hold the `state` lock — no concurrent reconfigure.
|
||||
unsafe { self.reconfigure(&mut mon, mode) };
|
||||
}
|
||||
mon
|
||||
}
|
||||
// SAFETY: `create_monitor` requires `dev` to be the live control handle; `dev` is the
|
||||
// handle `ensure_device()` returned above (cached in the `OnceLock`, never closed for the
|
||||
// manager's lifetime), and we hold the `state` lock.
|
||||
MgrState::Idle => unsafe { self.create_monitor(dev, mode)? },
|
||||
MgrState::Active { .. } => unreachable!("handled above"),
|
||||
};
|
||||
@@ -262,17 +291,27 @@ impl VirtualDisplayManager {
|
||||
/// # Safety
|
||||
/// `dev` must be the live control handle.
|
||||
unsafe fn create_monitor(&'static self, dev: HANDLE, mode: Mode) -> Result<Monitor> {
|
||||
// SAFETY: `create_monitor`'s own `# Safety` contract guarantees `dev` is the live control
|
||||
// handle; we forward it unchanged to `add_monitor`, whose precondition is exactly that.
|
||||
// `resolve_render_pin()` returns an `Option<LUID>` by value (plain `Copy`), so no borrowed
|
||||
// memory crosses the call.
|
||||
let added = unsafe { self.driver.add_monitor(dev, mode, resolve_render_pin())? };
|
||||
|
||||
// Mandatory keepalive: ping inside the watchdog window or the driver tears all displays down.
|
||||
// The pinger reaches the singleton for both the device + the driver — no raw-handle smuggle.
|
||||
let stop = Arc::new(AtomicBool::new(false));
|
||||
let interval = Duration::from_millis(self.watchdog_s.load(Ordering::Relaxed) as u64 * 1000 / 3);
|
||||
let interval =
|
||||
Duration::from_millis(self.watchdog_s.load(Ordering::Relaxed) as u64 * 1000 / 3);
|
||||
let stop_t = stop.clone();
|
||||
let pinger = thread::spawn(move || {
|
||||
let mut warned = false;
|
||||
while !stop_t.load(Ordering::Relaxed) {
|
||||
if let Some(h) = vdm().device_handle() {
|
||||
// SAFETY: `ping` requires `dev` to be the live control handle. `h` is from
|
||||
// `device_handle()` (the `Some` branch) — the `OnceLock<Arc<OwnedHandle>>` that,
|
||||
// once set, is never cleared or closed for the process lifetime, so the handle is
|
||||
// live for this call. The pinger thread only spins while the `&'static` manager
|
||||
// singleton (and thus the device) lives.
|
||||
match unsafe { vdm().driver.ping(h) } {
|
||||
Ok(()) => warned = false,
|
||||
Err(e) => {
|
||||
@@ -292,6 +331,9 @@ impl VirtualDisplayManager {
|
||||
let mut gdi_name = None;
|
||||
for _ in 0..15 {
|
||||
thread::sleep(Duration::from_millis(200));
|
||||
// SAFETY: `resolve_gdi_name` is `unsafe` for its CCD (QueryDisplayConfig) FFI; it takes a
|
||||
// plain `Copy` `u32` target id by value and returns an owned `String`, so no caller memory
|
||||
// is borrowed across the call.
|
||||
if let Some(n) = unsafe { resolve_gdi_name(added.target_id) } {
|
||||
gdi_name = Some(n);
|
||||
break;
|
||||
@@ -308,6 +350,9 @@ impl VirtualDisplayManager {
|
||||
// display(s) first via the atomic CCD path promotes the IDD to a composited primary with no
|
||||
// MODE_CHANGE storm. Opt out with PUNKTFUNK_NO_ISOLATE=1.
|
||||
if std::env::var("PUNKTFUNK_NO_ISOLATE").is_err() {
|
||||
// SAFETY: `isolate_displays_ccd` is `unsafe` for its CCD topology FFI; it takes a
|
||||
// `Copy` `u32` by value and returns an owned `SavedConfig` snapshot (no borrowed
|
||||
// memory crosses). It runs under the `state` lock, the sole mutator of the topology.
|
||||
ccd_saved = unsafe { isolate_displays_ccd(added.target_id) };
|
||||
} else {
|
||||
tracing::info!("display isolation skipped (PUNKTFUNK_NO_ISOLATE) — IDD stays extended");
|
||||
@@ -339,10 +384,15 @@ impl VirtualDisplayManager {
|
||||
/// Touches the live display topology via the CCD/GDI helpers.
|
||||
unsafe fn reconfigure(&self, mon: &mut Monitor, mode: Mode) {
|
||||
tracing::info!(
|
||||
old = format!("{}x{}@{}", mon.mode.width, mon.mode.height, mon.mode.refresh_hz),
|
||||
old = format!(
|
||||
"{}x{}@{}",
|
||||
mon.mode.width, mon.mode.height, mon.mode.refresh_hz
|
||||
),
|
||||
new = format!("{}x{}@{}", mode.width, mode.height, mode.refresh_hz),
|
||||
"virtual-display: reconfiguring reused monitor to the new client mode"
|
||||
);
|
||||
// SAFETY: `resolve_gdi_name` is `unsafe` for its CCD FFI; it takes the `Copy` `u32`
|
||||
// `mon.target_id` by value and returns an owned `String`, so nothing borrowed crosses the call.
|
||||
if let Some(n) = unsafe { resolve_gdi_name(mon.target_id) } {
|
||||
mon.gdi_name = Some(n);
|
||||
}
|
||||
@@ -365,10 +415,16 @@ impl VirtualDisplayManager {
|
||||
if let Some(saved) = &mon.ccd_saved {
|
||||
restore_displays_ccd(saved);
|
||||
}
|
||||
// SAFETY: `teardown`'s own `# Safety` contract guarantees `dev` is the live control handle, and
|
||||
// `remove_monitor` requires exactly that. `&mon.key` borrows the `MonitorKey` inside the
|
||||
// still-owned `mon`, alive for this synchronous IOCTL, so the pointer the driver reads stays valid.
|
||||
if let Err(e) = unsafe { self.driver.remove_monitor(dev, &mon.key) } {
|
||||
tracing::warn!("virtual-display REMOVE failed: {e:#}");
|
||||
} else {
|
||||
tracing::info!(backend = self.driver.name(), "virtual-display monitor removed");
|
||||
tracing::info!(
|
||||
backend = self.driver.name(),
|
||||
"virtual-display monitor removed"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -385,10 +441,16 @@ impl VirtualDisplayManager {
|
||||
return;
|
||||
}
|
||||
*state = match std::mem::replace(&mut *state, MgrState::Idle) {
|
||||
MgrState::Active { mon, refs } if refs > 1 => MgrState::Active { mon, refs: refs - 1 },
|
||||
MgrState::Active { mon, refs } if refs > 1 => MgrState::Active {
|
||||
mon,
|
||||
refs: refs - 1,
|
||||
},
|
||||
MgrState::Active { mon, .. } => {
|
||||
let ms = linger_ms();
|
||||
tracing::info!(linger_ms = ms, "virtual-display: last session left — lingering before teardown");
|
||||
tracing::info!(
|
||||
linger_ms = ms,
|
||||
"virtual-display: last session left — lingering before teardown"
|
||||
);
|
||||
MgrState::Lingering {
|
||||
mon,
|
||||
until: Instant::now() + Duration::from_millis(ms),
|
||||
@@ -470,6 +532,10 @@ impl VirtualDisplayManager {
|
||||
}
|
||||
};
|
||||
if let Some(mon) = taken {
|
||||
// SAFETY: `teardown` requires `dev` to be the live control handle; `dev` is from
|
||||
// `self.device_handle()` (the `Some` checked just above), i.e. the cached
|
||||
// `OwnedHandle` live for the process lifetime. `mon` was moved out of the
|
||||
// `Lingering` state under the `state` lock, so it is exclusively owned here.
|
||||
unsafe { self.teardown(dev, mon) };
|
||||
}
|
||||
})
|
||||
@@ -503,9 +569,13 @@ fn idd_push_mode() -> bool {
|
||||
/// ACCESS_LOST storm SudoVDA hit when pinned).
|
||||
fn resolve_render_pin() -> Option<LUID> {
|
||||
if crate::config::config().render_adapter.is_some() {
|
||||
// SAFETY: `resolve_render_adapter_luid` is `unsafe` only for its DXGI factory FFI; it takes no
|
||||
// arguments and returns an `Option<LUID>` by value, so there is no input/borrow to keep valid.
|
||||
unsafe { crate::win_adapter::resolve_render_adapter_luid() }
|
||||
} else if crate::config::config().idd_push {
|
||||
tracing::info!("IDD push: pinning the discrete render GPU (SET_RENDER_ADAPTER)");
|
||||
// SAFETY: as above — `resolve_render_adapter_luid` takes no arguments and returns an
|
||||
// `Option<LUID>` by value; the `unsafe` covers only its DXGI factory enumeration FFI.
|
||||
unsafe { crate::win_adapter::resolve_render_adapter_luid() }
|
||||
} else {
|
||||
tracing::info!(
|
||||
|
||||
@@ -6,7 +6,7 @@
|
||||
//!
|
||||
//! Control surface: a device-interface-GUID + `CreateFileW` + `DeviceIoControl` IOCTL protocol, with
|
||||
//! the wire contract OWNED by [`pf_driver_proto::control`] (versioned + `#[repr(C)] Pod` structs,
|
||||
//! NOT the SudoVDA ABI). No DLL, no named pipe. See `docs/windows-host-rewrite.md`.
|
||||
//! NOT the SudoVDA ABI). No DLL, no named pipe. See `design/windows-host-rewrite.md`.
|
||||
//!
|
||||
//! This is a faithful clone of [`super::sudovda`] (the shipping fallback) repointed at the new driver:
|
||||
//! same reference-counted/lingering monitor lifecycle, same CCD isolation + active-mode forcing — those
|
||||
@@ -14,6 +14,9 @@
|
||||
//! target id, so the CCD/DXGI code works unchanged). Only the driver-specific bits (GUID, IOCTL codes,
|
||||
//! request/reply structs, the version handshake) differ, per `pf_driver_proto`.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use std::ffi::c_void;
|
||||
use std::mem::size_of;
|
||||
use std::os::windows::io::{FromRawHandle, OwnedHandle};
|
||||
@@ -144,15 +147,26 @@ impl VdisplayDriver for PfVdisplayDriver {
|
||||
}
|
||||
|
||||
unsafe fn open(&self) -> Result<(OwnedHandle, u32)> {
|
||||
// SAFETY: `open_device` is `unsafe` only because it issues SetupAPI enumeration + `CreateFileW`
|
||||
// FFI; it takes no arguments and returns an owned raw `HANDLE` (or `Err`). Called here on the
|
||||
// backend-init thread, with no precondition beyond a valid thread context.
|
||||
let device = unsafe { open_device()? };
|
||||
// HARD protocol-version check (unlike SudoVDA's best-effort log): a mismatched host/driver pair
|
||||
// fails loudly here rather than corrupting the IOCTL stream.
|
||||
let mut info_buf = [0u8; size_of::<control::InfoReply>()];
|
||||
// SAFETY: `ioctl` requires `h` to be a valid device handle and its slices to be valid for the
|
||||
// call. `device` is the live handle just returned by `open_device`. `IOCTL_GET_INFO` takes no
|
||||
// input (`&[]`) and writes into `info_buf`, a stack `[u8; size_of::<InfoReply>()]` whose length
|
||||
// is passed as the output size — so `DeviceIoControl` can't write OOB — and which outlives this
|
||||
// synchronous call.
|
||||
unsafe { ioctl(device, control::IOCTL_GET_INFO, &[], &mut info_buf) }
|
||||
.context("pf-vdisplay IOCTL_GET_INFO (version handshake)")?;
|
||||
let info: control::InfoReply =
|
||||
bytemuck::pod_read_unaligned(&info_buf[..size_of::<control::InfoReply>()]);
|
||||
if info.protocol_version != pf_driver_proto::PROTOCOL_VERSION {
|
||||
// SAFETY: `device` is the valid raw handle from `open_device` and has NOT yet been wrapped
|
||||
// in an `OwnedHandle` (that happens only on the success path below), so this error path is
|
||||
// the sole owner closing it exactly once — no double-close.
|
||||
unsafe {
|
||||
let _ = CloseHandle(device);
|
||||
}
|
||||
@@ -171,12 +185,19 @@ impl VdisplayDriver for PfVdisplayDriver {
|
||||
);
|
||||
// Reap monitors orphaned by a crashed previous host — a FIRST-CLASS op (driver returns SUCCESS).
|
||||
let mut none: [u8; 0] = [];
|
||||
// SAFETY: `device` is the live handle from `open_device` (still owned here, before it is wrapped
|
||||
// below). `IOCTL_CLEAR_ALL` has no input and no output: `&[]` and the empty `none` slice pass
|
||||
// zero-length buffers, so nothing is read or written through them.
|
||||
if unsafe { ioctl(device, control::IOCTL_CLEAR_ALL, &[], &mut none) }.is_ok() {
|
||||
tracing::info!("cleared orphaned virtual monitors on host startup");
|
||||
} else {
|
||||
tracing::warn!("pf-vdisplay IOCTL_CLEAR_ALL failed on startup (continuing)");
|
||||
}
|
||||
Ok((
|
||||
// SAFETY: `device` is the valid handle from `open_device`, still owned here and NOT closed
|
||||
// on this success path (the error paths above close it and return). `from_raw_handle`'s
|
||||
// contract — caller owns a valid handle — holds, so ownership transfers cleanly into the
|
||||
// `OwnedHandle`: exactly one owner, which `CloseHandle`s it on drop.
|
||||
unsafe { OwnedHandle::from_raw_handle(device.0 as _) },
|
||||
watchdog_s,
|
||||
))
|
||||
@@ -199,6 +220,9 @@ impl VdisplayDriver for PfVdisplayDriver {
|
||||
// SET_RENDER_ADAPTER (opt-in; pf-vdisplay IMPLEMENTS it). Non-fatal on failure: the driver reports
|
||||
// its real render LUID in the shared header, so the host binds correctly even if this is ignored.
|
||||
if let Some(luid) = render_luid {
|
||||
// SAFETY: `add_monitor`'s `# Safety` contract guarantees `dev` is the live control handle,
|
||||
// which is `set_render_adapter`'s precondition; we forward it unchanged. `luid` is a plain
|
||||
// `Copy` `LUID` passed by value — no borrow crosses the call.
|
||||
match unsafe { set_render_adapter(dev, luid) } {
|
||||
Ok(()) => tracing::info!(
|
||||
luid = format!("{:08x}:{:08x}", luid.HighPart, luid.LowPart),
|
||||
@@ -210,14 +234,17 @@ impl VdisplayDriver for PfVdisplayDriver {
|
||||
}
|
||||
}
|
||||
let mut out = [0u8; size_of::<control::AddReply>()];
|
||||
unsafe { ioctl(dev, control::IOCTL_ADD, bytemuck::bytes_of(&add), &mut out) }.with_context(
|
||||
|| {
|
||||
// SAFETY: per `add_monitor`'s contract `dev` is the live control handle. `bytemuck::bytes_of(&add)`
|
||||
// borrows the local `AddRequest` (alive across this synchronous call) as the input bytes, and
|
||||
// `out` is a stack `[u8; size_of::<AddReply>()]` whose length bounds the kernel's write — both
|
||||
// buffers outlive the call.
|
||||
unsafe { ioctl(dev, control::IOCTL_ADD, bytemuck::bytes_of(&add), &mut out) }
|
||||
.with_context(|| {
|
||||
format!(
|
||||
"pf-vdisplay ADD {}x{}@{}",
|
||||
mode.width, mode.height, mode.refresh_hz
|
||||
)
|
||||
},
|
||||
)?;
|
||||
})?;
|
||||
// `pod_read_unaligned` (NOT `from_bytes`): `out` is a stack `[u8; N]` with no guaranteed 4-byte
|
||||
// alignment, and `from_bytes` PANICS on a mismatch. This copies into an aligned `AddReply`.
|
||||
let reply: control::AddReply =
|
||||
@@ -260,11 +287,24 @@ impl VdisplayDriver for PfVdisplayDriver {
|
||||
session_id: *session_id,
|
||||
};
|
||||
let mut none: [u8; 0] = [];
|
||||
unsafe { ioctl(dev, control::IOCTL_REMOVE, bytemuck::bytes_of(&req), &mut none) }.map(|_| ())
|
||||
// SAFETY: per `remove_monitor`'s contract `dev` is the live control handle. `bytes_of(&req)`
|
||||
// borrows the local `RemoveRequest` for the duration of this synchronous call as the input
|
||||
// bytes; `none` is empty, so there is no output buffer.
|
||||
unsafe {
|
||||
ioctl(
|
||||
dev,
|
||||
control::IOCTL_REMOVE,
|
||||
bytemuck::bytes_of(&req),
|
||||
&mut none,
|
||||
)
|
||||
}
|
||||
.map(|_| ())
|
||||
}
|
||||
|
||||
unsafe fn ping(&self, dev: HANDLE) -> Result<()> {
|
||||
let mut none: [u8; 0] = [];
|
||||
// SAFETY: per `ping`'s contract `dev` is the live control handle. `IOCTL_PING` has no input
|
||||
// (`&[]`) and no output (`none` is empty), so no memory is read or written through the buffers.
|
||||
unsafe { ioctl(dev, control::IOCTL_PING, &[], &mut none) }.map(|_| ())
|
||||
}
|
||||
}
|
||||
@@ -292,7 +332,11 @@ impl VirtualDisplay for PfVdisplayDisplay {
|
||||
|
||||
/// Readiness probe: can we open the pf-vdisplay control device?
|
||||
pub fn probe() -> Result<()> {
|
||||
// SAFETY: `open_device` is `unsafe` only for its SetupAPI + `CreateFileW` FFI; no arguments, returns
|
||||
// an owned raw `HANDLE` (or `Err`).
|
||||
let h = unsafe { open_device()? };
|
||||
// SAFETY: `h` is the handle just opened by `open_device` in this function, owned here and not yet
|
||||
// handed anywhere else, so this closes it exactly once — no double-close, no use-after-close.
|
||||
unsafe {
|
||||
let _ = CloseHandle(h);
|
||||
}
|
||||
@@ -301,6 +345,9 @@ pub fn probe() -> Result<()> {
|
||||
|
||||
/// Is the pf-vdisplay driver present (device interface enumerable)?
|
||||
pub fn is_available() -> bool {
|
||||
// SAFETY: `open_device` returns an owned raw `HANDLE`; on `Ok(h)` the handle is moved into the
|
||||
// closure (sole owner) and closed exactly once via `CloseHandle`, on `Err` there is nothing to
|
||||
// close — so no double-close and no leak of an opened handle. The `unsafe` covers both FFI calls.
|
||||
unsafe { open_device().map(|h| CloseHandle(h)).is_ok() }
|
||||
}
|
||||
|
||||
|
||||
@@ -15,6 +15,9 @@
|
||||
//! that is correct for launching *our own* streamer, but a store launcher needs the real user's token
|
||||
//! for activation + auth). The host process itself stays SYSTEM.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use anyhow::{bail, Context, Result};
|
||||
use std::path::Path;
|
||||
use windows::core::{PCWSTR, PWSTR};
|
||||
@@ -40,6 +43,8 @@ use windows::Win32::System::Threading::{
|
||||
/// user is logged on (a pre-login / freshly-booted box can stream the login desktop but cannot
|
||||
/// auto-launch a store title until someone signs in).
|
||||
pub fn spawn_in_active_session(cmdline: &str, workdir: Option<&Path>) -> Result<u32> {
|
||||
// SAFETY: `spawn_inner` is unsafe only for its Win32 FFI; it has no caller-side preconditions — it
|
||||
// validates the session/token itself and owns every handle it opens — so calling it is always sound.
|
||||
unsafe { spawn_inner(cmdline, workdir) }
|
||||
}
|
||||
|
||||
|
||||
@@ -21,6 +21,9 @@
|
||||
//! loaded into the service's environment and carried to the host child. Logs land in
|
||||
//! `%ProgramData%\punktfunk\logs\`.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use anyhow::{bail, Context, Result};
|
||||
use std::ffi::{c_void, OsString};
|
||||
use std::os::windows::io::{AsRawHandle, FromRawHandle, OwnedHandle};
|
||||
@@ -205,14 +208,19 @@ fn run_service() -> Result<()> {
|
||||
|
||||
// Two manual-reset events: STOP (set once, never reset) and SESSION (set on a console
|
||||
// connect/disconnect, reset by the supervisor after it reacts).
|
||||
// SAFETY: CreateEventW with null attributes (None), manual-reset=true, initial-state=false and a null
|
||||
// name passes no pointers into Rust memory; it returns a fresh, owned event HANDLE (or Err, via `?`).
|
||||
// Nothing aliases or outlives the call.
|
||||
let stop_raw =
|
||||
unsafe { CreateEventW(None, true, false, PCWSTR::null()) }.context("CreateEvent stop")?;
|
||||
// SAFETY: as above — a second fresh manual-reset event; no pointers into Rust memory, no aliasing.
|
||||
let session_raw = unsafe { CreateEventW(None, true, false, PCWSTR::null()) }
|
||||
.context("CreateEvent session")?;
|
||||
// Own each event handle (the OS reaps them at process exit); the handler reaches them through the
|
||||
// OnceLocks, while `supervise` waits on the borrowed `HANDLE`s. SAFETY: each is a fresh CreateEventW
|
||||
// handle we own — take ownership exactly once.
|
||||
let stop_owned = unsafe { OwnedHandle::from_raw_handle(stop_raw.0) };
|
||||
// SAFETY: `session_raw` is the other fresh CreateEventW handle nothing else owns — take ownership once.
|
||||
let session_owned = unsafe { OwnedHandle::from_raw_handle(session_raw.0) };
|
||||
let stop = HANDLE(stop_owned.as_raw_handle());
|
||||
let session = HANDLE(session_owned.as_raw_handle());
|
||||
@@ -226,6 +234,9 @@ fn run_service() -> Result<()> {
|
||||
match control {
|
||||
ServiceControl::Stop | ServiceControl::Preshutdown | ServiceControl::Shutdown => {
|
||||
if let Some(h) = event_handle(&STOP_EVENT) {
|
||||
// SAFETY: `h` borrows the STOP event HANDLE from the STOP_EVENT OwnedHandle, set for
|
||||
// the whole process lifetime and never closed before exit, so it is open here; SetEvent
|
||||
// only signals the event and passes no Rust memory.
|
||||
unsafe { SetEvent(h) }.ok();
|
||||
}
|
||||
ServiceControlHandlerResult::NoError
|
||||
@@ -237,6 +248,9 @@ fn run_service() -> Result<()> {
|
||||
ConsoleConnect | ConsoleDisconnect | SessionLogon
|
||||
) {
|
||||
if let Some(h) = event_handle(&SESSION_EVENT) {
|
||||
// SAFETY: `h` borrows the SESSION event HANDLE from the SESSION_EVENT OwnedHandle,
|
||||
// alive for the whole process lifetime and never closed before exit; SetEvent only
|
||||
// signals the event and passes no Rust memory.
|
||||
unsafe { SetEvent(h) }.ok();
|
||||
}
|
||||
}
|
||||
@@ -297,6 +311,8 @@ fn supervise(stop: HANDLE, session_ev: HANDLE) -> Result<()> {
|
||||
// Kill-on-close job so a service crash never orphans the SYSTEM host; BREAKAWAY_OK lets the host
|
||||
// still spawn the WGC helper. Owned: dropping it at function exit (KILL_ON_JOB_CLOSE) reaps any
|
||||
// straggler still inside it — no manual CloseHandle(job).
|
||||
// SAFETY: `make_job` is unsafe only for its Win32 FFI; it has no caller preconditions and creates +
|
||||
// immediately takes RAII ownership of the job object, so calling it here is sound.
|
||||
let job = unsafe { make_job() }.context("create job object")?;
|
||||
|
||||
let mut restarts: u32 = 0;
|
||||
@@ -304,6 +320,8 @@ fn supervise(stop: HANDLE, session_ev: HANDLE) -> Result<()> {
|
||||
if wait_one(stop, 0) {
|
||||
break;
|
||||
}
|
||||
// SAFETY: WTSGetActiveConsoleSessionId takes no arguments and returns the active console session
|
||||
// id (or 0xFFFFFFFF); it passes no pointers, so the call is always sound.
|
||||
let session = unsafe { WTSGetActiveConsoleSessionId() };
|
||||
if session == 0xFFFF_FFFF {
|
||||
// No interactive session yet (boot / fully logged out). Wait, but wake on stop/session.
|
||||
@@ -311,12 +329,17 @@ fn supervise(stop: HANDLE, session_ev: HANDLE) -> Result<()> {
|
||||
if wait_any(&[stop, session_ev], 3000) == Some(0) {
|
||||
break;
|
||||
}
|
||||
// SAFETY: `session_ev` is the SESSION event HANDLE borrowed from the SESSION_EVENT OwnedHandle,
|
||||
// alive for the process lifetime; ResetEvent only clears its signalled state, no Rust memory.
|
||||
unsafe { ResetEvent(session_ev) }.ok();
|
||||
continue;
|
||||
}
|
||||
|
||||
// BORROW the owned job handle for AssignProcessToJobObject inside spawn_host.
|
||||
let job_h = HANDLE(job.as_raw_handle());
|
||||
// SAFETY: `spawn_host` is unsafe only for its Win32 FFI. `session` is a valid console session id
|
||||
// (checked != 0xFFFFFFFF above), `cmdline`/`workdir` are live borrows for the call, and `job_h`
|
||||
// borrows the still-live `job` OwnedHandle — every argument is valid for the call's duration.
|
||||
let child = match unsafe { spawn_host(session, &cmdline, &workdir, job_h) } {
|
||||
Ok(child) => child,
|
||||
Err(e) => {
|
||||
@@ -340,6 +363,9 @@ fn supervise(stop: HANDLE, session_ev: HANDLE) -> Result<()> {
|
||||
match reason {
|
||||
Some(0) => {
|
||||
// Stop: terminate the child and exit (the `child` drop closes its handles).
|
||||
// SAFETY: `proc_h` is a HANDLE copy of the still-live `child.process` OwnedHandle (not
|
||||
// dropped until end of iteration), so the process handle is open; TerminateProcess only
|
||||
// signals termination by handle and passes no Rust memory.
|
||||
unsafe {
|
||||
let _ = TerminateProcess(proc_h, 0);
|
||||
}
|
||||
@@ -347,7 +373,10 @@ fn supervise(stop: HANDLE, session_ev: HANDLE) -> Result<()> {
|
||||
}
|
||||
Some(1) => {
|
||||
// Session change: relaunch only if the active console session actually moved.
|
||||
// SAFETY: `session_ev` borrows the process-lifetime SESSION_EVENT OwnedHandle; ResetEvent
|
||||
// only clears its signalled state and passes no Rust memory.
|
||||
unsafe { ResetEvent(session_ev) }.ok();
|
||||
// SAFETY: WTSGetActiveConsoleSessionId takes no arguments and passes no pointers.
|
||||
let now = unsafe { WTSGetActiveConsoleSessionId() };
|
||||
if now != session {
|
||||
tracing::info!(
|
||||
@@ -355,6 +384,8 @@ fn supervise(stop: HANDLE, session_ev: HANDLE) -> Result<()> {
|
||||
new = now,
|
||||
"console session changed — relaunching host"
|
||||
);
|
||||
// SAFETY: `proc_h` copies the still-live `child.process` OwnedHandle (dropped only at
|
||||
// end of iteration), so the handle is open; TerminateProcess only signals by handle.
|
||||
unsafe {
|
||||
let _ = TerminateProcess(proc_h, 0);
|
||||
}
|
||||
@@ -363,6 +394,8 @@ fn supervise(stop: HANDLE, session_ev: HANDLE) -> Result<()> {
|
||||
}
|
||||
// Same session (e.g. a stray notification) — keep waiting on the same child.
|
||||
let r = wait_any(&[stop, proc_h], INFINITE);
|
||||
// SAFETY: `proc_h` copies the still-live `child.process` OwnedHandle (dropped only at end
|
||||
// of iteration), so the handle is open; TerminateProcess only signals by handle.
|
||||
unsafe {
|
||||
let _ = TerminateProcess(proc_h, 0);
|
||||
}
|
||||
@@ -394,11 +427,17 @@ fn supervise(stop: HANDLE, session_ev: HANDLE) -> Result<()> {
|
||||
|
||||
/// `true` if `h` is signalled within `ms`.
|
||||
fn wait_one(h: HANDLE, ms: u32) -> bool {
|
||||
// SAFETY: `&[h]` is a live one-element HANDLE slice the caller keeps open across the wait; the kernel
|
||||
// reads exactly one handle (the binding derives the count from the slice length), bWaitAll=false,
|
||||
// `ms` is a timeout — no pointers escape and the array is only read for this synchronous call.
|
||||
unsafe { WaitForMultipleObjects(&[h], false, ms) == WAIT_OBJECT_0 }
|
||||
}
|
||||
|
||||
/// Wait on several handles; returns the index of the first signalled, or `None` on timeout.
|
||||
fn wait_any(handles: &[HANDLE], ms: u32) -> Option<usize> {
|
||||
// SAFETY: `handles` is a live slice the caller keeps open across the wait; WaitForMultipleObjects
|
||||
// reads exactly `handles.len()` handles (the binding derives the count from the slice), bWaitAll=false,
|
||||
// `ms` is a timeout — the array is only read for this synchronous call and no pointers escape it.
|
||||
let r = unsafe { WaitForMultipleObjects(handles, false, ms) };
|
||||
let idx = r.0.wrapping_sub(WAIT_OBJECT_0.0);
|
||||
(idx < handles.len() as u32).then_some(idx as usize)
|
||||
|
||||
@@ -1,5 +1,5 @@
|
||||
//! USER-session WGC helper (Windows) — part of the two-process secure-desktop design
|
||||
//! (docs/windows-secure-desktop.md).
|
||||
//! (design/windows-secure-desktop.md).
|
||||
//!
|
||||
//! WGC won't activate under the SYSTEM account, but the host must run as SYSTEM for the secure
|
||||
//! desktop. So the SYSTEM host spawns THIS helper in the interactive user session
|
||||
@@ -12,6 +12,9 @@
|
||||
//!
|
||||
//! Wire framing on stdout, per AU: `[u32 len LE][u64 pts_ns LE][u8 keyframe][len bytes data]`.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use crate::capture::{dxgi::WinCaptureTarget, wgc::WgcCapturer, Capturer};
|
||||
use crate::encode::{self, Codec};
|
||||
use anyhow::{Context, Result};
|
||||
@@ -72,6 +75,9 @@ pub fn run(opts: HelperOptions) -> Result<()> {
|
||||
.name("pf-present-trigger".into())
|
||||
.spawn(move || {
|
||||
tracing::info!("present-trigger: starting D3D present loop on the virtual display");
|
||||
// SAFETY: `present_trigger` is unsafe only for its Win32/D3D11 FFI; it has no caller
|
||||
// preconditions (it creates and exclusively owns its own window, device, and swapchain on
|
||||
// this dedicated thread), so the call is sound.
|
||||
if let Err(e) = unsafe { present_trigger(w, h) } {
|
||||
tracing::warn!("present-trigger error: {e:#}");
|
||||
}
|
||||
|
||||
@@ -8,6 +8,9 @@
|
||||
//! them, which let the SudoVDA backend be dropped without losing them (audit §9 / Goal 2 — done). The
|
||||
//! plan's `windows/display_ccd.rs`. Extracted verbatim from the former SudoVDA backend before its removal.
|
||||
|
||||
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it (unsafe-proof program).
|
||||
#![deny(clippy::undocumented_unsafe_blocks)]
|
||||
|
||||
use std::mem::size_of;
|
||||
|
||||
use windows::core::PCWSTR;
|
||||
@@ -16,9 +19,9 @@ use windows::Win32::Devices::Display::{
|
||||
QueryDisplayConfig, SetDisplayConfig, DISPLAYCONFIG_DEVICE_INFO_GET_ADVANCED_COLOR_INFO,
|
||||
DISPLAYCONFIG_DEVICE_INFO_GET_SOURCE_NAME, DISPLAYCONFIG_DEVICE_INFO_SET_ADVANCED_COLOR_STATE,
|
||||
DISPLAYCONFIG_GET_ADVANCED_COLOR_INFO, DISPLAYCONFIG_MODE_INFO, DISPLAYCONFIG_PATH_INFO,
|
||||
DISPLAYCONFIG_SET_ADVANCED_COLOR_STATE, DISPLAYCONFIG_SOURCE_DEVICE_NAME, QDC_ONLY_ACTIVE_PATHS,
|
||||
SDC_ALLOW_CHANGES, SDC_APPLY, SDC_FORCE_MODE_ENUMERATION, SDC_SAVE_TO_DATABASE,
|
||||
SDC_USE_SUPPLIED_DISPLAY_CONFIG,
|
||||
DISPLAYCONFIG_SET_ADVANCED_COLOR_STATE, DISPLAYCONFIG_SOURCE_DEVICE_NAME,
|
||||
QDC_ONLY_ACTIVE_PATHS, SDC_ALLOW_CHANGES, SDC_APPLY, SDC_FORCE_MODE_ENUMERATION,
|
||||
SDC_SAVE_TO_DATABASE, SDC_USE_SUPPLIED_DISPLAY_CONFIG,
|
||||
};
|
||||
use windows::Win32::Graphics::Gdi::{
|
||||
ChangeDisplaySettingsExW, EnumDisplaySettingsW, CDS_TEST, CDS_UPDATEREGISTRY, DEVMODEW,
|
||||
@@ -202,6 +205,10 @@ pub(crate) fn set_active_mode(gdi_name: &str, mode: Mode) {
|
||||
dmSize: size_of::<DEVMODEW>() as u16,
|
||||
..Default::default()
|
||||
};
|
||||
// SAFETY: `wname` is a live NUL-terminated UTF-16 device name (built above) whose pointer stays
|
||||
// valid for the call; `&mut dm` is a live DEVMODEW with `dmSize` set that EnumDisplaySettingsW
|
||||
// fills in for mode index `i`. Both outlive this synchronous call; the API only reads the name
|
||||
// and writes `dm`, so nothing aliases.
|
||||
let ok = unsafe {
|
||||
EnumDisplaySettingsW(
|
||||
PCWSTR(wname.as_ptr()),
|
||||
@@ -269,6 +276,9 @@ pub(crate) fn set_active_mode(gdi_name: &str, mode: Mode) {
|
||||
dmDisplayFrequency: chosen_hz,
|
||||
..Default::default()
|
||||
};
|
||||
// SAFETY: `wname` is a live NUL-terminated UTF-16 device name and `&dm` is a live DEVMODEW describing
|
||||
// the requested mode; both outlive the call. CDS_TEST only validates the mode (no apply), the two
|
||||
// trailing args are null, and the API only reads its inputs.
|
||||
let test = unsafe {
|
||||
ChangeDisplaySettingsExW(PCWSTR(wname.as_ptr()), Some(&dm), None, CDS_TEST, None)
|
||||
};
|
||||
@@ -282,6 +292,9 @@ pub(crate) fn set_active_mode(gdi_name: &str, mode: Mode) {
|
||||
);
|
||||
return;
|
||||
}
|
||||
// SAFETY: same inputs as the CDS_TEST call above — `wname` (live NUL-terminated device name) and
|
||||
// `&dm` (live DEVMODEW) both outlive the call; CDS_UPDATEREGISTRY applies the already-validated mode,
|
||||
// and the API only reads its inputs.
|
||||
let apply = unsafe {
|
||||
ChangeDisplaySettingsExW(
|
||||
PCWSTR(wname.as_ptr()),
|
||||
|
||||
@@ -43,7 +43,7 @@ Apollo is host-only. A stream flows: **nvhttp** (HTTPS pairing + serverinfo/appl
|
||||
| Apollo — Audio capture, encode, transport (Windows host) | `audio.cpp`; `audio.h`; `audio.cpp`; `common.h`; `stream.cpp` | `audio.rs`; `audio/wasapi_cap.rs`; `audio/linux.rs`; `gamestream/audio.rs`; `punktfunk1.rs` |
|
||||
| Apollo (Sunshine fork) — Input handling & injection | `input.cpp`; `input.cpp`; `keylayout.h`; `misc.cpp` | — |
|
||||
| Apollo: App/process launch & display configuration (Windows host) | `process.cpp`; `display_device.cpp`; `process.h`; `virtual_display.h`; `misc.cpp`; `utils.cpp` | `vdisplay/sudovda.rs`; `vdisplay.rs`; `gamestream/apps.rs`; `library.rs`; `punktfunk1.rs`; `capture/wgc_relay.rs` |
|
||||
| Apollo: Config, management/web UI, system tray | `config.h`; `config.cpp`; `confighttp.cpp`; `confighttp.h`; `system_tray.cpp`; `system_tray.h` | `mgmt.rs`; `mgmt_token.rs`; `main.rs`; `native_pairing.rs`; `library.rs`; `docs/windows-host.md` |
|
||||
| Apollo: Config, management/web UI, system tray | `config.h`; `config.cpp`; `confighttp.cpp`; `confighttp.h`; `system_tray.cpp`; `system_tray.h` | `mgmt.rs`; `mgmt_token.rs`; `main.rs`; `native_pairing.rs`; `library.rs`; `design/windows-host.md` |
|
||||
|
||||
### Apollo — Protocol & streaming (RTP/FEC/ENet/RTSP/crypto)
|
||||
|
||||
@@ -354,7 +354,7 @@ The `formats[]` table (258-277) maps 2/6/8 channels to Stereo/5.1/7.1 with the G
|
||||
- **Decouple ingest from injection via task-pool queue with lock-then-release batching** — The control-stream thread only enqueues bytes and schedules a task (src/input.cpp:1639-1643). A pool thread pops one packet, coalesces later same-type packets into it while holding the queue lock, then RELEASES the lock before the (potentially slow) SendInput/ViGEm call (src/input.cpp:1486-1520). — _For a low-latency streaming host this is the core anti-head-of-line-blocking pattern: a slow OS input call (e.g. SendInput crossing a desktop switch) never stalls the network/control thread, and bursts of mouse/scroll/controller packets collapse to one OS event per drain. punktfunk should mirror this: never call SendInput on the QUIC/control thread._
|
||||
- **Type-aware packet batching with batch_result_e (batched / not_batchable / terminate_batch)** — batch() overloads (src/input.cpp:1208-1475) sum relative-mouse deltas and scroll amounts (with __builtin_add_overflow guards that terminate the batch on 16-bit overflow), take the latest absolute position, and collapse controller/touch/pen move/hover runs. terminate_batch stops at a state-changing event (button change, eventType change, active-mask change) so ordering semantics are preserved; not_batchable skips a non-matching controller but keeps scanning. — _Moonlight 'spams controller packets even when not necessary' (src/input.cpp:282). Batching cuts injected-event count under load without dropping state transitions — directly reduces input-to-screen jitter and OS overhead._
|
||||
- **VK→scancode injection with normalization fallback ladder** — keyboard_update (src/platform/windows/input.cpp:608) prefers KEYEVENTF_SCANCODE using the static US-English VK_TO_SCANCODE_MAP (keylayout.h). If the client flagged the VK as non-normalized (SS_KBE_FLAG_NON_NORMALIZED) it falls back to MapVirtualKey under config::input.always_send_scancodes (excluding VK_LWIN/RWIN/PAUSE which misbehave), else sends a raw VK event. A curated switch adds KEYEVENTF_EXTENDEDKEY for the extended-key set (arrows, nav cluster, RWIN/RMENU/RCONTROL, numpad divide, apps). — _Many games read DirectInput/raw scancodes, not VK events; sending scancodes is essential for in-game key compatibility. The extended-key flag is required or arrow keys / right-modifiers misfire. This is a concrete table+logic punktfunk's Windows VK path can adopt verbatim._
|
||||
- **Desktop-switch retry on every SendInput / InjectSyntheticPointerInput** — send_input (src/platform/windows/input.cpp:477) and inject_synthetic_pointer_input (line 499) retry once after calling syncThreadDesktop() (misc.cpp:251 — OpenInputDesktop(DF_ALLOWOTHERACCOUNTHOOK)+SetThreadDesktop) when the call fails and the input desktop handle changed, tracked in a thread_local _lastKnownInputDesktop. — _On Windows the input desktop changes on UAC prompts, lock screen, and Ctrl+Alt+Del (secure desktop / Winlogon). Without re-binding the thread to the new desktop, all injected input silently fails. This is exactly the secure-desktop problem area called out in punktfunk's docs/memory — Apollo solves it cheaply per-call rather than with a second process._
|
||||
- **Desktop-switch retry on every SendInput / InjectSyntheticPointerInput** — send_input (src/platform/windows/input.cpp:477) and inject_synthetic_pointer_input (line 499) retry once after calling syncThreadDesktop() (misc.cpp:251 — OpenInputDesktop(DF_ALLOWOTHERACCOUNTHOOK)+SetThreadDesktop) when the call fails and the input desktop handle changed, tracked in a thread_local _lastKnownInputDesktop. — _On Windows the input desktop changes on UAC prompts, lock screen, and Ctrl+Alt+Del (secure desktop / Winlogon). Without re-binding the thread to the new desktop, all injected input silently fails. This is exactly the secure-desktop problem area called out in punktfunk's design/memory — Apollo solves it cheaply per-call rather than with a second process._
|
||||
- **ViGEm dual-target gamepad with client-negotiated type selection** — alloc_gamepad (src/platform/windows/input.cpp:1175) picks X360 vs DS4 by precedence: explicit config (x360/ds4) > client-reported LI_CTYPE_PS/XBOX > motion_as_ds4 if accel/gyro present > touchpad_as_ds4 > default X360. It warns when capabilities (motion/touchpad/RGB) will be lost on X360. DS4 path packs motion, touchpad, and battery into DS4_REPORT_EX. — _DS4 is the only ViGEm target that carries gyro/accel, touchpad, and lightbar; X360 is the safe default. punktfunk already does client-negotiated pad type — Apollo's capability-driven auto-selection (motion/touchpad presence → DS4) and the explicit 'feature will be lost' warnings are a more refined policy worth porting._
|
||||
- **DS4 timestamped resend loop (ds4_update_ts_and_send)** — Every DS4 report advances wTimestamp by elapsed time in 5.333µs units and re-arms a 100ms repeat_task (src/platform/windows/input.cpp:1454-1481), so the 16-bit timestamp never stalls/overflows even when no new input arrives. — _'Some applications require updated timestamp values to register DS4 input' (line 1450). Without the heartbeat, motion-aware games ignore a held DS4. Non-obvious gotcha that any DS4-emulating host must replicate._
|
||||
- **Synthetic pen/touch via InjectSyntheticPointerInput with periodic refresh and slot compaction** — Per-client synthetic pointer devices (CreateSyntheticPointerDevice, Win10 1809+). Touch slots are kept contiguous via perform_touch_compaction (line 715, required by the API), edge-triggered flags (DOWN/UP/CANCELED/UPDATE) are cleared after each frame (line 900/1020), and a 50ms repeat task (ISPI_REPEAT_INTERVAL) re-injects held state because Windows auto-cancels untouched interactions after ~1s. — _Touch/pen are stateful, slot-indexed, and self-cancelling — a fundamentally different injection model than mouse/keyboard. If punktfunk grows touch/pen, this is the reference for the Windows-specific contiguity + refresh requirements._
|
||||
@@ -479,7 +479,7 @@ A single static `struct tray` (l.112) holds icon path, tooltip, a fixed menu arr
|
||||
- **Per-vendor encoder enum string translators** — Whole namespaces (nv/amd/qsv/vt/sw, config.cpp l.53-357) map human strings ('ultralowlatency','cqp','superfast') to encoder SDK integer constants, with low-latency presets as the DEFAULTS (e.g. amd usage = ultralowlatency l.469-471, sw preset 'superfast'/'zerolatency' l.451-453, nvenc realtime HAGS + high-power mode on by default l.457-459). — _Defaults are explicitly tuned for latency, not quality — the encoder is configured ultra-low-latency out of the box. A low-latency host's config defaults should bias the same way; this is the concrete table punktfunk can port for AMD/QSV/VT vendor parity._
|
||||
- **Embedded HTTPS server sharing the host TLS identity** — confighttp uses SimpleWeb::Server<HTTPS> seeded with nvhttp.cert/pkey (confighttp.cpp l.1511) — the SAME cert the Moonlight/GameStream pairing uses — on a fixed port offset (PORT_HTTPS=1 → base+1). — _One identity, one cert, management UI and stream control on adjacent ports. punktfunk already shares its cert.pem between GameStream pairing and punktfunk/1; the lesson is the web console can reuse it rather than carrying a separate mgmt TLS story._
|
||||
- **Single-string session cookie with salted-hash validation** — authenticate() (l.179) validates hex(hash(cookie + salt)) against an in-memory sessionCookie with a 15-day steady_clock expiry; login (l.1469) rand_alphabet(64) the raw cookie and stores only its hash. checkIPOrigin gates by pc/lan/wan BEFORE auth. — _Contrast with punktfunk's mgmt API (bearer token in ~/.config/punktfunk/mgmt-token + web login gate). Apollo's cookie+IP-origin model is simpler for a desktop single-operator host and avoids a static long-lived token; worth considering for the web console's UX._
|
||||
- **Windows service↔UI self-elevation handshake** — config::parse (l.1490-1534): a non-admin Start-Menu shortcut self-relaunches as admin (ShellExecuteExW 'runas' --shortcut-admin l.1511), starts the service, wait_for_ui_ready() polls the Win32 TCP table for the LISTEN socket (entry_handler.cpp l.236), then launch_ui(), and returns 1 so the shortcut process never starts a stream. — _This is the mature answer to the exact problem punktfunk's Windows host hit (docs/windows-host.md 'secure-desktop two-process design', Session-0 vs interactive session). Apollo solves UI-launch-from-service cleanly; the TCP-table readiness poll is directly portable._
|
||||
- **Windows service↔UI self-elevation handshake** — config::parse (l.1490-1534): a non-admin Start-Menu shortcut self-relaunches as admin (ShellExecuteExW 'runas' --shortcut-admin l.1511), starts the service, wait_for_ui_ready() polls the Win32 TCP table for the LISTEN socket (entry_handler.cpp l.236), then launch_ui(), and returns 1 so the shortcut process never starts a stream. — _This is the mature answer to the exact problem punktfunk's Windows host hit (design/windows-host.md 'secure-desktop two-process design', Session-0 vs interactive session). Apollo solves UI-launch-from-service cleanly; the TCP-table readiness poll is directly portable._
|
||||
- **Tray thread DACL hardening for SYSTEM-context survival** — init_tray() (l.143-197) adds an EXPLICIT_ACCESS ACE granting SYNCHRONIZE to Everyone on the current thread handle before registering the icon, and busy-waits for GetShellWindow() (l.201) so the icon registers reliably across logoff/logon. — _When the host runs as a Windows service (SYSTEM), Explorer can't open the thread to detect termination → ghost tray icons forever. punktfunk's Windows host, if it ever runs as a service with a tray, needs this exact DACL fix._
|
||||
- **JSON-list config values parsed via ptree wrapping** — Multi-line bracketed values (global_prep_cmd, server_cmd, dd_mode_remapping) are extracted as raw strings by the flat parser, wrapped in a synthetic JSON object, then parsed by boost ptree (list_prep_cmd_f l.949, mode_remapping_from_view l.411). — _A pragmatic hybrid: flat key=value for the human-editable 90%, embedded JSON for structured fields, without committing to full-JSON config. Shows how to grow a flat config without a rewrite._
|
||||
|
||||
@@ -680,7 +680,7 @@ Both transports use the persistent `AudioCapSlot` (gamestream/audio.rs:251-257)
|
||||
|
||||
### Input handling & injection — 🔴 Apollo ahead
|
||||
|
||||
For the Windows host specifically, Apollo is ahead on input breadth and robustness. Apollo covers mouse (rel+abs), keyboard (with a static US-layout VK→scancode table for game compatibility), Unicode text, scroll, **touch + pen via CreateSyntheticPointerDevice**, and **both X360 and DS4** gamepads with rumble/LED/motion/touchpad/battery feedback (Apollo src/platform/windows/input.cpp). punktfunk's Windows host covers mouse/keyboard/scroll/X360-only; touch and pen are explicit no-ops (sendinput.rs:231-237), there is no Unicode text path (gamestream/input.rs:83-84), and only the Xbox 360 virtual pad exists on Windows. Apollo also has the more efficient secure-desktop model (retry-only) vs punktfunk's per-event reattach (sendinput.rs:97), and Apollo's task-pool queue + type-aware batching (Apollo src/input.cpp:1481-1571, 1208-1475) coalesces input spam off the network thread — punktfunk's GameStream path injects inline on the ENet thread (control.rs:207-211) with no batching anywhere. punktfunk's design is cleaner and its m3 path's session-end held-key release + backend-follow logic is genuinely nicer than Apollo, but those are punktfunk/1-specific; on the shared Windows-host injection surface Apollo is the more complete, battle-tested implementation. punktfunk's docs/windows-secure-desktop.md already flags the retry-only refactor as planned-but-unshipped, confirming the gap.
|
||||
For the Windows host specifically, Apollo is ahead on input breadth and robustness. Apollo covers mouse (rel+abs), keyboard (with a static US-layout VK→scancode table for game compatibility), Unicode text, scroll, **touch + pen via CreateSyntheticPointerDevice**, and **both X360 and DS4** gamepads with rumble/LED/motion/touchpad/battery feedback (Apollo src/platform/windows/input.cpp). punktfunk's Windows host covers mouse/keyboard/scroll/X360-only; touch and pen are explicit no-ops (sendinput.rs:231-237), there is no Unicode text path (gamestream/input.rs:83-84), and only the Xbox 360 virtual pad exists on Windows. Apollo also has the more efficient secure-desktop model (retry-only) vs punktfunk's per-event reattach (sendinput.rs:97), and Apollo's task-pool queue + type-aware batching (Apollo src/input.cpp:1481-1571, 1208-1475) coalesces input spam off the network thread — punktfunk's GameStream path injects inline on the ENet thread (control.rs:207-211) with no batching anywhere. punktfunk's design is cleaner and its m3 path's session-end held-key release + backend-follow logic is genuinely nicer than Apollo, but those are punktfunk/1-specific; on the shared Windows-host injection surface Apollo is the more complete, battle-tested implementation. punktfunk's design/windows-secure-desktop.md already flags the retry-only refactor as planned-but-unshipped, confirming the gap.
|
||||
|
||||
|
||||
**How punktfunk does it.**
|
||||
@@ -748,7 +748,7 @@ For the Windows host specifically, Apollo is clearly ahead on this subsystem. Ap
|
||||
- punktfunk has TWO app surfaces by design: the GameStream apps.json catalog (Moonlight compat) AND a richer punktfunk/1 library (Steam local scan + custom store + CDN art + uniform GameEntry grid). Apollo has only the apps.json catalog because it ships no client.
|
||||
- punktfunk's launch security model is deliberately client-can't-inject: the client sends only a store-qualified id and the host resolves it against its OWN library (library.rs:394-412), with steam appid validated digits-only. Apollo trusts its own apps.json cmds (it has no untrusted remote launch id).
|
||||
- punktfunk keeps NO async on the per-frame path; the SudoVDA watchdog pinger and capture are native threads. Apollo's libdisplaydevice RetryScheduler is its own machinery; punktfunk has no equivalent scheduler by choice (yet — see candidate improvements).
|
||||
- punktfunk's Windows virtual display is the SOLE primary output (isolate_displays + CDS_SET_PRIMARY) specifically to capture the secure/Winlogon desktop — a deliberate, documented design (docs/windows-secure-desktop.md) that goes beyond what stock Apollo needs.
|
||||
- punktfunk's Windows virtual display is the SOLE primary output (isolate_displays + CDS_SET_PRIMARY) specifically to capture the secure/Winlogon desktop — a deliberate, documented design (design/windows-secure-desktop.md) that goes beyond what stock Apollo needs.
|
||||
|
||||
**Transfer candidates from Apollo (6):** _Actually launch the app/game on Windows (CreateProcessAsUserW into the user session)_, _Display-config apply/revert with a retry scheduler and guaranteed revert on disconnect_, _Set HDR on the virtual display and advertise IsHdrSupported when the client requests it_, _Per-(app,client) stable virtual-display GUID instead of one fixed MONITOR_GUID_, _Inject per-app launch env (client res/fps/HDR/audio + status) for launch scripts_, _auto_detach heuristic for launcher-style apps (Steam/UWP) that exit immediately_ — see Part 4.
|
||||
|
||||
@@ -765,7 +765,7 @@ On the API itself punktfunk is arguably ahead (versioned `/api/v1`, compile-time
|
||||
punktfunk splits the control surface into three pieces and deliberately keeps them OUT of the host binary where Apollo bundles them in.
|
||||
|
||||
##### 1. Management plane = a versioned REST API only (`crates/punktfunk-host/src/mgmt.rs`)
|
||||
- An axum `Router` (`mgmt.rs:166` `fn app`) under `/api/v1`, single source of truth shared between the live server and the `openapi` subcommand (`mgmt.rs:195` `api_router_parts`, `main.rs:86`). The OpenAPI 3.1 doc is generated at compile time with `utoipa` and a checked-in copy is drift-tested against `docs/api/openapi.json` (`mgmt.rs:1582` `openapi_document_is_complete_and_checked_in`). This is a real maturity advantage over Apollo, which has no machine-readable API spec.
|
||||
- An axum `Router` (`mgmt.rs:166` `fn app`) under `/api/v1`, single source of truth shared between the live server and the `openapi` subcommand (`mgmt.rs:195` `api_router_parts`, `main.rs:86`). The OpenAPI 3.1 doc is generated at compile time with `utoipa` and a checked-in copy is drift-tested against `api/openapi.json` (`mgmt.rs:1582` `openapi_document_is_complete_and_checked_in`). This is a real maturity advantage over Apollo, which has no machine-readable API spec.
|
||||
- Routes: host info/capabilities/port map (`mgmt.rs:590`), live status (`mgmt.rs:671`), paired GameStream clients list/unpair (`mgmt.rs:707`,`752`), the GameStream PIN flow (`mgmt.rs:789`,`814`), the native punktfunk/1 pairing surface — arm/disarm/status/list/unpair (`mgmt.rs:870`-`994`), **delegated pairing approval** via a pending-device queue (`mgmt.rs:1011`,`1049`,`1094`), session stop + force-IDR (`mgmt.rs:1120`,`1144`), and game-library CRUD (`mgmt.rs:1171`-`1252`).
|
||||
- **HTTPS always, even on loopback** (`mgmt.rs:75` `run`): it runs the rustls handshake itself via tokio-rustls so it can surface the verified peer cert to handlers (`mgmt.rs:115` `serve_https`), reusing the host's persistent identity cert that clients already pin (`mgmt.rs:90`).
|
||||
- **Dual auth** (`mgmt.rs:518` `require_auth`): a paired native client authenticates by its **mTLS certificate fingerprint** (matched against the native paired store, no token needed); everyone else (the web console / admin) uses a bearer token compared in constant time (`mgmt.rs:551` `token_eq` via SHA-256 digest compare). `/api/v1/health` is the only unauthenticated route. This is stronger than Apollo's single-global-session-cookie scheme (Apollo `confighttp.cpp` has exactly one `std::string sessionCookie`).
|
||||
@@ -784,7 +784,7 @@ A token always exists with zero operator steps: env `PUNKTFUNK_MGMT_TOKEN` wins,
|
||||
There is no system tray, no balloon notifications, and no "open the UI in the browser" entry point anywhere in `crates/punktfunk-host`. Apollo has a full cross-platform tray (`system_tray.cpp`) with state-driven icon/notification updates and menu callbacks.
|
||||
|
||||
##### 6. Windows launch story = scripts, not in-binary
|
||||
The two-process secure-desktop design exists for *capture* (`main.rs:204` `wgc-helper` subcommand + `capture/wgc_relay.rs` `CreateProcessAsUserW`), but the service/desktop launch dance is handled by external scripts (scheduled task -> PsExec64 -> launch.vbs -> host-run.cmd; `docs/windows-host.md:77-96`). punktfunk has no in-binary service install, no self-elevation, no "launch UI in browser", and no tray — all of which Apollo bakes into `config.cpp`/`entry_handler.cpp`/`system_tray.cpp`.
|
||||
The two-process secure-desktop design exists for *capture* (`main.rs:204` `wgc-helper` subcommand + `capture/wgc_relay.rs` `CreateProcessAsUserW`), but the service/desktop launch dance is handled by external scripts (scheduled task -> PsExec64 -> launch.vbs -> host-run.cmd; `design/windows-host.md:77-96`). punktfunk has no in-binary service install, no self-elevation, no "launch UI in browser", and no tray — all of which Apollo bakes into `config.cpp`/`entry_handler.cpp`/`system_tray.cpp`.
|
||||
|
||||
|
||||
**Intentional divergences (by design, not gaps):**
|
||||
@@ -1555,14 +1555,14 @@ punktfunk's **secure-desktop / desktop-switch capture recovery is genuinely matu
|
||||
|
||||
##### Where punktfunk is weaker / missing / fragile
|
||||
|
||||
1. **No real Windows service — relies on a PsExec scheduled task.** The launch chain is a scheduled task → `PsExec64 -s -i 1` → `wscript.exe launch.vbs` → hidden `host-run.cmd` (`docs/windows-host.md:78-84`). There is **no `SERVICE_CONTROL_SESSIONCHANGE` relaunch** — the doc even lists it as unimplemented "step 6" (`docs/windows-secure-desktop.md:89`). PsExec is a 3rd-party SysInternals tool, not redistributable cleanly, and `-s -i 1` hard-codes session 1. None of the launch scripts (`launch.vbs`, `host-run.cmd`) are checked into the repo (only `scripts/headless/win-build.cmd` exists). This is the single biggest fragility vs Apollo's `sunshinesvc.cpp`.
|
||||
1. **No real Windows service — relies on a PsExec scheduled task.** The launch chain is a scheduled task → `PsExec64 -s -i 1` → `wscript.exe launch.vbs` → hidden `host-run.cmd` (`design/windows-host.md:78-84`). There is **no `SERVICE_CONTROL_SESSIONCHANGE` relaunch** — the doc even lists it as unimplemented "step 6" (`design/windows-secure-desktop.md:89`). PsExec is a 3rd-party SysInternals tool, not redistributable cleanly, and `-s -i 1` hard-codes session 1. None of the launch scripts (`launch.vbs`, `host-run.cmd`) are checked into the repo (only `scripts/headless/win-build.cmd` exists). This is the single biggest fragility vs Apollo's `sunshinesvc.cpp`.
|
||||
2. **No nvprefs / NvAPI at all.** `grep` for `nvprefs|NvAPI|DRS_|PREFERRED_PSTATE|DXPRESENT` across the host returns nothing. No PREFERRED_PSTATE_MAX for the encoder, no OGL_CPL_PREFER_DXPRESENT (so GL/Vulkan fullscreen apps may not be capturable via WGC/DDA), and no undo-file crash safety.
|
||||
3. **No DXGI GPU-preference / output-reparenting hook.** No MinHook of `NtGdiDdDDIGetCachedHybridQueryValue`. On a hybrid/Optimus box DXGI can reparent the SudoVDA output onto the render GPU and break DDA. punktfunk's "search all adapters" partly papers over this but does not prevent the reparenting itself.
|
||||
4. **mDNS uses the cross-platform `mdns-sd` crate, not Windows-native `DnsServiceRegister`** (`discovery.rs:17`). It works, but it does NOT carry Apollo's RFC-1035 empty-TXT fix — and the GameStream/Moonlight mDNS path on Windows is unverified (`docs/windows-host.md:46`). A non-RFC-compliant TXT can be rejected by Apple's resolver.
|
||||
4. **mDNS uses the cross-platform `mdns-sd` crate, not Windows-native `DnsServiceRegister`** (`discovery.rs:17`). It works, but it does NOT carry Apollo's RFC-1035 empty-TXT fix — and the GameStream/Moonlight mDNS path on Windows is unverified (`design/windows-host.md:46`). A non-RFC-compliant TXT can be rejected by Apple's resolver.
|
||||
5. **No stream-start system tuning.** No `NtSetTimerResolution`/`timeBeginPeriod`, no `DwmEnableMMCSS`, no `SetPriorityClass(HIGH_PRIORITY_CLASS)`, no `SetThreadExecutionState(ES_DISPLAY_REQUIRED)`, no WLAN media-streaming mode, no Mouse-Keys-on-headless trick. (Linux has none of this either, but on Windows these are real latency/jitter levers Apollo proves out.)
|
||||
6. **No `factory->IsCurrent()` per-frame check.** punktfunk reacts to errors from `AcquireNextFrame` but does not proactively detect HDR/topology changes the way Apollo does each frame (`display_base.cpp:235`) — it relies on ACCESS_LOST firing, which it usually does, but IsCurrent is the cleaner signal.
|
||||
7. **No `is_user_session_locked()` / CCD pre-flight.** Before a mode-set or isolation, Apollo checks `WTSQuerySessionInformationW` + `SetDisplayConfig(SDC_VALIDATE)` (`utils.cpp:184-237`); punktfunk just attempts and handles failure, which can thrash the display during a lock.
|
||||
8. **Clock epoch is `SystemTime::now()` (`dxgi.rs:1530`), not `GetSystemTimePreciseAsFileTime`.** The doc itself flags this as a cross-machine-latency risk (`docs/windows-host.md:284-286`); std SystemTime on Windows historically has coarser (~1–15 ms) resolution than the precise FILETIME API, which can corrupt the ClockProbe/ClockEcho skew handshake.
|
||||
8. **Clock epoch is `SystemTime::now()` (`dxgi.rs:1530`), not `GetSystemTimePreciseAsFileTime`.** The doc itself flags this as a cross-machine-latency risk (`design/windows-host.md:284-286`); std SystemTime on Windows historically has coarser (~1–15 ms) resolution than the precise FILETIME API, which can corrupt the ClockProbe/ClockEcho skew handshake.
|
||||
|
||||
|
||||
#### Transfer opportunities
|
||||
@@ -1772,7 +1772,7 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
|
||||
*Area:* `cmp:input` · *Windows-host:* yes · *Severity:* high · *Effort:* small
|
||||
|
||||
- **Apollo does:** send_input() / inject_synthetic_pointer_input() call SendInput FIRST, and only on failure (0 injected) re-run syncThreadDesktop() (OpenInputDesktop(DF_ALLOWOTHERACCOUNTHOOK)+SetThreadDesktop) and retry once, tracking the desktop in a thread_local _lastKnownInputDesktop — src/platform/windows/input.cpp:477,499 + src/platform/windows/misc.cpp:251
|
||||
- **punktfunk gap:** SendInputInjector::inject() calls reattach_input_desktop() (an OpenInputDesktop+SetThreadDesktop+CloseDesktop) at the TOP of EVERY event — crates/punktfunk-host/src/inject/sendinput.rs:97,50-69. This is a syscall triple per mouse-move; punktfunk's own docs/windows-secure-desktop.md:78-80 lists this exact refactor (step 2) as planned but unshipped.
|
||||
- **punktfunk gap:** SendInputInjector::inject() calls reattach_input_desktop() (an OpenInputDesktop+SetThreadDesktop+CloseDesktop) at the TOP of EVERY event — crates/punktfunk-host/src/inject/sendinput.rs:97,50-69. This is a syscall triple per mouse-move; punktfunk's own design/windows-secure-desktop.md:78-80 lists this exact refactor (step 2) as planned but unshipped.
|
||||
- **Proposal:** Inject first; cache the HDESK thread-local; only on a 0/partial SendInput result call reattach_input_desktop() and retry once. Use DF_ALLOWOTHERACCOUNTHOOK in the OpenInputDesktop access (sendinput.rs:52-56 currently passes DESKTOP_CONTROL_FLAGS(0)) so the secure desktop is reachable. Keeps the steady-state hot path to a single SendInput call.
|
||||
|
||||
#### 2. Detect resolution/format change on the acquire hot path, not only during rebuild
|
||||
@@ -1846,7 +1846,7 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
|
||||
*Area:* `cmp:config-management` · *Windows-host:* yes · *Severity:* high · *Effort:* medium
|
||||
|
||||
- **Apollo does:** system_tray.cpp builds a single static tray struct with a menu (Open/Force-stop/Reset-display/Restart/Quit, l.112-141) and pushes state changes from the streaming pipeline — update_tray_playing/pausing/stopped/launch_error/require_pin/paired/client_connected (l.238-412) each swap the icon + raise a balloon notification; init_tray hardens the thread DACL so the icon survives running as SYSTEM (l.143-204); a 50 ms polling thread drives it (tray_thread_worker l.415).
|
||||
- **punktfunk gap:** No tray code exists anywhere in crates/punktfunk-host (grep for tray/notify-rust/balloon returns nothing). On Windows the host runs windowless as SYSTEM in Session 1 via external scripts (docs/windows-host.md:77-84) with the only operator feedback being a redirected log file — there is no visible, clickable status/control surface for a desktop user.
|
||||
- **punktfunk gap:** No tray code exists anywhere in crates/punktfunk-host (grep for tray/notify-rust/balloon returns nothing). On Windows the host runs windowless as SYSTEM in Session 1 via external scripts (design/windows-host.md:77-84) with the only operator feedback being a redirected log file — there is no visible, clickable status/control surface for a desktop user.
|
||||
- **Proposal:** Add an optional system-tray plane behind a feature/flag using a Rust tray crate (e.g. tray-icon) spawned on its own native thread (no async on the per-frame path). Drive it from the existing AppState atomics/locks already exposed by mgmt.rs get_status (streaming/audio_streaming/pin_pending/session) — poll or push on state change to swap icon + show balloons (connected, pairing PIN, launch error). Menu items call the SAME primitives the API uses (stop_session, force_idr, native arm-pairing, quit). On Windows replicate Apollo's thread-DACL hardening so the icon shows when launched as SYSTEM in the interactive session.
|
||||
|
||||
#### 11. Treat S_OK-with-no-change frames as timeouts via DXGI update flags
|
||||
@@ -1923,7 +1923,7 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
|
||||
*Area:* `cmp:config-management` · *Windows-host:* yes · *Severity:* high · *Effort:* large
|
||||
|
||||
- **Apollo does:** config.cpp:1490-1534 handles the Windows shortcut/service launch dance inside the binary: --shortcut/--shortcut-admin handling, ShellExecuteExW(runas, --shortcut-admin) to self-elevate when the service isn't running, waits for the service, wait_for_ui_ready(), launch_ui(), then returns 1 so the foreground process does NOT also start a stream host. This is Sunshine/Apollo's mature service<->UI two-process split that makes one-click launch work.
|
||||
- **punktfunk gap:** punktfunk has no service-install / self-elevation / interactive-session bring-up in the binary. Deployment is documented as a manual chain of external scripts — scheduled task -> PsExec64 -i 1 -> launch.vbs -> host-run.cmd (docs/windows-host.md:77-96) — fragile and operator-hostile. main.rs has no install/service subcommand.
|
||||
- **punktfunk gap:** punktfunk has no service-install / self-elevation / interactive-session bring-up in the binary. Deployment is documented as a manual chain of external scripts — scheduled task -> PsExec64 -i 1 -> launch.vbs -> host-run.cmd (design/windows-host.md:77-96) — fragile and operator-hostile. main.rs has no install/service subcommand.
|
||||
- **Proposal:** Add `punktfunk-host install`/`uninstall`/`service` subcommands (Windows-gated) that register a service or an Interactive/Highest scheduled task to launch the host in Session 1 (the documented requirement for DXGI duplication + SendInput), and the self-elevate-if-not-running shortcut path. Reuse the existing capture/wgc_relay CreateProcessAsUserW machinery already in the crate. This codifies the script chain into the binary without touching the per-frame path or core.
|
||||
|
||||
#### 21. Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame
|
||||
@@ -1962,7 +1962,7 @@ GameStream `SO_SNDBUF`), **#8** (move GameStream input injection off the ENet se
|
||||
*Area:* `win:system-secure-desktop` · *Windows-host:* yes · *Severity:* high · *Effort:* large
|
||||
|
||||
- **Apollo does:** SunshineSvc.exe runs as LocalSystem in Session 0, loops on WTSGetActiveConsoleSessionId, clones its own token with DuplicateTokenEx(TokenPrimary)+SetTokenInformation(TokenSessionId) and CreateProcessAsUserW into winsta0\\default inside a per-session job object (JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE|BREAKAWAY_OK); opts into SERVICE_ACCEPT_SESSIONCHANGE and on WTS_CONSOLE_CONNECT terminates+relaunches the host in the new session (tools/sunshinesvc.cpp:95,111,239,256,267,276-294)
|
||||
- **punktfunk gap:** punktfunk has no Windows service; launch is a PsExec64 -s -i 1 scheduled task hard-coded to session 1 (docs/windows-host.md:78-84), with the SERVICE_CONTROL_SESSIONCHANGE relaunch listed as unimplemented step 6 (docs/windows-secure-desktop.md:89). Launch scripts are not even in the repo.
|
||||
- **punktfunk gap:** punktfunk has no Windows service; launch is a PsExec64 -s -i 1 scheduled task hard-coded to session 1 (design/windows-host.md:78-84), with the SERVICE_CONTROL_SESSIONCHANGE relaunch listed as unimplemented step 6 (design/windows-secure-desktop.md:89). Launch scripts are not even in the repo.
|
||||
- **Proposal:** Add a small Rust service binary (new crate or punktfunk-host `service` subcommand) using windows::Win32::System::Services (RegisterServiceCtrlHandlerEx, StartServiceCtrlDispatcher) that mirrors sunshinesvc.cpp: WTSGetActiveConsoleSessionId -> DuplicateTokenEx+SetTokenInformation(TokenSessionId) -> CreateProcessAsUserW(lpDesktop=winsta0\\default) into a kill-on-close job, accept SERVICE_ACCEPT_SESSIONCHANGE, and relaunch the host on a genuine console-session change. Ship an installer and drop the PsExec dependency.
|
||||
|
||||
#### 25. Elevate capture/encode/send thread priority on the host hot path
|
||||
@@ -40,7 +40,7 @@ the GPU/compositor stack of the box it runs on). What is:
|
||||
|
||||
| Image | Source | Notes |
|
||||
|---|---|---|
|
||||
| `git.unom.io/unom/punktfunk-web` | `web/Dockerfile` (repo-root context — orval needs `docs/api/openapi.json`) | Nitro `bun` bundle; `PORT` (3000) and `PUNKTFUNK_MGMT_URL` env at runtime |
|
||||
| `git.unom.io/unom/punktfunk-web` | `web/Dockerfile` (repo-root context — orval needs `api/openapi.json`) | Nitro `bun` bundle; `PORT` (3000) and `PUNKTFUNK_MGMT_URL` env at runtime |
|
||||
| `git.unom.io/unom/punktfunk-docs` | `docs-site/Dockerfile` | This site; `PORT` (3000) |
|
||||
| `git.unom.io/unom/punktfunk-rust-ci` | `ci/rust-ci.Dockerfile` | Ubuntu 26.04 + FFmpeg 8/PipeWire/GL/GBM dev libs + a libcuda **link stub** (driver userspace, no kernel module) + pinned rustup — the container `ci.yml`'s Rust job runs in |
|
||||
|
||||
@@ -0,0 +1,246 @@
|
||||
# Stats capture & graphing — design
|
||||
|
||||
Goal: let an operator **enable performance-stats capture from the web console**, play a
|
||||
session, **stop**, and **review the captured time-series as graphs** in the web console.
|
||||
Captures are **saved to disk** (browse/compare past sessions; survive host restart) and
|
||||
cover **both** streaming paths: native punktfunk/1 (`virtual_stream`) and GameStream/Moonlight
|
||||
(`gamestream/stream.rs`).
|
||||
|
||||
This builds on the existing per-stage instrumentation (today gated by `PUNKTFUNK_PERF=1`,
|
||||
stdout-only, read once at startup). We make recording **runtime-toggleable**, route the same
|
||||
aggregates into a **shared ring → on-disk recording**, and expose it over the mgmt REST API +
|
||||
web console.
|
||||
|
||||
---
|
||||
|
||||
## 1. Host: shared `StatsRecorder`
|
||||
|
||||
New module `crates/punktfunk-host/src/stats_recorder.rs`. One `Arc<StatsRecorder>` is created
|
||||
once in the unified host entry (`gamestream::serve`, the `serve` subcommand) alongside
|
||||
`Arc<NativePairing>`, and shared with **both** the mgmt API (`MgmtState`) and the streaming
|
||||
loops (threaded through `punktfunk1::serve` → `SessionContext` → `virtual_stream`/`send_loop`,
|
||||
and into the GameStream encode loop). Mirror the existing `NativePairing` Arc-sharing pattern
|
||||
exactly.
|
||||
|
||||
### Data model (serde + utoipa `ToSchema`; this is the wire + on-disk shape)
|
||||
|
||||
```rust
|
||||
/// One pipeline stage's latency in a window (microseconds).
|
||||
pub struct StageTiming {
|
||||
pub name: String, // "capture" | "submit" | "encode" | "packetize" | "send"
|
||||
pub p50_us: f32,
|
||||
pub p99_us: f32,
|
||||
}
|
||||
|
||||
/// One aggregated sample (~ every 2 s native, ~ every 1 s GameStream).
|
||||
pub struct StatsSample {
|
||||
pub t_ms: u64, // ms since capture start (monotonic, from a stored Instant)
|
||||
pub session_id: u32, // disambiguates concurrent sessions (usually constant)
|
||||
pub stages: Vec<StageTiming>, // ordered pipeline stages for this path
|
||||
pub fps: f32, // genuine NEW frames/s from the source
|
||||
pub repeat_fps: f32, // re-encoded holds/s (source-starvation indicator)
|
||||
pub mbps: f32, // tx goodput (Mb/s)
|
||||
pub bitrate_kbps: u32, // configured target bitrate
|
||||
pub frames_dropped: u32, // delta in this window
|
||||
pub packets_dropped: u32, // delta (receiver-side / reassembler), where known
|
||||
pub send_dropped: u32, // delta (host send-buffer overflow / EAGAIN)
|
||||
pub fec_recovered: u32, // delta (shards recovered)
|
||||
}
|
||||
|
||||
pub struct CaptureMeta {
|
||||
pub id: String, // "2026-06-26T20-14-03Z_5120x1440" — also the filename stem
|
||||
pub started_unix_ms: u64,
|
||||
pub duration_ms: u64,
|
||||
pub kind: String, // "native" | "gamestream"
|
||||
pub width: u32,
|
||||
pub height: u32,
|
||||
pub fps: u32,
|
||||
pub codec: String, // "h264" | "hevc" | "av1"
|
||||
pub client: String, // short label / fingerprint prefix, or "" if unknown
|
||||
pub sample_count: u32,
|
||||
}
|
||||
|
||||
pub struct Capture {
|
||||
pub meta: CaptureMeta,
|
||||
pub samples: Vec<StatsSample>,
|
||||
}
|
||||
|
||||
pub struct StatsStatus {
|
||||
pub armed: bool, // capture currently running
|
||||
pub sample_count: u32, // samples in the in-progress capture
|
||||
pub started_unix_ms: u64, // 0 if idle
|
||||
pub kind: String, // path of the in-progress capture, "" if idle
|
||||
}
|
||||
```
|
||||
|
||||
Stage sets per path (ordered, roughly the per-frame critical path so stacking is meaningful):
|
||||
- **native**: `capture` (try_latest ring read + color convert), `submit` (NVENC enqueue),
|
||||
`encode` (lock_bitstream = NVENC schedule + ASIC — the dominant stage under GPU load),
|
||||
`send` (paced_submit: seal + FEC + pace + sendmmsg).
|
||||
- **gamestream**: `capture`, `encode`, `packetize` (poll+FEC+packetize), `send`.
|
||||
|
||||
> Native naming: today's vectors are `st_cap`→`capture`, `st_submit`→`submit`,
|
||||
> `st_wait`→`encode`, `pace_us`→`send`. (`encode_us` total ≈ capture+submit+encode; we do not
|
||||
> emit it as a stage to avoid double-counting — it's implied by the stack.)
|
||||
|
||||
### Recorder API
|
||||
|
||||
```rust
|
||||
pub struct StatsRecorder { /* dir, armed: AtomicBool, live: Mutex<Option<Live>>, next_sid: AtomicU32 */ }
|
||||
|
||||
impl StatsRecorder {
|
||||
pub fn new(dir: PathBuf) -> Arc<Self>; // creates dir (0700) if missing
|
||||
|
||||
pub fn is_armed(&self) -> bool; // cheap Relaxed atomic load — called on the hot path
|
||||
|
||||
/// Arm a new capture. No-op if already armed (returns current status).
|
||||
pub fn start(&self) -> StatsStatus;
|
||||
|
||||
/// A streaming loop announces itself when it first records while armed.
|
||||
/// Seeds CaptureMeta (kind/w/h/fps/codec/client) on the FIRST registration. Returns session_id.
|
||||
pub fn register_session(&self, kind: &'static str, w: u32, h: u32, fps: u32, codec: &str, client: &str) -> u32;
|
||||
|
||||
/// Append one aggregated sample (called from the loops' existing ~2 s/~1 s boundary).
|
||||
/// Bounded: cap at MAX_SAMPLES (e.g. 5400 ≈ 3 h @ 2 s). On overflow, stop appending and
|
||||
/// set a `truncated` flag (DO NOT drop oldest — a saved recording must keep its start).
|
||||
pub fn push_sample(&self, session_id: u32, sample: StatsSample);
|
||||
|
||||
/// Disarm + finalize: write <dir>/<id>.json atomically, clear live, return saved meta.
|
||||
pub fn stop(&self) -> std::io::Result<Option<CaptureMeta>>;
|
||||
|
||||
pub fn status(&self) -> StatsStatus;
|
||||
pub fn live_snapshot(&self) -> Option<Capture>; // clone of the in-progress capture for live graphing
|
||||
|
||||
pub fn list(&self) -> Vec<CaptureMeta>; // scan dir, parse meta only, newest first
|
||||
pub fn load(&self, id: &str) -> std::io::Result<Capture>;
|
||||
pub fn delete(&self, id: &str) -> std::io::Result<()>;
|
||||
}
|
||||
```
|
||||
|
||||
Invariants / safety:
|
||||
- **No async on the per-frame path.** `is_armed()` is a `Relaxed` atomic load; sample
|
||||
construction happens only at the existing 2 s / 1 s aggregation boundary, never per frame.
|
||||
- **`id` is path-traversal-safe.** `load`/`delete` MUST reject any id not matching
|
||||
`^[A-Za-z0-9._-]+$` (no `/`, no `..`, no `:` — keep it a valid Windows filename), and only ever
|
||||
join `dir/<id>.json`. Return NotFound on reject. (Endpoints are bearer-authed, but defend in
|
||||
depth.)
|
||||
- **Bounded memory.** `MAX_SAMPLES` cap; truncate (keep oldest), never unbounded.
|
||||
- **Atomic disk write.** Write to `<id>.json.tmp` then rename, so a crash mid-write can't leave
|
||||
a half file. Pretty-print not required; compact JSON is fine.
|
||||
- Captures dir: `~/.config/punktfunk/captures/` (next to `cert.pem` etc.). Resolve via the same
|
||||
config-dir helper the rest of the host uses.
|
||||
|
||||
### Runtime gating change (the key behavioral change)
|
||||
|
||||
Today the loops measure per-stage timing only `if perf` (a startup bool). Change the per-frame
|
||||
**measurement** predicate to `let measure = perf || recorder.is_armed();`, re-evaluated each
|
||||
frame (cheap atomic). Then at the aggregation boundary:
|
||||
- if `perf` → keep the existing `tracing::info!` log line (unchanged behavior);
|
||||
- if `recorder.is_armed()` → also build a `StatsSample` and `push_sample`.
|
||||
|
||||
So `PUNKTFUNK_PERF=1` still works exactly as before, AND the web toggle now works at runtime
|
||||
with zero startup flags.
|
||||
|
||||
### Where each loop emits the sample
|
||||
|
||||
- **native** (`punktfunk1.rs`): the cap/submit/encode(`st_wait`) splits live in the capture
|
||||
thread; `mbps`/`send_dropped`/`bytes` and `session.stats()` live in the send thread. Emit the
|
||||
complete sample from **one** place. Cleanest: carry the per-frame `cap_us/submit_us/wait_us`
|
||||
(and a `repeat: bool`) on `FrameMsg` to the send thread (it already carries `encode_us`), so
|
||||
`send_loop` builds the whole sample at its existing 2 s boundary where `session.stats()` is
|
||||
already read. Compute `frames_dropped/packets_dropped/send_dropped/fec_recovered` as deltas vs
|
||||
the previous window's `Session::stats()` snapshot (the loop already tracks `last_bytes` /
|
||||
`last_send_dropped` — extend that bookkeeping). `register_session` is called once with the
|
||||
negotiated mode/codec and the client label.
|
||||
- **gamestream** (`gamestream/stream.rs`): the encode loop already tracks per-stage max each
|
||||
1 s. Add p50/p99 accumulation (small per-stage `Vec<u32>` like the native path) and, when
|
||||
`perf || recorder.is_armed()`, emit a `StatsSample` with stages
|
||||
`[capture, encode, packetize, send]` + fps (unique new frames) + mbps + whatever loss/byte
|
||||
counters that path exposes (use 0 where a counter doesn't exist; do NOT fabricate). Call
|
||||
`register_session("gamestream", ...)` with the GameStream-negotiated mode/codec/client.
|
||||
|
||||
Threading: add `stats: Arc<StatsRecorder>` to `SessionContext` and the GameStream stream
|
||||
setup; the standalone `punktfunk1-host` subcommand (no mgmt) passes a fresh recorder (harmless,
|
||||
just unused).
|
||||
|
||||
---
|
||||
|
||||
## 2. Host: mgmt REST API (`mgmt.rs`)
|
||||
|
||||
Add `stats: Arc<StatsRecorder>` to `MgmtState`. Register handlers in `api_router_parts()` via
|
||||
`routes!()` with `#[utoipa::path]`. All under `/api/v1`, **bearer-token only** (operator
|
||||
actions — do NOT add them to the mTLS `cert_may_access` read-only allowlist). All bodies/returns
|
||||
derive `ToSchema`; errors use the `ApiJson`/`ApiError` envelope. Tag every operation `stats`.
|
||||
|
||||
| Method & path | fn (operationId) | body → returns |
|
||||
|---------------------------------------|-------------------------|-------------------------------|
|
||||
| POST `/api/v1/stats/capture/start` | `stats_capture_start` | — → `StatsStatus` |
|
||||
| POST `/api/v1/stats/capture/stop` | `stats_capture_stop` | — → `CaptureMeta` (200) / 204-ish if nothing was recording |
|
||||
| GET `/api/v1/stats/capture/status` | `stats_capture_status` | → `StatsStatus` |
|
||||
| GET `/api/v1/stats/capture/live` | `stats_capture_live` | → `Capture` (in-progress; 404/empty if idle) |
|
||||
| GET `/api/v1/stats/recordings` | `stats_recordings_list` | → `Vec<CaptureMeta>` |
|
||||
| GET `/api/v1/stats/recordings/{id}` | `stats_recording_get` | → `Capture` |
|
||||
| DELETE `/api/v1/stats/recordings/{id}`| `stats_recording_delete`| → `StatsStatus`/204 |
|
||||
|
||||
Register the new `ToSchema` types with the OpenApi derive's `components(schemas(...))` list.
|
||||
Then regenerate the checked-in spec:
|
||||
|
||||
```
|
||||
cargo run -p punktfunk-host -- openapi > api/openapi.json
|
||||
```
|
||||
|
||||
CI fails on drift — the regenerated `api/openapi.json` MUST be committed.
|
||||
|
||||
---
|
||||
|
||||
## 3. Web console (`web/`)
|
||||
|
||||
New page **"Performance"** following the established route → section/index (fetch) →
|
||||
section/view (presentational) pattern, registered in the `NAV` array (`app-shell.tsx`) with a
|
||||
lucide icon (`Activity` or `LineChart`).
|
||||
|
||||
- Route: `web/src/routes/stats.tsx` → `createFileRoute('/stats')` → `SectionStats`.
|
||||
- Section: `web/src/sections/Stats/index.tsx` (orval hooks) + `view.tsx` (presentational,
|
||||
i18n via Paraglide `m.*`). Use `Section`, `QueryState`, `Card`/`CardHeader`/`CardTitle`/
|
||||
`CardContent`, `Button`, `Badge` from `web/src/components/ui`.
|
||||
- Charts: **add `recharts`** to `web/package.json` (no chart lib exists today). Render charts
|
||||
**client-only** (a mounted guard) so SSR doesn't choke on `ResponsiveContainer`'s 0-width
|
||||
measure. Theme via existing CSS variables / brand violet, dark-mode aware.
|
||||
|
||||
Data hooks come from regenerated orval (`bun run api:gen` after the host's openapi.json is
|
||||
updated): `useStatsCaptureStatus`, `useStatsCaptureStart`, `useStatsCaptureStop`,
|
||||
`useStatsCaptureLive`, `useStatsRecordingsList`, `useStatsRecordingGet`,
|
||||
`useStatsRecordingDelete` (exact names per orval's tag/operationId convention — verify against
|
||||
generated output and adjust the view imports to match).
|
||||
|
||||
UI layout:
|
||||
1. **Capture control card** — Start/Stop button (mutations; invalidate status query on
|
||||
success), a "Recording…"/"Idle" `Badge`, elapsed time + live sample count
|
||||
(`useStatsCaptureStatus`, `refetchInterval: 2000`). On Start, the live chart appears.
|
||||
2. **Live chart** (visible while armed; `useStatsCaptureLive`, `refetchInterval: 2000`) — the
|
||||
latency stage breakdown as a **stacked area** (capture/submit/encode/send in µs, the
|
||||
"where does the time go" view), with fps and mbps as secondary line charts.
|
||||
3. **Recordings card** — table from `useStatsRecordingsList`: time, kind badge, resolution,
|
||||
codec, duration, sample count; row actions **View** (select → detail), **Download** (export
|
||||
the `Capture` JSON via the recording GET), **Delete** (mutation, confirm).
|
||||
4. **Recording detail** — when a recording (or the live capture) is selected, render the full
|
||||
graph set from its `samples`:
|
||||
- Latency stage breakdown (stacked area, µs) — primary bottleneck view; p99 overlay toggle.
|
||||
- Throughput: fps (new vs repeat) + mbps.
|
||||
- Health: frames_dropped / packets_dropped / send_dropped / fec_recovered over time.
|
||||
|
||||
i18n: add keys to `web/messages/en.json` + `de.json` (nav label, titles, button/labels) and
|
||||
regenerate Paraglide. Keep both locales in sync.
|
||||
|
||||
---
|
||||
|
||||
## 4. Verification / done-criteria
|
||||
|
||||
- `cargo build -p punktfunk-host` (and `--workspace`), `cargo clippy --workspace --all-targets
|
||||
-D warnings`, `cargo fmt --all --check` — green.
|
||||
- `cargo run -p punktfunk-host -- openapi > api/openapi.json` — committed, no drift.
|
||||
- `PUNKTFUNK_PERF=1` stdout behavior unchanged (no regression to the existing perf log).
|
||||
- Web: orval regen clean, typecheck/build green, charts render client-side.
|
||||
- CLAUDE.md status note + this plan updated.
|
||||
- Adversarial review: hot-path stays sync + bounded; `id` path-traversal-safe; OpenAPI/orval no
|
||||
drift; SSR-safe charts; both paths actually emit samples.
|
||||
@@ -51,7 +51,7 @@ back — i.e. the Windows analogue of the **GTK4 Linux client** (`clients/linux`
|
||||
which is the architectural template. The Windows client is close to a 1:1 port of the Linux client
|
||||
with the platform layers swapped.
|
||||
|
||||
## Locked decisions (from the Windows-host/client plan, `docs/windows-host.md` + project memory)
|
||||
## Locked decisions (from the Windows-host/client plan, `design/windows-host.md` + project memory)
|
||||
|
||||
- **Pure Rust.** `windows-rs` + **Windows App SDK "Reactor"** (WinUI 3 from Rust, merged windows-rs
|
||||
PR #4479). No C++/C#. De-risk Reactor + `SwapChainPanel` FIRST — it's the only novel/uncertain
|
||||
@@ -165,6 +165,6 @@ Windows client should mirror it:
|
||||
- **Core client API:** `crates/punktfunk-core/src/client.rs` (`NativeClient`).
|
||||
- **Protocol:** `crates/punktfunk-core/src/quic.rs` (`Hello.video_caps`, `Welcome.bit_depth`,
|
||||
`VIDEO_CAP_10BIT`/`VIDEO_CAP_HDR`).
|
||||
- **Full Windows plan + SudoVDA/host details:** `docs/windows-host.md`.
|
||||
- **Full Windows plan + SudoVDA/host details:** `design/windows-host.md`.
|
||||
- **Host HDR conversion (for the inverse math):** `crates/punktfunk-host/src/capture/dxgi.rs`
|
||||
(`HDR_PS`, `HdrConverter`) + `crates/punktfunk-host/src/encode/nvenc.rs` (BT.2020/PQ VUI).
|
||||
@@ -34,7 +34,7 @@ which kept the live-validated host working at every step. The driver, by contras
|
||||
|---|---|---|
|
||||
| **Goal 1** — clean, layered host architecture | ✅ **DONE** | `config.rs` (`HostConfig`), `session_plan.rs` (`SessionPlan`), `SessionContext`, `windows/`+`linux/` confinement (`38c68c3`), `VirtualDisplayManager` (§2.5), `EncoderCaps` (`0ccd0fe`) |
|
||||
| **Goal 2** — drop every trace of SudoVDA | ✅ **DONE** | reach-in decoupled (F1: `d638a93`/`e60cda3` → `win_adapter`/`win_display`), then the `sudovda.rs` backend + the dual-backend select **deleted** (this branch) — pf-vdisplay is the sole Windows virtual-display backend |
|
||||
| **Goal 3** — minimize `unsafe` + P0 lints | 🟡 **PARTIAL** (**box-validated**) | driver `deny(unsafe_op_in_unsafe_fn)` (`a755d6e`); **`OwnedHandle`/RAII rollout** — `idd_push.rs` (`011607e`, view-leak fix) + `service.rs` child/job (`4c95ba7`) + the 3 gamepad backends via shared `gamepad_raii.rs` (`e5c2b4e`) + the IDD-push `KeyedMutexGuard` hot loop (`6585643`); **driver `pod_init!`** (`bf57704`, 27→1). **On-glass clean: host clippy `-D warnings` + driver build** (RTX box; `bd05bc8` fixed 11 lints the gate surfaced). Remaining: host-crate P0 lints (deferred — churn>value), the `service.rs` SCM-handler event smuggling (deliberately left) |
|
||||
| **Goal 3** — minimize `unsafe` + P0 lints | 🟡 **PARTIAL** (**box-validated**) | driver `deny(unsafe_op_in_unsafe_fn)` (`a755d6e`); **`OwnedHandle`/RAII rollout** — `idd_push.rs` (`011607e`, view-leak fix) + `service.rs` child/job (`4c95ba7`) + the 3 gamepad backends via shared `gamepad_raii.rs` (`e5c2b4e`) + the IDD-push `KeyedMutexGuard` hot loop (`6585643`) + the **SCM STOP/SESSION events** → `OnceLock<OwnedHandle>` (`61c02e6`, runtime-validated: clean ~1 s `sc stop`); **driver `pod_init!`** (`bf57704`, 27→1). **On-glass clean: host clippy `-D warnings` + driver build** (RTX box; `bd05bc8` fixed 11 lints the gate surfaced). The host-side raw-handle smuggling is fully retired; only host-crate P0 lints remain (deferred — churn>value) |
|
||||
| **M0** — proto ABI + driver toolchain + `/INTEGRITYCHECK` + `iddcx` | ✅ **DONE** | `pf-driver-proto`; vendored `windows-drivers-rs` 0.5.1; `clear-force-integrity.ps1`; CI-green |
|
||||
| **M1** — new IddCx driver, first light + HDR | ✅ **DONE (on-glass)** | STEP 0–8 (`d7a9fbf`…`cd59151`); HDR live ("Mac connects WITH HDR", `6399d28`) |
|
||||
| **M2** — IDD-push capture + NVENC, glass-to-glass | ✅ **DONE (on-glass)** | 5120×1440@240 HDR zero-copy; integrated into the host path |
|
||||
@@ -226,14 +226,16 @@ These are expensive empirical wins; keep them intact when touching the code:
|
||||
`unsafe fn`s need an inner `unsafe {}`). Stage it **per-module, Linux-first** (item-level `#[deny]` on
|
||||
`linux/zerocopy/cuda.rs`/`egl.rs`, `encode/linux/vaapi.rs` — locally verifiable), then the Windows
|
||||
modules (CI-gated), then promote to crate-level. The driver already has the deny.
|
||||
5. **D2 — `OwnedHandle` / RAII rollout.** ✅ **done** — `capture/windows/idd_push.rs` (`011607e`: a
|
||||
`MappedSection` RAII for the mapping handle **+** the leaked `MapViewOfFile` view, + `OwnedHandle` for the
|
||||
event / ring-slot shared handles); `windows/service.rs` (`4c95ba7`: the child process/thread + Job
|
||||
handles, ~9 `CloseHandle` deleted); and the **three gamepad backends** (`e5c2b4e`: a shared
|
||||
5. **D2 — `OwnedHandle` / RAII rollout.** ✅ **DONE (complete).** `capture/windows/idd_push.rs` (`011607e`:
|
||||
a `MappedSection` RAII for the mapping handle **+** the leaked `MapViewOfFile` view, + `OwnedHandle` for
|
||||
the event / ring-slot shared handles); `windows/service.rs` (`4c95ba7`: the child process/thread + Job
|
||||
handles, ~9 `CloseHandle` deleted); the **three gamepad backends** (`e5c2b4e`: a shared
|
||||
`inject/windows/gamepad_raii.rs` — `Shm` for the section+view, `SwDevice` for the devnode — replacing the
|
||||
duplicated `create_shm_section` + three hand-written `Drop`s). **Remaining (deliberately left):** the
|
||||
`service.rs` `AtomicIsize` STOP/SESSION events — smuggled into the C SCM handler, a separate riskier
|
||||
redesign. `manager.rs`/`pf_vdisplay.rs` already used the pattern.
|
||||
duplicated `create_shm_section` + three hand-written `Drop`s); and the **SCM STOP/SESSION events**
|
||||
(`61c02e6`: `AtomicIsize` raw-`isize` smuggle → `OnceLock<OwnedHandle>` the capture-free C handler reads,
|
||||
owned for the process lifetime — also closes a latent close-then-signal window). **Runtime-validated on
|
||||
the RTX box**: swapped in, `sc start` → RUNNING, `sc stop` → clean STOPPED in ~1 s (not a timeout-kill),
|
||||
original restored. `manager.rs`/`pf_vdisplay.rs` already used the pattern.
|
||||
6. **Hot-loop `KeyedMutexGuard` ✅ done** (`6585643`) — the IDD-push consume loop's hand-written
|
||||
`AcquireSync`/`ReleaseSync` (with its "don't `?`-return between them or you leak the lock + stall the
|
||||
driver" caveat) is now a RAII guard scoped to the convert/copy block: same release point (latency
|
||||
@@ -426,5 +428,5 @@ This file replaces five docs (recoverable from git history):
|
||||
- `windows-host-rewrite-game-capture-bug.md` (the GB1 investigation + fix) — **fixed**; the resolution is
|
||||
§2.5 (capture). The full investigation narrative is in git history.
|
||||
|
||||
(The older `docs/windows-host.md`, a pre-rewrite implementation plan from 2026-06-22, is a separate
|
||||
(The older `design/windows-host.md`, a pre-rewrite implementation plan from 2026-06-22, is a separate
|
||||
lineage and is left as-is.)
|
||||
+2
-2
@@ -16,8 +16,8 @@ sidebar, and the landing page). It reads [`public/openapi.json`](public/openapi.
|
||||
|
||||
```sh
|
||||
# from the repo root — regenerate the spec, then copy the snapshot in:
|
||||
cargo run -p punktfunk-host -- openapi > docs/api/openapi.json
|
||||
cp docs/api/openapi.json docs-site/public/openapi.json
|
||||
cargo run -p punktfunk-host -- openapi > api/openapi.json
|
||||
cp api/openapi.json docs-site/public/openapi.json
|
||||
```
|
||||
|
||||
## Develop
|
||||
|
||||
@@ -64,8 +64,10 @@ DualSense feedback, automatic host discovery, PIN pairing with pinned reconnects
|
||||
overlay — with D-pad and game-controller focus navigation for the couch. It builds from the
|
||||
`clients/android` directory (Kotlin + a shared Rust core).
|
||||
|
||||
Install it from **Google Play** — see [Install a Client](/docs/install-client#android). Open the app,
|
||||
pick your host, [pair](/docs/pairing) once, and stream.
|
||||
The app is in **Google Play Internal Testing** — request a tester invite on our
|
||||
[**Discord**](https://discord.gg/kaPNvzMuGU) and we'll add you (see
|
||||
[Install a Client](/docs/install-client#android)). Once added, open the app, pick your host,
|
||||
[pair](/docs/pairing) once, and stream.
|
||||
|
||||
## Windows desktop client
|
||||
|
||||
|
||||
@@ -16,14 +16,22 @@ monitor. When the client disconnects, the virtual display goes away.
|
||||
That's why a 1080p60 laptop and a 1440p120 desktop can stream from the same host **at the same time**,
|
||||
each at its own mode — they each get their own virtual display.
|
||||
|
||||
How the virtual display is created depends on your desktop:
|
||||
How the virtual display is created depends on your host:
|
||||
|
||||
| Desktop | How |
|
||||
| Host | How |
|
||||
|---|---|
|
||||
| **GNOME** (Mutter) | A virtual monitor via the screen-cast API |
|
||||
| **KDE Plasma** (KWin) | A virtual output via KWin's screencast |
|
||||
| **Bazzite / Steam** (gamescope) | A nested gamescope session launched at the client's mode |
|
||||
| **Sway** (wlroots) | A headless output added to the running session |
|
||||
| **Windows** | A virtual-display driver — including punktfunk's own **indirect display driver** the host pushes frames straight into — a real virtual display, no physical monitor, even on the secure desktop |
|
||||
|
||||
That last one is the distinctive part on Windows: rather than only capturing an existing screen,
|
||||
punktfunk has **its own indirect display driver (IDD)**, and the host can push finished frames
|
||||
**straight into the driver**. You get the same on-the-fly virtual display the Linux compositors give
|
||||
you — at the client's exact mode, with no physical monitor or dummy HDMI dongle, and even on the
|
||||
secure desktop (UAC / lock screen). That tight, push-based integration is unusual among Windows
|
||||
streaming hosts.
|
||||
|
||||
## From screen to GPU to wire
|
||||
|
||||
|
||||
@@ -1,19 +1,21 @@
|
||||
---
|
||||
title: Introduction
|
||||
description: Low-latency desktop and game streaming from a Linux host to any of your devices.
|
||||
description: Low-latency desktop and game streaming from a Linux or Windows host to any of your devices.
|
||||
---
|
||||
|
||||
import { Cards, Card } from 'fumadocs-ui/components/card'
|
||||
|
||||
**punktfunk** streams your Linux desktop or games to your other devices — a laptop, a Mac, a tablet,
|
||||
**punktfunk** streams your desktop or games to your other devices — a laptop, a Mac, a tablet,
|
||||
a TV — at low latency and at **each device's own resolution and refresh rate**. Run the host on a
|
||||
Linux machine with an NVIDIA GPU, connect a client, and you're streaming.
|
||||
Linux machine or a Windows PC, connect a client, and you're streaming.
|
||||
|
||||
It's built for the things that make streaming feel native:
|
||||
|
||||
- **Your device's exact mode.** The host spins up a virtual display sized to the client that's
|
||||
connecting — 1080p60 to your laptop, 1440p120 to your desktop, 4K to your TV — at the same time.
|
||||
No letterboxing, no scaling, no juggling your real monitors.
|
||||
No letterboxing, no scaling, no juggling your real monitors. On Windows that's punktfunk's own
|
||||
indirect display driver, frames pushed straight in — so there's no physical monitor or dummy HDMI
|
||||
plug to deal with, even on the secure desktop.
|
||||
- **Low latency, GPU end to end.** Frames go straight from the compositor to the GPU encoder
|
||||
(NVENC) with zero CPU copies, and over a transport tuned for responsiveness rather than throughput.
|
||||
- **Works with the apps you already have.** punktfunk speaks the GameStream protocol, so any
|
||||
@@ -34,11 +36,17 @@ It's built for the things that make streaming feel native:
|
||||
|
||||
## What you need
|
||||
|
||||
- A **Linux host** with an **NVIDIA GPU** (for the NVENC hardware encoder) running one of the
|
||||
[supported setups](/docs/requirements): **Ubuntu** (GNOME or KDE), **Fedora** (KDE), or **Bazzite**.
|
||||
A native [**Windows host**](/docs/windows-host) (NVIDIA-only) is also available.
|
||||
- A **host** with a supported GPU — either a **Linux** machine running one of the
|
||||
[supported setups](/docs/requirements) (**Ubuntu** GNOME or KDE, **Fedora** KDE, or **Bazzite**), or
|
||||
a **[Windows](/docs/windows-host) PC**.
|
||||
- A **client device** to stream to — there are native apps for **macOS, iOS/iPadOS, tvOS, Linux,
|
||||
Windows, and Android**, plus any device that runs **Moonlight**.
|
||||
- Both on the **same network** (LAN or VPN). punktfunk is designed for a trusted local network.
|
||||
|
||||
Ready? Head to the [Quick Start](/docs/quickstart).
|
||||
|
||||
## Community
|
||||
|
||||
Questions, help, or want to try the **Android beta**? Join the
|
||||
[**Discord**](https://discord.gg/kaPNvzMuGU) — request a tester invite there — or
|
||||
[**r/Punktfunk**](https://www.reddit.com/r/Punktfunk/) on Reddit.
|
||||
|
||||
@@ -20,7 +20,7 @@ Whichever client you install, the first connection needs a one-time [pairing](/d
|
||||
| **Windows** | [Signed MSIX](#windows) from the package registry |
|
||||
| **macOS** | [Notarized `.dmg`](#macos) from the releases page |
|
||||
| **iPhone / iPad / Apple TV** | [TestFlight beta](#ios-ipados-apple-tv) |
|
||||
| **Android / Android TV** | [Google Play](#android) |
|
||||
| **Android / Android TV** | [Beta — request access](#android) |
|
||||
| Anything else (browser, old phone, TV) | [Moonlight](/docs/moonlight) |
|
||||
|
||||
## Linux desktop (Flatpak)
|
||||
@@ -118,12 +118,16 @@ Open the app, and your hosts appear automatically under *On this network*.
|
||||
|
||||
## Android
|
||||
|
||||
The Android client (phone + Android TV) is on **Google Play**:
|
||||
The Android client (phone + Android TV) is in **Google Play Internal Testing**. To try it, request a
|
||||
tester invite on our [**Discord**](https://discord.gg/kaPNvzMuGU) and we'll add your Google account to
|
||||
the test track:
|
||||
|
||||
**[Request access on Discord →](https://discord.gg/kaPNvzMuGU)**
|
||||
|
||||
Once you're added, install it from Google Play, then open the app and pick your host:
|
||||
|
||||
**[Get punktfunk on Google Play →](https://play.google.com/store/apps/details?id=io.unom.punktfunk)**
|
||||
|
||||
Install, open the app, and pick your host. _(The app is in testing — if the listing isn't visible
|
||||
to you yet, you'll need to be added to the test track.)_
|
||||
_(only resolves once your account is on the tester list)_
|
||||
|
||||
## Anything else — Moonlight
|
||||
|
||||
|
||||
@@ -84,9 +84,10 @@ session unit — see [Bazzite](/docs/bazzite).
|
||||
|
||||
## Windows
|
||||
|
||||
> punktfunk is Linux-first, but a native **Windows host** also ships — a signed installer with an SCM
|
||||
> service and a bundled virtual-display driver. It's **NVIDIA-only** (NVENC) and newer than the Linux
|
||||
> host. (Not to be confused with the Windows *client*, which streams *to* a Windows PC.)
|
||||
> punktfunk has first-class **Linux and Windows** hosts. On Windows it ships as a signed installer
|
||||
> with an SCM service and a virtual-display driver — including punktfunk's own **indirect display
|
||||
> driver** the host pushes frames straight into. The Windows host is newer than the Linux host. (Not
|
||||
> to be confused with the Windows *client*, which streams *to* a Windows PC.)
|
||||
|
||||
On Windows the host runs as a `LocalSystem` service that launches into the interactive session, so it
|
||||
captures the secure desktop (UAC / lock screen) and survives reboots with nobody logged in — the same
|
||||
|
||||
@@ -23,9 +23,9 @@ A high-level view of where punktfunk stands. The ordered plan of work is on the
|
||||
|
||||
## What works today
|
||||
|
||||
punktfunk is a low-latency desktop and game streaming **host** — Linux-first (Linux + NVIDIA, NVENC),
|
||||
with a newer **NVIDIA-only Windows host** too — and native **clients** on macOS, iOS/iPadOS/tvOS,
|
||||
Linux, Windows, and Android.
|
||||
punktfunk is a low-latency desktop and game streaming **host** with first-class **Linux and Windows**
|
||||
support — and native **clients** on macOS, iOS/iPadOS/tvOS, Linux, Windows, and Android. (The Windows
|
||||
host is newer than the Linux host.)
|
||||
|
||||
- **Two protocols.** The host speaks the **GameStream** protocol, so any **Moonlight**
|
||||
client works out of the box, plus its own lower-latency **`punktfunk/1`** protocol
|
||||
|
||||
@@ -1,15 +1,17 @@
|
||||
---
|
||||
title: "Windows Host"
|
||||
description: "Run the punktfunk streaming host on a Windows PC with an NVIDIA GPU."
|
||||
description: "Run the punktfunk streaming host on a Windows PC — a first-class, virtual-display host."
|
||||
---
|
||||
|
||||
|
||||
**Status: implemented and shipping — NVIDIA-only, x64-only.** punktfunk is Linux-first, but it also
|
||||
runs as a native **Windows host**: a signed installer registers a `LocalSystem` service that streams
|
||||
**Status: implemented and shipping — x64-only.** Alongside the Linux host, punktfunk runs as a
|
||||
first-class native **Windows host**: a signed installer registers a `LocalSystem` service that streams
|
||||
your Windows desktop or games to any punktfunk or Moonlight client, at the client's exact resolution
|
||||
via a virtual display — including **HDR10** (10-bit BT.2020 PQ) when your Windows desktop is in HDR
|
||||
mode. It's newer and less battle-tested than the Linux host, and it is built specifically around
|
||||
NVIDIA hardware. (The Linux host is 8-bit only — HDR there is blocked upstream.)
|
||||
via a **virtual display** — including **HDR10** (10-bit BT.2020 PQ) when your Windows desktop is in HDR
|
||||
mode. punktfunk has its own **indirect display driver (IDD)** that the host pushes finished frames
|
||||
straight into, so you get a real on-the-fly virtual display with no physical monitor or dummy HDMI
|
||||
plug — even on the secure desktop (UAC / lock screen). The Windows host is newer and less
|
||||
battle-tested than the Linux host. (The Linux host is 8-bit only — HDR there is blocked upstream.)
|
||||
|
||||
> This page is about the Windows **host** (streaming *from* a Windows PC). To stream *to* a Windows
|
||||
> PC, see the [Windows client](/docs/clients#windows-desktop-client).
|
||||
|
||||
@@ -19,6 +19,8 @@ export function baseOptions(): BaseLayoutProps {
|
||||
{ text: 'API', url: '/api' },
|
||||
{ text: 'Website', url: 'https://punktfunk.unom.io' },
|
||||
{ text: 'Source code', url: 'https://git.unom.io/unom/punktfunk' },
|
||||
{ text: 'Discord', url: 'https://discord.gg/kaPNvzMuGU' },
|
||||
{ text: 'Reddit', url: 'https://www.reddit.com/r/Punktfunk/' },
|
||||
],
|
||||
}
|
||||
}
|
||||
|
||||
@@ -13,8 +13,8 @@ function Home() {
|
||||
<BrandMark className="size-20 drop-shadow-[0_8px_30px_rgba(108,91,243,0.45)]" />
|
||||
<Wordmark className="h-12 md:h-14" />
|
||||
<p className="max-w-xl text-fd-muted-foreground">
|
||||
Linux-first, low-latency desktop and game streaming — a shared Rust protocol
|
||||
core with native clients per platform.
|
||||
Low-latency desktop and game streaming with first-class Linux and Windows
|
||||
hosts — a shared Rust protocol core with native clients on every platform.
|
||||
</p>
|
||||
<Link
|
||||
to="/docs/$"
|
||||
|
||||
@@ -94,7 +94,7 @@ package_punktfunk-host() {
|
||||
install -Dm0644 "$R/scripts/host.env.example" "$pkgdir/usr/share/punktfunk/host.env.example"
|
||||
install -Dm0644 "$R/packaging/bazzite/host.env" "$pkgdir/usr/share/punktfunk/host.env.bazzite"
|
||||
install -Dm0644 "$R/packaging/kde/host.env" "$pkgdir/usr/share/punktfunk/host.env.kde"
|
||||
install -Dm0644 "$R/docs/api/openapi.json" "$pkgdir/usr/share/punktfunk/openapi.json"
|
||||
install -Dm0644 "$R/api/openapi.json" "$pkgdir/usr/share/punktfunk/openapi.json"
|
||||
install -Dm0644 "$R/LICENSE-MIT" "$pkgdir/usr/share/licenses/punktfunk-host/LICENSE-MIT"
|
||||
install -Dm0644 "$R/LICENSE-APACHE" "$pkgdir/usr/share/licenses/punktfunk-host/LICENSE-APACHE"
|
||||
install -Dm0644 "$R/README.md" "$pkgdir/usr/share/doc/punktfunk-host/README.md"
|
||||
|
||||
@@ -257,7 +257,7 @@ journalctl --user -u punktfunk-host -f
|
||||
|
||||
> ⚠️ **There is no firewall script or firewall doc in the repo.** The ports below are derived
|
||||
> directly from the code constants (`crates/punktfunk-host/src/gamestream/mod.rs`, `mgmt.rs`) and
|
||||
> the GameStream-host port-map (`docs/gamestream-host-plan.md`). Treat the `firewall-cmd` lines as recommended-but-verified,
|
||||
> the GameStream-host port-map (`design/gamestream-host-plan.md`). Treat the `firewall-cmd` lines as recommended-but-verified,
|
||||
> not a checked-in script.
|
||||
|
||||
**GameStream / Moonlight ports** (fixed; Moonlight derives them from the HTTP base). These only apply
|
||||
|
||||
@@ -57,7 +57,7 @@ install -Dm0644 scripts/headless/punktfunk-sink.conf "$SHAREDIR/headless/punkt
|
||||
install -Dm0644 scripts/host.env.example "$SHAREDIR/host.env.example"
|
||||
install -Dm0644 packaging/bazzite/host.env "$SHAREDIR/host.env.bazzite"
|
||||
install -Dm0644 packaging/kde/host.env "$SHAREDIR/host.env.kde"
|
||||
install -Dm0644 docs/api/openapi.json "$SHAREDIR/openapi.json"
|
||||
install -Dm0644 api/openapi.json "$SHAREDIR/openapi.json"
|
||||
install -Dm0644 LICENSE-MIT "$DOCDIR/LICENSE-MIT"
|
||||
install -Dm0644 LICENSE-APACHE "$DOCDIR/LICENSE-APACHE"
|
||||
install -Dm0644 README.md "$DOCDIR/README.md"
|
||||
|
||||
@@ -224,7 +224,7 @@ install -Dm0644 packaging/kde/host.env %{buildroot}%{_datadir}/%
|
||||
# Bazzite KDE Desktop-mode one-shot setup (KWIN_WAYLAND_NO_PERMISSION_CHECKS + RemoteDesktop grant).
|
||||
install -d %{buildroot}%{_datadir}/%{name}/bazzite
|
||||
install -Dm0755 packaging/bazzite/kde-desktop-setup.sh %{buildroot}%{_datadir}/%{name}/bazzite/kde-desktop-setup.sh
|
||||
install -Dm0644 docs/api/openapi.json %{buildroot}%{_datadir}/%{name}/openapi.json
|
||||
install -Dm0644 api/openapi.json %{buildroot}%{_datadir}/%{name}/openapi.json
|
||||
|
||||
%if %{with web}
|
||||
# --- web console subpackage (punktfunk-web) ---
|
||||
@@ -246,7 +246,7 @@ install -Dm0644 web/web.env.example %{buildroot}%{_datadir}/punkt
|
||||
|
||||
%files
|
||||
%license LICENSE-MIT LICENSE-APACHE
|
||||
%doc README.md docs/implementation-plan.md packaging/README.md
|
||||
%doc README.md design/implementation-plan.md packaging/README.md
|
||||
%{_bindir}/punktfunk-host
|
||||
%{_udevrulesdir}/60-punktfunk.rules
|
||||
%{_prefix}/lib/sysctl.d/99-punktfunk-net.conf
|
||||
|
||||
@@ -76,10 +76,11 @@ read it from `%ProgramData%\punktfunk\web-password`.
|
||||
| `reset-pf-vdisplay.ps1` | **Dev:** recover a wedged driver — stop host → reap ghost monitor nodes → reload the adapter → start host (no reboot). See *Dev iteration* below. |
|
||||
| `redeploy-pf-vdisplay.ps1` | **Dev:** one-shot redeploy — (optional) build → stop host → `deploy-dev.ps1 -Install` → reload adapter → start host. |
|
||||
| `nvenc/nvenc.def`, `nvenc/gen-nvenc-importlib.ps1` | Synthesise `nvencodeapi.lib` for the `--features nvenc` link (llvm-dlltool / lib.exe). |
|
||||
| `pf-vkhdr-layer/` | **HDR Vulkan layer** (standalone `cdylib`): lets Vulkan games (Doom: The Dark Ages, etc.) enable HDR over the virtual display by advertising the HDR surface formats the NVIDIA/AMD ICDs hide on an indirect display. Built by the packer, laid into `{app}\vklayer`, registered under `HKLM64\…\Khronos\Vulkan\ImplicitLayers` (opt-out *Install the HDR Vulkan layer* task). Self-gated on the display's HDR state. See its README. |
|
||||
|
||||
> **Vendored driver:** pf-vdisplay is our **all-Rust IddCx** virtual display (UMDF2), built from
|
||||
> `packaging/windows/drivers/`. It replaced the vendored SudoVDA C++ driver — full story in
|
||||
> [`docs/windows-virtual-display-rust-port.md`](../../docs/windows-virtual-display-rust-port.md). The
|
||||
> [`design/windows-virtual-display-rust-port.md`](../../design/windows-virtual-display-rust-port.md). The
|
||||
> **signed** output (`pf_vdisplay.dll`/`.inf`/`.cat` + `punktfunk-driver.cer`; signer
|
||||
> `punktfunk-ds-test` — the same cert the gamepad drivers ship, Class=Display, HWID `root\pf_vdisplay`)
|
||||
> is checked in under `pf-vdisplay/`. To refresh it after a driver-source change, rebuild + re-sign with
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user