Files
punktfunk/docs/security-review.md
T
enricobuehler 3c55ec37fa
apple / swift (push) Successful in 56s
windows-host / package (push) Successful in 2m25s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m8s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m10s
android / android (push) Successful in 4m42s
ci / rust (push) Successful in 4m44s
ci / web (push) Successful in 30s
ci / docs-site (push) Successful in 35s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 57s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m0s
deb / build-publish (push) Successful in 2m10s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m43s
flatpak / build-publish (push) Successful in 3m59s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m28s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m13s
fix(security): remaining audit findings — mgmt admin gate, RTSP DoS bounds, FEC drop, ALPN, ct-compare
Addresses the lower-severity findings from docs/security-review.md (#4-#12). Each fix was
adversarially re-reviewed (5-agent pass); two review catches folded in (the Apple client's
GET /library cert path; an RTSP header-cap bypass + a spawn-panic counter leak).

- #4 [low] mgmt mTLS-paired-cert no longer grants full admin. A paired STREAMING cert authorizes
  only a read-only allowlist (GET /host,/compositors,/status,/clients,/native/clients,/library);
  every state-changing route and every PIN-exposing route (/pair, /native/pair) requires the
  operator's bearer token. New cert_auth_is_a_read_only_allowlist test. (/library kept on the
  allowlist — the native clients browse it cert-only; its mutations stay token-only.)
- #6 [low] RTSP pre-auth DoS bounds: a concurrent-connection cap (RAII slot guard), a per-read
  timeout (slow-loris), and Content-Length/header/message size caps — closing an unauthenticated
  slow-loris / memory-growth / thread-exhaustion vector on TCP 48010.
- #11 [info] A FEC reconstruction failure is now a counted drop (discard the block, keep the
  session) instead of being stream-fatal — a lossy link can't be torn down by one bad block.
- #10 [info] Fixed ALPN ("pkf1") on both native QUIC endpoints (defense-in-depth; a deliberate
  coordinated client+host upgrade — a new host rejects an ALPN-less old client).
- #8 [info] Constant-time GameStream pairing phase-4 hash compare (crypto::ct_eq).
- #7 [low] New VirtualDisplay::set_launch_command carries the launch command per-session on the
  GameStream path (no process-global env stomp under concurrent sessions); native path keeps the
  env under today's single-session model (documented; plumb per-session with concurrent sessions).
- #5 [low] Legacy GameStream GCM nonce reuse: documented as inherent to Nvidia's old-style control
  encryption (Apollo/Moonlight identical; key is client-known) — unfixable on the legacy wire; the
  real fix is V2 control-encryption negotiation. Code comment at control.rs.
- #9 [info] GameStream plain-HTTP pairing: documented (inherent to GFE compat; use punktfunk/1).
- #12 [low] Web global NODE_TLS_REJECT_UNAUTHORIZED: fix designed (undici dispatcher scoped to the
  loopback mgmt fetch) but DEFERRED — needs `bun add undici` in the web build env; reverted to keep
  the web working. Latent-only (the loopback mgmt fetch is the console's only outbound TLS).

fmt + clippy -D warnings clean; 94 host + core tests green; no C-ABI/OpenAPI drift. (The HDR
Steps 1-2 client work in the tree is the user's parallel WIP — deliberately NOT included here.)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 09:50:24 +00:00

37 KiB
Raw Blame History

punktfunk — security audit (2026-06-21)

Whole-project audit by a 10-surface multi-agent review; every finding adversarially verified (reachability, attacker-control, existing mitigation). 10 surfaces · 20 raw findings → 18 confirmed/partial, 2 refuted. Threat model: a malicious network client (pre- and post-pairing) is the primary adversary; also an on-path MITM and a local unprivileged user (the host is privileged).

Remediation status (2026-06-21)

All 12 confirmed findings have been addressed — fixed, or documented where a fix isn't safely possible:

# Sev Status
#1 high FIXED (3526517) — secret files 0600 + dir 0700 / Windows icacls DACL
#2 high FIXED (3526517) — single-use SPAKE2 PIN (consumed at the host key-confirmation)
#3 med FIXED (3526517) — RTSP packetSize bounded + saturating packetizer math
#4 low FIXED — mgmt mTLS-cert auth restricted to a read-only allowlist; admin/state-changing routes require the bearer token
#5 low DOCUMENTED (won't-fix on legacy) — legacy GameStream GCM nonce reuse is inherent to Nvidia's old-style control encryption (Apollo/Moonlight identical); the GCM key is client-known. Real fix = V2 control-encryption negotiation; use punktfunk/1 for untrusted nets. Code comment at control.rs rumble loop.
#6 low FIXED — RTSP Content-Length/header caps + per-read timeout + concurrent-connection cap
#7 low FIXED (GameStream) / DOCUMENTED (native) — new VirtualDisplay::set_launch_command carries the launch command per-session (GameStream); native path keeps the env (safe under today's single-session model; plumb per-session with concurrent sessions)
#8 info FIXED — constant-time GameStream phase-4 hash compare (crypto::ct_eq)
#9 info DOCUMENTED — GameStream pairing over plain HTTP is inherent to GFE compat; steer untrusted networks to the SPAKE2 native plane
#10 info FIXED — fixed ALPN (pkf1) on both QUIC endpoints (coordinated client+host upgrade)
#11 info FIXED — FEC reconstruction failure is now a counted drop, not stream-fatal
#12 low DEFERRED (fix ready, reverted) — the scoped-dispatcher fix (undici Agent on proxyRequest's fetch option) is designed and the mechanism verified sound (h3 honors the fetch option), but it needs undici added as a web dependency (bun add undici + lockfile regen), which requires the web build env — not available here. Reverted to keep the web build/proxy working. Latent-only: the loopback mgmt fetch is the web console's ONLY outbound TLS, so the global env weakens nothing today. Apply with: cd web && bun add undici, then scope rejectUnauthorized:false to the mgmt fetch and drop the global env.

Executive summary

Overall the punktfunk host is a security-conscious codebase with a strong cryptographic and wire-parsing core: the FEC/reassembler path bounds every attacker-controlled length field before allocation, AES-GCM is used correctly with per-direction nonce separation and seq-as-AAD on the native plane, and the native trust model (SPAKE2 PIN binding both cert fingerprints, fingerprint pinning that still verifies the real TLS handshake signature) is genuinely sound. The most serious real defects are (1) local secret-disclosure of the host's master private key (key.pem) — written with no restrictive mode/ACL while the far-less-sensitive mgmt token is carefully 0600 — which on Windows (%ProgramData% default Users-read ACL, LocalSystem service) is a near-certain cross-privilege host-impersonation primitive, and (2) the native SPAKE2 PIN ceremony permitting unlimited online guesses against a static, non-rotating 4-digit PIN (no disarm-on-failure, no lockout), which contradicts the documented "one online guess" guarantee and lets a pre-auth LAN attacker brute-force pairing of a fully-trusted rogue client in a few hours against the default standalone/CLI flow. Dominant themes: file-permission hygiene on secrets is inconsistent (the secure pattern exists but is applied selectively), pairing throttling relies on a single global rate-limit rather than attempt-bounding, and authorization is overbroad (any streaming-paired cert is also a full mgmt admin). The remaining findings are a contained pre-auth RTSP video-thread DoS (unbounded packetSize and Content-Length), a legacy GameStream control-stream GCM nonce-reuse that is muted by modern V2 negotiation and being key-gated, and several defense-in-depth nits (non-constant-time GameStream hash compare, no QUIC ALPN, cross-session env-var launch confusion, global NODE_TLS_REJECT_UNAUTHORIZED). No memory-unsafety or RCE was found on attacker wire bytes; panics are safe Rust and isolated by panic=unwind. Net: a solid foundation whose highest-leverage fixes are tightening secret file permissions and making the PIN single-use/lockout-bounded.

Findings (ranked by severity × exploitability)

🟠 #1 [HIGH] Host master private key (key.pem) written with no restrictive file mode / ACL — local secret disclosure enabling full host impersonation

Surface: secrets-availability
Refs: crates/punktfunk-host/src/gamestream/cert.rs:36-44, crates/punktfunk-host/src/gamestream/mod.rs:216-232, crates/punktfunk-host/src/mgmt_token.rs:58-70, crates/punktfunk-host/src/service.rs:605-627, crates/punktfunk-host/src/native_pairing.rs:116-126

Why it ranks here / impact: Ranked #1 because it is the highest verdict-adjusted severity (high, three corroborating findings merged) and the most reliably exploitable post-foothold: key.pem is the single trust root for ALL surfaces — GameStream TLS server cert, GameStream pairing signing key, the punktfunk/1 QUIC identity every client pins, and the mgmt HTTPS cert — so its disclosure yields full host impersonation/MITM that defeats client fingerprint pinning, plus the mgmt bearer token is likewise unprotected on Windows. ServerIdentity::load_or_create writes it with a bare fs::write (no mode) and create_dir_all (no DACL). On Windows the leak is near-certain and umask-independent: config_dir() is %ProgramData%\punktfunk, whose default ACL grants BUILTIN\Users read, and the host runs as LocalSystem — any local unprivileged user reads the SYSTEM service's key; the mgmt-token 0o600 hardening is #[cfg(unix)] so it is a no-op there. On Linux the file lands at umask (commonly 0664/0644, verified live as world-readable) and is reachable cross-user whenever the home/config chain is traversable. The project demonstrably knows the secure pattern (mgmt_token.rs uses OpenOptions::mode(0o600)+set_permissions) but applies it to the less-sensitive token and not the master key. Local-only (adversary #3), not pre-auth/network, which caps it below critical.

Fix: Write key.pem (and cert.pem) via OpenOptions::mode(0o600) + a follow-up set_permissions(0o600) on Unix, mirroring mgmt_token.rs; create config_dir() with DirBuilder::mode(0o700). On Windows set an explicit DACL granting only SYSTEM+Administrators on the punktfunk %ProgramData% subtree and per-file on key.pem / mgmt-token / *paired.json (or relocate the key under a SYSTEM-only path), since the default ProgramData ACL is Users-readable. Extend the same hardening to client-key.pem and the persisted trust stores. Add a regression test asserting 0600 on key.pem on Unix.

🟠 #2 [HIGH] Native SPAKE2 PIN ceremony allows unlimited online guesses against a static 4-digit PIN — pre-auth brute-force to a fully-trusted rogue client

Surface: pairing-pin
Refs: crates/punktfunk-host/src/punktfunk1.rs:388-446, crates/punktfunk-host/src/punktfunk1.rs:475-491, crates/punktfunk-host/src/punktfunk1.rs:82, crates/punktfunk-host/src/native_pairing.rs:189-234, crates/punktfunk-host/src/native_pairing.rs:128-131, crates/punktfunk-host/src/mgmt.rs:841-842

Why it ranks here / impact: Ranked #2: high severity AND pre-auth + fully attacker-controlled, the strongest exploitability combination among the high-rated issues — gated only on pairing being armed and an hours-long active window. Merges the three pairing-pin brute-force findings (they share one root cause: no disarm/rotate-on-failure and no attempt budget). pair_ceremony logs a warning and returns Err on a wrong PIN but never calls np.disarm() or rotates the PIN; current_pin() returns the same value forever (cleared only by TTL or operator); the only throttle is one process-wide 2s PAIRING_COOLDOWN. The PIN space is 10,000. Critically the standalone punktfunk1-host default (--require-pairing forces allow_pairing) arms with expires_at:None at startup, so the indefinite static-PIN window is the DEFAULT for that binary, not an opt-in. At ~1 guess/2s the space exhausts in ~5.5h worst / ~2.8h avg, and on success the attacker's cert is permanently pinned, granting input injection, screen capture and app launch. This directly contradicts the documented 'one online guess, no offline dictionary' claim — the offline-dictionary resistance from SPAKE2 holds, but the online single-guess limit is simply not implemented. Mitigations partial: the web/mgmt arm path is TTL-bounded (15..600s), confining the worst case to the CLI/standalone mode.

Fix: Make a failed confirmation consume the PIN: on ok==false in pair_ceremony, call np.disarm() (or rotate to a fresh random PIN) so a single wrong guess closes the window — this is what actually delivers the documented 'one online guess'. Add a per-window failed-attempt budget (auto-disarm after N>=1 failures), give the CLI no-expiry arm path a default expiry, and disarm after a SUCCESSFUL pair too. Keep the 2s cooldown as defence-in-depth and raise the web-armed PIN to 6 digits.

🟡 #3 [MEDIUM] Pre-auth RTSP ANNOUNCE packetSize underflows/panics the GameStream video pipeline (div-by-zero / OOB slice / allocation amplification)

Surface: gamestream-parsing
Refs: crates/punktfunk-host/src/gamestream/rtsp.rs:275, crates/punktfunk-host/src/gamestream/video.rs:55-89, crates/punktfunk-host/src/gamestream/stream.rs:322

Why it ranks here / impact: Ranked #3: medium and fully pre-auth + attacker-controlled — the highest-exploitability of the medium-and-below tier. The RTSP listener on TCP 48010 performs no TLS/pairing/auth; an unauthenticated peer drives OPTIONS→ANNOUNCE→PLAY (+ a UDP ping to the video port) and the video thread starts on state.stream alone, no paired session required. x-nv-video[0].packetSize is read with no bound and flows into VideoPacketizer::new where payload_per_shard = packet_size - 16: packetSize==16 → pps==0 → div-by-zero panic; packetSize<16 → underflow → OOB slice panic; packetSize==17 → one byte/shard → per-frame datagram flood. Reliable remote pre-auth DoS of a privileged media service, made stickier because the panic unwinds before running.store(false) leaving the session wedged until restart. Calibrated medium (not higher) because it is a SAFE Rust panic (checked slice access, no memory corruption/UB) isolated to the punktfunk-video thread by panic=unwind — the host process and other listeners survive; not RCE.

Fix: Validate packet_size in stream_config() before building StreamConfig: reject packetSize below a sane floor (e.g. < 64) and clamp to a sane max (e.g. <= 2048). Additionally harden VideoPacketizer::new to use checked/saturating arithmetic and refuse construction (or fall back to a default) when packet_size < SHARD_HEADER-16 so the per-frame path never sees pps==0 or a wrapped payload_per_shard. Also store(false) on the unwind path so a panic doesn't wedge the session. Add a regression test over packetSize in {0,15,16,17}.

🔵 #4 [LOW] Any paired punktfunk/1 streaming client gets full management-API authority via the mTLS-paired-cert auth path (no streaming-vs-admin separation)

Surface: authz-trust
Refs: crates/punktfunk-host/src/mgmt.rs:459-488, crates/punktfunk-host/src/mgmt.rs:466-470

Why it ranks here / impact: Ranked #4: low but a genuine post-auth privilege over-broadening with concrete admin impact. require_auth grants any verified peer cert whose fingerprint is in the native paired store full unscoped access to every /api/v1 route — the SAME paired set that admits a device to stream. So a device paired purely to watch the screen can DELETE /clients/{fp} (unpair others), POST /native/pair/arm (open a pairing window and read the PIN), approve arbitrary knocking devices, DELETE /session, and CRUD the library; there is no role/scope check anywhere in the router. The native client presents its identity via TLS client auth on both ports, so the credential is genuinely usable against mgmt. Bounded to low because it requires being an already-paired (operator-trusted) device and the mgmt port binds loopback by default — remote reach needs an explicit routable --mgmt-bind (and the mTLS path then bypasses the token requirement).

Fix: Separate streaming trust from management trust: keep a distinct admin allow-list (or an admin flag on a paired entry) for the mTLS mgmt path, or restrict mTLS-cert auth to read-only endpoints and require the bearer token for state-changing/admin routes. At minimum gate the pairing-administration endpoints (arm/approve/unpair) and session/library mutation behind the bearer token only.

🔵 #5 [LOW] GameStream legacy control-stream AES-GCM nonce reuse across directions (host rumble vs client input share key+nonce)

Surface: crypto
Refs: crates/punktfunk-host/src/gamestream/control.rs:373-400, crates/punktfunk-host/src/gamestream/control.rs:257-266, crates/punktfunk-host/src/gamestream/control.rs:67,106-114

Why it ranks here / impact: Ranked #5: a real, correctly-identified catastrophic-class crypto defect (AES-GCM (key,nonce) reuse) but adjusted to low because reachability and impact are heavily muted. The legacy NonceKind branches apply no direction separation (other => other), so host rumble (rumble_seq from 0) and client control (seq from 0) under the shared rikey produce identical (key,nonce). BUT: (1) it only triggers on the legacy auto-detected scheme — modern moonlight-common-c negotiates the V2 scheme which flips marker[0] to 'H' and is direction-separated, so the default path is safe; the doc claim 'the legacy path — which we hit' is stale; (2) the rikey is delivered only over the mTLS /launch, so a pure MITM cannot derive the key — only a paired client can; (3) a paired client can already legitimately send any client→host control message (in-scope-by-design), so forgery is largely redundant and the only genuinely new gain is recovering low-value rumble keystream / forging rumble to its own client. Post-auth, conditional path.

Fix: Separate the two directions' nonce spaces for the legacy schemes too — set a reserved high bit/byte of the legacy IV for host-originated packets (mirror the V2 'H' marker), or better, HKDF-derive an independent host→client key from the rikey with a direction label so host and client never share a GCM key. Never let host rumble and client input share (key,nonce).

🔵 #6 [LOW] RTSP request Content-Length / header size unbounded with no read timeout or connection cap — pre-auth slow-loris / memory-growth DoS

Surface: gamestream-parsing
Refs: crates/punktfunk-host/src/gamestream/rtsp.rs:82-106, crates/punktfunk-host/src/gamestream/rtsp.rs:24-48

Why it ranks here / impact: Ranked #6: low, pre-auth and attacker-controlled but a rate-limited resource DoS, not unsafety or auth bypass. read_message parses content-length and computes total = end+4+content_len with no cap, looping buf.extend_from_slice until buf.len()>=total; the header scan is likewise unbounded and there is no body/header cap, no read/write timeout, and one unbounded native thread is spawned per connection with no global limit. Growth is bounded by attacker send rate (no pre-allocation), so it is slow exhaustion rather than instant OOM; the stronger lever is thread/FD exhaustion from many idle slow-loris connections at near-zero bandwidth. On a privileged LAN-facing plaintext listener with zero defensive caps.

Fix: Cap Content-Length and total header size to small constants (e.g. reject content_len > 64 KiB, total header > 16 KiB) and close on violation. Add a read timeout so a slow-loris connection cannot pin a thread indefinitely, and bound concurrent RTSP connections.

🔵 #7 [LOW] Per-session launch command carried via process-global PUNKTFUNK_GAMESCOPE_APP env var, stomped under concurrent native sessions (cross-session launch confusion)

Surface: privilege-process-launch
Refs: crates/punktfunk-host/src/punktfunk1.rs:560-571, crates/punktfunk-host/src/punktfunk1.rs:140, crates/punktfunk-host/src/vdisplay/gamescope.rs:629-647

Why it ranks here / impact: Ranked #7: low, post-auth cross-session isolation bug, explicitly NOT command injection. serve_session does std::env::set_var(PUNKTFUNK_GAMESCOPE_APP) per accepted connection with a stale comment claiming 'one session at a time', but DEFAULT_MAX_CONCURRENT=4 sessions run concurrently and the var is read in gamescope::spawn during VirtualDisplay::create — a genuine TOCTOU where client B's launch overwrites what client A's bare-spawn reads, and the never-cleared value leaks into a later no-launch client. Impact is capped because cmd always resolves through library::launch_command (digit-validated Steam appids / operator-only custom store), so the worst case is launching a DIFFERENT operator-approved title or a stale title — and it only affects the gamescope bare-spawn backend (kwin/mutter/wlroots/attach ignore the var).

Fix: Stop carrying the per-session launch command in a process-global env var. Plumb the resolved command through the VirtualDisplay::create call / per-session context (e.g. a field on Mode or a per-session GamescopeDisplay), and on the bare-spawn path pass it explicitly to spawn(); clear/scope it so a stale value never leaks to the next client.

#8 [INFO] GameStream pairing phase-4 hash compare is not constant-time

Surface: pairing-pin
Refs: crates/punktfunk-host/src/gamestream/pairing.rs:226-247

Why it ranks here / impact: Ranked #8: info / hardening only — a real variable-time == on attacker-influenced 32-byte SHA-256 digests, but not weaponizable. The compared expected mixes in host-random server_challenge that is never disclosed (so the attacker can neither compute nor aim at the target), the attacker cannot steer client_hash to a chosen value without the PIN key, and any mismatch removes the session (map.remove) forcing a fresh ceremony with new randomness — so there is no stable secret to recover prefix-by-prefix and no path from timing to PIN recovery or match forgery. Worth fixing for consistency since the codebase already has ct_eq for the native ceremony.

Fix: Use a constant-time comparator (subtle::ConstantTimeEq or the project's existing ct_eq) for hash_ok, matching the constant-time discipline already used in the native SPAKE2 ceremony.

#9 [INFO] GameStream pairing ceremony runs over plain HTTP — inherited GFE brute-forceable-PIN / MITM weakness

Surface: authz-trust
Refs: crates/punktfunk-host/src/gamestream/nvhttp.rs:33, crates/punktfunk-host/src/gamestream/nvhttp.rs:215-264, crates/punktfunk-host/src/gamestream/pairing.rs:102-247

Why it ranks here / impact: Ranked #9: info — real but intentional Moonlight-compat behavior, on record rather than a regression. The whole /pair flow (incl. phase-4 cert pinning) is on plain HTTP 47989 with no transport confidentiality and no rate-limiting; the AES key is pin_key(salt,pin) = SHA-256(salt||pin)[..16] feeding AES-128-ECB, so an on-path attacker observing a legitimate pairing can offline-brute-force the 4-digit PIN and forge a clientpairingsecret to get a cert pinned. This is the well-known GFE/Sunshine construction, fixed by interop, and is precisely why punktfunk/1's SPAKE2 path exists; it requires an active MITM during an operator-initiated pairing within the 300s window. A paired GameStream client is in-scope-by-design.

Fix: Inherent to GameStream compatibility — document it and steer users to punktfunk/1 (SPAKE2) for untrusted networks. Optionally rate-limit pairing sessions per uniqueid/IP and tighten/expire the awaiting-PIN window aggressively.

#10 [INFO] No ALPN configured on the native QUIC server/client (cross-protocol confusion hardening absent)

Surface: cert-tls-identity
Refs: crates/punktfunk-core/src/quic.rs:1335-1354, crates/punktfunk-core/src/quic.rs:1412-1448

Why it ranks here / impact: Ranked #10: info — factually correct (no alpn_protocols set on either endpoint; the cert.pem identity is shared with GameStream TLS) but no reachable confusion attack. ALPACA-style attacks need two TLS services sharing a cert on the SAME transport; here GameStream is TLS-over-TCP and punktfunk/1 is TLS-in-QUIC (UDP) — not cross-reachable — and there is exactly one QUIC server so ALPN would make no authorization decision. Trust is already enforced by fingerprint pinning + app-layer Hello/Welcome magic. Cheap future-proofing only.

Fix: Set a fixed ALPN on both endpoints (e.g. rustls_cfg.alpn_protocols = vec![b"pkf1".to_vec()]) so a mismatched protocol is rejected during the TLS handshake — defense-in-depth against ever multiplexing protocols on the QUIC endpoint.

#11 [INFO] FEC reconstruct error on the receive path is stream-fatal — code-contract inconsistency (not an exploitable DoS)

Surface: core-wire-deser
Refs: crates/punktfunk-core/src/packet.rs:411, crates/punktfunk-core/src/session.rs:283-289, clients/probe/src/main.rs:959, crates/punktfunk-host/src/spike.rs:251

Why it ranks here / impact: Ranked last: info — a correctly-identified contract inconsistency with NO demonstrable exploit. Reassembler::push propagates coder.reconstruct(...)? and both real receive-side callers treat any non-NoFrame error as fatal, inconsistent with the surrounding 'malformed = silent drop, never fatal' discipline. But every Err arm was traced unreachable from hostile input: header firewall + block-geometry pinning guarantee equal-length, correctly-counted shards; reconstruct is only called once received>=data_shards; Config::validate rejects odd/zero shard_payload before any decode; and MDS Reed-Solomon decodes any data_shards distinct shards. Reaching the reassembler also requires an AES-GCM-decryptable packet, so it is the connected host (not a port-sprayer), and it is client-side only — the privileged host never runs the reassembler on attacker bytes. Pure defense-in-depth hardening.

Fix: Make a FEC reconstruction failure a counted drop rather than stream-fatal: in Reassembler::push match coder.reconstruct(...) and on Err bump packets_dropped (or a fec_failed counter), discard the block, and return Ok(None). Reserve poll_frame's Err for genuinely fatal conditions (role misuse, transport teardown), matching the discipline documented at packet.rs:298-300.

🔵 #12 [LOW] Web console sets NODE_TLS_REJECT_UNAUTHORIZED=0 process-globally — latent footgun disabling all outbound TLS verification

Surface: deps-config-exposure
Refs: web/.env.example:22-24, web/web.env.example:11-14, web/server/util/auth.ts:17-22, web/vite.config.ts:23

Why it ranks here / impact: Ranked #12: low and not currently exploitable (attackerControlled false), included as a latent defense-in-depth defect. NODE_TLS_REJECT_UNAUTHORIZED=0 disables certificate validation for every outbound TLS connection the Node process makes, but the only current server-side outbound hop is the loopback proxy to https://127.0.0.1:47990 (CDN/art fetches are browser-side), and a loopback connection cannot be MITM'd — so impact is nil today. Real impact materializes silently if anyone later adds a server-side off-host HTTPS call (update check, webhook, metadata fetch) or points PUNKTFUNK_MGMT_URL off-loopback.

Fix: Do not disable TLS verification globally. Pin the host's self-signed cert for the single loopback fetch: pass an https.Agent with the host cert as ca (or rejectUnauthorized:false on that one Agent only) to the proxyRequest fetch in server/routes/api/[...].ts, and drop NODE_TLS_REJECT_UNAUTHORIZED from the deployment env.

Cross-cutting themes

  • Inconsistent secret file-permission hygiene: the secure 0600/ACL pattern exists (mgmt-token) but is applied selectively, leaving the master private key and trust stores at umask/default-ACL — the highest-impact local-privilege gap, acute on Windows %ProgramData% + LocalSystem.
  • Pairing throttling is rate-based, not attempt-bounded: a single global 2s cooldown and a static non-rotating 4-digit PIN with no disarm-on-failure/lockout means the documented 'one online guess' property is not actually implemented for the native ceremony.
  • Overbroad authorization / collapsed trust tiers: 'paired to stream' equals 'paired to administer' (mgmt mTLS), and GameStream pairing inherits the plaintext-HTTP brute-forceable-PIN model — coarse trust boundaries where finer scopes are warranted.
  • Pre-auth attack surface on the GameStream/RTSP listeners with missing input bounds and resource caps (unbounded packetSize, unbounded Content-Length, no timeouts/connection caps) — contained DoS, but on a privileged plaintext service.
  • Stale concurrency assumptions and process-global mutable state (legacy GCM nonce direction, PUNKTFUNK_GAMESCOPE_APP env var) that were safe under a since-removed 'one session at a time' invariant and now cause cross-session confusion / crypto reuse.
  • Strong, well-tested cryptographic and memory-safety core (bounded wire parsing, correct AEAD/SPAKE2/pinning, catch_unwind FFI, panic=unwind isolation) — the foundation is solid; the residual risk is in operational hardening and trust-tier granularity, not in unsafe/RCE.

Prioritized remediation (do in this order)

  1. Lock down secret files: write key.pem (and cert.pem) 0600 + create config_dir 0700 on Unix using the existing mgmt_token OpenOptions::mode pattern, and set an explicit SYSTEM+Administrators-only DACL on the punktfunk %ProgramData% subtree / key.pem / mgmt-token / *paired.json on Windows. Extend to client-key.pem; add a 0600 regression test.
  2. Make the native PIN single-use and lockout-bounded: disarm or rotate the PIN on a failed SPAKE2 confirmation, add a per-window failed-attempt budget, give the CLI no-expiry arm path a default expiry, and disarm after a successful pair — this is what delivers the documented 'one online guess'.
  3. Bound the RTSP video path: validate/clamp x-nv-video[0].packetSize (floor ~64, cap ~2048) in stream_config() and use checked/saturating arithmetic in VideoPacketizer::new so pps==0 / underflow can never occur; store(false) on the unwind path; add a {0,15,16,17} regression test.
  4. Cap RTSP request parsing: enforce a Content-Length and total-header-size limit, add a read timeout, and bound concurrent connections so a pre-auth peer cannot slow-loris exhaust threads/memory.
  5. Separate streaming trust from management trust: require the mgmt bearer token (not just a paired streaming cert) for state-changing and pairing-administration routes (arm/approve/unpair/session/library), or keep a distinct admin allow-list.
  6. Fix the legacy GameStream GCM nonce reuse: HKDF-derive an independent host→client key from the rikey (direction label), or mirror the V2 'H' direction marker into the legacy IV so host rumble and client input never share (key,nonce).
  7. Stop carrying the per-session gamescope launch command in a process-global env var: plumb it through the per-session VirtualDisplay::create/context and clear it when no launch is requested, eliminating cross-session stomping under concurrency.
  8. Apply the cheap hardening nits: constant-time compare for the GameStream phase-4 hash (use ct_eq), set a fixed ALPN ('pkf1') on both QUIC endpoints, make FEC reconstruct failures a counted drop instead of stream-fatal, and replace the global NODE_TLS_REJECT_UNAUTHORIZED with a cert-pinned https.Agent scoped to the loopback mgmt fetch.

Security controls done right (positives)

  • Defense-in-depth wire parsing: every attacker-controllable FEC/reassembler header field is bounded against negotiated limits BEFORE any allocation keyed on it (packet.rs:328-343) — shard_bytes exact-match, data/total/block counts in range, indices in bounds, frame_bytes<=max — with no integer overflow in the size math and regression tests (rejects_oversized_shard_counts, rejects_inconsistent_block_geometry_without_panicking).
  • Reassembler memory is bounded to a 16-frame reorder window with prune-on-push and completed-frame dedup (packet.rs:451-468), so a flood of distinct frame indices cannot grow memory unboundedly and late shards cannot resurrect emitted frames.
  • AEAD gates the reassembler: on an encrypted session open_from_wire verifies the GCM tag (with seq as AAD) before any bytes reach push (session.rs:120-131), so an attacker cannot reach the reassembler without the session key; oversized-datagram truncation is always detectable (recv buffers MAX+1, len>MAX dropped).
  • Native AES-GCM is correct and misuse-resistant: 96-bit nonce = 4-byte salt || 8-byte BE seq with seq also as AEAD AAD (tampering fails the tag, not shifts the nonce), per-direction salt-bit separation gives disjoint nonce spaces under the shared key, a fresh CSPRNG 128-bit key per session, and Config::validate rejects encrypt=true with an all-zero key (crypto.rs, session.rs).
  • Host-side data-plane datagram decoders (mic / RichInput / HidOutput / InputEvent / gamepad) are all length-checked, Option-returning and non-fatal — the privileged host drops anything malformed and keeps draining, never reassembles attacker video, and never panics on truncated/hostile input.
  • punktfunk/1 trust establishment is sound: PinVerify rejects a fingerprint mismatch AND still performs real TLS 1.2/1.3 CertificateVerify signature checks (not stubbed), so an active MITM cannot replay the host's public cert to satisfy a pin without the private key (quic.rs:1547-1608) — the single most important thing to get right, done correctly.
  • SPAKE2 PIN pairing for the native plane is built correctly: a balanced PAKE binding BOTH cert fingerprints as identities and into the key-confirmation transcript, a wrong PIN yields a different key (one online guess, no offline dictionary, no error oracle), MITM with different certs per leg reaches no shared key, and confirmation MACs use a constant-time ct_eq — all exercised by tests.
  • Authorization is cleanly split from authentication on both planes: AcceptAnyClientCert verifiers accept any self-signed cert at the handshake but still verify the handshake signature, so the post-handshake fingerprint proves key possession, and authorization is then enforced against the paired allow-list (--require-pairing default fail-closed; certless peers rejected).
  • GameStream post-pair endpoints (applist/launch/resume/cancel) are gated by peer_is_paired() requiring a pinned mutual-TLS client cert, fail-closed for certless/unknown/None peers, with a dedicated regression test (nvhttp.rs:46-55, 303-328).
  • No command injection on the launch surface: client-supplied launch ids resolve against the host's OWN catalog (client can only pick an existing title), Steam appids are digits-only validated (with a 570; rm -rf ~ rejection test), custom commands come only from the operator mgmt store, and gamescope::spawn uses argv (Command::new + args), never /bin/sh -c.
  • Client-controlled display dimensions are validated (encode::validate_dimensions: zero/odd/over-max rejected) on both the initial Hello and mid-stream Reconfigure before reaching encoders or the privileged SudoVDA ADD ioctl, which marshals a fixed #[repr(C)] struct and only selects driver-advertised modes.
  • mgmt API is authenticated on every route except /health even on loopback (fails closed on a blank token), with constant-time SHA-256-digest token comparison and a CSPRNG token persisted 0600 on Unix (O_CREAT mode + follow-up set_permissions, never briefly world-readable).
  • Attacker-controlled device names are sanitized before logging/storage/UI (C0/C1 controls, Unicode bidi/format overrides, BOM stripped, length-capped, fingerprint fallback), blocking log/console-injection and trusted-device spoofing in the approval UI.
  • The pending-knock / delegated-approval queue is bounded (PENDING_CAP=32, LRU eviction) and time-bounded (10-min TTL), in-memory only, so a LAN scanner cannot grow it unboundedly or poison the persistent trust store.
  • Both trust stores are persisted atomically (temp + rename) with in-memory rollback on a failed persist, so a crash or full disk mid-write cannot truncate the allow-list and silently lock out or un-gate paired clients.
  • C-ABI boundary is hardened: config_from_ptr enforces the struct_size skew guard before dereferencing, every narrowing field is range-checked before truncation, every data-processing entry point is wrapped in catch_unwind returning Panic (no unwind across FFI), and null/zero-length handling is consistent and safe.
  • No secret material is logged anywhere — PINs, GCM/rikey keys, nonces/salts, the mgmt token, and private keys never reach tracing/println; pairing-failure logs include only the sanitized device name + fingerprint.
  • Crypto stack is current and free of disclosed-vuln versions (rustls 0.23.40, quinn 0.11.9, ring 0.17.14, aes-gcm 0.10.3, spake2 0.4.0), ring-only with no aws-lc C dep; rsa 0.9.10 (RUSTSEC-2023-0071 Marvin) is used ONLY for sign/verify, never decryption, so the vulnerable path is not exercised.
  • The Windows service launches the host with a correctly-scoped duplicated SYSTEM token (only the session id retargeted), a fixed winsta0\default desktop, a command line built from current_exe + an operator-controlled host.env subcommand (never network input), and a kill-on-job-close job object so a crash never orphans the SYSTEM host.

Refuted (investigated, NOT vulnerabilities)

  • [unsafe-ffi-cabi] Free/close FFI entry points run Drop without catch_unwind — a panic in teardown is UB across the C boundary — Verified all cited code. abi.rs:272-276 (punktfunk_session_free) and abi.rs:1627-1631 (punktfunk_connection_close) are verbatim as described — both call drop(Box::from_raw(..)) outside guard()/catch_unwind, while the guard helper (abi.rs:168-170) wraps catch_unwind. The doc at abi.rs:11 ("every entry point is wrapped in catch_unwind") is literally inaccurate for these two — that doc-vs-code discrepancy is REAL.

But the finding's security claim — "the unwind would cross the extern "C" frame, which is undefined behavior" — is REFUTED by the build toolchain. This crate is edition 2021 built with rustc 1.96.0. Since Rust 1.81 (Sept 2024), an unwind that reaches a non--unwind extern "C" boundary is a defined, safe process abort, not UB. None of these functions use extern "C-unwind". So the worst possible outcome is a clean abort, which is also exactly what catch_unwind→PunktfunkStatus::Panic would avoid only by returning a status — but free/close return void, so there is no status to return anyway.

Moreover the precondition (a panicking Drop) does not exist in the code. NativeClient::drop discards the worker join result with let _ = w.join(), so a worker-thread panic/poison cannot re-enter Drop. Config::drop only zeroizes. Session has no custom Drop. The transport Drop closes a socket. There is no .unwrap(), no Mutex::lock, and no result-propagating thread join on any teardown path — the finding's speculated panic sources are not present.

The finding is correctly self-rated as informational and explicitly non-attacker-controlled / non-pre-auth. The accurate residual is a documentation inconsistency (the module doc overstates the catch_unwind invariant), not a security weakness. The substantive recommendation (keep Drop impls panic-free) is already satisfied. Net: refuted as a vulnerability; severity info, and even as a code-quality nit the UB framing is incorrect for the current compiler. Worth at most a one-line doc fix to say free/close intentionally abort-on-panic.

  • [unsafe-ffi-cabi] C-ABI pointer/length contracts (32-byte fingerprint buffers, caller buffers) are trusted, not validated — standard FFI, embedder-only — All three cited spans are present and behave exactly as described. abi.rs:877 from_raw_parts(pin_sha256, 32) (after null-check at 873), abi.rs:905 from_raw_parts_mut(observed_sha256_out, 32) (after null-check at 903), abi.rs:1003 from_raw_parts_mut(host_sha256_out, 32) (after null-check at 990). The submit_frame/send_mic claim is also accurate: host_submit_frame (283-302, with an extra null+len guard) and send_mic (1233-1253) build slices from caller (data,len). These are C-ABI entry points whose pointer/length arguments come from the trusted embedding app (PunktfunkKit/GTK/WinUI), not from wire bytes. None of the threat-model adversaries can influence them: the malicious network client (pre- or post-auth) controls protocol bytes that flow through the embedder, not the embedder's own FFI call arguments; a MITM is irrelevant; a local unprivileged user cannot call into another process's loaded library. Each function carries a documented Safety contract and null-checks its pointers, and the hard-coded 32 matches the fingerprint type, so even a buggy-but-honest caller passing a correctly-sized buffer is safe. This is idiomatic, sound FFI — not a vulnerability. The finding's own posture (info, attackerControlled=false, preAuth=false, no fix required, listed for the record) is correct and well-calibrated. In vuln terms this is refuted (not a vuln / not reachable by any in-scope adversary), consistent with the info classification.