Commit Graph

33 Commits

Author SHA1 Message Date
enricobuehler 8672026e97 fix(host): clear clippy doc_lazy_continuation in the 4:4:4 docs
apple / swift (push) Failing after 7s
apple / screenshots (push) Has been skipped
android / android (push) Successful in 3m17s
ci / rust (push) Successful in 1m17s
ci / web (push) Successful in 50s
ci / docs-site (push) Successful in 58s
windows-host / package (push) Successful in 7m27s
deb / build-publish (push) Successful in 2m54s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m39s
docker / deploy-docs (push) Has been cancelled
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Has been cancelled
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Has been cancelled
A line-wrap put `+`/`*`-style markers at the start of two doc lines, which
clippy (Windows host job, rust 1.96) reads as markdown list items whose
unindented follow-on lines trip `doc_lazy_continuation` under `-D warnings`:

  - encode/windows/nvenc.rs `chroma_444` field doc (the failing Windows-host
    clippy job): "+ chromaFormatIDC = 3" → "and chromaFormatIDC = 3".
  - encode/linux/vaapi.rs `probe_can_encode_444` doc: "+ validate" → "and
    validate" (last line, didn't fire yet, but fragile — fixed pre-emptively).

Pure doc rewording, no behaviour change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 21:38:07 +00:00
enricobuehler 75627c8afe feat(audio): end-to-end 5.1/7.1 surround across the native path + all clients
apple / swift (push) Failing after 10s
release / apple (push) Failing after 7s
apple / screenshots (push) Has been skipped
audit / cargo-audit (push) Failing after 1m19s
windows-host / package (push) Failing after 2m44s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Failing after 39s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Failing after 39s
windows / build (aarch64-pc-windows-msvc) (push) Failing after 45s
android / android (push) Successful in 5m17s
windows / build (x86_64-pc-windows-msvc) (push) Failing after 45s
ci / web (push) Successful in 57s
ci / docs-site (push) Successful in 56s
ci / rust (push) Successful in 9m19s
ci / bench (push) Successful in 4m40s
decky / build-publish (push) Successful in 26s
deb / build-publish (push) Successful in 2m57s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 33s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m56s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m35s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m20s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 53s
flatpak / build-publish (push) Successful in 4m22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m51s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m50s
Adds negotiated 5.1/7.1 surround to the punktfunk/1 protocol and every client
(previously stereo-only):

- core: new shared `audio` layout table (LAYOUT_51/71 + identity multistream
  mapping, canonical wire order FL FR FC LFE RL RR SL SR); Hello/Welcome
  `audio_channels` negotiation via the trailing-byte back-compat pattern (old
  peers fall back to stereo); C-ABI `punktfunk_connect_ex6`,
  `punktfunk_connection_audio_channels`, and in-core multistream decode
  `punktfunk_connection_next_audio_pcm` for embedders without a multistream
  Opus decoder. Real-libopus channel-identity round-trip test.
- host: native audio thread captures + Opus-(multi)stream-encodes at the
  negotiated count (with a cross-session cached-capturer channel-mismatch fix);
  GameStream surround unified onto the safe `opus::MSEncoder`, dropping
  `audiopus_sys` (~4 unsafe blocks) and un-gating Windows GameStream surround;
  WASAPI loopback capture relaxed to 2/6/8 with the correct dwChannelMask.
- clients: Linux (PipeWire), Windows (WASAPI), Android (AAudio) decode via
  `opus::MSDecoder` + render multichannel; Apple decodes in-core to PCM →
  AVAudioEngine with an explicit wire-order channel layout; each gains a
  Stereo/5.1/7.1 setting. `punktfunk-probe --audio-channels N` is the headless
  validator.

Verified on Linux: core/host/linux/probe test suites + the Android Rust
(cargo-ndk) build, clippy -D warnings, and rustfmt all green. Windows/Apple
builds, all on-glass checks, and the live native loopback are pending (CI / a
free box).

Also lands the concurrent in-tree HEVC 4:4:4 host work (PUNKTFUNK_444): it
shares the same touched files (quic.rs, punktfunk1.rs, encode/*, ...) and so
cannot be committed separately from the surround changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-28 21:11:05 +00:00
enricobuehler 7aa787a789 docs(host): prove the last 3 files + crate-root deny (unsafe-proof program 4/N, final)
Completes the unsafe-proof program now that the parallel WIP has landed:

- idd_push.rs (25 sites), nvenc.rs (7), punktfunk1.rs (21): a SAFETY proof on
  every unsafe block — D3D11/DXGI COM (same-device textures, immediate-context
  single-thread, keyed-mutex-held convert), the NVENC SDK table (versioned POD,
  register/map/lock-bitstream pairing), cross-process shm reads (atomic
  magic/generation handshake), and the C-ABI harness (each call cross-checked
  against its abi.rs `# Safety` doc). No SUSPECT (UB) blocks.
- capture.rs / encode.rs: the parent-module deny is restored (their WIP children
  are now proven), and main.rs gains a crate-root
  #![deny(clippy::undocumented_unsafe_blocks)] — the permanent catch-all gate so
  no future unsafe block anywhere in the crate can land without a proof.
- Fixed 4 blocks the agents missed: unsafe blocks nested inside `assert_eq!(...)`
  macro args (the comment-above-statement didn't associate) — hoisted to a `let`.
- rustfmt-canonicalized the Windows files (the agents' SAFETY comments + some
  pre-existing 1.9.0 drift) so `cargo fmt --all --check` is clean.

Verified: cargo clippy -p punktfunk-host --all-targets -- -D warnings AND
cargo fmt -p punktfunk-host --check both green with the crate-root deny active.
Windows cfg(windows) re-verified on the box next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 09:57:00 +00:00
enricobuehler 3514702d8c feat(windows-host): IDD-push encodes native NV12/P010 (skip NVENC's SM-side CSC)
GPU-contention work (host-latency plan §5.A): the IDD-push output ring now hands
NVENC native YUV instead of RGB, so NVENC skips its internal RGB→YUV colour
conversion on the SM/3D engine the running game saturates.

- idd_push.rs: out_ring is now NV12 (SDR, BT.709 limited) via a D3D11 VIDEO-engine
  BGRA→NV12 VideoConverter (keeps the CSC off the contended 3D/compute engine), or
  P010 (HDR, BT.2020 PQ limited) via the FP16→P010 shader (NVIDIA's VideoProcessor
  can't do RGB→P010). The ring drops its per-slot RTV (textures only), matching the
  WGC YUV ring; converters rebuild on a size/HDR flip.
- nvenc.rs: NV12 input forces bit_depth=8 so an HDR→SDR toggle (or a 10-bit-
  negotiated client on an SDR display) re-inits the session at the matching depth —
  NV12 can't feed a 10-bit session (register_resource rejects it).
- punktfunk1.rs: per-stage latency instrumentation under PUNKTFUNK_PERF
  (cap=try_latest, submit=encode_picture, wait=lock_bitstream µs p50/p99/max) to
  pinpoint where capture→encoded latency goes under GPU saturation.
2026-06-26 09:35:23 +00:00
enricobuehler 327a5fa828 docs(host): prove unsafe blocks in the Windows + cross-platform files + gate them (unsafe-proof program 3/N)
Continues the unsafe-proof program across the Windows/cross-platform host files
(~75 blocks, 21 files), each with a SAFETY proof of the real invariant and a
per-file #![deny(clippy::undocumented_unsafe_blocks)] gate:

  capture/windows: dxgi.rs, wgc_relay.rs, wgc.rs, desktop_watch.rs, composed_flip.rs
                   (windows-rs COM: interface validity, same-D3D11-device textures,
                    immediate-context single-thread, borrowed args outlive the call)
  windows: service.rs (SCM/token/CreateProcessAsUserW/event handles — OwnedHandle
           liveness, no double-close/signal race), win_display, wgc_helper, interactive
  vdisplay/windows: manager.rs, pf_vdisplay.rs (SwDeviceCreate/IddCx/ioctl handle
                    liveness via the OnceLock VDM singleton + OwnedHandle)
  encode/windows: ffmpeg_win.rs (full AVBufferRef refcount audit — balanced, NO leaks,
                  unlike the vaapi sibling), sw.rs
  cross-platform: gamestream/audio.rs (libopus), gamestream/stream.rs (sendmmsg),
                  inject/windows/sendinput.rs, audio/windows/wasapi_mic.rs,
                  session_tuning.rs, vdisplay.rs

Two findings (handled separately):
- wgc_relay.rs `unsafe impl Sync for HelperRelay` is UNSOUND (its mpsc Receiver is
  !Sync) though not live-exploited — marked SUSPECT inline; fix pending box check
  (it touches the in-flight punktfunk1.rs).
- capture.rs / encode.rs (PARENT modules of the WIP idd_push.rs / nvenc.rs) do NOT
  get the file deny yet — it would propagate the lint into the undocumented WIP
  children. The deny lands there once those are documented (after the WIP commits).

Linux-visible parts verified green (cargo clippy -p punktfunk-host --all-targets
-- -D warnings). The cfg(windows) deny gates are box-verified next.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 09:23:25 +00:00
enricobuehler 9777ed7fb3 fix(host/vaapi): plug two AVBufferRef leaks in DmabufInner::open
Surfaced while writing the unsafe-soundness proofs (2/N): both are refcount
leaks (sound — never dangling/double-free — so the SAFETY proofs held, but real
bugs on the persistent punktfunk1-host listener that opens a fresh encoder per
session).

1. Per-session leak: `par->hw_frames_ctx = av_buffer_ref(drm_frames)` created a
   second owned ref. `av_buffersrc_parameters_set` takes its OWN ref of
   `par->hw_frames_ctx`, and `av_free(par)` frees only the struct, not the ref —
   so the extra ref leaked every session, pinning the DRM frames ctx + device.
   Fix: assign `drm_frames` borrowed (the standard ffmpeg pattern); our single
   owned ref lives in DmabufInner and is unref'd in Drop.

2. Error-path leak: the final `open_vaapi_encoder(...)?` returned without the
   unref ladder every other error path runs, leaking graph/drm_frames/
   vaapi_device/drm_device on encoder-open failure. Fix: match + clean up before
   returning (nv12_ctx is borrowed from the sink → freed by graph teardown).

cargo clippy -p punktfunk-host --all-targets -- -D warnings clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 09:02:54 +00:00
enricobuehler ba68a98873 docs(host): prove every unsafe block in the Linux FFI files + gate them (unsafe-proof program 2/N)
Continues the structural unsafe-proof program (every unsafe carries a documented
proof of soundness; the file gains #![deny(clippy::undocumented_unsafe_blocks)]
so it stays proven). This batch covers all 10 remaining pure-Linux files
(104 blocks), each proof stating the REAL invariant — not boilerplate:

  zerocopy/cuda.rs (26)   leaked process-lifetime libcuda fn-ptr table; opaque
                          CUcontext never dereferenced; free-exactly-once via the
                          Arc<Mutex<PoolInner>> ownership graph; dmabuf fd take/close split
  zerocopy/egl.rs (18)    eglGetProcAddress'd procs with the GL context current;
                          EGLImage liveness; the two-call modifier-query bounds
  zerocopy/vulkan.rs (4)  copy-bounds arithmetic (src_size>=span); Send = thread
                          confinement to the punktfunk-pipewire thread
  dmabuf_fence.rs (4)     poll/ioctl/close fd liveness + ownership
  capture/linux/mod.rs (16)  spa_data repr(transparent) cast; null-checked spa
                          derefs; single-loop-thread buffer ownership until requeue
  inject/linux/gamepad.rs (10)  uinput ioctl request-number ↔ struct-size match
                          (static-asserted); InputEventRaw no-padding for the byte cast
  encode/linux/vaapi.rs (15) + encode/linux/mod.rs (9)  ffmpeg object ownership/
                          free ladders; VAAPI/DRM graph; Send = single-thread transfer
  inject/linux/wlr.rs (2), vdisplay/linux/kwin.rs (1)

No memory-unsafety SUSPECT blocks were found — the unsafe is sound. The vaapi
agent did flag two real AVBufferRef *leaks* (not UB) in DmabufInner::open; marked
inline with NOTE(leak) and addressed in a follow-up.

Verified: cargo clippy -p punktfunk-host --all-targets -- -D warnings is clean
(each file's deny gate hard-errors on any undocumented block).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-26 09:00:30 +00:00
enricobuehler 0ccd0fe676 feat(windows-host): EncoderCaps — query RFI/HDR-SEI caps (Goal-1 stage 5, tightening 3)
The last §2.3 seam-trait tightening: give `Encoder` a `caps() -> EncoderCaps`
so the session glue routes by *query* instead of relying on the no-op/`false`
defaults of `invalidate_ref_frames`/`set_hdr_meta`.

`EncoderCaps { supports_rfi, supports_hdr_metadata }` is a cheap `Copy` struct.
The trait gains a default `caps()` returning `EncoderCaps::default()` (all
false) — correct for every SDR/libavcodec backend (Linux NVENC, VAAPI, AMF/QSV,
software openh264), so they need no change. Only the Windows direct-NVENC path
(`NvencD3d11Encoder`) overrides it, reporting the real `rfi_supported` (probed
once at open via `nvEncGetEncodeCaps`) and `hdr` (HDR-SEI on keyframes).

Consumer: the GameStream encode loop (`gamestream/stream.rs`) hoists
`supports_rfi` once before the loop and gates the loss-recovery path on it —
`!(supports_rfi && enc.invalidate_ref_frames(..))` forces a keyframe directly
on non-RFI encoders instead of making an always-`false` call every loss event.
Behaviour-preserving (same keyframe/RFI outcome), one fewer no-op call, intent
explicit. The native host (punktfunk1) uses FEC+keyframes, no RFI consumer.

Linux `cargo clippy -p punktfunk-host --all-targets -D warnings` clean; the
three edited files are rustfmt-clean. The NVENC override is Windows-only
(1:1 with the existing impl style) → CI/on-glass gate.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 21:27:20 +00:00
enricobuehler 38c68c33e5 refactor(windows-host): confine platform code under windows/ + linux/ folders (Goal-1 stage 6)
Move 36 platform-specific files into per-module `windows/` and `linux/` subfolders (and the
shared HID codecs into `inject/proto/`):
  capture/{windows,linux}/  encode/{windows,linux}/  inject/{windows,linux,proto}/
  audio/{windows,linux}/  vdisplay/{windows,linux}/
  src/windows/ (service, wgc_helper, win_adapter, win_display)
  src/linux/  (dmabuf_fence, drm_sync, zerocopy/)

Done with `#[path]`, NOT a module rename: every file moves into its folder while the
`crate::*::*` module names stay FLAT, so all caller paths and every internal `super::`/`crate::`
reference are unchanged — only the parent `mod` decls gained `#[path = "..."]`. This is the
codebase's existing pattern (inject's gamepad_windows) and makes the move byte-identical in
behaviour with ZERO reference churn, far lower risk than collapsing to a single
`crate::capture::windows::` namespace (that deeper rename is an optional follow-on; this delivers
the cfg-sprawl folder confinement the stage is about). Done LAST, after the semantic stages, so
the path churn didn't fight them.

Verified: Linux cargo check + clippy (-D warnings) clean; my mod-decl changes fmt-clean (the 3
remaining fmt diffs are pre-existing local-rustfmt-version skew that moved with their files); all
36 `#[path]` targets exist; no internal `#[path]`/`include!`/file-child-mod in any moved file
(the inline `mod X {` blocks are self-contained). Box build to follow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 18:53:45 +00:00
enricobuehler e5057f6cc1 feat(windows-host): finish HostConfig migration — resolve operator/dispatch knobs once (Goal-1 stage 2)
Migrate 31 genuinely-constant operator/dispatch env::var sites onto HostConfig, so the
capture/topology/encoder decision reads ONE owner instead of being recomputed at each call
site (the latent bug where capture and encode could disagree on the resolved backend, plan §2.4):
idd_push x7, no_wgc, capture_backend, render_adapter, encoder_pref (Linux open_video +
linux_zero_copy_is_vaapi), the Windows vdisplay-backend select, plus the plan-named
secure_dda/idd_depth/zerocopy/ten_bit and the multi-site perf x4 / compositor x5 /
video_source x3 / gamepad. Each HostConfig field's parser is byte-identical to the read it
replaced, so old==new by construction (the plan's "a flipped bool is a silent regression" guard).

Scope correction — the plan's "~64 sites / Linux XDG+compositor included / grep env::var -> 0"
was unsafe as written. Two classes are deliberately KEPT as live reads and documented in config.rs:

  * Runtime-mutated session vars. vdisplay::apply_session_env REWRITES the process env on every
    connect (the Bazzite Gaming<->Desktop follow): WAYLAND_DISPLAY, XDG_CURRENT_DESKTOP,
    XDG_RUNTIME_DIR, DBUS_SESSION_BUS_ADDRESS, and the derived PUNKTFUNK_INPUT_BACKEND,
    GAMESCOPE_SESSION/NODE, KWIN/MUTTER_VIRTUAL_PRIMARY, FORCE_SHM. Parsing these once would
    freeze them at startup and silently break session-following — they are NOT constant.
  * Single-use local tuning with no resolve-once benefit (and FEC_PCT even has two different
    semantics): FEC_PCT, VIDEO_DROP, VBV_FRAMES, SPLIT_ENCODE, PACE_BURST_KB, the dxgi timing
    knobs, the *_LIVE/test gates, plus path/dynamic reads (config-dir, PATH search,
    env-forward-to-child). PUNKTFUNK_ZEROCOPY is split on purpose: Windows presence-semantics
    moved to the field; Linux keeps its own truthy (1|true|yes|on) parser.

Verified: Linux cargo check + clippy (-D warnings) + fmt clean on the touched files. The
Windows-only edits are 1:1 substitutions; they get a real Windows compile on the box with Stage 3.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 17:24:00 +00:00
enricobuehler 72eeedc4da feat(windows): AMD (AMF) + Intel (QSV) hardware encode on the Windows host
The Windows host was NVIDIA-only (NVENC) with an openh264 software fallback. Add
AMD AMF and Intel QSV via libavcodec — the Windows analogue of the Linux VAAPI
backend — so one installer serves all three GPU vendors.

- encode/ffmpeg_win.rs: new WinVendor{Amf,Qsv} encoder. System-memory NV12/P010
  readback (default, robust) + opt-in zero-copy D3D11 (PUNKTFUNK_ZEROCOPY: shares
  the capturer's ID3D11Device; AMF takes AV_PIX_FMT_D3D11, QSV derives a QSV frames
  ctx and maps) with a system fallback for the format-group mismatch the capturer's
  video-processor fallback can produce. HDR Main10 (P010 + BT.2020/PQ VUI; an
  Rgb10a2->P010 swscale covers the shader fallback).
- encode.rs: Codec::amf_name/qsv_name; open_video + windows_resolved_backend()
  resolve PUNKTFUNK_ENCODER=auto|nvenc|amf|qsv|sw via a DXGI adapter VendorId probe.
- capture/dxgi.rs: gpu_mode mirrors the resolved backend (D3D11 NV12/P010 for AMF/QSV).
- gamestream/serverinfo.rs: GPU-aware codec advertisement (windows_codec_support;
  AV1 gated to RDNA3+/Arc, like the VAAPI path).
- Cargo.toml: amf-qsv feature (optional ffmpeg-next in the windows target block).
- CI/installer: windows-host.yml sets FFMPEG_DIR + builds --features nvenc,amf-qsv;
  the Inno installer bundles the FFmpeg DLLs; host.env default nvenc -> auto.

CI-green target; AMF/QSV not yet on-glass validated (no AMD/Intel Windows box in the
lab) — NVENC stays live-validated. An adversarial-review pass caught + fixed real
FFI bugs (AV_PIX_FMT_P010 is a macro -> P010LE; windows-rs 0.62 GetImmediateContext/
GetDesc1 return Result; AV_HWFRAME_MAP_* is a bindgen enum with no BitOr).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-22 10:31:54 +00:00
enricobuehler 3526517eb1 feat: HDR Step-0 colour-metadata transport + security-audit hardening
ci / rust (push) Failing after 45s
apple / swift (push) Successful in 57s
ci / web (push) Successful in 39s
ci / docs-site (push) Successful in 38s
windows-host / package (push) Successful in 3m26s
android / android (push) Successful in 3m40s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m24s
deb / build-publish (push) Successful in 2m10s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m22s
decky / build-publish (push) Successful in 25s
ci / bench (push) Successful in 4m44s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 1m4s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m7s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m45s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 30s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m37s
flatpak / build-publish (push) Successful in 4m17s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m30s
docker / deploy-docs (push) Successful in 23s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m53s
Two strands, entangled in punktfunk1.rs, committed together (one builds-green tree).

HDR pipeline Step 0 — glass-to-glass colour-metadata transport (docs/hdr-pipeline-plan.md):
- Protocol/ABI: ColorInfo on the Welcome + a 0xCE HdrMeta datagram carry the source colour
  space + HDR10 static mastering metadata (quic.rs, abi.rs connect_ex5 fixing caps=0).
- New platform-independent, unit-tested HDR static-metadata helpers (hdr.rs): chromaticities
  (1/50000), mastering luminance (0.0001 cd/m2), MaxCLL/MaxFALL in HDR10/ST.2086 units.
- Capture/encode hooks (capture.rs, encode.rs set_hdr_meta) + Linux client / probe plumbing.

Security-audit hardening — top 3 from docs/security-review.md, each adversarially verified:
- #1 [HIGH] Secret file permissions. The host key.pem/cert.pem and both trust stores are now
  written owner-only: 0600 + dir 0700 on Unix (mirrors mgmt_token), best-effort
  SYSTEM/Administrators/OWNER-only icacls DACL on Windows (%ProgramData% is Users-readable).
  Closes a local key-disclosure -> host-impersonation gap. New gamestream::{create_private_dir,
  write_secret_file} + a 0600 regression test.
- #2 [HIGH] Native SPAKE2 PIN is single-use. The PIN is consumed the moment the host sends its
  key-confirmation (which lets the client test its one guess), before reading the proof, so any
  completed attempt -- right OR wrong -- disarms the window. A wrong PIN isn't observable
  host-side (the client aborts before sending its proof), so consuming on first attempt is what
  delivers the documented "one online guess" instead of an unbounded brute-force of the static
  4-digit PIN. Test verifies single-use.
- #3 [MEDIUM] RTSP packetSize is bounded ([64,2048] in stream_config) and VideoPacketizer::new
  uses saturating .max(1), killing a PRE-AUTH div-by-zero/underflow panic of the video thread.
  Tests for {0,15,16,17} + out-of-range rejection.

fmt + clippy -D warnings clean; full workspace test suite green (93 host tests).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 09:07:59 +00:00
enricobuehler 450bcf1e7b feat(host): Apollo-backlog hardening — cert gate, NVENC RFI, media QoS, async injector
A pass over the apollo-comparison backlog (re-verified against current code).
Lands four items end-to-end plus a Windows-DualSense scoping doc.

- #5/#92/#26 — GameStream paired-cert allow-list. tls.rs surfaces the verified
  peer cert to handlers (serve_https + PeerCertFingerprint, now shared with the
  mgmt API instead of duplicated); nvhttp gates /launch /resume /applist /cancel
  on AppState.paired and reports a real PairStatus; save_paired writes atomically
  (temp+rename). Closes the "mTLS accepts any client cert" hole. + regression test.

- #6/#51/#19/#22 — NVENC caps query -> reference-frame invalidation. nvenc.rs
  query_caps probes nvEncGetEncodeCaps (max dims / 10-bit / custom-VBV / RFI),
  rejecting over-range modes and degrading 10-bit->8-bit instead of an opaque
  InvalidParam. New Encoder::invalidate_ref_frames (default false -> caller
  keyframes); the Windows NVENC path implements real RFI (multi-ref DPB +
  nvEncInvalidateRefFrames, dedup + IDR-on-overflow). control.rs decodes the
  0x0301 lost-frame range (Apollo's IDX_INVALIDATE_REF_FRAMES) -> AppState.rfi_range
  -> encode loop, falling back to a keyframe. NOTE: the Windows NVENC impl is
  RTX-box/CI-pending (can't compile on Linux); adversarially reviewed vs the SDK.

- #43/#72 — media socket QoS + buffer growth. New punktfunk_core::transport::qos:
  grow_socket_buffers (factored out the native plane's 32MB SO_SNDBUF growth so the
  GameStream sockets reuse it) + set_media_qos (opt-in PUNKTFUNK_DSCP=1: DSCP CS5
  video / CS6 audio + Linux SO_PRIORITY, Apollo's scheme). Wired into UdpTransport
  and the GameStream video/audio sockets. Windows IP_TOS needs qWAVE (follow-up).

- #8/#45 — GameStream input injection off the ENet service thread. on_receive no
  longer injects inline (a slow inject head-blocked ENet keepalive/retransmit); it
  forwards to a dedicated injector thread. The hardened InjectorService moved from
  punktfunk1 into crate::inject (shared by both planes) + a coalesce step that sums
  adjacent relative-mouse/scroll deltas while preserving button/key/abs ordering.

Docs: re-verified apollo-comparison.md status (22 items already done/obsolete since
the snapshot) + windows-dualsense-scoping.md (ViGEm can't emulate a DualSense; real
DS5 on Windows needs a VHF virtual-HID driver — web-research pass pending).

fmt + clippy -D warnings clean; full workspace test suite green; no C-ABI/OpenAPI drift.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-21 00:06:30 +00:00
enricobuehler 333f66b45b fix(host/serverinfo): don't advertise an empty codec mask when the VAAPI probe finds nothing
apple / swift (push) Successful in 54s
windows-host / package (push) Successful in 2m21s
android / android (push) Successful in 3m30s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 27s
ci / rust (push) Successful in 5m56s
deb / build-publish (push) Successful in 3m9s
ci / bench (push) Successful in 4m40s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
decky / build-publish (push) Successful in 11s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m47s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m51s
docker / deploy-docs (push) Successful in 17s
The Phase 3 GPU-aware codec mask (6922e1c) probes VAAPI on any non-NVIDIA host.
On a GPU-less box (CI container: no /dev/nvidia* -> `auto` picks VAAPI, but there's
no VA display) the probe returns all-false, so the mask was 0 -- the host
advertised NO codecs, and the serverinfo unit test failed.

Fall back to the static superset when the probe yields nothing (VAAPI wasn't
usable, not "the GPU encodes nothing"); quiet ffmpeg's expected "No VA display"
error during the probe; and assert the test against codec_mode_support() rather
than a hardcoded 65793 so it's deterministic regardless of the build host's GPU.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 11:52:17 +00:00
enricobuehler 6922e1c467 feat(host): VAAPI codec probe + AMD/Intel packaging + neutral logs (Phase 3)
apple / swift (push) Successful in 55s
ci / rust (push) Failing after 1m35s
ci / web (push) Successful in 28s
windows-host / package (push) Successful in 2m23s
ci / docs-site (push) Successful in 30s
android / android (push) Successful in 3m24s
deb / build-publish (push) Successful in 3m22s
decky / build-publish (push) Successful in 14s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m48s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m50s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m51s
docker / deploy-docs (push) Successful in 18s
Polish for AMD/Intel support:
- GameStream serverinfo advertises only codecs the GPU can ACTUALLY encode on
  the VAAPI backend (probed once by opening a tiny encoder per codec). AV1
  encode is narrow (Intel Arc/Xe2+, AMD RDNA3+/RDNA4) and an old iGPU may lack
  HEVC, so a Moonlight client never negotiates a codec the encoder can't open.
  NVENC/Windows keep the Moonlight-validated static mask. Validated on a Radeon
  780M: h264/h265/av1 all probe true -> mask unchanged (65793).
- Packaging: Recommends mesa-va-drivers + intel-media-va-driver (deb) /
  mesa-va-drivers + intel-media-driver (rpm) so the auto-selected VAAPI backend
  works out of the box on AMD/Intel; NVIDIA boxes can --no-install-recommends.
  (Fedora note: stock mesa-va-drivers disables HEVC/AV1 -- needs the freeworld
  variant from RPM Fusion.)
- De-NVIDIA-fy the user-facing encoder log/context strings ("open NVENC" ->
  "open video encoder") now that VAAPI is a first-class backend.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 10:41:37 +00:00
enricobuehler 708c62788d feat(host/encode): VAAPI zero-copy dmabuf import (AMD/Intel GPU CSC)
apple / swift (push) Successful in 57s
ci / rust (push) Successful in 1m39s
ci / web (push) Successful in 32s
ci / docs-site (push) Successful in 31s
android / android (push) Successful in 3m29s
windows-host / package (push) Successful in 3m39s
deb / build-publish (push) Successful in 3m7s
decky / build-publish (push) Successful in 22s
ci / bench (push) Successful in 4m43s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m24s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 22s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m22s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m53s
Phase 2 of AMD/Intel support: the VAAPI encoder now takes the capture dmabuf
directly and does the RGB->NV12 colour conversion on the GPU's video engine,
eliminating the host-side de-pad + swscale CSC + upload the CPU path pays.

- capture: a vendor-neutral FramePayload::Dmabuf (dup'd fd + fourcc/modifier/
  layout). When zero-copy is on, the EGL->CUDA importer is unavailable (any
  non-NVIDIA host), and the backend is VAAPI, the capturer advertises LINEAR
  dmabuf and hands the raw buffer to the encoder instead of CPU-copying it.
- encode/vaapi: the encoder self-configures from the first frame's payload (no
  open_video signature change). The dmabuf arm wraps the buffer as an
  AV_PIX_FMT_DRM_PRIME frame and pushes it through a filter graph
  buffer(drm_prime) -> hwmap(vaapi) -> scale_vaapi=nv12 -> buffersink; the
  encoder takes NV12 surfaces straight from the sink. The Phase 1 CPU-upload
  path is kept as the other arm (used when capture produces CPU frames).

Live-validated on a Radeon 780M (real Sway/xdpw desktop capture): correct,
pixel-perfect HEVC, and ~10x less host CPU at 1440p (4.2s -> 0.4s of CPU for
300 frames) -- the de-pad/CSC/upload moves to the GPU. NVIDIA unchanged
(zero-copy still imports to CUDA; the passthrough path only engages on
non-NVIDIA hosts).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-20 09:57:00 +00:00
enricobuehler b390dd883b feat(host/encode): VAAPI encode backend for AMD/Intel GPUs (Linux)
apple / swift (push) Successful in 54s
windows-host / package (push) Successful in 2m52s
android / android (push) Successful in 3m4s
ci / rust (push) Successful in 1m18s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 27s
ci / bench (push) Successful in 4m32s
deb / build-publish (push) Successful in 2m56s
decky / build-publish (push) Successful in 22s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m59s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 15s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m41s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 7m27s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m10s
docker / deploy-docs (push) Successful in 39s
The Linux host was NVENC/CUDA-only. Add a VAAPI encoder — one libavcodec
backend (h264/hevc/av1_vaapi) covering both AMD (Mesa radeonsi) and Intel
(iHD) — behind the existing `Encoder` trait, and turn `open_video`'s Linux
arm into a vendor dispatcher: `PUNKTFUNK_ENCODER=auto|nvenc|vaapi` (default
auto: NVENC when a CUDA frame or /dev/nvidia* is present, else VAAPI). The
NVIDIA path is unchanged — auto resolves to NVENC on an NVIDIA box and the
bitrate-probe loop moved verbatim into `open_nvenc_probed`.

`VaapiEncoder` mirrors the NVENC hwframes pattern with AV_HWDEVICE_TYPE_VAAPI.
The CPU-input path swscales packed RGB -> NV12 (BT.709 limited, VUI signalled)
and uploads into a pooled VA surface (av_hwframe_transfer_data), preserving the
low-latency model (infinite GOP, on-demand forced IDR, async_depth=1, CBR when
the driver supports it). It works on a non-NVIDIA box with no capture changes:
the capturer already falls back to CPU frames when its EGL->CUDA importer can't
initialise (no libcuda).

Live-validated on a Radeon 780M (RDNA3): hevc/h264/av1_vaapi all encode,
HEVC/H264 decode cleanly with correct BT.709-limited colours, infinite GOP
preserved. Zero-copy dmabuf import (the high-res perf lever) is next.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-19 15:35:49 +00:00
enricobuehler 1fc6f73784 perf(host/linux): NV12 GPU convert — feed NVENC native YUV, off the contended SM (Tier 2A)
apple / swift (push) Successful in 54s
windows-host / package (push) Failing after 2m18s
ci / web (push) Successful in 32s
ci / rust (push) Failing after 5m2s
decky / build-publish (push) Successful in 11s
android / android (push) Failing after 49s
ci / docs-site (push) Successful in 35s
ci / bench (push) Failing after 3m15s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m49s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 15s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 40s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 28s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Successful in 5m54s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 11s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m36s
The Linux zero-copy tiled-GL path can now produce NV12 (BT.709 limited range)
on the GPU and feed NVENC native YUV, deleting NVENC's internal RGB->YUV CSC —
which runs on the SM/3D-compute engine a saturating game pins at 100% (the
game-vs-encode contention headache). Windows already does this via the D3D11
video processor; this closes the Linux gap. See docs/host-latency-plan.md §2A.

Gated behind PUNKTFUNK_NV12 (default OFF → the RGB/BGRx path is byte-for-byte
unchanged; zero regression). Only the tiled EGL/GL path converts; the
LINEAR/Vulkan-bridge (gamescope) path stays RGB.

- zerocopy/egl.rs: Nv12Blit — BT.709 limited Y pass (R8, full-res) + UV pass
  (RG8, half-res, GL_LINEAR 2x2 average); both CUDA-registered; import_nv12.
- zerocopy/cuda.rs: two-plane DeviceBuffer (Y W*H@1B + interleaved UV
  (W/2)*2 x H/2), paired Y+UV pool, copy_mapped_nv12 + copy_nv12_to_device,
  on the per-thread priority stream (dmabuf-recycle sync preserved).
- encode/linux.rs: nvenc_input(Nv12)->NV12; submit_cuda copies two planes into
  NVENC's surface; VUI signalled BT.709 limited (colorspace/range/primaries/trc).
- capture/linux.rs: gate (PUNKTFUNK_NV12 && tiled), report format Nv12.
- main.rs + zerocopy/mod.rs: `nv12-selftest` subcommand.

Validated on RTX 5070 Ti two ways: (1) nv12-selftest — synthetic RGBA->NV12
round-trip vs a BT.709 reference, max abs error Y=0.56/U=0.33/V=0.26 LSB;
(2) live capture->NV12->NVENC->decode of animated red content matches the RGB
path's colour (avg RGB 230,18,18 vs 231,18,20). build/clippy/fmt green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 23:39:11 +00:00
enricobuehler 9771aa8815 fix(host/windows): binary-search clamp NVENC bitrate to the codec-level max (not ×¾ step-down)
ci / web (push) Successful in 28s
ci / rust (push) Successful in 1m42s
ci / docs-site (push) Successful in 28s
apple / swift (push) Successful in 55s
android / android (push) Successful in 1m55s
deb / build-publish (push) Successful in 2m29s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
ci / bench (push) Successful in 4m27s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m54s
When a client requests a bitrate above the GPU's HEVC/AV1 level ceiling, NVENC rejects
initialize_encoder. The old probe stepped the rate down by ×¾ each retry, undershooting
the real ceiling badly (a 1 Gbps request landed ~300 Mbps even with the level cap near
800). Replace it with a binary search over [floor, requested] that converges (±20 Mbps)
on the HIGHEST rate NVENC accepts and clamps to that — so the stream uses the full
codec-level bitrate. Factored the session open/config/init into try_open_session() for
the probe; split-encode rejection is disambiguated from a bitrate-cap rejection (retry
once with split disabled) and the floor fallback also tries split-disabled.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 13:42:00 +00:00
enricobuehler a4df75132a fix(host/windows): HEVC/AV1 HIGH tier so high client bitrates aren't quartered
android / android (push) Successful in 1m56s
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m35s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 29s
deb / build-publish (push) Successful in 2m26s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 6s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m36s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m8s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m8s
docker / deploy-docs (push) Successful in 18s
NVENC defaulted to Main tier, whose per-level bitrate ceiling at 5K (HEVC Level 6.2
Main ≈ 240 Mbps) made initialize_encoder reject a high client bitrate; the existing
probe-and-step-down then silently dropped a ~1 Gbps request by ×¾ to ~240-320 Mbps —
visible color/motion compression on fast scenes. Set HIGH tier (≈800 Mbps for HEVC,
higher for AV1) + autoselect level so the requested bitrate goes through. `tier`/`level`
are u32 (HIGH=1, AUTOSELECT=0) shared across the HEVC/AV1 union offset; the step-down
remains as a safety net. Not yet built/validated on-box (box offline).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 13:33:31 +00:00
enricobuehler 4cc57d5c39 perf(host/windows): move capture→encode off the 3D engine (NV12/P010 video-processor path, zero-copy, GPU priority)
apple / swift (push) Successful in 56s
ci / rust (push) Successful in 1m36s
android / android (push) Successful in 1m56s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 28s
deb / build-publish (push) Successful in 2m26s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m33s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m58s
The Windows host capped at ~60 fps with 35-40 ms latency on a GPU-heavy game:
the per-frame capture→encode path shared the 3D engine with the game and got
scheduled behind it. Rework to minimize 3D-engine work per frame:

- VideoConverter (D3D11 video processor): capture → NVENC-native NV12/P010 so
  NVENC skips its internal RGB→YUV (a 3D/compute step). Wired into both DDA
  (dxgi.rs) and WGC (wgc.rs). New PixelFormat::Nv12/P010 + NVENC YUV input.
- GPU scheduling hardening (Apollo-style): D3DKMTSetProcessSchedulingPriorityClass
  HIGH, absolute SetGPUThreadPriority, SetMaximumFrameLatency(1).
- WGC SDR zero-copy (hold pool frames; no CopyResource). DDA keeps a fast
  CopyResource to decouple its single-frame acquire/release from the async convert.
- Pipelined helper encode loop (PUNKTFUNK_ENCODE_DEPTH, default 1) + perf split
  (cap_wait / encode / write).

Live on the RTX 4090: hard 60 fps ceiling removed (now scene-scaling 40-200+),
latency much reduced. Residual cap in GPU-pinned scenes is the irreducible RGB→YUV
convert (no fixed-function unit on NVIDIA — VideoProcessing engine reads 0%) waiting
behind an uncapped game under WDDM context time-slicing; Linux avoids it via
gamescope capping the game to the display refresh.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-17 13:08:03 +00:00
enricobuehler b9f4cf1f3e fix(host/windows): don't 2-way-split-encode Main10 — it's SLOWER on Ada (fixes broken HDR animations)
apple / swift (push) Successful in 53s
audit / cargo-audit (push) Failing after 1m9s
android / android (push) Successful in 2m3s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
ci / web (push) Successful in 29s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m31s
ci / rust (push) Successful in 4m26s
decky / build-publish (push) Successful in 11s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
flatpak / build-publish (push) Successful in 3m34s
deb / build-publish (push) Successful in 6m55s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m25s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m10s
The "broken animations in HDR" was an encode-throughput cliff, not the ACCESS_LOST churn. Measured at
5120x1440@240 HEVC Main10 on the RTX 4090: forced 2-way split-encode = 7.6 ms/frame (~131 fps, well
over the 4.17 ms/240fps budget → choppy), while SINGLE engine = 2.8-3.9 ms/frame (~256-357 fps, fits
240). The split/merge overhead dominates for 10-bit; a single Ada NVENC engine already handles 5K@240
Main10 comfortably. So the split decision now forces DISABLE for Main10 (bit_depth >= 10), keeping the
existing forced-2 only for 8-bit above 1 Gpix/s. PUNKTFUNK_SPLIT_ENCODE still overrides. Added a
split-mode log line.

Validated live on the 4090: encode_us_p50 7.6 ms → 3.9 ms at 5K240 HDR with no env override.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 21:40:28 +00:00
enricobuehler bbabc04bca feat(hdr): Windows HDR10 + 10-bit end-to-end, negotiated; non-blocking capture recovery
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m32s
android / android (push) Successful in 1m49s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m36s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m20s
flatpak / build-publish (push) Successful in 4m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m11s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m32s
Adds true HDR (BT.2020 PQ) and 10-bit (HEVC Main10) streaming, negotiated so an
8-bit/SDR client is never sent a stream it can't decode, plus a robust fix for the
capture losing the stream across a secure-desktop transition.

Protocol (punktfunk-core/quic.rs):
- Hello gains `video_caps` (VIDEO_CAP_10BIT / VIDEO_CAP_HDR), Welcome gains `bit_depth`,
  both as optional trailing bytes (back-compat). client-rs advertises 10-bit via
  PUNKTFUNK_CLIENT_10BIT; the connector advertises 0 for now (in-band detection drives
  the native clients). Regenerated punktfunk_core.h.

Windows host:
- 10-bit Main10: host enables it only when the client advertised VIDEO_CAP_10BIT AND
  PUNKTFUNK_10BIT is set; threaded through open_video → NVENC (profile Main10,
  pixelBitDepthMinus8).
- HDR: when the captured desktop is scRGB FP16 (R16G16B16A16_FLOAT, HDR on), copy it to
  an FP16 surface, composite the cursor there, convert scRGB → BT.2020 PQ 10-bit
  (R10G10B10A2) via a shader, and encode HEVC Main10 with the BT.2020/PQ colour VUI
  (ABGR10 input). Fixes the freeze + cursor-trail that came from feeding FP16 into the
  BGRA path. Reacts dynamically to the HDR toggle.
- Capture recovery: rebuild is now a single NON-BLOCKING attempt, throttled to ~4×/s,
  repeating the last good frame between attempts (format-tagged last_present). During a
  secure-desktop dwell SudoVDA's output is gone; the old blocking 12 s retry starved the
  send loop for seconds so the client timed out and disconnected — now the session stays
  fed (frozen) until the desktop returns. Also seeds a black frame on recovery.

Apple client (PunktfunkKit):
- Detects HDR in-band from the stream VUI (PQ transfer function), decodes to 10-bit P010,
  and presents via an rgba16Float + BT.2020 PQ CAMetalLayer with EDR; SDR path unchanged.
  Switches automatically on a mid-session HDR toggle.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 20:28:52 +00:00
enricobuehler 26fbd9ec64 perf(host/windows): zero-copy NVENC — encode the capturer's texture in place (halve 3D-engine load)
ci / rust (push) Failing after 43s
apple / swift (push) Successful in 53s
ci / web (push) Successful in 35s
android / android (push) Successful in 1m45s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m35s
decky / build-publish (push) Successful in 32s
deb / build-publish (push) Successful in 2m21s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 17s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m59s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3m52s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 21s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m37s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m37s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m4s
docker / deploy-docs (push) Successful in 18s
The Windows host pegged the GPU 3D engine at ~97% during high-fps desktop streaming — measured (per-
process GPU-engine counters) as OUR process, not DWM. Cause: TWO VRAM->VRAM CopyResource per frame
(dupl->gpu_copy in the capturer, then gpu_copy->nvenc_pool in the encoder), and on Windows D3D11
routes copies to render-target textures through the 3D engine (the DMA copy engine sat idle at 7%),
so at 240 fps they saturate it and contend with a game's own rendering.

Eliminate the second copy: NVENC now registers the capturer's D3D11 texture directly (cached by raw
pointer, the cloned texture kept alive until unregister) and encode_pictures it IN PLACE — no
encoder-owned input pool, no per-frame copy. Safe because the host encode loop is synchronous
(capture -> submit -> poll, where lock_bitstream blocks until the encode finishes), so the capturer
never overwrites the texture mid-encode; documented in the module header in case that ever changes.

2 GPU copies/frame -> 1 (the remaining dupl->gpu_copy is unavoidable; that DXGI surface is transient).
Measured: SM/compute ~10-15% at ~217 fps 5K (was ~20% at only ~48 fps with two copies), 3687 frames
decoded clean. Windows-only; Linux/macOS unaffected.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 17:33:07 +00:00
enricobuehler c830246037 feat(host/windows): UDP send offload + NVENC 2-way split-encode (1 Gbps+ / 5K@240)
apple / swift (push) Successful in 53s
audit / cargo-audit (push) Failing after 1m7s
ci / rust (push) Failing after 40s
android / android (push) Successful in 2m11s
ci / web (push) Successful in 29s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
flatpak / build-publish (push) Successful in 3m42s
deb / build-publish (push) Successful in 6m58s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m30s
docker / deploy-docs (push) Successful in 30s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m10s
The Windows host couldn't sustain high-throughput / high-fps streams — two gaps vs the Linux host,
both found via live RTX 4090 measurement (PERF timing + nvidia-smi per-engine attribution):

- UDP Send Offload (USO). punktfunk-core's UdpTransport sent one packet per `send` syscall on
  Windows (send_batch/send_gso were Linux-only), capping throughput at high packet rates. Add a
  Windows `send_gso` override using `WSASendMsg` + `UDP_SEND_MSG_SIZE` (the Windows analogue of
  Linux UDP GSO) via windows-sys — one syscall segments a coalesced <=512-segment super-buffer to
  the connected peer. On by default with auto-fallback (PUNKTFUNK_GSO=0 disables, error latches
  off); plugs into the existing paced send path. SO_SNDBUF (32MB) was already cross-platform.

- NVENC 2-way split-frame encoding. A single Ada NVENC session tops out ~0.8 Gpix/s, so 5K@240
  (1.77 Gpix/s) took ~8 ms/frame -> a ~125 fps ceiling at high motion (the in-game stutter). Set
  NV_ENC_INITIALIZE_PARAMS.splitEncodeMode = TWO_FORCED above ~1 Gpix/s (matching the Linux
  libavcodec split_encode_mode path) to use both 4090 encoders — measured ~8 ms -> ~4 ms/frame at
  throughput. Env override PUNKTFUNK_SPLIT_ENCODE; init-failure fallback disables it (e.g. H264).

Windows-only paths; Linux/macOS unaffected. Builds clean on x86_64-pc-windows-msvc.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 16:52:59 +00:00
enricobuehler f4b4a6c1e4 feat(host/windows): native res, cursor, secure-desktop capture, windowless SYSTEM launch
apple / swift (push) Successful in 52s
ci / rust (push) Failing after 36s
ci / web (push) Successful in 31s
android / android (push) Successful in 1m52s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 3m19s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m57s
docker / deploy-docs (push) Successful in 17s
Live-validated Mac <-> RTX 4090 at the display's native 5120x1440@240:

- Resolution: set_active_mode enumerates the IDD's advertised modes and sets the
  requested resolution at the best supported refresh (keeps 5120x1440@240; no more
  silent fallback to the 1080p OS default when an exact mode is briefly unavailable).
- Bitrate auto-cap: NVENC init probes and steps the average bitrate down to the GPU's
  codec-level max so a high client bitrate connects (matches the Linux host; we do not
  split NVENC sessions).
- Mouse cursor: DXGI duplication excludes the HW cursor; capture the pointer
  shape/position (GetFramePointerShape) and GPU-composite it before NVENC. Color cursors
  alpha-blend; masked-color (the text I-beam) uses an INV_DEST_COLOR inversion blend so
  the caret inverts the screen and shows on any background (no black box); monochrome
  handled too.
- Secure desktop (lock / login / UAC): run as SYSTEM in the interactive session, follow
  the input desktop via SetThreadDesktop, and on the WinSta switch recreate the D3D11
  device and re-resolve the virtual output's GDI name from the stable SudoVDA target id
  (the name changes across the topology rebuild; the old failure hunted the stale
  \\.\DISPLAYn and dropped). ACCESS_LOST / INVALID_CALL / device-removed are recoverable,
  and a mid-stream resolution change is followed (capturer + NVENC re-init at the new
  size). isolate_displays detaches other monitors so Winlogon renders to the virtual
  output. One real session recovered 1012 desktop switches and completed cleanly.

Windows-only backends; Linux/macOS unaffected. Builds clean on x86_64-pc-windows-msvc.
Deployment (windowless SYSTEM launch via PsExec + hidden VBScript) documented in
docs/windows-host.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 15:46:34 +00:00
enricobuehler 2448a33698 style(host/windows): rustfmt the Windows backends
apple / swift (push) Successful in 55s
android / android (push) Failing after 1m53s
ci / web (push) Failing after 17s
ci / docs-site (push) Successful in 42s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 7s
ci / rust (push) Failing after 3m5s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 12s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 7s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 2s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 0s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Failing after 0s
flatpak / build-publish (push) Failing after 0s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 0s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Failing after 1m43s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m15s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 01:50:16 +00:00
enricobuehler 69ba6ec45d feat(host/windows): NVENC D3D11 hardware encoder (--features nvenc)
android / android (push) Failing after 36s
ci / rust (push) Failing after 45s
apple / swift (push) Successful in 55s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m35s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
flatpak / build-publish (push) Failing after 2s
deb / build-publish (push) Successful in 3m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1m17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m32s
docker / deploy-docs (push) Successful in 17s
Zero-copy capture->encode on the GPU via the raw NVENC API (nvidia_video_codec_sdk sys + ENCODE_API; the safe wrapper is CUDA-only). Opens an NV_ENC_DEVICE_TYPE_DIRECTX session on the SAME ID3D11Device as the DXGI capturer (carried on the new FramePayload::D3d11), registers a pool of BGRA textures once, CopyResources each captured texture in and encode_picture; CBR/ULL, infinite GOP, P-only, forced-IDR for RFI. The DXGI capturer gains a D3D11 zero-copy output (selected, like the encoder, by PUNKTFUNK_ENCODER=nvenc) so capture+encode share textures.

OFF by default (the nvenc feature pulls the NVENC SDK + cudarc): the default Windows host links without it (openh264 path). cudarc builds toolkit-less via the SDK ci-check feature (dynamic-loading). At link time --features nvenc needs nvencodeapi.lib (NVENC SDK, or an import lib generated from the driver's nvEncodeAPI64.dll) on PUNKTFUNK_NVENC_LIB_DIR. Both default and --features nvenc builds validated to compile+link GPU-less on the VM (import lib generated from the driver DLL). Runtime needs a real NVIDIA GPU.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 01:39:46 +00:00
enricobuehler cbbeaa5c29 feat(host/windows): openh264 software H.264 encoder (GPU-less path)
apple / swift (push) Successful in 53s
android / android (push) Failing after 1m31s
ci / rust (push) Failing after 45s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m37s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 3s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
flatpak / build-publish (push) Failing after 2s
deb / build-publish (push) Successful in 3m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1m21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m46s
docker / deploy-docs (push) Successful in 18s
Windows Encoder impl via the openh264 crate (statically-bundled, BSD-2): low-latency screen-content config (Baseline/no-B-frames, bitrate RC, BT.709 limited, near-infinite GOP + forced-IDR recovery via request_keyframe), packed CPU pixels (BGRx/BGRA/RGB/RGBA/RGBx/BGR) -> I420 -> AnnexB with in-band SPS/PPS each IDR. Synchronous: submit encodes immediately, poll hands back the one AU, flush is a no-op. Windows open_video factory selects it (PUNKTFUNK_ENCODER=software|nvenc|auto; NVENC arm lands later), H.264-only with a clear error otherwise, SW bitrate ceiling. Unit-tested live on the VM: synthetic BGRx -> AnnexB IDR + SPS NAL. Unblocks the GPU-less capture->encode->FEC->send pipeline. Compiles clean on Windows + Linux.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 00:43:19 +00:00
enricobuehler a95984bb4f feat(client-linux): feature parity with the Swift client
Everything the macOS app does that stage 1 lacked, before any new
feature work (user directive):

- Input capture is now a deliberate, reversible STATE (Moonlight-
  style): engaged on stream start and click-into-video (the engaging
  click is suppressed), released by Ctrl+Alt+Shift+Q (toggles) or
  focus loss; held keys/buttons are flushed host-side on release;
  cursor hiding + shortcut inhibition follow the state; HUD hint when
  released. Per-session window handlers disconnect with the page.
- Gamepads: app-lifetime SDL service (GamepadManager parity) — pad
  list + "Forwarded controller" pin in Settings (auto = most recent),
  "Automatic" pad TYPE resolves from the physical pad at connect;
  DualSense touchpad contacts + ~250 Hz motion samples on the 0xCC
  plane (Swift GamepadWire scale constants); feedback grows adaptive-
  trigger replay and player LEDs via raw DS5 effects packets (the
  wire's 11-byte blocks drop into SDL_SendGamepadEffect verbatim);
  held pad state zeroed on pad switch/detach. sdl3 "hidapi" feature.
- Microphone uplink: PipeWire capture -> Opus 20 ms -> 0xCB datagrams
  (validated live: host received 711 mic packets), Settings toggle.
- Speed test per saved host (Swift's "Test Network Speed…"): 2 s
  probe burst, goodput/loss + recommended ~70 % bitrate, one-tap apply.
- Settings: host compositor preference (sent in the Hello), native-
  display resolution/refresh resolved from the window's monitor at
  connect (new default), bitrate ceiling to 3 Gbit/s.
- Hosts page: saved/trusted hosts section for direct pinned reconnect
  (mDNS not required), rebuilt on every page return.

Deliberately not ported: audio device pickers (PipeWire routing owns
this on Linux), resize-to-request_mode (not wired in Swift either),
pointer-lock relative mouse (stage-2 presenter, needs raw Wayland).
DualSense fidelity needs a physical pad to live-verify.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 21:11:52 +00:00
enricobuehler a8a6224fd8 fix(encode): bound per-frame size with a tight VBV buffer
ci / rust (push) Failing after 36s
ci / web (push) Failing after 36s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
docker / deploy-docs (push) Successful in 17s
ci / docs-site (push) Failing after 39s
apple / swift (push) Successful in 1m16s
NVENC ran CBR (bit_rate == max_bit_rate, rc=cbr) but never set rc_buffer_size,
so it used a loose default VBV. A high-motion P-frame was then allowed to spike
to many times the average frame size; the extra packets overflow the depth-2
send queue (newest frame dropped) and the kernel UDP buffer (WouldBlock drops),
which the client sees as framedrops/jitter — and on the infinite-GOP GameStream
path as old/stale frames flashing until the next RFI.

Set a tight ~1-frame VBV (rc_buffer_size = bitrate/fps) so the encoder holds
frame size roughly constant and absorbs motion as a momentary QP/quality dip
instead — the Sunshine/Moonlight low-latency model. Tunable via
PUNKTFUNK_VBV_FRAMES (default 1.0); larger trades burst tolerance for motion
quality. Fixes both the punktfunk/1 and GameStream paths (shared encoder).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 20:58:46 +00:00
enricobuehler 136390514d build: support FFmpeg 7.x and 8.x; fix RPM spec GPU link deps
ci / rust (push) Has been cancelled
punktfunk-host builds unchanged against either FFmpeg 7.x (libavcodec 61) or 8.x
(libavcodec 62) — ffmpeg-sys-next auto-detects the system version, and the host's
ffmpeg FFI only touches long-stable APIs. Confirmed by building + running live on a
Bazzite F43 box (FFmpeg 7.1.3): full gamescope capture → zero-copy dmabuf→CUDA →
NVENC H.265 at 1280x720x60, p50 ~0.96 ms. Just doc/spec accuracy, no code change:

- encode/linux.rs + CLAUDE.md: drop the "FFmpeg 8 only" claim; note 7.x/8.x both work.
- rpm spec: add the missing zero-copy GPU build deps the link actually needs —
  pkgconfig(gl) + pkgconfig(gbm) (mesa) — and document that -lcuda needs libcuda.so at
  link time (NVIDIA host, or the CUDA toolkit stub on a headless COPR/koji builder).
  Tracked for a proper fix: make the cuda/gbm/GL FFI dlopen-based like khronos-egl so
  the RPM builds on a GPU-less host.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 09:12:59 +00:00
enricobuehler bfd64ce871 rename: lumen → punktfunk, everywhere
ci / rust (push) Has been cancelled
Full project rename, decided 2026-06-10:
- Crates/binaries: punktfunk-core / punktfunk-host / punktfunk-client-rs.
- C ABI: punktfunk_* symbols, Punktfunk* types, include/punktfunk_core.h,
  PUNKTFUNK_FEATURE_QUIC guard (header regenerated; cbindgen renames updated, incl.
  PUNKTFUNK_BTN_*/PUNKTFUNK_AXIS_* wire constants).
- Protocol: punktfunk/1 — control-plane magic LMN1 → PKF1, nonce salt lmn1 → pkf1.
  WIRE BREAK: clients must be rebuilt from this revision.
- Env knobs: PUNKTFUNK_VIDEO_SOURCE / PUNKTFUNK_COMPOSITOR / PUNKTFUNK_ZEROCOPY / ….
- Host config dir: ~/.config/punktfunk (the box's dir was migrated in place — the
  persistent identity is unchanged, pinned fingerprints stay valid).
- Swift package: PunktfunkKit + PunktfunkCore.xcframework + PunktfunkConnection
  (Sources/PunktfunkClient app + tests renamed with it); build-xcframework.sh updated.
- scripts/: 60-punktfunk.rules, punktfunk-host.service; OpenAPI doc regenerated.

Also: scripts/headless/run-headless-kde.sh — full headless Plasma bringup. Root cause of
"desktop but no apps/settings" over the stream: plasmashell launched without
XDG_MENU_PREFIX=plasma-, so the launcher resolved a nonexistent applications.menu and
rendered an empty menu. The script sets the complete KDE session env (menu prefix,
KDE_FULL_SESSION, session version) and rebuilds ksycoca before starting plasmashell.

Gate: 97/97 tests, clippy -D warnings (both feature sets), fmt, C-ABI harness PASS,
zero lumen references left outside .git.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-10 13:11:59 +00:00