6c2942ee457a4be759063e1e57f66627718ff8ab
87 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
6c2942ee45 |
fix(fmt): remove extra blank line in dxgi.rs
android / android (push) Has been cancelled
apple / screenshots (push) Has been cancelled
apple / swift (push) Has been cancelled
ci / web (push) Has been cancelled
ci / docs-site (push) Has been cancelled
ci / bench (push) Has been cancelled
deb / build-publish (push) Has been cancelled
ci / rust (push) Has been cancelled
decky / build-publish (push) Has been cancelled
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Has been cancelled
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Has been cancelled
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Has been cancelled
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Has been cancelled
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Has been cancelled
docker / deploy-docs (push) Has been cancelled
windows-host / package (push) Failing after 11s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Has been cancelled
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Has been cancelled
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> |
||
|
|
188b26b584 |
fix build
windows-drivers / probe-and-proto (push) Successful in 27s
ci / rust (push) Failing after 1m8s
apple / swift (push) Successful in 1m8s
windows-drivers / driver-build (push) Successful in 1m14s
windows-host / package (push) Failing after 12s
ci / web (push) Successful in 54s
ci / docs-site (push) Successful in 1m8s
android / android (push) Successful in 3m23s
apple / screenshots (push) Successful in 5m24s
deb / build-publish (push) Successful in 3m22s
decky / build-publish (push) Successful in 25s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 46s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 7s
ci / bench (push) Successful in 5m1s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 10s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 10s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m11s
docker / deploy-docs (push) Successful in 8s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m49s
|
||
|
|
080c55dbf7 |
refactor(host/windows): collapse Windows capture to IDD-push only
apple / swift (push) Successful in 1m5s
ci / rust (push) Failing after 1m29s
windows-host / package (push) Failing after 1m11s
ci / web (push) Successful in 56s
ci / docs-site (push) Successful in 1m4s
android / android (push) Successful in 3m35s
apple / screenshots (push) Successful in 5m30s
deb / build-publish (push) Successful in 3m18s
decky / build-publish (push) Successful in 27s
ci / bench (push) Successful in 4m39s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 34s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m38s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m23s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 52s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m7s
docker / deploy-docs (push) Failing after 12m53s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Has been cancelled
Remove DXGI Desktop Duplication (DuplCapturer), Windows.Graphics.Capture
(WgcCapturer), the two-process SYSTEM+helper relay (virtual_stream_relay /
HelperRelay / DesktopWatcher / composed_flip), and the five source files that
implemented them. IDD direct-push is now the sole Windows capture path; the
session topology is always SingleProcess.
Deleted files: wgc.rs, wgc_relay.rs, desktop_watch.rs, composed_flip.rs,
windows/wgc_helper.rs (+ wgc-helper subcommand in main.rs).
dxgi.rs is kept but carved to shared GPU primitives only (make_device,
HdrP010Converter, VideoConverter, install_gpu_pref_hook, WinCaptureTarget,
pack_luid) — ~2237 lines of DDA-only code removed; imports cleaned.
capture.rs: IDD-push open failure fails the session cleanly (no fallback).
Adds capturer_supports_444() — returns false on Windows (IDD-push 4:4:4 is a
follow-up), replacing the stale single_process gate in 4:4:4 negotiation.
session_plan.rs: CaptureBackend{Dda,Wgc} and SessionTopology::TwoProcessRelay
removed. config.rs: no_helper/force_helper/no_wgc/capture_backend/secure_dda
removed. merged_env_block relocated from wgc_relay to windows/interactive.rs.
Linux cargo check clean.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
||
|
|
b5f02000d6 |
fix(host/security): scope Windows shared-section SDDL to SYSTEM+LocalService (close #5)
The gamepad host<->UMDF-driver shared sections (Global\pfds-shm-*, pfxusb-shm-*) and the IDD-push frame ring/event (Global\pfvd-*) were created with `D:(A;;GA;;;WD)` — GENERIC_ALL to **Everyone** — on the assumption the driver's WUDFHost ran under a restricted token needing broad access. So any local unprivileged user could OpenFileMapping the section to inject controller input, tamper the trusted HID channel, or read captured screen frames (security-review 2026-06-28 #5). On-box validation (RTX box, 2026-06-29) disproved the restricted-token premise: the WUDFHost token is NT AUTHORITY\LocalService (S-1-5-19), SYSTEM integrity, with ZERO restricted SIDs. So the section only needs SYSTEM (the host creates + writes it) and LocalService (the driver opens it). Scope both SDDL sites to `D:(A;;GA;;;SY)(A;;GA;;;LS)`; rename the now-misnamed `permissive_sa` -> `shared_object_sa`; correct the stale "restricted-token / Everyone" docs. Validated live: a full DualSense + 1280x720x60 session — 6943 frames received, HID output round-tripped, device status OK (pf_dualsense + pf_vdisplay WUDFHosts both LocalService open the scoped sections fine), while OpenFileMapping from a non-SYSTEM admin session now returns ACCESS_DENIED (was a granted handle under WD). Host-only change (the SDDL is set when the host CREATES the section); drivers unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
bee1f0416d |
chore(licensing): LGPL FFmpeg swap, third-party notices, attribution hygiene
The MIT OR Apache-2.0 SOURCE license is clean (audit found no copied copyleft); the
gaps were all binary-distribution (Layer-2). This makes the shipped artifacts honest:
- Windows host + client: bundled FFmpeg BtbN gpl-shared -> lgpl-shared (AMF/QSV/decode
unaffected; the GPL-only x264/x265 were never used), and ship the FFmpeg LGPL notice
+ license text in the installer + MSIX (licenses/).
- THIRD-PARTY-NOTICES.txt generated + bundled into installer/MSIX/deb/rpm. Offline
generator (scripts/gen-third-party-notices.{py,sh}) + cargo-about config (about.toml/
.hbs) with a permissive-only accepted-license allow-list as a copyleft regression gate.
- Reword the win32u GPU-preference hook comments to reflect independent reimplementation
(no Apollo/Sunshine GPL-3.0 source copied).
- README dual-license + inbound=outbound contributor clause + non-affiliation trademark
disclaimer; new CONTRIBUTING.md.
- LICENSE files into the standalone driver + vk-layer workspaces; deb copyright holder
aligned to "unom and the punktfunk contributors".
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
75627c8afe |
feat(audio): end-to-end 5.1/7.1 surround across the native path + all clients
apple / swift (push) Failing after 10s
release / apple (push) Failing after 7s
apple / screenshots (push) Has been skipped
audit / cargo-audit (push) Failing after 1m19s
windows-host / package (push) Failing after 2m44s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Failing after 39s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Failing after 39s
windows / build (aarch64-pc-windows-msvc) (push) Failing after 45s
android / android (push) Successful in 5m17s
windows / build (x86_64-pc-windows-msvc) (push) Failing after 45s
ci / web (push) Successful in 57s
ci / docs-site (push) Successful in 56s
ci / rust (push) Successful in 9m19s
ci / bench (push) Successful in 4m40s
decky / build-publish (push) Successful in 26s
deb / build-publish (push) Successful in 2m57s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 33s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m56s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m35s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m20s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 53s
flatpak / build-publish (push) Successful in 4m22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m51s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m50s
Adds negotiated 5.1/7.1 surround to the punktfunk/1 protocol and every client (previously stereo-only): - core: new shared `audio` layout table (LAYOUT_51/71 + identity multistream mapping, canonical wire order FL FR FC LFE RL RR SL SR); Hello/Welcome `audio_channels` negotiation via the trailing-byte back-compat pattern (old peers fall back to stereo); C-ABI `punktfunk_connect_ex6`, `punktfunk_connection_audio_channels`, and in-core multistream decode `punktfunk_connection_next_audio_pcm` for embedders without a multistream Opus decoder. Real-libopus channel-identity round-trip test. - host: native audio thread captures + Opus-(multi)stream-encodes at the negotiated count (with a cross-session cached-capturer channel-mismatch fix); GameStream surround unified onto the safe `opus::MSEncoder`, dropping `audiopus_sys` (~4 unsafe blocks) and un-gating Windows GameStream surround; WASAPI loopback capture relaxed to 2/6/8 with the correct dwChannelMask. - clients: Linux (PipeWire), Windows (WASAPI), Android (AAudio) decode via `opus::MSDecoder` + render multichannel; Apple decodes in-core to PCM → AVAudioEngine with an explicit wire-order channel layout; each gains a Stereo/5.1/7.1 setting. `punktfunk-probe --audio-channels N` is the headless validator. Verified on Linux: core/host/linux/probe test suites + the Android Rust (cargo-ndk) build, clippy -D warnings, and rustfmt all green. Windows/Apple builds, all on-glass checks, and the live native loopback are pending (CI / a free box). Also lands the concurrent in-tree HEVC 4:4:4 host work (PUNKTFUNK_444): it shares the same touched files (quic.rs, punktfunk1.rs, encode/*, ...) and so cannot be committed separately from the surround changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
94c556f0e3 |
fix(host/capture): recover from compositor loss instead of freezing
apple / swift (push) Successful in 1m1s
apple / screenshots (push) Successful in 5m7s
windows-host / package (push) Successful in 7m26s
android / android (push) Successful in 4m50s
ci / rust (push) Successful in 4m51s
ci / web (push) Successful in 50s
ci / docs-site (push) Successful in 54s
deb / build-publish (push) Successful in 2m29s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m37s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m1s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m47s
When the compositor is torn down mid-stream (a Gaming↔Desktop switch removes
the virtual output), its PipeWire stream leaves Streaming for Paused rather
than disconnecting. try_latest treated that as Ok(None) ("static desktop —
repeat the last frame"), so the stream froze on the last frame forever and
neither recovery path fired: the capture-loss rebuild keys on Err, and the
session watcher keys on a session-KIND change (a desktop→desktop new KWin
instance is the same kind).
Track the PipeWire stream state via state_changed (a `streaming` flag) and,
in try_latest, surface a sustained non-Streaming state (1.5s grace for a
transient renegotiation blip) as a capture-loss Err — which the encode loop
already handles by rebuilding the pipeline in place. A static desktop stays
Streaming, so no false trigger. Complements the now-default session watcher.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
7b99b41ede |
docs(design): trim shipped plans, consolidate cluster, add index
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).
- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
host-latency, gpu-contention (fixed stale status table), game-library,
linux-setup (fixed m0->spike + stale zero-copy claim),
session-aware-host-followups, windows-client-bootstrap,
windows-dualsense-{scoping,game-detection}, windows-virtual-display,
security-review (per-finding status table; #12 still open),
apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
merged, M4 done); windows-secure-desktop.md archived (now a fallback
behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
f6490f4c28 |
fix: complete the docs/→design/ and openapi→api/ rename references
The file moves (docs/ → design/, docs/api/openapi.json → api/openapi.json) landed
in
|
||
|
|
3e7c9bd059 |
fix(host): remove unsound unsafe impl Sync for HelperRelay
apple / swift (push) Failing after 0s
release / apple (push) Failing after 0s
apple / screenshots (push) Has been skipped
windows-drivers / probe-and-proto (push) Successful in 29s
audit / cargo-audit (push) Failing after 1m20s
windows-drivers / driver-build (push) Successful in 1m14s
android / android (push) Failing after 2m5s
ci / web (push) Successful in 46s
ci / docs-site (push) Successful in 1m3s
windows-host / package (push) Successful in 6m46s
ci / bench (push) Successful in 4m34s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m25s
ci / rust (push) Successful in 8m36s
decky / build-publish (push) Successful in 22s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m11s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 59s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m37s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m3s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 29s
deb / build-publish (push) Successful in 7m50s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m52s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 1m5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m33s
flatpak / build-publish (push) Successful in 3m56s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m46s
docker / deploy-docs (push) Successful in 22s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m26s
The one genuine soundness defect the unsafe-proof program surfaced (flagged SUSPECT in program 3/N). `HelperRelay` holds an `rx: Receiver<RelayAu>`, which is `!Sync` (std mpsc is single-consumer), so asserting `Sync` claimed more than the fields support — an `Arc<HelperRelay>` recv'd from two threads would compile and be UB. It was never live-exploited, and it turns out `Sync` is also unnecessary: the relay is a single-owner `mut relay` local in the punktfunk1 two-process mux loop (recv_timeout/try_recv/request_keyframe all called on the owning thread; no `Arc`, no `thread::spawn` capturing it). So the fix is simply to delete the impl — the struct keeps its sound `unsafe impl Send` (needed for the raw `HANDLE` fields), which is all the code uses. Box-verified: cargo clippy -p punktfunk-host --features nvenc --target x86_64-pc-windows-msvc -- -D warnings stays green without the Sync impl. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
7aa787a789 |
docs(host): prove the last 3 files + crate-root deny (unsafe-proof program 4/N, final)
Completes the unsafe-proof program now that the parallel WIP has landed: - idd_push.rs (25 sites), nvenc.rs (7), punktfunk1.rs (21): a SAFETY proof on every unsafe block — D3D11/DXGI COM (same-device textures, immediate-context single-thread, keyed-mutex-held convert), the NVENC SDK table (versioned POD, register/map/lock-bitstream pairing), cross-process shm reads (atomic magic/generation handshake), and the C-ABI harness (each call cross-checked against its abi.rs `# Safety` doc). No SUSPECT (UB) blocks. - capture.rs / encode.rs: the parent-module deny is restored (their WIP children are now proven), and main.rs gains a crate-root #![deny(clippy::undocumented_unsafe_blocks)] — the permanent catch-all gate so no future unsafe block anywhere in the crate can land without a proof. - Fixed 4 blocks the agents missed: unsafe blocks nested inside `assert_eq!(...)` macro args (the comment-above-statement didn't associate) — hoisted to a `let`. - rustfmt-canonicalized the Windows files (the agents' SAFETY comments + some pre-existing 1.9.0 drift) so `cargo fmt --all --check` is clean. Verified: cargo clippy -p punktfunk-host --all-targets -- -D warnings AND cargo fmt -p punktfunk-host --check both green with the crate-root deny active. Windows cfg(windows) re-verified on the box next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
3514702d8c |
feat(windows-host): IDD-push encodes native NV12/P010 (skip NVENC's SM-side CSC)
GPU-contention work (host-latency plan §5.A): the IDD-push output ring now hands NVENC native YUV instead of RGB, so NVENC skips its internal RGB→YUV colour conversion on the SM/3D engine the running game saturates. - idd_push.rs: out_ring is now NV12 (SDR, BT.709 limited) via a D3D11 VIDEO-engine BGRA→NV12 VideoConverter (keeps the CSC off the contended 3D/compute engine), or P010 (HDR, BT.2020 PQ limited) via the FP16→P010 shader (NVIDIA's VideoProcessor can't do RGB→P010). The ring drops its per-slot RTV (textures only), matching the WGC YUV ring; converters rebuild on a size/HDR flip. - nvenc.rs: NV12 input forces bit_depth=8 so an HDR→SDR toggle (or a 10-bit- negotiated client on an SDR display) re-inits the session at the matching depth — NV12 can't feed a 10-bit session (register_resource rejects it). - punktfunk1.rs: per-stage latency instrumentation under PUNKTFUNK_PERF (cap=try_latest, submit=encode_picture, wait=lock_bitstream µs p50/p99/max) to pinpoint where capture→encoded latency goes under GPU saturation. |
||
|
|
327a5fa828 |
docs(host): prove unsafe blocks in the Windows + cross-platform files + gate them (unsafe-proof program 3/N)
Continues the unsafe-proof program across the Windows/cross-platform host files
(~75 blocks, 21 files), each with a SAFETY proof of the real invariant and a
per-file #![deny(clippy::undocumented_unsafe_blocks)] gate:
capture/windows: dxgi.rs, wgc_relay.rs, wgc.rs, desktop_watch.rs, composed_flip.rs
(windows-rs COM: interface validity, same-D3D11-device textures,
immediate-context single-thread, borrowed args outlive the call)
windows: service.rs (SCM/token/CreateProcessAsUserW/event handles — OwnedHandle
liveness, no double-close/signal race), win_display, wgc_helper, interactive
vdisplay/windows: manager.rs, pf_vdisplay.rs (SwDeviceCreate/IddCx/ioctl handle
liveness via the OnceLock VDM singleton + OwnedHandle)
encode/windows: ffmpeg_win.rs (full AVBufferRef refcount audit — balanced, NO leaks,
unlike the vaapi sibling), sw.rs
cross-platform: gamestream/audio.rs (libopus), gamestream/stream.rs (sendmmsg),
inject/windows/sendinput.rs, audio/windows/wasapi_mic.rs,
session_tuning.rs, vdisplay.rs
Two findings (handled separately):
- wgc_relay.rs `unsafe impl Sync for HelperRelay` is UNSOUND (its mpsc Receiver is
!Sync) though not live-exploited — marked SUSPECT inline; fix pending box check
(it touches the in-flight punktfunk1.rs).
- capture.rs / encode.rs (PARENT modules of the WIP idd_push.rs / nvenc.rs) do NOT
get the file deny yet — it would propagate the lint into the undocumented WIP
children. The deny lands there once those are documented (after the WIP commits).
Linux-visible parts verified green (cargo clippy -p punktfunk-host --all-targets
-- -D warnings). The cfg(windows) deny gates are box-verified next.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
ba68a98873 |
docs(host): prove every unsafe block in the Linux FFI files + gate them (unsafe-proof program 2/N)
Continues the structural unsafe-proof program (every unsafe carries a documented
proof of soundness; the file gains #![deny(clippy::undocumented_unsafe_blocks)]
so it stays proven). This batch covers all 10 remaining pure-Linux files
(104 blocks), each proof stating the REAL invariant — not boilerplate:
zerocopy/cuda.rs (26) leaked process-lifetime libcuda fn-ptr table; opaque
CUcontext never dereferenced; free-exactly-once via the
Arc<Mutex<PoolInner>> ownership graph; dmabuf fd take/close split
zerocopy/egl.rs (18) eglGetProcAddress'd procs with the GL context current;
EGLImage liveness; the two-call modifier-query bounds
zerocopy/vulkan.rs (4) copy-bounds arithmetic (src_size>=span); Send = thread
confinement to the punktfunk-pipewire thread
dmabuf_fence.rs (4) poll/ioctl/close fd liveness + ownership
capture/linux/mod.rs (16) spa_data repr(transparent) cast; null-checked spa
derefs; single-loop-thread buffer ownership until requeue
inject/linux/gamepad.rs (10) uinput ioctl request-number ↔ struct-size match
(static-asserted); InputEventRaw no-padding for the byte cast
encode/linux/vaapi.rs (15) + encode/linux/mod.rs (9) ffmpeg object ownership/
free ladders; VAAPI/DRM graph; Send = single-thread transfer
inject/linux/wlr.rs (2), vdisplay/linux/kwin.rs (1)
No memory-unsafety SUSPECT blocks were found — the unsafe is sound. The vaapi
agent did flag two real AVBufferRef *leaks* (not UB) in DmabufInner::open; marked
inline with NOTE(leak) and addressed in a follow-up.
Verified: cargo clippy -p punktfunk-host --all-targets -- -D warnings is clean
(each file's deny gate hard-errors on any undocumented block).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
bd05bc8c30 |
fix(windows): clippy/build cleanups the on-glass build surfaced (-D warnings)
Built the host crate (`cargo clippy --features nvenc -D warnings`) and the driver workspace (`cargo build`) on the RTX box — the project's intended Windows gate, which `cargo check` (what the goal1/§2.5 work used) never runs. It surfaced lint issues accumulated across the goal1 / §2.5 / this-session Windows work: - 9× redundant `as *mut c_void` after `.as_raw_handle()` (already `*mut c_void`): idd_push.rs (3, this session), service.rs (3, this session), manager.rs (3, pre-existing §2.5 — my OwnedHandle work copied the idiom). Removed the casts + the now-unused `use std::ffi::c_void` in idd_push.rs / manager.rs (service still uses it). - `if_same_then_else` in session_plan.rs::resolve_topology (pre-existing goal1 stage 3): collapsed the two `false` arms into one condition (behavior identical). - `unused_unsafe` in the driver `pod_init!` macro: it expands at call sites already inside an `unsafe` block, where its own `unsafe` is redundant — `#[allow( unused_unsafe)]` (needed at the non-unsafe sites, redundant at the nested ones). After these, BOTH builds are clean on the box — validating the whole session's blind Windows + driver work compiles + passes clippy on real hardware. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
658564353c |
refactor(windows-host): KeyedMutexGuard RAII for the IDD-push consume hot loop (Goal-3, hw-validated)
The IDD-push consume loop acquired the slot's keyed mutex by hand (`AcquireSync(0,8)` … work … `ReleaseSync(0)`), with a comment warning that a `?`-return between acquire and release would leak the lock and stall the driver on that slot — the reason the HDR converter is built *before* the acquire. Replace with a `KeyedMutexGuard` RAII (acquire → `ReleaseSync` on drop), scoped to JUST the convert/copy block so the lock releases at the EXACT same point as before (the driver gets the slot back immediately; not held across the rest of `try_consume`). Now the release can't be skipped on any early return/panic — the leak footgun is gone by construction, and the hot loop has no raw `ReleaseSync`. Behavior/latency-equivalent (same acquire params, same release point). Windows- only (CI + on-glass gated); to be validated on the RTX box (host clippy build + a PERF=1 latency A/B vs the shipping binary — the change should show no delta). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
011607ec10 |
refactor(windows-host): RAII for IDD-push handles/views — fix a leak (Goal-3 unsafe reduction #1)
The IDD-push capturer held raw `HANDLE`s for the shared header mapping, the
frame-ready event, the debug section, and each ring slot's shared texture, with
manual `CloseHandle` scattered across two `Drop` impls — and the MapViewOfFile
VIEWS (header/dbg_block) were never UnmapViewOfFile'd (a real view leak).
- New `MappedSection { handle: OwnedHandle, view }` RAII: `Drop` UnmapViewOfFile's
the view THEN the `OwnedHandle` closes the mapping (unmap-before-close).
- `map`+`header` → `section: MappedSection` (+ a cached `header` ptr borrowing into
it, declared after `section` for drop order); same for `dbg_map`+`dbg_block`.
- `event: HANDLE` → `OwnedHandle` (borrowed as `HANDLE(as_raw_handle() as *mut
c_void)` for WaitForSingleObject); `HostSlot.shared` → `OwnedHandle` (its manual
`Drop` deleted). Removed the manual `CloseHandle`s + the `CloseHandle` import.
Net: deletes two `Drop` impls' worth of manual handle/view teardown and fixes the
view leak — fewer unsafe ops, RAII-correct. Behavior preserved (recreate_ring
writes the header in place; the keepalive still drops last so REMOVE is last).
Windows-only (CI-gated); adversarially reviewed (no double-free / UAF / dangling
header; handle interop matches manager.rs). Linux check unaffected.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
00cf51d610 |
refactor: rename pf-vdisplay-proto -> pf-driver-proto (it spans all drivers)
The shared host<->driver ABI crate already contains more than the virtual display: the IDD-push frame ring + control plane AND the gamepad shared-memory layouts (XusbShm / PadShm). "pf-vdisplay-proto" was a misnomer — the name now represents all the drivers it serves. Mechanical rename, no behavior change: - git mv crates/pf-vdisplay-proto -> crates/pf-driver-proto (package name + path-deps in the host crate and the driver workspace). - pf_vdisplay_proto -> pf_driver_proto across host + driver Rust, both Cargo.lock files, the workspace members, the CI path triggers (windows-drivers.yml), and the docs/INF comments. The runtime Global\pfvd-* shared-object names are a SEPARATE contract and are deliberately untouched (host<->driver name matching). - The pf-vdisplay DRIVER crate + its INF service name (Root\pf_vdisplay, UmdfService=pf_vdisplay, pf_vdisplay.dll) are unchanged — only the full `pf_vdisplay_proto` token was replaced, never the `pf_vdisplay` driver name. Linux-verified: cargo test -p pf-driver-proto (const size-asserts compile) + cargo clippy -p punktfunk-host -D warnings clean; Cargo.lock regenerated. The driver-workspace side (path-dep + imports + its Cargo.lock) is Windows-CI-gated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
84a3b95f17 |
refactor(windows-host): delete the SudoVDA backend — pf-vdisplay is the sole vdisplay (Goal 2)
Goal 2 ("drop every trace of SudoVDA") is done. The SudoVDA driver is no longer
shipped (only pf-vdisplay; the old vdisplay-driver tree was deleted in
|
||
|
|
0255a8289c |
docs(windows-host): consolidate 5 scattered docs into one current source of truth
The Windows-host docs were scattered across a design plan, a staged-refactor plan, an audit, an audit-remediation tracker, and a game-capture-bug analysis — several badly stale (the audit/remediation predate the Goal-1 branch landing and call DONE items "not started"). Verified the true state of every audit finding / goal / milestone against current code+git (4-agent workflow), then rewrote windows-host-rewrite.md as ONE consolidated, accurate doc: - §1 Status scorecard (Goals 1-3, M0-M6, GB1, audit P0/P1/P2) with DONE/PARTIAL/ OPEN + commit evidence. - §2 Architecture as-built (layering, HostConfig→SessionPlan→SessionContext, the VirtualDisplayManager ownership model, IDD-push-primary capture incl. secure desktop + GB1 recovery, encode/EncoderCaps, pf-vdisplay-proto, the driver, service/packaging). - §3 Validated invariants (the jewels). - §4 Prioritized open tasks (the genuine remaining work). - §5 Operations (RTX-box recipe, CI, env, build). - §6 Deep reference (/INTEGRITYCHECK answer, the 6 iddcx bindgen knobs, the driver port checklist, resolved decisions). Deleted the four now-redundant docs (content folded in; history in git): windows-host-goal1-plan.md, windows-host-rewrite-audit.md, windows-host-rewrite-remediation.md, windows-host-rewrite-game-capture-bug.md. Repointed the 6 code/proto/driver doc-comment refs that targeted them at the consolidated windows-host-rewrite.md sections. Linux cargo check clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
15202011c1 |
refactor(windows-host): §2.5 step 1 — delete the dead/write-only monitor-lifecycle code
Removes the cruft the §2.5 ownership-model rewrite would otherwise carry forward, and corrects a
false invariant the docs described:
* CURRENT_MON_GEN (sudovda) — the "current monitor generation" global was WRITE-ONLY. It was
stored on every mgr_acquire (both backends) but its only reader, idd_push's `my_gen`, was set
and NEVER read. The "session capturer re-checks the monitor gen each frame and bails on a
reconnect" behaviour the doc describes was never wired — per-frame staleness is the SEPARATE
ring FrameToken.generation / IDD_GENERATION mechanism (which works and is untouched). So the
monitor-gen-via-WinCaptureTarget carry the design proposed is unnecessary. Deleted the static,
its stores in both backends, the pf_vdisplay import, and idd_push's dead `my_gen` field/read.
(MON_GEN — the lease-generation counter behind the stale-lease no-op — is REAL and kept.)
* IDD_PERSIST + open_or_reuse + IddReuseHandle (idd_push) — a persistent-capturer reuse path
from an early prototype, defined but with ZERO callers across the crate. Deleted, plus the now
-orphaned `use std::sync::Mutex` and the now-dead `set_client_10bit` setter.
Windows-only; grep confirms no remaining references to any deleted symbol. Box build to follow.
First of the incremental §2.5 steps (user-approved OnceLock VirtualDisplayManager design).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
38c68c33e5 |
refactor(windows-host): confine platform code under windows/ + linux/ folders (Goal-1 stage 6)
Move 36 platform-specific files into per-module `windows/` and `linux/` subfolders (and the
shared HID codecs into `inject/proto/`):
capture/{windows,linux}/ encode/{windows,linux}/ inject/{windows,linux,proto}/
audio/{windows,linux}/ vdisplay/{windows,linux}/
src/windows/ (service, wgc_helper, win_adapter, win_display)
src/linux/ (dmabuf_fence, drm_sync, zerocopy/)
Done with `#[path]`, NOT a module rename: every file moves into its folder while the
`crate::*::*` module names stay FLAT, so all caller paths and every internal `super::`/`crate::`
reference are unchanged — only the parent `mod` decls gained `#[path = "..."]`. This is the
codebase's existing pattern (inject's gamepad_windows) and makes the move byte-identical in
behaviour with ZERO reference churn, far lower risk than collapsing to a single
`crate::capture::windows::` namespace (that deeper rename is an optional follow-on; this delivers
the cfg-sprawl folder confinement the stage is about). Done LAST, after the semantic stages, so
the path churn didn't fight them.
Verified: Linux cargo check + clippy (-D warnings) clean; my mod-decl changes fmt-clean (the 3
remaining fmt diffs are pre-existing local-rustfmt-version skew that moved with their files); all
36 `#[path]` targets exist; no internal `#[path]`/`include!`/file-child-mod in any moved file
(the inline `mod X {` blocks are self-contained). Box build to follow.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
a0427cd2a3 |
feat(windows-host): OutputFormat into the capturer — kill the dxgi back-reference (Goal-1 stage 5, tightening 1)
The headline §2.3 seam tightening (the explicit Stage-3 deferral; §5's "highest-severity
coupling"): the capturer is now TOLD its output format instead of re-deriving the encode backend.
New `capture::OutputFormat { gpu, hdr }`, resolved once per session and passed INTO
capture_virtual_output:
* native punktfunk/1 path: `SessionPlan::output_format()` (gpu = encoder.is_gpu(), from the
already-resolved plan.encoder — no second probe; hdr = plan.hdr).
* GameStream + spike paths: `OutputFormat::resolve(hdr)` (gpu from the single `gpu_encode()`
source, which maps windows_resolved_backend()).
`capture/dxgi.rs DuplCapturer::open` takes `gpu` in and its internal
`!matches!(windows_resolved_backend(), Software)` recompute is DELETED — the capture layer no
longer re-calls the encode layer (the back-reference that could let capture and encode disagree
on whether frames are GPU-resident, plan §2.3/§5). The relay's secure-desktop DDA passes
`gpu_encode()` likewise.
Behavior-preserving: the `gpu` passed in equals the value the capturer used to compute (same
encode-backend resolution). The DDA opens keep `want_hdr=false` (the SDR fallback, unchanged).
Tightenings 2 (HDR/release -> VirtualLease) and 3 (EncoderCaps) split off: (2) needs the
monitor-generation carried on the lease + the keepalive becoming Box<dyn VirtualLease> — that's
the §2.5 ownership-model change (CURRENT_MON_GEN / sudovda::wait_for_monitor_released), so it
moves there; (3) is a small additive follow-on. Documented in the plan.
Verified: Linux cargo check + clippy (-D warnings) + fmt clean. Box build to follow.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
e5057f6cc1 |
feat(windows-host): finish HostConfig migration — resolve operator/dispatch knobs once (Goal-1 stage 2)
Migrate 31 genuinely-constant operator/dispatch env::var sites onto HostConfig, so the
capture/topology/encoder decision reads ONE owner instead of being recomputed at each call
site (the latent bug where capture and encode could disagree on the resolved backend, plan §2.4):
idd_push x7, no_wgc, capture_backend, render_adapter, encoder_pref (Linux open_video +
linux_zero_copy_is_vaapi), the Windows vdisplay-backend select, plus the plan-named
secure_dda/idd_depth/zerocopy/ten_bit and the multi-site perf x4 / compositor x5 /
video_source x3 / gamepad. Each HostConfig field's parser is byte-identical to the read it
replaced, so old==new by construction (the plan's "a flipped bool is a silent regression" guard).
Scope correction — the plan's "~64 sites / Linux XDG+compositor included / grep env::var -> 0"
was unsafe as written. Two classes are deliberately KEPT as live reads and documented in config.rs:
* Runtime-mutated session vars. vdisplay::apply_session_env REWRITES the process env on every
connect (the Bazzite Gaming<->Desktop follow): WAYLAND_DISPLAY, XDG_CURRENT_DESKTOP,
XDG_RUNTIME_DIR, DBUS_SESSION_BUS_ADDRESS, and the derived PUNKTFUNK_INPUT_BACKEND,
GAMESCOPE_SESSION/NODE, KWIN/MUTTER_VIRTUAL_PRIMARY, FORCE_SHM. Parsing these once would
freeze them at startup and silently break session-following — they are NOT constant.
* Single-use local tuning with no resolve-once benefit (and FEC_PCT even has two different
semantics): FEC_PCT, VIDEO_DROP, VBV_FRAMES, SPLIT_ENCODE, PACE_BURST_KB, the dxgi timing
knobs, the *_LIVE/test gates, plus path/dynamic reads (config-dir, PATH search,
env-forward-to-child). PUNKTFUNK_ZEROCOPY is split on purpose: Windows presence-semantics
moved to the field; Linux keeps its own truthy (1|true|yes|on) parser.
Verified: Linux cargo check + clippy (-D warnings) + fmt clean on the touched files. The
Windows-only edits are 1:1 substitutions; they get a real Windows compile on the box with Stage 3.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
c87bfe0e7b |
feat(windows-host): IDD-push recovers from a game mode-set, else drops (game-capture bug GB1)
The bug: a fullscreen game mode-sets the virtual display (format/size); the driver's publish() guard then drops every frame; the host's ring — fixed at the session-negotiated mode — never adapts -> frozen picture, then black on reconnect. RECOVER (no DDA, per the chosen design): the ring now TRACKS the display's actual mode. At open it is sized to the display's actual resolution (new win_display::active_resolution, CCD/GDI) — so reconnecting while a game holds a different mode just works. Mid-session, the 250ms poll (was HDR-only) now also follows the active resolution; on any descriptor change (size or HDR) it recreates the ring at the new mode (recreate_ring generalized to a new size) -> the driver re-attaches -> frames resume at the game's mode. No freeze, no reconnect needed. DROP if unrecoverable: a descriptor change starts a recovery clock (recovering_since); if no fresh frame resumes within 3s (e.g. an exclusive-flip the host can't follow), try_consume bails -> the session ends cleanly -> the client reconnects, instead of freezing forever. A pure idle desktop (no mode change) never triggers this. Verified: host clippy (nvenc) clean on the RTX box. NEEDS ON-GLASS (Doom repro on .158): confirm the poll sees the mode-set, the ring recreates + recovers, the encoder+client adapt to the size change; tune the 3s window. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
f98ab07dd6 |
feat(windows-host): IDD-push first-frame failover to DDA (game-capture bug GB1 pt1)
wait_for_attach now requires the driver to publish a FIRST frame, not just attach (DRV_STATUS_OPENED). A fullscreen game can leave the virtual display in a format/size the driver's publish() guard rejects -> the driver ATTACHES but silently drops every frame; previously the host sailed past open() and only died on next_frame's 20s deadline (the 'reconnect = black + working audio' symptom). Now open() fails -> capture.rs falls back to DDA (reusing the C1 fallback) -> the game is captured + visible after a reconnect. Safe at open: the OS composites the freshly-activated virtual display, so a frame arrives within ~1s — a normal/idle open isn't false-failed; only a genuinely-broken display (no frame in 4s) falls back (and DDA is a working path, so even a false-positive degrades gracefully). GB1 Stage 1a (docs/windows-host-rewrite-game-capture-bug.md P3). The mid-session-without-reconnect live failover (composing capturer) is the next piece. Verified: host clippy (nvenc) clean on the RTX box. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
e60cda3939 |
refactor(windows-host): move CCD/HDR display helpers to a neutral module — F1 complete (audit §9)
Moved the remaining 6 SudoVDA reach-in helpers + SavedConfig (resolve_gdi_name, set_advanced_color, advanced_color_enabled, set_active_mode, isolate/restore_displays_ccd) verbatim from vdisplay::sudovda into a backend-neutral crate::win_display module (the plan's windows/display_ccd.rs). The capturers (idd_push/dxgi/wgc), pf_vdisplay, and punktfunk1 now depend on these as PEERS via crate::win_display instead of reaching into the SudoVDA backend. With win_adapter (F1 pt1), all 7 reach-in helpers are now neutral — the circular reach-in is broken, so SudoVDA can eventually be deleted (Goal 2) without losing the display utilities. sudovda re-exports the ones it still uses internally; its now-unused CCD/GDI imports were removed. Verified: host clippy (nvenc) clean on the RTX box; Linux check clean (the new modules are #[cfg(windows)]). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
d638a93e04 |
refactor(windows-host): move resolve_render_adapter_luid to a neutral module (audit §9 / F1 pt 1)
The discrete-render-GPU LUID picker was display-utility living in the SudoVDA backend; moved it verbatim to a backend-neutral crate::win_adapter module (the plan's windows/adapter.rs). The IDD-push capturer + pf-vdisplay backend now depend on it as a PEER instead of reaching into vdisplay::sudovda — the first step in breaking the circular reach-in so SudoVDA can eventually be dropped (Goal 2). sudovda re-exports it for its own callers. Remaining F1 increments: the CCD/HDR helpers (resolve_gdi_name, set_advanced_color, advanced_color_enabled, set_active_mode, isolate/restore_displays_ccd) → a neutral win_display module. Verified: host clippy (nvenc) clean on the RTX box. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
b0d28380b5 |
feat(windows-host): rotate out-ring on repeat + size HDR ring at open (audit §5.3/§5.4)
§5.3 (C3): repeat_last() now copies the last frame into a FRESH rotated out-ring slot instead of re-handing last_present's slot, so a repeat (static desktop) never re-hands a slot still encoding under pipeline_depth>1. OUT_RING(3) > max depth(2) keeps the rotated slot free — the out-ring rotation contract now holds for repeats too, not just the synchronous-loop assumption. §5.4 (C4): when enabling advanced color for a 10-bit client, trust set_advanced_color success and size the ring FP16 directly, instead of racing the advanced_color_enabled poll (which could size SDR while the driver composes FP16 -> format mismatch -> an immediate ring recreate + dropped first frames). Verified: host clippy (nvenc) clean on the RTX box. On-glass to confirm: HDR-client first-frame + static-desktop pipelining. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
ed583650a6 |
feat(windows-host): IDD-push attach fallback to DDA, not the 20s black bail (audit §5.1)
open() now hands the keepalive BACK on failure (the WGC attach_keepalive pattern) so the caller can fall back instead of tearing the virtual display down. Added a bounded wait_for_attach() that polls the driver's DRV_STATUS_OPENED — it checks ATTACH status, not frame arrival, so it never false-fails on an idle desktop that has composed no frame yet.
An attach failure (e.g. a hybrid-GPU render mismatch -> DRV_STATUS_TEX_FAIL, or the driver never opening the ring within 4s) now fails open() -> capture.rs falls back to DDA, instead of next_frame's 20s deadline leaving the session black. Pairs with the driver SET_RENDER_ADAPTER fix (
|
||
|
|
e2f004589c |
feat(windows-drivers): STEP 6 — IDD-push FramePublisher (driver) + host migration to proto::frame
apple / swift (push) Failing after 1s
apple / screenshots (push) Has been skipped
windows-drivers / probe-and-proto (push) Successful in 19s
windows-drivers / driver-build (push) Successful in 1m9s
ci / rust (push) Successful in 1m31s
ci / web (push) Successful in 42s
ci / docs-site (push) Successful in 1m2s
android / android (push) Successful in 3m50s
deb / build-publish (push) Successful in 2m37s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
windows-host / package (push) Successful in 5m20s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m37s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m32s
docker / deploy-docs (push) Successful in 16s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m19s
The driver now publishes each acquired swap-chain surface into the host-created shared ring (the IDD-push path) — the full glass-to-glass transport is code-complete. Both sides use the canonical pf_vdisplay_proto::frame layout (lockstep by compile-error, not "must match" comments). Driver compiles + LOADS on-glass (adapter inits, Status=OK; no regression — the publisher is dormant until a frame is acquired); host cargo check green; adversarially reviewed (no blockers — token layout, keyed-mutex key 0, names by target_id, and the format guard all match the host consumer). - new driver frame_transport.rs: FramePublisher OPENS the host ring by target_id (OpenFileMapping header + magic Acquire readiness gate + OpenEvent + OpenSharedResourceByName RING_LEN keyed-mutex textures), writes its render LUID + DRV_STATUS back into the header; publish() is NON-BLOCKING (round-robin 0ms try-acquire -> CopyResource -> ReleaseSync -> FrameToken::pack store Release -> SetEvent; drops the frame if every slot is busy or the surface format != the ring format). Manual handle/view cleanup on every try_open early return; RAII Drop (slots -> unmap -> CloseHandle). Layout/consts/names/token all from pf_vdisplay_proto::frame. - swap_chain_processor.rs run_core: lazy rate-limited attach (every ~30 frames) + is_stale re-attach (mid-session HDR ring recreate); publishes buffer.MetaData.pSurface via IDXGIResource::from_raw_borrowed (preserves IddCx's refcount) BEFORE IddCxSwapChainFinishedProcessingFrame. run/run_core gain the render LUID; callbacks.rs assign_swap_chain passes it. - host idd_push.rs migrated onto pf_vdisplay_proto::frame (deleted the hand-rolled SharedHeader / MAGIC / VERSION / RING_LEN / DRV_STATUS_* / name fns / token packing) — pure refactor, byte-identical, no behavior or gating change. DebugBlock + DXGI_SHARED_RESOURCE_RW kept local (not in the proto). - driver windows crate gains Win32_System_Memory (MapViewOfFile/OpenFileMappingW/...); rustfmt'd the whole driver workspace (incl. wdk-probe — fmt-only). Built via the ultracode flow: STEP-6 map workflow -> agent-implement -> box build (driver + host both green; caught nothing this time) -> adversarial-verify-agent (no blockers) -> FrameToken::pack hardening -> deploy (loads). Glass-to-glass frame validation awaits a composited session (per the parity finding: this headless box yields 0 frames for the proven SudoVDA path too). FOLLOW-UPs: port the optional Global\pfvd-dbg DebugBlock triage channel to the new driver; STEP 7 HDR; STEP 8 drop SudoVDA. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
e2c9bfd3d9 |
feat(windows): pf-vdisplay IDD-push — HDR + pipelined zero-copy capture
apple / swift (push) Successful in 1m4s
windows-host / package (push) Successful in 6m28s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m14s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m10s
release / apple (push) Successful in 7m53s
android / android (push) Successful in 10m33s
ci / web (push) Successful in 44s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 3m4s
ci / docs-site (push) Successful in 53s
ci / rust (push) Successful in 12m22s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m11s
apple / screenshots (push) Successful in 5m24s
deb / build-publish (push) Successful in 3m16s
decky / build-publish (push) Successful in 21s
ci / bench (push) Successful in 4m42s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m34s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m42s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m13s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 47s
flatpak / build-publish (push) Successful in 4m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 25s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m44s
HDR (display-driven, matching the WGC path): - CTA-861.3 HDR EDID (BT.2020 primaries + HDR Static Metadata block) so Windows offers "Use HDR" on the virtual display. The host FOLLOWS the display's live advanced-color state, recreating the shared ring at the matching format (FP16 in HDR / BGRA in SDR) on a toggle — no freeze. - Always emit Main10/BT.2020-PQ Rgb10a2 while the display is HDR; the client auto-detects PQ from the HEVC VUI (clients under-report VIDEO_CAP_10BIT). Generic HDR10 mastering SEI on every IDR. - Generation-tagged `latest` (gen<<40|seq<<8|slot) + driver `is_stale` re-attach kill the toggle-time garbage frame and any stale-ring read. Perf: - Pipeline the encode loop (Capturer::pipeline_depth; IDD-push = 2): submit N+1 before polling N so the convert/copy on the 3D engine overlaps the NVENC encode of N on the ASIC. PUNKTFUNK_IDD_DEPTH overrides (1 = synchronous). - Rotating host output ring (OUT_RING) so the in-flight encode and the next convert never touch the same texture. - HDR converts directly from the keyed-mutex slot's SRV into the output ring (drops the redundant slot->fp16 scratch copy); SDR copies the BGRA slot in. The slot mutex is held only across the convert/copy, not the encode. RING_LEN 3->6 for publish headroom. - Capture-health diagnostic: new_fps vs repeat_fps under PUNKTFUNK_PERF (a low new_fps at a high send rate means the source isn't compositing, not an encode stall). Validated live on the RTX box: 5120x1440@240 HDR streams; driver composes ~180 new fps, encode 240 fps @ ~4.3 ms p50. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
72eeedc4da |
feat(windows): AMD (AMF) + Intel (QSV) hardware encode on the Windows host
The Windows host was NVIDIA-only (NVENC) with an openh264 software fallback. Add
AMD AMF and Intel QSV via libavcodec — the Windows analogue of the Linux VAAPI
backend — so one installer serves all three GPU vendors.
- encode/ffmpeg_win.rs: new WinVendor{Amf,Qsv} encoder. System-memory NV12/P010
readback (default, robust) + opt-in zero-copy D3D11 (PUNKTFUNK_ZEROCOPY: shares
the capturer's ID3D11Device; AMF takes AV_PIX_FMT_D3D11, QSV derives a QSV frames
ctx and maps) with a system fallback for the format-group mismatch the capturer's
video-processor fallback can produce. HDR Main10 (P010 + BT.2020/PQ VUI; an
Rgb10a2->P010 swscale covers the shader fallback).
- encode.rs: Codec::amf_name/qsv_name; open_video + windows_resolved_backend()
resolve PUNKTFUNK_ENCODER=auto|nvenc|amf|qsv|sw via a DXGI adapter VendorId probe.
- capture/dxgi.rs: gpu_mode mirrors the resolved backend (D3D11 NV12/P010 for AMF/QSV).
- gamestream/serverinfo.rs: GPU-aware codec advertisement (windows_codec_support;
AV1 gated to RDNA3+/Arc, like the VAAPI path).
- Cargo.toml: amf-qsv feature (optional ffmpeg-next in the windows target block).
- CI/installer: windows-host.yml sets FFMPEG_DIR + builds --features nvenc,amf-qsv;
the Inno installer bundles the FFmpeg DLLs; host.env default nvenc -> auto.
CI-green target; AMF/QSV not yet on-glass validated (no AMD/Intel Windows box in the
lab) — NVENC stays live-validated. An adversarial-review pass caught + fixed real
FFI bugs (AV_PIX_FMT_P010 is a macro -> P010LE; windows-rs 0.62 GetImmediateContext/
GetDesc1 return Result; AV_HWFRAME_MAP_* is a bindgen enum with no BitOr).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
3526517eb1 |
feat: HDR Step-0 colour-metadata transport + security-audit hardening
ci / rust (push) Failing after 45s
apple / swift (push) Successful in 57s
ci / web (push) Successful in 39s
ci / docs-site (push) Successful in 38s
windows-host / package (push) Successful in 3m26s
android / android (push) Successful in 3m40s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m24s
deb / build-publish (push) Successful in 2m10s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m22s
decky / build-publish (push) Successful in 25s
ci / bench (push) Successful in 4m44s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 1m4s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m7s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m45s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 30s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m37s
flatpak / build-publish (push) Successful in 4m17s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m30s
docker / deploy-docs (push) Successful in 23s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m53s
Two strands, entangled in punktfunk1.rs, committed together (one builds-green tree). HDR pipeline Step 0 — glass-to-glass colour-metadata transport (docs/hdr-pipeline-plan.md): - Protocol/ABI: ColorInfo on the Welcome + a 0xCE HdrMeta datagram carry the source colour space + HDR10 static mastering metadata (quic.rs, abi.rs connect_ex5 fixing caps=0). - New platform-independent, unit-tested HDR static-metadata helpers (hdr.rs): chromaticities (1/50000), mastering luminance (0.0001 cd/m2), MaxCLL/MaxFALL in HDR10/ST.2086 units. - Capture/encode hooks (capture.rs, encode.rs set_hdr_meta) + Linux client / probe plumbing. Security-audit hardening — top 3 from docs/security-review.md, each adversarially verified: - #1 [HIGH] Secret file permissions. The host key.pem/cert.pem and both trust stores are now written owner-only: 0600 + dir 0700 on Unix (mirrors mgmt_token), best-effort SYSTEM/Administrators/OWNER-only icacls DACL on Windows (%ProgramData% is Users-readable). Closes a local key-disclosure -> host-impersonation gap. New gamestream::{create_private_dir, write_secret_file} + a 0600 regression test. - #2 [HIGH] Native SPAKE2 PIN is single-use. The PIN is consumed the moment the host sends its key-confirmation (which lets the client test its one guess), before reading the proof, so any completed attempt -- right OR wrong -- disarms the window. A wrong PIN isn't observable host-side (the client aborts before sending its proof), so consuming on first attempt is what delivers the documented "one online guess" instead of an unbounded brute-force of the static 4-digit PIN. Test verifies single-use. - #3 [MEDIUM] RTSP packetSize is bounded ([64,2048] in stream_config) and VideoPacketizer::new uses saturating .max(1), killing a PRE-AUTH div-by-zero/underflow panic of the video thread. Tests for {0,15,16,17} + out-of-range rejection. fmt + clippy -D warnings clean; full workspace test suite green (93 host tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
708c62788d |
feat(host/encode): VAAPI zero-copy dmabuf import (AMD/Intel GPU CSC)
apple / swift (push) Successful in 57s
ci / rust (push) Successful in 1m39s
ci / web (push) Successful in 32s
ci / docs-site (push) Successful in 31s
android / android (push) Successful in 3m29s
windows-host / package (push) Successful in 3m39s
deb / build-publish (push) Successful in 3m7s
decky / build-publish (push) Successful in 22s
ci / bench (push) Successful in 4m43s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m24s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 22s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m22s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m53s
Phase 2 of AMD/Intel support: the VAAPI encoder now takes the capture dmabuf directly and does the RGB->NV12 colour conversion on the GPU's video engine, eliminating the host-side de-pad + swscale CSC + upload the CPU path pays. - capture: a vendor-neutral FramePayload::Dmabuf (dup'd fd + fourcc/modifier/ layout). When zero-copy is on, the EGL->CUDA importer is unavailable (any non-NVIDIA host), and the backend is VAAPI, the capturer advertises LINEAR dmabuf and hands the raw buffer to the encoder instead of CPU-copying it. - encode/vaapi: the encoder self-configures from the first frame's payload (no open_video signature change). The dmabuf arm wraps the buffer as an AV_PIX_FMT_DRM_PRIME frame and pushes it through a filter graph buffer(drm_prime) -> hwmap(vaapi) -> scale_vaapi=nv12 -> buffersink; the encoder takes NV12 surfaces straight from the sink. The Phase 1 CPU-upload path is kept as the other arm (used when capture produces CPU frames). Live-validated on a Radeon 780M (real Sway/xdpw desktop capture): correct, pixel-perfect HEVC, and ~10x less host CPU at 1440p (4.2s -> 0.4s of CPU for 300 frames) -- the de-pad/CSC/upload moves to the GPU. NVIDIA unchanged (zero-copy still imports to CUDA; the passthrough path only engages on non-NVIDIA hosts). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
5e27f65f2e |
fix(host/capture): mmap the buffer fd ourselves — xdpw MemFd over-reads MAP_BUFFERS
apple / swift (push) Successful in 55s
windows-host / package (push) Successful in 2m28s
android / android (push) Successful in 10m10s
ci / web (push) Successful in 32s
ci / docs-site (push) Successful in 29s
ci / rust (push) Successful in 11m44s
deb / build-publish (push) Successful in 3m7s
decky / build-publish (push) Successful in 34s
ci / bench (push) Successful in 4m44s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 15s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m57s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m51s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 21s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m21s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m8s
docker / deploy-docs (push) Successful in 20s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m54s
The CPU de-pad path trusted PipeWire's MAP_BUFFERS slice (`d.data()`, length = `data.maxsize`). xdg-desktop-portal-wlr hands MemFd ScreenCast buffers whose maxsize exceeds the bytes PipeWire actually maps into our process, so reading to maxsize ran off the end of the mapping and SIGSEGV'd the capture thread — crashing every CPU-path capture on Sway/wlroots (and thus any non-NVIDIA host, which has no CUDA zero-copy importer and always falls back to this path). mmap the fd ourselves, sized to its real length (fstat), for any fd-backed buffer (MemFd SHM or DmaBuf); fall back to `d.data()` then drop. The existing `needed > avail` guard now drops cleanly instead of over-reading. This also subsumes the original "MAP_BUFFERS didn't map a Vulkan dmabuf" fallback. Verified: fixes real Sway-desktop portal capture -> VAAPI HEVC on a Radeon 780M (correct image + colours); the NVIDIA zero-copy path (returns before this code) and the NVIDIA/KWin CPU path (self-mmap, fd_len == maxsize) both still work. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
aef552f04a |
feat(host/windows): HDR scRGB→P010 in a shader — NVENC native P010, off the SM
apple / swift (push) Successful in 55s
deb / build-publish (push) Successful in 3m9s
decky / build-publish (push) Successful in 13s
ci / rust (push) Successful in 1m14s
ci / web (push) Successful in 30s
ci / docs-site (push) Successful in 30s
windows-host / package (push) Failing after 2m19s
android / android (push) Successful in 3m12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
ci / bench (push) Successful in 4m38s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m42s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m47s
docker / deploy-docs (push) Successful in 18s
On the Windows WGC HDR path the FP16 scRGB capture was fed to NVENC as R10G10B10A2 (BT.2020 PQ), and NVENC did the RGB→YUV CSC internally on the contended SM — adding to the encode_ms wall under a GPU-saturating game. (NVIDIA's D3D11 VideoProcessor can't do RGB→P010 for HDR; that path renders green, confirmed live — so the convert must be ours.) New `HdrP010Converter` fuses the tone-map with the BT.2020 RGB→YUV matrix and emits P010 (10-bit limited range) directly: a luma pass → an R16_UNORM plane RTV (full-res) and a chroma pass → an R16G16_UNORM plane RTV (half-res, 2x2 box average) of a DXGI_FORMAT_P010 texture. NVENC then takes native P010 and skips its SM-side convert. Gated behind PUNKTFUNK_HDR_SHADER_P010 (default OFF → the existing R10→NVENC path is byte-for-byte unchanged). Colour validated by a new `hdr-p010-selftest` subcommand: a synthetic scRGB pattern → P010 → readback, compared to a BT.2020 PQ 10-bit reference — max abs error Y=0.99 / Cb=0.82 / Cr=0.75 codes on an RTX 4090. Live-validated HDR colours correct (no green). Build + clippy (--features nvenc -D warnings) green on x86_64-pc-windows-msvc. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
1fc6f73784 |
perf(host/linux): NV12 GPU convert — feed NVENC native YUV, off the contended SM (Tier 2A)
apple / swift (push) Successful in 54s
windows-host / package (push) Failing after 2m18s
ci / web (push) Successful in 32s
ci / rust (push) Failing after 5m2s
decky / build-publish (push) Successful in 11s
android / android (push) Failing after 49s
ci / docs-site (push) Successful in 35s
ci / bench (push) Failing after 3m15s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m49s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 15s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 40s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 28s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Successful in 5m54s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 11s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m36s
The Linux zero-copy tiled-GL path can now produce NV12 (BT.709 limited range) on the GPU and feed NVENC native YUV, deleting NVENC's internal RGB->YUV CSC — which runs on the SM/3D-compute engine a saturating game pins at 100% (the game-vs-encode contention headache). Windows already does this via the D3D11 video processor; this closes the Linux gap. See docs/host-latency-plan.md §2A. Gated behind PUNKTFUNK_NV12 (default OFF → the RGB/BGRx path is byte-for-byte unchanged; zero regression). Only the tiled EGL/GL path converts; the LINEAR/Vulkan-bridge (gamescope) path stays RGB. - zerocopy/egl.rs: Nv12Blit — BT.709 limited Y pass (R8, full-res) + UV pass (RG8, half-res, GL_LINEAR 2x2 average); both CUDA-registered; import_nv12. - zerocopy/cuda.rs: two-plane DeviceBuffer (Y W*H@1B + interleaved UV (W/2)*2 x H/2), paired Y+UV pool, copy_mapped_nv12 + copy_nv12_to_device, on the per-thread priority stream (dmabuf-recycle sync preserved). - encode/linux.rs: nvenc_input(Nv12)->NV12; submit_cuda copies two planes into NVENC's surface; VUI signalled BT.709 limited (colorspace/range/primaries/trc). - capture/linux.rs: gate (PUNKTFUNK_NV12 && tiled), report format Nv12. - main.rs + zerocopy/mod.rs: `nv12-selftest` subcommand. Validated on RTX 5070 Ti two ways: (1) nv12-selftest — synthetic RGBA->NV12 round-trip vs a BT.709 reference, max abs error Y=0.56/U=0.33/V=0.26 LSB; (2) live capture->NV12->NVENC->decode of animated red content matches the RGB path's colour (avg RGB 230,18,18 vs 231,18,20). build/clippy/fmt green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
9c8fa9340c |
refactor: drop milestone names + consolidate clients; loss-recovery & rumble fixes
apple / swift (push) Failing after 40s
audit / cargo-audit (push) Failing after 1m12s
windows-msix / package (push) Successful in 1m37s
windows / build (push) Successful in 1m14s
android / android (push) Successful in 4m48s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 4m21s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 4m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 19s
deb / build-publish (push) Successful in 6m3s
flatpak / build-publish (push) Successful in 4m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m16s
docker / deploy-docs (push) Successful in 18s
Two bodies of work in one commit (the rename moved files the fixes also touched). Naming/structure cleanup (pre-launch): - Host modules m3.rs->punktfunk1.rs, m0.rs->spike.rs; CLI m3-host->punktfunk1-host, m0->spike; bare `punktfunk-host` now prints help. Types M3Options/M3Source-> Punktfunk1Options/Punktfunk1Source. - Clients consolidated out of crates/ into clients/: punktfunk-client-rs-> clients/probe (crate punktfunk-probe), client-linux->clients/linux, client-windows->clients/windows, punktfunk-android->clients/android/native (crate punktfunk-client-android; kept [lib] name=punktfunk_android so the JNI contract is unchanged). crates/ now holds only core + host. - Milestone codes M0-M4 purged from code/CLI/CLAUDE.md/README/docs/docs-site, kept only in docs/implementation-plan.md. docs/m2-plan.md-> docs/gamestream-host-plan.md. CI/gradle/flatpak paths updated. Client loss-recovery (video froze and never recovered after a brief drop): - Export punktfunk_connection_frames_dropped through the C ABI (the core already tracked it for the client keyframe-recovery loop; it was never reachable from the ABI clients). Regenerated punktfunk_core.h. - Apple (StreamPump + Stage2Pipeline) and Android (decode.rs) now poll frames_dropped and request a keyframe when it climbs -- the same loss-driven recovery Linux/Windows already had. Under infinite GOP the decoder silently conceals reference-missing frames, so the decode-error trigger rarely fires. Apple rumble robustness (worked then went spotty -- DualSense + Xbox): - Add CHHapticEngine stopped/reset handlers (rebuild on app background / audio interruption / server reset) and drop the permanent `broken` latch on a transient drive failure; latch only when the controller truly has no haptics. - Surface swallowed SDL set_rumble errors on Linux/Windows + diagnostic logging. Verified: cargo build/clippy/fmt --workspace, C-ABI harness, header drift. Not runnable on this box (verify in CI): Gitea workflows, gradle/Android, flatpak, Swift/decky. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
4cc57d5c39 |
perf(host/windows): move capture→encode off the 3D engine (NV12/P010 video-processor path, zero-copy, GPU priority)
apple / swift (push) Successful in 56s
ci / rust (push) Successful in 1m36s
android / android (push) Successful in 1m56s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 28s
deb / build-publish (push) Successful in 2m26s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m33s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m58s
The Windows host capped at ~60 fps with 35-40 ms latency on a GPU-heavy game: the per-frame capture→encode path shared the 3D engine with the game and got scheduled behind it. Rework to minimize 3D-engine work per frame: - VideoConverter (D3D11 video processor): capture → NVENC-native NV12/P010 so NVENC skips its internal RGB→YUV (a 3D/compute step). Wired into both DDA (dxgi.rs) and WGC (wgc.rs). New PixelFormat::Nv12/P010 + NVENC YUV input. - GPU scheduling hardening (Apollo-style): D3DKMTSetProcessSchedulingPriorityClass HIGH, absolute SetGPUThreadPriority, SetMaximumFrameLatency(1). - WGC SDR zero-copy (hold pool frames; no CopyResource). DDA keeps a fast CopyResource to decouple its single-frame acquire/release from the async convert. - Pipelined helper encode loop (PUNKTFUNK_ENCODE_DEPTH, default 1) + perf split (cap_wait / encode / write). Live on the RTX 4090: hard 60 fps ceiling removed (now scene-scaling 40-200+), latency much reduced. Residual cap in GPU-pinned scenes is the irreducible RGB→YUV convert (no fixed-function unit on NVIDIA — VideoProcessing engine reads 0%) waiting behind an uncapped game under WDDM context time-slicing; Linux avoids it via gamescope capping the game to the display refresh. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
ad0cb1b582 |
feat(host/windows): capture the secure desktop in HDR via DDA (no SDR drop)
ci / web (push) Successful in 32s
ci / rust (push) Successful in 1m26s
android / android (push) Failing after 43s
apple / swift (push) Successful in 55s
deb / build-publish (push) Successful in 2m24s
decky / build-publish (push) Successful in 22s
ci / bench (push) Successful in 4m30s
ci / docs-site (push) Successful in 28s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4m1s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m31s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 21s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m15s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 7m46s
docker / deploy-docs (push) Successful in 21s
The secure-desktop DDA leg went black with HDR on: legacy DuplicateOutput (the SDR-era API) can't capture an FP16/HDR desktop, and dropping the SudoVDA out of HDR is denied on the Winlogon desktop (so the SDR-drop attempt just churned and stayed black). Instead capture HDR natively on the DDA path — the capturer already has the full FP16→BT.2020 PQ→R10G10B10A2 conversion (hdr_fp16 path), it just never requested FP16. Thread a want_hdr flag into duplicate_output: for an HDR session request DuplicateOutput1 with FP16 first and retry it (5×) instead of bailing to the HDR-incapable legacy fallback. The secure-desktop mux now reads the monitor's real HDR state and opens DDA in HDR when set — no advanced-color toggling at all. The normal-desktop DDA overlay/flip issues that pushed us to WGC don't apply to the composed Winlogon UI. want_hdr is threaded through every (re)duplication incl. ACCESS_LOST recovery. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
0ce2e37faf |
refactor(host/windows): clean up DDA path + add a proper Windows service
Final cleanup after the DDA-parity work, plus an end-user service to replace the PsExec/VBS/scheduled-task launch chain. Cleanup (behavior-preserving): - sudovda.rs: drop the dead legacy GDI isolate_displays/restore_displays (CCD is the sole isolation path), the always-empty Monitor.isolated field, and the vestigial reassert_isolation + PUNKTFUNK_ISOLATE_DISPLAYS knob; fix stale comments. - dxgi.rs: downgrade leftover debug warns/infos (DuplicateOutput1 retry, FALLBACKS, hook-hits, AcquireNextFrame idle timeout) to debug!; remove the PUNKTFUNK_NO_CURSOR per-frame test knob. Windows service (src/service.rs, `punktfunk-host service`): - SCM supervisor (windows-service crate) that duplicates its LocalSystem token, retargets it to the active console session, and CreateProcessAsUserW's the host there (Sunshine/Apollo model) — relaunching on exit and console session switch, inside a kill-on-close job object so a service crash never orphans the host. - install/uninstall/start/stop/status subcommands: one elevated `service install` registers an auto-start LocalSystem service + firewall rules + a default host.env. - Config moves to %ProgramData%\punktfunk\host.env; config_dir() now resolves to %ProgramData%\punktfunk on Windows (replacing the APPDATA=C:\Users\Public hack), with a PUNKTFUNK_CONFIG_DIR override. Logs land in %ProgramData%\punktfunk\logs\. - merged_env_block (shared with the WGC helper) now also carries RUST_LOG. - docs/windows-service.md + scripts/windows/host.env.example; windows-host.md updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
f469dfcc76 |
chore(host/windows): clean up DDA capture — fix unused imports, quiet secure-desktop log, sane retry default
- Remove 4 unused imports (PCWSTR in composed_flip, anyhow macro + SizeInt32 in wgc, Write in wgc_relay). - DuplicateOutput1 retry defaults to N=1 (immediate legacy): on the secure desktop DuplicateOutput1 is LOGON_UI-only so it always refuses, and the release-before-reduplicate + gentle recovery keep the legacy dup stable; retrying there only blocked. Still env-tunable (PUNKTFUNK_DUP_RETRY_N/_MS). - Throttle the 'using legacy DuplicateOutput' warning (expected + once-per-gentle- recovery on secure) so a lock dwell doesn't flood the log. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
dc734c711b |
fix(host/windows): re-sync thread desktop on EVERY recovery (symmetric enter/leave secure)
User's observation: entering UAC/lock works instantly, but clicking OUT of it breaks (with the disconnect sound) — Apollo's enter and leave are symmetric. Root cause: attach_input_desktop() (SetThreadDesktop to the current input desktop) was gated behind is_secure_desktop() in recreate_dupl, so: - Default->Winlogon (enter): is_secure==true -> re-attach to Winlogon -> works. - Winlogon->Default (leave): is_secure==false -> SKIP re-attach -> the capture thread stays stuck on the now-gone Winlogon desktop -> every rebuild fails -> no frames -> client timeout -> session ends -> SudoVDA removed (the disconnect sound). Fix: call attach_input_desktop() UNCONDITIONALLY on every rebuild (Apollo calls syncThreadDesktop before every duplicate), so leaving secure re-attaches to the returned desktop. reassert_isolation stays secure-only. Also stop leaking the HDESK (CloseDesktop right after SetThreadDesktop, like Apollo) so calling it on every recovery is safe. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
9a9214a2d8 |
fix(host/windows): gentle DDA recovery — stop the tight teardown/recreate loop
Per the user's insight: on the secure (Winlogon) desktop the duplication dies on
every independent-flip, and our tight recovery loop tore it down + recreated it
hundreds of times/sec — that release/recreate cycle is the real kernel stress,
and it stalled the send thread long enough that the client timed out ('display
disconnected'). Normal-desktop streaming is already solid (per-session GUID
killed the collision); this only changes the loss-recovery cadence.
Gentle recovery (user chose 'keep session alive'):
- cap the cheap re-duplicate to PUNKTFUNK_RECOVER_MS (default 250ms, was 5ms)
- cap the heavy new-device rebuild to PUNKTFUNK_REBUILD_MS (default 1500ms, was
250ms) — it's the costliest teardown, throttled hardest
- repeat the last frame between attempts (no busy-spin, no 8ms sleep)
~200/s -> ~4/s teardown/recreate during a secure dwell. The session survives
lock/UAC (frozen/laggy secure screen, then clean resume on unlock) instead of
churning the kernel into a disconnect. Both cadences env-tunable.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
ce84861e3a |
fix(host/windows): DuplicateOutput1 retry wait 200ms (Apollo's value), env-tunable
The old-dup kernel teardown takes ~200ms (Apollo waits exactly that), so the previous 2-16ms retries were too short and still fell through to the churning legacy dup. Bump to PUNKTFUNK_DUP_RETRY_MS (default 200) x PUNKTFUNK_DUP_RETRY_N (default 6) so the robust DuplicateOutput1 dup wins the race. Env-tunable for on-box dialing without a rebuild. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
eb451d8bc6 |
fix(host/windows): retry DuplicateOutput1 to ride out the old-dup teardown race
User's insight, and it fits the evidence exactly: in duplicate_output the FIRST
DuplicateOutput1 (called microseconds after the caller releases the old
duplication via self.dupl=None) returns E_ACCESSDENIED, but the legacy
DuplicateOutput a beat later SUCCEEDS — the only difference is TIMING. The
kernel-side teardown of the just-released duplication is async, so the immediate
DuplicateOutput1 races it ('output still duplicated' -> E_ACCESSDENIED). We then
fell straight through to legacy DuplicateOutput, which 'succeeds' into a FRAGILE
dup that churns ACCESS_LOST/MODE_CHANGE every few ms on this cross-GPU IDD
(causing the post-login freeze + UAC-confirm drop).
Fix: retry DuplicateOutput1 up to 5x with escalating 2/4/8/16 ms waits before
falling back to legacy, so the teardown finishes and the ROBUST DuplicateOutput1
dup succeeds (no churn). Bounded (~30 ms worst case) so a genuine failure still
falls back quickly. This is exactly Apollo's 2x/200ms retry rationale.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
1e1e5ce9b5 |
fix(host/windows): Option-handle the multi-line dupl.GetFramePointerShape call too
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
da43b5e8d3 |
fix(host/windows): release the old duplication before re-duplicating (THE born-lost bug)
DuplicateOutput1 returned E_ACCESSDENIED ~8815x even with PER_MONITOR_AWARE_V2 confirmed on the capture thread (thread_is_v2=true) — so DPI was NOT the cause. The real cause: DXGI permits only ONE IDXGIOutputDuplication per output, and on ACCESS_LOST you MUST release the old one before re-duplicating. Our recovery (try_reduplicate / recreate_dupl) created the NEW duplication while the OLD self.dupl was still alive → the output stayed held → DuplicateOutput1 E_ACCESSDENIED and the legacy fallback returned a BORN-LOST dup. It never converged because there was always exactly one stale dup alive at creation time. The initial open() works precisely because there's no prior dup; Apollo is clean because it releases (dup.reset()) before every re-DuplicateOutput. Fix: make self.dupl an Option and set it to None (drop → release the output) BEFORE duplicate_output in try_reduplicate and before reopen_duplication in recreate_dupl, then Some(new). acquire() gets a None-guard that synthesizes ACCESS_LOST (routes into recovery) so a transient None can't panic. All ReleaseFrame/AcquireNextFrame sites updated for the Option. This is the documented DDA recovery requirement and the one thing that distinguished our failing DuplicateOutput1 from Apollo's working one. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
c8fb4822a2 |
fix(host/windows): per-thread Per-Monitor-V2 DPI awareness so DuplicateOutput1 succeeds
The remaining born-lost ACCESS_LOST storm traces to ONE thing: our
IDXGIOutput5::DuplicateOutput1 returns E_ACCESSDENIED (0x80070005) ~4370x, so
we fall back to legacy DuplicateOutput, which yields a BORN-LOST duplication on
this hybrid box. Apollo's DuplicateOutput1 SUCCEEDS on the identical
desktop/output/4090-device → a working dup, clean capture.
Root cause: DuplicateOutput1 REQUIRES Per-Monitor-Aware-V2. At startup our
SetProcessDpiAwarenessContext(PER_MONITOR_AWARE_V2) FAILS with E_ACCESSDENIED
('already set' — a manifest/runtime locked the process to a lower awareness),
and GetAwarenessFromDpiAwarenessContext reports 2 for BOTH Per-Monitor V1 and
V2, so the earlier 'awareness=2' was misleading — the process is likely V1,
which DuplicateOutput1 rejects with E_ACCESSDENIED. (Legacy DuplicateOutput has
no V2 requirement, so it 'worked' but born-lost.)
Fix: SetThreadDpiAwarenessContext(PER_MONITOR_AWARE_V2) on the capture thread
in open() — a per-thread override that takes regardless of the process default,
so DuplicateOutput1 can succeed (the working dup Apollo gets). Logs set_ok +
thread_is_v2 (via AreDpiAwarenessContextsEqual) to confirm V2 actually applied.
Topology fixes (sole display, no MODE_CHANGE) and the recovery backstops stay.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|