e2c9bfd3d94c90e8df7d795f83bdeaf5d25252f9
6 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
e2c9bfd3d9 |
feat(windows): pf-vdisplay IDD-push — HDR + pipelined zero-copy capture
apple / swift (push) Successful in 1m4s
windows-host / package (push) Successful in 6m28s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m14s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m10s
release / apple (push) Successful in 7m53s
android / android (push) Successful in 10m33s
ci / web (push) Successful in 44s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 3m4s
ci / docs-site (push) Successful in 53s
ci / rust (push) Successful in 12m22s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m11s
apple / screenshots (push) Successful in 5m24s
deb / build-publish (push) Successful in 3m16s
decky / build-publish (push) Successful in 21s
ci / bench (push) Successful in 4m42s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m34s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m42s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m13s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 47s
flatpak / build-publish (push) Successful in 4m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 25s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m44s
HDR (display-driven, matching the WGC path): - CTA-861.3 HDR EDID (BT.2020 primaries + HDR Static Metadata block) so Windows offers "Use HDR" on the virtual display. The host FOLLOWS the display's live advanced-color state, recreating the shared ring at the matching format (FP16 in HDR / BGRA in SDR) on a toggle — no freeze. - Always emit Main10/BT.2020-PQ Rgb10a2 while the display is HDR; the client auto-detects PQ from the HEVC VUI (clients under-report VIDEO_CAP_10BIT). Generic HDR10 mastering SEI on every IDR. - Generation-tagged `latest` (gen<<40|seq<<8|slot) + driver `is_stale` re-attach kill the toggle-time garbage frame and any stale-ring read. Perf: - Pipeline the encode loop (Capturer::pipeline_depth; IDD-push = 2): submit N+1 before polling N so the convert/copy on the 3D engine overlaps the NVENC encode of N on the ASIC. PUNKTFUNK_IDD_DEPTH overrides (1 = synchronous). - Rotating host output ring (OUT_RING) so the in-flight encode and the next convert never touch the same texture. - HDR converts directly from the keyed-mutex slot's SRV into the output ring (drops the redundant slot->fp16 scratch copy); SDR copies the BGRA slot in. The slot mutex is held only across the convert/copy, not the encode. RING_LEN 3->6 for publish headroom. - Capture-health diagnostic: new_fps vs repeat_fps under PUNKTFUNK_PERF (a low new_fps at a high send rate means the source isn't compositing, not an encode stall). Validated live on the RTX box: 5120x1440@240 HDR streams; driver composes ~180 new fps, encode 240 fps @ ~4.3 ms p50. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
9c8fa9340c |
refactor: drop milestone names + consolidate clients; loss-recovery & rumble fixes
apple / swift (push) Failing after 40s
audit / cargo-audit (push) Failing after 1m12s
windows-msix / package (push) Successful in 1m37s
windows / build (push) Successful in 1m14s
android / android (push) Successful in 4m48s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 4m21s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 4m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 19s
deb / build-publish (push) Successful in 6m3s
flatpak / build-publish (push) Successful in 4m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m16s
docker / deploy-docs (push) Successful in 18s
Two bodies of work in one commit (the rename moved files the fixes also touched). Naming/structure cleanup (pre-launch): - Host modules m3.rs->punktfunk1.rs, m0.rs->spike.rs; CLI m3-host->punktfunk1-host, m0->spike; bare `punktfunk-host` now prints help. Types M3Options/M3Source-> Punktfunk1Options/Punktfunk1Source. - Clients consolidated out of crates/ into clients/: punktfunk-client-rs-> clients/probe (crate punktfunk-probe), client-linux->clients/linux, client-windows->clients/windows, punktfunk-android->clients/android/native (crate punktfunk-client-android; kept [lib] name=punktfunk_android so the JNI contract is unchanged). crates/ now holds only core + host. - Milestone codes M0-M4 purged from code/CLI/CLAUDE.md/README/docs/docs-site, kept only in docs/implementation-plan.md. docs/m2-plan.md-> docs/gamestream-host-plan.md. CI/gradle/flatpak paths updated. Client loss-recovery (video froze and never recovered after a brief drop): - Export punktfunk_connection_frames_dropped through the C ABI (the core already tracked it for the client keyframe-recovery loop; it was never reachable from the ABI clients). Regenerated punktfunk_core.h. - Apple (StreamPump + Stage2Pipeline) and Android (decode.rs) now poll frames_dropped and request a keyframe when it climbs -- the same loss-driven recovery Linux/Windows already had. Under infinite GOP the decoder silently conceals reference-missing frames, so the decode-error trigger rarely fires. Apple rumble robustness (worked then went spotty -- DualSense + Xbox): - Add CHHapticEngine stopped/reset handlers (rebuild on app background / audio interruption / server reset) and drop the permanent `broken` latch on a transient drive failure; latch only when the controller truly has no haptics. - Surface swallowed SDL set_rumble errors on Linux/Windows + diagnostic logging. Verified: cargo build/clippy/fmt --workspace, C-ABI harness, header drift. Not runnable on this box (verify in CI): Gitea workflows, gradle/Android, flatpak, Swift/decky. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
3b3e8b4ba9 |
perf(host/windows): elevate capture/encode/send thread CPU priority (Apollo-parity)
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m36s
android / android (push) Successful in 2m5s
ci / web (push) Successful in 29s
ci / docs-site (push) Successful in 29s
deb / build-publish (push) Successful in 2m31s
decky / build-publish (push) Successful in 15s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m28s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m20s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m58s
Apollo runs its capture thread at CRITICAL and its encoder thread at ABOVE_NORMAL; we set none. Our GPU work is already HIGH priority, but the GPU scheduler can only favour commands we've SUBMITTED — a normal-priority thread descheduled by a CPU-heavy game submits the convert/encode late, so the HIGH GPU priority never bites (consistent with the measured "NVENC engine idle yet the encode waits ~15 ms"). Raise the WGC helper's capture+encode loop and the single-process capture+encode loop to THREAD_PRIORITY_HIGHEST, and the transmit thread to ABOVE_NORMAL, via a cross-platform boost_thread_priority() (Windows-only effect — the Linux host caps the game via gamescope so its threads aren't starved). Not yet built/validated on the GPU box (it's down); the cross-platform side compiles (cargo check) and the Windows calls are cross-checked against the windows-0.62 API. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
4cc57d5c39 |
perf(host/windows): move capture→encode off the 3D engine (NV12/P010 video-processor path, zero-copy, GPU priority)
apple / swift (push) Successful in 56s
ci / rust (push) Successful in 1m36s
android / android (push) Successful in 1m56s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 28s
deb / build-publish (push) Successful in 2m26s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m33s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m58s
The Windows host capped at ~60 fps with 35-40 ms latency on a GPU-heavy game: the per-frame capture→encode path shared the 3D engine with the game and got scheduled behind it. Rework to minimize 3D-engine work per frame: - VideoConverter (D3D11 video processor): capture → NVENC-native NV12/P010 so NVENC skips its internal RGB→YUV (a 3D/compute step). Wired into both DDA (dxgi.rs) and WGC (wgc.rs). New PixelFormat::Nv12/P010 + NVENC YUV input. - GPU scheduling hardening (Apollo-style): D3DKMTSetProcessSchedulingPriorityClass HIGH, absolute SetGPUThreadPriority, SetMaximumFrameLatency(1). - WGC SDR zero-copy (hold pool frames; no CopyResource). DDA keeps a fast CopyResource to decouple its single-frame acquire/release from the async convert. - Pipelined helper encode loop (PUNKTFUNK_ENCODE_DEPTH, default 1) + perf split (cap_wait / encode / write). Live on the RTX 4090: hard 60 fps ceiling removed (now scene-scaling 40-200+), latency much reduced. Residual cap in GPU-pinned scenes is the irreducible RGB→YUV convert (no fixed-function unit on NVIDIA — VideoProcessing engine reads 0%) waiting behind an uncapped game under WDDM context time-slicing; Linux avoids it via gamescope capping the game to the display refresh. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
9f50b3930d |
feat(host/windows): two-process secure-desktop step 4 — spawn helper + relay AUs
The SYSTEM host now sources the normal-desktop video from a user-session WGC helper instead of capturing in-process (WGC won't activate as SYSTEM). New `capture/wgc_relay.rs`: `HelperRelay::spawn` launches `m3-host wgc-helper` in the interactive user session via CreateProcessAsUserW (WTSQueryUserToken → DuplicateTokenEx(TokenPrimary) → lpDesktop="winsta0\\default", CREATE_NO_WINDOW) with three anonymous pipes — stdout (framed Annex-B AUs → parsed back to RelayAu), stdin (control: force-keyframe), stderr (helper logs → host tracing). The host holds the SudoVDA keepalive (sole isolation/topology owner); the helper captures by GDI name only. m3.rs: `virtual_stream` dispatches to the new `virtual_stream_relay` when `should_use_helper()` (running as SYSTEM, or PUNKTFUNK_FORCE_HELPER; disable with PUNKTFUNK_NO_HELPER). The relay loop feeds the existing send thread — same FEC/seal/paced-send path. Reconfigure rebuilds the output + re-spawns the helper; keyframe requests forward over the control pipe; helper pts_ns (same-machine monotonic clock) is used directly as capture_ns. Disconnect ends the stream (step 6 adds the relaunch watchdog). wgc_helper.rs: reads the stdin control byte to request an IDR; --bit-depth flag threaded through so SDR 10-bit (Main10) negotiation reaches the helper's encoder. cfg-gated windows-only; Linux/macOS build unaffected. Step 5 (DesktopWatcher mux to host DDA on the Winlogon secure desktop) is next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
a0f6cddc70 |
feat(host/windows): WGC helper subcommand (two-process secure-desktop, step 3)
`m3-host wgc-helper --target-id N --gdi NAME --mode WxHxHz --bitrate K`: the USER-session half of the two-process secure-desktop design (docs/windows-secure-desktop.md). Opens WGC on the EXISTING SudoVDA output by GDI name only (never creates a virtual output — a second topology owner re-trips the ACCESS_LOST born-lost storm), encodes via NVENC, and ships framed Annex-B AUs on stdout for the SYSTEM host to relay onto the live QUIC session: `[u32 magic "PFAU"][u32 len][u64 pts_ns][u8 keyframe][data]`. tracing → stderr so stdout stays the pure AU stream. cfg-gated windows-only; Linux build unaffected. scripts/headless/win-build.cmd: the canonical box build script (sets PUNKTFUNK_BUILD_VERSION so build.rs stamps the version + the NVENC LIB path). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |