b0d28380b50a802affaa7dcabd202571b39bbd40
59 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
b0d28380b5 |
feat(windows-host): rotate out-ring on repeat + size HDR ring at open (audit §5.3/§5.4)
§5.3 (C3): repeat_last() now copies the last frame into a FRESH rotated out-ring slot instead of re-handing last_present's slot, so a repeat (static desktop) never re-hands a slot still encoding under pipeline_depth>1. OUT_RING(3) > max depth(2) keeps the rotated slot free — the out-ring rotation contract now holds for repeats too, not just the synchronous-loop assumption. §5.4 (C4): when enabling advanced color for a 10-bit client, trust set_advanced_color success and size the ring FP16 directly, instead of racing the advanced_color_enabled poll (which could size SDR while the driver composes FP16 -> format mismatch -> an immediate ring recreate + dropped first frames). Verified: host clippy (nvenc) clean on the RTX box. On-glass to confirm: HDR-client first-frame + static-desktop pipelining. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
ed583650a6 |
feat(windows-host): IDD-push attach fallback to DDA, not the 20s black bail (audit §5.1)
open() now hands the keepalive BACK on failure (the WGC attach_keepalive pattern) so the caller can fall back instead of tearing the virtual display down. Added a bounded wait_for_attach() that polls the driver's DRV_STATUS_OPENED — it checks ATTACH status, not frame arrival, so it never false-fails on an idle desktop that has composed no frame yet.
An attach failure (e.g. a hybrid-GPU render mismatch -> DRV_STATUS_TEX_FAIL, or the driver never opening the ring within 4s) now fails open() -> capture.rs falls back to DDA, instead of next_frame's 20s deadline leaving the session black. Pairs with the driver SET_RENDER_ADAPTER fix (
|
||
|
|
e2f004589c |
feat(windows-drivers): STEP 6 — IDD-push FramePublisher (driver) + host migration to proto::frame
apple / swift (push) Failing after 1s
windows-drivers / driver-build (push) Successful in 1m9s
ci / rust (push) Successful in 1m31s
ci / web (push) Successful in 42s
apple / screenshots (push) Has been skipped
windows-drivers / probe-and-proto (push) Successful in 19s
ci / docs-site (push) Successful in 1m2s
android / android (push) Successful in 3m50s
deb / build-publish (push) Successful in 2m37s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
windows-host / package (push) Successful in 5m20s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m37s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m32s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m19s
docker / deploy-docs (push) Successful in 16s
The driver now publishes each acquired swap-chain surface into the host-created shared ring (the IDD-push path) — the full glass-to-glass transport is code-complete. Both sides use the canonical pf_vdisplay_proto::frame layout (lockstep by compile-error, not "must match" comments). Driver compiles + LOADS on-glass (adapter inits, Status=OK; no regression — the publisher is dormant until a frame is acquired); host cargo check green; adversarially reviewed (no blockers — token layout, keyed-mutex key 0, names by target_id, and the format guard all match the host consumer). - new driver frame_transport.rs: FramePublisher OPENS the host ring by target_id (OpenFileMapping header + magic Acquire readiness gate + OpenEvent + OpenSharedResourceByName RING_LEN keyed-mutex textures), writes its render LUID + DRV_STATUS back into the header; publish() is NON-BLOCKING (round-robin 0ms try-acquire -> CopyResource -> ReleaseSync -> FrameToken::pack store Release -> SetEvent; drops the frame if every slot is busy or the surface format != the ring format). Manual handle/view cleanup on every try_open early return; RAII Drop (slots -> unmap -> CloseHandle). Layout/consts/names/token all from pf_vdisplay_proto::frame. - swap_chain_processor.rs run_core: lazy rate-limited attach (every ~30 frames) + is_stale re-attach (mid-session HDR ring recreate); publishes buffer.MetaData.pSurface via IDXGIResource::from_raw_borrowed (preserves IddCx's refcount) BEFORE IddCxSwapChainFinishedProcessingFrame. run/run_core gain the render LUID; callbacks.rs assign_swap_chain passes it. - host idd_push.rs migrated onto pf_vdisplay_proto::frame (deleted the hand-rolled SharedHeader / MAGIC / VERSION / RING_LEN / DRV_STATUS_* / name fns / token packing) — pure refactor, byte-identical, no behavior or gating change. DebugBlock + DXGI_SHARED_RESOURCE_RW kept local (not in the proto). - driver windows crate gains Win32_System_Memory (MapViewOfFile/OpenFileMappingW/...); rustfmt'd the whole driver workspace (incl. wdk-probe — fmt-only). Built via the ultracode flow: STEP-6 map workflow -> agent-implement -> box build (driver + host both green; caught nothing this time) -> adversarial-verify-agent (no blockers) -> FrameToken::pack hardening -> deploy (loads). Glass-to-glass frame validation awaits a composited session (per the parity finding: this headless box yields 0 frames for the proven SudoVDA path too). FOLLOW-UPs: port the optional Global\pfvd-dbg DebugBlock triage channel to the new driver; STEP 7 HDR; STEP 8 drop SudoVDA. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
e2c9bfd3d9 |
feat(windows): pf-vdisplay IDD-push — HDR + pipelined zero-copy capture
apple / swift (push) Successful in 1m4s
windows-host / package (push) Successful in 6m28s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m14s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m10s
release / apple (push) Successful in 7m53s
android / android (push) Successful in 10m33s
ci / web (push) Successful in 44s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 3m4s
ci / docs-site (push) Successful in 53s
ci / rust (push) Successful in 12m22s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m11s
apple / screenshots (push) Successful in 5m24s
deb / build-publish (push) Successful in 3m16s
decky / build-publish (push) Successful in 21s
ci / bench (push) Successful in 4m42s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m34s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m42s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m13s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 47s
flatpak / build-publish (push) Successful in 4m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 25s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m44s
HDR (display-driven, matching the WGC path): - CTA-861.3 HDR EDID (BT.2020 primaries + HDR Static Metadata block) so Windows offers "Use HDR" on the virtual display. The host FOLLOWS the display's live advanced-color state, recreating the shared ring at the matching format (FP16 in HDR / BGRA in SDR) on a toggle — no freeze. - Always emit Main10/BT.2020-PQ Rgb10a2 while the display is HDR; the client auto-detects PQ from the HEVC VUI (clients under-report VIDEO_CAP_10BIT). Generic HDR10 mastering SEI on every IDR. - Generation-tagged `latest` (gen<<40|seq<<8|slot) + driver `is_stale` re-attach kill the toggle-time garbage frame and any stale-ring read. Perf: - Pipeline the encode loop (Capturer::pipeline_depth; IDD-push = 2): submit N+1 before polling N so the convert/copy on the 3D engine overlaps the NVENC encode of N on the ASIC. PUNKTFUNK_IDD_DEPTH overrides (1 = synchronous). - Rotating host output ring (OUT_RING) so the in-flight encode and the next convert never touch the same texture. - HDR converts directly from the keyed-mutex slot's SRV into the output ring (drops the redundant slot->fp16 scratch copy); SDR copies the BGRA slot in. The slot mutex is held only across the convert/copy, not the encode. RING_LEN 3->6 for publish headroom. - Capture-health diagnostic: new_fps vs repeat_fps under PUNKTFUNK_PERF (a low new_fps at a high send rate means the source isn't compositing, not an encode stall). Validated live on the RTX box: 5120x1440@240 HDR streams; driver composes ~180 new fps, encode 240 fps @ ~4.3 ms p50. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
72eeedc4da |
feat(windows): AMD (AMF) + Intel (QSV) hardware encode on the Windows host
The Windows host was NVIDIA-only (NVENC) with an openh264 software fallback. Add
AMD AMF and Intel QSV via libavcodec — the Windows analogue of the Linux VAAPI
backend — so one installer serves all three GPU vendors.
- encode/ffmpeg_win.rs: new WinVendor{Amf,Qsv} encoder. System-memory NV12/P010
readback (default, robust) + opt-in zero-copy D3D11 (PUNKTFUNK_ZEROCOPY: shares
the capturer's ID3D11Device; AMF takes AV_PIX_FMT_D3D11, QSV derives a QSV frames
ctx and maps) with a system fallback for the format-group mismatch the capturer's
video-processor fallback can produce. HDR Main10 (P010 + BT.2020/PQ VUI; an
Rgb10a2->P010 swscale covers the shader fallback).
- encode.rs: Codec::amf_name/qsv_name; open_video + windows_resolved_backend()
resolve PUNKTFUNK_ENCODER=auto|nvenc|amf|qsv|sw via a DXGI adapter VendorId probe.
- capture/dxgi.rs: gpu_mode mirrors the resolved backend (D3D11 NV12/P010 for AMF/QSV).
- gamestream/serverinfo.rs: GPU-aware codec advertisement (windows_codec_support;
AV1 gated to RDNA3+/Arc, like the VAAPI path).
- Cargo.toml: amf-qsv feature (optional ffmpeg-next in the windows target block).
- CI/installer: windows-host.yml sets FFMPEG_DIR + builds --features nvenc,amf-qsv;
the Inno installer bundles the FFmpeg DLLs; host.env default nvenc -> auto.
CI-green target; AMF/QSV not yet on-glass validated (no AMD/Intel Windows box in the
lab) — NVENC stays live-validated. An adversarial-review pass caught + fixed real
FFI bugs (AV_PIX_FMT_P010 is a macro -> P010LE; windows-rs 0.62 GetImmediateContext/
GetDesc1 return Result; AV_HWFRAME_MAP_* is a bindgen enum with no BitOr).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
3526517eb1 |
feat: HDR Step-0 colour-metadata transport + security-audit hardening
ci / rust (push) Failing after 45s
apple / swift (push) Successful in 57s
ci / web (push) Successful in 39s
ci / docs-site (push) Successful in 38s
windows-host / package (push) Successful in 3m26s
android / android (push) Successful in 3m40s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m24s
deb / build-publish (push) Successful in 2m10s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m22s
decky / build-publish (push) Successful in 25s
ci / bench (push) Successful in 4m44s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 1m4s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m7s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m45s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 30s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m37s
flatpak / build-publish (push) Successful in 4m17s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m30s
docker / deploy-docs (push) Successful in 23s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m53s
Two strands, entangled in punktfunk1.rs, committed together (one builds-green tree). HDR pipeline Step 0 — glass-to-glass colour-metadata transport (docs/hdr-pipeline-plan.md): - Protocol/ABI: ColorInfo on the Welcome + a 0xCE HdrMeta datagram carry the source colour space + HDR10 static mastering metadata (quic.rs, abi.rs connect_ex5 fixing caps=0). - New platform-independent, unit-tested HDR static-metadata helpers (hdr.rs): chromaticities (1/50000), mastering luminance (0.0001 cd/m2), MaxCLL/MaxFALL in HDR10/ST.2086 units. - Capture/encode hooks (capture.rs, encode.rs set_hdr_meta) + Linux client / probe plumbing. Security-audit hardening — top 3 from docs/security-review.md, each adversarially verified: - #1 [HIGH] Secret file permissions. The host key.pem/cert.pem and both trust stores are now written owner-only: 0600 + dir 0700 on Unix (mirrors mgmt_token), best-effort SYSTEM/Administrators/OWNER-only icacls DACL on Windows (%ProgramData% is Users-readable). Closes a local key-disclosure -> host-impersonation gap. New gamestream::{create_private_dir, write_secret_file} + a 0600 regression test. - #2 [HIGH] Native SPAKE2 PIN is single-use. The PIN is consumed the moment the host sends its key-confirmation (which lets the client test its one guess), before reading the proof, so any completed attempt -- right OR wrong -- disarms the window. A wrong PIN isn't observable host-side (the client aborts before sending its proof), so consuming on first attempt is what delivers the documented "one online guess" instead of an unbounded brute-force of the static 4-digit PIN. Test verifies single-use. - #3 [MEDIUM] RTSP packetSize is bounded ([64,2048] in stream_config) and VideoPacketizer::new uses saturating .max(1), killing a PRE-AUTH div-by-zero/underflow panic of the video thread. Tests for {0,15,16,17} + out-of-range rejection. fmt + clippy -D warnings clean; full workspace test suite green (93 host tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
708c62788d |
feat(host/encode): VAAPI zero-copy dmabuf import (AMD/Intel GPU CSC)
apple / swift (push) Successful in 57s
ci / rust (push) Successful in 1m39s
ci / web (push) Successful in 32s
ci / docs-site (push) Successful in 31s
android / android (push) Successful in 3m29s
windows-host / package (push) Successful in 3m39s
deb / build-publish (push) Successful in 3m7s
decky / build-publish (push) Successful in 22s
ci / bench (push) Successful in 4m43s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m27s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m24s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 22s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m22s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m53s
Phase 2 of AMD/Intel support: the VAAPI encoder now takes the capture dmabuf directly and does the RGB->NV12 colour conversion on the GPU's video engine, eliminating the host-side de-pad + swscale CSC + upload the CPU path pays. - capture: a vendor-neutral FramePayload::Dmabuf (dup'd fd + fourcc/modifier/ layout). When zero-copy is on, the EGL->CUDA importer is unavailable (any non-NVIDIA host), and the backend is VAAPI, the capturer advertises LINEAR dmabuf and hands the raw buffer to the encoder instead of CPU-copying it. - encode/vaapi: the encoder self-configures from the first frame's payload (no open_video signature change). The dmabuf arm wraps the buffer as an AV_PIX_FMT_DRM_PRIME frame and pushes it through a filter graph buffer(drm_prime) -> hwmap(vaapi) -> scale_vaapi=nv12 -> buffersink; the encoder takes NV12 surfaces straight from the sink. The Phase 1 CPU-upload path is kept as the other arm (used when capture produces CPU frames). Live-validated on a Radeon 780M (real Sway/xdpw desktop capture): correct, pixel-perfect HEVC, and ~10x less host CPU at 1440p (4.2s -> 0.4s of CPU for 300 frames) -- the de-pad/CSC/upload moves to the GPU. NVIDIA unchanged (zero-copy still imports to CUDA; the passthrough path only engages on non-NVIDIA hosts). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
5e27f65f2e |
fix(host/capture): mmap the buffer fd ourselves — xdpw MemFd over-reads MAP_BUFFERS
apple / swift (push) Successful in 55s
windows-host / package (push) Successful in 2m28s
android / android (push) Successful in 10m10s
ci / web (push) Successful in 32s
ci / docs-site (push) Successful in 29s
ci / rust (push) Successful in 11m44s
deb / build-publish (push) Successful in 3m7s
decky / build-publish (push) Successful in 34s
ci / bench (push) Successful in 4m44s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 15s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m57s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m51s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 21s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m21s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m8s
docker / deploy-docs (push) Successful in 20s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m54s
The CPU de-pad path trusted PipeWire's MAP_BUFFERS slice (`d.data()`, length = `data.maxsize`). xdg-desktop-portal-wlr hands MemFd ScreenCast buffers whose maxsize exceeds the bytes PipeWire actually maps into our process, so reading to maxsize ran off the end of the mapping and SIGSEGV'd the capture thread — crashing every CPU-path capture on Sway/wlroots (and thus any non-NVIDIA host, which has no CUDA zero-copy importer and always falls back to this path). mmap the fd ourselves, sized to its real length (fstat), for any fd-backed buffer (MemFd SHM or DmaBuf); fall back to `d.data()` then drop. The existing `needed > avail` guard now drops cleanly instead of over-reading. This also subsumes the original "MAP_BUFFERS didn't map a Vulkan dmabuf" fallback. Verified: fixes real Sway-desktop portal capture -> VAAPI HEVC on a Radeon 780M (correct image + colours); the NVIDIA zero-copy path (returns before this code) and the NVIDIA/KWin CPU path (self-mmap, fd_len == maxsize) both still work. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
aef552f04a |
feat(host/windows): HDR scRGB→P010 in a shader — NVENC native P010, off the SM
apple / swift (push) Successful in 55s
deb / build-publish (push) Successful in 3m9s
decky / build-publish (push) Successful in 13s
ci / rust (push) Successful in 1m14s
ci / web (push) Successful in 30s
ci / docs-site (push) Successful in 30s
windows-host / package (push) Failing after 2m19s
android / android (push) Successful in 3m12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
ci / bench (push) Successful in 4m38s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m42s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m47s
docker / deploy-docs (push) Successful in 18s
On the Windows WGC HDR path the FP16 scRGB capture was fed to NVENC as R10G10B10A2 (BT.2020 PQ), and NVENC did the RGB→YUV CSC internally on the contended SM — adding to the encode_ms wall under a GPU-saturating game. (NVIDIA's D3D11 VideoProcessor can't do RGB→P010 for HDR; that path renders green, confirmed live — so the convert must be ours.) New `HdrP010Converter` fuses the tone-map with the BT.2020 RGB→YUV matrix and emits P010 (10-bit limited range) directly: a luma pass → an R16_UNORM plane RTV (full-res) and a chroma pass → an R16G16_UNORM plane RTV (half-res, 2x2 box average) of a DXGI_FORMAT_P010 texture. NVENC then takes native P010 and skips its SM-side convert. Gated behind PUNKTFUNK_HDR_SHADER_P010 (default OFF → the existing R10→NVENC path is byte-for-byte unchanged). Colour validated by a new `hdr-p010-selftest` subcommand: a synthetic scRGB pattern → P010 → readback, compared to a BT.2020 PQ 10-bit reference — max abs error Y=0.99 / Cb=0.82 / Cr=0.75 codes on an RTX 4090. Live-validated HDR colours correct (no green). Build + clippy (--features nvenc -D warnings) green on x86_64-pc-windows-msvc. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
1fc6f73784 |
perf(host/linux): NV12 GPU convert — feed NVENC native YUV, off the contended SM (Tier 2A)
apple / swift (push) Successful in 54s
windows-host / package (push) Failing after 2m18s
ci / web (push) Successful in 32s
ci / rust (push) Failing after 5m2s
decky / build-publish (push) Successful in 11s
android / android (push) Failing after 49s
ci / docs-site (push) Successful in 35s
ci / bench (push) Failing after 3m15s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m49s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 15s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 40s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 28s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Successful in 5m54s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 11s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m36s
The Linux zero-copy tiled-GL path can now produce NV12 (BT.709 limited range) on the GPU and feed NVENC native YUV, deleting NVENC's internal RGB->YUV CSC — which runs on the SM/3D-compute engine a saturating game pins at 100% (the game-vs-encode contention headache). Windows already does this via the D3D11 video processor; this closes the Linux gap. See docs/host-latency-plan.md §2A. Gated behind PUNKTFUNK_NV12 (default OFF → the RGB/BGRx path is byte-for-byte unchanged; zero regression). Only the tiled EGL/GL path converts; the LINEAR/Vulkan-bridge (gamescope) path stays RGB. - zerocopy/egl.rs: Nv12Blit — BT.709 limited Y pass (R8, full-res) + UV pass (RG8, half-res, GL_LINEAR 2x2 average); both CUDA-registered; import_nv12. - zerocopy/cuda.rs: two-plane DeviceBuffer (Y W*H@1B + interleaved UV (W/2)*2 x H/2), paired Y+UV pool, copy_mapped_nv12 + copy_nv12_to_device, on the per-thread priority stream (dmabuf-recycle sync preserved). - encode/linux.rs: nvenc_input(Nv12)->NV12; submit_cuda copies two planes into NVENC's surface; VUI signalled BT.709 limited (colorspace/range/primaries/trc). - capture/linux.rs: gate (PUNKTFUNK_NV12 && tiled), report format Nv12. - main.rs + zerocopy/mod.rs: `nv12-selftest` subcommand. Validated on RTX 5070 Ti two ways: (1) nv12-selftest — synthetic RGBA->NV12 round-trip vs a BT.709 reference, max abs error Y=0.56/U=0.33/V=0.26 LSB; (2) live capture->NV12->NVENC->decode of animated red content matches the RGB path's colour (avg RGB 230,18,18 vs 231,18,20). build/clippy/fmt green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
9c8fa9340c |
refactor: drop milestone names + consolidate clients; loss-recovery & rumble fixes
apple / swift (push) Failing after 40s
audit / cargo-audit (push) Failing after 1m12s
windows-msix / package (push) Successful in 1m37s
windows / build (push) Successful in 1m14s
android / android (push) Successful in 4m48s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 4m21s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 4m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 19s
deb / build-publish (push) Successful in 6m3s
flatpak / build-publish (push) Successful in 4m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m16s
docker / deploy-docs (push) Successful in 18s
Two bodies of work in one commit (the rename moved files the fixes also touched). Naming/structure cleanup (pre-launch): - Host modules m3.rs->punktfunk1.rs, m0.rs->spike.rs; CLI m3-host->punktfunk1-host, m0->spike; bare `punktfunk-host` now prints help. Types M3Options/M3Source-> Punktfunk1Options/Punktfunk1Source. - Clients consolidated out of crates/ into clients/: punktfunk-client-rs-> clients/probe (crate punktfunk-probe), client-linux->clients/linux, client-windows->clients/windows, punktfunk-android->clients/android/native (crate punktfunk-client-android; kept [lib] name=punktfunk_android so the JNI contract is unchanged). crates/ now holds only core + host. - Milestone codes M0-M4 purged from code/CLI/CLAUDE.md/README/docs/docs-site, kept only in docs/implementation-plan.md. docs/m2-plan.md-> docs/gamestream-host-plan.md. CI/gradle/flatpak paths updated. Client loss-recovery (video froze and never recovered after a brief drop): - Export punktfunk_connection_frames_dropped through the C ABI (the core already tracked it for the client keyframe-recovery loop; it was never reachable from the ABI clients). Regenerated punktfunk_core.h. - Apple (StreamPump + Stage2Pipeline) and Android (decode.rs) now poll frames_dropped and request a keyframe when it climbs -- the same loss-driven recovery Linux/Windows already had. Under infinite GOP the decoder silently conceals reference-missing frames, so the decode-error trigger rarely fires. Apple rumble robustness (worked then went spotty -- DualSense + Xbox): - Add CHHapticEngine stopped/reset handlers (rebuild on app background / audio interruption / server reset) and drop the permanent `broken` latch on a transient drive failure; latch only when the controller truly has no haptics. - Surface swallowed SDL set_rumble errors on Linux/Windows + diagnostic logging. Verified: cargo build/clippy/fmt --workspace, C-ABI harness, header drift. Not runnable on this box (verify in CI): Gitea workflows, gradle/Android, flatpak, Swift/decky. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
4cc57d5c39 |
perf(host/windows): move capture→encode off the 3D engine (NV12/P010 video-processor path, zero-copy, GPU priority)
apple / swift (push) Successful in 56s
ci / rust (push) Successful in 1m36s
android / android (push) Successful in 1m56s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 28s
deb / build-publish (push) Successful in 2m26s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m33s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m58s
The Windows host capped at ~60 fps with 35-40 ms latency on a GPU-heavy game: the per-frame capture→encode path shared the 3D engine with the game and got scheduled behind it. Rework to minimize 3D-engine work per frame: - VideoConverter (D3D11 video processor): capture → NVENC-native NV12/P010 so NVENC skips its internal RGB→YUV (a 3D/compute step). Wired into both DDA (dxgi.rs) and WGC (wgc.rs). New PixelFormat::Nv12/P010 + NVENC YUV input. - GPU scheduling hardening (Apollo-style): D3DKMTSetProcessSchedulingPriorityClass HIGH, absolute SetGPUThreadPriority, SetMaximumFrameLatency(1). - WGC SDR zero-copy (hold pool frames; no CopyResource). DDA keeps a fast CopyResource to decouple its single-frame acquire/release from the async convert. - Pipelined helper encode loop (PUNKTFUNK_ENCODE_DEPTH, default 1) + perf split (cap_wait / encode / write). Live on the RTX 4090: hard 60 fps ceiling removed (now scene-scaling 40-200+), latency much reduced. Residual cap in GPU-pinned scenes is the irreducible RGB→YUV convert (no fixed-function unit on NVIDIA — VideoProcessing engine reads 0%) waiting behind an uncapped game under WDDM context time-slicing; Linux avoids it via gamescope capping the game to the display refresh. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
ad0cb1b582 |
feat(host/windows): capture the secure desktop in HDR via DDA (no SDR drop)
ci / web (push) Successful in 32s
ci / rust (push) Successful in 1m26s
android / android (push) Failing after 43s
apple / swift (push) Successful in 55s
deb / build-publish (push) Successful in 2m24s
decky / build-publish (push) Successful in 22s
ci / bench (push) Successful in 4m30s
ci / docs-site (push) Successful in 28s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4m1s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m31s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 21s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m15s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 7m46s
docker / deploy-docs (push) Successful in 21s
The secure-desktop DDA leg went black with HDR on: legacy DuplicateOutput (the SDR-era API) can't capture an FP16/HDR desktop, and dropping the SudoVDA out of HDR is denied on the Winlogon desktop (so the SDR-drop attempt just churned and stayed black). Instead capture HDR natively on the DDA path — the capturer already has the full FP16→BT.2020 PQ→R10G10B10A2 conversion (hdr_fp16 path), it just never requested FP16. Thread a want_hdr flag into duplicate_output: for an HDR session request DuplicateOutput1 with FP16 first and retry it (5×) instead of bailing to the HDR-incapable legacy fallback. The secure-desktop mux now reads the monitor's real HDR state and opens DDA in HDR when set — no advanced-color toggling at all. The normal-desktop DDA overlay/flip issues that pushed us to WGC don't apply to the composed Winlogon UI. want_hdr is threaded through every (re)duplication incl. ACCESS_LOST recovery. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
0ce2e37faf |
refactor(host/windows): clean up DDA path + add a proper Windows service
Final cleanup after the DDA-parity work, plus an end-user service to replace the PsExec/VBS/scheduled-task launch chain. Cleanup (behavior-preserving): - sudovda.rs: drop the dead legacy GDI isolate_displays/restore_displays (CCD is the sole isolation path), the always-empty Monitor.isolated field, and the vestigial reassert_isolation + PUNKTFUNK_ISOLATE_DISPLAYS knob; fix stale comments. - dxgi.rs: downgrade leftover debug warns/infos (DuplicateOutput1 retry, FALLBACKS, hook-hits, AcquireNextFrame idle timeout) to debug!; remove the PUNKTFUNK_NO_CURSOR per-frame test knob. Windows service (src/service.rs, `punktfunk-host service`): - SCM supervisor (windows-service crate) that duplicates its LocalSystem token, retargets it to the active console session, and CreateProcessAsUserW's the host there (Sunshine/Apollo model) — relaunching on exit and console session switch, inside a kill-on-close job object so a service crash never orphans the host. - install/uninstall/start/stop/status subcommands: one elevated `service install` registers an auto-start LocalSystem service + firewall rules + a default host.env. - Config moves to %ProgramData%\punktfunk\host.env; config_dir() now resolves to %ProgramData%\punktfunk on Windows (replacing the APPDATA=C:\Users\Public hack), with a PUNKTFUNK_CONFIG_DIR override. Logs land in %ProgramData%\punktfunk\logs\. - merged_env_block (shared with the WGC helper) now also carries RUST_LOG. - docs/windows-service.md + scripts/windows/host.env.example; windows-host.md updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
f469dfcc76 |
chore(host/windows): clean up DDA capture — fix unused imports, quiet secure-desktop log, sane retry default
- Remove 4 unused imports (PCWSTR in composed_flip, anyhow macro + SizeInt32 in wgc, Write in wgc_relay). - DuplicateOutput1 retry defaults to N=1 (immediate legacy): on the secure desktop DuplicateOutput1 is LOGON_UI-only so it always refuses, and the release-before-reduplicate + gentle recovery keep the legacy dup stable; retrying there only blocked. Still env-tunable (PUNKTFUNK_DUP_RETRY_N/_MS). - Throttle the 'using legacy DuplicateOutput' warning (expected + once-per-gentle- recovery on secure) so a lock dwell doesn't flood the log. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
dc734c711b |
fix(host/windows): re-sync thread desktop on EVERY recovery (symmetric enter/leave secure)
User's observation: entering UAC/lock works instantly, but clicking OUT of it breaks (with the disconnect sound) — Apollo's enter and leave are symmetric. Root cause: attach_input_desktop() (SetThreadDesktop to the current input desktop) was gated behind is_secure_desktop() in recreate_dupl, so: - Default->Winlogon (enter): is_secure==true -> re-attach to Winlogon -> works. - Winlogon->Default (leave): is_secure==false -> SKIP re-attach -> the capture thread stays stuck on the now-gone Winlogon desktop -> every rebuild fails -> no frames -> client timeout -> session ends -> SudoVDA removed (the disconnect sound). Fix: call attach_input_desktop() UNCONDITIONALLY on every rebuild (Apollo calls syncThreadDesktop before every duplicate), so leaving secure re-attaches to the returned desktop. reassert_isolation stays secure-only. Also stop leaking the HDESK (CloseDesktop right after SetThreadDesktop, like Apollo) so calling it on every recovery is safe. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
9a9214a2d8 |
fix(host/windows): gentle DDA recovery — stop the tight teardown/recreate loop
Per the user's insight: on the secure (Winlogon) desktop the duplication dies on
every independent-flip, and our tight recovery loop tore it down + recreated it
hundreds of times/sec — that release/recreate cycle is the real kernel stress,
and it stalled the send thread long enough that the client timed out ('display
disconnected'). Normal-desktop streaming is already solid (per-session GUID
killed the collision); this only changes the loss-recovery cadence.
Gentle recovery (user chose 'keep session alive'):
- cap the cheap re-duplicate to PUNKTFUNK_RECOVER_MS (default 250ms, was 5ms)
- cap the heavy new-device rebuild to PUNKTFUNK_REBUILD_MS (default 1500ms, was
250ms) — it's the costliest teardown, throttled hardest
- repeat the last frame between attempts (no busy-spin, no 8ms sleep)
~200/s -> ~4/s teardown/recreate during a secure dwell. The session survives
lock/UAC (frozen/laggy secure screen, then clean resume on unlock) instead of
churning the kernel into a disconnect. Both cadences env-tunable.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
ce84861e3a |
fix(host/windows): DuplicateOutput1 retry wait 200ms (Apollo's value), env-tunable
The old-dup kernel teardown takes ~200ms (Apollo waits exactly that), so the previous 2-16ms retries were too short and still fell through to the churning legacy dup. Bump to PUNKTFUNK_DUP_RETRY_MS (default 200) x PUNKTFUNK_DUP_RETRY_N (default 6) so the robust DuplicateOutput1 dup wins the race. Env-tunable for on-box dialing without a rebuild. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
eb451d8bc6 |
fix(host/windows): retry DuplicateOutput1 to ride out the old-dup teardown race
User's insight, and it fits the evidence exactly: in duplicate_output the FIRST
DuplicateOutput1 (called microseconds after the caller releases the old
duplication via self.dupl=None) returns E_ACCESSDENIED, but the legacy
DuplicateOutput a beat later SUCCEEDS — the only difference is TIMING. The
kernel-side teardown of the just-released duplication is async, so the immediate
DuplicateOutput1 races it ('output still duplicated' -> E_ACCESSDENIED). We then
fell straight through to legacy DuplicateOutput, which 'succeeds' into a FRAGILE
dup that churns ACCESS_LOST/MODE_CHANGE every few ms on this cross-GPU IDD
(causing the post-login freeze + UAC-confirm drop).
Fix: retry DuplicateOutput1 up to 5x with escalating 2/4/8/16 ms waits before
falling back to legacy, so the teardown finishes and the ROBUST DuplicateOutput1
dup succeeds (no churn). Bounded (~30 ms worst case) so a genuine failure still
falls back quickly. This is exactly Apollo's 2x/200ms retry rationale.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
1e1e5ce9b5 |
fix(host/windows): Option-handle the multi-line dupl.GetFramePointerShape call too
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
da43b5e8d3 |
fix(host/windows): release the old duplication before re-duplicating (THE born-lost bug)
DuplicateOutput1 returned E_ACCESSDENIED ~8815x even with PER_MONITOR_AWARE_V2 confirmed on the capture thread (thread_is_v2=true) — so DPI was NOT the cause. The real cause: DXGI permits only ONE IDXGIOutputDuplication per output, and on ACCESS_LOST you MUST release the old one before re-duplicating. Our recovery (try_reduplicate / recreate_dupl) created the NEW duplication while the OLD self.dupl was still alive → the output stayed held → DuplicateOutput1 E_ACCESSDENIED and the legacy fallback returned a BORN-LOST dup. It never converged because there was always exactly one stale dup alive at creation time. The initial open() works precisely because there's no prior dup; Apollo is clean because it releases (dup.reset()) before every re-DuplicateOutput. Fix: make self.dupl an Option and set it to None (drop → release the output) BEFORE duplicate_output in try_reduplicate and before reopen_duplication in recreate_dupl, then Some(new). acquire() gets a None-guard that synthesizes ACCESS_LOST (routes into recovery) so a transient None can't panic. All ReleaseFrame/AcquireNextFrame sites updated for the Option. This is the documented DDA recovery requirement and the one thing that distinguished our failing DuplicateOutput1 from Apollo's working one. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
c8fb4822a2 |
fix(host/windows): per-thread Per-Monitor-V2 DPI awareness so DuplicateOutput1 succeeds
The remaining born-lost ACCESS_LOST storm traces to ONE thing: our
IDXGIOutput5::DuplicateOutput1 returns E_ACCESSDENIED (0x80070005) ~4370x, so
we fall back to legacy DuplicateOutput, which yields a BORN-LOST duplication on
this hybrid box. Apollo's DuplicateOutput1 SUCCEEDS on the identical
desktop/output/4090-device → a working dup, clean capture.
Root cause: DuplicateOutput1 REQUIRES Per-Monitor-Aware-V2. At startup our
SetProcessDpiAwarenessContext(PER_MONITOR_AWARE_V2) FAILS with E_ACCESSDENIED
('already set' — a manifest/runtime locked the process to a lower awareness),
and GetAwarenessFromDpiAwarenessContext reports 2 for BOTH Per-Monitor V1 and
V2, so the earlier 'awareness=2' was misleading — the process is likely V1,
which DuplicateOutput1 rejects with E_ACCESSDENIED. (Legacy DuplicateOutput has
no V2 requirement, so it 'worked' but born-lost.)
Fix: SetThreadDpiAwarenessContext(PER_MONITOR_AWARE_V2) on the capture thread
in open() — a per-thread override that takes regardless of the process default,
so DuplicateOutput1 can succeed (the working dup Apollo gets). Logs set_ok +
thread_is_v2 (via AreDpiAwarenessContextsEqual) to confirm V2 actually applied.
Topology fixes (sole display, no MODE_CHANGE) and the recovery backstops stay.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
769fd96b87 |
fix(host/windows): stop SudoVDA MODE_CHANGE_IN_PROGRESS storm — don't force IDD primary by default
ROOT CAUSE (verified by multi-agent compare vs Apollo + adversarial review): set_active_mode() applied the SudoVDA mode with CDS_UPDATEREGISTRY | CDS_GLOBAL | CDS_SET_PRIMARY + DM_POSITION(0,0) — promoting the freshly-added IDD to PRIMARY at the virtual-screen origin and persisting it globally. On this box (baseline active display = a 1024x768 basic 'WinDisc') that primary-promotion contests the existing display so the desktop topology never reaches a stable fixed point → every DuplicateOutput/AcquireNextFrame during the unending settle returns DXGI_ERROR_MODE_CHANGE_IN_PROGRESS (0x887A0025). Apollo, live on this EXACT box with an empty config, never promotes primary and captures the same SudoVDA at 5120x1440 with zero DXGI errors. (Ruled out earlier on the live box: win32u hook, DPI, independent-flip/overlay, isolation, render pin.) Fixes (subtractive, gated per adversarial review): - sudovda.rs set_active_mode: default to CDS_UPDATEREGISTRY only (no primary promotion, no GLOBAL, no DM_POSITION) = Apollo-parity for the multi-display default. Promote to primary (CDS_GLOBAL|CDS_SET_PRIMARY+DM_POSITION) ONLY when PUNKTFUNK_ISOLATE_DISPLAYS=1 (sole display, where a blank extended IDD would otherwise yield no frames). Avoids regressing headless/isolated + mid-stream Reconfigure. - dxgi.rs acquire: treat MODE_CHANGE_IN_PROGRESS (0x887A0025) as a TRANSIENT (Ok(None), repeat last frame, wait it out) instead of falling through to the fatal Err arm → cold-rebuild → create()→set_active_mode (which re-issued the mode change and amplified the storm). - dxgi.rs acquire: remove the born-lost cold-rebuild escape — it re-created the SudoVDA (IOCTL REMOVE/ADD = the audible PnP chime the user heard) and never converged; now repeat last frame in-process (never tear the IDD down mid- session, like Apollo). Overlay + cheap-spin/HDR recovery left intact. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
63b63a4010 |
fix(host/windows): instrument + harden DDA against the born-lost ACCESS_LOST storm
The hybrid RTX4090+iGPU box storms DXGI_ERROR_ACCESS_LOST (0x887A0026) + MODE_CHANGE_IN_PROGRESS (0x887A0025) ~3s after first frame: every rebuilt duplication is born-lost (created OK, first AcquireNextFrame instantly ACCESS_LOST), seeds black, retries forever. The steady-state m3 loop calls try_latest()->acquire() which returns Ok(None) on every recovery, so the cold-rebuild escape (MAX_CAPTURE_REBUILDS) was unreachable -> frozen stream. Multi-agent root-cause + adversarial review point at the win32u GPU-pref hook being ineffective (patched on the main thread, no FlushInstructionCache, never verified) rather than the synthesis's independent-flip theory (Apollo has no overlay yet is stable on this exact box). This build instruments + applies the safe, high-probability fixes: - Hook: FlushInstructionCache after the inline patch (cross-thread i-cache); read back the 12 patched bytes and error! if they didn't land; per-call hit counter (hybrid_hook_hits) logged after open -- hits==0 proves the hook is off DXGI's reparent path. - DPI: log SetProcessDpiAwarenessContext result + effective awareness (need 2=PER_MONITOR for DuplicateOutput1; explains the 100% E_ACCESSDENIED). - SetThreadExecutionState(ES_CONTINUOUS|ES_DISPLAY_REQUIRED|ES_SYSTEM_REQUIRED) at capture open, restored on Drop -- stop IDD idle-invalidation (Apollo does this too). - Born-lost escape: count consecutive born-lost rebuilds; on the NORMAL desktop (never the secure/Winlogon dwell) escalate to Err after ~5s so the m3 loop cold-rebuilds the whole pipeline instead of freezing on the last frame. Diagnostic-forward: one test now tells us hook-hits + DPI awareness + whether ExecutionState/desktop-sync alone fixes it, and the stream self-recovers instead of wedging. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
3237ca31cd |
feat(host/windows): capture via IDXGIOutput5::DuplicateOutput1 (Apollo's capture API)
The one major capture-API difference left vs Apollo: punktfunk used legacy IDXGIOutput1::DuplicateOutput; Apollo uses IDXGIOutput5::DuplicateOutput1 with a format list, the modern path that's more robust to overlay/format changes (a candidate for the SudoVDA-on-hybrid 0x887A0026 churn). Add a duplicate_output() helper used at all 3 duplication sites (open, reopen_duplication, try_reduplicate): QI to IDXGIOutput5 and DuplicateOutput1, falling back to legacy DuplicateOutput. DuplicateOutput1 requires per-monitor-v2 DPI awareness, so set that at process start alongside the GPU-pref hook (matches Apollo). Format list is BGRA8-only for now (SDR test): DuplicateOutput1 returns the first format it can CONVERT to, so FP16-first would hand back FP16 even on SDR and trip the HDR path. Real FP16/HDR capture (with IDXGIOutput6 colorspace detection) is the follow-up once the churn is settled. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
7cfeddc770 |
fix(host/windows): install the GPU-preference hook at process start (before any DXGI)
The win32u hook only works if it patches before DXGI caches the hybrid preference. It was installed in DuplCapturer::open (first capture), but the SudoVDA render-adapter selection creates a DXGI factory during virtual-display setup — seconds earlier — so the preference was already cached and the hook had no effect (churn persisted; log showed "render adapter chosen" at :02, "hook installed" at :04). Call install_gpu_pref_hook() at the top of real_main(), before any command runs, so it beats the first DXGI factory. (open() still calls it too; Once makes the earliest call win.) Also fix the cosmetic function-cast-as-integer warning. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
a01f8a2f58 |
feat(host/windows): port Apollo's win32u GPU-preference hook (fix hybrid-GPU DDA churn)
Root cause of the ACCESS_LOST (0x887A0026) churn + context-change freeze, found live: the box is a HYBRID system (RTX 4090 + AMD Radeon iGPU + SudoVDA). DXGI does hybrid GPU-preference resolution and REPARENTS the SudoVDA output between adapters (SET_RENDER_ADAPTER is ignored — the IDD lands on the iGPU 0x23664 while we duplicate on the 4090 0x15768), which constantly invalidates Desktop Duplication. Apollo runs fine on this same box because it hooks this away. Port Apollo's hook: replace win32u.dll!NtGdiDdDDIGetCachedHybridQueryValue to always report D3DKMT_GPU_PREFERENCE_STATE_UNSPECIFIED, so DXGI skips preference resolution and never reparents the output → DDA stays on one adapter. Installed once before the first DXGI factory/enumeration (DuplCapturer::open). We fully replace the function (never call the original) so a 12-byte absolute-jmp prologue patch suffices — no detour crate / C length-disassembler dependency, just VirtualProtect. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
61fd75dc33 |
fix(host/windows): re-isolate/re-attach desktop ONLY on the secure desktop
recreate_dupl called reassert_isolation (a display-TOPOLOGY change via isolate_displays) + attach_input_desktop on EVERY ACCESS_LOST rebuild — 200× in a 6 s SDR session. A topology change itself invalidates the freshly-rebuilt duplication, so the next acquire is ACCESS_LOST → recreate → reassert → a self-feeding 0x887A0026 churn that freezes the stream and never recovers across context changes (lock / login / post-login). Gate both behind is_secure_desktop(): the heavy topology work runs only on the actual Winlogon (secure/login) desktop — where a physical monitor can grab the secure desktop off our virtual output. Routine churn, the lock screen, and post-login are all on the normal desktop, so they take a light re-duplicate with no topology meddling. Apollo isolates once at startup; its recovery just re-duplicates — this matches that. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d11f2bf800 |
fix(host/windows): stop the DDA freeze — kill the HDR format-change storm + throttle ACCESS_LOST recovery
Two freeze drivers found live on the RTX box (DDA-only, 5K@240 HDR SudoVDA):
Step 1 — the per-frame format-change check (
|
||
|
|
995db69387 |
fix(host/windows): detect format/size change on the DDA acquire path
DDA only re-read the duplication format/size on rebuild (recreate_dupl) and initial open. A mid-stream HDR<->SDR flip (FP16<->BGRA — e.g. the SudoVDA output dropping out of HDR for the secure desktop) or a resolution change that does NOT raise ACCESS_LOST left hdr_fp16/width/height stale, so present_acquired copied into a mismatched-format/size target — the secure-desktop "works once, then HDR breaks" symptom. Re-read the acquired texture's desc every frame (as Apollo does) and rebuild on a real change instead of presenting a mismatched frame; throttled like the ACCESS_LOST path so a flapping toggle can't hammer DuplicateOutput. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
6ea52b0372 |
feat(host/windows): SDR-while-secure — drop SudoVDA out of HDR on Winlogon so DDA captures it
When the DDA-on-secure path is enabled (PUNKTFUNK_SECURE_DDA=1), the mux now toggles the SudoVDA's advanced-color (HDR) state via the CCD API (sudovda::set_advanced_color → DisplayConfigSetDeviceInfo + DISPLAYCONFIG_SET_ADVANCED_COLOR_STATE): on entering the secure (Winlogon) desktop it disables HDR so the lock/UAC renders SDR/composed (no fullscreen independent-flip → DDA can duplicate it instead of storming ACCESS_LOST/black), opens DDA fresh on the now-SDR output; on returning to normal it re-enables HDR and rebuilds the helper so WGC re-detects the restored colorspace. Also debounce the DesktopWatcher (publish a Default↔Winlogon change only after it is stable ~80ms) so transient flaps during the transition don't thrash the mux. Default (no flag) is unchanged: WGC stays live through a lock, no DDA switch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
555ec2a3b7 |
Revert "fix(host/windows): rebuild the output fresh on every WGC↔DDA source switch"
This reverts commit
|
||
|
|
3f191ba2ea |
fix(host/windows): rebuild the output fresh on every WGC↔DDA source switch
Key insight (from the user): a fresh RECONNECT shows the secure desktop but the live transition does not — so the difference is what a fresh session does that the live switch skipped. A reconnect runs build() = REMOVE + fresh ADD of the SudoVDA monitor + re-isolate + a fresh capturer; the live transition instead reused the session-start output (created while on the NORMAL desktop), which goes born-lost (ACCESS_LOST storm → black) on the secure desktop. Fix: virtual_stream_relay now calls build() on EVERY source switch (both WGC→DDA and DDA→WGC), then opens DDA on the new target for secure / uses the fresh helper for normal. This makes each transition equivalent to the reconnect that works — fixing both the WGC→DDA cutover (secure desktop now in the clean output state DDA can duplicate) and the DDA→WGC cutover (a fresh helper's first frame is its opening IDR, so await_idr clears immediately instead of waiting on a wedged helper). Costs a ~1-2s rebuild per transition, acceptable for UAC/lock events. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
ef4786387e |
feat(host/windows): force-composed-flip overlay to capture the secure desktop
The secure (Winlogon: UAC/lock/login) desktop presents via fullscreen independent-flip/MPO — it scans out bypassing DWM composition, so DXGI Desktop Duplication returns born-lost DXGI_ERROR_ACCESS_LOST (the client sees black; the UAC only "flashes" during the brief composed transition). Confirmed live: stable 4090 LUID across the storm (NOT reparenting) on an FP16 HDR output, recovering only when the screen changes. Fix (non-input, no system-wide registry change): capture/composed_flip.rs keeps a tiny click-through near-invisible TOPMOST LAYERED window alive on the current input desktop. Any visible window on the output disqualifies independent-flip → DWM composites → DDA can capture. A dedicated thread follows the input desktop (Default↔Winlogon) and recreates the window there on each switch (a window is bound to its desktop), re-asserting topmost + pumping messages every 200ms. Started for the two-process stream's lifetime; gated by PUNKTFUNK_FORCE_COMPOSED (default on, =0 to disable). Needs GENERIC_ALL on OpenInputDesktop for DESKTOP_CREATEWINDOW (0x80070005 otherwise). Validated: overlay creates on the Default desktop; live lock test pending. Also includes SET_RENDER_ADAPTER (sudovda.rs, Apollo item #16): pins the IDD render GPU to the NVENC GPU before ADD — issued + accepted live, though the secure-desktop storm was proven to be independent-flip (stable LUID), not reparenting, so it's correctness/hygiene here rather than this bug's fix. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
6d7301ccf5 |
fix(windows): two-pass cursor compositing (alpha + XOR) in DXGI capture
A single DXGI cursor shape can need BOTH an alpha-blended layer AND a screen-inverting (XOR) layer at once — a masked-color text I-beam (opaque hot-spot + inverting bar) or a monochrome cursor mixing opaque and invert pixels. The old path produced ONE BGRA image per shape and picked ONE blend (cursor_invert) for the whole shape, so such mixed cursors rendered wrong (masked-color opaque pixels forced through the invert blend; monochrome (AND=1,XOR=1) invert pixels approximated as solid black). Port Apollo/Sunshine's decomposition: convert_pointer_shape now returns a CursorShape with optional alpha/xor layers; CursorCompositor holds tex_alpha + tex_xor and draw_layer renders each with its own blend (alpha = src-over, HDR-scaled; XOR = inversion, unscaled — it operates on the framebuffer reference). The CPU software path blends both layers too. Empty layers are never uploaded or drawn. Removes the single cursor_invert flag. Fixes #13 in docs/apollo-comparison.md. Independently reviewed (ship); Windows-only code — compile verified by CI / dev VM. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
7bf2899301 |
fix(host/windows): secure-desktop black screen — capture the real frame, don't seed black
apple / swift (push) Successful in 56s
android / android (push) Failing after 54s
ci / web (push) Successful in 39s
ci / docs-site (push) Successful in 31s
ci / rust (push) Failing after 2m15s
deb / build-publish (push) Successful in 2m4s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
ci / bench (push) Successful in 4m52s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 4m11s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 3m29s
docker / deploy-docs (push) Failing after 6s
Root cause (confirmed live: "black until I pressed a key, then the image came back"): the secure desktop (lock/login/UAC) is STATIC, and DXGI Desktop Duplication only emits a frame on CHANGE. On the normal→secure switch the duplication is rebuilt (recreate_dupl / try_reduplicate), and we then SEEDED A BLACK frame as last_present — which the static secure desktop never replaced (no change-frame) until the user pressed a key. So we streamed black. Fix: after rebuilding the duplication, CAPTURE the current desktop frame instead of seeding black. A freshly-created duplication's first AcquireNextFrame returns the full current desktop; grab it and present it. New `present_acquired` factors the frame-processing out of `acquire`; both recovery paths now call it: - recreate_dupl: after adopting the new duplication, acquire+present the real frame (born-lost ACCESS_LOST / no-initial-frame → seed black as fallback and let the 250ms-throttled caller retry — a brief flash, then real content). - try_reduplicate: adopt-first, then capture its probe frame (was discarded). Also (independently-correct safe fixes, per the adversarial review): - DesktopWatcher computes the current desktop synchronously in start() before returning, so a session that begins on the secure desktop (reconnect to a locked box) doesn't relay one stale normal-desktop frame (the "flash"). - DuplCapturer::open reasserts SudoVDA isolation at open time (mirrors recreate_dupl) — forces the secure desktop back onto the virtual output if a lock/UAC re-attached a physical monitor. - Instrumentation: dbg_black_seeds counter + a throttled warn when black is seeded, and an info when a real secure-desktop frame is captured on recovery. Pending: the user's real-lock smoke test on the 4090 (a headless PsExec LockWorkStation runs as SYSTEM and can't lock an interactive session, so this must be validated with an actual lock). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
8d6cbb81fe |
fix(host/windows): merge host PUNKTFUNK_* env into the WGC helper's environment
CreateProcessAsUserW gives the spawned helper the *user's* environment block, so the host's PUNKTFUNK_ENCODER=nvenc (and ZEROCOPY/PERF/…) were dropped and the helper fell back to the software (H.264-only) encoder — the client negotiated H265 → "WGC helper exited". `merged_env_block` now parses the user block, strips any PUNKTFUNK_* it carried, overlays this (host) process's PUNKTFUNK_* vars, and passes the merged UTF-16 block. Validated live on the RTX 4090 (host as SYSTEM): the helper spawns via CreateProcessAsUserW, runs WGC with no hang (HDR FP16 BT.2020 PQ), opens NVENC (D3D11 Main10), and relays AUs over the pipe — client-rs decoded 411 HEVC Main-10 frames over the LAN. Step 4 (spawn + relay) complete. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
140209bbfc |
feat(host/windows): two-process secure-desktop step 5 — DDA mux on Winlogon
`virtual_stream_relay` now muxes the AU source by input desktop. A DesktopWatcher (SYSTEM-only Winlogon-name poll) drives it: the user-session WGC helper relay feeds the normal (Default) desktop; the host's OWN DDA capturer+encoder — opened lazily on the first secure transition, on the same SudoVDA target with a no-op keepalive (the host still holds the real isolation owner) — captures the secure (Winlogon: UAC/lock/login) desktop that WGC can't see. Every switch latches "wait for IDR" and forces the now-active source to emit a keyframe (the two encoders keep independent infinite-GOP state, so the client must resume on an IDR); returning to the helper also drains its stale buffered AUs first. Reconfigure drops the stale-target DDA; keyframe requests route to the live source. Send path (FEC/seal/paced-send) unchanged. Also: wgc_relay gains try_recv (drain on switch-back); open_dda takes dims as args (avoids a closure borrow of the reassigned cur_mode); the forward! macro returns bool with `break 'outer` at the call site (no in-macro label hygiene). cfg-gated windows-only. Live validation (UAC switch over a session) pending. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
9f50b3930d |
feat(host/windows): two-process secure-desktop step 4 — spawn helper + relay AUs
The SYSTEM host now sources the normal-desktop video from a user-session WGC helper instead of capturing in-process (WGC won't activate as SYSTEM). New `capture/wgc_relay.rs`: `HelperRelay::spawn` launches `m3-host wgc-helper` in the interactive user session via CreateProcessAsUserW (WTSQueryUserToken → DuplicateTokenEx(TokenPrimary) → lpDesktop="winsta0\\default", CREATE_NO_WINDOW) with three anonymous pipes — stdout (framed Annex-B AUs → parsed back to RelayAu), stdin (control: force-keyframe), stderr (helper logs → host tracing). The host holds the SudoVDA keepalive (sole isolation/topology owner); the helper captures by GDI name only. m3.rs: `virtual_stream` dispatches to the new `virtual_stream_relay` when `should_use_helper()` (running as SYSTEM, or PUNKTFUNK_FORCE_HELPER; disable with PUNKTFUNK_NO_HELPER). The relay loop feeds the existing send thread — same FEC/seal/paced-send path. Reconfigure rebuilds the output + re-spawns the helper; keyframe requests forward over the control pipe; helper pts_ns (same-machine monotonic clock) is used directly as capture_ns. Disconnect ends the stream (step 6 adds the relaunch watchdog). wgc_helper.rs: reads the stdin control byte to request an IDR; --bit-depth flag threaded through so SDR 10-bit (Main10) negotiation reaches the helper's encoder. cfg-gated windows-only; Linux/macOS build unaffected. Step 5 (DesktopWatcher mux to host DDA on the Winlogon secure desktop) is next. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
80e222d3b8 |
feat(host/windows): DesktopWatcher (secure-desktop detection) — step 1 of the two-process build
apple / swift (push) Successful in 53s
android / android (push) Has been cancelled
ci / web (push) Has been cancelled
ci / docs-site (push) Has been cancelled
ci / bench (push) Has been cancelled
ci / rust (push) Has been cancelled
deb / build-publish (push) Has been cancelled
decky / build-publish (push) Has been cancelled
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Has been cancelled
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Has been cancelled
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Has been cancelled
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Has been cancelled
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Has been cancelled
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Has been cancelled
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Has been cancelled
docker / deploy-docs (push) Has been cancelled
Polls the input-desktop name (OpenInputDesktop + GetUserObjectInformationW(UOI_NAME)) on its own thread → Default/Winlogon atomic; the authoritative normal-vs-secure signal for the capture mux + input path (WTS notifications miss UAC). Not yet wired into the mux (needs the SYSTEM host + WGC helper, steps 3-5 in docs/windows-secure-desktop.md). NOTE: detecting the secure desktop requires the host to run as SYSTEM (a user-token process can't OpenInputDesktop the Winlogon desktop). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
5c2bcbc2a2 |
docs(windows): secure-desktop two-process design + WGC impersonation attempt (vestigial)
apple / swift (push) Successful in 55s
android / android (push) Has been cancelled
ci / rust (push) Has been cancelled
ci / web (push) Has been cancelled
ci / docs-site (push) Has been cancelled
ci / bench (push) Has been cancelled
deb / build-publish (push) Has been cancelled
decky / build-publish (push) Has been cancelled
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Has been cancelled
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Has been cancelled
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Has been cancelled
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Has been cancelled
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Has been cancelled
docker / deploy-docs (push) Has been cancelled
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Has been cancelled
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Has been cancelled
Validated design for adding secure-desktop (UAC/lock/login) coverage on top of the shipped WGC animation fix. Key verified constraint: WGC won't activate under SYSTEM (0x80070424) even with thread-level ImpersonateLoggedOnUser, and DDA+SendInput on Winlogon need LOCAL_SYSTEM — so one process can't do both. Architecture: SYSTEM host (QUIC + SudoVDA + DDA-secure + SendInput + AU mux) + a USER-session WGC helper (CreateProcessAsUser) that relays encoded Annex-B AUs over a named pipe; the host muxes helper-AUs (normal desktop) vs its own DDA encoder (secure desktop), switched by a desktop-name watcher. No shared GPU texture (rejected — MIC/keyed-mutex pain); just AU bytes. docs/windows-secure-desktop.md has the ordered, box-testable steps. The impersonate_active_user() in wgc.rs is kept as a harmless no-op (under a user-token process WTSQueryUserToken fails → no impersonation → WGC works natively); it does NOT make WGC work under SYSTEM (the two-process design uses a real user process for WGC instead). + Win32_System_RemoteDesktop. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
28ab448a29 |
feat(host/windows): WGC capture backend (overlay/HDR-correct) with watchdog'd DDA fallback
android / android (push) Failing after 46s
apple / swift (push) Successful in 54s
ci / rust (push) Failing after 1m16s
ci / web (push) Successful in 31s
ci / docs-site (push) Successful in 27s
deb / build-publish (push) Successful in 2m23s
decky / build-publish (push) Successful in 10s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m31s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m50s
The capture-architecture reset from the research: add a Windows.Graphics.Capture (WGC) backend that captures the COMPOSED desktop — including the overlay/independent-flip/MPO planes DXGI Desktop Duplication misses — which structurally fixes the frozen HDR animations + video (proven live: a WGC frame decodes to the real 5120x1440 HDR content DDA freezes on). It reuses the whole pipeline unchanged: the WGC frame's GPU texture → same scRGB→BT.2020-PQ shader → NVENC zero-copy; the OS composites the cursor (IsCursorCaptureEnabled) so no manual cursor pass. crates/punktfunk-host/src/ capture/wgc.rs; find_output/make_device/HdrConverter/nudge_cursor_onto made pub(crate) for reuse. Reliability findings + mitigations (live on the RTX 4090): - WGC can't activate under the SYSTEM account (0x80070424) — it needs the interactive user token. The host must run as the user for WGC (run.cmd: drop PsExec -s). DDA still needs SYSTEM for the secure desktop — that token reconciliation (impersonation) is the remaining task. - WGC's Direct3D11CaptureFramePool::CreateFreeThreaded intermittently HANGS on the headless SudoVDA (IddCx) display, correlated with accumulated SudoVDA churn (failed REMOVEs leaving lingering displays); clean-state opens reliably. Since it's a blocking hang, capture_virtual_output runs WGC open on a watchdog thread with a 5s timeout and falls back to DDA on hang/error — the session is NEVER left black: WGC when it opens (fixed animations), DDA otherwise. First-frame nudge added (WGC fires FrameArrived on change; a static desktop otherwise never delivers the first frame). - Default WGC; PUNKTFUNK_CAPTURE=dda forces DDA. DDA path unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
b1e95a386f |
fix(host/windows): tiered DXGI recovery — cheap re-DuplicateOutput for the HDR ACCESS_LOST churn
apple / swift (push) Successful in 53s
ci / web (push) Successful in 28s
android / android (push) Successful in 1m46s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 11s
ci / rust (push) Successful in 1m4s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 3m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m17s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m56s
The HDR path produced a constant ACCESS_LOST churn during real desktop activity (window resize / Start menu / DWM transitions): the duplication keeps getting invalidated but the OUTPUT stays valid (probe passes — 0 born-lost over 72 rebuilds). The old recovery did a FULL rebuild (new device + factory) on every loss, which re-inits NVENC + seeds black + was throttled to 4x/s → mostly-frozen, re-init churn = "broken animations". Now recovery is tiered (mirrors Sunshine): try_reduplicate() does a fresh DuplicateOutput on the EXISTING device+output — no new device, so NO encoder re-init, NO black seed, gpu_copy/HDR textures/last_present kept → frames resume immediately. Only a genuine output loss (secure-desktop switch) or a dead device (DEVICE_REMOVED/RESET) falls back to the full, throttled recreate_dupl. Both paths probe the new duplication and reject a born-lost one. Validated synthetically (1080p60 + 5120x1440@240 HDR): pipeline stable, 0 churn, frames flow. The real-desktop churn needs live validation (can't synthesize DWM animations). Secure-desktop "UI never appears in-session" is a separate issue (output gone in-session; only a fresh monitor re-add works) — still open. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
0a3b92d994 |
fix(host/windows): HDR cursor brightness (203-nit) + probe-before-adopt recovery; windows-client bootstrap doc
apple / swift (push) Successful in 55s
android / android (push) Successful in 2m43s
ci / web (push) Successful in 31s
ci / docs-site (push) Successful in 37s
ci / bench (push) Successful in 1m35s
ci / rust (push) Successful in 7m7s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m33s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m33s
docker / deploy-docs (push) Successful in 18s
- HDR cursor: sRGB→linear decode + scale to HDR graphics white (PUNKTFUNK_HDR_CURSOR_NITS, default 203 per BT.2408) in the FP16 cursor composite, so it's no longer ~2.5x too dim. SDR path unchanged; the masked-color (I-beam) inversion blend left unscaled. Cursor cbuffer widened 16→32 + bound to PS. (Validated live: cursor now correct brightness in HDR.) - Secure-desktop recovery: recreate_dupl now PROBES the rebuilt duplication with a 50ms AcquireNextFrame and only adopts it when live (Ok/WAIT_TIMEOUT); a born-lost one (immediate ACCESS_LOST) is dropped so the caller repeats the last frame + retries. Plus reassert_isolation() re-detaches physical displays on every recovery (re-routing the secure/HDR desktop to the virtual output, the delta a fresh reconnect has). NOTE: the born-lost ACCESS_LOST storm in HDR is NOT yet resolved by these — still under investigation (animations/secure-UI/cursor-trail in HDR remain). - docs/windows-client-bootstrap.md: handoff for the native Windows Rust client (windows-rs Reactor + WinUI 3 SwapChainPanel, D3D11VA decode, WASAPI audio, SDL3 input; ports crates/punktfunk-client-linux; 10-bit/HDR present; dev boxes + gotchas). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
bbabc04bca |
feat(hdr): Windows HDR10 + 10-bit end-to-end, negotiated; non-blocking capture recovery
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m32s
android / android (push) Successful in 1m49s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m36s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m20s
flatpak / build-publish (push) Successful in 4m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m11s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m32s
Adds true HDR (BT.2020 PQ) and 10-bit (HEVC Main10) streaming, negotiated so an 8-bit/SDR client is never sent a stream it can't decode, plus a robust fix for the capture losing the stream across a secure-desktop transition. Protocol (punktfunk-core/quic.rs): - Hello gains `video_caps` (VIDEO_CAP_10BIT / VIDEO_CAP_HDR), Welcome gains `bit_depth`, both as optional trailing bytes (back-compat). client-rs advertises 10-bit via PUNKTFUNK_CLIENT_10BIT; the connector advertises 0 for now (in-band detection drives the native clients). Regenerated punktfunk_core.h. Windows host: - 10-bit Main10: host enables it only when the client advertised VIDEO_CAP_10BIT AND PUNKTFUNK_10BIT is set; threaded through open_video → NVENC (profile Main10, pixelBitDepthMinus8). - HDR: when the captured desktop is scRGB FP16 (R16G16B16A16_FLOAT, HDR on), copy it to an FP16 surface, composite the cursor there, convert scRGB → BT.2020 PQ 10-bit (R10G10B10A2) via a shader, and encode HEVC Main10 with the BT.2020/PQ colour VUI (ABGR10 input). Fixes the freeze + cursor-trail that came from feeding FP16 into the BGRA path. Reacts dynamically to the HDR toggle. - Capture recovery: rebuild is now a single NON-BLOCKING attempt, throttled to ~4×/s, repeating the last good frame between attempts (format-tagged last_present). During a secure-desktop dwell SudoVDA's output is gone; the old blocking 12 s retry starved the send loop for seconds so the client timed out and disconnected — now the session stays fed (frozen) until the desktop returns. Also seeds a black frame on recovery. Apple client (PunktfunkKit): - Detects HDR in-band from the stream VUI (PQ transfer function), decodes to 10-bit P010, and presents via an rgba16Float + BT.2020 PQ CAMetalLayer with EDR; SDR path unchanged. Switches automatically on a mid-session HDR toggle. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
f4b4a6c1e4 |
feat(host/windows): native res, cursor, secure-desktop capture, windowless SYSTEM launch
apple / swift (push) Successful in 52s
ci / rust (push) Failing after 36s
ci / web (push) Successful in 31s
android / android (push) Successful in 1m52s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 3m19s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m57s
docker / deploy-docs (push) Successful in 17s
Live-validated Mac <-> RTX 4090 at the display's native 5120x1440@240: - Resolution: set_active_mode enumerates the IDD's advertised modes and sets the requested resolution at the best supported refresh (keeps 5120x1440@240; no more silent fallback to the 1080p OS default when an exact mode is briefly unavailable). - Bitrate auto-cap: NVENC init probes and steps the average bitrate down to the GPU's codec-level max so a high client bitrate connects (matches the Linux host; we do not split NVENC sessions). - Mouse cursor: DXGI duplication excludes the HW cursor; capture the pointer shape/position (GetFramePointerShape) and GPU-composite it before NVENC. Color cursors alpha-blend; masked-color (the text I-beam) uses an INV_DEST_COLOR inversion blend so the caret inverts the screen and shows on any background (no black box); monochrome handled too. - Secure desktop (lock / login / UAC): run as SYSTEM in the interactive session, follow the input desktop via SetThreadDesktop, and on the WinSta switch recreate the D3D11 device and re-resolve the virtual output's GDI name from the stable SudoVDA target id (the name changes across the topology rebuild; the old failure hunted the stale \\.\DISPLAYn and dropped). ACCESS_LOST / INVALID_CALL / device-removed are recoverable, and a mid-stream resolution change is followed (capturer + NVENC re-init at the new size). isolate_displays detaches other monitors so Winlogon renders to the virtual output. One real session recovered 1012 desktop switches and completed cleanly. Windows-only backends; Linux/macOS unaffected. Builds clean on x86_64-pc-windows-msvc. Deployment (windowless SYSTEM launch via PsExec + hidden VBScript) documented in docs/windows-host.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
7654b20b2a |
fix(host/windows): NVENC capture on real GPU + HOME-less config dir
apple / swift (push) Successful in 54s
android / android (push) Failing after 1m44s
ci / rust (push) Successful in 1m18s
ci / web (push) Successful in 28s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 1m50s
decky / build-publish (push) Successful in 10s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 3s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
flatpak / build-publish (push) Failing after 2s
deb / build-publish (push) Failing after 2m56s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m4s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m48s
docker / deploy-docs (push) Successful in 17s
Validated live on an RTX 4090 (Windows 11) host streaming to the Rust reference client over the LAN: SudoVDA virtual display → DXGI Desktop Duplication (D3D11 zero-copy) → NVENC HEVC → punktfunk/1. 720p60 and 1080p60 both clean (181 / 177 frames, 0 mismatched, p50 1.6 / 3.45 ms cross-machine), coexisting with Apollo. Two real-hardware bugs the GPU-less VM couldn't surface: - DXGI capturer: the SudoVDA virtual monitor's DXGI output is enumerated under the GPU that *renders* it (the 4090, LUID 0x15df6), NOT under the SudoVDA "adapter" LUID SudoVDA reports (0x23276). Restricting the output search to that LUID found nothing → "adapter has no output named \\.\DISPLAYn". Now search ALL adapters for the GDI name, bind the D3D11 device to whichever adapter exposes it (NVENC then shares that device), with a settle-retry (the output appears a beat after display creation) and topology logging. - native_pairing / apps: keyed config paths off raw $HOME, which a Windows service/scheduled-task context doesn't set → "HOME unset" hard-fail at m3-host startup. Route both through gamestream::config_dir(), which falls back to %APPDATA% on Windows (cert/paired/apps now under AppData\Roaming). clippy -D warnings + build green on x86_64-pc-windows-msvc (default and --features nvenc) and Linux (78/78 tests). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
9c61b03101 |
feat(host/windows): ViGEm rumble back-channel + Windows clippy clean
android / android (push) Failing after 21s
ci / web (push) Failing after 10s
ci / docs-site (push) Failing after 1s
ci / bench (push) Failing after 0s
deb / build-publish (push) Failing after 0s
decky / build-publish (push) Failing after 1s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Failing after 0s
docker / deploy-docs (push) Has been skipped
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 0s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Failing after 1s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 0s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 0s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Failing after 1s
flatpak / build-publish (push) Failing after 0s
apple / swift (push) Successful in 53s
ci / rust (push) Failing after 2m35s
Wire the host→client rumble path on Windows, the analogue of the Linux uinput EV_FF read loop: a game's force-feedback on the virtual Xbox 360 pad is delivered by ViGEm's notification API (`request_notification` → `spawn_thread`, gated by the crate's `unstable_xtarget_notification` feature). A per-pad background thread stores the latest motor levels; `pump_rumble` relays changes to the client on the universal 0xCA plane (motors scaled 0..255 → 0..65535). Dropping the target aborts the notification, so the thread exits with the session. Live verification still needs a physical pad. Also fix the Windows backends' clippy debt — these modules are cfg- excluded from Linux CI, so `clippy -D warnings` never saw them, and the VM's rustc 1.96 clippy is stricter on shared code than the CI image: - dxgi: manual checked division → checked_div().map_or - sendinput: `x = x | y` → `x |= y` - sudovda: `.then(|| ptr)` → `.then_some(ptr)` - m3 pick_compositor: drop the needless early return (match form) - m3 resolve_compositor: Windows arm is a tail expr, not `return` All Windows backends now build + clippy clean (default and --features nvenc); Linux unaffected (fmt/clippy/check green). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
2448a33698 |
style(host/windows): rustfmt the Windows backends
apple / swift (push) Successful in 55s
android / android (push) Failing after 1m53s
ci / web (push) Failing after 17s
ci / docs-site (push) Successful in 42s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 7s
ci / rust (push) Failing after 3m5s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 12s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 7s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 2s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 0s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Failing after 0s
flatpak / build-publish (push) Failing after 0s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 0s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Failing after 1m43s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m15s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
69ba6ec45d |
feat(host/windows): NVENC D3D11 hardware encoder (--features nvenc)
android / android (push) Failing after 36s
ci / rust (push) Failing after 45s
apple / swift (push) Successful in 55s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m35s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
flatpak / build-publish (push) Failing after 2s
deb / build-publish (push) Successful in 3m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1m17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m32s
docker / deploy-docs (push) Successful in 17s
Zero-copy capture->encode on the GPU via the raw NVENC API (nvidia_video_codec_sdk sys + ENCODE_API; the safe wrapper is CUDA-only). Opens an NV_ENC_DEVICE_TYPE_DIRECTX session on the SAME ID3D11Device as the DXGI capturer (carried on the new FramePayload::D3d11), registers a pool of BGRA textures once, CopyResources each captured texture in and encode_picture; CBR/ULL, infinite GOP, P-only, forced-IDR for RFI. The DXGI capturer gains a D3D11 zero-copy output (selected, like the encoder, by PUNKTFUNK_ENCODER=nvenc) so capture+encode share textures. OFF by default (the nvenc feature pulls the NVENC SDK + cudarc): the default Windows host links without it (openh264 path). cudarc builds toolkit-less via the SDK ci-check feature (dynamic-loading). At link time --features nvenc needs nvencodeapi.lib (NVENC SDK, or an import lib generated from the driver's nvEncodeAPI64.dll) on PUNKTFUNK_NVENC_LIB_DIR. Both default and --features nvenc builds validated to compile+link GPU-less on the VM (import lib generated from the driver DLL). Runtime needs a real NVIDIA GPU. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |