02b1be652d30e80213aad4f8bca80de57c7c8c68
247 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
1f7b8eba66 |
feat(host/windows): auto-install a virtual mic device (Steam Streaming Microphone)
apple / swift (push) Successful in 54s
android / android (push) Successful in 1m56s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 28s
deb / build-publish (push) Successful in 2m31s
ci / rust (push) Successful in 1m40s
decky / build-publish (push) Successful in 19s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
ci / bench (push) Successful in 4m32s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 20s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m16s
So Windows mic passthrough works without the user installing anything: when no virtual-mic
device is present, install Steam Remote Play's SteamStreamingMicrophone.inf (ships under
Steam\drivers\Windows10\{arch}\ next to the speakers INF Apollo uses) via DiInstallDriverW
loaded from newdev.dll — the same mechanism Apollo uses for Steam Streaming Speakers — then
re-find the device. Needs admin (the host runs as SYSTEM); best-effort and safe (no-op if
Steam absent / INF not found / PUNKTFUNK_NO_MIC_INSTALL), falling back to the manual-install
guidance (VB-Audio Cable) otherwise.
Not yet built/validated on the box (down); FFI cross-checked against windows-0.62. Whether
Steam ships SteamStreamingMicrophone.inf at that path is to be confirmed on the box.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
a7daed5797 |
feat(host/windows): client→host mic passthrough via a virtual audio device
apple / swift (push) Successful in 55s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 1m40s
android / android (push) Successful in 1m57s
ci / docs-site (push) Successful in 29s
deb / build-publish (push) Successful in 2m30s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m30s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m12s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m2s
docker / deploy-docs (push) Successful in 18s
The host received the client's mic uplink (0xCB Opus) but dropped it on Windows ("requires
Linux"). Windows has no user-mode way to CREATE a capture endpoint, so target an existing
virtual audio device and write the decoded mic PCM into its RENDER endpoint — the device's
CAPTURE endpoint then surfaces as a microphone host apps record from (the inverse of a
virtual cable). New audio::wasapi_mic::WasapiVirtualMic: finds the device by friendly-name
(Steam Streaming Microphone / VB-Audio CABLE Input / VoiceMeeter / "virtual", override with
PUNKTFUNK_MIC_DEVICE), opens a WASAPI shared event-driven RENDER client (48 kHz stereo f32,
autoconvert), and a dedicated COM thread writes a bounded (~80 ms drop-oldest) inject queue
with silence-fill. open_virtual_mic() gets a Windows arm; mic_service_thread (Opus decode →
push) now compiles for windows too (opus is already a windows dep). Clear error + install
guidance when no virtual device is present.
Linux/cross-platform side cargo-checks; the Windows path is built/validated when the box is
back (the wasapi render API was cross-checked against the docs + the existing capture path).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
3b3e8b4ba9 |
perf(host/windows): elevate capture/encode/send thread CPU priority (Apollo-parity)
apple / swift (push) Successful in 54s
deb / build-publish (push) Successful in 2m31s
decky / build-publish (push) Successful in 15s
ci / rust (push) Successful in 1m36s
android / android (push) Successful in 2m5s
ci / web (push) Successful in 29s
ci / docs-site (push) Successful in 29s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m28s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m20s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m58s
Apollo runs its capture thread at CRITICAL and its encoder thread at ABOVE_NORMAL; we set none. Our GPU work is already HIGH priority, but the GPU scheduler can only favour commands we've SUBMITTED — a normal-priority thread descheduled by a CPU-heavy game submits the convert/encode late, so the HIGH GPU priority never bites (consistent with the measured "NVENC engine idle yet the encode waits ~15 ms"). Raise the WGC helper's capture+encode loop and the single-process capture+encode loop to THREAD_PRIORITY_HIGHEST, and the transmit thread to ABOVE_NORMAL, via a cross-platform boost_thread_priority() (Windows-only effect — the Linux host caps the game via gamescope so its threads aren't starved). Not yet built/validated on the GPU box (it's down); the cross-platform side compiles (cargo check) and the Windows calls are cross-checked against the windows-0.62 API. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
9771aa8815 |
fix(host/windows): binary-search clamp NVENC bitrate to the codec-level max (not ×¾ step-down)
ci / web (push) Successful in 28s
ci / rust (push) Successful in 1m42s
ci / docs-site (push) Successful in 28s
apple / swift (push) Successful in 55s
android / android (push) Successful in 1m55s
deb / build-publish (push) Successful in 2m29s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
ci / bench (push) Successful in 4m27s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m5s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m54s
When a client requests a bitrate above the GPU's HEVC/AV1 level ceiling, NVENC rejects initialize_encoder. The old probe stepped the rate down by ×¾ each retry, undershooting the real ceiling badly (a 1 Gbps request landed ~300 Mbps even with the level cap near 800). Replace it with a binary search over [floor, requested] that converges (±20 Mbps) on the HIGHEST rate NVENC accepts and clamps to that — so the stream uses the full codec-level bitrate. Factored the session open/config/init into try_open_session() for the probe; split-encode rejection is disambiguated from a bitrate-cap rejection (retry once with split disabled) and the floor fallback also tries split-disabled. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
a4df75132a |
fix(host/windows): HEVC/AV1 HIGH tier so high client bitrates aren't quartered
android / android (push) Successful in 1m56s
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m35s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 29s
deb / build-publish (push) Successful in 2m26s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 6s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m36s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m8s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m8s
docker / deploy-docs (push) Successful in 18s
NVENC defaulted to Main tier, whose per-level bitrate ceiling at 5K (HEVC Level 6.2 Main ≈ 240 Mbps) made initialize_encoder reject a high client bitrate; the existing probe-and-step-down then silently dropped a ~1 Gbps request by ×¾ to ~240-320 Mbps — visible color/motion compression on fast scenes. Set HIGH tier (≈800 Mbps for HEVC, higher for AV1) + autoselect level so the requested bitrate goes through. `tier`/`level` are u32 (HIGH=1, AUTOSELECT=0) shared across the HEVC/AV1 union offset; the step-down remains as a safety net. Not yet built/validated on-box (box offline). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
4cc57d5c39 |
perf(host/windows): move capture→encode off the 3D engine (NV12/P010 video-processor path, zero-copy, GPU priority)
apple / swift (push) Successful in 56s
ci / rust (push) Successful in 1m36s
android / android (push) Successful in 1m56s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 28s
deb / build-publish (push) Successful in 2m26s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m33s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m58s
The Windows host capped at ~60 fps with 35-40 ms latency on a GPU-heavy game: the per-frame capture→encode path shared the 3D engine with the game and got scheduled behind it. Rework to minimize 3D-engine work per frame: - VideoConverter (D3D11 video processor): capture → NVENC-native NV12/P010 so NVENC skips its internal RGB→YUV (a 3D/compute step). Wired into both DDA (dxgi.rs) and WGC (wgc.rs). New PixelFormat::Nv12/P010 + NVENC YUV input. - GPU scheduling hardening (Apollo-style): D3DKMTSetProcessSchedulingPriorityClass HIGH, absolute SetGPUThreadPriority, SetMaximumFrameLatency(1). - WGC SDR zero-copy (hold pool frames; no CopyResource). DDA keeps a fast CopyResource to decouple its single-frame acquire/release from the async convert. - Pipelined helper encode loop (PUNKTFUNK_ENCODE_DEPTH, default 1) + perf split (cap_wait / encode / write). Live on the RTX 4090: hard 60 fps ceiling removed (now scene-scaling 40-200+), latency much reduced. Residual cap in GPU-pinned scenes is the irreducible RGB→YUV convert (no fixed-function unit on NVIDIA — VideoProcessing engine reads 0%) waiting behind an uncapped game under WDDM context time-slicing; Linux avoids it via gamescope capping the game to the display refresh. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
15d3d423fa |
feat(decky): full-featured Gaming-Mode client — fullscreen page, pairing, focus-correct launch
apple / swift (push) Successful in 56s
ci / docs-site (push) Successful in 28s
ci / rust (push) Successful in 1m48s
android / android (push) Successful in 2m11s
ci / web (push) Successful in 27s
deb / build-publish (push) Successful in 2m24s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 7s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
ci / bench (push) Successful in 4m32s
flatpak / build-publish (push) Successful in 4m1s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m18s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m43s
The plugin was a QAM launcher whose stream never appeared, with no
pairing. Three fixes, plus a headless --pair mode on the GTK client:
- Stream actually starts (MoonDeck's proven mechanism): gamescope only
focuses the process tree Steam launched via reaper, so a flatpak
spawned from the (root) backend is invisible. The frontend now
registers ONE hidden non-Steam shortcut pointing at bin/punktfunkrun.sh,
passes the host as the shortcut's Steam launch options, and starts it
with SteamClient.Apps.RunGame — gamescope then fullscreen-focuses it.
The wrapper execs `flatpak run io.unom.Punktfunk --connect <host>`.
- Fullscreen page: routerHook.addRoute("/punktfunk") — host list,
per-host Pair/Stream, and a settings section (resolution/refresh/
bitrate/gamepad/mic, written to client-gtk-settings.json).
- Pairing: a gamepad-navigable PIN keypad. The host shows the PIN; the
backend runs the SPAKE2 ceremony headlessly via the client's new
`--pair <PIN> --connect host` CLI mode (app.rs), persisting the host
as paired so the stream then connects silently. Same flatpak =>
shared identity store, verified live (ceremony against a real host).
- Backend (main.py): discover / pair / runner_info / get_settings /
set_settings / kill_stream; uses DECKY_USER_HOME so paths resolve to
the deck user's flatpak install regardless of the plugin's root flag.
CI (decky.yml) and the sideload packager now ship bin/punktfunkrun.sh.
The Steam-shortcut launch and headless-pairing env follow MoonDeck
exactly but need a Deck in Gaming Mode to fully confirm.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
67608944f0 |
feat(client-linux): controller + keyboard shortcuts to exit fullscreen
On the Steam Deck there was no way out of fullscreen — no F11 key, and the header bar (with the fullscreen button) is hidden while fullscreen. - Controller: a Moonlight-style escape chord (L1+R1+Start+Select) held together leaves fullscreen and releases input capture. The gamepad service latches the chord (fires once per press) and signals the stream page over an async channel; four simultaneous buttons no game uses as a deliberate combo, so it can't trigger during play. - Keyboard: F11 already toggled fullscreen (checked before input forwarding, so it works while captured) — now surfaced. - Discoverability: entering fullscreen flashes a 4s hint listing both exits (F11 · L1+R1+Start+Select). The escape future is aborted on page-hidden so a stale session can't act on the shared window. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
d5757980f8 |
style(host): rustfmt — align video_caps comment in m3 test call-sites
apple / swift (push) Successful in 53s
ci / web (push) Successful in 32s
ci / rust (push) Successful in 1m36s
ci / docs-site (push) Successful in 29s
deb / build-publish (push) Successful in 2m26s
decky / build-publish (push) Successful in 22s
ci / bench (push) Successful in 4m31s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 17s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 3m3s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m40s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 21s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m19s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m17s
android / android (push) Has been cancelled
docker / deploy-docs (push) Successful in 17s
cargo fmt --all over the merged connect() call-sites (the video_caps/ launch args landed without a fmt pass). Comment-alignment only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
41b289780f |
Merge remote-tracking branch 'origin/main'
apple / swift (push) Successful in 55s
ci / rust (push) Failing after 1m4s
ci / web (push) Successful in 36s
ci / docs-site (push) Successful in 30s
android / android (push) Successful in 2m27s
deb / build-publish (push) Successful in 2m24s
decky / build-publish (push) Successful in 25s
ci / bench (push) Successful in 4m31s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 17s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m47s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m55s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 22s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m14s
flatpak / build-publish (push) Failing after 2m41s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 4m15s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Has been cancelled
|
||
|
|
64b167946f |
fix(client-linux): VAAPI green screen on AMD — flatten NV12 planes across DRM layers
First AMD test (Steam Deck, Mesa radeonsi) showed a mostly-green image with red whites — the classic fingerprint of NV12 chroma read as 0. Root cause (confirmed against FFmpeg/GTK/mpv source): FFmpeg's VAAPI export uses VA_EXPORT_SURFACE_SEPARATE_LAYERS unconditionally, so an NV12 surface comes back as TWO single-plane layers — layers[0]=R8 (luma), layers[1]=GR88 (chroma) — sharing one object/fd, the UV plane reached via offset. map_dmabuf took layers[0] only and used its format (R8) as the GTK fourcc, so GdkDmabufTexture got a luma-only texture with the chroma plane dropped → chroma defaults to 0 → green field, red highlights. Fix (matches mpv's dmabuf_interop_gl flatten pattern): - Derive the combined fourcc from the decoder's sw_format (AVHWFramesContext.sw_format → NV12 → DRM_FORMAT_NV12), NOT from the per-plane component formats. The frame format is absent from the separate-layer descriptor and must be deduced from sw_format. - Flatten every plane across every layer in declared order (Y then UV), each with its own fd (objects[plane.object_index].fd), offset, pitch. - One-time descriptor dump (objects/layers/formats/modifier) so a new driver's real layout is visible in the logs. - Unit test locks the DRM FourCC magic numbers (NV12=0x3231564e). Software decode (swscale, reads colorspace from the VUI) was always correct, which isolated the bug to this path. PUNKTFUNK_DECODER=software is the immediate workaround on an un-rebuilt binary. Awaiting Steam Deck reconfirm (no AMD VAAPI on the NVIDIA dev box to live-verify). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
9537efdcd5 |
feat(client/windows): HDR10 (BT.2020 PQ) decode + present
apple / swift (push) Successful in 54s
windows-msix / package (push) Successful in 1m8s
windows / build (push) Successful in 1m14s
android / android (push) Failing after 1m43s
ci / rust (push) Failing after 48s
ci / web (push) Successful in 28s
ci / docs-site (push) Successful in 29s
deb / build-publish (push) Successful in 3m5s
decky / build-publish (push) Successful in 14s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 3s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m35s
flatpak / build-publish (push) Failing after 4m27s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 3m54s
docker / deploy-docs (push) Successful in 6s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m12s
Light up the dormant 10-bit/HDR path end to end on the Windows client. - core: NativeClient::connect gains a video_caps param threaded into the Hello. The Windows client advertises VIDEO_CAP_10BIT | VIDEO_CAP_HDR; every other caller (the C ABI shim, Linux, Android, host test connects) passes 0, so the 8-bit BT.709 path is unchanged. The host already gates a Main10/PQ encode on these bits + PUNKTFUNK_10BIT. - video.rs: a PQ frame (color_trc == SMPTE2084) converts 10-bit YUV → X2BGR10 (== DXGI R10G10B10A2) with the BT.2020 matrix via sws_setColorspaceDetails; swscale applies only the matrix + range, so the PQ-encoded samples pass through untouched. - present.rs: on an HDR frame the swapchain flips in place (ResizeBuffers) to R10G10B10A2 + DXGI_COLOR_SPACE_RGB_FULL_G2084_NONE_P2020 + HDR10 metadata; the passthrough shader is unchanged and the compositor maps PQ→display. Switched to ALPHA_MODE_IGNORE so the 10-bit padding bits don't render transparent. SDR stays 8-bit B8G8R8A8. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
5cbd249d09 |
fix(client/windows): first on-glass pass — component routing, pointer lock, stats HUD
The first real run on a display surfaced three issues the headless/dev-VM build never hit: - Route each hook-using screen (hosts/pair/stream) as its own component() instead of calling it with the shared cx. Calling hooks on the parent cx changed the hook order when the screen flipped, tripping reactor's Rules-of-Hooks guard and aborting the moment you navigated to the stream page. - Mouse: replace the absolute path (which swallowed WM_MOUSEMOVE and so froze the OS cursor, snapping the host pointer back to one point) with proper pointer lock — hide + ClipCursor + recentre, shipping relative MouseMove scaled by the Contain-fit factor. Ctrl+Alt+Shift+Q now actually toggles capture: track modifier state from the hook's own event stream (GetAsyncKeyState doesn't see keys we suppress in our own LL hook), and flush held keys/buttons on release so nothing sticks on the host. - Add the stats HUD overlay (mode · fps · Mb/s · capture→client/decode latency), mirroring the Apple client. Stats live in root state and reach the stream page as a prop (a child's own async-state update is pruned when props are unchanged), fed by a small poll thread. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
ad0cb1b582 |
feat(host/windows): capture the secure desktop in HDR via DDA (no SDR drop)
ci / web (push) Successful in 32s
ci / rust (push) Successful in 1m26s
android / android (push) Failing after 43s
apple / swift (push) Successful in 55s
deb / build-publish (push) Successful in 2m24s
decky / build-publish (push) Successful in 22s
ci / bench (push) Successful in 4m30s
ci / docs-site (push) Successful in 28s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4m1s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m31s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 21s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m15s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 7m46s
docker / deploy-docs (push) Successful in 21s
The secure-desktop DDA leg went black with HDR on: legacy DuplicateOutput (the SDR-era API) can't capture an FP16/HDR desktop, and dropping the SudoVDA out of HDR is denied on the Winlogon desktop (so the SDR-drop attempt just churned and stayed black). Instead capture HDR natively on the DDA path — the capturer already has the full FP16→BT.2020 PQ→R10G10B10A2 conversion (hdr_fp16 path), it just never requested FP16. Thread a want_hdr flag into duplicate_output: for an HDR session request DuplicateOutput1 with FP16 first and retry it (5×) instead of bailing to the HDR-incapable legacy fallback. The secure-desktop mux now reads the monitor's real HDR state and opens DDA in HDR when set — no advanced-color toggling at all. The normal-desktop DDA overlay/flip issues that pushed us to WGC don't apply to the composed Winlogon UI. want_hdr is threaded through every (re)duplication incl. ACCESS_LOST recovery. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
69765bad93 |
fix(host/windows): drop the SudoVDA to SDR for the secure DDA leg, verified
Keep HDR OFF for the DDA (secure-desktop) path rather than bailing to WGC: the DDA capturer is SDR-only (BGRA8), so an HDR SudoVDA makes the Winlogon capture black. On the secure transition, drop the monitor out of HDR and VERIFY it took (re-read advanced_color_enabled, retry up to 6×200ms) before opening DDA — the CCD toggle can transiently fail (rc=5) or lag. Restore HDR on return to the WGC normal-desktop leg. Logs clearly if the drop can't be applied (e.g. denied on the Winlogon desktop). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
af6787c0bd |
fix(host/windows): honor the SudoVDA's real HDR state (stop wiping the user's HDR toggle)
HDR streamed nothing and "didn't persist" because build() forced the SudoVDA's advanced-color state to match the handshake bit_depth on every build — with an 8-bit-negotiated session (the common case: clients advertise no 10-bit cap) that meant set_advanced_color(false) on every connect, wiping a user's deliberate Windows HDR toggle on the virtual display. But the whole pipeline already follows the monitor's REAL HDR state: WGC captures FP16 when HDR is on, NVENC forces Main10 + BT.2020 PQ from the 10-bit capture format regardless of the negotiated depth (encode/nvenc.rs), and the client auto-detects PQ from the HEVC VUI. So the negotiated bit_depth must NOT drive the monitor's colorspace. - build(): only ever ENABLE HDR (proactively, for a negotiated 10-bit session); never force it off. A user-enabled HDR session now persists and flows end-to-end. - secure-desktop mux: gate the HDR→SDR drop (for the DDA leg) on the monitor's ACTUAL advanced-color state at switch time, not bit_depth — so an HDR session with an 8-bit handshake still drops correctly for Winlogon and restores after. - sudovda: add advanced_color_enabled() reader (DISPLAYCONFIG_GET_ADVANCED_COLOR_INFO). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
0ce2e37faf |
refactor(host/windows): clean up DDA path + add a proper Windows service
Final cleanup after the DDA-parity work, plus an end-user service to replace the PsExec/VBS/scheduled-task launch chain. Cleanup (behavior-preserving): - sudovda.rs: drop the dead legacy GDI isolate_displays/restore_displays (CCD is the sole isolation path), the always-empty Monitor.isolated field, and the vestigial reassert_isolation + PUNKTFUNK_ISOLATE_DISPLAYS knob; fix stale comments. - dxgi.rs: downgrade leftover debug warns/infos (DuplicateOutput1 retry, FALLBACKS, hook-hits, AcquireNextFrame idle timeout) to debug!; remove the PUNKTFUNK_NO_CURSOR per-frame test knob. Windows service (src/service.rs, `punktfunk-host service`): - SCM supervisor (windows-service crate) that duplicates its LocalSystem token, retargets it to the active console session, and CreateProcessAsUserW's the host there (Sunshine/Apollo model) — relaunching on exit and console session switch, inside a kill-on-close job object so a service crash never orphans the host. - install/uninstall/start/stop/status subcommands: one elevated `service install` registers an auto-start LocalSystem service + firewall rules + a default host.env. - Config moves to %ProgramData%\punktfunk\host.env; config_dir() now resolves to %ProgramData%\punktfunk on Windows (replacing the APPDATA=C:\Users\Public hack), with a PUNKTFUNK_CONFIG_DIR override. Logs land in %ProgramData%\punktfunk\logs\. - merged_env_block (shared with the WGC helper) now also carries RUST_LOG. - docs/windows-service.md + scripts/windows/host.env.example; windows-host.md updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
6d611cf889 |
feat(host/windows): reference-counted SudoVDA monitor lifecycle (reuse on quick reconnect, teardown when idle)
User: tearing down + recreating the monitor per session is wrong both ways — a
fixed GUID collides on overlapping sessions, but a per-session GUID makes a new
screen on every reconnect; host-lifetime would leave a phantom display for
physical-screen users. Correct model = rock-solid state machine.
Replace the per-session create/REMOVE with a host-level reference-counted
manager (global MGR):
- States: Idle / Active{refs} / Lingering{until}.
- Connect (acquire): Idle→create; Lingering→reuse (cancel teardown, reconfigure
if the mode changed) — the quick-reconnect reuse, no new screen/PnP chime;
Active→refs++ (concurrent / Reconfigure-overlap), reconfigure on a mode change.
- Disconnect (release, via the MonitorLease keepalive Drop): refs-- ; at 0 →
Lingering(now + PUNKTFUNK_MONITOR_LINGER_MS, default 10s).
- Background timer: Lingering past its deadline → REMOVE the monitor → Idle, so a
physical screen returns ~10s after streaming stops.
Eliminates BOTH the cross-session REMOVE collision (teardown only at refs==0 +
expired grace) and the new-screen-on-reconnect, without a persistent phantom
display. The control-device handle is opened once (host-level) — a handle, not a
screen. SudoVdaDisplay is now a marker; the old create() body is create_monitor.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
ca375c7ce8 |
fix(host/windows): WGC mux — reuse the SudoVDA monitor + helper across secure switches (no teardown/recreate)
User: re-adding WGC brought back the teardown/recreate bug (audible disconnect/ connect on the secure<->normal switch). Cause: the secure->normal switch called build() = vd.create() = IOCTL_REMOVE old SudoVDA monitor + IOCTL_ADD new one + respawn the helper — the same teardown/recreate kernel stress we just eliminated from DDA, now on the mux path. Apply the same learning (reuse, don't tear down): the SudoVDA monitor and WGC helper persist for the whole session; only the host-DDA leg opens (on secure) and closes (on normal). On returning to normal, RESUME the still-alive helper (drain its secure-dwell backlog + request a keyframe) instead of rebuilding. The HDR-session colorspace restore (set_advanced_color(true) + helper rebuild) is kept ONLY for bit_depth>=10 — an SDR session never changed the colorspace, so it needs no rebuild at all. The secure switch already reuses the monitor (open_dda on the existing target). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
e8d885fb4f |
fix(host/windows): WGC relay — set SudoVDA color to match session bit depth at build (kill persisted HDR)
Re-test still broken: the WGC helper captured HDR FP16 BT.2020 PQ from the FIRST frame (before any switch), feeding the 8-bit SDR encoder → broken normal-desktop image. Root cause: the SudoVDA's advanced-color (HDR) state PERSISTS on the monitor across sessions, so the 8-bit session inherited HDR left enabled by the earlier broken toggle — and gating the per-switch toggles can't undo a state that's already on at start. Fix: in build() (runs on initial create + every mode-switch/return-from-secure rebuild), force set_advanced_color(target, bit_depth>=10) BEFORE spawning the WGC helper, with a 250ms settle if it changed. An 8-bit session now always captures SDR via WGC (matching the encoder); 10-bit keeps HDR. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d2e536d299 |
fix(host/windows): WGC relay — don't force HDR on SDR sessions across the secure mux
Re-enabling the WGC relay brought back a broken image on the secure->normal switch. Log root cause: on returning to the normal desktop the relay called set_advanced_color(target, true) to 'restore HDR', so the rebuilt WGC helper captured HDR FP16 BT.2020 PQ while the session encoder is 8-bit SDR -> format mismatch (the 'HDR gets restored when flipping back to WGC' bug). Gate BOTH set_advanced_color toggles on bit_depth>=10. An SDR (8-bit) session now stays SDR across WGC<->DDA switches (no HDR force, no needless topology change); HDR sessions keep the drop-on-secure / restore-on-normal behavior. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
f469dfcc76 |
chore(host/windows): clean up DDA capture — fix unused imports, quiet secure-desktop log, sane retry default
- Remove 4 unused imports (PCWSTR in composed_flip, anyhow macro + SizeInt32 in wgc, Write in wgc_relay). - DuplicateOutput1 retry defaults to N=1 (immediate legacy): on the secure desktop DuplicateOutput1 is LOGON_UI-only so it always refuses, and the release-before-reduplicate + gentle recovery keep the legacy dup stable; retrying there only blocked. Still env-tunable (PUNKTFUNK_DUP_RETRY_N/_MS). - Throttle the 'using legacy DuplicateOutput' warning (expected + once-per-gentle- recovery on secure) so a lock dwell doesn't flood the log. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
dc734c711b |
fix(host/windows): re-sync thread desktop on EVERY recovery (symmetric enter/leave secure)
User's observation: entering UAC/lock works instantly, but clicking OUT of it breaks (with the disconnect sound) — Apollo's enter and leave are symmetric. Root cause: attach_input_desktop() (SetThreadDesktop to the current input desktop) was gated behind is_secure_desktop() in recreate_dupl, so: - Default->Winlogon (enter): is_secure==true -> re-attach to Winlogon -> works. - Winlogon->Default (leave): is_secure==false -> SKIP re-attach -> the capture thread stays stuck on the now-gone Winlogon desktop -> every rebuild fails -> no frames -> client timeout -> session ends -> SudoVDA removed (the disconnect sound). Fix: call attach_input_desktop() UNCONDITIONALLY on every rebuild (Apollo calls syncThreadDesktop before every duplicate), so leaving secure re-attaches to the returned desktop. reassert_isolation stays secure-only. Also stop leaking the HDESK (CloseDesktop right after SetThreadDesktop, like Apollo) so calling it on every recovery is safe. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
9a9214a2d8 |
fix(host/windows): gentle DDA recovery — stop the tight teardown/recreate loop
Per the user's insight: on the secure (Winlogon) desktop the duplication dies on
every independent-flip, and our tight recovery loop tore it down + recreated it
hundreds of times/sec — that release/recreate cycle is the real kernel stress,
and it stalled the send thread long enough that the client timed out ('display
disconnected'). Normal-desktop streaming is already solid (per-session GUID
killed the collision); this only changes the loss-recovery cadence.
Gentle recovery (user chose 'keep session alive'):
- cap the cheap re-duplicate to PUNKTFUNK_RECOVER_MS (default 250ms, was 5ms)
- cap the heavy new-device rebuild to PUNKTFUNK_REBUILD_MS (default 1500ms, was
250ms) — it's the costliest teardown, throttled hardest
- repeat the last frame between attempts (no busy-spin, no 8ms sleep)
~200/s -> ~4/s teardown/recreate during a secure dwell. The session survives
lock/UAC (frozen/laggy secure screen, then clean resume on unlock) instead of
churning the kernel into a disconnect. Both cadences env-tunable.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
2f7c021cac |
fix(host/windows): per-session SudoVDA monitor GUID (stop overlapping-session monitor teardown)
User observed: 'display disconnected' + freeze with NO context change, and 'first switch happy, subsequent slower, then chaos under stress'. Log shows the cause: MONITOR_GUID was a FIXED constant, so overlapping sessions (a client RECONNECTING after a freeze before the old session tore down, or concurrent sessions) all map to the SAME SudoVDA monitor (same GUID -> IOCTL_ADD reuses target 257). When the old session ends, its IOCTL_REMOVE tears the monitor down OUT FROM UNDER the live session -> 'display disconnected' + the late E_INVALIDARG/MODE_CHANGE failures (output vanished mid-session) -> cascade. Fix: next_monitor_guid() returns a unique GUID per (process, session) [base GUID with low 48-bit node = pid<<16 | session#]; create() threads it into AddParams AND the keepalive (which REMOVEs by it). Each session now owns its own monitor; one ending can't kill another. (The 200ms DuplicateOutput1 retry confirmed working — 'succeeded on retry' logged; the residual failures were this collision, not the race.) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
ce84861e3a |
fix(host/windows): DuplicateOutput1 retry wait 200ms (Apollo's value), env-tunable
The old-dup kernel teardown takes ~200ms (Apollo waits exactly that), so the previous 2-16ms retries were too short and still fell through to the churning legacy dup. Bump to PUNKTFUNK_DUP_RETRY_MS (default 200) x PUNKTFUNK_DUP_RETRY_N (default 6) so the robust DuplicateOutput1 dup wins the race. Env-tunable for on-box dialing without a rebuild. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
eb451d8bc6 |
fix(host/windows): retry DuplicateOutput1 to ride out the old-dup teardown race
User's insight, and it fits the evidence exactly: in duplicate_output the FIRST
DuplicateOutput1 (called microseconds after the caller releases the old
duplication via self.dupl=None) returns E_ACCESSDENIED, but the legacy
DuplicateOutput a beat later SUCCEEDS — the only difference is TIMING. The
kernel-side teardown of the just-released duplication is async, so the immediate
DuplicateOutput1 races it ('output still duplicated' -> E_ACCESSDENIED). We then
fell straight through to legacy DuplicateOutput, which 'succeeds' into a FRAGILE
dup that churns ACCESS_LOST/MODE_CHANGE every few ms on this cross-GPU IDD
(causing the post-login freeze + UAC-confirm drop).
Fix: retry DuplicateOutput1 up to 5x with escalating 2/4/8/16 ms waits before
falling back to legacy, so the teardown finishes and the ROBUST DuplicateOutput1
dup succeeds (no churn). Bounded (~30 ms worst case) so a genuine failure still
falls back quickly. This is exactly Apollo's 2x/200ms retry rationale.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
1e1e5ce9b5 |
fix(host/windows): Option-handle the multi-line dupl.GetFramePointerShape call too
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
da43b5e8d3 |
fix(host/windows): release the old duplication before re-duplicating (THE born-lost bug)
DuplicateOutput1 returned E_ACCESSDENIED ~8815x even with PER_MONITOR_AWARE_V2 confirmed on the capture thread (thread_is_v2=true) — so DPI was NOT the cause. The real cause: DXGI permits only ONE IDXGIOutputDuplication per output, and on ACCESS_LOST you MUST release the old one before re-duplicating. Our recovery (try_reduplicate / recreate_dupl) created the NEW duplication while the OLD self.dupl was still alive → the output stayed held → DuplicateOutput1 E_ACCESSDENIED and the legacy fallback returned a BORN-LOST dup. It never converged because there was always exactly one stale dup alive at creation time. The initial open() works precisely because there's no prior dup; Apollo is clean because it releases (dup.reset()) before every re-DuplicateOutput. Fix: make self.dupl an Option and set it to None (drop → release the output) BEFORE duplicate_output in try_reduplicate and before reopen_duplication in recreate_dupl, then Some(new). acquire() gets a None-guard that synthesizes ACCESS_LOST (routes into recovery) so a transient None can't panic. All ReleaseFrame/AcquireNextFrame sites updated for the Option. This is the documented DDA recovery requirement and the one thing that distinguished our failing DuplicateOutput1 from Apollo's working one. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
c8fb4822a2 |
fix(host/windows): per-thread Per-Monitor-V2 DPI awareness so DuplicateOutput1 succeeds
The remaining born-lost ACCESS_LOST storm traces to ONE thing: our
IDXGIOutput5::DuplicateOutput1 returns E_ACCESSDENIED (0x80070005) ~4370x, so
we fall back to legacy DuplicateOutput, which yields a BORN-LOST duplication on
this hybrid box. Apollo's DuplicateOutput1 SUCCEEDS on the identical
desktop/output/4090-device → a working dup, clean capture.
Root cause: DuplicateOutput1 REQUIRES Per-Monitor-Aware-V2. At startup our
SetProcessDpiAwarenessContext(PER_MONITOR_AWARE_V2) FAILS with E_ACCESSDENIED
('already set' — a manifest/runtime locked the process to a lower awareness),
and GetAwarenessFromDpiAwarenessContext reports 2 for BOTH Per-Monitor V1 and
V2, so the earlier 'awareness=2' was misleading — the process is likely V1,
which DuplicateOutput1 rejects with E_ACCESSDENIED. (Legacy DuplicateOutput has
no V2 requirement, so it 'worked' but born-lost.)
Fix: SetThreadDpiAwarenessContext(PER_MONITOR_AWARE_V2) on the capture thread
in open() — a per-thread override that takes regardless of the process default,
so DuplicateOutput1 can succeed (the working dup Apollo gets). Logs set_ok +
thread_is_v2 (via AreDpiAwarenessContextsEqual) to confirm V2 actually applied.
Topology fixes (sole display, no MODE_CHANGE) and the recovery backstops stay.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
c60a05dbe9 |
fix(host/windows): make SudoVDA the sole display via clean CCD (the IDD needs to be primary/composited)
Live result of the previous build: the MODE_CHANGE_IN_PROGRESS storm was FIXED
(0 occurrences) by dropping primary-promotion — but it exposed the regression
the review predicted: a non-primary EXTENDED SudoVDA is NOT DWM-composited on
this box, so DDA gets born-lost ACCESS_LOST (0x887a0026) + black frames. The
IDD genuinely must be the sole/primary/composited display here.
Apollo reaches that end state ('Virtual Desktop: 5120x1440', sole display) via
Windows AUTO-promoting the real WDDM display over the box's leftover 1024x768
basic display — but Windows does NOT auto-promote for us, leaving the IDD
extended. So make it sole explicitly, the clean way:
- create(): deactivate the other display(s) via the atomic CCD path
(isolate_displays_ccd) by DEFAULT (opt out with PUNKTFUNK_NO_ISOLATE). Drop
the legacy per-device GDI detach from the path (it misses iGPU-attached
monitors and churns; kept #[allow(dead_code)] for reference).
- set_active_mode(): CDS_UPDATEREGISTRY only — set the mode in place, NO
CDS_SET_PRIMARY / CDS_GLOBAL / DM_POSITION. A sole display is already primary,
so there's nothing to contest → no MODE_CHANGE storm (that storm came from
promoting primary at (0,0) WHILE the basic display was still active).
Net: sole SudoVDA → primary → composited → capturable, with no topology
contest. Keeps the prior MODE_CHANGE-as-transient handling + removed born-lost
escape as backstops.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
||
|
|
769fd96b87 |
fix(host/windows): stop SudoVDA MODE_CHANGE_IN_PROGRESS storm — don't force IDD primary by default
ROOT CAUSE (verified by multi-agent compare vs Apollo + adversarial review): set_active_mode() applied the SudoVDA mode with CDS_UPDATEREGISTRY | CDS_GLOBAL | CDS_SET_PRIMARY + DM_POSITION(0,0) — promoting the freshly-added IDD to PRIMARY at the virtual-screen origin and persisting it globally. On this box (baseline active display = a 1024x768 basic 'WinDisc') that primary-promotion contests the existing display so the desktop topology never reaches a stable fixed point → every DuplicateOutput/AcquireNextFrame during the unending settle returns DXGI_ERROR_MODE_CHANGE_IN_PROGRESS (0x887A0025). Apollo, live on this EXACT box with an empty config, never promotes primary and captures the same SudoVDA at 5120x1440 with zero DXGI errors. (Ruled out earlier on the live box: win32u hook, DPI, independent-flip/overlay, isolation, render pin.) Fixes (subtractive, gated per adversarial review): - sudovda.rs set_active_mode: default to CDS_UPDATEREGISTRY only (no primary promotion, no GLOBAL, no DM_POSITION) = Apollo-parity for the multi-display default. Promote to primary (CDS_GLOBAL|CDS_SET_PRIMARY+DM_POSITION) ONLY when PUNKTFUNK_ISOLATE_DISPLAYS=1 (sole display, where a blank extended IDD would otherwise yield no frames). Avoids regressing headless/isolated + mid-stream Reconfigure. - dxgi.rs acquire: treat MODE_CHANGE_IN_PROGRESS (0x887A0025) as a TRANSIENT (Ok(None), repeat last frame, wait it out) instead of falling through to the fatal Err arm → cold-rebuild → create()→set_active_mode (which re-issued the mode change and amplified the storm). - dxgi.rs acquire: remove the born-lost cold-rebuild escape — it re-created the SudoVDA (IOCTL REMOVE/ADD = the audible PnP chime the user heard) and never converged; now repeat last frame in-process (never tear the IDD down mid- session, like Apollo). Overlay + cheap-spin/HDR recovery left intact. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
900089c44c |
fix(host/windows): don't pin SudoVDA render adapter by default (Apollo parity)
GROUND TRUTH from Apollo streaming live on this exact box (empty config): captures the SudoVDA at 5120x1440@240 on the RTX 4090 with ZERO ACCESS_LOST / born-lost / MODE_CHANGE -- clean, no overlay, no isolation, no render pin. That disproves the independent-flip theory (a sole SudoVDA captures fine here) and points at something WE do that Apollo doesn't. The concrete culprit: we call SET_RENDER_ADAPTER, which this driver IGNORES (logs 'render adapter DIFFERS from pinned add=0x23664 pinned=0x15768') and the IDD ends up rendering on adapter 0x23664 while its DXGI output is enumerated under the 4090 (0x15768) where we create the capture device -- a cross-GPU mismatch that is the real source of the perpetual ACCESS_LOST + MODE_CHANGE_IN_PROGRESS (0x887A0025) storm. Apollo never pins (empty config), so its IDD stays on its natural adapter, aligned with capture. Make the render pin OPT-IN (PUNKTFUNK_RENDER_ADAPTER=<name>); default to NOT pinning, matching Apollo. The startup log now shows the resulting AddOut LUID so we can confirm the IDD lands on the 4090. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
cd72164db2 |
fix(host/windows): keep multi-display (Apollo parity) instead of sole-display isolation
CONFIRMED on the live RTX4090+iGPU box: hook fires+verified, DPI=2, overlay running, yet the stream STILL freezes -- born-lost dropped but MODE_CHANGE_IN_ PROGRESS (0x887A0025) churn took over (2284x) and frames go stale. Root cause is the topology itself: create() makes SudoVDA the SOLE active display (CDS_SET_PRIMARY + isolate_displays + isolate_displays_ccd), and a sole display on a hybrid box goes into fullscreen independent-flip / MPO that Desktop Duplication cannot capture. Apollo is rock solid on this EXACT box because it does the opposite: it keeps the physical monitor ACTIVE and arranges the virtual display alongside it (rearrangeVirtualDisplayForLowerRight, 'Do not change the primary'). Multi- display is DWM-composited, so the output never independent-flips. Make isolation OPT-IN (PUNKTFUNK_ISOLATE_DISPLAYS=1) and default to NOT isolating -- match Apollo's multi-display topology. SudoVDA stays primary (so it carries the shell -> frames) but other monitors stay active, which disables independent-flip. reassert_isolation honors the same flag (re-isolating mid- stream would itself trigger the storm). Keeps the overlay + born-lost escape as belt-and-suspenders. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
5f84c5785c |
fix(host/windows): force-composed-flip overlay in the single-process DDA path
CONFIRMED root cause via instrumented build: hook_hits=1+ (win32u hook fires, verified-patched) and DPI awareness=2 (PER_MONITOR), yet the born-lost ACCESS_LOST storm persists with 100% DuplicateOutput1 E_ACCESSDENIED. That rules out reparenting (the hook works) and DPI -> it is fullscreen independent-flip / MPO: the SudoVDA virtual display, isolated as the SOLE active output, scans out one plane on one display, bypassing DWM composition, so Desktop Duplication gets a born-lost duplication. Apollo never hits this because it runs WITH a physical monitor attached (multi-display is already DWM-composited); we isolate to sole-display, so we must force composition ourselves. The fix already existed (ForceComposedFlip, a tiny topmost layered overlay that disqualifies independent-flip) but was only wired into the WGC relay path's secure branch, which PUNKTFUNK_NO_WGC=1 disables. Wire it into virtual_stream unconditionally (DDA owns the normal desktop here, where the storm is). Held for the session; Drop tears it down; PUNKTFUNK_FORCE_COMPOSED=0 disables. Keeps the prior build's born-lost escape as a safety net. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
63b63a4010 |
fix(host/windows): instrument + harden DDA against the born-lost ACCESS_LOST storm
The hybrid RTX4090+iGPU box storms DXGI_ERROR_ACCESS_LOST (0x887A0026) + MODE_CHANGE_IN_PROGRESS (0x887A0025) ~3s after first frame: every rebuilt duplication is born-lost (created OK, first AcquireNextFrame instantly ACCESS_LOST), seeds black, retries forever. The steady-state m3 loop calls try_latest()->acquire() which returns Ok(None) on every recovery, so the cold-rebuild escape (MAX_CAPTURE_REBUILDS) was unreachable -> frozen stream. Multi-agent root-cause + adversarial review point at the win32u GPU-pref hook being ineffective (patched on the main thread, no FlushInstructionCache, never verified) rather than the synthesis's independent-flip theory (Apollo has no overlay yet is stable on this exact box). This build instruments + applies the safe, high-probability fixes: - Hook: FlushInstructionCache after the inline patch (cross-thread i-cache); read back the 12 patched bytes and error! if they didn't land; per-call hit counter (hybrid_hook_hits) logged after open -- hits==0 proves the hook is off DXGI's reparent path. - DPI: log SetProcessDpiAwarenessContext result + effective awareness (need 2=PER_MONITOR for DuplicateOutput1; explains the 100% E_ACCESSDENIED). - SetThreadExecutionState(ES_CONTINUOUS|ES_DISPLAY_REQUIRED|ES_SYSTEM_REQUIRED) at capture open, restored on Drop -- stop IDD idle-invalidation (Apollo does this too). - Born-lost escape: count consecutive born-lost rebuilds; on the NORMAL desktop (never the secure/Winlogon dwell) escalate to Err after ~5s so the m3 loop cold-rebuilds the whole pipeline instead of freezing on the last frame. Diagnostic-forward: one test now tells us hook-hits + DPI awareness + whether ExecutionState/desktop-sync alone fixes it, and the stream self-recovers instead of wedging. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
60bb9727d6 |
fix(host/windows): correct SetDisplayConfig slice signature + local DISPLAYCONFIG_PATH_ACTIVE
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
2ac1014e8e |
fix(host/windows): CCD-based display isolation (detach hybrid-attached monitors)
The freeze on context change is the lock/login rendering on a PHYSICAL monitor instead of the captured SudoVDA display. Root cause: the legacy isolate_displays (EnumDisplayDevices + ChangeDisplaySettings) found NOTHING to detach on this hybrid box (4090 + AMD iGPU) — an iGPU-attached monitor isn't flagged ATTACHED_TO_DESKTOP in the GDI enum, so it's never detached and the secure desktop lands on it while the virtual output freezes. (Log: isolate ran, logged zero "detaching" lines.) Add CCD-based isolation (QueryDisplayConfig(QDC_ONLY_ACTIVE_PATHS) + SetDisplayConfig) — the API Apollo uses, which sees every active path. Deactivate all active paths except the SudoVDA target's, leaving the virtual display the sole desktop so ALL content (incl. Winlogon) renders to it. Runs alongside the legacy pass (now a no-op fallback); the original topology is saved and restored on teardown before REMOVE. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
3237ca31cd |
feat(host/windows): capture via IDXGIOutput5::DuplicateOutput1 (Apollo's capture API)
The one major capture-API difference left vs Apollo: punktfunk used legacy IDXGIOutput1::DuplicateOutput; Apollo uses IDXGIOutput5::DuplicateOutput1 with a format list, the modern path that's more robust to overlay/format changes (a candidate for the SudoVDA-on-hybrid 0x887A0026 churn). Add a duplicate_output() helper used at all 3 duplication sites (open, reopen_duplication, try_reduplicate): QI to IDXGIOutput5 and DuplicateOutput1, falling back to legacy DuplicateOutput. DuplicateOutput1 requires per-monitor-v2 DPI awareness, so set that at process start alongside the GPU-pref hook (matches Apollo). Format list is BGRA8-only for now (SDR test): DuplicateOutput1 returns the first format it can CONVERT to, so FP16-first would hand back FP16 even on SDR and trip the HDR path. Real FP16/HDR capture (with IDXGIOutput6 colorspace detection) is the follow-up once the churn is settled. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
7cfeddc770 |
fix(host/windows): install the GPU-preference hook at process start (before any DXGI)
The win32u hook only works if it patches before DXGI caches the hybrid preference. It was installed in DuplCapturer::open (first capture), but the SudoVDA render-adapter selection creates a DXGI factory during virtual-display setup — seconds earlier — so the preference was already cached and the hook had no effect (churn persisted; log showed "render adapter chosen" at :02, "hook installed" at :04). Call install_gpu_pref_hook() at the top of real_main(), before any command runs, so it beats the first DXGI factory. (open() still calls it too; Once makes the earliest call win.) Also fix the cosmetic function-cast-as-integer warning. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
a01f8a2f58 |
feat(host/windows): port Apollo's win32u GPU-preference hook (fix hybrid-GPU DDA churn)
Root cause of the ACCESS_LOST (0x887A0026) churn + context-change freeze, found live: the box is a HYBRID system (RTX 4090 + AMD Radeon iGPU + SudoVDA). DXGI does hybrid GPU-preference resolution and REPARENTS the SudoVDA output between adapters (SET_RENDER_ADAPTER is ignored — the IDD lands on the iGPU 0x23664 while we duplicate on the 4090 0x15768), which constantly invalidates Desktop Duplication. Apollo runs fine on this same box because it hooks this away. Port Apollo's hook: replace win32u.dll!NtGdiDdDDIGetCachedHybridQueryValue to always report D3DKMT_GPU_PREFERENCE_STATE_UNSPECIFIED, so DXGI skips preference resolution and never reparents the output → DDA stays on one adapter. Installed once before the first DXGI factory/enumeration (DuplCapturer::open). We fully replace the function (never call the original) so a 12-byte absolute-jmp prologue patch suffices — no detour crate / C length-disassembler dependency, just VirtualProtect. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
61fd75dc33 |
fix(host/windows): re-isolate/re-attach desktop ONLY on the secure desktop
recreate_dupl called reassert_isolation (a display-TOPOLOGY change via isolate_displays) + attach_input_desktop on EVERY ACCESS_LOST rebuild — 200× in a 6 s SDR session. A topology change itself invalidates the freshly-rebuilt duplication, so the next acquire is ACCESS_LOST → recreate → reassert → a self-feeding 0x887A0026 churn that freezes the stream and never recovers across context changes (lock / login / post-login). Gate both behind is_secure_desktop(): the heavy topology work runs only on the actual Winlogon (secure/login) desktop — where a physical monitor can grab the secure desktop off our virtual output. Routine churn, the lock screen, and post-login are all on the normal desktop, so they take a light re-duplicate with no topology meddling. Apollo isolates once at startup; its recovery just re-duplicates — this matches that. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
d11f2bf800 |
fix(host/windows): stop the DDA freeze — kill the HDR format-change storm + throttle ACCESS_LOST recovery
Two freeze drivers found live on the RTX box (DDA-only, 5K@240 HDR SudoVDA):
Step 1 — the per-frame format-change check (
|
||
|
|
995db69387 |
fix(host/windows): detect format/size change on the DDA acquire path
DDA only re-read the duplication format/size on rebuild (recreate_dupl) and initial open. A mid-stream HDR<->SDR flip (FP16<->BGRA — e.g. the SudoVDA output dropping out of HDR for the secure desktop) or a resolution change that does NOT raise ACCESS_LOST left hdr_fp16/width/height stale, so present_acquired copied into a mismatched-format/size target — the secure-desktop "works once, then HDR breaks" symptom. Re-read the acquired texture's desc every frame (as Apollo does) and rebuild on a real change instead of presenting a mismatched frame; throttled like the ACCESS_LOST path so a flapping toggle can't hammer DuplicateOutput. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
3d04ce92a1 |
feat(host/windows): PUNKTFUNK_NO_WGC — force single-process DDA everywhere
A single test flag to bring up / validate DDA on its own and as the base for the secure-desktop work. When set it (1) skips WGC in capture_virtual_output (forces dxgi::DuplCapturer, same as PUNKTFUNK_CAPTURE=dda) and (2) makes should_use_helper return false, so even a SYSTEM host bypasses the two-process WGC relay and captures in-process with one DDA capturer for both the normal AND the secure desktop — Apollo's model. All the WGC / relay code stays compiled; unset the flag to restore. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
6ea52b0372 |
feat(host/windows): SDR-while-secure — drop SudoVDA out of HDR on Winlogon so DDA captures it
When the DDA-on-secure path is enabled (PUNKTFUNK_SECURE_DDA=1), the mux now toggles the SudoVDA's advanced-color (HDR) state via the CCD API (sudovda::set_advanced_color → DisplayConfigSetDeviceInfo + DISPLAYCONFIG_SET_ADVANCED_COLOR_STATE): on entering the secure (Winlogon) desktop it disables HDR so the lock/UAC renders SDR/composed (no fullscreen independent-flip → DDA can duplicate it instead of storming ACCESS_LOST/black), opens DDA fresh on the now-SDR output; on returning to normal it re-enables HDR and rebuilds the helper so WGC re-detects the restored colorspace. Also debounce the DesktopWatcher (publish a Default↔Winlogon change only after it is stable ~80ms) so transient flaps during the transition don't thrash the mux. Default (no flag) is unchanged: WGC stays live through a lock, no DDA switch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
be18797df8 |
feat(client): request a recovery keyframe on unrecoverable loss
apple / swift (push) Successful in 54s
windows-msix / package (push) Successful in 1m0s
windows / build (push) Successful in 54s
android / android (push) Successful in 2m30s
ci / web (push) Successful in 37s
ci / docs-site (push) Successful in 38s
ci / rust (push) Successful in 4m24s
deb / build-publish (push) Successful in 2m5s
decky / build-publish (push) Successful in 25s
ci / bench (push) Successful in 4m25s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 16s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m38s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m24s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 22s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m15s
flatpak / build-publish (push) Failing after 5m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 4m37s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m26s
Under infinite GOP the punktfunk/1 plane has no periodic IDR — the only recovery keyframe is one the client requests. But the reassembler drops unrecoverable AUs silently (frames_dropped) and hands the decoder reference-missing delta frames that libavcodec conceals and returns Ok for, so keying recovery off a decode error mostly never fires under real loss → a long/permanent freeze. Surface the data-plane pump's Session.frames_dropped to NativeClient via a shared atomic (NativeClient::frames_dropped()), updated every pump iteration so it stays current through a total-loss drought. The Linux and Windows client video loops watch it and call request_keyframe() when it climbs, throttled to 100 ms (the decode stays wedged for several frames until the IDR lands). macOS already does this; client-rs doesn't decode. Resolves reliability backlog #2. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
55d5a4278f |
fix(host): self-heal capture loss + audio-thread death mid-session
Two steady-state faults previously bubbled a bare `?` to conn.close / silently muted the rest of a session. Recover in place instead. #4 — capture loss (virtual_stream): a mid-session capture stall/disconnect (`try_latest` Err: PipeWire/compositor thread ended, virtual output gone) ended the whole session — and the native client has no reconnect path, so it had to cold-restart the handshake. Now rebuild the pipeline IN PLACE at the current mode via build_pipeline_with_retry (same primitive the mode/session switch uses), force a keyframe, and only propagate when the bounded retry is exhausted. A consecutive-rebuild cap stops a flapping source from looping the client through endless cold IDRs. Track the live mode so a rebuild after a mode switch targets the right mode (also fixes the session-switch rebuild using the stale mode). #3 — native audio thread (audio_thread): broke the loop on ANY next_chunk Err, spawned once per session and never restarted, so a transient 5 s quiet-sink timeout permanently muted a multi-hour session. Make a quiet sink return an empty chunk (not an Err) in both backends so only a genuinely dead capture thread is an Err, and reopen-with-backoff (INJECTOR_REOPEN_BACKOFF) on death, keeping the Opus encoder + monotonic seq. Documents the next_chunk contract; also makes the GameStream audio sender survive quiet sinks for free. Resolves reliability backlog #3 and #4. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
e8619c2362 |
fix(host/windows): keep WGC through the secure desktop by default (DDA-secure opt-in)
apple / swift (push) Successful in 56s
ci / rust (push) Failing after 1m32s
ci / web (push) Successful in 29s
android / android (push) Successful in 3m15s
ci / docs-site (push) Successful in 41s
deb / build-publish (push) Successful in 2m5s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m47s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m2s
docker / deploy-docs (push) Successful in 37s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m6s
Regression fix. The DDA-on-secure mux + force-composed overlay + rebuild-on-switch
made the stream worse than just staying on WGC: DDA can't reliably capture the
secure desktop's HDR independent-flip (storms ACCESS_LOST → instant black), and
rebuilding the output on every Default↔Winlogon flip thrashed (frequent freezes).
Meanwhile the WGC helper STAYS LIVE through a lock/UAC.
So make the DDA-on-secure path OPT-IN (PUNKTFUNK_SECURE_DDA=1, or the test
toggle). By default the mux keeps WGC the whole session — the DesktopWatcher and
the force-composed overlay aren't even started — so a lock/UAC no longer black-
screens or freezes the stream. The DDA-secure machinery stays in the tree for
future experimentation behind the flag.
(Reverts the rebuild-on-every-switch change
|
||
|
|
555ec2a3b7 |
Revert "fix(host/windows): rebuild the output fresh on every WGC↔DDA source switch"
This reverts commit
|