Commit Graph

24 Commits

Author SHA1 Message Date
enricobuehler eb451d8bc6 fix(host/windows): retry DuplicateOutput1 to ride out the old-dup teardown race
User's insight, and it fits the evidence exactly: in duplicate_output the FIRST
DuplicateOutput1 (called microseconds after the caller releases the old
duplication via self.dupl=None) returns E_ACCESSDENIED, but the legacy
DuplicateOutput a beat later SUCCEEDS — the only difference is TIMING. The
kernel-side teardown of the just-released duplication is async, so the immediate
DuplicateOutput1 races it ('output still duplicated' -> E_ACCESSDENIED). We then
fell straight through to legacy DuplicateOutput, which 'succeeds' into a FRAGILE
dup that churns ACCESS_LOST/MODE_CHANGE every few ms on this cross-GPU IDD
(causing the post-login freeze + UAC-confirm drop).

Fix: retry DuplicateOutput1 up to 5x with escalating 2/4/8/16 ms waits before
falling back to legacy, so the teardown finishes and the ROBUST DuplicateOutput1
dup succeeds (no churn). Bounded (~30 ms worst case) so a genuine failure still
falls back quickly. This is exactly Apollo's 2x/200ms retry rationale.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 16:02:22 +00:00
enricobuehler 1e1e5ce9b5 fix(host/windows): Option-handle the multi-line dupl.GetFramePointerShape call too
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 15:41:41 +00:00
enricobuehler da43b5e8d3 fix(host/windows): release the old duplication before re-duplicating (THE born-lost bug)
DuplicateOutput1 returned E_ACCESSDENIED ~8815x even with PER_MONITOR_AWARE_V2
confirmed on the capture thread (thread_is_v2=true) — so DPI was NOT the cause.
The real cause: DXGI permits only ONE IDXGIOutputDuplication per output, and on
ACCESS_LOST you MUST release the old one before re-duplicating. Our recovery
(try_reduplicate / recreate_dupl) created the NEW duplication while the OLD
self.dupl was still alive → the output stayed held → DuplicateOutput1
E_ACCESSDENIED and the legacy fallback returned a BORN-LOST dup. It never
converged because there was always exactly one stale dup alive at creation
time. The initial open() works precisely because there's no prior dup; Apollo
is clean because it releases (dup.reset()) before every re-DuplicateOutput.

Fix: make self.dupl an Option and set it to None (drop → release the output)
BEFORE duplicate_output in try_reduplicate and before reopen_duplication in
recreate_dupl, then Some(new). acquire() gets a None-guard that synthesizes
ACCESS_LOST (routes into recovery) so a transient None can't panic. All
ReleaseFrame/AcquireNextFrame sites updated for the Option.

This is the documented DDA recovery requirement and the one thing that
distinguished our failing DuplicateOutput1 from Apollo's working one.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 15:40:50 +00:00
enricobuehler c8fb4822a2 fix(host/windows): per-thread Per-Monitor-V2 DPI awareness so DuplicateOutput1 succeeds
The remaining born-lost ACCESS_LOST storm traces to ONE thing: our
IDXGIOutput5::DuplicateOutput1 returns E_ACCESSDENIED (0x80070005) ~4370x, so
we fall back to legacy DuplicateOutput, which yields a BORN-LOST duplication on
this hybrid box. Apollo's DuplicateOutput1 SUCCEEDS on the identical
desktop/output/4090-device → a working dup, clean capture.

Root cause: DuplicateOutput1 REQUIRES Per-Monitor-Aware-V2. At startup our
SetProcessDpiAwarenessContext(PER_MONITOR_AWARE_V2) FAILS with E_ACCESSDENIED
('already set' — a manifest/runtime locked the process to a lower awareness),
and GetAwarenessFromDpiAwarenessContext reports 2 for BOTH Per-Monitor V1 and
V2, so the earlier 'awareness=2' was misleading — the process is likely V1,
which DuplicateOutput1 rejects with E_ACCESSDENIED. (Legacy DuplicateOutput has
no V2 requirement, so it 'worked' but born-lost.)

Fix: SetThreadDpiAwarenessContext(PER_MONITOR_AWARE_V2) on the capture thread
in open() — a per-thread override that takes regardless of the process default,
so DuplicateOutput1 can succeed (the working dup Apollo gets). Logs set_ok +
thread_is_v2 (via AreDpiAwarenessContextsEqual) to confirm V2 actually applied.
Topology fixes (sole display, no MODE_CHANGE) and the recovery backstops stay.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 15:29:17 +00:00
enricobuehler 769fd96b87 fix(host/windows): stop SudoVDA MODE_CHANGE_IN_PROGRESS storm — don't force IDD primary by default
ROOT CAUSE (verified by multi-agent compare vs Apollo + adversarial review):
set_active_mode() applied the SudoVDA mode with CDS_UPDATEREGISTRY | CDS_GLOBAL
| CDS_SET_PRIMARY + DM_POSITION(0,0) — promoting the freshly-added IDD to
PRIMARY at the virtual-screen origin and persisting it globally. On this box
(baseline active display = a 1024x768 basic 'WinDisc') that primary-promotion
contests the existing display so the desktop topology never reaches a stable
fixed point → every DuplicateOutput/AcquireNextFrame during the unending
settle returns DXGI_ERROR_MODE_CHANGE_IN_PROGRESS (0x887A0025). Apollo, live
on this EXACT box with an empty config, never promotes primary and captures
the same SudoVDA at 5120x1440 with zero DXGI errors. (Ruled out earlier on the
live box: win32u hook, DPI, independent-flip/overlay, isolation, render pin.)

Fixes (subtractive, gated per adversarial review):
- sudovda.rs set_active_mode: default to CDS_UPDATEREGISTRY only (no primary
  promotion, no GLOBAL, no DM_POSITION) = Apollo-parity for the multi-display
  default. Promote to primary (CDS_GLOBAL|CDS_SET_PRIMARY+DM_POSITION) ONLY
  when PUNKTFUNK_ISOLATE_DISPLAYS=1 (sole display, where a blank extended IDD
  would otherwise yield no frames). Avoids regressing headless/isolated +
  mid-stream Reconfigure.
- dxgi.rs acquire: treat MODE_CHANGE_IN_PROGRESS (0x887A0025) as a TRANSIENT
  (Ok(None), repeat last frame, wait it out) instead of falling through to the
  fatal Err arm → cold-rebuild → create()→set_active_mode (which re-issued the
  mode change and amplified the storm).
- dxgi.rs acquire: remove the born-lost cold-rebuild escape — it re-created the
  SudoVDA (IOCTL REMOVE/ADD = the audible PnP chime the user heard) and never
  converged; now repeat last frame in-process (never tear the IDD down mid-
  session, like Apollo). Overlay + cheap-spin/HDR recovery left intact.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 14:59:42 +00:00
enricobuehler 63b63a4010 fix(host/windows): instrument + harden DDA against the born-lost ACCESS_LOST storm
The hybrid RTX4090+iGPU box storms DXGI_ERROR_ACCESS_LOST (0x887A0026) +
MODE_CHANGE_IN_PROGRESS (0x887A0025) ~3s after first frame: every rebuilt
duplication is born-lost (created OK, first AcquireNextFrame instantly
ACCESS_LOST), seeds black, retries forever. The steady-state m3 loop calls
try_latest()->acquire() which returns Ok(None) on every recovery, so the
cold-rebuild escape (MAX_CAPTURE_REBUILDS) was unreachable -> frozen stream.

Multi-agent root-cause + adversarial review point at the win32u GPU-pref hook
being ineffective (patched on the main thread, no FlushInstructionCache, never
verified) rather than the synthesis's independent-flip theory (Apollo has no
overlay yet is stable on this exact box).

This build instruments + applies the safe, high-probability fixes:
- Hook: FlushInstructionCache after the inline patch (cross-thread i-cache);
  read back the 12 patched bytes and error! if they didn't land; per-call hit
  counter (hybrid_hook_hits) logged after open -- hits==0 proves the hook is
  off DXGI's reparent path.
- DPI: log SetProcessDpiAwarenessContext result + effective awareness (need
  2=PER_MONITOR for DuplicateOutput1; explains the 100% E_ACCESSDENIED).
- SetThreadExecutionState(ES_CONTINUOUS|ES_DISPLAY_REQUIRED|ES_SYSTEM_REQUIRED)
  at capture open, restored on Drop -- stop IDD idle-invalidation (Apollo does
  this too).
- Born-lost escape: count consecutive born-lost rebuilds; on the NORMAL desktop
  (never the secure/Winlogon dwell) escalate to Err after ~5s so the m3 loop
  cold-rebuilds the whole pipeline instead of freezing on the last frame.

Diagnostic-forward: one test now tells us hook-hits + DPI awareness + whether
ExecutionState/desktop-sync alone fixes it, and the stream self-recovers
instead of wedging.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 14:02:55 +00:00
enricobuehler 3237ca31cd feat(host/windows): capture via IDXGIOutput5::DuplicateOutput1 (Apollo's capture API)
The one major capture-API difference left vs Apollo: punktfunk used legacy
IDXGIOutput1::DuplicateOutput; Apollo uses IDXGIOutput5::DuplicateOutput1 with a
format list, the modern path that's more robust to overlay/format changes (a
candidate for the SudoVDA-on-hybrid 0x887A0026 churn). Add a duplicate_output()
helper used at all 3 duplication sites (open, reopen_duplication, try_reduplicate):
QI to IDXGIOutput5 and DuplicateOutput1, falling back to legacy DuplicateOutput.
DuplicateOutput1 requires per-monitor-v2 DPI awareness, so set that at process
start alongside the GPU-pref hook (matches Apollo).

Format list is BGRA8-only for now (SDR test): DuplicateOutput1 returns the first
format it can CONVERT to, so FP16-first would hand back FP16 even on SDR and trip
the HDR path. Real FP16/HDR capture (with IDXGIOutput6 colorspace detection) is the
follow-up once the churn is settled.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 13:00:37 +00:00
enricobuehler 7cfeddc770 fix(host/windows): install the GPU-preference hook at process start (before any DXGI)
The win32u hook only works if it patches before DXGI caches the hybrid preference.
It was installed in DuplCapturer::open (first capture), but the SudoVDA
render-adapter selection creates a DXGI factory during virtual-display setup —
seconds earlier — so the preference was already cached and the hook had no effect
(churn persisted; log showed "render adapter chosen" at :02, "hook installed" at
:04). Call install_gpu_pref_hook() at the top of real_main(), before any command
runs, so it beats the first DXGI factory. (open() still calls it too; Once makes
the earliest call win.) Also fix the cosmetic function-cast-as-integer warning.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 12:39:50 +00:00
enricobuehler a01f8a2f58 feat(host/windows): port Apollo's win32u GPU-preference hook (fix hybrid-GPU DDA churn)
Root cause of the ACCESS_LOST (0x887A0026) churn + context-change freeze, found
live: the box is a HYBRID system (RTX 4090 + AMD Radeon iGPU + SudoVDA). DXGI does
hybrid GPU-preference resolution and REPARENTS the SudoVDA output between adapters
(SET_RENDER_ADAPTER is ignored — the IDD lands on the iGPU 0x23664 while we
duplicate on the 4090 0x15768), which constantly invalidates Desktop Duplication.
Apollo runs fine on this same box because it hooks this away.

Port Apollo's hook: replace win32u.dll!NtGdiDdDDIGetCachedHybridQueryValue to always
report D3DKMT_GPU_PREFERENCE_STATE_UNSPECIFIED, so DXGI skips preference resolution
and never reparents the output → DDA stays on one adapter. Installed once before the
first DXGI factory/enumeration (DuplCapturer::open). We fully replace the function
(never call the original) so a 12-byte absolute-jmp prologue patch suffices — no
detour crate / C length-disassembler dependency, just VirtualProtect.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 12:31:54 +00:00
enricobuehler 61fd75dc33 fix(host/windows): re-isolate/re-attach desktop ONLY on the secure desktop
recreate_dupl called reassert_isolation (a display-TOPOLOGY change via
isolate_displays) + attach_input_desktop on EVERY ACCESS_LOST rebuild — 200×
in a 6 s SDR session. A topology change itself invalidates the freshly-rebuilt
duplication, so the next acquire is ACCESS_LOST → recreate → reassert → a
self-feeding 0x887A0026 churn that freezes the stream and never recovers across
context changes (lock / login / post-login).

Gate both behind is_secure_desktop(): the heavy topology work runs only on the
actual Winlogon (secure/login) desktop — where a physical monitor can grab the
secure desktop off our virtual output. Routine churn, the lock screen, and
post-login are all on the normal desktop, so they take a light re-duplicate with
no topology meddling. Apollo isolates once at startup; its recovery just
re-duplicates — this matches that.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 12:07:16 +00:00
enricobuehler d11f2bf800 fix(host/windows): stop the DDA freeze — kill the HDR format-change storm + throttle ACCESS_LOST recovery
Two freeze drivers found live on the RTX box (DDA-only, 5K@240 HDR SudoVDA):

Step 1 — the per-frame format-change check (995db69) mis-fired EVERY frame in HDR
(827+/session): self.hdr_fp16 is derived from the duplication ModeDesc (FP16
scanout mode), but legacy DuplicateOutput always hands back 8-bit BGRA, so the
acquired-texture format never equals hdr_fp16 → a rebuild storm (each rebuild
re-inits device+NVENC → freeze). Make the acquire check SIZE-only; a real
HDR<->SDR toggle still arrives as ACCESS_LOST → recreate_dupl re-detects it.

Step 3 — ACCESS_LOST (0x887A0026) churn: HDR overlay/MPO flips invalidate the
duplication continuously and the recovery loop had no rate limit (the 250ms
throttle guarded only the full rebuild, not the cheap try_reduplicate), so it
spun DuplicateOutput + up-to-16ms Acquire and starved the encode thread. Add a
last_recover throttle capping ALL recovery attempts to ~one per 5ms; between
attempts return None so the caller repeats the last frame, paced at the frame
interval (no busy-spin, encode thread keeps running).

Real FP16 HDR capture (DuplicateOutput1) + per-loss desktop-reisolation cleanup
are the next steps; validate this in SDR first.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 11:54:23 +00:00
enricobuehler 995db69387 fix(host/windows): detect format/size change on the DDA acquire path
DDA only re-read the duplication format/size on rebuild (recreate_dupl) and
initial open. A mid-stream HDR<->SDR flip (FP16<->BGRA — e.g. the SudoVDA output
dropping out of HDR for the secure desktop) or a resolution change that does NOT
raise ACCESS_LOST left hdr_fp16/width/height stale, so present_acquired copied
into a mismatched-format/size target — the secure-desktop "works once, then HDR
breaks" symptom. Re-read the acquired texture's desc every frame (as Apollo does)
and rebuild on a real change instead of presenting a mismatched frame; throttled
like the ACCESS_LOST path so a flapping toggle can't hammer DuplicateOutput.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 11:18:53 +00:00
enricobuehler 6d7301ccf5 fix(windows): two-pass cursor compositing (alpha + XOR) in DXGI capture
A single DXGI cursor shape can need BOTH an alpha-blended layer AND a
screen-inverting (XOR) layer at once — a masked-color text I-beam (opaque
hot-spot + inverting bar) or a monochrome cursor mixing opaque and invert
pixels. The old path produced ONE BGRA image per shape and picked ONE blend
(cursor_invert) for the whole shape, so such mixed cursors rendered wrong
(masked-color opaque pixels forced through the invert blend; monochrome
(AND=1,XOR=1) invert pixels approximated as solid black).

Port Apollo/Sunshine's decomposition: convert_pointer_shape now returns a
CursorShape with optional alpha/xor layers; CursorCompositor holds tex_alpha
+ tex_xor and draw_layer renders each with its own blend (alpha = src-over,
HDR-scaled; XOR = inversion, unscaled — it operates on the framebuffer
reference). The CPU software path blends both layers too. Empty layers are
never uploaded or drawn. Removes the single cursor_invert flag.

Fixes #13 in docs/apollo-comparison.md. Independently reviewed (ship);
Windows-only code — compile verified by CI / dev VM.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 09:48:34 +00:00
enricobuehler 7bf2899301 fix(host/windows): secure-desktop black screen — capture the real frame, don't seed black
apple / swift (push) Successful in 56s
android / android (push) Failing after 54s
ci / web (push) Successful in 39s
ci / docs-site (push) Successful in 31s
ci / rust (push) Failing after 2m15s
deb / build-publish (push) Successful in 2m4s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
ci / bench (push) Successful in 4m52s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 4m11s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 3m29s
docker / deploy-docs (push) Failing after 6s
Root cause (confirmed live: "black until I pressed a key, then the image came
back"): the secure desktop (lock/login/UAC) is STATIC, and DXGI Desktop
Duplication only emits a frame on CHANGE. On the normal→secure switch the
duplication is rebuilt (recreate_dupl / try_reduplicate), and we then SEEDED A
BLACK frame as last_present — which the static secure desktop never replaced
(no change-frame) until the user pressed a key. So we streamed black.

Fix: after rebuilding the duplication, CAPTURE the current desktop frame instead
of seeding black. A freshly-created duplication's first AcquireNextFrame returns
the full current desktop; grab it and present it. New `present_acquired` factors
the frame-processing out of `acquire`; both recovery paths now call it:
- recreate_dupl: after adopting the new duplication, acquire+present the real
  frame (born-lost ACCESS_LOST / no-initial-frame → seed black as fallback and
  let the 250ms-throttled caller retry — a brief flash, then real content).
- try_reduplicate: adopt-first, then capture its probe frame (was discarded).

Also (independently-correct safe fixes, per the adversarial review):
- DesktopWatcher computes the current desktop synchronously in start() before
  returning, so a session that begins on the secure desktop (reconnect to a
  locked box) doesn't relay one stale normal-desktop frame (the "flash").
- DuplCapturer::open reasserts SudoVDA isolation at open time (mirrors
  recreate_dupl) — forces the secure desktop back onto the virtual output if a
  lock/UAC re-attached a physical monitor.
- Instrumentation: dbg_black_seeds counter + a throttled warn when black is
  seeded, and an info when a real secure-desktop frame is captured on recovery.

Pending: the user's real-lock smoke test on the 4090 (a headless PsExec
LockWorkStation runs as SYSTEM and can't lock an interactive session, so this
must be validated with an actual lock).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 09:15:33 +00:00
enricobuehler 28ab448a29 feat(host/windows): WGC capture backend (overlay/HDR-correct) with watchdog'd DDA fallback
android / android (push) Failing after 46s
apple / swift (push) Successful in 54s
ci / rust (push) Failing after 1m16s
ci / web (push) Successful in 31s
ci / docs-site (push) Successful in 27s
deb / build-publish (push) Successful in 2m23s
decky / build-publish (push) Successful in 10s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m31s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m50s
The capture-architecture reset from the research: add a Windows.Graphics.Capture (WGC) backend that
captures the COMPOSED desktop — including the overlay/independent-flip/MPO planes DXGI Desktop
Duplication misses — which structurally fixes the frozen HDR animations + video (proven live: a WGC
frame decodes to the real 5120x1440 HDR content DDA freezes on). It reuses the whole pipeline
unchanged: the WGC frame's GPU texture → same scRGB→BT.2020-PQ shader → NVENC zero-copy; the OS
composites the cursor (IsCursorCaptureEnabled) so no manual cursor pass. crates/punktfunk-host/src/
capture/wgc.rs; find_output/make_device/HdrConverter/nudge_cursor_onto made pub(crate) for reuse.

Reliability findings + mitigations (live on the RTX 4090):
- WGC can't activate under the SYSTEM account (0x80070424) — it needs the interactive user token. The
  host must run as the user for WGC (run.cmd: drop PsExec -s). DDA still needs SYSTEM for the secure
  desktop — that token reconciliation (impersonation) is the remaining task.
- WGC's Direct3D11CaptureFramePool::CreateFreeThreaded intermittently HANGS on the headless SudoVDA
  (IddCx) display, correlated with accumulated SudoVDA churn (failed REMOVEs leaving lingering
  displays); clean-state opens reliably. Since it's a blocking hang, capture_virtual_output runs WGC
  open on a watchdog thread with a 5s timeout and falls back to DDA on hang/error — the session is
  NEVER left black: WGC when it opens (fixed animations), DDA otherwise. First-frame nudge added (WGC
  fires FrameArrived on change; a static desktop otherwise never delivers the first frame).
- Default WGC; PUNKTFUNK_CAPTURE=dda forces DDA. DDA path unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 06:32:54 +00:00
enricobuehler b1e95a386f fix(host/windows): tiered DXGI recovery — cheap re-DuplicateOutput for the HDR ACCESS_LOST churn
apple / swift (push) Successful in 53s
ci / web (push) Successful in 28s
android / android (push) Successful in 1m46s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 11s
ci / rust (push) Successful in 1m4s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 3m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m17s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m56s
The HDR path produced a constant ACCESS_LOST churn during real desktop activity (window resize /
Start menu / DWM transitions): the duplication keeps getting invalidated but the OUTPUT stays valid
(probe passes — 0 born-lost over 72 rebuilds). The old recovery did a FULL rebuild (new device +
factory) on every loss, which re-inits NVENC + seeds black + was throttled to 4x/s → mostly-frozen,
re-init churn = "broken animations".

Now recovery is tiered (mirrors Sunshine): try_reduplicate() does a fresh DuplicateOutput on the
EXISTING device+output — no new device, so NO encoder re-init, NO black seed, gpu_copy/HDR
textures/last_present kept → frames resume immediately. Only a genuine output loss (secure-desktop
switch) or a dead device (DEVICE_REMOVED/RESET) falls back to the full, throttled recreate_dupl.
Both paths probe the new duplication and reject a born-lost one.

Validated synthetically (1080p60 + 5120x1440@240 HDR): pipeline stable, 0 churn, frames flow. The
real-desktop churn needs live validation (can't synthesize DWM animations). Secure-desktop "UI never
appears in-session" is a separate issue (output gone in-session; only a fresh monitor re-add works) —
still open.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 21:31:14 +00:00
enricobuehler 0a3b92d994 fix(host/windows): HDR cursor brightness (203-nit) + probe-before-adopt recovery; windows-client bootstrap doc
apple / swift (push) Successful in 55s
android / android (push) Successful in 2m43s
ci / web (push) Successful in 31s
ci / docs-site (push) Successful in 37s
ci / bench (push) Successful in 1m35s
ci / rust (push) Successful in 7m7s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m33s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m33s
docker / deploy-docs (push) Successful in 18s
- HDR cursor: sRGB→linear decode + scale to HDR graphics white (PUNKTFUNK_HDR_CURSOR_NITS, default
  203 per BT.2408) in the FP16 cursor composite, so it's no longer ~2.5x too dim. SDR path unchanged;
  the masked-color (I-beam) inversion blend left unscaled. Cursor cbuffer widened 16→32 + bound to PS.
  (Validated live: cursor now correct brightness in HDR.)
- Secure-desktop recovery: recreate_dupl now PROBES the rebuilt duplication with a 50ms
  AcquireNextFrame and only adopts it when live (Ok/WAIT_TIMEOUT); a born-lost one (immediate
  ACCESS_LOST) is dropped so the caller repeats the last frame + retries. Plus reassert_isolation()
  re-detaches physical displays on every recovery (re-routing the secure/HDR desktop to the virtual
  output, the delta a fresh reconnect has). NOTE: the born-lost ACCESS_LOST storm in HDR is NOT yet
  resolved by these — still under investigation (animations/secure-UI/cursor-trail in HDR remain).
- docs/windows-client-bootstrap.md: handoff for the native Windows Rust client (windows-rs Reactor +
  WinUI 3 SwapChainPanel, D3D11VA decode, WASAPI audio, SDL3 input; ports crates/punktfunk-client-linux;
  10-bit/HDR present; dev boxes + gotchas).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 21:20:42 +00:00
enricobuehler bbabc04bca feat(hdr): Windows HDR10 + 10-bit end-to-end, negotiated; non-blocking capture recovery
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m32s
android / android (push) Successful in 1m49s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m36s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m20s
flatpak / build-publish (push) Successful in 4m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m11s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m32s
Adds true HDR (BT.2020 PQ) and 10-bit (HEVC Main10) streaming, negotiated so an
8-bit/SDR client is never sent a stream it can't decode, plus a robust fix for the
capture losing the stream across a secure-desktop transition.

Protocol (punktfunk-core/quic.rs):
- Hello gains `video_caps` (VIDEO_CAP_10BIT / VIDEO_CAP_HDR), Welcome gains `bit_depth`,
  both as optional trailing bytes (back-compat). client-rs advertises 10-bit via
  PUNKTFUNK_CLIENT_10BIT; the connector advertises 0 for now (in-band detection drives
  the native clients). Regenerated punktfunk_core.h.

Windows host:
- 10-bit Main10: host enables it only when the client advertised VIDEO_CAP_10BIT AND
  PUNKTFUNK_10BIT is set; threaded through open_video → NVENC (profile Main10,
  pixelBitDepthMinus8).
- HDR: when the captured desktop is scRGB FP16 (R16G16B16A16_FLOAT, HDR on), copy it to
  an FP16 surface, composite the cursor there, convert scRGB → BT.2020 PQ 10-bit
  (R10G10B10A2) via a shader, and encode HEVC Main10 with the BT.2020/PQ colour VUI
  (ABGR10 input). Fixes the freeze + cursor-trail that came from feeding FP16 into the
  BGRA path. Reacts dynamically to the HDR toggle.
- Capture recovery: rebuild is now a single NON-BLOCKING attempt, throttled to ~4×/s,
  repeating the last good frame between attempts (format-tagged last_present). During a
  secure-desktop dwell SudoVDA's output is gone; the old blocking 12 s retry starved the
  send loop for seconds so the client timed out and disconnected — now the session stays
  fed (frozen) until the desktop returns. Also seeds a black frame on recovery.

Apple client (PunktfunkKit):
- Detects HDR in-band from the stream VUI (PQ transfer function), decodes to 10-bit P010,
  and presents via an rgba16Float + BT.2020 PQ CAMetalLayer with EDR; SDR path unchanged.
  Switches automatically on a mid-session HDR toggle.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 20:28:52 +00:00
enricobuehler f4b4a6c1e4 feat(host/windows): native res, cursor, secure-desktop capture, windowless SYSTEM launch
apple / swift (push) Successful in 52s
ci / rust (push) Failing after 36s
ci / web (push) Successful in 31s
android / android (push) Successful in 1m52s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 3m19s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m57s
docker / deploy-docs (push) Successful in 17s
Live-validated Mac <-> RTX 4090 at the display's native 5120x1440@240:

- Resolution: set_active_mode enumerates the IDD's advertised modes and sets the
  requested resolution at the best supported refresh (keeps 5120x1440@240; no more
  silent fallback to the 1080p OS default when an exact mode is briefly unavailable).
- Bitrate auto-cap: NVENC init probes and steps the average bitrate down to the GPU's
  codec-level max so a high client bitrate connects (matches the Linux host; we do not
  split NVENC sessions).
- Mouse cursor: DXGI duplication excludes the HW cursor; capture the pointer
  shape/position (GetFramePointerShape) and GPU-composite it before NVENC. Color cursors
  alpha-blend; masked-color (the text I-beam) uses an INV_DEST_COLOR inversion blend so
  the caret inverts the screen and shows on any background (no black box); monochrome
  handled too.
- Secure desktop (lock / login / UAC): run as SYSTEM in the interactive session, follow
  the input desktop via SetThreadDesktop, and on the WinSta switch recreate the D3D11
  device and re-resolve the virtual output's GDI name from the stable SudoVDA target id
  (the name changes across the topology rebuild; the old failure hunted the stale
  \\.\DISPLAYn and dropped). ACCESS_LOST / INVALID_CALL / device-removed are recoverable,
  and a mid-stream resolution change is followed (capturer + NVENC re-init at the new
  size). isolate_displays detaches other monitors so Winlogon renders to the virtual
  output. One real session recovered 1012 desktop switches and completed cleanly.

Windows-only backends; Linux/macOS unaffected. Builds clean on x86_64-pc-windows-msvc.
Deployment (windowless SYSTEM launch via PsExec + hidden VBScript) documented in
docs/windows-host.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 15:46:34 +00:00
enricobuehler 7654b20b2a fix(host/windows): NVENC capture on real GPU + HOME-less config dir
apple / swift (push) Successful in 54s
android / android (push) Failing after 1m44s
ci / rust (push) Successful in 1m18s
ci / web (push) Successful in 28s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 1m50s
decky / build-publish (push) Successful in 10s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 4s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 3s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 5s
flatpak / build-publish (push) Failing after 2s
deb / build-publish (push) Failing after 2m56s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m4s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m48s
docker / deploy-docs (push) Successful in 17s
Validated live on an RTX 4090 (Windows 11) host streaming to the Rust
reference client over the LAN: SudoVDA virtual display → DXGI Desktop
Duplication (D3D11 zero-copy) → NVENC HEVC → punktfunk/1. 720p60 and
1080p60 both clean (181 / 177 frames, 0 mismatched, p50 1.6 / 3.45 ms
cross-machine), coexisting with Apollo. Two real-hardware bugs the
GPU-less VM couldn't surface:

- DXGI capturer: the SudoVDA virtual monitor's DXGI output is enumerated
  under the GPU that *renders* it (the 4090, LUID 0x15df6), NOT under the
  SudoVDA "adapter" LUID SudoVDA reports (0x23276). Restricting the output
  search to that LUID found nothing → "adapter has no output named
  \\.\DISPLAYn". Now search ALL adapters for the GDI name, bind the D3D11
  device to whichever adapter exposes it (NVENC then shares that device),
  with a settle-retry (the output appears a beat after display creation)
  and topology logging.

- native_pairing / apps: keyed config paths off raw $HOME, which a Windows
  service/scheduled-task context doesn't set → "HOME unset" hard-fail at
  m3-host startup. Route both through gamestream::config_dir(), which falls
  back to %APPDATA% on Windows (cert/paired/apps now under AppData\Roaming).

clippy -D warnings + build green on x86_64-pc-windows-msvc (default and
--features nvenc) and Linux (78/78 tests).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 09:18:15 +00:00
enricobuehler 9c61b03101 feat(host/windows): ViGEm rumble back-channel + Windows clippy clean
android / android (push) Failing after 21s
ci / web (push) Failing after 10s
ci / docs-site (push) Failing after 1s
ci / bench (push) Failing after 0s
deb / build-publish (push) Failing after 0s
decky / build-publish (push) Failing after 1s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Failing after 0s
docker / deploy-docs (push) Has been skipped
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 0s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Failing after 1s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 0s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 0s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Failing after 1s
flatpak / build-publish (push) Failing after 0s
apple / swift (push) Successful in 53s
ci / rust (push) Failing after 2m35s
Wire the host→client rumble path on Windows, the analogue of the Linux
uinput EV_FF read loop: a game's force-feedback on the virtual Xbox 360
pad is delivered by ViGEm's notification API (`request_notification` →
`spawn_thread`, gated by the crate's `unstable_xtarget_notification`
feature). A per-pad background thread stores the latest motor levels;
`pump_rumble` relays changes to the client on the universal 0xCA plane
(motors scaled 0..255 → 0..65535). Dropping the target aborts the
notification, so the thread exits with the session. Live verification
still needs a physical pad.

Also fix the Windows backends' clippy debt — these modules are cfg-
excluded from Linux CI, so `clippy -D warnings` never saw them, and the
VM's rustc 1.96 clippy is stricter on shared code than the CI image:
- dxgi: manual checked division → checked_div().map_or
- sendinput: `x = x | y` → `x |= y`
- sudovda: `.then(|| ptr)` → `.then_some(ptr)`
- m3 pick_compositor: drop the needless early return (match form)
- m3 resolve_compositor: Windows arm is a tail expr, not `return`

All Windows backends now build + clippy clean (default and --features
nvenc); Linux unaffected (fmt/clippy/check green).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 07:43:40 +00:00
enricobuehler 2448a33698 style(host/windows): rustfmt the Windows backends
apple / swift (push) Successful in 55s
android / android (push) Failing after 1m53s
ci / web (push) Failing after 17s
ci / docs-site (push) Successful in 42s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 7s
ci / rust (push) Failing after 3m5s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 12s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 7s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 2s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 0s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Failing after 0s
flatpak / build-publish (push) Failing after 0s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 0s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Failing after 1m43s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m15s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 01:50:16 +00:00
enricobuehler 69ba6ec45d feat(host/windows): NVENC D3D11 hardware encoder (--features nvenc)
android / android (push) Failing after 36s
ci / rust (push) Failing after 45s
apple / swift (push) Successful in 55s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m35s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
flatpak / build-publish (push) Failing after 2s
deb / build-publish (push) Successful in 3m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1m17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m32s
docker / deploy-docs (push) Successful in 17s
Zero-copy capture->encode on the GPU via the raw NVENC API (nvidia_video_codec_sdk sys + ENCODE_API; the safe wrapper is CUDA-only). Opens an NV_ENC_DEVICE_TYPE_DIRECTX session on the SAME ID3D11Device as the DXGI capturer (carried on the new FramePayload::D3d11), registers a pool of BGRA textures once, CopyResources each captured texture in and encode_picture; CBR/ULL, infinite GOP, P-only, forced-IDR for RFI. The DXGI capturer gains a D3D11 zero-copy output (selected, like the encoder, by PUNKTFUNK_ENCODER=nvenc) so capture+encode share textures.

OFF by default (the nvenc feature pulls the NVENC SDK + cudarc): the default Windows host links without it (openh264 path). cudarc builds toolkit-less via the SDK ci-check feature (dynamic-loading). At link time --features nvenc needs nvencodeapi.lib (NVENC SDK, or an import lib generated from the driver's nvEncodeAPI64.dll) on PUNKTFUNK_NVENC_LIB_DIR. Both default and --features nvenc builds validated to compile+link GPU-less on the VM (import lib generated from the driver DLL). Runtime needs a real NVIDIA GPU.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 01:39:46 +00:00
enricobuehler 9c2499fd45 feat(host/windows): DXGI Desktop Duplication capture backend
apple / swift (push) Successful in 53s
android / android (push) Failing after 2m25s
ci / web (push) Successful in 28s
ci / docs-site (push) Failing after 19s
ci / rust (push) Failing after 52s
decky / build-publish (push) Successful in 11s
ci / bench (push) Successful in 1m36s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
flatpak / build-publish (push) Failing after 2s
deb / build-publish (push) Successful in 3m22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1m18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m42s
docker / deploy-docs (push) Successful in 21s
Windows Capturer via DXGI Desktop Duplication: create a D3D11 device on the SudoVDA adapter (by LUID), find the matching output (by GDI name), DuplicateOutput, and per AcquireNextFrame copy the desktop into a CPU-readable staging texture -> tightly-packed BGRA (FramePayload::Cpu, feeds the openh264 software encoder GPU-lessly). Handles WAIT_TIMEOUT (reuse last frame) and ACCESS_LOST (re-duplicate). Adds FramePayload::D3d11(D3d11Frame) for the future NVENC zero-copy path, and a VirtualOutput.win_capture identity (adapter LUID + GDI name) carried out of the SudoVDA backend. Pure helpers (pack_luid/gdi_name_matches/depad_bgra) unit-tested on the VM; the live duplication path needs a real GPU + an activated SudoVDA monitor. Compiles clean on Windows + Linux.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-15 01:06:21 +00:00