docs(apollo): mark cursor #13 done, reclassify #21 as already-handled
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
apple / swift (push) Successful in 54s
ci / rust (push) Failing after 1m21s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 29s
android / android (push) Failing after 5m44s
ci / bench (push) Failing after 3m26s
decky / build-publish (push) Successful in 12s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m5s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m3s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m50s

#13 (two-pass alpha+XOR cursor) implemented in capture/dxgi.rs. #21
(composite moved cursor without a new desktop frame) is already handled:
DXGI returns S_OK for pointer-only updates so punktfunk recomposites in
present_acquired; the original premise (stutter via timeout) was incorrect.
Adds status banner + per-item resolution notes in Part 4 and Part 3.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-16 09:49:20 +00:00
parent 6d7301ccf5
commit ba4e9a8672
+37 -8
View File
@@ -1099,9 +1099,9 @@ punktfunk's cursor handling lives in `crates/punktfunk-host/src/capture/dxgi.rs`
#### Transfer opportunities #### Transfer opportunities
- **Split every cursor shape into an alpha image + an XOR image (two-pass composite)** (sev high, medium) — Refactor convert_pointer_shape in dxgi.rs to return two optional images (alpha, xor) mirroring Apollo's split. Store cursor_shape as Option<(alpha, xor)>, upload up to two SRVs in CursorCompositor, and in composite_cursor_gpu run the alpha pass with self.blend then the xor pass with self.blend_invert (skip empties). Drop the single cursor_invert flag. -**DONE (2026-06-16)** **Split every cursor shape into an alpha image + an XOR image (two-pass composite)** (sev high, medium) — Refactor convert_pointer_shape in dxgi.rs to return two optional images (alpha, xor) mirroring Apollo's split. Store cursor_shape as Option<(alpha, xor)>, upload up to two SRVs in CursorCompositor, and in composite_cursor_gpu run the alpha pass with self.blend then the xor pass with self.blend_invert (skip empties). Drop the single cursor_invert flag.
- **Render the monochrome 'inverse of screen' pixels via the XOR pass instead of dropping them** (sev medium, small) — In convert_pointer_shape's monochrome branch (dxgi.rs:628-654), once the dual-pass split (above) exists, route code (1,1) to the XOR image as white and codes (0,0)/(0,1) to the alpha image as opaque black/white, matching Apollo's case mapping. - **Render the monochrome 'inverse of screen' pixels via the XOR pass instead of dropping them** (sev medium, small) — In convert_pointer_shape's monochrome branch (dxgi.rs:628-654), once the dual-pass split (above) exists, route code (1,1) to the XOR image as white and codes (0,0)/(0,1) to the alpha image as opaque black/white, matching Apollo's case mapping.
- **Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame** (sev high, large) — Keep a clean intermediate copy of the last desktop frame (an extra DEFAULT texture). In acquire (dxgi.rs:1341), when AcquireNextFrame times out but update_cursor saw a position change (LastMouseUpdateTime changed) and the cursor is visible, copy the clean intermediate into gpu_copy and re-run composite_cursor_gpu, then return that as a fresh frame instead of repeating last_present. -**ALREADY-HANDLED (2026-06-16; premise incorrect — DDA returns S_OK on pointer-only updates, punktfunk recomposites)** **Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame** (sev high, large) — Keep a clean intermediate copy of the last desktop frame (an extra DEFAULT texture). In acquire (dxgi.rs:1341), when AcquireNextFrame times out but update_cursor saw a position change (LastMouseUpdateTime changed) and the cursor is visible, copy the clean intermediate into gpu_copy and re-run composite_cursor_gpu, then return that as a fresh frame instead of repeating last_present.
- **Stop baking the cursor destructively into the repeated gpu_copy texture** (sev medium, medium) — Add a clean base texture: CopyResource(duplication -> clean_base), then CopyResource(clean_base -> gpu_copy) and composite onto gpu_copy. Repeat clean_base (cursor-free) plus a re-composite on repeats. Also create the cursor RTV once per gpu_copy and cache it rather than CreateRenderTargetView every composite (dxgi.rs:1181-1184). - **Stop baking the cursor destructively into the repeated gpu_copy texture** (sev medium, medium) — Add a clean base texture: CopyResource(duplication -> clean_base), then CopyResource(clean_base -> gpu_copy) and composite onto gpu_copy. Repeat clean_base (cursor-free) plus a re-composite on repeats. Also create the cursor RTV once per gpu_copy and cache it rather than CreateRenderTargetView every composite (dxgi.rs:1181-1184).
- **Handle rotated outputs in cursor positioning** (sev low, medium) — Read rotation from DXGI_OUTDUPL_DESC.Rotation when opening/rebuilding the duplication (around dxgi.rs:888 and 1298), store it on DuplCapturer, and apply Apollo's rotation transform when computing the NDC rect in CursorCompositor::draw and when sampling the cursor texture in the VS. - **Handle rotated outputs in cursor positioning** (sev low, medium) — Read rotation from DXGI_OUTDUPL_DESC.Rotation when opening/rebuilding the duplication (around dxgi.rs:888 and 1298), store it on DuplCapturer, and apply Apollo's rotation transform when computing the NDC rect in CursorCompositor::draw and when sampling the cursor texture in the VS.
- **Validate masked-color mask bytes and log illegal values** (sev low, small) — In the MASKED_COLOR branch of convert_pointer_shape (dxgi.rs:594-627), branch explicitly on mask==0x00 vs mask==0xFF and emit a tracing::warn! once for any other value, matching Apollo's guard, so future cursor-render bugs are observable. - **Validate masked-color mask bytes and log illegal values** (sev low, small) — In the MASKED_COLOR branch of convert_pointer_shape (dxgi.rs:594-627), branch explicitly on mask==0x00 vs mask==0xFF and emit a tracing::warn! once for any other value, matching Apollo's guard, so future cursor-render bugs are observable.
@@ -1591,6 +1591,17 @@ punktfunk's **secure-desktop / desktop-switch capture recovery is genuinely matu
96 candidates, Windows-host first, then severity, then effort. **✓V** = passed the independent 96 candidates, Windows-host first, then severity, then effort. **✓V** = passed the independent
adversarial-verify pass. *Area* is the investigation that surfaced it. adversarial-verify pass. *Area* is the investigation that surfaced it.
> **Status updates**
> - **2026-06-16 — #13 ✅ DONE.** Two-pass cursor compositing (alpha + XOR layers) implemented in
> `capture/dxgi.rs` (`CursorShape`, `convert_pointer_shape` decomposition, `CursorCompositor::set_shapes`/
> `draw_layer`, two-pass GPU + CPU composite). Independently reviewed (ship). Pending: Windows CI/dev-VM compile.
> - **2026-06-16 — #21 ⊘ ALREADY-HANDLED (not a bug).** The premise is wrong for punktfunk: DXGI
> `AcquireNextFrame` returns **S_OK for pointer-only updates** (`LastMouseUpdateTime != 0`,
> `LastPresentTime == 0`), and `acquire()` always re-runs `present_acquired` on S_OK (`dxgi.rs:1407,1474`),
> re-copying the desktop and recompositing the cursor at its new position. `last_present` is repeated
> only on a genuine `WAIT_TIMEOUT` (nothing changed) or a rebuild gap — correct. No stutter from this
> cause. The only real (perf-only) delta is the redundant full-surface copy per pointer update; deferred.
| # | Improvement | Area | Win | Sev | Eff | ✓V | | # | Improvement | Area | Win | Sev | Eff | ✓V |
|---|---|---|---|---|---|---| |---|---|---|---|---|---|---|
@@ -1606,7 +1617,7 @@ adversarial-verify pass. *Area* is the investigation that surfaced it.
| 10 | Native system tray with state-driven icon + notifications | cmp:config-management | Y | high | medium | | | 10 | Native system tray with state-driven icon + notifications | cmp:config-management | Y | high | medium | |
| 11 | Treat S_OK-with-no-change frames as timeouts via DXGI update flags | win:capture-dxgi-dd | Y | high | medium | | | 11 | Treat S_OK-with-no-change frames as timeouts via DXGI update flags | win:capture-dxgi-dd | Y | high | medium | |
| 12 | Detect size/format change in WGC and signal reinit | win:capture-wgc | Y | high | medium | | | 12 | Detect size/format change in WGC and signal reinit | win:capture-wgc | Y | high | medium | |
| 13 | Split every cursor shape into an alpha image + an XOR image (two-pass composite) | win:cursor-compositing | Y | high | medium | | | 13 |**DONE** Split every cursor shape into an alpha image + an XOR image (two-pass composite) | win:cursor-compositing | Y | high | medium | |
| 14 | Map absolute mouse through the real virtual-desktop / output rect, not a blind 0..65535 normalize | win:input-sendinput-vigem | Y | high | medium | | | 14 | Map absolute mouse through the real virtual-desktop / output rect, not a blind 0..65535 normalize | win:input-sendinput-vigem | Y | high | medium | |
| 15 | Detect watchdog ping failures and escalate (re-open the device) | win:virtual-display-sudovda | Y | high | medium | | | 15 | Detect watchdog ping failures and escalate (re-open the device) | win:virtual-display-sudovda | Y | high | medium | |
| 16 | Add SET_RENDER_ADAPTER (IOCTL 0x802) to bind the IDD render GPU to the capture/encode GPU | win:virtual-display-sudovda | Y | high | medium | | | 16 | Add SET_RENDER_ADAPTER (IOCTL 0x802) to bind the IDD render GPU to the capture/encode GPU | win:virtual-display-sudovda | Y | high | medium | |
@@ -1614,7 +1625,7 @@ adversarial-verify pass. *Area* is the investigation that surfaced it.
| 18 | Recover WASAPI loopback from default-device change and AUDCLNT_E_DEVICE_INVALIDATED | win:critic | Y | high | medium | | | 18 | Recover WASAPI loopback from default-device change and AUDCLNT_E_DEVICE_INVALIDATED | win:critic | Y | high | medium | |
| 19 | Implement true reference-frame invalidation with a multi-ref DPB instead of always-full-IDR | cmp:video-encode | Y | high | large | | | 19 | Implement true reference-frame invalidation with a multi-ref DPB instead of always-full-IDR | cmp:video-encode | Y | high | large | |
| 20 | In-binary Windows service install + interactive-session launch | cmp:config-management | Y | high | large | | | 20 | In-binary Windows service install + interactive-session launch | cmp:config-management | Y | high | large | |
| 21 | Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame | win:cursor-compositing | Y | high | large | | | 21 |**ALREADY-HANDLED** Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame | win:cursor-compositing | Y | high | large | |
| 22 | Add real reference-frame invalidation (RFI) instead of always forcing IDR | win:nvenc-d3d11 | Y | high | large | | | 22 | Add real reference-frame invalidation (RFI) instead of always forcing IDR | win:nvenc-d3d11 | Y | high | large | |
| 23 | Add a DS4 (DualShock4) ViGEm target on Windows with type auto-selection, motion, touchpad, battery and timestamp pump | win:input-sendinput-vigem | Y | high | large | | | 23 | Add a DS4 (DualShock4) ViGEm target on Windows with type auto-selection, motion, touchpad, battery and timestamp pump | win:input-sendinput-vigem | Y | high | large | |
| 24 | Replace the PsExec scheduled-task launch with a real Windows service that relaunches the host on session change | win:system-secure-desktop | Y | high | large | | | 24 | Replace the PsExec scheduled-task launch with a real Windows service that relaunches the host on session change | win:system-secure-desktop | Y | high | large | |
@@ -1781,7 +1792,14 @@ adversarial-verify pass. *Area* is the investigation that surfaced it.
- **Proposal:** In WgcCapturer::process_frame, call src.GetDesc() and compare Width/Height/Format against self.width/height and the expected format. On mismatch, return a Reinit error (add a capture_e::Reinit-equivalent to the Capturer contract or bail with a recognizable error the m3/stream loop maps to a capturer rebuild). Drop and re-create fp16_src/hdr10_out/bgra_copy when size changes. - **Proposal:** In WgcCapturer::process_frame, call src.GetDesc() and compare Width/Height/Format against self.width/height and the expected format. On mismatch, return a Reinit error (add a capture_e::Reinit-equivalent to the Capturer contract or bail with a recognizable error the m3/stream loop maps to a capturer rebuild). Drop and re-create fp16_src/hdr10_out/bgra_copy when size changes.
#### 13. Split every cursor shape into an alpha image + an XOR image (two-pass composite) #### 13. Split every cursor shape into an alpha image + an XOR image (two-pass composite)
*Area:* `win:cursor-compositing` · *Windows-host:* yes · *Severity:* high · *Effort:* medium *Area:* `win:cursor-compositing` · *Windows-host:* yes · *Severity:* high · *Effort:* medium · **✅ DONE (2026-06-16)**
> **Resolution:** Implemented in `capture/dxgi.rs`. `convert_pointer_shape` now returns a `CursorShape`
> with optional `alpha`/`xor` layers; `CursorCompositor` holds `tex_alpha`/`tex_xor` and `draw_layer`
> renders each with its own blend (alpha = src-over + HDR scale; XOR = inversion, unscaled). MASKED_COLOR
> opaque pixels now go through the alpha pass (not the invert blend), and MONOCHROME `(1,1)` invert pixels
> now feed the XOR layer (previously approximated as solid black). CPU path blends both layers too.
> The `cursor_invert` flag was removed. Independently reviewed (ship); pending Windows CI/dev-VM compile.
- **Apollo does:** Apollo emits two BGRA images per shape — make_cursor_alpha_image (display_vram.cpp:279) and make_cursor_xor_image (display_vram.cpp:210) — and runs both an alpha-blend pass and an invert-blend pass in blend_cursor (display_vram.cpp:1448-1469), each skipped if its image is empty. MASKED_COLOR and MONOCHROME shapes legitimately need both. - **Apollo does:** Apollo emits two BGRA images per shape — make_cursor_alpha_image (display_vram.cpp:279) and make_cursor_xor_image (display_vram.cpp:210) — and runs both an alpha-blend pass and an invert-blend pass in blend_cursor (display_vram.cpp:1448-1469), each skipped if its image is empty. MASKED_COLOR and MONOCHROME shapes legitimately need both.
- **punktfunk gap:** convert_pointer_shape (dxgi.rs:566) produces ONE image and cursor_invert (dxgi.rs:1133-1134) picks ONE blend for the whole shape, so a cursor mixing opaque and screen-inverting pixels (common I-beams and themed arrows) renders wrong; masked-color opaque pixels are even forced through the invert blend (dxgi.rs:612-624 + 1205). - **punktfunk gap:** convert_pointer_shape (dxgi.rs:566) produces ONE image and cursor_invert (dxgi.rs:1133-1134) picks ONE blend for the whole shape, so a cursor mixing opaque and screen-inverting pixels (common I-beams and themed arrows) renders wrong; masked-color opaque pixels are even forced through the invert blend (dxgi.rs:612-624 + 1205).
@@ -1837,11 +1855,22 @@ adversarial-verify pass. *Area* is the investigation that surfaced it.
- **Proposal:** Add `punktfunk-host install`/`uninstall`/`service` subcommands (Windows-gated) that register a service or an Interactive/Highest scheduled task to launch the host in Session 1 (the documented requirement for DXGI duplication + SendInput), and the self-elevate-if-not-running shortcut path. Reuse the existing capture/wgc_relay CreateProcessAsUserW machinery already in the crate. This codifies the script chain into the binary without touching the per-frame path or core. - **Proposal:** Add `punktfunk-host install`/`uninstall`/`service` subcommands (Windows-gated) that register a service or an Interactive/Highest scheduled task to launch the host in Session 1 (the documented requirement for DXGI duplication + SendInput), and the self-elevate-if-not-running shortcut path. Reuse the existing capture/wgc_relay CreateProcessAsUserW machinery already in the crate. This codifies the script chain into the binary without touching the per-frame path or core.
#### 21. Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame #### 21. Composite the moved cursor onto a clean copy even when DDA returns no new desktop frame
*Area:* `win:cursor-compositing` · *Windows-host:* yes · *Severity:* high · *Effort:* large *Area:* `win:cursor-compositing` · *Windows-host:* yes · *Severity:* high · *Effort:* large · **⊘ ALREADY-HANDLED (2026-06-16)**
> **Resolution — not a bug for punktfunk.** The gap below assumes a cursor moving over a static screen
> produces `AcquireNextFrame` **timeouts**. It does not: DXGI returns **S_OK for pointer-only updates**
> (`FrameInfo.LastMouseUpdateTime != 0`, `LastPresentTime == 0`), with the resource holding the
> (unchanged) desktop. `acquire()` always re-runs `present_acquired` on S_OK (`dxgi.rs:1407,1474`), which
> re-copies the desktop and recomposites the cursor at its new position. `last_present` is repeated only
> on a genuine `WAIT_TIMEOUT` (nothing changed) or a mid-rebuild gap — correct. The agent that raised this
> didn't account for DDA's pointer-update S_OK semantics, and the run was killed before the verify phase
> reached it. The only real delta from Apollo is a **perf** micro-opt (Apollo retains a clean copy and
> re-blends just the cursor rect, avoiding a full ~29 MB `CopyResource` per pointer update) — deferred as
> optional, pending evidence of GPU-copy pressure.
- **Apollo does:** Apollo treats a mouse-only update as a real update (display_vram.cpp:1162-1168) and keeps an intermediate D3D surface of the last desktop frame so it can copy surface->fresh image and re-blend the cursor at its new position with no new DDA frame (last_frame_variant state machine, display_vram.cpp:1239-1306). - **Apollo does:** Apollo treats a mouse-only update as a real update (display_vram.cpp:1162-1168) and keeps an intermediate D3D surface of the last desktop frame so it can copy surface->fresh image and re-blend the cursor at its new position with no new DDA frame (last_frame_variant state machine, display_vram.cpp:1239-1306).
- **punktfunk gap:** punktfunk only composites on a fresh AcquireNextFrame (dxgi.rs:1477); on timeout it repeats last_present (dxgi.rs:1547-1561) which has the OLD cursor position baked in, so a cursor moving over a static screen stutters/lags. - **punktfunk gap (as originally filed — see Resolution above; premise incorrect):** punktfunk only composites on a fresh AcquireNextFrame (dxgi.rs:1477); on timeout it repeats last_present (dxgi.rs:1547-1561) which has the OLD cursor position baked in, so a cursor moving over a static screen stutters/lags.
- **Proposal:** Keep a clean intermediate copy of the last desktop frame (an extra DEFAULT texture). In acquire (dxgi.rs:1341), when AcquireNextFrame times out but update_cursor saw a position change (LastMouseUpdateTime changed) and the cursor is visible, copy the clean intermediate into gpu_copy and re-run composite_cursor_gpu, then return that as a fresh frame instead of repeating last_present. - **Proposal (superseded; only the perf variant remains):** Keep a clean intermediate copy of the last desktop frame (an extra DEFAULT texture). In acquire (dxgi.rs:1341), when AcquireNextFrame times out but update_cursor saw a position change (LastMouseUpdateTime changed) and the cursor is visible, copy the clean intermediate into gpu_copy and re-run composite_cursor_gpu, then return that as a fresh frame instead of repeating last_present.
#### 22. Add real reference-frame invalidation (RFI) instead of always forcing IDR #### 22. Add real reference-frame invalidation (RFI) instead of always forcing IDR
*Area:* `win:nvenc-d3d11` · *Windows-host:* yes · *Severity:* high · *Effort:* large *Area:* `win:nvenc-d3d11` · *Windows-host:* yes · *Severity:* high · *Effort:* large