Files
punktfunk/docs/windows-host-goal1-plan.md
T
enricobuehler a0427cd2a3 feat(windows-host): OutputFormat into the capturer — kill the dxgi back-reference (Goal-1 stage 5, tightening 1)
The headline §2.3 seam tightening (the explicit Stage-3 deferral; §5's "highest-severity
coupling"): the capturer is now TOLD its output format instead of re-deriving the encode backend.

New `capture::OutputFormat { gpu, hdr }`, resolved once per session and passed INTO
capture_virtual_output:
  * native punktfunk/1 path: `SessionPlan::output_format()` (gpu = encoder.is_gpu(), from the
    already-resolved plan.encoder — no second probe; hdr = plan.hdr).
  * GameStream + spike paths: `OutputFormat::resolve(hdr)` (gpu from the single `gpu_encode()`
    source, which maps windows_resolved_backend()).

`capture/dxgi.rs DuplCapturer::open` takes `gpu` in and its internal
`!matches!(windows_resolved_backend(), Software)` recompute is DELETED — the capture layer no
longer re-calls the encode layer (the back-reference that could let capture and encode disagree
on whether frames are GPU-resident, plan §2.3/§5). The relay's secure-desktop DDA passes
`gpu_encode()` likewise.

Behavior-preserving: the `gpu` passed in equals the value the capturer used to compute (same
encode-backend resolution). The DDA opens keep `want_hdr=false` (the SDR fallback, unchanged).

Tightenings 2 (HDR/release -> VirtualLease) and 3 (EncoderCaps) split off: (2) needs the
monitor-generation carried on the lease + the keepalive becoming Box<dyn VirtualLease> — that's
the §2.5 ownership-model change (CURRENT_MON_GEN / sudovda::wait_for_monitor_released), so it
moves there; (3) is a small additive follow-on. Documented in the plan.

Verified: Linux cargo check + clippy (-D warnings) + fmt clean. Box build to follow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 18:37:48 +00:00

125 lines
10 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Goal-1 (clean, layered host architecture) — staged execution plan
The design is in [`windows-host-rewrite.md`](windows-host-rewrite.md) §2.22.4. This file is the **ordered,
independently-shippable execution plan**, because the host is **live-validated** (GameStream + punktfunk/1,
NVENC + IDD-push on-glass) and Goal-1 rewires its session/config/dispatch flow — so every stage must
**preserve behavior**, compile + box-verify on its own, and be committed before the next starts. The plan's
own §14 makes the §1 preservation checklist a mandatory per-module assert contract; honour it.
## Why staged (not one big rewrite)
`main` is at parity and shipping. A monolithic rewrite would put the validated host in a broken
intermediate state for a long window and make a regression impossible to bisect. Each stage below is a
behaviour-preserving transform with its own verification, so a regression is caught at the stage that
introduced it.
## Stages (ordered; each = goal · files · risk · verify)
**Stage 1 — `HostConfig` foundation. ✅ DONE (this commit).**
`config.rs`: typed `HostConfig` parsed ONCE from env (`idd_push`/`encoder_pref`/`no_helper`/`force_helper`).
Migrated the two highest-churn dispatch reads onto it (`encode::windows_resolved_backend`,
`punktfunk1::should_use_helper`). Risk: low (env constant at runtime → identical behaviour). Verify: box
`cargo check --features nvenc`.
**Stage 2 — finish `HostConfig` + resolve-once. ✅ DONE (this commit).**
Migrated **31** genuinely-constant operator/dispatch sites onto `HostConfig`: `idd_push` ×7 (the
capture/topology disagreement knob), `no_wgc`, `capture_backend`, `render_adapter`, `encoder_pref` (Linux),
the Windows vdisplay-backend select, plus the plan-named `secure_dda`/`idd_depth`/`zerocopy`/`ten_bit` and the
multi-site `perf` ×4 / `compositor` ×5 / `video_source` ×3 / `gamepad`. Each `HostConfig` field's parser is
**byte-identical** to the read it replaced, so `old == new` by construction (the §1 "flipped bool" guard).
**Scope correction (the plan's "~64 sites / Linux XDG+compositor / grep→0" was unsafe as written):** two
classes of `env::var` read are deliberately **kept live** and documented in `config.rs`:
- **Runtime-mutated session vars.** On Linux, `vdisplay::apply_session_env` *rewrites the process env on
every connect* (the Bazzite Gaming↔Desktop follow): `WAYLAND_DISPLAY`, `XDG_CURRENT_DESKTOP`,
`XDG_RUNTIME_DIR`, `DBUS_SESSION_BUS_ADDRESS`, and the derived `PUNKTFUNK_INPUT_BACKEND`,
`PUNKTFUNK_GAMESCOPE_SESSION/NODE`, `PUNKTFUNK_KWIN/MUTTER_VIRTUAL_PRIMARY`, `PUNKTFUNK_FORCE_SHM`.
Parse-once would freeze them at startup → silent session-following regression. They are NOT constant.
- **Single-use local tuning** (no resolve-once benefit, call-site-local default/clamp, and `FEC_PCT` even has
*two different* semantics): `FEC_PCT`, `VIDEO_DROP`, `VBV_FRAMES`, `SPLIT_ENCODE`, `PACE_BURST_KB`, the
`capture/dxgi.rs` timing knobs, the `*_LIVE`/test gates, plus path/dynamic reads (config-dir, `PATH`
search, env-forward-to-child). `PUNKTFUNK_ZEROCOPY` is split on purpose: Windows presence-semantics moved
to the field; Linux keeps its own truthy parser.
Risk: medium (semantics-preservation). Verify: Linux `cargo check`/`clippy`/`fmt` green (the Windows-only
edits are 1:1 substitutions, compile-verified on the box as part of Stage 3's build).
**Stage 3 — `SessionPlan` (the single biggest clarity lever, plan §2.4). ✅ DONE (box-build + on-glass validated).**
New `src/session_plan.rs`: a `Copy` `SessionPlan { capture, topology, encoder, bit_depth, hdr }` resolved
**once** from `HostConfig` (+ the negotiated `bit_depth`) in `virtual_stream`, logged, and threaded through
`build_pipeline_with_retry`/`build_pipeline`. The three dispatch points now read it:
- **capture** — `capture::capture_virtual_output` takes a `CaptureBackend` IN (was re-deriving from
`config().idd_push`/`capture_backend`/`no_wgc`); `CaptureBackend::resolve()` is the one resolver (also
used by the GameStream + spike call sites).
- **topology** — `virtual_stream` reads `plan.topology` (`should_use_helper` deleted; its logic is
`session_plan::resolve_topology`, verbatim). The IDD-preempt guard reads `plan.capture` too.
- **encoder** — recorded as `EncoderBackend` from `encode::windows_resolved_backend` (config-backed +
GPU-vendor cached since stage 2, already a single source). Threading `encoder`/`input_format` into the
encoder + capturer opens (which removes the `dxgi.rs` back-reference) is **stage 5**.
Every decision is provably equivalent to the pre-stage-3 scattered reads (same `config()` + cached probes),
so it is behavior-preserving. Risk: medium-high (rewires the deployed decision). Verify:
- **Box build ✅** — `cargo check -p punktfunk-host --features nvenc` (the deployed config: NVENC SDK +
`cudarc` + `encode/nvenc.rs`) is **clean, zero warnings**, on the RTX box (`192.168.1.173`), in an
isolated worktree. This also covers stage 2's Windows-only edits (their first real Windows compile).
- **On-glass ✅** — deployed my Stage-3 host into the SCM service (Session-1 launch, the real IDD-push
environment) on the RTX box and drove a `punktfunk-probe` loopback session. The host logged
`resolved session plan { capture: IddPush, topology: SingleProcess, encoder: Nvenc, bit_depth: 8,
hdr: false }` — the **correct** resolution for the deployed config (IDD_PUSH + VDISPLAY=pf + nvenc) —
and routed correctly (IDD-push capturer → shared ring → IDD→DDA fallback). This box has a pre-existing
**hybrid-GPU IDD render-adapter mismatch** (driver renders on the iGPU `af4825`, host ring on the 4090
`294d29`) that yielded no published frame in this loopback scenario; an **A/B against the shipping
binary reproduced the identical `frames=0`**, proving the no-frame is environmental, **not** a Stage-3
regression. Stage 3 is behavior-equivalent to the shipping host. Box restored to its deployed state.
**Stage 4 — `SessionContext` (the arg-bundling). ✅ DONE (box-build validated). `SessionFactory`/`Session::drop` deferred to §2.5 — see below.**
Bundled the 13-positional-argument `#[allow(too_many_arguments)]` session entry (`virtual_stream` **and**
`virtual_stream_relay`) into one owned `SessionContext` struct, moved into the stream thread. The receivers
move in (`virtual_stream` is their only consumer), retiring the `&Receiver` borrow plumbing. **Behavior-
identical by construction**: each function destructures the context into the same local names at the top, so
the ~400-line loop bodies are byte-for-byte unchanged. Removed both `#[allow(too_many_arguments)]` attrs.
**Scoped deliberately.** The plan's `SessionFactory.build()` owning a `vdm.lease(mode) → open_capturer →
open_encoder → spawn` RAII chain with `Session::drop` as the *only* teardown is **coupled to §2.5's
ownership-model rewrite** — it needs a host-side `VirtualDisplayManager`/`MonitorLease` that does not exist
yet (the lifecycle still lives in the `CURRENT_MON_GEN`/`IDD_SETUP_LOCK` globals + the per-compositor
`vdisplay` backends). The current teardown is **already drop-based** (the capturer owns the keepalive whose
`Drop` releases the monitor — "restore displays before REMOVE" lives there; only `send_thread.join()` is
explicit), and it is the validated shipping path. Wrapping the deployed reconfig/switch/rebuild loop in a
`Session::drop` for a behavior-preserving change would add real regression risk for marginal gain. So the
`SessionFactory`/`Session::drop`/`vdm.lease` work is folded into §2.5 (its natural home); this stage delivers
the concrete, safe arg-bundling. Risk: low (behavior-identical). Verify: Linux + box build (the relay
destructure is the only Windows-only piece); the teardown on-glass gate moves to the §2.5 work.
**Stage 5 — seam-trait tightenings (plan §2.3). 🟡 Tightening 1 ✅ DONE (box-build validated); 2→§2.5, 3 follow-on.**
The three §2.3 tightenings have different coupling, so they split:
- **(1) `OutputFormat` into the capturer ✅** — the headline (the explicit Stage-3 deferral; §5's
"highest-severity coupling"). New `capture::OutputFormat { gpu, hdr }`, resolved once per session and
passed **into** `capture_virtual_output` (`SessionPlan::output_format()` for the native path —
`gpu = encoder.is_gpu()`, no second probe; `OutputFormat::resolve()` for the GameStream/spike paths).
`dxgi::DuplCapturer::open` takes `gpu` in and **its `windows_resolved_backend()` recompute is deleted**
capture no longer re-derives the encode backend. Behavior-preserving (the `gpu` passed in equals the value
the capturer used to compute). Linux + box-build clean.
- **(2) HDR/release → `VirtualLease`** — **moved to §2.5.** `await_released` as a lease method needs the
monitor-generation carried *on the lease* (today it's the `CURRENT_MON_GEN` global + the
`sudovda::wait_for_monitor_released` free fn), and the keepalive becoming `Box<dyn VirtualLease>` is the
ownership-model change. It belongs with the `VirtualDisplayManager`/`MonitorLease` work, not bolted on here.
- **(3) `EncoderCaps`** — small additive follow-on (query optional encoder capabilities instead of default
no-ops); not blocking. Tracked for the next seam pass.
Risk: medium (Tightening 1 is behavior-preserving + Windows-only → box-compile is the gate; on-glass parity is
the same env-limited story as Stage 3).
**Stage 6 — `src/windows/` tree (cfg-sprawl confinement, plan §2.2).**
Move the Windows backends under `src/windows/` + `capture/windows/`, `encode/windows/`, `inject/windows/`,
`audio/windows/`, `vdisplay/windows.rs` behind one `#[cfg(windows)] mod windows;` seam. Pure file move +
mod/use-path updates — behaviour-identical. Risk: low-but-huge (dozens of files; compile-verify catches all).
Do LAST so the earlier semantic stages don't fight path churn. Verify: Linux + box build.
## Guardrails (mandatory, plan §14)
- Each stage is its own commit; box-verify before moving on.
- Stages 35 touch the deployed path → **on-glass re-test** (NVENC + IDD-push, a mode switch, a
connect/disconnect cycle) before the next stage.
- Preserve every `PUNKTFUNK_*` var's exact semantics; when in doubt, assert old==new at the call site.