a0427cd2a3
The headline §2.3 seam tightening (the explicit Stage-3 deferral; §5's "highest-severity
coupling"): the capturer is now TOLD its output format instead of re-deriving the encode backend.
New `capture::OutputFormat { gpu, hdr }`, resolved once per session and passed INTO
capture_virtual_output:
* native punktfunk/1 path: `SessionPlan::output_format()` (gpu = encoder.is_gpu(), from the
already-resolved plan.encoder — no second probe; hdr = plan.hdr).
* GameStream + spike paths: `OutputFormat::resolve(hdr)` (gpu from the single `gpu_encode()`
source, which maps windows_resolved_backend()).
`capture/dxgi.rs DuplCapturer::open` takes `gpu` in and its internal
`!matches!(windows_resolved_backend(), Software)` recompute is DELETED — the capture layer no
longer re-calls the encode layer (the back-reference that could let capture and encode disagree
on whether frames are GPU-resident, plan §2.3/§5). The relay's secure-desktop DDA passes
`gpu_encode()` likewise.
Behavior-preserving: the `gpu` passed in equals the value the capturer used to compute (same
encode-backend resolution). The DDA opens keep `want_hdr=false` (the SDR fallback, unchanged).
Tightenings 2 (HDR/release -> VirtualLease) and 3 (EncoderCaps) split off: (2) needs the
monitor-generation carried on the lease + the keepalive becoming Box<dyn VirtualLease> — that's
the §2.5 ownership-model change (CURRENT_MON_GEN / sudovda::wait_for_monitor_released), so it
moves there; (3) is a small additive follow-on. Documented in the plan.
Verified: Linux cargo check + clippy (-D warnings) + fmt clean. Box build to follow.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
125 lines
10 KiB
Markdown
125 lines
10 KiB
Markdown
# Goal-1 (clean, layered host architecture) — staged execution plan
|
||
|
||
The design is in [`windows-host-rewrite.md`](windows-host-rewrite.md) §2.2–2.4. This file is the **ordered,
|
||
independently-shippable execution plan**, because the host is **live-validated** (GameStream + punktfunk/1,
|
||
NVENC + IDD-push on-glass) and Goal-1 rewires its session/config/dispatch flow — so every stage must
|
||
**preserve behavior**, compile + box-verify on its own, and be committed before the next starts. The plan's
|
||
own §14 makes the §1 preservation checklist a mandatory per-module assert contract; honour it.
|
||
|
||
## Why staged (not one big rewrite)
|
||
|
||
`main` is at parity and shipping. A monolithic rewrite would put the validated host in a broken
|
||
intermediate state for a long window and make a regression impossible to bisect. Each stage below is a
|
||
behaviour-preserving transform with its own verification, so a regression is caught at the stage that
|
||
introduced it.
|
||
|
||
## Stages (ordered; each = goal · files · risk · verify)
|
||
|
||
**Stage 1 — `HostConfig` foundation. ✅ DONE (this commit).**
|
||
`config.rs`: typed `HostConfig` parsed ONCE from env (`idd_push`/`encoder_pref`/`no_helper`/`force_helper`).
|
||
Migrated the two highest-churn dispatch reads onto it (`encode::windows_resolved_backend`,
|
||
`punktfunk1::should_use_helper`). Risk: low (env constant at runtime → identical behaviour). Verify: box
|
||
`cargo check --features nvenc`.
|
||
|
||
**Stage 2 — finish `HostConfig` + resolve-once. ✅ DONE (this commit).**
|
||
Migrated **31** genuinely-constant operator/dispatch sites onto `HostConfig`: `idd_push` ×7 (the
|
||
capture/topology disagreement knob), `no_wgc`, `capture_backend`, `render_adapter`, `encoder_pref` (Linux),
|
||
the Windows vdisplay-backend select, plus the plan-named `secure_dda`/`idd_depth`/`zerocopy`/`ten_bit` and the
|
||
multi-site `perf` ×4 / `compositor` ×5 / `video_source` ×3 / `gamepad`. Each `HostConfig` field's parser is
|
||
**byte-identical** to the read it replaced, so `old == new` by construction (the §1 "flipped bool" guard).
|
||
|
||
**Scope correction (the plan's "~64 sites / Linux XDG+compositor / grep→0" was unsafe as written):** two
|
||
classes of `env::var` read are deliberately **kept live** and documented in `config.rs`:
|
||
- **Runtime-mutated session vars.** On Linux, `vdisplay::apply_session_env` *rewrites the process env on
|
||
every connect* (the Bazzite Gaming↔Desktop follow): `WAYLAND_DISPLAY`, `XDG_CURRENT_DESKTOP`,
|
||
`XDG_RUNTIME_DIR`, `DBUS_SESSION_BUS_ADDRESS`, and the derived `PUNKTFUNK_INPUT_BACKEND`,
|
||
`PUNKTFUNK_GAMESCOPE_SESSION/NODE`, `PUNKTFUNK_KWIN/MUTTER_VIRTUAL_PRIMARY`, `PUNKTFUNK_FORCE_SHM`.
|
||
Parse-once would freeze them at startup → silent session-following regression. They are NOT constant.
|
||
- **Single-use local tuning** (no resolve-once benefit, call-site-local default/clamp, and `FEC_PCT` even has
|
||
*two different* semantics): `FEC_PCT`, `VIDEO_DROP`, `VBV_FRAMES`, `SPLIT_ENCODE`, `PACE_BURST_KB`, the
|
||
`capture/dxgi.rs` timing knobs, the `*_LIVE`/test gates, plus path/dynamic reads (config-dir, `PATH`
|
||
search, env-forward-to-child). `PUNKTFUNK_ZEROCOPY` is split on purpose: Windows presence-semantics moved
|
||
to the field; Linux keeps its own truthy parser.
|
||
|
||
Risk: medium (semantics-preservation). Verify: Linux `cargo check`/`clippy`/`fmt` green (the Windows-only
|
||
edits are 1:1 substitutions, compile-verified on the box as part of Stage 3's build).
|
||
|
||
**Stage 3 — `SessionPlan` (the single biggest clarity lever, plan §2.4). ✅ DONE (box-build + on-glass validated).**
|
||
New `src/session_plan.rs`: a `Copy` `SessionPlan { capture, topology, encoder, bit_depth, hdr }` resolved
|
||
**once** from `HostConfig` (+ the negotiated `bit_depth`) in `virtual_stream`, logged, and threaded through
|
||
`build_pipeline_with_retry`/`build_pipeline`. The three dispatch points now read it:
|
||
- **capture** — `capture::capture_virtual_output` takes a `CaptureBackend` IN (was re-deriving from
|
||
`config().idd_push`/`capture_backend`/`no_wgc`); `CaptureBackend::resolve()` is the one resolver (also
|
||
used by the GameStream + spike call sites).
|
||
- **topology** — `virtual_stream` reads `plan.topology` (`should_use_helper` deleted; its logic is
|
||
`session_plan::resolve_topology`, verbatim). The IDD-preempt guard reads `plan.capture` too.
|
||
- **encoder** — recorded as `EncoderBackend` from `encode::windows_resolved_backend` (config-backed +
|
||
GPU-vendor cached since stage 2, already a single source). Threading `encoder`/`input_format` into the
|
||
encoder + capturer opens (which removes the `dxgi.rs` back-reference) is **stage 5**.
|
||
|
||
Every decision is provably equivalent to the pre-stage-3 scattered reads (same `config()` + cached probes),
|
||
so it is behavior-preserving. Risk: medium-high (rewires the deployed decision). Verify:
|
||
- **Box build ✅** — `cargo check -p punktfunk-host --features nvenc` (the deployed config: NVENC SDK +
|
||
`cudarc` + `encode/nvenc.rs`) is **clean, zero warnings**, on the RTX box (`192.168.1.173`), in an
|
||
isolated worktree. This also covers stage 2's Windows-only edits (their first real Windows compile).
|
||
- **On-glass ✅** — deployed my Stage-3 host into the SCM service (Session-1 launch, the real IDD-push
|
||
environment) on the RTX box and drove a `punktfunk-probe` loopback session. The host logged
|
||
`resolved session plan { capture: IddPush, topology: SingleProcess, encoder: Nvenc, bit_depth: 8,
|
||
hdr: false }` — the **correct** resolution for the deployed config (IDD_PUSH + VDISPLAY=pf + nvenc) —
|
||
and routed correctly (IDD-push capturer → shared ring → IDD→DDA fallback). This box has a pre-existing
|
||
**hybrid-GPU IDD render-adapter mismatch** (driver renders on the iGPU `af4825`, host ring on the 4090
|
||
`294d29`) that yielded no published frame in this loopback scenario; an **A/B against the shipping
|
||
binary reproduced the identical `frames=0`**, proving the no-frame is environmental, **not** a Stage-3
|
||
regression. Stage 3 is behavior-equivalent to the shipping host. Box restored to its deployed state.
|
||
|
||
**Stage 4 — `SessionContext` (the arg-bundling). ✅ DONE (box-build validated). `SessionFactory`/`Session::drop` deferred to §2.5 — see below.**
|
||
Bundled the 13-positional-argument `#[allow(too_many_arguments)]` session entry (`virtual_stream` **and**
|
||
`virtual_stream_relay`) into one owned `SessionContext` struct, moved into the stream thread. The receivers
|
||
move in (`virtual_stream` is their only consumer), retiring the `&Receiver` borrow plumbing. **Behavior-
|
||
identical by construction**: each function destructures the context into the same local names at the top, so
|
||
the ~400-line loop bodies are byte-for-byte unchanged. Removed both `#[allow(too_many_arguments)]` attrs.
|
||
|
||
**Scoped deliberately.** The plan's `SessionFactory.build()` owning a `vdm.lease(mode) → open_capturer →
|
||
open_encoder → spawn` RAII chain with `Session::drop` as the *only* teardown is **coupled to §2.5's
|
||
ownership-model rewrite** — it needs a host-side `VirtualDisplayManager`/`MonitorLease` that does not exist
|
||
yet (the lifecycle still lives in the `CURRENT_MON_GEN`/`IDD_SETUP_LOCK` globals + the per-compositor
|
||
`vdisplay` backends). The current teardown is **already drop-based** (the capturer owns the keepalive whose
|
||
`Drop` releases the monitor — "restore displays before REMOVE" lives there; only `send_thread.join()` is
|
||
explicit), and it is the validated shipping path. Wrapping the deployed reconfig/switch/rebuild loop in a
|
||
`Session::drop` for a behavior-preserving change would add real regression risk for marginal gain. So the
|
||
`SessionFactory`/`Session::drop`/`vdm.lease` work is folded into §2.5 (its natural home); this stage delivers
|
||
the concrete, safe arg-bundling. Risk: low (behavior-identical). Verify: Linux + box build (the relay
|
||
destructure is the only Windows-only piece); the teardown on-glass gate moves to the §2.5 work.
|
||
|
||
**Stage 5 — seam-trait tightenings (plan §2.3). 🟡 Tightening 1 ✅ DONE (box-build validated); 2→§2.5, 3 follow-on.**
|
||
The three §2.3 tightenings have different coupling, so they split:
|
||
- **(1) `OutputFormat` into the capturer ✅** — the headline (the explicit Stage-3 deferral; §5's
|
||
"highest-severity coupling"). New `capture::OutputFormat { gpu, hdr }`, resolved once per session and
|
||
passed **into** `capture_virtual_output` (`SessionPlan::output_format()` for the native path —
|
||
`gpu = encoder.is_gpu()`, no second probe; `OutputFormat::resolve()` for the GameStream/spike paths).
|
||
`dxgi::DuplCapturer::open` takes `gpu` in and **its `windows_resolved_backend()` recompute is deleted** —
|
||
capture no longer re-derives the encode backend. Behavior-preserving (the `gpu` passed in equals the value
|
||
the capturer used to compute). Linux + box-build clean.
|
||
- **(2) HDR/release → `VirtualLease`** — **moved to §2.5.** `await_released` as a lease method needs the
|
||
monitor-generation carried *on the lease* (today it's the `CURRENT_MON_GEN` global + the
|
||
`sudovda::wait_for_monitor_released` free fn), and the keepalive becoming `Box<dyn VirtualLease>` is the
|
||
ownership-model change. It belongs with the `VirtualDisplayManager`/`MonitorLease` work, not bolted on here.
|
||
- **(3) `EncoderCaps`** — small additive follow-on (query optional encoder capabilities instead of default
|
||
no-ops); not blocking. Tracked for the next seam pass.
|
||
|
||
Risk: medium (Tightening 1 is behavior-preserving + Windows-only → box-compile is the gate; on-glass parity is
|
||
the same env-limited story as Stage 3).
|
||
|
||
**Stage 6 — `src/windows/` tree (cfg-sprawl confinement, plan §2.2).**
|
||
Move the Windows backends under `src/windows/` + `capture/windows/`, `encode/windows/`, `inject/windows/`,
|
||
`audio/windows/`, `vdisplay/windows.rs` behind one `#[cfg(windows)] mod windows;` seam. Pure file move +
|
||
mod/use-path updates — behaviour-identical. Risk: low-but-huge (dozens of files; compile-verify catches all).
|
||
Do LAST so the earlier semantic stages don't fight path churn. Verify: Linux + box build.
|
||
|
||
## Guardrails (mandatory, plan §14)
|
||
|
||
- Each stage is its own commit; box-verify before moving on.
|
||
- Stages 3–5 touch the deployed path → **on-glass re-test** (NVENC + IDD-push, a mode switch, a
|
||
connect/disconnect cycle) before the next stage.
|
||
- Preserve every `PUNKTFUNK_*` var's exact semantics; when in doubt, assert old==new at the call site.
|