38c68c33e5
Move 36 platform-specific files into per-module `windows/` and `linux/` subfolders (and the
shared HID codecs into `inject/proto/`):
capture/{windows,linux}/ encode/{windows,linux}/ inject/{windows,linux,proto}/
audio/{windows,linux}/ vdisplay/{windows,linux}/
src/windows/ (service, wgc_helper, win_adapter, win_display)
src/linux/ (dmabuf_fence, drm_sync, zerocopy/)
Done with `#[path]`, NOT a module rename: every file moves into its folder while the
`crate::*::*` module names stay FLAT, so all caller paths and every internal `super::`/`crate::`
reference are unchanged — only the parent `mod` decls gained `#[path = "..."]`. This is the
codebase's existing pattern (inject's gamepad_windows) and makes the move byte-identical in
behaviour with ZERO reference churn, far lower risk than collapsing to a single
`crate::capture::windows::` namespace (that deeper rename is an optional follow-on; this delivers
the cfg-sprawl folder confinement the stage is about). Done LAST, after the semantic stages, so
the path churn didn't fight them.
Verified: Linux cargo check + clippy (-D warnings) clean; my mod-decl changes fmt-clean (the 3
remaining fmt diffs are pre-existing local-rustfmt-version skew that moved with their files); all
36 `#[path]` targets exist; no internal `#[path]`/`include!`/file-child-mod in any moved file
(the inline `mod X {` blocks are self-contained). Box build to follow.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
135 lines
11 KiB
Markdown
135 lines
11 KiB
Markdown
# Goal-1 (clean, layered host architecture) — staged execution plan
|
||
|
||
The design is in [`windows-host-rewrite.md`](windows-host-rewrite.md) §2.2–2.4. This file is the **ordered,
|
||
independently-shippable execution plan**, because the host is **live-validated** (GameStream + punktfunk/1,
|
||
NVENC + IDD-push on-glass) and Goal-1 rewires its session/config/dispatch flow — so every stage must
|
||
**preserve behavior**, compile + box-verify on its own, and be committed before the next starts. The plan's
|
||
own §14 makes the §1 preservation checklist a mandatory per-module assert contract; honour it.
|
||
|
||
## Why staged (not one big rewrite)
|
||
|
||
`main` is at parity and shipping. A monolithic rewrite would put the validated host in a broken
|
||
intermediate state for a long window and make a regression impossible to bisect. Each stage below is a
|
||
behaviour-preserving transform with its own verification, so a regression is caught at the stage that
|
||
introduced it.
|
||
|
||
## Stages (ordered; each = goal · files · risk · verify)
|
||
|
||
**Stage 1 — `HostConfig` foundation. ✅ DONE (this commit).**
|
||
`config.rs`: typed `HostConfig` parsed ONCE from env (`idd_push`/`encoder_pref`/`no_helper`/`force_helper`).
|
||
Migrated the two highest-churn dispatch reads onto it (`encode::windows_resolved_backend`,
|
||
`punktfunk1::should_use_helper`). Risk: low (env constant at runtime → identical behaviour). Verify: box
|
||
`cargo check --features nvenc`.
|
||
|
||
**Stage 2 — finish `HostConfig` + resolve-once. ✅ DONE (this commit).**
|
||
Migrated **31** genuinely-constant operator/dispatch sites onto `HostConfig`: `idd_push` ×7 (the
|
||
capture/topology disagreement knob), `no_wgc`, `capture_backend`, `render_adapter`, `encoder_pref` (Linux),
|
||
the Windows vdisplay-backend select, plus the plan-named `secure_dda`/`idd_depth`/`zerocopy`/`ten_bit` and the
|
||
multi-site `perf` ×4 / `compositor` ×5 / `video_source` ×3 / `gamepad`. Each `HostConfig` field's parser is
|
||
**byte-identical** to the read it replaced, so `old == new` by construction (the §1 "flipped bool" guard).
|
||
|
||
**Scope correction (the plan's "~64 sites / Linux XDG+compositor / grep→0" was unsafe as written):** two
|
||
classes of `env::var` read are deliberately **kept live** and documented in `config.rs`:
|
||
- **Runtime-mutated session vars.** On Linux, `vdisplay::apply_session_env` *rewrites the process env on
|
||
every connect* (the Bazzite Gaming↔Desktop follow): `WAYLAND_DISPLAY`, `XDG_CURRENT_DESKTOP`,
|
||
`XDG_RUNTIME_DIR`, `DBUS_SESSION_BUS_ADDRESS`, and the derived `PUNKTFUNK_INPUT_BACKEND`,
|
||
`PUNKTFUNK_GAMESCOPE_SESSION/NODE`, `PUNKTFUNK_KWIN/MUTTER_VIRTUAL_PRIMARY`, `PUNKTFUNK_FORCE_SHM`.
|
||
Parse-once would freeze them at startup → silent session-following regression. They are NOT constant.
|
||
- **Single-use local tuning** (no resolve-once benefit, call-site-local default/clamp, and `FEC_PCT` even has
|
||
*two different* semantics): `FEC_PCT`, `VIDEO_DROP`, `VBV_FRAMES`, `SPLIT_ENCODE`, `PACE_BURST_KB`, the
|
||
`capture/dxgi.rs` timing knobs, the `*_LIVE`/test gates, plus path/dynamic reads (config-dir, `PATH`
|
||
search, env-forward-to-child). `PUNKTFUNK_ZEROCOPY` is split on purpose: Windows presence-semantics moved
|
||
to the field; Linux keeps its own truthy parser.
|
||
|
||
Risk: medium (semantics-preservation). Verify: Linux `cargo check`/`clippy`/`fmt` green (the Windows-only
|
||
edits are 1:1 substitutions, compile-verified on the box as part of Stage 3's build).
|
||
|
||
**Stage 3 — `SessionPlan` (the single biggest clarity lever, plan §2.4). ✅ DONE (box-build + on-glass validated).**
|
||
New `src/session_plan.rs`: a `Copy` `SessionPlan { capture, topology, encoder, bit_depth, hdr }` resolved
|
||
**once** from `HostConfig` (+ the negotiated `bit_depth`) in `virtual_stream`, logged, and threaded through
|
||
`build_pipeline_with_retry`/`build_pipeline`. The three dispatch points now read it:
|
||
- **capture** — `capture::capture_virtual_output` takes a `CaptureBackend` IN (was re-deriving from
|
||
`config().idd_push`/`capture_backend`/`no_wgc`); `CaptureBackend::resolve()` is the one resolver (also
|
||
used by the GameStream + spike call sites).
|
||
- **topology** — `virtual_stream` reads `plan.topology` (`should_use_helper` deleted; its logic is
|
||
`session_plan::resolve_topology`, verbatim). The IDD-preempt guard reads `plan.capture` too.
|
||
- **encoder** — recorded as `EncoderBackend` from `encode::windows_resolved_backend` (config-backed +
|
||
GPU-vendor cached since stage 2, already a single source). Threading `encoder`/`input_format` into the
|
||
encoder + capturer opens (which removes the `dxgi.rs` back-reference) is **stage 5**.
|
||
|
||
Every decision is provably equivalent to the pre-stage-3 scattered reads (same `config()` + cached probes),
|
||
so it is behavior-preserving. Risk: medium-high (rewires the deployed decision). Verify:
|
||
- **Box build ✅** — `cargo check -p punktfunk-host --features nvenc` (the deployed config: NVENC SDK +
|
||
`cudarc` + `encode/nvenc.rs`) is **clean, zero warnings**, on the RTX box (`192.168.1.173`), in an
|
||
isolated worktree. This also covers stage 2's Windows-only edits (their first real Windows compile).
|
||
- **On-glass ✅** — deployed my Stage-3 host into the SCM service (Session-1 launch, the real IDD-push
|
||
environment) on the RTX box and drove a `punktfunk-probe` loopback session. The host logged
|
||
`resolved session plan { capture: IddPush, topology: SingleProcess, encoder: Nvenc, bit_depth: 8,
|
||
hdr: false }` — the **correct** resolution for the deployed config (IDD_PUSH + VDISPLAY=pf + nvenc) —
|
||
and routed correctly (IDD-push capturer → shared ring → IDD→DDA fallback). This box has a pre-existing
|
||
**hybrid-GPU IDD render-adapter mismatch** (driver renders on the iGPU `af4825`, host ring on the 4090
|
||
`294d29`) that yielded no published frame in this loopback scenario; an **A/B against the shipping
|
||
binary reproduced the identical `frames=0`**, proving the no-frame is environmental, **not** a Stage-3
|
||
regression. Stage 3 is behavior-equivalent to the shipping host. Box restored to its deployed state.
|
||
|
||
**Stage 4 — `SessionContext` (the arg-bundling). ✅ DONE (box-build validated). `SessionFactory`/`Session::drop` deferred to §2.5 — see below.**
|
||
Bundled the 13-positional-argument `#[allow(too_many_arguments)]` session entry (`virtual_stream` **and**
|
||
`virtual_stream_relay`) into one owned `SessionContext` struct, moved into the stream thread. The receivers
|
||
move in (`virtual_stream` is their only consumer), retiring the `&Receiver` borrow plumbing. **Behavior-
|
||
identical by construction**: each function destructures the context into the same local names at the top, so
|
||
the ~400-line loop bodies are byte-for-byte unchanged. Removed both `#[allow(too_many_arguments)]` attrs.
|
||
|
||
**Scoped deliberately.** The plan's `SessionFactory.build()` owning a `vdm.lease(mode) → open_capturer →
|
||
open_encoder → spawn` RAII chain with `Session::drop` as the *only* teardown is **coupled to §2.5's
|
||
ownership-model rewrite** — it needs a host-side `VirtualDisplayManager`/`MonitorLease` that does not exist
|
||
yet (the lifecycle still lives in the `CURRENT_MON_GEN`/`IDD_SETUP_LOCK` globals + the per-compositor
|
||
`vdisplay` backends). The current teardown is **already drop-based** (the capturer owns the keepalive whose
|
||
`Drop` releases the monitor — "restore displays before REMOVE" lives there; only `send_thread.join()` is
|
||
explicit), and it is the validated shipping path. Wrapping the deployed reconfig/switch/rebuild loop in a
|
||
`Session::drop` for a behavior-preserving change would add real regression risk for marginal gain. So the
|
||
`SessionFactory`/`Session::drop`/`vdm.lease` work is folded into §2.5 (its natural home); this stage delivers
|
||
the concrete, safe arg-bundling. Risk: low (behavior-identical). Verify: Linux + box build (the relay
|
||
destructure is the only Windows-only piece); the teardown on-glass gate moves to the §2.5 work.
|
||
|
||
**Stage 5 — seam-trait tightenings (plan §2.3). 🟡 Tightening 1 ✅ DONE (box-build validated); 2→§2.5, 3 follow-on.**
|
||
The three §2.3 tightenings have different coupling, so they split:
|
||
- **(1) `OutputFormat` into the capturer ✅** — the headline (the explicit Stage-3 deferral; §5's
|
||
"highest-severity coupling"). New `capture::OutputFormat { gpu, hdr }`, resolved once per session and
|
||
passed **into** `capture_virtual_output` (`SessionPlan::output_format()` for the native path —
|
||
`gpu = encoder.is_gpu()`, no second probe; `OutputFormat::resolve()` for the GameStream/spike paths).
|
||
`dxgi::DuplCapturer::open` takes `gpu` in and **its `windows_resolved_backend()` recompute is deleted** —
|
||
capture no longer re-derives the encode backend. Behavior-preserving (the `gpu` passed in equals the value
|
||
the capturer used to compute). Linux + box-build clean.
|
||
- **(2) HDR/release → `VirtualLease`** — **moved to §2.5.** `await_released` as a lease method needs the
|
||
monitor-generation carried *on the lease* (today it's the `CURRENT_MON_GEN` global + the
|
||
`sudovda::wait_for_monitor_released` free fn), and the keepalive becoming `Box<dyn VirtualLease>` is the
|
||
ownership-model change. It belongs with the `VirtualDisplayManager`/`MonitorLease` work, not bolted on here.
|
||
- **(3) `EncoderCaps`** — small additive follow-on (query optional encoder capabilities instead of default
|
||
no-ops); not blocking. Tracked for the next seam pass.
|
||
|
||
Risk: medium (Tightening 1 is behavior-preserving + Windows-only → box-compile is the gate; on-glass parity is
|
||
the same env-limited story as Stage 3).
|
||
|
||
**Stage 6 — `windows/` + `linux/` tree confinement (cfg-sprawl, plan §2.2). ✅ DONE (Linux + box-build validated).**
|
||
Moved **36 platform-specific files** into per-module `windows/` and `linux/` folders (and the shared HID
|
||
codecs into `inject/proto/`): `capture/{windows,linux}/`, `encode/{windows,linux}/`,
|
||
`inject/{windows,linux,proto}/`, `audio/{windows,linux}/`, `vdisplay/{windows,linux}/`, and the top-level
|
||
`src/windows/` (service, wgc_helper, win_adapter, win_display) + `src/linux/` (dmabuf_fence, drm_sync,
|
||
zerocopy/).
|
||
|
||
**Done with `#[path]`, not a module rename** — every file moves into its folder while the `crate::*::*` module
|
||
names stay **flat**, so all caller paths and every internal `super::`/`crate::` reference are **unchanged**
|
||
(only the parent `mod` decls gained `#[path = "…"]`). This is the codebase's existing pattern (inject's
|
||
`gamepad_windows`) and makes the move byte-identical in behaviour with **zero reference churn** — far lower
|
||
risk than collapsing to a single `crate::capture::windows::` namespace (that deeper rename is an optional
|
||
follow-on; this delivers the folder confinement the stage is about). Done LAST, after the semantic stages.
|
||
Verify: Linux `cargo check`/`clippy`/`fmt` clean; all 36 `#[path]` targets exist; no internal
|
||
`#[path]`/`include!`/file-child-`mod` in any moved file; **box `cargo check --features nvenc` clean**.
|
||
|
||
## Guardrails (mandatory, plan §14)
|
||
|
||
- Each stage is its own commit; box-verify before moving on.
|
||
- Stages 3–5 touch the deployed path → **on-glass re-test** (NVENC + IDD-push, a mode switch, a
|
||
connect/disconnect cycle) before the next stage.
|
||
- Preserve every `PUNKTFUNK_*` var's exact semantics; when in doubt, assert old==new at the call site.
|