Files
punktfunk/docs/windows-host-goal1-plan.md
T
enricobuehler 38c68c33e5 refactor(windows-host): confine platform code under windows/ + linux/ folders (Goal-1 stage 6)
Move 36 platform-specific files into per-module `windows/` and `linux/` subfolders (and the
shared HID codecs into `inject/proto/`):
  capture/{windows,linux}/  encode/{windows,linux}/  inject/{windows,linux,proto}/
  audio/{windows,linux}/  vdisplay/{windows,linux}/
  src/windows/ (service, wgc_helper, win_adapter, win_display)
  src/linux/  (dmabuf_fence, drm_sync, zerocopy/)

Done with `#[path]`, NOT a module rename: every file moves into its folder while the
`crate::*::*` module names stay FLAT, so all caller paths and every internal `super::`/`crate::`
reference are unchanged — only the parent `mod` decls gained `#[path = "..."]`. This is the
codebase's existing pattern (inject's gamepad_windows) and makes the move byte-identical in
behaviour with ZERO reference churn, far lower risk than collapsing to a single
`crate::capture::windows::` namespace (that deeper rename is an optional follow-on; this delivers
the cfg-sprawl folder confinement the stage is about). Done LAST, after the semantic stages, so
the path churn didn't fight them.

Verified: Linux cargo check + clippy (-D warnings) clean; my mod-decl changes fmt-clean (the 3
remaining fmt diffs are pre-existing local-rustfmt-version skew that moved with their files); all
36 `#[path]` targets exist; no internal `#[path]`/`include!`/file-child-mod in any moved file
(the inline `mod X {` blocks are self-contained). Box build to follow.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-25 18:53:45 +00:00

11 KiB
Raw Blame History

Goal-1 (clean, layered host architecture) — staged execution plan

The design is in windows-host-rewrite.md §2.22.4. This file is the ordered, independently-shippable execution plan, because the host is live-validated (GameStream + punktfunk/1, NVENC + IDD-push on-glass) and Goal-1 rewires its session/config/dispatch flow — so every stage must preserve behavior, compile + box-verify on its own, and be committed before the next starts. The plan's own §14 makes the §1 preservation checklist a mandatory per-module assert contract; honour it.

Why staged (not one big rewrite)

main is at parity and shipping. A monolithic rewrite would put the validated host in a broken intermediate state for a long window and make a regression impossible to bisect. Each stage below is a behaviour-preserving transform with its own verification, so a regression is caught at the stage that introduced it.

Stages (ordered; each = goal · files · risk · verify)

Stage 1 — HostConfig foundation. DONE (this commit). config.rs: typed HostConfig parsed ONCE from env (idd_push/encoder_pref/no_helper/force_helper). Migrated the two highest-churn dispatch reads onto it (encode::windows_resolved_backend, punktfunk1::should_use_helper). Risk: low (env constant at runtime → identical behaviour). Verify: box cargo check --features nvenc.

Stage 2 — finish HostConfig + resolve-once. DONE (this commit). Migrated 31 genuinely-constant operator/dispatch sites onto HostConfig: idd_push ×7 (the capture/topology disagreement knob), no_wgc, capture_backend, render_adapter, encoder_pref (Linux), the Windows vdisplay-backend select, plus the plan-named secure_dda/idd_depth/zerocopy/ten_bit and the multi-site perf ×4 / compositor ×5 / video_source ×3 / gamepad. Each HostConfig field's parser is byte-identical to the read it replaced, so old == new by construction (the §1 "flipped bool" guard).

Scope correction (the plan's "~64 sites / Linux XDG+compositor / grep→0" was unsafe as written): two classes of env::var read are deliberately kept live and documented in config.rs:

  • Runtime-mutated session vars. On Linux, vdisplay::apply_session_env rewrites the process env on every connect (the Bazzite Gaming↔Desktop follow): WAYLAND_DISPLAY, XDG_CURRENT_DESKTOP, XDG_RUNTIME_DIR, DBUS_SESSION_BUS_ADDRESS, and the derived PUNKTFUNK_INPUT_BACKEND, PUNKTFUNK_GAMESCOPE_SESSION/NODE, PUNKTFUNK_KWIN/MUTTER_VIRTUAL_PRIMARY, PUNKTFUNK_FORCE_SHM. Parse-once would freeze them at startup → silent session-following regression. They are NOT constant.
  • Single-use local tuning (no resolve-once benefit, call-site-local default/clamp, and FEC_PCT even has two different semantics): FEC_PCT, VIDEO_DROP, VBV_FRAMES, SPLIT_ENCODE, PACE_BURST_KB, the capture/dxgi.rs timing knobs, the *_LIVE/test gates, plus path/dynamic reads (config-dir, PATH search, env-forward-to-child). PUNKTFUNK_ZEROCOPY is split on purpose: Windows presence-semantics moved to the field; Linux keeps its own truthy parser.

Risk: medium (semantics-preservation). Verify: Linux cargo check/clippy/fmt green (the Windows-only edits are 1:1 substitutions, compile-verified on the box as part of Stage 3's build).

Stage 3 — SessionPlan (the single biggest clarity lever, plan §2.4). DONE (box-build + on-glass validated). New src/session_plan.rs: a Copy SessionPlan { capture, topology, encoder, bit_depth, hdr } resolved once from HostConfig (+ the negotiated bit_depth) in virtual_stream, logged, and threaded through build_pipeline_with_retry/build_pipeline. The three dispatch points now read it:

  • capturecapture::capture_virtual_output takes a CaptureBackend IN (was re-deriving from config().idd_push/capture_backend/no_wgc); CaptureBackend::resolve() is the one resolver (also used by the GameStream + spike call sites).
  • topologyvirtual_stream reads plan.topology (should_use_helper deleted; its logic is session_plan::resolve_topology, verbatim). The IDD-preempt guard reads plan.capture too.
  • encoder — recorded as EncoderBackend from encode::windows_resolved_backend (config-backed + GPU-vendor cached since stage 2, already a single source). Threading encoder/input_format into the encoder + capturer opens (which removes the dxgi.rs back-reference) is stage 5.

Every decision is provably equivalent to the pre-stage-3 scattered reads (same config() + cached probes), so it is behavior-preserving. Risk: medium-high (rewires the deployed decision). Verify:

  • Box build cargo check -p punktfunk-host --features nvenc (the deployed config: NVENC SDK + cudarc + encode/nvenc.rs) is clean, zero warnings, on the RTX box (192.168.1.173), in an isolated worktree. This also covers stage 2's Windows-only edits (their first real Windows compile).
  • On-glass — deployed my Stage-3 host into the SCM service (Session-1 launch, the real IDD-push environment) on the RTX box and drove a punktfunk-probe loopback session. The host logged resolved session plan { capture: IddPush, topology: SingleProcess, encoder: Nvenc, bit_depth: 8, hdr: false } — the correct resolution for the deployed config (IDD_PUSH + VDISPLAY=pf + nvenc) — and routed correctly (IDD-push capturer → shared ring → IDD→DDA fallback). This box has a pre-existing hybrid-GPU IDD render-adapter mismatch (driver renders on the iGPU af4825, host ring on the 4090 294d29) that yielded no published frame in this loopback scenario; an A/B against the shipping binary reproduced the identical frames=0, proving the no-frame is environmental, not a Stage-3 regression. Stage 3 is behavior-equivalent to the shipping host. Box restored to its deployed state.

Stage 4 — SessionContext (the arg-bundling). DONE (box-build validated). SessionFactory/Session::drop deferred to §2.5 — see below. Bundled the 13-positional-argument #[allow(too_many_arguments)] session entry (virtual_stream and virtual_stream_relay) into one owned SessionContext struct, moved into the stream thread. The receivers move in (virtual_stream is their only consumer), retiring the &Receiver borrow plumbing. Behavior- identical by construction: each function destructures the context into the same local names at the top, so the ~400-line loop bodies are byte-for-byte unchanged. Removed both #[allow(too_many_arguments)] attrs.

Scoped deliberately. The plan's SessionFactory.build() owning a vdm.lease(mode) → open_capturer → open_encoder → spawn RAII chain with Session::drop as the only teardown is coupled to §2.5's ownership-model rewrite — it needs a host-side VirtualDisplayManager/MonitorLease that does not exist yet (the lifecycle still lives in the CURRENT_MON_GEN/IDD_SETUP_LOCK globals + the per-compositor vdisplay backends). The current teardown is already drop-based (the capturer owns the keepalive whose Drop releases the monitor — "restore displays before REMOVE" lives there; only send_thread.join() is explicit), and it is the validated shipping path. Wrapping the deployed reconfig/switch/rebuild loop in a Session::drop for a behavior-preserving change would add real regression risk for marginal gain. So the SessionFactory/Session::drop/vdm.lease work is folded into §2.5 (its natural home); this stage delivers the concrete, safe arg-bundling. Risk: low (behavior-identical). Verify: Linux + box build (the relay destructure is the only Windows-only piece); the teardown on-glass gate moves to the §2.5 work.

Stage 5 — seam-trait tightenings (plan §2.3). 🟡 Tightening 1 DONE (box-build validated); 2→§2.5, 3 follow-on. The three §2.3 tightenings have different coupling, so they split:

  • (1) OutputFormat into the capturer — the headline (the explicit Stage-3 deferral; §5's "highest-severity coupling"). New capture::OutputFormat { gpu, hdr }, resolved once per session and passed into capture_virtual_output (SessionPlan::output_format() for the native path — gpu = encoder.is_gpu(), no second probe; OutputFormat::resolve() for the GameStream/spike paths). dxgi::DuplCapturer::open takes gpu in and its windows_resolved_backend() recompute is deleted — capture no longer re-derives the encode backend. Behavior-preserving (the gpu passed in equals the value the capturer used to compute). Linux + box-build clean.
  • (2) HDR/release → VirtualLeasemoved to §2.5. await_released as a lease method needs the monitor-generation carried on the lease (today it's the CURRENT_MON_GEN global + the sudovda::wait_for_monitor_released free fn), and the keepalive becoming Box<dyn VirtualLease> is the ownership-model change. It belongs with the VirtualDisplayManager/MonitorLease work, not bolted on here.
  • (3) EncoderCaps — small additive follow-on (query optional encoder capabilities instead of default no-ops); not blocking. Tracked for the next seam pass.

Risk: medium (Tightening 1 is behavior-preserving + Windows-only → box-compile is the gate; on-glass parity is the same env-limited story as Stage 3).

Stage 6 — windows/ + linux/ tree confinement (cfg-sprawl, plan §2.2). DONE (Linux + box-build validated). Moved 36 platform-specific files into per-module windows/ and linux/ folders (and the shared HID codecs into inject/proto/): capture/{windows,linux}/, encode/{windows,linux}/, inject/{windows,linux,proto}/, audio/{windows,linux}/, vdisplay/{windows,linux}/, and the top-level src/windows/ (service, wgc_helper, win_adapter, win_display) + src/linux/ (dmabuf_fence, drm_sync, zerocopy/).

Done with #[path], not a module rename — every file moves into its folder while the crate::*::* module names stay flat, so all caller paths and every internal super::/crate:: reference are unchanged (only the parent mod decls gained #[path = "…"]). This is the codebase's existing pattern (inject's gamepad_windows) and makes the move byte-identical in behaviour with zero reference churn — far lower risk than collapsing to a single crate::capture::windows:: namespace (that deeper rename is an optional follow-on; this delivers the folder confinement the stage is about). Done LAST, after the semantic stages. Verify: Linux cargo check/clippy/fmt clean; all 36 #[path] targets exist; no internal #[path]/include!/file-child-mod in any moved file; box cargo check --features nvenc clean.

Guardrails (mandatory, plan §14)

  • Each stage is its own commit; box-verify before moving on.
  • Stages 35 touch the deployed path → on-glass re-test (NVENC + IDD-push, a mode switch, a connect/disconnect cycle) before the next stage.
  • Preserve every PUNKTFUNK_* var's exact semantics; when in doubt, assert old==new at the call site.