133e25849d27eecc1f32690c25c0d61b90c6ba94
6 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ffc0b07b46 |
feat(protocol,host): negotiate video codec + add a GPU-less software (openh264) encode path
Phase 1 of codec negotiation, and the Linux software H.264 encode path it unblocks. **Codec negotiation (core `quic`):** - `Hello.video_codecs` (bitfield: CODEC_H264/HEVC/AV1) — the client advertises what it can decode; appended as a trailing byte (older client → 0 = HEVC-only, back-compat). - `Welcome.codec` — the single codec the host resolved and will emit; trailing byte (older host → HEVC). - `resolve_codec(client, host_capable)` picks the shared codec (precedence HEVC > AV1 > H.264) or `None` → the host refuses honestly rather than sending an undecodable stream. - Roundtrip + back-compat tests; cbindgen exports the CODEC_* constants. **Software encoder (host):** - The openh264 `OpenH264Encoder` (was Windows-only) is now built on Linux too — it's platform-agnostic (consumes CPU RGB `CapturedFrame`s, statically-bundled openh264). `openh264` moved to the shared linux+windows Cargo target. - `PUNKTFUNK_ENCODER=software` selects it: `open_video` gains a `software` branch (H.264 only), and `session_plan::resolve_encoder` / `capture::gpu_encode` resolve `EncoderBackend::Software` → `output_format().gpu = false`, so the portal capturer delivers CPU RGB. Explicit-only (auto never picks it — a box with a dead driver still has /dev/nvidiactl and would mis-resolve NVENC). **Host codec resolution (`punktfunk1`):** - The native path no longer hardcodes HEVC: it resolves the codec from the client's advertised set ∩ the host's capability (`Codec::host_wire_caps`: software→H.264, else HEVC), threads it through `SessionPlan.codec`, and opens the encoder + validates reconfigures at that codec. A software host + HEVC-only client is refused with a clear error. - 4:4:4 is gated on HEVC (it's HEVC-only). **Probe:** advertises H264|HEVC|AV1 and logs the resolved codec. Validated on the GPU-less dev box: negotiation is live end-to-end (probe advertises 0x07 → host resolves H.264 → Welcome reports it → plan = Software/H264), and the openh264 unit test (CPU RGB → AnnexB IDR) now runs on Linux. Full capture→encode still needs a GPU on this box — every compositor screencast path (KWin GL, gamescope VK_EXT_physical_device_drm, wlroots EGL) requires one; software render (llvmpipe/pixman) can't be captured — so this box exercises negotiation + encoder, not live capture. The software path unblocks GPU-less-*encode* boxes that still have a display GPU. Phase 2 (clients advertising real codecs + decoding per Welcome.codec) is a follow-up. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
38c68c33e5 |
refactor(windows-host): confine platform code under windows/ + linux/ folders (Goal-1 stage 6)
Move 36 platform-specific files into per-module `windows/` and `linux/` subfolders (and the
shared HID codecs into `inject/proto/`):
capture/{windows,linux}/ encode/{windows,linux}/ inject/{windows,linux,proto}/
audio/{windows,linux}/ vdisplay/{windows,linux}/
src/windows/ (service, wgc_helper, win_adapter, win_display)
src/linux/ (dmabuf_fence, drm_sync, zerocopy/)
Done with `#[path]`, NOT a module rename: every file moves into its folder while the
`crate::*::*` module names stay FLAT, so all caller paths and every internal `super::`/`crate::`
reference are unchanged — only the parent `mod` decls gained `#[path = "..."]`. This is the
codebase's existing pattern (inject's gamepad_windows) and makes the move byte-identical in
behaviour with ZERO reference churn, far lower risk than collapsing to a single
`crate::capture::windows::` namespace (that deeper rename is an optional follow-on; this delivers
the cfg-sprawl folder confinement the stage is about). Done LAST, after the semantic stages, so
the path churn didn't fight them.
Verified: Linux cargo check + clippy (-D warnings) clean; my mod-decl changes fmt-clean (the 3
remaining fmt diffs are pre-existing local-rustfmt-version skew that moved with their files); all
36 `#[path]` targets exist; no internal `#[path]`/`include!`/file-child-mod in any moved file
(the inline `mod X {` blocks are self-contained). Box build to follow.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
||
|
|
4cc57d5c39 |
perf(host/windows): move capture→encode off the 3D engine (NV12/P010 video-processor path, zero-copy, GPU priority)
apple / swift (push) Successful in 56s
ci / rust (push) Successful in 1m36s
android / android (push) Successful in 1m56s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 28s
deb / build-publish (push) Successful in 2m26s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
ci / bench (push) Successful in 4m33s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m58s
The Windows host capped at ~60 fps with 35-40 ms latency on a GPU-heavy game: the per-frame capture→encode path shared the 3D engine with the game and got scheduled behind it. Rework to minimize 3D-engine work per frame: - VideoConverter (D3D11 video processor): capture → NVENC-native NV12/P010 so NVENC skips its internal RGB→YUV (a 3D/compute step). Wired into both DDA (dxgi.rs) and WGC (wgc.rs). New PixelFormat::Nv12/P010 + NVENC YUV input. - GPU scheduling hardening (Apollo-style): D3DKMTSetProcessSchedulingPriorityClass HIGH, absolute SetGPUThreadPriority, SetMaximumFrameLatency(1). - WGC SDR zero-copy (hold pool frames; no CopyResource). DDA keeps a fast CopyResource to decouple its single-frame acquire/release from the async convert. - Pipelined helper encode loop (PUNKTFUNK_ENCODE_DEPTH, default 1) + perf split (cap_wait / encode / write). Live on the RTX 4090: hard 60 fps ceiling removed (now scene-scaling 40-200+), latency much reduced. Residual cap in GPU-pinned scenes is the irreducible RGB→YUV convert (no fixed-function unit on NVIDIA — VideoProcessing engine reads 0%) waiting behind an uncapped game under WDDM context time-slicing; Linux avoids it via gamescope capping the game to the display refresh. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> |
||
|
|
bbabc04bca |
feat(hdr): Windows HDR10 + 10-bit end-to-end, negotiated; non-blocking capture recovery
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m32s
android / android (push) Successful in 1m49s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m36s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m20s
flatpak / build-publish (push) Successful in 4m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m11s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m32s
Adds true HDR (BT.2020 PQ) and 10-bit (HEVC Main10) streaming, negotiated so an 8-bit/SDR client is never sent a stream it can't decode, plus a robust fix for the capture losing the stream across a secure-desktop transition. Protocol (punktfunk-core/quic.rs): - Hello gains `video_caps` (VIDEO_CAP_10BIT / VIDEO_CAP_HDR), Welcome gains `bit_depth`, both as optional trailing bytes (back-compat). client-rs advertises 10-bit via PUNKTFUNK_CLIENT_10BIT; the connector advertises 0 for now (in-band detection drives the native clients). Regenerated punktfunk_core.h. Windows host: - 10-bit Main10: host enables it only when the client advertised VIDEO_CAP_10BIT AND PUNKTFUNK_10BIT is set; threaded through open_video → NVENC (profile Main10, pixelBitDepthMinus8). - HDR: when the captured desktop is scRGB FP16 (R16G16B16A16_FLOAT, HDR on), copy it to an FP16 surface, composite the cursor there, convert scRGB → BT.2020 PQ 10-bit (R10G10B10A2) via a shader, and encode HEVC Main10 with the BT.2020/PQ colour VUI (ABGR10 input). Fixes the freeze + cursor-trail that came from feeding FP16 into the BGRA path. Reacts dynamically to the HDR toggle. - Capture recovery: rebuild is now a single NON-BLOCKING attempt, throttled to ~4×/s, repeating the last good frame between attempts (format-tagged last_present). During a secure-desktop dwell SudoVDA's output is gone; the old blocking 12 s retry starved the send loop for seconds so the client timed out and disconnected — now the session stays fed (frozen) until the desktop returns. Also seeds a black frame on recovery. Apple client (PunktfunkKit): - Detects HDR in-band from the stream VUI (PQ transfer function), decodes to 10-bit P010, and presents via an rgba16Float + BT.2020 PQ CAMetalLayer with EDR; SDR path unchanged. Switches automatically on a mid-session HDR toggle. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
2448a33698 |
style(host/windows): rustfmt the Windows backends
apple / swift (push) Successful in 55s
android / android (push) Failing after 1m53s
ci / web (push) Failing after 17s
ci / docs-site (push) Successful in 42s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 7s
ci / rust (push) Failing after 3m5s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 12s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 7s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Failing after 2s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Failing after 0s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Failing after 0s
flatpak / build-publish (push) Failing after 0s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 0s
docker / deploy-docs (push) Has been skipped
deb / build-publish (push) Failing after 1m43s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m15s
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |
||
|
|
cbbeaa5c29 |
feat(host/windows): openh264 software H.264 encoder (GPU-less path)
apple / swift (push) Successful in 53s
android / android (push) Failing after 1m31s
ci / rust (push) Failing after 45s
ci / web (push) Successful in 27s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m37s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 3s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
flatpak / build-publish (push) Failing after 2s
deb / build-publish (push) Successful in 3m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Failing after 1m21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Failing after 1m46s
docker / deploy-docs (push) Successful in 18s
Windows Encoder impl via the openh264 crate (statically-bundled, BSD-2): low-latency screen-content config (Baseline/no-B-frames, bitrate RC, BT.709 limited, near-infinite GOP + forced-IDR recovery via request_keyframe), packed CPU pixels (BGRx/BGRA/RGB/RGBA/RGBx/BGR) -> I420 -> AnnexB with in-band SPS/PPS each IDR. Synchronous: submit encodes immediately, poll hands back the one AU, flush is a no-op. Windows open_video factory selects it (PUNKTFUNK_ENCODER=software|nvenc|auto; NVENC arm lands later), H.264-only with a clear error otherwise, SW bitrate ceiling. Unit-tested live on the VM: synthetic BGRx -> AnnexB IDR + SPS NAL. Unblocks the GPU-less capture->encode->FEC->send pipeline. Compiles clean on Windows + Linux. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> |