feat(clients/windows): all-vendor video pipeline rewrite + app icon + hosts-page tiles
Decode+present rewrite (first real pixels on glass for this client): - Decode: FFmpeg D3D11VA on NVIDIA/AMD/Intel. get_format now only returns AV_PIX_FMT_D3D11 and lets libavcodec build the decode pool from hw_device_ctx (hand-built frames contexts failed three different ways: NVIDIA rejects DECODER|SHADER_RESOURCE arrays, BindFlags=0 fails texture creation, Intel rejects non-128-aligned HEVC surfaces at the first SubmitDecoderBuffers). A DXVA profile probe before the hwdevice commits hardware-vs-software up front instead of burning the opening IDR; extra_hw_frames covers the frames the client holds. - Present: the decoded slice is copied with ONE display-size-boxed CopySubresourceRegion (a planar slice is a single subresource in D3D11; the old two-copy D3D12-style code silently no-opped - the black screen) into a sampleable NV12/P010 texture, per-plane SRVs + YUV->RGB shaders. - New dedicated render thread (render.rs): presenting is decoupled from the XAML thread; frame-latency-waitable swapchain + SetMaximumFrameLatency(1), newest-wins drain after the wait, crossbeam frame channel with pts for a capture->presented p50 log. - HiDPI: pixel-sized buffers + SetMatrixTransform(96/dpi) - was blurry at 125/150 % scaling. - Software fallback now feeds the same shaders (swscale -> NV12/P010 planes -> two dynamic plane textures); ps_rgba/X2BGR10 path deleted, hw/sw colour math identical. - Adapter selection for hybrid boxes: PUNKTFUNK_ADAPTER > the window's monitor's adapter > default; PUNKTFUNK_D3D_DEBUG=1 debug layer. - Session pump: request_keyframe at start and on hw->sw demotion (infinite GOP would otherwise sit on a black screen). Validated live on the Arc Pro + RTX 3500 Ada laptop against the local Windows host: 60 fps D3D11VA on both vendors, software path, GUI on glass. Also: embedded app icon (build.rs winresource + WM_SETICON, MSIX Square44x44 targetsize assets, pack-msix stages them) and the hosts-page tile rework (tap-to-connect tiles with sibling overflow menu - fixes forget-also-connects - in-tile rename editor, add-host modal via root state). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@@ -102,7 +102,17 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
|
|||||||
**pf-vdisplay** virtual display (`capture/windows/idd_push.rs`, `vdisplay/windows/pf_vdisplay.rs`;
|
**pf-vdisplay** virtual display (`capture/windows/idd_push.rs`, `vdisplay/windows/pf_vdisplay.rs`;
|
||||||
DXGI Desktop Duplication / WGC as fallbacks, `capture/windows/dxgi.rs`), GPU encode (NVENC
|
DXGI Desktop Duplication / WGC as fallbacks, `capture/windows/dxgi.rs`), GPU encode (NVENC
|
||||||
`--features nvenc`; AMD/Intel `--features amf-qsv`), SendInput + the in-house UMDF gamepad drivers
|
`--features nvenc`; AMD/Intel `--features amf-qsv`), SendInput + the in-house UMDF gamepad drivers
|
||||||
(`inject/windows/`), WASAPI loopback + virtual mic (`audio/windows/wasapi_*`). Ships as a **signed
|
(`inject/windows/`), WASAPI loopback + virtual mic (`audio/windows/wasapi_*`). **Keyboard wire
|
||||||
|
convention: US-positional VKs** (every first-party client sends the physical key's US-layout VK;
|
||||||
|
the Windows client derives it from the scancode, NOT the layout-resolved `vkCode`) — the Windows
|
||||||
|
injector resolves them via a fixed table mirroring the Linux `vk_to_evdev` (never through a
|
||||||
|
keyboard layout: the SYSTEM service thread's layout re-reads positions as characters — the
|
||||||
|
German y↔z / ö→ü scramble), while GameStream/Moonlight VKs are layout-semantic
|
||||||
|
(`KEY_FLAG_SEMANTIC_VK`, resolved under the foreground app's layout, Sunshine's model). Linux
|
||||||
|
renders positions under the session compositor's layout (libei) or the virtual keyboard's
|
||||||
|
uploaded keymap (Sway/wlroots — honors `XKB_DEFAULT_LAYOUT` et al., default US); the Android
|
||||||
|
client reads `KeyEvent.scanCode` first so a user-selected physical-keyboard layout can't
|
||||||
|
re-map keycodes semantically. Ships as a **signed
|
||||||
Inno Setup installer** that registers a `LocalSystem` SCM service launching into the interactive
|
Inno Setup installer** that registers a `LocalSystem` SCM service launching into the interactive
|
||||||
session for secure-desktop (UAC/lock-screen) capture (`windows/service.rs`), bundles the
|
session for secure-desktop (UAC/lock-screen) capture (`windows/service.rs`), bundles the
|
||||||
pf-vdisplay driver + the FFmpeg DLLs (+ VB-CABLE for the virtual mic), and is published by
|
pf-vdisplay driver + the FFmpeg DLLs (+ VB-CABLE for the virtual mic), and is published by
|
||||||
@@ -224,23 +234,39 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
|
|||||||
**Windows stage 1 done 2026-06-15** (`clients/windows`, binary
|
**Windows stage 1 done 2026-06-15** (`clients/windows`, binary
|
||||||
`punktfunk-client`): pure-Rust **WinUI 3** UI via **windows-reactor** (a declarative React-like
|
`punktfunk-client`): pure-Rust **WinUI 3** UI via **windows-reactor** (a declarative React-like
|
||||||
framework backed by WinUI; PR #4499 added the `SwapChainPanel` widget + `set_swap_chain`). The
|
framework backed by WinUI; PR #4499 added the `SwapChainPanel` widget + `set_swap_chain`). The
|
||||||
video is a **`SwapChainPanel`** bound to a **D3D11 composition swapchain** (WARP fallback for
|
video is a **`SwapChainPanel`** bound to a **D3D11 composition swapchain**, presented from a
|
||||||
the GPU-less dev box; runtime-compiled fullscreen-triangle shaders, Contain-fit letterbox),
|
**dedicated render thread** (`render.rs`, 2026-07-02 rewrite — presenting never touches or is
|
||||||
driven by reactor's per-frame `on_rendering`. **FFmpeg HEVC decode with a D3D11VA
|
stalled by the XAML thread): frame-latency-waitable swapchain + `SetMaximumFrameLatency(1)`
|
||||||
zero-copy hardware path** (`gpu.rs` shares one D3D11 device — hardware+`VIDEO_SUPPORT`, WARP
|
(≤1 queued present, newest-wins drain after the wait, so a stream faster than the display drops
|
||||||
fallback, multithread-protected — between the decoder and presenter; the decoder outputs
|
backlog before any GPU work), **HiDPI-correct** (pixel-sized buffers + `SetMatrixTransform`
|
||||||
NV12/P010 `ID3D11Texture2D` array slices with `BIND_SHADER_RESOURCE` and the presenter samples
|
96/DPI — DIP-sized buffers were blurry at 125/150 %), Contain-fit letterbox, WARP fallback.
|
||||||
them via per-plane SRVs + YUV→RGB shaders — NV12/BT.709, P010/BT.2020-PQ; **software CPU decode
|
**FFmpeg decode with a D3D11VA hardware path on all vendors** (`gpu.rs` shares one D3D11 device
|
||||||
stays as the robust fallback**, auto-selected with a `DecoderPref` override). **HDR10**: the
|
between decoder + presenter, adapter picked by console pref `PUNKTFUNK_ADAPTER` > the window's
|
||||||
client advertises 10-bit/HDR (Settings toggle), detects PQ in-band (`transfer == SMPTE2084`),
|
monitor's adapter > default; `PUNKTFUNK_D3D_DEBUG=1` adds the debug layer): the decode pool is
|
||||||
and flips the swapchain to `R10G10B10A2` + ST.2084 with HDR10 metadata. **WASAPI** render + mic
|
**decoder-only bind, sized/aligned by libavcodec itself** (get_format returns `AV_PIX_FMT_D3D11`
|
||||||
capture, **SDL3** gamepads (rumble/lightbar/DualSense), `mdns-sd` discovery, and the full trust
|
and lets `hw_device_ctx` drive — three hand-built-frames-context strikes are why: NVIDIA rejects
|
||||||
surface — all **in-app**: a polished WinUI shell (host cards w/ monogram + status pills,
|
`DECODER|SHADER_RESOURCE` arrays, `BindFlags=0` fails texture creation, and Intel rejects
|
||||||
|
non-128-aligned HEVC surfaces at the first `SubmitDecoderBuffers`), a DXVA **profile probe**
|
||||||
|
before the hwdevice commits software-vs-hardware up front (no burned first IDR), and the
|
||||||
|
presenter copies the decoded slice with ONE display-size-boxed `CopySubresourceRegion` (a planar
|
||||||
|
slice is a single subresource in D3D11 — the old two-copy D3D12-style code silently no-opped =
|
||||||
|
the black screen) into its sampleable NV12/P010 texture → per-plane SRVs + YUV→RGB shaders
|
||||||
|
(NV12/BT.709, P010/BT.2020-PQ). **Software CPU decode is the fallback** (auto-selected,
|
||||||
|
`DecoderPref` override, mid-session demotion + keyframe re-request) and now feeds the SAME
|
||||||
|
shaders (swscale → NV12/P010 planes → two dynamic plane textures) so hw/sw colour math is
|
||||||
|
identical. **HDR10**: the client advertises 10-bit/HDR (Settings toggle, gated on an HDR
|
||||||
|
display), detects PQ in-band (`transfer == SMPTE2084`), and flips the swapchain to
|
||||||
|
`R10G10B10A2` + ST.2084 with HDR10 metadata (0xCE mastering metadata plumbed). **WASAPI** render
|
||||||
|
+ mic capture, **SDL3** gamepads (rumble/lightbar/DualSense), `mdns-sd` discovery, and the full
|
||||||
|
trust surface — all **in-app**: a polished WinUI shell (host tiles w/ monogram + status pills,
|
||||||
`InfoBar` errors/hints, `ToggleSwitch` settings, status-chip stream HUD showing GPU/CPU decode +
|
`InfoBar` errors/hints, `ToggleSwitch` settings, status-chip stream HUD showing GPU/CPU decode +
|
||||||
HDR), host list (live mDNS + saved + manual), settings (resolution/refresh/decoder/bitrate/HDR/
|
HDR), host list (live mDNS + saved + manual), settings (resolution/refresh/decoder/bitrate/HDR/
|
||||||
mic), SPAKE2 PIN pairing screen, TOFU, pinned-fp-mismatch re-pair. **(D3D11VA + HDR present + the
|
mic), SPAKE2 PIN pairing screen, TOFU, pinned-fp-mismatch re-pair. **Live-validated 2026-07-02
|
||||||
GUI polish are written against the windows-rs/reactor APIs but not yet on-glass validated — the
|
on the hybrid laptop (Intel Arc Pro iGPU + RTX 3500 Ada) against the local Windows host**:
|
||||||
dev VM is headless/WARP; needs the RTX box.)** **Stream input** is Win32 low-level hooks (`WH_KEYBOARD_LL`/`WH_MOUSE_LL`) — reactor
|
D3D11VA hardware decode 60 fps on BOTH vendors (headless, `PUNKTFUNK_ADAPTER`-forced; NVIDIA
|
||||||
|
0.2 ms decode, Intel 0.2 ms), software path, and the GUI on glass (real decoded desktop pixels,
|
||||||
|
GPU-decode HUD chip, ~18 ms capture→decoded p50 over loopback — dominated by the host's 60 Hz
|
||||||
|
virtual-display capture cadence). HDR-on-glass still pending. **Stream input** is Win32 low-level hooks (`WH_KEYBOARD_LL`/`WH_MOUSE_LL`) — reactor
|
||||||
exposes no raw key/pointer events; native Windows VK + absolute mouse (client-rect Contain-fit) +
|
exposes no raw key/pointer events; native Windows VK + absolute mouse (client-rect Contain-fit) +
|
||||||
wheel, Ctrl+Alt+Shift+Q capture toggle. `--headless`/`--discover` keep CLI paths. Builds + clippy
|
wheel, Ctrl+Alt+Shift+Q capture toggle. `--headless`/`--discover` keep CLI paths. Builds + clippy
|
||||||
+ fmt green on **`x86_64-pc-windows-msvc` and `aarch64-pc-windows-msvc`** — the latter
|
+ fmt green on **`x86_64-pc-windows-msvc` and `aarch64-pc-windows-msvc`** — the latter
|
||||||
@@ -268,9 +294,9 @@ Low-latency desktop/game streaming stack, Linux-first, with a shared Rust protoc
|
|||||||
loopback E2E (TOFU connect → clock skew → HEVC negotiate → shared-D3D11 + D3D11VA init → WASAPI →
|
loopback E2E (TOFU connect → clock skew → HEVC negotiate → shared-D3D11 + D3D11VA init → WASAPI →
|
||||||
session end; synthetic payload isn't decodable so decode output stays unvalidated), speed-test
|
session end; synthetic payload isn't decodable so decode output stays unvalidated), speed-test
|
||||||
E2E. The WinUI window itself CANNOT be launched from SSH (session-0 → WinAppSDK 0x80070005,
|
E2E. The WinUI window itself CANNOT be launched from SSH (session-0 → WinAppSDK 0x80070005,
|
||||||
pre-existing) — GUI on-glass validation still pending (needs the console session, e.g. PsExec -i 1).
|
pre-existing; needs the console session, e.g. PsExec -i 1).
|
||||||
Next: **on-glass validation** of the D3D11VA decode + HDR present + GUI (console session on the
|
Next: **HDR on-glass validation** (Windows host with `PUNKTFUNK_10BIT` → the HDR laptop
|
||||||
RTX box), then RAWINPUT relative-mouse pointer-lock.
|
display), then RAWINPUT relative-mouse pointer-lock.
|
||||||
**Android stage 1 done** (`clients/android`, Kotlin app + `native/` Rust JNI core linking
|
**Android stage 1 done** (`clients/android`, Kotlin app + `native/` Rust JNI core linking
|
||||||
`punktfunk-core`; phone + Android TV): NDK `AMediaCodec` hardware HEVC decode → `SurfaceView` incl.
|
`punktfunk-core`; phone + Android TV): NDK `AMediaCodec` hardware HEVC decode → `SurfaceView` incl.
|
||||||
**HDR10** (Main10/BT.2020 PQ) with low-latency tuning + a live stats HUD (`decode.rs`/`stats.rs`),
|
**HDR10** (Main10/BT.2020 PQ) with low-latency tuning + a live stats HUD (`decode.rs`/`stats.rs`),
|
||||||
|
|||||||
@@ -770,6 +770,15 @@ dependencies = [
|
|||||||
"itertools 0.10.5",
|
"itertools 0.10.5",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "crossbeam-channel"
|
||||||
|
version = "0.5.15"
|
||||||
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
|
checksum = "82b8f8f868b36967f9606790d1903570de9ceaf870a7bf9fbbd3016d636a2cb2"
|
||||||
|
dependencies = [
|
||||||
|
"crossbeam-utils",
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "crossbeam-deque"
|
name = "crossbeam-deque"
|
||||||
version = "0.8.6"
|
version = "0.8.6"
|
||||||
@@ -2760,6 +2769,7 @@ version = "0.4.2"
|
|||||||
dependencies = [
|
dependencies = [
|
||||||
"anyhow",
|
"anyhow",
|
||||||
"async-channel",
|
"async-channel",
|
||||||
|
"crossbeam-channel",
|
||||||
"ffmpeg-next",
|
"ffmpeg-next",
|
||||||
"mdns-sd",
|
"mdns-sd",
|
||||||
"opus",
|
"opus",
|
||||||
@@ -2772,6 +2782,7 @@ dependencies = [
|
|||||||
"wasapi",
|
"wasapi",
|
||||||
"windows 0.62.2 (git+https://github.com/microsoft/windows-rs?rev=b4129fcc1ae81eec8bf1217539883db821bca3a1)",
|
"windows 0.62.2 (git+https://github.com/microsoft/windows-rs?rev=b4129fcc1ae81eec8bf1217539883db821bca3a1)",
|
||||||
"windows-reactor",
|
"windows-reactor",
|
||||||
|
"winresource",
|
||||||
]
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
@@ -5106,6 +5117,16 @@ dependencies = [
|
|||||||
"windows-sys 0.61.2",
|
"windows-sys 0.61.2",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
[[package]]
|
||||||
|
name = "winresource"
|
||||||
|
version = "0.1.31"
|
||||||
|
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||||
|
checksum = "0986a8b1d586b7d3e4fe3d9ea39fb451ae22869dcea4aa109d287a374d866087"
|
||||||
|
dependencies = [
|
||||||
|
"toml 1.1.2+spec-1.1.0",
|
||||||
|
"version_check",
|
||||||
|
]
|
||||||
|
|
||||||
[[package]]
|
[[package]]
|
||||||
name = "wit-bindgen"
|
name = "wit-bindgen"
|
||||||
version = "0.57.1"
|
version = "0.57.1"
|
||||||
|
|||||||
@@ -39,6 +39,8 @@ windows = { git = "https://github.com/microsoft/windows-rs", rev = "b4129fcc1ae8
|
|||||||
"Win32_Graphics_Gdi",
|
"Win32_Graphics_Gdi",
|
||||||
"Win32_System_Console",
|
"Win32_System_Console",
|
||||||
"Win32_System_LibraryLoader",
|
"Win32_System_LibraryLoader",
|
||||||
|
"Win32_System_Threading",
|
||||||
|
"Win32_UI_HiDpi",
|
||||||
"Win32_UI_Input_KeyboardAndMouse",
|
"Win32_UI_Input_KeyboardAndMouse",
|
||||||
"Win32_UI_WindowsAndMessaging",
|
"Win32_UI_WindowsAndMessaging",
|
||||||
] }
|
] }
|
||||||
@@ -57,8 +59,15 @@ sdl3 = { version = "0.18", features = ["build-from-source", "hidapi"] }
|
|||||||
|
|
||||||
mdns-sd = "0.20"
|
mdns-sd = "0.20"
|
||||||
async-channel = "2"
|
async-channel = "2"
|
||||||
|
# The decoded-frame channel (session pump → render thread): crossbeam because the render loop
|
||||||
|
# blocks with `recv_timeout`, which async-channel has no sync analogue of.
|
||||||
|
crossbeam-channel = "0.5"
|
||||||
serde = { version = "1", features = ["derive"] }
|
serde = { version = "1", features = ["derive"] }
|
||||||
serde_json = "1"
|
serde_json = "1"
|
||||||
anyhow = "1"
|
anyhow = "1"
|
||||||
tracing = "0.1"
|
tracing = "0.1"
|
||||||
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
|
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
|
||||||
|
|
||||||
|
# Embeds the app icon as an exe resource (build.rs) — Windows hosts only (rc.exe from the SDK).
|
||||||
|
[target.'cfg(windows)'.build-dependencies]
|
||||||
|
winresource = "0.1"
|
||||||
|
|||||||
@@ -0,0 +1,18 @@
|
|||||||
|
//! Embed the Windows version-info + icon resources into `punktfunk-client.exe`. The icon drives
|
||||||
|
//! Explorer / Alt-Tab / the unpackaged taskbar, and `app::run` stamps it onto the WinUI window's
|
||||||
|
//! title bar via `WM_SETICON` (the MSIX taskbar/Start icons come from the package assets instead).
|
||||||
|
|
||||||
|
fn main() {
|
||||||
|
// cfg(windows) is the HOST (skips the Linux/macOS workspace stub build); CARGO_CFG_WINDOWS
|
||||||
|
// is the TARGET (both the x64 and the cross-compiled ARM64 Windows builds pass).
|
||||||
|
#[cfg(windows)]
|
||||||
|
if std::env::var_os("CARGO_CFG_WINDOWS").is_some() {
|
||||||
|
let icon = "../../packaging/windows/branding/punktfunk.ico";
|
||||||
|
println!("cargo:rerun-if-changed={icon}");
|
||||||
|
winresource::WindowsResource::new()
|
||||||
|
// Ordinal 1 — app/mod.rs loads it by this id for WM_SETICON.
|
||||||
|
.set_icon_with_id(icon, "1")
|
||||||
|
.compile()
|
||||||
|
.expect("embed windows icon resource");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
After Width: | Height: | Size: 785 B |
|
After Width: | Height: | Size: 785 B |
|
After Width: | Height: | Size: 913 B |
|
After Width: | Height: | Size: 913 B |
|
After Width: | Height: | Size: 1.1 KiB |
|
After Width: | Height: | Size: 1.1 KiB |
|
After Width: | Height: | Size: 11 KiB |
|
After Width: | Height: | Size: 11 KiB |
|
After Width: | Height: | Size: 1.4 KiB |
|
After Width: | Height: | Size: 1.4 KiB |
|
After Width: | Height: | Size: 1.5 KiB |
|
After Width: | Height: | Size: 1.5 KiB |
|
After Width: | Height: | Size: 1.7 KiB |
|
After Width: | Height: | Size: 1.7 KiB |
|
After Width: | Height: | Size: 1.8 KiB |
|
After Width: | Height: | Size: 1.8 KiB |
|
After Width: | Height: | Size: 2.2 KiB |
|
After Width: | Height: | Size: 2.2 KiB |
|
After Width: | Height: | Size: 2.8 KiB |
|
After Width: | Height: | Size: 2.8 KiB |
@@ -106,6 +106,25 @@ Copy-Item (Join-Path $assets '*') (Join-Path $layout 'Assets') -Force
|
|||||||
$manifest = (Get-Content -Raw $manifestTemplate).Replace('{VERSION}', $Version).Replace('{PUBLISHER}', $Publisher).Replace('{ARCH}', $Arch)
|
$manifest = (Get-Content -Raw $manifestTemplate).Replace('{VERSION}', $Version).Replace('{PUBLISHER}', $Publisher).Replace('{ARCH}', $Arch)
|
||||||
Set-Content -Path (Join-Path $layout 'AppxManifest.xml') -Value $manifest -Encoding UTF8
|
Set-Content -Path (Join-Path $layout 'AppxManifest.xml') -Value $manifest -Encoding UTF8
|
||||||
|
|
||||||
|
# --- resource index (resources.pri) ---
|
||||||
|
# The shell resolves the manifest's logo assets through MRT, so the qualified variants
|
||||||
|
# (Square44x44Logo.targetsize-*_altform-unplated.png — the alpha-transparent taskbar icons) only
|
||||||
|
# take effect if a pri indexes them; without one the taskbar falls back to plating the base
|
||||||
|
# 44x44 onto a solid square (the white-cornered icon). makepri's default config indexes the
|
||||||
|
# layout's asset files AND merges any existing .pri it finds (reactor's staged WinUI resources)
|
||||||
|
# via its PRI indexer, yielding one combined resources.pri. Output lands outside the layout
|
||||||
|
# first — the reactor pri is an input while indexing — then replaces it.
|
||||||
|
$makepri = Find-SdkTool 'makepri.exe'
|
||||||
|
$priconfig = Join-Path $OutDir 'priconfig.xml'
|
||||||
|
New-Item -ItemType Directory -Force -Path $OutDir | Out-Null
|
||||||
|
& $makepri createconfig /cf $priconfig /dq en-US /o
|
||||||
|
if ($LASTEXITCODE -ne 0) { throw "makepri createconfig failed ($LASTEXITCODE)" }
|
||||||
|
$priOut = Join-Path $OutDir 'resources.pri'
|
||||||
|
if (Test-Path $priOut) { Remove-Item $priOut -Force }
|
||||||
|
& $makepri new /pr $layout /cf $priconfig /mn (Join-Path $layout 'AppxManifest.xml') /of $priOut /o
|
||||||
|
if ($LASTEXITCODE -ne 0) { throw "makepri new failed ($LASTEXITCODE)" }
|
||||||
|
Move-Item $priOut (Join-Path $layout 'resources.pri') -Force
|
||||||
|
|
||||||
Write-Host "layout assembled at $layout :"
|
Write-Host "layout assembled at $layout :"
|
||||||
Get-ChildItem $layout -Recurse -File | ForEach-Object { " $($_.FullName.Substring($layout.Length + 1))" }
|
Get-ChildItem $layout -Recurse -File | ForEach-Object { " $($_.FullName.Substring($layout.Length + 1))" }
|
||||||
|
|
||||||
|
|||||||
@@ -1,5 +1,6 @@
|
|||||||
//! The hosts page: saved (trusted/paired) hosts with per-host actions (speed test, forget),
|
//! The hosts page: saved (trusted/paired) hosts and live mDNS discovery as tap-to-connect
|
||||||
//! live mDNS discovery, and a manual connect entry.
|
//! tiles in a responsive grid, with a per-host "…" menu (connect / speed test / rename /
|
||||||
|
//! forget) and a manual connect entry — the same card layout as the Linux and Apple clients.
|
||||||
|
|
||||||
use super::connect::initiate;
|
use super::connect::initiate;
|
||||||
use super::speed::SpeedState;
|
use super::speed::SpeedState;
|
||||||
@@ -9,74 +10,190 @@ use crate::discovery::DiscoveredHost;
|
|||||||
use crate::trust::KnownHosts;
|
use crate::trust::KnownHosts;
|
||||||
use windows_reactor::*;
|
use windows_reactor::*;
|
||||||
|
|
||||||
|
/// Overflow-menu item labels — `on_menu_item_clicked` reports the clicked item by its text.
|
||||||
|
const MENU_CONNECT: &str = "Connect";
|
||||||
|
const MENU_SPEED: &str = "Test network speed\u{2026}";
|
||||||
|
const MENU_RENAME: &str = "Rename\u{2026}";
|
||||||
|
const MENU_FORGET: &str = "Forget\u{2026}";
|
||||||
|
|
||||||
|
/// Tile-grid metrics: minimum tile width before dropping a column, and the gap between tiles.
|
||||||
|
const TILE_MIN_WIDTH: f64 = 320.0;
|
||||||
|
const TILE_GAP: f64 = 12.0;
|
||||||
|
|
||||||
/// Props for the hosts page: the services plus the changing discovery/status data that must
|
/// Props for the hosts page: the services plus the changing discovery/status data that must
|
||||||
/// drive its re-render (compared by value, so a new host list or error refreshes the page).
|
/// drive its re-render (compared by value, so a new host list or error refreshes the page).
|
||||||
|
///
|
||||||
|
/// `forget` and `rename` are the per-host action state, and they live in ROOT (not this page's
|
||||||
|
/// own `use_state`) on purpose: the "…" overflow is a WinUI `MenuFlyout`, whose item clicks are
|
||||||
|
/// wired directly in the reactor backend (`add_Click`) and so bypass the normal event-dispatch
|
||||||
|
/// flush — a *sync* child `SetState` from that handler marks state dirty but never pumps the
|
||||||
|
/// reconciler, so nothing re-renders. Root `AsyncSetState` re-renders the whole tree; because
|
||||||
|
/// these values are props, the changed value propagates back into this page (a child's own async
|
||||||
|
/// state would be memoised away when its props are unchanged). `(fp_hex, _)` in each identifies
|
||||||
|
/// the target saved host; `rename`'s second field is the in-progress draft name.
|
||||||
#[derive(Clone)]
|
#[derive(Clone)]
|
||||||
pub(crate) struct HostsProps {
|
pub(crate) struct HostsProps {
|
||||||
pub(crate) svc: Svc,
|
pub(crate) svc: Svc,
|
||||||
pub(crate) hosts: Vec<DiscoveredHost>,
|
pub(crate) hosts: Vec<DiscoveredHost>,
|
||||||
pub(crate) status: String,
|
pub(crate) status: String,
|
||||||
|
pub(crate) forget: Option<(String, String)>,
|
||||||
|
pub(crate) rename: Option<(String, String)>,
|
||||||
|
/// Whether the "Add host" modal is open. Root state (like `forget`/`rename`), not the page's
|
||||||
|
/// own `use_state`: a child component's sync `SetState` marks its slot dirty but does not
|
||||||
|
/// re-render when its props are otherwise unchanged, so the toggle wouldn't take.
|
||||||
|
pub(crate) show_add: bool,
|
||||||
|
pub(crate) set_forget: AsyncSetState<Option<(String, String)>>,
|
||||||
|
pub(crate) set_rename: AsyncSetState<Option<(String, String)>>,
|
||||||
|
pub(crate) set_show_add: AsyncSetState<bool>,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl PartialEq for HostsProps {
|
impl PartialEq for HostsProps {
|
||||||
fn eq(&self, other: &Self) -> bool {
|
fn eq(&self, other: &Self) -> bool {
|
||||||
self.svc == other.svc && self.hosts == other.hosts && self.status == other.status
|
// Setters are identity-stable; only the value fields drive re-render.
|
||||||
|
self.svc == other.svc
|
||||||
|
&& self.hosts == other.hosts
|
||||||
|
&& self.status == other.status
|
||||||
|
&& self.forget == other.forget
|
||||||
|
&& self.rename == other.rename
|
||||||
|
&& self.show_add == other.show_add
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// A clickable host row: monogram + name/address + optional action buttons + status pill +
|
/// A host tile. The tap-to-connect summary (monogram, name, address, status row) and the
|
||||||
/// chevron. `actions` land between the text and the pill (saved hosts: speed test / forget).
|
/// optional "…" menu button are SIBLINGS overlaid in one grid cell, never nested: WinUI bubbles
|
||||||
fn host_card(
|
/// `Tapped` out of buttons (reactor doesn't mark it handled), so a button inside the tap target
|
||||||
|
/// would fire both its own click and the tile's connect (the old forget-also-connects bug).
|
||||||
|
fn host_tile(
|
||||||
name: &str,
|
name: &str,
|
||||||
sub: &str,
|
sub: &str,
|
||||||
badge: &str,
|
status_row: Element,
|
||||||
actions: Vec<Element>,
|
menu: Option<Button>,
|
||||||
on_tap: impl Fn() + 'static,
|
on_tap: Option<Box<dyn Fn()>>,
|
||||||
) -> Element {
|
) -> Element {
|
||||||
let kind = match badge {
|
let mut summary = border(
|
||||||
"Paired" => Pill::Good,
|
|
||||||
"Open" => Pill::Neutral,
|
|
||||||
_ => Pill::Accent, // Trusted / PIN
|
|
||||||
};
|
|
||||||
card(
|
|
||||||
grid((
|
|
||||||
avatar(name)
|
|
||||||
.grid_column(0)
|
|
||||||
.vertical_alignment(VerticalAlignment::Center),
|
|
||||||
vstack((
|
vstack((
|
||||||
text_block(name).font_size(15.0).semibold(),
|
avatar(name)
|
||||||
|
.width(44.0)
|
||||||
|
.height(44.0)
|
||||||
|
.horizontal_alignment(HorizontalAlignment::Left),
|
||||||
|
text_block(name)
|
||||||
|
.font_size(15.0)
|
||||||
|
.semibold()
|
||||||
|
.margin(edges(0.0, 12.0, 0.0, 0.0)),
|
||||||
text_block(sub)
|
text_block(sub)
|
||||||
.font_size(12.0)
|
.font_size(12.0)
|
||||||
.foreground(ThemeRef::SecondaryText),
|
.font_family("Consolas")
|
||||||
))
|
|
||||||
.spacing(2.0)
|
|
||||||
.grid_column(1)
|
|
||||||
.vertical_alignment(VerticalAlignment::Center)
|
|
||||||
.margin(edges(12.0, 0.0, 0.0, 0.0)),
|
|
||||||
hstack(actions)
|
|
||||||
.spacing(4.0)
|
|
||||||
.grid_column(2)
|
|
||||||
.vertical_alignment(VerticalAlignment::Center)
|
|
||||||
.margin(edges(0.0, 0.0, 10.0, 0.0)),
|
|
||||||
pill(badge, kind)
|
|
||||||
.grid_column(3)
|
|
||||||
.vertical_alignment(VerticalAlignment::Center)
|
|
||||||
.margin(edges(0.0, 0.0, 10.0, 0.0)),
|
|
||||||
text_block("\u{203A}")
|
|
||||||
.font_size(18.0)
|
|
||||||
.foreground(ThemeRef::SecondaryText)
|
.foreground(ThemeRef::SecondaryText)
|
||||||
.grid_column(4)
|
.margin(edges(0.0, 2.0, 0.0, 0.0)),
|
||||||
.vertical_alignment(VerticalAlignment::Center),
|
status_row,
|
||||||
))
|
))
|
||||||
.columns([
|
.spacing(0.0),
|
||||||
GridLength::Auto,
|
)
|
||||||
GridLength::Star(1.0),
|
.background(hit_test_backstop())
|
||||||
GridLength::Auto,
|
.padding(uniform(18.0));
|
||||||
GridLength::Auto,
|
if let Some(f) = on_tap {
|
||||||
GridLength::Auto,
|
summary = summary.on_tapped(f);
|
||||||
]),
|
}
|
||||||
|
|
||||||
|
let mut children: Vec<Element> = vec![summary.into()];
|
||||||
|
if let Some(m) = menu {
|
||||||
|
children.push(
|
||||||
|
m.horizontal_alignment(HorizontalAlignment::Right)
|
||||||
|
.vertical_alignment(VerticalAlignment::Top)
|
||||||
|
.margin(edges(0.0, 8.0, 8.0, 0.0))
|
||||||
|
.into(),
|
||||||
|
);
|
||||||
|
}
|
||||||
|
card_flush(grid(children)).into()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// The status row at the bottom of a tile: presence dot + Online/Offline, plus the trust chip.
|
||||||
|
fn status_row(online: Option<bool>, badge: &str, kind: Pill) -> Element {
|
||||||
|
let mut items: Vec<Element> = Vec::new();
|
||||||
|
if let Some(online) = online {
|
||||||
|
items.push(
|
||||||
|
presence_dot(online)
|
||||||
|
.vertical_alignment(VerticalAlignment::Center)
|
||||||
|
.into(),
|
||||||
|
);
|
||||||
|
items.push(
|
||||||
|
text_block(if online { "Online" } else { "Offline" })
|
||||||
|
.font_size(11.0)
|
||||||
|
.foreground(ThemeRef::SecondaryText)
|
||||||
|
.vertical_alignment(VerticalAlignment::Center)
|
||||||
|
.into(),
|
||||||
|
);
|
||||||
|
}
|
||||||
|
items.push(
|
||||||
|
pill(badge, kind)
|
||||||
|
.vertical_alignment(VerticalAlignment::Center)
|
||||||
|
.into(),
|
||||||
|
);
|
||||||
|
hstack(items)
|
||||||
|
.spacing(6.0)
|
||||||
|
.margin(edges(0.0, 12.0, 0.0, 0.0))
|
||||||
|
.into()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Lay tiles into a `cols`-wide grid of equal-width star columns (rows share the height of
|
||||||
|
/// their tallest tile, so a grid row always lines up).
|
||||||
|
fn tile_grid(tiles: Vec<Element>, cols: usize) -> Element {
|
||||||
|
let rows = tiles.len().div_ceil(cols);
|
||||||
|
let mut children = Vec::with_capacity(tiles.len());
|
||||||
|
for (i, t) in tiles.into_iter().enumerate() {
|
||||||
|
children.push(t.grid_row((i / cols) as i32).grid_column((i % cols) as i32));
|
||||||
|
}
|
||||||
|
grid(children)
|
||||||
|
.columns(vec![GridLength::Star(1.0); cols])
|
||||||
|
.rows(vec![GridLength::Auto; rows])
|
||||||
|
.column_spacing(TILE_GAP)
|
||||||
|
.row_spacing(TILE_GAP)
|
||||||
|
.into()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// The in-tile rename editor (ContentDialog can't hold a text field): name box + save/cancel.
|
||||||
|
/// No tap-to-connect while editing — a click into the box would bubble `Tapped` to the region.
|
||||||
|
fn rename_editor(
|
||||||
|
draft: &str,
|
||||||
|
fp: String,
|
||||||
|
set_rename: AsyncSetState<Option<(String, String)>>,
|
||||||
|
) -> Element {
|
||||||
|
let commit = {
|
||||||
|
let (fp, draft, sr) = (fp.clone(), draft.to_string(), set_rename.clone());
|
||||||
|
move || {
|
||||||
|
let name = draft.trim();
|
||||||
|
if !name.is_empty() {
|
||||||
|
let mut known = KnownHosts::load();
|
||||||
|
if let Some(h) = known.hosts.iter_mut().find(|h| h.fp_hex == fp) {
|
||||||
|
h.name = name.to_string();
|
||||||
|
}
|
||||||
|
let _ = known.save();
|
||||||
|
}
|
||||||
|
sr.call(None);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
let on_changed = {
|
||||||
|
let sr = set_rename.clone();
|
||||||
|
move |s: String| sr.call(Some((fp.clone(), s)))
|
||||||
|
};
|
||||||
|
card(
|
||||||
|
vstack((
|
||||||
|
text_box(draft)
|
||||||
|
.placeholder("Host name")
|
||||||
|
.on_changed(on_changed),
|
||||||
|
hstack((
|
||||||
|
button("Save")
|
||||||
|
.accent()
|
||||||
|
.icon(SymbolGlyph::Accept)
|
||||||
|
.on_click(commit),
|
||||||
|
button("Cancel")
|
||||||
|
.subtle()
|
||||||
|
.on_click(move || set_rename.call(None)),
|
||||||
|
))
|
||||||
|
.spacing(4.0),
|
||||||
|
))
|
||||||
|
.spacing(10.0),
|
||||||
)
|
)
|
||||||
.on_tapped(on_tap)
|
|
||||||
.into()
|
.into()
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -87,11 +204,23 @@ pub(crate) fn hosts_page(props: &HostsProps, cx: &mut RenderCx) -> Element {
|
|||||||
let set_screen = &props.svc.set_screen;
|
let set_screen = &props.svc.set_screen;
|
||||||
let set_status = &props.svc.set_status;
|
let set_status = &props.svc.set_status;
|
||||||
let (manual, set_manual) = cx.use_state(String::new());
|
let (manual, set_manual) = cx.use_state(String::new());
|
||||||
// Pending "forget host" confirmation: `(fp_hex, name)` of the saved host to drop. Drives the
|
// "Add host" modal open state lives in ROOT (see `HostsProps`).
|
||||||
// ContentDialog below; sync state, so setting it re-renders this page.
|
let show_add = props.show_add;
|
||||||
let (forget, set_forget) = cx.use_state(Option::<(String, String)>::None);
|
let set_show_add = &props.set_show_add;
|
||||||
|
// Forget confirmation and in-progress rename live in ROOT state (see `HostsProps`) — the
|
||||||
|
// overflow menu's flyout clicks can't re-render off a sync setter. Both are `(fp_hex, _)`.
|
||||||
|
let forget = props.forget.clone();
|
||||||
|
let rename = props.rename.clone();
|
||||||
|
let set_forget = &props.set_forget;
|
||||||
|
let set_rename = &props.set_rename;
|
||||||
let known = KnownHosts::load();
|
let known = KnownHosts::load();
|
||||||
|
|
||||||
|
// Responsive column count from the live window width (re-renders on resize): as many
|
||||||
|
// TILE_MIN_WIDTH columns as fit the page's content width, at least one.
|
||||||
|
let window = cx.use_inner_size();
|
||||||
|
let content_w = (window.width - 64.0).clamp(TILE_MIN_WIDTH, 1120.0);
|
||||||
|
let cols = (((content_w + TILE_GAP) / (TILE_MIN_WIDTH + TILE_GAP)).floor() as usize).max(1);
|
||||||
|
|
||||||
let mut body: Vec<Element> = Vec::new();
|
let mut body: Vec<Element> = Vec::new();
|
||||||
|
|
||||||
// Header: title block + Settings button.
|
// Header: title block + Settings button.
|
||||||
@@ -105,17 +234,25 @@ pub(crate) fn hosts_page(props: &HostsProps, cx: &mut RenderCx) -> Element {
|
|||||||
.spacing(2.0)
|
.spacing(2.0)
|
||||||
.grid_column(0)
|
.grid_column(0)
|
||||||
.vertical_alignment(VerticalAlignment::Center),
|
.vertical_alignment(VerticalAlignment::Center),
|
||||||
button("Settings")
|
hstack((
|
||||||
.icon(SymbolGlyph::Setting)
|
button("Add host")
|
||||||
|
.accent()
|
||||||
|
.icon(SymbolGlyph::Add)
|
||||||
.on_click({
|
.on_click({
|
||||||
|
let sa = set_show_add.clone();
|
||||||
|
move || sa.call(true)
|
||||||
|
}),
|
||||||
|
button("Settings").icon(SymbolGlyph::Setting).on_click({
|
||||||
let ss = set_screen.clone();
|
let ss = set_screen.clone();
|
||||||
move || ss.call(Screen::Settings)
|
move || ss.call(Screen::Settings)
|
||||||
})
|
}),
|
||||||
|
))
|
||||||
|
.spacing(8.0)
|
||||||
.grid_column(1)
|
.grid_column(1)
|
||||||
.vertical_alignment(VerticalAlignment::Center),
|
.vertical_alignment(VerticalAlignment::Center),
|
||||||
))
|
))
|
||||||
.columns([GridLength::Star(1.0), GridLength::Auto])
|
.columns([GridLength::Star(1.0), GridLength::Auto])
|
||||||
.margin(edges(0.0, 0.0, 0.0, 6.0))
|
.margin(edges(0.0, 0.0, 0.0, 10.0))
|
||||||
.into(),
|
.into(),
|
||||||
);
|
);
|
||||||
|
|
||||||
@@ -129,10 +266,18 @@ pub(crate) fn hosts_page(props: &HostsProps, cx: &mut RenderCx) -> Element {
|
|||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
// Saved (trusted/paired) hosts — reachable even when mDNS isn't.
|
// Saved (trusted/paired) hosts — reachable even when mDNS isn't. A saved host that's also
|
||||||
|
// being advertised right now shows as Online (and is deduped out of the discovery section).
|
||||||
if !known.hosts.is_empty() {
|
if !known.hosts.is_empty() {
|
||||||
body.push(section("SAVED HOSTS"));
|
body.push(section("SAVED HOSTS"));
|
||||||
|
let mut tiles: Vec<Element> = Vec::new();
|
||||||
for k in &known.hosts {
|
for k in &known.hosts {
|
||||||
|
// Rust 2021 (no let-chains): match the "this tile is being renamed" case explicitly.
|
||||||
|
if matches!(&rename, Some((fp, _)) if fp == &k.fp_hex) {
|
||||||
|
let (fp, draft) = rename.clone().unwrap();
|
||||||
|
tiles.push(rename_editor(&draft, fp, set_rename.clone()));
|
||||||
|
continue;
|
||||||
|
}
|
||||||
let target = Target {
|
let target = Target {
|
||||||
name: k.name.clone(),
|
name: k.name.clone(),
|
||||||
addr: k.addr.clone(),
|
addr: k.addr.clone(),
|
||||||
@@ -140,45 +285,72 @@ pub(crate) fn hosts_page(props: &HostsProps, cx: &mut RenderCx) -> Element {
|
|||||||
fp_hex: Some(k.fp_hex.clone()),
|
fp_hex: Some(k.fp_hex.clone()),
|
||||||
pair_optional: false,
|
pair_optional: false,
|
||||||
};
|
};
|
||||||
// Per-host actions: measure the path (probe burst → recommended bitrate) and forget
|
let online = hosts
|
||||||
// (drops the pinned fingerprint — a later connect re-pairs).
|
.iter()
|
||||||
let speed_btn = {
|
.any(|h| h.fp_hex == k.fp_hex || (h.addr == k.addr && h.port == k.port));
|
||||||
|
let menu = {
|
||||||
let (svc, target) = (props.svc.clone(), target.clone());
|
let (svc, target) = (props.svc.clone(), target.clone());
|
||||||
button("Test")
|
let (sf, sr) = (set_forget.clone(), set_rename.clone());
|
||||||
.icon(SymbolGlyph::Sync)
|
let (fp, name) = (k.fp_hex.clone(), k.name.clone());
|
||||||
|
button("")
|
||||||
|
.icon(SymbolGlyph::More)
|
||||||
.subtle()
|
.subtle()
|
||||||
.on_click(move || {
|
.tooltip("More options")
|
||||||
|
.automation_name("More options")
|
||||||
|
.menu_flyout(vec![
|
||||||
|
menu_item(MENU_CONNECT),
|
||||||
|
menu_item(MENU_SPEED),
|
||||||
|
menu_item(MENU_RENAME),
|
||||||
|
menu_separator(),
|
||||||
|
menu_item(MENU_FORGET),
|
||||||
|
])
|
||||||
|
.on_menu_item_clicked(move |item: String| match item.as_str() {
|
||||||
|
MENU_CONNECT => {
|
||||||
|
initiate(&svc.ctx, target.clone(), &svc.set_screen, &svc.set_status)
|
||||||
|
}
|
||||||
|
MENU_SPEED => {
|
||||||
*svc.ctx.shared.target.lock().unwrap() = target.clone();
|
*svc.ctx.shared.target.lock().unwrap() = target.clone();
|
||||||
// New run: invalidate any still-in-flight probe and reset the screen.
|
// New run: invalidate any still-in-flight probe, reset the screen.
|
||||||
svc.ctx
|
svc.ctx
|
||||||
.shared
|
.shared
|
||||||
.speed_gen
|
.speed_gen
|
||||||
.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
|
.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
|
||||||
svc.set_speed.call(SpeedState::Running);
|
svc.set_speed.call(SpeedState::Running);
|
||||||
svc.set_screen.call(Screen::SpeedTest);
|
svc.set_screen.call(Screen::SpeedTest);
|
||||||
|
}
|
||||||
|
MENU_RENAME => sr.call(Some((fp.clone(), name.clone()))),
|
||||||
|
MENU_FORGET => sf.call(Some((fp.clone(), name.clone()))),
|
||||||
|
_ => {}
|
||||||
})
|
})
|
||||||
};
|
};
|
||||||
let forget_btn = {
|
|
||||||
let (sf, fp, name) = (set_forget.clone(), k.fp_hex.clone(), k.name.clone());
|
|
||||||
button("Forget")
|
|
||||||
.icon(SymbolGlyph::Delete)
|
|
||||||
.subtle()
|
|
||||||
.on_click(move || sf.call(Some((fp.clone(), name.clone()))))
|
|
||||||
};
|
|
||||||
let (ctx2, ss, st) = (ctx.clone(), set_screen.clone(), set_status.clone());
|
let (ctx2, ss, st) = (ctx.clone(), set_screen.clone(), set_status.clone());
|
||||||
body.push(host_card(
|
tiles.push(host_tile(
|
||||||
&k.name,
|
&k.name,
|
||||||
&format!("{}:{}", k.addr, k.port),
|
&format!("{}:{}", k.addr, k.port),
|
||||||
|
status_row(
|
||||||
|
Some(online),
|
||||||
if k.paired { "Paired" } else { "Trusted" },
|
if k.paired { "Paired" } else { "Trusted" },
|
||||||
vec![speed_btn.into(), forget_btn.into()],
|
if k.paired { Pill::Good } else { Pill::Info },
|
||||||
move || initiate(&ctx2, target.clone(), &ss, &st),
|
),
|
||||||
|
Some(menu),
|
||||||
|
Some(Box::new(move || initiate(&ctx2, target.clone(), &ss, &st))),
|
||||||
));
|
));
|
||||||
}
|
}
|
||||||
|
body.push(tile_grid(tiles, cols));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Discovered hosts.
|
// Discovered hosts not already saved above.
|
||||||
body.push(section("ON YOUR NETWORK"));
|
body.push(section("ON THIS NETWORK"));
|
||||||
if hosts.is_empty() {
|
let discovered: Vec<&DiscoveredHost> = hosts
|
||||||
|
.iter()
|
||||||
|
.filter(|h| {
|
||||||
|
!known.hosts.iter().any(|k| {
|
||||||
|
(!h.fp_hex.is_empty() && k.fp_hex == h.fp_hex)
|
||||||
|
|| (k.addr == h.addr && k.port == h.port)
|
||||||
|
})
|
||||||
|
})
|
||||||
|
.collect();
|
||||||
|
if discovered.is_empty() {
|
||||||
body.push(
|
body.push(
|
||||||
card(
|
card(
|
||||||
hstack((
|
hstack((
|
||||||
@@ -190,7 +362,8 @@ pub(crate) fn hosts_page(props: &HostsProps, cx: &mut RenderCx) -> Element {
|
|||||||
.into(),
|
.into(),
|
||||||
);
|
);
|
||||||
} else {
|
} else {
|
||||||
for h in hosts {
|
let mut tiles: Vec<Element> = Vec::new();
|
||||||
|
for h in discovered {
|
||||||
let target = Target {
|
let target = Target {
|
||||||
name: h.name.clone(),
|
name: h.name.clone(),
|
||||||
addr: h.addr.clone(),
|
addr: h.addr.clone(),
|
||||||
@@ -199,69 +372,22 @@ pub(crate) fn hosts_page(props: &HostsProps, cx: &mut RenderCx) -> Element {
|
|||||||
pair_optional: h.pair == "optional",
|
pair_optional: h.pair == "optional",
|
||||||
};
|
};
|
||||||
let (ctx2, ss, st) = (ctx.clone(), set_screen.clone(), set_status.clone());
|
let (ctx2, ss, st) = (ctx.clone(), set_screen.clone(), set_status.clone());
|
||||||
let badge = if h.pair == "required" { "PIN" } else { "Open" };
|
let (badge, kind) = if h.pair == "required" {
|
||||||
body.push(host_card(
|
("PIN", Pill::Info)
|
||||||
|
} else {
|
||||||
|
("Open", Pill::Neutral)
|
||||||
|
};
|
||||||
|
tiles.push(host_tile(
|
||||||
&h.name,
|
&h.name,
|
||||||
&format!("{}:{}", h.addr, h.port),
|
&format!("{}:{}", h.addr, h.port),
|
||||||
badge,
|
status_row(None, badge, kind),
|
||||||
Vec::new(),
|
None,
|
||||||
move || initiate(&ctx2, target.clone(), &ss, &st),
|
Some(Box::new(move || initiate(&ctx2, target.clone(), &ss, &st))),
|
||||||
));
|
));
|
||||||
}
|
}
|
||||||
|
body.push(tile_grid(tiles, cols));
|
||||||
}
|
}
|
||||||
|
|
||||||
// Manual connection.
|
|
||||||
body.push(section("CONNECT MANUALLY"));
|
|
||||||
let connect_manual = {
|
|
||||||
let (ctx2, ss, st, text) = (
|
|
||||||
ctx.clone(),
|
|
||||||
set_screen.clone(),
|
|
||||||
set_status.clone(),
|
|
||||||
manual.clone(),
|
|
||||||
);
|
|
||||||
move || {
|
|
||||||
let text = text.trim();
|
|
||||||
if text.is_empty() {
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
let (addr, port) = match text.rsplit_once(':') {
|
|
||||||
Some((a, p)) => (a.to_string(), p.parse().unwrap_or(9777)),
|
|
||||||
None => (text.to_string(), 9777),
|
|
||||||
};
|
|
||||||
initiate(
|
|
||||||
&ctx2,
|
|
||||||
Target {
|
|
||||||
name: addr.clone(),
|
|
||||||
addr,
|
|
||||||
port,
|
|
||||||
fp_hex: None,
|
|
||||||
pair_optional: false,
|
|
||||||
},
|
|
||||||
&ss,
|
|
||||||
&st,
|
|
||||||
);
|
|
||||||
}
|
|
||||||
};
|
|
||||||
body.push(
|
|
||||||
card(
|
|
||||||
grid((
|
|
||||||
text_box(manual)
|
|
||||||
.placeholder("host or host:port")
|
|
||||||
.on_changed(move |s| set_manual.call(s))
|
|
||||||
.grid_column(0)
|
|
||||||
.vertical_alignment(VerticalAlignment::Center),
|
|
||||||
button("Connect")
|
|
||||||
.accent()
|
|
||||||
.icon(SymbolGlyph::Forward)
|
|
||||||
.on_click(connect_manual)
|
|
||||||
.grid_column(1)
|
|
||||||
.margin(edges(8.0, 0.0, 0.0, 0.0)),
|
|
||||||
))
|
|
||||||
.columns([GridLength::Star(1.0), GridLength::Auto]),
|
|
||||||
)
|
|
||||||
.into(),
|
|
||||||
);
|
|
||||||
|
|
||||||
// Forget confirmation (modal; shown while `forget` holds a pending host). Confirmed first,
|
// Forget confirmation (modal; shown while `forget` holds a pending host). Confirmed first,
|
||||||
// since it's destructive and re-establishing trust needs a fresh pairing.
|
// since it's destructive and re-establishing trust needs a fresh pairing.
|
||||||
if let Some((fp, name)) = forget {
|
if let Some((fp, name)) = forget {
|
||||||
@@ -287,5 +413,88 @@ pub(crate) fn hosts_page(props: &HostsProps, cx: &mut RenderCx) -> Element {
|
|||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
page(body)
|
let page = page_wide(body);
|
||||||
|
if !show_add {
|
||||||
|
return page;
|
||||||
|
}
|
||||||
|
|
||||||
|
// "Add host" modal: a scrim + centered card. It's an in-tree overlay, not a WinUI
|
||||||
|
// ContentDialog, because ContentDialog is text-only in windows-reactor (no room for a text
|
||||||
|
// field). The scrim border fills the cell and is hit-testable, so it blocks the page behind;
|
||||||
|
// it closes only via Cancel/Connect (a scrim tap would bubble `Tapped` up from the card too).
|
||||||
|
let connect_manual = {
|
||||||
|
let (ctx2, ss, st, text, sa) = (
|
||||||
|
ctx.clone(),
|
||||||
|
set_screen.clone(),
|
||||||
|
set_status.clone(),
|
||||||
|
manual.clone(),
|
||||||
|
set_show_add.clone(),
|
||||||
|
);
|
||||||
|
move || {
|
||||||
|
let text = text.trim();
|
||||||
|
if text.is_empty() {
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
let (addr, port) = match text.rsplit_once(':') {
|
||||||
|
Some((a, p)) => (a.to_string(), p.parse().unwrap_or(9777)),
|
||||||
|
None => (text.to_string(), 9777),
|
||||||
|
};
|
||||||
|
sa.call(false);
|
||||||
|
initiate(
|
||||||
|
&ctx2,
|
||||||
|
Target {
|
||||||
|
name: addr.clone(),
|
||||||
|
addr,
|
||||||
|
port,
|
||||||
|
fp_hex: None,
|
||||||
|
pair_optional: false,
|
||||||
|
},
|
||||||
|
&ss,
|
||||||
|
&st,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
let modal = dialog_surface(
|
||||||
|
vstack((
|
||||||
|
text_block("Add a host").font_size(20.0).bold(),
|
||||||
|
text_block(
|
||||||
|
"Enter the host's IP address or name. Append :port only for a non-standard port \
|
||||||
|
(the default is 9777).",
|
||||||
|
)
|
||||||
|
.font_size(13.0)
|
||||||
|
.wrap()
|
||||||
|
.foreground(ThemeRef::SecondaryText),
|
||||||
|
text_box(manual)
|
||||||
|
.header("Address")
|
||||||
|
.placeholder("192.168.1.20 or my-pc.local")
|
||||||
|
.on_changed(move |s| set_manual.call(s))
|
||||||
|
.margin(edges(0.0, 6.0, 0.0, 0.0)),
|
||||||
|
hstack((
|
||||||
|
button("Connect")
|
||||||
|
.accent()
|
||||||
|
.icon(SymbolGlyph::Forward)
|
||||||
|
.on_click(connect_manual),
|
||||||
|
button("Cancel").on_click({
|
||||||
|
let sa = set_show_add.clone();
|
||||||
|
move || sa.call(false)
|
||||||
|
}),
|
||||||
|
))
|
||||||
|
.spacing(8.0)
|
||||||
|
.horizontal_alignment(HorizontalAlignment::Right)
|
||||||
|
.margin(edges(0.0, 6.0, 0.0, 0.0)),
|
||||||
|
))
|
||||||
|
.spacing(12.0),
|
||||||
|
)
|
||||||
|
.max_width(460.0)
|
||||||
|
.horizontal_alignment(HorizontalAlignment::Center)
|
||||||
|
.vertical_alignment(VerticalAlignment::Center)
|
||||||
|
.margin(uniform(24.0));
|
||||||
|
|
||||||
|
let scrim = border(modal).background(Color {
|
||||||
|
a: 140,
|
||||||
|
r: 0,
|
||||||
|
g: 0,
|
||||||
|
b: 0,
|
||||||
|
});
|
||||||
|
grid(vec![page, scrim.into()]).into()
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -35,7 +35,6 @@ use crate::discovery::{self, DiscoveredHost};
|
|||||||
use crate::gamepad::GamepadService;
|
use crate::gamepad::GamepadService;
|
||||||
use crate::session::Stats;
|
use crate::session::Stats;
|
||||||
use crate::trust::Settings;
|
use crate::trust::Settings;
|
||||||
use crate::video::DecodedFrame;
|
|
||||||
use hosts::HostsProps;
|
use hosts::HostsProps;
|
||||||
use punktfunk_core::client::NativeClient;
|
use punktfunk_core::client::NativeClient;
|
||||||
use speed::{SpeedProps, SpeedState};
|
use speed::{SpeedProps, SpeedState};
|
||||||
@@ -99,7 +98,7 @@ impl PartialEq for Svc {
|
|||||||
/// Cross-thread handoff from the session pump (off-thread) to the stream page (UI thread).
|
/// Cross-thread handoff from the session pump (off-thread) to the stream page (UI thread).
|
||||||
#[derive(Default)]
|
#[derive(Default)]
|
||||||
pub(crate) struct Shared {
|
pub(crate) struct Shared {
|
||||||
pub(crate) handoff: Mutex<Option<(Arc<NativeClient>, async_channel::Receiver<DecodedFrame>)>>,
|
pub(crate) handoff: Mutex<Option<(Arc<NativeClient>, crate::session::FrameRx)>>,
|
||||||
pub(crate) target: Mutex<Target>,
|
pub(crate) target: Mutex<Target>,
|
||||||
/// Latest stream stats, written by the session's event loop and mirrored into reactor state
|
/// Latest stream stats, written by the session's event loop and mirrored into reactor state
|
||||||
/// by the HUD poll thread to drive the overlay.
|
/// by the HUD poll thread to drive the overlay.
|
||||||
@@ -129,6 +128,7 @@ pub fn run(identity: (String, String), gamepad: GamepadService) -> windows_react
|
|||||||
gamepad,
|
gamepad,
|
||||||
shared: Arc::new(Shared::default()),
|
shared: Arc::new(Shared::default()),
|
||||||
});
|
});
|
||||||
|
apply_window_icon_when_ready();
|
||||||
App::new()
|
App::new()
|
||||||
.title("Punktfunk")
|
.title("Punktfunk")
|
||||||
.inner_size(1000.0, 720.0)
|
.inner_size(1000.0, 720.0)
|
||||||
@@ -136,12 +136,66 @@ pub fn run(identity: (String, String), gamepad: GamepadService) -> windows_react
|
|||||||
.render(move |cx| root(cx, &ctx))
|
.render(move |cx| root(cx, &ctx))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Stamp the embedded app icon (build.rs, resource ordinal 1) onto the top-level window once it
|
||||||
|
/// exists: `WM_SETICON` drives the title bar and Alt-Tab (plus the taskbar for unpackaged runs;
|
||||||
|
/// the MSIX taskbar/Start icons come from the package assets). windows-reactor creates its
|
||||||
|
/// window icon-less and exposes no handle before `App::render` blocks, so a short background
|
||||||
|
/// poll finds our own window by its (unique) title.
|
||||||
|
fn apply_window_icon_when_ready() {
|
||||||
|
use windows::Win32::Foundation::{LPARAM, WPARAM};
|
||||||
|
use windows::Win32::System::LibraryLoader::GetModuleHandleW;
|
||||||
|
use windows::Win32::UI::WindowsAndMessaging::{
|
||||||
|
FindWindowW, GetSystemMetrics, LoadImageW, SendMessageW, ICON_BIG, ICON_SMALL, IMAGE_ICON,
|
||||||
|
LR_DEFAULTCOLOR, SM_CXICON, SM_CXSMICON, WM_SETICON,
|
||||||
|
};
|
||||||
|
let _ = std::thread::Builder::new()
|
||||||
|
.name("pf-window-icon".into())
|
||||||
|
.spawn(|| unsafe {
|
||||||
|
for _ in 0..100 {
|
||||||
|
if let Ok(hwnd) = FindWindowW(None, windows::core::w!("Punktfunk")) {
|
||||||
|
let Ok(module) = GetModuleHandleW(None) else {
|
||||||
|
return;
|
||||||
|
};
|
||||||
|
// Small (title bar) and big (Alt-Tab) at their native metrics, both from
|
||||||
|
// the multi-size .ico so nothing is scaled at draw time.
|
||||||
|
for (which, metric) in [(ICON_SMALL, SM_CXSMICON), (ICON_BIG, SM_CXICON)] {
|
||||||
|
let px = GetSystemMetrics(metric);
|
||||||
|
if let Ok(icon) = LoadImageW(
|
||||||
|
Some(module.into()),
|
||||||
|
windows::core::PCWSTR(1 as *const u16),
|
||||||
|
IMAGE_ICON,
|
||||||
|
px,
|
||||||
|
px,
|
||||||
|
LR_DEFAULTCOLOR,
|
||||||
|
) {
|
||||||
|
SendMessageW(
|
||||||
|
hwnd,
|
||||||
|
WM_SETICON,
|
||||||
|
Some(WPARAM(which as usize)),
|
||||||
|
Some(LPARAM(icon.0 as isize)),
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
std::thread::sleep(std::time::Duration::from_millis(50));
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
fn root(cx: &mut RenderCx, ctx: &Arc<AppCtx>) -> Element {
|
fn root(cx: &mut RenderCx, ctx: &Arc<AppCtx>) -> Element {
|
||||||
let (screen, set_screen) = cx.use_async_state(Screen::Hosts);
|
let (screen, set_screen) = cx.use_async_state(Screen::Hosts);
|
||||||
let (hosts, set_hosts) = cx.use_async_state(Vec::<DiscoveredHost>::new());
|
let (hosts, set_hosts) = cx.use_async_state(Vec::<DiscoveredHost>::new());
|
||||||
let (status, set_status) = cx.use_async_state(String::new());
|
let (status, set_status) = cx.use_async_state(String::new());
|
||||||
let (hud, set_hud) = cx.use_async_state(stream::HudSample::default());
|
let (hud, set_hud) = cx.use_async_state(stream::HudSample::default());
|
||||||
let (speed, set_speed) = cx.use_async_state(SpeedState::Running);
|
let (speed, set_speed) = cx.use_async_state(SpeedState::Running);
|
||||||
|
// Per-host action state for the hosts page. Root, not page-local: the "…" overflow is a WinUI
|
||||||
|
// MenuFlyout whose item clicks are wired straight in the reactor backend, bypassing the normal
|
||||||
|
// event-dispatch flush — a sync page-local setter marks state dirty but never re-renders. See
|
||||||
|
// `hosts::HostsProps`.
|
||||||
|
let (forget, set_forget) = cx.use_async_state(Option::<(String, String)>::None);
|
||||||
|
let (rename, set_rename) = cx.use_async_state(Option::<(String, String)>::None);
|
||||||
|
let (show_add, set_show_add) = cx.use_async_state(false);
|
||||||
|
|
||||||
// Continuous LAN discovery (spawned once).
|
// Continuous LAN discovery (spawned once).
|
||||||
cx.use_effect((), {
|
cx.use_effect((), {
|
||||||
@@ -183,6 +237,43 @@ fn root(cx: &mut RenderCx, ctx: &Arc<AppCtx>) -> Element {
|
|||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
|
// Screen-entrance animation: each navigation slides the new screen up a few px while fading it
|
||||||
|
// in (the Windows-Settings drill-in). It's a manual tween, not a composition animation, because
|
||||||
|
// reactor's DSL exposes no static transform/translation setter and its one-shot animations run
|
||||||
|
// from the visual's CURRENT value (a shown element is already at opacity 1, so nothing to fade
|
||||||
|
// from). So a worker thread steps a 0 → 1 `progress` after each navigation; the wrapper maps it
|
||||||
|
// to opacity (= progress) and a top margin (= (1-progress)·offset). The page components are
|
||||||
|
// memoised on unchanged props, so each step is just a cheap root re-render updating two props.
|
||||||
|
// A generation guard (bumped per navigation) stops a superseded tween so rapid nav can't fight.
|
||||||
|
let anim_gen = cx.use_ref(std::sync::Arc::new(std::sync::atomic::AtomicU64::new(0)));
|
||||||
|
let (anim, set_anim) = cx.use_async_state((Option::<Screen>::None, 1.0f64));
|
||||||
|
cx.use_effect(screen.clone(), {
|
||||||
|
let (s, set_anim, gen) = (screen.clone(), set_anim.clone(), anim_gen.borrow().clone());
|
||||||
|
move || {
|
||||||
|
use std::sync::atomic::Ordering::SeqCst;
|
||||||
|
let mine = gen.fetch_add(1, SeqCst) + 1;
|
||||||
|
std::thread::spawn(move || {
|
||||||
|
const STEPS: u32 = 14;
|
||||||
|
for i in 0..=STEPS {
|
||||||
|
if gen.load(SeqCst) != mine {
|
||||||
|
return; // a newer navigation superseded this tween
|
||||||
|
}
|
||||||
|
let p = f64::from(i) / f64::from(STEPS);
|
||||||
|
let eased = 1.0 - (1.0 - p).powi(3); // ease-out cubic
|
||||||
|
set_anim.call((Some(s.clone()), eased));
|
||||||
|
std::thread::sleep(std::time::Duration::from_millis(16));
|
||||||
|
}
|
||||||
|
});
|
||||||
|
}
|
||||||
|
});
|
||||||
|
// Progress for THIS screen: 0 until the tween for it starts (fresh navigation starts hidden +
|
||||||
|
// offset, no flash), 1 once settled. A stale value for another screen reads as 0.
|
||||||
|
let progress = if anim.0.as_ref() == Some(&screen) {
|
||||||
|
anim.1
|
||||||
|
} else {
|
||||||
|
0.0
|
||||||
|
};
|
||||||
|
|
||||||
// Each hook-using screen is mounted as its own component so its hooks are isolated from
|
// Each hook-using screen is mounted as its own component so its hooks are isolated from
|
||||||
// root's (root's own hooks above stay a stable prefix regardless of which screen renders).
|
// root's (root's own hooks above stay a stable prefix regardless of which screen renders).
|
||||||
let svc = Svc {
|
let svc = Svc {
|
||||||
@@ -191,8 +282,21 @@ fn root(cx: &mut RenderCx, ctx: &Arc<AppCtx>) -> Element {
|
|||||||
set_status: set_status.clone(),
|
set_status: set_status.clone(),
|
||||||
set_speed: set_speed.clone(),
|
set_speed: set_speed.clone(),
|
||||||
};
|
};
|
||||||
match screen {
|
let body = match &screen {
|
||||||
Screen::Hosts => component(hosts::hosts_page, HostsProps { svc, hosts, status }),
|
Screen::Hosts => component(
|
||||||
|
hosts::hosts_page,
|
||||||
|
HostsProps {
|
||||||
|
svc,
|
||||||
|
hosts,
|
||||||
|
status,
|
||||||
|
forget,
|
||||||
|
rename,
|
||||||
|
show_add,
|
||||||
|
set_forget,
|
||||||
|
set_rename,
|
||||||
|
set_show_add,
|
||||||
|
},
|
||||||
|
),
|
||||||
// connecting_page / request_access_page / settings_page / licenses_page use no hooks
|
// connecting_page / request_access_page / settings_page / licenses_page use no hooks
|
||||||
// (they never touch `cx`), so calling them inline is sound.
|
// (they never touch `cx`), so calling them inline is sound.
|
||||||
Screen::Connecting => connect::connecting_page(ctx, &status),
|
Screen::Connecting => connect::connecting_page(ctx, &status),
|
||||||
@@ -202,5 +306,21 @@ fn root(cx: &mut RenderCx, ctx: &Arc<AppCtx>) -> Element {
|
|||||||
Screen::Pair => component(pair::pair_page, svc),
|
Screen::Pair => component(pair::pair_page, svc),
|
||||||
Screen::SpeedTest => component(speed::speed_page, SpeedProps { svc, state: speed }),
|
Screen::SpeedTest => component(speed::speed_page, SpeedProps { svc, state: speed }),
|
||||||
Screen::Stream => component(stream::stream_page, StreamProps { svc, hud }),
|
Screen::Stream => component(stream::stream_page, StreamProps { svc, hud }),
|
||||||
|
};
|
||||||
|
|
||||||
|
// The Stream screen owns the SwapChainPanel + per-frame present; never wrap it in an animated
|
||||||
|
// opacity/offset layer. Everything else slides + fades in on navigation.
|
||||||
|
if matches!(screen, Screen::Stream) {
|
||||||
|
return body;
|
||||||
}
|
}
|
||||||
|
let offset = (1.0 - progress) * 22.0;
|
||||||
|
border(body)
|
||||||
|
.opacity(progress)
|
||||||
|
.margin(Thickness {
|
||||||
|
left: 0.0,
|
||||||
|
top: offset,
|
||||||
|
right: 0.0,
|
||||||
|
bottom: 0.0,
|
||||||
|
})
|
||||||
|
.into()
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,12 +1,14 @@
|
|||||||
//! The stream page: a `SwapChainPanel` bound to the D3D11 composition swapchain in
|
//! The stream page: a `SwapChainPanel` whose composition swapchain is created (and bound) once on
|
||||||
//! [`crate::present`], driven by reactor's per-frame `on_rendering`, with a status-chip HUD
|
//! the UI thread, then handed — presenter and all — to the dedicated render thread
|
||||||
//! overlay (mode · decode path · HDR · fps/throughput/latency · capture hint).
|
//! ([`crate::render`]), which presents decoded frames at stream cadence. The page itself only
|
||||||
|
//! forwards panel size/DPI changes and draws the status-chip HUD overlay (mode · decode path ·
|
||||||
|
//! HDR · fps/throughput/latency · capture hint).
|
||||||
|
|
||||||
use super::style::{edges, uniform};
|
use super::style::{edges, uniform};
|
||||||
use super::Svc;
|
use super::Svc;
|
||||||
use crate::present::Presenter;
|
use crate::present::Presenter;
|
||||||
|
use crate::render::{self, RenderThread};
|
||||||
use crate::session::Stats;
|
use crate::session::Stats;
|
||||||
use crate::video::DecodedFrame;
|
|
||||||
use punktfunk_core::client::NativeClient;
|
use punktfunk_core::client::NativeClient;
|
||||||
use punktfunk_core::config::Mode;
|
use punktfunk_core::config::Mode;
|
||||||
use std::cell::RefCell;
|
use std::cell::RefCell;
|
||||||
@@ -35,37 +37,34 @@ impl PartialEq for StreamProps {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// UI-thread-only present context: the D3D11 presenter plus the decoded-frame receiver.
|
|
||||||
struct PresentCtx {
|
|
||||||
presenter: Presenter,
|
|
||||||
frames: async_channel::Receiver<DecodedFrame>,
|
|
||||||
}
|
|
||||||
|
|
||||||
thread_local! {
|
thread_local! {
|
||||||
static PRESENT: RefCell<Option<PresentCtx>> = const { RefCell::new(None) };
|
/// Frames + host clock offset, stashed by the mount effect for `on_ready` (which fires later,
|
||||||
static PENDING_FRAMES: RefCell<Option<async_channel::Receiver<DecodedFrame>>> =
|
/// once the native panel exists).
|
||||||
const { RefCell::new(None) };
|
static PENDING: RefCell<Option<(crate::session::FrameRx, i64)>> = const { RefCell::new(None) };
|
||||||
|
/// The live render thread; stopped + joined by the unmount cleanup (before panel teardown).
|
||||||
|
static RENDER: RefCell<Option<RenderThread>> = const { RefCell::new(None) };
|
||||||
}
|
}
|
||||||
|
|
||||||
fn present_newest(ctx: &mut PresentCtx) {
|
/// The app window's DPI (96 when the window can't be found — then DIPs == pixels). Reactor's
|
||||||
// Apply the latest source HDR mastering metadata (from the session pump's 0xCE drain) before
|
/// `on_resize` reports DIPs and exposes no CompositionScale, so the window DPI is the scale.
|
||||||
// presenting — a cheap no-op in the presenter when unchanged.
|
fn window_dpi() -> u32 {
|
||||||
if let Some(meta) = *crate::present::LATEST_HDR_META.lock().unwrap() {
|
use windows::Win32::UI::HiDpi::GetDpiForWindow;
|
||||||
ctx.presenter.set_hdr_metadata(meta);
|
use windows::Win32::UI::WindowsAndMessaging::FindWindowW;
|
||||||
|
unsafe {
|
||||||
|
FindWindowW(None, windows::core::w!("Punktfunk"))
|
||||||
|
.ok()
|
||||||
|
.map(|h| GetDpiForWindow(h))
|
||||||
|
.filter(|d| *d > 0)
|
||||||
|
.unwrap_or(96)
|
||||||
}
|
}
|
||||||
// Drain to the newest decoded frame (drop any backlog) and hand it to the presenter by value —
|
|
||||||
// the GPU zero-copy path retains the decoder surface across re-presents, so ownership matters.
|
|
||||||
let mut newest = None;
|
|
||||||
while let Ok(f) = ctx.frames.try_recv() {
|
|
||||||
newest = Some(f);
|
|
||||||
}
|
|
||||||
ctx.presenter.present(newest);
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pub(crate) fn stream_page(props: &StreamProps, cx: &mut RenderCx) -> Element {
|
pub(crate) fn stream_page(props: &StreamProps, cx: &mut RenderCx) -> Element {
|
||||||
let ctx = &props.svc.ctx;
|
let ctx = &props.svc.ctx;
|
||||||
// Take the connector + frames handoff once on mount; keep the connector alive (and for input)
|
// Take the connector + frames handoff once on mount; keep the connector alive (and for input)
|
||||||
// in a use_ref, stash frames for `on_ready`, install the input hooks (and remove on unmount).
|
// in a use_ref, stash frames for `on_ready`, install the input hooks. The cleanup stops the
|
||||||
|
// render thread FIRST (it must not present into a panel that's tearing down), then removes
|
||||||
|
// the input hooks.
|
||||||
let connector_ref = cx.use_ref::<Option<Arc<NativeClient>>>(None);
|
let connector_ref = cx.use_ref::<Option<Arc<NativeClient>>>(None);
|
||||||
cx.use_effect_with_cleanup((), {
|
cx.use_effect_with_cleanup((), {
|
||||||
let shared = ctx.shared.clone();
|
let shared = ctx.shared.clone();
|
||||||
@@ -74,54 +73,58 @@ pub(crate) fn stream_page(props: &StreamProps, cx: &mut RenderCx) -> Element {
|
|||||||
move || {
|
move || {
|
||||||
if let Some((connector, frames)) = shared.handoff.lock().unwrap().take() {
|
if let Some((connector, frames)) = shared.handoff.lock().unwrap().take() {
|
||||||
let mode = connector.mode();
|
let mode = connector.mode();
|
||||||
|
let clock_offset = connector.clock_offset_ns;
|
||||||
connector_ref.set(Some(connector.clone()));
|
connector_ref.set(Some(connector.clone()));
|
||||||
PENDING_FRAMES.with(|c| *c.borrow_mut() = Some(frames));
|
PENDING.with(|c| *c.borrow_mut() = Some((frames, clock_offset)));
|
||||||
crate::input::install(connector, mode, inhibit);
|
crate::input::install(connector, mode, inhibit);
|
||||||
}
|
}
|
||||||
Some(crate::input::uninstall)
|
Some(|| {
|
||||||
|
RENDER.with(|c| {
|
||||||
|
if let Some(mut rt) = c.borrow_mut().take() {
|
||||||
|
rt.stop_and_join();
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
PENDING.with(|c| c.borrow_mut().take());
|
||||||
let rendering = cx.use_ref::<Option<Rendering>>(None);
|
crate::input::uninstall();
|
||||||
cx.use_effect((), {
|
})
|
||||||
let rendering = rendering.clone();
|
|
||||||
move || {
|
|
||||||
if let Ok(r) = on_rendering(move || {
|
|
||||||
PRESENT.with(|cell| {
|
|
||||||
if let Some(ctx) = cell.borrow_mut().as_mut() {
|
|
||||||
present_newest(ctx);
|
|
||||||
}
|
|
||||||
});
|
|
||||||
}) {
|
|
||||||
rendering.set(Some(r));
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
|
|
||||||
let mode = connector_ref.borrow().as_ref().map(|c| c.mode());
|
let mode = connector_ref.borrow().as_ref().map(|c| c.mode());
|
||||||
grid((
|
grid((
|
||||||
swap_chain_panel()
|
swap_chain_panel()
|
||||||
.on_ready(|panel| match Presenter::new(1280, 720) {
|
.on_ready(|panel| {
|
||||||
|
// Placeholder size — the first `on_resize` (fired after the first layout pass)
|
||||||
|
// resizes to the panel's real pixel size.
|
||||||
|
let dpi = window_dpi();
|
||||||
|
match Presenter::new(1280, 720, dpi) {
|
||||||
Ok(p) => {
|
Ok(p) => {
|
||||||
if let Err(e) = panel.set_swap_chain(p.swap_chain()) {
|
if let Err(e) = panel.set_swap_chain(p.swap_chain()) {
|
||||||
tracing::error!(error = %e, "set_swap_chain");
|
tracing::error!(error = %e, "set_swap_chain");
|
||||||
|
return;
|
||||||
}
|
}
|
||||||
if let Some(frames) = PENDING_FRAMES.with(|c| c.borrow_mut().take()) {
|
if let Some((frames, clock_offset)) =
|
||||||
PRESENT.with(|cell| {
|
PENDING.with(|c| c.borrow_mut().take())
|
||||||
*cell.borrow_mut() = Some(PresentCtx {
|
{
|
||||||
presenter: p,
|
let shared = render::RenderShared::new(1280, 720, dpi);
|
||||||
frames,
|
RENDER.with(|cell| {
|
||||||
|
*cell.borrow_mut() =
|
||||||
|
Some(render::spawn(p, frames, shared, clock_offset));
|
||||||
});
|
});
|
||||||
});
|
tracing::info!(dpi, "stream presenter bound — render thread started");
|
||||||
tracing::info!("stream presenter bound to SwapChainPanel");
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
Err(e) => tracing::error!(error = %e, "create presenter"),
|
Err(e) => tracing::error!(error = %e, "create presenter"),
|
||||||
|
}
|
||||||
})
|
})
|
||||||
.on_resize(|w, h| {
|
.on_resize(|w, h| {
|
||||||
PRESENT.with(|cell| {
|
// DIPs → physical pixels; the presenter maps back via SetMatrixTransform.
|
||||||
if let Some(ctx) = cell.borrow_mut().as_mut() {
|
let dpi = window_dpi();
|
||||||
ctx.presenter.resize(w as u32, h as u32);
|
let px = |v: f64| (v * f64::from(dpi) / 96.0).round() as u32;
|
||||||
|
RENDER.with(|cell| {
|
||||||
|
if let Some(rt) = cell.borrow().as_ref() {
|
||||||
|
rt.shared().set_dpi(dpi);
|
||||||
|
rt.shared().set_size(px(w), px(h));
|
||||||
}
|
}
|
||||||
});
|
});
|
||||||
}),
|
}),
|
||||||
|
|||||||
@@ -27,26 +27,67 @@ pub(crate) fn card(child: impl Into<Element>) -> Border {
|
|||||||
.padding(uniform(16.0))
|
.padding(uniform(16.0))
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Card chrome with no padding — for cards whose interactive regions (tap-to-connect area vs.
|
||||||
|
/// action buttons) must own their padding so hit areas reach the card edges.
|
||||||
|
pub(crate) fn card_flush(child: impl Into<Element>) -> Border {
|
||||||
|
card(child).padding(uniform(0.0))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// An OPAQUE modal/dialog surface. `card`'s `CardBackground` is a translucent acrylic brush — fine
|
||||||
|
/// layered on the page, but a floating dialog over a scrim needs a solid fill or the content behind
|
||||||
|
/// bleeds through (looks "transparent"). `SolidBackground` is the opaque base-layer brush.
|
||||||
|
pub(crate) fn dialog_surface(child: impl Into<Element>) -> Border {
|
||||||
|
border(child.into())
|
||||||
|
.background(ThemeRef::SolidBackground)
|
||||||
|
.border_brush(ThemeRef::SurfaceStroke)
|
||||||
|
.border_thickness(uniform(1.0))
|
||||||
|
.corner_radius(8.0)
|
||||||
|
.padding(uniform(20.0))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// A fully transparent brush: paints nothing but (unlike a null background) makes the whole
|
||||||
|
/// element hit-testable, so a tap region catches clicks in its blank space too.
|
||||||
|
pub(crate) fn hit_test_backstop() -> Color {
|
||||||
|
Color {
|
||||||
|
a: 0,
|
||||||
|
r: 0,
|
||||||
|
g: 0,
|
||||||
|
b: 0,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
/// A small all-caps section label above a group of cards.
|
/// A small all-caps section label above a group of cards.
|
||||||
pub(crate) fn section(label: &str) -> Element {
|
pub(crate) fn section(label: &str) -> Element {
|
||||||
text_block(label)
|
text_block(label)
|
||||||
.font_size(12.0)
|
.font_size(12.0)
|
||||||
.semibold()
|
.semibold()
|
||||||
.foreground(ThemeRef::SecondaryText)
|
.foreground(ThemeRef::SecondaryText)
|
||||||
.margin(edges(2.0, 10.0, 0.0, 0.0))
|
.margin(edges(2.0, 14.0, 0.0, 2.0))
|
||||||
.into()
|
.into()
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Wrap a screen's children in a scrollable, centred, max-width column.
|
/// Wrap a screen's children in a scrollable, centred, max-width column. Alignment stays the
|
||||||
|
/// default Stretch: with a MaxWidth that still centres the column, but the children get the
|
||||||
|
/// column's REAL width — an explicit Center would size the column to its content and leave
|
||||||
|
/// every card at its minimum width no matter how large the window is.
|
||||||
pub(crate) fn page(children: Vec<Element>) -> Element {
|
pub(crate) fn page(children: Vec<Element>) -> Element {
|
||||||
let col = vstack(children)
|
let col = vstack(children)
|
||||||
.spacing(10.0)
|
.spacing(10.0)
|
||||||
.max_width(640.0)
|
.max_width(640.0)
|
||||||
.horizontal_alignment(HorizontalAlignment::Center)
|
|
||||||
.margin(edges(24.0, 24.0, 24.0, 40.0));
|
.margin(edges(24.0, 24.0, 24.0, 40.0));
|
||||||
scroll_view(col).into()
|
scroll_view(col).into()
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Like [`page`], but wide and airier — for screens whose cards lay out in a responsive grid
|
||||||
|
/// and should use the window instead of a narrow settings column.
|
||||||
|
pub(crate) fn page_wide(children: Vec<Element>) -> Element {
|
||||||
|
let col = vstack(children)
|
||||||
|
.spacing(14.0)
|
||||||
|
.max_width(1120.0)
|
||||||
|
.margin(edges(32.0, 28.0, 32.0, 48.0));
|
||||||
|
scroll_view(col).into()
|
||||||
|
}
|
||||||
|
|
||||||
/// A page header: a large bold title on the left, one action button on the right.
|
/// A page header: a large bold title on the left, one action button on the right.
|
||||||
pub(crate) fn page_header(title: &str, action: Button) -> Element {
|
pub(crate) fn page_header(title: &str, action: Button) -> Element {
|
||||||
grid((
|
grid((
|
||||||
@@ -103,7 +144,9 @@ pub(crate) fn avatar(name: &str) -> Border {
|
|||||||
text_block(initial)
|
text_block(initial)
|
||||||
.font_size(17.0)
|
.font_size(17.0)
|
||||||
.semibold()
|
.semibold()
|
||||||
.foreground(ThemeRef::AccentText)
|
// NOT ThemeRef::AccentText — that's accent-COLOURED text for normal surfaces;
|
||||||
|
// on an accent fill it's accent-on-accent (unreadable). This is the on-accent brush.
|
||||||
|
.foreground(ThemeRef::custom("TextOnAccentFillColorPrimaryBrush"))
|
||||||
.horizontal_alignment(HorizontalAlignment::Center)
|
.horizontal_alignment(HorizontalAlignment::Center)
|
||||||
.vertical_alignment(VerticalAlignment::Center),
|
.vertical_alignment(VerticalAlignment::Center),
|
||||||
)
|
)
|
||||||
@@ -116,20 +159,39 @@ pub(crate) fn avatar(name: &str) -> Border {
|
|||||||
/// Pill chip colour intent.
|
/// Pill chip colour intent.
|
||||||
#[derive(Clone, Copy)]
|
#[derive(Clone, Copy)]
|
||||||
pub(crate) enum Pill {
|
pub(crate) enum Pill {
|
||||||
Accent,
|
Info,
|
||||||
Good,
|
Good,
|
||||||
Neutral,
|
Neutral,
|
||||||
}
|
}
|
||||||
|
|
||||||
/// A small rounded status chip (paired/PIN/HDR/etc.).
|
/// A small rounded status chip (paired/PIN/HDR/etc.) — subtle tinted fills with matching
|
||||||
|
/// system foregrounds (the InfoBar palette), never solid accent (white-on-bright is unreadable).
|
||||||
pub(crate) fn pill(text: &str, kind: Pill) -> Border {
|
pub(crate) fn pill(text: &str, kind: Pill) -> Border {
|
||||||
let (bg, fg) = match kind {
|
let (bg, fg) = match kind {
|
||||||
Pill::Accent => (ThemeRef::Accent, ThemeRef::AccentText),
|
Pill::Info => (
|
||||||
|
ThemeRef::SystemAttentionBackground,
|
||||||
|
ThemeRef::SystemAttention,
|
||||||
|
),
|
||||||
Pill::Good => (ThemeRef::SystemSuccessBackground, ThemeRef::SystemSuccess),
|
Pill::Good => (ThemeRef::SystemSuccessBackground, ThemeRef::SystemSuccess),
|
||||||
Pill::Neutral => (ThemeRef::SubtleFill, ThemeRef::SecondaryText),
|
Pill::Neutral => (ThemeRef::SubtleFill, ThemeRef::SecondaryText),
|
||||||
};
|
};
|
||||||
border(text_block(text).font_size(11.0).semibold().foreground(fg))
|
border(text_block(text).font_size(11.0).semibold().foreground(fg))
|
||||||
.background(bg)
|
.background(bg)
|
||||||
|
.border_brush(ThemeRef::CardStroke)
|
||||||
|
.border_thickness(uniform(1.0))
|
||||||
.corner_radius(10.0)
|
.corner_radius(10.0)
|
||||||
.padding(edges(9.0, 3.0, 9.0, 3.0))
|
.padding(edges(9.0, 2.0, 9.0, 2.0))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// A small presence dot (host online/offline).
|
||||||
|
pub(crate) fn presence_dot(online: bool) -> Border {
|
||||||
|
border(vstack(Vec::<Element>::new()))
|
||||||
|
.background(if online {
|
||||||
|
ThemeRef::SystemSuccess
|
||||||
|
} else {
|
||||||
|
ThemeRef::SystemNeutral
|
||||||
|
})
|
||||||
|
.corner_radius(4.0)
|
||||||
|
.width(8.0)
|
||||||
|
.height(8.0)
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -7,6 +7,16 @@
|
|||||||
//! pull it from a process-global `OnceLock` (initialised on whichever thread asks first: the
|
//! pull it from a process-global `OnceLock` (initialised on whichever thread asks first: the
|
||||||
//! session pump when it builds the decoder, or the UI thread when it builds the presenter).
|
//! session pump when it builds the decoder, or the UI thread when it builds the presenter).
|
||||||
//!
|
//!
|
||||||
|
//! **Adapter selection** (matters on hybrid boxes — e.g. an Intel iGPU driving the panel next to
|
||||||
|
//! an NVIDIA dGPU): `PUNKTFUNK_ADAPTER` (index or case-insensitive name substring) wins; else the
|
||||||
|
//! adapter whose output owns the monitor our window is on — that's the adapter DWM composes that
|
||||||
|
//! monitor with, so presents are copy-free and decode runs on the near GPU; else the default
|
||||||
|
//! adapter. Deliberately NOT "the adapter with the best decoder": if the monitor's adapter can't
|
||||||
|
//! decode the codec we demote to software, which beats a per-frame cross-adapter present copy.
|
||||||
|
//!
|
||||||
|
//! `PUNKTFUNK_D3D_DEBUG=1` adds the D3D11 debug layer (validation messages in the debugger /
|
||||||
|
//! DebugView) — invaluable for present-path bugs, which D3D11 otherwise drops silently.
|
||||||
|
//!
|
||||||
//! **Thread-safety.** windows-rs COM interfaces are deliberately `!Send`/`!Sync` — thread-safety
|
//! **Thread-safety.** windows-rs COM interfaces are deliberately `!Send`/`!Sync` — thread-safety
|
||||||
//! is per-object, not universal. An `ID3D11Device` and its immediate context become free-threaded
|
//! is per-object, not universal. An `ID3D11Device` and its immediate context become free-threaded
|
||||||
//! once `ID3D11Multithread::SetMultithreadProtected(TRUE)` is set, which FFmpeg's D3D11VA backend
|
//! once `ID3D11Multithread::SetMultithreadProtected(TRUE)` is set, which FFmpeg's D3D11VA backend
|
||||||
@@ -20,12 +30,15 @@ use anyhow::{anyhow, Result};
|
|||||||
use std::sync::OnceLock;
|
use std::sync::OnceLock;
|
||||||
use windows::core::Interface;
|
use windows::core::Interface;
|
||||||
use windows::Win32::Graphics::Direct3D::{
|
use windows::Win32::Graphics::Direct3D::{
|
||||||
D3D_DRIVER_TYPE_HARDWARE, D3D_DRIVER_TYPE_WARP, D3D_FEATURE_LEVEL_11_0, D3D_FEATURE_LEVEL_11_1,
|
D3D_DRIVER_TYPE, D3D_DRIVER_TYPE_HARDWARE, D3D_DRIVER_TYPE_UNKNOWN, D3D_DRIVER_TYPE_WARP,
|
||||||
|
D3D_FEATURE_LEVEL_11_0, D3D_FEATURE_LEVEL_11_1,
|
||||||
};
|
};
|
||||||
use windows::Win32::Graphics::Direct3D11::{
|
use windows::Win32::Graphics::Direct3D11::{
|
||||||
D3D11CreateDevice, ID3D11Device, ID3D11DeviceContext, ID3D11Multithread,
|
D3D11CreateDevice, ID3D11Device, ID3D11DeviceContext, ID3D11Multithread,
|
||||||
D3D11_CREATE_DEVICE_BGRA_SUPPORT, D3D11_CREATE_DEVICE_VIDEO_SUPPORT, D3D11_SDK_VERSION,
|
D3D11_CREATE_DEVICE_BGRA_SUPPORT, D3D11_CREATE_DEVICE_DEBUG, D3D11_CREATE_DEVICE_FLAG,
|
||||||
|
D3D11_CREATE_DEVICE_VIDEO_SUPPORT, D3D11_SDK_VERSION,
|
||||||
};
|
};
|
||||||
|
use windows::Win32::Graphics::Dxgi::{CreateDXGIFactory1, IDXGIAdapter, IDXGIFactory1};
|
||||||
|
|
||||||
pub struct SharedDevice {
|
pub struct SharedDevice {
|
||||||
pub device: ID3D11Device,
|
pub device: ID3D11Device,
|
||||||
@@ -60,26 +73,123 @@ fn create() -> Option<SharedDevice> {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
fn create_device() -> Result<SharedDevice> {
|
/// The adapter's human-readable description, for the logs.
|
||||||
// Preference order: a hardware adapter with video support (enables D3D11VA); the same without
|
fn adapter_name(adapter: &IDXGIAdapter) -> String {
|
||||||
// the VIDEO flag (a driver that rejects it still presents + software-decodes); finally WARP for
|
unsafe {
|
||||||
// the GPU-less box. BGRA_SUPPORT is required for the composition swapchain in every case.
|
adapter
|
||||||
let attempts = [
|
.GetDesc()
|
||||||
(D3D_DRIVER_TYPE_HARDWARE, true, true),
|
.map(|d| {
|
||||||
(D3D_DRIVER_TYPE_HARDWARE, false, true),
|
String::from_utf16_lossy(&d.Description)
|
||||||
(D3D_DRIVER_TYPE_WARP, false, false),
|
.trim_end_matches('\0')
|
||||||
];
|
.to_string()
|
||||||
for (driver, video, hardware) in attempts {
|
})
|
||||||
let flags = if video {
|
.unwrap_or_else(|_| "<unknown adapter>".into())
|
||||||
D3D11_CREATE_DEVICE_BGRA_SUPPORT | D3D11_CREATE_DEVICE_VIDEO_SUPPORT
|
}
|
||||||
} else {
|
}
|
||||||
D3D11_CREATE_DEVICE_BGRA_SUPPORT
|
|
||||||
|
/// Resolve an explicit adapter: `PUNKTFUNK_ADAPTER` (index or case-insensitive name substring)
|
||||||
|
/// wins; else the adapter whose output owns the monitor the app window is on (see module docs);
|
||||||
|
/// else `None` → the default adapter (also the headless-CLI path, where no window exists).
|
||||||
|
fn resolve_adapter() -> Option<IDXGIAdapter> {
|
||||||
|
let factory: IDXGIFactory1 = unsafe { CreateDXGIFactory1() }.ok()?;
|
||||||
|
let adapters: Vec<IDXGIAdapter> = {
|
||||||
|
let mut v = Vec::new();
|
||||||
|
let mut i = 0u32;
|
||||||
|
while let Ok(a) = unsafe { factory.EnumAdapters1(i) } {
|
||||||
|
i += 1;
|
||||||
|
if let Ok(a) = a.cast::<IDXGIAdapter>() {
|
||||||
|
v.push(a);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
v
|
||||||
};
|
};
|
||||||
|
|
||||||
|
if let Ok(pref) = std::env::var("PUNKTFUNK_ADAPTER") {
|
||||||
|
let pref = pref.trim();
|
||||||
|
let found = if let Ok(idx) = pref.parse::<usize>() {
|
||||||
|
adapters.get(idx).cloned()
|
||||||
|
} else {
|
||||||
|
let needle = pref.to_lowercase();
|
||||||
|
adapters
|
||||||
|
.iter()
|
||||||
|
.find(|a| adapter_name(a).to_lowercase().contains(&needle))
|
||||||
|
.cloned()
|
||||||
|
};
|
||||||
|
match &found {
|
||||||
|
Some(a) => {
|
||||||
|
tracing::info!(pref, adapter = %adapter_name(a), "PUNKTFUNK_ADAPTER matched")
|
||||||
|
}
|
||||||
|
None => tracing::warn!(pref, "PUNKTFUNK_ADAPTER matched no adapter — using default"),
|
||||||
|
}
|
||||||
|
if found.is_some() {
|
||||||
|
return found;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// The adapter driving the monitor our window sits on: DWM composes that monitor with it, so
|
||||||
|
// presenting from it is copy-free (a hybrid box's other adapter would pay a cross-adapter
|
||||||
|
// copy per frame).
|
||||||
|
let monitor = unsafe {
|
||||||
|
use windows::Win32::Graphics::Gdi::{MonitorFromWindow, MONITOR_DEFAULTTONULL};
|
||||||
|
use windows::Win32::UI::WindowsAndMessaging::FindWindowW;
|
||||||
|
let hwnd = FindWindowW(None, windows::core::w!("Punktfunk")).ok()?;
|
||||||
|
MonitorFromWindow(hwnd, MONITOR_DEFAULTTONULL)
|
||||||
|
};
|
||||||
|
if monitor.is_invalid() {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
for adapter in &adapters {
|
||||||
|
let mut oi = 0u32;
|
||||||
|
while let Ok(output) = unsafe { adapter.EnumOutputs(oi) } {
|
||||||
|
oi += 1;
|
||||||
|
if let Ok(desc) = unsafe { output.GetDesc() } {
|
||||||
|
if desc.Monitor == monitor {
|
||||||
|
tracing::info!(adapter = %adapter_name(adapter), "using the window's monitor adapter");
|
||||||
|
return Some(adapter.clone());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
None
|
||||||
|
}
|
||||||
|
|
||||||
|
fn create_device() -> Result<SharedDevice> {
|
||||||
|
// Preference order: the resolved adapter (or the default hardware adapter) with video support
|
||||||
|
// (enables D3D11VA); the same without the VIDEO flag (a driver that rejects it still presents +
|
||||||
|
// software-decodes); finally WARP for the GPU-less box. BGRA_SUPPORT is required for the
|
||||||
|
// composition swapchain in every case. An explicit adapter requires D3D_DRIVER_TYPE_UNKNOWN.
|
||||||
|
let adapter = resolve_adapter();
|
||||||
|
let attempts: [(Option<&IDXGIAdapter>, D3D_DRIVER_TYPE, bool, bool); 3] = match &adapter {
|
||||||
|
Some(a) => [
|
||||||
|
(Some(a), D3D_DRIVER_TYPE_UNKNOWN, true, true),
|
||||||
|
(Some(a), D3D_DRIVER_TYPE_UNKNOWN, false, true),
|
||||||
|
(None, D3D_DRIVER_TYPE_WARP, false, false),
|
||||||
|
],
|
||||||
|
None => [
|
||||||
|
(None, D3D_DRIVER_TYPE_HARDWARE, true, true),
|
||||||
|
(None, D3D_DRIVER_TYPE_HARDWARE, false, true),
|
||||||
|
(None, D3D_DRIVER_TYPE_WARP, false, false),
|
||||||
|
],
|
||||||
|
};
|
||||||
|
// The debug layer needs the SDK layers installed (Graphics Tools); when they're missing the
|
||||||
|
// creation fails, so each attempt retries without the flag rather than failing the ladder.
|
||||||
|
let debug = std::env::var("PUNKTFUNK_D3D_DEBUG").is_ok_and(|v| v == "1");
|
||||||
|
for (adapter, driver, video, hardware) in attempts {
|
||||||
|
let mut flags = D3D11_CREATE_DEVICE_BGRA_SUPPORT;
|
||||||
|
if video {
|
||||||
|
flags |= D3D11_CREATE_DEVICE_VIDEO_SUPPORT;
|
||||||
|
}
|
||||||
|
let flag_sets: &[D3D11_CREATE_DEVICE_FLAG] = if debug {
|
||||||
|
&[flags | D3D11_CREATE_DEVICE_DEBUG, flags]
|
||||||
|
} else {
|
||||||
|
&[flags]
|
||||||
|
};
|
||||||
|
for &flags in flag_sets {
|
||||||
let mut device = None;
|
let mut device = None;
|
||||||
let mut context = None;
|
let mut context = None;
|
||||||
let r = unsafe {
|
let r = unsafe {
|
||||||
D3D11CreateDevice(
|
D3D11CreateDevice(
|
||||||
None,
|
adapter,
|
||||||
driver,
|
driver,
|
||||||
None,
|
None,
|
||||||
flags,
|
flags,
|
||||||
@@ -92,22 +202,23 @@ fn create_device() -> Result<SharedDevice> {
|
|||||||
};
|
};
|
||||||
if r.is_ok() {
|
if r.is_ok() {
|
||||||
let (device, context) = (device.unwrap(), context.unwrap());
|
let (device, context) = (device.unwrap(), context.unwrap());
|
||||||
// Make the device + immediate context free-threaded: the decoder (D3D11VA video context,
|
// Make the device + immediate context free-threaded: the decoder (D3D11VA video
|
||||||
// pump thread) and the presenter (immediate context, UI thread) both touch this device.
|
// context, pump thread) and the presenter (immediate context, render thread) both
|
||||||
// FFmpeg also sets this during hwdevice init, but doing it up front keeps the
|
// touch this device. FFmpeg also sets this during hwdevice init, but doing it up
|
||||||
// cross-thread `Send`/`Sync` sound from the moment the device exists.
|
// front keeps the cross-thread `Send`/`Sync` sound from the moment the device exists.
|
||||||
if let Ok(mt) = context.cast::<ID3D11Multithread>() {
|
if let Ok(mt) = context.cast::<ID3D11Multithread>() {
|
||||||
unsafe {
|
unsafe {
|
||||||
let _ = mt.SetMultithreadProtected(true); // returns the prior state; ignore
|
let _ = mt.SetMultithreadProtected(true); // returns the prior state; ignore
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
tracing::info!(
|
tracing::info!(
|
||||||
driver = if hardware {
|
adapter = %adapter.map(adapter_name).unwrap_or_else(|| if hardware {
|
||||||
"hardware"
|
"default".into()
|
||||||
} else {
|
} else {
|
||||||
"WARP (software)"
|
"WARP (software)".into()
|
||||||
},
|
}),
|
||||||
video,
|
video,
|
||||||
|
debug = (flags & D3D11_CREATE_DEVICE_DEBUG).0 != 0,
|
||||||
"shared D3D11 device created"
|
"shared D3D11 device created"
|
||||||
);
|
);
|
||||||
return Ok(SharedDevice {
|
return Ok(SharedDevice {
|
||||||
@@ -117,6 +228,7 @@ fn create_device() -> Result<SharedDevice> {
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
}
|
||||||
Err(anyhow!(
|
Err(anyhow!(
|
||||||
"D3D11CreateDevice failed for both hardware and WARP"
|
"D3D11CreateDevice failed for both hardware and WARP"
|
||||||
))
|
))
|
||||||
|
|||||||
@@ -13,7 +13,10 @@
|
|||||||
//! sub-pixel remainder carried so slow drags aren't lost), then warp the cursor back to centre so
|
//! sub-pixel remainder carried so slow drags aren't lost), then warp the cursor back to centre so
|
||||||
//! it never reaches a screen edge. This is why the old absolute path froze: swallowing
|
//! it never reaches a screen edge. This is why the old absolute path froze: swallowing
|
||||||
//! `WM_MOUSEMOVE` pinned the OS cursor, so `pt` never travelled and the absolute coordinate
|
//! `WM_MOUSEMOVE` pinned the OS cursor, so `pt` never travelled and the absolute coordinate
|
||||||
//! snapped to one point. Keys carry the native Windows VK directly (the wire contract).
|
//! snapped to one point. Keys carry the **US-positional VK** for the pressed physical key (the
|
||||||
|
//! punktfunk wire contract shared by every first-party client — see [`scan_to_positional_vk`]):
|
||||||
|
//! the hook's layout-resolved `vkCode` must NOT go on the wire, or a non-US pair re-maps
|
||||||
|
//! positions through two layouts (German: y↔z swapped, ü lands on ö).
|
||||||
//!
|
//!
|
||||||
//! **Capture state machine** (parity with the GTK/Swift clients): capture engages at stream
|
//! **Capture state machine** (parity with the GTK/Swift clients): capture engages at stream
|
||||||
//! start, **Ctrl+Alt+Shift+Q** releases it (handing the cursor back to the local desktop), and a
|
//! start, **Ctrl+Alt+Shift+Q** releases it (handing the cursor back to the local desktop), and a
|
||||||
@@ -35,9 +38,9 @@ use windows::Win32::UI::Input::KeyboardAndMouse::VK_Q;
|
|||||||
use windows::Win32::UI::WindowsAndMessaging::{
|
use windows::Win32::UI::WindowsAndMessaging::{
|
||||||
CallNextHookEx, ClipCursor, GetClientRect, GetForegroundWindow, SetCursorPos,
|
CallNextHookEx, ClipCursor, GetClientRect, GetForegroundWindow, SetCursorPos,
|
||||||
SetWindowsHookExW, ShowCursor, UnhookWindowsHookEx, HC_ACTION, HHOOK, KBDLLHOOKSTRUCT,
|
SetWindowsHookExW, ShowCursor, UnhookWindowsHookEx, HC_ACTION, HHOOK, KBDLLHOOKSTRUCT,
|
||||||
LLMHF_INJECTED, MSLLHOOKSTRUCT, WH_KEYBOARD_LL, WH_MOUSE_LL, WM_KEYUP, WM_LBUTTONDOWN,
|
LLKHF_EXTENDED, LLMHF_INJECTED, MSLLHOOKSTRUCT, WH_KEYBOARD_LL, WH_MOUSE_LL, WM_KEYUP,
|
||||||
WM_LBUTTONUP, WM_MBUTTONDOWN, WM_MBUTTONUP, WM_MOUSEHWHEEL, WM_MOUSEMOVE, WM_MOUSEWHEEL,
|
WM_LBUTTONDOWN, WM_LBUTTONUP, WM_MBUTTONDOWN, WM_MBUTTONUP, WM_MOUSEHWHEEL, WM_MOUSEMOVE,
|
||||||
WM_RBUTTONDOWN, WM_RBUTTONUP, WM_SYSKEYUP, WM_XBUTTONDOWN, WM_XBUTTONUP,
|
WM_MOUSEWHEEL, WM_RBUTTONDOWN, WM_RBUTTONUP, WM_SYSKEYUP, WM_XBUTTONDOWN, WM_XBUTTONUP,
|
||||||
};
|
};
|
||||||
|
|
||||||
struct State {
|
struct State {
|
||||||
@@ -269,7 +272,17 @@ unsafe extern "system" fn kbd_proc(code: i32, wparam: WPARAM, lparam: LPARAM) ->
|
|||||||
if !st.inhibit_shortcuts && is_system_shortcut(st, vk) {
|
if !st.inhibit_shortcuts && is_system_shortcut(st, vk) {
|
||||||
return unsafe { CallNextHookEx(None, code, wparam, lparam) };
|
return unsafe { CallNextHookEx(None, code, wparam, lparam) };
|
||||||
}
|
}
|
||||||
let v = vk as u8;
|
// Wire key: the US-positional VK for this physical key (module docs), derived
|
||||||
|
// from the scancode. `vkCode` is layout-semantic and only passes through for
|
||||||
|
// keys the table doesn't cover — extended keys and everything outside the
|
||||||
|
// typing area, where positional == semantic (plus injected events with
|
||||||
|
// scanCode 0 from remapping tools, best-effort).
|
||||||
|
let ext = (kb.flags.0 & LLKHF_EXTENDED.0) != 0;
|
||||||
|
let v = if ext {
|
||||||
|
vk as u8
|
||||||
|
} else {
|
||||||
|
scan_to_positional_vk(kb.scanCode as u16).unwrap_or(vk as u8)
|
||||||
|
};
|
||||||
if up {
|
if up {
|
||||||
if st.held_keys.remove(&v) {
|
if st.held_keys.remove(&v) {
|
||||||
send(&st.connector, InputKind::KeyUp, v as u32, 0, 0, 0);
|
send(&st.connector, InputKind::KeyUp, v as u32, 0, 0, 0);
|
||||||
@@ -397,3 +410,95 @@ fn button(st: &mut State, id: u32, down: bool) {
|
|||||||
send(&c, InputKind::MouseButtonUp, id, 0, 0, 0);
|
send(&c, InputKind::MouseButtonUp, id, 0, 0, 0);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Set-1 make scancode → US-positional VK for the layout-**variant** typing area (letters, digit
|
||||||
|
/// row, OEM punctuation, the ISO 102nd key) — the exact inverse of the host injector's positional
|
||||||
|
/// table and the Windows analogue of the Linux client's `evdev_to_vk`. Keys not listed (F-row,
|
||||||
|
/// nav cluster, numpad, modifiers — plus every E0-extended key, which the caller filters out)
|
||||||
|
/// have layout-invariant VKs, so the hook's `vkCode` is already correct for them.
|
||||||
|
fn scan_to_positional_vk(scan: u16) -> Option<u8> {
|
||||||
|
Some(match scan {
|
||||||
|
0x02..=0x0A => (scan - 0x02) as u8 + 0x31, // 1..9
|
||||||
|
0x0B => 0x30, // 0
|
||||||
|
0x0C => 0xBD, // -_ VK_OEM_MINUS (DE: ß)
|
||||||
|
0x0D => 0xBB, // =+ VK_OEM_PLUS
|
||||||
|
0x10 => 0x51, // Q
|
||||||
|
0x11 => 0x57, // W
|
||||||
|
0x12 => 0x45, // E
|
||||||
|
0x13 => 0x52, // R
|
||||||
|
0x14 => 0x54, // T
|
||||||
|
0x15 => 0x59, // Y position (QWERTZ: the Z key)
|
||||||
|
0x16 => 0x55, // U
|
||||||
|
0x17 => 0x49, // I
|
||||||
|
0x18 => 0x4F, // O
|
||||||
|
0x19 => 0x50, // P
|
||||||
|
0x1A => 0xDB, // [{ VK_OEM_4 (DE: ü)
|
||||||
|
0x1B => 0xDD, // ]} VK_OEM_6
|
||||||
|
0x1E => 0x41, // A
|
||||||
|
0x1F => 0x53, // S
|
||||||
|
0x20 => 0x44, // D
|
||||||
|
0x21 => 0x46, // F
|
||||||
|
0x22 => 0x47, // G
|
||||||
|
0x23 => 0x48, // H
|
||||||
|
0x24 => 0x4A, // J
|
||||||
|
0x25 => 0x4B, // K
|
||||||
|
0x26 => 0x4C, // L
|
||||||
|
0x27 => 0xBA, // ;: VK_OEM_1 (DE: ö)
|
||||||
|
0x28 => 0xDE, // '" VK_OEM_7 (DE: ä)
|
||||||
|
0x29 => 0xC0, // `~ VK_OEM_3 (DE: ^)
|
||||||
|
0x2B => 0xDC, // \| VK_OEM_5
|
||||||
|
0x2C => 0x5A, // Z position (QWERTZ: the Y key)
|
||||||
|
0x2D => 0x58, // X
|
||||||
|
0x2E => 0x43, // C
|
||||||
|
0x2F => 0x56, // V
|
||||||
|
0x30 => 0x42, // B
|
||||||
|
0x31 => 0x4E, // N
|
||||||
|
0x32 => 0x4D, // M
|
||||||
|
0x33 => 0xBC, // ,< VK_OEM_COMMA
|
||||||
|
0x34 => 0xBE, // .> VK_OEM_PERIOD
|
||||||
|
0x35 => 0xBF, // /? VK_OEM_2
|
||||||
|
0x56 => 0xE2, // <>| VK_OEM_102 (ISO)
|
||||||
|
_ => return None,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::*;
|
||||||
|
|
||||||
|
/// The German-scramble regression pins: the physical keys a QWERTZ board labels Z/Y/ö/ü must
|
||||||
|
/// leave this client as their US-position VKs, regardless of the local layout's vkCode.
|
||||||
|
#[test]
|
||||||
|
fn positional_pins_for_the_qwertz_scramble() {
|
||||||
|
assert_eq!(scan_to_positional_vk(0x15), Some(0x59)); // QWERTZ Z key → VK_Y (US position)
|
||||||
|
assert_eq!(scan_to_positional_vk(0x2C), Some(0x5A)); // QWERTZ Y key → VK_Z (US position)
|
||||||
|
assert_eq!(scan_to_positional_vk(0x27), Some(0xBA)); // ö key → VK_OEM_1 (US ;: position)
|
||||||
|
assert_eq!(scan_to_positional_vk(0x1A), Some(0xDB)); // ü key → VK_OEM_4 (US [{ position)
|
||||||
|
assert_eq!(scan_to_positional_vk(0x28), Some(0xDE)); // ä key → VK_OEM_7 (US '" position)
|
||||||
|
assert_eq!(scan_to_positional_vk(0x0C), Some(0xBD)); // ß key → VK_OEM_MINUS (US -_ position)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Keys outside the layout-variant typing area stay un-mapped (vkCode passes through).
|
||||||
|
#[test]
|
||||||
|
fn invariant_keys_fall_through() {
|
||||||
|
for scan in [
|
||||||
|
0x01u16, 0x0E, 0x0F, 0x1C, 0x1D, 0x2A, 0x36, 0x38, 0x39, 0x3B, 0x45, 0x57,
|
||||||
|
] {
|
||||||
|
assert_eq!(scan_to_positional_vk(scan), None, "scan 0x{scan:02X}");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Exactly the 48 typing-area keys are covered (10 digits + 26 letters + 12 OEM), and every
|
||||||
|
/// mapping is unique — two physical keys must never collapse onto one wire VK.
|
||||||
|
#[test]
|
||||||
|
fn table_covers_the_typing_area_bijectively() {
|
||||||
|
let mapped: Vec<(u16, u8)> = (0u16..=0xFF)
|
||||||
|
.filter_map(|sc| scan_to_positional_vk(sc).map(|vk| (sc, vk)))
|
||||||
|
.collect();
|
||||||
|
assert_eq!(mapped.len(), 48);
|
||||||
|
let mut vks: Vec<u8> = mapped.iter().map(|&(_, vk)| vk).collect();
|
||||||
|
vks.sort_unstable();
|
||||||
|
vks.dedup();
|
||||||
|
assert_eq!(vks.len(), 48, "duplicate wire VK in the positional table");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|||||||
@@ -35,6 +35,8 @@ mod input;
|
|||||||
#[cfg(windows)]
|
#[cfg(windows)]
|
||||||
mod present;
|
mod present;
|
||||||
#[cfg(windows)]
|
#[cfg(windows)]
|
||||||
|
mod render;
|
||||||
|
#[cfg(windows)]
|
||||||
mod session;
|
mod session;
|
||||||
#[cfg(windows)]
|
#[cfg(windows)]
|
||||||
mod trust;
|
mod trust;
|
||||||
|
|||||||
@@ -1,17 +1,29 @@
|
|||||||
//! Direct3D11 presenter for a WinUI 3 `SwapChainPanel`. It draws a decoded frame Contain-fit into a
|
//! Direct3D11 presenter for a WinUI 3 `SwapChainPanel`. It draws a decoded frame Contain-fit into a
|
||||||
//! **composition** flip-model swapchain, which the reactor stream page binds to the panel via
|
//! **composition** flip-model swapchain, which the reactor stream page binds to the panel via
|
||||||
//! `SwapChainPanelHandle::set_swap_chain`.
|
//! `SwapChainPanelHandle::set_swap_chain`. After that one UI-thread bind, the presenter lives on
|
||||||
|
//! the dedicated render thread ([`crate::render`]) — presenting never touches (or is stalled by)
|
||||||
|
//! the XAML thread.
|
||||||
//!
|
//!
|
||||||
//! Two frame sources, one swapchain:
|
//! Two frame sources, one pair of YUV shaders (identical colour math for both):
|
||||||
//!
|
//!
|
||||||
//! * **GPU (zero-copy)** — [`crate::video::GpuFrame`] is a decoder-owned NV12/P010 `ID3D11Texture2D`
|
//! * **GPU (D3D11VA)** — [`crate::video::GpuFrame`] is a slice of the decoder-only NV12/P010
|
||||||
//! array slice (D3D11VA). We create per-plane shader-resource views over the slice and convert
|
//! texture array. One `CopySubresourceRegion` with a display-size box moves the slice — **both
|
||||||
//! YUV→RGB in a pixel shader: NV12 via BT.709 (`ps_nv12`), P010 via BT.2020 with the PQ transfer
|
//! planes; in D3D11 a planar slice is a single subresource** (unlike D3D12) — into our
|
||||||
//! left intact (`ps_p010`). No CPU copy. The decoder uses the **same** shared device
|
//! sampleable texture, which per-plane SRVs (R8/R8G8, R16/R16G16) expose to the shaders. The
|
||||||
//! ([`crate::gpu`]) so the texture is bindable here.
|
//! source box is mandatory: the decode array is coded-size (e.g. 1920×1088), the target
|
||||||
//! * **CPU upload** — [`crate::video::CpuFrame`] is packed RGBA (SDR) or X2BGR10 (HDR) from the
|
//! display-size (1920×1080), and D3D11 silently drops size-mismatched full-resource copies.
|
||||||
//! software decoder; we upload it into a dynamic texture and draw it with a passthrough shader
|
//! * **CPU upload** — [`crate::video::CpuFrame`] carries NV12/P010 planes from the software
|
||||||
//! (`ps_rgba`). The fallback path.
|
//! decoder; they upload into two dynamic plane textures feeding the same SRV slots/shaders.
|
||||||
|
//!
|
||||||
|
//! **Pacing**: the swapchain is created with `DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT`
|
||||||
|
//! and `SetMaximumFrameLatency(1)` (flagless fallback for odd drivers). The render thread waits
|
||||||
|
//! on the latency waitable before drawing, so at most one present is ever queued (minimum compose
|
||||||
|
//! latency) and a stream faster than the display drops frames *before* any GPU work. Every
|
||||||
|
//! `ResizeBuffers` must re-pass the creation flags — that's `swap_flags`.
|
||||||
|
//!
|
||||||
|
//! **HiDPI**: buffers are sized in physical pixels and `IDXGISwapChain2::SetMatrixTransform`
|
||||||
|
//! (scale 96/DPI) maps them to the panel's DIP coordinate space — without it XAML samples a
|
||||||
|
//! DIP-sized buffer up and the video is blurry at 125/150 % scaling.
|
||||||
//!
|
//!
|
||||||
//! **HDR10**: when a frame is BT.2020 PQ the swapchain flips to `R10G10B10A2` +
|
//! **HDR10**: when a frame is BT.2020 PQ the swapchain flips to `R10G10B10A2` +
|
||||||
//! `DXGI_COLOR_SPACE_RGB_FULL_G2084_NONE_P2020` (+ HDR10 metadata) via `ResizeBuffers`/
|
//! `DXGI_COLOR_SPACE_RGB_FULL_G2084_NONE_P2020` (+ HDR10 metadata) via `ResizeBuffers`/
|
||||||
@@ -21,21 +33,23 @@
|
|||||||
//! All `windows` types here come from the same windows-rs commit as `windows-reactor`, so the
|
//! All `windows` types here come from the same windows-rs commit as `windows-reactor`, so the
|
||||||
//! `IDXGISwapChain1` handed to `set_swap_chain` satisfies reactor's `windows_core::Interface`.
|
//! `IDXGISwapChain1` handed to `set_swap_chain` satisfies reactor's `windows_core::Interface`.
|
||||||
|
|
||||||
use crate::video::{DecodedFrame, GpuFrame};
|
use crate::video::{CpuFrame, DecodedFrame, GpuFrame};
|
||||||
use anyhow::{anyhow, Context, Result};
|
use anyhow::{anyhow, Context, Result};
|
||||||
use windows::core::{Interface, PCSTR};
|
use windows::core::{Interface, PCSTR};
|
||||||
|
use windows::Win32::Foundation::{CloseHandle, HANDLE, WAIT_OBJECT_0};
|
||||||
use windows::Win32::Graphics::Direct3D::Fxc::{D3DCompile, D3DCOMPILE_OPTIMIZATION_LEVEL3};
|
use windows::Win32::Graphics::Direct3D::Fxc::{D3DCompile, D3DCOMPILE_OPTIMIZATION_LEVEL3};
|
||||||
use windows::Win32::Graphics::Direct3D::{
|
use windows::Win32::Graphics::Direct3D::{
|
||||||
ID3DBlob, D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST, D3D_SRV_DIMENSION_TEXTURE2DARRAY,
|
ID3DBlob, D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST, D3D_SRV_DIMENSION_TEXTURE2D,
|
||||||
};
|
};
|
||||||
use windows::Win32::Graphics::Direct3D11::*;
|
use windows::Win32::Graphics::Direct3D11::*;
|
||||||
use windows::Win32::Graphics::Dxgi::Common::*;
|
use windows::Win32::Graphics::Dxgi::Common::*;
|
||||||
use windows::Win32::Graphics::Dxgi::*;
|
use windows::Win32::Graphics::Dxgi::*;
|
||||||
|
use windows::Win32::System::Threading::WaitForSingleObject;
|
||||||
|
|
||||||
// One vertex shader (fullscreen triangle) + three pixel shaders, selected per frame source. tex0 is
|
// One vertex shader (fullscreen triangle) + two pixel shaders, selected per frame colour space.
|
||||||
// RGBA (passthrough) or the luma plane; tex1 is the chroma plane. The YUV→RGB matrices fold the
|
// tex0 is the luma plane, tex1 the chroma plane. The YUV→RGB matrices fold the limited→full range
|
||||||
// limited→full range scale into the coefficients; for P010 the R16 sample is rescaled (×65535/65472)
|
// scale into the coefficients; for P010 the R16 sample is rescaled (×65535/65472) to undo the
|
||||||
// to undo the 10-bits-in-the-high-bits packing, then converted with BT.2020 NCL, PQ preserved.
|
// 10-bits-in-the-high-bits packing, then converted with BT.2020 NCL, PQ preserved.
|
||||||
const SHADER_HLSL: &str = r#"
|
const SHADER_HLSL: &str = r#"
|
||||||
struct VSOut { float4 pos : SV_Position; float2 uv : TEXCOORD0; };
|
struct VSOut { float4 pos : SV_Position; float2 uv : TEXCOORD0; };
|
||||||
VSOut vs_main(uint vid : SV_VertexID) {
|
VSOut vs_main(uint vid : SV_VertexID) {
|
||||||
@@ -49,8 +63,6 @@ Texture2D tex0 : register(t0);
|
|||||||
Texture2D tex1 : register(t1);
|
Texture2D tex1 : register(t1);
|
||||||
SamplerState smp : register(s0);
|
SamplerState smp : register(s0);
|
||||||
|
|
||||||
float4 ps_rgba(VSOut i) : SV_Target { return tex0.Sample(smp, i.uv); }
|
|
||||||
|
|
||||||
float4 ps_nv12(VSOut i) : SV_Target {
|
float4 ps_nv12(VSOut i) : SV_Target {
|
||||||
float y = tex0.Sample(smp, i.uv).r;
|
float y = tex0.Sample(smp, i.uv).r;
|
||||||
float2 uv = tex1.Sample(smp, i.uv).rg;
|
float2 uv = tex1.Sample(smp, i.uv).rg;
|
||||||
@@ -77,46 +89,53 @@ float4 ps_p010(VSOut i) : SV_Target {
|
|||||||
}
|
}
|
||||||
"#;
|
"#;
|
||||||
|
|
||||||
/// A bound GPU frame: per-plane SRVs over the decoder's texture-array slice, plus the `GpuFrame`
|
/// The currently bound frame: per-plane SRVs (over the GPU sample texture or the CPU plane
|
||||||
/// itself kept alive so the decoder won't recycle the slice while we re-present it.
|
/// textures) + the colour space that picks the shader. Redraws (resize, letterbox) re-present it.
|
||||||
struct GpuView {
|
struct Bound {
|
||||||
y: ID3D11ShaderResourceView,
|
y: ID3D11ShaderResourceView,
|
||||||
c: ID3D11ShaderResourceView,
|
c: ID3D11ShaderResourceView,
|
||||||
/// Held only for its `Drop` (returns the decoder surface to the reuse pool) — never read.
|
hdr: bool,
|
||||||
#[allow(dead_code)]
|
|
||||||
frame: GpuFrame,
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Current draw source.
|
|
||||||
#[derive(Clone, Copy, PartialEq)]
|
|
||||||
enum Mode {
|
|
||||||
Empty,
|
|
||||||
Rgba,
|
|
||||||
Nv12,
|
|
||||||
P010,
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pub struct Presenter {
|
pub struct Presenter {
|
||||||
device: ID3D11Device,
|
device: ID3D11Device,
|
||||||
context: ID3D11DeviceContext,
|
context: ID3D11DeviceContext,
|
||||||
vs: ID3D11VertexShader,
|
vs: ID3D11VertexShader,
|
||||||
ps_rgba: ID3D11PixelShader,
|
|
||||||
ps_nv12: ID3D11PixelShader,
|
ps_nv12: ID3D11PixelShader,
|
||||||
ps_p010: ID3D11PixelShader,
|
ps_p010: ID3D11PixelShader,
|
||||||
sampler: ID3D11SamplerState,
|
sampler: ID3D11SamplerState,
|
||||||
swap: IDXGISwapChain1,
|
swap: IDXGISwapChain1,
|
||||||
|
/// Creation flags — MUST be re-passed to every `ResizeBuffers` or it fails.
|
||||||
|
swap_flags: u32,
|
||||||
|
/// The frame-latency waitable (owned; closed in `Drop`), `None` on the flagless fallback.
|
||||||
|
waitable: Option<HANDLE>,
|
||||||
rtv: Option<ID3D11RenderTargetView>,
|
rtv: Option<ID3D11RenderTargetView>,
|
||||||
/// CPU-upload texture + SRV + dimensions; recreated when the decoded size/format changes.
|
/// GPU path: sampleable copy target for the decoded slice — `(tex, w, h, ten_bit)`, recreated
|
||||||
cpu_tex: Option<(ID3D11Texture2D, ID3D11ShaderResourceView, u32, u32)>,
|
/// when the decoded size/bit depth changes. Format must equal the decode array's (NV12/P010).
|
||||||
/// Bound zero-copy GPU frame (held to keep its decoder surface alive).
|
sample_tex: Option<(ID3D11Texture2D, u32, u32, bool)>,
|
||||||
gpu: Option<GpuView>,
|
/// The last GPU frame, held until the NEXT bind so its decode surface stays out of the reuse
|
||||||
mode: Mode,
|
/// pool at least until this frame's copy has been queued ahead of any later decoder write.
|
||||||
|
gpu_frame: Option<GpuFrame>,
|
||||||
|
/// CPU path: dynamic luma + chroma plane textures + their SRVs — `(y, uv, y_srv, uv_srv, w, h,
|
||||||
|
/// ten_bit)`, recreated when the decoded size/bit depth changes.
|
||||||
|
#[allow(clippy::type_complexity)]
|
||||||
|
plane_tex: Option<(
|
||||||
|
ID3D11Texture2D,
|
||||||
|
ID3D11Texture2D,
|
||||||
|
ID3D11ShaderResourceView,
|
||||||
|
ID3D11ShaderResourceView,
|
||||||
|
u32,
|
||||||
|
u32,
|
||||||
|
bool,
|
||||||
|
)>,
|
||||||
|
bound: Option<Bound>,
|
||||||
/// Source frame dimensions, for the Contain-fit letterbox.
|
/// Source frame dimensions, for the Contain-fit letterbox.
|
||||||
src_w: u32,
|
src_w: u32,
|
||||||
src_h: u32,
|
src_h: u32,
|
||||||
/// Panel (swapchain) size in pixels, updated on resize.
|
/// Panel (swapchain) size in physical pixels + the window DPI, updated on resize.
|
||||||
panel_w: u32,
|
panel_w: u32,
|
||||||
panel_h: u32,
|
panel_h: u32,
|
||||||
|
dpi: u32,
|
||||||
/// Whether the swapchain is currently in 10-bit HDR10 (R10G10B10A2 + ST.2084) mode.
|
/// Whether the swapchain is currently in 10-bit HDR10 (R10G10B10A2 + ST.2084) mode.
|
||||||
hdr: bool,
|
hdr: bool,
|
||||||
/// The source's static HDR mastering metadata received over the protocol (`0xCE`), applied via
|
/// The source's static HDR mastering metadata received over the protocol (`0xCE`), applied via
|
||||||
@@ -126,45 +145,71 @@ pub struct Presenter {
|
|||||||
}
|
}
|
||||||
|
|
||||||
/// Latest source HDR mastering metadata, written by the session pump (`session.rs`, the sole
|
/// Latest source HDR mastering metadata, written by the session pump (`session.rs`, the sole
|
||||||
/// `next_hdr_meta` consumer) and read by `present_newest` on the UI thread — decoupled so the
|
/// `next_hdr_meta` consumer) and read by the render thread before each present — decoupled so the
|
||||||
/// presenter doesn't need the connector. One session at a time on the client, so a single slot.
|
/// presenter doesn't need the connector. One session at a time on the client, so a single slot.
|
||||||
pub static LATEST_HDR_META: std::sync::Mutex<Option<punktfunk_core::quic::HdrMeta>> =
|
pub static LATEST_HDR_META: std::sync::Mutex<Option<punktfunk_core::quic::HdrMeta>> =
|
||||||
std::sync::Mutex::new(None);
|
std::sync::Mutex::new(None);
|
||||||
|
|
||||||
impl Presenter {
|
impl Presenter {
|
||||||
/// Create the presenter on the process-wide shared D3D11 device (the one the decoder uses), plus
|
/// Create the presenter on the process-wide shared D3D11 device (the one the decoder uses), plus
|
||||||
/// the composition swapchain + shaders, sized to the panel.
|
/// the composition swapchain + shaders, sized to the panel in physical pixels at `dpi`.
|
||||||
pub fn new(width: u32, height: u32) -> Result<Presenter> {
|
pub fn new(width: u32, height: u32, dpi: u32) -> Result<Presenter> {
|
||||||
let shared = crate::gpu::shared().ok_or_else(|| anyhow!("no shared D3D11 device"))?;
|
let shared = crate::gpu::shared().ok_or_else(|| anyhow!("no shared D3D11 device"))?;
|
||||||
let device = shared.device.clone();
|
let device = shared.device.clone();
|
||||||
let context = shared.context.clone();
|
let context = shared.context.clone();
|
||||||
let (vs, ps_rgba, ps_nv12, ps_p010, sampler) = build_pipeline(&device)?;
|
let (vs, ps_nv12, ps_p010, sampler) = build_pipeline(&device)?;
|
||||||
let swap = create_composition_swapchain(&device, width.max(1), height.max(1))?;
|
let (swap, swap_flags) =
|
||||||
Ok(Presenter {
|
create_composition_swapchain(&device, width.max(1), height.max(1))?;
|
||||||
|
// ≤1 queued present: the render thread blocks on the waitable, so a frame is only drawn
|
||||||
|
// when the compositor is ready to take it — the newest-wins drain happens after the wait.
|
||||||
|
let waitable = (swap_flags & DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT.0 as u32
|
||||||
|
!= 0)
|
||||||
|
.then(|| unsafe {
|
||||||
|
let sc2: IDXGISwapChain2 = swap.cast().ok()?;
|
||||||
|
sc2.SetMaximumFrameLatency(1).ok()?;
|
||||||
|
let h = sc2.GetFrameLatencyWaitableObject();
|
||||||
|
(!h.is_invalid()).then_some(h)
|
||||||
|
})
|
||||||
|
.flatten();
|
||||||
|
let p = Presenter {
|
||||||
device,
|
device,
|
||||||
context,
|
context,
|
||||||
vs,
|
vs,
|
||||||
ps_rgba,
|
|
||||||
ps_nv12,
|
ps_nv12,
|
||||||
ps_p010,
|
ps_p010,
|
||||||
sampler,
|
sampler,
|
||||||
swap,
|
swap,
|
||||||
|
swap_flags,
|
||||||
|
waitable,
|
||||||
rtv: None,
|
rtv: None,
|
||||||
cpu_tex: None,
|
sample_tex: None,
|
||||||
gpu: None,
|
gpu_frame: None,
|
||||||
mode: Mode::Empty,
|
plane_tex: None,
|
||||||
|
bound: None,
|
||||||
src_w: 1,
|
src_w: 1,
|
||||||
src_h: 1,
|
src_h: 1,
|
||||||
panel_w: width.max(1),
|
panel_w: width.max(1),
|
||||||
panel_h: height.max(1),
|
panel_h: height.max(1),
|
||||||
|
dpi: dpi.max(96),
|
||||||
hdr: false,
|
hdr: false,
|
||||||
hdr_meta: None,
|
hdr_meta: None,
|
||||||
})
|
};
|
||||||
|
p.apply_dpi_matrix();
|
||||||
|
Ok(p)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Block until the swapchain can take another present (≤ `timeout_ms`). True when a present
|
||||||
|
/// slot is free; also true on the flagless fallback (no throttle available, just present).
|
||||||
|
pub fn wait_present_slot(&self, timeout_ms: u32) -> bool {
|
||||||
|
match self.waitable {
|
||||||
|
Some(h) => unsafe { WaitForSingleObject(h, timeout_ms) == WAIT_OBJECT_0 },
|
||||||
|
None => true,
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Update the source HDR mastering metadata (from the `0xCE` plane). Stored for the next HDR
|
/// Update the source HDR mastering metadata (from the `0xCE` plane). Stored for the next HDR
|
||||||
/// swapchain switch, and applied immediately if already presenting HDR. A no-op when unchanged
|
/// swapchain switch, and applied immediately if already presenting HDR. A no-op when unchanged
|
||||||
/// (so it's cheap to call every frame from the present loop).
|
/// (so it's cheap to call every frame from the render loop).
|
||||||
pub fn set_hdr_metadata(&mut self, meta: punktfunk_core::quic::HdrMeta) {
|
pub fn set_hdr_metadata(&mut self, meta: punktfunk_core::quic::HdrMeta) {
|
||||||
if self.hdr_meta == Some(meta) {
|
if self.hdr_meta == Some(meta) {
|
||||||
return;
|
return;
|
||||||
@@ -180,28 +225,54 @@ impl Presenter {
|
|||||||
&self.swap
|
&self.swap
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Resize the back buffers to the panel's new size (drops the stale RTV).
|
/// Resize the back buffers to the panel's new size in physical pixels at `dpi` (drops the
|
||||||
pub fn resize(&mut self, width: u32, height: u32) {
|
/// stale RTV, re-applies the DIP↔pixel matrix).
|
||||||
if width == 0 || height == 0 || (width == self.panel_w && height == self.panel_h) {
|
pub fn resize(&mut self, width: u32, height: u32, dpi: u32) {
|
||||||
|
let dpi = dpi.max(96);
|
||||||
|
if width == 0
|
||||||
|
|| height == 0
|
||||||
|
|| (width == self.panel_w && height == self.panel_h && dpi == self.dpi)
|
||||||
|
{
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
self.rtv = None; // release all back-buffer refs before ResizeBuffers
|
self.rtv = None; // release all back-buffer refs before ResizeBuffers
|
||||||
unsafe {
|
unsafe {
|
||||||
let _ = self.swap.ResizeBuffers(
|
if let Err(e) = self.swap.ResizeBuffers(
|
||||||
0,
|
0,
|
||||||
width,
|
width,
|
||||||
height,
|
height,
|
||||||
DXGI_FORMAT_UNKNOWN,
|
DXGI_FORMAT_UNKNOWN,
|
||||||
DXGI_SWAP_CHAIN_FLAG(0),
|
DXGI_SWAP_CHAIN_FLAG(self.swap_flags as i32),
|
||||||
);
|
) {
|
||||||
|
tracing::warn!(error = %e, "ResizeBuffers failed");
|
||||||
|
return;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
self.panel_w = width;
|
self.panel_w = width;
|
||||||
self.panel_h = height;
|
self.panel_h = height;
|
||||||
|
self.dpi = dpi;
|
||||||
|
self.apply_dpi_matrix();
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Present one decoded frame (Contain-fit) — or, when `frame` is `None`, re-present the last one
|
/// Map the pixel-sized buffers into the panel's DIP coordinate space (scale 96/DPI) — XAML
|
||||||
/// (or black). Called from the reactor `on_rendering` per-frame callback on the UI thread. Takes
|
/// otherwise stretches whatever size the buffers are to the panel's DIP bounds (blurry).
|
||||||
/// the frame by value so the GPU path can retain the decoder surface across re-presents.
|
fn apply_dpi_matrix(&self) {
|
||||||
|
let s = 96.0 / self.dpi as f32;
|
||||||
|
if let Ok(sc2) = self.swap.cast::<IDXGISwapChain2>() {
|
||||||
|
let m = DXGI_MATRIX_3X2_F {
|
||||||
|
_11: s,
|
||||||
|
_22: s,
|
||||||
|
..Default::default()
|
||||||
|
};
|
||||||
|
if let Err(e) = unsafe { sc2.SetMatrixTransform(&m) } {
|
||||||
|
tracing::warn!(error = %e, "SetMatrixTransform failed");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Present one decoded frame (Contain-fit) — or, when `frame` is `None`, re-present the last
|
||||||
|
/// one (or black). Called from the render thread. Takes the frame by value: the GPU path
|
||||||
|
/// retains the decoder surface until the next bind.
|
||||||
pub fn present(&mut self, frame: Option<DecodedFrame>) {
|
pub fn present(&mut self, frame: Option<DecodedFrame>) {
|
||||||
match frame {
|
match frame {
|
||||||
Some(DecodedFrame::Cpu(c)) => {
|
Some(DecodedFrame::Cpu(c)) => {
|
||||||
@@ -210,20 +281,14 @@ impl Presenter {
|
|||||||
}
|
}
|
||||||
if let Err(e) = self.upload(&c) {
|
if let Err(e) = self.upload(&c) {
|
||||||
tracing::warn!(error = %e, "frame upload failed");
|
tracing::warn!(error = %e, "frame upload failed");
|
||||||
} else {
|
|
||||||
self.mode = Mode::Rgba;
|
|
||||||
self.src_w = c.width;
|
|
||||||
self.src_h = c.height;
|
|
||||||
self.gpu = None; // drop any held GPU frame
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
Some(DecodedFrame::Gpu(g)) => {
|
Some(DecodedFrame::Gpu(g)) => {
|
||||||
if g.hdr != self.hdr {
|
if g.hdr != self.hdr {
|
||||||
self.set_hdr(g.hdr);
|
self.set_hdr(g.hdr);
|
||||||
}
|
}
|
||||||
match self.bind_gpu(g) {
|
if let Err(e) = self.bind_gpu(g) {
|
||||||
Ok(()) => {}
|
tracing::warn!(error = %e, "GPU frame bind failed");
|
||||||
Err(e) => tracing::warn!(error = %e, "GPU frame bind failed"),
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
None => {}
|
None => {}
|
||||||
@@ -231,46 +296,102 @@ impl Presenter {
|
|||||||
self.draw();
|
self.draw();
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Build per-plane SRVs over the decoded texture-array slice and retain the frame.
|
/// Copy the decoded slice into our sampleable texture and build per-plane SRVs over it. The
|
||||||
|
/// decode array is decoder-only (NVIDIA won't bind a decoder array as a shader resource), so
|
||||||
|
/// it can't be sampled directly — one GPU-to-GPU copy makes the frame sampleable on every
|
||||||
|
/// vendor. D3D11 planar semantics: the slice is ONE subresource (both planes copy together),
|
||||||
|
/// and the source box is display-size (the array is coded-size; a full-resource copy would
|
||||||
|
/// size-mismatch and be silently dropped).
|
||||||
fn bind_gpu(&mut self, g: GpuFrame) -> Result<()> {
|
fn bind_gpu(&mut self, g: GpuFrame) -> Result<()> {
|
||||||
let tex: ID3D11Texture2D = unsafe {
|
let src: ID3D11Texture2D = unsafe {
|
||||||
let raw = g.texture_ptr();
|
let raw = g.texture_ptr();
|
||||||
ID3D11Texture2D::from_raw_borrowed(&raw)
|
ID3D11Texture2D::from_raw_borrowed(&raw)
|
||||||
.ok_or_else(|| anyhow!("null D3D11 texture"))?
|
.ok_or_else(|| anyhow!("null D3D11 texture"))?
|
||||||
.clone()
|
.clone()
|
||||||
};
|
};
|
||||||
// NV12: R8 luma + R8G8 chroma. P010: R16 luma + R16G16 chroma (10 bits in the high bits).
|
self.ensure_sample_tex(g.width, g.height, g.ten_bit)?;
|
||||||
let (fy, fc) = if g.hdr {
|
let dst = self.sample_tex.as_ref().unwrap().0.clone();
|
||||||
(DXGI_FORMAT_R16_UNORM, DXGI_FORMAT_R16G16_UNORM)
|
// Even-aligned luma coordinates (NV12/P010 chroma is 2×2 subsampled).
|
||||||
} else {
|
let src_box = D3D11_BOX {
|
||||||
(DXGI_FORMAT_R8_UNORM, DXGI_FORMAT_R8G8_UNORM)
|
left: 0,
|
||||||
|
top: 0,
|
||||||
|
front: 0,
|
||||||
|
right: g.width & !1,
|
||||||
|
bottom: g.height & !1,
|
||||||
|
back: 1,
|
||||||
};
|
};
|
||||||
let y = self.array_srv(&tex, fy, g.index)?;
|
unsafe {
|
||||||
let c = self.array_srv(&tex, fc, g.index)?;
|
self.context
|
||||||
self.mode = if g.hdr { Mode::P010 } else { Mode::Nv12 };
|
.CopySubresourceRegion(&dst, 0, 0, 0, 0, &src, g.index, Some(&src_box));
|
||||||
|
}
|
||||||
|
let (fy, fc) = plane_formats(g.ten_bit);
|
||||||
|
let y = self.plane_srv(&dst, fy)?;
|
||||||
|
let c = self.plane_srv(&dst, fc)?;
|
||||||
|
if g.ten_bit != g.hdr {
|
||||||
|
warn_bitdepth_mismatch_once(g.ten_bit, g.hdr);
|
||||||
|
}
|
||||||
self.src_w = g.width;
|
self.src_w = g.width;
|
||||||
self.src_h = g.height;
|
self.src_h = g.height;
|
||||||
self.gpu = Some(GpuView { y, c, frame: g });
|
self.bound = Some(Bound { y, c, hdr: g.hdr });
|
||||||
|
// Hold the frame until the next bind: its decode surface stays out of the reuse pool
|
||||||
|
// until this copy is queued ahead of any later decoder write (previous frame drops here).
|
||||||
|
self.gpu_frame = Some(g);
|
||||||
Ok(())
|
Ok(())
|
||||||
}
|
}
|
||||||
|
|
||||||
/// A shader-resource view over a single slice of a texture array, reinterpreting the plane
|
/// Ensure the sampleable copy texture matches the decoded frame's size + bit depth (NV12 for
|
||||||
/// format (the NV12/P010 sub-format trick D3D11 allows on video textures).
|
/// 8-bit, P010 for 10-bit — the same format as the decode array, a `CopySubresourceRegion`
|
||||||
fn array_srv(
|
/// requirement), recreating it on a change.
|
||||||
|
fn ensure_sample_tex(&mut self, w: u32, h: u32, ten_bit: bool) -> Result<()> {
|
||||||
|
if matches!(&self.sample_tex, Some((_, tw, th, tb)) if *tw == w && *th == h && *tb == ten_bit)
|
||||||
|
{
|
||||||
|
return Ok(());
|
||||||
|
}
|
||||||
|
let desc = D3D11_TEXTURE2D_DESC {
|
||||||
|
Width: w,
|
||||||
|
Height: h,
|
||||||
|
MipLevels: 1,
|
||||||
|
ArraySize: 1,
|
||||||
|
Format: if ten_bit {
|
||||||
|
DXGI_FORMAT_P010
|
||||||
|
} else {
|
||||||
|
DXGI_FORMAT_NV12
|
||||||
|
},
|
||||||
|
SampleDesc: DXGI_SAMPLE_DESC {
|
||||||
|
Count: 1,
|
||||||
|
Quality: 0,
|
||||||
|
},
|
||||||
|
Usage: D3D11_USAGE_DEFAULT,
|
||||||
|
BindFlags: D3D11_BIND_SHADER_RESOURCE.0 as u32,
|
||||||
|
CPUAccessFlags: 0,
|
||||||
|
MiscFlags: 0,
|
||||||
|
};
|
||||||
|
let tex = unsafe {
|
||||||
|
let mut t = None;
|
||||||
|
self.device
|
||||||
|
.CreateTexture2D(&desc, None, Some(&mut t))
|
||||||
|
.context("CreateTexture2D (sample target)")?;
|
||||||
|
t.ok_or_else(|| anyhow!("null sample texture"))?
|
||||||
|
};
|
||||||
|
self.sample_tex = Some((tex, w, h, ten_bit));
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// A shader-resource view over one plane of a single (non-array) NV12/P010 texture — the
|
||||||
|
/// R8/R8G8 (or R16/R16G16) format selects the luma vs. chroma plane (the D3D11 video
|
||||||
|
/// sub-format trick).
|
||||||
|
fn plane_srv(
|
||||||
&self,
|
&self,
|
||||||
tex: &ID3D11Texture2D,
|
tex: &ID3D11Texture2D,
|
||||||
format: DXGI_FORMAT,
|
format: DXGI_FORMAT,
|
||||||
slice: u32,
|
|
||||||
) -> Result<ID3D11ShaderResourceView> {
|
) -> Result<ID3D11ShaderResourceView> {
|
||||||
let desc = D3D11_SHADER_RESOURCE_VIEW_DESC {
|
let desc = D3D11_SHADER_RESOURCE_VIEW_DESC {
|
||||||
Format: format,
|
Format: format,
|
||||||
ViewDimension: D3D_SRV_DIMENSION_TEXTURE2DARRAY,
|
ViewDimension: D3D_SRV_DIMENSION_TEXTURE2D,
|
||||||
Anonymous: D3D11_SHADER_RESOURCE_VIEW_DESC_0 {
|
Anonymous: D3D11_SHADER_RESOURCE_VIEW_DESC_0 {
|
||||||
Texture2DArray: D3D11_TEX2D_ARRAY_SRV {
|
Texture2D: D3D11_TEX2D_SRV {
|
||||||
MostDetailedMip: 0,
|
MostDetailedMip: 0,
|
||||||
MipLevels: 1,
|
MipLevels: 1,
|
||||||
FirstArraySlice: slice,
|
|
||||||
ArraySize: 1,
|
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
};
|
};
|
||||||
@@ -278,37 +399,109 @@ impl Presenter {
|
|||||||
let mut srv = None;
|
let mut srv = None;
|
||||||
self.device
|
self.device
|
||||||
.CreateShaderResourceView(tex, Some(&desc), Some(&mut srv))
|
.CreateShaderResourceView(tex, Some(&desc), Some(&mut srv))
|
||||||
.context("CreateShaderResourceView (array slice)")?;
|
.context("CreateShaderResourceView (plane)")?;
|
||||||
srv.ok_or_else(|| anyhow!("null SRV"))
|
srv.ok_or_else(|| anyhow!("null SRV"))
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Upload a software-decoded frame's two planes into the dynamic plane textures (created to
|
||||||
|
/// match size/bit depth), feeding the same SRV slots + shaders as the GPU path.
|
||||||
|
fn upload(&mut self, frame: &CpuFrame) -> Result<()> {
|
||||||
|
let (w, h) = (frame.width, frame.height);
|
||||||
|
let rebuild = !matches!(&self.plane_tex,
|
||||||
|
Some((.., tw, th, tb)) if *tw == w && *th == h && *tb == frame.ten_bit);
|
||||||
|
if rebuild {
|
||||||
|
let (fy, fc) = plane_formats(frame.ten_bit);
|
||||||
|
let y = self.dynamic_tex(w, h, fy)?;
|
||||||
|
let uv = self.dynamic_tex(w.div_ceil(2), h.div_ceil(2), fc)?;
|
||||||
|
let y_srv = self.plane_srv(&y, fy)?;
|
||||||
|
let uv_srv = self.plane_srv(&uv, fc)?;
|
||||||
|
self.plane_tex = Some((y, uv, y_srv, uv_srv, w, h, frame.ten_bit));
|
||||||
|
}
|
||||||
|
let (y, uv, y_srv, uv_srv, ..) = self.plane_tex.as_ref().unwrap();
|
||||||
|
let bytes = if frame.ten_bit { 2 } else { 1 };
|
||||||
|
self.map_rows(y, &frame.y, frame.y_stride, w as usize * bytes, h as usize)?;
|
||||||
|
self.map_rows(
|
||||||
|
uv,
|
||||||
|
&frame.uv,
|
||||||
|
frame.uv_stride,
|
||||||
|
w.div_ceil(2) as usize * 2 * bytes,
|
||||||
|
h.div_ceil(2) as usize,
|
||||||
|
)?;
|
||||||
|
self.src_w = w;
|
||||||
|
self.src_h = h;
|
||||||
|
self.bound = Some(Bound {
|
||||||
|
y: y_srv.clone(),
|
||||||
|
c: uv_srv.clone(),
|
||||||
|
hdr: frame.hdr,
|
||||||
|
});
|
||||||
|
self.gpu_frame = None; // drop any held GPU frame
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
fn dynamic_tex(&self, w: u32, h: u32, format: DXGI_FORMAT) -> Result<ID3D11Texture2D> {
|
||||||
|
let desc = D3D11_TEXTURE2D_DESC {
|
||||||
|
Width: w,
|
||||||
|
Height: h,
|
||||||
|
MipLevels: 1,
|
||||||
|
ArraySize: 1,
|
||||||
|
Format: format,
|
||||||
|
SampleDesc: DXGI_SAMPLE_DESC {
|
||||||
|
Count: 1,
|
||||||
|
Quality: 0,
|
||||||
|
},
|
||||||
|
Usage: D3D11_USAGE_DYNAMIC,
|
||||||
|
BindFlags: D3D11_BIND_SHADER_RESOURCE.0 as u32,
|
||||||
|
CPUAccessFlags: D3D11_CPU_ACCESS_WRITE.0 as u32,
|
||||||
|
MiscFlags: 0,
|
||||||
|
};
|
||||||
|
unsafe {
|
||||||
|
let mut t = None;
|
||||||
|
self.device
|
||||||
|
.CreateTexture2D(&desc, None, Some(&mut t))
|
||||||
|
.context("CreateTexture2D (plane)")?;
|
||||||
|
t.ok_or_else(|| anyhow!("null plane texture"))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Map-discard `tex` and copy `rows` rows of `row_bytes` from `src` (stride `src_pitch`).
|
||||||
|
fn map_rows(
|
||||||
|
&self,
|
||||||
|
tex: &ID3D11Texture2D,
|
||||||
|
src: &[u8],
|
||||||
|
src_pitch: usize,
|
||||||
|
row_bytes: usize,
|
||||||
|
rows: usize,
|
||||||
|
) -> Result<()> {
|
||||||
|
unsafe {
|
||||||
|
let mut mapped = D3D11_MAPPED_SUBRESOURCE::default();
|
||||||
|
self.context
|
||||||
|
.Map(tex, 0, D3D11_MAP_WRITE_DISCARD, 0, Some(&mut mapped))
|
||||||
|
.context("Map plane texture")?;
|
||||||
|
let dst = mapped.pData as *mut u8;
|
||||||
|
let dst_pitch = mapped.RowPitch as usize;
|
||||||
|
let n = row_bytes.min(src_pitch);
|
||||||
|
for r in 0..rows {
|
||||||
|
std::ptr::copy_nonoverlapping(
|
||||||
|
src.as_ptr().add(r * src_pitch),
|
||||||
|
dst.add(r * dst_pitch),
|
||||||
|
n,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
self.context.Unmap(tex, 0);
|
||||||
|
}
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
fn draw(&mut self) {
|
fn draw(&mut self) {
|
||||||
let Ok(rtv) = self.rtv() else {
|
let Ok(rtv) = self.rtv() else {
|
||||||
return;
|
return;
|
||||||
};
|
};
|
||||||
let (pw, ph) = (self.panel_w, self.panel_h);
|
let (pw, ph) = (self.panel_w, self.panel_h);
|
||||||
// Resolve the current source's shader + the (up to two) SRVs to bind — cheap interface
|
|
||||||
// clones. Each arm yields `Option<(&pixel_shader, [Option<SRV>; 2])>`.
|
|
||||||
let binding = match self.mode {
|
|
||||||
Mode::Rgba => self
|
|
||||||
.cpu_tex
|
|
||||||
.as_ref()
|
|
||||||
.map(|(_, srv, _, _)| (&self.ps_rgba, [Some(srv.clone()), None])),
|
|
||||||
Mode::Nv12 => self
|
|
||||||
.gpu
|
|
||||||
.as_ref()
|
|
||||||
.map(|g| (&self.ps_nv12, [Some(g.y.clone()), Some(g.c.clone())])),
|
|
||||||
Mode::P010 => self
|
|
||||||
.gpu
|
|
||||||
.as_ref()
|
|
||||||
.map(|g| (&self.ps_p010, [Some(g.y.clone()), Some(g.c.clone())])),
|
|
||||||
Mode::Empty => None,
|
|
||||||
};
|
|
||||||
unsafe {
|
unsafe {
|
||||||
let c = &self.context;
|
let c = &self.context;
|
||||||
c.ClearRenderTargetView(&rtv, &[0.0, 0.0, 0.0, 1.0]);
|
c.ClearRenderTargetView(&rtv, &[0.0, 0.0, 0.0, 1.0]);
|
||||||
if let Some((ps, srvs)) = binding {
|
if let Some(bound) = &self.bound {
|
||||||
// Contain-fit viewport: scale to the smaller axis, centre, letterbox the rest.
|
// Contain-fit viewport: scale to the smaller axis, centre, letterbox the rest.
|
||||||
let (ww, wh, vfw, vfh) = (
|
let (ww, wh, vfw, vfh) = (
|
||||||
pw as f32,
|
pw as f32,
|
||||||
@@ -332,8 +525,15 @@ impl Presenter {
|
|||||||
c.IASetInputLayout(None);
|
c.IASetInputLayout(None);
|
||||||
c.IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
|
c.IASetPrimitiveTopology(D3D_PRIMITIVE_TOPOLOGY_TRIANGLELIST);
|
||||||
c.VSSetShader(&self.vs, None);
|
c.VSSetShader(&self.vs, None);
|
||||||
c.PSSetShader(ps, None);
|
c.PSSetShader(
|
||||||
c.PSSetShaderResources(0, Some(&srvs));
|
if bound.hdr {
|
||||||
|
&self.ps_p010
|
||||||
|
} else {
|
||||||
|
&self.ps_nv12
|
||||||
|
},
|
||||||
|
None,
|
||||||
|
);
|
||||||
|
c.PSSetShaderResources(0, Some(&[Some(bound.y.clone()), Some(bound.c.clone())]));
|
||||||
c.PSSetSamplers(0, Some(&[Some(self.sampler.clone())]));
|
c.PSSetSamplers(0, Some(&[Some(self.sampler.clone())]));
|
||||||
c.Draw(3, 0);
|
c.Draw(3, 0);
|
||||||
}
|
}
|
||||||
@@ -347,7 +547,6 @@ impl Presenter {
|
|||||||
/// PQ-encoded BT.2020 for HDR, so the colour space is all the compositor needs.
|
/// PQ-encoded BT.2020 for HDR, so the colour space is all the compositor needs.
|
||||||
fn set_hdr(&mut self, on: bool) {
|
fn set_hdr(&mut self, on: bool) {
|
||||||
self.rtv = None; // release back-buffer refs before ResizeBuffers
|
self.rtv = None; // release back-buffer refs before ResizeBuffers
|
||||||
self.cpu_tex = None; // CPU texture format changes (R10G10B10A2 vs R8G8B8A8)
|
|
||||||
let format = if on {
|
let format = if on {
|
||||||
DXGI_FORMAT_R10G10B10A2_UNORM
|
DXGI_FORMAT_R10G10B10A2_UNORM
|
||||||
} else {
|
} else {
|
||||||
@@ -359,7 +558,7 @@ impl Presenter {
|
|||||||
self.panel_w,
|
self.panel_w,
|
||||||
self.panel_h,
|
self.panel_h,
|
||||||
format,
|
format,
|
||||||
DXGI_SWAP_CHAIN_FLAG(0),
|
DXGI_SWAP_CHAIN_FLAG(self.swap_flags as i32),
|
||||||
) {
|
) {
|
||||||
tracing::warn!(error = %e, "ResizeBuffers for HDR switch failed");
|
tracing::warn!(error = %e, "ResizeBuffers for HDR switch failed");
|
||||||
return;
|
return;
|
||||||
@@ -389,6 +588,7 @@ impl Presenter {
|
|||||||
self.apply_hdr_metadata();
|
self.apply_hdr_metadata();
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
self.apply_dpi_matrix(); // belt-and-braces: keep the DIP mapping across the format switch
|
||||||
tracing::info!(hdr = on, "swapchain colour mode switched");
|
tracing::info!(hdr = on, "swapchain colour mode switched");
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -410,68 +610,6 @@ impl Presenter {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
fn upload(&mut self, frame: &crate::video::CpuFrame) -> Result<()> {
|
|
||||||
let (w, h) = (frame.width, frame.height);
|
|
||||||
let need_new = !matches!(&self.cpu_tex, Some((_, _, tw, th)) if *tw == w && *th == h);
|
|
||||||
if need_new {
|
|
||||||
let format = if self.hdr {
|
|
||||||
DXGI_FORMAT_R10G10B10A2_UNORM
|
|
||||||
} else {
|
|
||||||
DXGI_FORMAT_R8G8B8A8_UNORM
|
|
||||||
};
|
|
||||||
let desc = D3D11_TEXTURE2D_DESC {
|
|
||||||
Width: w,
|
|
||||||
Height: h,
|
|
||||||
MipLevels: 1,
|
|
||||||
ArraySize: 1,
|
|
||||||
Format: format,
|
|
||||||
SampleDesc: DXGI_SAMPLE_DESC {
|
|
||||||
Count: 1,
|
|
||||||
Quality: 0,
|
|
||||||
},
|
|
||||||
Usage: D3D11_USAGE_DYNAMIC,
|
|
||||||
BindFlags: D3D11_BIND_SHADER_RESOURCE.0 as u32,
|
|
||||||
CPUAccessFlags: D3D11_CPU_ACCESS_WRITE.0 as u32,
|
|
||||||
MiscFlags: 0,
|
|
||||||
};
|
|
||||||
let texture = unsafe {
|
|
||||||
let mut t = None;
|
|
||||||
self.device
|
|
||||||
.CreateTexture2D(&desc, None, Some(&mut t))
|
|
||||||
.context("CreateTexture2D")?;
|
|
||||||
t.unwrap()
|
|
||||||
};
|
|
||||||
let srv = unsafe {
|
|
||||||
let mut s = None;
|
|
||||||
self.device
|
|
||||||
.CreateShaderResourceView(&texture, None, Some(&mut s))
|
|
||||||
.context("CreateShaderResourceView")?;
|
|
||||||
s.unwrap()
|
|
||||||
};
|
|
||||||
self.cpu_tex = Some((texture, srv, w, h));
|
|
||||||
}
|
|
||||||
let (texture, _, _, _) = self.cpu_tex.as_ref().unwrap();
|
|
||||||
unsafe {
|
|
||||||
let mut mapped = D3D11_MAPPED_SUBRESOURCE::default();
|
|
||||||
self.context
|
|
||||||
.Map(texture, 0, D3D11_MAP_WRITE_DISCARD, 0, Some(&mut mapped))
|
|
||||||
.context("Map video texture")?;
|
|
||||||
let dst = mapped.pData as *mut u8;
|
|
||||||
let dst_pitch = mapped.RowPitch as usize;
|
|
||||||
let src_pitch = frame.stride;
|
|
||||||
let row_bytes = (w as usize) * 4;
|
|
||||||
for y in 0..h as usize {
|
|
||||||
std::ptr::copy_nonoverlapping(
|
|
||||||
frame.pixels.as_ptr().add(y * src_pitch),
|
|
||||||
dst.add(y * dst_pitch),
|
|
||||||
row_bytes.min(src_pitch),
|
|
||||||
);
|
|
||||||
}
|
|
||||||
self.context.Unmap(texture, 0);
|
|
||||||
}
|
|
||||||
Ok(())
|
|
||||||
}
|
|
||||||
|
|
||||||
fn rtv(&mut self) -> Result<ID3D11RenderTargetView> {
|
fn rtv(&mut self) -> Result<ID3D11RenderTargetView> {
|
||||||
if self.rtv.is_none() {
|
if self.rtv.is_none() {
|
||||||
let back: ID3D11Texture2D = unsafe { self.swap.GetBuffer(0).context("GetBuffer")? };
|
let back: ID3D11Texture2D = unsafe { self.swap.GetBuffer(0).context("GetBuffer")? };
|
||||||
@@ -488,18 +626,53 @@ impl Presenter {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// A composition flip-model swapchain (no HWND) for binding to a XAML `SwapChainPanel`.
|
impl Drop for Presenter {
|
||||||
|
fn drop(&mut self) {
|
||||||
|
if let Some(h) = self.waitable.take() {
|
||||||
|
unsafe {
|
||||||
|
let _ = CloseHandle(h);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Luma + chroma plane view formats for NV12 (8-bit) vs P010 (10-in-16-bit).
|
||||||
|
fn plane_formats(ten_bit: bool) -> (DXGI_FORMAT, DXGI_FORMAT) {
|
||||||
|
if ten_bit {
|
||||||
|
(DXGI_FORMAT_R16_UNORM, DXGI_FORMAT_R16G16_UNORM)
|
||||||
|
} else {
|
||||||
|
(DXGI_FORMAT_R8_UNORM, DXGI_FORMAT_R8G8_UNORM)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// The host couples 10-bit ⟺ HDR today; a mismatch means the shader's transfer/matrix assumption
|
||||||
|
/// is off for this stream (rendered anyway — approximate colour beats no picture).
|
||||||
|
fn warn_bitdepth_mismatch_once(ten_bit: bool, hdr: bool) {
|
||||||
|
use std::sync::atomic::{AtomicBool, Ordering};
|
||||||
|
static ONCE: AtomicBool = AtomicBool::new(true);
|
||||||
|
if ONCE.swap(false, Ordering::Relaxed) {
|
||||||
|
tracing::warn!(
|
||||||
|
ten_bit,
|
||||||
|
hdr,
|
||||||
|
"bit depth / HDR mismatch — colour may be approximate"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// A composition flip-model swapchain (no HWND) for binding to a XAML `SwapChainPanel`, with the
|
||||||
|
/// frame-latency waitable when the driver allows it. Returns the swapchain + the flags it was
|
||||||
|
/// created with (every `ResizeBuffers` must re-pass them).
|
||||||
fn create_composition_swapchain(
|
fn create_composition_swapchain(
|
||||||
device: &ID3D11Device,
|
device: &ID3D11Device,
|
||||||
width: u32,
|
width: u32,
|
||||||
height: u32,
|
height: u32,
|
||||||
) -> Result<IDXGISwapChain1> {
|
) -> Result<(IDXGISwapChain1, u32)> {
|
||||||
let dxdev: IDXGIDevice = device.cast().context("IDXGIDevice cast")?;
|
let dxdev: IDXGIDevice = device.cast().context("IDXGIDevice cast")?;
|
||||||
let factory: IDXGIFactory2 = unsafe {
|
let factory: IDXGIFactory2 = unsafe {
|
||||||
let adapter = dxdev.GetAdapter().context("GetAdapter")?;
|
let adapter = dxdev.GetAdapter().context("GetAdapter")?;
|
||||||
adapter.GetParent().context("GetParent (IDXGIFactory2)")?
|
adapter.GetParent().context("GetParent (IDXGIFactory2)")?
|
||||||
};
|
};
|
||||||
let desc = DXGI_SWAP_CHAIN_DESC1 {
|
let mut desc = DXGI_SWAP_CHAIN_DESC1 {
|
||||||
Width: width,
|
Width: width,
|
||||||
Height: height,
|
Height: height,
|
||||||
Format: DXGI_FORMAT_B8G8R8A8_UNORM,
|
Format: DXGI_FORMAT_B8G8R8A8_UNORM,
|
||||||
@@ -512,16 +685,24 @@ fn create_composition_swapchain(
|
|||||||
BufferCount: 2,
|
BufferCount: 2,
|
||||||
Scaling: DXGI_SCALING_STRETCH,
|
Scaling: DXGI_SCALING_STRETCH,
|
||||||
SwapEffect: DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL,
|
SwapEffect: DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL,
|
||||||
// IGNORE (opaque), not PREMULTIPLIED: the video fills the panel and the HDR `X2BGR10`
|
// IGNORE (opaque), not PREMULTIPLIED: the video fills the panel with opaque RGB either way.
|
||||||
// upload leaves the 2 padding/alpha bits 0 — premultiplied alpha would then make HDR frames
|
|
||||||
// transparent. Opaque is correct for a full-frame video surface either way.
|
|
||||||
AlphaMode: DXGI_ALPHA_MODE_IGNORE,
|
AlphaMode: DXGI_ALPHA_MODE_IGNORE,
|
||||||
Flags: 0,
|
Flags: DXGI_SWAP_CHAIN_FLAG_FRAME_LATENCY_WAITABLE_OBJECT.0 as u32,
|
||||||
};
|
};
|
||||||
unsafe {
|
unsafe {
|
||||||
factory
|
match factory.CreateSwapChainForComposition(device, &desc, None) {
|
||||||
|
Ok(sc) => Ok((sc, desc.Flags)),
|
||||||
|
Err(e) => {
|
||||||
|
// Odd driver/WARP combinations can reject the waitable — fall back to plain
|
||||||
|
// Present(1) pacing rather than failing the stream page.
|
||||||
|
tracing::warn!(error = %e, "waitable swapchain rejected — creating without");
|
||||||
|
desc.Flags = 0;
|
||||||
|
let sc = factory
|
||||||
.CreateSwapChainForComposition(device, &desc, None)
|
.CreateSwapChainForComposition(device, &desc, None)
|
||||||
.context("CreateSwapChainForComposition")
|
.context("CreateSwapChainForComposition")?;
|
||||||
|
Ok((sc, 0))
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -531,11 +712,9 @@ fn build_pipeline(
|
|||||||
ID3D11VertexShader,
|
ID3D11VertexShader,
|
||||||
ID3D11PixelShader,
|
ID3D11PixelShader,
|
||||||
ID3D11PixelShader,
|
ID3D11PixelShader,
|
||||||
ID3D11PixelShader,
|
|
||||||
ID3D11SamplerState,
|
ID3D11SamplerState,
|
||||||
)> {
|
)> {
|
||||||
let vs_blob = compile(SHADER_HLSL, "vs_main", "vs_5_0")?;
|
let vs_blob = compile(SHADER_HLSL, "vs_main", "vs_5_0")?;
|
||||||
let rgba_blob = compile(SHADER_HLSL, "ps_rgba", "ps_5_0")?;
|
|
||||||
let nv12_blob = compile(SHADER_HLSL, "ps_nv12", "ps_5_0")?;
|
let nv12_blob = compile(SHADER_HLSL, "ps_nv12", "ps_5_0")?;
|
||||||
let p010_blob = compile(SHADER_HLSL, "ps_p010", "ps_5_0")?;
|
let p010_blob = compile(SHADER_HLSL, "ps_p010", "ps_5_0")?;
|
||||||
unsafe {
|
unsafe {
|
||||||
@@ -543,10 +722,6 @@ fn build_pipeline(
|
|||||||
device
|
device
|
||||||
.CreateVertexShader(blob_bytes(&vs_blob), None, Some(&mut vs))
|
.CreateVertexShader(blob_bytes(&vs_blob), None, Some(&mut vs))
|
||||||
.context("CreateVertexShader")?;
|
.context("CreateVertexShader")?;
|
||||||
let mut ps_rgba = None;
|
|
||||||
device
|
|
||||||
.CreatePixelShader(blob_bytes(&rgba_blob), None, Some(&mut ps_rgba))
|
|
||||||
.context("CreatePixelShader (rgba)")?;
|
|
||||||
let mut ps_nv12 = None;
|
let mut ps_nv12 = None;
|
||||||
device
|
device
|
||||||
.CreatePixelShader(blob_bytes(&nv12_blob), None, Some(&mut ps_nv12))
|
.CreatePixelShader(blob_bytes(&nv12_blob), None, Some(&mut ps_nv12))
|
||||||
@@ -569,7 +744,6 @@ fn build_pipeline(
|
|||||||
.context("CreateSamplerState")?;
|
.context("CreateSamplerState")?;
|
||||||
Ok((
|
Ok((
|
||||||
vs.unwrap(),
|
vs.unwrap(),
|
||||||
ps_rgba.unwrap(),
|
|
||||||
ps_nv12.unwrap(),
|
ps_nv12.unwrap(),
|
||||||
ps_p010.unwrap(),
|
ps_p010.unwrap(),
|
||||||
sampler.unwrap(),
|
sampler.unwrap(),
|
||||||
|
|||||||
@@ -0,0 +1,204 @@
|
|||||||
|
//! The dedicated video render thread: decoded frames flow session pump → bounded channel → here →
|
||||||
|
//! `Presenter::present`. Presenting off the XAML thread means UI jank (layout, input, dialogs)
|
||||||
|
//! never stalls video, and a filled present queue never blocks the UI thread — the two failure
|
||||||
|
//! modes of the old present-from-`on_rendering` design.
|
||||||
|
//!
|
||||||
|
//! Pacing: block on the channel (the host paces the stream), then on the swapchain's
|
||||||
|
//! frame-latency waitable (≤1 queued present — see `present.rs`), then drain to the NEWEST frame
|
||||||
|
//! so a stream faster than the display drops backlog before any GPU work. The UI thread only
|
||||||
|
//! writes panel size/DPI into [`RenderShared`] atomics; the loop applies them before the next
|
||||||
|
//! draw (and redraws the held frame after a resize — fresh back buffers are blank).
|
||||||
|
|
||||||
|
use crate::present::Presenter;
|
||||||
|
use crate::session::FrameRx;
|
||||||
|
use crossbeam_channel::RecvTimeoutError;
|
||||||
|
use std::sync::atomic::{AtomicBool, AtomicU32, AtomicU64, Ordering};
|
||||||
|
use std::sync::Arc;
|
||||||
|
use std::time::{Duration, Instant};
|
||||||
|
|
||||||
|
/// UI-thread → render-thread state. Size is packed into ONE atomic (w<<32|h) so a resize never
|
||||||
|
/// tears into a (new-width, old-height) pair.
|
||||||
|
pub struct RenderShared {
|
||||||
|
size_px: AtomicU64,
|
||||||
|
dpi: AtomicU32,
|
||||||
|
stop: AtomicBool,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl RenderShared {
|
||||||
|
pub fn new(width: u32, height: u32, dpi: u32) -> Arc<RenderShared> {
|
||||||
|
Arc::new(RenderShared {
|
||||||
|
size_px: AtomicU64::new(pack(width, height)),
|
||||||
|
dpi: AtomicU32::new(dpi),
|
||||||
|
stop: AtomicBool::new(false),
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn set_size(&self, width: u32, height: u32) {
|
||||||
|
self.size_px.store(pack(width, height), Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn set_dpi(&self, dpi: u32) {
|
||||||
|
self.dpi.store(dpi, Ordering::Relaxed);
|
||||||
|
}
|
||||||
|
|
||||||
|
fn snapshot(&self) -> (u32, u32, u32) {
|
||||||
|
let s = self.size_px.load(Ordering::Relaxed);
|
||||||
|
((s >> 32) as u32, s as u32, self.dpi.load(Ordering::Relaxed))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn pack(w: u32, h: u32) -> u64 {
|
||||||
|
((w as u64) << 32) | h as u64
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Handle owned by the stream page; stops + joins the thread on unmount (and on drop, so a
|
||||||
|
/// navigation away can't leak a presenting thread).
|
||||||
|
pub struct RenderThread {
|
||||||
|
shared: Arc<RenderShared>,
|
||||||
|
join: Option<std::thread::JoinHandle<()>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl RenderThread {
|
||||||
|
pub fn shared(&self) -> &Arc<RenderShared> {
|
||||||
|
&self.shared
|
||||||
|
}
|
||||||
|
|
||||||
|
pub fn stop_and_join(&mut self) {
|
||||||
|
self.shared.stop.store(true, Ordering::SeqCst);
|
||||||
|
if let Some(j) = self.join.take() {
|
||||||
|
let _ = j.join();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Drop for RenderThread {
|
||||||
|
fn drop(&mut self) {
|
||||||
|
self.stop_and_join();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Moves the presenter (COM interfaces, `!Send` by default) onto the render thread. Sound here:
|
||||||
|
/// the shared device + immediate context are multithread-protected (see `crate::gpu`), D3D/DXGI
|
||||||
|
/// objects are apartment-agile, and after this one handoff the swapchain/RTV/context calls happen
|
||||||
|
/// on exactly the render thread — the same single-owner discipline as `SharedDevice`.
|
||||||
|
struct SendPresenter(Presenter);
|
||||||
|
unsafe impl Send for SendPresenter {}
|
||||||
|
|
||||||
|
/// Spawn the render thread. `frames` carries `(frame, capture pts_ns)`; `clock_offset_ns` maps our
|
||||||
|
/// wall clock onto the host's so the logged present latency is end-to-end (same math as the pump).
|
||||||
|
pub fn spawn(
|
||||||
|
presenter: Presenter,
|
||||||
|
frames: FrameRx,
|
||||||
|
shared: Arc<RenderShared>,
|
||||||
|
clock_offset_ns: i64,
|
||||||
|
) -> RenderThread {
|
||||||
|
let boxed = SendPresenter(presenter);
|
||||||
|
let shared_w = shared.clone();
|
||||||
|
let join = std::thread::Builder::new()
|
||||||
|
.name("pf-render".into())
|
||||||
|
.spawn(move || run(boxed, frames, shared_w, clock_offset_ns))
|
||||||
|
.expect("spawn render thread");
|
||||||
|
RenderThread {
|
||||||
|
shared,
|
||||||
|
join: Some(join),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn now_ns() -> u64 {
|
||||||
|
std::time::SystemTime::now()
|
||||||
|
.duration_since(std::time::UNIX_EPOCH)
|
||||||
|
.map(|d| d.as_nanos() as u64)
|
||||||
|
.unwrap_or(0)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// The window DPI, polled ~1 Hz as belt-and-braces for a monitor move that changes DPI without a
|
||||||
|
/// `SizeChanged` (same DIP size on both screens). `None` when the window isn't up (headless).
|
||||||
|
fn poll_window_dpi() -> Option<u32> {
|
||||||
|
use windows::Win32::UI::HiDpi::GetDpiForWindow;
|
||||||
|
use windows::Win32::UI::WindowsAndMessaging::FindWindowW;
|
||||||
|
unsafe {
|
||||||
|
let hwnd = FindWindowW(None, windows::core::w!("Punktfunk")).ok()?;
|
||||||
|
match GetDpiForWindow(hwnd) {
|
||||||
|
0 => None,
|
||||||
|
d => Some(d),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn run(presenter: SendPresenter, frames: FrameRx, shared: Arc<RenderShared>, clock_offset_ns: i64) {
|
||||||
|
let mut p = presenter.0;
|
||||||
|
let mut applied = (0u32, 0u32, 0u32); // last (w, h, dpi) handed to the presenter
|
||||||
|
let mut presented = 0u32;
|
||||||
|
let mut dropped = 0u32;
|
||||||
|
let mut lat_us: Vec<u64> = Vec::with_capacity(256);
|
||||||
|
let mut window_start = Instant::now();
|
||||||
|
let mut last_dpi_poll = Instant::now();
|
||||||
|
|
||||||
|
loop {
|
||||||
|
if shared.stop.load(Ordering::SeqCst) {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
let first = match frames.recv_timeout(Duration::from_millis(50)) {
|
||||||
|
Ok(f) => Some(f),
|
||||||
|
Err(RecvTimeoutError::Timeout) => None,
|
||||||
|
Err(RecvTimeoutError::Disconnected) => break,
|
||||||
|
};
|
||||||
|
|
||||||
|
if last_dpi_poll.elapsed() >= Duration::from_secs(1) {
|
||||||
|
last_dpi_poll = Instant::now();
|
||||||
|
if let Some(dpi) = poll_window_dpi() {
|
||||||
|
shared.set_dpi(dpi);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
let snap = shared.snapshot();
|
||||||
|
let resized = snap != applied && snap.0 > 0 && snap.1 > 0;
|
||||||
|
if resized {
|
||||||
|
p.resize(snap.0, snap.1, snap.2);
|
||||||
|
applied = snap;
|
||||||
|
}
|
||||||
|
if first.is_none() && !resized {
|
||||||
|
continue; // nothing new to show — don't burn GPU re-presenting a static frame
|
||||||
|
}
|
||||||
|
|
||||||
|
// Throttle to the compositor: with ≤1 present outstanding this returns as DWM frees a
|
||||||
|
// slot, and frames decoded meanwhile are drained below so the newest is what's drawn.
|
||||||
|
if !p.wait_present_slot(1000) {
|
||||||
|
tracing::debug!("frame-latency waitable timed out — presenting anyway");
|
||||||
|
}
|
||||||
|
let mut newest = first;
|
||||||
|
while let Ok(f) = frames.try_recv() {
|
||||||
|
if newest.is_some() {
|
||||||
|
dropped += 1;
|
||||||
|
}
|
||||||
|
newest = Some(f);
|
||||||
|
}
|
||||||
|
|
||||||
|
// The session pump is the sole 0xCE consumer and stashes the latest here (rare updates).
|
||||||
|
if let Some(meta) = *crate::present::LATEST_HDR_META.lock().unwrap() {
|
||||||
|
p.set_hdr_metadata(meta);
|
||||||
|
}
|
||||||
|
|
||||||
|
let pts_ns = newest.as_ref().map(|(_, pts)| *pts);
|
||||||
|
p.present(newest.map(|(f, _)| f));
|
||||||
|
presented += 1;
|
||||||
|
if let Some(pts) = pts_ns {
|
||||||
|
// Capture→presented, host-clock corrected — the glass-side companion to the pump's
|
||||||
|
// capture→decoded p50.
|
||||||
|
let lat = (now_ns() as i128 + clock_offset_ns as i128 - pts as i128).max(0) as u64;
|
||||||
|
if lat > 0 && lat < 10_000_000_000 {
|
||||||
|
lat_us.push(lat / 1000);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if window_start.elapsed() >= Duration::from_secs(1) {
|
||||||
|
lat_us.sort_unstable();
|
||||||
|
let p50 = lat_us.get(lat_us.len() / 2).copied().unwrap_or(0);
|
||||||
|
tracing::debug!(presented, dropped, present_p50_us = p50, "render window");
|
||||||
|
window_start = Instant::now();
|
||||||
|
presented = 0;
|
||||||
|
dropped = 0;
|
||||||
|
lat_us.clear();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
tracing::info!("render thread exiting");
|
||||||
|
}
|
||||||
@@ -74,9 +74,13 @@ pub enum SessionEvent {
|
|||||||
Stats(Stats),
|
Stats(Stats),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Decoded frames + their host-capture `pts_ns`, session pump → render thread (crossbeam so that
|
||||||
|
/// thread can block with a timeout — async-channel has no `recv_timeout`).
|
||||||
|
pub type FrameRx = crossbeam_channel::Receiver<(DecodedFrame, u64)>;
|
||||||
|
|
||||||
pub struct SessionHandle {
|
pub struct SessionHandle {
|
||||||
pub events: async_channel::Receiver<SessionEvent>,
|
pub events: async_channel::Receiver<SessionEvent>,
|
||||||
pub frames: async_channel::Receiver<DecodedFrame>,
|
pub frames: FrameRx,
|
||||||
pub stop: Arc<AtomicBool>,
|
pub stop: Arc<AtomicBool>,
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -131,13 +135,15 @@ pub fn run_speed_probe(
|
|||||||
|
|
||||||
pub fn start(params: SessionParams) -> SessionHandle {
|
pub fn start(params: SessionParams) -> SessionHandle {
|
||||||
let (ev_tx, ev_rx) = async_channel::unbounded();
|
let (ev_tx, ev_rx) = async_channel::unbounded();
|
||||||
// Tiny frame queue, newest wins: force_send displaces the oldest when the UI lags.
|
// Tiny frame queue, newest wins: the pump displaces the oldest when the renderer lags (it
|
||||||
let (frame_tx, frame_rx) = async_channel::bounded(2);
|
// keeps a Receiver clone for exactly that).
|
||||||
|
let (frame_tx, frame_rx) = crossbeam_channel::bounded(2);
|
||||||
let stop = Arc::new(AtomicBool::new(false));
|
let stop = Arc::new(AtomicBool::new(false));
|
||||||
let stop_w = stop.clone();
|
let stop_w = stop.clone();
|
||||||
|
let frame_rx_pump = frame_rx.clone();
|
||||||
std::thread::Builder::new()
|
std::thread::Builder::new()
|
||||||
.name("punktfunk-session".into())
|
.name("punktfunk-session".into())
|
||||||
.spawn(move || pump(params, ev_tx, frame_tx, stop_w))
|
.spawn(move || pump(params, ev_tx, frame_tx, frame_rx_pump, stop_w))
|
||||||
.expect("spawn session thread");
|
.expect("spawn session thread");
|
||||||
SessionHandle {
|
SessionHandle {
|
||||||
events: ev_rx,
|
events: ev_rx,
|
||||||
@@ -192,7 +198,8 @@ impl AudioDec {
|
|||||||
fn pump(
|
fn pump(
|
||||||
params: SessionParams,
|
params: SessionParams,
|
||||||
ev_tx: async_channel::Sender<SessionEvent>,
|
ev_tx: async_channel::Sender<SessionEvent>,
|
||||||
frame_tx: async_channel::Sender<DecodedFrame>,
|
frame_tx: crossbeam_channel::Sender<(DecodedFrame, u64)>,
|
||||||
|
frame_rx: FrameRx,
|
||||||
stop: Arc<AtomicBool>,
|
stop: Arc<AtomicBool>,
|
||||||
) {
|
) {
|
||||||
let connector = match NativeClient::connect(
|
let connector = match NativeClient::connect(
|
||||||
@@ -285,6 +292,11 @@ fn pump(
|
|||||||
})
|
})
|
||||||
.flatten();
|
.flatten();
|
||||||
|
|
||||||
|
// Force an immediate IDR (with in-band parameter sets) rather than waiting for the host's own
|
||||||
|
// first keyframe — under infinite GOP a late/missed IDR means the decoder sits on
|
||||||
|
// "PPS id out of range" (a black screen) until one arrives.
|
||||||
|
let _ = connector.request_keyframe();
|
||||||
|
|
||||||
let clock_offset = connector.clock_offset_ns;
|
let clock_offset = connector.clock_offset_ns;
|
||||||
let mut total_frames = 0u64;
|
let mut total_frames = 0u64;
|
||||||
let mut window_start = Instant::now();
|
let mut window_start = Instant::now();
|
||||||
@@ -304,7 +316,17 @@ fn pump(
|
|||||||
match connector.next_frame(Duration::from_millis(4)) {
|
match connector.next_frame(Duration::from_millis(4)) {
|
||||||
Ok(frame) => {
|
Ok(frame) => {
|
||||||
let t0 = Instant::now();
|
let t0 = Instant::now();
|
||||||
match decoder.decode(&frame.data) {
|
// A D3D11VA→software demotion (see `Decoder::decode`) starts a FRESH decoder that
|
||||||
|
// has none of the stream's parameter sets; under infinite GOP it would sit on
|
||||||
|
// "PPS id out of range" forever. Detect the transition and force a new IDR so the
|
||||||
|
// rebuilt decoder resynchronizes immediately.
|
||||||
|
let was_hw = decoder.is_hardware();
|
||||||
|
let decoded = decoder.decode(&frame.data);
|
||||||
|
if was_hw && !decoder.is_hardware() {
|
||||||
|
tracing::info!("decoder demoted to software — requesting keyframe to resync");
|
||||||
|
let _ = connector.request_keyframe();
|
||||||
|
}
|
||||||
|
match decoded {
|
||||||
Ok(Some(decoded)) => {
|
Ok(Some(decoded)) => {
|
||||||
total_frames += 1;
|
total_frames += 1;
|
||||||
hdr = decoded.hdr();
|
hdr = decoded.hdr();
|
||||||
@@ -330,7 +352,13 @@ fn pump(
|
|||||||
decode_us_sum += t0.elapsed().as_micros() as u64;
|
decode_us_sum += t0.elapsed().as_micros() as u64;
|
||||||
frames_n += 1;
|
frames_n += 1;
|
||||||
bytes_n += frame.data.len() as u64;
|
bytes_n += frame.data.len() as u64;
|
||||||
let _ = frame_tx.force_send(decoded);
|
// Newest wins: displace the oldest queued frame when the renderer lags.
|
||||||
|
if let Err(crossbeam_channel::TrySendError::Full(item)) =
|
||||||
|
frame_tx.try_send((decoded, frame.pts_ns))
|
||||||
|
{
|
||||||
|
let _ = frame_rx.try_recv();
|
||||||
|
let _ = frame_tx.try_send(item);
|
||||||
|
}
|
||||||
}
|
}
|
||||||
Ok(None) => {}
|
Ok(None) => {}
|
||||||
// Survivable (loss until the next IDR/RFI recovery) — keep feeding.
|
// Survivable (loss until the next IDR/RFI recovery) — keep feeding.
|
||||||
|
|||||||
@@ -2,15 +2,24 @@
|
|||||||
//!
|
//!
|
||||||
//! Two backends, picked at session start (override via [`DecoderPref`] / the Settings UI):
|
//! Two backends, picked at session start (override via [`DecoderPref`] / the Settings UI):
|
||||||
//!
|
//!
|
||||||
//! * **D3D11VA** (any GPU): libavcodec decodes on the GPU straight into `ID3D11Texture2D`s that
|
//! * **D3D11VA** (any GPU — the vendor-agnostic DXVA path on NVIDIA/AMD/Intel): libavcodec decodes
|
||||||
//! carry `D3D11_BIND_SHADER_RESOURCE`, so the presenter samples the decoded NV12/P010 surface
|
//! on the GPU into an `ID3D11Texture2D` decode array (decoder-only bind — NVIDIA rejects a
|
||||||
//! directly — **zero copy** (no swscale, no CPU readback, no per-frame upload). The textures are
|
//! decoder array that is also a shader resource). The presenter copies each decoded slice into
|
||||||
//! created by the process-wide shared device ([`crate::gpu`]) the presenter also draws with, which
|
//! its own sampleable NV12/P010 texture and converts YUV→RGB in a shader — one cheap GPU-to-GPU
|
||||||
//! is what makes them bindable there. This is the big latency/throughput win over software decode.
|
//! copy per frame (no swscale, no CPU readback). The decode array is created by the process-wide
|
||||||
//! * **Software**: libavcodec on the CPU + swscale to a packed 4-byte format the presenter uploads
|
//! shared device ([`crate::gpu`]) the presenter also draws with, so the copy stays on-GPU. This
|
||||||
//! (`RGBA` for SDR, `X2BGR10` for HDR). The fallback on a GPU-less box (WARP), when D3D11VA init
|
//! is the big latency/throughput win over software.
|
||||||
//! fails, or when a mid-session hardware error demotes us — the host's IDR/RFI recovery
|
//! * **Software**: libavcodec on the CPU + swscale to the same planar layout the hardware path
|
||||||
//! resynchronizes on the next keyframe either way.
|
//! produces (NV12, or P010 for 10-bit) — the presenter uploads the two planes and runs the SAME
|
||||||
|
//! YUV→RGB shaders, so hw/sw color math is identical. The fallback on a GPU-less box (WARP),
|
||||||
|
//! when D3D11VA init fails, or when a mid-session hardware error demotes us — the host's
|
||||||
|
//! IDR/RFI recovery resynchronizes on the next keyframe either way.
|
||||||
|
//!
|
||||||
|
//! D3D11VA viability is settled **before the session's first frame** by two probes: the adapter
|
||||||
|
//! must expose the negotiated codec's DXVA decode profile ([`decode_profile_supported`] — hwaccel
|
||||||
|
//! init otherwise only fails at the first AU, burning the IDR), and it must be able to create the
|
||||||
|
//! decode surface pool ([`d3d11va_decode_supported`]). Either failing commits to software decode
|
||||||
|
//! from frame one (a clean, gap-free stream) instead of dying mid-stream.
|
||||||
//!
|
//!
|
||||||
//! Both run `AV_CODEC_FLAG_LOW_DELAY`; the host encodes zero-reorder streams (no B-frames, in-band
|
//! Both run `AV_CODEC_FLAG_LOW_DELAY`; the host encodes zero-reorder streams (no B-frames, in-band
|
||||||
//! parameter sets on every IDR), so decode is strictly one-in/one-out.
|
//! parameter sets on every IDR), so decode is strictly one-in/one-out.
|
||||||
@@ -25,7 +34,9 @@ use ffmpeg::util::frame::Video as AvFrame;
|
|||||||
use ffmpeg_next as ffmpeg;
|
use ffmpeg_next as ffmpeg;
|
||||||
use std::ffi::c_void;
|
use std::ffi::c_void;
|
||||||
use std::ptr;
|
use std::ptr;
|
||||||
use windows::core::Interface; // ID3D11Device::clone().into_raw() for the FFmpeg hwdevice ctx
|
use windows::core::{Interface, GUID};
|
||||||
|
use windows::Win32::Graphics::Direct3D11::{ID3D11Device, ID3D11VideoDevice};
|
||||||
|
use windows::Win32::Graphics::Dxgi::Common::{DXGI_FORMAT, DXGI_FORMAT_NV12, DXGI_FORMAT_P010};
|
||||||
|
|
||||||
/// Which decode backend to use; the Settings UI persists this as a string.
|
/// Which decode backend to use; the Settings UI persists this as a string.
|
||||||
#[derive(Clone, Copy, PartialEq, Eq, Debug, Default)]
|
#[derive(Clone, Copy, PartialEq, Eq, Debug, Default)]
|
||||||
@@ -69,21 +80,27 @@ impl DecodedFrame {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Packed 4-byte-per-pixel frame for a D3D11 dynamic-texture upload (which takes a row pitch). The
|
/// A software-decoded frame in the same planar layout the hardware path produces: an NV12 (or
|
||||||
/// bytes are `R8G8B8A8` for SDR and `X2BGR10` (== DXGI `R10G10B10A2`, R in the low 10 bits) for HDR.
|
/// P010 for 10-bit) luma plane + interleaved chroma plane, each with its swscale row stride
|
||||||
|
/// (≥ the row bytes — swscale pads rows for SIMD). The presenter uploads them into two dynamic
|
||||||
|
/// plane textures sampled by the same shaders as the D3D11VA path.
|
||||||
pub struct CpuFrame {
|
pub struct CpuFrame {
|
||||||
pub width: u32,
|
pub width: u32,
|
||||||
pub height: u32,
|
pub height: u32,
|
||||||
/// Row stride in bytes (≥ width*4 — swscale pads rows for SIMD).
|
/// Luma plane (`W×H` samples, 1 byte each; 2 for 10-bit) + its row stride in bytes.
|
||||||
pub stride: usize,
|
pub y: Vec<u8>,
|
||||||
pub pixels: Vec<u8>,
|
pub y_stride: usize,
|
||||||
/// BT.2020 PQ HDR10 frame: `pixels` is `X2BGR10` and the presenter switches to a 10-bit
|
/// Interleaved chroma plane (`⌈W/2⌉×⌈H/2⌉` UV pairs) + its row stride in bytes.
|
||||||
/// R10G10B10A2 + ST.2084 swapchain. `false` = ordinary 8-bit BT.709 SDR.
|
pub uv: Vec<u8>,
|
||||||
|
pub uv_stride: usize,
|
||||||
|
/// P010 sample layout (10 bits in the high bits of 16) vs NV12. Selects texture/SRV formats.
|
||||||
|
pub ten_bit: bool,
|
||||||
|
/// BT.2020 PQ HDR10 vs ordinary BT.709 SDR. Selects shader + swapchain colour space.
|
||||||
pub hdr: bool,
|
pub hdr: bool,
|
||||||
}
|
}
|
||||||
|
|
||||||
/// A decoded frame still on the GPU: a D3D11 texture **array** plus the slice index the decoder
|
/// A decoded frame still on the GPU: a D3D11 texture **array** plus the slice index the decoder
|
||||||
/// wrote this frame into. The presenter creates per-plane shader-resource views over the slice and
|
/// wrote this frame into. The presenter copies the slice into its own sampleable texture and
|
||||||
/// converts YUV→RGB in a pixel shader. The underlying surface stays alive — and out of the decoder's
|
/// converts YUV→RGB in a pixel shader. The underlying surface stays alive — and out of the decoder's
|
||||||
/// reuse pool — for exactly as long as `guard` (an `av_frame_clone` of the decoded frame) lives.
|
/// reuse pool — for exactly as long as `guard` (an `av_frame_clone` of the decoded frame) lives.
|
||||||
pub struct GpuFrame {
|
pub struct GpuFrame {
|
||||||
@@ -91,16 +108,20 @@ pub struct GpuFrame {
|
|||||||
pub height: u32,
|
pub height: u32,
|
||||||
/// Texture-array slice this frame occupies (`AVFrame::data[1]`).
|
/// Texture-array slice this frame occupies (`AVFrame::data[1]`).
|
||||||
pub index: u32,
|
pub index: u32,
|
||||||
/// BT.2020 PQ HDR10 (P010, ST.2084) vs ordinary 8-bit BT.709 SDR (NV12). The present path keys
|
/// The decode pool is P010 (10 bits in the high bits) vs NV12 — from the frames context's
|
||||||
/// SRV format + shader off this (the host couples 10-bit ⟺ HDR).
|
/// `sw_format`. The presenter keys its copy-texture/SRV formats off this: they must match the
|
||||||
|
/// source array exactly for `CopySubresourceRegion`.
|
||||||
|
pub ten_bit: bool,
|
||||||
|
/// BT.2020 PQ HDR10 (ST.2084 transfer) vs ordinary BT.709 SDR. Selects shader + swapchain
|
||||||
|
/// colour space only (the host couples 10-bit ⟺ HDR today, but formats key off `ten_bit`).
|
||||||
pub hdr: bool,
|
pub hdr: bool,
|
||||||
guard: D3d11FrameGuard,
|
guard: D3d11FrameGuard,
|
||||||
}
|
}
|
||||||
|
|
||||||
impl GpuFrame {
|
impl GpuFrame {
|
||||||
/// The decoder's D3D11 texture array holding this frame's slice, borrowed from the live cloned
|
/// The decoder's D3D11 texture array holding this frame's slice, borrowed from the live cloned
|
||||||
/// `AVFrame`. Construct the windows-rs interface on the thread that will use it (the presenter /
|
/// `AVFrame`. Construct the windows-rs interface on the thread that will use it (the render
|
||||||
/// UI thread): COM interfaces are `!Send`, but the raw pointer is fine to carry across threads.
|
/// thread): COM interfaces are `!Send`, but the raw pointer is fine to carry across threads.
|
||||||
pub fn texture_ptr(&self) -> *mut c_void {
|
pub fn texture_ptr(&self) -> *mut c_void {
|
||||||
unsafe { (*self.guard.0).data[0] as *mut c_void }
|
unsafe { (*self.guard.0).data[0] as *mut c_void }
|
||||||
}
|
}
|
||||||
@@ -108,7 +129,7 @@ impl GpuFrame {
|
|||||||
|
|
||||||
/// Owns a cloned decoded `AVFrame` (which refs the D3D11 surface in the decoder pool). Dropping it
|
/// Owns a cloned decoded `AVFrame` (which refs the D3D11 surface in the decoder pool). Dropping it
|
||||||
/// releases the surface back for reuse. The clone is plain refcounted data; freeing it from the
|
/// releases the surface back for reuse. The clone is plain refcounted data; freeing it from the
|
||||||
/// presenter thread is fine.
|
/// render thread is fine.
|
||||||
pub struct D3d11FrameGuard(*mut ffmpeg::ffi::AVFrame);
|
pub struct D3d11FrameGuard(*mut ffmpeg::ffi::AVFrame);
|
||||||
unsafe impl Send for D3d11FrameGuard {}
|
unsafe impl Send for D3d11FrameGuard {}
|
||||||
impl Drop for D3d11FrameGuard {
|
impl Drop for D3d11FrameGuard {
|
||||||
@@ -139,6 +160,7 @@ pub fn ffmpeg_codec_id(wire: u8) -> ffmpeg::codec::Id {
|
|||||||
|
|
||||||
/// The `quic` codec bitfield this client can decode — whatever FFmpeg has a decoder for (HEVC/H.264
|
/// The `quic` codec bitfield this client can decode — whatever FFmpeg has a decoder for (HEVC/H.264
|
||||||
/// always; AV1 when built in). Advertised to the host so it never emits a codec we can't decode.
|
/// always; AV1 when built in). Advertised to the host so it never emits a codec we can't decode.
|
||||||
|
/// Deliberately NOT gated on the DXVA profiles: software decode covers anything FFmpeg can.
|
||||||
pub fn decodable_codecs() -> u8 {
|
pub fn decodable_codecs() -> u8 {
|
||||||
let _ = ffmpeg::init();
|
let _ = ffmpeg::init();
|
||||||
let mut bits = 0u8;
|
let mut bits = 0u8;
|
||||||
@@ -160,7 +182,7 @@ impl Decoder {
|
|||||||
if pref != DecoderPref::Software {
|
if pref != DecoderPref::Software {
|
||||||
match D3d11vaDecoder::new(codec_id) {
|
match D3d11vaDecoder::new(codec_id) {
|
||||||
Ok(d) => {
|
Ok(d) => {
|
||||||
tracing::info!(?codec_id, "D3D11VA hardware decode active (zero-copy)");
|
tracing::info!(?codec_id, "D3D11VA hardware decode active");
|
||||||
return Ok(Decoder {
|
return Ok(Decoder {
|
||||||
backend: Backend::D3d11va(d),
|
backend: Backend::D3d11va(d),
|
||||||
codec_id,
|
codec_id,
|
||||||
@@ -180,7 +202,7 @@ impl Decoder {
|
|||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
/// True for the zero-copy hardware backend (shown in the stream HUD).
|
/// True for the GPU hardware backend (shown in the stream HUD).
|
||||||
pub fn is_hardware(&self) -> bool {
|
pub fn is_hardware(&self) -> bool {
|
||||||
matches!(self.backend, Backend::D3d11va(_))
|
matches!(self.backend, Backend::D3d11va(_))
|
||||||
}
|
}
|
||||||
@@ -203,12 +225,73 @@ impl Decoder {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// --- DXVA decode-profile probe --------------------------------------------------------
|
||||||
|
|
||||||
|
/// DXVA decode-profile GUIDs (`dxva.h`), defined locally so no extra windows-rs feature or
|
||||||
|
/// metadata surface is pulled in for four constants.
|
||||||
|
const PROFILE_H264_VLD_NOFGT: GUID = GUID::from_u128(0x1b81be68_a0c7_11d3_b984_00c04f2e73c5);
|
||||||
|
const PROFILE_HEVC_VLD_MAIN: GUID = GUID::from_u128(0x5b11d51b_2f4c_4452_bcc3_09f2a1160cc0);
|
||||||
|
const PROFILE_HEVC_VLD_MAIN10: GUID = GUID::from_u128(0x107af0e0_ef1a_4d19_aba8_67a163073d13);
|
||||||
|
const PROFILE_AV1_VLD_PROFILE0: GUID = GUID::from_u128(0xb8be4ccb_cf53_46ba_8d59_d6b8a6da5d2a);
|
||||||
|
|
||||||
|
/// Does the shared device's adapter expose a DXVA decode profile for `codec_id`? Checked before
|
||||||
|
/// building the FFmpeg hwdevice because hwaccel selection (`get_format`) only runs on the FIRST
|
||||||
|
/// access unit — an unsupported profile would otherwise burn the opening IDR and recover through
|
||||||
|
/// the mid-stream demotion path instead of committing to software up front. Also logs (once) the
|
||||||
|
/// adapter's full profile list plus Main10 availability — the forensics for a new GPU/driver.
|
||||||
|
fn decode_profile_supported(device: &ID3D11Device, codec_id: ffmpeg::codec::Id) -> Result<()> {
|
||||||
|
let video: ID3D11VideoDevice = device
|
||||||
|
.cast()
|
||||||
|
.context("device lacks ID3D11VideoDevice (created without VIDEO_SUPPORT)")?;
|
||||||
|
let profiles: Vec<GUID> = unsafe {
|
||||||
|
let n = video.GetVideoDecoderProfileCount();
|
||||||
|
(0..n)
|
||||||
|
.filter_map(|i| video.GetVideoDecoderProfile(i).ok())
|
||||||
|
.collect()
|
||||||
|
};
|
||||||
|
log_profiles_once(&profiles);
|
||||||
|
|
||||||
|
let (wanted, format, name): (GUID, DXGI_FORMAT, &str) = match codec_id {
|
||||||
|
ffmpeg::codec::Id::H264 => (PROFILE_H264_VLD_NOFGT, DXGI_FORMAT_NV12, "H.264 VLD NoFGT"),
|
||||||
|
ffmpeg::codec::Id::HEVC => (PROFILE_HEVC_VLD_MAIN, DXGI_FORMAT_NV12, "HEVC Main"),
|
||||||
|
ffmpeg::codec::Id::AV1 => (PROFILE_AV1_VLD_PROFILE0, DXGI_FORMAT_NV12, "AV1 Profile 0"),
|
||||||
|
other => bail!("no DXVA profile known for {other:?}"),
|
||||||
|
};
|
||||||
|
let ok = profiles.contains(&wanted)
|
||||||
|
&& unsafe { video.CheckVideoDecoderFormat(&wanted, format) }
|
||||||
|
.map(|b| b.as_bool())
|
||||||
|
.unwrap_or(false);
|
||||||
|
if !ok {
|
||||||
|
bail!("adapter exposes no {name} decode profile");
|
||||||
|
}
|
||||||
|
// 10-bit (a mid-session HDR upgrade needs Main10): informational — if it's missing the
|
||||||
|
// decode error → software demotion + keyframe re-request path covers the switch.
|
||||||
|
if codec_id == ffmpeg::codec::Id::HEVC {
|
||||||
|
let main10 = profiles.contains(&PROFILE_HEVC_VLD_MAIN10)
|
||||||
|
&& unsafe { video.CheckVideoDecoderFormat(&PROFILE_HEVC_VLD_MAIN10, DXGI_FORMAT_P010) }
|
||||||
|
.map(|b| b.as_bool())
|
||||||
|
.unwrap_or(false);
|
||||||
|
tracing::info!(main10, "HEVC Main10 (10-bit/HDR) decode profile");
|
||||||
|
}
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// One-time dump of the adapter's DXVA decode profiles.
|
||||||
|
fn log_profiles_once(profiles: &[GUID]) {
|
||||||
|
use std::sync::atomic::{AtomicBool, Ordering};
|
||||||
|
static ONCE: AtomicBool = AtomicBool::new(true);
|
||||||
|
if ONCE.swap(false, Ordering::Relaxed) {
|
||||||
|
let list: Vec<String> = profiles.iter().map(|g| format!("{g:?}")).collect();
|
||||||
|
tracing::info!(count = profiles.len(), profiles = ?list, "adapter DXVA decode profiles");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// --- software backend ---------------------------------------------------------------
|
// --- software backend ---------------------------------------------------------------
|
||||||
|
|
||||||
struct SoftwareDecoder {
|
struct SoftwareDecoder {
|
||||||
decoder: ffmpeg::decoder::Video,
|
decoder: ffmpeg::decoder::Video,
|
||||||
/// Rebuilt whenever the decoded format/size **or output format** changes (mid-stream
|
/// Rebuilt whenever the decoded format/size **or output format** changes (mid-stream
|
||||||
/// `Reconfigure`, or an SDR↔HDR flip): `(ctx, src_fmt, w, h, dst_fmt)`.
|
/// `Reconfigure`, or an 8↔10-bit flip): `(ctx, src_fmt, w, h, dst_fmt)`.
|
||||||
sws: Option<(scaling::Context, Pixel, u32, u32, Pixel)>,
|
sws: Option<(scaling::Context, Pixel, u32, u32, Pixel)>,
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -241,36 +324,24 @@ impl SoftwareDecoder {
|
|||||||
Ok(out)
|
Ok(out)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Convert the decoded YUV frame to a packed 4-byte format the presenter uploads directly:
|
/// Convert the decoded planar YUV to the hardware path's layout: NV12 for 8-bit, P010 for
|
||||||
/// SDR → `RGBA` (BT.709), HDR (SMPTE ST.2084 / PQ transfer) → `X2BGR10` (== DXGI R10G10B10A2)
|
/// 10-bit — a chroma interleave (and 10→16-high-bits shift), NOT a colour conversion. The
|
||||||
/// using the BT.2020 matrix. For HDR the PQ-encoded values pass through unchanged (swscale only
|
/// matrix/range/transfer handling all lives in the presenter's shaders, shared with the
|
||||||
/// applies the YUV→RGB matrix + range, never the transfer) — exactly what an HDR10 swapchain wants.
|
/// D3D11VA path, so software frames are bit-comparable with hardware ones.
|
||||||
fn convert(&mut self, frame: &AvFrame) -> Result<CpuFrame> {
|
fn convert(&mut self, frame: &AvFrame) -> Result<CpuFrame> {
|
||||||
use ffmpeg::color::TransferCharacteristic;
|
use ffmpeg::color::TransferCharacteristic;
|
||||||
let (fmt, w, h) = (frame.format(), frame.width(), frame.height());
|
let (fmt, w, h) = (frame.format(), frame.width(), frame.height());
|
||||||
let hdr = frame.color_transfer_characteristic() == TransferCharacteristic::SMPTE2084;
|
let hdr = frame.color_transfer_characteristic() == TransferCharacteristic::SMPTE2084;
|
||||||
let dst = if hdr { Pixel::X2BGR10LE } else { Pixel::RGBA };
|
// Source bit depth from the pix-fmt descriptor (stable FFmpeg public API).
|
||||||
|
let ten_bit = unsafe {
|
||||||
|
let desc = ffmpeg::ffi::av_pix_fmt_desc_get(fmt.into());
|
||||||
|
!desc.is_null() && (*desc).comp[0].depth > 8
|
||||||
|
};
|
||||||
|
let dst = if ten_bit { Pixel::P010LE } else { Pixel::NV12 };
|
||||||
let rebuild = !matches!(&self.sws, Some((_, f, sw, sh, d)) if *f == fmt && *sw == w && *sh == h && *d == dst);
|
let rebuild = !matches!(&self.sws, Some((_, f, sw, sh, d)) if *f == fmt && *sw == w && *sh == h && *d == dst);
|
||||||
if rebuild {
|
if rebuild {
|
||||||
let mut ctx = scaling::Context::get(fmt, w, h, dst, w, h, scaling::Flags::POINT)
|
let ctx = scaling::Context::get(fmt, w, h, dst, w, h, scaling::Flags::POINT)
|
||||||
.context("swscale context")?;
|
.context("swscale context")?;
|
||||||
if hdr {
|
|
||||||
// BT.2020 non-constant-luminance YUV (limited range) → full-range RGB. swscale
|
|
||||||
// applies only the matrix + range here, so the samples stay PQ-encoded.
|
|
||||||
unsafe {
|
|
||||||
let coef = ffmpeg::ffi::sws_getCoefficients(ffmpeg::ffi::SWS_CS_BT2020);
|
|
||||||
ffmpeg::ffi::sws_setColorspaceDetails(
|
|
||||||
ctx.as_mut_ptr(),
|
|
||||||
coef,
|
|
||||||
0, // src range: limited (video)
|
|
||||||
coef,
|
|
||||||
1, // dst range: full
|
|
||||||
0,
|
|
||||||
1 << 16,
|
|
||||||
1 << 16, // brightness / contrast / saturation defaults (16.16)
|
|
||||||
);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
self.sws = Some((ctx, fmt, w, h, dst));
|
self.sws = Some((ctx, fmt, w, h, dst));
|
||||||
}
|
}
|
||||||
let (sws, ..) = self.sws.as_mut().unwrap();
|
let (sws, ..) = self.sws.as_mut().unwrap();
|
||||||
@@ -279,8 +350,11 @@ impl SoftwareDecoder {
|
|||||||
Ok(CpuFrame {
|
Ok(CpuFrame {
|
||||||
width: w,
|
width: w,
|
||||||
height: h,
|
height: h,
|
||||||
stride: conv.stride(0),
|
y: conv.data(0).to_vec(),
|
||||||
pixels: conv.data(0).to_vec(),
|
y_stride: conv.stride(0),
|
||||||
|
uv: conv.data(1).to_vec(),
|
||||||
|
uv_stride: conv.stride(1),
|
||||||
|
ten_bit,
|
||||||
hdr,
|
hdr,
|
||||||
})
|
})
|
||||||
}
|
}
|
||||||
@@ -295,11 +369,16 @@ impl SoftwareDecoder {
|
|||||||
// decoded surfaces transfer out through D3d11FrameGuard.
|
// decoded surfaces transfer out through D3d11FrameGuard.
|
||||||
|
|
||||||
const AVERROR_EAGAIN: i32 = -11; // -EAGAIN
|
const AVERROR_EAGAIN: i32 = -11; // -EAGAIN
|
||||||
const D3D11_BIND_SHADER_RESOURCE: u32 = 0x8; // <d3d11.h>; FFmpeg ORs D3D11_BIND_DECODER itself
|
|
||||||
|
/// D3D11VA decode surface pool depth: the zero-reorder DPB (1–2 refs) + the bounded decoded channel
|
||||||
|
/// (2) + the frame the presenter currently holds (until its copy flushes) + one in-flight decode —
|
||||||
|
/// 12 is comfortable. A GPU that can't create the pool at all is gated out by
|
||||||
|
/// `d3d11va_decode_supported` and the session uses software decode.
|
||||||
|
const DECODE_POOL_SIZE: i32 = 12;
|
||||||
|
|
||||||
/// `hwcontext_d3d11va.h` — `AVHWDeviceContext::hwctx`. Leaving `lock` null makes FFmpeg install an
|
/// `hwcontext_d3d11va.h` — `AVHWDeviceContext::hwctx`. Leaving `lock` null makes FFmpeg install an
|
||||||
/// `ID3D11Multithread` default lock + set multithread protection on `device_context` during init,
|
/// `ID3D11Multithread` default lock + set multithread protection on `device_context` during init,
|
||||||
/// which is what lets the presenter share this device's immediate context from the UI thread.
|
/// which is what lets the presenter share this device's immediate context from the render thread.
|
||||||
#[repr(C)]
|
#[repr(C)]
|
||||||
struct AVD3D11VADeviceContext {
|
struct AVD3D11VADeviceContext {
|
||||||
device: *mut c_void, // ID3D11Device*
|
device: *mut c_void, // ID3D11Device*
|
||||||
@@ -311,70 +390,79 @@ struct AVD3D11VADeviceContext {
|
|||||||
lock_ctx: *mut c_void,
|
lock_ctx: *mut c_void,
|
||||||
}
|
}
|
||||||
|
|
||||||
/// `hwcontext_d3d11va.h` — `AVHWFramesContext::hwctx`. `BindFlags` lets us add
|
/// `hwcontext_d3d11va.h` — `AVHWFramesContext::hwctx`. The header is explicit: "The user must at
|
||||||
/// `D3D11_BIND_SHADER_RESOURCE` so the decoded array texture is sampleable (zero copy).
|
/// least set D3D11_BIND_DECODER if the frames context is to be used for video decoding" — a
|
||||||
|
/// user-built frames context gets NO default (BindFlags 0 → `CreateTexture2D` E_INVALIDARG); the
|
||||||
|
/// automatic OR-in lives only in libavcodec's own frames-param path, which we bypass.
|
||||||
#[repr(C)]
|
#[repr(C)]
|
||||||
struct AVD3D11VAFramesContext {
|
struct AVD3D11VAFramesContext {
|
||||||
texture: *mut c_void, // ID3D11Texture2D* (null → FFmpeg allocates the pool)
|
texture: *mut c_void, // ID3D11Texture2D* (null → FFmpeg allocates the pool)
|
||||||
bind_flags: u32, // UINT BindFlags
|
bind_flags: u32, // UINT BindFlags
|
||||||
misc_flags: u32, // UINT MiscFlags
|
misc_flags: u32, // UINT MiscFlags
|
||||||
|
texture_infos: *mut c_void, // AVD3D11FrameDescriptor* (FFmpeg-managed)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// `D3D11_BIND_DECODER` — the decode pool's ONLY bind flag. Adding `D3D11_BIND_SHADER_RESOURCE`
|
||||||
|
/// is what NVIDIA rejects on a decoder texture ARRAY; the presenter samples via its own copy.
|
||||||
|
const BIND_DECODER: u32 = 0x200;
|
||||||
|
|
||||||
fn averr(what: &str, code: i32) -> anyhow::Error {
|
fn averr(what: &str, code: i32) -> anyhow::Error {
|
||||||
anyhow!("{what}: {}", ffmpeg::Error::from(code))
|
anyhow!("{what}: {}", ffmpeg::Error::from(code))
|
||||||
}
|
}
|
||||||
|
|
||||||
/// libavcodec's `get_format` callback: accept the D3D11 hw surface, building a frames context whose
|
/// libavcodec's `get_format` callback: pick the D3D11 hw surface format and nothing else.
|
||||||
/// textures carry `BIND_SHADER_RESOURCE` (so the presenter can sample them). Returning anything but
|
/// Deliberately does NOT build a frames context — with `hw_device_ctx` set and `hw_frames_ctx`
|
||||||
/// `AV_PIX_FMT_D3D11` aborts hardware decode → the session demotes to software.
|
/// left null, libavcodec derives the decode pool itself (`ff_decode_get_hw_frames_ctx`), applying
|
||||||
|
/// every vendor quirk: DXVA surface alignment (128 for HEVC/AV1), DPB-based pool sizing, and the
|
||||||
|
/// decoder-only `D3D11_BIND_DECODER` flags. A hand-built context validated on NVIDIA was rejected
|
||||||
|
/// by Intel at the first `SubmitDecoderBuffers` (E_INVALIDARG) — the vendor-proof path is the one
|
||||||
|
/// the ffmpeg CLI/mpv ship. Returning anything but `AV_PIX_FMT_D3D11` aborts hardware decode →
|
||||||
|
/// the session demotes to software.
|
||||||
unsafe extern "C" fn get_format_d3d11(
|
unsafe extern "C" fn get_format_d3d11(
|
||||||
avctx: *mut ffmpeg::ffi::AVCodecContext,
|
avctx: *mut ffmpeg::ffi::AVCodecContext,
|
||||||
mut list: *const ffmpeg::ffi::AVPixelFormat,
|
mut list: *const ffmpeg::ffi::AVPixelFormat,
|
||||||
) -> ffmpeg::ffi::AVPixelFormat {
|
) -> ffmpeg::ffi::AVPixelFormat {
|
||||||
use ffmpeg::ffi::*;
|
use ffmpeg::ffi::*;
|
||||||
unsafe {
|
unsafe {
|
||||||
let mut found = false;
|
if (*avctx).hw_device_ctx.is_null() {
|
||||||
|
return AVPixelFormat::AV_PIX_FMT_NONE;
|
||||||
|
}
|
||||||
while *list != AVPixelFormat::AV_PIX_FMT_NONE {
|
while *list != AVPixelFormat::AV_PIX_FMT_NONE {
|
||||||
if *list == AVPixelFormat::AV_PIX_FMT_D3D11 {
|
if *list == AVPixelFormat::AV_PIX_FMT_D3D11 {
|
||||||
found = true;
|
return AVPixelFormat::AV_PIX_FMT_D3D11;
|
||||||
break;
|
|
||||||
}
|
}
|
||||||
list = list.add(1);
|
list = list.add(1);
|
||||||
}
|
}
|
||||||
if !found {
|
AVPixelFormat::AV_PIX_FMT_NONE
|
||||||
return AVPixelFormat::AV_PIX_FMT_NONE;
|
|
||||||
}
|
}
|
||||||
let device_ref = (*avctx).hw_device_ctx;
|
|
||||||
if device_ref.is_null() {
|
|
||||||
return AVPixelFormat::AV_PIX_FMT_NONE;
|
|
||||||
}
|
}
|
||||||
let frames_ref = av_hwframe_ctx_alloc(device_ref);
|
|
||||||
|
/// Predict whether D3D11VA decode will work by doing EXACTLY what the decoder's `get_format` does —
|
||||||
|
/// allocate an `AVHWFramesContext` (decoder-only pool, no shader-resource bind) and initialize it,
|
||||||
|
/// which creates the real NV12 decode surface array. On a GPU/driver that can't create the pool this
|
||||||
|
/// fails here, up front, so the session commits to software decode from the first frame (a clean,
|
||||||
|
/// gap-free stream) rather than decoding the IDR then dying mid-stream on a texture error that a
|
||||||
|
/// software demotion can't reliably recover from (the host's infinite GOP won't re-send an IDR).
|
||||||
|
unsafe fn d3d11va_decode_supported(hw_device: *mut ffmpeg::ffi::AVBufferRef) -> bool {
|
||||||
|
use ffmpeg::ffi::*;
|
||||||
|
unsafe {
|
||||||
|
let frames_ref = av_hwframe_ctx_alloc(hw_device);
|
||||||
if frames_ref.is_null() {
|
if frames_ref.is_null() {
|
||||||
return AVPixelFormat::AV_PIX_FMT_NONE;
|
return false;
|
||||||
}
|
}
|
||||||
let frames = (*frames_ref).data as *mut AVHWFramesContext;
|
let frames = (*frames_ref).data as *mut AVHWFramesContext;
|
||||||
(*frames).format = AVPixelFormat::AV_PIX_FMT_D3D11;
|
(*frames).format = AVPixelFormat::AV_PIX_FMT_D3D11;
|
||||||
let sw = if (*avctx).sw_pix_fmt != AVPixelFormat::AV_PIX_FMT_NONE {
|
(*frames).sw_format = AVPixelFormat::AV_PIX_FMT_NV12;
|
||||||
(*avctx).sw_pix_fmt
|
(*frames).width = 1920;
|
||||||
} else {
|
(*frames).height = 1152; // 128-aligned 1080p surface (the HEVC DXVA alignment, see get_format)
|
||||||
AVPixelFormat::AV_PIX_FMT_NV12
|
(*frames).initial_pool_size = DECODE_POOL_SIZE;
|
||||||
};
|
// Decoder-only — matches get_format exactly.
|
||||||
(*frames).sw_format = sw;
|
|
||||||
(*frames).width = (*avctx).coded_width;
|
|
||||||
(*frames).height = (*avctx).coded_height;
|
|
||||||
// DPB + a few in-flight (decoded channel + the presenter's held frame); the host's
|
|
||||||
// zero-reorder stream needs only a small DPB, so 20 is comfortable headroom.
|
|
||||||
(*frames).initial_pool_size = 20;
|
|
||||||
let fhw = (*frames).hwctx as *mut AVD3D11VAFramesContext;
|
let fhw = (*frames).hwctx as *mut AVD3D11VAFramesContext;
|
||||||
(*fhw).bind_flags = D3D11_BIND_SHADER_RESOURCE;
|
(*fhw).bind_flags = BIND_DECODER;
|
||||||
let r = av_hwframe_ctx_init(frames_ref);
|
let r = av_hwframe_ctx_init(frames_ref);
|
||||||
if r < 0 {
|
|
||||||
let mut fr = frames_ref;
|
let mut fr = frames_ref;
|
||||||
av_buffer_unref(&mut fr);
|
av_buffer_unref(&mut fr);
|
||||||
return AVPixelFormat::AV_PIX_FMT_NONE;
|
r >= 0
|
||||||
}
|
|
||||||
(*avctx).hw_frames_ctx = frames_ref; // decoder takes ownership
|
|
||||||
AVPixelFormat::AV_PIX_FMT_D3D11
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -395,6 +483,8 @@ impl D3d11vaDecoder {
|
|||||||
if !shared.hardware {
|
if !shared.hardware {
|
||||||
bail!("shared device is WARP (no hardware video decode)");
|
bail!("shared device is WARP (no hardware video decode)");
|
||||||
}
|
}
|
||||||
|
// The adapter must expose the codec's DXVA profile — checked here, not at the first AU.
|
||||||
|
decode_profile_supported(&shared.device, codec_id)?;
|
||||||
unsafe {
|
unsafe {
|
||||||
// Build a D3D11VA hwdevice context around the *shared* device, so decoded textures live
|
// Build a D3D11VA hwdevice context around the *shared* device, so decoded textures live
|
||||||
// on the same device the presenter samples + draws with.
|
// on the same device the presenter samples + draws with.
|
||||||
@@ -417,6 +507,15 @@ impl D3d11vaDecoder {
|
|||||||
bail!("av_hwdevice_ctx_init: {}", ffmpeg::Error::from(r));
|
bail!("av_hwdevice_ctx_init: {}", ffmpeg::Error::from(r));
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Up-front viability probe (see `d3d11va_decode_supported`): a GPU/driver that can't
|
||||||
|
// create the decode surface pool commits to software NOW, so it decodes cleanly from the
|
||||||
|
// first frame instead of failing mid-stream (which a demotion can't reliably recover).
|
||||||
|
if !d3d11va_decode_supported(hw_device) {
|
||||||
|
let mut hw = hw_device;
|
||||||
|
ffi::av_buffer_unref(&mut hw);
|
||||||
|
bail!("GPU can't create the D3D11VA decode surface pool — using software decode");
|
||||||
|
}
|
||||||
|
|
||||||
let codec = ffi::avcodec_find_decoder(codec_id.into());
|
let codec = ffi::avcodec_find_decoder(codec_id.into());
|
||||||
if codec.is_null() {
|
if codec.is_null() {
|
||||||
let mut hw = hw_device;
|
let mut hw = hw_device;
|
||||||
@@ -427,7 +526,11 @@ impl D3d11vaDecoder {
|
|||||||
(*ctx).hw_device_ctx = ffi::av_buffer_ref(hw_device);
|
(*ctx).hw_device_ctx = ffi::av_buffer_ref(hw_device);
|
||||||
(*ctx).get_format = Some(get_format_d3d11);
|
(*ctx).get_format = Some(get_format_d3d11);
|
||||||
(*ctx).flags |= ffi::AV_CODEC_FLAG_LOW_DELAY as i32;
|
(*ctx).flags |= ffi::AV_CODEC_FLAG_LOW_DELAY as i32;
|
||||||
(*ctx).thread_count = 1; // hwaccel: threads only add latency
|
// hwaccel: threads only add latency.
|
||||||
|
(*ctx).thread_count = 1;
|
||||||
|
// On top of the DPB-based pool libavcodec sizes for us: the bounded decoded channel
|
||||||
|
// (2) + the frame the presenter holds until its copy flushes + margin.
|
||||||
|
(*ctx).extra_hw_frames = 4;
|
||||||
let r = ffi::avcodec_open2(ctx, codec, ptr::null_mut());
|
let r = ffi::avcodec_open2(ctx, codec, ptr::null_mut());
|
||||||
if r < 0 {
|
if r < 0 {
|
||||||
let mut ctx = ctx;
|
let mut ctx = ctx;
|
||||||
@@ -499,6 +602,7 @@ impl D3d11vaDecoder {
|
|||||||
width: (*self.frame).width as u32,
|
width: (*self.frame).width as u32,
|
||||||
height: (*self.frame).height as u32,
|
height: (*self.frame).height as u32,
|
||||||
index: (*self.frame).data[1] as usize as u32,
|
index: (*self.frame).data[1] as usize as u32,
|
||||||
|
ten_bit,
|
||||||
hdr,
|
hdr,
|
||||||
guard: D3d11FrameGuard(cloned),
|
guard: D3d11FrameGuard(cloned),
|
||||||
};
|
};
|
||||||
@@ -532,7 +636,7 @@ fn log_layout_once(width: u32, height: u32, index: u32, hdr: bool, ten_bit: bool
|
|||||||
slice = index,
|
slice = index,
|
||||||
hdr,
|
hdr,
|
||||||
ten_bit,
|
ten_bit,
|
||||||
"D3D11VA first frame (zero-copy)"
|
"D3D11VA first frame"
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|||||||