fix(host/windows): HDR cursor brightness (203-nit) + probe-before-adopt recovery; windows-client bootstrap doc
apple / swift (push) Successful in 55s
android / android (push) Successful in 2m43s
ci / web (push) Successful in 31s
ci / docs-site (push) Successful in 37s
ci / bench (push) Successful in 1m35s
ci / rust (push) Successful in 7m7s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m33s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m33s
docker / deploy-docs (push) Successful in 18s
apple / swift (push) Successful in 55s
android / android (push) Successful in 2m43s
ci / web (push) Successful in 31s
ci / docs-site (push) Successful in 37s
ci / bench (push) Successful in 1m35s
ci / rust (push) Successful in 7m7s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 2m18s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m33s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m33s
docker / deploy-docs (push) Successful in 18s
- HDR cursor: sRGB→linear decode + scale to HDR graphics white (PUNKTFUNK_HDR_CURSOR_NITS, default 203 per BT.2408) in the FP16 cursor composite, so it's no longer ~2.5x too dim. SDR path unchanged; the masked-color (I-beam) inversion blend left unscaled. Cursor cbuffer widened 16→32 + bound to PS. (Validated live: cursor now correct brightness in HDR.) - Secure-desktop recovery: recreate_dupl now PROBES the rebuilt duplication with a 50ms AcquireNextFrame and only adopts it when live (Ok/WAIT_TIMEOUT); a born-lost one (immediate ACCESS_LOST) is dropped so the caller repeats the last frame + retries. Plus reassert_isolation() re-detaches physical displays on every recovery (re-routing the secure/HDR desktop to the virtual output, the delta a fresh reconnect has). NOTE: the born-lost ACCESS_LOST storm in HDR is NOT yet resolved by these — still under investigation (animations/secure-UI/cursor-trail in HDR remain). - docs/windows-client-bootstrap.md: handoff for the native Windows Rust client (windows-rs Reactor + WinUI 3 SwapChainPanel, D3D11VA decode, WASAPI audio, SDL3 input; ports crates/punktfunk-client-linux; 10-bit/HDR present; dev boxes + gotchas). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -206,8 +206,21 @@ VOut main(uint vid : SV_VertexID) {
|
|||||||
const CURSOR_PS: &str = r"
|
const CURSOR_PS: &str = r"
|
||||||
Texture2D tx : register(t0);
|
Texture2D tx : register(t0);
|
||||||
SamplerState sm : register(s0);
|
SamplerState sm : register(s0);
|
||||||
|
// b0 is shared with the VS: float4 rect, then the HDR cursor params. For SDR white_mul=1 / decode=0
|
||||||
|
// so this is a no-op (returns the raw sampled BGRA, blended in the display's native sRGB space). For
|
||||||
|
// HDR the cursor is composited onto a LINEAR scRGB FP16 surface where 1.0 = 80 nits, so we sRGB→
|
||||||
|
// linear decode (correct alpha blending + no dark edge fringe) and scale to HDR graphics white
|
||||||
|
// (~203 nits → white_mul = 203/80) so the cursor isn't ~2.5x too dim vs the HDR desktop.
|
||||||
|
cbuffer C : register(b0) { float4 rect; float white_mul; float decode; float2 pad; };
|
||||||
|
float3 srgb_to_linear(float3 c) {
|
||||||
|
return c <= 0.04045 ? c / 12.92 : pow((c + 0.055) / 1.055, 2.4);
|
||||||
|
}
|
||||||
float4 main(float4 pos : SV_POSITION, float2 uv : TEXCOORD0) : SV_TARGET {
|
float4 main(float4 pos : SV_POSITION, float2 uv : TEXCOORD0) : SV_TARGET {
|
||||||
return tx.Sample(sm, uv);
|
float4 s = tx.Sample(sm, uv);
|
||||||
|
float3 rgb = s.rgb;
|
||||||
|
if (decode > 0.5) { rgb = srgb_to_linear(rgb); }
|
||||||
|
rgb *= white_mul;
|
||||||
|
return float4(rgb, s.a);
|
||||||
}
|
}
|
||||||
";
|
";
|
||||||
|
|
||||||
@@ -267,7 +280,7 @@ impl CursorCompositor {
|
|||||||
device.CreatePixelShader(&psb, None, Some(&mut ps))?;
|
device.CreatePixelShader(&psb, None, Some(&mut ps))?;
|
||||||
|
|
||||||
let cbd = D3D11_BUFFER_DESC {
|
let cbd = D3D11_BUFFER_DESC {
|
||||||
ByteWidth: 16,
|
ByteWidth: 32, // float4 rect + (white_mul, decode, pad, pad) for the HDR cursor PS
|
||||||
Usage: D3D11_USAGE_DYNAMIC,
|
Usage: D3D11_USAGE_DYNAMIC,
|
||||||
BindFlags: D3D11_BIND_CONSTANT_BUFFER.0 as u32,
|
BindFlags: D3D11_BIND_CONSTANT_BUFFER.0 as u32,
|
||||||
CPUAccessFlags: D3D11_CPU_ACCESS_WRITE.0 as u32,
|
CPUAccessFlags: D3D11_CPU_ACCESS_WRITE.0 as u32,
|
||||||
@@ -375,6 +388,13 @@ impl CursorCompositor {
|
|||||||
cx: i32,
|
cx: i32,
|
||||||
cy: i32,
|
cy: i32,
|
||||||
invert: bool,
|
invert: bool,
|
||||||
|
// HDR (decode=true): sRGB→linear decode + scale the cursor to `white_mul` × 80 nits, so a
|
||||||
|
// white cursor hits HDR graphics white (~203 nits) not 80. SDR passes white_mul=1.0,
|
||||||
|
// decode=false → the PS returns the raw sample (blended in the display's native sRGB space).
|
||||||
|
// The inversion (masked-color / I-beam) blend operates on the framebuffer reference, so it is
|
||||||
|
// left unscaled/undecoded even in HDR.
|
||||||
|
white_mul: f32,
|
||||||
|
decode: bool,
|
||||||
) {
|
) {
|
||||||
let (srv, cw, ch) = match &self.tex {
|
let (srv, cw, ch) = match &self.tex {
|
||||||
Some(t) => t,
|
Some(t) => t,
|
||||||
@@ -384,13 +404,19 @@ impl CursorCompositor {
|
|||||||
let x1 = ((cx + *cw as i32) as f32 / fw as f32) * 2.0 - 1.0;
|
let x1 = ((cx + *cw as i32) as f32 / fw as f32) * 2.0 - 1.0;
|
||||||
let y0 = 1.0 - (cy as f32 / fh as f32) * 2.0;
|
let y0 = 1.0 - (cy as f32 / fh as f32) * 2.0;
|
||||||
let y1 = 1.0 - ((cy + *ch as i32) as f32 / fh as f32) * 2.0;
|
let y1 = 1.0 - ((cy + *ch as i32) as f32 / fh as f32) * 2.0;
|
||||||
let rect = [x0, y0, x1, y1];
|
let (mul, dec) = if invert {
|
||||||
|
(1.0_f32, 0.0_f32)
|
||||||
|
} else {
|
||||||
|
(white_mul, if decode { 1.0 } else { 0.0 })
|
||||||
|
};
|
||||||
|
// cbuf layout: [rect.x, rect.y, rect.z, rect.w, white_mul, decode, pad, pad] (32 bytes).
|
||||||
|
let cb = [x0, y0, x1, y1, mul, dec, 0.0, 0.0];
|
||||||
let mut mapped = D3D11_MAPPED_SUBRESOURCE::default();
|
let mut mapped = D3D11_MAPPED_SUBRESOURCE::default();
|
||||||
if ctx
|
if ctx
|
||||||
.Map(&self.cbuf, 0, D3D11_MAP_WRITE_DISCARD, 0, Some(&mut mapped))
|
.Map(&self.cbuf, 0, D3D11_MAP_WRITE_DISCARD, 0, Some(&mut mapped))
|
||||||
.is_ok()
|
.is_ok()
|
||||||
{
|
{
|
||||||
std::ptr::copy_nonoverlapping(rect.as_ptr(), mapped.pData as *mut f32, 4);
|
std::ptr::copy_nonoverlapping(cb.as_ptr(), mapped.pData as *mut f32, cb.len());
|
||||||
ctx.Unmap(&self.cbuf, 0);
|
ctx.Unmap(&self.cbuf, 0);
|
||||||
}
|
}
|
||||||
let vp = D3D11_VIEWPORT {
|
let vp = D3D11_VIEWPORT {
|
||||||
@@ -412,6 +438,7 @@ impl CursorCompositor {
|
|||||||
ctx.VSSetShader(&self.vs, None);
|
ctx.VSSetShader(&self.vs, None);
|
||||||
ctx.PSSetShader(&self.ps, None);
|
ctx.PSSetShader(&self.ps, None);
|
||||||
ctx.VSSetConstantBuffers(0, Some(&[Some(self.cbuf.clone())]));
|
ctx.VSSetConstantBuffers(0, Some(&[Some(self.cbuf.clone())]));
|
||||||
|
ctx.PSSetConstantBuffers(0, Some(&[Some(self.cbuf.clone())])); // white_mul/decode for the PS
|
||||||
ctx.PSSetShaderResources(0, Some(&[Some(srv.clone())]));
|
ctx.PSSetShaderResources(0, Some(&[Some(srv.clone())]));
|
||||||
ctx.PSSetSamplers(0, Some(&[Some(self.sampler.clone())]));
|
ctx.PSSetSamplers(0, Some(&[Some(self.sampler.clone())]));
|
||||||
ctx.IASetInputLayout(None);
|
ctx.IASetInputLayout(None);
|
||||||
@@ -1110,8 +1137,11 @@ impl DuplCapturer {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Composite the cursor onto the GPU frame texture (zero-copy path).
|
/// Composite the cursor onto the GPU frame texture (zero-copy path). `hdr` = the target is the
|
||||||
unsafe fn composite_cursor_gpu(&mut self, gpu: &ID3D11Texture2D) -> Result<()> {
|
/// linear scRGB FP16 surface (HDR path) — the cursor is then sRGB→linear decoded and scaled to
|
||||||
|
/// HDR graphics white (PUNKTFUNK_HDR_CURSOR_NITS, default 203, per BT.2408) so it isn't ~2.5×
|
||||||
|
/// too dim; SDR composites the raw cursor in the display's native sRGB space.
|
||||||
|
unsafe fn composite_cursor_gpu(&mut self, gpu: &ID3D11Texture2D, hdr: bool) -> Result<()> {
|
||||||
// Diagnostic kill-switch: skip the GPU cursor composite entirely (PUNKTFUNK_NO_CURSOR=1) to
|
// Diagnostic kill-switch: skip the GPU cursor composite entirely (PUNKTFUNK_NO_CURSOR=1) to
|
||||||
// isolate its cost on the 3D engine. The per-frame render-target view + draw to the 5K target
|
// isolate its cost on the 3D engine. The per-frame render-target view + draw to the 5K target
|
||||||
// is the suspect for the high 3D usage under heavy desktop change.
|
// is the suspect for the high 3D usage under heavy desktop change.
|
||||||
@@ -1151,6 +1181,18 @@ impl DuplCapturer {
|
|||||||
.CreateRenderTargetView(gpu, None, Some(&mut rtv))?;
|
.CreateRenderTargetView(gpu, None, Some(&mut rtv))?;
|
||||||
let rtv = rtv.context("cursor rtv")?;
|
let rtv = rtv.context("cursor rtv")?;
|
||||||
let (cx, cy) = self.cursor_pos;
|
let (cx, cy) = self.cursor_pos;
|
||||||
|
// HDR graphics-white target in nits → scRGB multiplier (scRGB 1.0 = 80 nits). Default 203
|
||||||
|
// (BT.2408); PUNKTFUNK_HDR_CURSOR_NITS overrides without a rebuild. SDR → 1.0, no decode.
|
||||||
|
let white_mul = if hdr {
|
||||||
|
let nits = std::env::var("PUNKTFUNK_HDR_CURSOR_NITS")
|
||||||
|
.ok()
|
||||||
|
.and_then(|s| s.parse::<f32>().ok())
|
||||||
|
.filter(|n| n.is_finite() && *n > 0.0)
|
||||||
|
.unwrap_or(203.0);
|
||||||
|
nits / 80.0
|
||||||
|
} else {
|
||||||
|
1.0
|
||||||
|
};
|
||||||
self.cursor.as_ref().unwrap().draw(
|
self.cursor.as_ref().unwrap().draw(
|
||||||
&self.context,
|
&self.context,
|
||||||
&rtv,
|
&rtv,
|
||||||
@@ -1159,6 +1201,8 @@ impl DuplCapturer {
|
|||||||
cx,
|
cx,
|
||||||
cy,
|
cy,
|
||||||
self.cursor_invert,
|
self.cursor_invert,
|
||||||
|
white_mul,
|
||||||
|
hdr, // decode sRGB→linear only on the HDR (linear FP16) target
|
||||||
);
|
);
|
||||||
Ok(())
|
Ok(())
|
||||||
}
|
}
|
||||||
@@ -1183,10 +1227,41 @@ impl DuplCapturer {
|
|||||||
self.gdi_name = n;
|
self.gdi_name = n;
|
||||||
}
|
}
|
||||||
attach_input_desktop();
|
attach_input_desktop();
|
||||||
|
// Re-route the secure (Winlogon) desktop back to the virtual output. The lock/UAC switch can
|
||||||
|
// re-attach a physical monitor so the secure desktop lands there and our virtual output goes
|
||||||
|
// perpetually ACCESS_LOST; re-isolating (as a fresh session's `create` does) is the delta that
|
||||||
|
// makes in-session recovery work like a reconnect. Idempotent/cheap when already isolated.
|
||||||
|
crate::vdisplay::sudovda::reassert_isolation(&self.gdi_name);
|
||||||
let (dev, ctx, out, dupl) = reopen_duplication(&self.gdi_name)?; // Err → caller repeats + retries
|
let (dev, ctx, out, dupl) = reopen_duplication(&self.gdi_name)?; // Err → caller repeats + retries
|
||||||
// A desktop switch can come back at a different size (e.g. the user session applies its own
|
|
||||||
// resolution on login). Adopt it: update dimensions and drop the staging/gpu copies so they
|
// PROBE before adopting. During the unsettled Winlogon switch DuplicateOutput SUCCEEDS but the
|
||||||
// reallocate. NVENC re-inits at the new size when it sees the frame.
|
// duplication is "born-lost" — the first AcquireNextFrame immediately returns ACCESS_LOST.
|
||||||
|
// Adopting it (swapping into self + seeding black) is exactly what produced the perpetual
|
||||||
|
// rebuild→born-lost storm (lost=2097) where the secure desktop never appeared. So gate adoption
|
||||||
|
// on a probe: Ok (a frame) or WAIT_TIMEOUT (alive but idle) ⇒ live, adopt; any other error ⇒
|
||||||
|
// born-lost, drop the locals and bail so the caller repeats the last frame and retries on the
|
||||||
|
// 250ms throttle. Once the topology settles (and reassert_isolation has taken), a probe passes
|
||||||
|
// and we adopt a LIVE duplication of the secure desktop.
|
||||||
|
{
|
||||||
|
let mut info = DXGI_OUTDUPL_FRAME_INFO::default();
|
||||||
|
let mut res: Option<IDXGIResource> = None;
|
||||||
|
match dupl.AcquireNextFrame(50, &mut info, &mut res) {
|
||||||
|
Ok(()) => {
|
||||||
|
let _ = dupl.ReleaseFrame();
|
||||||
|
}
|
||||||
|
Err(e) if e.code() == DXGI_ERROR_WAIT_TIMEOUT => {}
|
||||||
|
Err(e) => {
|
||||||
|
return Err(anyhow!(
|
||||||
|
"rebuilt duplication is born-lost (probe AcquireNextFrame: {:#x}) — \
|
||||||
|
topology not settled yet",
|
||||||
|
e.code().0
|
||||||
|
));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// A desktop switch can come back at a different size (e.g. the user session applies its own
|
||||||
|
// resolution on login). Adopt it: update dimensions and drop the staging/gpu copies so they
|
||||||
|
// reallocate. NVENC re-inits at the new size when it sees the frame.
|
||||||
let dd: DXGI_OUTDUPL_DESC = dupl.GetDesc();
|
let dd: DXGI_OUTDUPL_DESC = dupl.GetDesc();
|
||||||
let (nw, nh) = (dd.ModeDesc.Width, dd.ModeDesc.Height);
|
let (nw, nh) = (dd.ModeDesc.Width, dd.ModeDesc.Height);
|
||||||
tracing::info!(
|
tracing::info!(
|
||||||
@@ -1317,7 +1392,7 @@ impl DuplCapturer {
|
|||||||
self.context.CopyResource(&src, &tex);
|
self.context.CopyResource(&src, &tex);
|
||||||
let _ = self.dupl.ReleaseFrame();
|
let _ = self.dupl.ReleaseFrame();
|
||||||
self.holding_frame = false;
|
self.holding_frame = false;
|
||||||
self.composite_cursor_gpu(&src)?; // onto the FP16 surface (RTV works on FP16)
|
self.composite_cursor_gpu(&src, true)?; // onto the FP16 surface (HDR: decode + nits scale)
|
||||||
self.ensure_hdr10_out()?;
|
self.ensure_hdr10_out()?;
|
||||||
let out = self.hdr10_out.clone().context("hdr10 out texture")?;
|
let out = self.hdr10_out.clone().context("hdr10 out texture")?;
|
||||||
if self.hdr_conv.is_none() {
|
if self.hdr_conv.is_none() {
|
||||||
@@ -1355,7 +1430,7 @@ impl DuplCapturer {
|
|||||||
self.context.CopyResource(&gpu, &tex);
|
self.context.CopyResource(&gpu, &tex);
|
||||||
let _ = self.dupl.ReleaseFrame();
|
let _ = self.dupl.ReleaseFrame();
|
||||||
self.holding_frame = false;
|
self.holding_frame = false;
|
||||||
self.composite_cursor_gpu(&gpu)?;
|
self.composite_cursor_gpu(&gpu, false)?;
|
||||||
self.last_present = Some((gpu.clone(), PixelFormat::Bgra));
|
self.last_present = Some((gpu.clone(), PixelFormat::Bgra));
|
||||||
return Ok(Some(CapturedFrame {
|
return Ok(Some(CapturedFrame {
|
||||||
width: self.width,
|
width: self.width,
|
||||||
|
|||||||
@@ -347,6 +347,20 @@ unsafe fn restore_displays(saved: &[(String, DEVMODEW)]) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Re-detach physical displays so the secure (Winlogon) desktop keeps rendering to the virtual
|
||||||
|
/// output — for the in-session DXGI capture recovery (dxgi.rs `recreate_dupl`). The lock/UAC/login
|
||||||
|
/// switch can re-attach a physical monitor (the secure desktop then lands on IT and our virtual
|
||||||
|
/// output goes perpetually ACCESS_LOST — the "born-lost" storm); re-running the isolate routes the
|
||||||
|
/// secure desktop back to the virtual output, mirroring what a fresh session's `create` does (the
|
||||||
|
/// delta that makes a reconnect work where in-session recovery didn't). Idempotent + cheap: when
|
||||||
|
/// nothing besides `gdi_name` is attached, [`isolate_displays`] finds nothing to detach and commits
|
||||||
|
/// nothing — so this is safe to call on every throttled recovery tick (no display thrash).
|
||||||
|
pub(crate) fn reassert_isolation(gdi_name: &str) {
|
||||||
|
unsafe {
|
||||||
|
let _ = isolate_displays(gdi_name);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
unsafe fn open_device() -> Result<HANDLE> {
|
unsafe fn open_device() -> Result<HANDLE> {
|
||||||
let hdev = SetupDiGetClassDevsW(
|
let hdev = SetupDiGetClassDevsW(
|
||||||
Some(&SUVDA_INTERFACE),
|
Some(&SUVDA_INTERFACE),
|
||||||
|
|||||||
@@ -0,0 +1,132 @@
|
|||||||
|
# Windows native client — bootstrap handoff
|
||||||
|
|
||||||
|
A handoff for an agent picking up the **native Windows punktfunk/1 client**. The host side is done
|
||||||
|
and live-validated on a real RTX 4090; the client is the remaining piece. This doc is the concrete
|
||||||
|
starting point: the locked decisions, the reference code to port, the stack swaps, the dev loop, and
|
||||||
|
the gotchas. Read it top to bottom, then start at **Phase 1** (de-risk Reactor first).
|
||||||
|
|
||||||
|
## What we're building
|
||||||
|
|
||||||
|
A native Windows client that connects to a punktfunk/1 host (`serve --native` / `m3-host`), decodes
|
||||||
|
HEVC, presents it low-latency, plays Opus audio, and captures local mouse/keyboard/gamepad to send
|
||||||
|
back — i.e. the Windows analogue of the **GTK4 Linux client** (`crates/punktfunk-client-linux`),
|
||||||
|
which is the architectural template. The Windows client is close to a 1:1 port of the Linux client
|
||||||
|
with the platform layers swapped.
|
||||||
|
|
||||||
|
## Locked decisions (from the Windows-host/client plan, `docs/windows-host.md` + project memory)
|
||||||
|
|
||||||
|
- **Pure Rust.** `windows-rs` + **Windows App SDK "Reactor"** (WinUI 3 from Rust, merged windows-rs
|
||||||
|
PR #4479). No C++/C#. De-risk Reactor + `SwapChainPanel` FIRST — it's the only novel/uncertain
|
||||||
|
piece; everything else is a known-good port.
|
||||||
|
- **Links `punktfunk-core` directly** (Cargo path dep, `features = ["quic"]`) — **no C ABI**, exactly
|
||||||
|
like the GTK client. `NativeClient` is already `Sync` (mutexed plane receivers), so it drops into a
|
||||||
|
UI app cleanly. The C ABI (`punktfunk_connect` + `next_au`/`next_audio`/`next_rumble`/`next_hidout`/
|
||||||
|
`send_input`/`send_rich_input`) is the *Apple* path; the native Rust clients call
|
||||||
|
`crates/punktfunk-core/src/client.rs` (`NativeClient`) methods directly.
|
||||||
|
- **Video widget = WinUI 3 `SwapChainPanel`** (built-in), fed a D3D11 swapchain via
|
||||||
|
`ISwapChainPanelNative::SetSwapChain`.
|
||||||
|
- **Decode = FFmpeg-next + D3D11VA** (HEVC; **Main10** for 10-bit/HDR — see below).
|
||||||
|
- **Audio playback = WASAPI render** + Opus decode (`opus` crate, vendors libopus via cmake; set
|
||||||
|
`CMAKE_POLICY_VERSION_MINIMUM=3.5`).
|
||||||
|
- **Input capture→send**: the client captures LOCAL input and sends it. Mouse (abs + relative) +
|
||||||
|
keyboard via the **inverse VK table** (port `keymap.rs`); gamepad via **SDL3** (already a workspace
|
||||||
|
dep, cross-platform) → `NativeClient::send_input`/`send_rich_input`. (`SendInput`/`ViGEm` are
|
||||||
|
HOST-side injection — not used by the client.)
|
||||||
|
- **Discovery = `mdns-sd`** (cross-platform, browses `_punktfunk._udp`).
|
||||||
|
- **Trust = shared client identity + SPAKE2 PIN pairing + TOFU** (port `trust.rs`; same identity
|
||||||
|
files/logic as the other native clients).
|
||||||
|
|
||||||
|
## The reference: `crates/punktfunk-client-linux/src/`
|
||||||
|
|
||||||
|
Port these files (near 1:1; only the platform layers change):
|
||||||
|
|
||||||
|
| Linux file | Role | Windows swap |
|
||||||
|
|---|---|---|
|
||||||
|
| `main.rs` / `app.rs` | app shell, lifecycle | WinUI 3 `App`/`Window` via Reactor |
|
||||||
|
| `ui_hosts.rs` | host list / connect screen | WinUI 3 page |
|
||||||
|
| `ui_settings.rs` | settings | WinUI 3 page |
|
||||||
|
| `ui_stream.rs` | the streaming view | WinUI 3 page hosting `SwapChainPanel` |
|
||||||
|
| `video.rs` | FFmpeg decode + present | FFmpeg **D3D11VA** → D3D11 swapchain in `SwapChainPanel` |
|
||||||
|
| `audio.rs` | Opus decode + playback | **WASAPI render** (was PipeWire) |
|
||||||
|
| `session.rs` | `NativeClient` connect + plane pumps | **reuse almost verbatim** (core is cross-platform) |
|
||||||
|
| `trust.rs` | identity, PIN, TOFU | **reuse almost verbatim** |
|
||||||
|
| `discovery.rs` | mDNS browse | **reuse verbatim** (`mdns-sd`) |
|
||||||
|
| `keymap.rs` | inverse VK table | reuse; Windows VK is the native source so this is *simpler* |
|
||||||
|
| `gamepad.rs` | SDL3 pad capture + rumble/feedback | **reuse almost verbatim** (SDL3 is cross-platform) |
|
||||||
|
|
||||||
|
`session.rs`, `trust.rs`, `discovery.rs`, `keymap.rs`, `gamepad.rs` are mostly platform-neutral
|
||||||
|
(they touch `punktfunk-core` + SDL3 + mdns, all cross-platform) — expect to reuse them with minimal
|
||||||
|
changes. The real work is `video.rs` (D3D11VA + swapchain), `audio.rs` (WASAPI), and the WinUI shell.
|
||||||
|
|
||||||
|
## 10-bit + HDR (NEW — landed this session, the client MUST handle it)
|
||||||
|
|
||||||
|
The host now negotiates and emits **HEVC Main10 + BT.2020 PQ HDR10** when the captured desktop is
|
||||||
|
HDR (and 10-bit SDR Main10 when negotiated). The Apple client already does the matching present; the
|
||||||
|
Windows client should mirror it:
|
||||||
|
|
||||||
|
- **Advertise caps** in the `Hello`: `video_caps = VIDEO_CAP_10BIT | VIDEO_CAP_HDR`
|
||||||
|
(`crates/punktfunk-core/src/quic.rs`). The host enables 10-bit only if the client advertised it.
|
||||||
|
(The native-client connector in `client.rs` currently hardcodes `video_caps: 0` with a TODO —
|
||||||
|
thread the real caps through when you wire decode; or detect HDR purely in-band, see next.)
|
||||||
|
- **Detect HDR in-band** from the HEVC VUI (transfer characteristics = SMPTE ST 2084 / PQ), exactly
|
||||||
|
like the Apple client's `VideoDecoder.isHDRFormat` (`clients/apple/Sources/PunktfunkKit/`). This
|
||||||
|
handles a mid-session HDR toggle without renegotiation. `Welcome.bit_depth` (8/10) is also available.
|
||||||
|
- **Decode** Main10 → **P010** (10-bit) via D3D11VA.
|
||||||
|
- **Present HDR**: swapchain in `DXGI_FORMAT_R10G10B10A2_UNORM` (or `R16G16B16A16_FLOAT`),
|
||||||
|
`IDXGISwapChain3::SetColorSpace1(DXGI_COLOR_SPACE_RGB_FULL_G2084_NONE_P2020)` +
|
||||||
|
`SetHDRMetaData` for HDR10; the host's stream is BT.2020 PQ, so present PQ. For SDR, the existing
|
||||||
|
`DXGI_FORMAT_B8G8R8A8_UNORM` + BT.709 path. (The host-side HDR conversion math is in
|
||||||
|
`crates/punktfunk-host/src/capture/dxgi.rs` `HDR_PS`/`HdrConverter` if you need the inverse.)
|
||||||
|
|
||||||
|
## Dev boxes
|
||||||
|
|
||||||
|
- **No-GPU dev box (UI + connect + software decode):** `ssh "Enrico Bühler"@192.168.1.57` — Win11 Pro
|
||||||
|
25H2 (build 26200), QEMU Q35, 8 vCPU/12 GB, **no working GPU** (so no NVENC, no D3D11VA hardware
|
||||||
|
decode — use FFmpeg software decode here; this box is for UI/connect/protocol work). Has Rust 1.96
|
||||||
|
MSVC, VS 2026 + VC tools + Win SDK, Win App Runtime 2.2, SudoVDA + Parsec VDD.
|
||||||
|
- **Real-GPU box (HDR / hardware decode / end-to-end):** `ssh "Enrico Bühler"@192.168.1.174` — Win11,
|
||||||
|
RTX 4090, runs the host. Use it to test the client against a live HDR host.
|
||||||
|
|
||||||
|
### Dev-loop gotchas (both boxes)
|
||||||
|
|
||||||
|
- **Build under an ASCII path** (`C:\Users\Public\…`). The username "Enrico Bühler" has a `ü` → MSVC
|
||||||
|
`LNK1201` PDB-write failure under `~/Developer`.
|
||||||
|
- **Toolchain gaps:** `winget install NASM.NASM Kitware.CMake LLVM.LLVM` (aws-lc-rs on the quic path,
|
||||||
|
ffmpeg-sys needs libclang).
|
||||||
|
- **`CMAKE_POLICY_VERSION_MINIMUM=3.5`** in the build env (CMake 4 rejects libopus's old minimum).
|
||||||
|
- **File transfer = `sftp`** (scp is broken under the PowerShell DefaultShell):
|
||||||
|
`printf 'put %s /C:/Users/Public/REL/PATH\n' LOCAL | sftp -b - "Enrico Bühler@192.168.1.57"` —
|
||||||
|
note the **leading slash** `/C:/…`. Let the VM regenerate its own `Cargo.lock` (don't transfer it).
|
||||||
|
- **Windows clippy is stricter** than Linux CI and `cfg(windows)` code is excluded from Linux CI →
|
||||||
|
run `cargo clippy -p punktfunk-client-windows -- -D warnings` ON THE VM before committing.
|
||||||
|
- Work on `main`; fetch+merge `origin/main` before pushing.
|
||||||
|
|
||||||
|
## Suggested phased plan
|
||||||
|
|
||||||
|
1. **De-risk Reactor (do this first).** A windows-rs Reactor (WinUI 3) hello-world that hosts a
|
||||||
|
`SwapChainPanel` and presents a cleared D3D11 swapchain into it. Confirm the windows-rs Reactor
|
||||||
|
version/API (PR #4479) and `ISwapChainPanelNative::SetSwapChain` interop. If Reactor proves too
|
||||||
|
raw, the fallback is `winit` + a child HWND swapchain, but try Reactor first per the decision.
|
||||||
|
2. **Crate scaffold.** `crates/punktfunk-client-windows`, `[target.'cfg(windows)'.dependencies]`:
|
||||||
|
`punktfunk-core { path, features=["quic"] }`, `windows`, the Reactor crate, `ffmpeg-next`, `opus`,
|
||||||
|
`sdl3`, `mdns-sd`, `anyhow`, `tracing`. Mirror `crates/punktfunk-client-linux/Cargo.toml`.
|
||||||
|
3. **Connect + control plane.** Port `session.rs` + `trust.rs`; validate headless against the 4090
|
||||||
|
box (`m3-host`/`serve --native`) — handshake, PIN/TOFU, plane counters — before any UI/decode.
|
||||||
|
4. **Decode + present.** FFmpeg D3D11VA → `SwapChainPanel`. SDR (8-bit BGRA) first, then **P010 +
|
||||||
|
HDR colorspace** (see the HDR section).
|
||||||
|
5. **Audio.** WASAPI render + Opus decode (port `audio.rs`).
|
||||||
|
6. **Input.** Mouse + keyboard capture→send (port `keymap.rs`), gamepad via SDL3 (port `gamepad.rs`),
|
||||||
|
feedback from `next_rumble`/`next_hidout`.
|
||||||
|
7. **Discovery + UI.** Port `discovery.rs` + `ui_hosts.rs` + `ui_settings.rs` to WinUI pages.
|
||||||
|
|
||||||
|
## Key references
|
||||||
|
|
||||||
|
- **Template:** `crates/punktfunk-client-linux/src/*` (the client to port).
|
||||||
|
- **Apple HDR present** (the pattern to mirror): `clients/apple/Sources/PunktfunkKit/{VideoDecoder,
|
||||||
|
MetalVideoPresenter,Stage2Pipeline}.swift` — in-band PQ detection, P010 decode, EDR present.
|
||||||
|
- **Core client API:** `crates/punktfunk-core/src/client.rs` (`NativeClient`).
|
||||||
|
- **Protocol:** `crates/punktfunk-core/src/quic.rs` (`Hello.video_caps`, `Welcome.bit_depth`,
|
||||||
|
`VIDEO_CAP_10BIT`/`VIDEO_CAP_HDR`).
|
||||||
|
- **Full Windows plan + SudoVDA/host details:** `docs/windows-host.md`.
|
||||||
|
- **Host HDR conversion (for the inverse math):** `crates/punktfunk-host/src/capture/dxgi.rs`
|
||||||
|
(`HDR_PS`, `HdrConverter`) + `crates/punktfunk-host/src/encode/nvenc.rs` (BT.2020/PQ VUI).
|
||||||
Reference in New Issue
Block a user