feat(hdr): Windows HDR10 + 10-bit end-to-end, negotiated; non-blocking capture recovery
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m32s
android / android (push) Successful in 1m49s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m36s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m20s
flatpak / build-publish (push) Successful in 4m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m11s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m32s
apple / swift (push) Successful in 54s
ci / rust (push) Successful in 1m32s
android / android (push) Successful in 1m49s
ci / web (push) Successful in 26s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m36s
decky / build-publish (push) Successful in 12s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
deb / build-publish (push) Successful in 2m20s
flatpak / build-publish (push) Successful in 4m6s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m11s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m32s
Adds true HDR (BT.2020 PQ) and 10-bit (HEVC Main10) streaming, negotiated so an 8-bit/SDR client is never sent a stream it can't decode, plus a robust fix for the capture losing the stream across a secure-desktop transition. Protocol (punktfunk-core/quic.rs): - Hello gains `video_caps` (VIDEO_CAP_10BIT / VIDEO_CAP_HDR), Welcome gains `bit_depth`, both as optional trailing bytes (back-compat). client-rs advertises 10-bit via PUNKTFUNK_CLIENT_10BIT; the connector advertises 0 for now (in-band detection drives the native clients). Regenerated punktfunk_core.h. Windows host: - 10-bit Main10: host enables it only when the client advertised VIDEO_CAP_10BIT AND PUNKTFUNK_10BIT is set; threaded through open_video → NVENC (profile Main10, pixelBitDepthMinus8). - HDR: when the captured desktop is scRGB FP16 (R16G16B16A16_FLOAT, HDR on), copy it to an FP16 surface, composite the cursor there, convert scRGB → BT.2020 PQ 10-bit (R10G10B10A2) via a shader, and encode HEVC Main10 with the BT.2020/PQ colour VUI (ABGR10 input). Fixes the freeze + cursor-trail that came from feeding FP16 into the BGRA path. Reacts dynamically to the HDR toggle. - Capture recovery: rebuild is now a single NON-BLOCKING attempt, throttled to ~4×/s, repeating the last good frame between attempts (format-tagged last_present). During a secure-desktop dwell SudoVDA's output is gone; the old blocking 12 s retry starved the send loop for seconds so the client timed out and disconnected — now the session stays fed (frozen) until the desktop returns. Also seeds a black frame on recovery. Apple client (PunktfunkKit): - Detects HDR in-band from the stream VUI (PQ transfer function), decodes to 10-bit P010, and presents via an rgba16Float + BT.2020 PQ CAMetalLayer with EDR; SDR path unchanged. Switches automatically on a mid-session HDR toggle. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -7,6 +7,7 @@
|
||||
// (which fires on the main runloop). The Metal objects + texture cache are touched only here.
|
||||
|
||||
#if canImport(Metal) && canImport(QuartzCore)
|
||||
import CoreGraphics
|
||||
import CoreVideo
|
||||
import Metal
|
||||
import QuartzCore
|
||||
@@ -44,6 +45,27 @@ fragment float4 pf_frag(VOut in [[stage_in]],
|
||||
float b = y + 1.8556 * u;
|
||||
return float4(saturate(float3(r, g, b)), 1.0);
|
||||
}
|
||||
|
||||
// HDR: 10-bit P010 (BT.2020, limited range), Y'CbCr that is PQ-encoded. We apply the BT.2020
|
||||
// matrix to get PQ-encoded R'G'B' and output it as-is — the CAMetalLayer's itur_2100_PQ colour
|
||||
// space + EDR tells the compositor the samples are PQ, so it does the PQ→display mapping. No EOTF
|
||||
// here (matching the host, which emitted BT.2020 PQ). P010 stores the 10-bit code in the high bits
|
||||
// of each 16-bit sample, so an .r16Unorm sample reads ~code/1023 (the /1024 vs /1023 error is < 0.1%).
|
||||
fragment float4 pf_frag_hdr(VOut in [[stage_in]],
|
||||
texture2d<float> lumaTex [[texture(0)]],
|
||||
texture2d<float> chromaTex [[texture(1)]]) {
|
||||
constexpr sampler s(filter::linear, address::clamp_to_edge);
|
||||
float y = lumaTex.sample(s, in.uv).r;
|
||||
float2 c = chromaTex.sample(s, in.uv).rg;
|
||||
// BT.2020 10-bit limited (video) range → full-range PQ R'G'B'.
|
||||
y = (y - 64.0/1023.0) * (1023.0/876.0);
|
||||
float u = (c.x - 512.0/1023.0) * (1023.0/896.0);
|
||||
float v = (c.y - 512.0/1023.0) * (1023.0/896.0);
|
||||
float r = y + 1.4746 * v;
|
||||
float g = y - 0.16455 * u - 0.57135 * v;
|
||||
float b = y + 1.8814 * u;
|
||||
return float4(saturate(float3(r, g, b)), 1.0);
|
||||
}
|
||||
"""
|
||||
|
||||
public final class MetalVideoPresenter {
|
||||
@@ -52,8 +74,13 @@ public final class MetalVideoPresenter {
|
||||
|
||||
private let device: MTLDevice
|
||||
private let queue: MTLCommandQueue
|
||||
private let pipeline: MTLRenderPipelineState
|
||||
/// SDR (BT.709 8-bit NV12 → bgra8) and HDR (BT.2020 PQ 10-bit P010 → rgba16Float) pipelines.
|
||||
/// Selected per frame by `render`; the layer is reconfigured when the mode flips (HDR toggle).
|
||||
private let pipelineSDR: MTLRenderPipelineState
|
||||
private let pipelineHDR: MTLRenderPipelineState
|
||||
private var textureCache: CVMetalTextureCache?
|
||||
/// Current layer configuration — switched lazily in `configure(hdr:)` when a frame's mode differs.
|
||||
private var hdrActive = false
|
||||
|
||||
/// nil if Metal is unavailable (no GPU / a headless CI) — the caller falls back to stage-1.
|
||||
public init?() {
|
||||
@@ -64,11 +91,17 @@ public final class MetalVideoPresenter {
|
||||
self.queue = queue
|
||||
do {
|
||||
let library = try device.makeLibrary(source: shaderSource, options: nil)
|
||||
let desc = MTLRenderPipelineDescriptor()
|
||||
desc.vertexFunction = library.makeFunction(name: "pf_vtx")
|
||||
desc.fragmentFunction = library.makeFunction(name: "pf_frag")
|
||||
desc.colorAttachments[0].pixelFormat = .bgra8Unorm
|
||||
pipeline = try device.makeRenderPipelineState(descriptor: desc)
|
||||
let vtx = library.makeFunction(name: "pf_vtx")
|
||||
let sdr = MTLRenderPipelineDescriptor()
|
||||
sdr.vertexFunction = vtx
|
||||
sdr.fragmentFunction = library.makeFunction(name: "pf_frag")
|
||||
sdr.colorAttachments[0].pixelFormat = .bgra8Unorm
|
||||
pipelineSDR = try device.makeRenderPipelineState(descriptor: sdr)
|
||||
let hdr = MTLRenderPipelineDescriptor()
|
||||
hdr.vertexFunction = vtx
|
||||
hdr.fragmentFunction = library.makeFunction(name: "pf_frag_hdr")
|
||||
hdr.colorAttachments[0].pixelFormat = .rgba16Float // EDR-capable
|
||||
pipelineHDR = try device.makeRenderPipelineState(descriptor: hdr)
|
||||
} catch {
|
||||
return nil
|
||||
}
|
||||
@@ -102,14 +135,40 @@ public final class MetalVideoPresenter {
|
||||
if layer.drawableSize != size { layer.drawableSize = size }
|
||||
}
|
||||
|
||||
/// Draw one decoded frame to the next drawable and present it. Returns true on success;
|
||||
/// Reconfigure the layer for SDR or HDR when the stream mode flips (HDR toggle). HDR uses an
|
||||
/// rgba16Float drawable + a BT.2020 PQ colour space + EDR, so the compositor PQ-maps to the
|
||||
/// display; SDR uses the plain 8-bit sRGB path. Main-thread only (called from `render`).
|
||||
private func configure(hdr: Bool) {
|
||||
guard hdr != hdrActive else { return }
|
||||
hdrActive = hdr
|
||||
if hdr {
|
||||
layer.pixelFormat = .rgba16Float
|
||||
layer.colorspace = CGColorSpace(name: CGColorSpace.itur_2100_PQ)
|
||||
#if os(macOS)
|
||||
layer.wantsExtendedDynamicRangeContent = true
|
||||
#endif
|
||||
} else {
|
||||
layer.pixelFormat = .bgra8Unorm
|
||||
layer.colorspace = nil
|
||||
#if os(macOS)
|
||||
layer.wantsExtendedDynamicRangeContent = false
|
||||
#endif
|
||||
}
|
||||
}
|
||||
|
||||
/// Draw one decoded frame to the next drawable and present it. `isHDR` selects the 10-bit
|
||||
/// BT.2020 PQ path (P010 input) vs the 8-bit BT.709 path (NV12 input). Returns true on success;
|
||||
/// false when there's no drawable yet, a texture couldn't be made, or Metal errored — the
|
||||
/// caller then doesn't stamp a present for this frame.
|
||||
@discardableResult
|
||||
public func render(_ pixelBuffer: CVPixelBuffer) -> Bool {
|
||||
public func render(_ pixelBuffer: CVPixelBuffer, isHDR: Bool = false) -> Bool {
|
||||
configure(hdr: isHDR)
|
||||
// P010 stores 10-bit luma/chroma in 16-bit samples → R16/RG16; NV12 is 8-bit → R8/RG8.
|
||||
let lumaFmt: MTLPixelFormat = isHDR ? .r16Unorm : .r8Unorm
|
||||
let chromaFmt: MTLPixelFormat = isHDR ? .rg16Unorm : .rg8Unorm
|
||||
guard let textureCache,
|
||||
let luma = makeTexture(pixelBuffer, plane: 0, format: .r8Unorm, cache: textureCache),
|
||||
let chroma = makeTexture(pixelBuffer, plane: 1, format: .rg8Unorm, cache: textureCache)
|
||||
let luma = makeTexture(pixelBuffer, plane: 0, format: lumaFmt, cache: textureCache),
|
||||
let chroma = makeTexture(pixelBuffer, plane: 1, format: chromaFmt, cache: textureCache)
|
||||
else { return false }
|
||||
|
||||
// The hosting view owns drawableSize (aspect-fit to its bounds); skip until it's laid
|
||||
@@ -127,7 +186,7 @@ public final class MetalVideoPresenter {
|
||||
guard let encoder = commandBuffer.makeRenderCommandEncoder(descriptor: pass) else {
|
||||
return false
|
||||
}
|
||||
encoder.setRenderPipelineState(pipeline)
|
||||
encoder.setRenderPipelineState(isHDR ? pipelineHDR : pipelineSDR)
|
||||
encoder.setFragmentTexture(CVMetalTextureGetTexture(luma), index: 0)
|
||||
encoder.setFragmentTexture(CVMetalTextureGetTexture(chroma), index: 1)
|
||||
encoder.drawPrimitives(type: .triangle, vertexStart: 0, vertexCount: 3)
|
||||
|
||||
@@ -144,7 +144,7 @@ public final class Stage2Pipeline {
|
||||
/// converted to `CLOCK_REALTIME` (see `realtimeNs(forDisplayLinkTimestamp:)`).
|
||||
public func renderTick(targetPresentNs: Int64) {
|
||||
guard let frame = ring.take() else { return }
|
||||
guard presenter.render(frame.pixelBuffer) else { return }
|
||||
guard presenter.render(frame.pixelBuffer, isHDR: frame.isHDR) else { return }
|
||||
presentMeter.record(ptsNs: frame.ptsNs, atNs: targetPresentNs, offsetNs: offsetNs)
|
||||
}
|
||||
|
||||
|
||||
@@ -17,8 +17,11 @@ public struct ReadyFrame: @unchecked Sendable {
|
||||
public let ptsNs: UInt64
|
||||
/// Client `CLOCK_REALTIME` instant decode completed, in nanoseconds.
|
||||
public let decodedNs: Int64
|
||||
/// The decoded image (NV12 biplanar, Metal-compatible).
|
||||
/// The decoded image — 8-bit NV12 biplanar (SDR) or 10-bit P010 biplanar (HDR), Metal-compatible.
|
||||
public let pixelBuffer: CVPixelBuffer
|
||||
/// True when the stream is HDR (BT.2020 PQ): the buffer is 10-bit P010 and the presenter must
|
||||
/// configure EDR + BT.2020 PQ output. Derived from the decoded buffer's pixel format.
|
||||
public let isHDR: Bool
|
||||
}
|
||||
|
||||
/// The C output callback can't capture context, so VideoToolbox hands it the refcon we set at
|
||||
@@ -116,8 +119,22 @@ public final class VideoDecoder: @unchecked Sendable {
|
||||
format = nil
|
||||
}
|
||||
|
||||
/// `lock` held. Replace the session with one for `newFormat`. NV12 video-range, Metal-
|
||||
/// compatible output (10-bit/HDR is a later tie-in — see the plan).
|
||||
/// True when `newFormat` carries a PQ (SMPTE ST 2084) or HLG transfer function — i.e. the host
|
||||
/// is sending HDR (BT.2020). VideoToolbox populates the transfer-function extension from the
|
||||
/// HEVC VUI, so this tracks the *stream*, switching dynamically when the user toggles HDR
|
||||
/// (the host re-emits parameter sets with the new VUI → a new format desc → session rebuild).
|
||||
static func isHDRFormat(_ format: CMVideoFormatDescription) -> Bool {
|
||||
guard
|
||||
let tf = CMFormatDescriptionGetExtension(
|
||||
format, extensionKey: kCMFormatDescriptionExtension_TransferFunction)
|
||||
else { return false }
|
||||
let s = tf as? String
|
||||
return s == (kCMFormatDescriptionTransferFunction_SMPTE_ST_2084_PQ as String)
|
||||
|| s == (kCMFormatDescriptionTransferFunction_ITU_R_2100_HLG as String)
|
||||
}
|
||||
|
||||
/// `lock` held. Replace the session with one for `newFormat`. SDR streams decode to 8-bit NV12;
|
||||
/// HDR streams (BT.2020 PQ) decode to 10-bit P010 so the presenter can drive EDR.
|
||||
private func createSessionLocked(format newFormat: CMVideoFormatDescription) -> Bool {
|
||||
if let session {
|
||||
VTDecompressionSessionWaitForAsynchronousFrames(session)
|
||||
@@ -126,10 +143,14 @@ public final class VideoDecoder: @unchecked Sendable {
|
||||
session = nil
|
||||
format = nil
|
||||
|
||||
let hdr = Self.isHDRFormat(newFormat)
|
||||
let pixelFormat =
|
||||
hdr
|
||||
? kCVPixelFormatType_420YpCbCr10BiPlanarVideoRange // P010 (10-bit)
|
||||
: kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange // NV12 (8-bit)
|
||||
let imageAttrs: [CFString: Any] = [
|
||||
kCVPixelBufferMetalCompatibilityKey: true,
|
||||
kCVPixelBufferPixelFormatTypeKey:
|
||||
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange,
|
||||
kCVPixelBufferPixelFormatTypeKey: pixelFormat,
|
||||
]
|
||||
var callback = VTDecompressionOutputCallbackRecord(
|
||||
decompressionOutputCallback: decoderOutputCallback,
|
||||
@@ -160,6 +181,11 @@ public final class VideoDecoder: @unchecked Sendable {
|
||||
// pts was stamped at timescale 1e9 (AnnexB.sampleBuffer); normalize defensively.
|
||||
let p = CMTimeConvertScale(pts, timescale: 1_000_000_000, method: .default)
|
||||
let ptsNs = p.value > 0 ? UInt64(p.value) : 0
|
||||
onDecoded(ReadyFrame(ptsNs: ptsNs, decodedNs: decodedNs, pixelBuffer: imageBuffer))
|
||||
// HDR iff the decoder produced a 10-bit P010 buffer (we only request P010 for PQ streams).
|
||||
let isHDR =
|
||||
CVPixelBufferGetPixelFormatType(imageBuffer)
|
||||
== kCVPixelFormatType_420YpCbCr10BiPlanarVideoRange
|
||||
onDecoded(
|
||||
ReadyFrame(ptsNs: ptsNs, decodedNs: decodedNs, pixelBuffer: imageBuffer, isHDR: isHDR))
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user