fix(host/windows): don't 2-way-split-encode Main10 — it's SLOWER on Ada (fixes broken HDR animations)
apple / swift (push) Successful in 53s
audit / cargo-audit (push) Failing after 1m9s
android / android (push) Successful in 2m3s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 6s
ci / web (push) Successful in 29s
ci / docs-site (push) Successful in 29s
ci / bench (push) Successful in 1m31s
ci / rust (push) Successful in 4m26s
decky / build-publish (push) Successful in 11s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 5s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 3s
flatpak / build-publish (push) Successful in 3m34s
deb / build-publish (push) Successful in 6m55s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m25s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 5m10s

The "broken animations in HDR" was an encode-throughput cliff, not the ACCESS_LOST churn. Measured at
5120x1440@240 HEVC Main10 on the RTX 4090: forced 2-way split-encode = 7.6 ms/frame (~131 fps, well
over the 4.17 ms/240fps budget → choppy), while SINGLE engine = 2.8-3.9 ms/frame (~256-357 fps, fits
240). The split/merge overhead dominates for 10-bit; a single Ada NVENC engine already handles 5K@240
Main10 comfortably. So the split decision now forces DISABLE for Main10 (bit_depth >= 10), keeping the
existing forced-2 only for 8-bit above 1 Gpix/s. PUNKTFUNK_SPLIT_ENCODE still overrides. Added a
split-mode log line.

Validated live on the 4090: encode_us_p50 7.6 ms → 3.9 ms at 5K240 HDR with no env override.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-15 21:40:28 +00:00
parent b1e95a386f
commit b9f4cf1f3e
10 changed files with 1103 additions and 0 deletions
+14
View File
@@ -158,11 +158,25 @@ impl NvencD3d11Encoder {
}
Some("3") => nv::NV_ENC_SPLIT_ENCODE_MODE::NV_ENC_SPLIT_THREE_FORCED_MODE as u32,
Some("2") => nv::NV_ENC_SPLIT_ENCODE_MODE::NV_ENC_SPLIT_TWO_FORCED_MODE as u32,
// Main10 (10-bit / HDR): 2-way split is measurably SLOWER on Ada — at 5120x1440@240
// Main10, forced-2 took 7.6 ms/frame (~131 fps) vs 2.8 ms (~357 fps) single-engine
// (the split/merge overhead dominates for 10-bit). A single Ada NVENC engine already
// handles 5K@240 Main10 well under the 4.17 ms budget, so DON'T split — splitting was
// the "broken animations in HDR" (the stream capped at ~131 fps). Env still overrides.
_ if self.bit_depth >= 10 => {
nv::NV_ENC_SPLIT_ENCODE_MODE::NV_ENC_SPLIT_DISABLE_MODE as u32
}
_ if pixel_rate > 1_000_000_000 => {
nv::NV_ENC_SPLIT_ENCODE_MODE::NV_ENC_SPLIT_TWO_FORCED_MODE as u32
}
_ => nv::NV_ENC_SPLIT_ENCODE_MODE::NV_ENC_SPLIT_AUTO_MODE as u32,
};
tracing::info!(
split_mode,
bit_depth = self.bit_depth,
pixel_rate,
"NVENC split-encode mode (0=disable 1=auto-forced 2=two 3=three 4=auto)"
);
let enc = loop {
// 1. open the session bound to the D3D11 device.
let mut params = nv::NV_ENC_OPEN_ENCODE_SESSION_EX_PARAMS {