fix(host/windows): tiered DXGI recovery — cheap re-DuplicateOutput for the HDR ACCESS_LOST churn
apple / swift (push) Successful in 53s
ci / web (push) Successful in 28s
android / android (push) Successful in 1m46s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 11s
ci / rust (push) Successful in 1m4s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 3m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m17s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m56s
apple / swift (push) Successful in 53s
ci / web (push) Successful in 28s
android / android (push) Successful in 1m46s
ci / docs-site (push) Successful in 30s
ci / bench (push) Successful in 1m49s
decky / build-publish (push) Successful in 11s
ci / rust (push) Successful in 1m4s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 5s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 3s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
deb / build-publish (push) Successful in 3m24s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 5m17s
docker / deploy-docs (push) Successful in 17s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 4m56s
The HDR path produced a constant ACCESS_LOST churn during real desktop activity (window resize / Start menu / DWM transitions): the duplication keeps getting invalidated but the OUTPUT stays valid (probe passes — 0 born-lost over 72 rebuilds). The old recovery did a FULL rebuild (new device + factory) on every loss, which re-inits NVENC + seeds black + was throttled to 4x/s → mostly-frozen, re-init churn = "broken animations". Now recovery is tiered (mirrors Sunshine): try_reduplicate() does a fresh DuplicateOutput on the EXISTING device+output — no new device, so NO encoder re-init, NO black seed, gpu_copy/HDR textures/last_present kept → frames resume immediately. Only a genuine output loss (secure-desktop switch) or a dead device (DEVICE_REMOVED/RESET) falls back to the full, throttled recreate_dupl. Both paths probe the new duplication and reject a born-lost one. Validated synthetically (1080p60 + 5120x1440@240 HDR): pipeline stable, 0 churn, frames flow. The real-desktop churn needs live validation (can't synthesize DWM animations). Secure-desktop "UI never appears in-session" is a separate issue (output gone in-session; only a fresh monitor re-add works) — still open. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1207,6 +1207,37 @@ impl DuplCapturer {
|
|||||||
Ok(())
|
Ok(())
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// CHEAP recovery for the ACCESS_LOST *churn*: re-`DuplicateOutput` on the EXISTING device +
|
||||||
|
/// output. No new device/factory, so the encoder is NOT re-initialized and no black is seeded —
|
||||||
|
/// the existing `gpu_copy`/HDR textures/`last_present` are kept and frames resume immediately. This
|
||||||
|
/// is the right recovery for the HDR overlay-flip churn (the duplication is invalidated but the
|
||||||
|
/// output is still live). Returns false when the output can't be re-duplicated (desktop switch /
|
||||||
|
/// output gone) so the caller falls back to the full [`recreate_dupl`]. Probes the new duplication
|
||||||
|
/// (like recreate_dupl) so a born-lost one is rejected rather than adopted.
|
||||||
|
unsafe fn try_reduplicate(&mut self) -> bool {
|
||||||
|
if self.holding_frame {
|
||||||
|
let _ = self.dupl.ReleaseFrame();
|
||||||
|
self.holding_frame = false;
|
||||||
|
}
|
||||||
|
let dupl = match self.output.DuplicateOutput(&self.device) {
|
||||||
|
Ok(d) => d,
|
||||||
|
Err(_) => return false,
|
||||||
|
};
|
||||||
|
// Short probe (hot path): a born-lost duplication returns ACCESS_LOST immediately regardless
|
||||||
|
// of the timeout; only the alive-but-idle case waits the full 16ms, and idle = nothing moving.
|
||||||
|
let mut info = DXGI_OUTDUPL_FRAME_INFO::default();
|
||||||
|
let mut res: Option<IDXGIResource> = None;
|
||||||
|
match dupl.AcquireNextFrame(16, &mut info, &mut res) {
|
||||||
|
Ok(()) => {
|
||||||
|
let _ = dupl.ReleaseFrame();
|
||||||
|
}
|
||||||
|
Err(e) if e.code() == DXGI_ERROR_WAIT_TIMEOUT => {}
|
||||||
|
Err(_) => return false, // born-lost on the same output → need the full rebuild
|
||||||
|
}
|
||||||
|
self.dupl = dupl;
|
||||||
|
true
|
||||||
|
}
|
||||||
|
|
||||||
/// ONE rebuild attempt — deliberately non-blocking. ACCESS_LOST fires on desktop switches
|
/// ONE rebuild attempt — deliberately non-blocking. ACCESS_LOST fires on desktop switches
|
||||||
/// (normal ↔ Winlogon secure: lock/login/UAC) and on the mode change we issue at create. We
|
/// (normal ↔ Winlogon secure: lock/login/UAC) and on the mode change we issue at create. We
|
||||||
/// re-attach to the now-current input desktop and recreate the D3D11 device + duplication on it
|
/// re-attach to the now-current input desktop and recreate the D3D11 device + duplication on it
|
||||||
@@ -1349,25 +1380,36 @@ impl DuplCapturer {
|
|||||||
|| e.code() == DXGI_ERROR_DEVICE_RESET =>
|
|| e.code() == DXGI_ERROR_DEVICE_RESET =>
|
||||||
{
|
{
|
||||||
self.dbg_lost += 1;
|
self.dbg_lost += 1;
|
||||||
// THROTTLED, NON-BLOCKING recovery. During a secure-desktop dwell the SudoVDA output
|
// TIERED recovery. The HDR path produces a constant ACCESS_LOST *churn*: the
|
||||||
// is gone, so a rebuild fails for the whole visit. We must NOT block retrying (that
|
// duplication keeps getting invalidated (overlay/MPO flips that HDR makes aggressive)
|
||||||
// starves the encode/send loop → the client times out → disconnect — the bug). Try a
|
// but the OUTPUT stays valid — a probe passes, the dup lives briefly, dies, repeats.
|
||||||
// rebuild at most ~4×/s; between attempts return "no new frame" so next_frame repeats
|
// For that, the cheap fix is a fresh DuplicateOutput on the SAME device+output: no new
|
||||||
// the last good frame, keeping the client fed (frozen) until the desktop returns. A
|
// device/factory → NO encoder re-init, NO black seed → frames stay near-continuous
|
||||||
// brief sleep on the throttled path avoids busy-spinning on the dead duplication.
|
// (this is what makes HDR animations smooth). Only a genuine output loss (secure-desktop
|
||||||
|
// switch, where DISPLAY10 is gone) or a dead device needs the full rebuild — and THAT
|
||||||
|
// is throttled so a long secure dwell doesn't hammer DuplicateOutput / starve the
|
||||||
|
// client (between attempts we repeat the last frame).
|
||||||
|
let device_dead =
|
||||||
|
e.code() == DXGI_ERROR_DEVICE_REMOVED || e.code() == DXGI_ERROR_DEVICE_RESET;
|
||||||
|
if self.dbg_lost % 64 == 1 {
|
||||||
|
tracing::warn!(
|
||||||
|
lost = self.dbg_lost,
|
||||||
|
code = format!("{:#x}", e.code().0),
|
||||||
|
"DXGI capture lost — recovering (cheap re-duplicate, full rebuild if output gone)"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if !device_dead && self.try_reduplicate() {
|
||||||
|
// Cheap recovery succeeded; the next acquire gets frames on the same device.
|
||||||
|
self.first_frame = true;
|
||||||
|
return Ok(None);
|
||||||
|
}
|
||||||
|
// Output gone / device dead → full rebuild (new device), throttled.
|
||||||
let now = Instant::now();
|
let now = Instant::now();
|
||||||
let due = self.last_rebuild.map_or(true, |t| {
|
let due = self.last_rebuild.map_or(true, |t| {
|
||||||
now.duration_since(t) >= Duration::from_millis(250)
|
now.duration_since(t) >= Duration::from_millis(250)
|
||||||
});
|
});
|
||||||
if due {
|
if due {
|
||||||
self.last_rebuild = Some(now);
|
self.last_rebuild = Some(now);
|
||||||
if self.dbg_lost % 8 == 1 {
|
|
||||||
tracing::warn!(
|
|
||||||
lost = self.dbg_lost,
|
|
||||||
code = format!("{:#x}", e.code().0),
|
|
||||||
"DXGI capture lost (desktop switch?) — repeating last frame, retrying rebuild"
|
|
||||||
);
|
|
||||||
}
|
|
||||||
if self.recreate_dupl().is_ok() {
|
if self.recreate_dupl().is_ok() {
|
||||||
self.first_frame = true;
|
self.first_frame = true;
|
||||||
}
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user