refactor: drop milestone names + consolidate clients; loss-recovery & rumble fixes
apple / swift (push) Failing after 40s
audit / cargo-audit (push) Failing after 1m12s
windows-msix / package (push) Successful in 1m37s
windows / build (push) Successful in 1m14s
android / android (push) Successful in 4m48s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 4m21s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 4m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 19s
deb / build-publish (push) Successful in 6m3s
flatpak / build-publish (push) Successful in 4m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m16s
docker / deploy-docs (push) Successful in 18s

Two bodies of work in one commit (the rename moved files the fixes also touched).

Naming/structure cleanup (pre-launch):
- Host modules m3.rs->punktfunk1.rs, m0.rs->spike.rs; CLI m3-host->punktfunk1-host,
  m0->spike; bare `punktfunk-host` now prints help. Types M3Options/M3Source->
  Punktfunk1Options/Punktfunk1Source.
- Clients consolidated out of crates/ into clients/: punktfunk-client-rs->
  clients/probe (crate punktfunk-probe), client-linux->clients/linux,
  client-windows->clients/windows, punktfunk-android->clients/android/native
  (crate punktfunk-client-android; kept [lib] name=punktfunk_android so the JNI
  contract is unchanged). crates/ now holds only core + host.
- Milestone codes M0-M4 purged from code/CLI/CLAUDE.md/README/docs/docs-site,
  kept only in docs/implementation-plan.md. docs/m2-plan.md->
  docs/gamestream-host-plan.md. CI/gradle/flatpak paths updated.

Client loss-recovery (video froze and never recovered after a brief drop):
- Export punktfunk_connection_frames_dropped through the C ABI (the core already
  tracked it for the client keyframe-recovery loop; it was never reachable from
  the ABI clients). Regenerated punktfunk_core.h.
- Apple (StreamPump + Stage2Pipeline) and Android (decode.rs) now poll
  frames_dropped and request a keyframe when it climbs -- the same loss-driven
  recovery Linux/Windows already had. Under infinite GOP the decoder silently
  conceals reference-missing frames, so the decode-error trigger rarely fires.

Apple rumble robustness (worked then went spotty -- DualSense + Xbox):
- Add CHHapticEngine stopped/reset handlers (rebuild on app background / audio
  interruption / server reset) and drop the permanent `broken` latch on a
  transient drive failure; latch only when the controller truly has no haptics.
- Surface swallowed SDL set_rumble errors on Linux/Windows + diagnostic logging.

Verified: cargo build/clippy/fmt --workspace, C-ABI harness, header drift.
Not runnable on this box (verify in CI): Gitea workflows, gradle/Android,
flatpak, Swift/decky.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-18 21:03:55 +00:00
parent 1faa6c6ad4
commit 9c8fa9340c
110 changed files with 534 additions and 341 deletions
@@ -58,7 +58,13 @@ private final class RumbleRenderer: @unchecked Sendable {
private var controller: GCController?
private var low: Motor?
private var high: Motor?
// `broken` latches OFF only for a controller that genuinely has no haptics engine (an Xbox pad
// on an OS that doesn't expose rumble through GameController, a Siri Remote) nothing to retry
// until the controller changes. A transient engine failure does NOT latch it; it tears down for
// a lazy rebuild instead, so a single hiccup can't kill rumble for the whole session.
private var broken = false
/// Last logged active/silent state for a one-line transition log, not per-event spam.
private var wasActive = false
func retarget(_ c: GCController?) {
queue.async {
@@ -70,8 +76,14 @@ private final class RumbleRenderer: @unchecked Sendable {
func apply(low lowAmp: UInt16, high highAmp: UInt16) {
queue.async {
let active = lowAmp != 0 || highAmp != 0
if active != self.wasActive {
self.wasActive = active
log.debug(
"rumble: \(active ? "active" : "stop", privacy: .public) low=\(lowAmp, privacy: .public) high=\(highAmp, privacy: .public)")
}
guard !self.broken else { return }
if (lowAmp != 0 || highAmp != 0), self.low == nil, self.high == nil {
if active, self.low == nil, self.high == nil {
self.setup()
}
if self.high != nil {
@@ -92,7 +104,15 @@ private final class RumbleRenderer: @unchecked Sendable {
/// high = right/light the Xbox/XInput convention the wire carries); one combined
/// engine otherwise, driven by whichever amplitude is stronger.
private func setup() {
guard let haptics = controller?.haptics else { return }
guard let haptics = controller?.haptics else {
// No haptics engine at all an Xbox controller on an OS/firmware that doesn't expose
// rumble through GameController (works on Android via the standard Vibrator path, but
// Apple's support is controller/OS-dependent), or a Siri Remote. Nothing to retry until
// the controller changes; latch off (retarget clears it) and say so once.
log.info("rumble: active controller exposes no haptics engine — rumble unavailable")
broken = true
return
}
let localities = haptics.supportedLocalities
if localities.contains(.leftHandle), localities.contains(.rightHandle) {
low = makeMotor(haptics, .leftHandle)
@@ -100,13 +120,28 @@ private final class RumbleRenderer: @unchecked Sendable {
} else {
low = makeMotor(haptics, .default)
}
if low == nil && high == nil {
broken = true // no usable engine (e.g. Siri Remote) stay silent
if low == nil, high == nil {
// Haptics present but no engine could be built right now (server busy / a transient
// error). Do NOT latch broken the next nonzero amplitude retries setup().
log.warning("rumble: haptics present but engine setup failed — will retry on next rumble")
}
}
private func makeMotor(_ haptics: GCDeviceHaptics, _ locality: GCHapticsLocality) -> Motor? {
guard let engine = haptics.createEngine(withLocality: locality) else { return nil }
// The haptic server can stop or reset the engine out from under us app backgrounding, an
// audio-session interruption (a call, Siri, another audio app), or a server crash. Left
// unhandled the players go dead and every later rumble throws, latching rumble off for the
// rest of the session (the "rumble worked, then went spotty" failure). Tear down on the
// serial queue so the next nonzero amplitude lazily rebuilds the engine, instead.
engine.stoppedHandler = { [weak self] reason in
log.info("rumble: haptic engine stopped (reason \(reason.rawValue, privacy: .public)) — will rebuild")
self?.queue.async { self?.teardown() }
}
engine.resetHandler = { [weak self] in
log.info("rumble: haptic engine reset — will rebuild")
self?.queue.async { self?.teardown() }
}
do {
try engine.start()
let event = CHHapticEvent(
@@ -141,14 +176,19 @@ private final class RumbleRenderer: @unchecked Sendable {
}
motor = m
} catch {
log.warning("haptic update failed — rumble disabled: \(error, privacy: .public)")
// A transient failure (the engine stopped/reset between its handler firing and now).
// Tear down so the next nonzero amplitude rebuilds do NOT latch rumble off for the
// session (that was the old "spotty" behaviour).
log.warning("rumble: haptic update failed — rebuilding: \(error, privacy: .public)")
teardown()
broken = true
}
}
private func teardown() {
for m in [low, high].compactMap({ $0 }) {
// Drop the handlers before stopping so stop() can't re-enter teardown via stoppedHandler.
m.engine.stoppedHandler = nil
m.engine.resetHandler = nil
try? m.player.stop(atTime: CHHapticTimeImmediate)
m.engine.stop()
}
@@ -362,6 +362,21 @@ public final class PunktfunkConnection {
_ = punktfunk_connection_request_keyframe(h)
}
/// Cumulative access units the hostclient reassembler dropped as unrecoverable (FEC couldn't
/// rebuild them). The video pump polls this and calls `requestKeyframe()` when it climbs the
/// correct loss trigger under the host's infinite GOP, where unrecoverable loss yields
/// reference-missing delta frames the decoder *silently conceals* (a frozen / garbage picture,
/// no decode error and no `.failed` layer), so a decode-error trigger rarely fires. Monotonic
/// for the session; 0 after close. Cheap (an atomic load) safe to poll every pump iteration.
public func framesDropped() -> UInt64 {
abiLock.lock()
defer { abiLock.unlock() }
guard let h = handle, !closeRequested else { return 0 }
var out: UInt64 = 0
_ = punktfunk_connection_frames_dropped(h, &out)
return out
}
/// The currently active session mode (updated by accepted `requestMode` switches).
public func currentMode() -> (width: UInt32, height: UInt32, refreshHz: UInt32) {
abiLock.lock()
@@ -113,8 +113,21 @@ public final class Stage2Pipeline {
let recovery = recovery
let thread = Thread {
var format: CMVideoFormatDescription?
var lastFramesDropped = connection.framesDropped()
while token.isLive {
do {
// Loss recovery (the primary recovery path). The reassembler drops unrecoverable
// AUs (framesDropped) and the decoder then conceals the reference-missing delta
// frames that follow often rendering them WITHOUT an error callback so the
// onDecodeError trigger rarely fires after a real network blip. Ask the host for
// a fresh IDR whenever the drop count climbs (throttled in KeyframeRecovery).
// Polled every iteration so a total-loss drought recovers the moment packets
// resume and the reassembler counts the gap.
let dropped = connection.framesDropped()
if dropped > lastFramesDropped {
lastFramesDropped = dropped
recovery.request()
}
guard let au = try connection.nextAU(timeoutMs: 100) else { continue }
onFrame?(au)
if let f = AnnexB.formatDescription(fromIDR: au.data) {
@@ -46,27 +46,44 @@ final class StreamPump {
let thread = Thread {
var format: CMVideoFormatDescription?
var lastKeyframeRequest = Date.distantPast
var lastFramesDropped = connection.framesDropped()
// Coalesced host keyframe request: the decode stays wedged for several frames until
// the IDR lands, so requesting on every frame would flood the control stream.
func requestKeyframeThrottled() {
let now = Date()
if now.timeIntervalSince(lastKeyframeRequest) > 0.25 {
connection.requestKeyframe()
lastKeyframeRequest = now
}
}
while token.isLive {
do {
// Loss recovery (the primary recovery path). Under the host's infinite GOP the
// only recovery keyframe is one we request. The reassembler drops unrecoverable
// AUs (framesDropped); the decoder then *conceals* the reference-missing delta
// frames that follow a frozen / garbage picture, WITHOUT flipping the layer to
// .failed so the .failed check below rarely fires after a real network blip.
// Ask the host for a fresh IDR whenever the drop count climbs. Polled every
// iteration (not just per AU) so a total-loss drought still recovers the moment
// packets resume and the reassembler counts the gap.
let dropped = connection.framesDropped()
if dropped > lastFramesDropped {
lastFramesDropped = dropped
requestKeyframeThrottled()
}
guard let au = try connection.nextAU(timeoutMs: 100) else { continue }
onFrame?(au)
if let f = AnnexB.formatDescription(fromIDR: au.data) {
format = f // refreshed on every IDR (mode changes included)
}
if layer.status == .failed {
// Decode wedged: flush and re-gate on the next in-band parameter sets
// (resuming with a delta frame can't recover), AND ask the host for a
// fresh IDR. With the host's infinite GOP the next keyframe could be
// far off, so without the request the picture stays frozen the
// intermittent first-connect freeze. Throttled: the layer stays .failed
// across several polls until the IDR lands, and one request suffices.
// Decode wedged hard (the cold-first-connect case a lost/corrupt opening
// IDR): flush and re-gate on the next in-band parameter sets (resuming with
// a delta frame can't recover), AND ask the host for a fresh IDR. Throttled:
// the layer stays .failed across several polls until the IDR lands.
layer.flush()
format = AnnexB.formatDescription(fromIDR: au.data)
let now = Date()
if now.timeIntervalSince(lastKeyframeRequest) > 0.25 {
connection.requestKeyframe()
lastKeyframeRequest = now
}
requestKeyframeThrottled()
}
guard let f = format,
let sample = AnnexB.sampleBuffer(au: au, format: f),