refactor: drop milestone names + consolidate clients; loss-recovery & rumble fixes
apple / swift (push) Failing after 40s
audit / cargo-audit (push) Failing after 1m12s
windows-msix / package (push) Successful in 1m37s
windows / build (push) Successful in 1m14s
android / android (push) Successful in 4m48s
ci / web (push) Successful in 27s
ci / rust (push) Successful in 4m21s
ci / docs-site (push) Successful in 31s
ci / bench (push) Successful in 4m39s
decky / build-publish (push) Successful in 11s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 4s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 19s
deb / build-publish (push) Successful in 6m3s
flatpak / build-publish (push) Successful in 4m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m15s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m16s
docker / deploy-docs (push) Successful in 18s

Two bodies of work in one commit (the rename moved files the fixes also touched).

Naming/structure cleanup (pre-launch):
- Host modules m3.rs->punktfunk1.rs, m0.rs->spike.rs; CLI m3-host->punktfunk1-host,
  m0->spike; bare `punktfunk-host` now prints help. Types M3Options/M3Source->
  Punktfunk1Options/Punktfunk1Source.
- Clients consolidated out of crates/ into clients/: punktfunk-client-rs->
  clients/probe (crate punktfunk-probe), client-linux->clients/linux,
  client-windows->clients/windows, punktfunk-android->clients/android/native
  (crate punktfunk-client-android; kept [lib] name=punktfunk_android so the JNI
  contract is unchanged). crates/ now holds only core + host.
- Milestone codes M0-M4 purged from code/CLI/CLAUDE.md/README/docs/docs-site,
  kept only in docs/implementation-plan.md. docs/m2-plan.md->
  docs/gamestream-host-plan.md. CI/gradle/flatpak paths updated.

Client loss-recovery (video froze and never recovered after a brief drop):
- Export punktfunk_connection_frames_dropped through the C ABI (the core already
  tracked it for the client keyframe-recovery loop; it was never reachable from
  the ABI clients). Regenerated punktfunk_core.h.
- Apple (StreamPump + Stage2Pipeline) and Android (decode.rs) now poll
  frames_dropped and request a keyframe when it climbs -- the same loss-driven
  recovery Linux/Windows already had. Under infinite GOP the decoder silently
  conceals reference-missing frames, so the decode-error trigger rarely fires.

Apple rumble robustness (worked then went spotty -- DualSense + Xbox):
- Add CHHapticEngine stopped/reset handlers (rebuild on app background / audio
  interruption / server reset) and drop the permanent `broken` latch on a
  transient drive failure; latch only when the controller truly has no haptics.
- Surface swallowed SDL set_rumble errors on Linux/Windows + diagnostic logging.

Verified: cargo build/clippy/fmt --workspace, C-ABI harness, header drift.
Not runnable on this box (verify in CI): Gitea workflows, gradle/Android,
flatpak, Swift/decky.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-18 21:03:55 +00:00
parent 1faa6c6ad4
commit 9c8fa9340c
110 changed files with 534 additions and 341 deletions
+4 -4
View File
@@ -20,8 +20,8 @@ full session: video AUs, **Opus audio** (`nextAudio()`), **rumble** (`nextRumble
**DualSense feedback** (`nextHidOutput()` — lightbar, player LEDs, adaptive-trigger
effects), input incl. gamepads + DualSense touchpad/motion (`sendTouchpad`/`sendMotion`),
and **cert pinning + TOFU** (`pinSHA256:`/`hostFingerprint`) — see
`m3.rs::tests::c_abi_connection_roundtrip` (three sequential sessions: TOFU, pinned
reconnect, wrong-pin rejection). The host (`punktfunk-host m3-host`) is a persistent listener:
`punktfunk1.rs::tests::c_abi_connection_roundtrip` (three sequential sessions: TOFU, pinned
reconnect, wrong-pin rejection). The host (`punktfunk-host punktfunk1-host`) is a persistent listener:
reconnect at will during development.
What's here, all compiled and tested on macOS (Xcode 26.5 / Swift 6.3):
@@ -127,10 +127,10 @@ bash test-loopback.sh # full loopback proof: builds punktfunk
# (synthetic source — runs on macOS), streams
# byte-verified frames into the Swift client
# against the real host (Linux box, see CLAUDE.md "Running on this box") — m3-host is a
# against the real host (Linux box, see CLAUDE.md "Running on this box") — punktfunk1-host is a
# persistent listener, reconnect at will:
# PUNKTFUNK_COMPOSITOR=gamescope PUNKTFUNK_GAMESCOPE_APP=vkcube PUNKTFUNK_ZEROCOPY=1 \
# cargo run -rp punktfunk-host -- m3-host --source virtual --seconds 60
# cargo run -rp punktfunk-host -- punktfunk1-host --source virtual --seconds 60
PUNKTFUNK_REMOTE_HOST=<box-ip> swift test --filter RemoteFirstLightTests # headless
# (+ PUNKTFUNK_REMOTE_PORT / PUNKTFUNK_REMOTE_COMPOSITOR=gamescope|kwin|… /
# PUNKTFUNK_REMOTE_PIN=<arming-pin> for the remote pairing test)
@@ -58,7 +58,13 @@ private final class RumbleRenderer: @unchecked Sendable {
private var controller: GCController?
private var low: Motor?
private var high: Motor?
// `broken` latches OFF only for a controller that genuinely has no haptics engine (an Xbox pad
// on an OS that doesn't expose rumble through GameController, a Siri Remote) nothing to retry
// until the controller changes. A transient engine failure does NOT latch it; it tears down for
// a lazy rebuild instead, so a single hiccup can't kill rumble for the whole session.
private var broken = false
/// Last logged active/silent state for a one-line transition log, not per-event spam.
private var wasActive = false
func retarget(_ c: GCController?) {
queue.async {
@@ -70,8 +76,14 @@ private final class RumbleRenderer: @unchecked Sendable {
func apply(low lowAmp: UInt16, high highAmp: UInt16) {
queue.async {
let active = lowAmp != 0 || highAmp != 0
if active != self.wasActive {
self.wasActive = active
log.debug(
"rumble: \(active ? "active" : "stop", privacy: .public) low=\(lowAmp, privacy: .public) high=\(highAmp, privacy: .public)")
}
guard !self.broken else { return }
if (lowAmp != 0 || highAmp != 0), self.low == nil, self.high == nil {
if active, self.low == nil, self.high == nil {
self.setup()
}
if self.high != nil {
@@ -92,7 +104,15 @@ private final class RumbleRenderer: @unchecked Sendable {
/// high = right/light the Xbox/XInput convention the wire carries); one combined
/// engine otherwise, driven by whichever amplitude is stronger.
private func setup() {
guard let haptics = controller?.haptics else { return }
guard let haptics = controller?.haptics else {
// No haptics engine at all an Xbox controller on an OS/firmware that doesn't expose
// rumble through GameController (works on Android via the standard Vibrator path, but
// Apple's support is controller/OS-dependent), or a Siri Remote. Nothing to retry until
// the controller changes; latch off (retarget clears it) and say so once.
log.info("rumble: active controller exposes no haptics engine — rumble unavailable")
broken = true
return
}
let localities = haptics.supportedLocalities
if localities.contains(.leftHandle), localities.contains(.rightHandle) {
low = makeMotor(haptics, .leftHandle)
@@ -100,13 +120,28 @@ private final class RumbleRenderer: @unchecked Sendable {
} else {
low = makeMotor(haptics, .default)
}
if low == nil && high == nil {
broken = true // no usable engine (e.g. Siri Remote) stay silent
if low == nil, high == nil {
// Haptics present but no engine could be built right now (server busy / a transient
// error). Do NOT latch broken the next nonzero amplitude retries setup().
log.warning("rumble: haptics present but engine setup failed — will retry on next rumble")
}
}
private func makeMotor(_ haptics: GCDeviceHaptics, _ locality: GCHapticsLocality) -> Motor? {
guard let engine = haptics.createEngine(withLocality: locality) else { return nil }
// The haptic server can stop or reset the engine out from under us app backgrounding, an
// audio-session interruption (a call, Siri, another audio app), or a server crash. Left
// unhandled the players go dead and every later rumble throws, latching rumble off for the
// rest of the session (the "rumble worked, then went spotty" failure). Tear down on the
// serial queue so the next nonzero amplitude lazily rebuilds the engine, instead.
engine.stoppedHandler = { [weak self] reason in
log.info("rumble: haptic engine stopped (reason \(reason.rawValue, privacy: .public)) — will rebuild")
self?.queue.async { self?.teardown() }
}
engine.resetHandler = { [weak self] in
log.info("rumble: haptic engine reset — will rebuild")
self?.queue.async { self?.teardown() }
}
do {
try engine.start()
let event = CHHapticEvent(
@@ -141,14 +176,19 @@ private final class RumbleRenderer: @unchecked Sendable {
}
motor = m
} catch {
log.warning("haptic update failed — rumble disabled: \(error, privacy: .public)")
// A transient failure (the engine stopped/reset between its handler firing and now).
// Tear down so the next nonzero amplitude rebuilds do NOT latch rumble off for the
// session (that was the old "spotty" behaviour).
log.warning("rumble: haptic update failed — rebuilding: \(error, privacy: .public)")
teardown()
broken = true
}
}
private func teardown() {
for m in [low, high].compactMap({ $0 }) {
// Drop the handlers before stopping so stop() can't re-enter teardown via stoppedHandler.
m.engine.stoppedHandler = nil
m.engine.resetHandler = nil
try? m.player.stop(atTime: CHHapticTimeImmediate)
m.engine.stop()
}
@@ -362,6 +362,21 @@ public final class PunktfunkConnection {
_ = punktfunk_connection_request_keyframe(h)
}
/// Cumulative access units the hostclient reassembler dropped as unrecoverable (FEC couldn't
/// rebuild them). The video pump polls this and calls `requestKeyframe()` when it climbs the
/// correct loss trigger under the host's infinite GOP, where unrecoverable loss yields
/// reference-missing delta frames the decoder *silently conceals* (a frozen / garbage picture,
/// no decode error and no `.failed` layer), so a decode-error trigger rarely fires. Monotonic
/// for the session; 0 after close. Cheap (an atomic load) safe to poll every pump iteration.
public func framesDropped() -> UInt64 {
abiLock.lock()
defer { abiLock.unlock() }
guard let h = handle, !closeRequested else { return 0 }
var out: UInt64 = 0
_ = punktfunk_connection_frames_dropped(h, &out)
return out
}
/// The currently active session mode (updated by accepted `requestMode` switches).
public func currentMode() -> (width: UInt32, height: UInt32, refreshHz: UInt32) {
abiLock.lock()
@@ -113,8 +113,21 @@ public final class Stage2Pipeline {
let recovery = recovery
let thread = Thread {
var format: CMVideoFormatDescription?
var lastFramesDropped = connection.framesDropped()
while token.isLive {
do {
// Loss recovery (the primary recovery path). The reassembler drops unrecoverable
// AUs (framesDropped) and the decoder then conceals the reference-missing delta
// frames that follow often rendering them WITHOUT an error callback so the
// onDecodeError trigger rarely fires after a real network blip. Ask the host for
// a fresh IDR whenever the drop count climbs (throttled in KeyframeRecovery).
// Polled every iteration so a total-loss drought recovers the moment packets
// resume and the reassembler counts the gap.
let dropped = connection.framesDropped()
if dropped > lastFramesDropped {
lastFramesDropped = dropped
recovery.request()
}
guard let au = try connection.nextAU(timeoutMs: 100) else { continue }
onFrame?(au)
if let f = AnnexB.formatDescription(fromIDR: au.data) {
@@ -46,27 +46,44 @@ final class StreamPump {
let thread = Thread {
var format: CMVideoFormatDescription?
var lastKeyframeRequest = Date.distantPast
var lastFramesDropped = connection.framesDropped()
// Coalesced host keyframe request: the decode stays wedged for several frames until
// the IDR lands, so requesting on every frame would flood the control stream.
func requestKeyframeThrottled() {
let now = Date()
if now.timeIntervalSince(lastKeyframeRequest) > 0.25 {
connection.requestKeyframe()
lastKeyframeRequest = now
}
}
while token.isLive {
do {
// Loss recovery (the primary recovery path). Under the host's infinite GOP the
// only recovery keyframe is one we request. The reassembler drops unrecoverable
// AUs (framesDropped); the decoder then *conceals* the reference-missing delta
// frames that follow a frozen / garbage picture, WITHOUT flipping the layer to
// .failed so the .failed check below rarely fires after a real network blip.
// Ask the host for a fresh IDR whenever the drop count climbs. Polled every
// iteration (not just per AU) so a total-loss drought still recovers the moment
// packets resume and the reassembler counts the gap.
let dropped = connection.framesDropped()
if dropped > lastFramesDropped {
lastFramesDropped = dropped
requestKeyframeThrottled()
}
guard let au = try connection.nextAU(timeoutMs: 100) else { continue }
onFrame?(au)
if let f = AnnexB.formatDescription(fromIDR: au.data) {
format = f // refreshed on every IDR (mode changes included)
}
if layer.status == .failed {
// Decode wedged: flush and re-gate on the next in-band parameter sets
// (resuming with a delta frame can't recover), AND ask the host for a
// fresh IDR. With the host's infinite GOP the next keyframe could be
// far off, so without the request the picture stays frozen the
// intermittent first-connect freeze. Throttled: the layer stays .failed
// across several polls until the IDR lands, and one request suffices.
// Decode wedged hard (the cold-first-connect case a lost/corrupt opening
// IDR): flush and re-gate on the next in-band parameter sets (resuming with
// a delta frame can't recover), AND ask the host for a fresh IDR. Throttled:
// the layer stays .failed across several polls until the IDR lands.
layer.flush()
format = AnnexB.formatDescription(fromIDR: au.data)
let now = Date()
if now.timeIntervalSince(lastKeyframeRequest) > 0.25 {
connection.requestKeyframe()
lastKeyframeRequest = now
}
requestKeyframeThrottled()
}
guard let f = format,
let sample = AnnexB.sampleBuffer(au: au, format: f),
@@ -1,7 +1,7 @@
// Integration: the Swift wrapper against a real punktfunk/1 host over QUIC + UDP on loopback
// the Swift twin of punktfunk-host's m3.rs::c_abi_connection_roundtrip, this time through the
// statically linked xcframework. Driven by clients/apple/test-loopback.sh, which builds and
// starts `punktfunk-host m3-host --source synthetic` and sets PUNKTFUNK_LOOPBACK_PORT.
// starts `punktfunk-host punktfunk1-host --source synthetic` and sets PUNKTFUNK_LOOPBACK_PORT.
import XCTest
@testable import PunktfunkKit
@@ -11,7 +11,7 @@ final class LoopbackIntegrationTests: XCTestCase {
guard let portStr = ProcessInfo.processInfo.environment["PUNKTFUNK_LOOPBACK_PORT"],
let port = UInt16(portStr)
else {
throw XCTSkip("needs a running m3-host — use clients/apple/test-loopback.sh")
throw XCTSkip("needs a running punktfunk1-host — use clients/apple/test-loopback.sh")
}
let conn = try PunktfunkConnection(
@@ -139,7 +139,7 @@ final class LoopbackIntegrationTests: XCTestCase {
guard let portStr = env["PUNKTFUNK_PAIRING_PORT"], let port = UInt16(portStr),
let pin = env["PUNKTFUNK_PAIRING_PIN"]
else {
throw XCTSkip("needs an armed m3-host — use clients/apple/test-loopback.sh")
throw XCTSkip("needs an armed punktfunk1-host — use clients/apple/test-loopback.sh")
}
let identity = try generateIdentity()
@@ -5,7 +5,7 @@
//
// Run (host side, on the Linux box):
// PUNKTFUNK_COMPOSITOR=gamescope PUNKTFUNK_GAMESCOPE_APP=vkcube PUNKTFUNK_ZEROCOPY=1 \
// punktfunk-host m3-host --source virtual --seconds 120
// punktfunk-host punktfunk1-host --source virtual --seconds 120
// Then here:
// PUNKTFUNK_REMOTE_HOST=192.168.1.70 swift test --filter RemoteFirstLightTests
@@ -54,7 +54,7 @@ final class RemoteFirstLightTests: XCTestCase {
func testRemoteAudioBothDirections() throws {
let env = ProcessInfo.processInfo.environment
guard let host = env["PUNKTFUNK_REMOTE_HOST"] else {
throw XCTSkip("set PUNKTFUNK_REMOTE_HOST (and start m3-host --source virtual there)")
throw XCTSkip("set PUNKTFUNK_REMOTE_HOST (and start punktfunk1-host --source virtual there)")
}
let port = env["PUNKTFUNK_REMOTE_PORT"].flatMap(UInt16.init) ?? 9777
@@ -106,7 +106,7 @@ final class RemoteFirstLightTests: XCTestCase {
func testRemoteStreamDecodesToPixels() throws {
let env = ProcessInfo.processInfo.environment
guard let host = env["PUNKTFUNK_REMOTE_HOST"] else {
throw XCTSkip("set PUNKTFUNK_REMOTE_HOST (and start m3-host --source virtual there)")
throw XCTSkip("set PUNKTFUNK_REMOTE_HOST (and start punktfunk1-host --source virtual there)")
}
let port = env["PUNKTFUNK_REMOTE_PORT"].flatMap(UInt16.init) ?? 9777
// PUNKTFUNK_REMOTE_COMPOSITOR=kwin|gamescope| asks the host for a specific
+2 -2
View File
@@ -22,10 +22,10 @@ trap 'kill "${HOST_PID:-}" "${PAIR_PID:-}" 2>/dev/null || true' EXIT
# The open host also scripts a feedback burst (rumble + DualSense hidout) right after the
# handshake, so the Swift test can assert the host→client feedback planes end to end.
HOME="$CFG/open" XDG_CONFIG_HOME="$CFG/open/.config" PUNKTFUNK_TEST_FEEDBACK=1 \
target/release/punktfunk-host m3-host --port "$PORT" --source synthetic --frames 300 &
target/release/punktfunk-host punktfunk1-host --port "$PORT" --source synthetic --frames 300 &
HOST_PID=$!
HOME="$CFG/paired" XDG_CONFIG_HOME="$CFG/paired/.config" \
target/release/punktfunk-host m3-host --port "$PAIR_PORT" --source synthetic --frames 300 \
target/release/punktfunk-host punktfunk1-host --port "$PAIR_PORT" --source synthetic --frames 300 \
--require-pairing >"$PAIR_LOG" 2>&1 &
PAIR_PID=$!
sleep 1