feat(audio): end-to-end 5.1/7.1 surround across the native path + all clients
apple / swift (push) Failing after 10s
release / apple (push) Failing after 7s
apple / screenshots (push) Has been skipped
audit / cargo-audit (push) Failing after 1m19s
windows-host / package (push) Failing after 2m44s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Failing after 39s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Failing after 39s
windows / build (aarch64-pc-windows-msvc) (push) Failing after 45s
android / android (push) Successful in 5m17s
windows / build (x86_64-pc-windows-msvc) (push) Failing after 45s
ci / web (push) Successful in 57s
ci / docs-site (push) Successful in 56s
ci / rust (push) Successful in 9m19s
ci / bench (push) Successful in 4m40s
decky / build-publish (push) Successful in 26s
deb / build-publish (push) Successful in 2m57s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 33s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 2m56s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 2m35s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m20s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 53s
flatpak / build-publish (push) Successful in 4m22s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m51s
docker / deploy-docs (push) Successful in 21s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 8m50s

Adds negotiated 5.1/7.1 surround to the punktfunk/1 protocol and every client
(previously stereo-only):

- core: new shared `audio` layout table (LAYOUT_51/71 + identity multistream
  mapping, canonical wire order FL FR FC LFE RL RR SL SR); Hello/Welcome
  `audio_channels` negotiation via the trailing-byte back-compat pattern (old
  peers fall back to stereo); C-ABI `punktfunk_connect_ex6`,
  `punktfunk_connection_audio_channels`, and in-core multistream decode
  `punktfunk_connection_next_audio_pcm` for embedders without a multistream
  Opus decoder. Real-libopus channel-identity round-trip test.
- host: native audio thread captures + Opus-(multi)stream-encodes at the
  negotiated count (with a cross-session cached-capturer channel-mismatch fix);
  GameStream surround unified onto the safe `opus::MSEncoder`, dropping
  `audiopus_sys` (~4 unsafe blocks) and un-gating Windows GameStream surround;
  WASAPI loopback capture relaxed to 2/6/8 with the correct dwChannelMask.
- clients: Linux (PipeWire), Windows (WASAPI), Android (AAudio) decode via
  `opus::MSDecoder` + render multichannel; Apple decodes in-core to PCM →
  AVAudioEngine with an explicit wire-order channel layout; each gains a
  Stereo/5.1/7.1 setting. `punktfunk-probe --audio-channels N` is the headless
  validator.

Verified on Linux: core/host/linux/probe test suites + the Android Rust
(cargo-ndk) build, clippy -D warnings, and rustfmt all green. Windows/Apple
builds, all on-glass checks, and the live native loopback are pending (CI / a
free box).

Also lands the concurrent in-tree HEVC 4:4:4 host work (PUNKTFUNK_444): it
shares the same touched files (quic.rs, punktfunk1.rs, encode/*, ...) and so
cannot be committed separately from the surround changes.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-28 21:11:05 +00:00
parent 6383e5f4fd
commit 75627c8afe
51 changed files with 2254 additions and 494 deletions
@@ -163,7 +163,7 @@ fun ConnectScreen(settings: Settings, onConnected: (Long) -> Unit) {
targetHost, targetPort, w, h, hz,
id.certPem, id.privateKeyPem, pinHex ?: "",
settings.bitrateKbps, settings.compositor, gamepadPref,
hdrEnabled,
hdrEnabled, settings.audioChannels,
)
}
connecting = false
@@ -16,6 +16,9 @@ data class Settings(
val bitrateKbps: Int = 0,
val compositor: Int = 0,
val gamepad: Int = 0,
/** Requested audio channel count: 2 (stereo), 6 (5.1) or 8 (7.1). The host clamps to what it
* can capture; the resolved count drives the decoder + AAudio layout. */
val audioChannels: Int = 2,
val micEnabled: Boolean = false,
/** Show the live stats overlay (FPS / throughput / latency) during a stream. */
val statsHudEnabled: Boolean = true,
@@ -39,6 +42,7 @@ class SettingsStore(context: Context) {
bitrateKbps = prefs.getInt(K_BITRATE, 0),
compositor = prefs.getInt(K_COMPOSITOR, 0),
gamepad = prefs.getInt(K_GAMEPAD, 0),
audioChannels = prefs.getInt(K_AUDIO_CH, 2),
micEnabled = prefs.getBoolean(K_MIC, false),
statsHudEnabled = prefs.getBoolean(K_HUD, true),
trackpadMode = prefs.getBoolean(K_TRACKPAD, true),
@@ -52,6 +56,7 @@ class SettingsStore(context: Context) {
.putInt(K_BITRATE, s.bitrateKbps)
.putInt(K_COMPOSITOR, s.compositor)
.putInt(K_GAMEPAD, s.gamepad)
.putInt(K_AUDIO_CH, s.audioChannels)
.putBoolean(K_MIC, s.micEnabled)
.putBoolean(K_HUD, s.statsHudEnabled)
.putBoolean(K_TRACKPAD, s.trackpadMode)
@@ -65,6 +70,7 @@ class SettingsStore(context: Context) {
const val K_BITRATE = "bitrate_kbps"
const val K_COMPOSITOR = "compositor"
const val K_GAMEPAD = "gamepad"
const val K_AUDIO_CH = "audio_channels"
const val K_MIC = "mic_enabled"
const val K_HUD = "stats_hud_enabled"
const val K_TRACKPAD = "trackpad_mode"
@@ -133,6 +139,13 @@ val REFRESH_OPTIONS = listOf(
240 to "240 Hz",
)
/** (channel count, label). 2 = stereo (default), 6 = 5.1, 8 = 7.1. */
val AUDIO_CHANNEL_OPTIONS = listOf(
2 to "Stereo",
6 to "5.1 Surround",
8 to "7.1 Surround",
)
/** (kbps, label). `0` = host default. */
val BITRATE_OPTIONS = listOf(
0 to "Automatic",
@@ -104,6 +104,12 @@ fun SettingsScreen(initial: Settings, onChange: (Settings) -> Unit, onBack: () -
}
SettingsGroup("Audio") {
SettingDropdown(
label = "Audio channels",
options = AUDIO_CHANNEL_OPTIONS,
selected = s.audioChannels,
) { ch -> update(s.copy(audioChannels = ch)) }
ToggleRow(
title = "Microphone",
subtitle = "Send your mic to the host's virtual microphone",
@@ -45,6 +45,7 @@ object NativeBridge {
compositorPref: Int,
gamepadPref: Int,
hdrEnabled: Boolean,
audioChannels: Int,
): Long
/** 64-hex SHA-256 of the cert the host presented on [handle]; valid after a successful connect. */
+88 -28
View File
@@ -1,7 +1,11 @@
//! Android audio playback (android-only): pull Opus packets from the connector, decode to
//! interleaved f32 stereo, and feed AAudio (LowLatency) via its realtime data callback through a
//! jitter ring. Mirrors [`crate::decode`]: one thread we own (the Opus decode producer) plus a
//! shutdown flag; the realtime callback thread is owned by AAudio.
//! interleaved f32 (stereo or 5.1/7.1 surround), and feed AAudio (LowLatency) via its realtime data
//! callback through a jitter ring. Mirrors [`crate::decode`]: one thread we own (the Opus decode
//! producer) plus a shutdown flag; the realtime callback thread is owned by AAudio.
//!
//! The layout is the host-RESOLVED channel count (`NativeClient::audio_channels`, negotiated at
//! connect), so an older/clamping host that can only capture stereo is decoded + played as stereo.
//! 2 = stereo / 6 = 5.1 / 8 = 7.1, in the canonical wire order FL FR FC LFE RL RR SL SR.
//!
//! The ring started as a port of `punktfunk-client-linux/src/audio.rs`, but AAudio — unlike
//! PipeWire, which adaptively rate-matches the stream and absorbs a shallow buffer — hands us a raw
@@ -26,36 +30,72 @@ use std::sync::mpsc::{sync_channel, Receiver, SyncSender, TrySendError};
use std::sync::Arc;
use std::time::Duration;
const CHANNELS: usize = 2;
const SAMPLE_RATE: i32 = 48_000;
/// Decoded-chunk hand-off depth: 64 × 5 ms = 320 ms slack (matches the core's AUDIO_QUEUE).
const RING_CHUNKS: usize = 64;
/// Opus decode scratch: worst-case 120 ms stereo frame (5760 samples/ch × 2 ch).
const PCM_SCRATCH: usize = 5760 * CHANNELS;
// --- Jitter-ring depths, in interleaved-f32 samples (all expressed in ms via `MS`). -----------
// --- Jitter-ring depths, in MILLISECONDS (scaled to interleaved-f32 samples at runtime). --------
// The channel count is negotiated, not a compile-time const, so these are kept in ms and multiplied
// by `ms` (interleaved-f32 samples per millisecond at the resolved layout) inside `start`.
// Unlike the Linux client (PipeWire adaptively rate-matches the stream to the graph clock, masking
// host↔DAC drift + a shallow ring), AAudio hands us a raw callback and we own the buffer: drift and
// WiFi power-save bunching land as underruns/overflows = crackle. So Android runs a deliberately
// deeper, smoothly-managed ring than Linux — keep the two clients' depths intentionally divergent.
/// Interleaved f32 samples per millisecond (48 kHz × 2 ch).
const MS: usize = (SAMPLE_RATE as usize / 1000) * CHANNELS; // 96
/// Prime/target floor: fill to ~40 ms before playing (and after a sustained drain). Deep enough to
/// ride out WiFi arrival jitter + clock drift; the dominant Android-only anti-crackle lever.
const PRIME_FLOOR: usize = 40 * MS;
const PRIME_FLOOR_MS: usize = 40;
/// Ceiling for the burst-scaled target (so a large quantum can't push the prime depth too high).
const PRIME_CEIL: usize = 80 * MS;
const PRIME_CEIL_MS: usize = 80;
/// Drop-oldest headroom above the target before trimming — a ~80 ms band swallows an arrival burst
/// without overflowing.
const JITTER_HEADROOM: usize = 80 * MS;
const JITTER_HEADROOM_MS: usize = 80;
/// Hard latency bound: never let the ring exceed ~150 ms (the only thing that caps added latency).
const HARD_CAP: usize = 150 * MS;
const HARD_CAP_MS: usize = 150;
/// Re-prime (go silent to refill) only after this many CONSECUTIVE empty callbacks, so one transient
/// drain doesn't manufacture a fresh 40 ms silence (the old `if ring.is_empty()` re-primed instantly).
const DEPRIME_AFTER_CALLBACKS: u32 = 5;
/// Throttle the AAudio XRun-driven HW-buffer grow check (cheap, but no need to poll every quantum).
const XRUN_CHECK_EVERY: u32 = 128;
/// Opus decoder for the audio plane: a plain stereo decoder (the validated path) or a multistream
/// decoder for 5.1/7.1, both behind one `decode_float`. Built from the host-RESOLVED channel count
/// via the shared layout table. Mirrors the Linux client's `AudioDec`.
enum AudioDec {
Stereo(opus::Decoder),
Surround(opus::MSDecoder),
}
impl AudioDec {
fn new(channels: u8) -> Result<AudioDec, opus::Error> {
if channels == 2 {
Ok(AudioDec::Stereo(opus::Decoder::new(
SAMPLE_RATE as u32,
opus::Channels::Stereo,
)?))
} else {
let l = punktfunk_core::audio::layout_for(channels, false);
Ok(AudioDec::Surround(opus::MSDecoder::new(
SAMPLE_RATE as u32,
l.streams,
l.coupled,
l.mapping,
)?))
}
}
fn decode_float(
&mut self,
input: &[u8],
out: &mut [f32],
fec: bool,
) -> Result<usize, opus::Error> {
match self {
AudioDec::Stereo(d) => d.decode_float(input, out, fec),
AudioDec::Surround(d) => d.decode_float(input, out, fec),
}
}
}
/// Diagnostics — written by the decode thread + the realtime callback, logged periodically. The
/// audio analogue of the video `fed`/`rendered` counters (we can't "screenshot" sound).
#[derive(Default)]
@@ -74,9 +114,20 @@ pub struct AudioPlayback {
}
impl AudioPlayback {
/// Open AAudio (LowLatency, 48 kHz/stereo/f32) with a realtime callback draining a jitter ring,
/// then spawn the Opus decode thread. `None` on failure (the caller leaves video streaming).
/// Open AAudio (LowLatency, 48 kHz/f32, the host-resolved channel layout) with a realtime
/// callback draining a jitter ring, then spawn the Opus decode thread. `None` on failure (the
/// caller leaves video streaming).
pub fn start(client: Arc<NativeClient>) -> Option<AudioPlayback> {
// Build playback from the host-RESOLVED channel count (never the request): 2 = stereo /
// 6 = 5.1 / 8 = 7.1, canonical wire order FL FR FC LFE RL RR SL SR.
let channels = punktfunk_core::audio::normalize_channels(client.audio_channels) as usize;
// Interleaved f32 samples per millisecond at this layout (48 kHz × channels); the ms-
// denominated jitter-ring depths scale by it.
let ms = (SAMPLE_RATE as usize / 1000) * channels;
let prime_floor = PRIME_FLOOR_MS * ms;
let prime_ceil = PRIME_CEIL_MS * ms;
let jitter_headroom = JITTER_HEADROOM_MS * ms;
let hard_cap_max = HARD_CAP_MS * ms;
let counters = Arc::new(Counters::default());
let (tx, rx) = sync_channel::<Vec<f32>>(RING_CHUNKS);
// Recycle free-list: drained PCM buffers go BACK to the decode thread to be refilled, so the
@@ -92,13 +143,13 @@ impl AudioPlayback {
// before the trim below = the hard cap plus one full channel of 5 ms (480-f32) frames — the
// punktfunk protocol always sends 5 ms Opus frames (host `audio_thread`); a larger frame
// would force a one-time realloc, asserted (not silently corrupted) in `decode_loop`.
let mut ring: VecDeque<f32> = VecDeque::with_capacity(HARD_CAP + RING_CHUNKS * 5 * MS);
let mut ring: VecDeque<f32> = VecDeque::with_capacity(hard_cap_max + RING_CHUNKS * 5 * ms);
let mut primed = false;
let mut empties: u32 = 0; // consecutive empty callbacks (de-prime hysteresis)
let mut cb_count: u32 = 0; // callbacks since open (throttles the XRun grow check)
let mut last_xrun: i32 = 0; // last AAudio XRun count we grew the buffer for
let callback = move |s: &AudioStream, data: *mut c_void, num_frames: i32| {
let want = num_frames as usize * CHANNELS;
let want = num_frames as usize * channels;
// SAFETY: AAudio provides `num_frames * channel_count` F32 slots at `data`.
let out = unsafe { std::slice::from_raw_parts_mut(data as *mut f32, want) };
// Drain decoded chunks into the ring WITHOUT freeing on the RT thread: `drain(..)` empties
@@ -108,11 +159,11 @@ impl AudioPlayback {
ring.extend(chunk.drain(..));
let _ = free_tx.try_send(chunk);
}
// Jitter buffer: prime to ~40 ms (PRIME_FLOOR) before playing and after a sustained drain;
// Jitter buffer: prime to ~40 ms (prime_floor) before playing and after a sustained drain;
// drop-oldest only above a wide ~120 ms band. Decoupled from the AAudio burst `want` (tiny
// on the LowLatency MMAP path) so the depth doesn't collapse to a single quantum.
let target = (3 * want).clamp(PRIME_FLOOR, PRIME_CEIL);
let hard_cap = (target + JITTER_HEADROOM).min(HARD_CAP);
let target = (3 * want).clamp(prime_floor, prime_ceil);
let hard_cap = (target + jitter_headroom).min(hard_cap_max);
while ring.len() > hard_cap {
ring.pop_front();
}
@@ -166,7 +217,11 @@ impl AudioPlayback {
.ok()?
.direction(AudioDirection::Output)
.sample_rate(SAMPLE_RATE)
.channel_count(CHANNELS as i32)
// The wire order (FL FR FC LFE RL RR SL SR) is the standard AAudio/Android channel
// order, so this is an IDENTITY mapping — no permute. AAudio infers the 5.1/7.1 mask
// from `channel_count` (the ndk crate's builder exposes no setChannelMask); the host
// captures + Opus-encodes in exactly this order.
.channel_count(channels as i32)
.format(AudioFormat::PCM_Float)
.performance_mode(AudioPerformanceMode::LowLatency)
.sharing_mode(AudioSharingMode::Shared)
@@ -206,7 +261,7 @@ impl AudioPlayback {
let sd = shutdown.clone();
let join = std::thread::Builder::new()
.name("pf-audio".into())
.spawn(move || decode_loop(client, tx, free_rx, sd, counters))
.spawn(move || decode_loop(client, tx, free_rx, sd, counters, channels))
.ok();
Some(AudioPlayback {
@@ -236,29 +291,34 @@ fn decode_loop(
free_rx: Receiver<Vec<f32>>,
shutdown: Arc<AtomicBool>,
counters: Arc<Counters>,
channels: usize,
) {
let mut dec = match opus::Decoder::new(SAMPLE_RATE as u32, opus::Channels::Stereo) {
// Interleaved f32 samples per millisecond at this layout — the ring's 5 ms reserve check below.
let ms = (SAMPLE_RATE as usize / 1000) * channels;
// Opus decode scratch: worst-case 120 ms frame (5760 samples/ch) × channels.
let pcm_scratch = 5760 * channels;
let mut dec = match AudioDec::new(channels as u8) {
Ok(d) => d,
Err(e) => {
log::error!("audio: opus decoder init: {e} — audio disabled");
return;
}
};
let mut pcm = vec![0f32; PCM_SCRATCH];
let mut pcm = vec![0f32; pcm_scratch];
let mut window_peak = 0f32; // loudest |sample| since the last log — tells a tone from silence
while !shutdown.load(Ordering::Relaxed) {
match client.next_audio(Duration::from_millis(5)) {
Ok(pkt) => match dec.decode_float(&pkt.data, &mut pcm, false) {
Ok(samples) => {
let n = samples * CHANNELS;
let n = samples * channels;
for &s in &pcm[..n] {
window_peak = window_peak.max(s.abs());
}
// The ring's pre-reservation in `start` assumes the protocol's 5 ms (≤480-f32)
// The ring's pre-reservation in `start` assumes the protocol's 5 ms (≤480-f32/ch)
// frames; a larger frame would force a one-time realloc on the RT thread. Catch a
// future host frame-size change here in debug, not as a silent audio glitch.
debug_assert!(
n <= 5 * MS,
n <= 5 * ms,
"audio frame {n} f32 exceeds the 5 ms ring reserve"
);
let count = counters.opus_decoded.fetch_add(1, Ordering::Relaxed) + 1;
@@ -266,7 +326,7 @@ fn decode_loop(
// free-list is momentarily empty (startup / after a backpressure drop).
let mut buf = free_rx
.try_recv()
.unwrap_or_else(|_| Vec::with_capacity(PCM_SCRATCH));
.unwrap_or_else(|_| Vec::with_capacity(pcm_scratch));
buf.clear();
buf.extend_from_slice(&pcm[..n]);
match tx.try_send(buf) {
+12 -4
View File
@@ -140,10 +140,12 @@ pub extern "system" fn Java_io_unom_punktfunk_kit_NativeBridge_nativeGenerateIde
}
/// `NativeBridge.nativeConnect(host, port, w, h, hz, certPem, keyPem, pinHex, bitrateKbps,
/// compositorPref, gamepadPref): Long`. `certPem`/`keyPem` empty = anonymous, else presented as the
/// persistent identity. `pinHex` empty = TOFU (read `nativeHostFingerprint` after), else 64-hex
/// SHA-256 to pin the host (mismatch → 0). `bitrateKbps` 0 = host default. `compositorPref`/
/// `gamepadPref` are `CompositorPref`/`GamepadPref` wire bytes (0 = Auto; unknown → Auto).
/// compositorPref, gamepadPref, hdrEnabled, audioChannels): Long`. `certPem`/`keyPem` empty =
/// anonymous, else presented as the persistent identity. `pinHex` empty = TOFU (read
/// `nativeHostFingerprint` after), else 64-hex SHA-256 to pin the host (mismatch → 0). `bitrateKbps`
/// 0 = host default. `compositorPref`/`gamepadPref` are `CompositorPref`/`GamepadPref` wire bytes
/// (0 = Auto; unknown → Auto). `audioChannels` is the requested surround layout (2/6/8; normalized,
/// anything else → stereo) — the host clamps it and the resolved count drives playback.
/// Returns an opaque handle, or 0 on failure (logged).
#[no_mangle]
#[allow(clippy::too_many_arguments)]
@@ -162,6 +164,7 @@ pub extern "system" fn Java_io_unom_punktfunk_kit_NativeBridge_nativeConnect<'lo
compositor_pref: jint,
gamepad_pref: jint,
hdr_enabled: jboolean,
audio_channels: jint,
) -> jlong {
let host: String = match env.get_string(&host) {
Ok(s) => s.into(),
@@ -213,6 +216,11 @@ pub extern "system" fn Java_io_unom_punktfunk_kit_NativeBridge_nativeConnect<'lo
} else {
0
},
// Requested surround layout (2 = stereo / 6 = 5.1 / 8 = 7.1). The host clamps to what it can
// capture and echoes the resolved count in `connector.audio_channels`, which drives the
// decoder + AAudio layout (read in `crate::audio::AudioPlayback::start`). Anything else
// normalizes to stereo here.
punktfunk_core::audio::normalize_channels(audio_channels.clamp(0, u8::MAX as jint) as u8),
None, // launch: default app
pin, // Some → Crypto on host-fp mismatch
identity, // owned (cert, key) PEM, or None (anonymous)
@@ -25,6 +25,7 @@ struct ContentView: View {
@AppStorage(DefaultsKey.compositor) private var compositor = 0
@AppStorage(DefaultsKey.gamepadType) private var gamepadType = 0
@AppStorage(DefaultsKey.bitrateKbps) private var bitrateKbps = 0
@AppStorage(DefaultsKey.audioChannels) private var audioChannels = 2
@AppStorage(DefaultsKey.fullscreenWhileStreaming) private var fullscreenWhileStreaming = true
@AppStorage(DefaultsKey.hudEnabled) private var hudEnabled = true
@AppStorage(DefaultsKey.hudPlacement) private var hudPlacement = HUDPlacement.topTrailing.rawValue
@@ -252,6 +253,7 @@ struct ContentView: View {
setting: PunktfunkConnection.GamepadType(
rawValue: UInt32(clamping: gamepadType)) ?? .auto),
bitrateKbps: UInt32(clamping: bitrateKbps),
audioChannels: UInt8(clamping: audioChannels),
launchID: launchID,
allowTofu: host.pinnedSHA256 == nil)
}
@@ -351,6 +353,7 @@ struct ContentView: View {
compositor: pref,
gamepad: pad,
bitrateKbps: bitrate,
audioChannels: UInt8(clamping: audioChannels),
autoTrust: true)
}
}
@@ -99,6 +99,7 @@ final class SessionModel: ObservableObject {
compositor: PunktfunkConnection.Compositor = .auto,
gamepad: PunktfunkConnection.GamepadType = .auto,
bitrateKbps: UInt32 = 0,
audioChannels: UInt8 = 2,
hdrEnabled: Bool = true,
launchID: String? = nil,
allowTofu: Bool = false,
@@ -137,7 +138,7 @@ final class SessionModel: ObservableObject {
width: width, height: height, refreshHz: hz,
pinSHA256: pin, identity: identity, compositor: compositor,
gamepad: gamepad, bitrateKbps: bitrateKbps, videoCaps: videoCaps,
launchID: launchID) }
audioChannels: audioChannels, launchID: launchID) }
await MainActor.run { [weak self] in
guard let self else { return }
// The user may have abandoned this attempt (window closed, another host
@@ -25,6 +25,7 @@ struct SettingsView: View {
@AppStorage(DefaultsKey.libraryEnabled) private var libraryEnabled = false
@AppStorage(DefaultsKey.fullscreenWhileStreaming) private var fullscreenWhileStreaming = true
@AppStorage(DefaultsKey.micEnabled) private var micEnabled = true
@AppStorage(DefaultsKey.audioChannels) private var audioChannels = 2
@AppStorage(DefaultsKey.hudEnabled) private var hudEnabled = true
@AppStorage(DefaultsKey.hudPlacement) private var hudPlacement = HUDPlacement.topTrailing.rawValue
@ObservedObject private var gamepads = GamepadManager.shared
@@ -173,6 +174,10 @@ struct SettingsView: View {
TVSelectionRow(title: "Stream mode", options: options, selection: modeTag)
TVSelectionRow(
title: "Bitrate", options: bitrateOptions, selection: $bitrateKbps)
TVSelectionRow(
title: "Audio channels",
options: [("Stereo", 2), ("5.1 Surround", 6), ("7.1 Surround", 8)],
selection: $audioChannels)
if bitrateKbps > 1_000_000 {
Label(Self.gigabitWarning, systemImage: "exclamationmark.triangle.fill")
.font(.caption)
@@ -271,6 +276,11 @@ struct SettingsView: View {
@ViewBuilder private var audioSection: some View {
Section {
Picker("Audio channels", selection: $audioChannels) {
Text("Stereo").tag(2)
Text("5.1 Surround").tag(6)
Text("7.1 Surround").tag(8)
}
#if os(macOS)
Picker("Speaker", selection: $speakerUID) {
Text("System default").tag("")
@@ -15,6 +15,9 @@ public enum DefaultsKey {
public static let gamepadType = "punktfunk.gamepadType"
public static let gamepadID = "punktfunk.gamepadID"
public static let bitrateKbps = "punktfunk.bitrateKbps"
/// Requested audio channel count: 2 (stereo), 6 (5.1) or 8 (7.1). The host clamps to what it
/// can capture; the resolved count drives the in-core decode + AVAudioEngine layout.
public static let audioChannels = "punktfunk.audioChannels"
public static let micEnabled = "punktfunk.micEnabled"
public static let speakerUID = "punktfunk.speakerUID"
public static let micUID = "punktfunk.micUID"
@@ -235,6 +235,12 @@ public final class PunktfunkConnection {
/// drain `nextHdrMeta`.
public var isHDR: Bool { colorTransfer == 16 || colorTransfer == 18 }
/// The audio channel count the host resolved for this session (the Welcome's echo of the
/// requested `audioChannels`, clamped to what the host can capture): `2` (stereo), `6` (5.1)
/// or `8` (7.1). Build the playback layout from THIS, never the request. `2` for an older host.
/// PCM from `nextAudioPcm` is interleaved in the canonical wire order FL FR FC LFE RL RR SL SR.
public private(set) var resolvedAudioChannels: UInt8 = 2
/// Connect and start a session at the requested mode (the host creates a native virtual
/// output at exactly this size/refresh). Blocks up to `timeoutMs`.
///
@@ -264,6 +270,7 @@ public final class PunktfunkConnection {
gamepad: GamepadType = .auto,
bitrateKbps: UInt32 = 0,
videoCaps: UInt8 = 0,
audioChannels: UInt8 = 2,
launchID: String? = nil,
timeoutMs: UInt32 = 10_000
) throws {
@@ -279,16 +286,16 @@ public final class PunktfunkConnection {
withOptionalCString(launchID) { launch in
if let pin = pinSHA256 {
return pin.withUnsafeBytes { p in
punktfunk_connect_ex5(
punktfunk_connect_ex6(
cs, port, width, height, refreshHz, compositor.rawValue,
gamepad.rawValue, bitrateKbps, videoCaps, launch,
gamepad.rawValue, bitrateKbps, videoCaps, audioChannels, launch,
p.bindMemory(to: UInt8.self).baseAddress, &observed,
cert, key, timeoutMs)
}
}
return punktfunk_connect_ex5(
return punktfunk_connect_ex6(
cs, port, width, height, refreshHz, compositor.rawValue,
gamepad.rawValue, bitrateKbps, videoCaps, launch,
gamepad.rawValue, bitrateKbps, videoCaps, audioChannels, launch,
nil, &observed, cert, key, timeoutMs)
}
}
@@ -320,6 +327,9 @@ public final class PunktfunkConnection {
colorMatrix = mtx
colorFullRange = fullRange != 0
bitDepth = depth
var ac: UInt8 = 2
_ = punktfunk_connection_audio_channels(handle, &ac)
resolvedAudioChannels = ac
}
/// A bandwidth speed-test measurement (see `startSpeedTest`). Partial until `done`.
@@ -468,6 +478,50 @@ public final class PunktfunkConnection {
}
}
/// One decoded audio frame from `nextAudioPcm`: interleaved 32-bit float at 48 kHz, in the
/// canonical wire channel order FL FR FC LFE RL RR SL SR (the first `channels`).
public struct AudioPCM: Sendable {
/// Interleaved f32 samples (`frameCount * channels` long), wire channel order.
public let samples: [Float]
/// Samples per channel.
public let frameCount: Int
/// Channel count (2/6/8) `resolvedAudioChannels`.
public let channels: Int
public let ptsNs: UInt64
public let seq: UInt32
}
/// Pull the next audio frame, **decoded in-core** to interleaved f32 PCM Apple's AudioToolbox
/// Opus path is stereo-only, so surround (and, for uniformity, stereo too) is decoded by the
/// Rust core (libopus multistream) and handed back as PCM. nil on timeout, throws `.closed` once
/// the session ended. Drain from a dedicated audio thread (do NOT also call `nextAudio` they
/// share the underlying queue). The returned `samples` are copied out, so the buffer is owned.
public func nextAudioPcm(timeoutMs: UInt32 = 100) throws -> AudioPCM? {
audioLock.lock()
defer { audioLock.unlock() }
guard let h = liveHandle() else { throw PunktfunkClientError.closed }
var out = PunktfunkAudioPcm()
let rc = punktfunk_connection_next_audio_pcm(h, &out, timeoutMs)
switch rc {
case statusOK:
let channels = Int(out.channels)
let total = Int(out.frame_count) * channels
guard let base = out.samples, total > 0 else { return nil }
// Copy: the pointer borrows connection memory only until the next PCM call.
let samples = Array(UnsafeBufferPointer(start: base, count: total))
return AudioPCM(
samples: samples, frameCount: Int(out.frame_count),
channels: channels, ptsNs: out.pts_ns, seq: out.seq)
case statusNoFrame:
return nil
case statusClosed:
throw PunktfunkClientError.closed
default:
throw PunktfunkClientError.status(rc)
}
}
/// Pull the next force-feedback update for the GCController haptics engine:
/// `(pad, lowFrequency, highFrequency)` with 0...0xFFFF amplitudes, (0, 0) = stop.
/// Drain from the (single) feedback thread, alongside `nextHidOutput`.
@@ -19,13 +19,13 @@ import os
private let log = Logger(subsystem: "io.unom.punktfunk", category: "audio")
/// SPSC-ish jitter ring (interleaved stereo float), drain thread render callback.
/// The unfair lock is held for microseconds; fine at render-callback rates. Priming:
/// SPSC-ish jitter ring (interleaved float, `channels` per frame), drain thread render
/// callback. The unfair lock is held for microseconds; fine at render-callback rates. Priming:
/// reads return silence until enough is buffered (at least `prefill`, and at least one
/// packet more than the device's render quantum large-buffer devices would otherwise
/// chronically out-demand the prefill and oscillate prime dropout re-prime), and an
/// underrun re-primes, concealing jitter as one short dip instead of sustained crackle.
/// All counts stay even (whole stereo frames), so L/R interleave can never flip.
/// All counts stay whole frames (multiples of `channels`), so the interleave can never slip.
final class AudioRing: @unchecked Sendable {
private var buf: [Float]
private var readIdx = 0
@@ -34,12 +34,14 @@ final class AudioRing: @unchecked Sendable {
private var renderQuantum = 0
private let prefill: Int
private let highWater: Int
private let channels: Int
private let lock = OSAllocatedUnfairLock()
/// `capacity`/`prefill` in samples (interleaved 2 per frame, both must be even).
init(capacity: Int, prefill: Int) {
/// `capacity`/`prefill` in samples (interleaved `channels` per frame, both whole frames).
init(capacity: Int, prefill: Int, channels: Int) {
buf = [Float](repeating: 0, count: capacity)
self.prefill = prefill
self.channels = channels
highWater = prefill * 4
}
@@ -74,8 +76,8 @@ final class AudioRing: @unchecked Sendable {
renderQuantum = max(renderQuantum, count)
let available = writeIdx - readIdx
if !primed {
// 480 samples = one 5 ms host packet of slack beyond the device's demand.
if available >= max(prefill, renderQuantum + 480) {
// One 5 ms host packet (240 frames × channels) of slack beyond the device's demand.
if available >= max(prefill, renderQuantum + 240 * channels) {
primed = true
} else {
for i in 0..<count { out[i] = 0 }
@@ -113,10 +115,55 @@ private final class StopFlag: @unchecked Sendable {
/// Render-block-owned scratch storage: freed exactly when the closure (and thus the
/// last possible render call) is released never racing CoreAudio.
private final class ScratchBuffer {
let ptr = UnsafeMutablePointer<Float>.allocate(capacity: 8192 * 2)
// 8192 frames × up to 8 channels (7.1) the render block caps `frames` at 8192.
let ptr = UnsafeMutablePointer<Float>.allocate(capacity: 8192 * 8)
deinit { ptr.deallocate() }
}
/// CoreAudio channel layout for the canonical wire order FL FR FC LFE RL RR [SL SR]. nil for
/// stereo (the standard layout is correct). For 5.1/7.1 we list explicit channel labels via
/// `kAudioChannelLayoutTag_UseChannelDescriptions` preset tags (DTS_5_1 etc.) don't reliably
/// match Moonlight's order. NB the 7.1 mapping (verified against the WASAPI 0x63F + SPA orderings):
/// wire idx 4-5 = RL/RR = the WAVE *back* pair LeftSurround/RightSurround; idx 6-7 = SL/SR = the
/// WAVE *side* pair LeftSurroundDirect/RightSurroundDirect. (Using RearSurround* for 6-7 would
/// swap side/back vs the Windows/Linux clients.)
private func wireChannelLayout(channels: Int) -> AVAudioChannelLayout? {
let labels: [AudioChannelLabel]
switch channels {
case 6:
labels = [
kAudioChannelLabel_Left, kAudioChannelLabel_Right, kAudioChannelLabel_Center,
kAudioChannelLabel_LFEScreen, kAudioChannelLabel_LeftSurround,
kAudioChannelLabel_RightSurround,
]
case 8:
labels = [
kAudioChannelLabel_Left, kAudioChannelLabel_Right, kAudioChannelLabel_Center,
kAudioChannelLabel_LFEScreen,
kAudioChannelLabel_LeftSurround, kAudioChannelLabel_RightSurround, // wire RL/RR (back)
kAudioChannelLabel_LeftSurroundDirect, kAudioChannelLabel_RightSurroundDirect, // wire SL/SR (side)
]
default:
return nil
}
let size = MemoryLayout<AudioChannelLayout>.size
+ (labels.count - 1) * MemoryLayout<AudioChannelDescription>.stride
let raw = UnsafeMutableRawPointer.allocate(byteCount: size, alignment: 16)
defer { raw.deallocate() }
let layout = raw.bindMemory(to: AudioChannelLayout.self, capacity: 1)
layout.pointee.mChannelLayoutTag = kAudioChannelLayoutTag_UseChannelDescriptions
layout.pointee.mChannelBitmap = AudioChannelBitmap(rawValue: 0)
layout.pointee.mNumberChannelDescriptions = UInt32(labels.count)
let descs = UnsafeMutableBufferPointer(
start: &layout.pointee.mChannelDescriptions, count: labels.count)
for (i, lbl) in labels.enumerated() {
descs[i] = AudioChannelDescription(
mChannelLabel: lbl, mChannelFlags: AudioChannelFlags(rawValue: 0),
mCoordinates: (0, 0, 0))
}
return AVAudioChannelLayout(layout: layout)
}
public final class SessionAudio {
private let connection: PunktfunkConnection
private let flag = StopFlag()
@@ -229,9 +276,13 @@ public final class SessionAudio {
// MARK: - Playback (host speaker)
private func startPlayback(speakerUID: String) {
// 1 s of interleaved stereo capacity, ~20 ms prefill: four 5 ms host packets of
// jitter absorption before the first sample plays.
let ring = AudioRing(capacity: 96_000, prefill: 1920)
// Build the playback layout from the host-RESOLVED channel count (never the request):
// 2 = stereo / 6 = 5.1 / 8 = 7.1, canonical wire order FL FR FC LFE RL RR SL SR.
let channels = Int(connection.resolvedAudioChannels)
// 1 s interleaved capacity, ~20 ms prefill (four 5 ms host packets of jitter absorption
// before the first sample plays), both scaled by the channel count.
let ring = AudioRing(
capacity: 48_000 * channels, prefill: 960 * channels, channels: channels)
let engine = AVAudioEngine()
#if os(macOS)
@@ -247,21 +298,32 @@ public final class SessionAudio {
}
#endif
// Engine-native deinterleaved float; the render block deinterleaves from the ring.
guard let format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)
else { return }
// Engine-native deinterleaved float; the render block deinterleaves from the ring. Surround
// uses an explicit wire-order channel layout; the mixer downmixes to the output device when
// it has fewer speakers (e.g. an iPhone's stereo built-ins). (Explicit if/else rather than
// map/flatMap so it's correct whether the channelLayout initializer is failable or not.)
var format: AVAudioFormat?
if channels == 2 {
format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channels: 2)
} else if let layout = wireChannelLayout(channels: channels) {
format = AVAudioFormat(standardFormatWithSampleRate: 48_000, channelLayout: layout)
}
guard let format else {
log.error("could not build \(channels)-channel audio format — audio disabled")
return
}
let scratch = ScratchBuffer() // block-owned; freed with the closure
let source = AVAudioSourceNode(format: format) { _, _, frameCount, abl -> OSStatus in
let frames = Int(frameCount)
guard frames <= 8192 else { return kAudioUnitErr_TooManyFramesToProcess }
ring.read(into: scratch.ptr, count: frames * 2)
ring.read(into: scratch.ptr, count: frames * channels)
let buffers = UnsafeMutableAudioBufferListPointer(abl)
if buffers.count >= 2,
let left = buffers[0].mData?.assumingMemoryBound(to: Float.self),
let right = buffers[1].mData?.assumingMemoryBound(to: Float.self) {
for f in 0..<frames {
left[f] = scratch.ptr[f * 2]
right[f] = scratch.ptr[f * 2 + 1]
// Deinterleave the wire-order interleaved ring into the engine's per-channel buses.
if buffers.count >= channels {
for ch in 0..<channels {
if let dst = buffers[ch].mData?.assumingMemoryBound(to: Float.self) {
for f in 0..<frames { dst[f] = scratch.ptr[f * channels + ch] }
}
}
}
return noErr
@@ -292,29 +354,20 @@ public final class SessionAudio {
stateLock.unlock()
let thread = Thread { [connection, flag, drainDone] in
defer { drainDone.signal() }
guard let decoder = try? OpusDecoder(framesPerPacket: 240),
let pcm = AVAudioPCMBuffer(
pcmFormat: decoder.pcmFormat, frameCapacity: 5760)
else {
log.error("Opus decoder unavailable — audio playback disabled")
return
}
// Decode happens IN-CORE (libopus multistream) AudioToolbox's Opus path is
// stereo-only and is handed back as interleaved f32 PCM in wire channel order.
while !flag.isStopped {
let packet: AudioPacket?
let pcm: PunktfunkConnection.AudioPCM?
do {
packet = try connection.nextAudio(timeoutMs: 100)
pcm = try connection.nextAudioPcm(timeoutMs: 100)
} catch {
break // session closed
}
guard let packet else { continue }
do {
let frames = try decoder.decode(packet.data, into: pcm)
if frames > 0, let p = pcm.floatChannelData?[0] {
ring.write(p, count: Int(frames) * 2)
guard let pcm, pcm.frameCount > 0 else { continue }
pcm.samples.withUnsafeBufferPointer { p in
if let base = p.baseAddress {
ring.write(base, count: pcm.frameCount * pcm.channels)
}
} catch {
// One corrupt packet a dead stream; skip it.
log.warning("audio decode failed: \(error.localizedDescription)")
}
}
}
+2
View File
@@ -452,6 +452,7 @@ fn speed_test(app: Rc<App>, req: ConnectRequest) {
GamepadPref::Auto,
0, // bitrate_kbps (host default)
0, // video_caps: the Linux client has no 10-bit/HDR present path yet
2, // audio_channels: speed-test probe, stereo
None, // launch: speed-test probe connect, no game
pin,
Some(identity),
@@ -573,6 +574,7 @@ fn start_session(app: Rc<App>, req: ConnectRequest, pin: Option<[u8; 32]>) {
},
bitrate_kbps: s.bitrate_kbps,
mic_enabled: s.mic_enabled,
audio_channels: s.audio_channels,
pin,
identity: app.identity.clone(),
};
+21 -10
View File
@@ -27,16 +27,17 @@ pub struct AudioPlayer {
}
impl AudioPlayer {
/// Spawn the PipeWire playback thread. Failure (no PipeWire in the session) is
/// survivable — the caller streams video-only.
pub fn spawn() -> Result<AudioPlayer> {
/// Spawn the PipeWire playback thread for `channels` (2/6/8, canonical wire order
/// FL FR FC LFE RL RR SL SR). Failure (no PipeWire in the session) is survivable — the
/// caller streams video-only.
pub fn spawn(channels: u32) -> Result<AudioPlayer> {
// 64 × 5 ms = 320 ms of slack between the pump and the PipeWire loop.
let (pcm_tx, pcm_rx) = std::sync::mpsc::sync_channel::<Vec<f32>>(64);
let (quit_tx, quit_rx) = pipewire::channel::channel::<Terminate>();
let thread = std::thread::Builder::new()
.name("punktfunk-audio".into())
.spawn(move || {
if let Err(e) = pw_thread(pcm_rx, quit_rx) {
if let Err(e) = pw_thread(pcm_rx, quit_rx, channels as usize) {
tracing::warn!(error = %e, "audio playback thread ended");
}
})
@@ -48,8 +49,8 @@ impl AudioPlayer {
})
}
/// Queue one interleaved-stereo f32 chunk. Drops the chunk if the PipeWire side is
/// wedged (the renderer conceals the gap; never block the session pump).
/// Queue one interleaved f32 chunk (in the session's channel layout). Drops the chunk if the
/// PipeWire side is wedged (the renderer conceals the gap; never block the session pump).
pub fn push(&self, pcm: Vec<f32>) {
if let Err(TrySendError::Disconnected(_)) = self.pcm_tx.try_send(pcm) {
// Thread already dead — Drop will reap it; nothing to do per-chunk.
@@ -71,11 +72,14 @@ struct PlayerData {
rx: Receiver<Vec<f32>>,
ring: VecDeque<f32>,
primed: bool,
/// Interleaved channel count this stream was opened with (2/6/8).
channels: usize,
}
fn pw_thread(
pcm_rx: Receiver<Vec<f32>>,
quit_rx: pipewire::channel::Receiver<Terminate>,
channels: usize,
) -> Result<()> {
use pipewire as pw;
use pw::{properties::properties, spa};
@@ -115,6 +119,7 @@ fn pw_thread(
rx: pcm_rx,
ring: VecDeque::new(),
primed: false,
channels,
};
let _listener = stream
@@ -130,19 +135,19 @@ fn pw_thread(
while let Ok(chunk) = ud.rx.try_recv() {
ud.ring.extend(chunk);
}
let stride = 4 * CHANNELS; // F32LE interleaved
let stride = 4 * ud.channels; // F32LE interleaved
let datas = buffer.datas_mut();
if datas.is_empty() {
return;
}
let data = &mut datas[0];
let want_frames = data.data().map(|s| s.len() / stride).unwrap_or(0);
let want = want_frames * CHANNELS;
let want = want_frames * ud.channels;
// Adaptive jitter buffer (same shape as the host's virtual mic): prime to
// ~3 quanta, cap at ~1 quantum of slack beyond that, re-prime after a
// genuine drain.
let target = (3 * want).clamp(720 * CHANNELS, 9600 * CHANNELS);
let target = (3 * want).clamp(720 * ud.channels, 9600 * ud.channels);
while ud.ring.len() > target.max(want) + want {
ud.ring.pop_front();
}
@@ -182,7 +187,13 @@ fn pw_thread(
let mut info = AudioInfoRaw::new();
info.set_format(AudioFormat::F32LE);
info.set_rate(SAMPLE_RATE);
info.set_channels(CHANNELS as u32);
info.set_channels(channels as u32);
// Channel positions in canonical wire order (FL FR FC LFE RL RR SL SR) so PipeWire routes each
// slot to the matching speaker (and downmixes when the sink has fewer). Identity, no permute.
let order = punktfunk_core::audio::spa_positions(channels as u8);
let mut positions = [0u32; 64];
positions[..order.len()].copy_from_slice(order);
info.set_position(positions);
let obj = pw::spa::pod::Object {
type_: pw::spa::utils::SpaTypes::ObjectParamFormat.as_raw(),
id: pw::spa::param::ParamType::EnumFormat.as_raw(),
+50 -7
View File
@@ -20,6 +20,8 @@ pub struct SessionParams {
pub compositor: CompositorPref,
pub gamepad: GamepadPref,
pub bitrate_kbps: u32,
/// Requested audio channel count (2/6/8); the host echoes the resolved value.
pub audio_channels: u8,
/// Stream the default microphone to the host's virtual mic source.
pub mic_enabled: bool,
/// Pinned host fingerprint; `None` = trust on first use (caller persists the observed one).
@@ -83,6 +85,42 @@ fn now_ns() -> u64 {
.unwrap_or(0)
}
/// Opus decoder for the audio plane: a plain stereo decoder (the validated path) or a multistream
/// decoder for 5.1/7.1, both behind one `decode_float`. Built from the host-RESOLVED channel count
/// via the shared layout table.
enum AudioDec {
Stereo(opus::Decoder),
Surround(opus::MSDecoder),
}
impl AudioDec {
fn new(channels: u8) -> Result<AudioDec, opus::Error> {
if channels == 2 {
Ok(AudioDec::Stereo(opus::Decoder::new(
48_000,
opus::Channels::Stereo,
)?))
} else {
let l = punktfunk_core::audio::layout_for(channels, false);
Ok(AudioDec::Surround(opus::MSDecoder::new(
48_000, l.streams, l.coupled, l.mapping,
)?))
}
}
fn decode_float(
&mut self,
input: &[u8],
out: &mut [f32],
fec: bool,
) -> Result<usize, opus::Error> {
match self {
AudioDec::Stereo(d) => d.decode_float(input, out, fec),
AudioDec::Surround(d) => d.decode_float(input, out, fec),
}
}
}
fn pump(
params: SessionParams,
ev_tx: async_channel::Sender<SessionEvent>,
@@ -96,7 +134,8 @@ fn pump(
params.compositor,
params.gamepad,
params.bitrate_kbps,
0, // video_caps: the Linux client has no 10-bit/HDR present path yet
0, // video_caps: the Linux client has no 10-bit/HDR present path yet
params.audio_channels,
None, // launch: the Linux client has no library picker yet
params.pin,
Some(params.identity),
@@ -134,11 +173,14 @@ fn pump(
}
};
// Audio is best-effort: a session without it still streams. Gamepads are the
// app-lifetime service's job (the UI attaches it on Connected).
let player = audio::AudioPlayer::spawn()
// app-lifetime service's job (the UI attaches it on Connected). Build the decoder + playback
// from the host-RESOLVED channel count (never the request), so an older/clamping host that
// resolves stereo is decoded as stereo.
let channels = connector.audio_channels;
let player = audio::AudioPlayer::spawn(channels as u32)
.map_err(|e| tracing::warn!(error = %e, "audio disabled"))
.ok();
let mut opus_dec = opus::Decoder::new(48_000, opus::Channels::Stereo)
let mut opus_dec = AudioDec::new(channels)
.map_err(|e| tracing::warn!(error = %e, "opus decoder failed — audio disabled"))
.ok();
let _mic = params
@@ -157,8 +199,8 @@ fn pump(
let mut bytes_n = 0u64;
let mut decode_us_sum = 0u64;
let mut lat_us: Vec<u64> = Vec::with_capacity(256);
let mut pcm = vec![0f32; 5760 * 2]; // decode scratch: max Opus frame (120 ms stereo)
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs.
let mut pcm = vec![0f32; 5760 * channels as usize]; // scratch: max Opus frame (120 ms) × channels
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs.
let mut last_dropped = connector.frames_dropped();
let mut last_kf_req: Option<Instant> = None;
@@ -221,7 +263,8 @@ fn pump(
while let Ok(pkt) = connector.next_audio(Duration::ZERO) {
if let (Some(player), Some(dec)) = (&player, opus_dec.as_mut()) {
match dec.decode_float(&pkt.data, &mut pcm, false) {
Ok(samples) => player.push(pcm[..samples * 2].to_vec()),
// `samples` is per-channel; the interleaved frame is `samples * channels`.
Ok(samples) => player.push(pcm[..samples * channels as usize].to_vec()),
Err(e) => tracing::debug!(error = %e, "opus decode"),
}
}
+4
View File
@@ -132,6 +132,9 @@ pub struct Settings {
pub inhibit_shortcuts: bool,
/// Stream the default microphone to the host's virtual mic source.
pub mic_enabled: bool,
/// Requested audio channel count: 2 (stereo), 6 (5.1) or 8 (7.1). The host clamps to what it
/// can capture; the resolved count drives the decoder + playback layout.
pub audio_channels: u8,
}
impl Default for Settings {
@@ -145,6 +148,7 @@ impl Default for Settings {
compositor: "auto".into(),
inhibit_shortcuts: true,
mic_enabled: false,
audio_channels: 2,
}
}
}
+20
View File
@@ -140,6 +140,16 @@ pub fn show(
input.add(&inhibit_row);
let audio = adw::PreferencesGroup::builder().title("Audio").build();
let surround_row = adw::ComboRow::builder()
.title("Audio channels")
.subtitle("Request stereo or surround (the host downmixes if its output has fewer)")
.model(&gtk::StringList::new(&[
"Stereo",
"5.1 Surround",
"7.1 Surround",
]))
.build();
audio.add(&surround_row);
let mic_row = adw::SwitchRow::builder()
.title("Stream microphone")
.subtitle("Send the default input device to the host's virtual microphone")
@@ -170,6 +180,11 @@ pub fn show(
compositor_row.set_selected(comp_i as u32);
inhibit_row.set_active(s.inhibit_shortcuts);
mic_row.set_active(s.mic_enabled);
surround_row.set_selected(match s.audio_channels {
6 => 1,
8 => 2,
_ => 0,
});
}
let dialog = adw::PreferencesDialog::new();
@@ -186,6 +201,11 @@ pub fn show(
.to_string();
s.inhibit_shortcuts = inhibit_row.is_active();
s.mic_enabled = mic_row.is_active();
s.audio_channels = match surround_row.selected() {
1 => 6,
2 => 8,
_ => 2,
};
s.save();
});
dialog.present(Some(parent));
+3 -4
View File
@@ -18,8 +18,7 @@ tracing-subscriber = { version = "0.3", features = ["env-filter"] }
# LAN host discovery (`--discover`): browse the native `_punktfunk._udp` mDNS service the host
# advertises (same crate/version the host advertises with).
mdns-sd = "0.20"
# Linux-only: --mic-test's Opus encoder (libopus). The mic UPLINK itself is portable —
# only this synthetic-tone test rig needs the encoder.
[target.'cfg(target_os = "linux")'.dependencies]
# Opus: multistream DECODE of the host's audio plane (the surround validator) + `--mic-test`'s
# encoder. libopus is already in the graph via `punktfunk-core`'s quic feature; this exposes the
# name directly. Cross-platform (cmake-vendored), so the probe builds + validates everywhere.
opus = "0.3"
+51 -6
View File
@@ -78,6 +78,10 @@ struct Args {
gamepad: GamepadPref,
/// `--bitrate KBPS` — request this encoder bitrate (kilobits/s); 0 = host default.
bitrate_kbps: u32,
/// `--audio-channels N` — request stereo (2), 5.1 (6) or 7.1 (8) audio; default 2. The probe
/// multistream-decodes the host's frames and asserts the per-channel sample count, so it's the
/// headless validator for the surround encode path.
audio_channels: u8,
/// `--launch ID` — ask the host to launch a library title in this session (a store-qualified
/// id from the host's `GET /api/v1/library`, e.g. `steam:570`). Host resolves it; `None` = none.
launch: Option<String>,
@@ -201,6 +205,11 @@ fn parse_args() -> Args {
compositor,
gamepad,
bitrate_kbps: get("--bitrate").and_then(|s| s.parse().ok()).unwrap_or(0),
audio_channels: punktfunk_core::audio::normalize_channels(
get("--audio-channels")
.and_then(|s| s.parse().ok())
.unwrap_or(2),
),
launch: get("--launch").map(str::to_string),
speed_test: get("--speed-test").and_then(|s| {
let (kbps, ms) = s.split_once(':')?;
@@ -385,13 +394,23 @@ async fn session(args: Args) -> Result<()> {
// `--launch ID` — host resolves it against its own library and runs it this session.
launch: args.launch.clone(),
// This headless tool just dumps the bitstream (no decode), so it can always claim
// 10-bit support. Gated by env so latency runs stay on the 8-bit baseline:
// PUNKTFUNK_CLIENT_10BIT=1 advertises VIDEO_CAP_10BIT to exercise the host Main10 path.
video_caps: if std::env::var_os("PUNKTFUNK_CLIENT_10BIT").is_some() {
punktfunk_core::quic::VIDEO_CAP_10BIT
} else {
0
// 10-bit / 4:4:4 support. Gated by env so latency runs stay on the 8-bit 4:2:0 baseline:
// PUNKTFUNK_CLIENT_10BIT=1 advertises VIDEO_CAP_10BIT (host Main10 path);
// PUNKTFUNK_CLIENT_444=1 advertises VIDEO_CAP_444 (host HEVC 4:4:4 path) — verify the
// resulting chroma with `ffprobe` on the `--out` .h265.
video_caps: {
let mut caps = 0u8;
if std::env::var_os("PUNKTFUNK_CLIENT_10BIT").is_some() {
caps |= punktfunk_core::quic::VIDEO_CAP_10BIT;
}
if std::env::var_os("PUNKTFUNK_CLIENT_444").is_some() {
caps |= punktfunk_core::quic::VIDEO_CAP_444;
}
caps
},
// `--audio-channels` (default stereo); the probe multistream-decodes + validates the
// host's frames to exercise the surround encode path headlessly.
audio_channels: args.audio_channels,
}
.encode(),
)
@@ -408,6 +427,8 @@ async fn session(args: Args) -> Result<()> {
bit_depth = welcome.bit_depth,
color = ?welcome.color,
hdr = welcome.color.is_hdr(),
chroma_444 = welcome.chroma_format == punktfunk_core::quic::CHROMA_IDC_444,
chroma_format_idc = welcome.chroma_format,
"session offer"
);
@@ -830,13 +851,37 @@ async fn session(args: Args) -> Result<()> {
hidout_pkts.clone(),
);
let conn2 = conn.clone();
// Build a multistream decoder for the host-RESOLVED layout so the probe actually decodes
// the surround stream (not just counts bytes) — the headless validator for the encode path.
let audio_channels = welcome.audio_channels;
tokio::spawn(async move {
use std::sync::atomic::Ordering::Relaxed;
let mut hdr_logged = false;
let layout = punktfunk_core::audio::layout_for(audio_channels, false);
let mut audio_dec =
opus::MSDecoder::new(48_000, layout.streams, layout.coupled, layout.mapping).ok();
let mut pcm = vec![0f32; 5760 * audio_channels as usize];
let mut audio_decoded_logged = false;
while let Ok(d) = conn2.read_datagram().await {
if let Some((_, _, opus)) = punktfunk_core::quic::decode_audio_datagram(&d) {
a.fetch_add(1, Relaxed);
ab.fetch_add(opus.len() as u64, Relaxed);
// Decode + validate: the per-channel sample count must be a legal Opus frame
// size; log the first success so a loopback test can assert surround decoded.
if let Some(dec) = audio_dec.as_mut() {
match dec.decode_float(opus, &mut pcm, false) {
Ok(samples) if !audio_decoded_logged => {
audio_decoded_logged = true;
tracing::info!(
channels = audio_channels,
samples_per_channel = samples,
"audio decoded (Opus multistream)"
);
}
Ok(_) => {}
Err(e) => tracing::debug!(error = %e, "probe audio decode"),
}
}
} else if punktfunk_core::quic::decode_rumble_datagram(&d).is_some() {
r.fetch_add(1, Relaxed);
} else if let Some(meta) = punktfunk_core::quic::decode_hdr_meta_datagram(&d) {
+32 -2
View File
@@ -39,6 +39,9 @@ const DECODERS: &[(&str, &str)] = &[
];
/// Bitrate presets in Mb/s; `0` = host default.
const BITRATES_MBPS: &[u32] = &[0, 10, 20, 30, 50, 80, 150];
/// Audio channel presets: `(channel count, display label)`. The host clamps to what it can
/// capture; the resolved count drives the decoder + WASAPI render layout.
const AUDIO_CHANNELS: &[(u8, &str)] = &[(2, "Stereo"), (6, "5.1 Surround"), (8, "7.1 Surround")];
#[derive(Clone, PartialEq)]
enum Screen {
@@ -598,6 +601,7 @@ fn connect(
compositor: CompositorPref::Auto,
gamepad: gamepad_pref,
bitrate_kbps: s.bitrate_kbps,
audio_channels: s.audio_channels,
mic_enabled: s.mic_enabled,
hdr_enabled: s.hdr_enabled,
decoder: DecoderPref::from_name(&s.decoder),
@@ -886,6 +890,23 @@ fn settings_page(ctx: &Arc<AppCtx>, set_screen: &AsyncSetState<Screen>) -> Eleme
s.save();
})
};
let ac_i = AUDIO_CHANNELS
.iter()
.position(|&(v, _)| v == s.audio_channels)
.unwrap_or(0) as i32;
let ac_names: Vec<String> = AUDIO_CHANNELS.iter().map(|&(_, l)| l.to_string()).collect();
let channels_combo = {
let ctx = ctx.clone();
ComboBox::new(ac_names)
.header("Audio channels")
.selected_index(ac_i)
.on_selection_changed(move |i: i32| {
let (v, _) = AUDIO_CHANNELS[(i.max(0) as usize).min(AUDIO_CHANNELS.len() - 1)];
let mut s = ctx.settings.lock().unwrap();
s.audio_channels = v;
s.save();
})
};
let header = grid((
text_block("Settings")
@@ -934,8 +955,17 @@ fn settings_page(ctx: &Arc<AppCtx>, set_screen: &AsyncSetState<Screen>) -> Eleme
.spacing(10.0),
);
let audio_card =
card(vstack((text_block("Audio").font_size(15.0).semibold(), mic_toggle)).spacing(10.0));
let audio_card = card(
vstack((
text_block("Audio").font_size(15.0).semibold(),
text_block("Request stereo or surround — the host downmixes if its output has fewer.")
.font_size(12.0)
.foreground(ThemeRef::SecondaryText),
channels_combo,
mic_toggle,
))
.spacing(10.0),
);
page(vec![
header.into(),
+28 -12
View File
@@ -21,9 +21,9 @@ use std::time::Duration;
use wasapi::{DeviceEnumerator, Direction, SampleType, StreamMode, WaveFormat};
const SAMPLE_RATE: usize = 48_000;
/// The microphone uplink stays stereo (the host's virtual mic is stereo). The render path is
/// multichannel — its channel count + block align are runtime, driven by the host-resolved layout.
const CHANNELS: usize = 2;
/// 48 kHz stereo f32: 2 channels * 4 bytes = 8 bytes per frame.
const BLOCK_ALIGN: usize = CHANNELS * 4;
/// Mic frames are 20 ms (960 samples/channel) — any size ≤ 120 ms is fine host-side.
const MIC_FRAME: usize = 960;
@@ -34,9 +34,10 @@ pub struct AudioPlayer {
}
impl AudioPlayer {
/// Spawn the WASAPI render thread. Failure (no render endpoint on this box) is
/// survivable — the caller streams video-only.
pub fn spawn() -> Result<AudioPlayer> {
/// Spawn the WASAPI render thread for `channels` (2/6/8, canonical wire order
/// FL FR FC LFE RL RR SL SR). Failure (no render endpoint on this box) is survivable — the
/// caller streams video-only.
pub fn spawn(channels: u8) -> Result<AudioPlayer> {
// 64 × 5 ms = 320 ms of slack between the pump and the WASAPI loop.
let (pcm_tx, pcm_rx) = std::sync::mpsc::sync_channel::<Vec<f32>>(64);
let stop = Arc::new(AtomicBool::new(false));
@@ -45,14 +46,14 @@ impl AudioPlayer {
let thread = std::thread::Builder::new()
.name("punktfunk-audio".into())
.spawn(move || {
if let Err(e) = render_thread(pcm_rx, stop_t, ready_tx) {
if let Err(e) = render_thread(pcm_rx, stop_t, ready_tx, channels) {
tracing::warn!(error = format!("{e:#}"), "audio playback thread ended");
}
})
.context("spawn audio thread")?;
match ready_rx.recv_timeout(Duration::from_secs(3)) {
Ok(Ok(())) => {
tracing::info!("WASAPI render: 48 kHz stereo f32 (default endpoint)");
tracing::info!(channels, "WASAPI render: 48 kHz f32 (default endpoint)");
Ok(AudioPlayer {
pcm_tx,
stop,
@@ -66,8 +67,8 @@ impl AudioPlayer {
}
}
/// Queue one interleaved-stereo f32 chunk. Drops the chunk if the WASAPI side is wedged
/// (the renderer conceals the gap; never block the session pump).
/// Queue one interleaved f32 chunk (in the session's channel layout). Drops the chunk if the
/// WASAPI side is wedged (the renderer conceals the gap; never block the session pump).
pub fn push(&self, pcm: Vec<f32>) {
if let Err(TrySendError::Disconnected(_)) = self.pcm_tx.try_send(pcm) {
// Thread already dead — Drop will reap it; nothing to do per-chunk.
@@ -88,6 +89,7 @@ fn render_thread(
pcm_rx: Receiver<Vec<f32>>,
stop: Arc<AtomicBool>,
ready: SyncSender<Result<()>>,
channels: u8,
) -> Result<()> {
if let Err(e) = wasapi::initialize_mta()
.ok()
@@ -97,12 +99,26 @@ fn render_thread(
return Ok(());
}
let res = (|| -> Result<()> {
// F32LE interleaved: channels × 4 bytes/sample. Stereo (channels == 2) is byte-identical
// to the old fixed path (mask 0x3, block align 8).
let block_align = channels as usize * 4;
let device = DeviceEnumerator::new()
.context("DeviceEnumerator")?
.get_default_device(&Direction::Render)
.context("default render endpoint")?;
let mut audio_client = device.get_iaudioclient().context("IAudioClient")?;
let desired = WaveFormat::new(32, 32, &SampleType::Float, SAMPLE_RATE, CHANNELS, None);
// The explicit dwChannelMask is the wire order (FL FR FC LFE RL RR SL SR); 5.1 = 0x3F,
// 7.1 = 0x63F. WASAPI delivers channels in ascending mask-bit order, which equals the wire
// order, so the render mapping is the identity — no permute. `autoconvert` (below) lets the
// audio engine downmix when the endpoint has fewer speakers.
let desired = WaveFormat::new(
32,
32,
&SampleType::Float,
SAMPLE_RATE,
channels as usize,
Some(punktfunk_core::audio::wasapi_channel_mask(channels)),
);
let (default_period, _min_period) =
audio_client.get_device_period().context("device period")?;
let mode = StreamMode::EventsShared {
@@ -139,10 +155,10 @@ fn render_thread(
if avail_frames == 0 {
continue;
}
let want_bytes = avail_frames * BLOCK_ALIGN;
let want_bytes = avail_frames * block_align;
// Prime to ~3 quanta; cap at ~1 quantum of slack beyond that; re-prime on drain.
let target = (3 * want_bytes).clamp(720 * BLOCK_ALIGN, 9600 * BLOCK_ALIGN);
let target = (3 * want_bytes).clamp(720 * block_align, 9600 * block_align);
while ring.len() > target.max(want_bytes) + want_bytes {
ring.pop_front();
}
+49 -6
View File
@@ -23,6 +23,8 @@ pub struct SessionParams {
pub compositor: CompositorPref,
pub gamepad: GamepadPref,
pub bitrate_kbps: u32,
/// Requested audio channel count (2/6/8); the host echoes the resolved value.
pub audio_channels: u8,
/// Stream the default microphone to the host's virtual mic source.
pub mic_enabled: bool,
/// Advertise 10-bit + HDR10 so the host may upgrade HDR content to a Main10/PQ stream.
@@ -94,6 +96,42 @@ fn now_ns() -> u64 {
.unwrap_or(0)
}
/// Opus decoder for the audio plane: a plain stereo decoder (the validated path) or a multistream
/// decoder for 5.1/7.1, both behind one `decode_float`. Built from the host-RESOLVED channel count
/// via the shared layout table.
enum AudioDec {
Stereo(opus::Decoder),
Surround(opus::MSDecoder),
}
impl AudioDec {
fn new(channels: u8) -> Result<AudioDec, opus::Error> {
if channels == 2 {
Ok(AudioDec::Stereo(opus::Decoder::new(
48_000,
opus::Channels::Stereo,
)?))
} else {
let l = punktfunk_core::audio::layout_for(channels, false);
Ok(AudioDec::Surround(opus::MSDecoder::new(
48_000, l.streams, l.coupled, l.mapping,
)?))
}
}
fn decode_float(
&mut self,
input: &[u8],
out: &mut [f32],
fec: bool,
) -> Result<usize, opus::Error> {
match self {
AudioDec::Stereo(d) => d.decode_float(input, out, fec),
AudioDec::Surround(d) => d.decode_float(input, out, fec),
}
}
}
fn pump(
params: SessionParams,
ev_tx: async_channel::Sender<SessionEvent>,
@@ -122,6 +160,7 @@ fn pump(
}
0
},
params.audio_channels,
None, // launch: the Windows client has no library picker yet
params.pin,
Some(params.identity),
@@ -161,11 +200,14 @@ fn pump(
let mut hardware = decoder.is_hardware();
let mut hdr = false;
// Audio is best-effort: a session without it still streams. Gamepads are the
// app-lifetime service's job (the UI attaches it on Connected).
let player = audio::AudioPlayer::spawn()
// app-lifetime service's job (the UI attaches it on Connected). Build the decoder + playback
// from the host-RESOLVED channel count (never the request), so an older/clamping host that
// resolves stereo is decoded as stereo.
let channels = connector.audio_channels;
let player = audio::AudioPlayer::spawn(channels)
.map_err(|e| tracing::warn!(error = %e, "audio disabled"))
.ok();
let mut opus_dec = opus::Decoder::new(48_000, opus::Channels::Stereo)
let mut opus_dec = AudioDec::new(channels)
.map_err(|e| tracing::warn!(error = %e, "opus decoder failed — audio disabled"))
.ok();
let _mic = params
@@ -184,8 +226,8 @@ fn pump(
let mut bytes_n = 0u64;
let mut decode_us_sum = 0u64;
let mut lat_us: Vec<u64> = Vec::with_capacity(256);
let mut pcm = vec![0f32; 5760 * 2]; // decode scratch: max Opus frame (120 ms stereo)
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs.
let mut pcm = vec![0f32; 5760 * channels as usize]; // scratch: max Opus frame (120 ms) × channels
// Loss recovery: watch the host→client unrecoverable-drop count and ask for an IDR when it climbs.
let mut last_dropped = connector.frames_dropped();
let mut last_kf_req: Option<Instant> = None;
@@ -253,7 +295,8 @@ fn pump(
while let Ok(pkt) = connector.next_audio(Duration::ZERO) {
if let (Some(player), Some(dec)) = (&player, opus_dec.as_mut()) {
match dec.decode_float(&pkt.data, &mut pcm, false) {
Ok(samples) => player.push(pcm[..samples * 2].to_vec()),
// `samples` is per-channel; the interleaved frame is `samples * channels`.
Ok(samples) => player.push(pcm[..samples * channels as usize].to_vec()),
Err(e) => tracing::debug!(error = %e, "opus decode"),
}
}
+4
View File
@@ -130,6 +130,9 @@ pub struct Settings {
pub inhibit_shortcuts: bool,
/// Stream the default microphone to the host's virtual mic source.
pub mic_enabled: bool,
/// Requested audio channel count: 2 (stereo), 6 (5.1) or 8 (7.1). The host clamps to what it
/// can capture; the resolved count drives the decoder + WASAPI render layout.
pub audio_channels: u8,
/// Advertise 10-bit + HDR10 so the host upgrades HDR content to a Main10/PQ stream (the client
/// presents it on a 10-bit ST.2084 swapchain). No effect on SDR content.
pub hdr_enabled: bool,
@@ -148,6 +151,7 @@ impl Default for Settings {
compositor: "auto".into(),
inhibit_shortcuts: true,
mic_enabled: false,
audio_channels: 2,
hdr_enabled: true,
decoder: "auto".into(),
}