Files
punktfunk/crates/punktfunk-host/src/audio/windows/wasapi_mic.rs
T
enricobuehler 2c7ded0f3c
apple / swift (push) Successful in 1m7s
ci / rust (push) Successful in 1m57s
ci / web (push) Successful in 59s
android / android (push) Successful in 3m19s
ci / docs-site (push) Successful in 1m0s
apple / screenshots (push) Successful in 5m12s
windows-host / package (push) Successful in 7m2s
ci / bench (push) Successful in 4m52s
decky / build-publish (push) Successful in 14s
deb / build-publish (push) Successful in 4m37s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 8s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 6s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 4s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m14s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m40s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m28s
fix(host/audio): rebuild mic passthrough — eager, self-healing virtual mic on both hosts
Mic passthrough silently died on real hosts. Root causes, all fixed:

- No liveness anywhere: a PipeWire restart (Linux) or any WASAPI device
  error (Windows) killed the backend worker; push() fed the dead queue
  for the rest of the host's life. VirtualMic now has a liveness
  contract (push -> bool, alive(), discard()) and the new shared
  audio::MicPump reopens with backoff, probing on an idle heartbeat so
  the mic heals BETWEEN sessions too. Validated live: systemctl restart
  pipewire -> node back in ~0.5 s, tone flows through the reopened
  backend.

- Lazy creation: the mic device didn't exist until the first 0xCB
  frame, but games bind their capture device at launch and never
  re-follow. The pump opens eagerly at host start (node exists with
  zero clients, elected default source).

- Windows headless dead-end: with VB-CABLE as the ONLY render endpoint
  (exactly what the installer ships), the anti-echo guard rejected the
  cable as the default render endpoint -> mic permanently dead. The new
  wiring_plan (pure, unit-tested on every platform) assigns the mic its
  endpoint FIRST (cable reserved for the mic), points the loopback at a
  DIFFERENT endpoint, and the capture side now yields (explicit
  endpoint or honest error) instead of the mic dying. Plan recomputed
  per (re)open — endpoints churn at boot/logon/driver installs.

- Stale bursts: buffered audio from a previous session played into a
  newly-attached recorder (observed live). Timestamped chunks + a
  consumer-gap check in the process callback age everything past 1 s.

The Linux node mechanism stays the stream-based Audio/Source with
RT_PROCESS + priority.session: the canonical null-audio-sink adapter
recipe was tested on this box (PipeWire 1.6.2) and never gets a clock
(QUANT 0 -> pure silence), and WirePlumber reroutes a feeder targeting
it to the default sink (echo). Decision documented in the module docs.

Live-validated on this box (synthetic host + probe --mic-test,
pw-record): eager node, both attach orderings, PipeWire-restart
self-heal, post-session silence. Windows side compile/CI + on-glass
validation pending.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-03 20:41:19 +00:00

334 lines
14 KiB
Rust

//! WASAPI virtual microphone (Windows) — the inverse of [`super::wasapi_cap`]. Windows has no
//! user-mode way to *create* a capture (microphone) endpoint, so we target an EXISTING virtual audio
//! device and write the client's decoded mic PCM into that device's **render** endpoint; the device's
//! **capture** endpoint then surfaces as a microphone that host apps can record from.
//!
//! The target comes from the [`audio_control::wire_now`] plan (recomputed on every open): VB-Audio
//! "CABLE Input" (bundled by the installer — the dedicated mic target), the Steam Streaming
//! Microphone, VoiceMeeter, or anything with "virtual" in the name; `PUNKTFUNK_MIC_DEVICE` overrides.
//! The plan reserves the mic target and points the desktop-audio loopback at a DIFFERENT endpoint, so
//! injecting here can never echo into the host→client audio stream (see
//! [`wiring_plan`](super::wiring_plan) for the precedence rules and the headless cable-only case).
//! If no candidate is present we auto-install the Steam Streaming audio pair (see
//! [`install_steam_audio_pair`]); failing that we return an error with install guidance and the
//! caller (the mic pump) retries with backoff — a cable that appears later (driver install finishing
//! after boot) is picked up without a host restart.
//!
//! **Liveness.** Any WASAPI error in the render loop (endpoint invalidated/removed, audio engine
//! restart) exits the worker thread, which flips the `alive` flag — [`VirtualMic::push`] then
//! returns `false` and the pump reopens (re-planning, so endpoint churn re-resolves). Before this
//! existed, the first device change silently killed mic passthrough for the rest of the host's life.
//!
//! `push` enqueues decoded interleaved-f32 PCM into a bounded ring (drop-oldest beyond ~80 ms so mic
//! latency stays bounded); a dedicated COM-apartment thread renders it event-driven, filling silence
//! when the client isn't talking. WASAPI objects are `!Send`, so they live entirely on that thread
//! (mirrors `WasapiLoopbackCapturer`).
// Every `unsafe` block in this file carries a `// SAFETY:` proof; enforce it.
#![deny(clippy::undocumented_unsafe_blocks)]
use super::{audio_control, VirtualMic, SAMPLE_RATE};
use anyhow::{anyhow, Context, Result};
use std::collections::VecDeque;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::mpsc::{sync_channel, SyncSender};
use std::sync::{Arc, Mutex};
use std::thread::{self, JoinHandle};
use std::time::Duration;
use wasapi::{Direction, SampleType, StreamMode, WaveFormat};
const CHANNELS: u32 = 2;
/// 48 kHz stereo f32: 2 channels * 4 bytes.
const BLOCK_ALIGN: usize = 2 * 4;
/// Bound the inject queue at ~80 ms so the passed-through mic stays low-latency (drop oldest beyond).
const MAX_QUEUE_BYTES: usize = (SAMPLE_RATE as usize * 80 / 1000) * BLOCK_ALIGN;
pub struct WasapiVirtualMic {
queue: Arc<Mutex<VecDeque<u8>>>,
stop: Arc<AtomicBool>,
/// False once the render thread has exited (device error or stop) — the pump's reopen signal.
alive: Arc<AtomicBool>,
join: Option<JoinHandle<()>>,
}
impl WasapiVirtualMic {
pub fn open(channels: u32) -> Result<Self> {
anyhow::ensure!(
channels == CHANNELS,
"virtual mic is stereo-only (got {channels})"
);
let queue = Arc::new(Mutex::new(VecDeque::<u8>::new()));
let stop = Arc::new(AtomicBool::new(false));
let alive = Arc::new(AtomicBool::new(true));
// Bring-up handshake: report the resolved device (or the error) before returning, so a missing
// virtual-mic device surfaces as Err (the caller retries with backoff) not a silent dead thread.
let (ready_tx, ready_rx) = sync_channel::<Result<String>>(1);
let (q, st, al) = (queue.clone(), stop.clone(), alive.clone());
let join = thread::Builder::new()
.name("punktfunk-wasapi-mic".into())
.spawn(move || {
if let Err(e) = render_thread(q, st, ready_tx) {
tracing::error!(error = %format!("{e:#}"), "wasapi virtual-mic thread failed");
}
// Normal stop or device error alike: this instance is done — the pump reopens.
al.store(false, Ordering::Release);
})
.context("spawn wasapi mic thread")?;
match ready_rx.recv_timeout(Duration::from_secs(5)) {
Ok(Ok(name)) => {
tracing::info!(device = %name,
"WASAPI virtual mic ready (client mic → this device's render endpoint)");
Ok(WasapiVirtualMic {
queue,
stop,
alive,
join: Some(join),
})
}
Ok(Err(e)) => Err(e),
Err(_) => Err(anyhow!("wasapi virtual-mic init timed out")),
}
}
}
impl Drop for WasapiVirtualMic {
fn drop(&mut self) {
self.stop.store(true, Ordering::SeqCst);
if let Some(j) = self.join.take() {
let _ = j.join();
}
}
}
impl VirtualMic for WasapiVirtualMic {
fn push(&self, pcm: &[f32]) -> bool {
if !self.alive.load(Ordering::Acquire) {
return false;
}
let Ok(mut q) = self.queue.lock() else {
return false;
};
q.reserve(pcm.len() * 4);
for &s in pcm {
q.extend(s.to_le_bytes());
}
// Drop-oldest to keep latency bounded (mic is real-time; stale audio is worse than dropped).
if q.len() > MAX_QUEUE_BYTES {
let excess = q.len() - MAX_QUEUE_BYTES;
q.drain(..excess);
}
true
}
fn alive(&self) -> bool {
self.alive.load(Ordering::Acquire)
}
fn discard(&self) {
if let Ok(mut q) = self.queue.lock() {
q.clear();
}
}
fn channels(&self) -> u32 {
CHANNELS
}
}
/// Resolve the mic inject target from the wiring plan, auto-installing the Steam Streaming pair
/// when nothing usable exists (then re-planning). Runs on the COM-initialized render thread.
fn resolve_target() -> Result<(wasapi::Device, String)> {
let mut wiring = audio_control::wire_now();
if wiring.mic_render.is_none() {
tracing::info!("no usable virtual mic device present — attempting auto-install");
// SAFETY: `install_steam_audio_pair` is `unsafe` only because it `LoadLibraryExW`s
// `newdev.dll` and calls `DiInstallDriverW` through a `transmute`d function pointer;
// calling it imposes no extra precondition here (it takes no args and aliases nothing).
// Its internal contract holds: the `DiInstall` type matches the documented
// `BOOL DiInstallDriverW(HWND, PCWSTR, DWORD, PBOOL)` ABI, and it passes a
// NUL-terminated UTF-16 INF path with null/zero optional args. Invoked once on the
// dedicated mic thread.
if unsafe { install_steam_audio_pair() } {
wiring = audio_control::wire_now();
}
}
let Some(ep) = wiring.mic_render else {
anyhow::bail!(
"no virtual-mic render endpoint on this box. Install VB-Audio Virtual Cable (the host \
installer bundles it) or enable Steam Remote Play's microphone (Steam Streaming \
Microphone), or set PUNKTFUNK_MIC_DEVICE=<friendly-name substring>."
);
};
let name = ep.0.clone();
Ok((audio_control::open_endpoint(&ep)?, name))
}
/// Best-effort: install BOTH Steam Streaming audio devices (the "Steam pair") so mic passthrough
/// works out of the box and the host has a desktop-audio sink distinct from the mic. Steam Remote
/// Play ships `SteamStreamingMicrophone.inf` + `SteamStreamingSpeakers.inf`: the microphone gives the
/// virtual mic a target whose **capture** endpoint apps record from, and the speakers give a
/// **render** endpoint a headless box can loopback-capture that is NOT the mic — so the loopback and
/// the mic land on different devices and never echo (see [`super::wiring_plan`]). Returns true if
/// either installed. No-op when Steam isn't installed (INFs absent), the install is denied (needs
/// admin — the host runs as SYSTEM), or `PUNKTFUNK_NO_MIC_INSTALL` is set.
unsafe fn install_steam_audio_pair() -> bool {
// Microphone first (the mic's actual target); speakers second (the distinct desktop-audio sink).
let mic = try_install_steam_audio("SteamStreamingMicrophone.inf");
let spk = try_install_steam_audio("SteamStreamingSpeakers.inf");
mic || spk
}
/// Install one Steam Streaming driver INF by filename via `DiInstallDriverW` (loaded from
/// `newdev.dll`, like Apollo, to avoid an extra windows-crate feature). See
/// [`install_steam_audio_pair`] for the contract; `inf_name` is a bare filename under Steam's
/// per-arch `drivers\Windows10\{arch}\` directory.
unsafe fn try_install_steam_audio(inf_name: &str) -> bool {
use windows::core::{s, w, PCWSTR};
use windows::Win32::Foundation::HWND;
use windows::Win32::System::Environment::ExpandEnvironmentStringsW;
use windows::Win32::System::LibraryLoader::{
GetProcAddress, LoadLibraryExW, LOAD_LIBRARY_SEARCH_SYSTEM32,
};
if std::env::var_os("PUNKTFUNK_NO_MIC_INSTALL").is_some() {
return false;
}
// Steam ships per-arch driver INFs under `Steam\drivers\Windows10\{arch}\`.
#[cfg(target_arch = "x86_64")]
let subdir = "x64";
#[cfg(target_arch = "aarch64")]
let subdir = "arm64";
#[cfg(not(any(target_arch = "x86_64", target_arch = "aarch64")))]
let subdir = "x86";
let template: Vec<u16> =
format!("%CommonProgramFiles(x86)%\\Steam\\drivers\\Windows10\\{subdir}\\{inf_name}")
.encode_utf16()
.chain(std::iter::once(0))
.collect();
let mut path = vec![0u16; 1024];
let n = ExpandEnvironmentStringsW(PCWSTR(template.as_ptr()), Some(path.as_mut_slice()));
if n == 0 || n as usize > path.len() {
return false;
}
let Ok(newdev) = LoadLibraryExW(w!("newdev.dll"), None, LOAD_LIBRARY_SEARCH_SYSTEM32) else {
tracing::warn!("could not load newdev.dll — Steam-audio auto-install unavailable");
return false;
};
let Some(addr) = GetProcAddress(newdev, s!("DiInstallDriverW")) else {
return false;
};
// BOOL DiInstallDriverW(HWND hwndParent, PCWSTR InfPath, DWORD Flags, PBOOL NeedReboot)
type DiInstall = unsafe extern "system" fn(HWND, PCWSTR, u32, *mut i32) -> i32;
let f: DiInstall = std::mem::transmute(addr);
let ok = f(
HWND(std::ptr::null_mut()),
PCWSTR(path.as_ptr()),
0,
std::ptr::null_mut(),
) != 0;
if ok {
tracing::info!(
inf = inf_name,
"installed a Steam Streaming virtual audio device"
);
std::thread::sleep(Duration::from_secs(5)); // let the audio subsystem register the endpoint
} else {
let err = windows::Win32::Foundation::GetLastError();
tracing::info!(
inf = inf_name,
?err,
"Steam-audio device not auto-installed (Steam absent / not admin) — see install guidance"
);
}
ok
}
fn render_thread(
queue: Arc<Mutex<VecDeque<u8>>>,
stop: Arc<AtomicBool>,
ready: SyncSender<Result<String>>,
) -> Result<()> {
if let Err(e) = wasapi::initialize_mta()
.ok()
.context("CoInitializeEx (MTA)")
{
let _ = ready.send(Err(e));
return Ok(());
}
// Open + start the render stream. The WASAPI objects must outlive the loop, so build them here and
// keep them (a closure that *returned* them would drop them); on any failure report Err and exit.
let setup = (|| -> Result<(wasapi::AudioClient, wasapi::AudioRenderClient, wasapi::Handle, String)> {
let (device, name) = resolve_target()?;
let mut audio_client = device.get_iaudioclient().context("IAudioClient")?;
// 48 kHz stereo f32; autoconvert lets WASAPI shared-mode SRC match the device mix format.
let desired = WaveFormat::new(
32,
32,
&SampleType::Float,
SAMPLE_RATE as usize,
CHANNELS as usize,
None,
);
let (default_period, _min) = audio_client.get_device_period().context("device period")?;
let mode = StreamMode::EventsShared {
autoconvert: true,
buffer_duration_hns: default_period,
};
audio_client
.initialize_client(&desired, &Direction::Render, &mode)
.context("initialize render client")?;
let h_event = audio_client.set_get_eventhandle().context("event handle")?;
let render_client = audio_client
.get_audiorenderclient()
.context("IAudioRenderClient")?;
// Pre-fill the whole buffer with silence so the stream starts cleanly (no startup glitch).
let buf_frames = audio_client.get_buffer_size().context("buffer size")? as usize;
let _ = render_client.write_to_device(buf_frames, &vec![0u8; buf_frames * BLOCK_ALIGN], None);
audio_client.start_stream().context("start render stream")?;
Ok((audio_client, render_client, h_event, name))
})();
let (audio_client, render_client, h_event, name) = match setup {
Ok(t) => t,
Err(e) => {
let _ = ready.send(Err(anyhow!("{e:#}")));
return Ok(());
}
};
let _ = ready.send(Ok(name));
// Any error below (endpoint invalidated/removed, engine restart) propagates out of the loop,
// ending the thread — the `alive` flag flips in the spawn wrapper and the pump reopens.
let mut buf: Vec<u8> = Vec::new();
while !stop.load(Ordering::Relaxed) {
// The device signals when it wants more data; finite timeout keeps `stop` responsive.
if h_event.wait_for_event(100).is_err() {
continue;
}
let space = audio_client
.get_available_space_in_frames()
.context("available space")? as usize;
if space == 0 {
continue;
}
let need = space * BLOCK_ALIGN;
if buf.len() < need {
buf.resize(need, 0);
}
// Silence base; overwrite with queued mic PCM (zero-pad the tail when the client is quiet).
buf[..need].fill(0);
{
let mut q = queue.lock().unwrap();
let n = q.len().min(need);
for (i, b) in q.drain(..n).enumerate() {
buf[i] = b;
}
}
render_client
.write_to_device(space, &buf[..need], None)
.context("write_to_device")?;
}
audio_client.stop_stream().ok();
Ok(())
}