Files
punktfunk/design/gamepad-driver-health.md
T
enricobuehler c21549c136
apple / swift (push) Successful in 1m12s
windows-drivers / probe-and-proto (push) Successful in 14s
windows-drivers / driver-build (push) Successful in 1m15s
apple / screenshots (push) Successful in 5m30s
android / android (push) Successful in 3m35s
ci / web (push) Successful in 51s
ci / rust (push) Successful in 1m44s
ci / docs-site (push) Successful in 58s
deb / build-publish (push) Successful in 4m6s
ci / bench (push) Successful in 4m50s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 7s
decky / build-publish (push) Successful in 13s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 8s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 7s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 35s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 51s
windows-host / package (push) Failing after 2m28s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 9m40s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 9m40s
docker / deploy-docs (push) Successful in 5s
feat(host/windows,drivers): gamepad driver attach/heartbeat health surfaced in logs
The gamepad drivers have no IOCTL plane (hidclass gates the stack), so
until now the host had ZERO visibility into whether a driver ever
bound: a pad could be "created" with no driver installed and nothing
was logged. Two health fields are carved from reserved shm space
(layout-compatible; pf-driver-proto pins the offsets): driver_proto —
stamped by pf-xusb at device add + per serviced XInput IOCTL (movement
= the game-visible path) and by pf-dualsense/DS4 from its ~125Hz timer
— and driver_heartbeat. Host-side, every pad owns a DriverAttach
watcher fed from the existing service() poll: INFO on attach (WARN on
proto mismatch), and after 3s of silence ONE diagnosis WARN combining
a cached pnputil /enum-drivers store check, the devnode's CM problem
code (CM_Locate_DevNodeW/CM_Get_DevNode_Status on the instance id now
captured from the create callback, with plain-language hints: 28 = not
installed, 52 = signature/Memory Integrity, …) and the driver's debug
log path. Also fixes a real bug both SwDeviceCreate wrappers shared:
the 10s WaitForSingleObject result was ignored and the callback
HRESULT zero-initialised, so a PnP timeout read as SUCCESS (now E_FAIL
init + explicit timeout error). Failure-mode table:
design/gamepad-driver-health.md.

Linux workspace green; Windows host + drivers CI-compile only, on-box
recipe at the bottom of the design doc.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-07-02 16:33:56 +00:00

6.2 KiB

Windows gamepad-driver health: failure modes and how each is surfaced

Written for the "host doesn't see the client's gamepad" class of bug report (2026-07-02). The Windows virtual pads have many silent ways to fail: the stack spans the host process, a named shared-memory section, a PnP software devnode, a UMDF driver in its own WUDFHost.exe, and finally the game's input API (XInput / HID / SDL). Before this work the host logged only its own create calls — a pad could "exist" with no driver installed and nothing was ever logged. This document enumerates the failure modes and states, for each, how it is now detected and what the log line says. All host lines land in stderr / %ProgramData%\punktfunk\logs\host.log (service) and the in-memory ring served at GET /api/v1/logs → the web console Logs page.

The health signals

The gamepad drivers have no IOCTL plane (hidclass gates the device stack), so the only cross-process channel is the shared section itself. Two fields were carved out of reserved space (layout-compatible; old drivers simply never write them, pf-driver-proto pins the offsets):

field XusbShm PadShm writer meaning
driver_proto @32 @144 driver GAMEPAD_PROTO_VERSION once attached; 0 = no driver on this section
driver_heartbeat @36 @148 driver XUSB: +1 per serviced XInput IOCTL (game-visible path). DS/DS4: +1 per ~8 ms timer tick (liveness)

Host side, every pad owns a DriverAttach watcher (inject/windows/gamepad_raii.rs), fed from the existing service() poll. State machine, each transition logs exactly once:

  • driver_proto != 0 → INFO gamepad driver attached to the shared section (with late=true if it came after the warning); WARN on a proto/host version mismatch.
  • 3 s of silence → one diagnosis WARN combining: driver-store check (pnputil /enum-drivers, cached once per process, only run on the failure path), devnode PnP status (CM_Locate_DevNodeW
    • CM_Get_DevNode_Status on the instance id captured from the SwDeviceCreate callback, with a plain-language hint per CM problem code), and the driver's own debug log path.

Failure modes

# failure cause examples detection surfaced as
1 Driver package not installed fresh box, installer's driver install --gamepad skipped/failed, package pruned attach timeout → pnputil /enum-drivers misses pf_xusb.inf/pf_dualsense.inf WARN driver package NOT in the driver store — run: punktfunk-host.exe driver install --gamepad
2 Package present but binding failed certificate not in Root/TrustedPublisher, Memory Integrity (HVCI) rejects it, stale DriverVer kept the old binary attach timeout → devnode problem code (28 = drivers not installed, 52 = signature rejected, 31/39 = load failure) WARN with the CM problem code + hint
3 Driver bound but crashed / never started WUDFHost crash, WdfDeviceCreate/queue failure inside the driver attach timeout → devnode status shows driver_loaded/started flags; the driver's own log (C:\Users\Public\pf*-driver.log) has the failing WDF call WARN referencing both
4 SwDeviceCreate fails outright not Administrator/SYSTEM, PnP wedged, _ in enumerator (E_INVALIDARG) existing error path (unchanged) WARN SwDeviceCreate failed; … devnode unavailable, pad continues on the out-of-band fallback
5 SwDeviceCreate callback never fires PnP service hung was silently mis-read as success (zero-init HRESULT(0) + ignored WaitForSingleObject return). Fixed: result inits to E_FAIL, the wait result is checked ERROR enumeration callback never fired (10s) — PnP may be wedged
6 Driver attached, then WUDFHost died mid-session crash, killed driver_heartbeat freezes (DS/DS4: timer-driven, so a freeze is conclusive; XUSB: only advances while a game polls, so absence is not an error) field exists for a future stall check; not auto-warned yet (XUSB semantics make a generic rule false-positive-prone)
7 Version skew host↔driver new host + old installed driver (or vice versa) driver_proto ≠ host's GAMEPAD_PROTO_VERSION; pre-health drivers read as never-attached WARN driver/host protocol mismatch — update the drivers (mismatch) / the mode-1 diagnosis text notes the pre-health case
8 Whole backend latched off first pad creation failed → broken latch disables pads for the session existing behaviour, now with remedy text ERROR …controller input disabled until the next client connect (install/repair: punktfunk-host.exe driver install --gamepad)
9 Section created but game can't see the pad XInput slot ordering, HidHide-style HID filters on the game process, RPCS3 pad-handler config, GameInput's instance-path VID/PID parse not host-detectable — outside our process and the driver's stack. XUSB driver_heartbeat advancing proves "some XInput client polls us", which brackets the problem to the game's side diagnosis text points at the driver log; the client-side controller view (Android "Connected controllers") covers the other end of the chain

What deliberately did NOT change

  • The broken latch stays one-way per session (retry loops against a missing driver would spam PnP); the log line now says so and gives the remedy.
  • No mgmt-API health endpoint yet — the log ring is the surfacing channel. If the web console ever grows a "gamepad health" card, DriverAttach is the state to expose.
  • The DS/DS4 heartbeat is not yet watched for mid-session stalls (mode 6): worth adding once the XUSB/DS semantics split is encoded (DS freeze = conclusive, XUSB freeze = normal when no game polls).

Validation status

  • Linux-side: workspace build/tests/clippy green; pf-driver-proto layout asserts pin the new offsets (compile-time).
  • Windows host code + both drivers: compile-checked in CI only (this box cannot cross-build the native deps); not yet on-box validated. On-box test recipe: stop the service, pnputil /delete-driver the gamepad package, connect a client with a pad → expect the mode-1 WARN in the console Logs page within ~3 s of the pad arriving; reinstall drivers → expect the attached (late=true) INFO on the next session.