Both the unified host (serve --native) and standalone m3-host now advertise the native punktfunk/1 service over mDNS (_punktfunk._udp) — the analogue of the GameStream _nvstream._tcp advert. TXT records carry proto, the host cert fingerprint (fp, the value clients pin), the pairing requirement (pair=required|optional), and the host id. New crate::discovery module, wired into m3::serve so both host entry points get it; best-effort, never blocks streaming (--connect always works). Client gains `punktfunk-client-rs --discover [SECS]`: browses the LAN and prints each host (name, addr:port, pairing, fingerprint), then exits. Apple clients browse the same service natively via NWBrowser (service type + TXT keys are the contract). Validated cross-LAN: the dev box discovered the GNOME-box appliance (pair=required) and a standalone synthetic host (pair=optional); fingerprint and pairing state correct in both. Also refresh the now-stale sendmmsg caveat in the bitrate doc (batched/paced send landed + validated to 1 Gbps) and mark the encode|send thread split done in §12. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
+32
-7
@@ -289,17 +289,42 @@ buffer; `sendmmsg`/`recvmmsg` batching; the capture-timestamp anchor placement.
|
||||
the ~150 Mbps@60 frame size where drops began). Plus **per-frame instrumentation** (PUNKTFUNK_PERF):
|
||||
`encode_us` + `pace_us` p50/p99/max + immediate-vs-paced counts, so the cap is tunable against real
|
||||
numbers. **Validate with the LAN soak before raising the cap** (`send_dropped` must stay 0).
|
||||
- **Done & live (`b295a5b`; validated on the GNOME box 2026-06-12):** **encode|send thread split**
|
||||
on the native path — a dedicated `send_loop` thread owns the `Session` and does seal+pace+send+
|
||||
probes; the encode thread captures+encodes+handles reconfig and hands `FrameMsg` over a bounded
|
||||
`sync_channel(3)` with backpressure. Removes the serialization (~2–8 ms @60–120 fps) and is the
|
||||
substrate the slice wrapper needs. Real-NIC soak (host on the Ubuntu/GNOME box, client over the
|
||||
LAN): `send_dropped=0` at 720p60 / 1080p120, and a 1 Gbps probe pushed 625 MB in 5 s clean.
|
||||
- **Bigger bets (ordered, deferred — need real-NIC/GPU/Mac validation):**
|
||||
1. **Encode|send thread split** on the native path (port GameStream's `spawn_sender` + depth-2
|
||||
channel; `seal_frame` stays on the encode thread, `send_sealed` on a send thread) — removes the
|
||||
serialization (~2–8 ms @60–120 fps), and is the substrate the slice wrapper needs.
|
||||
2. **Wall-clock skew handshake + glass-to-glass probe** (`tools/latency-probe`) — measures the two
|
||||
1. **Wall-clock skew handshake + glass-to-glass probe** (`tools/latency-probe`) — measures the two
|
||||
biggest unmeasured terms (render→capture, decode→present); client present-stamp vs the AU's
|
||||
`pts_ns` (already attached).
|
||||
3. **CUDA stream+event** to drop one of two redundant `cuCtxSynchronize` in `submit_cuda` (keep the
|
||||
2. **CUDA stream+event** to drop one of two redundant `cuCtxSynchronize` in `submit_cuda` (keep the
|
||||
copy) — ~0.1–0.4 ms@720p, ~1 ms@5K; only if per-stage timing proves the sync is on the path.
|
||||
4. **Stage-2 Apple presenter** (`VTDecompressionSession` → `CAMetalLayer`, hand-paced) — ~0.5 refresh
|
||||
3. **Stage-2 Apple presenter** (`VTDecompressionSession` → `CAMetalLayer`, hand-paced) — ~0.5 refresh
|
||||
off the present tail (biggest client win at 60 Hz); gate on the probe proving present is real.
|
||||
5. **NVENC slice-mode wrapper** (roadmap §2 sub-frame pipelining) — per-slice transmit overlaps
|
||||
4. **NVENC slice-mode wrapper** (roadmap §2 sub-frame pipelining) — per-slice transmit overlaps
|
||||
encode+send within a frame (~3–6 ms at 4K/5K/IDR); large + driver-ABI-fragile, on top of the
|
||||
thread split, only after measurement justifies it.
|
||||
|
||||
## 13. Native-protocol LAN auto-discovery ✅ *(done — 2026-06-12, validated cross-LAN)*
|
||||
|
||||
The native protocol had no discovery — clients connected by `--connect HOST:PORT` only, while
|
||||
GameStream already auto-discovered via mDNS (`_nvstream._tcp`). Now both the unified host
|
||||
(`serve --native`) and standalone `m3-host` advertise the native service over mDNS:
|
||||
|
||||
- **Service**: `_punktfunk._udp.local.` (UDP — punktfunk/1 is QUIC; the advertised port is the QUIC
|
||||
control/data port). Host side: `crate::discovery::advertise_native`, wired into `m3::serve` so
|
||||
both host entry points get it; best-effort (a discovery failure never blocks streaming —
|
||||
`--connect` always works). The advert is held for the host's lifetime (RAII unregister).
|
||||
- **TXT records**: `proto=punktfunk/1`, `fp=<host cert SHA-256>` (the value a client pins — advisory
|
||||
over unauthenticated mDNS, TOFU/pinning still verifies on connect), `pair=required|optional`
|
||||
(so a picker knows up front whether the PIN ceremony is needed), `id=<host uniqueid>` (dedup).
|
||||
- **Client**: `punktfunk-client-rs --discover [SECS]` browses and prints each host (name, addr:port,
|
||||
pairing, fingerprint), then exits. Apple clients browse the same service natively via NWBrowser
|
||||
(Bonjour) — no Rust-connector dependency; this section's service type + TXT keys are the contract.
|
||||
- **Validated**: cross-LAN — dev box discovered the GNOME-box appliance
|
||||
(`home-worker-3 192.168.1.248:9777 pair=required fp=1dcf3a…`) and a standalone synthetic host
|
||||
(`pair=optional`); fingerprint + pairing state correct in both.
|
||||
- **Next** (not done): wire NWBrowser discovery into the Apple client UI (host picker); the
|
||||
host-side contract above is all it needs.
|
||||
|
||||
Reference in New Issue
Block a user