Merge remote-tracking branch 'origin/main'
apple / swift (push) Successful in 56s
windows-host / package (push) Successful in 3m7s
windows-msix / package (arm64, C:\Users\Public\ffmpeg-arm64, aarch64-pc-windows-msvc, C:\t-a64) (push) Successful in 1m18s
android / android (push) Successful in 4m27s
ci / rust (push) Successful in 4m43s
ci / web (push) Successful in 31s
ci / docs-site (push) Successful in 34s
windows-msix / package (x64, C:\Users\Public\ffmpeg, x86_64-pc-windows-msvc, C:\t) (push) Successful in 1m18s
windows / build (aarch64-pc-windows-msvc) (push) Successful in 1m1s
deb / build-publish (push) Successful in 2m8s
windows / build (x86_64-pc-windows-msvc) (push) Successful in 1m5s
decky / build-publish (push) Successful in 24s
docker / build-push (--build-arg FEDORA_VERSION=44, ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora44-rpm) (push) Successful in 5s
docker / build-push (., web/Dockerfile, punktfunk-web) (push) Successful in 4s
docker / build-push (ci, ci/fedora-rpm.Dockerfile, punktfunk-fedora-rpm) (push) Successful in 4s
ci / bench (push) Successful in 4m43s
docker / build-push (docs-site, docs-site/Dockerfile, punktfunk-docs) (push) Successful in 26s
docker / build-push (ci, ci/rust-ci.Dockerfile, punktfunk-rust-ci) (push) Successful in 2m11s
flatpak / build-publish (push) Successful in 4m13s
rpm / build-publish (bazzite, punktfunk-fedora-rpm) (push) Successful in 8m6s
docker / deploy-docs (push) Successful in 18s
rpm / build-publish (fedora-44, punktfunk-fedora44-rpm) (push) Successful in 7m41s

# Conflicts:
#	docs-site/content/docs/meta.json
This commit is contained in:
2026-06-21 00:07:36 +00:00
30 changed files with 573 additions and 1209 deletions
+126
View File
@@ -0,0 +1,126 @@
---
title: "Apple Stage-2 Presenter (handoff)"
description: "Implementation plan for the explicit VTDecompressionSession → CAMetalLayer presenter — hand-paced present + true decode→present (glass-to-glass) measurement. Written so a Mac agent can pick it up."
---
> **Status update:** the stage-2 presenter described here has since been **built and live-validated**,
> shipping behind an opt-in flag (`AVSampleBufferDisplayLayer` remains the default known-good path).
> This page is preserved as the implementation/handoff record for that work.
The implementation plan for the **stage-2 Apple presenter**. The **stage-1** presenter feeds
compressed HEVC straight into `AVSampleBufferDisplayLayer`, which hardware-decodes **and presents
internally with no per-frame callback** — so we can't stamp decode or present, and we can't hand-pace.
Stage-2 takes explicit control: decode with `VTDecompressionSession`, present decoded frames through a
`CAMetalLayer` driven by a display link. Two wins: **~0.5 refresh off the present tail** (the biggest
client latency term at 60 Hz) and **true decode→present / glass-to-glass** numbers.
All of this is **macOS/iOS/tvOS-only** — build + validate on a Mac (`swift build && swift test`, then
live against a Linux host). The host + connector side is already done: `PunktfunkConnection.clockOffsetNs`
(the connect-time skew offset, host minus client) is what makes the present timestamp cross-machine
valid. See [Status](/docs/status) and roadmap §12.
## Where it plugs into the existing code
| Existing (stage-1) | Stage-2 change |
|---|---|
| `StreamPump` pulls AUs → `AnnexB.sampleBuffer``layer.enqueue` (compressed) | A `Stage2Pump` (or a mode flag on `StreamPump`) feeds AUs to `VTDecompressionSessionDecodeFrame` instead |
| `StreamView`/`StreamViewIOS` host an `AVSampleBufferDisplayLayer` | Host a `CAMetalLayer` (+ a display link); keep the input-capture + HUD overlay unchanged |
| `AnnexB.formatDescription(fromIDR:)` builds the format desc, refreshed on every IDR | **Reused** — it's the `VTDecompressionSession`'s format description; recreate the session when it changes |
| `LatencyMeter` records capture→client-receipt at `onFrame` | Extend to record **decode-completion** and **present** stages (below) |
Keep stage-1 behind a `UserDefaults` flag (e.g. `punktfunk.presenter = "stage1" | "stage2"`) so a
regression can fall back — `AVSampleBufferDisplayLayer` is the known-good path.
## Decode: VTDecompressionSession
1. Create the session from the IDR's `CMVideoFormatDescription`
(`AnnexB.formatDescription(fromIDR:)`):
```
VTDecompressionSessionCreate(
allocator: nil,
formatDescription: fmt,
decoderSpecification: nil, // hardware by default; no need to force
imageBufferAttributes: [
kCVPixelBufferMetalCompatibilityKey: true,
kCVPixelBufferPixelFormatTypeKey:
kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange, // 8-bit SDR; 10-bit (…10BiPlanar) for HDR later
],
outputCallback: <C-callback>,
decompressionSessionOut: &session)
```
2. Per AU: build the same `CMSampleBuffer` as stage-1 (`AnnexB.sampleBuffer(au:format:)`, PTS =
`au.ptsNs` @ 1e9 timescale) and submit:
```
VTDecompressionSessionDecodeFrame(session, sampleBuffer,
flags: ._EnableAsynchronousDecompression,
frameRefcon: <pts or a boxed context>, infoFlagsOut: nil)
```
3. The **output callback** delivers `(status, infoFlags, imageBuffer: CVImageBuffer?, presentationTimeStamp, …)`.
`presentationTimeStamp` is `au.ptsNs` (the host capture clock). **Stamp decode-completion here**
(`CLOCK_REALTIME` ns), retain the `CVPixelBuffer`, and push `{pts, pixelBuffer, decodedNs}` into a
small NSLock-guarded ring (the "ready" queue) the display link drains.
4. **IDR / mode change**: when `AnnexB.formatDescription` yields a new desc, check
`VTDecompressionSessionCanAcceptFormatDescription`; if not, finish-and-recreate the session (same
trigger stage-1 uses to refresh `format`). On decoder error (`kVTVideoDecoderBadDataErr`, etc.) drop
to the next IDR — there's no out-of-band extradata; recovery keyframes re-carry the parameter sets.
## Present: CAMetalLayer + display link
- `CAMetalLayer` (device = system default, `pixelFormat = .bgra8Unorm`, `framebufferOnly = true`,
`drawableSize` = stream WxH). The view: macOS `NSView`/iOS `UIView` whose `layerClass`/backing layer
is the `CAMetalLayer` (mirror `StreamView`/`StreamViewIOS`).
- **Display link** drives present: macOS `CVDisplayLink` (or `CADisplayLink` on macOS 14+),
iOS/tvOS `CADisplayLink`. Each callback carries the **target present timestamp** (`CVTimeStamp` /
`targetTimestamp`).
- Each vsync: pop the **newest** ready frame (drop older undisplayed ones — low-latency default; no
smoothing buffer to start), render a fullscreen quad sampling the **biplanar YUV** (luma +
chroma planes via `CVMetalTextureCache`) with a BT.709 YUV→RGB fragment shader, then
`commandBuffer.present(drawable)` (or `present(drawable, atTime:)`). **Stamp present time** for the
frame just shown (use the display link's target timestamp converted to `CLOCK_REALTIME`).
- Colorspace: BT.709 8-bit for now (matches the host's SDR). HDR (BT.2020/PQ, 10-bit `…10BiPlanar` +
EDR `CAMetalLayer.wantsExtendedDynamicRangeContent`) is a later tie-in with the HDR roadmap (§10).
### Cheaper intermediate (2a) if the Metal path is too big in one step
Decode with `VTDecompressionSession` (gets the **decode-completion timestamp** = capture→decoded),
then wrap the decoded `CVPixelBuffer` in a `CMSampleBuffer` and `enqueue` it into the existing
`AVSampleBufferDisplayLayer` (it accepts uncompressed pixel buffers too). This yields the decode term
**without** a Metal renderer — but **not** true present (the layer still presents internally). Ship 2a
first if useful; 2b (CAMetalLayer + display link) is required for the on-glass present stamp.
## Measurement (the whole point)
Extend `LatencyMeter` (or add per-stage meters) so each frame records three instants, all
`CLOCK_REALTIME` ns, all shifted by `connection.clockOffsetNs` to the host clock:
- **capture→decoded** = `decodedNs + offset pts_ns` (VideoToolbox decode latency, cross-machine)
- **decode→present** = `presentedNs decodedNs` (the present tail stage-2 shortens)
- **capture→present** = `presentedNs + offset pts_ns` — **the glass-to-glass number** (modulo the
host render→capture term, still unmeasured; see roadmap §12)
Surface `capture→present` p50/p95 in the HUD (extend the existing `model.latency*` line in
`ContentView`). `skewCorrected` stays false when `clockOffsetNs == 0` (old host) — then the numbers are
same-host-only, as today.
## Validation
- `swift test`: add a decode-output test (decode a known IDR built like
`VideoToolboxRoundTripTests` → assert a `CVPixelBuffer` of the right dimensions + the
decode callback fires). Present is display-bound — validate it **live** via the HUD number.
- Live: connect to a Linux host (`punktfunk1-host --source virtual` on the GNOME box; see
[Ubuntu — GNOME](/docs/ubuntu-gnome)), confirm `capture→present` is a few ms over `capture→client`
and that `decode→present` shrank vs. an `AVSampleBufferDisplayLayer` baseline.
- Compare against the headless reference number: `punktfunk-probe` reports skew-corrected
capture→reassembled (~1.3 ms p50 GNOME box → dev box); capture→present should be that **+ decode +
present**.
## Gotchas
- VT decode is **async**; the output callback runs on a VT-managed thread — don't block it, just stamp
+ enqueue. Retain the `CVPixelBuffer` until presented (the ring owns it).
- `VTDecompressionSessionDecodeFrame` wants the **same** `CMSampleBuffer` shape stage-1 builds (AVCC
length-prefixed NALs, in-band parameter sets in the format desc, never as extradata).
- `CAMetalLayer.drawableSize` must track mode changes (the host can `Reconfigure` mid-stream — watch
`PunktfunkConnection.mode`/the new-IDR dimensions).
- Don't add a jitter/smoothing buffer for the first cut — present newest-ready for lowest latency; a
pacing policy can come later if frames look uneven.
- Keep `clients/apple/README.md`'s "Stage 2" item + [Status](/docs/status) updated when this lands.
+119
View File
@@ -0,0 +1,119 @@
---
title: "CI & Docker"
description: "Gitea Actions setup — workflows, the dockerized pieces, and the runners."
---
CI runs on **Gitea Actions** (`git.unom.io`, org `unom`). The workflows live in
`.gitea/workflows/`; they run across Linux and macOS runners and push a few images to the
Gitea container registry.
## Workflows
| Workflow | Trigger | Runner | What it does |
|---|---|---|---|
| `ci.yml` | push to `main`, PRs | Linux | Rust workspace (fmt · clippy `-D warnings` · build · test · C-ABI harness · generated-header drift) inside the `punktfunk-rust-ci` image; `web/` and `docs-site/` build + typecheck in `oven/bun:1` |
| `docker.yml` | push to `main`, `v*` tags, manual | Linux | Builds + pushes the images below (`latest` + `sha-<short>` tags) |
| `apple.yml` | push to `main`, PRs, manual | macOS | Rust core → `PunktfunkCore.xcframework``swift build` + `swift test` in `clients/apple` |
| `release.yml` | `v*` tags, manual | macOS | Production Apple builds: sandboxed macOS `.dmg` (Developer ID, notarized, stapled) attached to the Gitea release + macOS/iOS/tvOS archives uploaded to TestFlight |
| `windows-msix.yml` | push to `main`, `v*` tags, manual | Windows | Builds the Windows client for `x86_64`/`aarch64` and packages signed MSIX artifacts |
## Dockerized pieces
The host and the native clients are intentionally **not** containerized (the host needs
the GPU/compositor stack of the box it runs on). What is:
| Image | Source | Notes |
|---|---|---|
| `git.unom.io/unom/punktfunk-web` | `web/Dockerfile` (repo-root context — orval needs `docs/api/openapi.json`) | Nitro `bun` bundle; `PORT` (3000) and `PUNKTFUNK_MGMT_URL` env at runtime |
| `git.unom.io/unom/punktfunk-docs` | `docs-site/Dockerfile` | This site; `PORT` (3000) |
| `git.unom.io/unom/punktfunk-rust-ci` | `ci/rust-ci.Dockerfile` | Ubuntu 26.04 + FFmpeg 8/PipeWire/GL/GBM dev libs + a libcuda **link stub** (driver userspace, no kernel module) + pinned rustup — the container `ci.yml`'s Rust job runs in |
Registry pushes authenticate with a repo Actions secret holding a registry token (a PAT
with `write:package`; the login username in `docker.yml` is the token owner, not the
push actor).
## Runners
- **Linux runner** — runs the Rust/web/docs jobs (as docker containers) and the image
build+push jobs.
- **macOS runner** — an Apple-silicon Mac running macOS, a **host-mode** `act_runner`
(upstream now ships it as `gitea-runner`) provisioned by
[`scripts/ci/setup-macos-runner.sh`](https://git.unom.io/unom/punktfunk/src/branch/main/scripts/ci/setup-macos-runner.sh):
rustup (+ both darwin targets for the universal xcframework), Node.js (host-mode runners
execute JS actions via `node` from PATH — nothing auto-provisions it), the runner binary
in `~/.local/bin`, state under `~/ci/act-runner/` (config, `.runner` registration,
`runner.log`), kept alive by the `io.gitea.act_runner` **root LaunchDaemon** — it cannot
be a user LaunchAgent: macOS Local Network privacy silently blocks LAN dials
("no route to host") from unbundled CLI binaries in gui/user launchd domains, while
system daemons are exempt. Needs full **Xcode** for `xcodebuild -create-xcframework`
(CLT alone only covers `swift build/test`); if `xcode-select` still points at CLT, the
script auto-detects `/Applications/Xcode*.app` and bakes a `DEVELOPER_DIR` override into
the daemon environment — no `xcode-select -s` required.
- **Windows runner** — builds and packages the native Windows client (MSIX) for the
release matrix.
Re-provisioning is idempotent — re-running `scripts/ci/setup-macos-runner.sh` on the macOS
runner with a fresh `GITEA_RUNNER_TOKEN` (org `unom` → Settings → Actions → Runners →
Create new runner) re-registers it without manual cleanup.
## Apple releases
`release.yml` produces the production client builds on the Mac runner. All three app
targets share the bundle ID **`io.unom.punktfunk`** (one App Store listing, universal
purchase — effectively unchangeable after first submission). Signing is **not** secret-based:
the runner uses its **login keychain** directly, so install the **Developer ID Application**,
**Apple Distribution**, and (for the Mac App Store `.pkg`) **3rd Party Mac Developer
Installer** identities once via Xcode, with the WWDR intermediate present so they show as
valid. The only secrets are `ASC_API_KEY_P8`/`ASC_API_KEY_ID`/`ASC_API_ISSUER_ID` (App Store
Connect API key — notarization + TestFlight upload). Per-platform state:
- **macOS (Developer ID)** — sandboxed app (`Config/Punktfunk-macOS.entitlements`) → export
`notarytool` → stapled `.dmg` on the Gitea release.
- **macOS (App Store)** — manual-signed archive (Apple Distribution + the *Punktfunk macOS
App Store Distribution* profile) → upload to TestFlight. App Sandbox is **mandatory** here
and is now declared (app-sandbox + network client/server + audio-input + bluetooth/usb).
Prereqs (one-time, Apple portal): add the **macOS platform** to the App Store Connect app
record (universal purchase), install the Mac App Store distribution profile + the installer
cert above. `continue-on-error` until those exist.
- **iOS** — archive + upload to TestFlight (`method: app-store-connect`,
`destination: upload`). Crypto is declared exempt (`ITSAppUsesNonExemptEncryption`,
`Config/Info.plist`) so builds don't stall on the compliance question.
- **tvOS** — archive + upload to TestFlight (Rust core built from tier-3 targets, nightly
`-Zbuild-std` via `build-xcframework.sh`).
Each macOS target uses its own entitlements: `Config/Punktfunk-macOS.entitlements` (App
Sandbox is macOS-only) for the macOS app, and the shared `Config/Punktfunk.entitlements`
(keychain-access-groups only) for iOS/tvOS — `com.apple.security.app-sandbox` is invalid on
iOS/tvOS and would fail upload validation.
The runner needs a **release (non-beta) Xcode** — App Store processing rejects beta-SDK
builds, and a beta is unusable for the Rust side too: a newer-than-OS ld emits dylibs the
running dyld rejects ("mis-aligned LINKEDIT string pool"), killing every proc-macro build
with a misleading `E0463 can't find crate`. `build-xcframework.sh` therefore resolves
toolchains itself: non-beta Xcode for everything; with only CLT + a beta present it
builds macOS slices against CLT (packaging via any Xcode — `-create-xcframework` does no
linking) and **refuses iOS/tvOS slices** (CLT has no iOS SDK).
## Deployment
`docker.yml`'s `deploy-docs` job ships this docs site after every image push: it syncs
`compose.production.yml` to the docs server and runs `docker compose pull && up -d` there
over SSH, driven by a small set of deploy secrets (`DEPLOY_HOST` / `DEPLOY_USER` /
`DEPLOY_PORT` / `DEPLOY_SSH_KEY`). A reverse proxy in front of that server serves the
container as <https://docs.punktfunk.unom.io>. The host and the web console are NOT
deployed — the console fronts a punktfunk host's management API on whatever box runs the
host.
## Troubleshooting
- **macOS runner offline** — check `~/ci/act-runner/runner.log` on the runner; restart with
`sudo launchctl kickstart -k system/io.gitea.act_runner`. "no route to host" in the log
means the daemon is running in a gui/user domain again — see the Local Network note
above.
- **`apple.yml` fails at the xcframework step** — Xcode missing or unselected:
`sudo xcode-select -s /Applications/Xcode.app/Contents/Developer` and accept the license
(`sudo xcodebuild -license accept`), then re-run.
- **Rust job can't pull `punktfunk-rust-ci`** — the runner host's docker daemon needs a
`docker login git.unom.io` if the org/registry isn't anonymously readable.
- **Stale builder image after toolchain/dep changes** — `docker.yml` re-pushes it on every
`main` push; a manual `workflow_dispatch` of `docker.yml` forces a rebuild.
+5 -1
View File
@@ -1,4 +1,8 @@
# DualSense advanced (audio-driven) haptics — feasibility & scoping
---
title: "DualSense Haptics"
description: "Feasibility and scoping for audio-driven DualSense haptics."
---
**Status: scoped, NO-GO for now (deferred).** Advanced voice-coil haptics on the DualSense are
driven by the controller's **USB audio interface** (4-channel surround, the back two channels carry
+73
View File
@@ -0,0 +1,73 @@
---
title: "gamescope Multi-User Isolation (deferred)"
description: "Research + design for concurrent INDEPENDENT gamescope desktops (multi-user), and why it's deferred. The shared-desktop multi-view case already landed."
---
**Status: deferred (2026-06-12).** Concurrent sessions landed for the **shared-desktop multi-view**
case — multiple devices viewing/controlling the *same* KWin/Mutter/wlroots desktop ([Status](/docs/status)).
This page captures the research for the *other* model — **independent desktops** (each client its own
gamescope instance: the multi-user / cloud-gaming-on-one-box case) — and why it's parked. Pick this
up from here if the use case becomes a priority.
## What landed vs what this is
| Model | Backends | Input | Audio | Status |
|---|---|---|---|---|
| **Shared-desktop multi-view** | kwin / mutter / wlroots | shared (all drive one desktop) | shared (all hear one desktop) | ✅ **landed** — correct semantics: stream *your* desktop to laptop + TV at once |
| **Independent desktops (multi-user)** | gamescope | **per-session** (each drives its own game) | **per-session** | ⏸ **deferred** — this page |
For independent desktops, shared input/audio is *wrong* — each user must drive and hear only their own
session. gamescope is the natural fit: each `create()` spawns a fresh nested compositor (own
rendering, own EIS input socket). The blocker is that the host's input/audio/mic are host-lifetime
**shared** services, and the gamescope EIS socket is relayed through a single global file.
## Current architecture (the research)
Each gamescope **process is per-session** (`vdisplay/gamescope.rs::create()` spawns one; the
`VirtualOutput.keepalive` owns it). But:
- **EIS input socket — single global file.** gamescope exports `LIBEI_SOCKET` for its children; a
shell wrapper relays it to the fixed path `/tmp/punktfunk-gamescope-ei` (`EI_SOCKET_FILE`).
**Two concurrent instances overwrite each other's socket name** in that one file.
- **Injector — one host-lifetime `!Send` service.** `punktfunk1.rs::InjectorService` opens **one**
`inject::open(backend)` for the whole run and forwards events over an mpsc channel. It was made
shared deliberately (the portal `CreateSession` churn wedged KWin's EIS — "EIS setup timed out").
For gamescope it reads the one global socket file, so all sessions' input lands in whichever
instance wrote last.
- **Audio — global default-sink monitor.** `audio::open_audio_capture()` sets
`STREAM_CAPTURE_SINK` and autoconnects to the host's **default sink monitor** (PW_ID_ANY) — the
whole system's output, not a per-gamescope node. gamescope exposes **no per-instance audio node**.
- **Mic — one global `Audio/Source`.** `MicService` feeds one PipeWire source named `punktfunk-mic`;
all clients' mic uplinks mix into it.
- Per-session already (no work): the gamescope process, the PipeWire video node, and the uinput
gamepads.
## What it would take
1. **Per-instance EIS socket** — give each gamescope a unique relay file
(`/tmp/punktfunk-gamescope-{id}-ei`) and carry the path on `VirtualOutput` (new field) so the
session can find its own socket.
2. **Per-session injector** — for gamescope sessions, create a **per-session** injector bound to that
socket (its own thread, since `InputInjector` is `!Send`), instead of the shared `InjectorService`.
Keep the shared service for the portal backends (kwin/mutter) where shared input is correct.
Ordering nuance: the input thread is wired before the gamescope socket exists, so the per-session
injector must open **lazily** (on first event, by which time gamescope is up) or be created after
`build_pipeline`.
3. **Per-session audio (the bigger piece).** gamescope has no per-instance audio node, but audio
*is* isolatable: create a **per-session PipeWire null-sink**, route that gamescope's apps to it
(`PULSE_SINK` / a target node on the spawn env), and capture **that sink's monitor** per session.
This is the largest addition — null-sink create/teardown + routing + per-session capture.
4. **Per-session mic** — a virtual `Audio/Source` per session (`punktfunk-mic-{id}`), routed into
that gamescope, instead of the one global source.
## Why deferred
- It's a **large multi-file refactor** — the whole input path (per-instance sockets + per-session
injector + the lazy-open ordering), **plus** per-session null-sink audio routing, **plus** per-session
mic — for a **niche** use case (multiple independent users gaming on one box).
- The **common** concurrency case — stream one desktop to several of *your own* devices — is the
shared-desktop multi-view model, which **already landed and is the correct semantics** for it.
- No correctness gap in what shipped: concurrent sessions work today; this is purely the *additional*
independent-desktops model.
Revisit when there's a real multi-user requirement. The plumbing list above is the whole job.
+5 -1
View File
@@ -1,4 +1,8 @@
# GameStream host: stream to a stock Moonlight client
---
title: "GameStream Host"
description: "Stream to a stock Moonlight client on a client-sized virtual display."
---
The shippable milestone (plan §8). A stock Moonlight/Artemis client discovers this host,
pairs, launches, and gets video (then input, then audio) on a client-sized virtual display.
+16 -12
View File
@@ -1,8 +1,12 @@
# punktfunk — Implementation Plan
---
title: "Implementation Plan"
description: "The full design: protocol core, milestones, and architecture."
---
*A ground-up low-latency desktop streaming stack, built Linux-first, with a shared Rust protocol core and native clients per platform.*
> `punktfunk` is a placeholder codename — rename freely. It fits the lowercase house style (`unom`, `played`, `remplir`) and reads as "glass-to-glass light," which is the whole point.
> The name `punktfunk` fits the lowercase house style (`unom`, `played`, `remplir`) and reads as "glass-to-glass light," which is the whole point.
---
@@ -28,7 +32,7 @@ Two concrete gaps justify a new project rather than another fork:
**Explicit non-goals (at least at first):**
- Windows *host* support (Sunshine/Apollo already do this well; no gap to fill).
- Internet/NAT-traversal relay infrastructure (LAN/VPN first; you already run Headscale/NetBird — lean on that).
- Internet/NAT-traversal relay infrastructure (LAN/VPN first; lean on an existing mesh VPN such as Headscale/NetBird/Tailscale).
- Reinventing encoders/decoders (bind to FFmpeg + vendor SDKs; never rewrite codecs).
- A bespoke compositor (drive existing ones; only consider a dedicated headless compositor as a *deployment mode*, see §6).
@@ -165,7 +169,7 @@ This is the differentiator and the most fragmented part. Two deployment models
**Per-compositor (Model A) runtime virtual-output creation:**
- **KWin / Plasma 6 (recommended MVP target — matches your CachyOS/KDE daily driver and where the gap is loudest):** KWin can create virtual outputs; KRdp already does this internally for remote sessions. Drive it via the KWin DBus interface; capture via `xdg-desktop-portal-kde` ScreenCast (PipeWire); inject input via the RemoteDesktop portal or `reis`.
- **KWin / Plasma 6 (recommended MVP target — a common KDE daily-driver setup, and where the gap is loudest):** KWin can create virtual outputs; KRdp already does this internally for remote sessions. Drive it via the KWin DBus interface; capture via `xdg-desktop-portal-kde` ScreenCast (PipeWire); inject input via the RemoteDesktop portal or `reis`.
- **wlroots (Sway/Hyprland — fastest to *prototype* the pipeline):** enable the headless backend (`WLR_BACKENDS=…,headless`), then `swaymsg create_output` / `hyprctl output create headless`. Capture via `wlr-screencopy` or the portal. Simplest API; good for validating capture→encode→send before fighting KWin/Mutter.
- **Mutter / GNOME:** virtual monitors via the headless backend; runtime creation via Mutter DBus (`org.gnome.Mutter.*` — partly experimental). Capture via `xdg-desktop-portal-gnome` ScreenCast.
@@ -223,7 +227,7 @@ Sizing is rough and relative (Spike / S / M / L) for a focused solo dev; treat a
**M4 — P2 transport: break the wall (L).** Add `punktfunk/1` negotiation; swap to `reed-solomon-simd` GF(2¹⁶) with multi-block per-frame framing; optional QUIC control/audio. Write a minimal **Rust** reference client (decode via VAAPI, present via wgpu/Vulkan) to exercise it. *Acceptance:* a stable stream above 1.4 Gbps at 5120×1440@240 with loss recovery working; latency unchanged vs. M2.
**M5 — Apple client (L).** Swift + VideoToolbox + Metal + SwiftUI, linking `punktfunk-core` via the C header. *Acceptance:* the Mac Studio plays a stream at native resolution/refresh.
**M5 — Apple client (L).** Swift + VideoToolbox + Metal + SwiftUI, linking `punktfunk-core` via the C header. *Acceptance:* a Mac plays a stream at native resolution/refresh.
**M6 — Feature surface (M, ongoing).** Mic passthrough as a proper encrypted, per-client reverse audio stream (the thing the upstream PR got wrong); HDR signalling; per-client identity/permissions; pause/resume. *Acceptance:* feature parity with Apollo on the items you care about, plus mic done right.
@@ -239,7 +243,7 @@ Sizing is rough and relative (Spike / S / M / L) for a focused solo dev; treat a
| Encoder/decoder can't sustain 1.77 Gpx/s @ 240 | Med | High | Measure in M0/M4 on real silicon; this is a hardware ceiling no rewrite fixes — discover it before P2, not after |
| Frame pacing eats more time than expected | High | Med | M3 measurement harness first; treat pacing as a first-class subsystem, not a polish step |
| Scope creep into a full Moonlight replacement | High | High | P1 (stock-client compat) is the firewall: it forces you to ship value before writing a client |
| Solo bandwidth vs. your other projects (ENRW thesis, played) | High | Med | M2 is a complete, useful artifact on its own; the plan is safe to pause after any milestone |
| Solo bandwidth vs. other projects | High | Med | M2 is a complete, useful artifact on its own; the plan is safe to pause after any milestone |
---
@@ -250,7 +254,7 @@ Sizing is rough and relative (Spike / S / M / L) for a focused solo dev; treat a
- **Loss resilience:** `tc netem` to inject loss/jitter/reorder; verify FEC recovery and graceful degradation.
- **Pacing:** log present timestamps vs. client vsync; alert on stalls and duplicate/dropped frames.
- **Soak:** multi-hour streams; watch for buffer growth, fd leaks, encoder session exhaustion.
- **Hardware matrix:** your NVIDIA box (NVENC), an AMD/Intel box (VAAPI), Mac Studio (VideoToolbox decode). Catch driver quirks early.
- **Hardware matrix:** an NVIDIA box (NVENC), an AMD/Intel box (VAAPI), a Mac (VideoToolbox decode). Catch driver quirks early.
---
@@ -272,7 +276,7 @@ punktfunk/
│ │ ├── src/vdisplay/ # trait + kwin/wlroots/mutter impls
│ │ ├── src/input/ # reis + uinput
│ │ └── src/web/ # axum config/pairing API
│ └── punktfunk-client-rs/ # reference Rust client (M4)
│ └── punktfunk-probe/ # reference Rust client (M4)
├── clients/
│ ├── apple/ # Swift package, imports punktfunk_core.h (M5)
│ └── android/ # Kotlin + JNI (later)
@@ -286,10 +290,10 @@ punktfunk/
## 12. Immediate next actions (first week)
1. **Stand up the workspace** with `punktfunk-core` (empty ABI + `cbindgen`) and `punktfunk-host` skeletons; CI on your Gitea (you already have BuildKit pipelines).
2. **M0 spike on wlroots:** headless output → PipeWire capture → NVENC/VAAPI encode → playable file. This validates the riskiest *pipeline* assumptions in days, on your real GPU.
3. **Read KRdp's source** for how KDE creates virtual outputs and casts them — it's the closest existing reference for the KWin path you'll need in M2.
4. **Decide P1 protocol depth:** confirm exactly which `serverinfo`/RTSP/pairing messages a current Moonlight client requires for a successful connect, so M2's compat surface is scoped precisely (this is also the question to take back to the dev who mentioned the 1G limit).
1. **Stand up the workspace** with `punktfunk-core` (empty ABI + `cbindgen`) and `punktfunk-host` skeletons; wire up CI (Gitea Actions, BuildKit-based pipelines).
2. **M0 spike on wlroots:** headless output → PipeWire capture → NVENC/VAAPI encode → playable file. This validates the riskiest *pipeline* assumptions in days, on real GPU hardware.
3. **Read KRdp's source** for how KDE creates virtual outputs and casts them — it's the closest existing reference for the KWin path needed in M2.
4. **Decide P1 protocol depth:** confirm exactly which `serverinfo`/RTSP/pairing messages a current Moonlight client requires for a successful connect, so M2's compat surface is scoped precisely.
---