7b99b41ede
Much of design/ described work that has since shipped. Trim each doc to
its durable rationale + still-open items (the code is the source of truth
for shipped detail; git history holds the full originals).
- Shipped plans -> status stubs: stats-capture, gamestream-host-plan,
apple-stage2-presenter, windows-service.
- Trimmed completed-out / open-kept: implementation-plan, hdr-pipeline,
host-latency, gpu-contention (fixed stale status table), game-library,
linux-setup (fixed m0->spike + stale zero-copy claim),
session-aware-host-followups, windows-client-bootstrap,
windows-dualsense-{scoping,game-detection}, windows-virtual-display,
security-review (per-finding status table; #12 still open),
apollo-comparison (shipped backlog collapsed to one-liners).
- Windows-host cluster consolidated: windows-host.md -> redirect into
windows-host-rewrite.md (whose stale scorecard is corrected -- goal1 is
merged, M4 done); windows-secure-desktop.md archived (now a fallback
behind IDD-push primary).
- Kept evergreen: ci.md, gamescope-multiuser.md, windows-build-and-packaging.md.
- New design/README.md: per-doc status table + consolidated open-items
roll-up so nothing is tracked in only one buried doc.
- Repoint 5 code comments to the archived secure-desktop doc path.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
148 lines
8.6 KiB
Markdown
148 lines
8.6 KiB
Markdown
# Linux host setup — NVIDIA GPU VM (pipeline spike + GameStream host)
|
|
|
|
> **Status:** Setup guide — still current and in active use (referenced from
|
|
> `clients/apple/README.md`). The pipeline spike and the GameStream host are **shipped**
|
|
> (see `design/gamestream-host-plan.md` + `CLAUDE.md`). This doc is trimmed to the bring-up
|
|
> steps + gotchas; the §4 crate-API walkthrough was folded into `CLAUDE.md` "Pinned crate facts".
|
|
|
|
How to bring up the build environment for the punktfunk Linux host on an NVIDIA-GPU Ubuntu VM
|
|
and run the **pipeline spike** (capture→encode). `punktfunk-core` already builds and is tested
|
|
cross-platform; this is about the platform backends in `crates/punktfunk-host`.
|
|
|
|
> Target **Ubuntu 24.04 (noble)**: Sway 1.9, FFmpeg 6.1.1, xdg-desktop-portal 1.18.
|
|
> 22.04 (jammy) ships Sway 1.7 / FFmpeg 4.4 — too old for this path; build from source or
|
|
> upgrade. Package names/versions below were verified against the live Ubuntu archive.
|
|
|
|
## 1. Bootstrap
|
|
|
|
```sh
|
|
git clone git@git.unom.io:unom/punktfunk.git && cd punktfunk && git checkout m1-punktfunk-core
|
|
bash scripts/bootstrap-ubuntu.sh
|
|
```
|
|
|
|
It **verifies** the (already-installed) NVIDIA + NVENC stack, installs the Rust toolchain
|
|
(rustup) and the build/runtime deps (PipeWire, xdg-desktop-portal + the wlroots backend,
|
|
Sway, Wayland/DRM/EGL/GBM/VA dev libs, capture tools), **gates** the FFmpeg `-dev`
|
|
headers so it can't clobber your custom NVENC FFmpeg, and drops headless-Sway + portal
|
|
config templates into `~/.config` (only if absent). It does **not** reboot or edit GRUB.
|
|
|
|
After it runs, sanity-check the core on Linux:
|
|
|
|
```sh
|
|
cargo test --workspace # 21 tests; same suite that's green on macOS
|
|
```
|
|
|
|
## 2. NVIDIA prerequisites (one-time, may need a reboot)
|
|
|
|
Wayland on NVIDIA requires KMS modeset. The bootstrap checks it; if it isn't `Y`:
|
|
|
|
```sh
|
|
echo 'options nvidia-drm modeset=1 fbdev=1' | sudo tee /etc/modprobe.d/nvidia-drm.conf
|
|
sudo update-initramfs -u && sudo reboot
|
|
cat /sys/module/nvidia_drm/parameters/modeset # must print Y after reboot
|
|
```
|
|
|
|
- Driver **≥ 535** is the floor for headless wlroots (EGL/dmabuf); 550+ recommended.
|
|
- **Install the NVIDIA GL/EGL userspace, not just `nvidia-utils`:**
|
|
`sudo apt install libnvidia-gl-<NNN>` (matching the driver, e.g. `libnvidia-gl-595`).
|
|
`nvidia-utils-NNN` ships nvidia-smi + NVENC but **not** `libEGL_nvidia.so.0` or the GLVND
|
|
vendor JSON (`/usr/share/glvnd/egl_vendor.d/10_nvidia.json`). Without them libglvnd falls
|
|
back to Mesa, wlroots can't init EGL on the GPU and drops to the **pixman** software
|
|
renderer — and the ScreenCast portal then fails to negotiate a buffer format
|
|
(`unable to receive a valid format from wlr_screencopy`). Verify after install:
|
|
`ls /usr/share/glvnd/egl_vendor.d/10_nvidia.json && ldconfig -p | grep libEGL_nvidia`.
|
|
A correct GPU Sway logs `EGL vendor: NVIDIA` and a list of DMA-BUF formats.
|
|
- **Join the `render` + `video` groups:** `sudo usermod -aG render,video $USER`, then
|
|
**re-login** (group changes only apply to new logins). wlroots opens
|
|
`/dev/dri/renderD128` (group `render`) and `/dev/dri/card*` (group `video`), both 0660;
|
|
without membership Sway aborts with `Permission denied`. (`scripts/headless/*.sh` bridge a
|
|
not-yet-re-logged-in shell with `sg render`, but re-login is the clean fix.)
|
|
- A **headless VM GPU exposes no DRM connectors** — that's expected. We don't use the DRM
|
|
backend; `WLR_BACKENDS=headless` renders to an offscreen GBM/EGL surface and creates a
|
|
virtual `HEADLESS-1` output. Use the render node `/dev/dri/renderD128`.
|
|
- **NVENC in a VM:** full PCI **passthrough** = bare-metal NVENC, no license. **vGPU**
|
|
needs a valid license (vWS) or NVENC runs degraded — the bootstrap's smoke-encode tells
|
|
you if it actually works. Consumer GeForce cards also cap concurrent NVENC sessions
|
|
(~8); datacenter/RTX-pro are effectively unlimited — relevant once we serve many clients.
|
|
|
|
## 3. Bring up the headless compositor + prove capture→NVENC
|
|
|
|
```sh
|
|
# shell 1 — start headless GPU Sway on the shared user bus (blocks; -d for debug log)
|
|
bash scripts/headless/run-headless-sway.sh # success logs "EGL vendor: NVIDIA"
|
|
|
|
# shell 2 — same user: set the client mode, import the portal env, write the env file
|
|
bash scripts/headless/prepare-session.sh 2560x1440@60Hz
|
|
source /tmp/punktfunk-sway-env.sh
|
|
swaymsg -t get_outputs # confirm HEADLESS-1 active
|
|
swaymsg exec foot # optional: animated content to capture
|
|
bash scripts/headless/capture-smoke-test.sh # wf-recorder (wlr-screencopy) -> hevc_nvenc
|
|
ffprobe /tmp/punktfunk-headless-test.mkv # confirm a real H.265 stream
|
|
```
|
|
|
|
`wf-recorder` uses `wlr-screencopy` directly (no portal/D-Bus) — the fastest way to
|
|
de-risk the GPU encode path. **Note:** screencopy encodes straight to a file and *cannot*
|
|
feed PipeWire; the real integration uses the ScreenCast portal (see the pipeline spike). If shell 1 logged
|
|
a Mesa/EGL fallback (or Sway dropped to pixman) instead of `EGL vendor: NVIDIA`, install the
|
|
NVIDIA GL userspace (§2) — the portal cannot capture a pixman output.
|
|
|
|
**An idle headless output produces no frames** (its frame clock is driven by damage); give
|
|
it a real refresh mode (`prepare-session.sh` does) *and* run something animated
|
|
(`swaymsg exec foot`) or the capture will be ~1 frame.
|
|
|
|
The wlroots-on-NVIDIA env workarounds (`WLR_RENDERER=gles2`, `WLR_NO_HARDWARE_CURSORS=1`,
|
|
`GBM_BACKEND=nvidia-drm`, `sway --unsupported-gpu`, …) live in
|
|
`scripts/headless/env.sh` — `source` it before launching anything Wayland.
|
|
|
|
## 4. The spike proper — wire it into `punktfunk-core`
|
|
|
|
Goal (plan §8): headless output → PipeWire ScreenCast → NVENC → a playable file, then feed
|
|
the encoded access units into a `punktfunk_core::Session` (host role). The module seams exist
|
|
in `crates/punktfunk-host/src/{vdisplay,capture,encode,inject,pipeline}.rs`.
|
|
|
|
**Status: implemented and verified end-to-end** in `crates/punktfunk-host` (`spike.rs`,
|
|
`capture/linux.rs`, `encode/linux.rs`). After the §3 bring-up:
|
|
|
|
```sh
|
|
source /tmp/punktfunk-sway-env.sh
|
|
swaymsg exec foot # animated content
|
|
# Live portal capture → NVENC HEVC → playable file, with each AU also round-tripped
|
|
# through a punktfunk_core host→client Session (FEC + packetize + reassemble) and verified:
|
|
cargo run -p punktfunk-host -- spike --source portal --seconds 5 --out /tmp/punktfunk-spike.h265
|
|
ffprobe /tmp/punktfunk-spike.h265
|
|
# No capture session needed (encode + core only): --source synthetic
|
|
```
|
|
|
|
Verified result: `1920x1080` HEVC, ~300 frames in 5s, `punktfunk-core loopback … 0 mismatches`.
|
|
The portal negotiates packed **`RGB` (24-bit, 3 bpp)** on wlroots; the encoder expands it to
|
|
`rgb0` (one pad byte/pixel, no colour math) since NVENC accepts `rgb0`/`bgr0` but not
|
|
`rgb24`. **GPU zero-copy is now implemented on all paths** (tiled dmabuf → EGL/GL → CUDA;
|
|
LINEAR dmabuf → Vulkan bridge → CUDA → NVENC — see `CLAUDE.md`); the `capture` module keeps a
|
|
`cpu_bytes` fallback for inputs that can't be imported.
|
|
|
|
Crate/API details (`ashpd` 0.13 screencast handshake, `pipewire` 0.9 frame pull, `ffmpeg-next`
|
|
8.x encoder selection, `reis`/uinput input) now live in `CLAUDE.md` "Pinned crate facts" — they
|
|
are the source of truth, with the FFmpeg-prefix override `export FFMPEG_DIR=/that/prefix` and
|
|
the bindgen `LIBCLANG_PATH` knob in the troubleshooting table below.
|
|
|
|
The **GameStream host** built on this spike is shipped — `serverinfo`/RTSP/pairing for a stock
|
|
Moonlight client, a per-compositor virtual output created on connect, input via reis/uinput.
|
|
See `design/gamestream-host-plan.md` + `CLAUDE.md`.
|
|
|
|
## Troubleshooting
|
|
|
|
| Symptom | Fix |
|
|
|---|---|
|
|
| Sway aborts on NVIDIA | add `--unsupported-gpu` (the helper scripts do) |
|
|
| `not a KMS device` / no connectors | expected on a headless VM GPU — use `WLR_BACKENDS=headless`, not the DRM backend |
|
|
| Sway won't start at all | `WLR_RENDERER_ALLOW_SOFTWARE=1 WLR_RENDERER=pixman` to prove the pipeline, then fix EGL |
|
|
| ScreenCast portal finds no output | ensure `xdg-desktop-portal-wlr` is running in the same session, `XDG_CURRENT_DESKTOP=sway`, and `~/.config/xdg-desktop-portal-wlr/config` has `output_name=HEADLESS-1` |
|
|
| `Cannot load libnvidia-encode.so.1` | NVENC runtime lib missing (driver) or unlicensed vGPU |
|
|
| `cargo build` can't find FFmpeg | `export FFMPEG_DIR=$(pkg-config --variable=prefix libavcodec)` or point `PKG_CONFIG_PATH` at the custom build |
|
|
| bindgen: libclang not found | `export LIBCLANG_PATH=$(llvm-config --libdir)` |
|
|
|
|
## Open items
|
|
|
|
None — the pipeline spike and the GameStream host it seeds are both shipped (see
|
|
`design/gamestream-host-plan.md` + `CLAUDE.md`). This file remains as the host-box bring-up guide.
|