diff --git a/design/README.md b/design/README.md
index dc97608..e1733db 100644
--- a/design/README.md
+++ b/design/README.md
@@ -24,6 +24,7 @@ holds the full originals.
 | [`multi-user-profiles.md`](multi-user-profiles.md) | Multi-user / profiles end to end: map a client to a real host OS user account (own isolated desktop), web-console config, per-profile passcode | **Design, schema-of-record** — not yet implemented |
 | [`host-latency-plan.md`](host-latency-plan.md) | Latency under GPU contention — 4-tier plan | **Partly shipped** — superseded by ↓; diagnostics + open tiers kept |
 | [`gpu-contention-investigation.md`](gpu-contention-investigation.md) | GPU-contention root-cause + ranked levers (supersedes ↑) | **Active plan** — §5.A shipped; §5.B/C/E/F/G open |
+| [`vrr-plan.md`](vrr-plan.md) | VRR over punktfunk/1 — skip the VRR virtual display (client panel follows stream cadence; virtual display Hz = sampling grid); frame-driven host loop + VFR rate control + present-on-arrival clients | **Design** — not yet implemented |
 | [`hdr-pipeline-plan.md`](hdr-pipeline-plan.md) | Glass-to-glass HDR | **Steps 0–3 shipped**; Step 4 (Linux) open |
 | [`windows-host-rewrite.md`](windows-host-rewrite.md) | **Windows host — the single architecture/status/reference doc** (validated invariants, ops, open work) | **Active reference** |
 | [`windows-build-and-packaging.md`](windows-build-and-packaging.md) | How the Windows host is built, signed, packaged (drivers-from-source, Inno, CI) | **Evergreen reference** |
@@ -53,6 +54,7 @@ owning doc.)
 - Sub-frame pipelining — overlap encode+transmit within a frame; needs a direct NVENC SDK wrapper (~2–4 ms). → `implementation-plan`, `gamestream-host-plan`
 - GPU-contention levers: correct async NVENC pipeline, auto-gated REALTIME GPU priority, clock/P-state pinning, frame-source escape (swapchain-hook/NvFBC/compose-flip), iGPU encode offload, PERF uniq-vs-fps instrumentation. → `gpu-contention-investigation` (§5.B/C/E/F/G), `host-latency-plan` (Tiers 1A/1B/3B/3C/3D/4)
 - Apple stage-2 as default (after resolution/HDR checks) + smoothing/pacing policy + glass-to-glass numbers via `tools/latency-probe`. → `apple-stage2-presenter`
+- VRR / frame-driven cadence: Stage A client present-on-arrival (Android/Linux → iOS → Windows), Stage B host frame-driven loop + VFR rate control + `FLAG_REPEAT`/`VIDEO_CAP_VRR`, Stage C gamescope `--adaptive-sync` headless experiment. → `vrr-plan`
 
 **HDR**
 - Linux 10-bit HDR (Step 4): 8-bit→Main10 shim, true 10-bit PipeWire capture (blocked upstream — gamescope #2126), Linux-client P010 + GTK color management. → `hdr-pipeline-plan`
diff --git a/design/vrr-plan.md b/design/vrr-plan.md
new file mode 100644
index 0000000..a2bd311
--- /dev/null
+++ b/design/vrr-plan.md
@@ -0,0 +1,158 @@
+# VRR over punktfunk/1 — design
+
+> **Status:** DESIGN — investigation complete (2026-07-03), nothing implemented. Key architectural
+> decision recorded here: **no VRR virtual display is needed** — client-side VRR is driven purely by
+> presentation cadence, so the host's virtual display Hz becomes a *sampling grid* decoupled from the
+> client panel. punktfunk/1-native only; GameStream/Moonlight stays fixed-cadence (stock clients).
+
+Goal: end-to-end variable refresh — the game's real frame pacing reaches the client's VRR panel
+instead of being resampled onto a fixed grid twice (host pacer, client vsync). Gains are both
+latency (the fixed-cadence quantization at capture and present is now the dominant remaining
+latency term — the Windows-client loopback p50 of ~18 ms is dominated by the 60 Hz virtual-display
+cadence while the wire is sub-millisecond) and smoothness (an 85 fps game on a 120 Hz grid presents
+as an irregular 8.3/16.7 ms alternation — judder baked in at the source that no client can undo).
+
+## The core decision: skip the VRR virtual display
+
+A VRR panel is not "driven at a framerate" by any API — it follows the presentation cadence. If the
+client presents each frame on arrival, the panel refreshes at the stream's cadence, whatever it is.
+So client VRR needs **frame-driven host emission + present-on-arrival clients**, and no VRR anywhere
+on the host display stack. This sidesteps two otherwise-hard blockers entirely:
+
+- **IddCx (Windows host) has no VRR support at all** (through 1.10, which pf-vdisplay is built
+  against): no VRR DDI, no VRR in the virtual EDID, and GPU control panels don't even list indirect
+  displays as VRR-capable. Not fixable by us; the community IDD projects' "can we fake it" issue is
+  open and unanswered.
+- **KWin/Mutter/wlroots virtual outputs are fixed-mode** (KWin hardcodes 60 Hz + out-of-band
+  `kscreen-doctor` custom modes, `vdisplay/linux/kwin.rs:101,138`; Mutter defaults 60 with the
+  `PUNKTFUNK_MUTTER_VIRTUAL_REFRESH` opt-in, `mutter.rs:244-258`; Sway takes one
+  `--custom WxH@Hz`, `wlroots.rs:93`).
+
+What a true-VRR virtual display *would* add is confined to the source end, exactly two residuals:
+(1) **sampling quantization → pacing wobble** — the game's output is sampled on the virtual
+display's fixed grid, and the game's true present times never reach the wire (our `pts_ns` is
+stamped at capture, already grid-aligned); (2) **up to one virtual-vblank of host latency** (a
+frame completed just after a composite waits for the next grid tick). Both scale with the grid:
+at 240 Hz the grid is 4.2 ms — pacing error ~±2 ms (below the ~4–5 ms perceptibility threshold)
+and ≤4.2 ms added latency. The high-Hz machinery already exists on every backend, and the Linux
+compositors composite on damage, so a 240 Hz virtual mode costs GPU work proportional to the game's
+actual fps, not 240 composites/s.
+
+**Negotiation semantics shift**: today the client requests its native WxH@Hz and the mode's Hz means
+"the cadence you'll receive." Under client-VRR the virtual display Hz is the *sampling grid* (pick
+it high), while the client's panel VRR range governs presentation only.
+
+## Where the pieces stand (investigation findings)
+
+### Wire — already ~90 % ready
+
+- Every packet carries a wall-clock **capture** timestamp: `PacketHeader.pts_ns` is the first field
+  of the 40-byte header (`punktfunk-core/src/packet.rs:52-68`), threaded to `Frame.pts_ns` and ABI
+  `PunktfunkFrame.pts_ns`. Epoch = ns since UNIX epoch, stamped host-side via `SystemTime::now()`
+  (`punktfunk1.rs:100-105`). Plus a monotonic per-AU `frame_index`.
+- The clock-skew offset is **ABI-exposed**: `punktfunk_connection_clock_offset_ns`
+  (`abi.rs:2121-2137`; NTP-style min-RTT estimate, `quic.rs:417-426`). A client can convert host
+  capture time to its own clock — the raw material for a timestamp-scheduled presenter, and
+  something Moonlight fundamentally lacks (its "frame pacing" guesses; we have a measured offset).
+- **FEC, keepalives, and reorder are rate-agnostic**: FEC is self-describing per packet and adapts
+  on loss; QUIC keepalive is 4 s/8 s; the reassembler window is frame-count-based
+  (`REORDER_WINDOW = 16`, `packet.rs:47`). Nothing in the data plane divides by fps.
+
+Missing (all small): a `FLAG_REPEAT` (or `FLAG_NEW`) bit in the already-end-to-end
+`PacketHeader.user_flags` (free bits above `FLAG_PIC/EOF/SOF/PROBE`, `packet.rs:30-36` — no header
+size change); `VIDEO_CAP_VRR = 0x08` in `video_caps` (`quic.rs:107-116`) mirrored to the ABI
+constant with the lockstep assert (`abi.rs:856-864`); an append-only Hello/Welcome trailing field
+for the client's panel refresh range (the same trailing-byte back-compat pattern used 7×). One real
+caveat: **`Reconfigure`/`Reconfigured` are fixed-length, not tail-extensible** (decode requires
+exact lengths, `quic.rs:1029,1057`) — a mid-stream VRR toggle/range change needs a new typed
+control message, not a field append.
+
+### Host — fixed cadence is a consumer-loop choice, not a capture limitation
+
+Every capture producer is already push/event-driven: PipeWire delivers a buffer per composite on
+all Linux backends (damage-driven on kwin/mutter/wlroots — a static desktop produces *nothing*;
+gamescope pushes per output frame at its `-r` rate); the pf-vdisplay ring publishes one frame per
+DWM present and signals a frame-ready event, returning `E_PENDING` when DWM composed nothing
+(`swap_chain_processor.rs:306-333`). The fixed cadence is imposed entirely by the encode loops: the
+`next += 1/effective_hz` pacer (`punktfunk1.rs:3336,3398-3401,3606`; GameStream analogue
+`gamestream/stream.rs:805-808`) re-samples via `try_latest()` and **re-encodes the last frame as a
+synthetic repeat** when nothing new arrived (`punktfunk1.rs:3169-3179`) — repeats go on the wire
+indistinguishable from new frames (the `repeat` bool is host-internal stats only).
+
+Smallest cadence change: block on the existing `next_frame()` (Linux `recv_timeout`, IDD
+`WaitForSingleObject` on the frame-ready event) and submit one encode per delivered frame, keeping
+an **idle-timeout repeat** so a damage-idle desktop still emits keepalive frames. The wire PTS is
+already wall-clock, so timestamps survive unchanged.
+
+The load-bearing fixed-fps assumption is **rate control**: both encoder paths run CBR with a
+~1-frame VBV sized `bitrate/fps` and feed `frame_idx` as the encoder PTS
+(`encode/linux/mod.rs:280-297` — `time_base(1/fps)`, VBV `bitrate/fps × PUNKTFUNK_VBV_FRAMES`
+default 1; `encode/windows/nvenc.rs:663-672,787-788` — `frameRateNum = fps`, VBV `bitrate/fps`;
+PTS = `frame_idx` at `nvenc.rs:1189-1206` / `mod.rs:167,535`). Variable intervals won't corrupt
+ordering, but a game at 85 fps in a "240 Hz-grid" session drastically undershoots the bitrate
+target and bursts fight the 1-frame VBV. VFR needs: feed the real capture PTS to the encoder
+timeline, and either budget `frameRate` at the *expected* rate with a laxer VBV or move that path
+to VBR/CQ. This is the one real technical knot.
+
+### Clients — all four are vsync-locked newest-wins today
+
+No client has any tearing/VRR/present-immediate path; `clock_offset_ns` is used only for the
+latency HUD. Queue depth is 1–2 slots newest-wins everywhere; no de-jitter buffer anywhere.
+
+| Client | Today | Present-on-arrival path |
+|---|---|---|
+| Android | `releaseOutputBuffer(render=true)` immediately on the newest drained buffer (`native/src/decode.rs:274-334`); `setFrameRate` fixed hint (`decode.rs:100`) | **Closest** — present already arrival-driven; switch to the frame-rate change-strategy / seamless APIs so a VRR panel follows |
+| Linux | `set_paintable` on frame arrival; GTK/compositor frame clock scans out (`ui_stream.rs:475-588`) | Arrival side done; needs compositor VRR (GNOME/KDE enable VRR for fullscreen apps — the fullscreen `GtkGraphicsOffload` dmabuf direct-scanout path is exactly the eligible case) |
+| Apple | Main-runloop `CADisplayLink` at fixed display/stream cadence + 1-slot `ReadyRing` (`SessionPresenter.swift:69-76`, `Stage2Pipeline.swift:15-37`); macOS `displaySyncEnabled=false` is *not* tearing — WindowServer still composites at vsync (`MetalVideoPresenter.swift:193-200`) | iOS/iPadOS ProMotion: wide `CAFrameRateRange` + drive render from the decode callback instead of the link. macOS: WindowServer-limited (Moonlight reports VRR-follows-stream fullscreen only) |
+| Windows | Render thread waits the swapchain latency waitable (DWM vblank cadence) then `Present(1)`; no `ALLOW_TEARING` anywhere (`render.rs:157-225`, `present.rs:161-173,540`) | **Hardest** — the composition `SwapChainPanel` swapchain can't tear/independent-flip. Plausible route: arrival-driven presents through DWM's windowed-VRR (windowed G-Sync/FreeSync — DWM composes on demand, panel follows); needs on-glass validation, else a fullscreen HWND swapchain mode |
+
+### Client pacing policy: scheduled present, not raw arrival
+
+Raw present-on-arrival replays network+encode jitter onto the panel. Better: present at
+`pts_ns + clock_offset_ns + D` for a small constant `D` — the shared clock absorbs jitter and
+reproduces the *host-side* cadence exactly (still grid-quantized at the source; see residual (1)).
+`D` is a smoothness-vs-latency knob; on LAN it can be near zero. All the data for this is already
+on the wire today.
+
+## Staging
+
+1. **Stage A — client-only, no protocol change.** Timestamp-scheduled / present-on-arrival on
+   VRR-capable displays. Order: Android + Linux (architecturally ready) → iOS ProMotion → Windows
+   (DWM windowed-VRR validation) → macOS fullscreen. Biggest single latency win: removes avg ½ /
+   worst 1 client refresh (~8/16.7 ms at 60 Hz, halved at 120).
+2. **Stage B — host, native path only.** Frame-driven consumer loop + idle-repeat keepalive;
+   real-PTS encoder timeline + VFR-tolerant rate control; `FLAG_REPEAT` on the wire;
+   `VIDEO_CAP_VRR` + panel-range negotiation; grid-Hz mode semantics. Kills the capture-side
+   quantization down to the grid and stops burning encode on synthetic repeats.
+3. **Stage C — optional gamescope experiment.** gamescope has `--adaptive-sync` and it works even
+   nested per upstream #1694; we don't pass it in the headless spawn
+   (`vdisplay/linux/gamescope.rs:975-980`), and whether the *headless* backend honors it is
+   unverified (untestable until the dev VM's GPU passthrough returns). If it works, it removes even
+   the sampling grid on the path that matters most for gaming, at near-zero implementation cost.
+   An optimization, not the architecture. KWin/Mutter/wlroots/IddCx true VRR: upstream-blocked,
+   do not pursue.
+
+## Open questions / risks
+
+- **VFR rate control per encoder**: exact NVENC/VAAPI/AMF-QSV recipe (real-timestamp `time_base`
+  vs max-rate + enlarged VBV vs VBR/CQ); interaction with the 1-frame-VBV latency property we rely
+  on. The main Stage-B risk item.
+- **Does gamescope headless honor `--adaptive-sync`?** (Stage C gate; needs the GPU back.)
+- **DWM windowed VRR with a composition swapchain**: does arrival-cadence presenting through the
+  XAML `SwapChainPanel` actually drive a G-Sync/FreeSync panel variably? On-glass validation gates
+  the Windows-client stage-A entry.
+- **Panel VRR floor / LFC**: the idle-keepalive repeat cadence sets the stream's minimum rate; if
+  it sits below a panel's ~48 Hz floor the client compositor/driver's LFC handles doubling —
+  verify, and don't park the keepalive interval right at a floor boundary.
+- **Android**: seamless (`CHANGE_FRAME_RATE_ONLY_IF_SEAMLESS`) vs non-seamless switch strategy,
+  and real-device VRR panel coverage.
+- **Hello semantics**: how a VRR-capable client picks the grid Hz to request (host advertises its
+  max grid? client just asks 240 and the host clamps like today's mode ladder?).
+
+## External evidence (2026-07-03)
+
+- gamescope `--adaptive-sync` works in nested mode: [ValveSoftware/gamescope#1694](https://github.com/ValveSoftware/gamescope/issues/1694)
+- IddCx has no VRR path; community "can we fake it" open/unanswered: [Virtual-Display-Driver#24](https://github.com/itsmikethetech/Virtual-Display-Driver/issues/24), [IddCx DDI index](https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/iddcx/)
+- Client VRR panels do follow Moonlight's stream cadence in practice (and it's messy — our shared
+  clock is the differentiator): [moonlight-qt#1545](https://github.com/moonlight-stream/moonlight-qt/issues/1545), macOS fullscreen-only [moonlight-qt#1509](https://github.com/moonlight-stream/moonlight-qt/issues/1509)
+- Mutter `RecordVirtual` derives refresh from PipeWire; VRR only on real monitors: [mutter!1154](https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1154)