claude-code - 💡(How to fix) Fix [FEATURE] Relay mode: split UI from agentic loop for low-bandwidth/high-latency connections [1 participants]

claude-code2026-04-30 17:54:54

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#55087•Fetched 2026-05-01 05:46:35

View on GitHub

Comments

Participants

Timeline

Reactions

Author

kimn1944

Participants

kimn1944

Timeline (top)

labeled ×4

Root Cause

Current experience: Direct Claude Code is unusable — each agentic turn uploads growing context over the slow link. SSH to the server makes the TUI sluggish (every streaming token, every tool output renders as terminal escape sequences traveling back). Even mosh doesn't help because it can't predict full-screen TUI redraws.

Fix Action

Fix / Workaround

The workaround (SSH to a remote machine) trades one problem for another. Running Claude Code over SSH to a machine with fast internet solves the bandwidth problem, but introduces keystroke latency. Claude Code is a full-screen TUI — every screen redraw (streaming tokens, tool output rendering, status bar updates, spinner animation) must travel back as terminal escape sequences. Even mosh can't help: its local echo prediction only works for simple shell prompts, not full-screen alternate-screen-buffer applications. Mosh falls back to pure server-side rendering for TUI apps, and its state-synchronization protocol is still gated by RTT for acknowledgments (~2 updates/sec on 500ms RTT).

Code Example

claude --relay-server --port 9847          # on the remote machine
claude --relay-connect host:port           # on the local machine

RAW_BUFFERClick to expand / collapse

Preflight Checklist

I have searched existing requests and this feature hasn't been requested yet
This is a single feature request (not multiple features)

Problem Statement

Claude Code's architecture requires the full conversation context — user prompts, tool call results, and LLM responses — to round-trip between the local machine and Anthropic's API on every turn. For users on slow, high-latency, or unstable connections (remote regions, mobile tethering, VPN-constrained corporate networks), this makes Claude Code effectively unusable.

Tool outputs dominate bandwidth. A single Read of a large file, a Grep across a codebase, or a Bash command producing verbose output can be tens of KB. This data is generated locally, sent to the API, and the response streams back — all over the constrained link. On a 50 Kbps connection, a single 20 KB tool result takes ~3 seconds just to upload, and this compounds across multi-turn agentic loops with dozens of tool calls.

Proposed Solution

A client-server split ("relay mode") where:

A thin local client runs on the user's machine, handling only:

User input (responsive, zero-latency typing)
Displaying streamed LLM text output
Permission prompts (approve/deny tool use)

A relay server runs on a machine with fast internet (cloud VM, workstation, etc.), handling:

The full agentic loop (tool calls execute locally on the relay machine)
API communication with Anthropic (fast, low-latency)
Workspace file access (codebase lives on the relay machine)

The wire between them carries only:

Upstream: user prompts (small, text only — typically < 1 KB)
Downstream: LLM text output and permission requests (semantic content only — no escape sequences, no tool result payloads, no framebuffer diffs)

UX could look like:

claude --relay-server --port 9847          # on the remote machine
claude --relay-connect host:port           # on the local machine

Transport could run over an SSH tunnel for auth/encryption, or use its own TLS + token auth.

Claude Code already has building blocks for this: --input-format stream-json, --output-format stream-json, --include-partial-messages, and permission mode flags.

Alternative Solutions

Alternative	Why insufficient
SSH + mosh	Mosh prediction fails for full-screen TUIs; still bandwidth-heavy for screen diffs
VS Code Remote SSH + extension	Extension runs remotely but VS Code Remote protocol still sends substantial UI data; not available for terminal-only users
API compression (#13911)	Helps marginally but doesn't address the fundamental issue: tool outputs shouldn't cross the slow link at all
`--print` with remote execution	Works per-turn but loses interactive features (permission prompts, conversation continuity requires manual `--resume` chaining)

Priority

High - Significant impact on productivity

Feature Category

Performance and speed

Use Case Example

Developer on a 50 Kbps / 500ms RTT connection (rural area, travel, VPN-constrained corporate network) needs to use Claude Code against a codebase on a remote server.

With relay mode: User runs claude --relay-server on the remote machine, sets up an SSH tunnel, and runs claude --relay-connect localhost:9847 locally. Typing is instant. Tool calls (file reads, greps, bash commands) execute on the remote machine and never cross the slow link. Only the LLM's text responses stream back — roughly 100x less bandwidth than the current architecture.

Scenario	Current (direct)	SSH/mosh	Relay mode
LLM streams 500 tokens	~50 KB upload (context) + stream back	Terminal diffs of full TUI	~2 KB streamed text
Tool reads 10 KB file	10 KB upload to API	10 KB rendered as terminal diffs	0 bytes (stays on relay)
Grep returns 200 matches	Rendered + uploaded	Terminal diffs	0 bytes
Multi-turn 20-tool session	Cumulative context grows each turn	All rendered remotely	Only final text per turn

Additional Context

Why "just get better internet" isn't an answer:

Many developers work from locations where fast connections aren't available — rural areas, developing regions, travel, satellite internet
Enterprise VPNs often impose bandwidth caps and add 100-200ms latency that users can't bypass
Agentic workloads amplify the problem: 5-30 tool calls per task, each adding a round trip, making marginal connections completely impractical

Related issues:

#53719 — CLI-as-remote-control-client (closest architectural neighbor: proposes attaching a local CLI to a remote session, but motivated by terminal preference and assumes a fast link between client and server; doesn't address bandwidth/latency on the wire)
#25570 — iOS thin client via claude --serve (similar split UI/backend pattern, but motivated by mobile access; tool execution stays on the user's local machine)
#49790 — Persistent remote sessions surviving disconnect (complementary)
#22408 — High-latency SSH rendering fix (addressed flickering, not the fundamental bandwidth problem)
#49657 — Multi-device session sync (overlapping motivation)

extent analysis

TL;DR

Implement a client-server split, "relay mode", to reduce bandwidth usage by executing tool calls on a remote machine with fast internet and streaming only LLM text output to the local client.

Guidance

Investigate using existing building blocks such as --input-format stream-json, --output-format stream-json, and permission mode flags to implement the relay mode.
Design a transport protocol for the wire between the local client and relay server, potentially using SSH tunnel for auth/encryption or TLS + token auth.
Develop a UX for users to easily set up and connect to the relay server, such as the proposed claude --relay-server and claude --relay-connect commands.
Test the relay mode with various scenarios, including different tool calls and multi-turn sessions, to ensure it significantly reduces bandwidth usage and improves performance.
Consider implementing additional features, such as persistent remote sessions and multi-device session sync, to complement the relay mode.

Example

claude --relay-server --port 9847  # on the remote machine
claude --relay-connect host:port  # on the local machine

Notes

The proposed solution requires significant changes to the existing architecture, and its implementation may have implications on the overall system's security, scalability, and maintainability. Additionally, the relay mode may introduce new latency or performance issues if not properly optimized.

Recommendation

Apply the proposed workaround by implementing the client-server split, "relay mode", as it addresses the fundamental issue of tool outputs crossing the slow link and has the potential to significantly improve performance for users with slow or high-latency connections.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #LLM response #indexing error #inference speed #output truncation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [FEATURE] Relay mode: split UI from agentic loop for low-bandwidth/high-latency connections [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Preflight Checklist

Problem Statement

Proposed Solution

Alternative Solutions

Priority

Feature Category

Use Case Example

Additional Context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [FEATURE] Relay mode: split UI from agentic loop for low-bandwidth/high-latency connections [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Preflight Checklist

Problem Statement

Proposed Solution

Alternative Solutions

Priority

Feature Category

Use Case Example

Additional Context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING