openclaw - ✅(Solved) Fix Claude CLI sessions reset on every gateway restart due to ephemeral loopback port in mcpConfigHash [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#64386Fetched 2026-04-11 06:15:08
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
referenced ×3cross-referenced ×2

Every gateway restart silently wipes all persisted Claude CLI session memory. The first turn after a restart for any session logs cli session reset: provider=claude-cli reason=mcp and starts a fresh claude -p instead of claude --resume <id>. Users experience this as "the agent suddenly forgot everything."

Reproducible on main. Affects any deployment using the claude-cli backend with the loopback MCP bridge (i.e. the default configuration since #35676).

Root Cause

Two independently-correct commits interact badly:

  1. 12100719b8 — "fix: preserve cli sessions across model changes" introduced mcpConfigHash as a CLI-session-reuse invalidation key in src/agents/cli-session.ts. At that time the hashed mergedConfig contained only user-authored MCP state (plugin .mcp.json, inline mcpServers from bundle manifests) — stable across restarts, so hashing was safe.

  2. 3de09fbe74 — "fix: restore claude cli loopback mcp bridge (#35676)" merged the in-gateway loopback server into the same mergedConfig via additionalConfig in prepareCliBundleMcpConfig (src/agents/cli-runner/prepare.ts:116-138). The loopback URL is constructed in src/gateway/mcp-http.loopback-runtime.ts:22-38:

    url: `http://127.0.0.1:${port}/mcp`,

    The port comes from src/gateway/mcp-http.ts:26 which calls startMcpLoopbackServer(port = 0) — OS-assigned ephemeral port, different on every gateway start. That literal port ends up in the JSON that src/agents/cli-runner/bundle-mcp.ts:301-302 hashes:

    const serializedConfig = `${JSON.stringify(params.mergedConfig, null, 2)}\n`;
    const mcpConfigHash = crypto.createHash("sha256").update(serializedConfig).digest("hex");

Result: mcpConfigHash changes on every gateway start, so resolveCliSessionReuse (src/agents/cli-session.ts:148-151) returns { invalidatedReason: "mcp" } on the first turn of every previously-persisted session after every restart.

The auth token is not the culprit — it's referenced as ${OPENCLAW_MCP_TOKEN} and resolved via env, so it never enters the hashed bytes. The port is the sole offender because it is a literal in the URL.

Fix Action

Fix / Workaround

Beyond the immediate bug, there is a layering issue: bundle-mcp.ts treats all entries in mcpServers as equivalent user-authored config, but the loopback entry is gateway-internal runtime state. Its contribution to session identity should be "is OpenClaw's own tool surface attached" (boolean), not "which ephemeral port did we bind today." Any future ephemeral value merged into mergedConfig (PID, tempdir path, per-start identifier) will reintroduce the same class of bug.

Compute mcpConfigHash from the user-authored mergedConfig before additionalConfig (the loopback) is merged in, then merge the loopback on top only for writing the actual mcp.json / CLI args. Rough shape in src/agents/cli-runner/bundle-mcp.ts#prepareCliBundleMcpConfig:

const hashableConfig = mergedConfig; // user-authored only
if (params.additionalConfig) {
  mergedConfig = applyMergePatch(mergedConfig, params.additionalConfig) as BundleMcpConfig;
}
return await prepareModeSpecificBundleMcpConfig({
  mode,
  backend: params.backend,
  mergedConfig,                // includes loopback — used to write mcp.json
  hashSource: hashableConfig,  // excludes loopback — used for session identity
  env: params.env,
});

PR fix notes

PR #64393: fix(cli): exclude loopback overlay from mcpConfigHash

Description (problem / solution / changelog)

Summary

  • Snapshots user-authored MCP state in prepareCliBundleMcpConfig before merging the loopback overlay (additionalConfig) and passes it to prepareModeSpecificBundleMcpConfig as hashSource.
  • mcpConfigHash is now computed from that snapshot instead of the loopback-merged config. Session identity is therefore port-agnostic.
  • Actual mcp.json / CLI args still embed the current loopback port so the CLI process can reach the bridge — only the hash changes.

Fixes #64386.

Why

The loopback MCP bridge binds to an OS-assigned ephemeral port on every gateway start (startMcpLoopbackServer(port = 0) in src/gateway/mcp-http.ts). createMcpLoopbackServerConfig embeds that port as a literal in the server URL, which was then merged into mergedConfig in prepareCliBundleMcpConfig and hashed in prepareModeSpecificBundleMcpConfig. Result: mcpConfigHash changed on every gateway restart, so resolveCliSessionReuse returned { invalidatedReason: "mcp" } on the first turn of every previously-persisted Claude CLI session after every restart, and the agent resumed with a fresh claude -p instead of claude --resume — losing all conversation memory.

The OpenClaw auth token is referenced as ${OPENCLAW_MCP_TOKEN} and resolved via env, so it never entered the hashed bytes. The port was the sole offender because it is a literal in the URL.

This was the result of two otherwise-correct commits interacting:

  • 12100719b8 "fix: preserve cli sessions across model changes" introduced mcpConfigHash. At the time, the hashed config only contained user-authored MCP state — stable across restarts.
  • 3de09fbe74 "fix: restore claude cli loopback mcp bridge (#35676)" started merging the loopback overlay into the same mergedConfig, silently turning a stable hash into a per-restart-random one.

The layering fix is to treat the loopback as gateway-internal runtime state that shouldn't contribute to session identity. Any other ephemeral value merged via additionalConfig in the future (PID, tempdir, per-start identifier) is now automatically safe for the same reason.

Test plan

New tests in src/agents/cli-runner/bundle-mcp.test.ts:

  • Hash stability: run prepareCliBundleMcpConfig twice with identical user MCP state but two different loopback ports (62949, 51734); assert mcpConfigHash is identical. Also assert the two written mcp.json files contain the correct per-run port so the bridge is still reachable.
  • Hash sensitivity (not over-corrective): same loopback URL both runs, but the second run adds a real plugin MCP server via createBundleProbePlugin; assert the hash changes.
  • Existing bundle-mcp.test.ts and cli-session.test.ts tests still pass.
  • oxlint clean on touched files.
  • tsc -p tsconfig.json --noEmit clean.

Notes

  • prepareModeSpecificBundleMcpConfig gained an optional hashSource param; when absent it falls back to hashing mergedConfig, so all existing callers are unaffected.
  • The hash snapshot is a shallow { mcpServers: { ...mergedConfig.mcpServers } } — sufficient because applyMergePatch replaces entries by key and the snapshot is not mutated afterwards.

🤖 Generated with Claude Code

Changed files

  • src/agents/cli-runner/bundle-mcp.test.ts (modified, +336/-0)
  • src/agents/cli-runner/bundle-mcp.ts (modified, +86/-1)
  • src/agents/cli-runner/prepare.ts (modified, +1/-0)
  • src/agents/cli-session.test.ts (modified, +48/-0)
  • src/agents/cli-session.ts (modified, +15/-1)

PR #64557: fix: exclude ephemeral loopback port from CLI session mcpConfigHash (closes #64386)

Description (problem / solution / changelog)

Summary

  • Problem: Every gateway restart silently wipes all persisted Claude CLI session memory. The first turn after restart resets the session because mcpConfigHash changes.
  • Why it matters: Users experience "the agent suddenly forgot everything" after every gateway restart. Affects all deployments using the claude-cli backend with the loopback MCP bridge (default since #35676).
  • What changed: Compute mcpConfigHash from user-authored mergedConfig before additionalConfig (containing the ephemeral loopback server URL) is merged in. The loopback config is still merged for the actual mcp.json / CLI args, just excluded from the session identity hash.
  • What did NOT change: The actual MCP config written to disk still includes the loopback server. User-authored MCP config changes still correctly invalidate sessions.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #64386
  • This PR fixes a bug or regression

Root Cause (if applicable)

  • Root cause: Two independently-correct commits interact badly: (1) 12100719b8 introduced mcpConfigHash for CLI session reuse, (2) 3de09fbe74 merged the loopback MCP server (with OS-assigned ephemeral port) into the same mergedConfig that gets hashed. The ephemeral port changes on every gateway restart → hash changes → all sessions invalidated.
  • Missing detection / guardrail: No test asserts hash stability across simulated gateway restarts with port churn.
  • Contributing context: The loopback URL is constructed with port=0 (OS-assigned), so http://127.0.0.1:<random-port>/mcp produces a different hash every time.

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
  • Target test or file: src/agents/cli-runner/bundle-mcp.test.ts
  • Scenario the test should lock in: Hash computed at run N+1 equals hash at run N when only the loopback port differs
  • If no new test is added, why not: Keeping PR minimal — happy to add if requested

User-visible / Behavior Changes

  • CLI sessions now persist across gateway restarts (previously silently reset every restart)
  • Session reuse is still correctly invalidated when user-authored MCP config changes

Diagram (if applicable)

Before:
gateway restart -> new ephemeral port -> hash(mergedConfig + loopback) changes -> session reset

After:
gateway restart -> new ephemeral port -> hash(mergedConfig only, no loopback) unchanged -> session preserved

Security Impact (required)

  • New permissions/capabilities? No
  • Auth boundary changes? No
  • Secrets/token exposure risk? No
  • New external calls? No
  • Sandbox/isolation changes? No

Evidence

  • Code trace confirms ephemeral port is sole source of hash instability (auth token uses env var reference, not literal)
  • pnpm check passes
  • AI-assisted: fix authored by Claude Code, reviewed and verified manually

Human Verification (required)

  • Verified scenarios: Traced hash computation path, confirmed additionalConfig merge happens after hash
  • Edge cases checked: No additionalConfig (hash unchanged), only additionalConfig changes (hash unchanged, correct), user MCP config changes (hash changes, correct)
  • What you did not verify: Live gateway restart test

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No

Risks and Mitigations

  • Risk: If additionalConfig contains user-meaningful config that should affect session identity
    • Mitigation: Currently additionalConfig only contains the loopback server (gateway-internal runtime state). The function signature documents this separation clearly.

Changed files

  • src/agents/cli-runner/bundle-mcp.ts (modified, +8/-1)

Code Example

url: `http://127.0.0.1:${port}/mcp`,

---

const serializedConfig = `${JSON.stringify(params.mergedConfig, null, 2)}\n`;
    const mcpConfigHash = crypto.createHash("sha256").update(serializedConfig).digest("hex");

---

2026-04-10T07:03:25 [gateway] signal SIGTERM received
2026-04-10T07:03:58 [gateway] MCP loopback server listening on http://127.0.0.1:62949/mcp
2026-04-10T07:03:58 [gateway] ready
2026-04-10T07:40:43 [agent] cli session reset: provider=claude-cli reason=mcp   <- session A, first turn after restart
2026-04-10T07:56:47 [agent] cli session reset: provider=claude-cli reason=mcp   <- session B, first turn after restart

---

const hashableConfig = mergedConfig; // user-authored only
if (params.additionalConfig) {
  mergedConfig = applyMergePatch(mergedConfig, params.additionalConfig) as BundleMcpConfig;
}
return await prepareModeSpecificBundleMcpConfig({
  mode,
  backend: params.backend,
  mergedConfig,                // includes loopback — used to write mcp.json
  hashSource: hashableConfig,  // excludes loopback — used for session identity
  env: params.env,
});
RAW_BUFFERClick to expand / collapse

Summary

Every gateway restart silently wipes all persisted Claude CLI session memory. The first turn after a restart for any session logs cli session reset: provider=claude-cli reason=mcp and starts a fresh claude -p instead of claude --resume <id>. Users experience this as "the agent suddenly forgot everything."

Reproducible on main. Affects any deployment using the claude-cli backend with the loopback MCP bridge (i.e. the default configuration since #35676).

Root cause

Two independently-correct commits interact badly:

  1. 12100719b8 — "fix: preserve cli sessions across model changes" introduced mcpConfigHash as a CLI-session-reuse invalidation key in src/agents/cli-session.ts. At that time the hashed mergedConfig contained only user-authored MCP state (plugin .mcp.json, inline mcpServers from bundle manifests) — stable across restarts, so hashing was safe.

  2. 3de09fbe74 — "fix: restore claude cli loopback mcp bridge (#35676)" merged the in-gateway loopback server into the same mergedConfig via additionalConfig in prepareCliBundleMcpConfig (src/agents/cli-runner/prepare.ts:116-138). The loopback URL is constructed in src/gateway/mcp-http.loopback-runtime.ts:22-38:

    url: `http://127.0.0.1:${port}/mcp`,

    The port comes from src/gateway/mcp-http.ts:26 which calls startMcpLoopbackServer(port = 0) — OS-assigned ephemeral port, different on every gateway start. That literal port ends up in the JSON that src/agents/cli-runner/bundle-mcp.ts:301-302 hashes:

    const serializedConfig = `${JSON.stringify(params.mergedConfig, null, 2)}\n`;
    const mcpConfigHash = crypto.createHash("sha256").update(serializedConfig).digest("hex");

Result: mcpConfigHash changes on every gateway start, so resolveCliSessionReuse (src/agents/cli-session.ts:148-151) returns { invalidatedReason: "mcp" } on the first turn of every previously-persisted session after every restart.

The auth token is not the culprit — it's referenced as ${OPENCLAW_MCP_TOKEN} and resolved via env, so it never enters the hashed bytes. The port is the sole offender because it is a literal in the URL.

Layering concern

Beyond the immediate bug, there is a layering issue: bundle-mcp.ts treats all entries in mcpServers as equivalent user-authored config, but the loopback entry is gateway-internal runtime state. Its contribution to session identity should be "is OpenClaw's own tool surface attached" (boolean), not "which ephemeral port did we bind today." Any future ephemeral value merged into mergedConfig (PID, tempdir path, per-start identifier) will reintroduce the same class of bug.

Evidence from a live gateway

2026-04-10T07:03:25 [gateway] signal SIGTERM received
2026-04-10T07:03:58 [gateway] MCP loopback server listening on http://127.0.0.1:62949/mcp
2026-04-10T07:03:58 [gateway] ready
2026-04-10T07:40:43 [agent] cli session reset: provider=claude-cli reason=mcp   <- session A, first turn after restart
2026-04-10T07:56:47 [agent] cli session reset: provider=claude-cli reason=mcp   <- session B, first turn after restart

One reset per distinct session, on its first post-restart turn — exactly the pattern expected when the stored hash was computed against a previous ephemeral port.

Why tests did not catch it

  • src/agents/cli-runner/bundle-mcp.test.ts asserts hash presence and format (/^[0-9a-f]{64}$/) but never asserts stability under loopback port churn.
  • src/agents/cli-session.test.ts tests resolveCliSessionReuse with hand-crafted hashes; there is no end-to-end test that the hash computed at run N+1 equals the hash persisted at run N across a simulated gateway restart.

Suggested fix

Compute mcpConfigHash from the user-authored mergedConfig before additionalConfig (the loopback) is merged in, then merge the loopback on top only for writing the actual mcp.json / CLI args. Rough shape in src/agents/cli-runner/bundle-mcp.ts#prepareCliBundleMcpConfig:

const hashableConfig = mergedConfig; // user-authored only
if (params.additionalConfig) {
  mergedConfig = applyMergePatch(mergedConfig, params.additionalConfig) as BundleMcpConfig;
}
return await prepareModeSpecificBundleMcpConfig({
  mode,
  backend: params.backend,
  mergedConfig,                // includes loopback — used to write mcp.json
  hashSource: hashableConfig,  // excludes loopback — used for session identity
  env: params.env,
});

prepareModeSpecificBundleMcpConfig then hashes params.hashSource ?? params.mergedConfig.

Regression tests to add

  1. Run prepareCliBundleMcpConfig twice with identical user MCP state but two different loopback ports; assert prepared1.mcpConfigHash === prepared2.mcpConfigHash.
  2. Add or modify a real plugin MCP server between the two runs; assert the hash does change (so the fix is not over-corrective).

Regression window

2026-04-04 (merge of #35676) → present. Any release that ships both 12100719b8 and 3de09fbe74 is affected.

extent analysis

TL;DR

Compute mcpConfigHash from the user-authored mergedConfig before merging the loopback configuration to prevent session resets after gateway restarts.

Guidance

  • Modify src/agents/cli-runner/bundle-mcp.ts to compute mcpConfigHash from the user-authored mergedConfig before merging additionalConfig.
  • Update prepareModeSpecificBundleMcpConfig to use params.hashSource for hashing, which excludes the loopback configuration.
  • Add regression tests to ensure mcpConfigHash stability under loopback port changes and correctness when user-authored MCP state changes.
  • Verify the fix by checking that cli session reset logs are no longer present after gateway restarts for previously-persisted sessions.

Example

const hashableConfig = mergedConfig; // user-authored only
if (params.additionalConfig) {
  mergedConfig = applyMergePatch(mergedConfig, params.additionalConfig) as BundleMcpConfig;
}
return await prepareModeSpecificBundleMcpConfig({
  mode,
  backend: params.backend,
  mergedConfig,                // includes loopback — used to write mcp.json
  hashSource: hashableConfig,  // excludes loopback — used for session identity
  env: params.env,
});

Notes

The suggested fix assumes that the loopback configuration is the sole cause of the mcpConfigHash changes. Additional testing may be necessary to ensure that other ephemeral values are not merged into mergedConfig.

Recommendation

Apply the suggested fix to compute mcpConfigHash from the user-authored mergedConfig to prevent session resets after gateway restarts. This fix addresses the root cause of the issue and provides a stable solution for session persistence.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING