claude-code - 💡(How to fix) Fix MCP connection cache race condition in clearServerCache causes 'Already connected to a transport' crash on session resume [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#48260Fetched 2026-04-16 07:04:54
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
labeled ×4

Root Cause

In clearServerCache() (client.ts ~L1648):

export async function clearServerCache(name, serverRef) {
  const key = getServerCacheKey(name, serverRef)

  try {
    const wrappedClient = await connectToServer(name, serverRef)  // [1] get from cache
    if (wrappedClient.type === 'connected') {
      await wrappedClient.cleanup()  // [2] close transport → triggers onclose
    }
  } catch {}

  connectToServer.cache.delete(key)  // [4] PROBLEM: deletes NEW entry from [3]
}

When cleanup() calls client.close():

  • The transport's onclose handler fires synchronously (L1374)
  • onclose calls connectToServer.cache.delete(key) (L1396) — clears the old entry ✓
  • onclose then calls originalOnclose() (L1399) → useManageMCPConnections's reconnect logic
  • Reconnect calls connectToServer(name, config) → cache miss → creates new connection [3]
  • Back in clearServerCache, L1666 connectToServer.cache.delete(key) runs — deletes the new connection [3]

Now the new connection exists in useManageMCPConnections state but not in the memoize cache. Next call to connectToServer creates yet another connection, but the previous one's transport is still alive → "Already connected to a transport".

Code Example

export async function clearServerCache(name, serverRef) {
  const key = getServerCacheKey(name, serverRef)

  try {
    const wrappedClient = await connectToServer(name, serverRef)  // [1] get from cache
    if (wrappedClient.type === 'connected') {
      await wrappedClient.cleanup()  // [2] close transport → triggers onclose
    }
  } catch {}

  connectToServer.cache.delete(key)  // [4] PROBLEM: deletes NEW entry from [3]
}

---

export async function clearServerCache(name, serverRef) {
  const key = getServerCacheKey(name, serverRef)
  const cached = connectToServer.cache.get(key)   // grab reference
  connectToServer.cache.delete(key)                // delete FIRST
  fetchToolsForClient.cache.delete(name)
  fetchResourcesForClient.cache.delete(name)
  fetchCommandsForClient.cache.delete(name)

  if (cached) {
    try {
      const client = await cached
      if (client.type === 'connected') await client.cleanup()
    } catch {}
  }
}
RAW_BUFFERClick to expand / collapse

Bug description

clearServerCache() in src/services/mcp/client.ts has a race condition where connectToServer.cache.delete(key) executes after the onclose handler has already cleared and repopulated the cache with a fresh connection. This causes the delete to remove the new entry, leaving stale references that crash with "Already connected to a transport" on the next connect() call.

Reproduction

  1. Start a session with at least one MCP server configured
  2. Let the session idle for hours (sandbox TTL expires, session goes cold)
  3. Resume the session — JSONL is restored, CLI subprocess spawns
  4. CLI detects stale MCP connections → calls clearServerCache()crash

This is intermittent — it depends on the async timing between cleanup(), onclose, and reconnection.

Root cause

In clearServerCache() (client.ts ~L1648):

export async function clearServerCache(name, serverRef) {
  const key = getServerCacheKey(name, serverRef)

  try {
    const wrappedClient = await connectToServer(name, serverRef)  // [1] get from cache
    if (wrappedClient.type === 'connected') {
      await wrappedClient.cleanup()  // [2] close transport → triggers onclose
    }
  } catch {}

  connectToServer.cache.delete(key)  // [4] PROBLEM: deletes NEW entry from [3]
}

When cleanup() calls client.close():

  • The transport's onclose handler fires synchronously (L1374)
  • onclose calls connectToServer.cache.delete(key) (L1396) — clears the old entry ✓
  • onclose then calls originalOnclose() (L1399) → useManageMCPConnections's reconnect logic
  • Reconnect calls connectToServer(name, config) → cache miss → creates new connection [3]
  • Back in clearServerCache, L1666 connectToServer.cache.delete(key) runs — deletes the new connection [3]

Now the new connection exists in useManageMCPConnections state but not in the memoize cache. Next call to connectToServer creates yet another connection, but the previous one's transport is still alive → "Already connected to a transport".

Suggested fix

Delete the cache entry before cleanup, and retrieve the cached promise directly instead of calling the memoized function:

export async function clearServerCache(name, serverRef) {
  const key = getServerCacheKey(name, serverRef)
  const cached = connectToServer.cache.get(key)   // grab reference
  connectToServer.cache.delete(key)                // delete FIRST
  fetchToolsForClient.cache.delete(name)
  fetchResourcesForClient.cache.delete(name)
  fetchCommandsForClient.cache.delete(name)

  if (cached) {
    try {
      const client = await cached
      if (client.type === 'connected') await client.cleanup()
    } catch {}
  }
}

This way, when onclose fires during cleanup(), its cache.delete(key) is a no-op (already deleted). Any reconnection triggered by onclose stores a new entry that won't be clobbered.

Environment

  • Claude Code: latest (observed via @anthropic-ai/claude-agent-sdk query() API)
  • OS: Linux (Kubernetes pod)
  • MCP server: custom stdio server
  • Trigger: session resume after hours of inactivity

extent analysis

TL;DR

Delete the cache entry before cleanup in clearServerCache to prevent deleting the new connection entry.

Guidance

  • Identify the clearServerCache function in src/services/mcp/client.ts and modify it to delete the cache entry before calling cleanup().
  • Retrieve the cached promise directly using connectToServer.cache.get(key) instead of calling the memoized function.
  • Verify that the onclose handler's cache.delete(key) call is a no-op after the cache entry has been deleted.
  • Test the modified clearServerCache function to ensure it resolves the "Already connected to a transport" crash.

Example

export async function clearServerCache(name, serverRef) {
  const key = getServerCacheKey(name, serverRef)
  const cached = connectToServer.cache.get(key)   // grab reference
  connectToServer.cache.delete(key)                // delete FIRST
  // ...
}

Notes

This fix assumes that the onclose handler's reconnect logic is correct and will store a new entry in the cache. If the reconnect logic is flawed, additional modifications may be necessary.

Recommendation

Apply the suggested fix to the clearServerCache function, as it directly addresses the identified race condition and prevents the deletion of the new connection entry.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING