claude-code - 💡(How to fix) Fix MCP servers don't respawn after process death mid-session [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#45146Fetched 2026-04-09 08:12:10
View on GitHub
Comments
2
Participants
3
Timeline
5
Reactions
0
Timeline (top)
labeled ×3commented ×2

When an MCP server process dies (killed, crashed, OOM, or stale code cleanup), the session permanently loses access to that server's tools. There is no recovery mechanism — the user must start a new session.

Root Cause

When an MCP server process dies (killed, crashed, OOM, or stale code cleanup), the session permanently loses access to that server's tools. There is no recovery mechanism — the user must start a new session.

Fix Action

Fix / Workaround

Currently the only workaround is starting a new session, which loses all conversation context.

This compounds with other MCP connection fragility:

  • Playwright MCP hardcodes WebSocket UUIDs that go stale on browser restart (workaround: --cdp-endpoint http://localhost:PORT)
  • Claude-in-Chrome native messaging host conflicts between Claude Desktop and Claude Code (Desktop's host wins the race)
RAW_BUFFERClick to expand / collapse

Description

When an MCP server process dies (killed, crashed, OOM, or stale code cleanup), the session permanently loses access to that server's tools. There is no recovery mechanism — the user must start a new session.

Reproduction

  1. Start a Claude Code session with MCP servers configured (e.g., a custom Substack MCP server)
  2. Use pkill or kill to terminate the MCP server process
  3. Attempt to use any tool from that MCP server
  4. Result: "MCP server disconnected" — tools are gone for the rest of the session

Expected Behavior

Claude Code should detect MCP server process death and attempt to respawn the server using the original configuration from .claude.json or .mcp.json. If respawn fails after N retries, report the failure clearly.

Context

This is a significant pain point in environments where:

  • MCP server code is actively being developed and updated (stale processes cache old code)
  • Multiple Claude Code sessions share MCP servers and one kills a shared process
  • MCP servers crash due to transient errors (network timeouts, API rate limits)
  • Long-running sessions need persistent tool access

Currently the only workaround is starting a new session, which loses all conversation context.

Related Issues

This compounds with other MCP connection fragility:

  • Playwright MCP hardcodes WebSocket UUIDs that go stale on browser restart (workaround: --cdp-endpoint http://localhost:PORT)
  • Claude-in-Chrome native messaging host conflicts between Claude Desktop and Claude Code (Desktop's host wins the race)

Environment

  • Claude Code v2.1.96
  • macOS Darwin 25.3.0
  • MCP servers: stdio-based (Python, Node)

extent analysis

TL;DR

Implementing a mechanism to detect and respawn MCP server processes when they die or become unresponsive is likely to resolve the issue of permanent tool loss.

Guidance

  • Investigate the feasibility of integrating a process monitoring and respawn mechanism, such as using a library like psutil for Python or node-ps-utils for Node, to detect when an MCP server process dies.
  • Consider implementing a retry mechanism with a limited number of attempts (e.g., N retries) to respawn the MCP server process before reporting a failure.
  • Review the .claude.json and .mcp.json configuration files to ensure they contain the necessary information to respawn the MCP server process with the original configuration.
  • Evaluate the potential for using a more robust connection mechanism, such as a message queue or a service discovery protocol, to improve the resilience of MCP server connections.

Example

A simple example of a respawn mechanism in Python using psutil could be:

import psutil
import time

def respawn_mcp_server(process_config):
    # Attempt to respawn the MCP server process
    for attempt in range(3):  # N retries
        try:
            # Create a new process using the original configuration
            process = psutil.Popen(process_config['command'], **process_config['kwargs'])
            # Wait for the process to become available
            time.sleep(1)
            # Check if the process is running
            if process.is_running():
                return process
        except Exception as e:
            # Log the error and continue to the next attempt
            print(f"Error respawning MCP server: {e}")
            time.sleep(1)
    # Report failure after N retries
    print("Failed to respawn MCP server")

## Notes
The implementation details of the respawn mechanism will depend on the specific requirements and constraints of the Claude Code and MCP server environments.

## Recommendation
Apply a workaround by implementing a custom respawn mechanism, as upgrading to a fixed version is not mentioned in the issue. This will allow for a more robust and resilient connection to MCP servers, reducing the impact of process deaths and crashes.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING