claude-code - 💡(How to fix) Fix claude -p wedges (600s timeout) when invoked via Python subprocess.run with --model sonnet, --output-format json, and large prompt content (macOS arm64) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#57787Fetched 2026-05-11 03:25:24
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
labeled ×4commented ×1

claude -p reliably wedges (no progress for 600s, 4/4 attempts) when invoked from Python via subprocess.run with this specific combination of arguments and prompt content:

  • --model sonnet
  • --output-format json
  • prompt of ~11KB delivered via stdin
  • specific prompt content (a classification task with a JSON taxonomy + bookmark records — see negative control below)

The same call shape with --model haiku succeeds in ~46s. The same prompt via bash argv (instead of Python subprocess) succeeds. A synthetic 11.8KB prompt of similar length but different content via the same Python harness with sonnet succeeds in ~45s — so the wedge appears to depend on prompt content, not just length.

Reproducible on demand on macOS arm64 (Mac Mini, Darwin 25.4.0). Self-contained repro artifacts published at https://github.com/misterwiggles/agent-tars (commit ac78b1d, path bridge/Monitoring/wedge-repro-2026-05-10/).

This was discovered as the root cause of a 10-day silent-failure incident in a downstream automation pipeline; the surface fix (switching to --model haiku) is in place, but the underlying behavior looks like a bug worth filing because:

  1. The failure mode is silent — no error, no progress, just a hung subprocess until the wall-timeout kills it.
  2. It's specific to a real, common invocation pattern (Python subprocess + stdin prompt + sonnet + JSON output).
  3. It's macOS-specific (we have no Linux repro yet).

Error Message

  1. The failure mode is silent — no error, no progress, just a hung subprocess until the wall-timeout kills it. claude -p ... --model sonnet invoked from Python via subprocess.run with stdin prompt should either complete or fail with an error. It should not hang silently for 10× the typical response time.

Root Cause

This was discovered as the root cause of a 10-day silent-failure incident in a downstream automation pipeline; the surface fix (switching to --model haiku) is in place, but the underlying behavior looks like a bug worth filing because:

Fix Action

Fix / Workaround

Workarounds in use

Code Example

bridge/Monitoring/wedge-repro-2026-05-10/
├── real-classify-prompt.txt    # ~11.3KB prompt (deterministic classification task)
├── test-real-classify.py       # WEDGE: runs claude -p with sonnet via subprocess.run
└── test-real-haiku.py          # NEGATIVE CONTROL: same shape, --model haiku, succeeds ~46s

---

import subprocess, time
with open('/tmp/real-classify-prompt.txt') as f:
    prompt = f.read()
print(f"Prompt: {len(prompt)} bytes; testing with haiku")
t0 = time.time()
try:
    r = subprocess.run(
        ["claude", "-p", "--output-format", "json", "--model", "haiku"],
        input=prompt, capture_output=True, text=True, timeout=120,
    )
    print(f"haiku stdin: rc={r.returncode}, t={time.time()-t0:.1f}s, stdout_len={len(r.stdout)}")
    if r.returncode != 0:
        print(f"stderr: {r.stderr[:300]}")
except subprocess.TimeoutExpired:
    print(f"WEDGED at {time.time()-t0:.1f}s")
RAW_BUFFERClick to expand / collapse

Summary

claude -p reliably wedges (no progress for 600s, 4/4 attempts) when invoked from Python via subprocess.run with this specific combination of arguments and prompt content:

  • --model sonnet
  • --output-format json
  • prompt of ~11KB delivered via stdin
  • specific prompt content (a classification task with a JSON taxonomy + bookmark records — see negative control below)

The same call shape with --model haiku succeeds in ~46s. The same prompt via bash argv (instead of Python subprocess) succeeds. A synthetic 11.8KB prompt of similar length but different content via the same Python harness with sonnet succeeds in ~45s — so the wedge appears to depend on prompt content, not just length.

Reproducible on demand on macOS arm64 (Mac Mini, Darwin 25.4.0). Self-contained repro artifacts published at https://github.com/misterwiggles/agent-tars (commit ac78b1d, path bridge/Monitoring/wedge-repro-2026-05-10/).

This was discovered as the root cause of a 10-day silent-failure incident in a downstream automation pipeline; the surface fix (switching to --model haiku) is in place, but the underlying behavior looks like a bug worth filing because:

  1. The failure mode is silent — no error, no progress, just a hung subprocess until the wall-timeout kills it.
  2. It's specific to a real, common invocation pattern (Python subprocess + stdin prompt + sonnet + JSON output).
  3. It's macOS-specific (we have no Linux repro yet).

Repro

Three files, self-contained:

bridge/Monitoring/wedge-repro-2026-05-10/
├── real-classify-prompt.txt    # ~11.3KB prompt (deterministic classification task)
├── test-real-classify.py       # WEDGE: runs claude -p with sonnet via subprocess.run
└── test-real-haiku.py          # NEGATIVE CONTROL: same shape, --model haiku, succeeds ~46s

test-real-haiku.py (the negative control, simpler to read):

import subprocess, time
with open('/tmp/real-classify-prompt.txt') as f:
    prompt = f.read()
print(f"Prompt: {len(prompt)} bytes; testing with haiku")
t0 = time.time()
try:
    r = subprocess.run(
        ["claude", "-p", "--output-format", "json", "--model", "haiku"],
        input=prompt, capture_output=True, text=True, timeout=120,
    )
    print(f"haiku stdin: rc={r.returncode}, t={time.time()-t0:.1f}s, stdout_len={len(r.stdout)}")
    if r.returncode != 0:
        print(f"stderr: {r.stderr[:300]}")
except subprocess.TimeoutExpired:
    print(f"WEDGED at {time.time()-t0:.1f}s")

The wedge case (test-real-classify.py) is identical except "sonnet" instead of "haiku" in the argv. Both read the same real-classify-prompt.txt from disk.

Steps

  1. Clone https://github.com/misterwiggles/agent-tars at commit ac78b1d.
  2. Copy bridge/Monitoring/wedge-repro-2026-05-10/real-classify-prompt.txt to /tmp/real-classify-prompt.txt.
  3. Run python3 bridge/Monitoring/wedge-repro-2026-05-10/test-real-haiku.py → expect success in ~46s.
  4. Run python3 bridge/Monitoring/wedge-repro-2026-05-10/test-real-classify.py → expect wedge (no output for 600s, then subprocess.TimeoutExpired).

Expected behavior

claude -p ... --model sonnet invoked from Python via subprocess.run with stdin prompt should either complete or fail with an error. It should not hang silently for 10× the typical response time.

Actual behavior

The subprocess hangs with no output. kill -0 on the PID confirms the process is alive but making no I/O progress. We've observed this pattern survive past 600s; it requires SIGKILL to clear.

Environment

  • OS: macOS Darwin 25.4.0 (build 25E253), arm64 (Apple Silicon)
  • Hardware: Mac Mini (M-series)
  • claude CLI version: 2.1.138 (Claude Code)
  • Python: Python 3.9.6 (system /usr/bin/python3)
  • Reproducibility: 4/4 attempts wedged at 600s wall-timeout
  • Prompt size: 11361 bytes (~11.1KB, 102 lines)
  • Prompt content type: classification task — JSON taxonomy + 21 bookmark records, asking for structured JSON output

Negative controls (all PASS — establishes the wedge envelope)

VariationResult
Same Python harness, --model haiku instead of sonnet~46s
Same prompt via bash argv: claude -p "$(cat real-classify-prompt.txt)" --output-format json --model sonnetSucceeds
Synthetic 11.8KB prompt (similar length, different content) + Python harness + sonnet + JSON~45s

The first two narrow the wedge to {Python subprocess + stdin prompt + sonnet + JSON}. The third is the most interesting: a similar-length prompt with different content does not trigger the wedge, suggesting the issue is content-dependent within the {Python + stdin + sonnet + JSON} envelope.

Suspected mechanism

A sample capture of a wedged process showed the main thread blocked in macOS CFRunLoop / mach_msg_overwrite, suggesting an interaction between:

  • the Bun-bundled claude.exe MCP setup pipes,
  • Python subprocess.run's stdio buffering, and
  • something specific about sonnet's larger model-loading or context-handling footprint relative to haiku.

This is speculation — the symptom is "main thread parks in mach_msg_overwrite and never wakes." We haven't been able to localize whether the Bun side is waiting for stdio that never arrives, or whether the MCP-init handshake stalls on a specific buffer-fill condition that the larger sonnet startup window exposes.

Workarounds in use

  • Switching to --model haiku (works for non-sonnet-required tasks).
  • Switching to bash argv invocation (works but loses stdin streaming for prompts that exceed argv limits).

Why we're filing this

The downstream impact in our case was a 10-day silent-failure window in an automated bookmark-classification pipeline. The wrapper around the failing subprocess was using a kill -9 followed by wait pattern that silently aborted on the dead-PID exit code, which masked the wedge from log-watching. We've fixed the wrapper, but the underlying wedge is what we'd like upstream to see.

If anyone on the team wants additional probes run (system-call traces, sample dumps at different points, prompt-size sweeps to find the wedge boundary), reach out via this issue — the repro is reproducible on demand on the affected machine.


Public artifacts: https://github.com/misterwiggles/agent-tars/tree/ac78b1d/bridge/Monitoring/wedge-repro-2026-05-10

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

claude -p ... --model sonnet invoked from Python via subprocess.run with stdin prompt should either complete or fail with an error. It should not hang silently for 10× the typical response time.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING