claude-code - 💡(How to fix) Fix Expose Claude Code harness process-state to a per-instance state file

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fix / Workaround

None of these can distinguish a permission-prompted instance (correctly idle) from a hung instance (incorrectly idle). The current mitigation is a sticky-false-positive flag that suppresses alerts after the user acknowledges the false positive once — this is a patch on the symptom. Process-state externalization resolves the class at root.

Code Example

{
  "state": "awaiting_permission",
  "tool": "Bash",
  "tool_args_summary": "git push origin main",
  "since": "2026-05-24T08:30:00Z",
  "pid": 12345
}
RAW_BUFFERClick to expand / collapse

Problem

External tooling that monitors multiple Claude Code instances cannot distinguish "agent is waiting for user input" from "agent is stuck in a loop or hung." Both produce identical external signals — silence on stdout, no JSONL mtime updates. The harness internally knows its state (showing a permission prompt, idle at REPL, executing a tool, generating a response), but that state is not exposed.

For a single interactive user, this distinction is moot — they see the prompt. For multi-instance orchestration (fleet supervision, automated monitoring, idle-detection layers), it produces a high-frequency false-positive class: monitoring infrastructure flags instances as "stuck" when they are actually waiting on the user to approve a permission prompt.

Proposed solution

Have the Claude Code harness write its current state to a small file on a state-change cadence. Suggested location: alongside the existing .claude/ config, or a well-known per-project state directory.

Example schema:

{
  "state": "awaiting_permission",
  "tool": "Bash",
  "tool_args_summary": "git push origin main",
  "since": "2026-05-24T08:30:00Z",
  "pid": 12345
}

State values (suggested):

  • awaiting_permission — permission prompt shown, waiting for Allow/Deny
  • idle_repl — at the REPL, waiting for user input
  • executing_tool — tool call in progress (tool name in tool field)
  • generating — model is generating a response
  • awaiting_user_question — model has asked the user a question via AskUserQuestion

The file should be written atomically (write-then-rename) on every state transition. Stale files (process no longer running) should be detectable via the embedded pid.

Use case

I run an orchestrated set of ~13 Claude Code instances, each with its own working directory. A separate monitoring instance polls the set looking for stuck or hung instances. Currently the only external signals available are:

  • Transcript JSONL mtime (updates only when a turn completes)
  • A custom append-only log emitted at turn boundaries
  • Process existence (PID check)

None of these can distinguish a permission-prompted instance (correctly idle) from a hung instance (incorrectly idle). The current mitigation is a sticky-false-positive flag that suppresses alerts after the user acknowledges the false positive once — this is a patch on the symptom. Process-state externalization resolves the class at root.

Backward compatibility

Pure additive feature. New file appears in a well-known location; absent the feature, downstream tooling falls back to existing signals. No breaking change.

Adjacent considerations

  • A subscribe/notify path (named pipe, websocket, or fsnotify-watchable file) would be ideal for low-latency probes, but a plain file polled at 1-2s cadence is sufficient for the use case.
  • If a single state file is too invasive, an opt-in flag (CLAUDE_CODE_EXPORT_STATE=1 env var or settings.json field) would unblock the use case without affecting users who don't need it.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Expose Claude Code harness process-state to a per-instance state file