claude-code - 💡(How to fix) Fix [Bug] stream-json system.init session_id is a per-invocation tag in --resume mode (re #8069) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#58760Fetched 2026-05-14 03:40:11
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Author
Timeline (top)
labeled ×5commented ×1

Error Message

For interactive CLI usage this is mostly cosmetic. For orchestrators that integrate claude -p as a subprocess and rely on system.init.session_id for retry-on-transient-error logic, this is a silent data-corruption bug: the tag overwrites the canonical SID in their state store, and subsequent --resume <tag> calls fail with No conversation found. An orchestrator we run drives claude -p subprocesses for a multi-phase code-generation pipeline. Per-phase session IDs are persisted in a SQLite column for use as --resume targets across phases and for transient-error retry.

Root Cause

I'm not opening a PR on this because the fix lives in claude-code's session manager and the correct approach (1 vs 2) is a product decision. Happy to provide more orchestrator-side details or run additional diagnostics if useful.

Fix Action

Fix / Workaround

Code reference: selfcoding/backend/app/services/claude_service.py @ 8b524c5 — the gate if on_session_start is not None and _current_resume is None: is the necessary workaround.

Code Example

cd "$(mktemp -d)"
git init -q

# 1. Fresh session — system.init.session_id == disk SID. 
claude -p "say hi briefly" \
       --output-format stream-json --verbose \
       --permission-mode bypassPermissions 2>/dev/null \
  | jq -r 'select(.type=="system" and .subtype=="init") | .session_id' \
  | head -1
# → e.g. d8fe6a3d-81ae-46cd-8ab2-137e3bfde15c
SID=d8fe6a3d-81ae-46cd-8ab2-137e3bfde15c
ls ~/.claude/projects/$(pwd | sed 's|/|-|g')/${SID}.jsonl
# → exists ✓

# 2. Resume — system.init.session_id is a brand-new UUID.claude -p "what did you just say?" --resume "$SID" \
       --output-format stream-json --verbose \
       --permission-mode bypassPermissions 2>/dev/null \
  | jq -r 'select(.type=="system" and .subtype=="init") | .session_id' \
  | head -1
# → e.g. 360c5f35-9e0e-4bdb-964c-30cd0ef8de04   (different from $SID)
ls ~/.claude/projects/$(pwd | sed 's|/|-|g')/360c5f35*.jsonl
# → no such file ✗
ls -la ~/.claude/projects/$(pwd | sed 's|/|-|g')/${SID}.jsonl
# → mtime updated — disk session IS being appended to under $SID
RAW_BUFFERClick to expand / collapse

[Bug] stream-json system.init.session_id is a per-invocation tag in --resume mode, causing orchestrator retry to fail with "No conversation found"

Re-opening the topic of #8069 (closed as not_planned on 2026-01-09, locked since 2026-01-16) with fresh production-orchestrator evidence on 2.1.139.

TL;DR

When claude -p --resume <X> --output-format stream-json is invoked:

  • The disk session file (~/.claude/projects/<cwd-encoded>/<X>.jsonl) is correctly appended to throughout the invocation. ✅
  • But the system.init event emitted on stdout has session_id: <NEW_UUID> where <NEW_UUID> ≠ X, AND <NEW_UUID> has no corresponding .jsonl file on disk. ❌

For interactive CLI usage this is mostly cosmetic. For orchestrators that integrate claude -p as a subprocess and rely on system.init.session_id for retry-on-transient-error logic, this is a silent data-corruption bug: the tag overwrites the canonical SID in their state store, and subsequent --resume <tag> calls fail with No conversation found.

Reproduction (2.1.139, macOS, subscription auth)

Start a fresh session, capture the disk SID, then resume and observe system.init.session_id:

cd "$(mktemp -d)"
git init -q

# 1. Fresh session — system.init.session_id == disk SID. ✅
claude -p "say hi briefly" \
       --output-format stream-json --verbose \
       --permission-mode bypassPermissions 2>/dev/null \
  | jq -r 'select(.type=="system" and .subtype=="init") | .session_id' \
  | head -1
# → e.g. d8fe6a3d-81ae-46cd-8ab2-137e3bfde15c
SID=d8fe6a3d-81ae-46cd-8ab2-137e3bfde15c
ls ~/.claude/projects/$(pwd | sed 's|/|-|g')/${SID}.jsonl
# → exists ✓

# 2. Resume — system.init.session_id is a brand-new UUID. ❌
claude -p "what did you just say?" --resume "$SID" \
       --output-format stream-json --verbose \
       --permission-mode bypassPermissions 2>/dev/null \
  | jq -r 'select(.type=="system" and .subtype=="init") | .session_id' \
  | head -1
# → e.g. 360c5f35-9e0e-4bdb-964c-30cd0ef8de04   (different from $SID)
ls ~/.claude/projects/$(pwd | sed 's|/|-|g')/360c5f35*.jsonl
# → no such file ✗
ls -la ~/.claude/projects/$(pwd | sed 's|/|-|g')/${SID}.jsonl
# → mtime updated — disk session IS being appended to under $SID ✓

Production impact (orchestrator perspective)

An orchestrator we run drives claude -p subprocesses for a multi-phase code-generation pipeline. Per-phase session IDs are persisted in a SQLite column for use as --resume targets across phases and for transient-error retry.

The integration was built against the assumption that system.init.session_id is the canonical disk SID (which it is for fresh sessions). Reality is otherwise for --resume invocations. On 2026-05-11 a PLAN phase hit a transient stop_sequence mid-revision, the retry layer pulled the tag from system.init.session_id and issued claude -p --resume <tag>, claude responded with No conversation found with session ID: <tag> (since no <tag>.jsonl exists), the failure escalated to terminal, and ~1h of accumulated Codex review + revision work was unrecoverable.

We've worked around it by:

  1. Never trusting system.init.session_id in --resume calls.
  2. Keeping the SID captured on the first (no---resume) invocation as the canonical value, ignoring all later system.init.session_id emissions for that conversation.
  3. Re-using the canonical SID for every --resume call AND for retry-resume after transient failures.

Code reference: selfcoding/backend/app/services/claude_service.py @ 8b524c5 — the gate if on_session_start is not None and _current_resume is None: is the necessary workaround.

Suggested fixes (any one would resolve)

  1. Most surgical: in --resume <X> mode, emit session_id: <X> (the canonical) in system.init instead of a fresh tag. Matches user expectation, fixes #8069 / #12235 / #10806 wholesale.
  2. Additive: keep the new tag for diagnostic purposes, but add a sibling field — original_session_id or resumed_from — on system.init so integrators can recover the canonical. This is what #12235 and #10806 explicitly asked for.
  3. Docs-only fallback: if the current behavior is intentional (e.g. the tag is meaningful for some internal purpose), document it clearly in Run Claude Code programmatically so SDK integrators know not to trust system.init.session_id in resume mode.

Option 1 or 2 unblock orchestrator implementations. Option 3 at least prevents future integrators from stepping on the same mine.

Related closed issues (all auto-closed; not fixed)

  • #8069 (canonical, not_planned, locked)
  • #12235 (closed as duplicate of #8069, locked)
  • #10806 (closed as duplicate of #8069, locked)
  • #23948 (partial-session-corruption-on-resume — related, separate symptom)
  • #21067 (resume hangs on large tool outputs — related, separate symptom)

Environment

  • Claude Code: 2.1.139
  • Platform: macOS 25.4.0 (darwin), arm64
  • Auth: subscription (Max)
  • Output format: stream-json --verbose

Notes

I'm not opening a PR on this because the fix lives in claude-code's session manager and the correct approach (1 vs 2) is a product decision. Happy to provide more orchestrator-side details or run additional diagnostics if useful.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING