openclaw - 💡(How to fix) Fix [Bug]: [AI-Assisted] Silent agent stall: claude-cli `result` event sums cache_read across tool sub-calls, trips broken compaction gate

Q: Expected behavior

`result.cache_read_input_tokens` should report the value for the final API sub-call of the turn (matching what each `assistant` event reports per-call), not the sum across sub-calls. `sessionEntry.totalTokens` should not exceed `contextTokens` for a single turn. Preemptive compaction should fire on actual budget pressure, not on accounting artifacts. The agent should not silently stall on every channel without surfacing the condition.

openclaw2026-05-23 01:44:29

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

[AI-Assisted] On claude-cli provider (stream-json), the result.cache_read_input_tokens is summed across the turn's tool-use sub-calls — not reported per-call — inflating sessionEntry.totalTokens 6–13× on tool-heavy turns, tripping the preemptive-compaction gate in cli-compaction-mgvpa49z.js; compaction then returns compacted: false (because OpenClaw does not own conversation state for claude-cli sessions), and the agent silently stops responding on every channel until /reset.

Error Message

Continue the conversation; once /status reports tokens >1M, the agent stops responding on every channel (Telegram, dashboard, etc.). No error surfaces. /reset clears the stuck session-store entry. Compaction then fails because OpenClaw does not own conversation state for claude-cli sessions (see Additional information). compactCliTranscript returns {ok: true, compacted: false, reason: "no real conversation messages"} and warn-bails. Net effect: silent stall on every channel until /reset.

Severity: High. The agent silently stops responding on every channel (Telegram, dashboard, any active session). No error surfaces in normal UI. Only recovery is manual /reset, which loses the running session.

Root Cause

Fix Action

Fix / Workaround

Secondary finding: result.usage.iterations carries only the LAST API call's stats — it is NOT a per-call breakdown, so it can't be used as a workaround. Per-call data must come from the deduped chain of assistant events (events with identical usage blocks are partial streams of the same call).

Disclosure (AI-assisted, per CONTRIBUTING.md):* The analysis, reproduction design, and writeup were authored by Claude (Anthropic) collaborating with @TrophyUncle. The reproduction was run on @TrophyUncle's live OpenClaw 2026.5.18 host — the silent-stall behavior, the inflated sessions.json values, and the effect of the local mitigation patches were all observed in real production traffic, not AI-generated simulation.

Proposed fix — OpenClaw-side workarounds (two layers, independent):

Code Example

result.cache_read_input_tokens = 45641

---

# Sample sessions.json entry showing inflation (active channel mid-stall):
{
  "inputTokens": 18,
  "outputTokens": 11834,
  "cacheRead": 882580,
  "cacheWrite": 24178,
  "totalTokens": 906776,
  "contextTokens": 1048576,
  "totalTokensFresh": true
}

# Cross-check: max cache_read_input_tokens ever produced by any single turn
# in claude-cli's own JSONL records: ~111K. Session-store cacheRead (~882K) is ~8x that.

# Test B (three-Bash turn) result-event line:
result.cache_read_input_tokens = 45641
# Per-call values from the same turn's three `assistant` events:
assistant[0].cache_read_input_tokens = 18499
assistant[1].cache_read_input_tokens = 27142
assistant[2].cache_read_input_tokens = (intermediate)
# 45641 = 18499 + 27142 (SUM, not LAST)

# Production WebSocket observation: manual /compact returns ok:true but no state change
[ws] -> sessions.compact { ... }
[ws] <- sessions.compact ok=true compacted=false reason="no real conversation messages" 9797ms

# Filesystem evidence for state-ownership issue:
# `agent:main:telegram:direct:<id>` sessionFile path in sessions.json — file does not exist on disk.
# 18 of 18 IDs in usageFamilySessionIds for that key are missing from both
#   ~/.openclaw/agents/main/sessions/ and ~/.claude/projects/<workspace>/.
# By contrast, agent:main:main (non-claude-cli session) DOES have its JSONL on disk
# at the documented path. Persistence works in general — just not for claude-cli channels.


Raw stream-json JSONLs (`cache-read-test-2026-05-22-A.jsonl`, `cache-read-test-2026-05-22-B.jsonl`) and the full analysis worksheet (`cache-read-test-2026-05-22-analysis.md`) are available; happy to attach in a follow-up comment.

RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

Steps to reproduce

Install OpenClaw 2026.5.18, configure claude provider via the bundled claude-cli backend in stream-json mode with the 1M (tengu_cobalt_raccoon) long-context profile.
Run any non-trivial coding-agent conversation that uses multiple tool calls per turn (Bash/Edit/Read chains, etc.).
After several tool-heavy turns, inspect ~/.openclaw/agents/<agentId>/sessions/sessions.json for the active channel — cacheRead and totalTokens climb far above any single API call's possible cache_read (e.g., cacheRead: 882580 on a session whose largest single-call cache_read in claude-cli's own JSONL is ~111K).
Continue the conversation; once /status reports tokens >1M, the agent stops responding on every channel (Telegram, dashboard, etc.). No error surfaces. /reset clears the stuck session-store entry.
Reproduced deterministically on the reporter's host across multiple sessions.

Expected behavior

result.cache_read_input_tokens should report the value for the final API sub-call of the turn (matching what each assistant event reports per-call), not the sum across sub-calls. sessionEntry.totalTokens should not exceed contextTokens for a single turn. Preemptive compaction should fire on actual budget pressure, not on accounting artifacts. The agent should not silently stall on every channel without surfacing the condition.

Actual behavior

Controlled test (2026-05-22):

Test A — single turn, no tool calls. result.cache_read_input_tokens matched the single API call's value.
Test B — single turn, three Bash tool calls (forces three API sub-calls). Per-call cache_read_input_tokens from the three assistant events: 18499, 27142, and intermediate. The result event reported:
```
result.cache_read_input_tokens = 45641
```
45641 = 18499 + 27142 — the sum, not the last. Confirms the SUM-vs-LAST hypothesis.

Downstream effect: the inflated cache_read enters OpenClaw via dist/claude-live-session-8ZWPgp8B.js (single pick("cache_read_input_tokens")), flows into dist/session-store-BzMhQnMs.js's updateSessionStoreAfterAgentRun, and lands in sessionEntry.totalTokens via deriveSessionTotalTokens (dist/usage-C0ofRzYq.js). The compaction gate at dist/cli-compaction-mgvpa49z.js then takes Math.max(estimatedPromptTokens, sessionEntry.totalTokens) — the inflated snapshot wins whenever it exceeds the real prompt size — and triggers compaction.

Compaction then fails because OpenClaw does not own conversation state for claude-cli sessions (see Additional information). compactCliTranscript returns {ok: true, compacted: false, reason: "no real conversation messages"} and warn-bails. Net effect: silent stall on every channel until /reset.

OpenClaw version

2026.5.18 (build 50a2481)

Operating system

Ubuntu 26.04 LTS (WSL2 on Windows; kernel 6.6.114.1-microsoft-standard-WSL2)

Install method

npm global (~/.npm-global/lib/node_modules/openclaw)

Model

anthropic/claude-opus-4-7

Provider / routing chain

openclaw -> claude-cli (stream-json) -> Anthropic API

Additional provider/model setup details

Long-context profile: tengu_cobalt_raccoon (1M cap, contextTokens: 1048576).
claude-cli version 2.1.147.
Node v24.15.0.
Single auth profile; no proxy or routing gateway between OpenClaw and Anthropic.

Logs, screenshots, and evidence

# Sample sessions.json entry showing inflation (active channel mid-stall):
{
  "inputTokens": 18,
  "outputTokens": 11834,
  "cacheRead": 882580,
  "cacheWrite": 24178,
  "totalTokens": 906776,
  "contextTokens": 1048576,
  "totalTokensFresh": true
}

# Cross-check: max cache_read_input_tokens ever produced by any single turn
# in claude-cli's own JSONL records: ~111K. Session-store cacheRead (~882K) is ~8x that.

# Test B (three-Bash turn) result-event line:
result.cache_read_input_tokens = 45641
# Per-call values from the same turn's three `assistant` events:
assistant[0].cache_read_input_tokens = 18499
assistant[1].cache_read_input_tokens = 27142
assistant[2].cache_read_input_tokens = (intermediate)
# 45641 = 18499 + 27142 (SUM, not LAST)

# Production WebSocket observation: manual /compact returns ok:true but no state change
[ws] -> sessions.compact { ... }
[ws] <- sessions.compact ok=true compacted=false reason="no real conversation messages" 9797ms

# Filesystem evidence for state-ownership issue:
# `agent:main:telegram:direct:<id>` sessionFile path in sessions.json — file does not exist on disk.
# 18 of 18 IDs in usageFamilySessionIds for that key are missing from both
#   ~/.openclaw/agents/main/sessions/ and ~/.claude/projects/<workspace>/.
# By contrast, agent:main:main (non-claude-cli session) DOES have its JSONL on disk
# at the documented path. Persistence works in general — just not for claude-cli channels.


Raw stream-json JSONLs (`cache-read-test-2026-05-22-A.jsonl`, `cache-read-test-2026-05-22-B.jsonl`) and the full analysis worksheet (`cache-read-test-2026-05-22-analysis.md`) are available; happy to attach in a follow-up comment.

Impact and severity

Affected: any OpenClaw deployment driving claude-cli (stream-json) on tool-heavy workloads. Coding-agent workflows are the worst case because each turn includes multiple tool-use sub-calls, multiplying the inflation factor.
Severity: High. The agent silently stops responding on every channel (Telegram, dashboard, any active session). No error surfaces in normal UI. Only recovery is manual /reset, which loses the running session.
Frequency: Deterministic once inflated sessionEntry.totalTokens exceeds promptBudgetBeforeReserve for the configured context profile. On tool-heavy work at the 1M (tengu_cobalt_raccoon) profile, observed reproducibly within a single long session. Less aggressive profiles delay onset but don't prevent it.
Consequence: Loss of in-progress agent work on /reset; silent failure mode is particularly bad because the user has no signal anything is wrong until the agent stops mid-task.

Additional information

Why compaction also fails — architectural mismatch (claude-cli provider):

When OpenClaw drives claude-cli as a provider, OpenClaw does not own the conversation state — claude-cli does, via its own session JSONLs under ~/.claude/projects/<workspace>/<cli-session-id>.jsonl and its --resume mechanism. OpenClaw's own session file at the path stored in sessions.json (~/.openclaw/agents/<agentId>/sessions/<openclaw-sid>.jsonl) is empty or non-existent for claude-cli-backed channels in our deployment (see Logs section for evidence). The compaction pipeline (sanitizeSessionHistory → validateReplayTurns → dedupeDuplicateUserMessagesForCompaction → limitHistoryTurns → sanitizeToolUseResultPairing → containsRealConversationMessages) opens that empty session file, containsRealConversationMessages returns false, and the wrapper returns compacted: false. This means manual /compact is also a silent no-op for these sessions.

Any fix that tries to "make compaction work better" on the OpenClaw side will fail unless it first addresses state ownership. Structural options:

(a) Disable compaction entirely for claude-cli sessions. Most honest — admits OpenClaw doesn't own the state to compact.
(b) Read claude-cli's JSONLs to populate session.messages before compaction. Heavier lift, but principled.
(c) Delegate to claude-cli's own compaction via --resume + internal /compact invocation. Relies on claude-cli's compaction working in this mode, which is also limited.

Proposed fix — preferred (upstream root cause):

Coordinate with the Anthropic claude-cli team: the stream-json result event's cache_read_input_tokens should report the final turn's value, not the sum across nested tool-use sub-call cache reads. Clarifying that result.usage.iterations is not a per-call breakdown would also help downstream parsers.

Proposed fix — OpenClaw-side workarounds (two layers, independent):

Layer 1 (accounting): In dist/claude-live-session-8ZWPgp8B.js, take cache_read_input_tokens from the most-recent deduped assistant event rather than the result line. Restores trustworthy /status numbers.
Layer 2 (gate): Either (a) defensive clamp in dist/session-store-BzMhQnMs.js to mark totalTokens stale when it exceeds contextTokens, OR (b) short-circuit resolveSessionTokenSnapshot in dist/cli-compaction-mgvpa49z.js to disable the compaction gate for claude-cli sessions. Option (b) is structurally correct given that compaction cannot succeed for these sessions anyway.

Local mitigations currently applied on the reporter's host (OpenClaw 2026.5.18; both patches in dist/, with .bak backups; survive restart, not upgrade):

session-store-BzMhQnMs.js — defensive exceedsContextWindow clamp (Layer 2a).
cli-compaction-mgvpa49z.js — resolveSessionTokenSnapshot returns undefined unconditionally (Layer 2b, structural).

Both currently active; the agent has run through tool-heavy sessions for several hours without recurrence of the silent-stall behavior. Happy to share the patch diffs in a follow-up comment.

Related issues:

#84305 (closed) — same family ("token accounting outpaces compaction") on Codex runtime
#85025 (open) — adjacent (defaults causing unbounded transcript growth); not the same root cause

Last known good / first known bad: N/A — this OpenClaw install has been on 2026.5.18 since first install; we do not have a known-good prior version on this host. The inflation pattern appears consistent with the per-call usage semantics being assumed throughout dist/claude-live-session-*.js, so it likely predates 2026.5.18.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: [AI-Assisted] Silent agent stall: claude-cli `result` event sums cache_read across tool sub-calls, trips broken compaction gate

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

TRENDING