openclaw - 💡(How to fix) Fix Multi-tool turn replay produces orphan tool_use blocks after session compaction (v2026.4.x regression) [6 comments, 4 participants]

openclaw2026-04-30 06:27:59

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#74907•Fetched 2026-05-01 05:40:06

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×6mentioned ×1subscribed ×1

LLM request rejected: messages.2: `tool_use` ids were found without `tool_result` blocks
immediately after: <id1>, <id2>, <id3>, <id4>. Each `tool_use` block must have a corresponding
`tool_result` block in the next message.

The error fires on first user message after the upgrade and persists until /new. Sessions that have only single-tool turns or are short enough not to trigger compaction are unaffected.

Error Message

After upgrading from 2026.3.8 to 2026.4.x (currently installed: 2026.4.27), the gateway started producing 400 errors from Anthropic on long-running sessions that contain multi-tool assistant turns. The error is: The error fires on first user message after the upgrade and persists until /new. Sessions that have only single-tool turns or are short enough not to trigger compaction are unaffected. 4. Gateway logs embedded run agent end ... isError=true with the tool_use ids ... without tool_result blocks error. (numbers above are the original pre-compaction overflow that started the failure cascade; later runs operated on an already-compacted session and still hit the same orphan-tool_use error)

A fresh gateway boot with an in-tact, untouched JSONL/trajectory pair reproduces the error reliably without any user input — the integrity check fires during session load, then the first prompt to Anthropic 400s.

Root Cause

LLM request rejected: messages.2: `tool_use` ids were found without `tool_result` blocks
immediately after: <id1>, <id2>, <id3>, <id4>. Each `tool_use` block must have a corresponding
`tool_result` block in the next message.

The error fires on first user message after the upgrade and persists until /new. Sessions that have only single-tool turns or are short enough not to trigger compaction are unaffected.

Code Example

LLM request rejected: messages.2: `tool_use` ids were found without `tool_result` blocks
immediately after: <id1>, <id2>, <id3>, <id4>. Each `tool_use` block must have a corresponding
`tool_result` block in the next message.

---

overflow-precheck diagId: ovf-mol18r0g-2jH_Ew
sessionKey:        agent:main:main
provider:          anthropic/claude-opus-4-6
estimatedPromptTokens: 19,412,981
overflowTokens:        19,232,981
toolResultReducibleChars: 3,382,326
route:                 compact_then_truncate

RAW_BUFFERClick to expand / collapse

Summary

LLM request rejected: messages.2: `tool_use` ids were found without `tool_result` blocks
immediately after: <id1>, <id2>, <id3>, <id4>. Each `tool_use` block must have a corresponding
`tool_result` block in the next message.

The error fires on first user message after the upgrade and persists until /new. Sessions that have only single-tool turns or are short enough not to trigger compaction are unaffected.

Reproduction signature

Run a session against an anthropic-messages provider over multiple days until it accumulates many turns where the assistant calls 2+ tools in parallel (each emitted as role:"assistant" with multiple toolCall content blocks, followed by N separate role:"toolResult" messages).
Let the session grow large enough to trigger compaction (or simply continue until the prompt approaches the model budget).
Send any user message — even hey.
Gateway logs embedded run agent end ... isError=true with the tool_use ids ... without tool_result blocks error.
Auth profile is then cooled-down with reason=format, and subsequent requests fail with Provider anthropic is in cooldown (all profiles unavailable).

Diagnostic IDs from one affected install

overflow-precheck diagId: ovf-mol18r0g-2jH_Ew
sessionKey:        agent:main:main
provider:          anthropic/claude-opus-4-6
estimatedPromptTokens: 19,412,981
overflowTokens:        19,232,981
toolResultReducibleChars: 3,382,326
route:                 compact_then_truncate

(numbers above are the original pre-compaction overflow that started the failure cascade; later runs operated on an already-compacted session and still hit the same orphan-tool_use error)

What I confirmed by tracing the data

The <session>.jsonl file can be sanitized (rewriting every role:"toolResult" message to role:"user" text, and stripping toolCall blocks from assistant messages). After sanitizing, the JSONL contains zero toolCall blocks and zero toolResult messages.
On the next gateway start, the log shows agent/embedded session file repaired: rewrote 1 assistant message(s) — and the very next request fails with the same orphan tool_use IDs as before.
That means the integrity-check / repair step is sourcing the broken state from somewhere other than the JSONL. Inspection of <session>.trajectory.jsonl shows the orphan IDs are present in 6 separate prompt.submitted records (14 occurrences total). The <session>.trajectory-path.json pointer file references that trajectory.
A fresh gateway boot with an in-tact, untouched JSONL/trajectory pair reproduces the error reliably without any user input — the integrity check fires during session load, then the first prompt to Anthropic 400s.

Likely cause (best guess from outside)

The compactor or session-replay path keeps the assistant-with-multiple-tool_use message but does not group the N consecutive role:"toolResult" messages into a single role:"user" message containing N tool_result content blocks before sending to Anthropic. Anthropic enforces strict pairing: an assistant tool_use message must be immediately followed by one user message containing matching tool_result blocks for every id. If only one is paired (or the toolResult messages are summarized away while the tool_use remains), the request 400s.

The session-repair / integrity-check step then re-injects the broken assistant turn from .trajectory.jsonl on every gateway restart, defeating manual session repair from outside.

What works around it (today)

/new clears the symptom.
Cron-spawned sessions (short-lived, no accumulated history) are unaffected.
Live multi-tool turns within a fresh session continue to work normally.

The class of users affected is anyone who keeps a single long-running agent session (Telegram-bot-style use case) past the point where the prompt budget triggers compaction.

Suggested fix areas

In the JSONL → API converter: group consecutive role:"toolResult" messages into a single role:"user" message with N tool_result blocks, indexed by toolCallId.
In the compactor: never summarize/strip toolResult messages whose toolCallId matches a toolCall block in a still-retained assistant message.
In the session-repair step: do not re-inject assistant toolCall blocks from trajectory if the corresponding toolResult messages have been intentionally pruned from the JSONL — or perform the pruning symmetrically across both files.

Happy to provide more redacted log excerpts on request, or to test a fix locally.

Versions

Affected: openclaw 2026.4.12 through 2026.4.27 (still reproduces on the latest)
Last known working: 2026.3.8
Provider: anthropic via anthropic-messages API, model claude-sonnet-4-6 and claude-opus-4-6 (both reproduce)
Node: v22.22.0, macOS

extent analysis

TL;DR

The most likely fix involves modifying the JSONL to API converter to group consecutive role:"toolResult" messages into a single role:"user" message with N tool_result blocks.

Guidance

Review the JSONL to API converter to ensure it correctly handles consecutive role:"toolResult" messages and groups them into a single role:"user" message.
Verify that the compactor does not summarize or strip toolResult messages whose toolCallId matches a toolCall block in a still-retained assistant message.
Investigate the session-repair step to prevent re-injection of assistant toolCall blocks from trajectory if the corresponding toolResult messages have been pruned from the JSONL.
Test the fix locally to ensure it resolves the issue without introducing new problems.

Example

No code snippet is provided as the issue requires a deeper understanding of the system's architecture and the specific implementation of the JSONL to API converter, compactor, and session-repair step.

Notes

The issue seems to be related to the handling of toolResult messages and their pairing with tool_use messages. The fix may require modifications to multiple components of the system. It is essential to thoroughly test any changes to ensure they do not introduce new issues.

Recommendation

Apply a workaround by modifying the JSONL to API converter to correctly group consecutive toolResult messages, as this is the most likely cause of the issue. This change should be made with caution and thoroughly tested to avoid introducing new problems.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #autograd error #model save/load #optimization #mixed precision

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Multi-tool turn replay produces orphan tool_use blocks after session compaction (v2026.4.x regression) [6 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Reproduction signature

Diagnostic IDs from one affected install

What I confirmed by tracing the data

Likely cause (best guess from outside)

What works around it (today)

Suggested fix areas

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Multi-tool turn replay produces orphan tool_use blocks after session compaction (v2026.4.x regression) [6 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Reproduction signature

Diagnostic IDs from one affected install

What I confirmed by tracing the data

Likely cause (best guess from outside)

What works around it (today)

Suggested fix areas

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING