openclaw - 💡(How to fix) Fix [Bug]: Session not recovered after gateway crash — orphaned tool_use blocks cause permanent LLM rejection [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#53781Fetched 2026-04-08 01:23:29
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×1

When the gateway crashes mid-tool-execution, the session JSONL contains a tool_use block with no corresponding tool_result. On restart, every subsequent LLM request is permanently rejected by the Anthropic API:

LLM request rejected: messages.91: tool_use ids were found without tool_result blocks
immediately after: functionswrite0374f29f4. Each tool_use block must have a corresponding
tool_result block in the next message.

The agent is stuck in an unrecoverable loop — it cannot read its own history, cannot patch the session, and cannot self-heal. Manual intervention (session clear + re-brief) is required every time this happens.

Error Message

  1. Any message to the agent returns the tool_use ids were found without tool_result error indefinitely On session load, OpenClaw should scan the last N messages for unpaired tool_use blocks and inject a synthetic tool_result with an error payload:
  2. If any are unpaired, inject a synthetic error tool_result immediately after the orphaned tool_use

Root Cause

Gateway crash (e.g. Discord WS 1006 — see #53644) kills the Node process between:

  1. Writing the tool_use block to the session JSONL
  2. Executing the tool and writing the tool_result

On reload, the session history is structurally invalid per Anthropic's API contract.

Fix Action

Fix / Workaround

The agent is stuck in an unrecoverable loop — it cannot read its own history, cannot patch the session, and cannot self-heal. Manual intervention (session clear + re-brief) is required every time this happens.

  • Agent becomes permanently unresponsive after any mid-tool gateway crash
  • Session context is lost (users must clear and re-brief the agent)
  • No user-configurable workaround exists today
  • Related to #53644 (gateway crash on Discord WS 1006) — that issue causes the crash; this issue describes the downstream consequence

Code Example

LLM request rejected: messages.91: tool_use ids were found without tool_result blocks
immediately after: functionswrite0374f29f4. Each tool_use block must have a corresponding
tool_result block in the next message.

---

{
  "type": "tool_result",
  "tool_use_id": "<orphaned_id>",
  "content": "[Gateway crash: tool execution did not complete. Please retry.]",
  "is_error": true
}
RAW_BUFFERClick to expand / collapse

Summary

When the gateway crashes mid-tool-execution, the session JSONL contains a tool_use block with no corresponding tool_result. On restart, every subsequent LLM request is permanently rejected by the Anthropic API:

LLM request rejected: messages.91: tool_use ids were found without tool_result blocks
immediately after: functionswrite0374f29f4. Each tool_use block must have a corresponding
tool_result block in the next message.

The agent is stuck in an unrecoverable loop — it cannot read its own history, cannot patch the session, and cannot self-heal. Manual intervention (session clear + re-brief) is required every time this happens.

Root Cause

Gateway crash (e.g. Discord WS 1006 — see #53644) kills the Node process between:

  1. Writing the tool_use block to the session JSONL
  2. Executing the tool and writing the tool_result

On reload, the session history is structurally invalid per Anthropic's API contract.

Reproduction Steps

  1. Agent invokes a tool (e.g. write, exec)
  2. Gateway process crashes or is killed before the tool result is recorded
  3. Gateway restarts, session reloads
  4. Any message to the agent returns the tool_use ids were found without tool_result error indefinitely

Expected Behavior

On session load, OpenClaw should scan the last N messages for unpaired tool_use blocks and inject a synthetic tool_result with an error payload:

{
  "type": "tool_result",
  "tool_use_id": "<orphaned_id>",
  "content": "[Gateway crash: tool execution did not complete. Please retry.]",
  "is_error": true
}

This restores API validity and lets the agent acknowledge the failure and continue.

Suggested Fix

In the session hydration / history-replay path, before sending messages to the LLM provider:

  1. Walk the message array and build a set of seen tool_use IDs
  2. For each tool_use ID, verify a tool_result follows in the next message
  3. If any are unpaired, inject a synthetic error tool_result immediately after the orphaned tool_use
  4. Log a warning so the user knows a recovery injection occurred

This is a pure defensive measure — the agent recovers gracefully and can retry the failed tool call.

Impact

  • Agent becomes permanently unresponsive after any mid-tool gateway crash
  • Session context is lost (users must clear and re-brief the agent)
  • No user-configurable workaround exists today
  • Related to #53644 (gateway crash on Discord WS 1006) — that issue causes the crash; this issue describes the downstream consequence

Environment

  • OpenClaw v2026.3.12
  • Node.js v24.13.1
  • Windows 10 x64
  • Provider: anthropic (claude-sonnet-4-6)

extent analysis

Fix Plan

To resolve the issue, we need to implement a session hydration fix that injects synthetic tool_result blocks for unpaired tool_use IDs. Here are the steps:

  • Modify the session hydration code to walk the message array and build a set of seen tool_use IDs.
  • For each tool_use ID, verify a tool_result follows in the next message.
  • If any are unpaired, inject a synthetic error tool_result immediately after the orphaned tool_use.
  • Log a warning to notify the user of the recovery injection.

Example code snippet:

const hydrateSession = (messages) => {
  const toolUseIds = new Set();
  const syntheticResults = [];

  messages.forEach((message) => {
    if (message.type === 'tool_use') {
      toolUseIds.add(message.tool_use_id);
    } else if (message.type === 'tool_result') {
      toolUseIds.delete(message.tool_use_id);
    }
  });

  toolUseIds.forEach((id) => {
    const syntheticResult = {
      type: 'tool_result',
      tool_use_id: id,
      content: '[Gateway crash: tool execution did not complete. Please retry.]',
      is_error: true,
    };
    syntheticResults.push(syntheticResult);
  });

  // Inject synthetic results into the message array
  const updatedMessages = messages.map((message) => {
    if (message.type === 'tool_use' && toolUseIds.has(message.tool_use_id)) {
      return [message, syntheticResults.find((result) => result.tool_use_id === message.tool_use_id)];
    }
    return message;
  });

  return updatedMessages;
};

Verification

To verify the fix, restart the gateway and send a message to the agent. The agent should now be able to recover from the unpaired tool_use block and continue processing messages.

Extra Tips

  • Make sure to log a warning when injecting synthetic results to notify the user of the recovery injection.
  • Consider implementing a retry mechanism for failed tool calls to improve user experience.
  • Review the gateway crash issue (#53644) and implement a fix to prevent crashes from occurring in the first place.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Session not recovered after gateway crash — orphaned tool_use blocks cause permanent LLM rejection [1 participants]