When a run is killed by circuit breaker (or any hard abort), the message history should be restored to a **consistent state** — either: 1. **Transactionally remove** the incomplete assistant messages that contain tool calls without results, OR 2. **Mark** the orphaned tool calls as cancelled/aborted with a distinct status, rather than the generic and misleading `"No result provided"`, OR 3. Provide an operator command to "sanitize" a conversation's message history by removing or repairing orphaned tool call pairs.

openclaw - ✅(Solved) Fix Circuit breaker kill run leaves orphaned tool calls in persistent message history [1 pull requests, 1 comments, 2 participants]

openclaw2026-05-13 02:29:54

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#81252•Fetched 2026-05-14 03:34:07

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ygc3817922006-sketch

Participants

clawsweeper[bot]

ygc3817922006-sketch

Timeline (top)

closed ×1commented ×1cross-referenced ×1mentioned ×1

Error Message

content: [{ type: "text", text: "No result provided" }], isError: true,

Root Cause

The circuit breaker performs a hard abort of the current run without rolling back the message history. The messages table already contains assistant role messages with toolCall content, but no matching tool role messages with toolResult content.

transform-messages.js (part of the pi runtime) then encounters these unpaired tool calls during message preparation and generates the synthetic error as a defensive measure.

Fix Action

Fixed

Fixed by PR: fix(agents): repair persisted tool result pairing (https://github.com/openclaw/openclaw/pull/81397)

PR fix notes

PR #81397: fix(agents): repair persisted tool result pairing

Repository: openclaw/openclaw
Author: stainlu
State: open | merged: False
Link: https://github.com/openclaw/openclaw/pull/81397

Description (problem / solution / changelog)

Summary

Problem: interrupted or killed tool runs can leave persisted session JSONL with toolResult entries separated from their assistant tool-call entry, duplicated, or orphaned.
Why it matters: session-file repair runs before OpenClaw loads a transcript. If the durable JSONL keeps invalid tool-result ordering, the same session can keep failing on later turns after restart.
What changed: the session-file repair pass now moves matching persisted tool results next to their assistant tool call and drops duplicate or orphan persisted tool results before loading the transcript.
What did NOT change: runtime/provider replay can still synthesize missing generic tool results when a provider policy needs it. The durable disk repair does not invent missing generic outputs; it only preserves and reorders real persisted entries or removes invalid persisted entries. The existing Codex-specific session-file repair for aborted tool outputs remains unchanged.

Change Type

Scope

Linked Issue/PR

Closes #58608
This PR fixes a bug or regression

Real behavior proof

Behavior or issue addressed: persisted session JSONL with a displaced matching toolResult, a duplicate toolResult, and an orphan toolResult.
Real environment tested: local macOS OpenClaw checkout, current main JSONL session-file repair path, real temp session file, and production repairSessionFileIfNeeded.
Exact steps or command run after this patch: ran a local node --import tsx --input-type=module command that wrote a corrupted session.jsonl, invoked repairSessionFileIfNeeded, then read the durable JSONL back from disk.
Evidence after fix: console output from that command:

OpenClaw console output: repair result {
  "repaired": true,
  "movedToolResults": 1,
  "droppedDuplicateToolResults": 1,
  "droppedOrphanToolResults": 1,
  "hasBackup": true
}
OpenClaw console output: repaired role order session > user > assistant > toolResult > user
OpenClaw console output: repaired ids proof-session > msg-user > msg-assistant > msg-tool-result > msg-user-followup
OpenClaw console output: preserved moved result parent preserved-parent

Observed result after fix: the durable JSONL transcript is rewritten so the real matching tool result sits immediately after the assistant tool call, the duplicate and orphan tool results are removed, the moved entry metadata is preserved, and the original session file is backed up.
What was not tested: a live provider call after killing an active tool run, because this patch is isolated to deterministic durable transcript repair.

Root Cause

Root cause: transcript replay already knew how to repair invalid tool-call/tool-result order in memory, but session-file repair did not repair persisted tool-result pairing before loading JSONL transcript entries.
Missing detection / guardrail: existing disk repair covered malformed JSONL, malformed messages, blank user text, empty assistant error turns, and Codex missing outputs, but did not cover toolResult entries that were displaced, duplicated, or orphaned in the persisted transcript.
Contributing context: interruption paths can persist partial tool-call sequences. Once persisted, the invalid entries survive restart and can poison later context rebuilds for the session.

Regression Test Plan

Coverage level that should have caught this:
- Unit test
- Seam / integration test
- End-to-end test
- Existing coverage already sufficient
Target test or file: src/agents/session-file-repair.test.ts
Scenario the test should lock in: session-file repair moves a real matching late tool result next to its assistant tool call, drops duplicate persisted tool results, and drops orphan persisted tool results without synthesizing missing generic results.
Why this is the smallest reliable guardrail: it exercises the actual durable JSONL repair boundary and atomic rewrite/backup behavior without requiring provider scheduling or process-kill timing.
Existing test that already covers this: in-memory replay repair coverage exists in src/agents/session-transcript-repair.test.ts, but durable session-file repair did not have persisted-entry coverage for this corruption shape.

User-visible / Behavior Changes

Sessions with persisted tool-call/tool-result pairing corruption can recover on restart instead of repeatedly failing during later context rebuilds.

Diagram

Before:
session.jsonl:
assistant(toolCall call_1) -> user follow-up -> toolResult(call_1) -> duplicate/orphan result
load/replay later -> invalid durable transcript can fail again

After:
session-file repair:
assistant(toolCall call_1) -> toolResult(call_1) -> user follow-up
duplicate/orphan persisted results removed

Security Impact

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No

Repro + Verification

Environment

OS: macOS
Runtime/container: Node 22 repo test wrapper and JSONL session-file repair
Model/provider: N/A
Integration/channel: N/A
Relevant config: temp session JSONL file

Steps

Write a session JSONL file with an assistant toolCall entry.
Persist a user follow-up before the matching toolResult.
Persist a duplicate matching toolResult and an unrelated orphan toolResult.
Run repairSessionFileIfNeeded.
Read the repaired session JSONL back from disk.

Expected

The matching toolResult is moved next to the assistant tool call.
Duplicate and orphan persisted toolResult entries are removed.
Missing generic tool results are not synthesized into durable state.
The original session file is backed up.

Actual

Before this patch, session-file repair left the invalid tool-result entries in place.
After this patch, the persisted JSONL entries are repaired deterministically.

Evidence

Attach at least one:

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers

Human Verification

Verified scenarios: moved displaced matching tool result, duplicate persisted tool result dropped, orphan persisted tool result dropped, moved entry metadata preserved, backup file written.
Edge cases checked: malformed-line repair still works, empty assistant error-turn repair still works, blank user repair still works, delivered trailing assistant messages remain untouched, Codex-specific missing-output repair remains intact, in-memory replay repair coverage still passes.
What you did not verify: live provider call after killing an active tool run.

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No

Risks and Mitigations

Risk: durable repair could accidentally rewrite valid transcript shape.
- Mitigation: the repair only touches toolResult entries whose ids match a visible assistant tool call span, duplicate ids, or tool results with no matching assistant tool call. Regression tests assert delivered assistant turns and unrelated non-message entries are preserved.
Risk: durable repair could invent fake generic tool output.
- Mitigation: this pass intentionally does not synthesize missing generic tool results; synthetic missing-output repair remains runtime/provider-specific. The existing Codex session-file aborted repair is left as-is.

Validation

pnpm docs:list
pnpm test src/agents/session-file-repair.test.ts src/agents/session-transcript-repair.test.ts src/agents/pi-embedded-runner.sanitize-session-history.test.ts
pnpm exec oxfmt --check --threads=1 CHANGELOG.md docs/reference/transcript-hygiene.md src/agents/session-file-repair.ts src/agents/session-file-repair.test.ts
git diff --check
pnpm changed:lanes --base upstream/main --json
pnpm check:changed --base upstream/main

Changed files

CHANGELOG.md (modified, +1/-0)
docs/reference/transcript-hygiene.md (modified, +1/-1)
src/agents/session-file-repair.test.ts (modified, +176/-0)
src/agents/session-file-repair.ts (modified, +180/-13)

Code Example

content: [{ type: "text", text: "No result provided" }],
isError: true,

---

SELECT message_id, role, seq, substr(content, 1, 100)
FROM messages
WHERE conversation_id = 81
  AND content LIKE '%No result provided%';

RAW_BUFFERClick to expand / collapse

Design Issue: Orphaned tool calls persist after circuit breaker kill run

Environment

OpenClaw version: 2026.5.7
Runtime: pi
OS: macOS

Problem Description

When Loop Detection's globalCircuitBreakerThreshold triggers a kill run, the current message run is terminated. However, incomplete tool calls (toolCall without toolResult) are left in the persistent message history (lcm.db).

On every subsequent request in the same conversation, transform-messages.js detects these orphaned tool calls and inserts a synthetic error:

content: [{ type: "text", text: "No result provided" }],
isError: true,

This creates a permanent pollution cycle:

Circuit breaker kills a run that has emitted toolCall messages
The corresponding toolResult messages are never generated
The orphaned toolCalls remain in lcm.db
transform-messages.js synthesizes "No result provided" on every future turn
The model sees this error in its history and may repeat tool calls or report failures
User interprets this as "tools are broken for this model"

Evidence from message history (lcm.db)

SELECT message_id, role, seq, substr(content, 1, 100)
FROM messages
WHERE conversation_id = 81
  AND content LIKE '%No result provided%';

Returns dozens of assistant messages across hundreds of turns, all containing the synthetic error.

Even after:

Disabling Loop Detection ("enabled": false)
Restarting gateway
Changing model provider

…the orphaned tool calls and synthetic errors remain in the persistent history.

Root cause analysis

transform-messages.js (part of the pi runtime) then encounters these unpaired tool calls during message preparation and generates the synthetic error as a defensive measure.

Expected behavior

When a run is killed by circuit breaker (or any hard abort), the message history should be restored to a consistent state — either:

Transactionally remove the incomplete assistant messages that contain tool calls without results, OR
Mark the orphaned tool calls as cancelled/aborted with a distinct status, rather than the generic and misleading "No result provided", OR
Provide an operator command to "sanitize" a conversation's message history by removing or repairing orphaned tool call pairs.

Steps to reproduce

Enable Loop Detection with aggressive thresholds (e.g., warningThreshold: 4, criticalThreshold: 5, globalCircuitBreakerThreshold: 6)
Use a model prone to tool/thinking loops (e.g., accounts/fireworks/routers/kimi-k2p6-turbo)
Trigger the circuit breaker (model repeats tool calls 6+ times)
Disable Loop Detection and restart gateway
Continue the same conversation — observe "No result provided" persists across turns

Additional context

This was initially misdiagnosed as a Fireworks/Kimi model-specific bug because the synthetic error only becomes visible when using models that generate frequent tool calls.
The actual tool execution layer (exec, read, write) works correctly when tested in isolation; the failure is caused by the polluted history feeding back into the model.
External subprocesses (e.g., user-launched gbrain dream via Bun) can compound the issue by creating resource contention that makes new tool executions fail, adding new orphaned tool calls on top of the historical ones.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

When a run is killed by circuit breaker (or any hard abort), the message history should be restored to a consistent state — either:

Transactionally remove the incomplete assistant messages that contain tool calls without results, OR
Mark the orphaned tool calls as cancelled/aborted with a distinct status, rather than the generic and misleading "No result provided", OR
Provide an operator command to "sanitize" a conversation's message history by removing or repairing orphaned tool call pairs.

#inference speed #output truncation #response parsing #generation error #database connection

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix Circuit breaker kill run leaves orphaned tool calls in persistent message history [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #81397: fix(agents): repair persisted tool result pairing

Description (problem / solution / changelog)

Summary

Change Type

Scope

Linked Issue/PR

Real behavior proof

Root Cause

Regression Test Plan

User-visible / Behavior Changes

Diagram

Security Impact

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification

Review Conversations

Compatibility / Migration

Risks and Mitigations

Validation

Changed files

Code Example

Design Issue: Orphaned tool calls persist after circuit breaker kill run

Environment

Problem Description

Evidence from message history (lcm.db)

Root cause analysis

Expected behavior

Steps to reproduce

Additional context

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING