openclaw - 💡(How to fix) Fix Compaction: `after_compaction` not emitted when `result.compacted:false`; validation: single-quote delimiter trips tool-caller retries

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

error=Context overflow: prompt too large for the model.

Bug 2 — Validation error format uses single-quote delimiters, which LLM tool callers misparse and copy back as the property name

The literal key emitted by the model is sessionTarget': (six extra characters: ', :, '). The model is parsing back the validator's own error message: it sees unexpected property 'sessionTarget':' in the previous tool result, interprets the single-quoted delimiters as part of the property name, and produces a "corrected" payload that's even more wrong than before. Each retry makes the situation worse — loopDetection genericRepeatissues a warning at count=10 but the model keeps retrying identical malformed payloads untilcriticalThreshold(or until aborted manually viachat.abort). This is a one-line change with no test impact other than updating the literal in any test fixture that asserts on the error string verbatim. OpenClaw is positioning itself as the agent runtime for tool-using LLMs. **Validator error messages are part of the tool-calling protocol** — they get fed back to the model as tool result content, and the model is expected to self-correct from them. If the format is ambiguous to LLM parsers, you create unrecoverable retry loops on otherwise-fixable schema errors. This is especially bad for smaller open-weight models (Qwen 3.5 9B, Llama 3.1 8B, etc.) where tool call JSON discipline is already brittle. 6. **For Bug 2**: observe cron failed: invalid ... unexpected property 'sessionTarget':'` and the model retrying with a malformed key copied from the error message itself.

Root Cause

Relation to existing issues (Bug 1 only):

  • #78300 (closed, fixed in 2026.5.5) — addressed the symptom "empty fallback summaries loop until 60s timeout". Bug 1 here is a different symptom (observers hang waiting for after_compaction) of overlapping but distinct root cause (event emission gated on compacted:true).
  • #78478 (closed, "treat toolResult entries as valid cut points") — proposes widening cut-point types so safeguard can compact tool-heavy sessions. Complementary: even when cut-point detection widens, the case "policy legitimately decides nothing to compact" (e.g. second consecutive compaction request, already-collapsed history) still returns compacted:false, and observers still hang.

Fix Action

Fix / Workaround

Workarounds we adopted (consumer side)

  • Voice client: chat.abort exposed on /api/voice/abort so the user can recover without a full session reset.
  • Voice client: sessions.compact exposed on /api/voice/compact for proactive shrink (currently disabled — the reactive engine in 5.7 handles it well enough on its own, and proactive compact during an active run race-conditions on the session lane).
  • agents.defaults.models["gemma-pc/qwen/qwen3.5-9b"].providerConfig.timeoutSeconds: 120 to absorb the recovery window in Bug 1 case (b). Workaround, not a fix — the model still loses the turn.
  • For Bug 2, no workaround on the consumer side without invasive hooks that rewrite tool result content before it reaches the model. We'd rather fix the format upstream.

Code Example

[agent/embedded] embedded run agent end ... isError=true
  error=Context overflow: prompt too large for the model.
  rawError=400 request (66336 tokens) exceeds the available context size (65792 tokens)
[plugins] [compaction-hook] before messages=120 turn=99c3e686
[plugins] [compaction-hook] after (synthetic safeguard) 120->120
[plugins] voice turn failed: Context overflow: prompt too large for the model.

---

attempt 1/365959 tokens (overflow)
attempt 2/365964 tokens (overflow)
attempt 3/366000 tokens (overflow)
[context-overflow-recovery] Attempting tool result truncation for gemma-pc/qwen/qwen3.5-9b
[tool-result-truncation] Truncated 23 tool result(s) in session (maxChars=3500)
[context-overflow-recovery] Truncated 23 tool result(s); retrying prompt → success

---

00:49:40  [compaction-hook] before messages=7
00:49:43  [compaction-hook] after (synthetic safeguard) 7->7   ← 3s timer, plugin gives up
00:50:02  [compaction-hook] after 7->1                          ← real compaction completes 19s later
00:50:04  [ws] sessions.compact 49746ms
00:50:16  [agent/embedded] failover decision: stage=assistant decision=surface_error
          reason=timeout from=gemma-pc/qwen/qwen3.5-9b
          ← model timed out during the 49s compaction windowHTTP 500

---

// compact.queued.ts:244-265 (current 5.14)
if (engineOwnsCompaction && result.ok && result.compacted) {
  await runPostCompactionSideEffects({ ... });
}
if (
  result.ok &&
  result.compacted &&                                    // ← Bug 1: gate
  hookRunner?.hasHooks?.("after_compaction") &&
  hookRunner.runAfterCompaction
) {
  await hookRunner.runAfterCompaction({ ... });
}
// else: nothing — observers waiting for after_compaction time out

---

api.emit("after_compaction", {
  messageCount: <unchanged>,
  compactedCount: 0,
  reason: "no_real_conversation_messages" | "policy_skipped",
  tokenCount: ...
});

---

[ws] ⇄ res ✗ cron.add 1ms errorCode=INVALID_REQUEST
  errorMessage=invalid cron.add params: at root: unexpected property 'sessionTarget':'

[tools] cron failed: invalid cron.add params: at root: unexpected property 'sessionTarget':'
  raw_params={"action":"add","job":{
    "name":"Promemoria 1 (2 minuti)",
    "schedule":{"kind":"at","at":"2026-05-09T01:05:00+02:00"},
    "payload":{"kind":"agentTurn","message":"Promemoria 1: sono passati due minuti."},
    "sessionTarget':":"isolated"   ← here
  }}

---

parts.push(`${where}: unexpected property '${additionalProperty}'`);
//                                          ↑                    ↑
//                                          single quotes hardcoded

---

// Option A — double quotes (matches JSON):
parts.push(`${where}: unexpected property "${additionalProperty}"`);

// Option B — backticks (no ambiguity with any JSON syntax):
parts.push(`${where}: unexpected property \`${additionalProperty}\``);
RAW_BUFFERClick to expand / collapse

OpenClaw issue — two bugs that conspire to break LLM tool callers under context pressure

Version observed: gateway 2026.5.7 (production at observation time, upgraded from 5.4 on 2026-05-08), provider gemma-pc → llama-server, model qwen/qwen3.5-9b, contextWindow 65695.

Status as of 2026-05-14: both bugs verified still present in 2026.5.14 HEAD. Line references in this issue have been updated to the 5.14 source tree.

Relation to existing issues (Bug 1 only):

  • #78300 (closed, fixed in 2026.5.5) — addressed the symptom "empty fallback summaries loop until 60s timeout". Bug 1 here is a different symptom (observers hang waiting for after_compaction) of overlapping but distinct root cause (event emission gated on compacted:true).
  • #78478 (closed, "treat toolResult entries as valid cut points") — proposes widening cut-point types so safeguard can compact tool-heavy sessions. Complementary: even when cut-point detection widens, the case "policy legitimately decides nothing to compact" (e.g. second consecutive compaction request, already-collapsed history) still returns compacted:false, and observers still hang.

In short: prior fixes reduced how often compacted:false is returned. This issue is about what happens when it is correctly returned — observers should still be notified.

Compaction config: agents.defaults.compaction.mode: "default" (reserveTokensFloor: 30000, maxActiveTranscriptBytes: 200000).

This document tracks two separate but related bugs reproduced live on 2026-05-09. Both surface as user-visible HTTP 500s on a voice channel (POST /api/voice/turn) and are amplified when the context engine decides to compact mid-turn.


Bug 1 — compaction-hook returns "synthetic safeguard" N→N, then internal recovery is too slow for the client

Symptom

A voice turn that triggers a tool with a large result (homeassistant__GetLiveContext, web_search returning grounded summary, etc.) overflows the model context window. The bundled compaction hook then logs a no-op:

[agent/embedded] embedded run agent end ... isError=true
  error=Context overflow: prompt too large for the model.
  rawError=400 request (66336 tokens) exceeds the available context size (65792 tokens)
[plugins] [compaction-hook] before messages=120 turn=99c3e686
[plugins] [compaction-hook] after (synthetic safeguard) 120->120
[plugins] voice turn failed: Context overflow: prompt too large for the model.

After the 500 reaches the client, the gateway internally retries 3× with the same prompt, then falls through to tool-result-truncation:

attempt 1/3 → 65959 tokens (overflow)
attempt 2/3 → 65964 tokens (overflow)
attempt 3/3 → 66000 tokens (overflow)
[context-overflow-recovery] Attempting tool result truncation for gemma-pc/qwen/qwen3.5-9b
[tool-result-truncation] Truncated 23 tool result(s) in session (maxChars=3500)
[context-overflow-recovery] Truncated 23 tool result(s); retrying prompt → success

The recovery via tool-result-truncation works correctly, but only triggers as a last-resort after 3 failed attempts (~2.5 minutes wall-clock from the initial overflow). The client has long since timed out.

A second, distinct manifestation of the same root cause (observed 2026-05-09 00:49–00:50)

When the session has been freshly compacted (down to messages=1), a subsequent heavy turn can trigger reactive compaction even though the message count is low. The first attempt returns compacted: false because the policy decides there are "no real conversation messages to summarize" yet — but the plugin voice hook still gets stuck waiting for after_compaction:

00:49:40  [compaction-hook] before messages=7
00:49:43  [compaction-hook] after (synthetic safeguard) 7->7   ← 3s timer, plugin gives up
00:50:02  [compaction-hook] after 7->1                          ← real compaction completes 19s later
00:50:04  [ws] sessions.compact 49746ms
00:50:16  [agent/embedded] failover decision: stage=assistant decision=surface_error
          reason=timeout from=gemma-pc/qwen/qwen3.5-9b
          ← model timed out during the 49s compaction window → HTTP 500

So the same bug surfaces as either:

  • (a) inner attempt loop never makes progress, late tool-result-truncation saves the session but client has 500'd, or
  • (b) compaction succeeds on a delayed retry, but the model run that triggered it hits its own requestTimeoutMs waiting for the lane to clear, and the gateway emits surface_error: timeout → 500.

Hypothesis & exact source

In src/agents/pi-embedded-runner/compact.queued.ts:244-265 (verified against 2026.5.14 HEAD), both runPostCompactionSideEffects() and the hookRunner.runAfterCompaction invocation are gated on result.ok && result.compacted. The legitimate "no real conversation messages" return path in src/agents/pi-embedded-runner/compact.ts produces {ok:true, compacted:false} and silently bypasses observers:

// compact.queued.ts:244-265 (current 5.14)
if (engineOwnsCompaction && result.ok && result.compacted) {
  await runPostCompactionSideEffects({ ... });
}
if (
  result.ok &&
  result.compacted &&                                    // ← Bug 1: gate
  hookRunner?.hasHooks?.("after_compaction") &&
  hookRunner.runAfterCompaction
) {
  await hookRunner.runAfterCompaction({ ... });
}
// else: nothing — observers waiting for after_compaction time out

Note: #78300 (2026.5.5) widened the anchor detection used by containsRealConversationMessages so that custom-message/bash/branch-summary entries are no longer treated as empty. This reduces but does not eliminate the compacted:false return path — the policy can still legitimately decide there's nothing to compact (e.g. on the second consecutive compaction request, when the previous one already collapsed history), and observers still hang on that path.

The plugin voice compaction-hook.ts waits 3s for after_compaction (after a before_compaction), then synthesizes a fake N→N event so the client doesn't deadlock — but that synthetic event is misleading: it tells downstream consumers "compaction ran and did nothing" when reality is "compaction hasn't run yet, wait."

Proposed fix

Always emit after_compaction from compact.queued.ts:248-265 even when result.compacted === false, with a payload that distinguishes:

api.emit("after_compaction", {
  messageCount: <unchanged>,
  compactedCount: 0,
  reason: "no_real_conversation_messages" | "policy_skipped",
  tokenCount: ...
});

Existing test coverage in src/agents/pi-embedded-runner/compact.hooks.test.ts (fixtures around containsRealConversationMessages → false) should make this a small change.

Alternative or complementary: expose a tunable tools.compaction.recoveryStrategy: "compaction-first" | "truncation-first" that lets agents bias toward the path that actually reduces tokens for their workload.


Bug 2 — Validation error format uses single-quote delimiters, which LLM tool callers misparse and copy back as the property name

Symptom

Reproduced live 2026-05-09 01:04–01:05. Qwen 3.5 9B (a tool-call-trained model) on the voice agent enters a 17-attempt retry loop on cron.add because each retry produces a malformed property name:

[ws] ⇄ res ✗ cron.add 1ms errorCode=INVALID_REQUEST
  errorMessage=invalid cron.add params: at root: unexpected property 'sessionTarget':'

[tools] cron failed: invalid cron.add params: at root: unexpected property 'sessionTarget':'
  raw_params={"action":"add","job":{
    "name":"Promemoria 1 (2 minuti)",
    "schedule":{"kind":"at","at":"2026-05-09T01:05:00+02:00"},
    "payload":{"kind":"agentTurn","message":"Promemoria 1: sono passati due minuti."},
    "sessionTarget':":"isolated"   ← here
  }}

The literal key emitted by the model is sessionTarget': (six extra characters: ', :, '). The model is parsing back the validator's own error message: it sees unexpected property 'sessionTarget':' in the previous tool result, interprets the single-quoted delimiters as part of the property name, and produces a "corrected" payload that's even more wrong than before. Each retry makes the situation worse — loopDetection genericRepeatissues a warning at count=10 but the model keeps retrying identical malformed payloads untilcriticalThreshold(or until aborted manually viachat.abort`).

This is exactly the pattern documented in your own src/agents/loop-detection/* warnings: "WARNING: You have called cron 10 times with identical arguments. If this is not making progress, stop retrying and report the task as failed." The model can't escape because the malformed key is in its prompt now (in the prior tool result), so it keeps copying it forward.

Exact source

src/gateway/protocol/index.ts:764 (verified against 2026.5.14 HEAD; the line was at 717 in 5.7):

parts.push(`${where}: unexpected property '${additionalProperty}'`);
//                                          ↑                    ↑
//                                          single quotes hardcoded

Single quotes are not valid string delimiters in JSON. LLMs trained to emit JSON tool calls treat single-quote-bracketed tokens inconsistently — some include the trailing ' in the property name when copying, especially under context pressure or when the surrounding token stream contains other escape characters.

Proposed fix

Use double quotes (matching JSON spec) or backticks (visually distinct from any string syntax the model might emit):

// Option A — double quotes (matches JSON):
parts.push(`${where}: unexpected property "${additionalProperty}"`);

// Option B — backticks (no ambiguity with any JSON syntax):
parts.push(`${where}: unexpected property \`${additionalProperty}\``);

This is a one-line change with no test impact other than updating the literal in any test fixture that asserts on the error string verbatim.

Why this matters more than it looks

OpenClaw is positioning itself as the agent runtime for tool-using LLMs. Validator error messages are part of the tool-calling protocol — they get fed back to the model as tool result content, and the model is expected to self-correct from them. If the format is ambiguous to LLM parsers, you create unrecoverable retry loops on otherwise-fixable schema errors. This is especially bad for smaller open-weight models (Qwen 3.5 9B, Llama 3.1 8B, etc.) where tool call JSON discipline is already brittle.


Repro shape (both bugs together)

  1. OpenClaw 2026.5.7 with compaction.mode: "default".
  2. Voice agent on a tool-call-capable open-weight model in the 7B–14B class (we use qwen/qwen3.5-9b via llama-server). HA tool registered (homeassistant__GetLiveContext returning ~50 entities).
  3. Build a session up to ~60k tokens of context with a mix of HA tool results and conversation.
  4. Issue a turn that triggers a tool call with large output, or asks the model to schedule a cron job that includes the optional sessionTarget field.
  5. For Bug 1: observe [compaction-hook] after (synthetic safeguard) N→N followed either by late tool-result-truncation recovery or by surface_error: timeout from the model run.
  6. For Bug 2: observe cron failed: invalid ... unexpected property 'sessionTarget':' and the model retrying with a malformed key copied from the error message itself.

Workarounds we adopted (consumer side)

  • Voice client: chat.abort exposed on /api/voice/abort so the user can recover without a full session reset.
  • Voice client: sessions.compact exposed on /api/voice/compact for proactive shrink (currently disabled — the reactive engine in 5.7 handles it well enough on its own, and proactive compact during an active run race-conditions on the session lane).
  • agents.defaults.models["gemma-pc/qwen/qwen3.5-9b"].providerConfig.timeoutSeconds: 120 to absorb the recovery window in Bug 1 case (b). Workaround, not a fix — the model still loses the turn.
  • For Bug 2, no workaround on the consumer side without invasive hooks that rewrite tool result content before it reaches the model. We'd rather fix the format upstream.

Ask

Two small fixes that together make compaction-under-pressure recoverable for LLM tool callers:

  1. compact.queued.ts:244-250 — emit after_compaction always, with compactedCount: 0 and a reason discriminator when the compaction policy deliberately skipped. Lets observers (plugins, side-channel relays, UI bubbles) react correctly instead of timing out.
  2. protocol/index.ts:717 — switch the unexpected property '${name}' template to double-quote or backtick delimiters so LLM tool callers don't copy the delimiter into the property name on retry.

Either fix in isolation would help — but together they remove the most common path to an unrecoverable 500 on a voice/agent channel.


Repo context (for OpenClaw maintainers)

  • Track: voice frontend on top of OpenClaw, Qwen 3.5 9B GGUF on llama-server, custom plugin channel voice with agent voice on a gemma-pc provider.
  • Local OpenClaw source verified against: 2026.5.14 HEAD as of 2026-05-14.
  • Full gateway log trace (~5 minutes) covering both bugs available on request.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Compaction: `after_compaction` not emitted when `result.compacted:false`; validation: single-quote delimiter trips tool-caller retries