openclaw - 💡(How to fix) Fix Auto-compaction silently no-ops on tool-loop preflight overflow because token count is not propagated [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#68930Fetched 2026-04-20 12:04:21
View on GitHub
Comments
0
Participants
1
Timeline
1
Reactions
0
Author
Participants
Timeline (top)
closed ×1

When the embedded pi runner trips its preflight context-overflow guard during a tool loop, auto-compaction silently no-ops and the session is "restarted" (new sessionId, fragmented context) — even when an LCM-style context engine like lossless-claw is fully able to compact. The root cause is an instrumentation gap: the preflight error string carries no token count, the recovery handler can't recover one, and contextEngine.compact() is then called without currentTokenCount. The compactor evaluates only persisted history, returns "already under target", and the session is fragmented.

Error Message

When the embedded pi runner trips its preflight context-overflow guard during a tool loop, auto-compaction silently no-ops and the session is "restarted" (new sessionId, fragmented context) — even when an LCM-style context engine like lossless-claw is fully able to compact. The root cause is an instrumentation gap: the preflight error string carries no token count, the recovery handler can't recover one, and contextEngine.compact() is then called without currentTokenCount. The compactor evaluates only persisted history, returns "already under target", and the session is fragmented. [agent] [context-overflow-diag] sessionKey=... provider=openai-codex/gpt-5.4 source=assistantError messages=197 ... compactionAttempts=0 observedTokens=unknown error=Context overflow: estimated context size exceeds safe threshold during tool loop.

  1. installToolResultContextGuard (dist/pi-embedded-Vw-lS5ti.js ~line 18577) throws a hardcoded preflight error: throw new Error(PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE);
  2. The overflow recovery path (dist/pi-embedded-Vw-lS5ti.js ~line 34115) tries to recover the count by parsing the error text:

Fix A: embed the estimated token count in the preflight error

throw new Error(${PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE} (requested ${estimatedTokens} tokens));

Fix B: bypass the error-text round-trip for preflight overflows

When the runner knows the estimated token count (because it just computed it for the threshold check), pass it directly into the recovery call instead of stringifying it into an error and re-parsing. Throw a typed PreemptiveOverflowError carrying { estimatedTokens } and have the recovery handler read the field directly.

Root Cause

  1. installToolResultContextGuard (dist/pi-embedded-Vw-lS5ti.js ~line 18577) throws a hardcoded preflight error:

    const PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE = "Context overflow: estimated context size exceeds safe threshold during tool loop.";
    ...
    if (exceedsPreemptiveOverflowThreshold({ messages: contextMessages, maxContextChars }))
      throw new Error(PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE);

    The string contains no number.

  2. The overflow recovery path (dist/pi-embedded-Vw-lS5ti.js ~line 34115) tries to recover the count by parsing the error text:

    const observedOverflowTokens = extractObservedOverflowTokenCount(errorText);
    ...
    ...observedOverflowTokens !== void 0 ? { currentTokenCount: observedOverflowTokens } : {},

    extractObservedOverflowTokenCount (in dist/errors-DVZmaL5J.js) only matches:

    • prompt is too long: NNN tokens > MMM maximum
    • requested NNN tokens
    • resulted in NNN tokens

    None of these match the preflight string, so observedOverflowTokens is undefined and currentTokenCount is omitted.

  3. contextEngine.compact() (lossless-claw engine.ts:3771-3782) without currentTokenCount evaluates against persisted history only:

    const observedTokens = this.normalizeObservedTokenCount(params.currentTokenCount ?? lp.currentTokenCount);
    const decision = observedTokens !== undefined
      ? await this.compaction.evaluate(conversationId, tokenBudget, observedTokens)
      : await this.compaction.evaluate(conversationId, tokenBudget);

    Persisted history is small (the bloat is in transient tool-loop messages that haven't been committed). The decision returns "below threshold" or, if it tries, "already under target" (lines 3831-3834 / 3883).

  4. Openclaw treats the "already under target" outcome as a compaction failure and restarts the session, fragmenting context.

Fix Action

Workaround

Local patch script (idempotent) is documented at ~/.openclaw/local-patches/apply-preflight-overflow-patch.sh in the reporter's environment — it implements Fix A directly on the bundled JS. After applying and restarting the gateway, the recovery handler receives the token count, lossless-claw evaluates against the live size, compacts the leaf chunks, and the session no longer fragments.

Code Example

[agent] [context-overflow-diag] sessionKey=... provider=openai-codex/gpt-5.4 source=assistantError messages=197 ... compactionAttempts=0 observedTokens=unknown error=Context overflow: estimated context size exceeds safe threshold during tool loop.
[agent] context overflow detected (attempt 1/3); attempting auto-compaction for openai-codex/gpt-5.4
[agent] auto-compaction failed for openai-codex/gpt-5.4: already under target
... Auto-compaction failed (...). Restarting session ... -> <new-sessionId> and retrying.

---

const PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE = "Context overflow: estimated context size exceeds safe threshold during tool loop.";
   ...
   if (exceedsPreemptiveOverflowThreshold({ messages: contextMessages, maxContextChars }))
     throw new Error(PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE);

---

const observedOverflowTokens = extractObservedOverflowTokenCount(errorText);
   ...
   ...observedOverflowTokens !== void 0 ? { currentTokenCount: observedOverflowTokens } : {},

---

const observedTokens = this.normalizeObservedTokenCount(params.currentTokenCount ?? lp.currentTokenCount);
   const decision = observedTokens !== undefined
     ? await this.compaction.evaluate(conversationId, tokenBudget, observedTokens)
     : await this.compaction.evaluate(conversationId, tokenBudget);

---

// dist/pi-embedded-Vw-lS5ti.js — installToolResultContextGuard
const estimateCache = createMessageCharEstimateCache();
const estimatedChars = estimateContextChars$1(contextMessages, estimateCache);
if (estimatedChars > maxContextChars) {
  const estimatedTokens = Math.ceil(estimatedChars / 4);
  throw new Error(`${PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE} (requested ${estimatedTokens} tokens)`);
}
RAW_BUFFERClick to expand / collapse

Auto-compaction silently no-ops on tool-loop preflight overflow because token count isn't propagated

Summary

When the embedded pi runner trips its preflight context-overflow guard during a tool loop, auto-compaction silently no-ops and the session is "restarted" (new sessionId, fragmented context) — even when an LCM-style context engine like lossless-claw is fully able to compact. The root cause is an instrumentation gap: the preflight error string carries no token count, the recovery handler can't recover one, and contextEngine.compact() is then called without currentTokenCount. The compactor evaluates only persisted history, returns "already under target", and the session is fragmented.

Versions

  • openclaw: 2026.4.9 (commit 0512059)
  • lossless-claw: @martian-engineering/[email protected]
  • Node: tested on Linux 6.8.0-107-generic
  • Provider in use at repro: openai-codex/gpt-5.4 (OAuth), runtime contextTokens: 272000, reserveTokensFloor: 50000

Reproduction

  1. Run a session through openai-codex/gpt-5.4 (or any model where the configured contextTokens is well below native contextWindow).
  2. Drive a tool-heavy turn (e.g., a chain of Grep / large Read / MCP search calls) so the transient prompt grows past contextTokens * 0.9 * 4 chars while persisted history is still small.
  3. Observe gateway logs:
[agent] [context-overflow-diag] sessionKey=... provider=openai-codex/gpt-5.4 source=assistantError messages=197 ... compactionAttempts=0 observedTokens=unknown error=Context overflow: estimated context size exceeds safe threshold during tool loop.
[agent] context overflow detected (attempt 1/3); attempting auto-compaction for openai-codex/gpt-5.4
[agent] auto-compaction failed for openai-codex/gpt-5.4: already under target
... Auto-compaction failed (...). Restarting session ... -> <new-sessionId> and retrying.

The observedTokens=unknown is the smoking gun. The session is then restarted, fragmenting context (the user-visible "context limit exceeded" experience).

Root cause

  1. installToolResultContextGuard (dist/pi-embedded-Vw-lS5ti.js ~line 18577) throws a hardcoded preflight error:

    const PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE = "Context overflow: estimated context size exceeds safe threshold during tool loop.";
    ...
    if (exceedsPreemptiveOverflowThreshold({ messages: contextMessages, maxContextChars }))
      throw new Error(PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE);

    The string contains no number.

  2. The overflow recovery path (dist/pi-embedded-Vw-lS5ti.js ~line 34115) tries to recover the count by parsing the error text:

    const observedOverflowTokens = extractObservedOverflowTokenCount(errorText);
    ...
    ...observedOverflowTokens !== void 0 ? { currentTokenCount: observedOverflowTokens } : {},

    extractObservedOverflowTokenCount (in dist/errors-DVZmaL5J.js) only matches:

    • prompt is too long: NNN tokens > MMM maximum
    • requested NNN tokens
    • resulted in NNN tokens

    None of these match the preflight string, so observedOverflowTokens is undefined and currentTokenCount is omitted.

  3. contextEngine.compact() (lossless-claw engine.ts:3771-3782) without currentTokenCount evaluates against persisted history only:

    const observedTokens = this.normalizeObservedTokenCount(params.currentTokenCount ?? lp.currentTokenCount);
    const decision = observedTokens !== undefined
      ? await this.compaction.evaluate(conversationId, tokenBudget, observedTokens)
      : await this.compaction.evaluate(conversationId, tokenBudget);

    Persisted history is small (the bloat is in transient tool-loop messages that haven't been committed). The decision returns "below threshold" or, if it tries, "already under target" (lines 3831-3834 / 3883).

  4. Openclaw treats the "already under target" outcome as a compaction failure and restarts the session, fragmenting context.

Suggested fix

Either is sufficient; both is best.

Fix A: embed the estimated token count in the preflight error

Smallest possible change. The recovery extractor's existing requested NNN tokens regex then matches.

// dist/pi-embedded-Vw-lS5ti.js — installToolResultContextGuard
const estimateCache = createMessageCharEstimateCache();
const estimatedChars = estimateContextChars$1(contextMessages, estimateCache);
if (estimatedChars > maxContextChars) {
  const estimatedTokens = Math.ceil(estimatedChars / 4);
  throw new Error(`${PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE} (requested ${estimatedTokens} tokens)`);
}

Fix B: bypass the error-text round-trip for preflight overflows

When the runner knows the estimated token count (because it just computed it for the threshold check), pass it directly into the recovery call instead of stringifying it into an error and re-parsing. Throw a typed PreemptiveOverflowError carrying { estimatedTokens } and have the recovery handler read the field directly.

This is more robust than relying on regex round-tripping and makes the contract explicit.

Workaround

Local patch script (idempotent) is documented at ~/.openclaw/local-patches/apply-preflight-overflow-patch.sh in the reporter's environment — it implements Fix A directly on the bundled JS. After applying and restarting the gateway, the recovery handler receives the token count, lossless-claw evaluates against the live size, compacts the leaf chunks, and the session no longer fragments.

extent analysis

TL;DR

The most likely fix is to embed the estimated token count in the preflight error or bypass the error-text round-trip for preflight overflows, allowing the recovery handler to receive the token count and compact the context correctly.

Guidance

  • To fix the issue, consider implementing one of the suggested fixes: embed the estimated token count in the preflight error or bypass the error-text round-trip for preflight overflows.
  • Verify the fix by checking the gateway logs for the presence of the estimated token count in the preflight error message or the successful compaction of the context.
  • If implementing Fix A, update the installToolResultContextGuard function to include the estimated token count in the error message.
  • If implementing Fix B, create a typed PreemptiveOverflowError carrying the estimated token count and update the recovery handler to read the field directly.

Example

// Fix A: embed the estimated token count in the preflight error
const estimateCache = createMessageCharEstimateCache();
const estimatedChars = estimateContextChars$1(contextMessages, estimateCache);
if (estimatedChars > maxContextChars) {
  const estimatedTokens = Math.ceil(estimatedChars / 4);
  throw new Error(`${PREEMPTIVE_CONTEXT_OVERFLOW_MESSAGE} (requested ${estimatedTokens} tokens)`);
}

Notes

The provided fixes assume that the estimated token count is available and accurate. If the estimation is incorrect, the compaction may not work as expected. Additionally, the workaround script provided in the issue may not be applicable in all environments.

Recommendation

Apply Fix A, as it is the smallest possible change and allows the recovery extractor's existing regex to match the estimated token count. This fix is also more straightforward to implement and test.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING