openclaw - 💡(How to fix) Fix [Bug] Embedded agent context overflows 200k limit without auto-compaction firing [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#71430Fetched 2026-04-26 05:12:46
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Timeline (top)
closed ×1commented ×1cross-referenced ×1

A long-running embedded agent (Anthropic Claude Opus 4.7, 200k context window) reached and exceeded its hard context limit (~300k observed total tokens, 150% of cap) without the documented auto-compaction routine triggering. Result: model responses get silently truncated, completion events do not fire, downstream channel (Telegram) sees partial or no output.

Root Cause

Note to maintainers: This is filed with awareness of #68580 and #67473 (same root: threshold-based auto-compaction not firing). We chose a separate issue because our observations extend beyond those reports — we hit the cliff past the 200k hard limit (≈300k cumulative), in a multi-agent setup where shared-knowledge files compound the load, and we have specific runtime-telemetry questions for the agent-side workaround. Happy to be merged/closed as duplicate if maintainers prefer a single thread.

Fix Action

Fix / Workaround

Note to maintainers: This is filed with awareness of #68580 and #67473 (same root: threshold-based auto-compaction not firing). We chose a separate issue because our observations extend beyond those reports — we hit the cliff past the 200k hard limit (≈300k cumulative), in a multi-agent setup where shared-knowledge files compound the load, and we have specific runtime-telemetry questions for the agent-side workaround. Happy to be merged/closed as duplicate if maintainers prefer a single thread.

Cross-references

  • NOW-003 (b) — telegram in-flight loss (related downstream symptom)
  • NOW-003 Layer 3 — agent-side self-monitoring as workaround
  • NOW-003 (h) — MemKraft solution evaluation (potential alternative)
RAW_BUFFERClick to expand / collapse

Summary

A long-running embedded agent (Anthropic Claude Opus 4.7, 200k context window) reached and exceeded its hard context limit (~300k observed total tokens, 150% of cap) without the documented auto-compaction routine triggering. Result: model responses get silently truncated, completion events do not fire, downstream channel (Telegram) sees partial or no output.

Environment

  • OpenClaw runtime: 2026.4.20
  • Agent: agent:kusanagi:telegram:direct:<redacted>
  • Model: claude-opus-4-7 (200k context)
  • Session transcript path: ~/.openclaw/agents/kusanagi/sessions/<redacted>.jsonl
  • Observed at: 2026-04-24 21:00 KST

Note to maintainers: This is filed with awareness of #68580 and #67473 (same root: threshold-based auto-compaction not firing). We chose a separate issue because our observations extend beyond those reports — we hit the cliff past the 200k hard limit (≈300k cumulative), in a multi-agent setup where shared-knowledge files compound the load, and we have specific runtime-telemetry questions for the agent-side workaround. Happy to be merged/closed as duplicate if maintainers prefer a single thread.

Repro

  1. Run a long-lived embedded agent across many cross-agent + Captain turns
  2. Watch totalTokens grow past 200k (in our case, contextTokens=200000 + totalTokens trending toward 300k+ because of cumulative session history)
  3. No auto-compaction event fires; no warning surfaced to either agent or operator
  4. Subsequent turns produce truncated / empty responses; completion stop reasons appear normal but text is missing or partial

Expected

  • Auto-compaction triggers at a documented threshold (e.g., 80% of context window)
  • Compaction event is observable: agent sees a system signal, operator sees a metric/log line
  • After compaction, agent resumes with summarized history + recent verbatim tail
  • If compaction fails, agent is informed (not silent)

Actual

  • No compaction event observed
  • No warning surfaced
  • Agent continues producing turns that silently truncate or fail to deliver
  • Operator only discovers via downstream symptom (Telegram output missing)

Impact

  • Strategic agents (planning, multi-step) become unreliable past ~150k tokens
  • Loss of agent-side strategic context = lost proposals, lost plans, manual recovery cost
  • Multi-agent setups amplify this: shared knowledge files loaded per session push every agent closer to the cliff

Diagnostics requested

  • What is the documented auto-compaction trigger threshold? Is it per-model or global?
  • Is auto-compaction enabled by default for embedded agents, or opt-in?
  • Is there a way for the agent itself to query current totalTokens / contextLimit ratio at runtime? (We would use this for self-monitoring per NOW-003 Layer 3.)

Suggested fix direction

  • Make compaction threshold a configurable per-agent knob (default 80% of context limit)
  • Emit observable compaction events (log + metric)
  • On compaction failure or threshold-without-fire, fail loud (not silent truncation)
  • Expose context-usage telemetry to the agent itself via a system message or tool

Cross-references

  • NOW-003 (b) — telegram in-flight loss (related downstream symptom)
  • NOW-003 Layer 3 — agent-side self-monitoring as workaround
  • NOW-003 (h) — MemKraft solution evaluation (potential alternative)

Notes

Reported as part of multi-agent operating model hardening (Captain JS, 4-agent setup). The combination of (this issue) + sendChatAction in-flight loss caused a full-cycle context loss this morning (2026-04-25) that took 4 cross-session round-trips to recover. Will share session jsonl excerpts on request.

extent analysis

TL;DR

Implement a configurable auto-compaction threshold and emit observable compaction events to prevent silent truncation of agent responses.

Guidance

  • Review the documented auto-compaction trigger threshold and its configuration to understand why it didn't fire at the expected 80% of the context window.
  • Consider implementing a per-agent configurable compaction threshold to allow for more fine-grained control over when compaction occurs.
  • Develop a mechanism for the agent to query its current totalTokens / contextLimit ratio at runtime to enable self-monitoring and proactive compaction.
  • Investigate why the auto-compaction routine failed to trigger and ensure that compaction events are properly logged and surfaced to the operator.

Example

No code snippet is provided due to the lack of specific implementation details in the issue.

Notes

The issue highlights the need for more robust auto-compaction and monitoring mechanisms in the embedded agent. The suggested fix direction provides a good starting point, but further investigation is required to determine the root cause of the failed auto-compaction.

Recommendation

Apply a workaround by implementing a per-agent configurable compaction threshold and exposing context-usage telemetry to the agent itself, as this will allow for more proactive management of the context window and prevent silent truncation of agent responses.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug] Embedded agent context overflows 200k limit without auto-compaction firing [1 comments, 2 participants]