openclaw - 💡(How to fix) Fix [runtime] Add compaction rate-limit guardrail (minIntervalSeconds + maxPerHour) to prevent compaction storms [1 comments, 2 participants]

openclaw2026-05-06 08:39:23

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#78367•Fetched 2026-05-07 03:37:45

View on GitHub

Comments

Participants

Timeline

Reactions

Author

blazing-mj

Participants

blazing-mj

clawsweeper[bot]

Timeline (top)

commented ×1

A single agent session can enter an unbounded compaction loop, consuming significant token budget and producing 4–7 MB checkpoint files within minutes. The runtime currently has no rate limit on compactions per session.

Error Message

Emit a warn-level log entry with the session key and the trip reason

Root Cause

compactionSafeguardExtension in compaction-successor-transcript-ZByj7D6a.js (around line 3592 in current dist) has no minimum-interval check. If the post-compaction state immediately re-trips the threshold, another compaction fires on the very next turn.

Code Example

{
  "minIntervalSeconds": 60,   // refuse compaction if Δt since previous < this
  "maxPerHour": 12             // refuse if rolling-hour count exceeds this
}

RAW_BUFFERClick to expand / collapse

Summary

Repro

Configure session.dmScope: "main" (the default in some older configs).
Have a single high-traffic peer (e.g. user on a primary channel) generate sustained inbound while the catch-all agent:<id>:main session is at >70% context.
Each turn ends with a compaction; each compaction restores enough history that the next turn re-fills the budget; the cycle repeats every 1–3 minutes.

Observed in production: one session accumulated 89 compactions in ~24 hours (~2.4MB transcript, ~700MB historic checkpoints).

Root cause

Proposed fix

Two new config keys under agents.defaults.compaction:

{
  "minIntervalSeconds": 60,   // refuse compaction if Δt since previous < this
  "maxPerHour": 12             // refuse if rolling-hour count exceeds this
}

When either guard trips:

Return { cancel: true, reason: "compaction throttled" } from the safeguard extension
Emit a warn-level log entry with the session key and the trip reason
Optionally fire a lifecycle event so monitoring can surface it

The default values are conservative: 60s minimum and 12/hour both leave plenty of headroom for legitimate heavy-traffic sessions while turning a runaway loop into a clear, bounded failure mode that surfaces in logs instead of silently burning tokens.

Why "tune `maxHistoryShare` lower" is not enough

We tested aggressive trimming (maxHistoryShare: 0.4, reserveTokens: 24000). It reduces probability of re-trip but does not eliminate the loop — a session with persistent high inbound rate still compacts every few minutes. A first-class rate limit is the only way to bound it.

Effort

~30 lines source change + the two schema entries. Tests: simulate compaction at t=0, attempt at t=10s with minIntervalSeconds=60 → expect cancel; at t=70s → expect proceed.

Context

Discovered during a P0 incident where Alfred's main session compacted 89× before manual intervention. Full incident notes available on request. Related but distinct from the dmScope: "main" routing magnet (which we resolved with dmScope: "per-channel-peer" in our config); a guardrail at the compaction layer would have caught this even with bad routing config.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#container setup #orchestration issue #cache issue #memory leak #API versioning

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [runtime] Add compaction rate-limit guardrail (minIntervalSeconds + maxPerHour) to prevent compaction storms [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Repro

Root cause

Proposed fix

Why "tune `maxHistoryShare` lower" is not enough

Effort

Context

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [runtime] Add compaction rate-limit guardrail (minIntervalSeconds + maxPerHour) to prevent compaction storms [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Repro

Root cause

Proposed fix

Why "tune maxHistoryShare lower" is not enough

Effort

Context

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Why "tune `maxHistoryShare` lower" is not enough