openclaw - ✅(Solved) Fix fix(compaction): retry loop can burn tokens when summarizer model is unavailable [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#58838Fetched 2026-04-08 02:32:02
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Participants
Timeline (top)
cross-referenced ×2

Fix Action

Fixed

PR fix notes

PR #58846: fix(compaction): add circuit breaker for retry loops

Description (problem / solution / changelog)

Fixes #58838

Summary

When compaction repeatedly fails (model errors, timeouts, malformed output), retryAsync burns tokens without progress. Add CompactionCircuitBreaker that stops attempts after N consecutive failures and resumes after a cooldown period.

  • CompactionCircuitBreaker class with closed/open/half-open states
  • Default: opens after 3 consecutive failures, resets after 60s cooldown
  • Wire into summarizeChunks() as optional parameter (backward compatible)
  • When circuit is open, falls back to previousSummary instead of retrying

Test plan

  • State transitions: closed → open (after N failures)
  • State transitions: open → half-open (after cooldown)
  • State transitions: half-open → closed (on success) / open (on failure)
  • Counter reset on mid-sequence success
  • Default config values (3 failures, 60s cooldown)
  • reset() clears all state
  • Fake timers for deterministic time-based tests
  • Existing compaction tests pass (10 tests)

Changed files

  • src/agents/compaction-circuit-breaker.test.ts (added, +111/-0)
  • src/agents/compaction-circuit-breaker.ts (added, +99/-0)
  • src/agents/compaction.ts (modified, +37/-22)

PR #58857: fix(compaction): add circuit breaker for retry loops

Description (problem / solution / changelog)

Fixes #58838

Summary

When compaction repeatedly fails (model errors, timeouts, malformed output), retryAsync burns tokens without progress. Add CompactionCircuitBreaker that stops attempts after N consecutive failures and resumes after a cooldown period.

  • CompactionCircuitBreaker class with closed/open/half-open states
  • Default: opens after 3 consecutive failures, resets after 60s cooldown
  • Wire into summarizeChunks() as optional parameter (backward compatible)
  • When circuit is open, falls back to previousSummary instead of retrying

Test plan

  • State transitions: closed → open → half-open → closed/open
  • Counter reset on mid-sequence success
  • Default config values (3 failures, 60s cooldown)
  • reset() clears all state
  • Fake timers for deterministic time-based tests
  • Existing compaction tests pass (10 tests)
  • tsgo --noEmit clean, oxlint clean, oxfmt --check clean

Changed files

  • src/agents/compaction-circuit-breaker.test.ts (added, +111/-0)
  • src/agents/compaction-circuit-breaker.ts (added, +99/-0)
  • src/agents/compaction.ts (modified, +43/-22)
RAW_BUFFERClick to expand / collapse

Bug Description

summarizeChunks() in src/agents/compaction.ts uses retryAsync() with 3 attempts per chunk to handle transient errors during compaction. However, when the summarizer model is persistently unavailable (overloaded, rate-limited, or misconfigured), the retry loop burns tokens across multiple chunks without making progress.

In a multi-chunk compaction scenario (common with long channel conversations), this means:

  1. Chunk 1: 3 retries → fail
  2. Chunk 2: 3 retries → fail
  3. Chunk N: 3 retries → fail

Each retry sends the chunk to the model, consuming input tokens even though the response fails. With large chunks (~4K tokens each), a 5-chunk compaction can waste 60K+ tokens before finally throwing.

Expected Behavior

After N consecutive compaction failures, the system should stop attempting compaction and fall back to the previous summary (or a placeholder). This prevents cascading token waste when the summarizer is down.

A circuit breaker pattern would be appropriate:

  • Closed (normal): compaction attempts proceed
  • Open (after N failures): compaction is skipped, fallback used
  • Half-open (after cooldown): one test attempt allowed

Affected Code

  • src/agents/compaction.ts:263retryAsync() call in summarizeChunks()
  • Called from summarizeWithFallback() and summarizeInStages()

extent analysis

TL;DR

Implement a circuit breaker pattern to prevent excessive token waste when the summarizer model is persistently unavailable.

Guidance

  • Identify the number of consecutive failures (N) that should trigger the circuit breaker to open and prevent further compaction attempts.
  • Modify the retryAsync() call in summarizeChunks() to incorporate the circuit breaker logic, switching between closed, open, and half-open states based on the number of failures and a cooldown period.
  • Implement a fallback mechanism to use the previous summary or a placeholder when the circuit breaker is open.
  • Consider adding a cooldown period before allowing a single test attempt when the circuit breaker is half-open.

Example

// Pseudocode example of circuit breaker implementation
let consecutiveFailures = 0;
let circuitBreakerState = 'closed';
let cooldownTimeout = null;

function summarizeChunks() {
  if (circuitBreakerState === 'open') {
    // Fallback to previous summary or placeholder
    return fallbackSummary;
  }

  try {
    // Attempt compaction
    const result = retryAsync();
    consecutiveFailures = 0;
    circuitBreakerState = 'closed';
    return result;
  } catch (error) {
    consecutiveFailures++;
    if (consecutiveFailures >= N) {
      circuitBreakerState = 'open';
      cooldownTimeout = setTimeout(() => {
        circuitBreakerState = 'half-open';
      }, cooldownPeriod);
    }
    throw error;
  }
}

Notes

The exact implementation of the circuit breaker pattern may vary depending on the specific requirements and constraints of the system. The cooldown period and the number of consecutive failures (N) should be tuned based on the system's behavior and the summarizer model's characteristics.

Recommendation

Apply a workaround by implementing the circuit breaker pattern to prevent excessive token waste when the summarizer model is persistently unavailable. This approach allows the system to adapt to the summarizer's availability and prevent cascading token waste.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING