openclaw - ✅(Solved) Fix [Bug]: active-memory plugin infinite retry on API timeout causes token waste [1 pull requests, 2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#74054Fetched 2026-04-30 06:29:21
View on GitHub
Comments
2
Participants
3
Timeline
7
Reactions
1
Author
Timeline (top)
cross-referenced ×3commented ×2closed ×1referenced ×1

When using xiaomi/mimo-v2-flash model with the active-memory plugin, API calls frequently timeout (15s threshold), but the plugin continues to retry indefinitely without any retry limit configuration. This causes:

  1. Token waste (API requests are sent and tokens consumed even on timeout)
  2. Context overflow when messages pile up
  3. Stuck sessions that remain in processing state for 10+ minutes

Error Message

error=Context overflow: estimated context size exceeds safe threshold during tool loop

Root Cause

From source code analysis (/dist/extensions/active-memory/index.js):

  1. Plugin has no retry limit configuration: The plugin only handles single-call timeout (15s), returns status: "timeout", and does not retry internally.

  2. Agent-runner layer has no retry limit: When active-memory times out, the agent-runner triggers failover decisions but has no configurable retry limit.

  3. Token already consumed on timeout: The API request is sent and tokens are consumed even when the response times out.

Fix Action

Temporary Workaround

Disable active-memory plugin or use a faster model:

"active-memory": {
  "enabled": false
}

Or configure with a faster fallback model:

"active-memory": {
  "config": {
    "model": "deepseek/deepseek-v4-flash",
    "modelFallback": "deepseek/deepseek-v4-flash"
  }
}

PR fix notes

PR #74158: feat(active-memory): add timeout circuit breaker to skip recall after consecutive failures

Description (problem / solution / changelog)

Summary

  • Add a per-agent/model circuit breaker to the Active Memory plugin that skips recall after consecutive timeouts, preventing token waste and stuck sessions.
  • Add two new configurable options: circuitBreakerMaxTimeouts (default: 3) and circuitBreakerCooldownMs (default: 60s).
  • Add 4 focused tests covering circuit breaker trip, reset, and config normalization.

Closes #74054

Problem

When using a slow-responding model (e.g., xiaomi/mimo-v2-flash with 15s+ response times) with the Active Memory plugin, API calls frequently timeout but the plugin continues to attempt recall on every eligible prompt. Since timeout results are deliberately not cached, each new message triggers another full recall attempt against the same slow model — consuming tokens, causing context overflow, and leaving sessions stuck in processing state for 10+ minutes.

The root cause: Active Memory has no plugin-owned mechanism to stop trying after repeated failures against the same provider/model.

Design

Circuit Breaker State

A module-level Map<string, CircuitBreakerEntry> keyed by agentId:provider/model tracks:

  • consecutiveTimeouts: count of consecutive timeout/timeout_partial results
  • lastTimeoutAt: timestamp of the last timeout

Behavior

  1. Before each recall attempt: Check if the circuit breaker is open (consecutive timeouts >= threshold AND within cooldown window). If open, return { status: "timeout", elapsedMs: 0 } immediately without calling the subagent.
  2. After timeout/timeout_partial: Increment the counter via recordCircuitBreakerTimeout().
  3. After ok/empty (success): Reset the counter via resetCircuitBreaker().
  4. After cooldown expires: The next attempt goes through (one retry allowed), and if it succeeds the breaker resets; if it times out again the breaker re-trips.

Config Options

OptionTypeDefaultRange
circuitBreakerMaxTimeoutsinteger31–20
circuitBreakerCooldownMsinteger600005000–600000

Both are added to the plugin manifest schema with additionalProperties: false preserved, and include UI hints.

Files Changed

FileChange
extensions/active-memory/index.tsCircuit breaker state, helpers, integration into maybeResolveActiveRecall, config normalization, __testing exports
extensions/active-memory/index.test.ts4 new tests: breaker trips after consecutive timeouts, breaker resets on success after cooldown, config defaults, config clamping
extensions/active-memory/openclaw.plugin.jsonNew config schema properties and UI hints

Test Plan

  • pnpm test extensions/active-memory — 99 tests (95 existing + 4 new), all pass
  • New test cases:
    • Circuit breaker trips after maxTimeouts consecutive timeouts (subagent not called again)
    • Circuit breaker resets after cooldown + successful recall
    • Config normalization produces correct defaults (3 / 60000)
    • Config clamping enforces min bounds (1 / 5000)

Notes

  • Circuit breaker is extension-owned, not a generic runner feature — this follows the architecture principle that extension-specific behavior stays in the extension.
  • The breaker keys are scoped by agentId:provider/model to avoid cross-agent or cross-model interference.
  • The existing behavior of not caching timeout results is preserved; the circuit breaker is an additional layer above the cache.
  • Backward compatible: new config options are optional with sensible defaults.

Changed files

  • extensions/active-memory/index.test.ts (modified, +143/-0)
  • extensions/active-memory/index.ts (modified, +98/-0)
  • extensions/active-memory/openclaw.plugin.json (modified, +10/-0)

Code Example

"active-memory": {
  "enabled": true,
  "config": {
    "agents": ["main"],
    "queryMode": "recent",
    "promptStyle": "balanced",
    "maxSummaryChars": 220,
    "logging": true,
    "timeoutMs": 15000
  }
}

---

[02:16:05.824] active-memory: agent=main session=agent:main:main 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               start timeoutMs=15000 queryChars=962

[02:16:24.172] active-memory: agent=main session=agent:main:main 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               done status=timeout elapsedMs=18350 summaryChars=0

---

[03:01:58.709] active-memory: agent=main session=... 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               start timeoutMs=15000 queryChars=434

[03:02:33.908] active-memory: agent=main session=... 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               done status=timeout elapsedMs=35201 summaryChars=0

---

[03:06:57.804] [context-overflow-diag] sessionKey=agent:main:feishu:group:... 
               provider=xiaomi/mimo-v2-flash 
               source=assistantError 
               messages=116 
               error=Context overflow: estimated context size exceeds safe threshold during tool loop

---

[03:15:46] stuck session: sessionId=main 
           sessionKey=agent:main:feishu:group:... 
           state=processing 
           age=493s (8+ minutes)

[03:17:46] stuck session: ... age=613s (10+ minutes)

---

"active-memory": {
  "enabled": false
}

---

"active-memory": {
  "config": {
    "model": "deepseek/deepseek-v4-flash",
    "modelFallback": "deepseek/deepseek-v4-flash"
  }
}
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

When using xiaomi/mimo-v2-flash model with the active-memory plugin, API calls frequently timeout (15s threshold), but the plugin continues to retry indefinitely without any retry limit configuration. This causes:

  1. Token waste (API requests are sent and tokens consumed even on timeout)
  2. Context overflow when messages pile up
  3. Stuck sessions that remain in processing state for 10+ minutes

Steps to reproduce

  1. Configure OpenClaw with a slow-responding model (e.g., xiaomi/mimo-v2-flash with 15s+ response times)
  2. Enable active-memory plugin with default settings
  3. Have multiple conversations that trigger active-memory recall
  4. Observe repeated timeout events in gateway logs
  5. Check token usage - tokens are consumed despite timeouts

Expected behavior

When an API call times out:

  1. The plugin should stop retrying after a configurable number of attempts
  2. Token consumption should be minimized on timeout
  3. The system should gracefully degrade (skip memory recall) instead of infinite retry

Actual behavior

  1. No retry limit: The agent-runner layer retries indefinitely on timeout
  2. Token waste: Each timeout still consumes tokens (API request is sent)
  3. Context overflow: Messages pile up during retry loops, causing Context overflow: estimated context size exceeds safe threshold
  4. Stuck sessions: Sessions remain in processing state for 10+ minutes
  5. No circuit breaker: No mechanism to stop retries after repeated failures

OpenClaw version

2026.4.26 (be8c246)

Install method

npm global

Model

xiaomi/mimo-v2-flash

Provider / routing chain

openclaw -> xiaomi (direct API)

Additional provider/model setup details

"active-memory": {
  "enabled": true,
  "config": {
    "agents": ["main"],
    "queryMode": "recent",
    "promptStyle": "balanced",
    "maxSummaryChars": 220,
    "logging": true,
    "timeoutMs": 15000
  }
}

Logs, screenshots, and evidence

API Timeout Events

[02:16:05.824] active-memory: agent=main session=agent:main:main 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               start timeoutMs=15000 queryChars=962

[02:16:24.172] active-memory: agent=main session=agent:main:main 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               done status=timeout elapsedMs=18350 summaryChars=0
[03:01:58.709] active-memory: agent=main session=... 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               start timeoutMs=15000 queryChars=434

[03:02:33.908] active-memory: agent=main session=... 
               activeProvider=xiaomi activeModel=mimo-v2-flash 
               done status=timeout elapsedMs=35201 summaryChars=0

Context Overflow

[03:06:57.804] [context-overflow-diag] sessionKey=agent:main:feishu:group:... 
               provider=xiaomi/mimo-v2-flash 
               source=assistantError 
               messages=116 
               error=Context overflow: estimated context size exceeds safe threshold during tool loop

Stuck Session

[03:15:46] stuck session: sessionId=main 
           sessionKey=agent:main:feishu:group:... 
           state=processing 
           age=493s (8+ minutes)

[03:17:46] stuck session: ... age=613s (10+ minutes)

Token Consumption

  • Timeout rate: 100% (4/4 calls timed out)
  • Average response time: 20.89s (threshold: 15s)
  • Longest response: 35.20s (135% over threshold)
  • Estimated token waste: Hundreds of thousands of tokens

Impact and severity

Affected: All users using slow-responding models with active-memory plugin

Severity: High

  • Token waste (direct cost impact)
  • Session stuck (blocks user interactions)
  • Context overflow (degrades system stability)

Frequency: 100% timeout rate observed during testing

Consequence:

  • Unnecessary token consumption
  • Poor user experience (long waits, no responses)
  • System instability (stuck sessions, context overflow)

Additional information

Root Cause Analysis

From source code analysis (/dist/extensions/active-memory/index.js):

  1. Plugin has no retry limit configuration: The plugin only handles single-call timeout (15s), returns status: "timeout", and does not retry internally.

  2. Agent-runner layer has no retry limit: When active-memory times out, the agent-runner triggers failover decisions but has no configurable retry limit.

  3. Token already consumed on timeout: The API request is sent and tokens are consumed even when the response times out.

Suggested Fixes

  1. Add retry limit configuration: Add maxRetries config option to active-memory plugin (e.g., 0-3 retries)

  2. Add circuit breaker: Stop retries after N consecutive failures

  3. Add timeout-aware token tracking: Track and report token consumption on timeout

  4. Improve timeout handling: Cancel API request on client-side timeout to potentially save tokens

Temporary Workaround

Disable active-memory plugin or use a faster model:

"active-memory": {
  "enabled": false
}

Or configure with a faster fallback model:

"active-memory": {
  "config": {
    "model": "deepseek/deepseek-v4-flash",
    "modelFallback": "deepseek/deepseek-v4-flash"
  }
}

extent analysis

TL;DR

Implement a retry limit and circuit breaker in the active-memory plugin to prevent indefinite retries and token waste on API timeouts.

Guidance

  • Add a maxRetries configuration option to the active-memory plugin to limit the number of retries on timeout.
  • Implement a circuit breaker mechanism to stop retries after a specified number of consecutive failures.
  • Consider adding timeout-aware token tracking to report and minimize token consumption on timeouts.
  • Evaluate the feasibility of canceling API requests on client-side timeouts to potentially save tokens.

Example

"active-memory": {
  "config": {
    "maxRetries": 3,
    "circuitBreaker": {
      "threshold": 5,
      "resetTimeout": 30000
    }
  }
}

Notes

The suggested fixes require modifications to the active-memory plugin and potentially the agent-runner layer. The temporary workaround of disabling the active-memory plugin or using a faster model may not be suitable for all use cases.

Recommendation

Apply the workaround by disabling the active-memory plugin or configuring a faster fallback model until a permanent fix is implemented, as this will immediately mitigate the token waste and session stuck issues.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When an API call times out:

  1. The plugin should stop retrying after a configurable number of attempts
  2. Token consumption should be minimized on timeout
  3. The system should gracefully degrade (skip memory recall) instead of infinite retry

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: active-memory plugin infinite retry on API timeout causes token waste [1 pull requests, 2 comments, 3 participants]