openclaw - 💡(How to fix) Fix Memory plugin auto-recall embeds full conversation context, causing context length errors and timeouts [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70871Fetched 2026-04-24 10:38:30
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0

Auto-recall for memory plugins (specifically openclaw-mem0 in open-source mode) fails on conversations with large context because the full conversation is passed to the embedding model instead of a focused query.

Error Message

[plugins] openclaw-mem0: recall failed: ResponseError: the input length exceeds the context length

Root Cause

Auto-recall for memory plugins (specifically openclaw-mem0 in open-source mode) fails on conversations with large context because the full conversation is passed to the embedding model instead of a focused query.

Fix Action

Workaround

None currently. Auto-recall fails silently on large contexts; users must use manual memory_search tool calls.

Code Example

[plugins] openclaw-mem0: recall failed: ResponseError: the input length exceeds the context length

---

[plugins] openclaw-mem0: recall timed out after 8000ms, skipping
RAW_BUFFERClick to expand / collapse

Summary

Auto-recall for memory plugins (specifically openclaw-mem0 in open-source mode) fails on conversations with large context because the full conversation is passed to the embedding model instead of a focused query.

Environment

  • OpenClaw: 2026.4.15
  • Plugin: openclaw-mem0 1.0.10
  • Mode: open-source (self-hosted)
  • Embedding model: Ollama nomic-embed-text (~8K token context limit)
  • Vector store: Qdrant (localhost)
  • OS: Windows 11

Observed Behavior

  1. Context length error when conversation includes large content (PDFs, long code blocks):

    [plugins] openclaw-mem0: recall failed: ResponseError: the input length exceeds the context length
  2. 8-second timeout on moderate-to-large conversations:

    [plugins] openclaw-mem0: recall timed out after 8000ms, skipping
  3. Normal short conversations work fine — recall succeeds and memories are injected.

Expected Behavior

  • Auto-recall should extract a focused query from the conversation (e.g., last user message, ~500 tokens max) rather than embedding the entire context
  • Large attachments, PDFs, or code blocks in the conversation should not affect recall
  • The embedding input should be deterministic and bounded

Root Causes (Analysis)

  1. Full context passed to embedding: The plugin receives the entire conversation context for recall, then attempts to embed it. This exceeds embedding model limits.

  2. Hardcoded 8s timeout: The 8000ms timeout is not configurable. Even when embedding succeeds, large contexts can timeout.

  3. No graceful degradation: When embedding fails, recall fails entirely rather than falling back to a truncated query.

Suggested Fixes

  1. Bound the recall query: OpenClaw should pass a bounded context slice to memory plugins (e.g., last N tokens, configurable)

  2. Expose recallTimeoutMs config: Allow users to increase timeout for slower embedding providers

  3. Plugin contract clarification: Document whether plugins are responsible for truncation, or whether OpenClaw guarantees bounded input

Workaround

None currently. Auto-recall fails silently on large contexts; users must use manual memory_search tool calls.

Reproduction Steps

  1. Configure openclaw-mem0 in open-source mode with Ollama embeddings
  2. Enable autoRecall: true
  3. Send a message with a large PDF attachment or paste a long document
  4. Observe gateway logs showing recall failure

Related

This may also need changes in the openclaw-mem0 plugin to truncate before embedding, but the architectural decision of what gets passed to plugins is OpenClaw's responsibility.


Happy to provide additional logs or test configurations if helpful.

extent analysis

TL;DR

Bound the recall query by passing a truncated context slice to memory plugins to prevent exceeding embedding model limits.

Guidance

  • Verify the current implementation of the openclaw-mem0 plugin to determine if it attempts to embed the entire conversation context.
  • Consider modifying the OpenClaw plugin interface to guarantee bounded input, such as passing only the last N tokens of the conversation.
  • Evaluate the feasibility of exposing a recallTimeoutMs configuration option to allow users to adjust the timeout for slower embedding providers.
  • Investigate implementing a fallback mechanism for recall failures, such as truncating the query or using a default embedding.

Example

No code snippet is provided due to the lack of specific implementation details in the issue.

Notes

The solution may require changes to both the OpenClaw core and the openclaw-mem0 plugin. The exact implementation will depend on the plugin's current behavior and the desired architectural design.

Recommendation

Apply a workaround by modifying the OpenClaw plugin interface to pass a bounded context slice to memory plugins, as this is the most direct way to address the issue. This change can help prevent recall failures due to excessive context length.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING