openclaw - 💡(How to fix) Fix Retry loop duplicates user message hundreds of times in context window on rate limit [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#57880Fetched 2026-04-08 01:56:32
View on GitHub
Comments
1
Participants
2
Timeline
1
Reactions
1
Timeline (top)
commented ×1

Error Message

  1. Retry limit: After N retries (e.g. 3-5), either fall back to the next model or return an error to the user
RAW_BUFFERClick to expand / collapse

Bug

When the upstream provider (Anthropic Claude) rate-limits a request, OpenClaw retries the message delivery — but each retry re-appends the user's inbound message to the session context. This results in the same user message appearing 500+ times in the agent's context window.

Expected behavior

  • After N failed retries, either fall back to the next model in the fallback chain, or stop retrying and notify the user
  • Deduplicate repeated identical inbound messages so the context window isn't polluted

Actual behavior

  • The same user message was injected into the context ~500+ times
  • The agent received a massively bloated context with identical content
  • No fallback to the next provider was triggered (fallback chain only activates on new session creation, not retries within existing sessions)
  • The agent eventually responded but with an enormous context overhead

Impact

  • Wastes tokens (context full of duplicated messages)
  • May cause the agent to hit context length limits
  • Makes it harder for the agent to parse the actual user intent
  • Costs money on large-context models like Opus

Environment

  • OpenClaw version: latest (2026-03-30)
  • Primary model: anthropic/claude-opus-4-6
  • Fallback chain: opus → sonnet → gpt-5.4
  • Channel: Telegram

Suggested fix

  1. Dedup: If the same message_id is already in the session context, don't re-append it
  2. Retry limit: After N retries (e.g. 3-5), either fall back to the next model or return an error to the user
  3. Extend fallback chain scope: Make fallback work for retries within existing sessions, not just new session creation

extent analysis

Fix Plan

To address the issue, we'll implement the following steps:

  • Deduplication: Check if a message ID already exists in the session context before appending it.
  • Retry Limit: Introduce a retry limit (e.g., 3-5 attempts) before falling back to the next model or returning an error.
  • Fallback Chain Extension: Modify the fallback chain to trigger on retries within existing sessions.

Example Code

def process_message(message_id, session_context, max_retries=3):
    # Check if message ID already exists in session context
    if message_id in session_context:
        return  # Dedup: don't re-append if already exists

    # Attempt to deliver message
    for attempt in range(max_retries):
        try:
            # Deliver message to upstream provider
            deliver_message(message_id)
            break
        except RateLimitError:
            # Retry limit exceeded, fall back to next model
            if attempt == max_retries - 1:
                fall_back_to_next_model()
                break
            # Wait before retrying
            time.sleep(1)

def fall_back_to_next_model():
    # Get current model and fallback chain
    current_model = get_current_model()
    fallback_chain = get_fallback_chain()

    # Find next model in fallback chain
    next_model = next((model for model in fallback_chain if model != current_model), None)

    # Switch to next model if available
    if next_model:
        switch_to_model(next_model)
    else:
        # No more models in fallback chain, return error to user
        return_error_to_user()

Verification

To verify the fix, test the following scenarios:

  • Send a message with a unique ID and verify it's appended to the session context only once.
  • Simulate a rate limit error and verify the retry limit is enforced (e.g., 3-5 attempts).
  • Test the fallback chain by simulating a rate limit error and verifying the next model is used.

Extra Tips

  • Monitor the retry limit and adjust as needed to balance between delivering messages and avoiding excessive retries.
  • Consider implementing a backoff strategy (e.g., exponential backoff) to reduce the load on the upstream provider during retries.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  • After N failed retries, either fall back to the next model in the fallback chain, or stop retrying and notify the user
  • Deduplicate repeated identical inbound messages so the context window isn't polluted

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Retry loop duplicates user message hundreds of times in context window on rate limit [1 comments, 2 participants]