openclaw - 💡(How to fix) Fix Gateway enters infinite model-switch loop when all auth profiles fail [1 participants]

openclaw2026-03-30 19:42:50

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#57905•Fetched 2026-04-08 01:56:17

View on GitHub

Comments

Participants

Timeline

Reactions

Author

klausbot98-bot

Participants

klausbot98-bot

Error Message

When all models/providers fail, the gateway should send an error message to the user and stop retrying — not spin forever.

Root Cause

Problem

When all auth profiles enter cooldown (e.g., Anthropic OAuth token rejected + no xAI key), the gateway enters an infinite model-switch loop cycling every ~1 second. The gateway becomes completely unresponsive and cannot process any messages. The loop persists across restarts because session state is recreated from sessions.json.

RAW_BUFFERClick to expand / collapse

Problem

Expected behavior

When all models/providers fail, the gateway should send an error message to the user and stop retrying — not spin forever.

Environment

OpenClaw 2026.3.28
macOS (arm64)
Telegram channel

Steps to reproduce

Have Anthropic as primary, with fallbacks to providers that have no key configured
Anthropic OAuth token gets rejected (401)
Auth profile enters cooldown
Gateway cycles through fallbacks, all fail, loops back to primary
Infinite loop at ~1/sec consuming all gateway resources

Additional context

Manual cooldown reset in auth-profiles.json did not take effect on restart
The configure wizard also left stale claude-cli/ model prefixes which compounded the issue

extent analysis

Fix Plan

To resolve the infinite model-switch loop, we need to implement a mechanism to detect when all auth profiles have failed and enter a cooldown period, preventing further retries.

Code Changes

We will introduce a new variable allProfilesFailed to track when all profiles have entered cooldown. We will also add a check to prevent retries when this condition is met.

# auth_manager.py
class AuthManager:
    def __init__(self):
        self.allProfilesFailed = False
        # ... existing code ...

    def switch_model(self):
        if self.allProfilesFailed:
            # Send error message to user and stop retrying
            self.send_error_message("All auth profiles have failed. Please check your configuration.")
            return

        # ... existing code to switch models ...

    def update_profile_status(self, profile, status):
        if status == "cooldown":
            # Check if all profiles have failed
            self.allProfilesFailed = all(profile.status == "cooldown" for profile in self.profiles)
            if self.allProfilesFailed:
                # Prevent retries
                self.retry_timeout = None
        # ... existing code ...

Configuration Changes

No configuration changes are required for this fix.

Verification

To verify the fix, follow these steps:

Reproduce the issue by having all auth profiles enter cooldown.
Verify that the gateway sends an error message to the user and stops retrying.
Check the logs to ensure that the allProfilesFailed variable is set correctly and retries are prevented.

Extra Tips

Make sure to handle the allProfilesFailed variable correctly in case of multiple auth profiles.
Consider adding a timeout or a maximum number of retries before entering the cooldown period to prevent abuse.
Review the auth-profiles.json file to ensure that the cooldown reset is properly applied after a restart.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

When all models/providers fail, the gateway should send an error message to the user and stop retrying — not spin forever.

#api #ssr #installation #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Gateway enters infinite model-switch loop when all auth profiles fail [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Problem

Problem

Expected behavior

Environment

Steps to reproduce

Additional context

extent analysis

Fix Plan

Code Changes

Configuration Changes

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Gateway enters infinite model-switch loop when all auth profiles fail [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Problem

Problem

Expected behavior

Environment

Steps to reproduce

Additional context

extent analysis

Fix Plan

Code Changes

Configuration Changes

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING