openclaw - 💡(How to fix) Fix Codex OAuth refresh failures can wedge an agent for hours without clear alerting or aggressive profile rotation

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When Codex OAuth refresh starts timing out or a profile later returns 401 Unauthorized: Your authentication token has been invalidated, OpenClaw can keep retrying inside the same provider/auth lane for hours without surfacing a clear operator-visible incident and without rotating aggressively enough across other valid profiles.

Error Message

On my local OpenClaw runtime, Dev was effectively down for about a day even though some Codex subscriptions were still valid. The system did retry and sometimes fell back from openai/gpt-5.5 to openai/gpt-5.4, but that was still inside the same Codex auth path and did not resolve the incident.

Root Cause

This failure mode makes the system look like the model or subscription is just "down" while the actual problem is auth refresh/profile management. The current behavior costs hours of silent degraded service and forces manual log spelunking to understand what happened.

RAW_BUFFERClick to expand / collapse

Summary

When Codex OAuth refresh starts timing out or a profile later returns 401 Unauthorized: Your authentication token has been invalidated, OpenClaw can keep retrying inside the same provider/auth lane for hours without surfacing a clear operator-visible incident and without rotating aggressively enough across other valid profiles.

Observed behavior

On my local OpenClaw runtime, Dev was effectively down for about a day even though some Codex subscriptions were still valid. The system did retry and sometimes fell back from openai/gpt-5.5 to openai/gpt-5.4, but that was still inside the same Codex auth path and did not resolve the incident.

What it did wrong:

  • kept retrying a failing Codex OAuth refresh path for many hours
  • did not raise a clear operator-visible alert saying a subscription/auth profile was failing
  • did not rotate hard enough across other configured valid profiles after repeated refresh timeouts
  • later encountered a hard 401 invalidated on one profile after the prolonged timeout loop

Expected behavior

After repeated refresh timeouts or an invalidated-token response, OpenClaw should:

  • classify this as an auth-profile/subscription incident, not a generic long-running retry
  • surface a visible alert to the operator
  • quarantine the failing profile/provider path sooner
  • rotate across other valid configured profiles more aggressively
  • avoid leaving an agent effectively down for hours while only logging internal fallback churn

Concrete evidence

From ~/.openclaw/logs/gateway.err.log:

  • 2026-05-23T12:42:10.723-07:00 embedded run failover decision ... rawError=auth refresh request timed out after 10s
  • 2026-05-23T12:43:19.371-07:00 Embedded agent failed before reply: All models failed (2): openai/gpt-5.5: auth refresh request timed out after 10s (timeout) | openai/gpt-5.4: auth refresh request timed out after 10s (timeout)
  • this same pattern repeats across cron and direct session runs through 2026-05-24T13:27:55.170-07:00
  • 2026-05-24T14:28:22.074-07:00 embedded run failover decision ... reason=auth ... rawError=unexpected status 401 Unauthorized: Your authentication token has been invalidated. Please try signing in again.

Current runtime state after restart/reauth showed multiple configured openai-codex OAuth profiles, including both Pro and Free accounts, but the outage path was not surfaced as a clear profile-selection/auth incident.

Environment

  • OpenClaw 2026.5.22
  • runtime from local gateway logs
  • Codex provider via OAuth profiles
  • Discord-triggered agent plus cron-triggered runs both affected

Why this matters

This failure mode makes the system look like the model or subscription is just "down" while the actual problem is auth refresh/profile management. The current behavior costs hours of silent degraded service and forces manual log spelunking to understand what happened.

Suggested fix areas

  • auth failure classification for repeated Codex refresh timeouts
  • provider/profile quarantine and rotation policy
  • operator-visible incident/alert path
  • session status surface making it obvious which auth profile is failing and why

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

After repeated refresh timeouts or an invalidated-token response, OpenClaw should:

  • classify this as an auth-profile/subscription incident, not a generic long-running retry
  • surface a visible alert to the operator
  • quarantine the failing profile/provider path sooner
  • rotate across other valid configured profiles more aggressively
  • avoid leaving an agent effectively down for hours while only logging internal fallback churn

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING