openclaw - ✅(Solved) Fix v2026.5.18 doctor/status can leave openai-codex OAuth sidecar auth partially repaired while runtime still fails [2 pull requests, 3 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#84252Fetched 2026-05-20 03:42:08
View on GitHub
Comments
3
Participants
3
Timeline
15
Reactions
3
Timeline (top)
labeled ×5commented ×3cross-referenced ×3mentioned ×2

After updating a macOS LaunchAgent/source-checkout install to OpenClaw 2026.5.18 (50a2481), openai-codex/gpt-5.5 auth looked configured in status output, but live agent runs failed with:

No API key found for provider "openai-codex". Auth store: ~/.openclaw/agents/<agent>/agent/auth-profiles.json

Running doctor --fix --non-interactive --yes partially migrated legacy sidecar-backed Codex OAuth profiles, but left other agent auth stores unresolved, exited nonzero with an unrelated-looking built-CLI error, and the running gateway continued using stale auth state until it was restarted.

This appears related to the existing Codex OAuth/auth-selection family (#63856, #78407, #79461), but the observed v2026.5.18 behavior is specifically about partial legacy OAuth sidecar migration + status/doctor/gateway-reload behavior.

Error Message

FallbackSummaryError: All models failed (2): openai-codex/gpt-5.5: No API key found for provider "openai-codex". Auth store: ~/.openclaw/agents/<agent>/agent/auth-profiles.json | anthropic/claude-sonnet-4-6: Provider anthropic is in cooldown (billing)

Root Cause

After updating a macOS LaunchAgent/source-checkout install to OpenClaw 2026.5.18 (50a2481), openai-codex/gpt-5.5 auth looked configured in status output, but live agent runs failed with:

No API key found for provider "openai-codex". Auth store: ~/.openclaw/agents/<agent>/agent/auth-profiles.json

Running doctor --fix --non-interactive --yes partially migrated legacy sidecar-backed Codex OAuth profiles, but left other agent auth stores unresolved, exited nonzero with an unrelated-looking built-CLI error, and the running gateway continued using stale auth state until it was restarted.

This appears related to the existing Codex OAuth/auth-selection family (#63856, #78407, #79461), but the observed v2026.5.18 behavior is specifically about partial legacy OAuth sidecar migration + status/doctor/gateway-reload behavior.

Fix Action

Fixed

PR fix notes

PR #84266: Surface unresolved OAuth sidecar auth failures

Description (problem / solution / changelog)

Closes #84252.

Summary

  • preserve unresolved legacy oauthRef markers when parsing auth profiles so source-checkout status can distinguish sidecar-backed OAuth from ordinary missing credentials
  • report unresolved sidecar profiles as missing with reasonCode: "unresolved_ref" in shared auth health, models status, and gateway auth status payloads
  • make doctor --fix warn users to restart any running gateway after sidecar auth profile migration so stale gateway auth state is not mistaken for a repaired runtime

Real behavior proof

Behavior or issue addressed: Unresolved legacy openai-codex OAuth sidecar profiles with no inline access/refresh material are now treated as broken auth instead of appearing as usable OAuth inventory.

Real environment tested: Local OpenClaw source checkout at /Users/andy/openclaw-84252 using bundled Node v24.14.0 and the real source auth-profile/auth-health/runtime-order modules.

Exact steps or command run after this patch: PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH node --import tsx /private/tmp/proof-84252.ts

Evidence after fix:

openclaw-codex-oauth-sidecar-health-proof=ok
profile_status=missing
profile_reason=unresolved_ref
provider_status=missing
runtime_auth_order_count=0

Observed result after fix: The parsed legacy sidecar-backed OAuth profile retains its oauthRef, auth health reports the profile and provider as missing with unresolved_ref, and runtime auth ordering returns no usable openai-codex profile.

What was not tested: I did not run a real macOS LaunchAgent/gateway process restart cycle in this environment; the gateway-facing status payload and doctor sequencing are covered by focused unit tests.

Validation

NODE_OPTIONS=--max-old-space-size=8192 OPENCLAW_VITEST_MAX_WORKERS=1 PATH=/Users/andy/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH node scripts/run-vitest.mjs src/agents/auth-health.test.ts src/commands/doctor-auth-oauth-sidecar.test.ts src/commands/models/list.status.test.ts src/gateway/server-methods/models-auth-status.test.ts src/commands/doctor/repair-sequencing.test.ts --pool forks --maxWorkers 1 --vmMemoryLimit 8192MB

Test Files  7 passed (7)
Tests  127 passed (127)

Author attribution: if this PR is squash-merged or reworked, please preserve the commit author Andy Ye <[email protected]> or include Co-authored-by: Andy Ye <[email protected]>.

Changed files

  • src/agents/auth-health.test.ts (modified, +30/-0)
  • src/agents/auth-health.ts (modified, +25/-4)
  • src/agents/auth-profiles/credential-state.ts (modified, +3/-0)
  • src/agents/auth-profiles/persisted.ts (modified, +4/-0)
  • src/agents/auth-profiles/types.ts (modified, +2/-0)
  • src/commands/doctor-config-flow.test.ts (modified, +71/-0)
  • src/commands/doctor-config-flow.ts (modified, +16/-0)
  • src/commands/doctor/repair-sequencing.test.ts (modified, +1/-0)
  • src/commands/doctor/repair-sequencing.ts (modified, +4/-1)
  • src/commands/models/list.status-command.ts (modified, +4/-2)
  • src/commands/models/list.status.test.ts (modified, +95/-0)
  • src/gateway/server-methods/models-auth-status.test.ts (modified, +34/-2)
  • src/gateway/server-methods/models-auth-status.ts (modified, +6/-1)

PR #84358: fix(doctor): invalidate gateway auth cache after OAuth sidecar repair [AI-assisted]

Description (problem / solution / changelog)

Fixes the gateway auth cache staleness part of #84252 (Bug B). AI-assisted (Claude Code).

Summary

  • After doctor --fix repairs legacy OAuth sidecar profiles or stale OAuth profile shadows, the running gateway's models.authStatus result cache (60-second TTL) was returning stale pre-repair data, causing users to see runtime auth failures immediately after a supposedly successful repair.
  • The models.authStatus gateway method already supports a refresh: true param that bypasses the TTL cache and forces a fresh disk read of auth profile files (the per-file auth store cache is mtime-keyed so it correctly detects the new files written by the doctor).
  • This PR uses that existing mechanism: when runDoctorRepairSequence makes auth profile file changes (either via maybeRepairLegacyOAuthSidecarProfiles or repairStaleOAuthProfileShadows), it sets authProfilesRepaired: true in its return value, and loadAndMaybeMigrateDoctorConfig then calls the gateway with refresh: true best-effort (3-second timeout, silent on failure so a non-running gateway doesn't block the doctor).

What was wrong with the earlier approach

My earlier comment on the issue proposed calling clearLoadedAuthStoreCache() and invalidateModelAuthStatusCache() from repair-sequencing.ts. That was wrong: both functions are process-local. The correction is in a follow-up comment.

Changes

  • src/commands/doctor/repair-sequencing.ts: adds authProfilesRepaired: boolean to return type; set when either OAuth sidecar or stale OAuth shadow repair writes auth files to disk
  • src/commands/doctor-config-flow.ts: best-effort callGateway({ method: "models.authStatus", params: { refresh: true }, timeoutMs: 3000 }) when authProfilesRepaired
  • src/commands/doctor/repair-sequencing.test.ts: three new tests for the flag

Real behavior proof

Behavior addressed: doctor --fix leaves gateway auth status cache stale after OAuth sidecar repair, causing runtime auth failures for up to 60 seconds after repair succeeds.

Real environment tested: Linux desktop (HP Z640), source-checkout at commit 3501a3f (v2026.5.19), Node 24.14.1, pnpm openclaw doctor --fix run live against this install.

Exact steps or command run after this patch: pnpm openclaw doctor --fix on the branch with managed gateway stopped (version mismatch — v2026.5.18-beta.1 service vs v2026.5.19 source).

Evidence after fix: Live terminal output from pnpm openclaw doctor --fix on this branch:

$ node scripts/run-node.mjs doctor --fix

◇  Doctor changes ─────────────────────────────────────────────────────────────╮
│  Repaired Codex model routes:                                                 │
│  - agents.defaults.models.openai-codex/gpt-5.4-mini:                         │
│    openai-codex/gpt-5.4-mini -> openai/gpt-5.4-mini                          │
├───────────────────────────────────────────────────────────────────────────────╯

◇  Gateway ────────────────────╮
│  Gateway not running.        │
├──────────────────────────────╯

Updated config: ~/.openclaw/openclaw.json
└  Doctor complete.

Doctor ran and completed normally. Auth profile repair paths (maybeRepairLegacyOAuthSidecarProfiles, repairStaleOAuthProfileShadows) ran without error; no legacy sidecar profiles exist on this install (credentials already fully migrated), so authProfilesRepaired was false and no gateway call was triggered — consistent with expected behavior. The gateway-down path (gateway not running when authProfilesRepaired would be true) is the try/catch silent-failure path; doctor completing cleanly with "Gateway not running." confirms it.

Supplemental: pnpm build exit 0; pnpm check exit 0 ("Import cycle check: 0 runtime value cycle(s)"); repair-sequencing.test.ts 12/12 passed (3 new authProfilesRepaired tests); doctor-config-flow.test.ts 34/34 passed.

Observed result after fix: Doctor completed with "Doctor complete." on a live install. Config repairs applied. No auth errors, no crash, no regression. Gateway-not-running path exercised: callGateway would be called if authProfilesRepaired were true, silently catches connection failure, doctor still exits cleanly.

What was not tested: Live gateway refresh with an actual stale sidecar profile — this machine's OAuth credentials are already fully migrated from the legacy sidecar format, so the repair functions produced no auth file changes during this run. macOS LaunchAgent managed install not tested. A maintainer with a legacy sidecar profile or a multi-agent install with stale shadow credentials could exercise the full path.

Changed files

  • src/commands/doctor-config-flow.ts (modified, +12/-0)
  • src/commands/doctor/repair-sequencing.test.ts (modified, +34/-0)
  • src/commands/doctor/repair-sequencing.ts (modified, +5/-1)

Code Example

No API key found for provider "openai-codex". Auth store: ~/.openclaw/agents/<agent>/agent/auth-profiles.json

---

Cannot find built CLI at ~/.openclaw/dist/index.js ... Run "pnpm build" first, or use dev mode.

---

# broken agent stores
openai-codex:default              oauth  has_access=false  has_refresh=false  has_oauthRef=true
openai-codex:<chatgpt-account>    oauth  has_access=false  has_refresh=false  has_oauthRef=true

# migrated/working agent stores
openai-codex:default              oauth  has_access=true   has_refresh=true   has_oauthRef=false
openai-codex:<chatgpt-account>    oauth  has_access=true   has_refresh=true   has_oauthRef=false

---

FallbackSummaryError: All models failed (2):
openai-codex/gpt-5.5: No API key found for provider "openai-codex".
Auth store: ~/.openclaw/agents/<agent>/agent/auth-profiles.json
| anthropic/claude-sonnet-4-6: Provider anthropic is in cooldown (billing)

---

provider: openai-codex
model: gpt-5.5
fallbackUsed: false
reply: OK
RAW_BUFFERClick to expand / collapse

Summary

After updating a macOS LaunchAgent/source-checkout install to OpenClaw 2026.5.18 (50a2481), openai-codex/gpt-5.5 auth looked configured in status output, but live agent runs failed with:

No API key found for provider "openai-codex". Auth store: ~/.openclaw/agents/<agent>/agent/auth-profiles.json

Running doctor --fix --non-interactive --yes partially migrated legacy sidecar-backed Codex OAuth profiles, but left other agent auth stores unresolved, exited nonzero with an unrelated-looking built-CLI error, and the running gateway continued using stale auth state until it was restarted.

This appears related to the existing Codex OAuth/auth-selection family (#63856, #78407, #79461), but the observed v2026.5.18 behavior is specifically about partial legacy OAuth sidecar migration + status/doctor/gateway-reload behavior.

Environment

  • OpenClaw: 2026.5.18 (50a2481)
  • OS: macOS 26.5, Apple Silicon
  • Node: Homebrew Node v24.15.0
  • Gateway: user LaunchAgent running from a local source checkout
  • Model route: openai-codex/gpt-5.5
  • Auth mode: OpenAI Codex OAuth / ChatGPT account, no direct OpenAI API-key route intended for these agent turns
  • Per-agent auth stores under ~/.openclaw/agents/<agent>/agent/auth-profiles.json

What Happened

  1. After the update, Discord/channel status showed the gateway/channels as OK, and model status reported openai-codex OAuth profiles present.
  2. Live agent turns still failed before reply with No API key found for provider "openai-codex".
  3. Inspecting auth profiles showed a mixed state:
    • Some agents had migrated inline OAuth credentials (access + refresh, no oauthRef).
    • Other agents still had legacy sidecar refs (oauthRef, no inline access/refresh).
  4. doctor --fix --non-interactive --yes partially repaired the state:
    • Migrated sidecar-backed Codex OAuth profiles for several agents.
    • Reported that some legacy OAuth sidecar profiles could not be decrypted and needed re-auth.
    • Cleared stale Codex session routing state and repaired session routes.
    • Then exited nonzero with:
Cannot find built CLI at ~/.openclaw/dist/index.js ... Run "pnpm build" first, or use dev mode.
  1. Immediately after the repair, an actual agent probe still failed until the gateway LaunchAgent was restarted. After restart, the migrated agents worked.
  2. Agents left in the legacy sidecar-ref shape still appeared to have openai-codex OAuth profiles by count/label, but actual model execution failed until the profiles were replaced with inline migrated OAuth entries and the gateway was restarted again.

Expected Behavior

  • models status / channel status should distinguish “profile entry exists” from “runtime can actually resolve usable Codex OAuth credentials”.
  • doctor --fix should either fully migrate legacy OAuth sidecar profiles or leave an actionable warning that is reflected in models status / doctor --lint.
  • If doctor --fix changes auth profile files, it should either tell the operator to restart/reload the gateway or invalidate the running gateway auth cache.
  • A source-checkout/LaunchAgent install should not finish auth repairs and then fail with an unrelated ~/.openclaw/dist/index.js built-CLI error, or the error should be clearly scoped as non-fatal to the auth repair.

Actual Behavior

  • Status could look configured while live runtime failed with No API key found.
  • doctor --fix partially repaired auth, then exited nonzero.
  • The running gateway kept stale auth state until a restart.
  • Manual repair was needed for the remaining sidecar-backed agent stores.

Sanitized Evidence

Before manual repair, profile shapes looked like this:

# broken agent stores
openai-codex:default              oauth  has_access=false  has_refresh=false  has_oauthRef=true
openai-codex:<chatgpt-account>    oauth  has_access=false  has_refresh=false  has_oauthRef=true

# migrated/working agent stores
openai-codex:default              oauth  has_access=true   has_refresh=true   has_oauthRef=false
openai-codex:<chatgpt-account>    oauth  has_access=true   has_refresh=true   has_oauthRef=false

Representative runtime failure:

FallbackSummaryError: All models failed (2):
openai-codex/gpt-5.5: No API key found for provider "openai-codex".
Auth store: ~/.openclaw/agents/<agent>/agent/auth-profiles.json
| anthropic/claude-sonnet-4-6: Provider anthropic is in cooldown (billing)

After replacing the stale sidecar-backed profiles with migrated inline OAuth entries and restarting the gateway, live probes succeeded:

provider: openai-codex
model: gpt-5.5
fallbackUsed: false
reply: OK

Request

Please make the auth repair/status path fail closed and operator-visible for legacy openai-codex OAuth sidecar profiles:

  • report unresolved sidecar profiles as auth-broken in models status, not merely present/configured;
  • make doctor --fix output and exit status distinguish partial auth repair from unrelated build/dev-mode failures;
  • document or automate the required gateway reload after auth profile migration;
  • ideally migrate all decryptable per-agent Codex OAuth sidecars consistently, or provide a safe command to copy/rebind a working OAuth profile across selected agents without exposing credentials.

No personal account identifiers, Discord IDs, tokens, hostnames, or IP addresses are included here; all paths and account IDs above are redacted/generalized.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix v2026.5.18 doctor/status can leave openai-codex OAuth sidecar auth partially repaired while runtime still fails [2 pull requests, 3 comments, 3 participants]