openclaw - ✅(Solved) Fix [Bug]: doctor --fix silently migrates intentional openai-codex/ config to openai/, breaking PI+OAuth runtime and causing 3-4x token inflation [4 pull requests, 9 comments, 5 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#84038Fetched 2026-05-20 03:44:54
View on GitHub
Comments
9
Participants
5
Timeline
53
Reactions
4
Timeline (top)
referenced ×14labeled ×10commented ×9cross-referenced ×8

The native Codex runtime produces 3–4× higher token usage compared to the OpenClaw PI runtime for the same GPT-5.x requests. This is a known upstream issue that OpenClaw cannot fix — but doctor --fix silently forces users onto the broken runtime by migrating intentional openai-codex/ configs to openai/, removing the agentRuntime: { id: "pi" } override in the process. Users have no way to opt out of this migration short of manually reverting after every doctor run.

Root Cause

- Doctor respects an explicitly configured openai-codex/ primary and agentRuntime: { id: "pi" } override. If the user has deliberately opted out of the native Codex runtime (because buggy/token-burn), doctor should not re-enroll them.

  • If doctor must migrate model references, it should at minimum preserve the agentRuntime config from the old entry and copy it to the new one.
  • Ideally: doctor warns and asks before applying breaking model-route changes, similar to how it already shows interactive summaries for other migrations.

Fix Action

Fix / Workaround

Workaround

PR fix notes

PR #84142: fix(doctor): preserve explicit agentRuntime pin during codex model migration [AI-assisted]

Description (problem / solution / changelog)

Summary

  • Problem: openclaw doctor --fix silently overwrote an explicit agentRuntime: { id: "pi" } opt-out with { id: "codex" } while migrating legacy openai-codex/* model refs to canonical openai/* form, flipping the user from the PI runtime back onto the native Codex runtime (3-4× token inflation per request, per #84038).
  • Solution: Skip the default-codex pin synthesis on the canonical model entry when a legacy openai-codex/* form of that model ref already has an explicit non-default agentRuntime.id (anything other than auto / default / missing). rewriteModelsMap then carries the legacy pin forward through its existing legacy-wins fallback branch.
  • What changed: src/commands/doctor/shared/codex-route-warnings.ts (+24, -0). New helper legacyEntryHasExplicitNonDefaultRuntimePin(models, canonicalModelRef) plus one guard in ensureCodexRuntimePolicy before setModelRuntimePolicy.
  • What did NOT change (scope boundary): Session-store stale-pin clearing (repairCodexSessionStoreRoutes, clearStaleSessionRuntimePins) is untouched. Top-level agents.defaults.agentRuntime clearing (clearLegacyAgentRuntimePolicy) is untouched. The migration of model refs themselves (openai-codex/Xopenai/X) is unchanged. auto / default runtime pins remain overwritable.

Motivation

The native Codex runtime currently produces 3-4× the token usage of the PI runtime for the same GPT-5.x request (the upstream Codex issue OpenClaw cannot fix at the provider layer). Users who explicitly pin agentRuntime: { id: "pi" } to opt out of this need that opt-out to survive doctor --fix. Every doctor run today silently puts them back on the broken runtime.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor required for the fix
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

  • Closes #84038
  • Related #83315 (same "preserve user fields during legacy migration" pattern in tools.web.search; cited as design precedent below)
  • This PR fixes a bug or regression

Real behavior proof

  • Behavior or issue addressed: doctor --fix overwrites agents.defaults.models["openai-codex/gpt-5.4"].agentRuntime = { id: "pi" } with { id: "codex" } while migrating the legacy openai-codex/* model ref to openai/*, silently re-enrolling the user into the native Codex runtime they explicitly opted out of.

  • Real environment tested: Local fresh worktree off upstream/main f07c87405c on Linux 6.17.0 (Node 22.22.1, pnpm 11.1.0). Two sibling worktrees: baseline (upstream/main, no fix) at /tmp/openclaw-84038-baseline and fix branch at /tmp/openclaw-84038. Production maybeRepairCodexRoutes (the same function src/commands/doctor/repair-sequencing.ts:82 calls during openclaw doctor --fix) invoked directly against the exact 4-key user config from #84038.

  • Exact steps or command run after this patch:

    # In each worktree (baseline and fix branch):
    node --import tsx ./_proof-trace.mjs
    #   _proof-trace.mjs imports { maybeRepairCodexRoutes }
    #   from src/commands/doctor/shared/codex-route-warnings.ts and calls it
    #   with shouldRepair: true on the issue body's exact config.
  • Evidence after fix:

Terminal capture from this branch, copied live output:

=== INPUT CONFIG (user's #84038 setup) ===
  {
    "agents": {
      "defaults": {
        "model": { "primary": "openai-codex/gpt-5.4" },
        "models": {
          "openai-codex/gpt-5.4": { "agentRuntime": { "id": "pi" } }
        }
      }
    },
    "auth": { "order": { "openai-codex": ["openai-codex:[email protected]"] } },
    "plugins": { "entries": { "codex": { "enabled": false } } }
  }

  === DOCTOR CHANGES ===
  Repaired Codex model routes:
  - agents.defaults.model.primary: openai-codex/gpt-5.4 -> openai/gpt-5.4.
  - agents.defaults.models.openai-codex/gpt-5.4: openai-codex/gpt-5.4 -> openai/gpt-5.4.

  === POST-MIGRATION CONFIG (agents block) ===
  {
    "defaults": {
      "model": { "primary": "openai/gpt-5.4" },
      "models": {
        "openai/gpt-5.4": { "agentRuntime": { "id": "pi" } }
      }
    }
  }

agentRuntime preserved as { id: pi }?: YES
  • Observed result after fix: the legacy openai-codex/gpt-5.4 model ref is migrated to canonical openai/gpt-5.4, the corresponding map entry follows the rename, and the user's explicit agentRuntime: { id: "pi" } opt-out is preserved on the migrated entry. The previous "Set agents.defaults.models.openai/gpt-5.4.agentRuntime.id to 'codex'" line is correctly absent because no synthesized default needs setting, and plugins.entries.codex.enabled stays false as the user requested (no spurious auto-enable, because no route now claims the codex runtime).

  • What was not tested: live gateway runtime / actual provider request emission (the runtime resolution side is exercised by existing resolveAgentHarnessPolicy tests in codex-route-warnings.test.ts at L3000-3006 and L3036-3042 covering both pi and codex shapes), and the macOS install path the original reporter is on (the migration code path is platform-agnostic, so the issue surface should reproduce / fix on macOS the same way).

  • Before evidence (baseline, same script same input on upstream/main f07c87405c):

    === DOCTOR CHANGES (BEFORE FIX) ===
    Repaired Codex model routes:
    - agents.defaults.model.primary: openai-codex/gpt-5.4 -> openai/gpt-5.4.
    - agents.defaults.models.openai-codex/gpt-5.4: openai-codex/gpt-5.4 -> openai/gpt-5.4.
    Set agents.defaults.models.openai/gpt-5.4.agentRuntime.id to "codex" so repaired OpenAI refs keep Codex auth routing.
    Enabled plugins.entries.codex because configured agent routes use Codex runtime.
    
    === POST-MIGRATION CONFIG (agents block) ===
    {
      "defaults": {
        "model": { "primary": "openai/gpt-5.4" },
        "models": {
          "openai/gpt-5.4": { "agentRuntime": { "id": "codex" } }
        }
      }
    }
    
    agentRuntime preserved as { id: pi }?: NO (got {"id":"codex"})

Root Cause

  • Root cause: in rewriteAgentModelRefs, the AGENT_MODEL_CONFIG_KEYS loop processes model first and calls preserveCodexRuntimePolicyForNewHitsensureCodexRuntimePolicy(modelRef="openai/X") while the canonical entry models["openai/X"] does not yet exist. ensureCodexRuntimePolicy synthesizes { agentRuntime: { id: "codex" } } on a fresh empty entry. rewriteModelsMap then renames openai-codex/Xopenai/X, and its { ...legacyRecord, ...canonicalRecord } spread places the just-synthesized codex pin last, silently dropping the user's { id: "pi" }.
  • Missing detection / guardrail: ensureCodexRuntimePolicy only looked at the canonical entry's existing agentRuntime when deciding whether to synthesize. It did not consider that a legacy form of the same model ref in the same models map might carry an explicit non-default pin the user expected to keep through the migration.
  • Contributing context (if known): two existing tests in codex-route-warnings.test.ts already encode the maintainer intent to preserve such pins — "preserves explicit model-scoped runtime pins when repairing legacy model map keys" (L2974) for the models-map-only case, and "overwrites non-concrete model-scoped runtime pins when preserving Codex route intent" (L3009) for the auto/default case. The combined-config path (legacy ref present in BOTH model.primary and models[...]) was the previously uncovered seam.

Regression Test Plan

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
    • End-to-end test
    • Existing coverage already sufficient
  • Target test or file: src/commands/doctor/shared/codex-route-warnings.test.ts
  • Scenario the test should lock in: user has agents.defaults.model.primary = "openai-codex/X" AND agents.defaults.models["openai-codex/X"].agentRuntime = { id: "pi" }. After maybeRepairCodexRoutes({ shouldRepair: true }), the migrated entry models["openai/X"] must have agentRuntime = { id: "pi" }, and models["openai-codex/X"] must be gone.
  • Why this is the smallest reliable guardrail: the bug is purely in the rewrite sequence inside rewriteAgentModelRefs; a unit-level config-in/config-out assertion against maybeRepairCodexRoutes exercises the full repair pipeline (model-slot rewrite, ensureCodexRuntimePolicy decision, rewriteModelsMap rename + merge) without needing a real gateway or provider auth wiring.
  • Existing test that already covers this (if any): No. L2974 is similar but covers the model.primary-absent variant which already passes today; the combined-config variant from #84038 was uncovered.
  • If no new test is added, why not: N/A — added as a new it(...) block (RED on the prior commit cd0f8f800e, GREEN on the fix commit 9d932635ae).

User-visible / Behavior Changes

  • openclaw doctor --fix no longer rewrites agentRuntime.id to "codex" on a canonical openai/X entry when a legacy openai-codex/X form of the same model ref carries an explicit non-default pin (e.g. { id: "pi" }). The legacy pin survives the rename.
  • As a direct downstream consequence, plugins.entries.codex is no longer auto-enabled in that scenario (the migrated route now claims the pi runtime, not codex, so enableCodexPluginForRequiredRoutes does not need to touch the plugin entry). This matches the user's stated intent in #84038 ("explicitly disabled the codex plugin"). Users who left agentRuntime unset, or set it to auto / default / codex, see no behavior change.

Diagram

Before fix:
[user config]
  model.primary = "openai-codex/X"
  models["openai-codex/X"].agentRuntime = { id: "pi" }
        |
        v
[AGENT_MODEL_CONFIG_KEYS loop processes "model"]
  -> rewrite model.primary to "openai/X" (hit added)
  -> preserveCodexRuntimePolicyForNewHits -> ensureCodexRuntimePolicy("openai/X")
     -> models["openai/X"] does not exist yet
     -> SYNTHESIZE models["openai/X"] = { agentRuntime: { id: "codex" } }
        |
        v
[rewriteModelsMap renames legacy key]
  -> legacyRecord  = { agentRuntime: { id: "pi" } }     # user's pin
  -> canonicalRecord = { agentRuntime: { id: "codex" } } # synthesized above
  -> merge spread { ...legacy, ...canonical }
     -> canonical wins -> { agentRuntime: { id: "codex" } }   # USER'S PIN LOST

After fix:
[same user config]
        |
        v
[AGENT_MODEL_CONFIG_KEYS loop processes "model"]
  -> rewrite model.primary to "openai/X" (hit added)
  -> preserveCodexRuntimePolicyForNewHits -> ensureCodexRuntimePolicy("openai/X")
     -> models["openai/X"] does not exist yet
     -> legacyEntryHasExplicitNonDefaultRuntimePin(models, "openai/X") = true
        (legacy form "openai-codex/X" has agentRuntime.id = "pi")
     -> EARLY RETURN, no synthesis
        |
        v
[rewriteModelsMap renames legacy key]
  -> legacyRecord  = { agentRuntime: { id: "pi" } }
  -> canonicalRecord = undefined   (nothing synthesized)
  -> falsy branch -> canonicalEntry ?? legacyEntry -> legacyEntry
     -> models["openai/X"] = { agentRuntime: { id: "pi" } }   # PIN PRESERVED

Security Impact

  • New permissions/capabilities? No
  • Secrets/tokens handling changed? No
  • New/changed network calls? No
  • Command/tool execution surface changed? No
  • Data access scope changed? No

The patch is a local config-rewrite decision change. No auth, secret, capability, network, or filesystem surface changes.

Repro + Verification

Environment

  • OS: Linux 6.17.0-14-generic
  • Runtime/container: Node 22.22.1, pnpm 11.1.0
  • Model/provider: N/A (config-mutation path, no provider request emitted during the migration)
  • Integration/channel (if any): N/A
  • Relevant config (redacted): the four top-level keys from the issue body — agents, auth, plugins — verbatim as posted in #84038.

Steps

  1. Check out upstream/main f07c87405c in worktree A and fix/84038-preserve-pi-runtime in worktree B.
  2. In each worktree, run pnpm install --prefer-offline.
  3. Drop the same _proof-trace.mjs into each worktree. The script imports maybeRepairCodexRoutes from src/commands/doctor/shared/codex-route-warnings.ts and calls it with shouldRepair: true on the issue's exact config.
  4. In each worktree, run node --import tsx ./_proof-trace.mjs and compare stdout.

Expected

  • After fix: models["openai/gpt-5.4"].agentRuntime = { id: "pi" }. No Set agents.defaults.models.openai/gpt-5.4.agentRuntime.id to "codex" line in changes. No Enabled plugins.entries.codex line.

Actual

  • After fix: matches expected (see Evidence above).
  • Before fix (baseline): models["openai/gpt-5.4"].agentRuntime = { id: "codex" }, with both the Set ... and Enabled ... lines present in changes.

Evidence

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Live-output evidence captured in the Real behavior proof section above. Repro test landed in src/commands/doctor/shared/codex-route-warnings.test.ts (it("preserves an explicit non-default agentRuntime pin on the legacy model entry during migration (#84038)", ...)). Lint, typecheck, and the full codex-route-warnings.test.ts + repair-sequencing.test.ts sweep all pass; these are supplemental to the live-output evidence.

Human Verification

  • Verified scenarios:
    • User's exact 4-key config from #84038 → migration runs, pin preserved (live node --import tsx output captured both directions).
    • Same config on upstream/main baseline → migration runs, pin overwritten (matches the user's reported regression).
    • Existing scenarios kept locked in: 92/92 in codex-route-warnings.test.ts, 10/10 in repair-sequencing.test.ts. Specifically L2974 ("preserves explicit model-scoped runtime pins when repairing legacy model map keys" — pi pin preserved when only models map is set, no model.primary) and L3009 ("overwrites non-concrete model-scoped runtime pins when preserving Codex route intent" — auto pin still gets overwritten by codex).
  • Edge cases checked:
    • agentRuntime: { id: "auto" } legacy pin → not preserved (still overwritten by codex synthesis; matches existing maintainer intent).
    • agentRuntime: { id: "default" } legacy pin → not preserved (same as auto).
    • No legacy agentRuntime at all → unchanged (codex synthesis still happens).
    • User has both legacy and canonical entries with explicit pins → canonical entry's pin already has explicit value, ensureCodexRuntimePolicy's existing pre-check returns early, no synthesis happens, rewriteModelsMap keeps canonical via its current spread order.
  • What you did not verify: live gateway / provider request emission (the runtime resolution is exercised by existing resolveAgentHarnessPolicy tests in the same file); the macOS install path the original reporter is on (the migration logic is platform-agnostic).

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? Yes
  • Config/env changes? No
  • Migration needed? No
  • If yes, exact upgrade steps: N/A. Users with no agentRuntime pin (the vast majority) see identical doctor behavior. Users on a non-default explicit pin keep it after upgrade — no manual reapply needed anymore.

Risks and Mitigations

  • Risk: a user who genuinely WANTED their agentRuntime: { id: "pi" } (or similar non-codex pin) on a legacy openai-codex/X ref to be REPLACED by { id: "codex" } during migration no longer gets that automatic replacement.
    • Mitigation: this is the bug, not a new risk. The previous behavior was silently incorrect — an explicit pin meant the user did not want default codex. Existing test "overwrites non-concrete model-scoped runtime pins when preserving Codex route intent" (L3009) keeps auto / default overwritable for users who want default-policy behavior. A user who actually wants codex can set { id: "codex" } explicitly, which the fix also leaves alone (existing ensureCodexRuntimePolicy early-exit on concrete non-default pin handles this case).
  • Risk: behavior change to plugins.entries.codex auto-enable. When the user's preserved pin is non-codex, the migrated route no longer claims the codex runtime, so the codex plugin is no longer auto-enabled.
    • Mitigation: this matches stated user intent in #84038 ("explicitly disabled the codex plugin"). Auto-enable still fires for users without an explicit non-default pin.

AI-assisted disclosure

  • AI-assisted — diagnosis and fix authored with Claude Opus 4.7 in a NexCore agent session. Real-behavior proof above was captured by directly invoking the production maybeRepairCodexRoutes function from a Node script on the human-operator's machine using the exact user config from the issue body. Co-author trailer: Co-Authored-By: Nex <[email protected]>.
  • I understand what the code does, the trigger sequence inside rewriteAgentModelRefs, why ensureCodexRuntimePolicy is the right place for the early-return, and the trade-off vs reordering rewriteModelsMap (a second considered approach, rejected because it shifts an existing snapshot test's expected changes[] ordering for no semantic gain).
  • Session log available on request; not attached to keep PR body focused.
  • codex review --base origin/main was not run because Codex CLI is not installed on the human-operator's local machine. Equivalent diff-review steps performed: typecheck (pnpm check:changed --base upstream/main, lane check-prod-types + check-test-types), lint (oxlint 0 warnings 0 errors on 8650 files / 217 rules), oxfmt --check clean, repro test exercises the exact behavior.
  • Will resolve bot review conversations after addressing them; will not leave bot threads dangling for maintainers.

Changed files

  • src/commands/doctor/shared/codex-route-warnings.test.ts (modified, +33/-0)
  • src/commands/doctor/shared/codex-route-warnings.ts (modified, +24/-0)

PR #84150: fix(doctor): preserve non-codex agentRuntime during model ref migration

Description (problem / solution / changelog)

Summary

doctor --fix silently migrates intentional openai-codex/ config with agentRuntime: { id: "pi" } to openai/ with agentRuntime: { id: "codex" }, breaking PI+OAuth runtime and causing 3-4x token inflation.

Root Cause

In codex-route-warnings.ts, rewriteAgentModelRefs() processes model config keys (triggering ensureCodexRuntimePolicy which writes codex runtime) before migrating the models map. When the models map is later merged, the generated codex entry overwrites the user's pi entry.

Additionally, clearLegacyAgentRuntimePolicy unconditionally deletes agent-level agentRuntime regardless of runtime ID.

Fix

  1. Reorder: Move rewriteModelsMap before the AGENT_MODEL_CONFIG_KEYS loop so legacy entries with explicit runtime pins are merged onto canonical keys first. ensureCodexRuntimePolicy then sees the existing pi runtime and skips.

  2. Guard: clearLegacyAgentRuntimePolicy now only deletes runtime pins with ID codex, auto, default, or absent. Non-codex runtimes like pi are preserved.

Tests

1 new regression test verifying openai-codex/gpt-5.4 with agentRuntime.id: "pi" migrates correctly. All existing codex-route-warnings tests pass.

Fixes #84038

Changed files

  • src/commands/doctor/shared/codex-route-warnings.test.ts (modified, +28/-1)
  • src/commands/doctor/shared/codex-route-warnings.ts (modified, +9/-9)

PR #84196: fix(doctor): preserve legacy agentRuntime when merging codex route config

Description (problem / solution / changelog)

Problem

openclaw doctor --fix silently migrates openai-codex/* config to openai/*, but the merge order causes canonical's agentRuntime to overwrite user-explicit agentRuntime settings.

Impact: Users who explicitly configured PI runtime to work around Codex bugs get silently migrated back to Codex, causing 3-4x token inflation.

Root Cause

src/commands/doctor/shared/codex-route-warnings.ts:1314 — rewriteModelsMap merges canonical into legacy, but canonical fields (including agentRuntime) take priority over legacy.

Fix

Flip merge order from { ...legacyRecord, ...canonicalRecord } to { ...canonicalRecord, ...legacyRecord } so user-explicit settings (like PI runtime) take priority over canonical defaults.

Test

All existing tests pass.

Fixes #84038

Changed files

  • src/commands/doctor/shared/codex-route-warnings.ts (modified, +1/-1)

PR #84362: fix(doctor): preserve explicit agentRuntime pin during codex model migration [AI-assisted]

Description (problem / solution / changelog)

Makes https://github.com/openclaw/openclaw/pull/84142 merge-ready for the ClawSweeper automerge loop. The edit pass should inspect the live PR diff, review comments, and failing checks; rebase if needed; keep the contributor branch credited; and stop only when validation is green or an external blocker is proven.

ClawSweeper 🐠 replacement reef notes:

<!-- clawsweeper-automerge-requested-by login="Takhoffman" id="781889" -->
  • Repair fallback: GitHub rejected the repair branch push because it updates workflow files and the ClawSweeper app token does not have workflows permission

Inherited issue-closing references from the source PR: Closes #84038

Co-author credit kept:

fish notes: model gpt-5.5, reasoning high; reviewed against 41e5043d9b2c.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/commands/doctor/shared/codex-route-warnings.test.ts (modified, +281/-0)
  • src/commands/doctor/shared/codex-route-warnings.ts (modified, +107/-10)

Code Example

{
       "agents": {
         "defaults": {
           "model": {
             "primary": "openai-codex/gpt-5.4"
           },
           "models": {
             "openai-codex/gpt-5.4": {
               "agentRuntime": { "id": "pi" }
             }
           }
         }
       },
       "auth": {
         "order": {
           "openai-codex": ["openai-codex:[email protected]"]
         }
       },
       "plugins": {
         "entries": { "codex": { "enabled": false } }
       }
     }
  2. Run openclaw doctor --fix (e.g. as part of a version update)
  3. Observe doctor output:
  Repaired Codex model routes:
  - agents.defaults.model.primary: openai-codex/gpt-5.4 -> openai/gpt-5.4
  - agents.defaults.models.openai-codex/gpt-5.4: openai-codex/gpt-5.4 -> openai/gpt-5.4
  4. Restart the gateway
  5. Send any message to the agent and observe token usage or check
  openclaw models status

### Expected behavior

  **- Doctor respects an explicitly configured openai-codex/ primary and
  agentRuntime: { id: "pi" } override. If the user has deliberately opted
  out of the native Codex runtime (because buggy/token-burn), doctor should not re-enroll them.**
  - If doctor must migrate model references, it should at minimum preserve the
  agentRuntime config from the old entry and copy it to the new one.
  - Ideally: doctor warns and asks before applying breaking model-route changes,
  similar to how it already shows interactive summaries for other migrations.

### Actual behavior

  - primary is silently changed to openai/gpt-5.4
  - The models["openai-codex/gpt-5.4"] entry — including agentRuntime: { id: "pi" }  is removed entirely
  - All model aliases pointing to openai-codex/gpt-5.4 are dropped
  - The gateway starts with the native Codex runtime active
  - Token usage per turn is 34× higher than before the migration
  (observed with GPT-5.4; even worse with GPT-5.5)
  - openclaw models status shows openai/gpt-5.4 · api-key (env: OPENAI_API_KEY)
  instead of openai-codex/gpt-5.4 · oauth — auth path also silently switched

### OpenClaw version

from 2026.5.12 onwards

### Operating system

macOS 26.5

### Install method

_No response_

### Model

openai-codex/gpt-5.x

### Provider / routing chain

  Desired (PI runtime, working):   User message     → openai-codex/gpt-5.4     → auth: oauth (openai-codex:user@example.com)     → runtime: OpenClaw PI     → tokens_prompt: baseline (1×)

### Additional provider/model setup details

  Provider / Routing Chain

  Desired (PI runtime, working):
  User message
    → openai-codex/gpt-5.4
    → auth: oauth (openai-codex:user@example.com)
    → runtime: OpenClaw PI
    → tokens_prompt: baseline (1×)

  Actual after doctor (native Codex runtime, broken):
  User message
    → openai/gpt-5.4
    → auth: api-key (env: OPENAI_API_KEY)   ← wrong auth path
    → runtime: native Codex harness          ← broken runtime
    → tokens_prompt: 34× baseline          ← token inflation

  The OPENAI_API_KEY is present in this environment for unrelated features
  (Talk/Voice). With primary: openai/gpt-5.4, OpenClaw silently prefers the
  direct API key path even when auth.order.openai-codex is configured — this
  auth-path ambiguity is the original reason for adopting the explicit
  openai-codex/ namespace

### Logs, screenshots, and evidence
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

The native Codex runtime produces 3–4× higher token usage compared to the OpenClaw PI runtime for the same GPT-5.x requests. This is a known upstream issue that OpenClaw cannot fix — but doctor --fix silently forces users onto the broken runtime by migrating intentional openai-codex/ configs to openai/, removing the agentRuntime: { id: "pi" } override in the process. Users have no way to opt out of this migration short of manually reverting after every doctor run.

Steps to reproduce

  1. Configure an explicit PI-runtime setup to avoid the broken Codex runtime:
    {
      "agents": {
        "defaults": {
          "model": {
            "primary": "openai-codex/gpt-5.4"
          },
          "models": {
            "openai-codex/gpt-5.4": {
              "agentRuntime": { "id": "pi" }
            }
          }
        }
      },
      "auth": {
        "order": {
          "openai-codex": ["openai-codex:[email protected]"]
        }
      },
      "plugins": {
        "entries": { "codex": { "enabled": false } }
      }
    }
  2. Run openclaw doctor --fix (e.g. as part of a version update)
  3. Observe doctor output: Repaired Codex model routes:
  • agents.defaults.model.primary: openai-codex/gpt-5.4 -> openai/gpt-5.4
  • agents.defaults.models.openai-codex/gpt-5.4: openai-codex/gpt-5.4 -> openai/gpt-5.4
  1. Restart the gateway
  2. Send any message to the agent and observe token usage or check openclaw models status

Expected behavior

- Doctor respects an explicitly configured openai-codex/ primary and agentRuntime: { id: "pi" } override. If the user has deliberately opted out of the native Codex runtime (because buggy/token-burn), doctor should not re-enroll them.

  • If doctor must migrate model references, it should at minimum preserve the agentRuntime config from the old entry and copy it to the new one.
  • Ideally: doctor warns and asks before applying breaking model-route changes, similar to how it already shows interactive summaries for other migrations.

Actual behavior

  • primary is silently changed to openai/gpt-5.4
  • The models["openai-codex/gpt-5.4"] entry — including agentRuntime: { id: "pi" } — is removed entirely
  • All model aliases pointing to openai-codex/gpt-5.4 are dropped
  • The gateway starts with the native Codex runtime active
  • Token usage per turn is 3–4× higher than before the migration (observed with GPT-5.4; even worse with GPT-5.5)
  • openclaw models status shows openai/gpt-5.4 · api-key (env: OPENAI_API_KEY) instead of openai-codex/gpt-5.4 · oauth — auth path also silently switched

OpenClaw version

from 2026.5.12 onwards

Operating system

macOS 26.5

Install method

No response

Model

openai-codex/gpt-5.x

Provider / routing chain

Desired (PI runtime, working): User message → openai-codex/gpt-5.4 → auth: oauth (openai-codex:[email protected]) → runtime: OpenClaw PI → tokens_prompt: baseline (1×)

Additional provider/model setup details

Provider / Routing Chain

Desired (PI runtime, working): User message → openai-codex/gpt-5.4 → auth: oauth (openai-codex:[email protected]) → runtime: OpenClaw PI → tokens_prompt: baseline (1×)

Actual after doctor (native Codex runtime, broken): User message → openai/gpt-5.4 → auth: api-key (env: OPENAI_API_KEY) ← wrong auth path → runtime: native Codex harness ← broken runtime → tokens_prompt: 3–4× baseline ← token inflation

The OPENAI_API_KEY is present in this environment for unrelated features (Talk/Voice). With primary: openai/gpt-5.4, OpenClaw silently prefers the direct API key path even when auth.order.openai-codex is configured — this auth-path ambiguity is the original reason for adopting the explicit openai-codex/ namespace

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

Proposed Fix / Feature Request

The root issue is the native Codex runtime itself — it is producing significantly inflated token counts compared to the PI runtime for equivalent workloads. This appears to be an upstream Codex issue that OpenClaw cannot directly fix at the provider level.

Until that is resolved, users need a stable way to opt out. Specifically:

  1. Respect agentRuntime: { id: "pi" } as an explicit opt-out signal. Doctor should not overwrite a model entry that carries this key without at minimum a warning.
  2. Surface the Codex runtime as experimental / opt-in, not the default path that doctor steers users toward. Users should be able to choose whether they want the native Codex harness or the PI runtime — and that choice should survive doctor --fix.
  3. If migration is unavoidable, carry agentRuntime and alias config forward to the new model ref automatically.

Workaround

Manually reapply after every doctor --fix:

openclaw config set agents.defaults.model.primary "openai-codex/gpt-5.4" openclaw config set agents.defaults.models["openai-codex/gpt-5.4"].agentRuntime.id "pi" openclaw gateway restart --force

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

- Doctor respects an explicitly configured openai-codex/ primary and agentRuntime: { id: "pi" } override. If the user has deliberately opted out of the native Codex runtime (because buggy/token-burn), doctor should not re-enroll them.

  • If doctor must migrate model references, it should at minimum preserve the agentRuntime config from the old entry and copy it to the new one.
  • Ideally: doctor warns and asks before applying breaking model-route changes, similar to how it already shows interactive summaries for other migrations.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: doctor --fix silently migrates intentional openai-codex/ config to openai/, breaking PI+OAuth runtime and causing 3-4x token inflation [4 pull requests, 9 comments, 5 participants]