openclaw - ✅(Solved) Fix [Bug]: openai-codex OAuth runtime fails on 2026.4.9 with 403 HTML; 2026.3.28 works [1 pull requests, 2 comments, 2 participants]

FWo100 · 2026-04-10T07:38:25Z

[openclaw] On the same Ubuntu VPS and openai-codex OAuth setup, OpenClaw 2026.4.9 fails to return runtime chat replies in Control UI, WhatsApp, and WeCom, whil… On the same Ubuntu VPS and openai-codex OAuth setup, OpenClaw 2026.4.9 fails to return runtime chat replies in Control UI, WhatsApp, and WeCom, while 2026.3.28 works. # PR #64286: openai-codex: fix auth scope handling and classify provider/runtime failures - Repository: openclaw/openclaw - Author: 100yenadmin - State: closed | merged: False - Link: https://github.com/openclaw/openclaw/pull/64286 ## Description (problem / solution / changelog) ## Summary This is PR 2 of the GPT-5.4 / Codex agentic runtime parity program tracked in #64227 and scoped by #64229. It fixes the maintained-source OpenAI Codex OAuth scope gap in OpenClaw's login wrapper and adds a separate provider/runtime failure taxonomy that makes auth-scope, refresh, HTML 403, proxy, DNS, timeout, schema, sandbox-blocked, and replay-invalid failures observable in logs and easier to explain to users. ## What changed - normalize OpenAI Codex authorize URLs so the required scopes are always present: - `openid` - `profile` - `email` - `offline_access` - `model.request` - `api.responses.write` - add `classifyProviderRuntimeFailureKind(...)` as a typed provider/runtime failure classifier - keep the older failover-reason contract intact instead of widening it in this slice - thread `providerRuntimeFailureKind` through embedded-run observation fields and lifecycle logging - surface more truthful user-facing copy for: - OAuth refresh failures - missing OpenAI Codex scopes - HTML 403 auth failures - proxy/tunnel misroutes - replay-invalid failures - add focused regressions for scope failures, refresh failures, HTML 403, proxy, DNS, timeout, schema, sandbox-blocked, and replay-invalid paths ## Why GPT-5.4 / Codex failures in OpenClaw are still too easy to misdiagnose as generic model stops. This slice makes the auth/runtime layer tell the truth before we move on to tool-contract and parity-harness work. ## Non-goals - does not implement tool compatibility work from #64230 - does not implement permission truthfulness work from #64231 - does not implement replay/liveness hardening from #64232 - does not implement the benchmark harness from #64233 - does not widen the generic failover-reason enum for every caller in this slice ## Builds on prior groundwork - #45176 - #48592 - #53702 - #55206 - #44019 ## Validation Focused checks run: - `CI=1 pnpm exec vitest run src/commands/openai-codex-oauth.test.ts src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts src/agents/failover-error.test.ts src/agents/pi-embedded-error-observation.test.ts src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts` - repo hook gate during commit: - `pnpm check:no-conflict-markers` - `pnpm tool-display:check` - `pnpm check:host-env-policy:swift` - `pnpm tsgo` - `node scripts/prepare-extension-package-boundary-artifacts.mjs` - `pnpm lint` - `pnpm lint:webhook:no-low-level-body-read` - `pnpm lint:auth:no-pairing-store-group` - `pnpm lint:auth:pairing-account-scope` ## Linked issues - Closes #64229 - Refs #64227 - Refs #64133 - Refs #64174 - Refs #64092 - Refs #57399 - Refs #62672 ## Changed files - `src/agents/failover-error.test.ts` (modified, +10/-0) - `src/agents/pi-embedded-error-observation.test.ts` (modified, +14/-0) - `src/agents/pi-embedded-error-observation.ts` (modified, +23/-4) - `src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts` (modified, +67/-0) - `src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts` (modified, +79/-0) - `src/agents/pi-embedded-helpers.ts` (modified, +2/-0) - `src/agents/pi-embedded-helpers/errors.ts` (modified, +219/-4) - `src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts` (modified, +22/-0) - `src/agents/pi-embedded-subscribe.handlers.lifecycle.ts` (modified, +16/-3) - `src/commands/openai-codex-oauth.test.ts` (modified, +28/-3) - `src/plugins/provider-openai-codex-oauth.ts` (modified, +40/-1) ## Fixed - Fixed by PR: openai-codex: fix auth scope handling and classify provider/runtime failures (https://github.com/openclaw/openclaw/pull/64286) ### Bug type Regression (worked before, now fails) ### Beta release blocker No ### Summary On the same Ubuntu VPS and openai-codex OAuth setup, OpenClaw 2026.4.9 fails to return runtime chat replies in Control UI, WhatsApp, and WeCom, while 2026.3.28 works. ### Steps to reproduce 1. Start the same self-hosted Ubuntu VPS setup on OpenClaw 2026.4.9 with openai-codex OAuth. 2. Open a fresh Control UI chat, or send `/new` and then a simple text message in WhatsApp or WeCom. 3. Observe that Control UI gets stuck on "writing" or the channel returns an error instead of a normal reply. 4. Check gateway logs and observe repeated `403 ...` for `provider":"openai-codex"` and `model":"gpt-5.4"`. 5. Downgrade the same setup to 2026.3.28 and repeat the same t

openclaw2026-04-10 07:38:25

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#64174•Fetched 2026-04-11 06:16:01

View on GitHub

Comments

Participants

Timeline

Reactions

Author

FWo100

Participants

FWo100

mjamiv

Timeline (top)

commented ×2labeled ×2cross-referenced ×1

On the same Ubuntu VPS and openai-codex OAuth setup, OpenClaw 2026.4.9 fails to return runtime chat replies in Control UI, WhatsApp, and WeCom, while 2026.3.28 works.

Error Message

Observe that Control UI gets stuck on "writing" or the channel returns an error instead of a normal reply.

error":"403 <html>..." Control UI becomes unusable for normal chat replies and channel users receive misleading error messages instead of agent responses
The repeated raw runtime symptom observed in logs was upstream 403 <html>..., while the surfaced user-facing error text varied

Root Cause

On the same Ubuntu VPS and openai-codex OAuth setup, OpenClaw 2026.4.9 fails to return runtime chat replies in Control UI, WhatsApp, and WeCom, while 2026.3.28 works.

Fix Action

Fixed

Fixed by PR: openai-codex: fix auth scope handling and classify provider/runtime failures (https://github.com/openclaw/openclaw/pull/64286)

PR fix notes

PR #64286: openai-codex: fix auth scope handling and classify provider/runtime failures

Repository: openclaw/openclaw
Author: 100yenadmin
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/64286

Description (problem / solution / changelog)

Summary

This is PR 2 of the GPT-5.4 / Codex agentic runtime parity program tracked in #64227 and scoped by #64229.

It fixes the maintained-source OpenAI Codex OAuth scope gap in OpenClaw's login wrapper and adds a separate provider/runtime failure taxonomy that makes auth-scope, refresh, HTML 403, proxy, DNS, timeout, schema, sandbox-blocked, and replay-invalid failures observable in logs and easier to explain to users.

What changed

normalize OpenAI Codex authorize URLs so the required scopes are always present:
- openid
- profile
- email
- offline_access
- model.request
- api.responses.write
add classifyProviderRuntimeFailureKind(...) as a typed provider/runtime failure classifier
keep the older failover-reason contract intact instead of widening it in this slice
thread providerRuntimeFailureKind through embedded-run observation fields and lifecycle logging
surface more truthful user-facing copy for:
- OAuth refresh failures
- missing OpenAI Codex scopes
- HTML 403 auth failures
- proxy/tunnel misroutes
- replay-invalid failures
add focused regressions for scope failures, refresh failures, HTML 403, proxy, DNS, timeout, schema, sandbox-blocked, and replay-invalid paths

Why

GPT-5.4 / Codex failures in OpenClaw are still too easy to misdiagnose as generic model stops. This slice makes the auth/runtime layer tell the truth before we move on to tool-contract and parity-harness work.

Non-goals

does not implement tool compatibility work from #64230
does not implement permission truthfulness work from #64231
does not implement replay/liveness hardening from #64232
does not implement the benchmark harness from #64233
does not widen the generic failover-reason enum for every caller in this slice

Builds on prior groundwork

#45176
#48592
#53702
#55206
#44019

Validation

Focused checks run:

CI=1 pnpm exec vitest run src/commands/openai-codex-oauth.test.ts src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts src/agents/failover-error.test.ts src/agents/pi-embedded-error-observation.test.ts src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts
repo hook gate during commit:
- pnpm check:no-conflict-markers
- pnpm tool-display:check
- pnpm check:host-env-policy:swift
- pnpm tsgo
- node scripts/prepare-extension-package-boundary-artifacts.mjs
- pnpm lint
- pnpm lint:webhook:no-low-level-body-read
- pnpm lint:auth:no-pairing-store-group
- pnpm lint:auth:pairing-account-scope

Linked issues

Closes #64229
Refs #64227
Refs #64133
Refs #64174
Refs #64092
Refs #57399
Refs #62672

Changed files

src/agents/failover-error.test.ts (modified, +10/-0)
src/agents/pi-embedded-error-observation.test.ts (modified, +14/-0)
src/agents/pi-embedded-error-observation.ts (modified, +23/-4)
src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts (modified, +67/-0)
src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts (modified, +79/-0)
src/agents/pi-embedded-helpers.ts (modified, +2/-0)
src/agents/pi-embedded-helpers/errors.ts (modified, +219/-4)
src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts (modified, +22/-0)
src/agents/pi-embedded-subscribe.handlers.lifecycle.ts (modified, +16/-3)
src/commands/openai-codex-oauth.test.ts (modified, +28/-3)
src/plugins/provider-openai-codex-oauth.ts (modified, +40/-1)

Code Example

Observed log patterns on 2026.4.9 include:

- `embedded_run_agent_end`
  - `error":"403 <html>..."`
  - `failoverReason":"auth"`
  - `provider":"openai-codex"`
  - `model":"gpt-5.4"`

- Followed by surfaced user-facing messages such as:
  - `LLM request failed: DNS lookup for the provider endpoint failed.`
  - `⚠️ API rate limit reached. Please try again later.`

Other observed evidence:
- VPS resolver checks for `api.openai.com` succeeded
- `openclaw models status --agent main --probe` succeeded on the intended Codex OAuth profile
- Downgrading from 2026.4.9 to 2026.3.28 restored normal behavior on the same VPS and same setup

RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

Summary

On the same Ubuntu VPS and openai-codex OAuth setup, OpenClaw 2026.4.9 fails to return runtime chat replies in Control UI, WhatsApp, and WeCom, while 2026.3.28 works.

Steps to reproduce

Start the same self-hosted Ubuntu VPS setup on OpenClaw 2026.4.9 with openai-codex OAuth.
Open a fresh Control UI chat, or send /new and then a simple text message in WhatsApp or WeCom.
Observe that Control UI gets stuck on "writing" or the channel returns an error instead of a normal reply.
Check gateway logs and observe repeated 403 <html>... for provider":"openai-codex" and model":"gpt-5.4".
Downgrade the same setup to 2026.3.28 and repeat the same tests.
Observe that replies work again on 2026.3.28.

Expected behavior

On the same VPS and same openai-codex OAuth setup, Control UI, WhatsApp, and WeCom should return normal replies, as observed on OpenClaw 2026.3.28.

Actual behavior

On 2026.4.9, Control UI main gets stuck on "writing" and WhatsApp/WeCom return misleading user-facing errors such as LLM request failed: DNS lookup for the provider endpoint failed. or ⚠️ API rate limit reached. Please try again later.; the gateway logs repeatedly show upstream 403 <html>... with failoverReason":"auth", provider":"openai-codex", and model":"gpt-5.4".

OpenClaw version

2026.4.9 (0512059)

Operating system

Ubuntu 22.04.5 LTS

Install method

npm global

Model

openai-codex/gpt-5.4

Provider / routing chain

openclaw -> openai-codex OAuth -> https://chatgpt.com/backend-api

Additional provider/model setup details

Node version: v24.13.1
Main LLM path is openai-codex OAuth only
No non-Codex fallback is configured
openclaw models status --agent main --probe succeeds for the intended profile while runtime chats still fail
Per-agent probes for live handler agents also showed the intended Codex profile probing OK
One malformed main/agent/models.json entry (https:/chatgpt.com/backend-api) was found and fixed to https://chatgpt.com/backend-api, but the broader 2026.4.9 runtime failure persisted
SSE transport was forced in config and did not resolve the issue

Logs, screenshots, and evidence

Observed log patterns on 2026.4.9 include:

- `embedded_run_agent_end`
  - `error":"403 <html>..."`
  - `failoverReason":"auth"`
  - `provider":"openai-codex"`
  - `model":"gpt-5.4"`

- Followed by surfaced user-facing messages such as:
  - `LLM request failed: DNS lookup for the provider endpoint failed.`
  - `⚠️ API rate limit reached. Please try again later.`

Other observed evidence:
- VPS resolver checks for `api.openai.com` succeeded
- `openclaw models status --agent main --probe` succeeded on the intended Codex OAuth profile
- Downgrading from 2026.4.9 to 2026.3.28 restored normal behavior on the same VPS and same setup

Impact and severity

Affected users/systems/channels:

Control UI main
WhatsApp channel
WeCom channel
Same self-hosted Ubuntu VPS setup

Severity: High; blocks normal agent replies across the main UI and both tested channels

Frequency: Repeated across observed attempts on 2026.4.9

Consequence: Control UI becomes unusable for normal chat replies and channel users receive misleading error messages instead of agent responses

Additional information

Last known good version: 2026.3.28
First known bad version observed: 2026.4.9
Downgrading back to 2026.3.28 restored normal behavior
This issue reproduced across Control UI, WhatsApp, and WeCom on the same VPS
The repeated raw runtime symptom observed in logs was upstream 403 <html>..., while the surfaced user-facing error text varied

extent analysis

TL;DR

The most likely fix is to downgrade OpenClaw to version 2026.3.28, as it is the last known good version where the issue did not occur.

Guidance

Verify that the openai-codex OAuth setup and configuration are correct and unchanged between versions 2026.3.28 and 2026.4.9.
Check the gateway logs for any other error patterns or clues that might indicate a specific issue with the openai-codex integration in version 2026.4.9.
Test the openclaw models status --agent main --probe command again in version 2026.4.9 to confirm that the issue is not related to model configuration or probing.
Consider reaching out to the OpenClaw community or support team for further assistance, as this issue may be related to a regression or bug in version 2026.4.9.

Example

No specific code snippet is provided, as the issue seems to be related to a version-specific bug or regression.

Notes

The fact that downgrading to version 2026.3.28 resolves the issue suggests that the problem is likely related to a change or bug introduced in version 2026.4.9. However, without further information or debugging, it is difficult to pinpoint the exact cause.

Recommendation

Downgrade to version 2026.3.28, as it is the last known good version where the issue did not occur. This will allow for normal agent replies to function across the main UI and channels until a fix for version 2026.4.9 is available.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

On the same VPS and same openai-codex OAuth setup, Control UI, WhatsApp, and WeCom should return normal replies, as observed on OpenClaw 2026.3.28.

#api #API rate limit #batch processing #GPU compatibility #latency issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug]: openai-codex OAuth runtime fails on 2026.4.9 with 403 HTML; 2026.3.28 works [1 pull requests, 2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #64286: openai-codex: fix auth scope handling and classify provider/runtime failures

Description (problem / solution / changelog)

Summary

What changed

Why

Non-goals

Builds on prior groundwork

Validation

Linked issues

Changed files

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING