openclaw - ✅(Solved) Fix agents: classify replay/liveness failures and prevent silent long-task abandonment [3 pull requests, 1 participants]

100yenadmin · 2026-04-10T09:55:00Z

[openclaw] Harden replay and long-task liveness so GPT-5.4 does not silently disappear mid-task or have replay failures misread as model behavior. Harden replay and long-task liveness so GPT-5.4 does not silently disappear mid-task or have replay failures misread as model behavior. # PR #64286: openai-codex: fix auth scope handling and classify provider/runtime failures - Repository: openclaw/openclaw - Author: 100yenadmin - State: closed | merged: False - Link: https://github.com/openclaw/openclaw/pull/64286 ## Description (problem / solution / changelog) ## Summary This is PR 2 of the GPT-5.4 / Codex agentic runtime parity program tracked in #64227 and scoped by #64229. It fixes the maintained-source OpenAI Codex OAuth scope gap in OpenClaw's login wrapper and adds a separate provider/runtime failure taxonomy that makes auth-scope, refresh, HTML 403, proxy, DNS, timeout, schema, sandbox-blocked, and replay-invalid failures observable in logs and easier to explain to users. ## What changed - normalize OpenAI Codex authorize URLs so the required scopes are always present: - `openid` - `profile` - `email` - `offline_access` - `model.request` - `api.responses.write` - add `classifyProviderRuntimeFailureKind(...)` as a typed provider/runtime failure classifier - keep the older failover-reason contract intact instead of widening it in this slice - thread `providerRuntimeFailureKind` through embedded-run observation fields and lifecycle logging - surface more truthful user-facing copy for: - OAuth refresh failures - missing OpenAI Codex scopes - HTML 403 auth failures - proxy/tunnel misroutes - replay-invalid failures - add focused regressions for scope failures, refresh failures, HTML 403, proxy, DNS, timeout, schema, sandbox-blocked, and replay-invalid paths ## Why GPT-5.4 / Codex failures in OpenClaw are still too easy to misdiagnose as generic model stops. This slice makes the auth/runtime layer tell the truth before we move on to tool-contract and parity-harness work. ## Non-goals - does not implement tool compatibility work from #64230 - does not implement permission truthfulness work from #64231 - does not implement replay/liveness hardening from #64232 - does not implement the benchmark harness from #64233 - does not widen the generic failover-reason enum for every caller in this slice ## Builds on prior groundwork - #45176 - #48592 - #53702 - #55206 - #44019 ## Validation Focused checks run: - `CI=1 pnpm exec vitest run src/commands/openai-codex-oauth.test.ts src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts src/agents/failover-error.test.ts src/agents/pi-embedded-error-observation.test.ts src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts` - repo hook gate during commit: - `pnpm check:no-conflict-markers` - `pnpm tool-display:check` - `pnpm check:host-env-policy:swift` - `pnpm tsgo` - `node scripts/prepare-extension-package-boundary-artifacts.mjs` - `pnpm lint` - `pnpm lint:webhook:no-low-level-body-read` - `pnpm lint:auth:no-pairing-store-group` - `pnpm lint:auth:pairing-account-scope` ## Linked issues - Closes #64229 - Refs #64227 - Refs #64133 - Refs #64174 - Refs #64092 - Refs #57399 - Refs #62672 ## Changed files - `src/agents/failover-error.test.ts` (modified, +10/-0) - `src/agents/pi-embedded-error-observation.test.ts` (modified, +14/-0) - `src/agents/pi-embedded-error-observation.ts` (modified, +23/-4) - `src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts` (modified, +67/-0) - `src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts` (modified, +79/-0) - `src/agents/pi-embedded-helpers.ts` (modified, +2/-0) - `src/agents/pi-embedded-helpers/errors.ts` (modified, +219/-4) - `src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts` (modified, +22/-0) - `src/agents/pi-embedded-subscribe.handlers.lifecycle.ts` (modified, +16/-3) - `src/commands/openai-codex-oauth.test.ts` (modified, +28/-3) - `src/plugins/provider-openai-codex-oauth.ts` (modified, +40/-1) --- # PR #64300: agents: add OpenAI/Codex tool compatibility and replay/liveness state - Repository: openclaw/openclaw - Author: 100yenadmin - State: closed | merged: True - Link: https://github.com/openclaw/openclaw/pull/64300 ## Description (problem / solution / changelog) ## Summary - keep the provider-owned OpenAI/Codex tool-compat layer via the existing provider hook surface - add replay/liveness state surfacing so long-running embedded runs stop disappearing silently - compact the original Contracts 2 and 5 into one execution-correctness PR in the GPT-5.4 / Codex parity program tracked by #64227 ## Scope - Refs #64230 - Refs #64232 - Refs #64227 - combines provider-owned tool compatibility with replay/liveness hardening - no auth / permission truthfulness changes in this PR - no self-elected continuation scope from #38780 - no benchmark harness work from #64233 ## What changed - add an `openai` tool-compat family to `buildProviderToolCom

openclaw2026-04-10 09:55:00

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#64232•Fetched 2026-04-11 06:15:46

View on GitHub

Comments

Participants

Timeline

Reactions

Author

100yenadmin

Participants

100yenadmin

Timeline (top)

cross-referenced ×4

Harden replay and long-task liveness so GPT-5.4 does not silently disappear mid-task or have replay failures misread as model behavior.

Root Cause

Harden replay and long-task liveness so GPT-5.4 does not silently disappear mid-task or have replay failures misread as model behavior.

Fix Action

Fixed

Fixed by PR: openai-codex: fix auth scope handling and classify provider/runtime failures (https://github.com/openclaw/openclaw/pull/64286)
Fixed by PR: agents: add OpenAI/Codex tool compatibility and replay/liveness state (https://github.com/openclaw/openclaw/pull/64300)
Fixed by PR: openai-codex: classify runtime failures and make full access truthful (https://github.com/openclaw/openclaw/pull/64439)

PR fix notes

PR #64286: openai-codex: fix auth scope handling and classify provider/runtime failures

Repository: openclaw/openclaw
Author: 100yenadmin
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/64286

Description (problem / solution / changelog)

Summary

This is PR 2 of the GPT-5.4 / Codex agentic runtime parity program tracked in #64227 and scoped by #64229.

It fixes the maintained-source OpenAI Codex OAuth scope gap in OpenClaw's login wrapper and adds a separate provider/runtime failure taxonomy that makes auth-scope, refresh, HTML 403, proxy, DNS, timeout, schema, sandbox-blocked, and replay-invalid failures observable in logs and easier to explain to users.

What changed

normalize OpenAI Codex authorize URLs so the required scopes are always present:
- openid
- profile
- email
- offline_access
- model.request
- api.responses.write
add classifyProviderRuntimeFailureKind(...) as a typed provider/runtime failure classifier
keep the older failover-reason contract intact instead of widening it in this slice
thread providerRuntimeFailureKind through embedded-run observation fields and lifecycle logging
surface more truthful user-facing copy for:
- OAuth refresh failures
- missing OpenAI Codex scopes
- HTML 403 auth failures
- proxy/tunnel misroutes
- replay-invalid failures
add focused regressions for scope failures, refresh failures, HTML 403, proxy, DNS, timeout, schema, sandbox-blocked, and replay-invalid paths

Why

GPT-5.4 / Codex failures in OpenClaw are still too easy to misdiagnose as generic model stops. This slice makes the auth/runtime layer tell the truth before we move on to tool-contract and parity-harness work.

Non-goals

does not implement tool compatibility work from #64230
does not implement permission truthfulness work from #64231
does not implement replay/liveness hardening from #64232
does not implement the benchmark harness from #64233
does not widen the generic failover-reason enum for every caller in this slice

Builds on prior groundwork

#45176
#48592
#53702
#55206
#44019

Validation

Focused checks run:

CI=1 pnpm exec vitest run src/commands/openai-codex-oauth.test.ts src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts src/agents/failover-error.test.ts src/agents/pi-embedded-error-observation.test.ts src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts
repo hook gate during commit:
- pnpm check:no-conflict-markers
- pnpm tool-display:check
- pnpm check:host-env-policy:swift
- pnpm tsgo
- node scripts/prepare-extension-package-boundary-artifacts.mjs
- pnpm lint
- pnpm lint:webhook:no-low-level-body-read
- pnpm lint:auth:no-pairing-store-group
- pnpm lint:auth:pairing-account-scope

Linked issues

Closes #64229
Refs #64227
Refs #64133
Refs #64174
Refs #64092
Refs #57399
Refs #62672

Changed files

src/agents/failover-error.test.ts (modified, +10/-0)
src/agents/pi-embedded-error-observation.test.ts (modified, +14/-0)
src/agents/pi-embedded-error-observation.ts (modified, +23/-4)
src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts (modified, +67/-0)
src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts (modified, +79/-0)
src/agents/pi-embedded-helpers.ts (modified, +2/-0)
src/agents/pi-embedded-helpers/errors.ts (modified, +219/-4)
src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts (modified, +22/-0)
src/agents/pi-embedded-subscribe.handlers.lifecycle.ts (modified, +16/-3)
src/commands/openai-codex-oauth.test.ts (modified, +28/-3)
src/plugins/provider-openai-codex-oauth.ts (modified, +40/-1)

PR #64300: agents: add OpenAI/Codex tool compatibility and replay/liveness state

Repository: openclaw/openclaw
Author: 100yenadmin
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/64300

Description (problem / solution / changelog)

Summary

keep the provider-owned OpenAI/Codex tool-compat layer via the existing provider hook surface
add replay/liveness state surfacing so long-running embedded runs stop disappearing silently
compact the original Contracts 2 and 5 into one execution-correctness PR in the GPT-5.4 / Codex parity program tracked by #64227

Scope

Refs #64230
Refs #64232
Refs #64227
combines provider-owned tool compatibility with replay/liveness hardening
no auth / permission truthfulness changes in this PR
no self-elected continuation scope from #38780
no benchmark harness work from #64233

What changed

add an openai tool-compat family to buildProviderToolCompatFamilyHooks(...)
gate the family to native OpenAI/OpenAI Codex response routes only
normalize provider-owned parameter-free and missing-object-shape tool schemas for strict OpenAI/Codex routes
surface provider-owned diagnostics for remaining strict-schema incompatibilities
attach the compat hooks in extensions/openai/index.ts so OpenAI and OpenAI Codex providers both expose them
add replay/liveness state to embedded run results and lifecycle surfaces
classify replay/liveness outcomes as observable working, paused, blocked, or abandoned states instead of silent disappearance
preserve replay-invalid truth across compaction retries after mutating tool side effects
add focused regressions for replay/liveness surfacing alongside the existing tool-compat coverage

Validation

pnpm build
CI=1 pnpm exec vitest run src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts src/agents/pi-embedded-subscribe.handlers.compaction.test.ts src/agents/pi-embedded-subscribe.handlers.tools.test.ts src/agents/pi-embedded-runner/run/attempt.spawn-workspace.test.ts src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.subscribeembeddedpisession.test.ts

Non-goals

does not supersede #64229 or #64231
does not add tool-name or argument aliases
does not change generic runner behavior outside provider-owned hooks and replay/liveness surfacing

Changed files

CHANGELOG.md (modified, +1/-0)
extensions/openai/index.test.ts (modified, +78/-0)
extensions/openai/index.ts (modified, +3/-0)
src/agents/pi-embedded-runner/run.incomplete-turn.test.ts (modified, +43/-0)
src/agents/pi-embedded-runner/run.overflow-compaction.test.ts (modified, +23/-0)
src/agents/pi-embedded-runner/run.timeout-triggered-compaction.test.ts (modified, +1/-0)
src/agents/pi-embedded-runner/run.ts (modified, +80/-0)
src/agents/pi-embedded-runner/run/attempt.spawn-workspace.test-support.ts (modified, +6/-0)
src/agents/pi-embedded-runner/run/attempt.ts (modified, +18/-5)
src/agents/pi-embedded-runner/run/incomplete-turn.ts (modified, +45/-0)
src/agents/pi-embedded-runner/run/retry-limit.ts (modified, +5/-0)
src/agents/pi-embedded-runner/run/types.ts (modified, +7/-0)
src/agents/pi-embedded-runner/types.ts (modified, +4/-0)
src/agents/pi-embedded-subscribe.handlers.compaction.ts (modified, +4/-0)
src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts (modified, +67/-0)
src/agents/pi-embedded-subscribe.handlers.lifecycle.ts (modified, +27/-1)
src/agents/pi-embedded-subscribe.handlers.tools.test.ts (modified, +92/-0)
src/agents/pi-embedded-subscribe.handlers.tools.ts (modified, +5/-0)
src/agents/pi-embedded-subscribe.handlers.types.ts (modified, +6/-0)
src/agents/pi-embedded-subscribe.subscribe-embedded-pi-session.subscribeembeddedpisession.test.ts (modified, +38/-0)
src/agents/pi-embedded-subscribe.ts (modified, +21/-0)
src/agents/pi-embedded-subscribe.types.ts (modified, +2/-0)
src/auto-reply/reply/dispatch-from-config.ts (modified, +2/-2)
src/plugin-sdk/provider-tools.test.ts (modified, +244/-0)
src/plugin-sdk/provider-tools.ts (modified, +286/-1)
src/plugins/contracts/provider-family-plugin-tests.test.ts (modified, +1/-0)

PR #64439: openai-codex: classify runtime failures and make full access truthful

Repository: openclaw/openclaw
Author: 100yenadmin
State: closed | merged: True
Link: https://github.com/openclaw/openclaw/pull/64439

Description (problem / solution / changelog)

Summary

This is the compact runtime-truthfulness slice of the GPT-5.4 / Codex parity program tracked in #64227.

It combines the original Contract 1 auth/runtime truthfulness work from #64229 with the Contract 4 permission truthfulness work from #64231, so OpenClaw tells the truth about both provider/runtime failures and whether /elevated full is actually available.

Scope

Closes #64229
Closes #64231
Refs #64227
combines auth/runtime failure classification with truthful full-access surfacing
no tool-compat or replay/liveness scope in this PR
no benchmark harness scope in this PR

What changed

normalize OpenAI Codex authorize URLs so the required scopes are always present:
- openid
- profile
- email
- offline_access
- model.request
- api.responses.write
add typed provider/runtime failure classification for:
- auth_scope
- auth_refresh
- auth_html_403
- proxy
- dns
- timeout
- schema
- sandbox_blocked
- replay_invalid
- unknown
thread providerRuntimeFailureKind through embedded-run observation fields and lifecycle logging
surface more truthful user-facing copy for scope failures, refresh failures, HTML 403 auth failures, proxy/tunnel misroutes, and replay-invalid failures
extend embedded elevated metadata with fullAccessAvailable and fullAccessBlockedReason
advertise /elevated full only when auto-approved host exec is actually available for the current runtime
update current exec hints so unavailable full access is explained precisely instead of being suggested as if it were always possible

Validation

full repo check stack completed while landing the combined branch commits
pnpm exec vitest run src/commands/openai-codex-oauth.test.ts src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts src/agents/failover-error.test.ts src/agents/pi-embedded-error-observation.test.ts src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts src/agents/pi-embedded-runner.buildembeddedsandboxinfo.test.ts src/agents/system-prompt.test.ts src/auto-reply/reply/get-reply-run.exec-hint.test.ts

Non-goals

does not supersede #64230 or #64232
does not widen the generic failover-reason enum for every caller in this slice
does not introduce a new permission system
does not change exec enforcement in bash-tools.exec

Changed files

CHANGELOG.md (modified, +1/-0)
extensions/qa-lab/src/live-transports/telegram/telegram-live.runtime.test.ts (modified, +6/-2)
src/agents/bash-tools.exec-types.ts (modified, +3/-0)
src/agents/pi-embedded-error-observation.test.ts (modified, +14/-0)
src/agents/pi-embedded-error-observation.ts (modified, +23/-4)
src/agents/pi-embedded-helpers.formatassistanterrortext.test.ts (modified, +71/-0)
src/agents/pi-embedded-helpers.isbillingerrormessage.test.ts (modified, +84/-0)
src/agents/pi-embedded-helpers.ts (modified, +2/-0)
src/agents/pi-embedded-helpers/errors.ts (modified, +224/-4)
src/agents/pi-embedded-runner.buildembeddedsandboxinfo.test.ts (modified, +65/-1)
src/agents/pi-embedded-runner/sandbox-info.ts (modified, +33/-3)
src/agents/pi-embedded-runner/types.ts (modified, +4/-0)
src/agents/pi-embedded-subscribe.handlers.lifecycle.test.ts (modified, +22/-0)
src/agents/pi-embedded-subscribe.handlers.lifecycle.ts (modified, +6/-2)
src/agents/system-prompt.test.ts (modified, +30/-1)
src/agents/system-prompt.ts (modified, +46/-7)
src/auto-reply/reply.directive.directive-behavior.defaults-think-low-reasoning-capable-models-no.test.ts (modified, +2/-1)
src/auto-reply/reply/commands-system-prompt.test.ts (modified, +34/-0)
src/auto-reply/reply/commands-system-prompt.ts (modified, +12/-0)
src/auto-reply/reply/get-reply-run.exec-hint.test.ts (modified, +13/-0)
src/auto-reply/reply/get-reply-run.media-only.test.ts (modified, +8/-4)
src/auto-reply/reply/get-reply-run.ts (modified, +25/-1)
src/commands/openai-codex-oauth.test.ts (modified, +53/-3)
src/media/base64.ts (modified, +32/-3)
src/plugins/provider-openai-codex-oauth.test.ts (added, +24/-0)
src/plugins/provider-openai-codex-oauth.ts (modified, +47/-1)

RAW_BUFFERClick to expand / collapse

Parent: #64227

Summary

Harden replay and long-task liveness so GPT-5.4 does not silently disappear mid-task or have replay failures misread as model behavior.

Scope

replay-invalid classification
stale continuation / compaction abandonment handling
liveness states for working, paused, blocked, and abandoned
hardening only around current replay/liveness seams; no duplication of #38780

Acceptance

replay failures classify as replay failures
long-running tasks cannot silently vanish without a user-visible state transition
compaction/replay transitions preserve truthful progress state

extent analysis

TL;DR

Implement liveness states and harden replay handling to prevent silent task disappearance and misclassification of replay failures.

Guidance

Introduce distinct liveness states (working, paused, blocked, abandoned) to track task progress and prevent silent disappearance.
Enhance replay handling to correctly classify replay failures and prevent them from being misread as model behavior.
Implement compaction and replay transition logic to preserve truthful progress state.
Review and refine the stale continuation and compaction abandonment handling to ensure seamless task execution.

Example

No explicit code example can be provided without more context, but the implementation should focus on introducing these liveness states and enhancing replay handling logic.

Notes

The provided information lacks specific technical details, so the guidance is focused on the general approach to addressing the issue. The actual implementation may vary depending on the underlying system architecture and technology stack.

Recommendation

Apply workaround: Implement the suggested liveness states and replay handling enhancements to harden the system against silent task disappearance and replay failures. This approach is recommended because it directly addresses the identified issues and provides a clear path to improving system reliability.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#embedding generation #cache error #pipeline error #runtime error #dependency conflict

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix agents: classify replay/liveness failures and prevent silent long-task abandonment [3 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #64286: openai-codex: fix auth scope handling and classify provider/runtime failures

Description (problem / solution / changelog)

Summary

What changed

Why

Non-goals

Builds on prior groundwork

Validation

Linked issues

Changed files

PR #64300: agents: add OpenAI/Codex tool compatibility and replay/liveness state

Description (problem / solution / changelog)

Summary

Scope

What changed

Validation

Non-goals

Changed files

PR #64439: openai-codex: classify runtime failures and make full access truthful

Description (problem / solution / changelog)

Summary

Scope

What changed

Validation

Non-goals

Changed files

Summary

Scope

Acceptance

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING