OpenClaw should not silently drop a productive Codex app-server turn after a completed tool item if the turn is still expected to continue. At minimum, if OpenClaw decides the app-server turn is unrecoverably incomplete because `turn/completed` never arrived, it should: - release the session lane - send a visible channel status explaining the failed turn - preserve enough state to allow the user to retry/resume - avoid misleading explanations such as user/UI interruption when the log cause is `turn_completion_idle_timeout` - avoid losing already-started work without a user-visible failure/recovery message Better behavior would distinguish: - completed tool call followed by expected assistant continuation - genuinely terminal item completion - missing/late `turn/completed` - app-server still computing vs. app-server protocol dead-air

openclaw - 💡(How to fix) Fix [Bug]: Codex app-server stalls after `item/completed`, then aborts without recovery/status [4 comments, 3 participants]

openclaw2026-05-19 09:45:20

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#84076•Fetched 2026-05-20 03:44:22

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

labeled ×10commented ×4cross-referenced ×1

OpenClaw 2026.5.18 still loses productive Codex app-server turns when the last observed current-turn notification is item/completed and no turn/completed follows.

The already-merged fixes for #78756 and #82171 appear to be present in this installation. The current behavior is therefore not a missing-fix case, but a remaining recovery/turn-semantics problem:

the session lane enters processing
diagnostics report active_work_without_progress
lastProgress=codex_app_server:notification:item/completed
recovery=none
after turnCompletionIdleTimeoutMs, OpenClaw aborts the run
no useful visible recovery/status is delivered for the failed work
already-started work is not resumed

This makes chat lanes look silent or stuck and can drop real work after a completed tool call.

Root Cause

At minimum, if OpenClaw decides the app-server turn is unrecoverably incomplete because turn/completed never arrived, it should:

Fix Action

Fix / Workaround

Workaround in this environment: avoid the Codex app-server runtime for user-facing chat lanes until this recovery gap is fixed. For OpenAI GPT models, forcing harness=pi is only viable if the OpenAI provider credentials have api.responses.write; otherwise the normal OpenAI Responses API path fails with HTTP 401.

Code Example

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai/gpt-5.5",
        "fallbacks": []
      },
      "timeoutSeconds": 900
    }
  },
  "plugins": {
    "entries": {
      "codex": {
        "config": {
          "appServer": {
            "turnCompletionIdleTimeoutMs": 180000
          }
        }
      }
    }
  }
}

---

2026-05-19T08:07:43.604Z user prompt from Discord
2026-05-19T08:07:43.995Z assistant toolCall: bash mkdir -p /home/casper/.openclaw/workspace/artifacts/maria-ward-smartphone-start/site/assets/img
2026-05-19T08:07:44.092Z toolResult: completed exitCode 0 durationMs 0

---

"fallbacks": []

---

Gateway log excerpts:


2026-05-19T08:04:04.805Z [agent/embedded]
strict-agentic execution contract active:
runId=fa6f5365-411f-4028-8985-a9ec7a9b35a4
sessionId=ac54314e-d1ad-4145-b8fe-932309953759
provider=openai-codex/gpt-5.5 harness=codex



2026-05-19T08:07:04.822Z [diagnostic]
stalled session:
sessionId=ac54314e-d1ad-4145-b8fe-932309953759
sessionKey=agent:main:discord:channel:1497109509825626232
state=processing age=142s queueDepth=1
reason=active_work_without_progress
classification=stalled_agent_run
activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:item/completed
lastProgressAge=141s
recovery=none



2026-05-19T08:07:34.819Z [diagnostic]
stalled session:
sessionId=ac54314e-d1ad-4145-b8fe-932309953759
sessionKey=agent:main:discord:channel:1497109509825626232
state=processing age=172s queueDepth=1
reason=active_work_without_progress
classification=stalled_agent_run
activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:item/completed
lastProgressAge=171s
recovery=none



2026-05-19T08:07:43.435Z [agent/embedded]
codex app-server turn idle timed out waiting for completion
{
  threadId: "019e3f2d-b7f2-7443-ab96-4e72fe219fe1",
  turnId: "019e3f43-8034-7001-88af-70ffeb9bdb43",
  idleMs: 180003,
  timeoutMs: 180000,
  lastActivityReason: "notification:item/completed",
  lastNotificationMethod: "item/completed"
}



2026-05-19T08:07:43.457Z [agent/embedded]
codex app-server client retired after timed-out turn
{
  threadId: "019e3f2d-b7f2-7443-ab96-4e72fe219fe1",
  turnId: "019e3f43-8034-7001-88af-70ffeb9bdb43",
  reason: "turn_completion_idle_timeout",
  clearedSharedClient: true
}



2026-05-19T08:07:44.198Z [agent/embedded]
embedded run failover decision
{
  runId: "fa6f5365-411f-4028-8985-a9ec7a9b35a4",
  stage: "assistant",
  decision: "surface_error",
  failoverReason: "timeout",
  profileFailureReason: "timeout",
  provider: "openai-codex",
  model: "gpt-5.5",
  fallbackConfigured: false,
  timedOut: true,
  aborted: true
}


While diagnosing the Discord stall from Telegram, the Telegram direct session itself hit the same failure mode.


2026-05-19T08:14:59.977Z [agent/embedded]
strict-agentic execution contract active:
runId=6e9f7eb1-5418-4d5c-aabc-df8a1e7f7619
sessionId=9578d939-b2fd-4ec9-b65b-8a93348ca570
provider=openai-codex/gpt-5.5 harness=codex



2026-05-19T08:17:38.070Z [diagnostic]
stalled session:
sessionId=9578d939-b2fd-4ec9-b65b-8a93348ca570
sessionKey=agent:main:telegram:direct:287384854
state=processing age=129s queueDepth=1
reason=active_work_without_progress
classification=stalled_agent_run
activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:item/completed
lastProgressAge=129s
recovery=none



2026-05-19T08:18:08.068Z [diagnostic]
stalled session:
sessionId=9578d939-b2fd-4ec9-b65b-8a93348ca570
sessionKey=agent:main:telegram:direct:287384854
state=processing age=159s queueDepth=1
reason=active_work_without_progress
classification=stalled_agent_run
activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:item/completed
lastProgressAge=159s
recovery=none



2026-05-19T08:18:29.525Z [agent/embedded]
codex app-server turn idle timed out waiting for completion
{
  threadId: "019e3ef4-0e36-7b32-b9e1-36b98cc115a8",
  turnId: "019e3f4d-7f38-74e2-82fc-2557e24a98b1",
  idleMs: 180001,
  timeoutMs: 180000,
  lastActivityReason: "notification:item/completed",
  lastNotificationMethod: "item/completed"
}



2026-05-19T08:18:30.061Z [agent/embedded]
embedded run failover decision
{
  runId: "6e9f7eb1-5418-4d5c-aabc-df8a1e7f7619",
  stage: "assistant",
  decision: "surface_error",
  failoverReason: "timeout",
  profileFailureReason: "timeout",
  provider: "openai-codex",
  model: "gpt-5.5",
  fallbackConfigured: false,
  timedOut: true,
  aborted: true
}

RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

OpenClaw 2026.5.18 still loses productive Codex app-server turns when the last observed current-turn notification is item/completed and no turn/completed follows.

The already-merged fixes for #78756 and #82171 appear to be present in this installation. The current behavior is therefore not a missing-fix case, but a remaining recovery/turn-semantics problem:

the session lane enters processing
diagnostics report active_work_without_progress
lastProgress=codex_app_server:notification:item/completed
recovery=none
after turnCompletionIdleTimeoutMs, OpenClaw aborts the run
no useful visible recovery/status is delivered for the failed work
already-started work is not resumed

This makes chat lanes look silent or stuck and can drop real work after a completed tool call.

Steps to reproduce

Run OpenClaw with a user-facing chat lane, reproduced here in Discord and Telegram direct chat.
Configure an OpenAI GPT model to use the Codex app-server runtime.
Disable model fallbacks to avoid hiding the Codex failure behind Anthropic fallback.
Set plugins.entries.codex.config.appServer.turnCompletionIdleTimeoutMs to 180000 to prove which watchdog fires.
In Discord, ask the agent to do a multi-step file-producing task, for example building a static multi-page web presence from existing project drafts.
Observe that the assistant completes one tool item and then no turn/completed arrives.
Watch diagnostics until the completion-idle timeout fires.

Relevant redacted config used during the test:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "openai/gpt-5.5",
        "fallbacks": []
      },
      "timeoutSeconds": 900
    }
  },
  "plugins": {
    "entries": {
      "codex": {
        "config": {
          "appServer": {
            "turnCompletionIdleTimeoutMs": 180000
          }
        }
      }
    }
  }
}

Discord reproduction sequence from the session JSONL:

2026-05-19T08:07:43.604Z user prompt from Discord
2026-05-19T08:07:43.995Z assistant toolCall: bash mkdir -p /home/casper/.openclaw/workspace/artifacts/maria-ward-smartphone-start/site/assets/img
2026-05-19T08:07:44.092Z toolResult: completed exitCode 0 durationMs 0

No subsequent assistant work was written for the requested site build before timeout. The only filesystem result was directory creation.

Expected behavior

OpenClaw should not silently drop a productive Codex app-server turn after a completed tool item if the turn is still expected to continue.

At minimum, if OpenClaw decides the app-server turn is unrecoverably incomplete because turn/completed never arrived, it should:

release the session lane
send a visible channel status explaining the failed turn
preserve enough state to allow the user to retry/resume
avoid misleading explanations such as user/UI interruption when the log cause is turn_completion_idle_timeout
avoid losing already-started work without a user-visible failure/recovery message

Better behavior would distinguish:

completed tool call followed by expected assistant continuation
genuinely terminal item completion
missing/late turn/completed
app-server still computing vs. app-server protocol dead-air

Actual behavior

The run is aborted after the completion idle timeout. Diagnostics explicitly say recovery=none.

In the Discord reproduction, only a directory was created; no requested site files were produced. The user saw typing/activity disappear and no useful recovery surfaced.

Subsequent status questions can create confusing assistant explanations that imply a user/UI abort, even though the durable gateway evidence for the original run points to turn_completion_idle_timeout.

OpenClaw version

OpenClaw 2026.5.18 (50a2481)

Operating system

Ubuntu

Install method

npm global

Model

gpt-5.5

Provider / routing chain

openai-codex/gpt-5.5 -> Codex app-server harness -> OpenClaw embedded run -> Discord/Telegram chat lane

Additional provider/model setup details

Fallbacks were disabled during the primary test:

"fallbacks": []

This was intentional to avoid an Anthropic fallback hiding the Codex app-server failure.

turnCompletionIdleTimeoutMs was deliberately raised to 180000 during testing. The same pattern had previously been observed around the default shorter idle behavior; raising the timeout made it clear which watchdog fired.

Earlier tests with fallbacks enabled caused additional confusing behavior: OpenClaw fell back to Anthropic, then hit context overflow/compaction and separate message tool delivery errors.

Related issues/PRs:

#78756: Codex app-server turns time out after 60s despite meaningful progress
#79667: fix(codex): ignore account updates for turn liveness
#82171: Codex app-server can stall after the last current-turn item completes without turn/completed
#82172: fix(codex): fail fast after quiescent turn completion stalls

Logs, screenshots, and evidence

Gateway log excerpts:


2026-05-19T08:04:04.805Z [agent/embedded]
strict-agentic execution contract active:
runId=fa6f5365-411f-4028-8985-a9ec7a9b35a4
sessionId=ac54314e-d1ad-4145-b8fe-932309953759
provider=openai-codex/gpt-5.5 harness=codex



2026-05-19T08:07:04.822Z [diagnostic]
stalled session:
sessionId=ac54314e-d1ad-4145-b8fe-932309953759
sessionKey=agent:main:discord:channel:1497109509825626232
state=processing age=142s queueDepth=1
reason=active_work_without_progress
classification=stalled_agent_run
activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:item/completed
lastProgressAge=141s
recovery=none



2026-05-19T08:07:34.819Z [diagnostic]
stalled session:
sessionId=ac54314e-d1ad-4145-b8fe-932309953759
sessionKey=agent:main:discord:channel:1497109509825626232
state=processing age=172s queueDepth=1
reason=active_work_without_progress
classification=stalled_agent_run
activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:item/completed
lastProgressAge=171s
recovery=none



2026-05-19T08:07:43.435Z [agent/embedded]
codex app-server turn idle timed out waiting for completion
{
  threadId: "019e3f2d-b7f2-7443-ab96-4e72fe219fe1",
  turnId: "019e3f43-8034-7001-88af-70ffeb9bdb43",
  idleMs: 180003,
  timeoutMs: 180000,
  lastActivityReason: "notification:item/completed",
  lastNotificationMethod: "item/completed"
}



2026-05-19T08:07:43.457Z [agent/embedded]
codex app-server client retired after timed-out turn
{
  threadId: "019e3f2d-b7f2-7443-ab96-4e72fe219fe1",
  turnId: "019e3f43-8034-7001-88af-70ffeb9bdb43",
  reason: "turn_completion_idle_timeout",
  clearedSharedClient: true
}



2026-05-19T08:07:44.198Z [agent/embedded]
embedded run failover decision
{
  runId: "fa6f5365-411f-4028-8985-a9ec7a9b35a4",
  stage: "assistant",
  decision: "surface_error",
  failoverReason: "timeout",
  profileFailureReason: "timeout",
  provider: "openai-codex",
  model: "gpt-5.5",
  fallbackConfigured: false,
  timedOut: true,
  aborted: true
}


While diagnosing the Discord stall from Telegram, the Telegram direct session itself hit the same failure mode.


2026-05-19T08:14:59.977Z [agent/embedded]
strict-agentic execution contract active:
runId=6e9f7eb1-5418-4d5c-aabc-df8a1e7f7619
sessionId=9578d939-b2fd-4ec9-b65b-8a93348ca570
provider=openai-codex/gpt-5.5 harness=codex



2026-05-19T08:17:38.070Z [diagnostic]
stalled session:
sessionId=9578d939-b2fd-4ec9-b65b-8a93348ca570
sessionKey=agent:main:telegram:direct:287384854
state=processing age=129s queueDepth=1
reason=active_work_without_progress
classification=stalled_agent_run
activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:item/completed
lastProgressAge=129s
recovery=none



2026-05-19T08:18:08.068Z [diagnostic]
stalled session:
sessionId=9578d939-b2fd-4ec9-b65b-8a93348ca570
sessionKey=agent:main:telegram:direct:287384854
state=processing age=159s queueDepth=1
reason=active_work_without_progress
classification=stalled_agent_run
activeWorkKind=embedded_run
lastProgress=codex_app_server:notification:item/completed
lastProgressAge=159s
recovery=none



2026-05-19T08:18:29.525Z [agent/embedded]
codex app-server turn idle timed out waiting for completion
{
  threadId: "019e3ef4-0e36-7b32-b9e1-36b98cc115a8",
  turnId: "019e3f4d-7f38-74e2-82fc-2557e24a98b1",
  idleMs: 180001,
  timeoutMs: 180000,
  lastActivityReason: "notification:item/completed",
  lastNotificationMethod: "item/completed"
}



2026-05-19T08:18:30.061Z [agent/embedded]
embedded run failover decision
{
  runId: "6e9f7eb1-5418-4d5c-aabc-df8a1e7f7619",
  stage: "assistant",
  decision: "surface_error",
  failoverReason: "timeout",
  profileFailureReason: "timeout",
  provider: "openai-codex",
  model: "gpt-5.5",
  fallbackConfigured: false,
  timedOut: true,
  aborted: true
}

Impact and severity

Severity: high for user-facing chat lanes using Codex app-server.

Impact:

User-facing Discord/Telegram lanes can appear silent or stuck.
Real work may be dropped after a completed tool call.
Diagnostics say recovery=none, leaving no clear user-facing recovery path.
The failure can be confused with a user/UI abort even though logs show turn_completion_idle_timeout.
Increasing turnCompletionIdleTimeoutMs only delays the abort; it does not solve recovery.

Additional information

Why #78756 and #82171 do not fully cover this:

The fixes appear to be present and working in a narrow sense:

account/rate-limit updates are not prolonging this stall indefinitely
the session does not wait for the 30-minute terminal cap
the configured completion-idle watchdog fires

However, that still leaves a correctness/recovery gap:

productive work can be aborted after the last observed item/completed
no useful visible recovery is emitted
no resume/retry path is provided
the lane is not self-healing in a user-meaningful way

This looks like a remaining bug adjacent to #82171: the fail-fast behavior prevents long hangs, but it does not provide correct turn semantics or recovery when turn/completed is missing.

Suggested fix direction:

Preserve and expose a structured recovery result when turn_completion_idle_timeout fires after item/completed.
Emit a visible channel message when a user-facing lane aborts due to missing turn/completed, including the last completed item/tool and retry guidance.
Add a retry/resume mechanism that restarts the turn with a compact summary of already-completed tool calls and their results.
Improve app-server protocol handling so that if the final observed current-turn item is a tool result, OpenClaw does not treat silence as terminal without preserving recovery.
Add diagnostics that distinguish:
- turn/completed missing after assistant final text
- turn/completed missing after tool result where more assistant work is expected
- raw response completion stalls
- user/UI aborts

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

OpenClaw should not silently drop a productive Codex app-server turn after a completed tool item if the turn is still expected to continue.

At minimum, if OpenClaw decides the app-server turn is unrecoverably incomplete because turn/completed never arrived, it should:

release the session lane
send a visible channel status explaining the failed turn
preserve enough state to allow the user to retry/resume
avoid misleading explanations such as user/UI interruption when the log cause is turn_completion_idle_timeout
avoid losing already-started work without a user-visible failure/recovery message

Better behavior would distinguish:

completed tool call followed by expected assistant continuation
genuinely terminal item completion
missing/late turn/completed
app-server still computing vs. app-server protocol dead-air

#api #installation #serialization error #model compatibility #GPU setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: Codex app-server stalls after `item/completed`, then aborts without recovery/status [4 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Codex app-server stalls after `item/completed`, then aborts without recovery/status [4 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING