openclaw - 💡(How to fix) Fix Codex app-server startup retries can exhaust before replacement server is ready [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#83959Fetched 2026-05-20 03:46:02
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
1
Author
Timeline (top)
labeled ×4commented ×1mentioned ×1subscribed ×1

On OpenClaw v2026.5.18, a scheduled background agent turn using the Codex harness failed during app-server startup with:

Error: codex app-server client is closed

The local logs showed repeated startup-close handling in the same shape:

codex app-server connection closed during startup; restarting app-server and retrying

After the retry budget was exhausted, a later turn succeeded shortly afterward, which suggests the replacement app-server may still have been warming up when the immediate retries ran.

Error Message

Error: codex app-server client is closed

Root Cause

A scheduled/background turn can fail before reaching user work, even though the Codex route becomes viable moments later. This is especially noisy for cron-style automation because it reports a job failure even when there is no task-level or repository-level problem.

Fix Action

Fix / Workaround

Local mitigation tested

A local mitigation increased the startup-close retry budget and added a short bounded backoff between startup retry attempts. After restarting the gateway, a trivial Codex agent smoke test completed successfully and no new startup-close exhaustion was observed.

Code Example

Error: codex app-server client is closed

---

codex app-server connection closed during startup; restarting app-server and retrying
RAW_BUFFERClick to expand / collapse

Summary

On OpenClaw v2026.5.18, a scheduled background agent turn using the Codex harness failed during app-server startup with:

Error: codex app-server client is closed

The local logs showed repeated startup-close handling in the same shape:

codex app-server connection closed during startup; restarting app-server and retrying

After the retry budget was exhausted, a later turn succeeded shortly afterward, which suggests the replacement app-server may still have been warming up when the immediate retries ran.

Impact

A scheduled/background turn can fail before reaching user work, even though the Codex route becomes viable moments later. This is especially noisy for cron-style automation because it reports a job failure even when there is no task-level or repository-level problem.

Expected behavior

When the app-server connection closes during startup, OpenClaw should tolerate a short replacement-server warm-up window before declaring the Codex harness failed.

Useful behavior would be:

  • retry with a small backoff instead of immediately reusing the same failing window
  • use a retry budget large enough for the replacement app-server to bind and become ready
  • preserve the hard failure if the app-server remains unavailable after that bounded window
  • make the terminal error distinguish startup lifecycle exhaustion from task/model failure

Local mitigation tested

A local mitigation increased the startup-close retry budget and added a short bounded backoff between startup retry attempts. After restarting the gateway, a trivial Codex agent smoke test completed successfully and no new startup-close exhaustion was observed.

Related

Possibly related to #79495, which covers shared Codex app-server client eviction across agents. This issue is narrower: even when restarting the app-server is the intended recovery path, the startup retry loop can exhaust before the replacement process is ready.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When the app-server connection closes during startup, OpenClaw should tolerate a short replacement-server warm-up window before declaring the Codex harness failed.

Useful behavior would be:

  • retry with a small backoff instead of immediately reusing the same failing window
  • use a retry budget large enough for the replacement app-server to bind and become ready
  • preserve the hard failure if the app-server remains unavailable after that bounded window
  • make the terminal error distinguish startup lifecycle exhaustion from task/model failure

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING