openclaw - 💡(How to fix) Fix Gateway restart can leave Codex custom tool call without output [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

A gateway restart can abort an in-flight Codex app-server custom tool call after Codex has persisted the custom_tool_call item but before OpenClaw records the matching custom_tool_call_output. The next Codex app-server resume keeps loading that invalid native transcript and repeatedly logs:

Custom tool call output is missing for call id: call_<redacted>

Error Message

2026-05-20T21:31:38.479Z WARN still draining 2 active task(s) and 1 active embedded run(s) before restart 2026-05-20T21:31:38.502Z WARN wait for active embedded runs timed out... 2026-05-20T21:31:38.506Z WARN active embedded run drain grace reached; aborting active run(s) before restart 2026-05-20T21:31:38.522Z WARN drain timeout reached; proceeding with restart first: 2026-05-20T21:33:12.565Z codex app-server stderr: ... ERROR codex_core::util: Custom tool call output is missing for call id: call_<redacted> last: 2026-05-20T23:12:07.703Z codex app-server stderr: ... ERROR codex_core::util: Custom tool call output is missing for call id: call_<redacted> This produces repeated scary ERROR stderr entries in the gateway log even though the gateway is healthy. It also risks making future Codex app-server resume behavior depend on an invalid provider-owned transcript until someone manually edits the rollout JSONL.

Root Cause

A gateway restart can abort an in-flight Codex app-server custom tool call after Codex has persisted the custom_tool_call item but before OpenClaw records the matching custom_tool_call_output. The next Codex app-server resume keeps loading that invalid native transcript and repeatedly logs:

Custom tool call output is missing for call id: call_<redacted>

Fix Action

Fixed

Code Example

Custom tool call output is missing for call id: call_<redacted>

---

2026-05-20T21:31:38.479Z WARN still draining 2 active task(s) and 1 active embedded run(s) before restart
2026-05-20T21:31:38.502Z WARN wait for active embedded runs timed out...
2026-05-20T21:31:38.506Z WARN active embedded run drain grace reached; aborting active run(s) before restart
2026-05-20T21:31:38.522Z WARN drain timeout reached; proceeding with restart

---

custom missing count: 29
first: 2026-05-20T21:33:12.565Z codex app-server stderr: ... ERROR codex_core::util: Custom tool call output is missing for call id: call_<redacted>
last:  2026-05-20T23:12:07.703Z codex app-server stderr: ... ERROR codex_core::util: Custom tool call output is missing for call id: call_<redacted>

---

$OPENCLAW_HOME/agents/<agent-id>/agent/codex-home/sessions/2026/05/20/rollout-<timestamp>-<thread-id>.jsonl
line 28: response_item payload.type=custom_tool_call call_id=call_<redacted> name=exec
(no matching custom_tool_call_output existed before manual local repair)

---

GET /readyz -> 200 {"ready":true,...}
GET /healthz -> 200
dashboard localhost -> 200
RAW_BUFFERClick to expand / collapse

Summary

A gateway restart can abort an in-flight Codex app-server custom tool call after Codex has persisted the custom_tool_call item but before OpenClaw records the matching custom_tool_call_output. The next Codex app-server resume keeps loading that invalid native transcript and repeatedly logs:

Custom tool call output is missing for call id: call_<redacted>

Proof from a live OpenClaw 2026.5.19 update

During the local update/restart, the gateway drain timed out while an embedded run was active:

2026-05-20T21:31:38.479Z WARN still draining 2 active task(s) and 1 active embedded run(s) before restart
2026-05-20T21:31:38.502Z WARN wait for active embedded runs timed out...
2026-05-20T21:31:38.506Z WARN active embedded run drain grace reached; aborting active run(s) before restart
2026-05-20T21:31:38.522Z WARN drain timeout reached; proceeding with restart

After restart, the app-server repeatedly reported the missing output. Redacted local log proof:

custom missing count: 29
first: 2026-05-20T21:33:12.565Z codex app-server stderr: ... ERROR codex_core::util: Custom tool call output is missing for call id: call_<redacted>
last:  2026-05-20T23:12:07.703Z codex app-server stderr: ... ERROR codex_core::util: Custom tool call output is missing for call id: call_<redacted>

The persisted Codex native transcript confirmed the orphan:

$OPENCLAW_HOME/agents/<agent-id>/agent/codex-home/sessions/2026/05/20/rollout-<timestamp>-<thread-id>.jsonl
line 28: response_item payload.type=custom_tool_call call_id=call_<redacted> name=exec
(no matching custom_tool_call_output existed before manual local repair)

The gateway itself recovered and stayed healthy, so this is specifically transcript hygiene after restart abort rather than gateway liveness:

GET /readyz -> 200 {"ready":true,...}
GET /healthz -> 200
dashboard localhost -> 200

Expected behavior

OpenClaw should not leave Codex-owned native transcripts in a state that Codex rejects on every future resume. On restart abort/recovery, it should either:

  • avoid interrupting between custom tool call persistence and output persistence, or
  • repair the bound Codex rollout transcript by inserting a synthetic interrupted/cancelled custom_tool_call_output for orphaned custom_tool_call records before resume.

Impact

This produces repeated scary ERROR stderr entries in the gateway log even though the gateway is healthy. It also risks making future Codex app-server resume behavior depend on an invalid provider-owned transcript until someone manually edits the rollout JSONL.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

OpenClaw should not leave Codex-owned native transcripts in a state that Codex rejects on every future resume. On restart abort/recovery, it should either:

  • avoid interrupting between custom tool call persistence and output persistence, or
  • repair the bound Codex rollout transcript by inserting a synthetic interrupted/cancelled custom_tool_call_output for orphaned custom_tool_call records before resume.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Gateway restart can leave Codex custom tool call without output [1 pull requests]