openclaw - 💡(How to fix) Fix [Bug]: Codex warm turns spend ~7.5s in auth/start-options/tool setup before prompt submission

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

On OpenClaw 2026.5.18, Codex warm turns can still spend most of their latency budget in OpenClaw harness setup before the prompt is submitted to the already-running Codex app-server.

This is distinct from a full Codex app-server cold start. In the measured warm turn, the shared Codex app-server client was reused successfully, but OpenClaw still spent about 7.5s before prompt.submitted. The actual Codex/model completion wait was only about 3.1s.

Root Cause

From an operator perspective, this looks like “Codex/OpenClaw is slow” or “the VPS is too weak,” but the trace suggests a more specific warm-path harness overhead:

  • auth/profile resolution is repeated
  • bridge/start-options resolution is repeated
  • dynamic tool schema construction is repeated
  • shared-client cache lookup happens only after some expensive preparation

This also makes self-hosted gateways more sensitive to CPU pressure because the gateway main process does substantial per-turn setup work before the model call begins.

Code Example

Total harness send:              ~10.8s
Prompt submitted at:              ~7.5s
Actual completion wait:           ~3.1s

Pre-submit overhead:
  auth_cache:                     ~1.5s
  dynamic_tools.build:             ~1.9s
  getSharedCodexAppServerClient:   ~3.4s
    auth_profile.resolve:          ~1.6s
    managed_options.resolve:       ~0.16s
    bridge_options.resolve:        ~1.6s
    key.build:                     ~0.003s
  thread_lifecycle:                ~0.1s

---

phase=lookup clients=1 hasEntry=true hasClient=true hasPromise=true transport=stdio commandSource=resolved-managed
phase=return_client reusedClient=true reusedPromise=true clients=1

---

resolveCodexAppServerAuthAccountCacheKey(...)
loadCodexBundleMcpThreadConfig(...)
buildDynamicTools(...)

---

run-attempt.ts
  startupAuthProfileId = resolveCodexAppServerAuthProfileIdForAgent(...)
  startupAuthAccountCacheKey = await resolveCodexAppServerAuthAccountCacheKey(...)
  bundleMcpThreadConfig = await loadCodexBundleMcpThreadConfig(...)
  tools = await buildDynamicTools(...)

---

agentDir = ...
authProfileId = resolveCodexAppServerAuthProfileIdForAgent(...)
managedStartOptions = await resolveManagedCodexAppServerStartOptions(...)
startOptions = await bridgeCodexAppServerStartOptions(...)
key = codexAppServerStartOptionsKey(...)
entry = getOrCreateSharedClientEntry(state, key)

---

ensureCodexAppServerAuthProfileStore(...)
resolveCodexAppServerAuthProfileId(...)

---

~7.5s OpenClaw setup before prompt.submitted
~3.1s actual Codex/model completion

---

client cache state
pre-lookup auth/start-options time
dynamic tool build time
prompt.submitted timestamp
completion.wait duration
RAW_BUFFERClick to expand / collapse

Summary

On OpenClaw 2026.5.18, Codex warm turns can still spend most of their latency budget in OpenClaw harness setup before the prompt is submitted to the already-running Codex app-server.

This is distinct from a full Codex app-server cold start. In the measured warm turn, the shared Codex app-server client was reused successfully, but OpenClaw still spent about 7.5s before prompt.submitted. The actual Codex/model completion wait was only about 3.1s.

Environment

  • OpenClaw: 2026.5.18
  • Commit: 50a2481652b6a62d573ece3cead60400dc77020d
  • Platform: Linux x64, Docker gateway
  • Runtime/provider: Codex app-server / openai-codex
  • Model: gpt-5.5
  • Channel: Telegram and Control UI both show the slow turns; this report focuses on the Codex harness path, not Telegram delivery.
  • Gateway process was warm and healthy.
  • Codex app-server child process remained alive between turns.

Observed warm-turn timing

Local temporary instrumentation around the Codex app-server harness produced this trace for a short test prompt on an already-warm session:

Total harness send:              ~10.8s
Prompt submitted at:              ~7.5s
Actual completion wait:           ~3.1s

Pre-submit overhead:
  auth_cache:                     ~1.5s
  dynamic_tools.build:             ~1.9s
  getSharedCodexAppServerClient:   ~3.4s
    auth_profile.resolve:          ~1.6s
    managed_options.resolve:       ~0.16s
    bridge_options.resolve:        ~1.6s
    key.build:                     ~0.003s
  thread_lifecycle:                ~0.1s

The shared-client trace showed that the app-server client itself was warm and reused:

phase=lookup clients=1 hasEntry=true hasClient=true hasPromise=true transport=stdio commandSource=resolved-managed
phase=return_client reusedClient=true reusedPromise=true clients=1

So the issue is not simply that every turn starts a new app-server. The expensive part is the per-turn setup work before reaching the cache hit and before submitting the prompt.

Source-level review

The current source appears to match the trace:

extensions/codex/src/app-server/run-attempt.ts

The run startup path resolves auth/account state and builds tools every turn:

resolveCodexAppServerAuthAccountCacheKey(...)
loadCodexBundleMcpThreadConfig(...)
buildDynamicTools(...)

Relevant area:

run-attempt.ts
  startupAuthProfileId = resolveCodexAppServerAuthProfileIdForAgent(...)
  startupAuthAccountCacheKey = await resolveCodexAppServerAuthAccountCacheKey(...)
  bundleMcpThreadConfig = await loadCodexBundleMcpThreadConfig(...)
  tools = await buildDynamicTools(...)

buildDynamicTools(...) dynamically imports/constructs the OpenClaw coding tool surface each turn, then filters it for model/plugin/allowlist/vision inputs.

extensions/codex/src/app-server/shared-client.ts

getSharedCodexAppServerClient(...) resolves auth profile, managed start options, and bridged start options before looking in the shared-client map:

agentDir = ...
authProfileId = resolveCodexAppServerAuthProfileIdForAgent(...)
managedStartOptions = await resolveManagedCodexAppServerStartOptions(...)
startOptions = await bridgeCodexAppServerStartOptions(...)
key = codexAppServerStartOptionsKey(...)
entry = getOrCreateSharedClientEntry(state, key)

That means even a warm shared client still pays the auth/start-options cost before discovering that the client can be reused.

extensions/codex/src/app-server/auth-bridge.ts

resolveCodexAppServerAuthProfileIdForAgent(...) calls:

ensureCodexAppServerAuthProfileStore(...)
resolveCodexAppServerAuthProfileId(...)

bridgeCodexAppServerStartOptions(...) then calls ensureCodexAppServerAuthProfileStore(...) again and checks whether inherited OpenAI API-key env vars should be cleared.

In the measured warm turn these two phases together accounted for roughly 3.2s of the pre-submit delay.

Expected behavior

After the Codex app-server client is warm and the session is already bound, a trivial turn should not spend multiple seconds re-resolving stable auth/profile/start-options/tool-schema state before prompt.submitted.

The warm path should get much closer to Codex App/CLI latency on the same account, especially for short prompts with no tool calls.

Actual behavior

A warm turn can still spend most of its latency before prompt submission:

~7.5s OpenClaw setup before prompt.submitted
~3.1s actual Codex/model completion

This makes interactive Telegram/Control UI turns feel much slower than the native Codex app, even when the model response itself is quick.

Why this matters

From an operator perspective, this looks like “Codex/OpenClaw is slow” or “the VPS is too weak,” but the trace suggests a more specific warm-path harness overhead:

  • auth/profile resolution is repeated
  • bridge/start-options resolution is repeated
  • dynamic tool schema construction is repeated
  • shared-client cache lookup happens only after some expensive preparation

This also makes self-hosted gateways more sensitive to CPU pressure because the gateway main process does substantial per-turn setup work before the model call begins.

Proposed fix direction

Potential improvements:

  1. Cache/memoize Codex auth profile resolution and auth account cache key by agentDir + authProfileId + authProfileStore/config fingerprint, with invalidation when auth/config changes.
  2. Move the shared-client cache lookup earlier, or cache the resolved shared-client key/start options so a warm client does not pay auth_profile.resolve and bridge_options.resolve on every turn.
  3. Cache dynamic tool specs for stable agentId/sessionKey/model/tool allowlist/plugin config inputs, or split static tool schema construction from per-turn message/channel context.
  4. Add first-class diagnostics for:
    • auth_cache
    • dynamic_tools.build
    • client_factory.pre_lookup
    • client_factory.cache_hit
    • prompt.submitted
    • completion.wait

That would let operators distinguish actual model latency from OpenClaw harness overhead.

Related issues

Related but not identical:

  • #78947: older broader report about native Codex runtime latency on 2026.5.6.
  • #84037: steady-state Codex app-server CPU/helper overhead.
  • #84662: runaway native Codex history growth from persisted per-turn runtime context.

This issue is specifically about warm-path per-turn setup overhead before prompt submission on 2026.5.18, even when the app-server shared client is already alive and reused.

Suggested test

Add a harness-level test or diagnostic smoke test that runs two consecutive Codex app-server turns on the same native thread and records:

client cache state
pre-lookup auth/start-options time
dynamic tool build time
prompt.submitted timestamp
completion.wait duration

The second turn should assert that warm-client setup stays below a small threshold, or at least expose the timings in diagnostics.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

After the Codex app-server client is warm and the session is already bound, a trivial turn should not spend multiple seconds re-resolving stable auth/profile/start-options/tool-schema state before prompt.submitted.

The warm path should get much closer to Codex App/CLI latency on the same account, especially for short prompts with no tool calls.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: Codex warm turns spend ~7.5s in auth/start-options/tool setup before prompt submission