openclaw - 💡(How to fix) Fix Improve Codex app-server steady-state CPU and helper process overhead [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#84037Fetched 2026-05-20 03:44:56
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
1
Author
Timeline (top)
labeled ×6commented ×1

On [email protected], a gateway using the Codex runtime can still show substantial steady CPU use from the gateway process and the Codex app-server, plus short-lived openclaw / openclaw-hooks child processes that briefly consume high CPU.

This is visible even when the gateway /health endpoint is fast under a quiet host. The user-facing concern is that interactive sessions feel slower or more jittery than expected, especially under host contention or when multiple CLI/tool probes are spawned.

This is related to, but not exactly the same as:

  • #78947: native Codex runtime latency for trivial turns
  • #79495: Codex app-server shared client eviction across agents
  • #82070: post-2026.5.12 CLI cold-start regression

The ask here is to improve the Codex app-server/runtime path's steady-state CPU profile and reduce the amount of expensive child process / hook work needed during normal interactive gateway use.

Root Cause

For a self-hosted always-on gateway, the Codex runtime should ideally have a low idle/steady-state CPU profile. A persistent ~15-40% CPU load from the gateway/app-server combination makes interactive channels more sensitive to host pressure and makes the system feel sluggish even when the actual model call is not the only bottleneck.

The issue is not just extension discovery:

  • Reducing dist/extensions from 94 dirs / 1,566 files to 12 dirs / 193 files improved the health path.
  • But Codex app-server CPU remained material, and transient child process/hook activity still caused visible jitter.

Fix Action

Fix / Workaround

  • gateway /health should stay in low single-digit milliseconds
  • Codex app-server should not consume a large sustained CPU share
  • transient helper processes should not dominate CPU during routine status/health/tool dispatch paths
RAW_BUFFERClick to expand / collapse

Summary

On [email protected], a gateway using the Codex runtime can still show substantial steady CPU use from the gateway process and the Codex app-server, plus short-lived openclaw / openclaw-hooks child processes that briefly consume high CPU.

This is visible even when the gateway /health endpoint is fast under a quiet host. The user-facing concern is that interactive sessions feel slower or more jittery than expected, especially under host contention or when multiple CLI/tool probes are spawned.

This is related to, but not exactly the same as:

  • #78947: native Codex runtime latency for trivial turns
  • #79495: Codex app-server shared client eviction across agents
  • #82070: post-2026.5.12 CLI cold-start regression

The ask here is to improve the Codex app-server/runtime path's steady-state CPU profile and reduce the amount of expensive child process / hook work needed during normal interactive gateway use.

Environment

  • OpenClaw: 2026.5.18 (50a2481)
  • Platform: Linux x64
  • Node: v22.22.1
  • Runtime: OpenAI Codex / openai-codex/gpt-5.5
  • Gateway mode: long-running local gateway on port 18789
  • Useful env already enabled:
    • NODE_COMPILE_CACHE=/var/tmp/openclaw-compile-cache
    • OPENCLAW_NO_RESPAWN=1

Private account names, hostnames, channel names, and local workspace paths are intentionally omitted.

Measurements

Baseline with full bundled extensions

Before pruning bundled extensions:

  • dist/extensions: 94 directories / 1,566 files / 8.0 MB
  • /health idle, 20 samples:
    • min: 3 ms
    • p50: 5 ms
    • p95: 7 ms
    • max: 9 ms
    • avg: 6 ms
  • openclaw --version, 5 samples:
    • min: 1.528 s
    • p50: 1.644 s
    • max: 2.734 s
    • avg: 1.961 s
  • 10s CPU sample:
    • gateway node process: ~29%
    • Codex app-server process: ~28%
  • Gateway RSS: roughly 850-910 MB during the test window

After pruning the extensions directory

The local install was pruned to a small set of required extensions:

  • dist/extensions: 12 directories / 193 files / 1.1 MB
  • retained categories included provider/runtime/core/media facade plugins such as openai, openrouter, deepseek, memory-core, media-understanding-core, image-generation-core, video-generation-core, document-extract, and web-readability

With host CPU contention removed (steal=0.0%), /health became very fast:

  • /health, 20 samples:
    • min: 2.3 ms
    • p50: 3.0 ms
    • p90: 3.3 ms
    • p95: 3.4 ms
    • max: 3.9 ms
    • avg: 3.0 ms
  • Gateway RSS: roughly 790 MB

So extension pruning helps the light health path and memory footprint, but it does not fully address Codex runtime CPU overhead:

  • 15s CPU sample after host contention was fixed:
    • gateway process: ~15.8%
    • Codex app-server process: ~38.2%
  • Earlier under host contention, the same setup showed /health p95 in multi-second territory and gateway CPU above 60%, which suggests the runtime becomes very sensitive to CPU pressure.

During tests, short-lived openclaw / openclaw-hooks children also appeared and briefly used high CPU. When those settled and the host was quiet, /health recovered immediately.

Why this matters

For a self-hosted always-on gateway, the Codex runtime should ideally have a low idle/steady-state CPU profile. A persistent ~15-40% CPU load from the gateway/app-server combination makes interactive channels more sensitive to host pressure and makes the system feel sluggish even when the actual model call is not the only bottleneck.

The issue is not just extension discovery:

  • Reducing dist/extensions from 94 dirs / 1,566 files to 12 dirs / 193 files improved the health path.
  • But Codex app-server CPU remained material, and transient child process/hook activity still caused visible jitter.

Requested improvements

Please consider optimizing the Codex app-server/runtime path in these areas:

  1. Reduce steady-state CPU use from the Codex app-server when it is idle or waiting for work.
  2. Avoid spawning expensive short-lived openclaw / hook helper processes in hot interactive paths where possible.
  3. Add lightweight diagnostics that separate:
    • gateway event-loop delay
    • Codex app-server CPU
    • hook relay / child process CPU
    • host CPU steal / contention
  4. Document recommended production settings for self-hosted Codex runtime deployments, including NODE_COMPILE_CACHE, OPENCLAW_NO_RESPAWN, and whether a minimal bundled extension set is supported.

Expected behavior

When no model turn is actively running and the host is not CPU-starved:

  • gateway /health should stay in low single-digit milliseconds
  • Codex app-server should not consume a large sustained CPU share
  • transient helper processes should not dominate CPU during routine status/health/tool dispatch paths

Actual behavior

On 2026.5.18, after enabling the known startup optimizations and pruning extensions, health checks can be excellent under a quiet host, but the Codex app-server/runtime path still shows material CPU use and jitter under modest pressure.

This makes the runtime feel less stable than expected for an always-on self-hosted gateway.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When no model turn is actively running and the host is not CPU-starved:

  • gateway /health should stay in low single-digit milliseconds
  • Codex app-server should not consume a large sustained CPU share
  • transient helper processes should not dominate CPU during routine status/health/tool dispatch paths

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Improve Codex app-server steady-state CPU and helper process overhead [1 comments, 2 participants]