openclaw - 💡(How to fix) Fix perf(codex): plugin cold load ~3.3 s; investigation findings + lever options

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  • If dropping ajv, error message string shape changes (any test or

Code Example

[plugin-load-profile] phase=discovery plugin=codex elapsedMs=3268.3 source=~/.openclaw/npm/node_modules/@openclaw/codex/dist/index.js
[plugin-load-profile] phase=discovery plugin=anthropic elapsedMs=43.0

---

NODE_OPTIONS='--cpu-prof --cpu-prof-name=codex.cpuprofile' \
  pnpm openclaw tui --local
RAW_BUFFERClick to expand / collapse

TL;DR

The @openclaw/codex plugin takes 3.27 s to load on cold start, vs. 43 ms for anthropic in the same TUI run (76× faster). PR #84649 cut other plugin-system thrash but explicitly left codex/anthropic on the table; anthropic is now fine, codex remains the outlier.

Investigation below identifies what the time is spent on, two experiments that didn't help (single-file bundle, V8 compile cache), and the realistic lever options for follow-up.

Data

Headline

[plugin-load-profile] phase=discovery plugin=codex elapsedMs=3268.3 source=~/.openclaw/npm/node_modules/@openclaw/codex/dist/index.js
[plugin-load-profile] phase=discovery plugin=anthropic elapsedMs=43.0

ajv.compile breakdown

Per-call timings around the 7 ajv.compile(schema) calls in extensions/codex/src/app-server/protocol-validators.ts, captured by adding inline [codex-load-time] probes to the installed codex dist:

Phasems
compile:ThreadResumeResponse29.6
compile:ThreadStartResponse21.8
compile:TurnCompletedNotification13.5
compile:TurnStartResponse12.9
compile:DynamicToolCallParams9.5
compile:ErrorNotification3.5
compile:ModelListResponse3.0
ajv-new1.6
sum of 7 ajv compiles93.8
protocol-validators module-body total95.8
config module-body total (zod)5.1

93.8 of the 95.8 ms in protocol-validators is ajv compile. zod in config.ts is negligible. The worst two schemas (ThreadResumeResponse, ThreadStartResponse) are 2,630-line JSON Schemas; compiling each from JSON Schema → JS validator at every fresh process is the cost.

Accounting

Bucketms% of 3268
ajv.compile (7 schemas)93.82.9%
zod schema setup in config.ts5.10.2%
All CJS module bodies (top-21 by selfMs from Module._load hook)~952.9%
ESM source-fetch + compile (sum of registerHooks.load lines)3.70.1%
Unaccounted (= ESM module body execution)~3070~94%

The 94% gap is ESM module body run for codex's ~30 published chunks plus its npm deps (zod, ajv, ws, pi-coding-agent, pi-ai, openai, @modelcontextprotocol/sdk). Node 22's module.registerHooks.load only exposes source-fetch+compile, not body execution — CPU profile is the only way to decompose further.

Experiments tried

V8 compile cache (NODE_COMPILE_CACHE) — within noise

Repeated runs alternating fresh/reuse cache on the current chunked dist:

RunCachecodex elapsedMs
1fresh1018
2reuse955
3reuse929
4fresh921

Run-to-run variance (~100 ms) exceeds the cache effect. Not a lever on the current dist.

Single-file bundle (esbuild --bundle) — 5× regression

Inlined codex's 30 chunks into a single dist/index.js with npm deps left external:

VariantCachecodex elapsedMs
Bundledfresh5183
Bundledreuse1578

Bundling made it 5× slower. esbuild --bundle collapses await import(...) dynamic imports into the bundle, defeating the laziness the chunked dist depends on. Codex's harness.ts lazily imports run-attempt, side-question, compact, etc. — bundling loads those eagerly at register.

The current 30-chunk shape is already optimized for cold load; bundling into one file is the wrong direction.

Affects gateway too

Yes — every fresh Node process that loads codex pays the full 3.27 s, including the gateway daemon on every restart. ajv compiles run once per process (module-cached) but on every fresh process. (Raised by @kevins8 in Slack.)

Lever options

1. Pre-compile ajv validators with Ajv.standaloneCode (~95 ms)

Run Ajv.standaloneCode(ajv, validator) at codex publish time, emit generated validator JS files, replace ajv.compile() calls with static imports of the generated files. Saves ~95 ms per fresh process including every gateway restart. Optionally drop ajv as a runtime dep (saves another ~30 ms from not loading the lib, plus ~30–50 % bundle size reduction).

Scope: ~50-line build script in extensions/codex/scripts/, prebuild hook in codex's package.json, modify protocol-validators.ts to import precompiled validators. Optionally write a ~10-line errorsText replacement to drop ajv.

Pros:

  • Clean, shippable independently
  • Net bundle size decrease (compiled validators replace inlined schemas + ajv runtime, est. 30–50% smaller)
  • Fires on every fresh process, including every gateway restart
  • Doesn't touch core, doesn't touch other plugins

Paper-cuts:

  • New build step + CI plumbing + snapshot test for schema drift
  • ajv version pinning + lockstep with codex re-publish (no more automatic uptake of ajv bug fixes)
  • If dropping ajv, error message string shape changes (any test or log grep matching errorsText output needs updating)
  • Stack traces from validation point into generated code (source maps possible but extra plumbing)
  • Schemas no longer "live" at runtime (introspection requires shipping schemas separately as data files)
  • New contributors see import compiled from "./..." instead of ajv.compile(schema) — one indirection further

Bottom line: ~3% of total codex load. Clean win if you want it, but won't move the needle on the user-facing 3 s.

2. ESM module body work (the ~3 s blind spot)

The dominant cost is in ESM body evaluation of codex's chunks + heavy deps. CPU profile is the only way to identify which specific modules before any lever here can be picked.

NODE_OPTIONS='--cpu-prof --cpu-prof-name=codex.cpuprofile' \
  pnpm openclaw tui --local

Likely candidates based on the CJS portion we could capture: pi-ai, pi-coding-agent, openai SDK, @modelcontextprotocol/sdk.

Once dominant modules are known, possible directions:

  • Further defer codex's eager top-level imports (more await import())
  • Trim what codex consumes from each dep (subpath imports vs whole-SDK)
  • Upgrade or swap heavy deps for lighter alternatives

This is where any large win lives, but it requires a CPU profile to target.

3. V8 compile cache

Measured ~within noise on current chunked dist. Free to enable, but not a primary lever.

Measurement caveat

The 3268 ms headline is from a foreground pnpm openclaw tui --local on a likely-cold-disk state. Back-to-back script-driven runs with OS file cache warm sit around 950 ms. Both numbers are real for different scenarios:

  • 3.27 s is what users hit on first TUI of the day / first gateway boot after deploy.
  • 0.95 s is the "warm disk, only V8 cache cold" baseline that back-to-back script runs measure.

Relative deltas between lever experiments are consistent regardless.

Recommendation

  1. CPU profile first to identify the dominant ~3 s ESM body cost. Without that, lever 2 is a shot in the dark.
  2. Pre-compile ajv as a clean independent ~95 ms win if you want it shipped now (doesn't depend on (1)).
  3. Skip single-file bundling and V8 compile cache — measured non-wins.

Methodology

  • OPENCLAW_PLUGIN_LOAD_PROFILE=1 (existing) — per-plugin total elapsed.
  • OPENCLAW_PLUGIN_LOAD_TRACE=1 (added during investigation in src/plugins/plugin-load-profile.ts) — Module._load CJS hook + module.registerHooks ESM hook + top-N summary per outermost withProfile scope.
  • [codex-load-time] probes added inline to the installed codex dist's protocol-validators-*.js and config-*.js to time each ajv.compile + module body total.
  • Bundle experiment via npx esbuild dist/index.js --bundle --format=esm --platform=node --packages=external; reverted after measurement.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING