openclaw - 💡(How to fix) Fix Cold-start optimization: ship pre-warmed Node compile cache + provide opt-in to skip self-respawn [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#74087Fetched 2026-04-30 06:28:39
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Timeline (top)
cross-referenced ×2commented ×1

OpenClaw's coldstart on stateless container runtimes (Cloud Run, Lambda, Fly Machines, fresh K8s pods) currently spends ~50s in V8 parse+compile of the bundled chunk graph before any log line is emitted, plus ~5–10s in a self-respawn fork. Two small upstream changes would let every install benefit:

  1. Ship a pre-warmed NODE_COMPILE_CACHE via a build artifact or openclaw warm-cache subcommand
  2. Add an opt-in flag (env var or CLI) for users who can guarantee NODE_OPTIONS and NODE_EXTRA_CA_CERTS are already set, so dist/entry.js skips its self-respawn fork

I've implemented both downstream in our sandbox image and measured ~6× speedup on the parse+compile portion and an additional ~5–10s saved on the respawn skip. Filing here in case upstream wants to adopt either.

Root Cause

OpenClaw's coldstart on stateless container runtimes (Cloud Run, Lambda, Fly Machines, fresh K8s pods) currently spends ~50s in V8 parse+compile of the bundled chunk graph before any log line is emitted, plus ~5–10s in a self-respawn fork. Two small upstream changes would let every install benefit:

  1. Ship a pre-warmed NODE_COMPILE_CACHE via a build artifact or openclaw warm-cache subcommand
  2. Add an opt-in flag (env var or CLI) for users who can guarantee NODE_OPTIONS and NODE_EXTRA_CA_CERTS are already set, so dist/entry.js skips its self-respawn fork

I've implemented both downstream in our sandbox image and measured ~6× speedup on the parse+compile portion and an additional ~5–10s saved on the respawn skip. Filing here in case upstream wants to adopt either.

Fix Action

Fix / Workaround

Caveat: cache is keyed on Node `major.minor.patch` + arch, so the postinstall hook must run on the deployment target (true for npm-global installs, not for distros that vendor a prebuilt openclaw).

Today the workaround is to set all four of:

  • `NODE_OPTIONS="--disable-warning=ExperimentalWarning"`

  • `NODE_EXTRA_CA_CERTS=...`

  • `OPENCLAW_NODE_OPTIONS_READY=1`

  • `OPENCLAW_NODE_EXTRA_CA_CERTS_READY=1`

  • #48644 (closed today as fixed by lazy-channel-entry contracts) — orthogonal: that addressed plugin module-evaluation cost; this issue addresses the V8 parse+compile cost of the gateway-cli bundle itself, which lazy plugin loading doesn't touch.

  • #70533 (open) — also orthogonal: that's about `tools.allow` filtering at plugin-discovery time; reduces heap, not parse time.

Code Example

# Cold (no compile cache)
$ time openclaw gateway --help >/dev/null 2>&1
real    0m53.740s
user    0m17.093s
sys     0m2.702s

# Populate compile cache
$ rm -rf /tmp/oc-cache
$ NODE_COMPILE_CACHE=/tmp/oc-cache openclaw gateway --help >/dev/null
$ find /tmp/oc-cache -type f | wc -l
1175
$ du -sh /tmp/oc-cache
6.9M

# Warm (compile cache hit)
$ time NODE_COMPILE_CACHE=/tmp/oc-cache openclaw gateway --help >/dev/null
real    0m9.078s
user    0m13.569s
sys     0m1.059s

---

openclaw warm-cache --output /opt/openclaw-compile-cache [--extensions slack,googlechat,msteams]

---

ENV NODE_COMPILE_CACHE=/opt/openclaw-compile-cache
COPY scripts/prime-openclaw-compile-cache.mjs /tmp/prime.mjs
RUN mkdir -p \$NODE_COMPILE_CACHE && node /tmp/prime.mjs && chmod -R a-w \$NODE_COMPILE_CACHE

ENV NODE_OPTIONS=\"--disable-warning=ExperimentalWarning\"
ENV NODE_EXTRA_CA_CERTS=/etc/ssl/certs/ca-certificates.crt
ENV OPENCLAW_NODE_OPTIONS_READY=1
ENV OPENCLAW_NODE_EXTRA_CA_CERTS_READY=1
RAW_BUFFERClick to expand / collapse

Summary

OpenClaw's coldstart on stateless container runtimes (Cloud Run, Lambda, Fly Machines, fresh K8s pods) currently spends ~50s in V8 parse+compile of the bundled chunk graph before any log line is emitted, plus ~5–10s in a self-respawn fork. Two small upstream changes would let every install benefit:

  1. Ship a pre-warmed NODE_COMPILE_CACHE via a build artifact or openclaw warm-cache subcommand
  2. Add an opt-in flag (env var or CLI) for users who can guarantee NODE_OPTIONS and NODE_EXTRA_CA_CERTS are already set, so dist/entry.js skips its self-respawn fork

I've implemented both downstream in our sandbox image and measured ~6× speedup on the parse+compile portion and an additional ~5–10s saved on the respawn skip. Filing here in case upstream wants to adopt either.

Background

The openclaw gateway cold boot has two distinct slow windows on container restarts:

Window 1 — process spawn → first log line (~48s on 2026.4.15)

No log lines emit during this window; it's pure V8 work parsing+compiling the ~12 MB bundle in dist/ (~2,150 chunks). On a cold disk + cold V8 heap this is the bulk of cold-start time.

Window 2 — dist/entry.js self-respawn (~5–10s)

buildCliRespawnPlan() (dist/entry.js, line ~30) checks whether NODE_OPTIONS contains --disable-warning=ExperimentalWarning and whether NODE_EXTRA_CA_CERTS is set. If either is missing, it spawn()s a child Node.js process with the corrected env and uses the parent only as a stdio bridge. On Cloud Run cold starts this means two full Node.js boots back-to-back.

This is correct default behaviour for interactive CLI use, but for production container deployments where the operator controls the env, it's pure overhead.

Reproducer (host: openclaw-server, Linux 6.17 GCE n2-standard, Node 22.12)

# Cold (no compile cache)
$ time openclaw gateway --help >/dev/null 2>&1
real    0m53.740s
user    0m17.093s
sys     0m2.702s

# Populate compile cache
$ rm -rf /tmp/oc-cache
$ NODE_COMPILE_CACHE=/tmp/oc-cache openclaw gateway --help >/dev/null
$ find /tmp/oc-cache -type f | wc -l
1175
$ du -sh /tmp/oc-cache
6.9M

# Warm (compile cache hit)
$ time NODE_COMPILE_CACHE=/tmp/oc-cache openclaw gateway --help >/dev/null
real    0m9.078s
user    0m13.569s
sys     0m1.059s
ScenarioWall time
No compile cache, cold disk53.7s
Compile cache populated (6.9 MB / 1,175 files)9.1s

That's ~6× speedup from compile cache alone. With extension priming (importing dist/extensions/{slack,googlechat,msteams,...}/index.js etc.) the cache grows to ~3,663 entries / 21 MB and warms additional channel-plugin chunks too.

Proposed upstream changes

Change 1: openclaw warm-cache subcommand

A first-class command that operators can run during image build:

openclaw warm-cache --output /opt/openclaw-compile-cache [--extensions slack,googlechat,msteams]

It would:

  1. Set NODE_COMPILE_CACHE to the output path
  2. Internally invoke openclaw gateway --help (forks Node, walks gateway-cli chunk graph)
  3. Optionally import each requested channel-extension entry point under `try { await import(...) } catch {}` (extensions throw without a runtime context, but V8 still parses+compiles them, populating the cache before throwing)
  4. Print a summary: `{entries: 3663, sizeMB: 21}`

Operators then bake the output dir into their image and set `NODE_COMPILE_CACHE` at runtime.

Change 2 (alternative): `postinstall` hook populates a default cache location

`scripts/postinstall-bundled-plugins.mjs` already runs at install. It could also run `openclaw warm-cache --output $packageRoot/.compile-cache` so any `npm install -g openclaw` user gets warm boots automatically. `openclaw.mjs` would default `NODE_COMPILE_CACHE` to that path if the env var isn't set.

Caveat: cache is keyed on Node `major.minor.patch` + arch, so the postinstall hook must run on the deployment target (true for npm-global installs, not for distros that vendor a prebuilt openclaw).

Change 3: opt-in to skip self-respawn

Add a single env var (e.g. `OPENCLAW_SKIP_RESPAWN=1`) or CLI flag (`--no-respawn`) that bypasses `buildCliRespawnPlan()` entirely, returning `null` immediately. Operators who set their own `NODE_OPTIONS` and `NODE_EXTRA_CA_CERTS` (which is the production-deploy norm) can use it.

Today the workaround is to set all four of:

  • `NODE_OPTIONS="--disable-warning=ExperimentalWarning"`
  • `NODE_EXTRA_CA_CERTS=...`
  • `OPENCLAW_NODE_OPTIONS_READY=1`
  • `OPENCLAW_NODE_EXTRA_CA_CERTS_READY=1`

…which works (and is what we ship), but the latter two undocumented internal flags should ideally not be the public escape hatch.

Change 4 (more ambitious): V8 startup snapshot

`node --build-snapshot` (~stable since Node 22) would be even faster than compile cache (skips link + instantiate too), but doesn't support ESM entry points with top-level `await` and `import.meta`, both of which `openclaw.mjs` and `dist/entry.js` use today. A CJS bootstrap shim that loads the ESM graph via dynamic `import()` could unlock this. Out of scope for a quick win, but worth tracking.

What we're shipping downstream

We deploy OpenClaw in stateless Cloud Run sandboxes (one per user). We've added these to our `Dockerfile`:

ENV NODE_COMPILE_CACHE=/opt/openclaw-compile-cache
COPY scripts/prime-openclaw-compile-cache.mjs /tmp/prime.mjs
RUN mkdir -p \$NODE_COMPILE_CACHE && node /tmp/prime.mjs && chmod -R a-w \$NODE_COMPILE_CACHE

ENV NODE_OPTIONS=\"--disable-warning=ExperimentalWarning\"
ENV NODE_EXTRA_CA_CERTS=/etc/ssl/certs/ca-certificates.crt
ENV OPENCLAW_NODE_OPTIONS_READY=1
ENV OPENCLAW_NODE_EXTRA_CA_CERTS_READY=1

Where `prime-openclaw-compile-cache.mjs` runs `openclaw gateway --help` (forks) and best-effort imports channel entry points. Full source: https://github.com/shipcalm/marvin/blob/feat/sandbox-coldstart-compile-cache/infrastructure/docker/scripts/prime-openclaw-compile-cache.mjs

Result: 35–54s shaved from cold start (out of a measured ~96s baseline for one of our deployments). PR: https://github.com/shipcalm/marvin/pull/3009

Happy to extract this into an upstream PR if there's interest. Let me know which (if any) of changes 1–3 you'd accept and I'll prep something.

Related closed/open issues

  • #48644 (closed today as fixed by lazy-channel-entry contracts) — orthogonal: that addressed plugin module-evaluation cost; this issue addresses the V8 parse+compile cost of the gateway-cli bundle itself, which lazy plugin loading doesn't touch.
  • #70533 (open) — also orthogonal: that's about `tools.allow` filtering at plugin-discovery time; reduces heap, not parse time.

extent analysis

TL;DR

To speed up OpenClaw's cold start on stateless container runtimes, consider implementing a pre-warmed NODE_COMPILE_CACHE or adding an opt-in flag to skip the self-respawn fork in dist/entry.js.

Guidance

  • Implement openclaw warm-cache subcommand: Create a first-class command that operators can run during image build to populate the compile cache, which can lead to a ~6× speedup.
  • Add opt-in flag to skip self-respawn: Introduce an environment variable (e.g., OPENCLAW_SKIP_RESPAWN=1) or CLI flag (--no-respawn) to bypass buildCliRespawnPlan() and save an additional ~5-10s.
  • Verify cache population: Use the openclaw warm-cache command to populate the cache and measure the speedup by comparing the wall time with and without the cache.
  • Test the opt-in flag: Set the opt-in flag and measure the speedup by comparing the wall time with and without the flag.

Example

openclaw warm-cache --output /opt/openclaw-compile-cache

This command populates the compile cache, which can be used to speed up subsequent runs of OpenClaw.

Notes

The proposed changes are orthogonal to existing issues (#48644 and #70533) and address the V8 parse+compile cost of the gateway-cli bundle itself. The openclaw warm-cache subcommand and opt-in flag can be implemented independently, and their effectiveness can be measured separately.

Recommendation

Apply the workaround by implementing the openclaw warm-cache subcommand and adding the opt-in flag to skip self-respawn, as these changes can provide a significant speedup for OpenClaw's cold start on stateless container runtimes.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Cold-start optimization: ship pre-warmed Node compile cache + provide opt-in to skip self-respawn [1 comments, 2 participants]