openclaw - ✅(Solved) Fix [Bug]: Gateway cold start is dominated by bundled runtime mirror work (not deps install) on Docker Desktop + WSL [1 pull requests, 3 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73339Fetched 2026-04-29 06:20:47
View on GitHub
Comments
3
Participants
3
Timeline
7
Reactions
0
Author
Timeline (top)
commented ×3labeled ×2closed ×1cross-referenced ×1

On Windows 11 with Docker Desktop + WSL, openclaw-gateway cold start takes about 3+ minutes before reaching http server listening.

This does not look like “full npm reinstall every startup”. In my case, bundled runtime deps install finished in about 27 seconds, but the gateway then spent much longer in the bundled runtime mirror path while holding .openclaw-runtime-mirror.lock.

During that time:

  • container healthcheck stayed in starting and later became unhealthy
  • /healthz checks exceeded the current 5s timeout
  • if I manually stopped the container during this phase, a stale .openclaw-runtime-mirror.lock remained in the persisted config volume

Also, on Windows/WSL the persisted plugin-runtime-deps tree could remain undeletable even after containers were removed until I ran:

wsl --shutdown

Root Cause

Affected: users running OpenClaw via Docker Desktop on Windows with WSL-backed volumes, especially when ~/.openclaw/plugin-runtime-deps is persisted across restarts Severity: moderate to high for self-hosted users, because startup regularly takes minutes and failed/interrupted starts leave behind state that is hard to clean up Frequency: reproducible on cold starts in this environment; stale lock/undeletable stage-root behavior appears whenever startup is interrupted during the bundled runtime mirror phase Consequence:

  • gateway readiness is delayed by several minutes
  • Docker healthcheck may mark the container unhealthy even though startup is still progressing
  • interrupted starts can leave stale .openclaw-runtime-mirror.lock
  • cleanup/retry can require wsl --shutdown before the persisted stage root becomes deletable again

Fix Action

Fixed

PR fix notes

PR #73364: fix: reduce mirror lock hold time on slow filesystems (fixes #73339)

Description (problem / solution / changelog)

Summary

On Docker Desktop + WSL (overlayfs inside containers), the bundled runtime mirror fingerprint computation takes ~3 minutes because every readdir/stat call traverses the overlay layer stack. Previously this was done inside the .openclaw-runtime-mirror.lock, blocking all other processes for the full duration.

This change introduces precomputeBundledRuntimeMirrorMetadata() and an optional precomputedSourceFingerprint parameter for refreshBundledPluginRuntimeMirrorRoot(). The caller now computes the fingerprint before acquiring the lock, then passes it in.

Lock hold time drops from ~fingerprint + ~copy (~200s on affected systems) to just ~copy (~20s), a ~90% reduction.

Changes

  • src/plugins/bundled-runtime-mirror.ts:

    • Added precomputedSourceFingerprint?: string parameter to refreshBundledPluginRuntimeMirrorRoot() (backward-compatible, optional)
    • Added precomputeBundledRuntimeMirrorMetadata() exported helper to compute fingerprint + resolved source root outside the lock
    • createBundledRuntimeMirrorMetadata() accepts an optional pre-computed fingerprint
  • src/plugins/loader.ts and src/plugins/bundled-runtime-root.ts:

    • mirrorBundledPluginRuntimeRoot(): pre-computes fingerprint before withBundledRuntimeDepsFilesystemLock(), passes it to refreshBundledPluginRuntimeMirrorRoot() inside the lock
    • mirrorCanonicalBundledRuntimeDistRoot(): same optimization

Testing

The fix was validated by analyzing the code path. The fingerprint computation (the expensive part) is now done outside the lock, so the lock is only held during the directory copy (~20s) instead of fingerprint + copy (~200s). This directly addresses the reported 176-second lock hold time on Docker Desktop + WSL.

Fixes openclaw/openclaw#73339

Changed files

  • src/plugins/bundled-runtime-mirror.ts (modified, +32/-6)
  • src/plugins/bundled-runtime-root.ts (modified, +20/-0)
  • src/plugins/loader.ts (modified, +20/-0)

Code Example

wsl --shutdown

---

2026-04-28T06:04:48Z [gateway] starting...
2026-04-28T06:04:49Z [plugins] staging bundled runtime deps before gateway startup
2026-04-28T06:05:16Z [plugins] installed bundled runtime deps before gateway startup in 27160ms
2026-04-28T06:08:10Z [gateway] starting HTTP server...
2026-04-28T06:08:11Z [gateway] http server listening (...; 203.3s)

---



---

Health check exceeded timeout (5s)

---

Status=running Health=unhealthy FailingStreak=13

---

C:\Users\Ryuu\.openclaw\plugin-runtime-deps\openclaw-2026.4.26-f53b52ad6d21

---

C:\Users\Ryuu\.openclaw\plugin-runtime-deps\openclaw-2026.4.26-f53b52ad6d21

---

wsl --shutdown
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

No

Summary

Bug type

Startup / performance bug

Summary

On Windows 11 with Docker Desktop + WSL, openclaw-gateway cold start takes about 3+ minutes before reaching http server listening.

This does not look like “full npm reinstall every startup”. In my case, bundled runtime deps install finished in about 27 seconds, but the gateway then spent much longer in the bundled runtime mirror path while holding .openclaw-runtime-mirror.lock.

During that time:

  • container healthcheck stayed in starting and later became unhealthy
  • /healthz checks exceeded the current 5s timeout
  • if I manually stopped the container during this phase, a stale .openclaw-runtime-mirror.lock remained in the persisted config volume

Also, on Windows/WSL the persisted plugin-runtime-deps tree could remain undeletable even after containers were removed until I ran:

wsl --shutdown

Steps to reproduce

  1. Run OpenClaw via Docker Compose on Windows 11 with Docker Desktop + WSL
  2. Mount host config dir into /home/node/.openclaw
  3. Delete the existing plugin-runtime-deps/openclaw-2026.4.26-* stage root so startup is cold
  4. Start the gateway
  5. Observe:
    • bundled runtime deps stage quickly
    • startup then spends a long time before http server listening
    • healthcheck may become unhealthy
  6. Stop the container during the slow-start phase and inspect the persisted stage root for stale .openclaw-runtime-mirror.lock

Expected behavior

  • cold startup should not block for 3+ minutes in bundled runtime prep/mirror
  • healthcheck should not mark the container unhealthy while startup is still legitimately progressing
  • interrupted startup should not leave stale runtime mirror locks in persisted state

Actual behavior

Observed startup timeline:

2026-04-28T06:04:48Z [gateway] starting...
2026-04-28T06:04:49Z [plugins] staging bundled runtime deps before gateway startup
2026-04-28T06:05:16Z [plugins] installed bundled runtime deps before gateway startup in 27160ms
2026-04-28T06:08:10Z [gateway] starting HTTP server...
2026-04-28T06:08:11Z [gateway] http server listening (...; 203.3s)

So total startup was about 203 seconds, but actual bundled deps install was only about 27 seconds.

During the long middle phase:

  • node_modules already existed
  • .openclaw-runtime-deps.lock was already gone
  • .openclaw-runtime-mirror.lock still existed
  • the staged runtime dist directory kept changing while the mirror lock was held

OpenClaw version

2026.4.26

Operating system

Windows 11

Install method

dcoker

Model

none

Provider / routing chain

none

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

Affected: users running OpenClaw via Docker Desktop on Windows with WSL-backed volumes, especially when ~/.openclaw/plugin-runtime-deps is persisted across restarts Severity: moderate to high for self-hosted users, because startup regularly takes minutes and failed/interrupted starts leave behind state that is hard to clean up Frequency: reproducible on cold starts in this environment; stale lock/undeletable stage-root behavior appears whenever startup is interrupted during the bundled runtime mirror phase Consequence:

  • gateway readiness is delayed by several minutes
  • Docker healthcheck may mark the container unhealthy even though startup is still progressing
  • interrupted starts can leave stale .openclaw-runtime-mirror.lock
  • cleanup/retry can require wsl --shutdown before the persisted stage root becomes deletable again

Additional information

Environment

  • OpenClaw: 2026.4.26
  • Host: Windows 11
  • Docker Desktop + WSL
  • Config mounted from host to /home/node/.openclaw
  • Observed on: 2026-04-28

Healthcheck symptom

After the server was already listening, Docker health still reported:

Health check exceeded timeout (5s)

Example state:

Status=running Health=unhealthy FailingStreak=13

Persisted stage root involved

C:\Users\Ryuu\.openclaw\plugin-runtime-deps\openclaw-2026.4.26-f53b52ad6d21

If startup was interrupted, stale mirror lock state could remain there.

A Windows/WSL-specific detail made retries more painful:

after interrupting startup, the persisted staged runtime root under:

C:\Users\Ryuu\.openclaw\plugin-runtime-deps\openclaw-2026.4.26-f53b52ad6d21

could remain undeletable even after all Docker containers were removed and Docker Desktop was closed.

I was only able to delete the directory after running:

wsl --shutdown

So this looks like two layers of trouble interacting:

  1. OpenClaw cold startup spends a long time in bundled runtime mirror work
  2. if startup is interrupted, Windows/WSL can keep handles on the persisted staged runtime tree, which makes cleanup and retry harder

This does not seem to be the root cause of the slow startup itself, but it makes failed or interrupted startup attempts much more difficult to recover from.

Notes

This feels related to the broader bundled plugin/runtime startup cost issues already reported elsewhere, but this repro is specifically:

  • Docker
  • Windows + WSL
  • persisted plugin-runtime-deps
  • long hold of .openclaw-runtime-mirror.lock
  • healthcheck timeout after gateway eventually listens

extent analysis

TL;DR

The most likely fix for the slow startup issue is to investigate and optimize the bundled runtime mirror process, potentially by improving the locking mechanism for .openclaw-runtime-mirror.lock or enhancing the healthcheck to account for longer startup times.

Guidance

  1. Investigate the bundled runtime mirror process: Analyze the code responsible for the bundled runtime mirror to understand why it's taking so long and if there are any potential optimizations.
  2. Improve the locking mechanism: Consider enhancing the locking mechanism for .openclaw-runtime-mirror.lock to prevent it from being held for an extended period, causing the slow startup.
  3. Enhance the healthcheck: Modify the healthcheck to account for longer startup times, potentially by increasing the timeout or implementing a more sophisticated check to determine if the startup is still progressing.
  4. Clean up persisted stage root: After interrupting startup, ensure that the persisted stage root is properly cleaned up to prevent stale locks and undeletable directories.

Example

No specific code example is provided due to the lack of detailed code information in the issue.

Notes

The issue seems to be related to the broader bundled plugin/runtime startup cost issues, but this specific repro is focused on Docker, Windows + WSL, and persisted plugin-runtime-deps. The slow startup and healthcheck timeout are the primary concerns.

Recommendation

Apply a workaround by increasing the healthcheck timeout to account for longer startup times, and investigate the bundled runtime mirror process for potential optimizations. This will help mitigate the issue until a more permanent fix can be implemented.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  • cold startup should not block for 3+ minutes in bundled runtime prep/mirror
  • healthcheck should not mark the container unhealthy while startup is still legitimately progressing
  • interrupted startup should not leave stale runtime mirror locks in persisted state

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING