- cold startup should not block for 3+ minutes in bundled runtime prep/mirror - healthcheck should not mark the container unhealthy while startup is still legitimately progressing - interrupted startup should not leave stale runtime mirror locks in persisted state

openclaw - ✅(Solved) Fix [Bug]: Gateway cold start is dominated by bundled runtime mirror work (not deps install) on Docker Desktop + WSL [1 pull requests, 3 comments, 3 participants]

openclaw2026-04-28 06:32:48

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#73339•Fetched 2026-04-29 06:20:47

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×3labeled ×2closed ×1cross-referenced ×1

On Windows 11 with Docker Desktop + WSL, openclaw-gateway cold start takes about 3+ minutes before reaching http server listening.

This does not look like “full npm reinstall every startup”. In my case, bundled runtime deps install finished in about 27 seconds, but the gateway then spent much longer in the bundled runtime mirror path while holding .openclaw-runtime-mirror.lock.

During that time:

container healthcheck stayed in starting and later became unhealthy
/healthz checks exceeded the current 5s timeout
if I manually stopped the container during this phase, a stale .openclaw-runtime-mirror.lock remained in the persisted config volume

Also, on Windows/WSL the persisted plugin-runtime-deps tree could remain undeletable even after containers were removed until I ran:

wsl --shutdown

Root Cause

Affected: users running OpenClaw via Docker Desktop on Windows with WSL-backed volumes, especially when ~/.openclaw/plugin-runtime-deps is persisted across restarts Severity: moderate to high for self-hosted users, because startup regularly takes minutes and failed/interrupted starts leave behind state that is hard to clean up Frequency: reproducible on cold starts in this environment; stale lock/undeletable stage-root behavior appears whenever startup is interrupted during the bundled runtime mirror phase Consequence:

gateway readiness is delayed by several minutes
Docker healthcheck may mark the container unhealthy even though startup is still progressing
interrupted starts can leave stale .openclaw-runtime-mirror.lock
cleanup/retry can require wsl --shutdown before the persisted stage root becomes deletable again

Fix Action

Fixed

Fixed by PR: fix: reduce mirror lock hold time on slow filesystems (fixes #73339) (https://github.com/openclaw/openclaw/pull/73364)

PR fix notes

PR #73364: fix: reduce mirror lock hold time on slow filesystems (fixes #73339)

Repository: openclaw/openclaw
Author: 1yihui
State: closed | merged: False
Link: https://github.com/openclaw/openclaw/pull/73364

Description (problem / solution / changelog)

Summary

On Docker Desktop + WSL (overlayfs inside containers), the bundled runtime mirror fingerprint computation takes ~3 minutes because every readdir/stat call traverses the overlay layer stack. Previously this was done inside the .openclaw-runtime-mirror.lock, blocking all other processes for the full duration.

This change introduces precomputeBundledRuntimeMirrorMetadata() and an optional precomputedSourceFingerprint parameter for refreshBundledPluginRuntimeMirrorRoot(). The caller now computes the fingerprint before acquiring the lock, then passes it in.

Lock hold time drops from ~fingerprint + ~copy (~200s on affected systems) to just ~copy (~20s), a ~90% reduction.

Changes

src/plugins/bundled-runtime-mirror.ts:
- Added precomputedSourceFingerprint?: string parameter to refreshBundledPluginRuntimeMirrorRoot() (backward-compatible, optional)
- Added precomputeBundledRuntimeMirrorMetadata() exported helper to compute fingerprint + resolved source root outside the lock
- createBundledRuntimeMirrorMetadata() accepts an optional pre-computed fingerprint
src/plugins/loader.ts and src/plugins/bundled-runtime-root.ts:
- mirrorBundledPluginRuntimeRoot(): pre-computes fingerprint before withBundledRuntimeDepsFilesystemLock(), passes it to refreshBundledPluginRuntimeMirrorRoot() inside the lock
- mirrorCanonicalBundledRuntimeDistRoot(): same optimization

Testing

The fix was validated by analyzing the code path. The fingerprint computation (the expensive part) is now done outside the lock, so the lock is only held during the directory copy (~20s) instead of fingerprint + copy (~200s). This directly addresses the reported 176-second lock hold time on Docker Desktop + WSL.

Fixes openclaw/openclaw#73339

Changed files

src/plugins/bundled-runtime-mirror.ts (modified, +32/-6)
src/plugins/bundled-runtime-root.ts (modified, +20/-0)
src/plugins/loader.ts (modified, +20/-0)

Code Example

wsl --shutdown

---

2026-04-28T06:04:48Z [gateway] starting...
2026-04-28T06:04:49Z [plugins] staging bundled runtime deps before gateway startup
2026-04-28T06:05:16Z [plugins] installed bundled runtime deps before gateway startup in 27160ms
2026-04-28T06:08:10Z [gateway] starting HTTP server...
2026-04-28T06:08:11Z [gateway] http server listening (...; 203.3s)

---



---

Health check exceeded timeout (5s)

---

Status=running Health=unhealthy FailingStreak=13

---

C:\Users\Ryuu\.openclaw\plugin-runtime-deps\openclaw-2026.4.26-f53b52ad6d21

---

C:\Users\Ryuu\.openclaw\plugin-runtime-deps\openclaw-2026.4.26-f53b52ad6d21

---

wsl --shutdown

RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Summary

Bug type

Startup / performance bug

Summary

On Windows 11 with Docker Desktop + WSL, openclaw-gateway cold start takes about 3+ minutes before reaching http server listening.

During that time:

container healthcheck stayed in starting and later became unhealthy
/healthz checks exceeded the current 5s timeout
if I manually stopped the container during this phase, a stale .openclaw-runtime-mirror.lock remained in the persisted config volume

Also, on Windows/WSL the persisted plugin-runtime-deps tree could remain undeletable even after containers were removed until I ran:

wsl --shutdown

Steps to reproduce

Run OpenClaw via Docker Compose on Windows 11 with Docker Desktop + WSL
Mount host config dir into /home/node/.openclaw
Delete the existing plugin-runtime-deps/openclaw-2026.4.26-* stage root so startup is cold
Start the gateway
Observe:
- bundled runtime deps stage quickly
- startup then spends a long time before http server listening
- healthcheck may become unhealthy
Stop the container during the slow-start phase and inspect the persisted stage root for stale .openclaw-runtime-mirror.lock

Expected behavior

cold startup should not block for 3+ minutes in bundled runtime prep/mirror
healthcheck should not mark the container unhealthy while startup is still legitimately progressing
interrupted startup should not leave stale runtime mirror locks in persisted state

Actual behavior

Observed startup timeline:

2026-04-28T06:04:48Z [gateway] starting...
2026-04-28T06:04:49Z [plugins] staging bundled runtime deps before gateway startup
2026-04-28T06:05:16Z [plugins] installed bundled runtime deps before gateway startup in 27160ms
2026-04-28T06:08:10Z [gateway] starting HTTP server...
2026-04-28T06:08:11Z [gateway] http server listening (...; 203.3s)

So total startup was about 203 seconds, but actual bundled deps install was only about 27 seconds.

During the long middle phase:

node_modules already existed
.openclaw-runtime-deps.lock was already gone
.openclaw-runtime-mirror.lock still existed
the staged runtime dist directory kept changing while the mirror lock was held

OpenClaw version

2026.4.26

Operating system

Windows 11

Install method

dcoker

Model

none

Provider / routing chain

none

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

gateway readiness is delayed by several minutes
Docker healthcheck may mark the container unhealthy even though startup is still progressing
interrupted starts can leave stale .openclaw-runtime-mirror.lock
cleanup/retry can require wsl --shutdown before the persisted stage root becomes deletable again

Additional information

Environment

OpenClaw: 2026.4.26
Host: Windows 11
Docker Desktop + WSL
Config mounted from host to /home/node/.openclaw
Observed on: 2026-04-28

Healthcheck symptom

After the server was already listening, Docker health still reported:

Health check exceeded timeout (5s)

Example state:

Status=running Health=unhealthy FailingStreak=13

Persisted stage root involved

C:\Users\Ryuu\.openclaw\plugin-runtime-deps\openclaw-2026.4.26-f53b52ad6d21

If startup was interrupted, stale mirror lock state could remain there.

A Windows/WSL-specific detail made retries more painful:

after interrupting startup, the persisted staged runtime root under:

C:\Users\Ryuu\.openclaw\plugin-runtime-deps\openclaw-2026.4.26-f53b52ad6d21

could remain undeletable even after all Docker containers were removed and Docker Desktop was closed.

I was only able to delete the directory after running:

wsl --shutdown

So this looks like two layers of trouble interacting:

OpenClaw cold startup spends a long time in bundled runtime mirror work
if startup is interrupted, Windows/WSL can keep handles on the persisted staged runtime tree, which makes cleanup and retry harder

This does not seem to be the root cause of the slow startup itself, but it makes failed or interrupted startup attempts much more difficult to recover from.

Notes

This feels related to the broader bundled plugin/runtime startup cost issues already reported elsewhere, but this repro is specifically:

Docker
Windows + WSL
persisted plugin-runtime-deps
long hold of .openclaw-runtime-mirror.lock
healthcheck timeout after gateway eventually listens

extent analysis

TL;DR

The most likely fix for the slow startup issue is to investigate and optimize the bundled runtime mirror process, potentially by improving the locking mechanism for .openclaw-runtime-mirror.lock or enhancing the healthcheck to account for longer startup times.

Guidance

Investigate the bundled runtime mirror process: Analyze the code responsible for the bundled runtime mirror to understand why it's taking so long and if there are any potential optimizations.
Improve the locking mechanism: Consider enhancing the locking mechanism for .openclaw-runtime-mirror.lock to prevent it from being held for an extended period, causing the slow startup.
Enhance the healthcheck: Modify the healthcheck to account for longer startup times, potentially by increasing the timeout or implementing a more sophisticated check to determine if the startup is still progressing.
Clean up persisted stage root: After interrupting startup, ensure that the persisted stage root is properly cleaned up to prevent stale locks and undeletable directories.

Example

No specific code example is provided due to the lack of detailed code information in the issue.

Notes

The issue seems to be related to the broader bundled plugin/runtime startup cost issues, but this specific repro is focused on Docker, Windows + WSL, and persisted plugin-runtime-deps. The slow startup and healthcheck timeout are the primary concerns.

Recommendation

Apply a workaround by increasing the healthcheck timeout to account for longer startup times, and investigate the bundled runtime mirror process for potential optimizations. This will help mitigate the issue until a more permanent fix can be implemented.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

cold startup should not block for 3+ minutes in bundled runtime prep/mirror
healthcheck should not mark the container unhealthy while startup is still legitimately progressing
interrupted startup should not leave stale runtime mirror locks in persisted state

#integration issue #index setup #retrieval issue #search optimization #API routing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

openclaw - ✅(Solved) Fix [Bug]: Gateway cold start is dominated by bundled runtime mirror work (not deps install) on Docker Desktop + WSL [1 pull requests, 3 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

PR fix notes

PR #73364: fix: reduce mirror lock hold time on slow filesystems (fixes #73339)

Description (problem / solution / changelog)

Summary

Changes

Testing

Changed files

Code Example

Bug type

Beta release blocker

Summary

Bug type

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Environment

Healthcheck symptom

Persisted stage root involved

Notes

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING