openclaw - ✅(Solved) Fix [Bug]: Synchronous plugin discovery blocks event loop for minutes on Windows with multiple agents [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#67869Fetched 2026-04-17 08:29:12
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
labeled ×2closed ×1cross-referenced ×1

discoverOpenClawPlugins() in discovery-*.js performs ~62,000 synchronous filesystem calls (realpathSync, existsSync, readFileSync) when scanning 102+ bundled extensions. On Windows, each full scan blocks the event loop for ~85 seconds.

The in-memory discovery cache has a default TTL of 1 second (DEFAULT_DISCOVERY_CACHE_MS = 1e3), and the cache key includes workspaceDir. With multiple agents configured on different workspaces, each agent initialization triggers an independent full rescan — the cache expires or misses due to different keys.

With 4 agents on 4 workspaces: 4 scans × ~85s = 5-7 minutes of completely blocked event loop on every boot. During this time, Discord's 15-second gateway readiness timeout fails repeatedly, creating an infinite zombie/restart loop where the bot never comes online.

Root Cause

discoverOpenClawPlugins() in discovery-*.js performs ~62,000 synchronous filesystem calls (realpathSync, existsSync, readFileSync) when scanning 102+ bundled extensions. On Windows, each full scan blocks the event loop for ~85 seconds.

The in-memory discovery cache has a default TTL of 1 second (DEFAULT_DISCOVERY_CACHE_MS = 1e3), and the cache key includes workspaceDir. With multiple agents configured on different workspaces, each agent initialization triggers an independent full rescan — the cache expires or misses due to different keys.

With 4 agents on 4 workspaces: 4 scans × ~85s = 5-7 minutes of completely blocked event loop on every boot. During this time, Discord's 15-second gateway readiness timeout fails repeatedly, creating an infinite zombie/restart loop where the bot never comes online.

Fix Action

Fix / Workaround

Workaround Patched DEFAULT_DISCOVERY_CACHE_MS from 1e3 to 36e5 (1 hour) directly in the bundled discovery-*.js. The OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS environment variable did not propagate to the daemon worker spawned by the Windows Scheduled Task.

PR fix notes

PR #67940: fix: reuse shared plugin discovery cache across workspaces

Description (problem / solution / changelog)

Summary

  • cache bundled and global plugin discovery separately from workspace-specific discovery
  • add a regression test proving shared roots are not rescanned across workspace cache misses

Closes #67869

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/plugins/discovery.test.ts (modified, +52/-1)
  • src/plugins/discovery.ts (modified, +138/-81)
RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked before, now fails)

Beta release blocker

No

Summary

discoverOpenClawPlugins() in discovery-*.js performs ~62,000 synchronous filesystem calls (realpathSync, existsSync, readFileSync) when scanning 102+ bundled extensions. On Windows, each full scan blocks the event loop for ~85 seconds.

The in-memory discovery cache has a default TTL of 1 second (DEFAULT_DISCOVERY_CACHE_MS = 1e3), and the cache key includes workspaceDir. With multiple agents configured on different workspaces, each agent initialization triggers an independent full rescan — the cache expires or misses due to different keys.

With 4 agents on 4 workspaces: 4 scans × ~85s = 5-7 minutes of completely blocked event loop on every boot. During this time, Discord's 15-second gateway readiness timeout fails repeatedly, creating an infinite zombie/restart loop where the bot never comes online.

Steps to reproduce

Configure OpenClaw gateway on Windows with 2+ agents, each with a different workspace path Enable the Discord channel Start the gateway (openclaw gateway start) Observe the log: after "starting channels and sidecars...", the log freezes for minutes

Expected behavior

Plugin discovery should not block the event loop. At minimum, the default cache TTL should be much longer than 1 second — a 1-second TTL effectively means no caching during boot since each initialization step takes longer than 1 second.

Suggested Fixes (in order of impact) Make plugin discovery async — replace realpathSync, existsSync, readFileSync with async equivalents. This is the root fix. Increase default cache TTL — DEFAULT_DISCOVERY_CACHE_MS should be at least 60 seconds, or ideally cache for the lifetime of the process with explicit invalidation on config change. Cache the expensive scan (bundled + global roots) separately from the workspace root — the bundled extensions directory (102+ dirs) is the same for every agent. Only the workspace extensions directory differs. Caching the stock/global scan independently would eliminate redundant rescans.

Actual behavior

The event loop blocks completely for 5-7 minutes after "starting channels and sidecars...". During this time:

HTTP health endpoint accepts connections but never responds (timeout) Discord WebSocket cannot send/receive heartbeats Discord gateway readiness times out after 15 seconds, flagging the connection as "zombie" Auto-restart reconnects to Discord, but the event loop is still blocked by the next agent's discovery scan The cycle repeats: connect → block → zombie → restart → connect → block The bot process appears alive (watchdog reports UP) but cannot process any messages Eventually the bot may briefly connect once all scans finish, then dies minutes later when a subsequent event triggers another discovery call

OpenClaw version

2026.4.11

Operating system

Windows 10 Home 10.0.19045

Install method

No response

Model

anthropic/claude-sonnet-4-6

Provider / routing chain

anthropic, openai (both enabled)

Additional provider/model setup details

Workaround Patched DEFAULT_DISCOVERY_CACHE_MS from 1e3 to 36e5 (1 hour) directly in the bundled discovery-*.js. The OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS environment variable did not propagate to the daemon worker spawned by the Windows Scheduled Task.

Environment OS: Windows 10 Home 10.0.19045 Node: 22.22.0 OpenClaw: 2026.4.11 Agents: 4 (main, discord, hermes, sherlock) with separate workspaces Bundled extensions: 102+

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

TL;DR

The most likely fix is to make plugin discovery asynchronous by replacing synchronous filesystem calls with their asynchronous equivalents.

Guidance

  • Identify and replace realpathSync, existsSync, and readFileSync with their asynchronous counterparts (realpath, exists, and readFile) in the discoverOpenClawPlugins function.
  • Consider increasing the default cache TTL (DEFAULT_DISCOVERY_CACHE_MS) to a higher value, such as 60 seconds or more, to reduce the frequency of plugin discovery scans.
  • Cache the expensive scan of bundled extensions separately from the workspace root to eliminate redundant rescans.
  • Verify that the OPENCLAW_PLUGIN_DISCOVERY_CACHE_MS environment variable is properly propagated to the daemon worker spawned by the Windows Scheduled Task.

Example

// Before
const fs = require('fs');
const path = require('path');
// ...
const stats = fs.realpathSync(pluginPath);
const exists = fs.existsSync(pluginPath);
const fileContent = fs.readFileSync(pluginPath, 'utf8');

// After
const fs = require('fs').promises;
const path = require('path');
// ...
const stats = await fs.realpath(pluginPath);
const exists = await fs.exists(pluginPath);
const fileContent = await fs.readFile(pluginPath, 'utf8');

Notes

The provided workaround of patching DEFAULT_DISCOVERY_CACHE_MS to 1 hour may not be a permanent solution, as it only delays the issue. Making plugin discovery asynchronous is the recommended fix.

Recommendation

Apply the workaround of increasing the default cache TTL and make plugin discovery asynchronous to fix the issue. This will prevent the event loop from blocking and allow the bot to process messages.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Plugin discovery should not block the event loop. At minimum, the default cache TTL should be much longer than 1 second — a 1-second TTL effectively means no caching during boot since each initialization step takes longer than 1 second.

Suggested Fixes (in order of impact) Make plugin discovery async — replace realpathSync, existsSync, readFileSync with async equivalents. This is the root fix. Increase default cache TTL — DEFAULT_DISCOVERY_CACHE_MS should be at least 60 seconds, or ideally cache for the lifetime of the process with explicit invalidation on config change. Cache the expensive scan (bundled + global roots) separately from the workspace root — the bundled extensions directory (102+ dirs) is the same for every agent. Only the workspace extensions directory differs. Caching the stock/global scan independently would eliminate redundant rescans.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix [Bug]: Synchronous plugin discovery blocks event loop for minutes on Windows with multiple agents [1 pull requests, 1 participants]