openclaw - 💡(How to fix) Fix [Bug]: sidecars.channels stalls 145–625s on startup due to redundant loadOpenClawPlugins call from shouldSuppressBuiltInModel [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73645Fetched 2026-04-29 06:17:02
View on GitHub
Comments
2
Participants
2
Timeline
5
Reactions
0
Author
Timeline (top)
commented ×2labeled ×2closed ×1

On every gateway startup, the sidecars.channels phase stalls for 145–625 seconds before the gateway becomes ready.

Root Cause

On every gateway startup, the sidecars.channels phase stalls for 145–625 seconds before the gateway becomes ready.

Fix Action

Fix / Workaround

The full call chain analysis, cache-miss mechanism explanation, and three suggested fix options are available — happy to provide them if helpful, or to test patches and capture additional profiles.

Code Example

Gateway startup trace lines from two consecutive cold starts:

[gateway] startup trace: sidecars.channels 278801.0ms total=305025.6ms
[gateway] startup trace: sidecars.channels 625543.0ms total=645929.4ms

Telegram polling watchdog log during a stall:

[telegram] Polling stall detected (no completed getUpdates for 250.7s); forcing restart.

Thread state during the stall (top -H output, PID 20 = openclaw-gateway main thread):
- Main event loop thread: ~100% CPU
- V8 worker threads: idle
- libuv threads: idle

CPU profile attached: openclaw-stall.cpuprofile (captured during a 625s stall, 171,439 ms scripting time, dominant call stack: prewarmConfiguredPrimaryModel → resolveModel → resolveExplicitModelWithRegistry → shouldSuppressBuiltInModel → resolveBuiltInModelSuppression → resolveProviderBuiltInModelSuppression → resolveProviderPluginsForCatalogHooks → resolveProviderPluginsForHooks → resolvePluginProviders → resolveRuntimePluginRegistry → loadOpenClawPlugins).
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

On every gateway startup, the sidecars.channels phase stalls for 145–625 seconds before the gateway becomes ready.

Steps to reproduce

  1. Start the gateway with OPENCLAW_GATEWAY_STARTUP_TRACE=1.
  2. Observe the sidecars.channels trace value. On any system with ≥ 1 configured plugin, it will be in the 100–600 s range on a cold start.
  3. Attach --cpu-prof --cpu-prof-interval=1000 to NODE_OPTIONS (or run with it already set). Load the resulting .cpuprofile in Chrome DevTools → Performance → Load profile. The dominant call stack from ~20 s onward is shouldSuppressBuiltInModel → … → loadOpenClawPlugins.

A CPU profile captured during a 625 s stall is attached (openclaw-stall.cpuprofile). The resolveRuntimePluginRegistry → loadOpenClawPlugins stack consumes the majority of the 171,439 ms of scripting time.

Expected behavior

NOT_ENOUGH_INFO

Actual behavior

On every cold start, the gateway emits a startup trace line indicating sidecars.channels takes 145–625 seconds:

[gateway] startup trace: sidecars.channels 278801.0ms total=305025.6ms [gateway] startup trace: sidecars.channels 625543.0ms total=645929.4ms

During this window, the main event loop thread runs at ~100% CPU (V8 worker and libuv threads idle). The Telegram polling watchdog logs:

[telegram] Polling stall detected (no completed getUpdates for 250.7s); forcing restart.

After sidecars.channels completes, the gateway becomes responsive and operates normally until next restart.

OpenClaw version

2026.4.25

Operating system

Linux x86-64 (Docker container, Node.js v24.14.0)

Install method

docker

Model

openrouter/x-ai/grok-4.1-fast

Provider / routing chain

openclaw -> openrouter -> x-ai (grok-4.1-fast) Fallbacks: openrouter/google/gemini-3-flash-preview, openrouter/anthropic/claude-haiku-4.5

Additional provider/model setup details

Primary model for main agent and subagents: openrouter/x-ai/grok-4.1-fast Fallbacks: openrouter/google/gemini-3-flash-preview, openrouter/anthropic/claude-haiku-4.5 OPENROUTER_API_KEY is set in the container environment. No model whitelist or catalog cap is configured.

openclaw-stall.cpuprofile

Logs, screenshots, and evidence

Gateway startup trace lines from two consecutive cold starts:

[gateway] startup trace: sidecars.channels 278801.0ms total=305025.6ms
[gateway] startup trace: sidecars.channels 625543.0ms total=645929.4ms

Telegram polling watchdog log during a stall:

[telegram] Polling stall detected (no completed getUpdates for 250.7s); forcing restart.

Thread state during the stall (top -H output, PID 20 = openclaw-gateway main thread):
- Main event loop thread: ~100% CPU
- V8 worker threads: idle
- libuv threads: idle

CPU profile attached: openclaw-stall.cpuprofile (captured during a 625s stall, 171,439 ms scripting time, dominant call stack: prewarmConfiguredPrimaryModel → resolveModel → resolveExplicitModelWithRegistry → shouldSuppressBuiltInModel → resolveBuiltInModelSuppression → resolveProviderBuiltInModelSuppression → resolveProviderPluginsForCatalogHooks → resolveProviderPluginsForHooks → resolvePluginProviders → resolveRuntimePluginRegistry → loadOpenClawPlugins).

Impact and severity

Affected: Any user running OpenClaw 2026.4.25 with at least one configured plugin and a primary model. Reproduces every cold start on the affected system.

Severity: Blocks workflow. The gateway is unresponsive to all incoming requests (Telegram messages, WebSocket connections, HTTP) for the duration of the stall (145–625 s observed). The Telegram polling watchdog logs "Polling stall detected" and forces a restart, which compounds the issue — restart triggers another stall.

Frequency: Always on cold start. Observed across 3 consecutive restarts on the same host with durations of 145s, 279s, and 625s.

Consequence: Agents do not respond to user messages during the stall window. Messages sent to the bot during this window may time out or be dropped depending on the channel adapter's queue behavior.

Additional information

NOT_ENOUGH_INFO

We do not have evidence of a known-good prior version. The bug was discovered through investigation of unrelated symptoms (Telegram bot non-responsiveness) and traced to the sidecars.channels phase via the OPENCLAW_GATEWAY_STARTUP_TRACE=1 environment variable. We did not test against earlier OpenClaw versions to determine when the regression was introduced.

The full call chain analysis, cache-miss mechanism explanation, and three suggested fix options are available — happy to provide them if helpful, or to test patches and capture additional profiles.

extent analysis

TL;DR

The most likely fix for the OpenClaw gateway startup stall is to optimize the loadOpenClawPlugins function, which is consuming the majority of the scripting time.

Guidance

  • Investigate the loadOpenClawPlugins function to identify performance bottlenecks and optimize its execution.
  • Analyze the CPU profile attached to the issue to understand the call stack and identify areas for improvement.
  • Consider implementing caching or lazy loading for plugins to reduce the load time.
  • Test patches and capture additional profiles to verify the effectiveness of the optimizations.

Example

No code snippet is provided as the issue does not contain sufficient information about the loadOpenClawPlugins function.

Notes

The issue lacks information about the loadOpenClawPlugins function and its implementation, making it difficult to provide a more specific solution. Additionally, the issue does not provide a known-good prior version, making it challenging to determine when the regression was introduced.

Recommendation

Apply a workaround by optimizing the loadOpenClawPlugins function, as it is the most likely cause of the startup stall. This approach is recommended because it directly addresses the performance bottleneck identified in the CPU profile.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

NOT_ENOUGH_INFO

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: sidecars.channels stalls 145–625s on startup due to redundant loadOpenClawPlugins call from shouldSuppressBuiltInModel [2 comments, 2 participants]