openclaw - 💡(How to fix) Fix gateway: prewarmConfiguredPrimaryModel blocks channel startup for ~4 minutes on every restart [1 comments, 2 participants]

openclaw2026-04-30 03:14:05

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#74776•Fetched 2026-05-01 05:41:28

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Eric-CN-MS

Participants

clawsweeper[bot]

Eric-CN-MS

Timeline (top)

closed ×1commented ×1renamed ×1

After every gateway restart, there is a ~4 minute gap before channels (Feishu, Telegram, etc.) become ready, and the first message after restart takes 3–6 minutes to process. During message processing, all WebSocket/HTTP connections drop simultaneously.

After thorough investigation (tracing code paths, benchmarking all components, testing APIs directly), the root cause has been identified.

Root Cause

After thorough investigation (tracing code paths, benchmarking all components, testing APIs directly), the root cause has been identified.

Fix Action

Workaround

None available currently. The 4-minute delay on every restart is unavoidable with the current code path.

Code Example

// dist/server.impl-hNr66nDN.js ~line 8554
await prewarmConfiguredPrimaryModel({ cfg: params.cfg, log: params.log });
await params.startChannels();   // ← blocked until above resolves

---

await ensureOpenClawModelsJson(params.cfg, agentDir);

---

08:50:56  starting channels and sidecars...
08:50:58  hooks loaded
08:51:43  [client ready]
           ← ~4 minutes: prewarmConfiguredPrimaryModel / ensureOpenClawModelsJson →
08:55:38  feishu / browser / telegram all start simultaneously

RAW_BUFFERClick to expand / collapse

Summary

After thorough investigation (tracing code paths, benchmarking all components, testing APIs directly), the root cause has been identified.

Environment

OpenClaw: 2026.4.26
OS: Windows 11 (x64)
Node.js: v24.13.1
Primary model: github-copilot/claude-sonnet-4.6

Root Cause: `prewarmConfiguredPrimaryModel` blocks channel startup

File: dist/server.impl-hNr66nDN.js

In startGatewaySidecars(), channels are started after prewarmConfiguredPrimaryModel() completes:

// dist/server.impl-hNr66nDN.js ~line 8554
await prewarmConfiguredPrimaryModel({ cfg: params.cfg, log: params.log });
await params.startChannels();   // ← blocked until above resolves

Inside prewarmConfiguredPrimaryModel, ensureOpenClawModelsJson() is called:

await ensureOpenClawModelsJson(params.cfg, agentDir);

ensureOpenClawModelsJson calls planOpenClawModelsJson which fetches the latest model list from the provider API (GitHub Copilot /models endpoint) to rebuild models.json. This network call has no short timeout, and on this setup consistently takes ~4 minutes before resolving (possibly due to retry/backoff behavior or a slow response for enterprise Copilot endpoints).

Evidence:

Every restart shows the same ~4 minute gap between [client ready] and channel startup
models.json write timestamp (12:44) matches exactly when channels finally started (12:43:06 + ~4 min)
Disabling store.vector.enabled had zero effect on the gap, ruling out memory-core
All individual API calls tested directly are fast: Copilot token ~517ms, embedding ~1s, chat ~1.7s

Startup timeline (every restart)

08:50:56  starting channels and sidecars...
08:50:58  hooks loaded
08:51:43  [client ready]
           ← ~4 minutes: prewarmConfiguredPrimaryModel / ensureOpenClawModelsJson →
08:55:38  feishu / browser / telegram all start simultaneously

Secondary Issue: First message slow + simultaneous channel disconnects

After channels start, the first message triggers memory_search which calls ensureProviderInitialized(). Additionally, the first model API call (with full system prompt) takes longer than the Feishu WebSocket idle timeout (~60s), causing:

write ECONNABORTED on Feishu WS + reconnect at the 60s mark
Telegram Polling stall detected at the ~120s mark
Both disconnect at exactly the same moment — confirming event loop / connection starvation

Expected Behavior

prewarmConfiguredPrimaryModel should not block channel startup. Either run it concurrently with channel initialization, or add a short timeout (e.g. 5–10s) so channels are not held waiting.
ensureOpenClawModelsJson should use a short timeout for the startup fetch, falling back to the existing cached models.json if the API call times out.
Channels should be ready to accept messages while model warmup proceeds in the background.

What was ruled out

❌ Session context size (149K tokens) — new sessions exhibit the same behavior
❌ memory-core local embedding provider blocking — actual provider is github-copilot (remote)
❌ GitHub Copilot API latency — all endpoints respond in <2s when called directly
❌ SQLite vector search blocking — FTS queries take <1ms, full scan 77ms
❌ store.vector.enabled = false — no effect on the 4-minute gap

Workaround

None available currently. The 4-minute delay on every restart is unavoidable with the current code path.

extent analysis

TL;DR

Modify the prewarmConfiguredPrimaryModel function to run concurrently with channel initialization or add a short timeout to prevent blocking channel startup.

Guidance

Identify the prewarmConfiguredPrimaryModel function in the dist/server.impl-hNr66nDN.js file and consider running it concurrently with channel initialization using Promise.all or a similar approach.
Alternatively, add a short timeout (e.g., 5-10 seconds) to the prewarmConfiguredPrimaryModel function to prevent it from blocking channel startup.
Review the ensureOpenClawModelsJson function to ensure it uses a short timeout for the startup fetch and falls back to the existing cached models.json if the API call times out.
Consider implementing a background warm-up process for the primary model to allow channels to accept messages while the model is still warming up.

Example

// Run prewarmConfiguredPrimaryModel concurrently with channel initialization
Promise.all([
  prewarmConfiguredPrimaryModel({ cfg: params.cfg, log: params.log }),
  params.startChannels()
]).then(() => {
  // Channels and model warm-up are complete
});

Notes

The current implementation of prewarmConfiguredPrimaryModel blocks channel startup, causing a 4-minute delay. Modifying this function to run concurrently or adding a short timeout should resolve the issue. However, the root cause of the slow API call in ensureOpenClawModelsJson should also be investigated to prevent similar issues in the future.

Recommendation

Apply a workaround by modifying the prewarmConfiguredPrimaryModel function to run concurrently with channel initialization or adding a short timeout. This will allow channels to start accepting messages while the primary model is still warming up, reducing the delay caused by the current implementation.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #parallel task #integration issue #index setup #retrieval issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix gateway: prewarmConfiguredPrimaryModel blocks channel startup for ~4 minutes on every restart [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Summary

Environment

Root Cause: `prewarmConfiguredPrimaryModel` blocks channel startup

Startup timeline (every restart)

Secondary Issue: First message slow + simultaneous channel disconnects

Expected Behavior

What was ruled out

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix gateway: prewarmConfiguredPrimaryModel blocks channel startup for ~4 minutes on every restart [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Summary

Environment

Root Cause: prewarmConfiguredPrimaryModel blocks channel startup

Startup timeline (every restart)

Secondary Issue: First message slow + simultaneous channel disconnects

Expected Behavior

What was ruled out

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Root Cause: `prewarmConfiguredPrimaryModel` blocks channel startup