openclaw - 💡(How to fix) Fix First message after restart stalls ~42s (bundled provider tool factories timeout sequentially) [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#73808Fetched 2026-04-29 06:14:51
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×2subscribed ×1

Root Cause

On the first message (embedded run), the gateway initializes tool factories for ALL bundled provider plugins — including providers that are not configured or explicitly disabled. Each unconfigured provider's tool factory makes a network call that times out at ~13 seconds. With 3+ unconfigured providers timing out sequentially, total stall is ~42s.

The root cause is per-embedded-run provider scanning instead of using cached results from startup resolution. The gateway already resolves providers at startup — tool factories should reuse that result rather than re-probing on first message.

Fix Action

Workaround

  • Disabling individual bundled providers via config (e.g., xai: { enabled: false }) reduces the timeout count but doesn't eliminate it — other unconfigured providers still scan
  • No complete workaround exists short of patching out the tool factory initialization
RAW_BUFFERClick to expand / collapse

Bug Description

After a clean gateway startup (~6s), the first user message stalls for ~42-55 seconds before the assistant begins responding. Subsequent messages are fast (~3s API round-trip). The stall occurs in the tool initialization phase where bundled provider plugins run tool factories sequentially, and unconfigured providers timeout one-by-one.

Root Cause

On the first message (embedded run), the gateway initializes tool factories for ALL bundled provider plugins — including providers that are not configured or explicitly disabled. Each unconfigured provider's tool factory makes a network call that times out at ~13 seconds. With 3+ unconfigured providers timing out sequentially, total stall is ~42s.

The root cause is per-embedded-run provider scanning instead of using cached results from startup resolution. The gateway already resolves providers at startup — tool factories should reuse that result rather than re-probing on first message.

Steps to Reproduce

  1. Start gateway (completes in ~6s)
  2. Send first message via any channel (Telegram, Control UI, API)
  3. Observe ~42-55s delay before assistant response begins
  4. Send a second message — response in ~3s

Expected Behavior

First message response time should be comparable to subsequent messages (~3-5s), since provider resolution already happened at startup.

Actual Behavior

~42s stall on first message. Tool initialization re-probes all providers including:

  • Unconfigured bundled providers (e.g., xai) that timeout at ~13s each
  • Providers that were already resolved during startup

Workaround

  • Disabling individual bundled providers via config (e.g., xai: { enabled: false }) reduces the timeout count but doesn't eliminate it — other unconfigured providers still scan
  • No complete workaround exists short of patching out the tool factory initialization

Environment

  • OpenClaw v2026.4.26
  • macOS Sequoia 15.x, Mac Mini M4 Pro
  • Primary model: openai-codex/gpt-5.5 (Codex OAuth)
  • Active-memory plugin disabled (not the cause — disabling it alone didn't fix the issue)

Related Issues

  • #73805 — prewarmConfiguredPrimaryModel 30s timeout (same provider resolution code path, but at startup not first-message)
  • #62364 — Slow startup with multiple providers

Suggestion

  1. Cache provider/tool factory results from startup and reuse on first embedded run (don't re-probe)
  2. Skip tool factory initialization for providers that are not configured or have 0 models in registry
  3. If re-probing is necessary, run tool factories in parallel rather than sequentially
  4. Add a short timeout (2-3s) for non-primary provider tool factories — a 13s timeout per provider is excessive for an optimization step

extent analysis

TL;DR

Cache provider results from startup to avoid re-probing on the first embedded run, reducing the stall time.

Guidance

  • Identify and disable unconfigured bundled providers via config to reduce the timeout count.
  • Consider implementing a caching mechanism to store provider/tool factory results from startup and reuse them on the first embedded run.
  • Review the tool factory initialization process to determine if it can be optimized to run in parallel or with a shorter timeout.
  • Investigate the possibility of adding a short timeout (2-3s) for non-primary provider tool factories.

Example

No code snippet is provided as the issue does not contain specific code references.

Notes

The provided suggestions are based on the root cause analysis and may require further investigation and testing to implement.

Recommendation

Apply a workaround by caching provider results from startup and reusing them on the first embedded run, as this approach directly addresses the identified root cause and has the potential to significantly reduce the stall time.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix First message after restart stalls ~42s (bundled provider tool factories timeout sequentially) [1 participants]