openclaw - 💡(How to fix) Fix [Performance] ~18s pre-flight overhead per agent turn on 2026.4.29 [3 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75861Fetched 2026-05-02 05:28:49
View on GitHub
Comments
3
Participants
4
Timeline
7
Reactions
5
Timeline (top)
commented ×3closed ×1cross-referenced ×1subscribed ×1

Every agent turn incurs a fixed ~18s overhead before the model API call even starts. Combined with startup stages (~10-11s), a simple Feishu message takes 30-40s to get a response. Additionally, persistent event-loop blocking of 7-8s is observed via liveness warnings.

Root Cause

Every agent turn incurs a fixed ~18s overhead before the model API call even starts. Combined with startup stages (~10-11s), a simple Feishu message takes 30-40s to get a response. Additionally, persistent event-loop blocking of 7-8s is observed via liveness warnings.

Fix Action

Fix / Workaround

startup stages: ... phase=attempt-dispatch totalMs=10693 stages=
  model-resolution:    2387ms   ← ~2.4s
  auth:                4454ms   ← ~4.5s  ← suspicious
  attempt-dispatch:    3849ms   ← ~3.8s
2026-05-02T03:54:16.522+08:00 [agent/embedded] [trace:embedded-run] prep stages: ... totalMs=18325 stages=workspace-sandbox:0ms@0ms,core-plugin-tools:6216ms@6216ms,system-prompt:4986ms@12311ms,stream-setup:4939ms@18325ms
2026-05-02T03:54:18.510+08:00 [diagnostic] liveness warning: reasons=event_loop_delay interval=36s eventLoopDelayMaxMs=7017.1
2026-05-02T03:56:51.515+08:00 [agent/embedded] [trace:embedded-run] startup stages: ... totalMs=10693 stages=model-resolution:2387ms,auth:4454ms,attempt-dispatch:3849ms
2026-05-02T03:56:51.517+08:00 [diagnostic] liveness warning: ... eventLoopDelayMaxMs=10502.5

Code Example

prep stages: ... phase=stream-ready totalMs=18325 stages=
  workspace-sandbox:   0ms
  core-plugin-tools:   6216ms   ← ~6.2s
  bootstrap-context:     5ms
  bundle-tools:        1104ms   ← ~1.1s
  system-prompt:       4986ms   ← ~5.0s
  session-resource-loader: 1075ms ← ~1.1s
  stream-setup:        4939ms   ← ~5.0s
  total:              18325ms   ← ~18.3s

---

startup stages: ... phase=attempt-dispatch totalMs=10693 stages=
  model-resolution:    2387ms   ← ~2.4s
  auth:                4454ms   ← ~4.5s  ← suspicious
  attempt-dispatch:    3849ms   ← ~3.8s

---

liveness warning: reasons=event_loop_delay interval=31s
  eventLoopDelayP99Ms=3026.2  eventLoopDelayMaxMs=8002.7
  eventLoopUtilization=0.781  cpuCoreRatio=0.795

---

2026-05-02T03:54:16.522+08:00 [agent/embedded] [trace:embedded-run] prep stages: ... totalMs=18325 stages=workspace-sandbox:0ms@0ms,core-plugin-tools:6216ms@6216ms,system-prompt:4986ms@12311ms,stream-setup:4939ms@18325ms
2026-05-02T03:54:18.510+08:00 [diagnostic] liveness warning: reasons=event_loop_delay interval=36s eventLoopDelayMaxMs=7017.1
2026-05-02T03:56:51.515+08:00 [agent/embedded] [trace:embedded-run] startup stages: ... totalMs=10693 stages=model-resolution:2387ms,auth:4454ms,attempt-dispatch:3849ms
2026-05-02T03:56:51.517+08:00 [diagnostic] liveness warning: ... eventLoopDelayMaxMs=10502.5
RAW_BUFFERClick to expand / collapse

[Performance] ~18s pre-flight overhead per agent turn on 2026.4.29

Version: OpenClaw 2026.4.29 (a448042) Environment: Mac Studio M1 Max, 64GB, macOS (Darwin arm64), Node v25.9.0 Channel: Feishu (WebSocket), DeepSeek V4 Flash model

Summary

Every agent turn incurs a fixed ~18s overhead before the model API call even starts. Combined with startup stages (~10-11s), a simple Feishu message takes 30-40s to get a response. Additionally, persistent event-loop blocking of 7-8s is observed via liveness warnings.

Measured Data (from gateway.err.log)

Pre-flight stages (stream-ready) — consistent across 20+ runs

prep stages: ... phase=stream-ready totalMs=18325 stages=
  workspace-sandbox:   0ms
  core-plugin-tools:   6216ms   ← ~6.2s
  bootstrap-context:     5ms
  bundle-tools:        1104ms   ← ~1.1s
  system-prompt:       4986ms   ← ~5.0s
  session-resource-loader: 1075ms ← ~1.1s
  stream-setup:        4939ms   ← ~5.0s
  total:              18325ms   ← ~18.3s

All subsequent runs show the same distribution (core-plugin-tools ~6.2s, system-prompt ~5.0s, stream-setup ~5.0s, total ~18.3-18.5s).

Startup stages — another ~10s before pre-flight

startup stages: ... phase=attempt-dispatch totalMs=10693 stages=
  model-resolution:    2387ms   ← ~2.4s
  auth:                4454ms   ← ~4.5s  ← suspicious
  attempt-dispatch:    3849ms   ← ~3.8s

The auth stage at 4.5s is particularly notable — it seems disproportionately long for auth resolution.

Liveness warnings — event loop blocking 7-8s

liveness warning: reasons=event_loop_delay interval=31s
  eventLoopDelayP99Ms=3026.2  eventLoopDelayMaxMs=8002.7
  eventLoopUtilization=0.781  cpuCoreRatio=0.795

This pattern repeats every 30-36s throughout the logs — event loop regularly blocked for 7-8s, meaning the Node.js main thread is periodically stalled.

Agent wait times

One observed agent.wait call: 315808ms (5+ minutes). Another: 48033ms (48s).

Suspected bottlenecks

  1. core-plugin-tools (~6.2s): Likely scanning/loading all available tools on every turn.
  2. system-prompt (~5.0s): Prompt construction with context assembly.
  3. auth stage (~4.5s): Persistently high for a lightweight auth check — may involve unnecessary I/O or re-validation per turn.
  4. Event loop blocking (7-8s): Suggests CPU-intensive synchronous work blocking the event loop, possibly related to tool loading or prompt serialization.

Expected vs Actual

MetricExpectedActual
Pre-flight overhead< 3s~18s
First token latency< 5s30-40s
Event loop max delay< 500ms7-8s

Raw log excerpts (redacted — internal IDs truncated)

2026-05-02T03:54:16.522+08:00 [agent/embedded] [trace:embedded-run] prep stages: ... totalMs=18325 stages=workspace-sandbox:0ms@0ms,core-plugin-tools:6216ms@6216ms,system-prompt:4986ms@12311ms,stream-setup:4939ms@18325ms
2026-05-02T03:54:18.510+08:00 [diagnostic] liveness warning: reasons=event_loop_delay interval=36s eventLoopDelayMaxMs=7017.1
2026-05-02T03:56:51.515+08:00 [agent/embedded] [trace:embedded-run] startup stages: ... totalMs=10693 stages=model-resolution:2387ms,auth:4454ms,attempt-dispatch:3849ms
2026-05-02T03:56:51.517+08:00 [diagnostic] liveness warning: ... eventLoopDelayMaxMs=10502.5

Steps to reproduce

  1. Run OpenClaw 2026.4.29 with Feishu WebSocket channel
  2. Send any message to the agent
  3. Observe gateway.err.log for prep stages totalMs and liveness warnings
  4. First response typically arrives 30-40s after message send

Impact

  • Poor UX on chat channels (Feishu, Telegram, etc.)
  • Cron jobs queue up and compound delays
  • Event loop blocking affects all concurrent sessions

extent analysis

TL;DR

Optimize the core-plugin-tools, system-prompt, and auth stages to reduce the pre-flight overhead and alleviate event loop blocking.

Guidance

  1. Investigate core-plugin-tools: Analyze the tool loading process to determine if it can be optimized or cached to reduce the 6.2s overhead.
  2. Review system-prompt construction: Examine the prompt assembly process to identify potential bottlenecks and optimize the context assembly.
  3. Verify auth stage: Check the authentication process to ensure it's not performing unnecessary I/O or re-validation per turn, and optimize it if possible.
  4. Address event loop blocking: Identify the CPU-intensive synchronous work causing the 7-8s event loop blocking and refactor it to be asynchronous or optimize its performance.

Example

No specific code snippet can be provided without more context, but optimizing the core-plugin-tools stage might involve caching frequently used tools or lazy-loading them.

Notes

The provided data suggests that the pre-flight overhead and event loop blocking are the primary causes of the performance issues. However, without more information about the specific implementation, it's challenging to provide a more detailed solution.

Recommendation

Apply workarounds to optimize the core-plugin-tools, system-prompt, and auth stages, as these are the most likely causes of the performance issues. Upgrading to a fixed version is not implied in the provided issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Performance] ~18s pre-flight overhead per agent turn on 2026.4.29 [3 comments, 4 participants]