openclaw - 💡(How to fix) Fix Telegram direct chat replies spend ~25–31s in embedded agent startup/prep before streaming [3 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75725Fetched 2026-05-02 05:31:09
View on GitHub
Comments
3
Participants
4
Timeline
7
Reactions
3
Timeline (top)
commented ×3cross-referenced ×2closed ×1subscribed ×1

Telegram direct-chat replies are consistently slow because every normal message appears to pay ~25–31s of embedded agent startup/prep overhead before model streaming begins. This happens even for trivial messages like pingpong.

This looks like a runtime/performance issue rather than model generation latency: the trace shows most time is spent in model-resolution, auth, core-plugin-tools, system-prompt, and stream-setup before the assistant can start replying.

Root Cause

Telegram direct-chat replies are consistently slow because every normal message appears to pay ~25–31s of embedded agent startup/prep overhead before model streaming begins. This happens even for trivial messages like pingpong.

Fix Action

Fix / Workaround

startup totalMs=12616
  workspace:0ms
  runtime-plugins:1ms
  hooks:1ms
  model-resolution:6785ms
  auth:3330ms
  context-engine:1ms
  attempt-dispatch:2498ms

```text
startup totalMs=12849
  model-resolution:6916ms
  auth:3353ms
  attempt-dispatch:2578ms

```text
startup totalMs=11430
  model-resolution:6175ms
  auth:3008ms
  attempt-dispatch:2246ms

Code Example

startup totalMs=12616
  workspace:0ms
  runtime-plugins:1ms
  hooks:1ms
  model-resolution:6785ms
  auth:3330ms
  context-engine:1ms
  attempt-dispatch:2498ms

prep totalMs=16492
  workspace-sandbox:4ms
  skills:0ms
  core-plugin-tools:6281ms
  bootstrap-context:267ms
  bundle-tools:984ms
  system-prompt:3718ms
  session-resource-loader:959ms
  agent-session:2ms
  stream-setup:4277ms

---

startup totalMs=12849
  model-resolution:6916ms
  auth:3353ms
  attempt-dispatch:2578ms

prep totalMs=16636
  core-plugin-tools:6604ms
  bundle-tools:986ms
  system-prompt:3683ms
  session-resource-loader:993ms
  stream-setup:4351ms

---

startup totalMs=11430
  model-resolution:6175ms
  auth:3008ms
  attempt-dispatch:2246ms

prep totalMs=13794
  core-plugin-tools:5125ms
  bundle-tools:839ms
  system-prompt:3084ms
  session-resource-loader:926ms
  stream-setup:3813ms

---

startup totalMs=15625
  model-resolution:8736ms
  auth:4321ms
  attempt-dispatch:2567ms

prep totalMs=15681
  core-plugin-tools:5473ms
  bundle-tools:892ms
  system-prompt:4301ms
  session-resource-loader:902ms
  stream-setup:4104ms
RAW_BUFFERClick to expand / collapse

Summary

Telegram direct-chat replies are consistently slow because every normal message appears to pay ~25–31s of embedded agent startup/prep overhead before model streaming begins. This happens even for trivial messages like pingpong.

This looks like a runtime/performance issue rather than model generation latency: the trace shows most time is spent in model-resolution, auth, core-plugin-tools, system-prompt, and stream-setup before the assistant can start replying.

Environment

  • OpenClaw: 2026.4.29 (a448042)
  • OS: macOS 26.3.1 arm64
  • Gateway runtime: Node 22.22.1
  • CLI Node observed by openclaw status: Node 25.8.0
  • Gateway mode: local loopback LaunchAgent
  • Channel: Telegram direct chat
  • Model: openai-codex/gpt-5.5
  • Session context at time of testing: ~80k / 272k tokens (~30%)

What I tried

I reduced the runtime/config surface to rule out obvious local configuration overhead:

  • Restricted plugins to only openai and telegram
  • Enabled fast mode for the main agent / model config
  • Disabled heartbeat system-prompt section injection
  • Set main agent skillsLimits.maxSkillsPromptChars = 0
  • Reduced main agent context limits
  • Restarted the Gateway
  • Confirmed Telegram messages were received and replied to

Even after these changes, each simple direct chat message still spent roughly 25–31s in startup/prep.

Representative trace samples

Sample 1

startup totalMs=12616
  workspace:0ms
  runtime-plugins:1ms
  hooks:1ms
  model-resolution:6785ms
  auth:3330ms
  context-engine:1ms
  attempt-dispatch:2498ms

prep totalMs=16492
  workspace-sandbox:4ms
  skills:0ms
  core-plugin-tools:6281ms
  bootstrap-context:267ms
  bundle-tools:984ms
  system-prompt:3718ms
  session-resource-loader:959ms
  agent-session:2ms
  stream-setup:4277ms

Total before stream-ready: ~29.1s.

Sample 2

startup totalMs=12849
  model-resolution:6916ms
  auth:3353ms
  attempt-dispatch:2578ms

prep totalMs=16636
  core-plugin-tools:6604ms
  bundle-tools:986ms
  system-prompt:3683ms
  session-resource-loader:993ms
  stream-setup:4351ms

Total before stream-ready: ~29.5s.

Sample 3

startup totalMs=11430
  model-resolution:6175ms
  auth:3008ms
  attempt-dispatch:2246ms

prep totalMs=13794
  core-plugin-tools:5125ms
  bundle-tools:839ms
  system-prompt:3084ms
  session-resource-loader:926ms
  stream-setup:3813ms

Total before stream-ready: ~25.2s.

Sample 4

startup totalMs=15625
  model-resolution:8736ms
  auth:4321ms
  attempt-dispatch:2567ms

prep totalMs=15681
  core-plugin-tools:5473ms
  bundle-tools:892ms
  system-prompt:4301ms
  session-resource-loader:902ms
  stream-setup:4104ms

Total before stream-ready: ~31.3s.

Current interpretation

The main cost appears to be fixed per-turn embedded agent preparation, not the Telegram API or the LLM generating the response.

Likely optimization areas:

  • Cache model resolution across turns
  • Cache auth/profile resolution across turns
  • Cache tool inventory / plugin tool preparation
  • Cache or incrementally build system prompts
  • Avoid full embedded agent prep for trivial/direct chat messages
  • Provide a lightweight reply path for normal chat messages
  • Keep the runner/session warm between turns

Notes

There was separate Telegram/TUN network instability earlier in the setup, but after the Telegram channel became usable, the per-turn embedded agent trace still consistently showed ~25–31s before stream-ready. So the trace above appears to be a separate runtime overhead issue.

extent analysis

TL;DR

Implement caching for model resolution, auth/profile resolution, and tool inventory/plugin tool preparation to reduce the fixed per-turn embedded agent preparation time.

Guidance

  • Investigate caching mechanisms for model-resolution and auth to reduce their average times of 6785ms and 3330ms, respectively.
  • Optimize core-plugin-tools by caching or incrementally building tool inventory to decrease its average time of 6281ms.
  • Consider implementing a lightweight reply path for normal chat messages to bypass full embedded agent preparation.
  • Review the system-prompt and stream-setup components to identify potential optimization opportunities.

Example

No code snippet is provided as the issue description does not include specific code references.

Notes

The provided trace samples indicate a consistent overhead in the model-resolution, auth, and core-plugin-tools components, suggesting that caching or optimization in these areas could significantly reduce the per-turn preparation time.

Recommendation

Apply workaround: Implement caching for model resolution, auth/profile resolution, and tool inventory/plugin tool preparation to reduce the fixed per-turn embedded agent preparation time, as these components are identified as the primary contributors to the overhead.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Telegram direct chat replies spend ~25–31s in embedded agent startup/prep before streaming [3 comments, 4 participants]