openclaw - 💡(How to fix) Fix Telegram direct chat replies spend ~25–31s in embedded agent startup/prep before streaming [3 comments, 4 participants]

openclaw2026-05-01 15:54:52

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#75725•Fetched 2026-05-02 05:31:09

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×3cross-referenced ×2closed ×1subscribed ×1

Telegram direct-chat replies are consistently slow because every normal message appears to pay ~25–31s of embedded agent startup/prep overhead before model streaming begins. This happens even for trivial messages like ping → pong.

This looks like a runtime/performance issue rather than model generation latency: the trace shows most time is spent in model-resolution, auth, core-plugin-tools, system-prompt, and stream-setup before the assistant can start replying.

Root Cause

Fix Action

Fix / Workaround

startup totalMs=12616
  workspace:0ms
  runtime-plugins:1ms
  hooks:1ms
  model-resolution:6785ms
  auth:3330ms
  context-engine:1ms
  attempt-dispatch:2498ms

```text
startup totalMs=12849
  model-resolution:6916ms
  auth:3353ms
  attempt-dispatch:2578ms

```text
startup totalMs=11430
  model-resolution:6175ms
  auth:3008ms
  attempt-dispatch:2246ms

Code Example

startup totalMs=12616
  workspace:0ms
  runtime-plugins:1ms
  hooks:1ms
  model-resolution:6785ms
  auth:3330ms
  context-engine:1ms
  attempt-dispatch:2498ms

prep totalMs=16492
  workspace-sandbox:4ms
  skills:0ms
  core-plugin-tools:6281ms
  bootstrap-context:267ms
  bundle-tools:984ms
  system-prompt:3718ms
  session-resource-loader:959ms
  agent-session:2ms
  stream-setup:4277ms

---

startup totalMs=12849
  model-resolution:6916ms
  auth:3353ms
  attempt-dispatch:2578ms

prep totalMs=16636
  core-plugin-tools:6604ms
  bundle-tools:986ms
  system-prompt:3683ms
  session-resource-loader:993ms
  stream-setup:4351ms

---

startup totalMs=11430
  model-resolution:6175ms
  auth:3008ms
  attempt-dispatch:2246ms

prep totalMs=13794
  core-plugin-tools:5125ms
  bundle-tools:839ms
  system-prompt:3084ms
  session-resource-loader:926ms
  stream-setup:3813ms

---

startup totalMs=15625
  model-resolution:8736ms
  auth:4321ms
  attempt-dispatch:2567ms

prep totalMs=15681
  core-plugin-tools:5473ms
  bundle-tools:892ms
  system-prompt:4301ms
  session-resource-loader:902ms
  stream-setup:4104ms

RAW_BUFFERClick to expand / collapse

Summary

Environment

OpenClaw: 2026.4.29 (a448042)
OS: macOS 26.3.1 arm64
Gateway runtime: Node 22.22.1
CLI Node observed by openclaw status: Node 25.8.0
Gateway mode: local loopback LaunchAgent
Channel: Telegram direct chat
Model: openai-codex/gpt-5.5
Session context at time of testing: ~80k / 272k tokens (~30%)

What I tried

I reduced the runtime/config surface to rule out obvious local configuration overhead:

Restricted plugins to only openai and telegram
Enabled fast mode for the main agent / model config
Disabled heartbeat system-prompt section injection
Set main agent skillsLimits.maxSkillsPromptChars = 0
Reduced main agent context limits
Restarted the Gateway
Confirmed Telegram messages were received and replied to

Even after these changes, each simple direct chat message still spent roughly 25–31s in startup/prep.

Representative trace samples

Sample 1

startup totalMs=12616
  workspace:0ms
  runtime-plugins:1ms
  hooks:1ms
  model-resolution:6785ms
  auth:3330ms
  context-engine:1ms
  attempt-dispatch:2498ms

prep totalMs=16492
  workspace-sandbox:4ms
  skills:0ms
  core-plugin-tools:6281ms
  bootstrap-context:267ms
  bundle-tools:984ms
  system-prompt:3718ms
  session-resource-loader:959ms
  agent-session:2ms
  stream-setup:4277ms

Total before stream-ready: ~29.1s.

Sample 2

startup totalMs=12849
  model-resolution:6916ms
  auth:3353ms
  attempt-dispatch:2578ms

prep totalMs=16636
  core-plugin-tools:6604ms
  bundle-tools:986ms
  system-prompt:3683ms
  session-resource-loader:993ms
  stream-setup:4351ms

Total before stream-ready: ~29.5s.

Sample 3

startup totalMs=11430
  model-resolution:6175ms
  auth:3008ms
  attempt-dispatch:2246ms

prep totalMs=13794
  core-plugin-tools:5125ms
  bundle-tools:839ms
  system-prompt:3084ms
  session-resource-loader:926ms
  stream-setup:3813ms

Total before stream-ready: ~25.2s.

Sample 4

startup totalMs=15625
  model-resolution:8736ms
  auth:4321ms
  attempt-dispatch:2567ms

prep totalMs=15681
  core-plugin-tools:5473ms
  bundle-tools:892ms
  system-prompt:4301ms
  session-resource-loader:902ms
  stream-setup:4104ms

Total before stream-ready: ~31.3s.

Current interpretation

The main cost appears to be fixed per-turn embedded agent preparation, not the Telegram API or the LLM generating the response.

Likely optimization areas:

Cache model resolution across turns
Cache auth/profile resolution across turns
Cache tool inventory / plugin tool preparation
Cache or incrementally build system prompts
Avoid full embedded agent prep for trivial/direct chat messages
Provide a lightweight reply path for normal chat messages
Keep the runner/session warm between turns

Notes

There was separate Telegram/TUN network instability earlier in the setup, but after the Telegram channel became usable, the per-turn embedded agent trace still consistently showed ~25–31s before stream-ready. So the trace above appears to be a separate runtime overhead issue.

extent analysis

TL;DR

Implement caching for model resolution, auth/profile resolution, and tool inventory/plugin tool preparation to reduce the fixed per-turn embedded agent preparation time.

Guidance

Investigate caching mechanisms for model-resolution and auth to reduce their average times of 6785ms and 3330ms, respectively.
Optimize core-plugin-tools by caching or incrementally building tool inventory to decrease its average time of 6281ms.
Consider implementing a lightweight reply path for normal chat messages to bypass full embedded agent preparation.
Review the system-prompt and stream-setup components to identify potential optimization opportunities.

Example

No code snippet is provided as the issue description does not include specific code references.

Notes

The provided trace samples indicate a consistent overhead in the model-resolution, auth, and core-plugin-tools components, suggesting that caching or optimization in these areas could significantly reduce the per-turn preparation time.

Recommendation

Apply workaround: Implement caching for model resolution, auth/profile resolution, and tool inventory/plugin tool preparation to reduce the fixed per-turn embedded agent preparation time, as these components are identified as the primary contributors to the overhead.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #optimization #prompt issue #agent setup #task chaining

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Telegram direct chat replies spend ~25–31s in embedded agent startup/prep before streaming [3 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Environment

What I tried

Representative trace samples

Sample 1

Sample 2

Sample 3

Sample 4

Current interpretation

Notes

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Telegram direct chat replies spend ~25–31s in embedded agent startup/prep before streaming [3 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Environment

What I tried

Representative trace samples

Sample 1

Sample 2

Sample 3

Sample 4

Current interpretation

Notes

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING