openclaw - 💡(How to fix) Fix observability: emit per-MCP-server sub-spans inside bundle-tools so cold-start cost is attributable [1 comments, 2 participants]

openclaw2026-05-14 01:30:42

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#81595•Fetched 2026-05-14 03:30:36

View on GitHub

Comments

Participants

Timeline

Reactions

Author

mlopez14

Participants

clawsweeper[bot]

mlopez14

Timeline (top)

commented ×1cross-referenced ×1

The [trace:embedded-run] prep stages line and the timeline diagnostic (openclaw.diagnostics.v1 envelope) both expose bundle-tools as a single aggregate duration. On hosts with multiple stdio MCP servers configured, this stage routinely accounts for the dominant share of per-turn prep time — but it's a black box: there's no way to attribute the cost to a specific MCP server, plugin tool registry, or sub-step.

This makes regressions and duplicate-child situations (e.g. #75621) much harder to diagnose than they need to be, and makes proposed fixes like #80131 (tool bundling memoization) harder to verify after the fact.

Error Message

{"span":"mcp.tools_list","server":"desktop-automation","durationMs":11420,"toolCount":6,"warn":"slow"},

Root Cause

Fix Action

Fix / Workaround

Happy to test patches against this workload — I already have a CSV parser of the prep-stages line and a baseline corpus across two days of bimodal traffic.

Code Example

[trace:embedded-run] prep stages: phase=stream-ready totalMs=21441
  workspace-sandbox:1ms, skills:0ms, core-plugin-tools:320ms,
  bootstrap-context:27ms,
  bundle-tools:20931ms,            ← black box
  system-prompt:5ms, session-resource-loader:137ms,
  agent-session:0ms, stream-setup:20ms

---

{"span":"bundle-tools","durationMs":20931,"children":[
  {"span":"mcp.tools_list","server":"filesystem","durationMs":3120,"toolCount":11},
  {"span":"mcp.tools_list","server":"github","durationMs":1840,"toolCount":29},
  {"span":"mcp.tools_list","server":"desktop-automation","durationMs":11420,"toolCount":6,"warn":"slow"},
  {"span":"mcp.tools_list","server":"graphify-orchestrator","durationMs":1505,"toolCount":7},
  {"span":"mcp.tools_list","server":"graphify-trader","durationMs":1490,"toolCount":7},
  {"span":"mcp.tools_list","server":"graphify-openclaw","durationMs":1456,"toolCount":7},
  {"span":"core-plugin-tools.assemble","durationMs":80}
]}

---

[trace:embedded-run] bundle-tools detail: filesystem:3120ms(11) github:1840ms(29) desktop-automation:11420ms(6,slow) graphify-orchestrator:1505ms(7) graphify-trader:1490ms(7) graphify-openclaw:1456ms(7) core-plugin-tools.assemble:80ms

RAW_BUFFERClick to expand / collapse

Summary

Concrete observation

On a Linux/WSL host running OpenClaw 2026.5.7 with 6 stdio MCP servers configured (filesystem, github, desktop-automation via WSL→cmd.exe, and 3× graphify.serve), bundle-tools shows a bimodal distribution:

Fast mode: ~8 s (event loop quiet)
Slow mode: ~17–20 s (event loop contended by concurrent worker subagents)
Outliers >50 s (one timeout at the 30 s mark on filesystem MCP)

[trace:embedded-run] prep stages: phase=stream-ready totalMs=21441
  workspace-sandbox:1ms, skills:0ms, core-plugin-tools:320ms,
  bootstrap-context:27ms,
  bundle-tools:20931ms,            ← black box
  system-prompt:5ms, session-resource-loader:137ms,
  agent-session:0ms, stream-setup:20ms

All evidence points at sequential tools/list cold-start across the MCP children (re-spawned per prep cycle), with one slow child dominating, but there is no log signal that names which child is slow. Diagnosis required ad-hoc ps/strace/timing experiments.

Proposal

Emit nested sub-spans inside bundle-tools to the timeline diagnostic (openclaw.diagnostics.v1), one per MCP server / tool source. Also include a compact summary line in the existing aggregate [trace:embedded-run] prep stages log when diagnostics.flags includes timeline or a new bundle-tools.detail flag.

Suggested shape (JSONL timeline):

{"span":"bundle-tools","durationMs":20931,"children":[
  {"span":"mcp.tools_list","server":"filesystem","durationMs":3120,"toolCount":11},
  {"span":"mcp.tools_list","server":"github","durationMs":1840,"toolCount":29},
  {"span":"mcp.tools_list","server":"desktop-automation","durationMs":11420,"toolCount":6,"warn":"slow"},
  {"span":"mcp.tools_list","server":"graphify-orchestrator","durationMs":1505,"toolCount":7},
  {"span":"mcp.tools_list","server":"graphify-trader","durationMs":1490,"toolCount":7},
  {"span":"mcp.tools_list","server":"graphify-openclaw","durationMs":1456,"toolCount":7},
  {"span":"core-plugin-tools.assemble","durationMs":80}
]}

Compact one-liner (for the existing log channel):

[trace:embedded-run] bundle-tools detail: filesystem:3120ms(11) github:1840ms(29) desktop-automation:11420ms(6,slow) graphify-orchestrator:1505ms(7) graphify-trader:1490ms(7) graphify-openclaw:1456ms(7) core-plugin-tools.assemble:80ms

Either format is enough to identify the slowest server at a glance.

Why this matters now

Two open performance issues would directly benefit:

#80131 (per-request auth and tool bundling dominate TTFT) — proposes memoizing bundle-tools. With sub-spans we could quantify the actual saving per MCP server and prove the fix lands.
#75621 (duplicate stdio MCP children) — currently diagnosed by inspecting ps after the fact. Per-server timing would surface the duplicate (two children responding to the same (server, config) key) as duplicated sub-spans or anomalous totals.

It also helps users with many MCP servers decide which servers are pulling weight vs being dead-weight cold-start cost.

Backwards compatibility

New sub-spans should only emit when diagnostics.flags includes timeline (already gated) or a new dedicated flag — no change to default behavior.
The aggregate bundle-tools value in the existing [trace:embedded-run] prep stages line stays as-is, so downstream parsers keep working.

Environment

OpenClaw 2026.5.7 (eeef486)
Node v22.22.2
WSL2 / Linux 6.6.87.2-microsoft-standard-WSL2
6 stdio MCP servers configured (filesystem, github, desktop-automation via cmd.exe, 3× graphify.serve)
Channels: Discord, Telegram
Gateway up ~52 h, ~300 prep traces/day, no regression — the cost is structural

Offer

Happy to test patches against this workload — I already have a CSV parser of the prep-stages line and a baseline corpus across two days of bimodal traffic.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#latency issue #model loading #dependency error #configuration error #environment variable

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix observability: emit per-MCP-server sub-spans inside bundle-tools so cold-start cost is attributable [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Concrete observation

Proposal

Why this matters now

Backwards compatibility

Environment

Offer

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix observability: emit per-MCP-server sub-spans inside bundle-tools so cold-start cost is attributable [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Concrete observation

Proposal

Why this matters now

Backwards compatibility

Environment

Offer

Still need to ship something?

RELATED_DISCOVERY

TRENDING