openclaw - 💡(How to fix) Fix Embedded agent runs spend ~45-53s before stream-ready on local OpenAI-compatible backend; Telegram getMe health checks appear to bypass configured SOCKS proxy [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75834Fetched 2026-05-02 05:29:19
View on GitHub
Comments
1
Participants
2
Timeline
1
Reactions
2
Timeline (top)
commented ×1

I am seeing consistently slow embedded/pi agent runs on OpenClaw 2026.4.29 even though the configured local OpenAI-compatible backend is fast when called directly. Each agent turn spends ~19-23s before attempt-dispatch and ~46-53s before stream-ready, with repeated diagnostic liveness warnings showing event loop delay / high CPU utilization.

There is also a possibly separate Telegram networking issue: channels.telegram.proxy is configured with a SOCKS5 proxy and direct curl --socks5-hostname ... getMe is fast, but OpenClaw still logs fetch-timeout for https://api.telegram.org/.../getMe. This suggests at least one Telegram health/probe code path may not be honoring channels.telegram.proxy.

I can split this into two issues if preferred; reporting together because both show up in the same affected gateway and the Telegram timeouts make latency debugging noisier.

Root Cause

I can split this into two issues if preferred; reporting together because both show up in the same affected gateway and the Telegram timeouts make latency debugging noisier.

Fix Action

Fix / Workaround

I am seeing consistently slow embedded/pi agent runs on OpenClaw 2026.4.29 even though the configured local OpenAI-compatible backend is fast when called directly. Each agent turn spends ~19-23s before attempt-dispatch and ~46-53s before stream-ready, with repeated diagnostic liveness warnings showing event loop delay / high CPU utilization.

  • startup to attempt-dispatch: ~18.8-23.1s
  • prep to stream-ready: ~46.3-53.1s
  • repeated expensive prep stages:
    • core-plugin-tools: ~5.5-6.8s
    • bundle-tools: ~6.8-8.7s
    • system-prompt: ~15.9-17.7s
    • stream-setup: ~16.7-19.0s
  • liveness warnings during/around runs, sometimes with event loop delay >17s and eventLoopUtilization near 1
  • one observed cleanup timeout: step=pi-trajectory-flush timeoutMs=10000
[trace:embedded-run] startup stages: phase=attempt-dispatch totalMs=20016 stages=workspace:1ms@1ms,runtime-plugins:1ms@2ms,hooks:0ms@2ms,model-resolution:2160ms@2162ms,auth:8348ms@10510ms,context-engine:0ms@10510ms,attempt-dispatch:9506ms@20016ms

Code Example

{
  "agents": {
    "defaults": {
      "contextInjection": "continuation-skip",
      "bootstrapMaxChars": 12000,
      "bootstrapTotalMaxChars": 60000
    }
  },
  "models": {
    "providers": {
      "<provider>": {
        "api": "openai-completions",
        "baseUrl": "http://localhost:2455/v1"
      }
    }
  },
  "channels": {
    "telegram": {
      "proxy": "socks5://<socks-proxy-host>:1080"
    }
  }
}

---

openclaw gateway status
openclaw status --deep

---

curl -sS http://localhost:2455/v1/models
curl -sS http://localhost:2455/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"gpt-5.5","messages":[{"role":"user","content":"Reply with OK only."}],"max_tokens":16}'

---

grep -E "trace:embedded-run|liveness warning|pi-trajectory-flush" /tmp/openclaw/openclaw-2026-05-02.log | tail -50

---

# configured in openclaw.json
channels.telegram.proxy = "socks5://<socks-proxy-host>:1080"

curl --socks5-hostname <socks-proxy-host>:1080 \
  -sS -o /tmp/tg-getme.json \
  -w 'time=%{time_total} code=%{http_code}\n' \
  'https://api.telegram.org/bot<redacted>/getMe'

---

[trace:embedded-run] startup stages: phase=attempt-dispatch totalMs=20016 stages=workspace:1ms@1ms,runtime-plugins:1ms@2ms,hooks:0ms@2ms,model-resolution:2160ms@2162ms,auth:8348ms@10510ms,context-engine:0ms@10510ms,attempt-dispatch:9506ms@20016ms

---

[trace:embedded-run] prep stages: phase=stream-ready totalMs=49740 stages=workspace-sandbox:8ms@8ms,skills:0ms@8ms,core-plugin-tools:5994ms@6002ms,bootstrap-context:37ms@6039ms,bundle-tools:7613ms@13652ms,system-prompt:17142ms@30794ms,session-resource-loader:1309ms@32103ms,agent-session:3ms@32106ms,stream-setup:17633ms@49739ms

[trace:embedded-run] prep stages: phase=stream-ready totalMs=53114 stages=workspace-sandbox:7ms@7ms,skills:0ms@7ms,core-plugin-tools:6469ms@6476ms,bootstrap-context:7ms@6483ms,bundle-tools:8660ms@15143ms,system-prompt:17695ms@32838ms,session-resource-loader:1228ms@34066ms,agent-session:1ms@34067ms,stream-setup:19047ms@53114ms

---

liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=31s eventLoopDelayP99Ms=22431.1 eventLoopDelayMaxMs=22431.1 eventLoopUtilization=1 cpuCoreRatio=1.005 active=0 waiting=0 queued=0

agent cleanup timed out: step=pi-trajectory-flush timeoutMs=10000

---

[fetch-timeout] {"timeoutMs":10000,"elapsedMs":14948,"operation":"fetchWithTimeout","url":"https://api.telegram.org/bot<redacted>/getMe"} fetch timeout reached; aborting operation
RAW_BUFFERClick to expand / collapse

Embedded agent runs spend ~45-53s before stream-ready on local OpenAI-compatible backend; Telegram getMe health checks appear to bypass configured SOCKS proxy

Summary

I am seeing consistently slow embedded/pi agent runs on OpenClaw 2026.4.29 even though the configured local OpenAI-compatible backend is fast when called directly. Each agent turn spends ~19-23s before attempt-dispatch and ~46-53s before stream-ready, with repeated diagnostic liveness warnings showing event loop delay / high CPU utilization.

There is also a possibly separate Telegram networking issue: channels.telegram.proxy is configured with a SOCKS5 proxy and direct curl --socks5-hostname ... getMe is fast, but OpenClaw still logs fetch-timeout for https://api.telegram.org/.../getMe. This suggests at least one Telegram health/probe code path may not be honoring channels.telegram.proxy.

I can split this into two issues if preferred; reporting together because both show up in the same affected gateway and the Telegram timeouts make latency debugging noisier.

Environment

  • OpenClaw: 2026.4.29 (a448042)
  • Install/update mode reported by openclaw status --deep: pnpm · up to date · npm latest 2026.4.29
  • OS: Arch Linux, linux 7.0.2-arch1-1 (x64)
  • Node: v25.9.0
  • Gateway service: systemd user service, gateway on loopback 127.0.0.1:18789
  • Gateway status: running, connectivity probe OK
  • Model provider: local OpenAI-compatible endpoint

Relevant sanitized config:

{
  "agents": {
    "defaults": {
      "contextInjection": "continuation-skip",
      "bootstrapMaxChars": 12000,
      "bootstrapTotalMaxChars": 60000
    }
  },
  "models": {
    "providers": {
      "<provider>": {
        "api": "openai-completions",
        "baseUrl": "http://localhost:2455/v1"
      }
    }
  },
  "channels": {
    "telegram": {
      "proxy": "socks5://<socks-proxy-host>:1080"
    }
  }
}

Direct backend checks

The backend itself does not appear to explain the latency:

  • GET http://localhost:2455/v1/models: fast
  • direct simple POST /v1/chat/completions: about 1.35s
  • direct streaming chat completion: first line/token around 1.4s
  • POST /v1/completions returned 405, but this backend supports /v1/chat/completions; OpenClaw provider uses api: "openai-completions", which docs say is the right adapter for /v1/chat/completions compatible backends.

Actual behavior

For normal Telegram/group-session turns and isolated watchdog agent turns, embedded-run trace timings are consistently around:

  • startup to attempt-dispatch: ~18.8-23.1s
  • prep to stream-ready: ~46.3-53.1s
  • repeated expensive prep stages:
    • core-plugin-tools: ~5.5-6.8s
    • bundle-tools: ~6.8-8.7s
    • system-prompt: ~15.9-17.7s
    • stream-setup: ~16.7-19.0s
  • liveness warnings during/around runs, sometimes with event loop delay >17s and eventLoopUtilization near 1
  • one observed cleanup timeout: step=pi-trajectory-flush timeoutMs=10000

This remains true after lowering bootstrap limits and setting contextInjection: "continuation-skip".

Expected behavior

With a local /v1/chat/completions backend responding in ~1-1.5s and no large bootstrap files, an embedded agent turn should not spend ~45-53s before stream setup. If expensive prompt/tool/session setup is expected, it would be helpful to know which config controls it; currently the trace suggests CPU/event-loop-bound synchronous work inside the gateway/embedded runtime.

For Telegram, when channels.telegram.proxy is configured, Telegram API requests such as getMe health/probe calls should use that proxy consistently, or logs/status should identify which path is bypassing it.

Reproduction / diagnostics performed

  1. Confirm gateway is running:
openclaw gateway status
openclaw status --deep

Status shows gateway reachable, Telegram OK, security audit clean.

  1. Confirm local backend speed directly:
curl -sS http://localhost:2455/v1/models
curl -sS http://localhost:2455/v1/chat/completions \
  -H 'Content-Type: application/json' \
  -d '{"model":"gpt-5.5","messages":[{"role":"user","content":"Reply with OK only."}],"max_tokens":16}'

Observed simple chat completion around 1.35s; streaming first line around 1.4s.

  1. Send normal OpenClaw agent turns and inspect logs:
grep -E "trace:embedded-run|liveness warning|pi-trajectory-flush" /tmp/openclaw/openclaw-2026-05-02.log | tail -50
  1. Configure Telegram SOCKS proxy, restart gateway, compare manual Telegram API call via proxy vs OpenClaw logs:
# configured in openclaw.json
channels.telegram.proxy = "socks5://<socks-proxy-host>:1080"

curl --socks5-hostname <socks-proxy-host>:1080 \
  -sS -o /tmp/tg-getme.json \
  -w 'time=%{time_total} code=%{http_code}\n' \
  'https://api.telegram.org/bot<redacted>/getMe'

Manual proxy result: time=0.404185 code=200.

Direct non-proxied Telegram call from the same host timed out after 20s. OpenClaw still logged fetch-timeout for getMe after proxy config + restart.

Sanitized log snippets

Startup stage example:

[trace:embedded-run] startup stages: phase=attempt-dispatch totalMs=20016 stages=workspace:1ms@1ms,runtime-plugins:1ms@2ms,hooks:0ms@2ms,model-resolution:2160ms@2162ms,auth:8348ms@10510ms,context-engine:0ms@10510ms,attempt-dispatch:9506ms@20016ms

Prep stage examples:

[trace:embedded-run] prep stages: phase=stream-ready totalMs=49740 stages=workspace-sandbox:8ms@8ms,skills:0ms@8ms,core-plugin-tools:5994ms@6002ms,bootstrap-context:37ms@6039ms,bundle-tools:7613ms@13652ms,system-prompt:17142ms@30794ms,session-resource-loader:1309ms@32103ms,agent-session:3ms@32106ms,stream-setup:17633ms@49739ms

[trace:embedded-run] prep stages: phase=stream-ready totalMs=53114 stages=workspace-sandbox:7ms@7ms,skills:0ms@7ms,core-plugin-tools:6469ms@6476ms,bootstrap-context:7ms@6483ms,bundle-tools:8660ms@15143ms,system-prompt:17695ms@32838ms,session-resource-loader:1228ms@34066ms,agent-session:1ms@34067ms,stream-setup:19047ms@53114ms

Liveness / cleanup examples:

liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=31s eventLoopDelayP99Ms=22431.1 eventLoopDelayMaxMs=22431.1 eventLoopUtilization=1 cpuCoreRatio=1.005 active=0 waiting=0 queued=0

agent cleanup timed out: step=pi-trajectory-flush timeoutMs=10000

Telegram proxy-related example:

[fetch-timeout] {"timeoutMs":10000,"elapsedMs":14948,"operation":"fetchWithTimeout","url":"https://api.telegram.org/bot<redacted>/getMe"} fetch timeout reached; aborting operation

Notes / hypotheses

  • Direct model inference is fast, so the delay seems to be inside OpenClaw embedded/pi agent preparation rather than the provider backend.
  • contextInjection: "continuation-skip" and reduced bootstrap limits did not materially change the timings.
  • The repeated system-prompt and stream-setup costs look CPU/event-loop-bound from gateway liveness warnings.
  • pi-trajectory-flush timeout may be related, or just another symptom of event loop blockage.
  • The Telegram getMe timeout may be a separate bug where a status/health/probe path uses raw fetch rather than the configured Telegram proxy.

Happy to run a more targeted debug build/trace if there is a recommended flag for breaking down system-prompt and stream-setup further.

extent analysis

TL;DR

The most likely fix involves investigating and optimizing the system-prompt and stream-setup stages in the OpenClaw embedded agent preparation, which seem to be CPU/event-loop-bound, and ensuring the Telegram proxy is correctly used for all API requests.

Guidance

  • Investigate the system-prompt and stream-setup stages to understand what causes their high latency and optimize their performance.
  • Verify that the Telegram proxy configuration is correctly applied to all Telegram API requests, including health/probe calls.
  • Consider running a targeted debug build with additional tracing to break down the system-prompt and stream-setup stages further.
  • Review the gateway's event loop utilization and CPU usage to identify potential bottlenecks.

Example

No specific code snippet is provided as the issue seems to be related to the configuration and performance of the OpenClaw embedded agent and Telegram proxy.

Notes

The issue appears to be complex and may require a deeper investigation into the OpenClaw embedded agent and Telegram proxy configuration. The provided information suggests that the delay is not caused by the model inference itself, but rather by the preparation stages within OpenClaw.

Recommendation

Apply a workaround by optimizing the system-prompt and stream-setup stages and ensuring the correct use of the Telegram proxy, as the root cause of the issue is not immediately clear and may require further investigation.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

With a local /v1/chat/completions backend responding in ~1-1.5s and no large bootstrap files, an embedded agent turn should not spend ~45-53s before stream setup. If expensive prompt/tool/session setup is expected, it would be helpful to know which config controls it; currently the trace suggests CPU/event-loop-bound synchronous work inside the gateway/embedded runtime.

For Telegram, when channels.telegram.proxy is configured, Telegram API requests such as getMe health/probe calls should use that proxy consistently, or logs/status should identify which path is bypassing it.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Embedded agent runs spend ~45-53s before stream-ready on local OpenAI-compatible backend; Telegram getMe health checks appear to bypass configured SOCKS proxy [1 comments, 2 participants]