- Simple direct-message turns should not spend tens of seconds in embedded-run prep before reaching the model. - Active Memory should either return bounded recall quickly or degrade gracefully without blocking the user-visible turn for ~60s. - `sessions.list` / `node.list` should not materially contribute to gateway event-loop pressure during normal use. - Transient Telegram typing/send failures should not leave the system feeling wedged.

openclaw - 💡(How to fix) Fix [Bug]: 2026.4.29 gateway instability: Active Memory timeouts, embedded-run prep latency, event-loop pressure [1 comments, 2 participants]

harleymdsavage · 2026-05-02T18:47:14Z

[openclaw] After upgrading to OpenClaw 2026.4.29 , the gateway became noticeably unstable and slow. The main symptoms are high per-turn latency, Active Memory… After upgrading to OpenClaw `2026.4.29`, the gateway became noticeably unstable and slow. The main symptoms are high per-turn latency, Active Memory timeouts, repeated embedded-run startup/prep overhead, event-loop delay warnings, and slow control/session RPCs. Temporarily disabling Active Memory did not resolve the overall instability, so Active Memory appears to be affected by the broader embedded-run/gateway performance regression rather than being the only root cause. This looks related to the current `2026.4.29` regression cluster, especially #76123, #76166, #76047, #76048, and the superseded Active Memory report #76043. ## Fix / Workaround ```text [trace:embedded-run] startup stages: phase=attempt-dispatch totalMs=16889 stages=runtime-plugins:9907ms,model-resolution:1763ms,auth:2627ms,attempt-dispatch:2591ms [trace:embedded-run] prep stages: phase=stream-ready totalMs=20049 stages=core-plugin-tools:9575ms,bundle-tools:877ms,system-prompt:4323ms,session-resource-loader:894ms,stream-setup:4309ms ``` ### Bug type Regression (worked better before, now unstable/slow) ### Summary After upgrading to OpenClaw `2026.4.29`, the gateway became noticeably unstable and slow. The main symptoms are high per-turn latency, Active Memory timeouts, repeated embedded-run startup/prep overhead, event-loop delay warnings, and slow control/session RPCs. Temporarily disabling Active Memory did not resolve the overall instability, so Active Memory appears to be affected by the broader embedded-run/gateway performance regression rather than being the only root cause. This looks related to the current `2026.4.29` regression cluster, especially #76123, #76166, #76047, #76048, and the superseded Active Memory report #76043. ### Environment - OpenClaw: `2026.4.29 (a448042)` - Install/update channel: stable / npm latest - OS: Ubuntu Linux x64 - Node: `v22.22.2` - Gateway: systemd service, local loopback gateway - Primary model route: `openai-codex/gpt-5.5` - Active Memory model route: `zai/glm-5.1` - Telegram direct channel is enabled and generally reachable ### Observed behavior 1. Normal direct-message turns are slow or appear stuck. 2. Active Memory runs time out during prompt preparation / hidden embedded runs. 3. Gateway logs show large embedded-run startup/prep costs before model output begins. 4. Gateway logs show event-loop/liveness warnings under normal use. 5. `sessions.list` and `node.list` calls are repeatedly slow enough to show up in logs and likely contribute to gateway pressure. 6. Disabling Active Memory alone did not fix the broader instability. 7. Telegram health reports OK, but there are transient outbound Bot API failures such as `sendChatAction` network failures. ### Local evidence / log excerpts Representative gateway log excerpts, sanitized: ```text [plugins] active-memory: agent=main session= start timeoutMs=30000 model=zai/glm-5.1 [plugins] [hooks] before_prompt_build handler from active-memory failed: timed out after 60000ms [diagnostic] lane task error: lane=main durationMs=60719 error="CommandLaneTaskTimeoutError: Command lane \"main\" task timed out after 60000ms" [plugins] active-memory: agent=main session= done status=timeout elapsedMs=60825 summaryChars=0 [agent/embedded] embedded run failover decision: stage=assistant decision=surface_error reason=timeout from=zai/glm-5.1 ``` Embedded-run startup/prep examples: ```text [trace:embedded-run] startup stages: phase=attempt-dispatch totalMs=16889 stages=runtime-plugins:9907ms,model-resolution:1763ms,auth:2627ms,attempt-dispatch:2591ms [trace:embedded-run] prep stages: phase=stream-ready totalMs=20049 stages=core-plugin-tools:9575ms,bundle-tools:877ms,system-prompt:4323ms,session-resource-loader:894ms,stream-setup:4309ms ``` Event-loop/liveness example: ```text [diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu eventLoopDelayP99Ms=9412 eventLoopUtilization=0.995 cpuCoreRatio=1.088 ``` Slow RPC examples: ```text [ws] res sessions.list ~2180ms-2210ms [ws] res node.list ~2180ms-2640ms ``` Telegram transient outbound failure example: ```text [telegram] sendChatAction failed: Network request for 'sendChatAction' failed ``` ### Active Memory config details Active Memory is enabled only for the main agent and direct chats. The local config is already reduced/conservative: ```json { "enabled": true, "agents": ["main"], "allowedChatTypes": ["direct"], "model": "zai/glm-5.1", "modelFallback": "zai/glm-5.1", "queryMode": "message", "promptStyle": "preference-only", "timeoutMs": 30000, "maxSummaryChars": 120, "circuitBreakerMaxTimeouts": 1, "circuitBreakerCooldownMs": 600000, "recentUserTurns": 0, "recentAssistantTurns": 0, "recentUserChars": 300, "recentAssistantChars": 40, "cacheTtlMs": 120000 } ``` ### Expected behavior - Simple direct-message

openclaw2026-05-02 18:47:14

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#76212•Fetched 2026-05-03 04:40:40

View on GitHub

Comments

Participants

Timeline

Reactions

Author

harleymdsavage

Participants

clawsweeper[bot]

harleymdsavage

Timeline (top)

closed ×1commented ×1

After upgrading to OpenClaw 2026.4.29, the gateway became noticeably unstable and slow. The main symptoms are high per-turn latency, Active Memory timeouts, repeated embedded-run startup/prep overhead, event-loop delay warnings, and slow control/session RPCs. Temporarily disabling Active Memory did not resolve the overall instability, so Active Memory appears to be affected by the broader embedded-run/gateway performance regression rather than being the only root cause.

This looks related to the current 2026.4.29 regression cluster, especially #76123, #76166, #76047, #76048, and the superseded Active Memory report #76043.

Error Message

Normal direct-message turns are slow or appear stuck.
Active Memory runs time out during prompt preparation / hidden embedded runs.
Gateway logs show large embedded-run startup/prep costs before model output begins.
Gateway logs show event-loop/liveness warnings under normal use.
sessions.list and node.list calls are repeatedly slow enough to show up in logs and likely contribute to gateway pressure.
Disabling Active Memory alone did not fix the broader instability.
Telegram health reports OK, but there are transient outbound Bot API failures such as sendChatAction network failures.

Root Cause

Fix Action

Fix / Workaround

[trace:embedded-run] startup stages: phase=attempt-dispatch totalMs=16889 stages=runtime-plugins:9907ms,model-resolution:1763ms,auth:2627ms,attempt-dispatch:2591ms
[trace:embedded-run] prep stages: phase=stream-ready totalMs=20049 stages=core-plugin-tools:9575ms,bundle-tools:877ms,system-prompt:4323ms,session-resource-loader:894ms,stream-setup:4309ms

Code Example

[plugins] active-memory: agent=main session=<direct-session> start timeoutMs=30000 model=zai/glm-5.1
[plugins] [hooks] before_prompt_build handler from active-memory failed: timed out after 60000ms
[diagnostic] lane task error: lane=main durationMs=60719 error="CommandLaneTaskTimeoutError: Command lane \"main\" task timed out after 60000ms"
[plugins] active-memory: agent=main session=<direct-session> done status=timeout elapsedMs=60825 summaryChars=0
[agent/embedded] embedded run failover decision: stage=assistant decision=surface_error reason=timeout from=zai/glm-5.1

---

[trace:embedded-run] startup stages: phase=attempt-dispatch totalMs=16889 stages=runtime-plugins:9907ms,model-resolution:1763ms,auth:2627ms,attempt-dispatch:2591ms
[trace:embedded-run] prep stages: phase=stream-ready totalMs=20049 stages=core-plugin-tools:9575ms,bundle-tools:877ms,system-prompt:4323ms,session-resource-loader:894ms,stream-setup:4309ms

---

[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu eventLoopDelayP99Ms=9412 eventLoopUtilization=0.995 cpuCoreRatio=1.088

---

[ws] res sessions.list ~2180ms-2210ms
[ws] res node.list ~2180ms-2640ms

---

[telegram] sendChatAction failed: Network request for 'sendChatAction' failed

---

{
  "enabled": true,
  "agents": ["main"],
  "allowedChatTypes": ["direct"],
  "model": "zai/glm-5.1",
  "modelFallback": "zai/glm-5.1",
  "queryMode": "message",
  "promptStyle": "preference-only",
  "timeoutMs": 30000,
  "maxSummaryChars": 120,
  "circuitBreakerMaxTimeouts": 1,
  "circuitBreakerCooldownMs": 600000,
  "recentUserTurns": 0,
  "recentAssistantTurns": 0,
  "recentUserChars": 300,
  "recentAssistantChars": 40,
  "cacheTtlMs": 120000
}

RAW_BUFFERClick to expand / collapse

Bug type

Regression (worked better before, now unstable/slow)

Summary

This looks related to the current 2026.4.29 regression cluster, especially #76123, #76166, #76047, #76048, and the superseded Active Memory report #76043.

Environment

OpenClaw: 2026.4.29 (a448042)
Install/update channel: stable / npm latest
OS: Ubuntu Linux x64
Node: v22.22.2
Gateway: systemd service, local loopback gateway
Primary model route: openai-codex/gpt-5.5
Active Memory model route: zai/glm-5.1
Telegram direct channel is enabled and generally reachable

Observed behavior

Normal direct-message turns are slow or appear stuck.
Active Memory runs time out during prompt preparation / hidden embedded runs.
Gateway logs show large embedded-run startup/prep costs before model output begins.
Gateway logs show event-loop/liveness warnings under normal use.
sessions.list and node.list calls are repeatedly slow enough to show up in logs and likely contribute to gateway pressure.
Disabling Active Memory alone did not fix the broader instability.
Telegram health reports OK, but there are transient outbound Bot API failures such as sendChatAction network failures.

Local evidence / log excerpts

Representative gateway log excerpts, sanitized:

[plugins] active-memory: agent=main session=<direct-session> start timeoutMs=30000 model=zai/glm-5.1
[plugins] [hooks] before_prompt_build handler from active-memory failed: timed out after 60000ms
[diagnostic] lane task error: lane=main durationMs=60719 error="CommandLaneTaskTimeoutError: Command lane \"main\" task timed out after 60000ms"
[plugins] active-memory: agent=main session=<direct-session> done status=timeout elapsedMs=60825 summaryChars=0
[agent/embedded] embedded run failover decision: stage=assistant decision=surface_error reason=timeout from=zai/glm-5.1

Embedded-run startup/prep examples:

[trace:embedded-run] startup stages: phase=attempt-dispatch totalMs=16889 stages=runtime-plugins:9907ms,model-resolution:1763ms,auth:2627ms,attempt-dispatch:2591ms
[trace:embedded-run] prep stages: phase=stream-ready totalMs=20049 stages=core-plugin-tools:9575ms,bundle-tools:877ms,system-prompt:4323ms,session-resource-loader:894ms,stream-setup:4309ms

Event-loop/liveness example:

[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu eventLoopDelayP99Ms=9412 eventLoopUtilization=0.995 cpuCoreRatio=1.088

Slow RPC examples:

[ws] res sessions.list ~2180ms-2210ms
[ws] res node.list ~2180ms-2640ms

Telegram transient outbound failure example:

[telegram] sendChatAction failed: Network request for 'sendChatAction' failed

Active Memory config details

Active Memory is enabled only for the main agent and direct chats. The local config is already reduced/conservative:

{
  "enabled": true,
  "agents": ["main"],
  "allowedChatTypes": ["direct"],
  "model": "zai/glm-5.1",
  "modelFallback": "zai/glm-5.1",
  "queryMode": "message",
  "promptStyle": "preference-only",
  "timeoutMs": 30000,
  "maxSummaryChars": 120,
  "circuitBreakerMaxTimeouts": 1,
  "circuitBreakerCooldownMs": 600000,
  "recentUserTurns": 0,
  "recentAssistantTurns": 0,
  "recentUserChars": 300,
  "recentAssistantChars": 40,
  "cacheTtlMs": 120000
}

Expected behavior

Simple direct-message turns should not spend tens of seconds in embedded-run prep before reaching the model.
Active Memory should either return bounded recall quickly or degrade gracefully without blocking the user-visible turn for ~60s.
sessions.list / node.list should not materially contribute to gateway event-loop pressure during normal use.
Transient Telegram typing/send failures should not leave the system feeling wedged.

Related issues / possible overlap

#76123 — 2026.4.29 performance regression: latency, stuck sessions, event-loop blocking
#76166 — Control UI repeatedly calls slow sessions.list
#76047 — event-loop saturation involving temp-file pressure / node.list
#76048 — ZAI GLM-5 reasoning output routed to hidden thinking instead of visible text
#76043 — Active Memory embedded-run startup overhead, superseded by canonical embedded-run prep tracker
#76174 / #76176 — embedded-run/provider/Telegram hang symptom family

Notes

This report is intentionally sanitized and omits hostnames, domains, usernames, chat identifiers, local filesystem paths, and tokens.

extent analysis

TL;DR

Downgrade OpenClaw to a version prior to 2026.4.29 or wait for a patch release that addresses the performance regression and instability issues.

Guidance

Review the related issues (#76123, #76166, #76047, #76048, #76043) to understand the scope of the performance regression and potential workarounds.
Consider temporarily disabling Active Memory or adjusting its configuration to reduce the load on the gateway.
Monitor the gateway logs for event-loop warnings and slow RPC calls to identify potential bottlenecks.
Verify that the Telegram direct channel is properly configured and reachable to rule out external factors contributing to the instability.

Example

No code snippet is provided as the issue is related to a specific version of OpenClaw and its configuration.

Notes

The issue is likely related to the 2026.4.29 regression cluster, and downgrading or waiting for a patch release may be the most effective solution. However, adjusting the Active Memory configuration or disabling it temporarily may help mitigate the instability.

Recommendation

Apply workaround: Downgrade OpenClaw to a version prior to 2026.4.29 to avoid the performance regression and instability issues until a patch release is available.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Simple direct-message turns should not spend tens of seconds in embedded-run prep before reaching the model.
Active Memory should either return bounded recall quickly or degrade gracefully without blocking the user-visible turn for ~60s.
sessions.list / node.list should not materially contribute to gateway event-loop pressure during normal use.
Transient Telegram typing/send failures should not leave the system feeling wedged.

#api #ssr #model compatibility #GPU setup #container setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: 2026.4.29 gateway instability: Active Memory timeouts, embedded-run prep latency, event-loop pressure [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug type

Summary

Environment

Observed behavior

Local evidence / log excerpts

Active Memory config details

Expected behavior

Related issues / possible overlap

Notes

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: 2026.4.29 gateway instability: Active Memory timeouts, embedded-run prep latency, event-loop pressure [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Bug type

Summary

Environment

Observed behavior

Local evidence / log excerpts

Active Memory config details

Expected behavior

Related issues / possible overlap

Notes

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING