openclaw - 💡(How to fix) Fix [Bug]: 2026.4.29 embedded runner auth/dispatch latency causes restart loop on Synology NAS (Docker + QQ bot) [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#75756Fetched 2026-05-02 05:30:42
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
2
Timeline (top)
closed ×1commented ×1labeled ×1

After auto-updating to OpenClaw 2026.4.29, the gateway enters a restart loop on a Synology NAS (Docker container). Each incoming message triggers auth:14s + dispatch:17s in the embedded runner, blocking the Node.js event loop for 30+ seconds. The health monitor detects the stall and kills the gateway, which restarts and repeats.

Error Message

Gateway starts, QQ bots connect, but any user message triggers:

Root Cause

This appears to be the same root cause as #75676, #75668, #75650 — the plugin bundled runtime-deps staging mechanism introduced in 2026.4.24 blocks the Node.js main thread synchronously. On resource-constrained Linux hosts (NAS, Raspberry Pi, VPS), this makes the gateway unusable.

Fix Action

Fix / Workaround

Summary

After auto-updating to OpenClaw 2026.4.29, the gateway enters a restart loop on a Synology NAS (Docker container). Each incoming message triggers auth:14s + dispatch:17s in the embedded runner, blocking the Node.js event loop for 30+ seconds. The health monitor detects the stall and kills the gateway, which restarts and repeats.

[trace:embedded-run] totalMs=33303 auth:14643ms, attempt-dispatch:17610ms

What did NOT fix it

  • Disabling memorySearch — auth/dispatch still 25-33s
  • Disabling memory-core plugin — no improvement
RAW_BUFFERClick to expand / collapse

Bug type

Crash (process/app exits or hangs)

Beta release blocker

No

Summary

Bug type

Regression (worked before, now fails)

Summary

After auto-updating to OpenClaw 2026.4.29, the gateway enters a restart loop on a Synology NAS (Docker container). Each incoming message triggers auth:14s + dispatch:17s in the embedded runner, blocking the Node.js event loop for 30+ seconds. The health monitor detects the stall and kills the gateway, which restarts and repeats.

Environment

  • OpenClaw: 2026.4.29 (a448042)
  • OS: Synology NAS, Debian 12 Docker container
  • CPU: 4-core Celeron J3455
  • Memory: 9.7 GB
  • Node: v22
  • Channels: QQ bot (4 accounts)
  • Model: zhipu/glm-5-turbo
  • Plugins: qqbot, anthropic, active-memory, memory-core, memory-wiki, mempalace

Observed behavior

Gateway starts, QQ bots connect, but any user message triggers:

[trace:embedded-run] totalMs=33303 auth:14643ms, attempt-dispatch:17610ms

[diagnostic] liveness warning: eventLoopDelayP99Ms=173006.7 eventLoopUtilization=1

stuck session: sessionKey=agent:main:main state=processing age=173s

Cascading failure: event loop blocked → QQ WebSocket timeout → all 4 bots disconnect → health monitor kills gateway → restart → repeat.

What did NOT fix it

  • Disabling memorySearch — auth/dispatch still 25-33s
  • Disabling memory-core plugin — no improvement

What DID fix it

Rollback to 2026.4.23. sessions.list: 250ms (vs 8000-170000ms on .29), zero restarts.

Related

  • #75676 #75668 #75650 #75687 #75718

Steps to reproduce

  1. Install OpenClaw 2026.4.29 on a Synology NAS Docker container with QQ bot channel enabled (4 bot accounts)
  2. Start gateway, wait for QQ bots to connect (WebSocket connected, Gateway ready)
  3. Send any message to any QQ bot account
  4. Observe: gateway hangs for 30+ seconds, liveness warning fires, QQ bots disconnect, gateway restarts
  5. Repeat on every incoming message

Expected behavior

After receiving a user message, the agent should begin responding within 1-2 seconds (as it does on 2026.4.23). The embedded runner startup stages (auth + dispatch) should complete in under 1 second total, not 30+ seconds. The gateway should remain stable without restart loops.

Actual behavior

Each incoming message triggers embedded-run startup with auth:14s + dispatch:17s (total 30-33s), blocking the Node.js event loop. eventLoopDelayP99Ms reaches 173000ms (173 seconds). Health monitor detects the stall and restarts the gateway. All QQ bot WebSocket connections drop during the block. The cycle repeats on every new message, causing 3-4 consecutive restarts before temporarily stabilizing.

OpenClaw version

2026.4.29 (a448042)

Operating system

Synology NAS, Debian 12 Docker container

Install method

Node: v22

Model

zhipu/glm-5-turbo

Provider / routing chain

zhipu

Additional provider/model setup details

Provider: zhipu (ZhipuAI / BigModel) Endpoint: https://open.bigmodel.cn/api/poding/paas/v4 Model: glm-5-turbo via Coding Plan API keys (separate keys per agent) Embedding: embedding-3 via openai-compatible provider pointing to same ZhipuAI endpoint Embedding was ruled out as cause — disabling memorySearch entirely did not fix the issue.

Logs, screenshots, and evidence

Impact and severity

Severity: High — gateway is completely unusable on 2026.4.29. Every user message triggers a crash-restart cycle.

Impact: All 4 QQ bot accounts (serving different business functions: sales assistant, accounting, smart home, private chat) are affected simultaneously. The NAS has been running OpenClaw 24/7 since March 2026 — this is the first version that made it completely non-functional.

Workaround: Roll back to 2026.4.23 (last known good version confirmed by multiple users in #75676, #75687, #75718).

Additional information

This appears to be the same root cause as #75676, #75668, #75650 — the plugin bundled runtime-deps staging mechanism introduced in 2026.4.24 blocks the Node.js main thread synchronously. On resource-constrained Linux hosts (NAS, Raspberry Pi, VPS), this makes the gateway unusable.

Community consensus: 2026.4.23 is the last stable version. All versions from .24 through .29 exhibit various forms of this regression across Windows, macOS, and Linux platforms.

The NAS environment has mihomo-core proxy available at 127.0.0.1:7890 (host network mode) for npm access, and plugin-runtime-deps cache is preserved across container restarts via volume mount. The staging still blocks on every startup despite the cache being present (consistent with #75718).

extent analysis

TL;DR

Rolling back to OpenClaw version 2026.4.23 is the most likely fix for the gateway restart loop issue on Synology NAS.

Guidance

  • The issue seems to be caused by a regression introduced in version 2026.4.24, which affects the plugin bundled runtime-deps staging mechanism and blocks the Node.js main thread synchronously.
  • Disabling certain plugins or features, such as memorySearch or memory-core, does not resolve the issue.
  • The problem is reproducible on resource-constrained Linux hosts, including NAS and Raspberry Pi devices.
  • Rolling back to version 2026.4.23 has been confirmed to fix the issue by multiple users.

Example

No code snippet is provided as the issue is related to a specific version of OpenClaw and its interaction with the underlying system.

Notes

The issue appears to be a known regression affecting versions 2026.4.24 through 2026.4.29, and the community consensus is that 2026.4.23 is the last stable version.

Recommendation

Apply the workaround by rolling back to OpenClaw version 2026.4.23, as it has been confirmed to fix the issue by multiple users and is considered the last stable version.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

After receiving a user message, the agent should begin responding within 1-2 seconds (as it does on 2026.4.23). The embedded runner startup stages (auth + dispatch) should complete in under 1 second total, not 30+ seconds. The gateway should remain stable without restart loops.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix [Bug]: 2026.4.29 embedded runner auth/dispatch latency causes restart loop on Synology NAS (Docker + QQ bot) [1 comments, 2 participants]