openclaw - 💡(How to fix) Fix Gateway crash loop: 'Response timeout' unhandled rejection after upgrade to 2026.5.27 (stable from 2026.5.20)

openclaw2026-05-30 05:19:14

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

After upgrading from 2026.5.20 to 2026.5.27, the Gateway enters a crash loop where it runs for 5–30 minutes, then crashes with an unhandled Response timeout rejection and launchd restarts it automatically. This repeats indefinitely, making the system unusable as a production tool.

Downgrading to 2026.5.20 (via snapshot restore) resolves the issue — the same environment runs stable for days without crashes.

Error Message

Error: Response timeout Stack trace: (empty — no JavaScript stack frames)

Root Cause

Crash Pattern (all 5 crashes, identical root cause)

Fix Action

Fix / Workaround

Issue: Gateway repeatedly crashes with "Response timeout" every 10–30 minutes after upgrade to 2026.5.27

If there is a known regression, a patch in the next stable release would be greatly appreciated.

Code Example

Error: Response timeout
Stack trace: (empty — no JavaScript stack frames)

---

~/.openclaw/logs/stability/openclaw-stability-2026-05-30T*.json

---

~/.openclaw/logs/gateway-restart.log

RAW_BUFFERClick to expand / collapse

Environment

Platform: macOS 25.5.0 (arm64)
Node: v24.14.0
OpenClaw: 2026.5.27
Previous stable version: 2026.5.20 (ran fine for 6 days)

Issue: Gateway repeatedly crashes with "Response timeout" every 10–30 minutes after upgrade to 2026.5.27

Summary

Downgrading to 2026.5.20 (via snapshot restore) resolves the issue — the same environment runs stable for days without crashes.

Crash Pattern (all 5 crashes, identical root cause)

Each crash event is captured in the stability logs with identical characteristics:

Error: Response timeout
Stack trace: (empty — no JavaScript stack frames)

Common pattern across all crashes:

session.stalled (×4 per crash, at 140s / 170s / 200s / 230s intervals) — reason: active_work_without_progress
session.long_running — reason: active_work, kind: model_call
exec.process.completed with timedOut: true, durationMs: 60013 — one exec command timed out at ~60s
Hundreds of queue.lane.enqueue / queue.lane.dequeue pairs — queue severely backlogged
Gateway crashes with unhandled rejection → launchd restarts → repeats

Crash Timeline (May 30, 2026)

Time (UTC)	Event
~03:23	First crash (pid 26804, uptime 32 min)
~03:53	Second crash (pid 32982, uptime 9.7 min)
~04:13	Third crash (pid 35879, uptime 7.3 min)
~04:23	Fourth crash (pid 37323, uptime 5.2 min)
~04:29	Fifth crash (pid 37670, uptime 6.2 min)

Gateway was restarted after each crash by launchd, but the new process immediately encountered the same blocked state and crashed again within minutes.

Suspected Root Cause

The session.stalled + session.long_running events with activeWorkKind: "model_call" indicate that an LLM API call (via any channel — QQ, Feishu, WeChat, etc.) is hanging indefinitely, causing:

The session blocks waiting for a model response that never arrives
All other sessions queue up behind the blocked one
Memory and queue depth grow
Eventually the Gateway crashes with "Response timeout" as an unhandled rejection

This is NOT related to any single plugin or channel. It affects the entire Gateway because it shares a single event loop.

Relevant Context

active-memory plugin was running but was not the cause — it was disabled at 03:53 as a diagnostic attempt, but crashes continued unchanged
All configured channels were active: qqbot, feishu, openclaw-weixin
The crash pattern suggests a regression in LLM call timeout/error handling introduced in 2026.5.27
active-memory was configured with qianfan/ernie-4.5-turbo-20260402

Request

Please investigate LLM call handling changes between 2026.5.20 and 2026.5.27 that could cause:

Model calls to hang indefinitely without proper timeout enforcement
Blocked sessions to propagate back-pressure to the entire Gateway
"Response timeout" errors to become unhandled rejections rather than being caught

If there is a known regression, a patch in the next stable release would be greatly appreciated.

Stability Log Locations

All crash stability snapshots are stored in:

~/.openclaw/logs/stability/openclaw-stability-2026-05-30T*.json

Full gateway-restart.log is at:

~/.openclaw/logs/gateway-restart.log

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Gateway crash loop: 'Response timeout' unhandled rejection after upgrade to 2026.5.27 (stable from 2026.5.20)

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Crash Pattern (all 5 crashes, identical root cause)

Fix Action

Fix / Workaround

Issue: Gateway repeatedly crashes with "Response timeout" every 10–30 minutes after upgrade to 2026.5.27

Code Example

Environment

Issue: Gateway repeatedly crashes with "Response timeout" every 10–30 minutes after upgrade to 2026.5.27

Summary

Crash Pattern (all 5 crashes, identical root cause)

Crash Timeline (May 30, 2026)

Suspected Root Cause

Relevant Context

Request

Stability Log Locations

Still need to ship something?

TRENDING