openclaw - 💡(How to fix) Fix WhatsApp 408 disconnects in 2026.4.27 are caused by event-loop blocking up to 100s, not Baileys [1 comments, 2 participants]

openclaw2026-04-30 15:07:24

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#75122•Fetched 2026-05-01 05:37:57

View on GitHub

Comments

Participants

Timeline

Reactions

Author

rutipo

Participants

clawsweeper[bot]

rutipo

Timeline (top)

commented ×1

The chronic WhatsApp status 408 — Connection Terminated disconnects we see in 2026.4.27 (also 2026.4.24) are not a Baileys bug — the new liveness diagnostic subsystem in 2026.4.27 shows the OpenClaw gateway's Node event loop is being blocked for tens of seconds at a time. While the loop is blocked, Baileys' keepalive ping cannot fire, so WhatsApp Web closes the socket at ~60s with HTTP 408. Any in-flight outbound reply is dropped (no replay).

Root Cause

Fix Action

Fix / Workaround

Workaround in use

Hourly cron resets the direct-DM session transcript when it crosses size or token-pct thresholds, keeping LLM call latency short enough that replies sometimes land before the next 408. This is operationally noisy and shouldn't be needed.

Code Example

16:38:22  eventLoopDelayMaxMs=5226     util=0.188  active=0 queued=0  (startup)
16:40:26  eventLoopDelayMaxMs=100059   util=1.000  active=0 queued=0  ★ 100s block
16:43:31  eventLoopDelayMaxMs=7222     util=0.306  active=1 queued=1
16:48:41  eventLoopDelayMaxMs=97039    util=0.992  active=0 queued=0  ★ 97s block
16:53:00  eventLoopDelayMaxMs=32799    util=0.905  active=1 queued=1  ★ 33s block

RAW_BUFFERClick to expand / collapse

Summary

Environment

OpenClaw 2026.4.27 (cbc2ba0) (also reproduced on 2026.4.24)
Bundled @whiskeysockets/[email protected] (this is the npm latest)
Node 22.22.2, Linux 6.8.0-106-generic, Contabo VPS — 4 cores / 7.8 GB / load 1.09
Single WhatsApp account, gateway in local mode, loopback bind
Model: openai-codex/gpt-5.5

Capacity is not the issue. The event loop is being blocked synchronously.

Evidence — liveness warnings from a single 15-min window today

16:38:22  eventLoopDelayMaxMs=5226     util=0.188  active=0 queued=0  (startup)
16:40:26  eventLoopDelayMaxMs=100059   util=1.000  active=0 queued=0  ★ 100s block
16:43:31  eventLoopDelayMaxMs=7222     util=0.306  active=1 queued=1
16:48:41  eventLoopDelayMaxMs=97039    util=0.992  active=0 queued=0  ★ 97s block
16:53:00  eventLoopDelayMaxMs=32799    util=0.905  active=1 queued=1  ★ 33s block

Notice the active=0 queued=0 rows: the loop is blocked even when the agent runtime says nothing is in flight, so the cause isn't simply LLM latency.

Symptom timing (deterministic)

inbound	408 close	Δ
16:43:21	16:44:24	63s
16:52:16	16:53:00	44s
10:08:54	10:10:37	1m43s
10:14:03	10:15:46	1m43s
10:39:39	10:40:31	52s

Bimodal at ~52-63s and ~103s — consistent with Baileys' default keepAliveIntervalMs (30s) + defaultQueryTimeoutMs (60s) firing during event-loop blocks.

Hypotheses to investigate (the diagnostic subsystem already collects the

data — please surface its async-resource trace alongside the warning)

Memory subsystem fallback: chunks_vec not updated — sqlite-vec unavailable. Vector recall degraded. — does the FTS fallback do a synchronous scan on each agent turn?
Session transcript loading: a 6.6 MB / 386-line .jsonl was correlating with the worst blocks. Is loadSessionTranscript synchronous JSON parsing on the main thread?
Plugin runtime-deps re-extraction at startup (we hit the 100s block at 16:40, two minutes after process start) — extracting tarballs sync?
There is also a stale-tree GC bug: plugin-runtime-deps/ accumulates a directory per OpenClaw version forever. After two upgrades we had 3.5 GB of stale trees alongside the active one. Manual rm -rf was needed.

Asks

Identify and fix what blocks the event loop. The diagnostic subsystem already detects it — please log the async-resource at the time of the block.
On WhatsApp socket close after a queued outbound, replay the reply on reconnect rather than dropping it.
Expose Baileys timeout/keepalive options in channels.whatsapp so operators can mitigate without waiting on a fix.
Garbage-collect stale plugin-runtime-deps/openclaw-<version>-* trees on upgrade.

Workaround in use

extent analysis

TL;DR

The event loop blockage in OpenClaw's Node.js application is likely caused by synchronous operations, such as JSON parsing or database queries, which need to be identified and fixed to prevent WhatsApp connection terminations.

Guidance

Investigate the hypotheses provided, such as memory subsystem fallback, session transcript loading, and plugin runtime-deps re-extraction, to identify the synchronous operation causing the event loop blockage.
Use the diagnostic subsystem to log the async-resource trace alongside the warning to gain more insights into the issue.
Consider exposing Baileys timeout/keepalive options in channels.whatsapp to allow operators to mitigate the issue temporarily.
Implement garbage collection for stale plugin-runtime-deps/openclaw-<version>-* trees on upgrade to prevent accumulation of unnecessary data.

Example

No code snippet is provided as the issue requires identification of the specific synchronous operation causing the event loop blockage.

Notes

The provided data suggests that the issue is not related to capacity, but rather to synchronous operations blocking the event loop. The workaround in use, which involves hourly cron resets of the direct-DM session transcript, is operationally noisy and should not be necessary once the root cause is fixed.

Recommendation

Apply a workaround by exposing Baileys timeout/keepalive options in channels.whatsapp to allow operators to mitigate the issue temporarily, while investigating and fixing the root cause of the event loop blockage.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #installation #tensor shape #autograd error #model save/load

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix WhatsApp 408 disconnects in 2026.4.27 are caused by event-loop blocking up to 100s, not Baileys [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Workaround in use

Code Example

Summary

Environment

Evidence — liveness warnings from a single 15-min window today

Symptom timing (deterministic)

Hypotheses to investigate (the diagnostic subsystem already collects the

Asks

Workaround in use

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix WhatsApp 408 disconnects in 2026.4.27 are caused by event-loop blocking up to 100s, not Baileys [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Workaround in use

Code Example

Summary

Environment

Evidence — liveness warnings from a single 15-min window today

Symptom timing (deterministic)

Hypotheses to investigate (the diagnostic subsystem already collects the

Asks

Workaround in use

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING