openclaw - 💡(How to fix) Fix WhatsApp: reconnect loop lacks exponential backoff after status 499 disconnects [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#60626Fetched 2026-04-08 02:49:01
View on GitHub
Comments
0
Participants
1
Timeline
0
Reactions
0
Author
Participants

When the WhatsApp Web connection drops with status 499, OpenClaw enters a reconnect loop that retries every ~60 seconds indefinitely without exponential backoff. This creates a storm of reconnect attempts, log spam, and unnecessary resource consumption.

Error Message

  • After N consecutive failed reconnect cycles (e.g., 5-10), stop attempting and log an error suggesting manual intervention
  • The 91MB error log generated by this loop should not be possible in normal operation
  • Generated 91MB of error log in one night

Root Cause

Two issues compound:

  1. No backoff between heartbeat-driven reconnects: After a successful reconnect, the heartbeat should reset its "time since last message" counter or at minimum apply exponential backoff before forcing another reconnect.

  2. lastInboundAt is not reset on reconnect: The heartbeat keeps comparing against the original last-message timestamp. Since no new messages arrive (likely because it's nighttime and nobody is messaging), every 60-second heartbeat check immediately exceeds the 30-minute threshold and forces yet another reconnect.

Code Example

14:04:58 No messages received in 39m - restarting connection
14:06:01 No messages received in 40m - restarting connection  
14:07:05 No messages received in 41m - restarting connection
...
14:36:45 No messages received in 71m - restarting connection
...
20:41:06 No messages received in 30m - restarting connection
20:42:10 No messages received in 31m - restarting connection
20:43:14 No messages received in 32m - restarting connection
...continued until gateway was killed by auto-update at 20:48
RAW_BUFFERClick to expand / collapse

Summary

When the WhatsApp Web connection drops with status 499, OpenClaw enters a reconnect loop that retries every ~60 seconds indefinitely without exponential backoff. This creates a storm of reconnect attempts, log spam, and unnecessary resource consumption.

Observed Behavior

The reconnect cycle works like this:

  1. WhatsApp heartbeat detects no messages for 30+ minutes
  2. Forces reconnect → connection closed with status 499
  3. Schedules retry 1/12 in ~2 seconds → reconnects successfully
  4. 60 seconds later, heartbeat fires again → still no messages since the original timeout threshold
  5. Forces another reconnect → goto step 2

The cycle repeats because the heartbeat uses the original lastInboundAt timestamp (which never gets updated since no actual messages arrive), so every new connection immediately triggers a new timeout detection.

This went on for hours (2:04 PM to 8:48 PM on April 2, 2026) generating hundreds of reconnect cycles.

Log Evidence

14:04:58 No messages received in 39m - restarting connection
14:06:01 No messages received in 40m - restarting connection  
14:07:05 No messages received in 41m - restarting connection
...
14:36:45 No messages received in 71m - restarting connection
...
20:41:06 No messages received in 30m - restarting connection
20:42:10 No messages received in 31m - restarting connection
20:43:14 No messages received in 32m - restarting connection
...continued until gateway was killed by auto-update at 20:48

Each cycle also triggers the false creds.json corruption restore (see #60625).

Root Cause Analysis

Two issues compound:

  1. No backoff between heartbeat-driven reconnects: After a successful reconnect, the heartbeat should reset its "time since last message" counter or at minimum apply exponential backoff before forcing another reconnect.

  2. lastInboundAt is not reset on reconnect: The heartbeat keeps comparing against the original last-message timestamp. Since no new messages arrive (likely because it's nighttime and nobody is messaging), every 60-second heartbeat check immediately exceeds the 30-minute threshold and forces yet another reconnect.

Expected Behavior

  • After a reconnect, the heartbeat timer should reset (using the reconnect time as the new baseline)
  • If multiple consecutive reconnects fail to receive messages, apply exponential backoff (e.g., 30min → 1h → 2h → 4h cap)
  • After N consecutive failed reconnect cycles (e.g., 5-10), stop attempting and log an error suggesting manual intervention
  • The 91MB error log generated by this loop should not be possible in normal operation

Impact

  • Generated 91MB of error log in one night
  • Constant WhatsApp reconnection churn
  • Combined with the update that followed, left the gateway down for ~21 hours

Environment

  • OpenClaw: 2026.4.2 (observed on 2026.3.31 before update)
  • OS: macOS 25.3.0 (ARM64)
  • Node: v25.6.1
  • WhatsApp account type: Personal (linked device)

Related

  • #60625 (false creds.json corruption warning)

extent analysis

TL;DR

Implement exponential backoff and reset the lastInboundAt timestamp after a successful reconnect to prevent indefinite reconnect loops.

Guidance

  • Introduce exponential backoff between heartbeat-driven reconnects to prevent a storm of reconnect attempts.
  • Reset the lastInboundAt timestamp after a successful reconnect to ensure the heartbeat timer uses the correct baseline.
  • Consider implementing a limit on consecutive failed reconnect cycles to prevent indefinite looping.
  • Review the logging mechanism to prevent excessive log generation in case of repeated reconnect attempts.

Example

// Pseudo-code example of resetting lastInboundAt and applying exponential backoff
function reconnect() {
  // Reconnect logic...
  lastInboundAt = Date.now(); // Reset lastInboundAt timestamp
  backoffTimeout = calculateExponentialBackoff(backoffTimeout); // Apply exponential backoff
  setTimeout(checkHeartbeat, backoffTimeout);
}

function calculateExponentialBackoff(currentTimeout) {
  // Example exponential backoff calculation
  return Math.min(currentTimeout * 2, MAX_BACKOFF_TIMEOUT);
}

Notes

The provided solution assumes that the lastInboundAt timestamp and heartbeat logic are accessible and modifiable. Additional considerations may be necessary to handle edge cases and ensure a robust solution.

Recommendation

Apply workaround: Implement exponential backoff and reset the lastInboundAt timestamp after a successful reconnect to prevent indefinite reconnect loops. This approach addresses the immediate issue and prevents excessive resource consumption and log generation.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING