openclaw - 💡(How to fix) Fix 2026.5.19-beta.1 — gateway becomes unresponsive and silently respawns within ~2 minutes on a working v2026.5.18 setup [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#83934Fetched 2026-05-20 03:46:22
View on GitHub
Comments
1
Participants
2
Timeline
11
Reactions
1
Author
Timeline (top)
labeled ×7cross-referenced ×2commented ×1unsubscribed ×1

Root Cause

  1. Start from a healthy 2026.5.18 install with openai-codex OAuth (post doctor --fix so profile is inline, model refs are openai/*).
  2. openclaw update --channel beta --tag 2026.5.19-beta.1 --yes (user-local install).
  3. sudo npm install -g [email protected] (system install — required because the system service spawns /usr/bin/openclaw gateway/usr/lib/node_modules/openclaw/dist/index.js).
  4. Stop the auto-re-enabled user-level openclaw-gateway.service if present (systemctl --user stop openclaw-gateway && systemctl --user disable openclaw-gateway).
  5. sudo systemctl restart openclaw.
  6. Wait ~90–120 s.
  7. Observe: heap pressure warning, multi-second health latencies, lost Telegram inbounds, silent gateway respawn.

Fix Action

Fix / Workaround

After downgrade to 2026.5.18 (system path: sudo npm install -g [email protected]; user-local: openclaw update --channel stable --tag 2026.5.18 --yes) and sudo systemctl restart openclaw:

Code Example

02:15:00.711  [gateway] loading configuration…
02:15:11.337  [gateway] agent model: openai/gpt-5.5 (thinking=medium, fast=off)
02:15:11.338  [gateway] http server listening (6 plugins …; 10.4s)
02:15:11.655  [gateway] ready
02:15:17.153  [gateway] startup model warmup timed out after 5000ms; continuing without waiting
02:15:17.360  [telegram] Inbound message telegram:<id> -> @<bot> (direct, 98 chars)
02:15:49.733  [discord] gateway READY wait timed out after 15000ms; reconnecting with backoff (attempt 1)
02:15:49.772  [ws] closed before connect …
02:15:49.781  [ws] closed before connect …
02:16:00.992  [ws] ⇄ res ✓ health 10482ms                       ← 10.5 s for a cached health ping
02:16:17.642  [ws] ⇄ res ✓ sessions.delete 1275ms              ← external watchdog firing
02:16:19.747  [diagnostics/memory] memory pressure: level=warning reason=heap_threshold rssBytes=1364066304 heapUsedBytes=1119640208 thresholdBytes=1073741824
RAW_BUFFERClick to expand / collapse

Bug: 2026.5.19-beta.1 — gateway becomes unresponsive and silently respawns within ~2 minutes of startup on a working v2026.5.18 setup

TL;DR

Upgrading a working 2026.5.18 install to 2026.5.19-beta.1 causes the gateway to develop severe event-loop pressure, exceed its own memory-pressure heap threshold, and then silently respawn (no SIGTERM, no shutdown log, no kernel OOM) within ~2 minutes of gateway ready. While unresponsive, Telegram inbounds never reach the agent layer ([agent/embedded] strict-agentic execution contract active never appears), Discord gateway READY wait timed out after 15000ms, and health WS calls take 10+ seconds. Downgrading to 2026.5.18 on the same install and same config restores normal behavior (memory drops from ~1.4 GB back to ~480 MB, event loop quiet, channels recover).

This appears specific to 2026.5.19-beta.1. The same config — including a fresh openclaw doctor --fix migration of legacy sidecar OAuth profiles and openai-codex/*openai/* model refs — is healthy under 2026.5.18.

Environment

  • OpenClaw: 2026.5.19-beta.1 (ba9034b) — fresh install via npm install -g [email protected] (both user-local /home/<user>/.local/lib/node_modules/openclaw via openclaw update, and system /usr/lib/node_modules/openclaw via sudo npm install -g)
  • Previous working version: 2026.5.18 (50a2481), same install paths
  • Install kind: package (npm/pnpm)
  • OS: Ubuntu Linux x86_64, systemd-managed
  • Service: /etc/systemd/system/openclaw.service (system-level, User=<user>)
  • Node: 22.x
  • Agent backend: openai-codex OAuth (post doctor --fix migration; auth-profiles.json has inline access + refresh tokens, no sidecar oauthRef)
  • Channels: Telegram (long-polling) + Discord
  • Memory backend: qmd
  • Session scope: session.dmScope: per-channel-peer
  • Config last touched by: written by 2026.5.19-beta.1 (meta.lastTouchedVersion), originally migrated by 2026.5.18's doctor --fix

Symptom timeline (real log lines)

After sudo systemctl restart openclaw on 2026.5.19-beta.1:

02:15:00.711  [gateway] loading configuration…
02:15:11.337  [gateway] agent model: openai/gpt-5.5 (thinking=medium, fast=off)
02:15:11.338  [gateway] http server listening (6 plugins …; 10.4s)
02:15:11.655  [gateway] ready
02:15:17.153  [gateway] startup model warmup timed out after 5000ms; continuing without waiting
02:15:17.360  [telegram] Inbound message telegram:<id> -> @<bot> (direct, 98 chars)
02:15:49.733  [discord] gateway READY wait timed out after 15000ms; reconnecting with backoff (attempt 1)
02:15:49.772  [ws] closed before connect …
02:15:49.781  [ws] closed before connect …
02:16:00.992  [ws] ⇄ res ✓ health 10482ms                       ← 10.5 s for a cached health ping
02:16:17.642  [ws] ⇄ res ✓ sessions.delete 1275ms              ← external watchdog firing
02:16:19.747  [diagnostics/memory] memory pressure: level=warning reason=heap_threshold rssBytes=1364066304 heapUsedBytes=1119640208 thresholdBytes=1073741824

Two consecutive 98-char Telegram inbounds (02:14:26 and 02:15:17) never produced an [agent/embedded] strict-agentic execution contract active log line — the runtime never began processing them.

Between the two inbounds the gateway PID changed (3493197 → 3493789) with no SIGTERM, no [shutdown] started, no journal exit message, and no kernel OOM (verified via dmesg and journalctl). The process simply disappeared and systemd Restart=always brought up a replacement.

What does NOT happen on 2026.5.18 with the same config

After downgrade to 2026.5.18 (system path: sudo npm install -g [email protected]; user-local: openclaw update --channel stable --tag 2026.5.18 --yes) and sudo systemctl restart openclaw:

  • Steady-state memory: ~480 MB (vs ~1.4 GB on beta)
  • No [diagnostics/memory] warnings
  • No event-loop degradation warnings
  • health calls back to ~70 ms
  • Discord and Telegram both reach connected within ~10 s
  • Telegram inbounds correctly hit [agent/embedded] strict-agentic execution contract active and start a turn (turn still wedges per #82681, but that's the pre-existing wedge, not the new failure mode)

Same openclaw.json, same auth-profiles.json, same Telegram/Discord configuration, same systemd unit. Only the version differs.

Reproduction

  1. Start from a healthy 2026.5.18 install with openai-codex OAuth (post doctor --fix so profile is inline, model refs are openai/*).
  2. openclaw update --channel beta --tag 2026.5.19-beta.1 --yes (user-local install).
  3. sudo npm install -g [email protected] (system install — required because the system service spawns /usr/bin/openclaw gateway/usr/lib/node_modules/openclaw/dist/index.js).
  4. Stop the auto-re-enabled user-level openclaw-gateway.service if present (systemctl --user stop openclaw-gateway && systemctl --user disable openclaw-gateway).
  5. sudo systemctl restart openclaw.
  6. Wait ~90–120 s.
  7. Observe: heap pressure warning, multi-second health latencies, lost Telegram inbounds, silent gateway respawn.

Possibly relevant deltas in 2026.5.19-beta.1

Working off the published changelog, candidates worth bisecting first:

  • #83474 "Codex app-server: complete OpenClaw dynamic tool diagnostics at the request boundary…" — the wedge fix this beta was supposed to ship. If its diagnostics path holds references in a way that prevents GC, this could plausibly drive the heap growth.
  • #82981 "Codex app-server: rotate oversized native Codex threads before resume and cap dynamic tool-result text entering native Codex sessions"
  • #83454 "Codex app-server: scope OpenClaw prompt guidance by runtime surface…"

I have no source-level confirmation; these are just the Codex app-server changes that look most likely to influence startup-time memory.

What I'd find useful from a maintainer

  • Confirmation that this is reproducible on a comparable install (Linux + system service + Codex OAuth)
  • Whether 2026.5.19-beta.2 is expected to drop with a fix before 2026.5.19 stable
  • If not, whether downgrading to 2026.5.18 is the recommended interim for production Codex-OAuth setups — given that 2026.5.19-beta.1 is currently the only release containing the #82681 wedge fix

Happy to attach extended journal slices, the post-doctor --fix config (sanitized), heap dumps, or node --inspect traces if useful.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix 2026.5.19-beta.1 — gateway becomes unresponsive and silently respawns within ~2 minutes on a working v2026.5.18 setup [1 comments, 2 participants]