openclaw - 💡(How to fix) Fix Feature: Auto-rollback on gateway startup failure (safe mode)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  • Gateway restarts and fails to boot (bad model config, channel error, etc.)
  • Gateway exits with error → counter++

Root Cause

OpenClaw is often the sole remote access channel to a machine. A resilient gateway is critical infrastructure for the "control your computer from anywhere" use case.

Fix Action

Fix / Workaround

Current Workaround

RAW_BUFFERClick to expand / collapse

Problem

Users who rely on OpenClaw as their primary remote interface (e.g., via Telegram) and are away from the host machine can get permanently locked out if a config change causes the gateway to fail at startup.

Concrete scenario:

  • User is traveling, only has a phone
  • Pushes a config change remotely
  • Gateway restarts and fails to boot (bad model config, channel error, etc.)
  • Telegram bot is down, no SSH, nothing

There is currently no built-in mechanism to survive a bad-config boot loop.

Proposed Solution

A gateway-level safe mode with automatic config rollback:

  1. gateway.safeConfig — points to a minimal known-safe config (e.g. openclaw.safe.json)
  2. Boot health monitor — after startup, gateway enters a "probation" period (~60s). If healthy → marks config as known-good. If not → increments failure counter.
  3. Auto-rollback — after N consecutive failures in a time window (e.g. 3 failures in 3 min), gateway swaps in the known-good backup (or safe config) before retrying.
  4. CLI support — openclaw gateway restart --safe

Safe exit codes (should NOT trigger rollback):

  • 0 (clean), 143 (SIGTERM), 78 (duplicate gateway)

Current Workaround

We built this externally with a bash wrapper + systemd drop-in: systemd → bash wrapper → node gateway

  • Gateway exits with error → counter++
  • 3 failures in 3 min → restore known-good → restart
  • If backup also fails → restore minimal safe config → restart

But this should be a first-class feature, not a user-land hack.

Why this matters

OpenClaw is often the sole remote access channel to a machine. A resilient gateway is critical infrastructure for the "control your computer from anywhere" use case.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - 💡(How to fix) Fix Feature: Auto-rollback on gateway startup failure (safe mode)