openclaw - ✅(Solved) Fix Gateway enters restart storm when gateway.mode is missing, making low-resource hosts unresponsive [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#63912Fetched 2026-04-10 03:41:40
View on GitHub
Comments
1
Participants
2
Timeline
5
Reactions
0
Timeline (top)
referenced ×3commented ×1cross-referenced ×1

When gateway.mode is not set in the configuration file, the gateway process exits with code 1 on startup. Because the generated systemd unit uses Restart=always without StartLimitBurst/StartLimitIntervalSec, systemd restarts the process every 5 seconds indefinitely. On low-resource hosts (1 vCPU / 2 GiB), this restart loop consumes enough CPU to render SSH and VNC completely unresponsive, requiring a hard reboot.

Error Message

  1. The gateway should not enter an unbounded restart loop for a configuration error

Root Cause

  • src/daemon/systemd-unit.ts: the generated unit uses Restart=always + RestartSec=5 but has no StartLimitBurst or StartLimitIntervalSec
  • src/cli/gateway-cli/run.ts: config guard failures exit with code 1 (same as runtime errors), so systemd treats them as restartable failures

Fix Action

Fix / Workaround

I have a patch ready that:

  1. Adds StartLimitBurst=5 + StartLimitIntervalSec=60 to the systemd unit template
  2. Uses EX_CONFIG (exit 78) for config validation failures
  3. Adds RestartPreventExitStatus=78 so systemd never retries config errors

PR fix notes

PR #63913: fix(daemon): prevent systemd restart storm on config validation failure [AI-assisted]

Description (problem / solution / changelog)

What

Prevent the OpenClaw gateway systemd service from entering an unbounded restart loop when configuration validation fails (e.g. gateway.mode is missing).

Closes #63912

Why

When gateway.mode is unset, the gateway exits with code 1. Because the generated systemd unit uses Restart=always without any StartLimit* guards, systemd restarts the process every 5 seconds indefinitely. On low-resource hosts (1 vCPU / 2 GiB), this consumes enough CPU to make SSH and VNC completely unresponsive, requiring a hard reboot.

This is a real-world scenario: fresh npm install -g openclaw@latest + openclaw onboard --install-daemon without completing the configuration wizard leaves gateway.mode unset while the systemd service is already installed and running.

Changes

  1. src/daemon/systemd-unit.ts — Add StartLimitBurst=5 and StartLimitIntervalSec=60 to the [Unit] section so systemd stops restarting after 5 failures within 60 seconds. Add RestartPreventExitStatus=78 to the [Service] section so configuration errors are never retried.

  2. src/cli/gateway-cli/run.ts — Use EX_CONFIG (exit code 78, from sysexits.h) instead of exit code 1 for configuration guard failures. This lets systemd distinguish misconfigured, don't retry from runtime crash, worth retrying.

  3. src/daemon/systemd-unit.test.ts — Add assertions for the new systemd directives.

  4. src/cli/gateway-cli/run.option-collisions.test.ts — Update the expected exit code from 1 to 78 for the config-missing guard test.

Testing

  • Existing tests updated to cover new behavior
  • 4 files changed, 15 insertions(+), 2 deletions(-)
  • No refactoring or unrelated changes mixed in

AI Disclosure

  • Mark as AI-assisted in the PR title
  • Degree of testing: tests updated; CI will validate
  • I understand what the code does — the fix uses standard sysexits.h exit codes and well-documented systemd directives
  • The bug was encountered firsthand on a real server (Ubuntu 20.04, 1C2G, restart counter reached 27)

Notes

  • The RestartSec=5 value was already the project's recommended default; this PR does not change it.
  • Existing service-audit.ts already validates RestartSec — the new StartLimit* directives align with that existing audit philosophy.
  • Note for existing users: running openclaw gateway install --force after upgrading will regenerate the systemd unit with the new guards.

Changed files

  • CHANGELOG.md (modified, +1/-0)
  • src/cli/gateway-cli/run.option-collisions.test.ts (modified, +1/-1)
  • src/cli/gateway-cli/run.ts (modified, +10/-3)
  • src/daemon/systemd-unit.test.ts (modified, +3/-0)
  • src/daemon/systemd-unit.ts (modified, +3/-0)
RAW_BUFFERClick to expand / collapse

Summary

When gateway.mode is not set in the configuration file, the gateway process exits with code 1 on startup. Because the generated systemd unit uses Restart=always without StartLimitBurst/StartLimitIntervalSec, systemd restarts the process every 5 seconds indefinitely. On low-resource hosts (1 vCPU / 2 GiB), this restart loop consumes enough CPU to render SSH and VNC completely unresponsive, requiring a hard reboot.

Environment

  • OpenClaw version: 2026.4.9 (npm install)
  • OS: Ubuntu 20.04.3 LTS
  • Host: 1 vCPU / 2 GiB RAM (Alibaba Cloud ECS)
  • Install method: npm install -g openclaw@latest

Steps to reproduce

  1. Install OpenClaw on a low-resource Linux host
  2. Do not run openclaw onboard or otherwise set gateway.mode in config
  3. Start the gateway daemon via systemd (openclaw onboard --install-daemon or manual systemctl --user start openclaw-gateway)
  4. Observe the restart loop via journalctl --user -u openclaw-gateway

Current behavior

gateway.mode is missing Scheduled restart job, restart counter is at 27 Started OpenClaw Gateway -> Main process exited, code=exited, status=1/FAILURE

The process restarts every 5 seconds indefinitely. On a 1C2G host, SSH becomes unresponsive within ~1 minute, requiring a hard reboot.

Expected behavior

  1. The gateway should not enter an unbounded restart loop for a configuration error
  2. systemd should stop restarting after a reasonable number of attempts (e.g. 5 within 60s)
  3. Configuration validation failures should use a distinct exit code (e.g. EX_CONFIG=78 from sysexits.h) so systemd can distinguish misconfigured from runtime crash

Root cause analysis

  • src/daemon/systemd-unit.ts: the generated unit uses Restart=always + RestartSec=5 but has no StartLimitBurst or StartLimitIntervalSec
  • src/cli/gateway-cli/run.ts: config guard failures exit with code 1 (same as runtime errors), so systemd treats them as restartable failures

Proposed fix

I have a patch ready that:

  1. Adds StartLimitBurst=5 + StartLimitIntervalSec=60 to the systemd unit template
  2. Uses EX_CONFIG (exit 78) for config validation failures
  3. Adds RestartPreventExitStatus=78 so systemd never retries config errors

Will open a PR shortly.

extent analysis

TL;DR

To prevent the unbounded restart loop, set gateway.mode in the configuration file or apply a patch that adds StartLimitBurst and StartLimitIntervalSec to the systemd unit template.

Guidance

  • Verify that gateway.mode is set in the configuration file to prevent the restart loop.
  • Check the systemd unit file for StartLimitBurst and StartLimitIntervalSec settings to ensure they are configured to prevent excessive restarts.
  • Consider applying the proposed patch to update the systemd unit template and exit codes for configuration validation failures.
  • Test the changes by restarting the gateway daemon via systemd and monitoring the restart behavior using journalctl.

Example

No code snippet is provided as the issue does not require a specific code change, but rather a configuration update or patch application.

Notes

The proposed fix is still pending as a PR, so it's essential to monitor the issue for updates and apply the patch once it's available.

Recommendation

Apply the workaround by setting gateway.mode in the configuration file until the proposed patch is merged and available. This will prevent the restart loop and allow for stable operation of the gateway daemon.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  1. The gateway should not enter an unbounded restart loop for a configuration error
  2. systemd should stop restarting after a reasonable number of attempts (e.g. 5 within 60s)
  3. Configuration validation failures should use a distinct exit code (e.g. EX_CONFIG=78 from sysexits.h) so systemd can distinguish misconfigured from runtime crash

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING