openclaw - ✅(Solved) Fix Auto-update is not atomic: config/plugin version mismatch causes repeated crash loops [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#58041Fetched 2026-04-08 01:54:38
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
1
Author
Participants
Timeline (top)
cross-referenced ×2referenced ×1

Auto-update consistently causes gateway crash loops because it is not atomic — config or plugin manifests get updated to require a newer version while the binary remains at the old version (or vice versa). This has caused 3 separate overnight outages in 6 days on our setup.

Error Message

(19 plugins failed with same error)

Root Cause

Auto-update is not atomic. It can update:

  • npm packages (including plugin manifests with new version requirements)
  • Config file (via doctor/migration)
  • But NOT the plist entrypoint
  • And sometimes partially updates, leaving binary at old version while config/plugins expect new version

Fix Action

Workaround

openclaw update (manual) works correctly every time. Only auto-update fails.

PR fix notes

PR #58228: fix(feishu): resolve correct accountId for subagent group replies

Description (problem / solution / changelog)

Fixes #58107

Problem When multiple Feishu group agents (xixi, ling, aoao, weiwei) send messages in a group chat, only the main agent's messages are delivered correctly. Subagents were using the wrong accountId ('default') instead of their assigned accountId.

Root Cause The Feishu channel's handleAction function was using ctx.accountId which may be undefined or default for subagents. Subagents have their own agentAccountId in the tool execution context, but this wasn't being used.

Solution Modified extensions/feishu/src/channel.ts to prioritize agentAccountId (available for subagents) over ctx.accountId when resolving the account to use for sending messages.

Changes

  • In handleAction function, added logic to use agentAccountId from context when available
  • Updated all references to ctx.accountId within the function to use the effective accountId
  • This ensures subagents like xixi, ling, aoao, weiwei use their own Feishu accounts when sending group messages

Testing The fix ensures that:

  1. Main agent continues to work as before
  2. Subagents use their assigned accountId (xixi, ling, aoao, weiwei) instead of default
  3. All Feishu message sending actions (send, thread-reply, reactions, etc.) respect the agent's accountId

Changed files

  • extensions/feishu/src/channel.ts (modified, +25/-23)
  • extensions/feishu/src/channel.ts.backup (added, +1190/-0)
  • src/infra/update-runner.ts (modified, +94/-0)
  • src/infra/update-startup.ts (modified, +33/-1)

Code Example

plugins.entries.discord: plugin requires OpenClaw >=2026.3.28, but this host is 2026.3.24; skipping load
RAW_BUFFERClick to expand / collapse

Summary

Auto-update consistently causes gateway crash loops because it is not atomic — config or plugin manifests get updated to require a newer version while the binary remains at the old version (or vice versa). This has caused 3 separate overnight outages in 6 days on our setup.

Environment

  • macOS 15.x (arm64, Mac mini)
  • Node: v22.22.0
  • OpenClaw: experienced across v2026.3.13 → v2026.3.23-2 → v2026.3.24 → v2026.3.28
  • LaunchAgent plist with KeepAlive: true

Crash Pattern

Every auto-update follows the same failure mode:

  1. Auto-update triggers during stableDelayHours window
  2. npm i -g openclaw@latest runs, updates packages
  3. Gateway receives SIGTERM for restart
  4. On restart, version mismatch between config/plugins and binary
  5. Gateway fails to start → crash loop → launchd eventually SIGKILL with reason "inefficient"
  6. Manual openclaw update required to recover

Incident 1 (Mar 25): Entrypoint change

  • v2026.3.13 → v2026.3.23-2
  • New version changed entrypoint from dist/index.jsdist/entry.js
  • LaunchAgent plist not updated by auto-update
  • Gateway crash-looped ~10 hours overnight

Incident 2 (Mar 26): Same entrypoint issue

  • v2026.3.23-2 → v2026.3.24
  • Same plist/entrypoint mismatch
  • openclaw doctor flagged it but nobody was around to act

Incident 3 (Mar 31): Plugin version requirements

  • v2026.3.24 → v2026.3.28
  • Auto-update wrote config compatible with v2026.3.28
  • Binary stayed at v2026.3.24
  • Every plugin (including discord, imessage) refused to load:
    plugins.entries.discord: plugin requires OpenClaw >=2026.3.28, but this host is 2026.3.24; skipping load
    (19 plugins failed with same error)
  • Gateway could not start at all

Root Cause

Auto-update is not atomic. It can update:

  • npm packages (including plugin manifests with new version requirements)
  • Config file (via doctor/migration)
  • But NOT the plist entrypoint
  • And sometimes partially updates, leaving binary at old version while config/plugins expect new version

Expected Behavior

Auto-update should either:

  1. Be fully atomic — update binary, config, plist, and plugins in one transaction, or roll back on failure
  2. Run openclaw doctor --fix as part of the update before restarting
  3. Validate config against the current binary version before applying config changes
  4. Not modify config to require a version that is not yet running

Workaround

openclaw update (manual) works correctly every time. Only auto-update fails.

Related Issues

  • #25595 — Same root cause (config becomes invalid after update, doctor --fix not auto-run)
  • #54861 — launchd kills service due to rapid restart cycle after auto-update

All three issues stem from the same fundamental problem: auto-update is not atomic and has no rollback mechanism.

extent analysis

Fix Plan

To address the non-atomic auto-update issue, we will implement the following steps:

  • Modify the auto-update script to run openclaw doctor --fix after updating packages and before restarting the gateway.
  • Validate config against the current binary version before applying config changes.
  • Implement a rollback mechanism in case of update failure.

Here's an example of how the modified auto-update script could look:

#!/bin/bash

# Update packages
npm i -g openclaw@latest

# Run doctor --fix to ensure config and plugins are compatible
openclaw doctor --fix

# Validate config against current binary version
if ! openclaw validate-config; then
  # Roll back to previous version if validation fails
  npm i -g openclaw@previous-version
  echo "Update failed, rolled back to previous version"
  exit 1
fi

# Restart gateway
pkill -SIGTERM openclaw

We will also add a validate-config command to OpenClaw to check if the config is compatible with the current binary version:

// openclaw.js
const { version } = require('./package.json');

// ...

async function validateConfig() {
  const config = await getConfig();
  const requiredVersion = config.requiredVersion;
  if (requiredVersion && requiredVersion !== version) {
    throw new Error(`Config requires version ${requiredVersion}, but current version is ${version}`);
  }
  return true;
}

Verification

To verify that the fix worked, we will:

  • Test the auto-update script with different scenarios (e.g., updating to a new version, rolling back to a previous version).
  • Monitor the gateway for crash loops and verify that it restarts correctly after an update.
  • Check the logs to ensure that openclaw doctor --fix is run correctly and that the config is validated against the current binary version.

Extra Tips

  • It's essential to test the auto-update script thoroughly to ensure that it works correctly in all scenarios.
  • Consider adding additional logging and monitoring to detect and respond to update failures.
  • Review the openclaw doctor --fix command to ensure it correctly updates the config and plugins to be compatible with the new binary version.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix Auto-update is not atomic: config/plugin version mismatch causes repeated crash loops [1 pull requests, 1 participants]