openclaw - ✅(Solved) Fix openclaw boot: invalid plugin config crashes the whole worker (no graceful degradation) [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#70371Fetched 2026-04-23 07:25:34
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Author
Participants
Timeline (top)
commented ×1cross-referenced ×1

When a plugin's runtime config fails its declared schema during openclaw startup, the entire worker process exits with code 1 instead of marking that single plugin disabled and continuing. This is a single-tenant single-plugin failure mode that takes down everything else (channels, other plugins, scheduled jobs, etc).

Error Message

F {"0":"Config invalid","logLevelId":5,"logLevelName":"ERROR",...}

  1. Log the error clearly (it already does this — keep it).
  2. The plugin's tools become unavailable; the agent gets a clear "plugin not loaded" error if they're called. Make every plugin config field optional and runtime-check in the tool's execute(). Throw a friendly user-visible error from the tool if a required field is missing. This works but means plugin authors lose the ergonomics of declarative schema validation and must re-implement validation imperatively in every tool entry-point.

Root Cause

Plugin authors can't always anticipate every operator misconfiguration. With strict validators + boot-time crash, any plugin author who ships a non-optional config field with a regex effectively has the power to take down the whole worker if the operator doesn't seed a valid value. Today the safe pattern is "all config optional + runtime-check in tool execute()". That's overly defensive and pushes validation to call time instead of boot time, where it should be.

Fix Action

Fix / Workaround

Workaround we use today

  • sidekykai/sidekyk-verticals#55 — plugin that triggered this (community-feedback). We're also patching the plugin to make communityRepo optional as a workaround until upstream lands a fix.

Medium-to-high. Single-plugin config bugs cause tenant-wide outage. Workaround exists but adds boilerplate to every plugin and gives up declarative validation.

PR fix notes

PR #70394: fix(plugins): degrade gracefully instead of crashing worker on invalid config

Description (problem / solution / changelog)

Fix #70371

Problem: A single invalid plugin config schema causes the entire worker/boot path to crash instead of degrading gracefully.

Root Cause: Both loadPluginMetadataRegistrySnapshot and ensurePluginRegistryLoaded set throwOnLoadError: true. These functions run in mode: "validate" paths where the goal is to check and report — not crash the worker.

Fix: Remove throwOnLoadError: true from both functions. Error status and diagnostics are recorded correctly; only the fatal throw is removed.

Changes:

  • src/plugins/runtime/metadata-registry-loader.ts: throwOnLoadError: truefalse
  • src/plugins/runtime/runtime-registry-loader.ts: throwOnLoadError: truefalse
  • src/plugins/runtime/runtime-registry-loader.test.ts: update expected value
  • src/plugins/loader.cli-metadata.test.ts: add regression test

Changed files

  • extensions/qqbot/src/utils/text-parsing.test.ts (modified, +4/-0)
  • extensions/qqbot/src/utils/text-parsing.ts (modified, +2/-2)
  • src/agents/anthropic-vertex-stream.test.ts (modified, +22/-0)
  • src/agents/anthropic-vertex-stream.ts (modified, +9/-1)
  • src/auto-reply/reply/reply-delivery.test.ts (modified, +12/-20)
  • src/auto-reply/reply/reply-delivery.ts (modified, +31/-10)
  • src/plugins/loader.cli-metadata.test.ts (modified, +29/-0)
  • src/plugins/runtime/metadata-registry-loader.ts (modified, +1/-1)
  • src/plugins/runtime/runtime-registry-loader.test.ts (modified, +1/-1)
  • src/plugins/runtime/runtime-registry-loader.ts (modified, +1/-1)

Code Example

F {"0":"Config invalid","logLevelId":5,"logLevelName":"ERROR",...}
F {"0":"File: /mnt/openclaw-state/orgs/<org>/openclaw.json",...}
F {"0":"Problem:",...}
F {"0":"  - plugins.entries.community-feedback.config.communityRepo: invalid config: must match pattern \"^[^/]+/[^/]+$\"",...}
F {"0":"Run: openclaw doctor --fix",...}
F Run: openclaw doctor --fix
[container exits 1CrashLoopBackOff]
RAW_BUFFERClick to expand / collapse

Summary

When a plugin's runtime config fails its declared schema during openclaw startup, the entire worker process exits with code 1 instead of marking that single plugin disabled and continuing. This is a single-tenant single-plugin failure mode that takes down everything else (channels, other plugins, scheduled jobs, etc).

Reproduction

  1. Install any openclaw plugin whose manifest declares a config field with a pattern or other strict validator (example: communityRepo: { type: "string", pattern: "^[^/]+/[^/]+$" }).
  2. Have the runtime openclaw.json initialize that field as "" (e.g. via a config-reset workflow that uses empty defaults).
  3. Boot the worker.

Observed (live worker logs, 2026-04-22 17:07 UTC)

F {"0":"Config invalid","logLevelId":5,"logLevelName":"ERROR",...}
F {"0":"File: /mnt/openclaw-state/orgs/<org>/openclaw.json",...}
F {"0":"Problem:",...}
F {"0":"  - plugins.entries.community-feedback.config.communityRepo: invalid config: must match pattern \"^[^/]+/[^/]+$\"",...}
F {"0":"Run: openclaw doctor --fix",...}
F Run: openclaw doctor --fix
[container exits 1 → CrashLoopBackOff]

The whole worker crash-looped (Probe of StartUp failed with status code: 1 × 80+ events) until a human edited the openclaw.json to provide a valid value for the offending field. WhatsApp channel went down. Trip-broadcast poller went down. Concierge for the entire tenant went down — all because of one optional plugin's empty config.

Suggested behavior

When config validation fails for a single plugin entry, the loader should:

  1. Log the error clearly (it already does this — keep it).
  2. Mark that plugin entry as disabled (or failed) in the in-memory registry. Skip its register(). Surface the failure in openclaw doctor output.
  3. Continue booting the rest of the gateway / other plugins / channels.
  4. The plugin's tools become unavailable; the agent gets a clear "plugin not loaded" error if they're called.
  5. Optionally: emit a structured event (log + maybe webhook) so an operator dashboard can detect the disabled state.

This matches how openclaw already handles plugin LOAD failures (silent skip per "Plugins fail SILENTLY if wrong" — repo memory). Config-validation failures should be no harsher: invalid config = same as failed load = skip + continue.

Why this matters

Plugin authors can't always anticipate every operator misconfiguration. With strict validators + boot-time crash, any plugin author who ships a non-optional config field with a regex effectively has the power to take down the whole worker if the operator doesn't seed a valid value. Today the safe pattern is "all config optional + runtime-check in tool execute()". That's overly defensive and pushes validation to call time instead of boot time, where it should be.

Acceptance criteria

  • Config validation failure for plugin entry X causes plugin X to be disabled but worker continues to boot.
  • openclaw doctor lists disabled plugins with the failing config-path → schema-rule that caused it.
  • Other plugins, channels, and gateway are unaffected.
  • openclaw doctor --fix either prompts to seed a valid value OR strips the offending field and re-validates.

Workaround we use today

Make every plugin config field optional and runtime-check in the tool's execute(). Throw a friendly user-visible error from the tool if a required field is missing. This works but means plugin authors lose the ergonomics of declarative schema validation and must re-implement validation imperatively in every tool entry-point.

Related downstream work

  • sidekykai/sidekyk-verticals#55 — plugin that triggered this (community-feedback). We're also patching the plugin to make communityRepo optional as a workaround until upstream lands a fix.

Severity

Medium-to-high. Single-plugin config bugs cause tenant-wide outage. Workaround exists but adds boilerplate to every plugin and gives up declarative validation.

extent analysis

TL;DR

Modify the openclaw loader to mark a plugin as disabled and continue booting when its runtime config fails schema validation, rather than exiting the entire worker process.

Guidance

  • Identify the plugin whose config is causing the validation failure and verify that its manifest declares a config field with a strict validator (e.g., pattern).
  • Update the openclaw loader to catch config validation errors, log the error, and mark the plugin as disabled in the in-memory registry, skipping its register() call.
  • Ensure that the loader continues booting the rest of the gateway, plugins, and channels after marking the plugin as disabled.
  • Enhance openclaw doctor to list disabled plugins with the failing config-path and schema rule that caused the failure.

Example

No code snippet is provided as the issue does not contain explicit code references.

Notes

The current workaround of making every plugin config field optional and runtime-checking in the tool's execute() method is not ideal, as it loses the benefits of declarative schema validation.

Recommendation

Apply a workaround by modifying the plugin to make its config fields optional and implementing runtime checks, until the upstream fix is implemented. This will prevent tenant-wide outages due to single-plugin config bugs.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

openclaw - ✅(Solved) Fix openclaw boot: invalid plugin config crashes the whole worker (no graceful degradation) [1 pull requests, 1 comments, 2 participants]