openclaw - 💡(How to fix) Fix RFC: config.patch safety guardrails — dry-run validation, auto-backup, and post-apply doctor [1 comments, 1 participants]

openclaw2026-03-27 04:13:26

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#55556•Fetched 2026-04-08 01:38:04

View on GitHub

Comments

Participants

Timeline

Reactions

Author

sydneygemstone-sudo

Participants

sydneygemstone-sudo

Timeline (top)

closed ×1commented ×1locked ×1

Error Message

LLM hallucination feedback loop: When an agent's config.patch is rejected, the error message may not clearly indicate what went wrong. The agent may retry with a different (also wrong) approach, wasting cycles.

Warn if running ACP/subagent sessions would be affected by the restart

Root Cause

In multi-agent setups:

Agent dispatches a config.patch to enable a feature
The patch is schema-valid, gets written, triggers SIGUSR1
A running ACP session (Codex doing a 5-minute coding task) gets killed
The agent doesn't know the ACP session was killed because there's no coordination

Fix Action

Fix / Workaround

config.patch is a high-stakes operation in OpenClaw. While the current implementation does validate the merged config against the Zod schema (with .strict() mode rejecting unknown keys) before writing to disk, there are still operational gaps that make config management risky in agent-driven environments:

No dry-run/preview mode: Agents cannot preview what a patch would change before committing. They must apply the patch to discover if it's valid, which triggers a restart even for exploratory changes.
No automatic backup: If a patch passes schema validation but introduces a runtime issue (valid schema, bad runtime behavior), there's no built-in rollback path.
Every successful patch triggers a restart: Even for hot-reloadable paths, config.patch schedules SIGUSR1 (#43803, #46310), which can kill running ACP sessions (#52440).
LLM hallucination feedback loop: When an agent's config.patch is rejected, the error message may not clearly indicate what went wrong. The agent may retry with a different (also wrong) approach, wasting cycles.

These gaps are documented across several open issues:

#43803: config.patch sends SIGUSR1 for hot-reloadable paths
#46310: config.patch unconditionally schedules restart
#52440: ACP sessions killed by gateway restart
#43150: Config write race condition causes lost updates

Code Example

Agent: config.patch({ "agents.defaults.timeoutSeconds": 600 }, { dryRun: true })
Gateway: { 
  ok: true, 
  dryRun: true,
  diff: { "agents.defaults.timeoutSeconds": [172800, 600] },
  requiresRestart: true,
  activeAcpSessions: 2,
  warning: "2 active ACP sessions will be terminated by restart"
}
// Agent decides to wait until ACP sessions complete before applying

RAW_BUFFERClick to expand / collapse

Problem

No dry-run/preview mode: Agents cannot preview what a patch would change before committing. They must apply the patch to discover if it's valid, which triggers a restart even for exploratory changes.
No automatic backup: If a patch passes schema validation but introduces a runtime issue (valid schema, bad runtime behavior), there's no built-in rollback path.
Every successful patch triggers a restart: Even for hot-reloadable paths, config.patch schedules SIGUSR1 (#43803, #46310), which can kill running ACP sessions (#52440).
LLM hallucination feedback loop: When an agent's config.patch is rejected, the error message may not clearly indicate what went wrong. The agent may retry with a different (also wrong) approach, wasting cycles.

These gaps are documented across several open issues:

#43803: config.patch sends SIGUSR1 for hot-reloadable paths
#46310: config.patch unconditionally schedules restart
#52440: ACP sessions killed by gateway restart
#43150: Config write race condition causes lost updates

Real-world scenario

In multi-agent setups:

Agent dispatches a config.patch to enable a feature
The patch is schema-valid, gets written, triggers SIGUSR1
A running ACP session (Codex doing a 5-minute coding task) gets killed
The agent doesn't know the ACP session was killed because there's no coordination

Proposed Improvements

1. Dry-run / preview mode

Add a dryRun: true option to config.patch that:

Validates the patch against the schema (same path as current validation)
Returns a diff preview showing what paths would change
Does not write to disk or schedule restart
Returns the full validation result so agents can make informed decisions

This is especially valuable for LLM agents that want to "plan" a config change before executing it.

2. Auto-backup with rollback

Before every config.patch write:

Snapshot the current config to a backup (e.g., openclaw.json.bak or timestamped)
Keep the last N backups (configurable, default 3)
Provide a config.rollback command to restore the previous snapshot

This provides a safety net for valid-but-problematic changes.

3. Restart-aware patching

When config.patch detects that changed paths are hot-reloadable:

Skip SIGUSR1 and apply hot-reload instead
Only schedule restart for paths that truly require it
Warn if running ACP/subagent sessions would be affected by the restart

Example Flow

Agent: config.patch({ "agents.defaults.timeoutSeconds": 600 }, { dryRun: true })
Gateway: { 
  ok: true, 
  dryRun: true,
  diff: { "agents.defaults.timeoutSeconds": [172800, 600] },
  requiresRestart: true,
  activeAcpSessions: 2,
  warning: "2 active ACP sessions will be terminated by restart"
}
// Agent decides to wait until ACP sessions complete before applying

Note on current validation

Credit where due: the current config.patch implementation already validates against the Zod schema with .strict() before writing, so unknown keys are rejected at the gateway level. This RFC focuses on the remaining gaps (preview, backup, restart coordination) rather than schema validation.

extent analysis

Fix Plan

To address the operational gaps in the config.patch operation, we will implement the following changes:

Add a dryRun option to config.patch to preview changes without writing to disk
Implement automatic backup and rollback for config.patch operations
Modify config.patch to skip restart for hot-reloadable paths and warn about affected ACP sessions

Step-by-Step Solution

Add dryRun option:
- Update the config.patch function to accept a dryRun option
- If dryRun is true, validate the patch against the schema and return a diff preview without writing to disk
Implement auto-backup and rollback:
- Before writing to disk, create a backup of the current config
- Store the last N backups (configurable, default 3)
- Add a config.rollback command to restore the previous snapshot
Modify restart behavior:
- Check if changed paths are hot-reloadable
- If hot-reloadable, apply hot-reload instead of scheduling a restart
- Warn if running ACP sessions would be affected by the restart

Example Code

// config.patch function with dryRun option
function configPatch(patch, options = {}) {
  const { dryRun = false } = options;
  const validation = validatePatchAgainstSchema(patch);
  if (validation.error) {
    return { ok: false, error: validation.error };
  }
  if (dryRun) {
    const diff = calculateDiff(patch);
    return { ok: true, dryRun: true, diff };
  }
  // Write to disk and schedule restart if necessary
}

// Auto-backup and rollback implementation
function backupConfig() {
  const currentConfig = getConfig();
  const backupPath = `openclaw.json.bak.${Date.now()}`;
  writeConfig(backupPath, currentConfig);
}

function rollbackConfig() {
  const lastBackupPath = getLastBackupPath();
  const backupConfig = readConfig(lastBackupPath);
  writeConfig(`openclaw.json`, backupConfig);
}

// Modified restart behavior
function applyPatch(patch) {
  const changedPaths = getChangedPaths(patch);
  const hotReloadablePaths = getHotReloadablePaths(changedPaths);
  if (hotReloadablePaths.length > 0) {
    applyHotReload(hotReloadablePaths);
  } else {
    scheduleRestart();
  }
  const activeAcpSessions = getActiveAcpSessions();
  if (activeAcpSessions > 0) {
    warnAboutAffectedAcpSessions(activeAcpSessions);
  }
}

Verification

To verify the fix, test the following scenarios:

config.patch with dryRun: true returns a diff preview without writing to disk
`config.patch

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#dependency conflict #environment setup #docker error #permission error #agent setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix RFC: config.patch safety guardrails — dry-run validation, auto-backup, and post-apply doctor [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Problem

Real-world scenario

Proposed Improvements

1. Dry-run / preview mode

2. Auto-backup with rollback

3. Restart-aware patching

Example Flow

Note on current validation

extent analysis

Fix Plan

Step-by-Step Solution

Example Code

Verification

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix RFC: config.patch safety guardrails — dry-run validation, auto-backup, and post-apply doctor [1 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Problem

Real-world scenario

Proposed Improvements

1. Dry-run / preview mode

2. Auto-backup with rollback

3. Restart-aware patching

Example Flow

Note on current validation

extent analysis

Fix Plan

Step-by-Step Solution

Example Code

Verification

Still need to ship something?

RELATED_DISCOVERY

TRENDING