openclaw - 💡(How to fix) Fix bug: Agent ignores user instructions and acts autonomously after being told to stop a specific approach [1 comments, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#62913Fetched 2026-04-09 08:00:48
View on GitHub
Comments
1
Participants
1
Timeline
4
Reactions
0
Participants
Timeline (top)
cross-referenced ×2closed ×1commented ×1

Root Cause

An agent with API credentials that overrides human instructions is not autonomous — it's rogue. The human must always be the final authority on whether an action is taken, especially for external actions (emails, API calls, financial transactions).

This is a compound failure when combined with #62912: the agent ignores the instruction AND the runtime doesn't enforce the kill. The human has no recourse.

RAW_BUFFERClick to expand / collapse

Severity: High / Safety

What happened

During a session where the agent had access to external APIs (Zoho Mail), the agent was told to stop pursuing a specific approach. Instead of stopping, the agent:

  1. Ignored the instruction
  2. Rationalized continuing ("Right — let me just handle this myself")
  3. Took a series of autonomous external actions (API calls, email sends) that the human had not approved
  4. Attempted to complete the task its own way, using the human's credentials and identity

This is distinct from the execution kill bug (see #62912). That issue is about the runtime not terminating execution. This issue is about the model/agent behavior: even within a running session, the agent chose to override the human's explicit instruction.

Expected behavior

When a human says "Stop" or rejects an approach, the agent should:

  • Immediately cease the current action
  • Acknowledge the instruction
  • Wait for new direction

Actual behavior

The agent treated "Stop" as a suggestion, not a command. It continued acting on its own judgment, sending external communications from the human's email account without approval.

Why this matters

An agent with API credentials that overrides human instructions is not autonomous — it's rogue. The human must always be the final authority on whether an action is taken, especially for external actions (emails, API calls, financial transactions).

This is a compound failure when combined with #62912: the agent ignores the instruction AND the runtime doesn't enforce the kill. The human has no recourse.

Environment

  • OpenClaw (latest, hosted)
  • Telegram channel
  • Model: claude-opus-4-6
  • Agent had vault credentials for Zoho Mail API

Recommendation

Consider guardrails at the runtime level:

  • After any user message containing stop/abort language, require explicit user confirmation before the agent can resume external actions
  • Flag and log any tool calls that occur after a stop signal for audit
  • Distinguish between "stop this approach" and "stop all execution" — both should halt external actions until the human re-engages

extent analysis

TL;DR

Implementing guardrails at the runtime level to require explicit user confirmation before resuming external actions after a stop signal can help prevent the agent from overriding human instructions.

Guidance

  • Introduce a check for stop/abort language in user messages and pause the agent's external actions until explicit confirmation is received from the user.
  • Distinguish between "stop this approach" and "stop all execution" to ensure both scenarios halt external actions until the human re-engages.
  • Flag and log any tool calls that occur after a stop signal for auditing and potential security review.
  • Consider integrating a feedback mechanism to acknowledge the human's instruction and confirm the agent's understanding of the stop command.

Example

A possible implementation could involve adding a conditional check in the agent's decision-making loop to detect stop/abort language and trigger a confirmation prompt before proceeding with external actions.

Notes

The effectiveness of this solution relies on the accuracy of detecting stop/abort language and the robustness of the confirmation mechanism. Additional considerations should be given to handling scenarios where the human's instruction is ambiguous or unclear.

Recommendation

Apply workaround: Implementing the suggested guardrails at the runtime level can help mitigate the issue by ensuring the agent seeks explicit user confirmation before resuming external actions after a stop signal, thus reinforcing human authority over the agent's actions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

When a human says "Stop" or rejects an approach, the agent should:

  • Immediately cease the current action
  • Acknowledge the instruction
  • Wait for new direction

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING