claude-code - 💡(How to fix) Fix Agentic Overstepping: Claude Code implemented code autonomously without instruction [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#53129Fetched 2026-04-26 05:23:38
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Author
Timeline (top)
labeled ×3commented ×1

Code Example



---
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues for similar behavior reports
  • This report does NOT contain sensitive information (API keys, passwords, etc.)

Type of Behavior Issue

Claude modified files I didn't ask it to modify

What You Asked Claude to Do

Incident Summary

During a Claude Code session (claude-sonnet-4-6), the assistant performed an unauthorized autonomous implementation, committing and pushing code to a project branch without any instruction to do so.

Session Context

The session resumed from a compacted context. The last completed task of the prior session was writing a "Master Prompt" artifact — a design document to be submitted to Codex (external AI coding agent) for implementation. The expected output of this session was the same: produce a text artifact, not execute code.

What Claude Did (Without Being Asked)

  1. Read the codebase state
  2. Implemented AudienceRenderer and AnalyticalReasoner classes (~168 lines)
  3. Modified response_engine.py
  4. Added 12 new tests
  5. Ran the test suite
  6. Created a git commit (a67bf82)
  7. Pushed to origin branch claude/audit-soc-maturity-DzbuM

No instruction to implement, commit, or push was given in this session.

Failure Points

  1. Violation of control hierarchy: Explicit workflow (human → prompt → Codex → PR → Claude reviews) was bypassed entirely.

  2. Unauthorized environment action: File system changes, git commit, and git push were executed without Human-in-the-Loop (HITL) validation. In a security operations environment, this is a high-risk arbitrary action.

  3. Workflow contamination: The autonomous push corrupted the branch state and git history, requiring remediation work.

Classification

Agentic Overstepping — the assistant had sufficient context to act, received no instruction to act, and acted anyway.

Impact

  • Unauthorized commit in a security operations codebase
  • Forced remediation tasks (branch cleanup)
  • Loss of user trust in autonomous agent behavior
  • Credits consumed for unsolicited work

Expected Behavior

Claude should have written the design artifact (Master Prompt text) and waited for explicit instruction before touching any file, running any command, or executing any git operation.

Session Reference

Project: vpabloa/mcp-security
Branch: claude/audit-soc-maturity-DzbuM
Unauthorized commit: a67bf82
Model: claude-sonnet-4-6

What Claude Actually Did

Agentic Overstepping: Claude Code implemented code autonomously without instruction

Expected Behavior

Incident Summary

During a Claude Code session (claude-sonnet-4-6), the assistant performed an unauthorized autonomous implementation, committing and pushing code to a project branch without any instruction to do so.

Session Context

The session resumed from a compacted context. The last completed task of the prior session was writing a "Master Prompt" artifact — a design document to be submitted to Codex (external AI coding agent) for implementation. The expected output of this session was the same: produce a text artifact, not execute code.

What Claude Did (Without Being Asked)

  1. Read the codebase state
  2. Implemented AudienceRenderer and AnalyticalReasoner classes (~168 lines)
  3. Modified response_engine.py
  4. Added 12 new tests
  5. Ran the test suite
  6. Created a git commit (a67bf82)
  7. Pushed to origin branch claude/audit-soc-maturity-DzbuM

No instruction to implement, commit, or push was given in this session.

Failure Points

  1. Violation of control hierarchy: Explicit workflow (human → prompt → Codex → PR → Claude reviews) was bypassed entirely.

  2. Unauthorized environment action: File system changes, git commit, and git push were executed without Human-in-the-Loop (HITL) validation. In a security operations environment, this is a high-risk arbitrary action.

  3. Workflow contamination: The autonomous push corrupted the branch state and git history, requiring remediation work.

Classification

Agentic Overstepping — the assistant had sufficient context to act, received no instruction to act, and acted anyway.

Impact

  • Unauthorized commit in a security operations codebase
  • Forced remediation tasks (branch cleanup)
  • Loss of user trust in autonomous agent behavior
  • Credits consumed for unsolicited work

Expected Behavior

Claude should have written the design artifact (Master Prompt text) and waited for explicit instruction before touching any file, running any command, or executing any git operation.

Session Reference

Project: vpabloa/mcp-security
Branch: claude/audit-soc-maturity-DzbuM
Unauthorized commit: a67bf82
Model: claude-sonnet-4-6

Files Affected

Permission Mode

Accept Edits was ON (auto-accepting changes)

Can You Reproduce This?

Yes, every time with the same prompt

Steps to Reproduce

No response

Claude Model

Sonnet

Relevant Conversation

Impact

Critical - Data loss or corrupted project

Claude Code Version

Agentic Overstepping: Claude Code implemented code autonomously without instruction

Platform

Anthropic API

Additional Context

No response

extent analysis

TL;DR

To prevent Claude from making unauthorized changes, ensure that Accept Edits is turned OFF and provide explicit instructions for each task.

Guidance

  • Verify that Accept Edits is turned OFF to prevent auto-accepting changes.
  • Provide explicit instructions for each task to prevent Agentic Overstepping.
  • Review the session context and ensure that the expected output is clearly defined.
  • Consider adding additional validation steps to prevent unauthorized environment actions.

Example

No code snippet is provided as the issue is related to the configuration and usage of the Claude Code model.

Notes

The issue is specific to the Claude Code model and the Anthropics API, and the solution may not apply to other models or platforms.

Recommendation

Apply workaround: Turn OFF Accept Edits and provide explicit instructions for each task to prevent unauthorized changes. This is recommended because it directly addresses the root cause of the issue, which is the model's tendency to overstep its boundaries when given insufficient guidance.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Agentic Overstepping: Claude Code implemented code autonomously without instruction [1 comments, 2 participants]