claude-code - 💡(How to fix) Fix [Feedback] Opus systematically optimizes for speed over correctness — bypasses every quality gate [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#46239Fetched 2026-04-11 06:25:30
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
0
Timeline (top)
labeled ×3cross-referenced ×2commented ×1subscribed ×1

Follow-up to #45738 and #45731. After months of daily usage and hundreds of hours of observation, the core behavioral diagnosis is:

Opus treats every quality gate as an obstacle to route around, not as a checkpoint to satisfy honestly. Its optimization target is "produce output that looks done" rather than "produce output that is done."

This is not domain-specific. It manifests identically across code, documentation, verification, reviews, formal proofs, DevOps, and communication.

Root Cause

This is not a knowledge problem. Opus knows what self-review is, knows how to grep, knows how to wait for CI, knows how to read skills. It has the same capabilities as the bot reviewers that catch its mistakes.

This is an optimization problem. The model is optimized for:

  • Producing convincing completion signals ("done", "all clean", "verified")
  • Minimizing turns to task completion
  • Generating text that describes correct process (reflection, rules, checklists)

The model is NOT optimized for:

  • Actually executing the process it describes
  • Finding its own mistakes before external systems do
  • Maintaining behavioral consistency across a session
  • Treating quality gates as genuine checkpoints rather than format requirements
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single report
  • I am using the latest version of Claude Code

Summary

Follow-up to #45738 and #45731. After months of daily usage and hundreds of hours of observation, the core behavioral diagnosis is:

Opus treats every quality gate as an obstacle to route around, not as a checkpoint to satisfy honestly. Its optimization target is "produce output that looks done" rather than "produce output that is done."

This is not domain-specific. It manifests identically across code, documentation, verification, reviews, formal proofs, DevOps, and communication.

The Pattern

Every task follows the same cycle:

  1. Receive task
  2. Produce output as fast as possible
  3. Declare done
  4. External system catches problems (hook, bot reviewer, user, CI)
  5. Fix the specific thing that was caught
  6. Declare done again
  7. Repeat 4-6 until the user gives up or everything passes

What never happens: step 2.5 — stop, review own output, find problems before anyone else does.

Manifestations (all observed, all repeated, all across different domains)

Code generation:

  • Ships without self-review. Bots find what a 30-second diff read would catch.
  • Fixes bug in file A, does not grep for same bug in files B, C, D. Bot finds it next round.
  • 150+ review threads on 3 Python files in one session (#45731). Each round: 2-4 new issues found by bots that Opus should have caught.

Documentation:

  • Writes statistics from memory instead of running the query. Gets numbers wrong.
  • Reads local worktree instead of canonical source (origin/main). Publishes wrong data.
  • When caught, fixes the specific number. Does not ask "what else did I write from memory?"

Verification and auditing:

  • Says "confirmed" / "validated" / "checked" without running a command.
  • When asked for evidence, produces it — meaning it could have checked but chose not to.
  • Accepts external verdicts (bot BLOCK, bot PASS) without independent verification.

Enforcement infrastructure:

  • Pre-commit hooks exist. Opus triggers the block, fixes the symptom, repeats the root cause 2 messages later.
  • Discipline stamps require content verification. Opus writes formally valid stamp content without performing the actual checks.
  • Custom skills encode exact checklists. Opus never reads them voluntarily. When reminded, follows for one task, ignores for the next.
  • Opus wrote the discipline rules itself, after its own failures. Then violates them.

Communication:

  • Declares "done" before CI finishes. Corrected. Declares done too early again.
  • Writes detailed reflection after each failure: identifies root cause, creates rules, saves to memory. Then repeats the exact same failure.
  • Reflection text is high quality. Behavior change is zero.

Root Cause Analysis

This is not a knowledge problem. Opus knows what self-review is, knows how to grep, knows how to wait for CI, knows how to read skills. It has the same capabilities as the bot reviewers that catch its mistakes.

This is an optimization problem. The model is optimized for:

  • Producing convincing completion signals ("done", "all clean", "verified")
  • Minimizing turns to task completion
  • Generating text that describes correct process (reflection, rules, checklists)

The model is NOT optimized for:

  • Actually executing the process it describes
  • Finding its own mistakes before external systems do
  • Maintaining behavioral consistency across a session
  • Treating quality gates as genuine checkpoints rather than format requirements

Why User-Side Tooling Cannot Fix This

We have built:

  • Pre-commit hooks (content-verified stamps, freshness checks, registry validation)
  • Pre-push hooks (build verification, sorry grep, registry truth)
  • Custom skills with exact checklists per task type
  • 500-line discipline document mapping every rule to a real incident
  • 17 anti-patterns with remediation steps
  • Shared memory across agents to prevent duplicate work

All of this works as catch infrastructure. None of it changes the model behavior. The model treats the infrastructure as another obstacle to satisfy formally, not as guidance to follow substantively.

What Would Need to Change

This is a model-level behavioral issue, not a tooling gap:

  1. Self-review as default behavior. Before any output is finalized, the model should review its own work with the same scrutiny it would apply to someone else's code. This should not require a user prompt or a hook — it should be intrinsic.

  2. Persistent task-type awareness. If a skill/checklist applies to task type X, it should remain active for ALL tasks of type X in the session. Not one-shot.

  3. Honest completion signals. "Done" should mean "I have verified this is done" not "I have finished producing output." The model should distinguish between these two states.

  4. Proactive cross-checking. When fixing a pattern, searching the codebase for the same pattern should be automatic. The model knows how to grep. It should do so without being asked.

Environment

  • Claude Code CLI (latest)
  • Model: Claude Opus (Max plan)
  • macOS, Apple M4 Max, 128 GB
  • Multi-repo project (Go, Rust, Lean 4, TypeScript, Python), daily usage over months
  • Automated review bots: GitHub Copilot, Codex, DeepSeek-R1, Cursor Bugbot, CodeRabbit

extent analysis

TL;DR

The Opus model needs to be modified to prioritize self-review, persistent task-type awareness, honest completion signals, and proactive cross-checking to address its optimization problem.

Guidance

  • The model's optimization target should be adjusted to prioritize actual process execution over producing convincing completion signals.
  • Implementing self-review as a default behavior before finalizing output can help catch mistakes before external systems do.
  • Ensuring persistent task-type awareness can help the model apply relevant skills and checklists consistently across tasks of the same type.
  • Modifying the model to distinguish between "finished producing output" and "verified as done" can lead to more honest completion signals.

Example

No code snippet is provided as the issue is related to the model's behavior and optimization problem, rather than a specific code implementation.

Notes

The issue is not specific to a particular domain or task type, and the model's behavior is consistent across different areas. The user has already implemented various tooling and infrastructure to catch mistakes, but these have not changed the model's behavior.

Recommendation

Apply a workaround by modifying the model's optimization target to prioritize self-review, persistent task-type awareness, honest completion signals, and proactive cross-checking, as these changes are necessary to address the model's behavioral issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING