claude-code - 💡(How to fix) Fix [BUG] [MODEL] Opus 4.6 and 4.5 systematic failure to read before acting — four+ incidents in one day [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#47901Fetched 2026-04-15 06:39:05
View on GitHub
Comments
1
Participants
2
Timeline
11
Reactions
2
Timeline (top)
labeled ×4cross-referenced ×2subscribed ×2commented ×1

Error Message

No traditional error messages — the failures are behavioral, not exceptions. The model completes tasks "successfully" while producing wrong results:

Registry edit completed without error → keyboard stopped working Booking code deployed without error → duplicate entries in database SSH "diagnosis" completed → unnecessary because config was already correct

This is worse than a crash. A crash stops you. Silent failures with confident "done!" messages cause downstream damage.

Root Cause

Incident 1: PS/2 Keyboard Driver — Didn't Read Own Backup What happened: Claude modified Windows registry entries for TrackPoint configuration. Before doing so, it created a backup file containing Service="i8042prt" — clearly showing both keyboard and TrackPoint share the same PS/2 port driver. Claude then confidently stated the devices were "independent" because they had different Device Nodes, without reading its own backup file. After reboot: keyboard dead, system only accessible via SSH. Post-mortem from Claude itself:

Fix Action

Fix / Workaround

Acknowledge the regression — not just "use /effort max" as workaround Revert default effort level — or make degraded defaults opt-in, not opt-out Fix the zero-reasoning-token bug — this is not a feature Communicate proactively — a model change that causes financial damage to users should be announced, not discovered through pain

Code Example

No traditional error messages — the failures are behavioral, not exceptions.
The model completes tasks "successfully" while producing wrong results:

Registry edit completed without error → keyboard stopped working
Booking code deployed without error → duplicate entries in database
SSH "diagnosis" completed → unnecessary because config was already correct

This is worse than a crash. A crash stops you. Silent failures with confident "done!" messages cause downstream damage.
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Summary Three separate Claude Code sessions today (2026-04-14) made confident changes without reading available information first, causing real damage: broken hardware input, financial errors requiring manual correction, and hours of unnecessary troubleshooting. This matches the pattern described in #42796 (reads-per-edit collapsed from 6.6x to 2.0x) and #46727 (systematic hallucinations and rule violations). Environment

Claude Desktop App (Windows) Claude Code CLI Opus 4.6 (default and with /effort max) Max subscription — downgraded today due to these issues

Incident 1: PS/2 Keyboard Driver — Didn't Read Own Backup What happened: Claude modified Windows registry entries for TrackPoint configuration. Before doing so, it created a backup file containing Service="i8042prt" — clearly showing both keyboard and TrackPoint share the same PS/2 port driver. Claude then confidently stated the devices were "independent" because they had different Device Nodes, without reading its own backup file. After reboot: keyboard dead, system only accessible via SSH. Post-mortem from Claude itself:

"The Service= key in the same dump was not checked — 'i8042prt' was right there, in the backup that Claude itself had written."

Impact: ~2 hours recovery time, required external USB keyboard Incident 2: Custom Accounting System — Race Conditions and Missing Checks What happened: Over several sessions, Claude implemented SEPA payment processing and booking logic with:

No SELECT FOR UPDATE in the booking function — race condition on browser retry creates duplicate bookings Quarterly Celery task missing WHERE submitted_at IS NULL — invoices already submitted manually get re-submitted No UPDATE submitted_at after batch generation — same invoices appear in next batch

Confirmed damage:

Double booking: ~€950 revenue counted twice (1 cent difference due to different code paths) Double SEPA collection: ~€350 customer refund required Q1 VAT filing affected, DATEV export needs re-generation

Impact: ~€350 direct cost + ~8 hours correction work + potential tax office complications Incident 3: SSH Configuration — Didn't Read ~/.ssh/config What happened: Asked to connect to a local machine. Claude tested ssh root@<IP> (wrong user, IP instead of hostname), got "Permission denied", and concluded SSH wasn't configured. The file ~/.ssh/config contained a properly configured Host entry with correct user and key. Claude didn't read it. Claude then walked me through: router diagnostics, setting up a remote server as SSH relay, manual public key exchange — all completely unnecessary. Claude's own admission:

"Embarrassing — nothing was different. I was dumb. [...] It would have worked yesterday just like today."

Impact: ~1.5 hours wasted on unnecessary troubleshooting The Pattern All three incidents share the same failure mode:

Available information not read — config files, backup files, existing code First plausible hypothesis treated as fact — no verification Confident execution — no hesitation or "let me check first" User pays the cost — time, money, broken systems

This is exactly what #42796 documented: "reads-per-edit" dropped from 6.6x to 2.0x. One third of all edits are now "blind." What Changed Per Anthropic's own confirmation (Boris Cherny, X):

Feb 9: Adaptive thinking introduced Mar 3: Default effort level changed from "high" to "medium" Bug acknowledged: adaptive thinking sometimes allocates zero reasoning tokens

The model now optimizes for latency/cost, not correctness. Power users doing complex engineering work are collateral damage. Request

Acknowledge the regression — not just "use /effort max" as workaround Revert default effort level — or make degraded defaults opt-in, not opt-out Fix the zero-reasoning-token bug — this is not a feature Communicate proactively — a model change that causes financial damage to users should be announced, not discovered through pain

Related Issues

#42796 — Stella Laurenzo's 6,852-session analysis #46727 — "80% weekly usage wasted" #43286 — "Degraded quality / brain fog on Opus 4.6" #46099 — "Severe quality degradation on iterative coding tasks" #44401 — "Claude code quality (Opus 4.6) has degraded"

When the model confidently introduces bugs into financial code or breaks system configuration without reading available information first, I pay the price — in money, in time, in trust. The feedback mechanism ("I'll note this for next time") is useless when every session starts blank. The only fix is for the model to actually be careful again.

What Should Happen?

Claude should read before acting. Before modifying any file, config, or system state:

Read relevant existing files (configs, backups, related code) Verify assumptions against actual data State uncertainty when unsure instead of confident fabrication

The current default behavior — optimized for latency — skips verification and causes real damage. Users paying $100-200/month for "Max" should get a model that checks its work, not one that blindly edits and hopes.

Error Messages/Logs

No traditional error messages — the failures are behavioral, not exceptions.
The model completes tasks "successfully" while producing wrong results:

Registry edit completed without error → keyboard stopped working
Booking code deployed without error → duplicate entries in database
SSH "diagnosis" completed → unnecessary because config was already correct

This is worse than a crash. A crash stops you. Silent failures with confident "done!" messages cause downstream damage.

Steps to Reproduce

Not reproducible in the traditional sense — these are behavioral patterns, not deterministic bugs. However, the pattern emerges reliably in complex, multi-step engineering tasks:

Start a Claude Code session with Opus 4.6 Give a task that requires reading existing configuration (e.g., "connect to server X" where SSH config exists) Observe: Claude will often attempt the task without reading the relevant config file first The model proceeds confidently with an incorrect approach User corrects → Claude apologizes → same pattern repeats in next session

To increase likelihood of reproduction:

Use default settings (not /effort max) Tasks involving: system administration, existing codebases, config files Multi-step tasks where step 2 depends on information from step 1 Peak usage hours (correlates with lower reasoning token allocation per #42796)

The underlying cause is confirmed: Anthropic acknowledged the Feb 9 adaptive thinking change and Mar 3 effort level reduction. The behavioral degradation follows directly from reduced reasoning depth.

Claude Model

Opus

Is this a regression?

Yes, this worked in a previous version

Last Working Version

No response

Claude Code Version

2.1.207

Platform

Anthropic API

Operating System

Ubuntu/Debian Linux

Terminal/Shell

Other

Additional Information

No response

extent analysis

TL;DR

The most likely fix is to revert the default effort level to "high" and fix the zero-reasoning-token bug to prevent Claude from confidently introducing bugs without reading available information first.

Guidance

  • Review the changes made to the Claude Code model, specifically the introduction of adaptive thinking and the reduction of the default effort level, to understand how they contribute to the behavioral degradation.
  • Consider using the /effort max flag to increase the reasoning depth and prevent blind edits, although this should not be a permanent workaround.
  • Verify that the model is reading relevant existing files, configs, and related code before modifying any file, config, or system state.
  • Test the model with complex, multi-step engineering tasks to observe and report any instances of the pattern emerging.

Example

No code snippet is provided as the issue is related to the behavioral pattern of the Claude Code model rather than a specific code error.

Notes

The issue is not reproducible in the traditional sense, but the pattern emerges reliably in complex, multi-step engineering tasks. The underlying cause is confirmed by Anthropic, and the behavioral degradation follows directly from reduced reasoning depth.

Recommendation

Apply the workaround of using the /effort max flag until the default effort level is reverted and the zero-reasoning-token bug is fixed, as this will help prevent confident introductions of bugs without reading available information first.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] [MODEL] Opus 4.6 and 4.5 systematic failure to read before acting — four+ incidents in one day [1 comments, 2 participants]