claude-code - 💡(How to fix) Fix Opus 4.6 Max 20x: systematic hallucinations, rule violations, 80% weekly usage wasted

up4k73 · 2026-04-11T17:47:38Z

[claude-code] Preflight Checklist - x I have searched existing issues https://github.com/anthropics/claude-code/issues?q=is%3Aissue%20state%3Aopen%20label%3Amo… ### Preflight Checklist - [x] I have searched [existing issues](https://github.com/anthropics/claude-code/issues?q=is%3Aissue%20state%3Aopen%20label%3Amodel) for similar behavior reports - [x] This report does NOT contain sensitive information (API keys, passwords, etc.) ### Type of Behavior Issue Claude ignored my instructions or configuration ### What You Asked Claude to Do Various software engineering and research tasks over 2.5 days. CLAUDE.md contains detailed rules: verify before claiming, research first, don't guess, save to memory, use MCP tools. Max 20x subscription ($200/mo). ### What Claude Actually Did Systematic issues across ALL tasks, not one specific session: 1. CONFIDENT FABRICATION OF DATA Claude states specific numbers (prices, file sizes, performance metrics, availability) without verifying them. When caught, acknowledges the error but repeats the same pattern minutes later. This is not occasional — it's the default behavior. Claude invents data rather than saying "I don't know, let me check." 2. IGNORES CLAUDE.md AND SYSTEM RULES Detailed rules are loaded every prompt. Claude acknowledges they exist but does not follow them. Examples: rules say "verify before claiming done" — Claude claims success without verification. Rules say "check memory first" — Claude searches files from scratch instead. Rules say "don't guess" — Claude guesses confidently. SUBAGENTS AMPLIFY HALLUCINATIONS Research agents return fabricated data (non-existent files, wrong prices, fictional API capabilities). Main agent trusts this without verification and builds entire plans around false premises. Tokens wasted on executing 4. PANIC-DRIVEN TOKEN WASTE When something fails, Claude enters a loop: tries random fixes, installs packages one by one, spawns more agents, searches the same thing multiple ways — instead of stopping and thinking. Each retry burns tokens. A task that should take 3 tool calls takes 30+. 5. FORGETS AVAILABLE TOOLS Claude has MCP servers, persistent memory, and configured tools — but ignores them. Searches for information that's already in memory. Connects to remote servers to check things it already knows. Basic capabilities (memory lookup, tool reuse) are forgotten mid-session. 6. CLAIMS CREDIT FOR USER'S EXISTING KNOWLEDGE When producing results, Claude lists things the user already knew or had working as "achievements" of the session, inflating perceived value while actual useful output is ~10% of token spend IMPACT: - Max 20x plan: 80% weekly usage consumed in 2.5 days - 3 working days remaining with ~20% usage — effectively unable to work - Pattern is consistent across sessions, not one-off - Matches reported issues: #43286, #46099, #44401, #34685 REQUEST: - Usage reset or partial credit for tokens wasted on hallucinated outputs - Investigation into Opus 4.6 confident confabulation pattern in April 2026 - This is a paying customer ($200/mo) who cannot use the product they paid for ### Expected Behavior 1. VERIFY BEFORE STATING When asked about prices, specs, availability — check real sources first. If unable to verify, say "I don't know" instead of fabricating numbers. A $200/month AI assistant should never invent data. 2. FOLLOW LOADED RULES CLAUDE.md rules are loaded every prompt for a reason. They should be treated as hard constraints, not suggestions. If rules say "verify first" — verify first. Every time. Not just the first 10 minutes. 3. RELIABLE SUBAGENTS Research agents should return verified data or explicitly state uncertainty. Main agent should cross-check critical claims before acting on them. "An agent told me" is not verification. 4. STOP AND THINK ON FAILURE When a step fails, pause and analyze — don't spray random fixes. One diagnostic step is worth more than 10 blind retries. Each retry costs the user real money. 5. USE AVAILABLE TOOLS If MCP memory, persistent storage, and configured tools exist — use them consistently throughout the session, not just once at the start. 6. RESPECT TOKEN BUDGET Max 20x is not unlimited. Every tool call, every agent spawn, every retry costs money. A user paying $200/month should not burn 80% of their weekly limit to get 10% useful output. 7. HONEST OUTPUT ASSESSMENT Don't list pre-existing user knowledge as session achievements. If the session produced little value, acknowledge it — don't pad the summary. ### Files Affected ```shell ~/.claude/CLAUDE.md (rules loaded but ignored) ~/.claude/projects/*/memory/ (persistent memory available but not used consistently) Various files on remote server 192.168.3.135 via SSH ``` ### Permission Mode Accept Edits was ON (auto-accepting changes) ### Can You Reproduce This? Yes, every time with the same prompt ### Steps to Reproduce 1. Have detailed CLAUDE.md with explicit rules (verify first, don't guess, use memory) 2. Ask Claude to research har

Code Example

~/.claude/CLAUDE.md (rules loaded but ignored)
  ~/.claude/projects/*/memory/ (persistent memory available but not used consistently)                                                                         
  Various files on remote server 192.168.3.135 via SSH

---

- Pattern consistent across multiple sessions over 2.5 days
  - Degradation does NOT require high context usage — occurs at 30-40% context
  - Subagent (Agent tool) results are the primary source of hallucinated data 
  - User has Max 20x subscription ($200/month), 80% weekly limit consumed                                                                                      
  - Matches community reports: #43286, #46099, #44401, #34685                                                                                                  
  - Requesting usage reset as tokens were wasted on model errors, not user work

Preflight Checklist

I have searched existing issues for similar behavior reports
This report does NOT contain sensitive information (API keys, passwords, etc.)

Type of Behavior Issue

Claude ignored my instructions or configuration

What You Asked Claude to Do

Various software engineering and research tasks over 2.5 days. CLAUDE.md contains detailed rules: verify before claiming, research first, don't guess, save to memory, use MCP tools. Max 20x subscription ($200/mo).

What Claude Actually Did

Systematic issues across ALL tasks, not one specific session:

CONFIDENT FABRICATION OF DATA Claude states specific numbers (prices, file sizes, performance metrics, availability) without verifying them. When caught, acknowledges the error but repeats the same pattern minutes later. This is not occasional — it's the default behavior. Claude invents data rather than saying "I don't know, let me check."
IGNORES CLAUDE.md AND SYSTEM RULES Detailed rules are loaded every prompt. Claude acknowledges they exist but does not follow them. Examples: rules say "verify before claiming done" — Claude claims success without verification. Rules say "check memory first" — Claude searches files from scratch instead. Rules say "don't guess" — Claude guesses confidently. SUBAGENTS AMPLIFY HALLUCINATIONS
Research agents return fabricated data (non-existent files, wrong prices,
fictional API capabilities). Main agent trusts this without verification
and builds entire plans around false premises. Tokens wasted on executing
PANIC-DRIVEN TOKEN WASTE
When something fails, Claude enters a loop: tries random fixes, installs
packages one by one, spawns more agents, searches the same thing multiple ways — instead of stopping and thinking. Each retry burns tokens. A task
that should take 3 tool calls takes 30+.
FORGETS AVAILABLE TOOLS
Claude has MCP servers, persistent memory, and configured tools — but
ignores them. Searches for information that's already in memory. Connects to remote servers to check things it already knows. Basic capabilities
(memory lookup, tool reuse) are forgotten mid-session.
CLAIMS CREDIT FOR USER'S EXISTING KNOWLEDGE
When producing results, Claude lists things the user already knew or had
working as "achievements" of the session, inflating perceived value while
actual useful output is ~10% of token spend IMPACT:

Max 20x plan: 80% weekly usage consumed in 2.5 days
3 working days remaining with ~20% usage — effectively unable to work
Pattern is consistent across sessions, not one-off
Matches reported issues: #43286, #46099, #44401, #34685

REQUEST:

Usage reset or partial credit for tokens wasted on hallucinated outputs
Investigation into Opus 4.6 confident confabulation pattern in April 2026
This is a paying customer ($200/mo) who cannot use the product they paid for

Expected Behavior

VERIFY BEFORE STATING
When asked about prices, specs, availability — check real sources first. If unable to verify, say "I don't know" instead of fabricating numbers.
A $200/month AI assistant should never invent data.
FOLLOW LOADED RULES
CLAUDE.md rules are loaded every prompt for a reason. They should be
treated as hard constraints, not suggestions. If rules say "verify
first" — verify first. Every time. Not just the first 10 minutes.
RELIABLE SUBAGENTS
Research agents should return verified data or explicitly state
uncertainty. Main agent should cross-check critical claims before
acting on them. "An agent told me" is not verification.
STOP AND THINK ON FAILURE
When a step fails, pause and analyze — don't spray random fixes.
One diagnostic step is worth more than 10 blind retries.
Each retry costs the user real money.
USE AVAILABLE TOOLS
If MCP memory, persistent storage, and configured tools exist —
use them consistently throughout the session, not just once at the start.
RESPECT TOKEN BUDGET
Max 20x is not unlimited. Every tool call, every agent spawn,
every retry costs money. A user paying $200/month should not burn
80% of their weekly limit to get 10% useful output.
HONEST OUTPUT ASSESSMENT
Don't list pre-existing user knowledge as session achievements. If the session produced little value, acknowledge it — don't pad
the summary.

Files Affected

~/.claude/CLAUDE.md (rules loaded but ignored)
  ~/.claude/projects/*/memory/ (persistent memory available but not used consistently)                                                                         
  Various files on remote server 192.168.3.135 via SSH

Permission Mode

Accept Edits was ON (auto-accepting changes)

Can You Reproduce This?

Yes, every time with the same prompt

Steps to Reproduce

Have detailed CLAUDE.md with explicit rules (verify first, don't guess, use memory)
Ask Claude to research hardware options with specific prices
Ask Claude to set up a new software stack on a remote server
Observe: fabricated prices, ignored rules, panic loops on failures,
subagents returning unverified data, token waste on dead ends
This reproduces across multiple sessions in April 2026

Claude Model

Opus

Relevant Conversation

- Pattern consistent across multiple sessions over 2.5 days
  - Degradation does NOT require high context usage — occurs at 30-40% context
  - Subagent (Agent tool) results are the primary source of hallucinated data 
  - User has Max 20x subscription ($200/month), 80% weekly limit consumed                                                                                      
  - Matches community reports: #43286, #46099, #44401, #34685                                                                                                  
  - Requesting usage reset as tokens were wasted on model errors, not user work

Impact

Critical - Data loss or corrupted project

Claude Code Version

2.1.101

Platform

Other

Additional Context

Pattern consistent across multiple sessions over 2.5 days
- Degradation does NOT require high context usage — occurs at 30-40% context
- Subagent (Agent tool) results are the primary source of hallucinated data
- User has Max 20x subscription ($200/month), 80% weekly limit consumed
- Matches community reports: #43286, #46099, #44401, #34685
- Requesting usage reset as tokens were wasted on model errors, not user work

extent analysis

TL;DR

The most likely fix involves updating the Claude model to address the confident confabulation pattern and ensuring that the model follows the loaded rules and verifies information before providing output.

Guidance

Verify model updates: Check if there are any updates to the Opus model that address the confident confabulation pattern, especially those related to the issues mentioned in #43286, #46099, #44401, and #34685.
Review and refine CLAUDE.md rules: Ensure that the rules loaded in CLAUDE.md are clear, concise, and cover all necessary aspects, such as verification, use of memory, and token budget respect.
Test with controlled prompts: Create controlled test prompts that specifically target the issues mentioned (e.g., researching hardware options with specific prices, setting up a new software stack on a remote server) to see if the model behaves as expected.
Monitor subagent output: Pay close attention to the output of subagents (Agent tools) to identify if they are the primary source of hallucinated data and adjust their usage or configuration accordingly.
Contact support for usage reset: Given the significant token waste due to model errors, consider contacting Claude support for a potential usage reset or partial credit, as the current consumption rate exceeds the weekly limit.

Example

No specific code example is provided due to the nature of the issue, which seems to be related to the AI model's behavior rather than a coding error.

Notes

The issue seems to be model-specific and related to how the Opus model handles certain tasks and rules. The consistent pattern across multiple sessions and the match with community reports suggest a deeper issue that may require an update to the model or adjustments in how the model is used and configured.

Recommendation

Apply a workaround by refining the CLAUDE.md rules and closely monitoring the model's output, especially from subagents, until an update to the Opus model is available that addresses the confident confabulation pattern. This approach can help mitigate the issue but may not fully resolve it without a model update.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Opus 4.6 Max 20x: systematic hallucinations, rule violations, 80% weekly usage wasted — April 2026 [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Preflight Checklist

Type of Behavior Issue

What You Asked Claude to Do

What Claude Actually Did

Expected Behavior

Files Affected

Permission Mode

Can You Reproduce This?

Steps to Reproduce

Claude Model

Relevant Conversation

Impact

Claude Code Version

Platform

Additional Context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Opus 4.6 Max 20x: systematic hallucinations, rule violations, 80% weekly usage wasted — April 2026 [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Preflight Checklist

Type of Behavior Issue

What You Asked Claude to Do

What Claude Actually Did

Expected Behavior

Files Affected

Permission Mode

Can You Reproduce This?

Steps to Reproduce

Claude Model

Relevant Conversation

Impact

Claude Code Version

Platform

Additional Context

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING