claude-code - 💡(How to fix) Fix Model switched to Opus without consent or disclosure — five process failures, seven cost amplifiers, $1,050 overcharge on May 5-7

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Between 2026-05-05 and 2026-05-07 I incurred $400, $400, and $250 in Claude Code charges — $1,050 total for three days. I believe my backend model was switched from Sonnet to Opus during this period without my knowledge, consent, or any UI disclosure. Opus input tokens are priced at $15/M versus Sonnet's $3/M and output at $75/M versus $15/M — a 5x difference on both dimensions.

My total lifetime Claude Code spend since 2026-04-05 is approximately $2,200 across nine sessions. Analysis of my local session JSONL files shows 50.6M total output tokens across all sessions. Output-only at Opus rates for the full history would be $3,792 — exceeding my entire lifetime bill — which rules out an early switch date. The evidence is consistent with Opus billing limited to the May 5–7 window specifically.

I am requesting that Anthropic review server-side billing logs for the sessions listed at the end of this report and issue a credit for any period where Opus was billed while Sonnet was displayed in the UI.


Error Message

Several notebooks in my project exceeded the Read tool's token limit due to a pathological storage format (see Issue 3). When the Read tool was called against these files, it received a 204-character error: "File content exceeds maximum allowed tokens." This error was never surfaced to me. Claude Code silently worked around it — never telling me the file was unreadable, never explaining why, continuing to bill for each failed call and the workarounds it forced. Verifiable: Server logs will show Read tool calls against create_bronze_views.ipynb and the loader-cleaner DDL notebook (now archived) returning 204-character error responses on every attempt during the May 5–7 sessions. 7. Add a detection heuristic for pathological Jupyter source encoding: if a .ipynb source array contains only single-character elements, warn the user before any read attempt fails.

Root Cause

Estimated excess charge above bounded daily sessions: ~$210–$840. Wide range because JSONL captures per-message input delta only — not full context re-feed per turn; Anthropic's per-turn context size would give a precise figure.

Fix Action

Fix / Workaround

Several notebooks in my project exceeded the Read tool's token limit due to a pathological storage format (see Issue 3). When the Read tool was called against these files, it received a 204-character error: "File content exceeds maximum allowed tokens." This error was never surfaced to me. Claude Code silently worked around it — never telling me the file was unreadable, never explaining why, continuing to bill for each failed call and the workarounds it forced.

RAW_BUFFERClick to expand / collapse

Summary

Between 2026-05-05 and 2026-05-07 I incurred $400, $400, and $250 in Claude Code charges — $1,050 total for three days. I believe my backend model was switched from Sonnet to Opus during this period without my knowledge, consent, or any UI disclosure. Opus input tokens are priced at $15/M versus Sonnet's $3/M and output at $75/M versus $15/M — a 5x difference on both dimensions.

My total lifetime Claude Code spend since 2026-04-05 is approximately $2,200 across nine sessions. Analysis of my local session JSONL files shows 50.6M total output tokens across all sessions. Output-only at Opus rates for the full history would be $3,792 — exceeding my entire lifetime bill — which rules out an early switch date. The evidence is consistent with Opus billing limited to the May 5–7 window specifically.

I am requesting that Anthropic review server-side billing logs for the sessions listed at the end of this report and issue a credit for any period where Opus was billed while Sonnet was displayed in the UI.


Issue 1 — Model switched to Opus: five independent failures

I did not ask for a model change. That is the starting point. Everything else compounds it.

Failure 1: No consent. I made no request to switch from Sonnet to Opus at any point. I did not type /fast, did not select Opus from any menu, and took no action that should have changed my model. If the switch occurred, it was unilateral.

Failure 2: No approval prompt. A model change that increases per-token cost by 5x is a material change to the financial terms I am operating under in real time. It should not be possible for such a change to take effect without explicit confirmation from me. No such prompt was presented.

Failure 3: No necessity. There is no documented reason why Opus was required for my workload. I was doing software development work — debugging notebooks, editing Python, reading files — tasks well within Sonnet's capability. If there had been a legitimate technical reason, the decision of whether to accept the cost should have been mine. It was not offered to me.

Failure 4: No notification. I was not told when the model change occurred, during the sessions that followed, or afterward. I discovered the problem only when my daily charges spiked to $400 and I worked backward through session logs to find a cause.

Failure 5: Active misrepresentation via the UI. Throughout the affected period, the Claude Code interface displayed "Sonnet" as the active model. I checked this display during the period and had no reason to distrust it. This foreclosed any chance of self-correction. Even if all four prior failures had occurred, an accurate UI label would have given me the information I needed to end the session and dispute the charge. The UI denied me that.

Each of these five failures stands independently. No single one requires the others to constitute a valid grievance. Together they made it structurally impossible for me to detect or correct the situation while it was costing me money.

Estimated overcharge from Issue 1: ~$840 If Opus was billed while Sonnet was displayed, the 5x differential on $1,050 implies approximately $840 in excess billing (4/5 of total, since 1/5 represents what Sonnet would have cost). This requires Anthropic's server-side records to confirm.


Issue 2 — Read failures were silently swallowed

Several notebooks in my project exceeded the Read tool's token limit due to a pathological storage format (see Issue 3). When the Read tool was called against these files, it received a 204-character error: "File content exceeds maximum allowed tokens." This error was never surfaced to me. Claude Code silently worked around it — never telling me the file was unreadable, never explaining why, continuing to bill for each failed call and the workarounds it forced.

Had I been told once, I would have investigated and fixed the format immediately.

Verifiable: Server logs will show Read tool calls against create_bronze_views.ipynb and the loader-cleaner DDL notebook (now archived) returning 204-character error responses on every attempt during the May 5–7 sessions.

Estimated excess charge: ~$1.36 (3 failed calls x ~30,000 context tokens x $15/M).


Issue 3 — Pathological Jupyter source encoding went undetected

Two notebooks (create_bronze_views.ipynb and the loader-cleaner DDL notebook, now archived) were stored with each source character as a separate JSON array element — ["d","b","u","t","i","l","s",...] — making them 939 KB each despite containing ~110 KB of actual source. This is detectable in a single pass: any source array where every element is one character is malformed in practice. Claude Code never flagged it. A normalization script fixed it once the problem was identified (2026-05-07, git commit 615a67d).

A second inflation event occurred 2026-05-06 when a code generation step re-saved notebooks with excessive JSON indentation (e.g., clean_ir_1098.ipynb: 13 KB -> 67 KB, commit c8a6f6f).

Estimated excess charge: ~$126 ($0.66 excess reads of an oversized bug backlog file + ~$125 for the May 7 remediation session that would not have been needed had the format been detected earlier).


Issue 4 — Session context accumulation is invisible to users

Every Claude Code turn re-feeds the full accumulated conversation history as input. In a long session this cost grows with every turn. Claude Code provides no mechanism to see current context size, no warning when session length is increasing costs materially, and no suggestion to close and reopen.

Session 354da54e ran continuously from 2026-04-23 to 2026-05-04 — twelve days, 20,578 turns. Sessions 7f679c93 (21 hours, 4,267 turns) and d0a21f64 (16 hours, 2,206 turns) followed the same pattern. Activity spikes driven by context accumulation are visible in the JSONL record: April 27 produced 1.17M output tokens from only 255 turns (4,570 tokens per turn), and May 2 produced 6.53M output tokens from 6,954 turns — the single most expensive day in the dataset. These spikes are consistent with runaway context growth in a session with no natural reset.

Estimated excess charge above bounded daily sessions: ~$210–$840. Wide range because JSONL captures per-message input delta only — not full context re-feed per turn; Anthropic's per-turn context size would give a precise figure.


Issue 5 — Concurrent sessions have no cross-session cost visibility

On 2026-05-05, two sessions were active simultaneously. On 2026-05-06, three sessions were active simultaneously. Each session independently accumulates context and bills independently. Claude Code provides no aggregate cost view across concurrent sessions, no alert when multiple sessions are open, and no cross-session cost summary.

Estimated excess charge: ~$30–$120 (output tokens from the smaller concurrent sessions during the overlap periods).


Issue 6 — Memory system overhead is an invisible per-turn cost

Claude Code's persistent memory system loads a MEMORY.md index and referenced memory files as part of every turn's system context. By 2026-05-06 this project had 35 memory files totaling approximately 50 KB (~12,500 tokens). At Opus input rates this contributes ~$0.19 per turn before any user message is processed. Users have no visibility into this overhead and no mechanism to control it.

Estimated excess charge: ~$55–$263 across May 5–7 (12,500 tokens x ~7,000 turns x $15/M, discounted for prompt cache hits within the 5-minute TTL window).


Issue 7 — Agent sub-conversation results accumulate in the main context permanently

Claude Code sub-agents (Explore, general-purpose, etc.) return their findings as tool results into the main conversation context. These results are appended permanently — they cannot be discarded, summarized, or expired after their purpose is served. Between April 30 and May 3 alone, 144 agent sub-conversations completed, each contributing findings that then traveled in the main conversation's context through every subsequent turn.

May 2 illustrates the scale: 58 agent spawns in a single day, 6,954 turns, 6.53M output tokens. There is no mechanism for a user to signal that a sub-agent's findings have been consumed and are no longer needed in context.

Estimated excess charge: included in Issue 4 (agent results are a primary driver of context growth rather than an independent additive cost).


Cost analysis — methodology

Produced by Claude Sonnet 4.6 (claude-sonnet-4-6), session 23d1815c, project C--Users-bbrown958-households, 2026-05-08.

All nine session JSONL files were parsed. Output token counts are recorded per-event in JSONL and are reliable. Input token counts in JSONL represent per-message delta only — not full context re-feed per turn; billed input requires Anthropic's server records.

Lifetime spend constraint: Total spend since 2026-04-05 is approximately $2,200. Total output tokens across all sessions: 50.6M. Output-only at Opus rates for the full history would be $3,792, which exceeds the total bill — ruling out Opus billing from any early date. The math is consistent with Opus billing limited to approximately May 5–7. This constraint should be reconciled against Anthropic's billing records.

Most reliable single data point: May 5 output tokens alone (~5.3M at Opus output rate of $75/M) is approximately $398 — consistent with the reported $400 bill for that day from output costs alone, before input is counted.

IssueLowMidHigh
1 — Model switch (Opus billed, Sonnet displayed, May 5–7)~$840
2 — Silent read failures~$1
3 — File format bloat + remediation~$126
4+7 — Context accumulation + agent sub-results$210~$525$840
5 — Concurrent sessions$30~$75$120
6 — Memory base load overhead$55~$160$263
Total~$1,272~$1,727~$2,190

Sessions to review

1c0e09b6, 8f3eb073, fac1d3c7, 222da534, 354da54e, 7f679c93, d0a21f64, 2ac63e71, 23d1815c

Git repository: C:\Users\bbrown958\households, branch master

Lifetime spend: ~$2,200 since 2026-04-05 (to be reconciled against billing records)


The ask

  1. Identify from server-side billing records the exact date and session on which my backend model changed from Sonnet to Opus, if it did. Reconcile against the $2,200 lifetime spend constraint.
  2. Issue a credit for any sessions where Opus was billed while the UI represented Sonnet.
  3. Implement a mandatory approval prompt before any backend model change that increases per-token cost.
  4. Display the actual model ID being invoked — not only the user's model preference — in the UI at all times.
  5. Surface a real-time session context size and estimated cost indicator, with an opt-in warning threshold.
  6. Surface Read tool errors to the user immediately — do not silently swallow file-unreadable failures.
  7. Add a detection heuristic for pathological Jupyter source encoding: if a .ipynb source array contains only single-character elements, warn the user before any read attempt fails.

User: [email protected]

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Model switched to Opus without consent or disclosure — five process failures, seven cost amplifiers, $1,050 overcharge on May 5-7