claude-code - 💡(How to fix) Fix [BUG] Silent model switch to Opus 4.7 [1M] mid-session caused ~4× quota burn [2 comments, 2 participants]

claude-code2026-04-16 21:22:39

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#49541•Fetched 2026-04-17 08:38:12

View on GitHub

Comments

Participants

Timeline

Reactions

Author

StevenJohnson998

Participants

junaidtitan

StevenJohnson998

Timeline (top)

labeled ×4cross-referenced ×3commented ×2subscribed ×1

Error Message

Warn users when context grows past a threshold (e.g., 300K) on 1M-context models, since the quota impact is invisible in the current UI.

Error Messages/Logs

Root Cause

Max context per request grew from ~250K (pre-switch) to 650K+ (post-switch), because the 1M variant does not auto-compact at the 200K boundary the way the 200K model does. Each subsequent turn re-reads that full context from cache, so the burn rate scales with context size.

Fix Action

Fix / Workaround

In /model, the only Opus 4.7 option is Opus 4.7 (1M context). There is no 200K variant of 4.7 exposed. Users who want to stay on a 200K-context Opus have to downgrade to 4.6, assuming it remains available.

RAW_BUFFERClick to expand / collapse

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report (please file separate reports for different bugs)
I am using the latest version of Claude Code

What's Wrong?

Today I burned through my entire 5h quota in half an hour of work, which has never happened in months of daily usage. After digging into the transcripts I found the cause: my Claude Code session was silently switched from claude-opus-4-6 to claude-opus-4-7 mid-session, and the only Opus 4.7 variant exposed in /model is the 1M-context one. Looking at a single session transcript, the model field changed mid-conversation: 11:58 → 16:17 UTC : claude-opus-4-6 16:17 UTC onward : claude-opus-4-7 ← auto-switched, no prompt, no notice

Over the previous 15 days, every transcript shows claude-opus-4-6 exclusively. April 16 is the first appearance of claude-opus-4-7 in my logs.

Impact on token usage.

Cache-read tokens per 10-minute bucket in the affected session:

Window	cache_read tokens	Notes
13:00–13:10	~4.9 M	Opus 4.6, 200K context
16:10–16:20	~13 M	right before switch
16:40–17:00	~60 M	right after switch to 4.7 [1M]
20:10–20:30	~38 M	still 4.7 [1M]

Session totals for the day: 391 M cache-read tokens, 10 M cache-creation, 948 K output across ~14h of combined session time. 5 hours quota gone in half an hour.

Happy to share more details privately if useful for reproduction.

What Should Happen?

Ask

Don't silently switch model variants mid-session. If a user started on a 200K model, keep them on a 200K model unless they opt in.
Expose a 200K variant of Opus 4.7 in /model, like the 4.6 behavior. Not everyone wants to pay the 3–4× cache-read cost of the 1M variant by default.
Warn users when context grows past a threshold (e.g., 300K) on 1M-context models, since the quota impact is invisible in the current UI.
Document the cache-read cost implications of the 1M variant — "1M context" sounds like a pure upgrade, but the cost profile is very different.

Error Messages/Logs

Steps to Reproduce

The switch itself is server-side and not user-triggerable.

Claude Model

Opus

Is this a regression?

I don't know

Last Working Version

No response

Claude Code Version

2.1.112

Platform

Other

Operating System

Ubuntu/Debian Linux

Terminal/Shell

Other

Additional Information

Edit April 16th :

Additional context from community reports

Multiple users on r/ClaudeAI report the same burn-rate issue since 2026-04-16 (see thread: "Opus 4.7 is 50% more expensive with context regression?!"). Independent measurements indicate:

Tokenizer change: Opus 4.7 consumes ~1.35× more tokens than 4.6 for identical input, acknowledged by Anthropic's Boris Cherny on X as "by design for better quality."
Context recall regression (MRCR v2 benchmark):
256K: 4.6 = 91.9% → 4.7 = 59.2%
1M: 4.6 = 78.3% → 4.7 = 32.2%
Anthropic states limits were raised to compensate, but has not disclosed by how much.

The compound effect with the silent mid-session model switch (my case) is a ~4× burn rate increase, which matches my transcript data. For users who never intended to opt into the 1M variant, this is a regression presented as an upgrade.

extent analysis

TL;DR

The most likely fix is to expose a 200K variant of Opus 4.7 in /model to prevent silent mid-session model switches that increase cache-read token burn rate.

Guidance

Verify the model field in session transcripts to identify if a mid-session switch from claude-opus-4-6 to claude-opus-4-7 occurred, which could be causing the increased token burn rate.
Check the cache-read tokens per 10-minute bucket in the affected session to confirm the impact of the model switch on token usage.
Consider downgrading to Opus 4.6 if the 200K context variant is required, assuming it remains available.
Monitor the community reports and Anthropic's responses for updates on the tokenizer change and context recall regression in Opus 4.7.

Example

No code snippet is provided as the issue is related to the Claude Code model and its configuration.

Notes

The issue is specific to the Opus 4.7 model and its 1M context variant, which has a different cache-read cost profile compared to the 200K context variant of Opus 4.6. The silent mid-session model switch and lack of a 200K variant of Opus 4.7 in /model contribute to the increased token burn rate.

Recommendation

Apply a workaround by downgrading to Opus 4.6 if the 200K context variant is required, until a 200K variant of Opus 4.7 is exposed in /model or the silent mid-session model switch issue is resolved. This is because the current implementation of Opus 4.7 with 1M context results in a significantly higher cache-read token burn rate, which can lead to unexpected quota exhaustion.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#latency issue #model loading #dependency error #configuration error #environment variable

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [BUG] Silent model switch to Opus 4.7 [1M] mid-session caused ~4× quota burn [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Root Cause

Fix Action

Fix / Workaround

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Silent model switch to Opus 4.7 [1M] mid-session caused ~4× quota burn [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Root Cause

Fix Action

Fix / Workaround

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING