claude-code - 💡(How to fix) Fix [Bug] Auto-compact triggers far below 1M window on Opus 4.7 [1M] — severe token burn regression on Max 5x (post-2.1.116) [2 comments, 2 participants]

warioishere · 2026-04-25T08:49:02Z

[claude-code] On claude-opus-4-7 1m , auto-compact now triggers far below the 1M context window despite 1M being selected and shown in /model . The behavior ch… On `claude-opus-4-7[1m]`, auto-compact now triggers **far below the 1M context window** despite 1M being selected and shown in `/model`. The behavior changed **~2 days ago** — prior to that, compact correctly triggered near the actual ~1M limit. No user-side configuration changes. In addition to the premature compact, **token consumption per turn has increased substantially**: 5-hour Max plan windows are now exhausted within a fraction of previously-normal usage on identical coding workflows. ## Workaround Manual downgrade to 2.1.116 + auto-updater off: ```bash curl -fL -o ~/.local/share/claude/versions/2.1.116 \ https://downloads.claude.ai/claude-code-releases/2.1.116/linux-x64/claude chmod +x ~/.local/share/claude/versions/2.1.116 ln -sfn ~/.local/share/claude/versions/2.1.116 ~/.local/bin/claude echo 'export DISABLE_AUTOUPDATER=1' >> ~/.bashrc ``` ## Summary On `claude-opus-4-7[1m]`, auto-compact now triggers **far below the 1M context window** despite 1M being selected and shown in `/model`. The behavior changed **~2 days ago** — prior to that, compact correctly triggered near the actual ~1M limit. No user-side configuration changes. In addition to the premature compact, **token consumption per turn has increased substantially**: 5-hour Max plan windows are now exhausted within a fraction of previously-normal usage on identical coding workflows. ## Environment - **Claude Code version (current):** 2.1.119 - **Onset:** ~2 days ago (cannot pinpoint the exact version where the regression appeared, possibly 2.1.118 or 2.1.119, or a server-side flag rollout — see below) - **Model:** `claude-opus-4-7[1m]` (1M context, confirmed in `/model`) - **Plan:** Max 5x - **Settings (unchanged for months):** - `alwaysThinkingEnabled: true` - `effortLevel: "high"` - `ultrathink` used in-session for hard tasks - **OS:** Linux (Ubuntu) - **Install method:** native (single-binary install) ## Behavior ### Before (~2+ days ago) — correct - Auto-compact triggered close to 1M tokens (near the actual context limit) - Token consumption per turn was reasonable - 5-hour rate-limit window comfortably covered a normal coding day - Long coding sessions sustainable end-to-end ### Now — regression - Auto-compact triggers **much earlier than before** — well below the 1M window. The exact percentage varies between sessions, but consistently far below where it used to fire. - Severe token burn: **5-hour limits exhausted in a small fraction of normal time** on identical workflows - Constant compacts invalidate cache, re-inject CLAUDE.md / memory, slow iteration significantly - Sessions are practically unusable ### Ruled out - CLAUDE.md size: 667 bytes (negligible) - Memory files: ~12K total (negligible) - MCP servers: 0 installed - Plugins: 0 installed - No settings change in months — these settings worked fine before ## Note on root cause Because the regression appeared abruptly without a user-side update event, this could be either: - a CLI version regression (auto-update happened in the same window), **or** - a **server-side / GrowthBook flag rollout** that silently changed the resolved compact threshold (see #46331 for prior precedent) A downgrade to 2.1.116 has been performed as a workaround; behavior on that version will be reported back if the issue persists or resolves. ## Workaround Manual downgrade to 2.1.116 + auto-updater off: ```bash curl -fL -o ~/.local/share/claude/versions/2.1.116 \ https://downloads.claude.ai/claude-code-releases/2.1.116/linux-x64/claude chmod +x ~/.local/share/claude/versions/2.1.116 ln -sfn ~/.local/share/claude/versions/2.1.116 ~/.local/bin/claude echo 'export DISABLE_AUTOUPDATER=1' >> ~/.bashrc ``` ## Likely related issues - #36014 — autocompact triggers at 17% on 1M context model (effectiveWindow likely capped at 200K) — **matches this symptom** - #52981 — auto-compact triggers at ~8% context usage (84K/1M) - #46331 — GrowthBook experiment `tengu_amber_redwood` silently reduces autocompact window - #50803 — 1M context window not auto-applied on Max plan; `--model` flag drops `[1m]` suffix - #52522 — adjacent: documents 2.1.117 threshold-detection change with subscriber-impact concerns - #52153 — excessive token consumption per prompt on Opus 4.7 1M ## Ask 1. Confirm whether the 1M window detection has regressed for Opus 4.7 on Max 5x accounts (or any plan) within the last few days, either via CLI release or server-side flag 2. Check whether a GrowthBook flag (e.g. `tengu_amber_redwood` or similar) is targeting accounts and overriding the resolved compact threshold below the model's actual window 3. Either fix the detection or expose the **resolved** compact threshold in `/context` output (currently it shows the model window but not the actual percent at which compact will fire), so users can verify what the CLI is actually using 4. Given the sub

claude-code2026-04-25 08:49:02

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#53199•Fetched 2026-04-26 05:21:48

View on GitHub

Comments

Participants

Timeline

Reactions

Author

warioishere

Participants

github-actions[bot]

warioishere

Timeline (top)

labeled ×5commented ×2cross-referenced ×1renamed ×1

On claude-opus-4-7[1m], auto-compact now triggers far below the 1M context window despite 1M being selected and shown in /model. The behavior changed ~2 days ago — prior to that, compact correctly triggered near the actual ~1M limit. No user-side configuration changes.

In addition to the premature compact, token consumption per turn has increased substantially: 5-hour Max plan windows are now exhausted within a fraction of previously-normal usage on identical coding workflows.

Root Cause

Note on root cause

Fix Action

Workaround

Manual downgrade to 2.1.116 + auto-updater off:

curl -fL -o ~/.local/share/claude/versions/2.1.116 \
  https://downloads.claude.ai/claude-code-releases/2.1.116/linux-x64/claude
chmod +x ~/.local/share/claude/versions/2.1.116
ln -sfn ~/.local/share/claude/versions/2.1.116 ~/.local/bin/claude
echo 'export DISABLE_AUTOUPDATER=1' >> ~/.bashrc

Code Example

curl -fL -o ~/.local/share/claude/versions/2.1.116 \
  https://downloads.claude.ai/claude-code-releases/2.1.116/linux-x64/claude
chmod +x ~/.local/share/claude/versions/2.1.116
ln -sfn ~/.local/share/claude/versions/2.1.116 ~/.local/bin/claude
echo 'export DISABLE_AUTOUPDATER=1' >> ~/.bashrc

RAW_BUFFERClick to expand / collapse

Summary

Environment

Claude Code version (current): 2.1.119
Onset: ~2 days ago (cannot pinpoint the exact version where the regression appeared, possibly 2.1.118 or 2.1.119, or a server-side flag rollout — see below)
Model: claude-opus-4-7[1m] (1M context, confirmed in /model)
Plan: Max 5x
Settings (unchanged for months):
- alwaysThinkingEnabled: true
- effortLevel: "high"
- ultrathink used in-session for hard tasks
OS: Linux (Ubuntu)
Install method: native (single-binary install)

Behavior

Before (~2+ days ago) — correct

Auto-compact triggered close to 1M tokens (near the actual context limit)
Token consumption per turn was reasonable
5-hour rate-limit window comfortably covered a normal coding day
Long coding sessions sustainable end-to-end

Now — regression

Auto-compact triggers much earlier than before — well below the 1M window. The exact percentage varies between sessions, but consistently far below where it used to fire.
Severe token burn: 5-hour limits exhausted in a small fraction of normal time on identical workflows
Constant compacts invalidate cache, re-inject CLAUDE.md / memory, slow iteration significantly
Sessions are practically unusable

Ruled out

CLAUDE.md size: 667 bytes (negligible)
Memory files: ~12K total (negligible)
MCP servers: 0 installed
Plugins: 0 installed
No settings change in months — these settings worked fine before

Note on root cause

Because the regression appeared abruptly without a user-side update event, this could be either:

a CLI version regression (auto-update happened in the same window), or
a server-side / GrowthBook flag rollout that silently changed the resolved compact threshold (see #46331 for prior precedent)

A downgrade to 2.1.116 has been performed as a workaround; behavior on that version will be reported back if the issue persists or resolves.

Workaround

Manual downgrade to 2.1.116 + auto-updater off:

curl -fL -o ~/.local/share/claude/versions/2.1.116 \
  https://downloads.claude.ai/claude-code-releases/2.1.116/linux-x64/claude
chmod +x ~/.local/share/claude/versions/2.1.116
ln -sfn ~/.local/share/claude/versions/2.1.116 ~/.local/bin/claude
echo 'export DISABLE_AUTOUPDATER=1' >> ~/.bashrc

Likely related issues

#36014 — autocompact triggers at 17% on 1M context model (effectiveWindow likely capped at 200K) — matches this symptom
#52981 — auto-compact triggers at ~8% context usage (84K/1M)
#46331 — GrowthBook experiment tengu_amber_redwood silently reduces autocompact window
#50803 — 1M context window not auto-applied on Max plan; --model flag drops [1m] suffix
#52522 — adjacent: documents 2.1.117 threshold-detection change with subscriber-impact concerns
#52153 — excessive token consumption per prompt on Opus 4.7 1M

Ask

Confirm whether the 1M window detection has regressed for Opus 4.7 on Max 5x accounts (or any plan) within the last few days, either via CLI release or server-side flag
Check whether a GrowthBook flag (e.g. tengu_amber_redwood or similar) is targeting accounts and overriding the resolved compact threshold below the model's actual window
Either fix the detection or expose the resolved compact threshold in /context output (currently it shows the model window but not the actual percent at which compact will fire), so users can verify what the CLI is actually using
Given the subscriber cost impact (5x plan exhausted on identical workflows), surface defaults/threshold changes in release notes with explicit "Impact on subscribers" callouts going forward

extent analysis

TL;DR

Downgrade to version 2.1.116 as a temporary workaround to mitigate the premature auto-compact issue and excessive token consumption.

Guidance

Verify if the issue persists after downgrading to 2.1.116 to determine if the problem is version-specific.
Check the /context output to see if the resolved compact threshold is being overridden by a server-side flag.
Monitor token consumption and auto-compact triggers to identify any patterns or correlations with specific settings or workflows.
Review release notes for any changes to defaults or thresholds that may impact subscribers, especially regarding the 1M context window detection.

Example

The provided workaround script can be used to downgrade to 2.1.116:

curl -fL -o ~/.local/share/claude/versions/2.1.116 \
  https://downloads.claude.ai/claude-code-releases/2.1.116/linux-x64/claude
chmod +x ~/.local/share/claude/versions/2.1.116
ln -sfn ~/.local/share/claude/versions/2.1.116 ~/.local/bin/claude
echo 'export DISABLE_AUTOUPDATER=1' >> ~/.bashrc

Notes

The root cause of the issue is uncertain, and it may be related to a server-side flag rollout or a CLI version regression. Further investigation is needed to determine the exact cause.

Recommendation

Apply the workaround by downgrading to 2.1.116, as it has been reported to resolve the issue temporarily. This will help mitigate the premature auto-compact and excessive token consumption until a permanent fix is available.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#memory optimization #batch processing #GPU compatibility #latency issue #model loading

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [Bug] Auto-compact triggers far below 1M window on Opus 4.7 [1M] — severe token burn regression on Max 5x (post-2.1.116) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Note on root cause

Fix Action

Workaround

Code Example

Summary

Environment

Behavior

Before (~2+ days ago) — correct

Now — regression

Ruled out

Note on root cause

Workaround

Likely related issues

Ask

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [Bug] Auto-compact triggers far below 1M window on Opus 4.7 [1M] — severe token burn regression on Max 5x (post-2.1.116) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Note on root cause

Fix Action

Workaround

Code Example

Summary

Environment

Behavior

Before (~2+ days ago) — correct

Now — regression

Ruled out

Note on root cause

Workaround

Likely related issues

Ask

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING