claude-code - 💡(How to fix) Fix [BUG] Auto-compact triggers consistently at ~367K tokens on Opus 4.7 [1m] in Claude Code 2.1.119 (regression of #52519?) [2 comments, 2 participants]

claude-code2026-04-27 20:13:44

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#54056•Fetched 2026-04-28 06:40:26

View on GitHub

Comments

Participants

Timeline

Reactions

Author

mikerCZ

Participants

github-actions[bot]

mikerCZ

Timeline (top)

labeled ×5commented ×2cross-referenced ×1

Error Message

Error Messages/Logs

No explicit error message is shown — auto-compact just fires silently. There are no autocompact: log lines in ~/.claude/debug/ (directory exists but is empty), and no log lines in ~/.claude/telemetry/ mentioning effectiveWindow or threshold.

Root Cause

Specifically:

The local auto-compact threshold should match the same window that is used for routing (the request is clearly routed as 1M because cache_read goes above 200K).
A user with "model": "opus[1m]" should be able to use roughly 800K+ tokens before triggering auto-compaction.
effectiveWindow reported internally should equal 1,000,000 for this configuration.

Code Example

Claude Code 2.1.119 auto-compacts Opus 4.7 [1m] sessions at a hard threshold of **~367K tokens**, despite the model being configured with the 1M context window.

Per the v2.1.117 changelog, the threshold computation against a 200K window for Opus 4.7 was supposed to be fixed — but on a MAX plan with `"model": "opus[1m]"`, compaction now consistently fires around 36.7% of the advertised 1M window.

This is not a single-incident hiccup. In one working day (2026-04-27), auto-compact fired **4 times** at near-identical thresholds:

| Time (UTC) | Peak tokens before compact | After compact |
|------------|---------------------------:|--------------:|
| 15:41:55   | 366,432                    | 92,518        |
| 16:52:52   | 366,989                    | 102,144       |
| 17:53:47   | 360,677                    | 97,646        |
| 18:40:47   | 366,790                    | 105,886       |

Tokens = `input_tokens + cache_read_input_tokens + cache_creation_input_tokens`, extracted directly from the session JSONL in `~/.claude/projects/.../<session>.jsonl`. Cache reads exceed 200K (max observed: 359,497), which confirms the request is being routed to the 1M variant on the backend — but the local auto-compact threshold appears to be calculated against a different (smaller) window.

Distribution of token usage during the session (50k buckets):

---

The session never exceeded 367K. This is a hard ceiling, not a natural conversation end.

Related issues that suggest this bug class has recurred multiple times:
- #52519 — 2.1.117 fix for Opus 4.7 inflated `/context` percentages and early auto-compact
- #42375 — Auto-compaction at ~6% on Opus 4.6 (1M)
- #34158 — Context limit reached at ~200K despite 1M model
- #34363 — Autocompact buffer larger than 200K window
- #46372 — MAX stuck at 200K after model switch / cancelled compaction

---

With `"model": "opus[1m]"` (resolves to `claude-opus-4-7[1m]`) on a MAX plan, auto-compaction should trigger near the 1M boundary — i.e. ~835K tokens (83.5% of 1M), as documented for the 1M context window.

Specifically:
- The local auto-compact threshold should match the same window that is used for routing (the request is clearly routed as 1M because cache_read goes above 200K).
- A user with `"model": "opus[1m]"` should be able to use roughly 800K+ tokens before triggering auto-compaction.
- `effectiveWindow` reported internally should equal 1,000,000 for this configuration.

---

No explicit error message is shown — auto-compact just fires silently. There are no `autocompact:` log lines in `~/.claude/debug/` (directory exists but is empty), and no log lines in `~/.claude/telemetry/` mentioning `effectiveWindow` or `threshold`.

The only evidence is reconstructed from the session JSONL by extracting `message.usage` and detecting large drops in total token count (input + cache_read + cache_creation):


=== COMPACT EVENTS (drops > 50k tokens) ===
COMPACT: 366432 -> 92518   at 2026-04-27T15:41:55.346Z
COMPACT: 366989 -> 102144  at 2026-04-27T16:52:52.466Z
COMPACT: 360677 -> 97646   at 2026-04-27T17:53:47.153Z
COMPACT: 366790 -> 105886  at 2026-04-27T18:40:47.823Z


Suggestion: please log `effectiveWindow` and `threshold` somewhere user-accessible so users can self-diagnose this class of issue.

---

1. On macOS, install Claude Code 2.1.119 with a MAX plan (subscriptionType `"max"`).
2. In `~/.claude/settings.json` set:

---

3. Start a new session: `claude`
4. Run `/context` — confirm the denominator reads `1000k`.
5. Use the session normally — file reads via `Read`, subagents via `Task`, MCP calls (e.g. `memory`), etc. Allow the conversation to grow.
6. Watch the total token usage (input + cache_read + cache_creation) climb past 350K.
7. Observe that auto-compact fires consistently around **360–367K tokens**, well below the expected ~835K threshold for a 1M window.

To verify post-hoc, extract token timeline from the session JSONL:

---

Expected: zero compact events until ~835K. Actual: multiple compact events at ~367K.

---

**Environment variables (none of these are set):**
- `CLAUDE_CODE_AUTO_COMPACT_WINDOW` — not set
- `DISABLE_AUTO_COMPACT` — not set
- `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` — not set
- `CLAUDE_CODE_BLOCKING_LIMIT_OVERRIDE` — not set
- `CLAUDE_CODE_DISABLE_1M_CONTEXT` — not set

**Relevant `~/.claude/settings.json` excerpt:**

---

**Possible root cause hypothesis:**
The 1M-context detection works for the request routing path (cache_read goes above 200K, confirming 1M variant on the backend), but the auto-compact threshold is computed via a different code path that uses a stale or smaller window value. 367K corresponds to roughly 83.5% of an effective window of ~440K, suggesting some intermediate value rather than 200K or 1M.

RAW_BUFFERClick to expand / collapse

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report (please file separate reports for different bugs)
I am using the latest version of Claude Code

What's Wrong?

Claude Code 2.1.119 auto-compacts Opus 4.7 [1m] sessions at a hard threshold of **~367K tokens**, despite the model being configured with the 1M context window.

Per the v2.1.117 changelog, the threshold computation against a 200K window for Opus 4.7 was supposed to be fixed — but on a MAX plan with `"model": "opus[1m]"`, compaction now consistently fires around 36.7% of the advertised 1M window.

This is not a single-incident hiccup. In one working day (2026-04-27), auto-compact fired **4 times** at near-identical thresholds:

| Time (UTC) | Peak tokens before compact | After compact |
|------------|---------------------------:|--------------:|
| 15:41:55   | 366,432                    | 92,518        |
| 16:52:52   | 366,989                    | 102,144       |
| 17:53:47   | 360,677                    | 97,646        |
| 18:40:47   | 366,790                    | 105,886       |

Tokens = `input_tokens + cache_read_input_tokens + cache_creation_input_tokens`, extracted directly from the session JSONL in `~/.claude/projects/.../<session>.jsonl`. Cache reads exceed 200K (max observed: 359,497), which confirms the request is being routed to the 1M variant on the backend — but the local auto-compact threshold appears to be calculated against a different (smaller) window.

Distribution of token usage during the session (50k buckets):

50000- 99999 : 37 100000-149999 : 70 150000-199999 : 225 200000-249999 : 257 250000-299999 : 302 300000-349999 : 215 350000-399999 : 88 <- session never exceeds ~367K


The session never exceeded 367K. This is a hard ceiling, not a natural conversation end.

Related issues that suggest this bug class has recurred multiple times:
- #52519 — 2.1.117 fix for Opus 4.7 inflated `/context` percentages and early auto-compact
- #42375 — Auto-compaction at ~6% on Opus 4.6 (1M)
- #34158 — Context limit reached at ~200K despite 1M model
- #34363 — Autocompact buffer larger than 200K window
- #46372 — MAX stuck at 200K after model switch / cancelled compaction

What Should Happen?

With `"model": "opus[1m]"` (resolves to `claude-opus-4-7[1m]`) on a MAX plan, auto-compaction should trigger near the 1M boundary — i.e. ~835K tokens (83.5% of 1M), as documented for the 1M context window.

Specifically:
- The local auto-compact threshold should match the same window that is used for routing (the request is clearly routed as 1M because cache_read goes above 200K).
- A user with `"model": "opus[1m]"` should be able to use roughly 800K+ tokens before triggering auto-compaction.
- `effectiveWindow` reported internally should equal 1,000,000 for this configuration.

Error Messages/Logs

No explicit error message is shown — auto-compact just fires silently. There are no `autocompact:` log lines in `~/.claude/debug/` (directory exists but is empty), and no log lines in `~/.claude/telemetry/` mentioning `effectiveWindow` or `threshold`.

The only evidence is reconstructed from the session JSONL by extracting `message.usage` and detecting large drops in total token count (input + cache_read + cache_creation):


=== COMPACT EVENTS (drops > 50k tokens) ===
COMPACT: 366432 -> 92518   at 2026-04-27T15:41:55.346Z
COMPACT: 366989 -> 102144  at 2026-04-27T16:52:52.466Z
COMPACT: 360677 -> 97646   at 2026-04-27T17:53:47.153Z
COMPACT: 366790 -> 105886  at 2026-04-27T18:40:47.823Z


Suggestion: please log `effectiveWindow` and `threshold` somewhere user-accessible so users can self-diagnose this class of issue.

Steps to Reproduce

1. On macOS, install Claude Code 2.1.119 with a MAX plan (subscriptionType `"max"`).
2. In `~/.claude/settings.json` set:
   ```json
   {
     "model": "opus[1m]"
   }

Start a new session: claude
Run /context — confirm the denominator reads 1000k.
Use the session normally — file reads via Read, subagents via Task, MCP calls (e.g. memory), etc. Allow the conversation to grow.
Watch the total token usage (input + cache_read + cache_creation) climb past 350K.
Observe that auto-compact fires consistently around 360–367K tokens, well below the expected ~835K threshold for a 1M window.

To verify post-hoc, extract token timeline from the session JSONL:

SESSION=~/.claude/projects/-Users-<you>/<session-uuid>.jsonl

jq -r 'select(.message.usage) | "\(.timestamp)|\((.message.usage.input_tokens // 0) + (.message.usage.cache_read_input_tokens // 0) + (.message.usage.cache_creation_input_tokens // 0))"' "$SESSION" \
  | awk -F'|' 'NR>1 && prev > $2 + 50000 {print "COMPACT: " prev " -> " $2 " at " $1} {prev=$2}'

Expected: zero compact events until ~835K. Actual: multiple compact events at ~367K.


### Claude Model

Opus

### Is this a regression?

I don't know

### Last Working Version

Unknown — I noticed the issue today on 2.1.119. I do not have data points from earlier 2.1.x versions to confirm exactly when it started.

### Claude Code Version

2.1.119

### Platform

Anthropic API

### Operating System

macOS

### Terminal/Shell

Terminal.app (macOS)

### Additional Information

```markdown
**Environment variables (none of these are set):**
- `CLAUDE_CODE_AUTO_COMPACT_WINDOW` — not set
- `DISABLE_AUTO_COMPACT` — not set
- `CLAUDE_AUTOCOMPACT_PCT_OVERRIDE` — not set
- `CLAUDE_CODE_BLOCKING_LIMIT_OVERRIDE` — not set
- `CLAUDE_CODE_DISABLE_1M_CONTEXT` — not set

**Relevant `~/.claude/settings.json` excerpt:**
```json
{
  "model": "opus[1m]",
  "effortLevel": "high",
  "remoteControlAtStartup": true,
  "permissions": {
    "additionalDirectories": [
      "/etc", "/usr/local", "/opt", "/Library", "/Applications", "/private/etc"
    ]
  },
  "hooks": {
    "PreToolUse": [
      // 8 PreToolUse hooks (Edit/Write protect, Bash secrets check, conventional commits, etc.)
    ]
  },
  "mcpServers": {
    "memory": { "autoApprove": true },
    "firebase": { "command": "npx", "args": ["-y", "firebase-tools@latest", "mcp", "--dir", "..."] }
  }
}

Possible root cause hypothesis: The 1M-context detection works for the request routing path (cache_read goes above 200K, confirming 1M variant on the backend), but the auto-compact threshold is computed via a different code path that uses a stale or smaller window value. 367K corresponds to roughly 83.5% of an effective window of ~440K, suggesting some intermediate value rather than 200K or 1M.

extent analysis

TL;DR

The auto-compact threshold in Claude Code 2.1.119 is being calculated incorrectly, resulting in premature compaction at around 367K tokens instead of the expected 1M threshold.

Guidance

Verify that the effectiveWindow value is being reported correctly internally and matches the expected 1M window for the Opus 4.7 model.
Check the calculation of the auto-compact threshold to ensure it is using the correct window value, rather than a stale or smaller value.
Consider adding logging for effectiveWindow and threshold to facilitate self-diagnosis of this issue.
Review the code path for request routing and auto-compact threshold calculation to identify any discrepancies or intermediate values that may be causing the issue.

Example

No code snippet is provided as the issue is related to the internal calculation of the auto-compact threshold and requires further investigation.

Notes

The root cause of the issue is unclear, but it is suspected to be related to a discrepancy between the request routing path and the auto-compact threshold calculation. Further investigation is needed to determine the exact cause and implement a fix.

Recommendation

Apply a workaround by monitoring the token usage and manually intervening when the threshold is reached, until a fix is available. This will prevent premature compaction and allow for continued use of the session.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #model save/load #optimization #mixed precision #environment variable

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [BUG] Auto-compact triggers consistently at ~367K tokens on Opus 4.7 [1m] in Claude Code 2.1.119 (regression of #52519?) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Root Cause

Code Example

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Auto-compact triggers consistently at ~367K tokens on Opus 4.7 [1m] in Claude Code 2.1.119 (regression of #52519?) [2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Root Cause

Code Example

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING