claude-code - 💡(How to fix) Fix [BUG] Compaction threshold regression in 2.1.144+: MCP tool definitions over-counted, auto-compact fires at ~74K (vs ~143K in 2.1.143)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Auto-compaction threshold collapsed by ~50% in CC 2.1.144 and was not fixed in 2.1.146 or 2.1.147. The regression is gated on the presence of MCP tool definitions: with MCPs enabled, compaction fires at a real input-token count of ~74K (vs ~143K in 2.1.143 and earlier). With MCPs disabled, compaction fires at the historical ~135K threshold even on 2.1.144+. Both auto-compact and manual /compact are affected (Claude Code's TUI fullness signal seems to share the broken count, so users type /compact early too).

Root Cause

In healthy versions, auto-compact (gap < 5s after prior turn) and manual /compact (gap ≥ 30s) sit at different thresholds because users type manual at their own perceived "context full" point, not the heuristic point.

Fix Action

Fix / Workaround

Workaround (current)

  • Pin to 2.1.143 (only fully-clean version observed across the 47h capture)
  • Or disable unused MCP servers in ~/.claude.json to push back into the 135K threshold band on 2.1.144+
  • Once a conversation crosses ~100 messages, message volume dilutes the MCP over-count and the threshold returns to 140K — affected primarily during the first 50-100 turns of a new session
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues — closest match is #59650 (deferred MCP schemas inflating cache reads, different bug, see Related below)
  • This is a single bug report
  • I am using the latest version of Claude Code (2.1.147 confirmed broken)

Related

  • #59650 — Deferred MCP tool schemas inflate cache reads — adjacent (also MCP/token inflation), but that's about cache space and bills. This is a regression in compaction trigger threshold introduced in 2.1.144 specifically. Both can coexist.
  • #50015 — Auto-compaction fires without pre-compaction warning — describes a related but earlier regression; this issue narrows the new threshold-collapse to MCP tool definitions and a specific version range.

Summary

Auto-compaction threshold collapsed by ~50% in CC 2.1.144 and was not fixed in 2.1.146 or 2.1.147. The regression is gated on the presence of MCP tool definitions: with MCPs enabled, compaction fires at a real input-token count of ~74K (vs ~143K in 2.1.143 and earlier). With MCPs disabled, compaction fires at the historical ~135K threshold even on 2.1.144+. Both auto-compact and manual /compact are affected (Claude Code's TUI fullness signal seems to share the broken count, so users type /compact early too).

Empirical Evidence

Captured from a logging proxy in front of a self-hosted Anthropic-protocol endpoint over 2026-04-28 → 2026-05-21. 411 compaction trigger events with the preceding turn's server-reported usage.input_tokens. Identical Anthropic protocol traffic across all CC versions; the only thing that changes is the client.

Per-version median of prev_input_tokens at compaction trigger (latest 47-hour log, n=411):

cc_versionnauto countmanual countp50 prev_inputp90 prev_input
2.1.92202140,121
2.1.138505137,236143,968
2.1.141202150,137
2.1.14220019142,372143,725
2.1.143101143,709
2.1.144318728674,312143,295
2.1.1464914374,29785,446
2.1.14780881,077155,942

Day-over-day collapse:

daynp50
2026-05-1927142,540
2026-05-2040143,052
2026-05-2134473,620 ← 2.1.144 dominant

MCP Discriminator (the smoking gun)

Within 2.1.144, splitting the 333 events by bytes of mcp__* tool definitions in the request:

bytes_mcp bucketnmedian MCP toolsp50 prev_inputregressedhealthy
<50 KB240135,762915
50-100 KB58972,12650
100-150 KB30413274,52525252

Zero MCPs → 135K threshold (healthy). 100+ KB of MCP definitions → 74K threshold (regressed). Same client, same model, same protocol — only the presence of MCP tool definitions toggles the behavior.

The threshold gap (143K healthy − 74K regressed = ~69K) is the same order of magnitude as the byte-count of MCP tool definitions reduced to tokens (~116K bytes ÷ ~3 chars/token ≈ 39K tokens × ~1.8× over-count). Long underscore-separated MCP tool names like mcp__atlassian__jira_search_users tokenize at ~13 cl100k_base tokens vs 1 token for Bash — a ratio that may have changed across tokenizer/serializer revisions in the 2.1.143 → 2.1.144 diff.

Auto vs Manual Collapse (shared count source)

In healthy versions, auto-compact (gap < 5s after prior turn) and manual /compact (gap ≥ 30s) sit at different thresholds because users type manual at their own perceived "context full" point, not the heuristic point.

In 2.1.144 they collapse to the same threshold (auto p50 = 67K, manual p50 = 75K), strongly suggesting the same broken token-counter feeds both the auto-compact heuristic AND whatever fullness indicator Claude Code shows the user. Users see the indicator firing too early and reflexively /compact.

Reproduction

  1. Use Claude Code 2.1.144 (or 2.1.146/147) configured with ≥1 MCP server (we see it with mcp-atlassian, mcp-phabricator, mcp-bitbucket, playwright, mattermost — total 132 tool definitions, ~116 KB)
  2. Start a new conversation, do work that accumulates context normally
  3. Observe: auto-compact fires at ~74K tokens of real prompt content (server-reported), well below historical ~140K
  4. Repeat with all MCP servers disabled: auto-compact fires at ~135K, healthy

Environment

  • Versions confirmed broken: 2.1.144.*, 2.1.146.*, 2.1.147.*
  • Last clean version observed: 2.1.143.7a0
  • Skipped/unobserved: 2.1.145.* (zero events captured — possibly skipped release)
  • Backend: self-hosted Anthropic-protocol endpoint serving glm-5 (irrelevant to bug — server only sees the request after CC's compaction decision is made)
  • 5 MCP servers enabled, 132 MCP tool definitions, ~116 KB of MCP tool JSON

Suggested Investigation

  1. Diff client-side context-budget / token-counting code between 2.1.143 and 2.1.144. The change is likely a single tokenizer model swap, a tool-serialization change (e.g., adding boilerplate around each tool definition), or a constant added to the per-tool overhead.
  2. Verify with a side-by-side: count tools_size_in_tokens for the same MCP definitions on both versions; expect the 2.1.144 number to be ~1.8× larger.
  3. The fix should restore the 2.1.143 behavior so auto-compact fires at ~140K real input regardless of tool composition.

Workaround (current)

  • Pin to 2.1.143 (only fully-clean version observed across the 47h capture)
  • Or disable unused MCP servers in ~/.claude.json to push back into the 135K threshold band on 2.1.144+
  • Once a conversation crosses ~100 messages, message volume dilutes the MCP over-count and the threshold returns to 140K — affected primarily during the first 50-100 turns of a new session

Data

Full per-event captures (411 trigger events with all extracted features, per-version histograms, MCP byte composition, time-gap classification) are available on request — happy to share a compaction_events.jsonl + tool_features.jsonl from our proxy.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Compaction threshold regression in 2.1.144+: MCP tool definitions over-counted, auto-compact fires at ~74K (vs ~143K in 2.1.143)