claude-code - 💡(How to fix) Fix System prompt size grew ~70K tokens between v2.1.89 and v2.1.96, making sessions unusable without frequent manual /compact [4 comments, 3 participants]

Wishbringer71 · 2026-04-08T12:40:19Z

[claude-code] Between Claude Code 2.1.89 and 2.1.96 approximately 5 days, early April 2026 , the initial system prompt size grew by approximately 70K tokens. T… Between Claude Code 2.1.89 and 2.1.96 (approximately 5 days, early April 2026), the initial system prompt size grew by approximately 70K tokens. This is large enough to make sessions effectively unusable: with auto-compact disabled (as I had it configured at the time), the context would fill before I could complete any non-trivial task, and Claude Code stopped accepting input entirely. I had to introduce a manual `/compact` workflow as an emergency measure just to keep working. ## Fix / Workaround I had auto-compact disabled during this period (I was testing with it off). As the system prompt grew past ~100K+, sessions would fill up before completing any multi-step task. Claude Code stopped accepting new input. I had to run manual `/compact` continuously as an emergency workaround just to keep the tool usable. This persisted until I re-enabled auto-compact and tuned the threshold. ## Summary Between Claude Code 2.1.89 and 2.1.96 (approximately 5 days, early April 2026), the initial system prompt size grew by approximately 70K tokens. This is large enough to make sessions effectively unusable: with auto-compact disabled (as I had it configured at the time), the context would fill before I could complete any non-trivial task, and Claude Code stopped accepting input entirely. I had to introduce a manual `/compact` workflow as an emergency measure just to keep working. ## Environment - Claude Code versions affected: 2.1.89 → 2.1.91 → 2.1.92 → 2.1.96 - OS: Windows 11 (Git Bash shell) - Models: claude-opus-4-6 / claude-sonnet-4-6 - Setup: standard user install + a few user plugins (LSPs, code-review, feature-dev, hookify) ## Measurement Method I correlated the `usage.cache_creation_input_tokens` field of the **first assistant message** in each session JSONL against the `version` field, across all sessions under `~/.claude/projects/ /*.jsonl`. The first assistant message has no cache hit yet, so its `cache_creation_input_tokens` reflects the full initial system prompt assembled by the binary — independent of any `additionalContext` injected by UserPromptSubmit hooks. ```python import json from pathlib import Path from collections import defaultdict proj = Path.home() / ".claude" / "projects" / " " by_version = defaultdict(list) for jsonl in proj.glob("*.jsonl"): first_cache = None version = None with jsonl.open(encoding="utf-8") as f: for line in f: try: d = json.loads(line) except json.JSONDecodeError: continue if version is None and d.get("version"): version = d["version"] usage = (d.get("message") or {}).get("usage") or {} cct = usage.get("cache_creation_input_tokens") if cct and first_cache is None: first_cache = cct break if first_cache and version: by_version[version].append(first_cache) for v in sorted(by_version): vals = sorted(by_version[v]) print(f"{v}: n={len(vals):3d} median={vals[len(vals)//2]:6d} max={max(vals):6d}") ``` ## Observed Data | Date | CC Version | Median first-cache | Δ vs baseline | |--------------|-----------------|--------------------|---------------| | 04-01..04-02 | 2.1.89 / 2.1.90 | ~38–52K | baseline | | 04-03 | **2.1.91** | **~80–88K** | **+40K** | | 04-04..04-07 | 2.1.92 | ~85–90K | +40K (stable) | | **04-08** | **2.1.96** | **~112–119K** | **+70K total**| Two distinct step changes: +40K at 2.1.91, +30K at 2.1.96. Total growth: ~70K in 5 days. ## Impact I had auto-compact disabled during this period (I was testing with it off). As the system prompt grew past ~100K+, sessions would fill up before completing any multi-step task. Claude Code stopped accepting new input. I had to run manual `/compact` continuously as an emergency workaround just to keep the tool usable. This persisted until I re-enabled auto-compact and tuned the threshold. Even with auto-compact enabled at the default threshold, the effective usable context has shrunk dramatically: a 70K larger baseline means 70K less room for actual conversation and tool output. ## What I Already Ruled Out - **UserPromptSubmit hooks (`additionalContext`)**: affects later cache blocks, not the first `cache_creation_input_tokens`. Verified by disabling my hook chain — first-cache size unchanged. - **MCP servers**: removed candidate MCPs, re-measured — first-cache size unchanged. - **Plugin count**: removed 5 stale LSP plugins, re-measured — first-cache size unchanged. - **Session state / bistability**: sessions started with `/clear` show ~50–60K smaller first-cache than sessions resuming work. Both classes show the same version-correlated step changes, so the jumps are not session-state artifacts. The growth correlates exactly and exclusively with binary version bumps. ## Suspected Sources of Growth Based on inspection of what appears to have changed in the system prompt area between these versions: - Skill listing attachment (`skill_listing` type) — my ses

claude-code2026-04-08 12:40:19

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#45188•Fetched 2026-04-09 08:11:14

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

labeled ×5commented ×4

Between Claude Code 2.1.89 and 2.1.96 (approximately 5 days, early April 2026), the initial system prompt size grew by approximately 70K tokens. This is large enough to make sessions effectively unusable: with auto-compact disabled (as I had it configured at the time), the context would fill before I could complete any non-trivial task, and Claude Code stopped accepting input entirely. I had to introduce a manual /compact workflow as an emergency measure just to keep working.

Error Message

import json from pathlib import Path from collections import defaultdict

proj = Path.home() / ".claude" / "projects" / "<your-project-id>" by_version = defaultdict(list)

for jsonl in proj.glob("*.jsonl"): first_cache = None version = None with jsonl.open(encoding="utf-8") as f: for line in f: try: d = json.loads(line) except json.JSONDecodeError: continue if version is None and d.get("version"): version = d["version"] usage = (d.get("message") or {}).get("usage") or {} cct = usage.get("cache_creation_input_tokens") if cct and first_cache is None: first_cache = cct break if first_cache and version: by_version[version].append(first_cache)

for v in sorted(by_version): vals = sorted(by_version[v]) print(f"{v}: n={len(vals):3d} median={vals[len(vals)//2]:6d} max={max(vals):6d}")

Root Cause

Fix Action

Fix / Workaround

I had auto-compact disabled during this period (I was testing with it off). As the system prompt grew past ~100K+, sessions would fill up before completing any multi-step task. Claude Code stopped accepting new input. I had to run manual /compact continuously as an emergency workaround just to keep the tool usable. This persisted until I re-enabled auto-compact and tuned the threshold.

Code Example

import json
from pathlib import Path
from collections import defaultdict

proj = Path.home() / ".claude" / "projects" / "<your-project-id>"
by_version = defaultdict(list)

for jsonl in proj.glob("*.jsonl"):
    first_cache = None
    version = None
    with jsonl.open(encoding="utf-8") as f:
        for line in f:
            try:
                d = json.loads(line)
            except json.JSONDecodeError:
                continue
            if version is None and d.get("version"):
                version = d["version"]
            usage = (d.get("message") or {}).get("usage") or {}
            cct = usage.get("cache_creation_input_tokens")
            if cct and first_cache is None:
                first_cache = cct
                break
    if first_cache and version:
        by_version[version].append(first_cache)

for v in sorted(by_version):
    vals = sorted(by_version[v])
    print(f"{v}: n={len(vals):3d}  median={vals[len(vals)//2]:6d}  max={max(vals):6d}")

RAW_BUFFERClick to expand / collapse

Summary

Environment

Claude Code versions affected: 2.1.89 → 2.1.91 → 2.1.92 → 2.1.96
OS: Windows 11 (Git Bash shell)
Models: claude-opus-4-6 / claude-sonnet-4-6
Setup: standard user install + a few user plugins (LSPs, code-review, feature-dev, hookify)

Measurement Method

I correlated the usage.cache_creation_input_tokens field of the first assistant message in each session JSONL against the version field, across all sessions under ~/.claude/projects/<id>/*.jsonl.

The first assistant message has no cache hit yet, so its cache_creation_input_tokens reflects the full initial system prompt assembled by the binary — independent of any additionalContext injected by UserPromptSubmit hooks.

import json
from pathlib import Path
from collections import defaultdict

proj = Path.home() / ".claude" / "projects" / "<your-project-id>"
by_version = defaultdict(list)

for jsonl in proj.glob("*.jsonl"):
    first_cache = None
    version = None
    with jsonl.open(encoding="utf-8") as f:
        for line in f:
            try:
                d = json.loads(line)
            except json.JSONDecodeError:
                continue
            if version is None and d.get("version"):
                version = d["version"]
            usage = (d.get("message") or {}).get("usage") or {}
            cct = usage.get("cache_creation_input_tokens")
            if cct and first_cache is None:
                first_cache = cct
                break
    if first_cache and version:
        by_version[version].append(first_cache)

for v in sorted(by_version):
    vals = sorted(by_version[v])
    print(f"{v}: n={len(vals):3d}  median={vals[len(vals)//2]:6d}  max={max(vals):6d}")

Observed Data

Date	CC Version	Median first-cache	Δ vs baseline
04-01..04-02	2.1.89 / 2.1.90	~38–52K	baseline
04-03	2.1.91	~80–88K	+40K
04-04..04-07	2.1.92	~85–90K	+40K (stable)
04-08	2.1.96	~112–119K	+70K total

Two distinct step changes: +40K at 2.1.91, +30K at 2.1.96. Total growth: ~70K in 5 days.

Impact

Even with auto-compact enabled at the default threshold, the effective usable context has shrunk dramatically: a 70K larger baseline means 70K less room for actual conversation and tool output.

What I Already Ruled Out

UserPromptSubmit hooks (additionalContext): affects later cache blocks, not the first cache_creation_input_tokens. Verified by disabling my hook chain — first-cache size unchanged.
MCP servers: removed candidate MCPs, re-measured — first-cache size unchanged.
Plugin count: removed 5 stale LSP plugins, re-measured — first-cache size unchanged.
Session state / bistability: sessions started with /clear show ~50–60K smaller first-cache than sessions resuming work. Both classes show the same version-correlated step changes, so the jumps are not session-state artifacts.

The growth correlates exactly and exclusively with binary version bumps.

Suspected Sources of Growth

Based on inspection of what appears to have changed in the system prompt area between these versions:

Skill listing attachment (skill_listing type) — my session shows ~7KB for 47 skills injected on every turn
New or expanded injection-defense / safety layers
New tool descriptions (Skill tool, browser-automation tool suite, Plugin subagent descriptions)

Asks

Confirm whether the ~70K growth between 2.1.89 and 2.1.96 is intentional and which components account for it.
Consider lazy-loading or deduplicating large fixed blocks (skill listings, agent descriptions, plugin metadata) so they only appear when the relevant capability is actually needed.
Expose first-cache / baseline system-prompt size in /status or a debug flag so users can monitor it without parsing session JSONLs.
Document the expected baseline range per version in the CHANGELOG so users can anticipate and plan around context budget changes.

Reproducibility

Anyone with multiple session JSONLs spanning versions 2.1.89–2.1.96 can run the script above and reproduce the version-vs-size correlation in a few seconds.

extent analysis

TL;DR

The most likely fix for the issue of the initial system prompt size growing by approximately 70K tokens between Claude Code versions 2.1.89 and 2.1.96 is to re-enable auto-compact and tune the threshold, and consider implementing lazy-loading or deduplicating large fixed blocks.

Guidance

Re-enable auto-compact and adjust the threshold to a suitable value to prevent the context from filling up before completing tasks.
Consider implementing lazy-loading or deduplicating large fixed blocks, such as skill listings, agent descriptions, and plugin metadata, to reduce the initial system prompt size.
Monitor the first-cache size by parsing session JSONLs or using a potential future /status or debug flag to expose this information.
Review the CHANGELOG for expected baseline range per version to anticipate and plan around context budget changes.

Example

No code snippet is provided as the issue is more related to configuration and potential code changes in the Claude Code binary.

Notes

The exact cause of the growth is not explicitly stated, but it is suspected to be related to changes in the system prompt area, such as skill listing attachment, injection-defense/safety layers, and new tool descriptions. The provided script can be used to reproduce the version-vs-size correlation.

Recommendation

Apply a workaround by re-enabling auto-compact and tuning the threshold, as this is a viable solution to mitigate the issue until a more permanent fix is implemented. Consider implementing lazy-loading or deduplicating large fixed blocks to reduce the initial system prompt size.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#authentication setup #request error #file not found #serialization error #model compatibility

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix System prompt size grew ~70K tokens between v2.1.89 and v2.1.96, making sessions unusable without frequent manual /compact [4 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Environment

Measurement Method

Observed Data

Impact

What I Already Ruled Out

Suspected Sources of Growth

Asks

Reproducibility

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix System prompt size grew ~70K tokens between v2.1.89 and v2.1.96, making sessions unusable without frequent manual /compact [4 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Code Example

Summary

Environment

Measurement Method

Observed Data

Impact

What I Already Ruled Out

Suspected Sources of Growth

Asks

Reproducibility

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING