claude-code - 💡(How to fix) Fix Repeated context injection via JSONL attachments causes linear token bloat across long sessions [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#50998Fetched 2026-04-20 12:07:19
View on GitHub
Comments
2
Participants
2
Timeline
6
Reactions
0
Author
Timeline (top)
labeled ×4commented ×2

Claude Code persists every hook result, built-in task_reminder, IDE events, skill-list changes, tool-call side-events, subagent progress, and last-prompt as separate attachment / last-prompt / progress entries in the session JSONL under <USER_HOME>/.claude/projects/<PROJECT_ID>/<SESSION_ID>.jsonl. On every subsequent API turn, history reconstruction re-includes these entries in the prompt payload, so the same content is re-transmitted many times within one session, producing linear token bloat that users perceive as "the same context being injected on every turn."

In a 150-minute live monitoring window across five concurrent sessions, this report documents:

  • Linear, unbounded growth of per-turn API payload (~1,100 tokens/min) in a continuously-used session, reaching 661,706 tokens = 66% of a 1 M-token context window in 148 minutes.
  • Auto-compaction is observable but does not help: three spikes of cache_creation_input_tokens at 455K~558K each, immediately followed by cache_read returning to its pre-compaction 500K level.
  • /clear is a temporary reset, not a fix: 319 reset events were logged across four sessions; every reset was followed by rapid re-accumulation.
  • Per-tool-use fan-out: one Bash call produces ~3 attachment/hook_success entries.
  • Subagent silos: Agent / Task invocations create separate JSONLs of up to 6.4 MB each, on top of the main session.
  • Global CLI history (<USER_HOME>/.claude/history.jsonl) retains every input across all sessions/projects forever with no documented purge command.

All findings are reproducible across concurrent sessions on the same machine and persist after upgrading to the latest Claude Code v2.1.113.


Error Message

Usage: python this_script.py <USER_HOME>/.claude/projects/<PROJECT_ID>/<SESSION_ID>.jsonl

import json, hashlib, collections, sys

path = sys.argv[1] counts = collections.Counter() hashes = collections.Counter() task_reminder_items = collections.Counter() attachment_types = collections.Counter() version_split = collections.defaultdict(lambda: {'entries':0, 'attach':0, 'tr0':0, 'lp':0}) last_prompt_hashes = set() last_prompts = 0

with open(path, 'r', encoding='utf-8') as f: for line in f: try: d = json.loads(line) except Exception: continue t = d.get('type', '') v = d.get('version', 'unknown') counts[t] += 1 version_split[v]['entries'] += 1 if t == 'attachment': version_split[v]['attach'] += 1 att = d.get('attachment', {}) attachment_types[att.get('type', '?')] += 1 h = hashlib.md5(json.dumps(att, sort_keys=True).encode()).hexdigest()[:12] hashes[h] += 1 if att.get('type') == 'task_reminder': ic = att.get('itemCount', -1) task_reminder_items[ic] += 1 if ic == 0: version_split[v]['tr0'] += 1 elif t == 'last-prompt': last_prompts += 1 version_split[v]['lp'] += 1 last_prompt_hashes.add(hashlib.md5(d.get('lastPrompt', '').encode()).hexdigest()[:12])

print('entry type counts :', dict(counts)) print('attachment.type catalog :', dict(attachment_types)) print('top repeated attachment :', hashes.most_common(10)) print('task_reminder itemCount dist:', dict(task_reminder_items)) print(f'last-prompt: {last_prompts} entries, {len(last_prompt_hashes)} unique') print('by version :', dict(version_split))

Root Cause

v2.1.113 does not address the root cause.

Fix Action

Fix / Workaround

Readings:

  • attachment / entries ratio on v2.1.113 is 40~54% — equal to or worse than older versions.

  • task_reminder itemCount=0 emission on v2.1.113: decreases in some sessions, increases in others — the patch is not a coalescing fix; it is situational.

  • last-prompt continues to persist ~7× copies of each unique prompt on v2.1.113 (no improvement).

  • CLI restart keeps the same sessionId and writes to the same JSONL — prior accumulation carries over and is re-included in future API payloads.

  • Cost: API token consumption grows faster than the information content of the conversation. 5-minute prompt-cache TTL does not absorb this because session JSONL entries change on every turn, invalidating cache boundaries downstream of the first modification. In a 148-minute continuously-used session, the per-turn API payload grew linearly by ~1,100 tokens/min and reached 661,706 tokens — 66% of a 1M-token context window (§15.1).

  • Latency: Larger payloads → slower first-token time and slower auto-compaction triggers. Auto-compaction itself introduces 455K~558K-token spikes (§15.2).

  • Context window: Usable context shrinks faster than the visible conversation warrants, forcing premature /clear or /compact. Heavy /clear users were observed resetting 211 times in a single session (<PROJECT_B>, §15.3) without ever escaping re-accumulation; the user who never /clear-ed hit 66% of the 1M window.

  • User experience: Users perceive "the same context injected repeatedly" without a clear mitigation path because the cause is internal and survives restarts, upgrades, /clear, and /compact.

  • Privacy (secondary): <USER_HOME>/.claude/history.jsonl retains every CLI input across all projects forever, with no documented purge command (§14).

Workarounds the user tried (limited effect)

Code Example

session       size        attachment   task_reminder   last-prompt
-------       ----        ----------   -------------   -----------
<PROJECT_A>   2.2 MB          532            16             54
<PROJECT_B>   2.5 MB          429            14             62
<PROJECT_C>   7.2 MB        1,413            32            168
<PROJECT_D>   3.4 MB          992            11             81
<PROJECT_E>   870 KB          154             9             20

---

entry type              count    bytes        %
attachment              1,413    1.66 MB     22%
assistant                 860    1.79 MB     25%
user                      684    2.71 MB     37%
permission-mode           176       20 KB     0%
last-prompt               168       26 KB     0%
file-history-snapshot      97      297 KB     4%
custom-title               84        9 KB     0%
agent-name                 84        9 KB     0%
system                     66       52 KB     1%
queue-operation            11        2 KB     0%

---

hash            repeats   size/copy   cumulative waste
<HASH_1>          16 ×     56 B          896 B    task_reminder itemCount=0 (empty list)
<HASH_2>           9 ×     51 B          459 B
<HASH_3>           5 ×  8,730 B       43,650 B    (large repeated payload)
<HASH_4>           5 ×  2,931 B       14,655 B
<HASH_5>           5 ×    456 B        2,280 B    specific task object persisted 5×
<HASH_6>           3 ×  1,396 B        4,188 B    specific task object persisted 3×

---

attachment.type           role
------------------------  -----------------------------------------------------
hook_success              user + plugin hook results (dominant, 89 / 300 entries in sample)
task_reminder             built-in TodoWrite nag
opened_file_in_ide        IDE file-open tracking
deferred_tools_delta      tool list updates
skill_listing             skill list snapshot
command_permissions       permission mode changes
date_change               system clock date rollover
hook_non_blocking_error   hook execution failures
hook_additional_context   hook-injected additionalContext
hook_system_message       hook-emitted system messages

---

itemCount distribution (<PROJECT_C> session):
  0: 16"task tools haven't been used recently" reminder
  1:  5
  3:  3
  4:  5
  7:  1
 10:  1
 13:  1

---

total last-prompt entries:  168
unique content hashes:       22
mean copies per unique:    7.6 ×

---

session       turns   first-turn     25%       50%       75%      last-turn
-------       -----   ----------   -------   -------   -------   ---------
<PROJECT_C>     860       52,572    80,656   152,242   285,677     498,255
<PROJECT_B>     390       86,804   189,077   349,696   422,434     497,299
<PROJECT_D>     530       89,619   185,527   251,836   343,565     431,287
<PROJECT_E>     105       92,183   134,735   168,584   198,054     242,501

---

session       input   cache_read   cache_creation   output     TOTAL
-------       -----   ----------   --------------   ------    --------
<PROJECT_C>       1      497,533              721    1,662     498,255
<PROJECT_B>       1      496,604              694    1,123     497,299
<PROJECT_D>       1      430,331              955    1,274     431,287
<PROJECT_E>       6      231,371           11,124    3,200     242,501

---

entry 0  file-history-snapshot     (auto-generated by Claude Code, no timestamp)
entry 1  attachment/hook_success          hookName = SessionStart:clear
entry 2  attachment/hook_system_message   hookName = SessionStart:clear  content = "bkit v2.1.7"
entry 3  attachment/hook_success          hookName = SessionStart:clear
entry 4  user        <local-command-caveat></local-command-caveat>
entry 5  user        <command-name>/clear</command-name>entry 6  system/local_command             (empty stdout)

---

session       version    entries   attach   tr:itemCount=0   last-p(unique)
--------      -------    -------   ------   --------------   --------------
<PROJECT_C>    2.1.110     1,937      945          16<PROJECT_C>    2.1.112       522      189           0<PROJECT_C>    2.1.113       533      279           0         168 (22 uniq, 7.6×)
<PROJECT_E>    2.1.112       288      136           5<PROJECT_E>    2.1.113        23        9           0          20 ( 8 uniq, 2.5×)
<PROJECT_D>    2.1.112     1,546      793           4<PROJECT_D>    2.1.113       369      199           7          81 (11 uniq, 7.4×)
<PROJECT_B>    2.1.112       714      249           1<PROJECT_B>    2.1.113       395      187           1          63 ( 8 uniq, 7.9×)

---

[user-input]
file-history-snapshot                      (auto-generated by Claude Code)
assistant: thinking
assistant: tool_use                         ← Bash call #1
attachment/hook_success                     ← PreToolUse:Bash hook
attachment/hook_success                     ← PreToolUse:Bash hook (second)
user: tool_result
attachment/hook_success                     ← PostToolUse:Bash hook
assistant: thinking
assistant: text
assistant: tool_use                         ← Bash call #2
...

---

total subagent JSONL files on machine : 1,091
largest single subagent JSONL         : 6,378,381 B (6.4 MB)
largest subagent internal composition : progress 2,615  assistant 163  user 97  system 9

---

<USER_HOME>/.claude/history.jsonl   846,561 B
lines                               2,674
schema                              {display, pastedContents, project, sessionId, timestamp}
sample display values               "/exit", "/clear", <free-form user prompts>

---

elapsed    size          total_tokens   cache_r      cache_c      input
-------    ------------  ------------   ----------   ---------    -----
  0m        7,273,633        498,255       497,533         721        1
 10m        7,300,616        500,013        44,488     455,519        6   ← auto-compaction
 20m        7,524,612        524,962       522,689       2,272        1
 60m        8,092,572        582,403        44,488     537,909        6   ← auto-compaction
 70m        8,293,253        602,914        44,488     558,420        6   ← auto-compaction
120m        8,606,324        638,367       637,545         821        1
140m        8,850,869        661,706       661,594         111        1
148m        8,850,869        661,706       661,594         111        1  (final)

148 min Δ:  size +1.58 MB  /  tokens +163,451  (1,100 tokens/min, linear)
peak:       661,706 tokens = 66% of a 1 M-token context window

---

time    cache_r before   cache_r after   cache_c (that turn)   billable amplification
-----   --------------   -------------   -------------------   -------------------------
 10m         497,533          44,488            455,519        cache_c billed at 125% of input
 60m          ~538,000         44,488            537,909         equivalent to ~672 K input tokens
 70m          ~558,000         44,488            558,420         equivalent to ~698 K input tokens

---

session              snapshots   reset events   peak tokens   final tokens
---------------      ---------   ------------   -----------   ------------
<PROJECT_C>             230             0         661,706        661,706   (monotonic)
<PROJECT_A>              411           103         266,927        266,927   (final re-accumulation to peak)
<PROJECT_B>              502           211         562,491              0   (ended in /clear state)
<PROJECT_D>              213             1         558,418              0   (single late reset)
<PROJECT_E> (this session)   173             4         290,986         89,945   (repeated reset / accumulate)

---

total snapshots across 5 sessions             : 1,529
combined JSONL size change in 150 min         : +0.65 MB net (with four sessions resetting)
longest uninterrupted accumulation            : 148 min → +163 K tokens linear
largest peak observed                         : 661,706 tokens (66 % of 1 M window)
auto-compaction events observed in one session: 3
total /clear-induced resets observed          : 319 across 4 sessions

---

<USER_HOME>/.claude/projects/<PROJECT_A>/<SESSION_ID_A>.jsonl  (2.2 MB)
<USER_HOME>/.claude/projects/<PROJECT_B>/<SESSION_ID_B>.jsonl  (2.5 MB)
<USER_HOME>/.claude/projects/<PROJECT_C>/<SESSION_ID_C>.jsonl  (7.2~8.9 MB, grew during monitoring)
<USER_HOME>/.claude/projects/<PROJECT_D>/<SESSION_ID_D>.jsonl  (3.4~4.6 MB)
<USER_HOME>/.claude/projects/<PROJECT_E>/<SESSION_ID_E>.jsonl  (0.1~1.2 MB, reset during monitoring)
<USER_HOME>/.claude/projects/<PROJECT_*>/subagents/agent-*.jsonl   (1,091 files machine-wide, largest 6.4 MB)
<USER_HOME>/.claude/history.jsonl                               (846,561 B, 2,674 entries, machine-wide)

---

<USER_HOME>/.claude/plugins//monitor.log    (JSON Lines, 314 snapshots at 30 s intervals,
                                              09:2212:00 local time, 321 KB)

---

{
  "ts": "2026-04-20T11:48:32",
  "sessions": [
    {
      "session": "<PROJECT_C>",
      "size": 8850869,
      "mtime_ago_s": 4,
      "last_total_tokens": 661706,
      "last_input": 1,
      "last_cache_r": 661594,
      "last_cache_c": 111,
      "last_out": 1662,
      "last_version": "2.1.113"
    }
  ]
}

---

# Usage: python this_script.py <USER_HOME>/.claude/projects/<PROJECT_ID>/<SESSION_ID>.jsonl
import json, hashlib, collections, sys

path = sys.argv[1]
counts = collections.Counter()
hashes = collections.Counter()
task_reminder_items = collections.Counter()
attachment_types = collections.Counter()
version_split = collections.defaultdict(lambda: {'entries':0, 'attach':0, 'tr0':0, 'lp':0})
last_prompt_hashes = set()
last_prompts = 0

with open(path, 'r', encoding='utf-8') as f:
    for line in f:
        try:
            d = json.loads(line)
        except Exception:
            continue
        t = d.get('type', '')
        v = d.get('version', 'unknown')
        counts[t] += 1
        version_split[v]['entries'] += 1
        if t == 'attachment':
            version_split[v]['attach'] += 1
            att = d.get('attachment', {})
            attachment_types[att.get('type', '?')] += 1
            h = hashlib.md5(json.dumps(att, sort_keys=True).encode()).hexdigest()[:12]
            hashes[h] += 1
            if att.get('type') == 'task_reminder':
                ic = att.get('itemCount', -1)
                task_reminder_items[ic] += 1
                if ic == 0:
                    version_split[v]['tr0'] += 1
        elif t == 'last-prompt':
            last_prompts += 1
            version_split[v]['lp'] += 1
            last_prompt_hashes.add(hashlib.md5(d.get('lastPrompt', '').encode()).hexdigest()[:12])

print('entry type counts           :', dict(counts))
print('attachment.type catalog     :', dict(attachment_types))
print('top repeated attachment     :', hashes.most_common(10))
print('task_reminder itemCount dist:', dict(task_reminder_items))
print(f'last-prompt: {last_prompts} entries, {len(last_prompt_hashes)} unique')
print('by version                  :', dict(version_split))

---

"""
Snapshot every 30 s until today's 12:00 local. Logs size + last assistant usage
(input / cache_read / cache_creation / output) for every session JSONL modified
within the last 10 minutes. One JSON line per snapshot appended to monitor.log.
"""
import json, os, glob, time, datetime

HERE = os.path.dirname(os.path.abspath(__file__))
LOG = os.path.join(HERE, 'monitor.log')
PROJECTS_ROOT = os.path.expanduser('~/.claude/projects')


def last_assistant_usage(path):
    try:
        last = None
        with open(path, 'r', encoding='utf-8') as f:
            for line in f:
                try:
                    d = json.loads(line)
                except Exception:
                    continue
                if d.get('type') == 'assistant' and d.get('message', {}).get('usage'):
                    last = d
        if not last:
            return None
        u = last['message']['usage']
        return {
            'v': last.get('version', ''),
            'in': u.get('input_tokens', 0),
            'cache_r': u.get('cache_read_input_tokens', 0),
            'cache_c': u.get('cache_creation_input_tokens', 0),
            'out': u.get('output_tokens', 0),
        }
    except Exception:
        return None


def snapshot():
    now = time.time()
    files = glob.glob(os.path.join(PROJECTS_ROOT, '**', '*.jsonl'), recursive=True)
    active = []
    for f in files:
        try:
            mt = os.path.getmtime(f)
            sz = os.path.getsize(f)
        except OSError:
            continue
        if now - mt < 600:
            active.append((f, mt, sz))
    active.sort(key=lambda x: -x[1])
    sessions = []
    for f, mt, sz in active[:8]:
        u = last_assistant_usage(f) or {}
        total = u.get('in', 0) + u.get('cache_r', 0) + u.get('cache_c', 0)
        rel = f[len(PROJECTS_ROOT):]
        for ch in ('/', os.sep):
            rel = rel.lstrip(ch)
        label = rel
        for ch in ('/', os.sep):
            if ch in label:
                label = label.split(ch, 1)[0]
                break
        sessions.append({
            'session': label, 'size': sz, 'mtime_ago_s': int(now - mt),
            'last_total_tokens': total,
            'last_input': u.get('in', 0),
            'last_cache_r': u.get('cache_r', 0),
            'last_cache_c': u.get('cache_c', 0),
            'last_out': u.get('out', 0),
            'last_version': u.get('v', ''),
        })
    return {
        'ts': datetime.datetime.now().isoformat(timespec='seconds'),
        'sessions': sessions,
    }


def main():
    now = datetime.datetime.now()
    end = datetime.datetime.combine(now.date(), datetime.time(12, 0, 0))
    if end <= now:
        end = end + datetime.timedelta(days=1)
    with open(LOG, 'a', encoding='utf-8') as fp:
        fp.write(json.dumps({'ts': now.isoformat(timespec='seconds'), 'event': 'monitor_start',
                             'end_target': end.isoformat(timespec='seconds')}) + '\n')
        fp.flush()
        while datetime.datetime.now() < end:
            fp.write(json.dumps(snapshot(), ensure_ascii=False) + '\n')
            fp.flush()
            time.sleep(30)
        fp.write(json.dumps({'ts': datetime.datetime.now().isoformat(timespec='seconds'),
                             'event': 'monitor_end'}) + '\n')


if __name__ == '__main__':
    main()

---

import json, time
p = 'monitor.log'
snaps = [json.loads(l) for l in open(p, encoding='utf-8') if l.strip() and 'sessions' in l]
def e(ts): return time.mktime(time.strptime(ts, '%Y-%m-%dT%H:%M:%S'))
series = {}
for s in snaps:
    for ses in s['sessions']:
        series.setdefault(ses['session'], []).append((e(s['ts']), ses['size'],
                                                      ses['last_total_tokens'],
                                                      ses['last_cache_r'],
                                                      ses['last_cache_c']))
for name, rec in series.items():
    resets = sum(1 for i in range(1, len(rec))
                 if rec[i-1][2] > 50000 and rec[i][2] < rec[i-1][2] * 0.5)
    peak = max(rec, key=lambda r: r[2])
    print(f'{name}: snaps={len(rec)} resets={resets} peak_tokens={peak[2]}')
RAW_BUFFERClick to expand / collapse

[Claude Code] Repeated context injection via JSONL attachments causes linear token bloat across long sessions

Where to submit: https://github.com/anthropics/claude-code/issues Label suggestion: bug, performance, context-management


Summary

Claude Code persists every hook result, built-in task_reminder, IDE events, skill-list changes, tool-call side-events, subagent progress, and last-prompt as separate attachment / last-prompt / progress entries in the session JSONL under <USER_HOME>/.claude/projects/<PROJECT_ID>/<SESSION_ID>.jsonl. On every subsequent API turn, history reconstruction re-includes these entries in the prompt payload, so the same content is re-transmitted many times within one session, producing linear token bloat that users perceive as "the same context being injected on every turn."

In a 150-minute live monitoring window across five concurrent sessions, this report documents:

  • Linear, unbounded growth of per-turn API payload (~1,100 tokens/min) in a continuously-used session, reaching 661,706 tokens = 66% of a 1 M-token context window in 148 minutes.
  • Auto-compaction is observable but does not help: three spikes of cache_creation_input_tokens at 455K~558K each, immediately followed by cache_read returning to its pre-compaction 500K level.
  • /clear is a temporary reset, not a fix: 319 reset events were logged across four sessions; every reset was followed by rapid re-accumulation.
  • Per-tool-use fan-out: one Bash call produces ~3 attachment/hook_success entries.
  • Subagent silos: Agent / Task invocations create separate JSONLs of up to 6.4 MB each, on top of the main session.
  • Global CLI history (<USER_HOME>/.claude/history.jsonl) retains every input across all sessions/projects forever with no documented purge command.

All findings are reproducible across concurrent sessions on the same machine and persist after upgrading to the latest Claude Code v2.1.113.


Environment

FieldValue
Claude Code versions observed2.1.110 → 2.1.112 → 2.1.113 (auto-update within same sessions)
PlatformWindows 11 Pro 10.0.26100
Shellbash (Git Bash on Windows)
Node.jsv24.11.1
ModelOpus 4.7 (1M context)
OS localeKorean (language: "korean")

Same sessionId is reused across CLI restarts and version upgrades, so accumulated attachment / last-prompt entries carry over — a restart does not reset the re-injection problem.


Reproduction

Any long-running session reproduces this. Below are five concurrent sessions on the same machine; all show the same structural pattern.

session       size        attachment   task_reminder   last-prompt
-------       ----        ----------   -------------   -----------
<PROJECT_A>   2.2 MB          532            16             54
<PROJECT_B>   2.5 MB          429            14             62
<PROJECT_C>   7.2 MB        1,413            32            168
<PROJECT_D>   3.4 MB          992            11             81
<PROJECT_E>   870 KB          154             9             20

Minimal steps to reproduce:

  1. Start a Claude Code session in any project.
  2. Use tools (Bash/Edit/Write/Read) and invoke skills/agents for 30+ turns.
  3. Inspect <USER_HOME>/.claude/projects/<PROJECT_ID>/<SESSION_ID>.jsonl.
  4. Observe the entry-type distribution and repeated content.

Evidence

1. Entry-type distribution in a 7.2 MB session (<PROJECT_C>)

entry type              count    bytes        %
attachment              1,413    1.66 MB     22%
assistant                 860    1.79 MB     25%
user                      684    2.71 MB     37%
permission-mode           176       20 KB     0%
last-prompt               168       26 KB     0%
file-history-snapshot      97      297 KB     4%
custom-title               84        9 KB     0%
agent-name                 84        9 KB     0%
system                     66       52 KB     1%
queue-operation            11        2 KB     0%

attachment alone is 22% of the JSONL. Combined with last-prompt and file-history-snapshot, Claude-Code-generated metadata exceeds 26% of persisted session bytes.

2. Repeated attachment content — hash-based duplication analysis

hash            repeats   size/copy   cumulative waste
<HASH_1>          16 ×     56 B          896 B    task_reminder itemCount=0 (empty list)
<HASH_2>           9 ×     51 B          459 B
<HASH_3>           5 ×  8,730 B       43,650 B    (large repeated payload)
<HASH_4>           5 ×  2,931 B       14,655 B
<HASH_5>           5 ×    456 B        2,280 B    specific task object persisted 5×
<HASH_6>           3 ×  1,396 B        4,188 B    specific task object persisted 3×
  • Total attachment bytes: ~1.72 MB
  • After content-hash deduplication: ~1.56 MB
  • Bytes wasted on duplicates in storage alone: ~154 KB (~9%)

That 9% is storage only. The amplification is in the API payload: each past attachment is re-included on every subsequent API turn, so one duplicate attachment in storage becomes duplicate re-transmission on every future turn.

3. attachment.type catalog — not just hook output

Live observation on v2.1.113 shows Claude Code persists at least nine distinct attachment types, covering almost every in-session state change:

attachment.type           role
------------------------  -----------------------------------------------------
hook_success              user + plugin hook results (dominant, 89 / 300 entries in sample)
task_reminder             built-in TodoWrite nag
opened_file_in_ide        IDE file-open tracking
deferred_tools_delta      tool list updates
skill_listing             skill list snapshot
command_permissions       permission mode changes
date_change               system clock date rollover
hook_non_blocking_error   hook execution failures
hook_additional_context   hook-injected additionalContext
hook_system_message       hook-emitted system messages

Every one of these is persisted as a separate JSONL entry and is a candidate for re-injection on subsequent turns.

4. task_reminder is emitted by Claude Code itself (not user hooks)

itemCount distribution (<PROJECT_C> session):
  0: 16   ← "task tools haven't been used recently" reminder
  1:  5
  3:  3
  4:  5
  7:  1
 10:  1
 13:  1

The itemCount: 0 variant is the built-in <system-reminder>The task tools haven't been used recently. If you're working on tasks that would benefit from tracking progress, consider using TaskCreate ...</system-reminder> nag. It is stored as a separate attachment entry sixteen times in one session. A built-in Claude Code reminder is a primary contributor — user plugins/hooks are not the cause.

5. last-prompt entries duplicate user input

total last-prompt entries:  168
unique content hashes:       22
mean copies per unique:    7.6 ×

The same user prompt string is persisted up to 16 times per session. A 2 KB user message therefore occupies up to 32 KB of JSONL through this mechanism alone — before considering its inclusion in subsequent API payloads.

6. <system-reminder> tags are not stored in the user-message body

When parsing message.content[].text for type: user entries specifically, zero <system-reminder> tags are found in the <PROJECT_C> session, even though those tags appear in the live prompt. This proves:

  • <system-reminder> blocks are runtime-injected during API payload construction, not persisted.
  • The persistence that causes bloat is not the reminders themselves — it is the attachment / last-prompt entries that Claude Code reads back in and re-attaches on every turn.

7. Cross-session confirmation rules out user-side causes

The 5 sessions above are in 5 different projects with different CLAUDE.md, different plugin activity, and different workflows. All show the same attachment / task_reminder / last-prompt accumulation pattern. This is a Claude Code binary behavior, not a user misconfiguration.

8. Direct API-payload evidence — usage fields returned by Anthropic API

The message.usage object in each assistant entry is server-returned and non-forgeable. It reports the exact token counts Anthropic's API billed for the preceding request. Plotting these across a session shows the payload growth directly.

session       turns   first-turn     25%       50%       75%      last-turn
-------       -----   ----------   -------   -------   -------   ---------
<PROJECT_C>     860       52,572    80,656   152,242   285,677     498,255
<PROJECT_B>     390       86,804   189,077   349,696   422,434     497,299
<PROJECT_D>     530       89,619   185,527   251,836   343,565     431,287
<PROJECT_E>     105       92,183   134,735   168,584   198,054     242,501

Last-turn breakdown:

session       input   cache_read   cache_creation   output     TOTAL
-------       -----   ----------   --------------   ------    --------
<PROJECT_C>       1      497,533              721    1,662     498,255
<PROJECT_B>       1      496,604              694    1,123     497,299
<PROJECT_D>       1      430,331              955    1,274     431,287
<PROJECT_E>       6      231,371           11,124    3,200     242,501

Key observations:

  • input_tokens is 1~6 across every sampled last turn — essentially no "genuinely new" content is sent raw.
  • cache_read_input_tokens carries 231K~497K — the entire conversation (including every historical attachment, tool_result, last-prompt, assistant output) is re-included in every request via prompt cache.
  • cache_creation_input_tokens = 721~11,124 per turn — this is the increment appended to cache each turn, which then compounds into the next turn's cache_read.
  • Growth is approximately linear in turn count: <PROJECT_C> sampled at turns 0 / 215 / 430 / 645 / 859 yields 52K / 80K / 152K / 285K / 498K, averaging ~540 tokens of permanent context growth per turn.
  • <PROJECT_C>'s last turn occupies 49.8% of the 1M-token context window before any new user input. Earlier turns in that same session fit comfortably — the growth is structural, not driven by user content.

9. Mapping cache growth back to attachment accumulation

From the attachment analysis above, <PROJECT_C> contains ~1.72 MB of attachment bytes across 1,413 attachment entries, averaging 1.2 KB per attachment ≈ 300 tokens**. The per-turn cache_creation_input_tokens range of 72111,124 observed in §8 corresponds to **roughly 236 attachment-equivalents per turn being appended to the running context.

This is the arithmetic bridge between §2~§5 (what gets persisted in JSONL) and §8 (what actually gets billed in the API request). It confirms the originally-hypothesized mechanism: Claude Code stores per-turn events as attachments in JSONL, then re-sends them on every subsequent API call via the prompt cache, and the cache size — and therefore per-turn billable tokens — grows linearly with the number of turns.

10. Idle Claude Code still accumulates JSONL — /clear with zero user input

Starting Claude Code v2.1.113 and immediately pressing /clear twice, with zero user text input, produces a micro-session whose JSONL already contains seven entries totalling ~3.9 KB:

entry 0  file-history-snapshot     (auto-generated by Claude Code, no timestamp)
entry 1  attachment/hook_success          hookName = SessionStart:clear
entry 2  attachment/hook_system_message   hookName = SessionStart:clear  content = "bkit v2.1.7"
entry 3  attachment/hook_success          hookName = SessionStart:clear
entry 4  user        <local-command-caveat>…</local-command-caveat>
entry 5  user        <command-name>/clear</command-name>…
entry 6  system/local_command             (empty stdout)

Observations:

  • /clear re-fires SessionStart hooks — matcher name observed is literally SessionStart:clear. Every SessionStart-bound hook (user and plugin) emits a fresh attachment entry, even though the user did not start a new session deliberately.
  • User-typed bytes: 0. Persisted JSONL bytes: 3,941. Every byte of the JSONL is eligible for re-inclusion in the next API request.
  • A user who never types anything, only presses /clear, still pays the cache-creation tax for these attachments.

This rules out "the user is accumulating context" as a possible cause. The accumulation happens purely from Claude Code's own lifecycle events.

11. v2.1.113 upgrade does not fix the issue

Same sessions analyzed before and after auto-upgrade, splitting entries by the version field they were written under:

session       version    entries   attach   tr:itemCount=0   last-p(unique)
--------      -------    -------   ------   --------------   --------------
<PROJECT_C>    2.1.110     1,937      945          16             —
<PROJECT_C>    2.1.112       522      189           0             —
<PROJECT_C>    2.1.113       533      279           0         168 (22 uniq, 7.6×)
<PROJECT_E>    2.1.112       288      136           5             —
<PROJECT_E>    2.1.113        23        9           0          20 ( 8 uniq, 2.5×)
<PROJECT_D>    2.1.112     1,546      793           4             —
<PROJECT_D>    2.1.113       369      199           7          81 (11 uniq, 7.4×)
<PROJECT_B>    2.1.112       714      249           1             —
<PROJECT_B>    2.1.113       395      187           1          63 ( 8 uniq, 7.9×)

Readings:

  • attachment / entries ratio on v2.1.113 is 40~54% — equal to or worse than older versions.
  • task_reminder itemCount=0 emission on v2.1.113: decreases in some sessions, increases in others — the patch is not a coalescing fix; it is situational.
  • last-prompt continues to persist ~7× copies of each unique prompt on v2.1.113 (no improvement).
  • CLI restart keeps the same sessionId and writes to the same JSONL — prior accumulation carries over and is re-included in future API payloads.

v2.1.113 does not address the root cause.

12. Tool-use amplification — one Bash call produces ~3 attachment entries

Inspecting the last 12 entries written during an active turn of this reporting session shows a repeating motif. One Bash invocation expands into one tool_use / tool_result pair plus three attachment/hook_success entries (one PreToolUse + another PreToolUse variant + one PostToolUse). A typical slice:

[user-input]
file-history-snapshot                      (auto-generated by Claude Code)
assistant: thinking
assistant: tool_use                         ← Bash call #1
attachment/hook_success                     ← PreToolUse:Bash hook
attachment/hook_success                     ← PreToolUse:Bash hook (second)
user: tool_result
attachment/hook_success                     ← PostToolUse:Bash hook
assistant: thinking
assistant: text
assistant: tool_use                         ← Bash call #2
...

Implication: every tool use, regardless of content, appends three attachment entries to the JSONL. Ten Bash invocations → 30 new attachments. Given §9's conversion of ~300 tokens per attachment, this is ~9K tokens per 10 tool calls purely from hook-success persistence, all of which re-enter subsequent turns' prompt caches.

13. Subagent JSONL silos — parallel accumulation under subagents/

Agent / Task tool invocations create a separate JSONL per subagent under <PROJECT_ROOT>/subagents/agent-*.jsonl. These exhibit the same attachment accumulation pattern as the parent session, compounding the problem.

total subagent JSONL files on machine : 1,091
largest single subagent JSONL         : 6,378,381 B (6.4 MB)
largest subagent internal composition : progress 2,615  assistant 163  user 97  system 9

The progress entry type appears only in subagent JSONLs — one per step of the subagent's execution — and each carries its own persisted state. A single 6.4 MB subagent JSONL is comparable in size to a large main session, and every parent-session turn that invoked a subagent pays both costs.

14. Global CLI history file never rotates

<USER_HOME>/.claude/history.jsonl stores every command-line input typed into Claude Code across all sessions and all projects on the machine, forever.

<USER_HOME>/.claude/history.jsonl   846,561 B
lines                               2,674
schema                              {display, pastedContents, project, sessionId, timestamp}
sample display values               "/exit", "/clear", <free-form user prompts>

Unlike session JSONLs, this file is not re-injected into API requests, so it does not contribute to per-turn token bloat. However:

  • It accumulates without bound; /clear does not touch it.
  • It preserves every prompt a user typed — which for security-sensitive users can include paths, API keys, database URLs that were pasted on the CLI.
  • There is no documented mechanism to rotate or purge it.

This is a separate issue in spirit (privacy / data retention rather than per-turn cost), but it shares the root cause pattern: Claude Code persists everything and provides no user-level trimming controls.

15. 150-minute continuous live monitoring — linear growth, auto-compaction inefficacy, /clear is temporary

A persistent monitor snapshotted five active sessions every 30 seconds from 09:22 to 12:00 local time (316 log lines, 314 snapshots). Full per-session findings:

15.1 Longest continuously-used session: <PROJECT_C> (148 min, 230 snapshots, zero /clear)

elapsed    size          total_tokens   cache_r      cache_c      input
-------    ------------  ------------   ----------   ---------    -----
  0m        7,273,633        498,255       497,533         721        1
 10m        7,300,616        500,013        44,488     455,519        6   ← auto-compaction
 20m        7,524,612        524,962       522,689       2,272        1
 60m        8,092,572        582,403        44,488     537,909        6   ← auto-compaction
 70m        8,293,253        602,914        44,488     558,420        6   ← auto-compaction
120m        8,606,324        638,367       637,545         821        1
140m        8,850,869        661,706       661,594         111        1
148m        8,850,869        661,706       661,594         111        1  (final)

148 min Δ:  size +1.58 MB  /  tokens +163,451  (≈ 1,100 tokens/min, linear)
peak:       661,706 tokens = 66% of a 1 M-token context window

The user performed normal work in this session; no /clear or /compact commands were issued. Per-turn input_tokens stayed at 1~6 throughout, confirming that growth is driven by Claude-Code-persisted state, not by user text.

15.2 Auto-compaction is observed — and does not help

At the 10 min, 60 min, and 70 min marks, cache_read_input_tokens collapses from 500 K to 44,488 while cache_creation_input_tokens spikes to **455 K558 K** in the same turn. This is the observable signature of prompt-cache reconstruction / auto-compaction.

time    cache_r before   cache_r after   cache_c (that turn)   billable amplification
-----   --------------   -------------   -------------------   -------------------------
 10m         497,533          44,488            455,519        cache_c billed at 125% of input
 60m          ~538,000         44,488            537,909         equivalent to ~672 K input tokens
 70m          ~558,000         44,488            558,420         equivalent to ~698 K input tokens

Yet the very next snapshot after each compaction event shows cache_r back in the 500 K range. Auto-compaction does not reduce the context that subsequent turns have to carry — it rebuilds the cache from that same context and immediately resumes accumulation. In effect the user pays a cache-creation surcharge and the new cache starts filling back up at the same rate.

15.3 /clear is reset, not a fix — five-session reset tally

session              snapshots   reset events   peak tokens   final tokens
---------------      ---------   ------------   -----------   ------------
<PROJECT_C>             230             0         661,706        661,706   (monotonic)
<PROJECT_A>              411           103         266,927        266,927   (final re-accumulation to peak)
<PROJECT_B>              502           211         562,491              0   (ended in /clear state)
<PROJECT_D>              213             1         558,418              0   (single late reset)
<PROJECT_E> (this session)   173             4         290,986         89,945   (repeated reset / accumulate)

A "reset event" is detected when last_total_tokens falls to less than 50 % of the previous snapshot's value with the previous value > 50 K.

Readings:

  • The only session that avoided resets is the one where the user never ran /clear. It is also the one that reached 66 % of the 1 M window.
  • <PROJECT_B> and <PROJECT_A> users /clear-ed many times (211 and 103 resets). After each reset, the session re-accumulated within minutes. <PROJECT_A> eventually reached 266 K tokens at the 110-min mark despite heavy /clear use.
  • Resetting is a perpetual cost — users are effectively paying for context they throw away because they cannot selectively prune persisted attachments.

15.4 Aggregate over 150 min

total snapshots across 5 sessions             : 1,529
combined JSONL size change in 150 min         : +0.65 MB net (with four sessions resetting)
longest uninterrupted accumulation            : 148 min → +163 K tokens linear
largest peak observed                         : 661,706 tokens (66 % of 1 M window)
auto-compaction events observed in one session: 3
total /clear-induced resets observed          : 319 across 4 sessions

Expected vs Actual

Expected

  • History reconstruction sends past turns without re-attaching transient metadata that is already summarized in the current turn.
  • Built-in nags (task_reminder with itemCount: 0) emit at most once per threshold condition, not per tool call.
  • last-prompt is a pointer, not a full-content copy on every write.
  • Restarting the CLI, auto-updating, or idling resets ephemeral attachment accumulation.
  • Auto-compaction meaningfully reduces the persisted context a subsequent turn must carry.
  • Subagents (Agent / Task) run in an isolated context that does not compound main-session cost.
  • One tool call produces roughly one persisted event.

Actual

  • Each past attachment is re-included on every API turn → payload grows linearly with turn count (§8, §15.1).
  • task_reminder itemCount:0 is appended on nearly every tool use where TodoWrite is not active → users pay for the same nag 16+ times per session (§4).
  • last-prompt duplicates the prompt string ~7-8× on average (§5).
  • Restart keeps the same sessionId and all accumulated attachments; idle sessions still accumulate when /clear is pressed (§7, §10).
  • Auto-compaction was observed three times in one 148-min session (§15.2). Each event spikes cache_creation by 455K~558K tokens (billed at 125% of input), and the immediately following turn's cache_read returns to the same pre-compaction level — compaction is billed but does not shrink ongoing per-turn payload.
  • Subagent Task calls create independent JSONLs under subagents/agent-*.jsonl that reach 6.4 MB each and repeat the same attachment accumulation pattern (§13).
  • One Bash tool call produces roughly three attachment/hook_success entries (§12).

Impact

  • Cost: API token consumption grows faster than the information content of the conversation. 5-minute prompt-cache TTL does not absorb this because session JSONL entries change on every turn, invalidating cache boundaries downstream of the first modification. In a 148-minute continuously-used session, the per-turn API payload grew linearly by ~1,100 tokens/min and reached 661,706 tokens — 66% of a 1M-token context window (§15.1).
  • Latency: Larger payloads → slower first-token time and slower auto-compaction triggers. Auto-compaction itself introduces 455K~558K-token spikes (§15.2).
  • Context window: Usable context shrinks faster than the visible conversation warrants, forcing premature /clear or /compact. Heavy /clear users were observed resetting 211 times in a single session (<PROJECT_B>, §15.3) without ever escaping re-accumulation; the user who never /clear-ed hit 66% of the 1M window.
  • User experience: Users perceive "the same context injected repeatedly" without a clear mitigation path because the cause is internal and survives restarts, upgrades, /clear, and /compact.
  • Privacy (secondary): <USER_HOME>/.claude/history.jsonl retains every CLI input across all projects forever, with no documented purge command (§14).

Requested

One or more of:

  1. Deduplicate attachment entries by content hash before including them in API payload history reconstruction.
  2. Coalesce task_reminder — emit at most once per N turns or once per distinct itemCount change, and suppress the itemCount: 0 variant beyond the first occurrence.
  3. Trim last-prompt storage — store a reference/hash once rather than a full copy on each submit.
  4. Collapse skill_listing / deferred_tools_delta / command_permissions — only persist deltas that actually change state.
  5. Reduce per-tool-use attachment fan-out — one hook_success per tool call is enough; the current ~3 (PreToolUse × 2 + PostToolUse) multiplies cost for every Bash/Edit/Write.
  6. Stop re-injecting subagent progress entries during main-session history reconstruction; subagent JSONLs already contain the full trace.
  7. Make auto-compaction actually lossy — after a cache_creation spike of the entire conversation, discard or summarize the next turn's persisted history rather than immediately letting cache_read return to the same pre-compaction level.
  8. Expose a user-level setting (e.g. settings.json → experimental.historyCompaction: true) so power users can opt into aggressive attachment compression.
  9. Offer a /clear-attachments command that strips non-essential attachment types from the current JSONL in place, without ending the session.
  10. Rotate or cap <USER_HOME>/.claude/history.jsonl — right now it stores every CLI input across all projects forever, including pasted secrets.
  11. Document the current injection semantics in the Hooks / Context Management section so users can audit their own sessions with confidence.

Workarounds the user tried (limited effect)

  • Removing dormant UserPromptSubmit hooks (prd-detector, skill-router, disabled-plugin leftovers) slightly reduces attachment creation rate.
  • Using TodoWrite eagerly prevents the itemCount: 0 task_reminder from firing.
  • Running /clear resets JSONL entry count but loses context. In the 150-min monitoring window, 319 /clear resets were observed across 4 sessions, each followed by rapid re-accumulation — never a durable fix.
  • Running /compact or waiting for auto-compaction: observed three times in one 148-min session; each event spikes cache_creation by 455 K~558 K tokens (billed at 125% of input), and the very next turn's cache_read is back in the 500 K range — compaction is billed but does not meaningfully shrink the ongoing per-turn payload.
  • Uninstalling unused plugins reduces hook fan-out.
  • Avoiding Agent/Task tool use avoids creating subagent JSONL silos, but at the cost of losing parallelism.
  • Upgrading to v2.1.113 — no material change.

None address the underlying re-injection on every turn.


Reproduction artifacts

JSONL files referenced above (paths redacted):

<USER_HOME>/.claude/projects/<PROJECT_A>/<SESSION_ID_A>.jsonl  (2.2 MB)
<USER_HOME>/.claude/projects/<PROJECT_B>/<SESSION_ID_B>.jsonl  (2.5 MB)
<USER_HOME>/.claude/projects/<PROJECT_C>/<SESSION_ID_C>.jsonl  (7.2~8.9 MB, grew during monitoring)
<USER_HOME>/.claude/projects/<PROJECT_D>/<SESSION_ID_D>.jsonl  (3.4~4.6 MB)
<USER_HOME>/.claude/projects/<PROJECT_E>/<SESSION_ID_E>.jsonl  (0.1~1.2 MB, reset during monitoring)
<USER_HOME>/.claude/projects/<PROJECT_*>/subagents/agent-*.jsonl   (1,091 files machine-wide, largest 6.4 MB)
<USER_HOME>/.claude/history.jsonl                               (846,561 B, 2,674 entries, machine-wide)

Live-monitoring artifact:

<USER_HOME>/.claude/plugins/…/monitor.log    (JSON Lines, 314 snapshots at 30 s intervals,
                                              09:22–12:00 local time, 321 KB)

Schema of one monitor.log snapshot:

{
  "ts": "2026-04-20T11:48:32",
  "sessions": [
    {
      "session": "<PROJECT_C>",
      "size": 8850869,
      "mtime_ago_s": 4,
      "last_total_tokens": 661706,
      "last_input": 1,
      "last_cache_r": 661594,
      "last_cache_c": 111,
      "last_out": 1662,
      "last_version": "2.1.113"
    }
  ]
}

The numeric tables in §Evidence can be regenerated by (a) counting "type":"attachment", "task_reminder", "type":"last-prompt" occurrences in the JSONLs, (b) hashing attachment payloads for dedup counts, or (c) replaying the monitor.log to reconstruct the per-session time series.


Checklist before submission

  • Run one of the above JSONL files through the Python snippet in §Reference below and paste the output into the GitHub issue.
  • Attach (or link) a sanitized JSONL excerpt of a repeated-attachment block.
  • Confirm Claude Code version with claude --version and paste the exact string.
  • If possible, include /cost output from a session that hit the problem.
  • Optionally run the live-monitor script (§Monitor Script) for 30+ min in an active session and attach its monitor.log.
  • Redact any proprietary project names from the JSONL and from the paths before attaching.
  • Redact any personally-identifying strings from <USER_HOME>/.claude/history.jsonl if referenced.

Reference snippet for reproducing the numbers

# Usage: python this_script.py <USER_HOME>/.claude/projects/<PROJECT_ID>/<SESSION_ID>.jsonl
import json, hashlib, collections, sys

path = sys.argv[1]
counts = collections.Counter()
hashes = collections.Counter()
task_reminder_items = collections.Counter()
attachment_types = collections.Counter()
version_split = collections.defaultdict(lambda: {'entries':0, 'attach':0, 'tr0':0, 'lp':0})
last_prompt_hashes = set()
last_prompts = 0

with open(path, 'r', encoding='utf-8') as f:
    for line in f:
        try:
            d = json.loads(line)
        except Exception:
            continue
        t = d.get('type', '')
        v = d.get('version', 'unknown')
        counts[t] += 1
        version_split[v]['entries'] += 1
        if t == 'attachment':
            version_split[v]['attach'] += 1
            att = d.get('attachment', {})
            attachment_types[att.get('type', '?')] += 1
            h = hashlib.md5(json.dumps(att, sort_keys=True).encode()).hexdigest()[:12]
            hashes[h] += 1
            if att.get('type') == 'task_reminder':
                ic = att.get('itemCount', -1)
                task_reminder_items[ic] += 1
                if ic == 0:
                    version_split[v]['tr0'] += 1
        elif t == 'last-prompt':
            last_prompts += 1
            version_split[v]['lp'] += 1
            last_prompt_hashes.add(hashlib.md5(d.get('lastPrompt', '').encode()).hexdigest()[:12])

print('entry type counts           :', dict(counts))
print('attachment.type catalog     :', dict(attachment_types))
print('top repeated attachment     :', hashes.most_common(10))
print('task_reminder itemCount dist:', dict(task_reminder_items))
print(f'last-prompt: {last_prompts} entries, {len(last_prompt_hashes)} unique')
print('by version                  :', dict(version_split))

Monitor Script — capture live per-turn growth

Save as monitor.py and run in the background (python monitor.py &). Writes one JSON line per snapshot every 30 seconds.

"""
Snapshot every 30 s until today's 12:00 local. Logs size + last assistant usage
(input / cache_read / cache_creation / output) for every session JSONL modified
within the last 10 minutes. One JSON line per snapshot appended to monitor.log.
"""
import json, os, glob, time, datetime

HERE = os.path.dirname(os.path.abspath(__file__))
LOG = os.path.join(HERE, 'monitor.log')
PROJECTS_ROOT = os.path.expanduser('~/.claude/projects')


def last_assistant_usage(path):
    try:
        last = None
        with open(path, 'r', encoding='utf-8') as f:
            for line in f:
                try:
                    d = json.loads(line)
                except Exception:
                    continue
                if d.get('type') == 'assistant' and d.get('message', {}).get('usage'):
                    last = d
        if not last:
            return None
        u = last['message']['usage']
        return {
            'v': last.get('version', ''),
            'in': u.get('input_tokens', 0),
            'cache_r': u.get('cache_read_input_tokens', 0),
            'cache_c': u.get('cache_creation_input_tokens', 0),
            'out': u.get('output_tokens', 0),
        }
    except Exception:
        return None


def snapshot():
    now = time.time()
    files = glob.glob(os.path.join(PROJECTS_ROOT, '**', '*.jsonl'), recursive=True)
    active = []
    for f in files:
        try:
            mt = os.path.getmtime(f)
            sz = os.path.getsize(f)
        except OSError:
            continue
        if now - mt < 600:
            active.append((f, mt, sz))
    active.sort(key=lambda x: -x[1])
    sessions = []
    for f, mt, sz in active[:8]:
        u = last_assistant_usage(f) or {}
        total = u.get('in', 0) + u.get('cache_r', 0) + u.get('cache_c', 0)
        rel = f[len(PROJECTS_ROOT):]
        for ch in ('/', os.sep):
            rel = rel.lstrip(ch)
        label = rel
        for ch in ('/', os.sep):
            if ch in label:
                label = label.split(ch, 1)[0]
                break
        sessions.append({
            'session': label, 'size': sz, 'mtime_ago_s': int(now - mt),
            'last_total_tokens': total,
            'last_input': u.get('in', 0),
            'last_cache_r': u.get('cache_r', 0),
            'last_cache_c': u.get('cache_c', 0),
            'last_out': u.get('out', 0),
            'last_version': u.get('v', ''),
        })
    return {
        'ts': datetime.datetime.now().isoformat(timespec='seconds'),
        'sessions': sessions,
    }


def main():
    now = datetime.datetime.now()
    end = datetime.datetime.combine(now.date(), datetime.time(12, 0, 0))
    if end <= now:
        end = end + datetime.timedelta(days=1)
    with open(LOG, 'a', encoding='utf-8') as fp:
        fp.write(json.dumps({'ts': now.isoformat(timespec='seconds'), 'event': 'monitor_start',
                             'end_target': end.isoformat(timespec='seconds')}) + '\n')
        fp.flush()
        while datetime.datetime.now() < end:
            fp.write(json.dumps(snapshot(), ensure_ascii=False) + '\n')
            fp.flush()
            time.sleep(30)
        fp.write(json.dumps({'ts': datetime.datetime.now().isoformat(timespec='seconds'),
                             'event': 'monitor_end'}) + '\n')


if __name__ == '__main__':
    main()

Post-processing (time-series and reset detection):

import json, time
p = 'monitor.log'
snaps = [json.loads(l) for l in open(p, encoding='utf-8') if l.strip() and 'sessions' in l]
def e(ts): return time.mktime(time.strptime(ts, '%Y-%m-%dT%H:%M:%S'))
series = {}
for s in snaps:
    for ses in s['sessions']:
        series.setdefault(ses['session'], []).append((e(s['ts']), ses['size'],
                                                      ses['last_total_tokens'],
                                                      ses['last_cache_r'],
                                                      ses['last_cache_c']))
for name, rec in series.items():
    resets = sum(1 for i in range(1, len(rec))
                 if rec[i-1][2] > 50000 and rec[i][2] < rec[i-1][2] * 0.5)
    peak = max(rec, key=lambda r: r[2])
    print(f'{name}: snaps={len(rec)} resets={resets} peak_tokens={peak[2]}')

extent analysis

TL;DR

The most likely fix for the issue of repeated context injection via JSONL attachments causing linear token bloat across long sessions is to deduplicate attachment entries by content hash before including them in API payload history reconstruction.

Guidance

  1. Deduplicate attachment entries: Before including them in API payload history reconstruction, deduplicate attachment entries by content hash to prevent re-injection of the same content on every turn.
  2. Coalesce task_reminder: Emit task_reminder at most once per N turns or once per distinct itemCount change, and suppress the itemCount: 0 variant beyond the first occurrence to reduce unnecessary reminders.
  3. Trim last-prompt storage: Store a reference or hash of the last prompt instead of a full copy on each submit to reduce storage and re-injection of duplicate prompts.
  4. Review and adjust auto-compaction: Make auto-compaction actually lossy by discarding or summarizing the next turn's persisted history after a compaction event to prevent immediate re-accumulation.

Example

To deduplicate attachment entries, you can use a Python script like the one provided in the issue body, which uses a content hash to identify and count duplicate attachments.

Notes

  • The provided guidance focuses on the most critical steps to address the issue directly and may not cover all potential optimizations or fixes mentioned in the issue.
  • Implementing these changes may require adjustments to the Claude Code application and its handling of session data and API requests.

Recommendation

Apply the workaround of deduplicating attachment entries and coalescing task_reminder to mitigate the issue until a more comprehensive fix can be implemented. This approach directly addresses the root cause of the problem and can help reduce the linear token bloat across long sessions.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING