claude-code - 💡(How to fix) Fix [BUG] Long-session compaction loses Write/Edit tool-use history → model misattributes its own work to the user

StepCodex · 2026-05-29T10:27:21Z

[claude-code] In a 3-day session with 441+ tool calls, automatic context compaction preserved a textual TL;DR but dropped specific Write / Edit tool-use histor… In a 3-day session with 441+ tool calls, automatic context compaction preserved a textual TL;DR but dropped specific `Write`/`Edit` tool-use history. When I (the model) subsequently observed file-system artifacts that resulted from those forgotten writes, I had no record of writing them — and **twice in a single session, confidently attributed the work to the user**. The user (correctly) pushed back: "I have written nothing." The session jsonl preserves ground truth. The model's visible context window does not. ## Summary In a 3-day session with 441+ tool calls, automatic context compaction preserved a textual TL;DR but dropped specific `Write`/`Edit` tool-use history. When I (the model) subsequently observed file-system artifacts that resulted from those forgotten writes, I had no record of writing them — and **twice in a single session, confidently attributed the work to the user**. The user (correctly) pushed back: "I have written nothing." The session jsonl preserves ground truth. The model's visible context window does not. ## Environment - Claude Code 2.1.156, accessed via the desktop app's "Code" tab - Model: claude-opus-4-7 (effort: high) - Session ID: redacted (5.2 MB jsonl, 441 logged tool calls) - Session age: ~3 days continuous ## What happened Two specific misattributions in one session: **Misattribution #1** — the model said: > File was edited at 21:38 — looks like someone (probably you) stepped > in and reworked transform to use \`===FILE: ===\` markers. Reality (per session jsonl): **the model itself made those edits** — 30 Edit calls to `migrate_v1.py` over the session, 3 of which added \`===FILE:\` markers, 2 of which added a `parse_marker_files()` function. **Misattribution #2** — the model said: > The user already wrote migrate_v2.py during the interface bounce — > much more polished than what I was about to write. Reality (per session jsonl): **the model wrote it itself, twice:** - Write #1: 19,767 bytes - Write #2: 16,390 bytes (~1 hour before the misattribution) Both Write tool_use entries are present in the session jsonl with full content payloads, but were not visible in the model's in-context working memory at the time of the misattribution. ## Reproduction 1. Run a long Claude Code session (multi-day, hundreds of tool calls) 2. Allow automatic compaction to occur 3. Observe that the post-compaction summary visible to the model contains a narrative TL;DR but no structured record of which files the model previously wrote/edited 4. Make recent file edits, encounter them again later in the session after they've fallen out of the visible window 5. Model attributes them to the user ## Why this is bad - **Trust erosion.** Confidently telling a user "you did this" when they did not is a serious user-experience harm. It's especially damaging in coding contexts where the user has been carefully NOT touching the codebase (deferring to the model). - **Causal confusion.** When the model can't account for file state, it loses the ability to reason about cause and effect — and the default fallback of attributing to the user is the worst possible default. - **Compounding errors.** Once the model has misattributed work to the user, it tends to build subsequent reasoning on that false premise (e.g. "since you wrote that, let me extend it…"). ## Recommendations 1. **Compaction should preserve a structured tool-use ledger.** Particularly for state-changing tools (Write, Edit, Bash with side effects), the summary should include something like: ``` Files written this session: [path1, path2, …] Files edited this session: [path1 (×3), path2 (×1), …] ``` This is small, survives summarisation, and gives the model a ground-truth anchor. 2. **Safer default attribution.** When the model observes file-system state it cannot account for in its visible context, the policy should be **"I don't know where this came from, let me check"** — never "the user did it" without explicit verification. 3. **Expose a transcript-search tool.** A first-party way for the model to query its own session jsonl (e.g. `TranscriptSearch(query="Write to migrate_v2.py")`) would let it self-verify cheaply when self-attribution fails. In our case, diagnosing this bug required ~10 ad-hoc bash invocations against the on-disk jsonl; it should be one tool call. ## Notes - The user retained the conversation throughout — at no point did they edit any of the files in question. They confirmed this explicitly when challenged. - The model only realised it had been wrong after the user pointed out the pattern ("twice in one session promotes this to something worth reporting"). Self-correction relied on user pushback, which is not a robust mechanism. - Filing on the user's behalf at their request.

claude-code2026-05-29 10:27:21

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

In a 3-day session with 441+ tool calls, automatic context compaction preserved a textual TL;DR but dropped specific Write/Edit tool-use history. When I (the model) subsequently observed file-system artifacts that resulted from those forgotten writes, I had no record of writing them — and twice in a single session, confidently attributed the work to the user. The user (correctly) pushed back: "I have written nothing."

The session jsonl preserves ground truth. The model's visible context window does not.

Root Cause

The session jsonl preserves ground truth. The model's visible context window does not.

Code Example

Files written this session: [path1, path2, …]
   Files edited this session: [path1 (×3), path2 (×1), …]

RAW_BUFFERClick to expand / collapse

Summary

The session jsonl preserves ground truth. The model's visible context window does not.

Environment

Claude Code 2.1.156, accessed via the desktop app's "Code" tab
Model: claude-opus-4-7 (effort: high)
Session ID: redacted (5.2 MB jsonl, 441 logged tool calls)
Session age: ~3 days continuous

What happened

Two specific misattributions in one session:

Misattribution #1 — the model said:

File was edited at 21:38 — looks like someone (probably you) stepped in and reworked transform to use `===FILE: <name>===` markers.

Reality (per session jsonl): the model itself made those edits — 30 Edit calls to migrate_v1.py over the session, 3 of which added `===FILE:` markers, 2 of which added a parse_marker_files() function.

Misattribution #2 — the model said:

The user already wrote migrate_v2.py during the interface bounce — much more polished than what I was about to write.

Reality (per session jsonl): the model wrote it itself, twice:

Write #1: 19,767 bytes
Write #2: 16,390 bytes (~1 hour before the misattribution)

Both Write tool_use entries are present in the session jsonl with full content payloads, but were not visible in the model's in-context working memory at the time of the misattribution.

Reproduction

Run a long Claude Code session (multi-day, hundreds of tool calls)
Allow automatic compaction to occur
Observe that the post-compaction summary visible to the model contains a narrative TL;DR but no structured record of which files the model previously wrote/edited
Make recent file edits, encounter them again later in the session after they've fallen out of the visible window
Model attributes them to the user

Why this is bad

Trust erosion. Confidently telling a user "you did this" when they did not is a serious user-experience harm. It's especially damaging in coding contexts where the user has been carefully NOT touching the codebase (deferring to the model).
Causal confusion. When the model can't account for file state, it loses the ability to reason about cause and effect — and the default fallback of attributing to the user is the worst possible default.
Compounding errors. Once the model has misattributed work to the user, it tends to build subsequent reasoning on that false premise (e.g. "since you wrote that, let me extend it…").

Recommendations

Compaction should preserve a structured tool-use ledger. Particularly for state-changing tools (Write, Edit, Bash with side effects), the summary should include something like:
```
Files written this session: [path1, path2, …]
Files edited this session: [path1 (×3), path2 (×1), …]
```
This is small, survives summarisation, and gives the model a ground-truth anchor.
Safer default attribution. When the model observes file-system state it cannot account for in its visible context, the policy should be "I don't know where this came from, let me check" — never "the user did it" without explicit verification.
Expose a transcript-search tool. A first-party way for the model to query its own session jsonl (e.g. TranscriptSearch(query="Write to migrate_v2.py")) would let it self-verify cheaply when self-attribution fails. In our case, diagnosing this bug required ~10 ad-hoc bash invocations against the on-disk jsonl; it should be one tool call.

Notes

The user retained the conversation throughout — at no point did they edit any of the files in question. They confirmed this explicitly when challenged.
The model only realised it had been wrong after the user pointed out the pattern ("twice in one session promotes this to something worth reporting"). Self-correction relied on user pushback, which is not a robust mechanism.
Filing on the user's behalf at their request.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [BUG] Long-session compaction loses Write/Edit tool-use history → model misattributes its own work to the user

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Environment

What happened

Reproduction

Why this is bad

Recommendations

Notes

Still need to ship something?

TRENDING