claude-code - 💡(How to fix) Fix [BUG] Opus 4.7: In-session corrections not retained — structurally unreliable for autonomous operation

claude-code2026-04-20 10:59:15

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Running an 8-agent autonomous fleet on Claude Code CLI with Opus 4.7 for 72 consecutive 10-minute watchdog cycles (~12 hours), we observed that corrections made within the same session were not retained. This makes Opus 4.7 structurally unreliable for autonomous (unattended) operation compared to Opus 4.6.

Error Message

14 corrections required in ~6 hours (1 every ~25 minutes). The watchdog cycles every 10 minutes — the operator was correcting faster than the system cycled.
Same errors recurred after correction within the same session. Example: role-assignment error corrected in cycle 1, written to persistent memory file, recurred in cycle 7. Same session, same context window.
Read:Edit ratio dropped from 6.6 (4.6) to 2.0 (4.7). The model acts before fully reading referenced files — closes tasks after checking 1 of 8 files, mis-scopes follow-ups repeatedly despite source material being on disk.
Net useful throughput (actions minus corrections minus self-inflicted cleanup) was lower than 4.6 despite higher raw action count.

Root Cause

Fix Action

Fix / Workaround

Run Claude Code CLI with --model claude-opus-4-7[1m] in an autonomous loop (watchdog sends prompts every 10 min via claude -p)
Correct a behavioral error (e.g., wrong agent dispatch, scope misread)
Write the correction to persistent memory (file on disk that is read each cycle)
Observe the identical error recurring within 3-6 cycles in the same session

RAW_BUFFERClick to expand / collapse

Description

Reproduction

Run Claude Code CLI with --model claude-opus-4-7[1m] in an autonomous loop (watchdog sends prompts every 10 min via claude -p)
Correct a behavioral error (e.g., wrong agent dispatch, scope misread)
Write the correction to persistent memory (file on disk that is read each cycle)
Observe the identical error recurring within 3-6 cycles in the same session

Observed behavior

14 corrections required in ~6 hours (1 every ~25 minutes). The watchdog cycles every 10 minutes — the operator was correcting faster than the system cycled.
Same errors recurred after correction within the same session. Example: role-assignment error corrected in cycle 1, written to persistent memory file, recurred in cycle 7. Same session, same context window.
Read:Edit ratio dropped from 6.6 (4.6) to 2.0 (4.7). The model acts before fully reading referenced files — closes tasks after checking 1 of 8 files, mis-scopes follow-ups repeatedly despite source material being on disk.
Net useful throughput (actions minus corrections minus self-inflicted cleanup) was lower than 4.6 despite higher raw action count.

Expected behavior

Corrections made within a session and written to persistent files should be retained for the remainder of that session. A model receiving explicit correction + seeing the correction in a persistent file it reads every cycle should not repeat the corrected behavior.

Environment

Claude Code CLI (latest as of 2026-04-19)
Model: claude-opus-4-7[1m] (1M context)
Platform: Windows 11, --chrome flag enabled
Session mode: claude -p "<prompt>" called every 10 min by a Python watchdog, --continue for persistent context
Fleet: 8 specialized agents coordinated via file-based mesh

Decision methodology

We ran a formal adversarial review: a separate clean-context Claude instance (Opus 4.6, max effort) reviewed the raw session transcript with zero knowledge of our preferences. Its independent conclusion:

"4.7 is a slightly smarter model that requires a babysitter. 4.6 is a more disciplined model that does what it's told. For an autonomous agent where the operator wants to walk away, discipline beats intelligence."

Comparison with Opus 4.6

Switched back to Opus 4.6 and ran 80+ cycles overnight with zero corrections needed. Same fleet, same watchdog, same prompts, same BACKLOG. The correction-retention issue appears specific to 4.7.

Related issues

#50999 (could not produce trustworthy plan without repeated correction)
#32963 (degrades after ~6 hours)
#50235 (hallucinations)

Impact

For teams running autonomous agents (unattended operation), this behavioral regression makes Opus 4.7 unsuitable regardless of its benchmark improvements. The operator's time is the scarcest resource — a model that requires 14 corrections in 6 hours is not autonomous.

extent analysis

TL;DR

Downgrade to Opus 4.6 to resolve the correction retention issue in autonomous agent operation.

Guidance

The issue seems to be specific to Opus 4.7, as switching back to Opus 4.6 resolved the problem with zero corrections needed.
To verify the issue, run the same test with Opus 4.6 and Opus 4.7, and compare the number of corrections required.
Consider implementing additional logging or monitoring to track corrections and errors in real-time, to better understand the issue and identify potential workarounds.
Review related issues (#50999, #32963, #50235) to see if they are also related to the correction retention problem.

Example

No code snippet is provided as the issue is related to the behavior of the Opus model, not a specific code implementation.

Notes

The issue appears to be specific to autonomous agent operation, and the problem may not be present in attended operation. The decision to downgrade to Opus 4.6 is based on the formal adversarial review and the comparison with Opus 4.6.

Recommendation

Downgrade to Opus 4.6, as it has been shown to be more reliable in autonomous agent operation, with zero corrections needed in 80+ cycles. This is despite Opus 4.7's benchmark improvements, the correction retention issue makes it unsuitable for unattended operation.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#environment setup #docker error #permission error #memory optimization #batch processing

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [BUG] Opus 4.7: In-session corrections not retained — structurally unreliable for autonomous operation

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Description

Reproduction

Observed behavior

Expected behavior

Environment

Decision methodology

Comparison with Opus 4.6

Related issues

Impact

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Opus 4.7: In-session corrections not retained — structurally unreliable for autonomous operation

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Description

Reproduction

Observed behavior

Expected behavior

Environment

Decision methodology

Comparison with Opus 4.6

Related issues

Impact

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING