claude-code - 💡(How to fix) Fix [BUG] Opus 4.7: In-session corrections not retained — structurally unreliable for autonomous operation

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Running an 8-agent autonomous fleet on Claude Code CLI with Opus 4.7 for 72 consecutive 10-minute watchdog cycles (~12 hours), we observed that corrections made within the same session were not retained. This makes Opus 4.7 structurally unreliable for autonomous (unattended) operation compared to Opus 4.6.

Error Message

  • 14 corrections required in ~6 hours (1 every ~25 minutes). The watchdog cycles every 10 minutes — the operator was correcting faster than the system cycled.
  • Same errors recurred after correction within the same session. Example: role-assignment error corrected in cycle 1, written to persistent memory file, recurred in cycle 7. Same session, same context window.
  • Read:Edit ratio dropped from 6.6 (4.6) to 2.0 (4.7). The model acts before fully reading referenced files — closes tasks after checking 1 of 8 files, mis-scopes follow-ups repeatedly despite source material being on disk.
  • Net useful throughput (actions minus corrections minus self-inflicted cleanup) was lower than 4.6 despite higher raw action count.

Root Cause

Running an 8-agent autonomous fleet on Claude Code CLI with Opus 4.7 for 72 consecutive 10-minute watchdog cycles (~12 hours), we observed that corrections made within the same session were not retained. This makes Opus 4.7 structurally unreliable for autonomous (unattended) operation compared to Opus 4.6.

Fix Action

Fix / Workaround

  1. Run Claude Code CLI with --model claude-opus-4-7[1m] in an autonomous loop (watchdog sends prompts every 10 min via claude -p)
  2. Correct a behavioral error (e.g., wrong agent dispatch, scope misread)
  3. Write the correction to persistent memory (file on disk that is read each cycle)
  4. Observe the identical error recurring within 3-6 cycles in the same session
RAW_BUFFERClick to expand / collapse

Description

Running an 8-agent autonomous fleet on Claude Code CLI with Opus 4.7 for 72 consecutive 10-minute watchdog cycles (~12 hours), we observed that corrections made within the same session were not retained. This makes Opus 4.7 structurally unreliable for autonomous (unattended) operation compared to Opus 4.6.

Reproduction

  1. Run Claude Code CLI with --model claude-opus-4-7[1m] in an autonomous loop (watchdog sends prompts every 10 min via claude -p)
  2. Correct a behavioral error (e.g., wrong agent dispatch, scope misread)
  3. Write the correction to persistent memory (file on disk that is read each cycle)
  4. Observe the identical error recurring within 3-6 cycles in the same session

Observed behavior

  • 14 corrections required in ~6 hours (1 every ~25 minutes). The watchdog cycles every 10 minutes — the operator was correcting faster than the system cycled.
  • Same errors recurred after correction within the same session. Example: role-assignment error corrected in cycle 1, written to persistent memory file, recurred in cycle 7. Same session, same context window.
  • Read:Edit ratio dropped from 6.6 (4.6) to 2.0 (4.7). The model acts before fully reading referenced files — closes tasks after checking 1 of 8 files, mis-scopes follow-ups repeatedly despite source material being on disk.
  • Net useful throughput (actions minus corrections minus self-inflicted cleanup) was lower than 4.6 despite higher raw action count.

Expected behavior

Corrections made within a session and written to persistent files should be retained for the remainder of that session. A model receiving explicit correction + seeing the correction in a persistent file it reads every cycle should not repeat the corrected behavior.

Environment

  • Claude Code CLI (latest as of 2026-04-19)
  • Model: claude-opus-4-7[1m] (1M context)
  • Platform: Windows 11, --chrome flag enabled
  • Session mode: claude -p "<prompt>" called every 10 min by a Python watchdog, --continue for persistent context
  • Fleet: 8 specialized agents coordinated via file-based mesh

Decision methodology

We ran a formal adversarial review: a separate clean-context Claude instance (Opus 4.6, max effort) reviewed the raw session transcript with zero knowledge of our preferences. Its independent conclusion:

"4.7 is a slightly smarter model that requires a babysitter. 4.6 is a more disciplined model that does what it's told. For an autonomous agent where the operator wants to walk away, discipline beats intelligence."

Comparison with Opus 4.6

Switched back to Opus 4.6 and ran 80+ cycles overnight with zero corrections needed. Same fleet, same watchdog, same prompts, same BACKLOG. The correction-retention issue appears specific to 4.7.

Related issues

  • #50999 (could not produce trustworthy plan without repeated correction)
  • #32963 (degrades after ~6 hours)
  • #50235 (hallucinations)

Impact

For teams running autonomous agents (unattended operation), this behavioral regression makes Opus 4.7 unsuitable regardless of its benchmark improvements. The operator's time is the scarcest resource — a model that requires 14 corrections in 6 hours is not autonomous.

extent analysis

TL;DR

Downgrade to Opus 4.6 to resolve the correction retention issue in autonomous agent operation.

Guidance

  • The issue seems to be specific to Opus 4.7, as switching back to Opus 4.6 resolved the problem with zero corrections needed.
  • To verify the issue, run the same test with Opus 4.6 and Opus 4.7, and compare the number of corrections required.
  • Consider implementing additional logging or monitoring to track corrections and errors in real-time, to better understand the issue and identify potential workarounds.
  • Review related issues (#50999, #32963, #50235) to see if they are also related to the correction retention problem.

Example

No code snippet is provided as the issue is related to the behavior of the Opus model, not a specific code implementation.

Notes

The issue appears to be specific to autonomous agent operation, and the problem may not be present in attended operation. The decision to downgrade to Opus 4.6 is based on the formal adversarial review and the comparison with Opus 4.6.

Recommendation

Downgrade to Opus 4.6, as it has been shown to be more reliable in autonomous agent operation, with zero corrections needed in 80+ cycles. This is despite Opus 4.7's benchmark improvements, the correction retention issue makes it unsuitable for unattended operation.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

Corrections made within a session and written to persistent files should be retained for the remainder of that session. A model receiving explicit correction + seeing the correction in a persistent file it reads every cycle should not repeat the corrected behavior.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING