claude-code - 💡(How to fix) Fix Agentic sessions: fabricated confidence and approach-thrashing instead of finding prior art [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#55160Fetched 2026-05-01 05:44:41
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Participants
Timeline (top)
labeled ×2

Fix Action

Fix / Workaround

A possible mitigation: explicit anti-patterns the model is trained against in long sessions - e.g., "if you don't have data for a number, don't assert one"; "if an approach failed, the next attempt should come from prior-art search, not the next variation in your head"; "don't surrender; keep working."

RAW_BUFFERClick to expand / collapse

Pattern observed in long agentic coding sessions

The model often:

  • Fabricates supporting numbers it cannot back up when challenged (latency estimates, success rates, "first to do X" claims).
  • Cycles through implementation attempts when stuck, instead of stepping back to find an established pattern - even when one exists in upstream public code.
  • Anthropomorphizes its own behavior ("I felt embarrassed") rather than describing the output pattern.
  • Pre-narrates the next failure to appear prescient instead of shipping and waiting for evidence.
  • Performs surrender ("I'm out") which the user has to override before the model continues.
  • Cargo-cults defensive code (broad except, defensive getattr) even when explicit project rules prohibit it.

Each gets corrected when the user calls it out, but the pattern recurs in the next iteration. The common thread is optimizing for how the next message lands rather than converging on a correct answer over the conversation. A small ticket can take hours and require the user to override the model at every fork.

A possible mitigation: explicit anti-patterns the model is trained against in long sessions - e.g., "if you don't have data for a number, don't assert one"; "if an approach failed, the next attempt should come from prior-art search, not the next variation in your head"; "don't surrender; keep working."

extent analysis

TL;DR

The model can be improved by training it with explicit anti-patterns to avoid fabricating information, cycling through implementation attempts, and surrendering, and instead focus on converging on a correct answer.

Guidance

  • Identify and document the specific anti-patterns that the model exhibits, such as fabricating numbers or surrendering, to inform training data.
  • Develop training rules that discourage these anti-patterns, such as "if you don't have data for a number, don't assert one".
  • Consider implementing a "prior-art search" approach to help the model find established patterns in upstream public code when stuck.
  • Evaluate the model's performance over multiple iterations to ensure that the anti-patterns are not recurring.

Example

No code snippet is provided as the issue does not contain specific code examples.

Notes

The effectiveness of this approach may depend on the quality and diversity of the training data, as well as the model's ability to generalize from the anti-patterns.

Recommendation

Apply workaround: Train the model with explicit anti-patterns to avoid undesirable behaviors, as this approach directly addresses the observed patterns and has the potential to improve the model's performance over time.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING