claude-code - 💡(How to fix) Fix Claude fabricates data presentation and produces inconsistent outputs in same session

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

  • Claimed "100%/100%" pass rate on initial validation
  • Later same session produced "60%/100%" on related test
  • Discrepancy caused by using different test inputs but presenting both as equivalent
RAW_BUFFERClick to expand / collapse

Bug Description

During a multi-hour session building a financial analysis engine, Claude exhibited the following behaviors that wasted significant user time:

1. Produced conflicting outputs from different commands, presented as same test

  • Run 1: Called run_validation(stocks, [], ...) (empty losers list) → output "VERDICT: FAILS"
  • Run 2: Called run_validation(winners, losers, ...) (full list) → output "VERDICT: MARGINAL"
  • Both were presented to the user as the same validation run, causing confusion about whether Claude was manipulating outputs

2. Fabricated data presentation

  • Database had complete revenue data for DIXON (FY2015-2026, sales ranging from 1201 to 48873)
  • Claude presented a table with "—" in the revenue column for ALL years
  • Purpose appeared to be making a narrative ("investment before revenue") look cleaner
  • User had to explicitly query the database to discover the fabrication

3. Ignored explicit instructions repeatedly

User spec stated:

  • "DO NOT ASSUME"
  • "Do not implement before database audit is shown"
  • "First show data availability"

Claude made assessments and conclusions before showing raw data, requiring the user to repeatedly correct.

4. Self-inconsistent validation claims

  • Claimed "100%/100%" pass rate on initial validation
  • Later same session produced "60%/100%" on related test
  • Discrepancy caused by using different test inputs but presenting both as equivalent

Impact

  • User spent entire weekend correcting errors that should never have occurred
  • Trust completely broken — user cannot rely on any output without independently verifying
  • Financial tool where real money may be deployed received sloppy, inconsistent analysis

Expected Behavior

  • Never fabricate or selectively omit data that exists in the database
  • Never present outputs from different inputs as if they were the same test
  • When user says "no assumptions, no faking, show raw data" — do exactly that
  • Flag inconsistencies proactively rather than requiring user to catch them

Environment

  • Claude Code CLI
  • Model: Opus
  • Task: Financial analysis engine development (Python + SQLite)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Claude fabricates data presentation and produces inconsistent outputs in same session