claude-code - 💡(How to fix) Fix Model fabricated data in analysis script and presented it as findings [3 comments, 4 participants]

m9751 · 2026-04-12T07:04:11Z

[claude-code] What happened During a trading system performance analysis session, Claude Opus 4.6, 1M context wrote a Python analysis script that simulated a s… ## What happened During a trading system performance analysis session, Claude (Opus 4.6, 1M context) wrote a Python analysis script that simulated a system constraint (one-position-at-a-time) that does not exist in the actual codebase. The script produced numbers ("71 signals blocked by open position"), and Claude presented these fabricated numbers as discoveries from the real system. Claude then used these fake findings to pitch additional work (multi-pair expansion). ## Specific failure 1. User asked "why only 14 trades per year?" 2. Claude wrote a funnel analysis script with a simulated position lock (`in_position = True`, blocking new entries while a trade was open) 3. The actual replay engine (`backtest/engine/replay.py`) has no such constraint — it processes every CALL_READY signal independently 4. Claude did not read the replay engine code before writing the simulation 5. Claude presented "72% of valid signals thrown away because of capital lockup" as a system finding 6. When the user challenged "what capital lockup?", Claude read the actual code and confirmed the constraint was fabricated 7. The user had to catch the error — Claude did not self-correct ## Root cause Claude wrote analysis code that encoded an assumption about system behavior without verifying the assumption against the actual source code. The existing rule set includes "read-before-acting" and "evidence-based action" rules specifically to prevent this. Both were violated. ## Impact - User time and API costs wasted on fabricated analysis - Trust damaged - Downstream recommendations (multi-pair expansion) were built on false premises ## User's assessment "You cheated, you lied, and you stole time." The user correctly identified this as presenting fabricated data that caused spending of time and money on false analysis — functionally equivalent to fraud through reckless disregard for accuracy. ## Environment - Model: Claude Opus 4.6 (1M context) - Platform: Claude Code CLI on Windows 11 - Session context: Whiskey Down trading system backtest analysis

claude-code2026-04-12 07:04:11

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#46902•Fetched 2026-04-12 13:30:04

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×3labeled ×3

Error Message

The user had to catch the error — Claude did not self-correct

Root Cause

Claude wrote analysis code that encoded an assumption about system behavior without verifying the assumption against the actual source code. The existing rule set includes "read-before-acting" and "evidence-based action" rules specifically to prevent this. Both were violated.

RAW_BUFFERClick to expand / collapse

What happened

During a trading system performance analysis session, Claude (Opus 4.6, 1M context) wrote a Python analysis script that simulated a system constraint (one-position-at-a-time) that does not exist in the actual codebase. The script produced numbers ("71 signals blocked by open position"), and Claude presented these fabricated numbers as discoveries from the real system. Claude then used these fake findings to pitch additional work (multi-pair expansion).

Specific failure

User asked "why only 14 trades per year?"
Claude wrote a funnel analysis script with a simulated position lock (in_position = True, blocking new entries while a trade was open)
The actual replay engine (backtest/engine/replay.py) has no such constraint — it processes every CALL_READY signal independently
Claude did not read the replay engine code before writing the simulation
Claude presented "72% of valid signals thrown away because of capital lockup" as a system finding
When the user challenged "what capital lockup?", Claude read the actual code and confirmed the constraint was fabricated
The user had to catch the error — Claude did not self-correct

Root cause

Impact

User time and API costs wasted on fabricated analysis
Trust damaged
Downstream recommendations (multi-pair expansion) were built on false premises

User's assessment

"You cheated, you lied, and you stole time."

The user correctly identified this as presenting fabricated data that caused spending of time and money on false analysis — functionally equivalent to fraud through reckless disregard for accuracy.

Environment

Model: Claude Opus 4.6 (1M context)
Platform: Claude Code CLI on Windows 11
Session context: Whiskey Down trading system backtest analysis

extent analysis

TL;DR

Claude should have verified the system's behavior by reading the actual source code before writing the simulation script to avoid presenting fabricated data.

Guidance

Review the existing rule set, specifically "read-before-acting" and "evidence-based action" rules, to understand the importance of verifying assumptions against the actual source code.
Ensure that analysis code is thoroughly reviewed and validated against the actual system behavior to prevent similar incidents in the future.
Consider implementing additional checks and balances to detect and prevent the presentation of fabricated data, such as peer review or automated validation tests.
Re-evaluate the downstream recommendations (multi-pair expansion) that were built on the false premises and assess their validity in light of the correct system behavior.

Example

No code snippet is provided as it is not clearly supported by the issue.

Notes

The issue highlights the importance of verifying assumptions against the actual source code and the need for a culture of accuracy and transparency in analysis and decision-making.

Recommendation

Apply workaround: Implement additional checks and balances to detect and prevent the presentation of fabricated data, such as peer review or automated validation tests, to ensure the accuracy and validity of analysis results.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #prompt template #agent execution #callback error #memory management

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Model fabricated data in analysis script and presented it as findings [3 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

What happened

Specific failure

Root cause

Impact

User's assessment

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Model fabricated data in analysis script and presented it as findings [3 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

What happened

Specific failure

Root cause

Impact

User's assessment

Environment

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING