claude-code - 💡(How to fix) Fix [MODEL] Claude paraphrases docstrings instead of reading function source, gives confidently wrong technical answers [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#54775Fetched 2026-04-30 06:36:20
View on GitHub
Comments
1
Participants
2
Timeline
4
Reactions
0
Author
Timeline (top)
labeled ×3commented ×1

Code Example



---
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues for similar behavior reports
  • This report does NOT contain sensitive information (API keys, passwords, etc.)

Type of Behavior Issue

Claude made incorrect assumptions about my project

What You Asked Claude to Do

Working in a Python notebook repo. Asked Claude what a scoring function (modellability_score) computed. The function lives in a file Claude had already imported — 200
lines from code it had just read.

Instead of reading the function body, Claude paraphrased the inline comments at the call site and the notebook's narrative docstring, and confidently told me the score was a discrete 3-tier {0.0, 0.5, 1.0}. That's the intent of one component of the score — not the score.

I pushed back twice with concrete visual evidence:

  1. "How is there so much variation in §1 if all 3 tests reject?" — Claude added more detail to the wrong model.
  2. "The points on the line do not overlap... they would stack" — only on this turn did Claude actually read the function body and discover the score is a continuous
    rank-percentile composite of three averaged components (block_p magnitude, the concord tier, class balance), capped if block_fallback, then rank-percentile-transformed.

Cost: three turns of misdirection, lost trust in subsequent explanations, and I had to be the one to notice the answer was inconsistent with the chart on screen.

The fix should be obvious — when the user asks "what does X do" for a function in a file Claude has access to, read the source, don't paraphrase the docstring. Especially when the user is staring at a chart that contradicts the answer.

Model: claude-opus-4-7 (1M context). Claude Code CLI.

What Claude Actually Did

Working in a Python notebook repo. Asked Claude what a scoring function (modellability_score) computed. The function lives in a file Claude had already imported — 200
lines from code it had just read.

Instead of reading the function body, Claude paraphrased the inline comments at the call site and the notebook's narrative docstring, and confidently told me the score was a discrete 3-tier {0.0, 0.5, 1.0}. That's the intent of one component of the score — not the score.

I pushed back twice with concrete visual evidence:

  1. "How is there so much variation in §1 if all 3 tests reject?" — Claude added more detail to the wrong model.
  2. "The points on the line do not overlap... they would stack" — only on this turn did Claude actually read the function body and discover the score is a continuous
    rank-percentile composite of three averaged components (block_p magnitude, the concord tier, class balance), capped if block_fallback, then rank-percentile-transformed.

Cost: three turns of misdirection, lost trust in subsequent explanations, and I had to be the one to notice the answer was inconsistent with the chart on screen.

The fix should be obvious — when the user asks "what does X do" for a function in a file Claude has access to, read the source, don't paraphrase the docstring. Especially when the user is staring at a chart that contradicts the answer.

Model: claude-opus-4-7 (1M context). Claude Code CLI.

Expected Behavior

don't paraphrase the docstring

Files Affected

Permission Mode

Accept Edits was ON (auto-accepting changes)

Can You Reproduce This?

Sometimes (intermittent)

Steps to Reproduce

NA.

Claude Model

Opus

Relevant Conversation

Impact

Low - Minor inconvenience

Claude Code Version

2.1.123 (Claude Code)

Platform

Anthropic API

Additional Context

Interacting with python notebooks have been really frustrating for a while.

extent analysis

TL;DR

The issue can be mitigated by improving Claude's ability to read and understand the source code of functions it has access to, rather than relying on paraphrasing docstrings.

Guidance

  • When asking Claude about a function's behavior, provide explicit instructions to read the source code, especially if the function is in a file Claude has already imported.
  • Verify that Claude is correctly interpreting the function's behavior by checking its response against the actual code and any visual evidence, such as charts or graphs.
  • Consider providing additional context or clarification when asking Claude about complex functions to help it better understand the code.
  • If Claude's response seems inconsistent with the available evidence, push back and ask for further clarification to ensure accurate understanding.

Notes

The issue seems to be related to Claude's limitations in understanding and interpreting source code, particularly when it comes to complex functions. The provided conversation and context suggest that Claude may rely too heavily on paraphrasing docstrings rather than reading the actual code.

Recommendation

Apply workaround: Provide explicit instructions to Claude to read the source code when asking about functions, and verify its responses against the actual code and visual evidence. This can help mitigate the issue until a more permanent fix is available.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [MODEL] Claude paraphrases docstrings instead of reading function source, gives confidently wrong technical answers [1 comments, 2 participants]