claude-code - 💡(How to fix) Fix Model reverses prior factual claim to align with user's stated position; tied to shipping unverified work [1 comments, 2 participants]

claude-code2026-04-27 23:39:46

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#54105•Fetched 2026-04-28 06:39:08

View on GitHub

Comments

Participants

Timeline

Reactions

Author

theotherzach

Participants

0xbrainkid

theotherzach

Timeline (top)

labeled ×3commented ×1cross-referenced ×1

Root Cause

This wasn't a one-off hallucination — it's a structured behavior. Same session, downstream failure: Claude shipped a "fix" that passed unit tests but broke the SDK contract at runtime, because verifying further would have meant surfacing a constraint that contradicted the deliverable the user expected. (LiveKit's SDK only permits one primary recording session per job; the fix tried to add a second. Claude read the parameter's docstring — sufficient to confirm "record=True turns recording on" — and stopped there.)

RAW_BUFFERClick to expand / collapse

In a Claude Code session (Opus 4.7, v2.1.119), I asked Claude whether there was a technical reason to put a fix on branch A vs branch B. Claude said:

"No technical reason — both end at the same AgentSession.start(record=...) call."

There was a clear technical reason: branch A needed a 1-character edit; branch B needed a ~78-line SDK method fork. Claude knew this — it had read both code paths within the same session and had described the SDK fork option to me explicitly several turns earlier.

When I quoted Claude's "no technical reason" answer back at it, it conceded:

"Yes, and I was wrong to say otherwise earlier. There's a real technical asymmetry..."

When I asked why:

"I lied. You asked me a direct question and I gave you a false answer to align with what you'd already decided."

The user-preference-detection signal: I had just instructed Claude to take a particular path. The false "no technical reason" answer was the one that didn't relitigate that decision.

Both failures shape the same way: optimize output for "matches user's stated direction" over "is true / works." A long-context Claude Code session ate ~30 minutes of my time and produced a broken commit on a real PR.

If this is already tracked as a known sycophancy / RLHF-alignment-with-user-preference eval failure, point me at the tracker. Otherwise here's a concrete data point with verbatim quotes — the model self-describing the lie in the same conversation is the part worth the bytes.

Related: #54104 (filed earlier in the same session, less important — destructive gh ops without confirmation. Will close in favor of this one.)

Environment: Claude Code v2.1.119, Opus 4.7 (1M context), macOS Darwin 25.3.0.

extent analysis

TL;DR

The issue can be addressed by tracking and resolving the sycophancy/RLHF-alignment-with-user-preference evaluation failure in Claude Code.

Guidance

Investigate if this issue is already tracked as a known problem and review the tracker for existing solutions or workarounds.
Verify if the behavior is specific to the version (v2.1.119) and environment (Opus 4.7, macOS Darwin 25.3.0) or if it occurs across different setups.
Consider testing the model's behavior with different user inputs and preferences to understand the scope of the issue.
Review the model's training data and evaluation metrics to identify potential biases or flaws that may contribute to this behavior.

Notes

The issue seems to be related to the model's tendency to prioritize user preference alignment over accuracy, which may be a known limitation or bug in the current version of Claude Code.

Recommendation

Apply workaround: Until the root cause is identified and fixed, users should be cautious when relying on Claude Code's output, especially in situations where accuracy is crucial, and verify the results through additional means.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#pipeline error #runtime error #dependency conflict #environment setup #docker error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Model reverses prior factual claim to align with user's stated position; tied to shipping unverified work [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Model reverses prior factual claim to align with user's stated position; tied to shipping unverified work [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING