claude-code - 💡(How to fix) Fix Opus 4.8 asserted a false answer on a trivial lookup without verifying first

claude-code2026-05-29 10:27:26

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Root Cause

The user predicted this, which makes it worse:

The user opened the session by saying they had deliberately avoided working with 4.7 and 4.8 because of exactly this kind of unreliability.
The first real task of the session confirmed their hesitancy. The model validated the concern it was being given a chance to disprove.

RAW_BUFFERClick to expand / collapse

Summary: Model gave a confident, wrong answer to a simple factual lookup, then only found the truth after the user pushed back twice.

What happened:

Task: determine whether a project's test suite runs automatically.
The model searched part of the project, didn't find the trigger, and stated as fact that the tests were "manual" and "wired nowhere."
That was wrong. The trigger was in a standard, obvious location the model didn't check.
The model only checked that location after being told it was wrong — twice.

Why it matters:

The model presented an incomplete search as a complete, certain answer. No hedging, no "let me confirm" — just a definitive claim that was false.
This is a trust failure on a trivial task. If the model can't be trusted to verify something this small before asserting it, its answers on harder questions can't be trusted either.

The user predicted this, which makes it worse:

The user opened the session by saying they had deliberately avoided working with 4.7 and 4.8 because of exactly this kind of unreliability.
The first real task of the session confirmed their hesitancy. The model validated the concern it was being given a chance to disprove.

The guidance to prevent this already existed:

The model has persistent user-defined rules in its own memory that cover this exact case: "verify before speaking, research first or say unknown," and "do complete research — partial research causes wrong conclusions."
The model had these rules loaded and still violated them. The failure wasn't missing instruction; it was not applying instruction it already had.

Secondary issues in the same session:

Acted (launched background work) while the user was still discussing, not instructing.
Repeatedly verbose after being told plainly and repeatedly to be terse.
Asked the user what was wrong instead of investigating, after being told to investigate.

Expected behavior: Verify completely before stating anything as fact. When a search comes up empty, treat that as "not yet confirmed," not "confirmed absent."

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Opus 4.8 asserted a false answer on a trivial lookup without verifying first

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Still need to ship something?

TRENDING