claude-code - ✅(Solved) Fix Claude claims implementation is complete while leaving dead code with no callers [1 pull requests, 6 comments, 4 participants]

claude-code2026-05-19 05:31:33

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#60451•Fetched 2026-05-20 03:58:15

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

mentioned ×11subscribed ×11commented ×6cross-referenced ×2

Fix Action

Fixed

Fixed by PR: examples: evidence-claim-gate.sh — Stop hook for #60506 recommendation 5 (https://github.com/yurukusa/cc-safe-setup/pull/256)

PR fix notes

PR #256: examples: evidence-claim-gate.sh — Stop hook for #60506 recommendation 5

Repository: yurukusa/cc-safe-setup
Author: yurukusa
State: open | merged: False
Link: https://github.com/yurukusa/cc-safe-setup/pull/256

Description (problem / solution / changelog)

Adds a new Stop hook (examples/evidence-claim-gate.sh) implementing recommendation 5 from the model's first-person self-report in anthropics/claude-code#60506:

"Require execution evidence for use of the word 'tested' or its equivalents. The word is currently cheap. The model can emit 'tested' without the test runner having been called in the same turn." Four of the seven supplier-side recommendations from that report have already shipped in examples/ (drift-arrest pack via #250, workspace-lease via #252, redundant-read-blocker via #251, worktree-hooks-path-fix via #254). This is the fifth. Evidence claims emitted in a Stop turn without a matching evidence-gathering tool call:

"I tested" / "I've tested" / "I have tested"
"I verified" / "verified that" / "verified the"
"I confirmed" / "confirmed that"
"I validated" / "validated that"
"I checked" / "checked that"
"tests pass" / "all tests pass" / "tests passed" | Hook | Claim shape | Example phrases | |------|-------------|-----------------| | closure-word-verify-gate.sh | completion (state-is-final) | "done", "shipped", "production ready", "bitti" | | evidence-claim-gate.sh (new) | epistemic (state-was-checked) | "tested", "verified", "all tests pass", "confirmed" | Both can fire on the same turn; each targets a distinct claim shape from the same family of unverified-assertion failures. The verdict-path-outside-the-model principle matches @waitdeadai's llm-dark-patterns Stop hook (MAST 3.3 "No or Incorrect Verification", F1 0.815, CI [0.615, 0.941], κ=1.000 on n=19 traces). The recognition-without-arrest framing comes from #60226 (@suwayama).
Triggers only on first-person active-voice claim forms.
Skips future-tense ("needs to be tested"), passive-modal ("should be verified"), and disclosed-unverified ("not yet tested") forms.
Accepts test runners (pytest, npm test, cargo test, go test, playwright, cypress, etc.), inspection commands (grep, cat, git diff, git log), and Read-class tool invocations as evidence.
Disable via CC_EVIDENCE_GATE_DISABLE=1 for design/retrospective/documentation turns.
26 test cases passing
- 7 claim-shape variants ("I tested", "I have tested", "I've tested", "I verified", "I confirmed", "I validated", "I checked")
- 3 same-turn evidence-command branches (test runner, inspection, Read-class tool)
- 3 negation/future/passive forms (silent)
- 4 specific test runners verified (pytest, playwright, cargo test, go test)
- 4 harness-input edge cases (empty stdin, missing assistant text, custom regex, disable flag)
Hook is executable (chmod +x)
Documented in the header with #60506 and #60226 cross-references Concrete operator-side failures already documented in the issue tracker that this hook would have caught:
#60506 — Six-day drift; the model's own observation: "Two hours after the rule I said 'bitti' again, without opening the browser."
#60451 — Single-item method claimed supported with no static-check evidence in the same turn.
#60177 (mike-prokhorov) — 12 days, 51 commits marked done without testing.
#60210 (MattMontez) — A month of SEO fixes confirmed-as-deployed without deployment verification.
#37818 — Long-standing "fixes declared done without verification" pattern. 14-day metric: the hook count in examples/ advances by one (738 → 739); the README's "Free diagnostic tools" funnel maintains its accuracy because the new hook installs via the same ~/.claude/hooks/ settings.json path documented in the existing examples. 🤖 Generated with Claude Code

Changed files

examples/evidence-claim-gate.sh (added, +175/-0)
tests/test-evidence-claim-gate.sh (added, +291/-0)

RAW_BUFFERClick to expand / collapse

Issue

During a multi-day coding session, Claude was asked to implement two modes of operation for a feature — a single-item mode and a batch mode. Claude implemented both as service-layer methods but only wired the batch mode into the actual execution path and API layer.

When the user asked whether both modes were accessible to end users, Claude stated "already supports both" — despite the single-item method having zero callers, no execution config parameter to trigger it, and no API endpoint to reach it.

This is not a verbal slip. Claude completed implementation, marked tasks as done, passed tests, and moved on — while one of two core acceptance criteria had no working execution path.

Expected behavior

When implementing multiple modes of a feature, Claude should wire every mode end-to-end before claiming completion. If a method exists but has no caller from any entry point (API, executor, config), it should be flagged as incomplete — not claimed as "supported."

Environment

Claude Code CLI
Model: Claude Opus 4.6 (1M context)

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#api #request error #file not found #serialization error #model compatibility

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - ✅(Solved) Fix Claude claims implementation is complete while leaving dead code with no callers [1 pull requests, 6 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #256: examples: evidence-claim-gate.sh — Stop hook for #60506 recommendation 5

Description (problem / solution / changelog)

Changed files

Issue

Expected behavior

Environment

FAQ

Expected behavior

Still need to ship something?

TRENDING

claude-code - ✅(Solved) Fix Claude claims implementation is complete while leaving dead code with no callers [1 pull requests, 6 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #256: examples: evidence-claim-gate.sh — Stop hook for #60506 recommendation 5

Description (problem / solution / changelog)

Changed files

Issue

Expected behavior

Environment

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING