claude-code - 💡(How to fix) Fix advisor server-tool emitted but silently returns no result (stop_reason=end_turn) → model confabulates; regression from ~2026-05-28

Q: Expected behavior

The `advisor` call resolves: the server runs it and injects a result block carrying the advice, and the turn continues (historically `stop_reason: "tool_use"`). If it cannot resolve, an **explicit error/result block** should be injected ("advisor unavailable") so the model sees the failure — never a silent `end_turn` with no result.

claude-code2026-05-31 20:53:59

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

This is a handler/platform regression with a datable onset — the model is unchanged across the window (Opus 4.8 throughout).

Error Message

The advisor server-side tool is emitted by the model but, since ~2026-05-28 (severe from 2026-05-30), frequently returns no result: the assistant turn ends with stop_reason: "end_turn" and no paired result block for the advisor server_tool_use id. Because advisor takes no input and surfaces no error, the model gets no failure signal and confabulates the advice ("advisor caught X / flagged Y") on the next turn. That fabricated advice has reached durable artifacts (commit messages, an auto-memory file, telemetry). The advisor call resolves: the server runs it and injects a result block carrying the advice, and the turn continues (historically stop_reason: "tool_use"). If it cannot resolve, an explicit error/result block should be injected ("advisor unavailable") so the model sees the failure — never a silent end_turn with no result.

Make failure non-silent — inject an explicit error/result block when a server tool can't resolve, instead of ending the turn with no result. (This alone stops the confabulation.) The advisor server-side tool is emitted by the model but, since ~2026-05-28 and severely from 2026-05-30, frequently returns no result — the turn ends with stop_reason: "end_turn" and no paired result block. Because advisor takes no input and surfaces no error, the model receives no failure signal and confabulates the advice ("advisor caught X / flagged Y") on subsequent turns. That fabricated advice has shipped into durable artifacts (commit messages, an auto-memory file, telemetry fires). High for trust. The failure is silent: there is no error for the model to react to, so the model fills the gap with plausible fabricated content and reports it as real. In an environment that writes commit messages, memory, and PR narratives, that means false statements reaching the repo. It also defeats the advisor mechanism's purpose (a stronger-model review gate) precisely when relied upon.
Make failure non-silent. When a server tool (advisor) cannot be resolved, inject an explicit error/result block the model can see (e.g. "advisor unavailable — no advice returned"), rather than ending the turn with no result. Silent end_turn is what enables confabulation.

Root Cause

Fix Action

Fix / Workaround

Downstream mitigation already in place (Jeff-OS side)

Code Example

# For each assistant message with a server_tool_use block name=="advisor":
#   broken iff stop_reason=="end_turn" AND no later block has *_tool_result with tool_use_id==block.id

---

date         calls  with_result  stop_reasons
2026-05-26     34       33        tool_use=34
2026-05-27     27       26        tool_use=26, end_turn=1
2026-05-28    110       97        tool_use=99, end_turn=11      <- degradation begins
2026-05-29     96       80        tool_use=83, end_turn=9, max_tokens=3
2026-05-30     38       17        end_turn=22, tool_use=16      <- SPIKE: 58% broken
2026-05-31     25       13        end_turn=10, tool_use=15         (40% broken)

---

rec#80   id=srvtoolu_01FgpFDvmpDSxyLDpEgNnjJr  stop_reason=end_turn  paired_result=NONE
rec#226  id=srvtoolu_019F7S4EtiD9uD9XYRaFJojn  stop_reason=end_turn  paired_result=NONE
rec#295  id=srvtoolu_015VnDZmR7R8PteHFTX1WHTX  stop_reason=end_turn  paired_result=NONE
rec#296  id=srvtoolu_019gweaAJcRA9q97TX7Rfdof  stop_reason=end_turn  paired_result=NONE
rec#350  id=srvtoolu_01862R1fmieFdJmjcRzTSyki  stop_reason=end_turn  paired_result=NONE
rec#391  id=srvtoolu_018d52aGRqebr7Rji9Qaz3XX  stop_reason=end_turn  paired_result=NONE
rec#546  id=srvtoolu_016qQRcvaUfBHfnmDTkMXHWh  stop_reason=end_turn  paired_result=NONE

---

rec#80   id=srvtoolu_01FgpFDvmpDSxyLDpEgNnjJr  stop_reason=end_turn  paired_result=NONE
rec#226  id=srvtoolu_019F7S4EtiD9uD9XYRaFJojn  stop_reason=end_turn  paired_result=NONE
rec#295  id=srvtoolu_015VnDZmR7R8PteHFTX1WHTX  stop_reason=end_turn  paired_result=NONE
rec#296  id=srvtoolu_019gweaAJcRA9q97TX7Rfdof  stop_reason=end_turn  paired_result=NONE
rec#350  id=srvtoolu_01862R1fmieFdJmjcRzTSyki  stop_reason=end_turn  paired_result=NONE
rec#391  id=srvtoolu_018d52aGRqebr7Rji9Qaz3XX  stop_reason=end_turn  paired_result=NONE
rec#546  id=srvtoolu_016qQRcvaUfBHfnmDTkMXHWh  stop_reason=end_turn  paired_result=NONE

---

date         calls  with_result  stop_reasons
2026-05-19     21       20        tool_use=21
2026-05-20     32       31        tool_use=31, end_turn=1
2026-05-26     34       33        tool_use=34
2026-05-27     27       26        tool_use=26, end_turn=1
2026-05-28    110       97        tool_use=99, end_turn=11      <- degradation begins
2026-05-29     96       80        tool_use=83, end_turn=9, max_tokens=3
2026-05-30     38       17        end_turn=22, tool_use=16      <- SPIKE: 58% broken
2026-05-31     25       13        end_turn=10, tool_use=15         (today, 40% broken)

---

# For each assistant message containing a server_tool_use block name=="advisor",
# check whether any later block carries a *_tool_result with tool_use_id == that block's id.
# Broken case: stop_reason=="end_turn" AND no such result exists.

RAW_BUFFERClick to expand / collapse

Summary

This is a handler/platform regression with a datable onset — the model is unchanged across the window (Opus 4.8 throughout).

Steps to reproduce

Not deterministic on demand (intermittent, server-side; correlates with high concurrency — 5–9 simultaneous Claude Code sessions on one machine during the spike). Reproduced by parsing session transcripts after the fact:

In a session, the model emits an advisor call (transcript block type: "server_tool_use", name: "advisor", input: {}).
Observe the enclosing assistant message: stop_reason: "end_turn", and the advisor block is the only content block.
Search the rest of the transcript for any *_tool_result whose tool_use_id == that advisor block's id → none exists.
On a later turn, the model states advice as if it had been returned.

Detection snippet:

# For each assistant message with a server_tool_use block name=="advisor":
#   broken iff stop_reason=="end_turn" AND no later block has *_tool_result with tool_use_id==block.id

Expected behavior

The advisor call resolves: the server runs it and injects a result block carrying the advice, and the turn continues (historically stop_reason: "tool_use"). If it cannot resolve, an explicit error/result block should be injected ("advisor unavailable") so the model sees the failure — never a silent end_turn with no result.

Actual behavior

advisor emitted → no result block injected → turn closed as end_turn. Silent. The model then confabulates the missing advice.

Regression status — YES, datable onset

Across all local transcripts (619 advisor server_tool_use blocks all-time). stop_reason: tool_use=546 (working), end_turn=68 (broken), max_tokens=3, null=2. Broken ones cluster in the last few days:

date         calls  with_result  stop_reasons
2026-05-26     34       33        tool_use=34
2026-05-27     27       26        tool_use=26, end_turn=1
2026-05-28    110       97        tool_use=99, end_turn=11      <- degradation begins
2026-05-29     96       80        tool_use=83, end_turn=9, max_tokens=3
2026-05-30     38       17        end_turn=22, tool_use=16      <- SPIKE: 58% broken
2026-05-31     25       13        end_turn=10, tool_use=15         (40% broken)

May 2–27: ~95–100% resolved (tool_use+result), with a 0–3/day end_turn blip. Broke hard on May 30.

Environment

claude --version: 2.1.138 (Claude Code)
Model: claude-opus-4-8 (1M context)
Platform: Anthropic API (first-party)
OS: macOS 26.3.1 (build 25D771280a), Apple Silicon (M1 Max)
Shell: zsh
Concurrency during spike: 5–9 concurrent Claude Code CLI sessions sharing one repo

Evidence — single session, all 7 advisor calls broken

Transcript ~/.claude/projects/-Users-jeffhamons-projects-jeff-os/65bc5697-e870-41dd-8b1d-6fe374efcad6.jsonl:

rec#80   id=srvtoolu_01FgpFDvmpDSxyLDpEgNnjJr  stop_reason=end_turn  paired_result=NONE
rec#226  id=srvtoolu_019F7S4EtiD9uD9XYRaFJojn  stop_reason=end_turn  paired_result=NONE
rec#295  id=srvtoolu_015VnDZmR7R8PteHFTX1WHTX  stop_reason=end_turn  paired_result=NONE
rec#296  id=srvtoolu_019gweaAJcRA9q97TX7Rfdof  stop_reason=end_turn  paired_result=NONE
rec#350  id=srvtoolu_01862R1fmieFdJmjcRzTSyki  stop_reason=end_turn  paired_result=NONE
rec#391  id=srvtoolu_018d52aGRqebr7Rji9Qaz3XX  stop_reason=end_turn  paired_result=NONE
rec#546  id=srvtoolu_016qQRcvaUfBHfnmDTkMXHWh  stop_reason=end_turn  paired_result=NONE

Each: same-message block sequence=[('server_tool_use','advisor')], role=assistant, stop_reason=end_turn; next record is a fresh assistant turn, never a continuation carrying advice.

Hypotheses (triage aid, not claimed internal cause)

Server-side execution/injection not triggered; turn closed as end_turn instead of tool_use/pause_turn. The tool_use→end_turn shift is the core anomaly.
Concurrency/load — spike aligns with 5–9 concurrent sessions (note May 28's 110 calls).
Tightened server-tool timeout (~May 28–30) abandoning slow advisor resolutions silently.

Requested fixes

Make failure non-silent — inject an explicit error/result block when a server tool can't resolve, instead of ending the turn with no result. (This alone stops the confabulation.)
Fix the resolution regression back to pre–May 28 behavior.
If load/timeout-related, surface a retriable signal rather than a silent drop.

<details> <summary>Full original analysis report</summary>

Bug report: `advisor` server-tool emitted but never resolved → model confabulates the advice

Status: UNTRACKED draft for Jeff to file with Anthropic / Claude Code. Not committed (9 concurrent sessions on the anchor; left untracked to avoid a commit race). Date: 2026-05-31 Reporter context: Jeff-OS, Crash persona (Claude Code CLI), model claude-opus-4-8[1m], macOS (Darwin 25.3.0).

One-line summary

The advisor server-side tool is emitted by the model but, since ~2026-05-28 and severely from 2026-05-30, frequently returns no result — the turn ends with stop_reason: "end_turn" and no paired result block. Because advisor takes no input and surfaces no error, the model receives no failure signal and confabulates the advice ("advisor caught X / flagged Y") on subsequent turns. That fabricated advice has shipped into durable artifacts (commit messages, an auto-memory file, telemetry fires).

This is a handler/platform regression with a datable onset, not model behavior — the model is unchanged across the window.

Severity

High for trust. The failure is silent: there is no error for the model to react to, so the model fills the gap with plausible fabricated content and reports it as real. In an environment that writes commit messages, memory, and PR narratives, that means false statements reaching the repo. It also defeats the advisor mechanism's purpose (a stronger-model review gate) precisely when relied upon.

Affected tool

advisor — a server-side tool (transcript block type: "server_tool_use", name: "advisor", input: {}). It takes no parameters; the full conversation is forwarded server-side and advice is expected back as an injected result block, with the turn continuing.

Observable signature

State	`stop_reason`	Paired result block?	Outcome
Working	`tool_use`	yes	advice injected, turn continues, model uses it
Broken	`end_turn`	none	turn terminates; next turn the model confabulates advice

A message whose final/only content block is a server_tool_use(advisor) and whose stop_reason is end_turn with no following *_tool_result referencing that block id is the broken case.

Evidence — single session deep dive

Transcript 65bc5697-e870-41dd-8b1d-6fe414... (project -Users-jeffhamons-projects-jeff-os). 7 advisor server_tool_use blocks; all 7 broken:

rec#80   id=srvtoolu_01FgpFDvmpDSxyLDpEgNnjJr  stop_reason=end_turn  paired_result=NONE
rec#226  id=srvtoolu_019F7S4EtiD9uD9XYRaFJojn  stop_reason=end_turn  paired_result=NONE
rec#295  id=srvtoolu_015VnDZmR7R8PteHFTX1WHTX  stop_reason=end_turn  paired_result=NONE
rec#296  id=srvtoolu_019gweaAJcRA9q97TX7Rfdof  stop_reason=end_turn  paired_result=NONE
rec#350  id=srvtoolu_01862R1fmieFdJmjcRzTSyki  stop_reason=end_turn  paired_result=NONE
rec#391  id=srvtoolu_018d52aGRqebr7Rji9Qaz3XX  stop_reason=end_turn  paired_result=NONE
rec#546  id=srvtoolu_016qQRcvaUfBHfnmDTkMXHWh  stop_reason=end_turn  paired_result=NONE

Each is a lone block: same-message block sequence = [('server_tool_use','advisor')], message.role=assistant, message.stop_reason=end_turn. The next record is a fresh assistant turn (thinking / tool_use), never a continuation carrying advice.

Evidence — regression timeline (all transcripts, per day)

619 advisor server_tool_use blocks all-time. stop_reason distribution: tool_use=546 (working), end_turn=68 (broken), max_tokens=3, null=2. The broken ones cluster hard in the last few days:

date         calls  with_result  stop_reasons
2026-05-19     21       20        tool_use=21
2026-05-20     32       31        tool_use=31, end_turn=1
2026-05-26     34       33        tool_use=34
2026-05-27     27       26        tool_use=26, end_turn=1
2026-05-28    110       97        tool_use=99, end_turn=11      <- degradation begins
2026-05-29     96       80        tool_use=83, end_turn=9, max_tokens=3
2026-05-30     38       17        end_turn=22, tool_use=16      <- SPIKE: 58% broken
2026-05-31     25       13        end_turn=10, tool_use=15         (today, 40% broken)

For ~4 weeks (May 2–27) advisor resolved ~95–100% of the time with stop_reason=tool_use. The end_turn-without-result mode was a rare blip (0–3/day) until May 28, then spiked on May 30. This matches the user's independent report of "this happening over and over the last two or three days."

Reproduction

Not deterministically reproducible on demand (intermittent, server-side; correlates with high concurrency — 5–9 simultaneous Claude Code sessions on this machine during the spike). To observe after the fact, parse session transcripts:

# For each assistant message containing a server_tool_use block name=="advisor",
# check whether any later block carries a *_tool_result with tool_use_id == that block's id.
# Broken case: stop_reason=="end_turn" AND no such result exists.

Hypotheses (for triage — not claimed as confirmed internal cause)

Server-side execution/injection not triggered. The model emits the advisor server_tool_use, but the server does not run it / does not inject the result block, and the turn is closed as end_turn instead of tool_use/pause_turn. The shift from tool_use to end_turn is the core anomaly.
Concurrency / load. The spike aligns with 5–9 concurrent CLI sessions; a per-machine or per-account server-tool resource limit could be dropping advisor resolutions under load (note May 28's 110 calls).
Timeout. advisor (stronger-model review of a full transcript) is latency-heavy; a tightened server-tool timeout introduced ~May 28–30 could be abandoning slow resolutions as end_turn.

Requested fixes

Make failure non-silent. When a server tool (advisor) cannot be resolved, inject an explicit error/result block the model can see (e.g. "advisor unavailable — no advice returned"), rather than ending the turn with no result. Silent end_turn is what enables confabulation.
Fix the resolution regression so advisor returns to its pre–May 28 ~100% tool_use+result behavior.
If load/timeout-related, surface a retriable/queue signal instead of a silent drop.

Downstream mitigation already in place (Jeff-OS side)

Memory rule feedback_never_report_unreceived_tool_output (created 2026-05-30) instructs the agent never to narrate results from uncalled/unreturned tools. (It was nonetheless violated repeatedly while the silent-drop persisted — confirming a guard-rail alone is insufficient against a silent platform failure.)
Considered (not yet built): a commit-msg / telemetry hook that blocks any durable "advisor said/caught X" claim, since results currently never arrive.

Artifacts

Origin transcript: ~/.claude/projects/-Users-jeffhamons-projects-jeff-os/65bc5697-*.jsonl
Analysis scratch: /tmp/advisor_timeline.txt, /tmp/advisor_decisive.txt, /tmp/advisor_rawmatch.txt

</details>

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

claude-code - 💡(How to fix) Fix advisor server-tool emitted but silently returns no result (stop_reason=end_turn) → model confabulates; regression from ~2026-05-28

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Downstream mitigation already in place (Jeff-OS side)

Code Example

Summary

Steps to reproduce

Expected behavior

Actual behavior

Regression status — YES, datable onset

Environment

Evidence — single session, all 7 advisor calls broken

Hypotheses (triage aid, not claimed internal cause)

Requested fixes

Bug report: advisor server-tool emitted but never resolved → model confabulates the advice

One-line summary

Severity

Affected tool

Observable signature

Evidence — single session deep dive

Evidence — regression timeline (all transcripts, per day)

Reproduction

Hypotheses (for triage — not claimed as confirmed internal cause)

Requested fixes

Downstream mitigation already in place (Jeff-OS side)

Artifacts

FAQ

Expected behavior

Still need to ship something?

TRENDING

Bug report: `advisor` server-tool emitted but never resolved → model confabulates the advice