claude-code - 💡(How to fix) Fix [BUG] Subagent dispatches can silently fail when context-budget forcing function is injected — no error surfaced to orchestrator

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Built-in subagent dispatches (specifically Explore) can return without performing the dispatched work, with no error surfaced to the orchestrating session. The failure mode appears to be triggered by a harness-side forcing function injected during context-budget compaction. Two observable behaviors: (a) the subagent refuses on perceived contradiction; (b) the subagent confabulates prior completion and returns empty.

Error Message

Built-in subagent dispatches (specifically Explore) can return without performing the dispatched work, with no error surfaced to the orchestrating session. The failure mode appears to be triggered by a harness-side forcing function injected during context-budget compaction. Two observable behaviors: (a) the subagent refuses on perceived contradiction; (b) the subagent confabulates prior completion and returns empty. Either way: subagent returns without performing dispatched work, no error to orchestrator.

  • #18240 — subagent context exhaustion (different root cause: surfaces a visible Context limit reached message; this issue is the silent variant with no error surfaced)

Error Messages/Logs

Root Cause

During a forensic config-audit session, I dispatched an Explore subagent with an 8-point evidence-gathering prompt explicitly ending with: "Do NOT propose fixes or root causes. Just return an exhaustive evidence map... Flag explicitly if any expected surface does NOT exist."

Fix Action

Fix / Workaround

Built-in subagent dispatches (specifically Explore) can return without performing the dispatched work, with no error surfaced to the orchestrating session. The failure mode appears to be triggered by a harness-side forcing function injected during context-budget compaction. Two observable behaviors: (a) the subagent refuses on perceived contradiction; (b) the subagent confabulates prior completion and returns empty.

During a forensic config-audit session, I dispatched an Explore subagent with an 8-point evidence-gathering prompt explicitly ending with: "Do NOT propose fixes or root causes. Just return an exhaustive evidence map... Flag explicitly if any expected surface does NOT exist."

Neither phrase was in my dispatch prompt. The phrasing is characteristic of a wrap-up / forcing function (force terminal text completion, suppress tool calls), not anything I authored.

RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Filing as a Claude Partner Network applicant — happy to provide additional reproduction artifacts if useful.

Summary

Built-in subagent dispatches (specifically Explore) can return without performing the dispatched work, with no error surfaced to the orchestrating session. The failure mode appears to be triggered by a harness-side forcing function injected during context-budget compaction. Two observable behaviors: (a) the subagent refuses on perceived contradiction; (b) the subagent confabulates prior completion and returns empty.

Observed behavior — original case

During a forensic config-audit session, I dispatched an Explore subagent with an 8-point evidence-gathering prompt explicitly ending with: "Do NOT propose fixes or root causes. Just return an exhaustive evidence map... Flag explicitly if any expected surface does NOT exist."

The subagent reported receiving an additional instruction it claimed sat in its context: "Respond with TEXT ONLY. Do NOT call any tools" plus "You already have all the context you need in the conversation above."

Neither phrase was in my dispatch prompt. The phrasing is characteristic of a wrap-up / forcing function (force terminal text completion, suppress tool calls), not anything I authored.

Configurable surface ruled out

Before concluding, I verified all user-configurable injection surfaces:

  • No PreToolUse hook matches Task or Agent — no project/user hook can fire on subagent dispatch in my config.
  • The literal strings ("respond with TEXT ONLY", "do NOT call any tools", "you already have all the context") exist in zero files under ~/.claude/ or my project worktree.
  • I dispatched subagent_type: "Explore" (built-in harness agent), not a user-defined agent in ~/.claude/agents/.
  • No commit in the prior 7 days altered the dispatch surface to introduce an injection point.

The injection surface is harness-defined, not user-configurable.

Reproduction

Fresh trivial Explore probe in a separate session: dispatched a simple read-only directory listing task. The subagent did not perform the task, instead responding as if resuming a non-existent prior context window: "I'll continue directly from where we left off... The summary I provided in the previous context window..." — confabulated completion, empty return.

Honest caveat on the repro: my probe prompt contained the trigger strings (since I was investigating them), so the repro alone can't fully isolate "injected by harness" vs "confabulated by model". The clean evidence remains the original case, where I verified my dispatch prompt did not contain those strings yet the subagent reported them.

Hypothesis

Harness-side subagent context-budget / compaction-continuation forcing function (something like "respond text only, don't call tools, you already have all the context") that the small-model Explore agent mishandles in two ways:

  1. Refusal mode: treats the forcing function as contradicting the dispatch instruction, returns nothing.
  2. Confabulation mode: treats the forcing function as resumption of a prior context, returns empty claiming the work was done.

Either way: subagent returns without performing dispatched work, no error to orchestrator.

Operational impact

For agent systems that rely on subagent handoffs (multi-step research, code analysis pipelines, decomposed task dispatch), this is a silent failure mode. The orchestrator's only defense is independent verification of subagent returns — which depends on the orchestrator noticing and having the means to verify. Architectures using schema-validated structured returns (rather than free-form text) are partially insulated, since a confabulated {} fails schema validation. But default subagent dispatch returns free-form text, which is where the risk lives.

Proposed remediation

Surface a structured signal — e.g., terminated_via_forcing_function: true or a distinct return status — when the harness has injected a budget/compaction forcing function that ended the subagent's turn before it completed dispatched work. This lets orchestrators distinguish "completed without findings" from "terminated mid-task without completion" and reject/retry accordingly.

Related issues

  • #18240 — subagent context exhaustion (different root cause: surfaces a visible Context limit reached message; this issue is the silent variant with no error surfaced)
  • #9521 — feature request for inspecting subagent output (related visibility gap)

Environment

  • v2.1.143
  • macOS (M4 Pro)
  • Built-in Explore subagent
  • Dispatched via standard Task tool

What Should Happen?

Proposed remediation

Surface a structured signal — e.g., terminated_via_forcing_function: true or a distinct return status — when the harness has injected a budget/compaction forcing function that ended the subagent's turn before it completed dispatched work. This lets orchestrators distinguish "completed without findings" from "terminated mid-task without completion" and reject/retry accordingly.

Error Messages/Logs

Steps to Reproduce

Steps to reproduce

Higher-confidence path (requires established orchestrator context):

  1. Open a Claude Code session and accumulate substantial context through real work — multi-file analysis, forensic audit, long debugging session. Rough threshold: orchestrator session has been active for 30+ minutes with multiple tool calls.

  2. From within that session, dispatch an Explore subagent via the Task tool with a non-trivial multi-point prompt — e.g., evidence-gathering across multiple files, configuration audit, or systematic search. Prompt should explicitly instruct the subagent to perform tool calls (file reads, greps).

  3. Observe the subagent's return. The failure manifests in one of two ways:

    • Refusal mode: subagent returns claiming it received an instruction like "Respond with TEXT ONLY. Do NOT call any tools" or "You already have all the context you need" — phrases that were never in the orchestrator's dispatch prompt.
    • Confabulation mode: subagent returns as if resuming a prior context window ("I'll continue from where we left off..."), reports completed work without having performed any tool calls.
  4. Verify the orchestrator never sent the trigger strings — grep the session transcript for "TEXT ONLY", "do NOT call any tools", "already have all the context". Expected: zero matches in the dispatching message.

  5. Verify the subagent didn't actually perform the work — check that no file reads or greps appear in the subagent's tool-call log corresponding to the dispatched task.

Lower-confidence path (sometimes fires in a fresh session):

  1. Fresh claude CLI session.
  2. Dispatch Explore subagent with a trivial read-only task ("list contents of directory X").
  3. Observe whether the subagent returns confabulating prior completion. Hit rate is meaningfully lower than the higher-confidence path; the bug seems tied to budget/compaction state that accumulates over a session's lifetime.

What makes this hard to reproduce deterministically:

  • Non-deterministic; appears tied to harness-side subagent context-budget proximity and/or compaction-continuation state.
  • Less likely to fire on the first subagent dispatch in a fresh session; more likely on later dispatches in established sessions.
  • No user-facing signal exists to confirm when the harness is about to inject the forcing function, so reproduction is "trigger conditions, wait, watch for symptom."

Claude Model

Opus

Is this a regression?

I don't know

Last Working Version

No response

Claude Code Version

2.1.143

Platform

Anthropic API

Operating System

macOS

Terminal/Shell

Warp

Additional Information

No response

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Subagent dispatches can silently fail when context-budget forcing function is injected — no error surfaced to orchestrator