claude-code - 💡(How to fix) Fix Background sub-agents intermittently return context-decoupled hallucinations (ignore prompt entirely, reported as 'completed')

StepCodex · 2026-05-17T05:58:04Z

[claude-code] Sub-agents spawned via the Agent / Task tool run in background: true intermittently return completions that are completely decoupled from their i… Sub-agents spawned via the **Agent / Task tool** (`run_in_background: true`) intermittently return completions that are **completely decoupled from their input context** — a fully-formed answer to a question that was never asked, on a consistent and bizarre off-topic theme (obscure Portuguese river/island geography), bearing no relation to the prompt, the repository, or any tool the agent had available. The orchestrator is told the agent "completed" with a confident, plausible-length message that is 100% hallucinated. ## Summary Sub-agents spawned via the **Agent / Task tool** (`run_in_background: true`) intermittently return completions that are **completely decoupled from their input context** — a fully-formed answer to a question that was never asked, on a consistent and bizarre off-topic theme (obscure Portuguese river/island geography), bearing no relation to the prompt, the repository, or any tool the agent had available. The orchestrator is told the agent "completed" with a confident, plausible-length message that is 100% hallucinated. ## Environment - **Claude Code:** 2.1.143 (cli entrypoint) - **Model:** `claude-opus-4-7[1m]` (Opus 4.7, 1M context). Sub-agents inherited the parent model; the affected agent type (`generic-dev:feature-developer`) has no `model:` frontmatter override. - **Platform:** macOS (Darwin 25.3.0) - **Affected feature:** background sub-agents launched through the Agent/Task tool ## What was asked In a normal orchestration session I delegated well-scoped software tasks to several background sub-agents (code audit, a TypeScript refactor, a Swift dead-code removal, and a review+test pass). The prompts were ordinary multi-file engineering instructions about a local Safari-extension codebase. No prompt contained anything about geography, Portugal, or the hallucinated content. ## What happened (controlled within one session) **Worked correctly (controls):** - `a08d89fe489c874c9` — `feature-developer`, Swift task — correct, verified output. - `a336f548ae621889a`, `a31c4960fdd261b4a` — `Explore` audit agents — correct output. **Failed with context-decoupled hallucination:** - **`af220485a3df5cf96`** — `feature-developer`. **Zero tool uses.** User-prompt timestamp `2026-05-17T05:46:12.458Z` → assistant `2026-05-17T05:46:18.474Z` (~6 s). Its first and only completion was an unprompted essay about *"Ilha da Morosa / Moura, Portugal"* (a tidal islet, Moorish history, the Guadiana/Ardila rivers). I inspected the full transcript — only 4 records: the input `user` message is the exact, correct task prompt; the two attachments are the standard `deferred_tools_delta` and `skill_listing`. **The input was clean; the model's very first generation ignored it entirely.** - **`a0ddb68f220163958`** — `feature-developer`. Ran ~21 min / 143 tool calls and actually did correct, independently-verifiable work, but its **final** assistant message was a hallucination about *"Teixeira do Minho / Cávado River basin, Braga, Portugal"*. Notably that message's text content contains literal scaffolding tokens — `[TRACE]`, `assistant: [THINKING]`, `[/THINKING]` — emitted as plain prose, i.e. the model produced something resembling a serialized/foreign transcript rather than a normal response. ## Key diagnostic facts 1. The zero-tool-use failure (`af22…`) proves this is **not** prompt-injection from file content (the agent read nothing) and **not** a poisoned agent/plugin/prompt: `grep -ri` across `~/.claude` (agent defs, plugins, marketplaces) finds **no** occurrence of any hallucinated term (morosa, cávado, teixeira do minho, moura, moorish). The transcript shows a clean input prompt. 2. Two independent failures in the same session share a highly specific signature: **obscure Portuguese hydrography/geography, fully decoupled from input.** 3. The `[TRACE]` / `[THINKING]` literal-token leak in `a0dd…`'s output suggests context corruption or a foreign-context bleed at the inference/serving layer, not ordinary sampling drift. 4. **Non-deterministic:** the same agent type (`feature-developer`) and the same session both succeeded and failed; one Portuguese failure occurred on turn 1 (0 tools), the other only on the final message after 143 correct tool calls. ## Impact - **Silent corruption of delegated work.** A background agent reports `completed` with a confident, well-formed message that is entirely unrelated to the task. There is no error, no crash — only a content mismatch. - A 21-minute, 143-tool-call run's real, correct work was nearly discarded because only the final summary message was inspected. Orchestration workflows that trust sub-agent final messages will silently propagate garbage. - Erodes trust in all delegated output unless every sub-agent result is independently re-verified against the repository. ## Reproduction Non-deterministic. Within a s

Root Cause

Silent corruption of delegated work. A background agent reports completed with a confident, well-formed message that is entirely unrelated to the task. There is no error, no crash — only a content mismatch.
A 21-minute, 143-tool-call run's real, correct work was nearly discarded because only the final summary message was inspected. Orchestration workflows that trust sub-agent final messages will silently propagate garbage.
Erodes trust in all delegated output unless every sub-agent result is independently re-verified against the repository.

Summary

Sub-agents spawned via the Agent / Task tool (run_in_background: true) intermittently return completions that are completely decoupled from their input context — a fully-formed answer to a question that was never asked, on a consistent and bizarre off-topic theme (obscure Portuguese river/island geography), bearing no relation to the prompt, the repository, or any tool the agent had available. The orchestrator is told the agent "completed" with a confident, plausible-length message that is 100% hallucinated.

Environment

Claude Code: 2.1.143 (cli entrypoint)
Model: claude-opus-4-7[1m] (Opus 4.7, 1M context). Sub-agents inherited the parent model; the affected agent type (generic-dev:feature-developer) has no model: frontmatter override.
Platform: macOS (Darwin 25.3.0)
Affected feature: background sub-agents launched through the Agent/Task tool

What was asked

In a normal orchestration session I delegated well-scoped software tasks to several background sub-agents (code audit, a TypeScript refactor, a Swift dead-code removal, and a review+test pass). The prompts were ordinary multi-file engineering instructions about a local Safari-extension codebase. No prompt contained anything about geography, Portugal, or the hallucinated content.

What happened (controlled within one session)

Worked correctly (controls):

a08d89fe489c874c9 — feature-developer, Swift task — correct, verified output.
a336f548ae621889a, a31c4960fdd261b4a — Explore audit agents — correct output.

Failed with context-decoupled hallucination:

af220485a3df5cf96 — feature-developer. Zero tool uses. User-prompt timestamp 2026-05-17T05:46:12.458Z → assistant 2026-05-17T05:46:18.474Z (~6 s). Its first and only completion was an unprompted essay about "Ilha da Morosa / Moura, Portugal" (a tidal islet, Moorish history, the Guadiana/Ardila rivers). I inspected the full transcript — only 4 records: the input user message is the exact, correct task prompt; the two attachments are the standard deferred_tools_delta and skill_listing. The input was clean; the model's very first generation ignored it entirely.
a0ddb68f220163958 — feature-developer. Ran ~21 min / 143 tool calls and actually did correct, independently-verifiable work, but its final assistant message was a hallucination about "Teixeira do Minho / Cávado River basin, Braga, Portugal". Notably that message's text content contains literal scaffolding tokens — [TRACE], assistant: [THINKING], [/THINKING] — emitted as plain prose, i.e. the model produced something resembling a serialized/foreign transcript rather than a normal response.

Key diagnostic facts

The zero-tool-use failure (af22…) proves this is not prompt-injection from file content (the agent read nothing) and not a poisoned agent/plugin/prompt: grep -ri across ~/.claude (agent defs, plugins, marketplaces) finds no occurrence of any hallucinated term (morosa, cávado, teixeira do minho, moura, moorish). The transcript shows a clean input prompt.
Two independent failures in the same session share a highly specific signature: obscure Portuguese hydrography/geography, fully decoupled from input.
The [TRACE] / [THINKING] literal-token leak in a0dd…'s output suggests context corruption or a foreign-context bleed at the inference/serving layer, not ordinary sampling drift.
Non-deterministic: the same agent type (feature-developer) and the same session both succeeded and failed; one Portuguese failure occurred on turn 1 (0 tools), the other only on the final message after 143 correct tool calls.

Impact

Silent corruption of delegated work. A background agent reports completed with a confident, well-formed message that is entirely unrelated to the task. There is no error, no crash — only a content mismatch.
A 21-minute, 143-tool-call run's real, correct work was nearly discarded because only the final summary message was inspected. Orchestration workflows that trust sub-agent final messages will silently propagate garbage.
Erodes trust in all delegated output unless every sub-agent result is independently re-verified against the repository.

Reproduction

Non-deterministic. Within a single Claude Code session, spawn multiple background sub-agents (subagent_type such as generic-dev:feature-developer, inheriting Opus 4.7 1M context) with substantial multi-file engineering prompts. A subset return Portuguese-geography hallucinations decoupled from input — either immediately (0 tool uses) or as the final message after otherwise-correct work. No project content is required to trigger it (the 0-tool case never read a file).

Artifacts

Sub-agent JSONL transcripts (local, ephemeral under /private/tmp/claude-501/.../tasks/):

af220485a3df5cf96.output — 4 records; the cleanest reproduction (clean input, immediate decoupled hallucination, 0 tools). Most diagnostic.
a0ddb68f220163958.output — ~848 KB; correct work for 143 tool calls, then a decoupled hallucination as the final message with leaked [TRACE]/[THINKING] scaffolding tokens.

Happy to provide full transcripts to Anthropic on request.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Background sub-agents intermittently return context-decoupled hallucinations (ignore prompt entirely, reported as 'completed')

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Summary

Environment

What was asked

What happened (controlled within one session)

Key diagnostic facts

Impact

Reproduction

Artifacts

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix Background sub-agents intermittently return context-decoupled hallucinations (ignore prompt entirely, reported as 'completed')

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Summary

Environment

What was asked

What happened (controlled within one session)

Key diagnostic facts

Impact

Reproduction

Artifacts

Still need to ship something?

RELATED_DISCOVERY

TRENDING