claude-code - 💡(How to fix) Fix Agent tool: add context isolation mode to prevent evaluator bias in multi-agent setups

Root Cause

Multi-agent Claude Code setups are increasingly using agents-as-evaluators — for quality gates, deliberation, adversarial review, and synthesis. Without context isolation, these evaluators are architecturally biased toward the generator's reasoning, which defeats their core purpose.

The problem scales with harness complexity: the more reasoning the orchestrator accumulates, the stronger the evaluator's inherited bias.

Code Example

Orchestrator generates Proposal A
  → Orchestrator spawns Agent("evaluate Proposal A")
  → Agent receives prompt + implicit parent context (including how Proposal A was reasoned)
  → Agent's "evaluation" skews toward validating Proposal A
  → Quality gate fails its purpose

---

Innovator (Agent) → Devil-Advocate (Agent) → Mediator

---

# Before: biased Mediator
[Orchestrator context = full session + Innovator + Devil]
Orchestrator performs synthesis ← same instance, sees all prior reasoning

# After: isolated Mediator
Agent(
  prompt="Synthesize the following without prior context",
  input=only(innovator_output, devil_output)  # explicit context boundary
)

---

# Illustrative API
Agent(
  prompt="...",
  context_isolation="strict",   # spawned agent sees ONLY what's in `prompt`
                                # no parent reasoning chain inherited
)

Problem

When Claude Code's Agent tool spawns a sub-agent for evaluation roles (critic, mediator, quality-gate), the spawned agent implicitly inherits the parent's reasoning trajectory. Even with a neutral system prompt, the evaluator "sees" the generator's context and biases its judgment toward agreement.

This is the Cost of Consensus problem — empirically measured: a 32.3 percentage point performance drop when the same model evaluates outputs generated within a shared context (arXiv 2605.00914).

Concrete failure mode:

Orchestrator generates Proposal A
  → Orchestrator spawns Agent("evaluate Proposal A")
  → Agent receives prompt + implicit parent context (including how Proposal A was reasoned)
  → Agent's "evaluation" skews toward validating Proposal A
  → Quality gate fails its purpose

This affects any multi-agent pattern that uses agents-as-evaluators: deliberation, adversarial review, automated critique loops, and multi-persona synthesis.

Real-World Evidence (forge-harness)

We run forge-harness (github.com/chrono-code/forge-harness), a Claude Code multi-agent harness with 23 skills. One skill, deliberation, implements a 3-layer pipeline:

Innovator (Agent) → Devil-Advocate (Agent) → Mediator

Before redesign: Mediator step was performed by the orchestrating skill itself (same Claude instance, same accumulated context). Observed behavior: Mediator consistently favored whichever persona had appeared last — not genuine synthesis.

After redesign (based on Cost of Consensus findings): Mediator is now spawned as a separate isolated Agent call receiving only the Innovator and Devil outputs — no parent reasoning chain:

# Before: biased Mediator
[Orchestrator context = full session + Innovator + Devil]
Orchestrator performs synthesis ← same instance, sees all prior reasoning

# After: isolated Mediator
Agent(
  prompt="Synthesize the following without prior context",
  input=only(innovator_output, devil_output)  # explicit context boundary
)

Synthesis quality improved measurably after isolation. The evaluator no longer "knew" which side the orchestrator had been leaning toward.

Feature Request

Option A — Documentation (minimal)

Publish an official pattern for context-isolated evaluation in multi-agent setups. Clarify which parts of parent context are inherited when using the Agent tool, and how practitioners can minimize implicit context leakage.

Option B — Explicit isolation parameter (ideal)

Add a context isolation option to the Agent tool invocation so developers can declare a hard context boundary:

# Illustrative API
Agent(
  prompt="...",
  context_isolation="strict",   # spawned agent sees ONLY what's in `prompt`
                                # no parent reasoning chain inherited
)

This would make evaluator sub-agents structurally independent — solving the self-evaluation bias at the platform level rather than requiring harness-level workarounds.

Option C — Worktree guidance

If strict isolation already exists via worktrees or other mechanisms, document the recommended pattern for evaluation sub-agents specifically (vs. task execution sub-agents).

Why This Matters

The problem scales with harness complexity: the more reasoning the orchestrator accumulates, the stronger the evaluator's inherited bias.

References

arXiv 2605.00914 — "Cost of Consensus: The Hidden Price of Multi-Agent Concordance" (32.3pp degradation measured)
forge-harness deliberation skill — real-world implementation and redesign: https://github.com/chrono-code/forge-harness
Anthropic internal experiment: $9 single-agent vs $200 multi-agent harness quality gap (cited in official materials)

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix Agent tool: add context isolation mode to prevent evaluator bias in multi-agent setups

Recommended Tools

GitHub issue graph ai analysis