claude-code - 💡(How to fix) Fix Feature request: context_profile parameter on Agent (Task) tool to reduce subagent dispatch cost

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fix / Workaround

Feature request: context_profile parameter on Agent (Task) tool to reduce subagent dispatch cost

Every subagent dispatched via the Agent (Task) tool currently inherits the parent session's full context as a cache-creation cost on the subagent's first message. Measured baseline on claude-opus-4-7 (1M context): ~54,000 tokens per dispatch, very tight standard deviation across dispatches (54,271 max vs 53,630 min across an 8-dispatch session).

For workflows that dispatch verification subagents, lens-frame divergent-converge agents, or parallel implementation subagents (8-12 dispatches per session is not unusual), cumulative subagent-startup cost can exceed 400,000 tokens before any subagent has done useful work. That cost is constant whether the subagent is doing a one-shot grep or a multi-hour implementation.

RAW_BUFFERClick to expand / collapse

Title

Feature request: context_profile parameter on Agent (Task) tool to reduce subagent dispatch cost

Body

Problem

Every subagent dispatched via the Agent (Task) tool currently inherits the parent session's full context as a cache-creation cost on the subagent's first message. Measured baseline on claude-opus-4-7 (1M context): ~54,000 tokens per dispatch, very tight standard deviation across dispatches (54,271 max vs 53,630 min across an 8-dispatch session).

For workflows that dispatch verification subagents, lens-frame divergent-converge agents, or parallel implementation subagents (8-12 dispatches per session is not unusual), cumulative subagent-startup cost can exceed 400,000 tokens before any subagent has done useful work. That cost is constant whether the subagent is doing a one-shot grep or a multi-hour implementation.

What gets inherited

Per the current behavior, the subagent's first turn includes:

  • Claude Code system prompt
  • Global ~/.claude/CLAUDE.md (operator-personal posture)
  • Repo CLAUDE.md at the parent's cwd (project-specific posture)
  • Transitively @-referenced files from either CLAUDE.md
  • Memory MEMORY.md index plus on-demand memory entries
  • Operator standing-instructions block (from a UserPromptSubmit hook in our setup)
  • Standard tool listings + tool schemas
  • The parent's task prompt envelope

For the simplest possible verification subagent (one shell command, three-sentence reply), most of the above context is unused yet billed at dispatch time.

Proposed shape

Add an optional context_profile parameter to the Agent tool with at least two named values:

ValueBehavior
full (default)Current behavior; preserves backward compatibility.
leanSuppresses non-load-bearing context: skip repo CLAUDE.md, skip memory MEMORY.md index, skip @-resolved files. Keep: system prompt, global CLAUDE.md (or even skip; see open questions), tool schemas, agent task prompt.

A bare value could also be useful for purely mechanical lookups (skip everything except system prompt + agent prompt + tool schemas).

Parent caller chooses the profile at dispatch time. Default stays full so existing parent code is unaffected.

Open design questions

  1. Where does the profile decision live? Parent-prompt-time, or a per-subagent-type default in some registry?
  2. Should CLAUDE.md content be marker-gateable? For example, <subagent-context-min-section> markers so the file's author can declare which sections survive lean profile inheritance. This lets operators tune the profile per-file rather than per-call.
  3. How should lean mode interact with the divergent-converge pattern, where parallel subagents do need shared canonical-discipline content?
  4. Should lean mode include a "load on demand" path so the subagent can pull in skipped context via Read if the task warrants it, without re-paying for cache creation on the first turn?

Witnessed pattern that surfaced this

Operator-side workflow (Nickcom4) routinely dispatches:

  • 3 divergent-thinking subagents on substantive design or evaluation work (the 2-or-3-witness pattern)
  • 1-2 verification subagents per PR before merge
  • N parallel implementation subagents for independent tier-1 PRs in an umbrella issue

A typical session that ships 4-8 PRs incurs 10-20 subagent dispatches. At ~54k tokens each, the startup-only cost dominates the meaningful-work tokens for short subagent tasks. The asymmetry encourages skipping verification or batching work in ways that hurt PR quality.

What I've already tried locally

  • Trimming the project CLAUDE.md content (saves ~6k per dispatch on identical content; bounded by the operator-personal frame size).
  • Removing duplicated content (the same file was loading as both global and project; ~6k saved).
  • Measurement script that tracks per-dispatch cache_creation_input_tokens so the trim work is data-driven.

Combined, these recover roughly 24% of the per-dispatch cost. The remaining 76% is Anthropic-side (system prompt, tool listings, the dispatch envelope) and not reachable from the operator side.

Refs

  • (link to operator-side measurement issue / closed issue in their repo, if filed)
  • (link to umbrella cost-burn diagnostic, if filed)

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix Feature request: context_profile parameter on Agent (Task) tool to reduce subagent dispatch cost