claude-code - 💡(How to fix) Fix [FEATURE] Cache read tokens from parallel subagent dispatch accumulate without upper bound [1 comments, 2 participants]

claude-code2026-04-10 21:22:36

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#46421•Fetched 2026-04-11 06:20:44

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Fearvox

Participants

Fearvox

github-actions[bot]

Timeline (top)

labeled ×3commented ×1

Fix Action

Fix / Workaround

When dispatching parallel subagents (via the Agent tool with allowParallel: true), each subagent session re-reads the full conversation history and parent context from cache. This causes cache_read tokens to accumulate rapidly and without upper bound across parallel subagent dispatches.

Cache read tokens should either:

Not bill to parent session — subagent cache reads are subagent overhead, not parent overhead
Deduplicate across subagents — if multiple subagents read the same context, count it once
Have a clear cap — set a maximum cache read budget per subagent dispatch to prevent runaway accumulation

This is distinct from issue #45958 (subagent notification stall) — this is specifically about cache_read token accumulation inflating the parent session's usage metrics
The same pattern affects CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 parallel dispatch workflows
Not limited to long-running sessions; even short parallel tasks trigger multiplicative cache reads

Code Example

Subagent 1: cache_read = X tokens
Subagent 2: cache_read = X tokens  
Subagent 3: cache_read = X tokens
Total: 3X cache_read tokens billed to parent session

RAW_BUFFERClick to expand / collapse

Preflight Checklist

I have searched existing issues and this has not been reported yet
This is a single bug report
I am using the latest version of Claude Code

What's Wrong?

Reproduction

Spawn 3 parallel subagents simultaneously (each running a long task)
Observe that each subagent independently re-chats the same conversation history from cache
Cache read tokens accumulate multiplicatively: 3 agents × N context tokens each = 3N cache reads
A 90-minute parallel subagent session can burn ~15M cache_read tokens with no user-visible benefit

Observed Symptom

Subagent 1: cache_read = X tokens
Subagent 2: cache_read = X tokens  
Subagent 3: cache_read = X tokens
Total: 3X cache_read tokens billed to parent session

The parent session is billed for cache reads that were done independently by each subagent — but subagents are supposed to be isolated. They should not be inflating the parent session's cache read token count.

Expected Behavior

Cache read tokens should either:

Not bill to parent session — subagent cache reads are subagent overhead, not parent overhead
Deduplicate across subagents — if multiple subagents read the same context, count it once
Have a clear cap — set a maximum cache read budget per subagent dispatch to prevent runaway accumulation

Technical Notes

This is distinct from issue #45958 (subagent notification stall) — this is specifically about cache_read token accumulation inflating the parent session's usage metrics
The same pattern affects CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 parallel dispatch workflows
Not limited to long-running sessions; even short parallel tasks trigger multiplicative cache reads

Suggested Labels

area:agents, area:cost, enhancement

extent analysis

TL;DR

Implement a mechanism to deduplicate cache reads across subagents or set a clear cap on cache read tokens per subagent dispatch to prevent accumulation.

Guidance

Investigate modifying the Agent tool to implement a cache read deduplication mechanism, ensuring that each unique cache read is only counted once across all subagents.
Consider introducing a cacheReadBudget parameter for subagent dispatches, allowing users to set a maximum cache read token limit per subagent to prevent runaway accumulation.
Review the billing logic for cache reads to determine if it's possible to exclude subagent cache reads from the parent session's usage metrics.
Explore the impact of CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1 on cache read token accumulation and determine if a similar solution can be applied to this workflow.

Example

No code example is provided due to the lack of specific implementation details in the issue.

Notes

The solution may require significant changes to the underlying architecture of the Agent tool and the cache read billing logic. It's essential to consider the potential performance implications of implementing a deduplication mechanism or cache read budgeting system.

Recommendation

Apply a workaround by introducing a cache read deduplication mechanism or a clear cap on cache read tokens per subagent dispatch, as this will help prevent cache read token accumulation and provide a more accurate representation of parent session usage metrics.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #conversation history #parallel task #orchestration issue #cache issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [FEATURE] Cache read tokens from parallel subagent dispatch accumulate without upper bound [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Code Example

Preflight Checklist

What's Wrong?

Reproduction

Observed Symptom

Expected Behavior

Technical Notes

Suggested Labels

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [FEATURE] Cache read tokens from parallel subagent dispatch accumulate without upper bound [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Code Example

Preflight Checklist

What's Wrong?

Reproduction

Observed Symptom

Expected Behavior

Technical Notes

Suggested Labels

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING