claude-code - 💡(How to fix) Fix [BUG] Subagent 'Warmup' drains 99%+ of subscription input tokens (regression since 2026-04-09; ref #17457 closed NOT_PLANNED) [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#47922Fetched 2026-04-15 06:38:32
View on GitHub
Comments
0
Participants
1
Timeline
8
Reactions
0
Author
Participants
Timeline (top)
labeled ×6cross-referenced ×1subscribed ×1

Error Message

Error Messages/Logs

No error — tokens drain silently. The Warmup entries look like this in the transcript JSONL (~/.claude/projects/<project>/<session>.jsonl):

Root Cause

  • I have searched existing open issues — none tracking this. Closely related: #17457 (closed NOT_PLANNED on 2026-02-28), #16752, #16961, #25138 — all closed. Reopening the topic because the behaviour re-surfaced / intensified after 2026-04-09.
  • Single bug report.
  • Using latest — claude --version: 2.1.107.

Fix Action

Fix / Workaround

  1. Apply the only documented mitigation — CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1 in ~/.claude/settings.json under env — and re-run the script. Sub%Inp drops to single digits.

Current (unsatisfactory) workaround

Claude Code CLI on Linux (Debian 11). Also reproduced on Windows 11 (same CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1 workaround works there).

Code Example

# user side (sidechain opener)
{"type":"user","isSidechain":true,"agentId":"a72e8af","message":{"role":"user","content":[{"type":"text","text":"Warmup"}]},"timestamp":"2026-04-14T12:44:04.879Z"}

# assistant side (the costly call)
{"type":"assistant","isSidechain":true,"message":{"model":"claude-opus-4-5-20251101","usage":{"input_tokens":8082,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"output_tokens":2}},"timestamp":"2026-04-14T12:44:05.112Z"}

---

# repro script
   git clone https://github.com/mann1x/claude-hooks
   python3 claude-hooks/scripts/weekly_token_usage.py --show-sidechain
   # Expect: Sub%Inp column reads 9599 % on any day with agents loaded.

---

"env": { "CLAUDE_CODE_DISABLE_BACKGROUND_TASKS": "1" }
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing open issues — none tracking this. Closely related: #17457 (closed NOT_PLANNED on 2026-02-28), #16752, #16961, #25138 — all closed. Reopening the topic because the behaviour re-surfaced / intensified after 2026-04-09.
  • Single bug report.
  • Using latest — claude --version: 2.1.107.

What's Wrong?

Subagent Warmup is draining subscription tokens on a massive scale. Every registered subagent is spawned at session start with the hard-coded user prompt "Warmup" and pays a full cold-cache input cost (no prompt-cache reuse between sessions). On a machine with plugins that register several agents, this alone consumes 99 %+ of the input tokens billed against the weekly subscription limit.

First "Warmup" message across 27,038 transcripts in ~/.claude/projects/ on my host: 2026-04-09 17:12 UTC. Before that date — zero Warmups, ever. Some behaviour change landed on/around 2026-04-09 that turned the feature back on or raised its intensity, post-dating the NOT_PLANNED closure of #17457.

What Should Happen?

Either of:

  1. The Warmup turn is prompt-cached across session starts (cache-creation only once per agent-manifest change), or
  2. Warmup is opt-in per-agent via a manifest key like warmup: false, or
  3. A granular env var like CLAUDE_CODE_DISABLE_AGENT_WARMUP=1 is provided that does not also disable Ctrl+B, run_in_background: true, and auto-backgrounding (as the current CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1 does).

In any case: document Warmup in the changelog and in Manage costs effectively so users can correlate sudden token-usage spikes with the feature.

Evidence / Impact

Walked every JSONL under ~/.claude/projects/, deduped on message.id + model + requestId (same key ccusage uses, because Claude Code replays the same assistant turn into every transcript that forks/resumes/compacts from the originating session — up to 30× in my case, inflating naive totals 2–3×). Totals cross-validated against ccusage daily.

Script: mann1x/claude-hooks · scripts/weekly_token_usage.py (stdlib only, no deps).

Per-day split (Fri 2026-04-10 10:00 CEST → Tue 2026-04-14):

DayMain inputSidechain inputSidechain share
Fri2 7607 37072.8 %
Sat3 795433 30899.1 %
Sun5 069346 18898.6 %
Mon4 405600 00599.3 %
Tue4 4931 304 52999.7 %

Tuesday breakdown (the worst day):

  • 1 143 sidechain user messages carried the literal text "Warmup".
  • Only 4 sidechain messages carried a real task prompt.
  • ≈ 99 % of the 1.3 M sidechain input tokens were the Warmup spawns, not user work.

Main conversation used only 2–5 k input tokens per day thanks to prompt caching. Warmups pay cache-creation costs every single session start because each agent's context is new.

Dedup validation (anticipating the obvious question)

The per-day totals grow monotonically from Fri → Tue, which could in principle hide a dedup bug. It was checked three ways:

1. Composite key equals single-field key. Dedup key is message.id + model + requestId. Across the whole week's sidechain entries, using message.id alone produces the same deduped count (4 757). No silent collisions — message.id is already unique per API call.

2. Raw-to-deduped ratio is consistent at ~2×. Without dedup the week has 9 338 sidechain assistant entries; with dedup 4 757 (ratio 1.96×). Matches the expected replay pattern — each sidechain turn is written into both the originating transcript and the forked-session transcript it spawns, so most messages appear exactly twice.

3. Growth tracks session count, not a dedup artefact.

DaySidechain sessionsUnique agentIdsDeduped warmup msgsDeduped input
Fri13957 370
Sat141396567433 308
Sun153456696346 188
Mon2246611 189600 005
Tue3901 1482 2101 304 529

Fri has only 1 sidechain session because Warmup had just rolled out (first global Warmup: 2026-04-09 17:12 UTC) and the Fri-10:00-CEST weekly window excludes the handful of Thu-evening Warmups that preceded it. Sessions per day grew Fri → Tue from 1 → 390; warmups-per-session stayed constant at 2–3 (≈ one per registered agent). The 1.3 M Tue figure decomposes cleanly as 390 sessions × ~3 agents warmed × ~1–8 k fresh input tokens each.

4. ccusage cross-check. ccusage daily -z Europe/Berlin --since 20260410 reports the same per-day totals within the CET↔CEST window delta — e.g. this script reports 446 100 541 total Tuesday tokens; ccusage reports 446 712 100 (delta ≈ 0.14 %, attributable to ccusage grouping by full calendar day while this script uses the Fri-10:00-CEST reset-shifted day).

Error Messages/Logs

No error — tokens drain silently. The Warmup entries look like this in the transcript JSONL (~/.claude/projects/<project>/<session>.jsonl):

# user side (sidechain opener)
{"type":"user","isSidechain":true,"agentId":"a72e8af","message":{"role":"user","content":[{"type":"text","text":"Warmup"}]},"timestamp":"2026-04-14T12:44:04.879Z"}

# assistant side (the costly call)
{"type":"assistant","isSidechain":true,"message":{"model":"claude-opus-4-5-20251101","usage":{"input_tokens":8082,"cache_creation_input_tokens":0,"cache_read_input_tokens":0,"output_tokens":2}},"timestamp":"2026-04-14T12:44:05.112Z"}

Note the zero cache_read_input_tokens — no cache hits between sessions.

Steps to Reproduce

  1. Install a plugin that registers several subagents (e.g. code-analysis@mag-claude-plugins registers 13 detective agents) — or simply rely on the ~6 built-in agents (general-purpose, Explore, Plan, claude-code-guide, statusline-setup, user-defined detective).

  2. Start a few fresh claude sessions in a row.

  3. grep -c '"Warmup"' ~/.claude/projects/*/*.jsonl — count Warmup prompts.

  4. Optional quantitative run:

    # repro script
    git clone https://github.com/mann1x/claude-hooks
    python3 claude-hooks/scripts/weekly_token_usage.py --show-sidechain
    # Expect: Sub%Inp column reads 95–99 % on any day with agents loaded.
  5. Apply the only documented mitigation — CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1 in ~/.claude/settings.json under env — and re-run the script. Sub%Inp drops to single digits.

Current (unsatisfactory) workaround

Set in ~/.claude/settings.json:

"env": { "CLAUDE_CODE_DISABLE_BACKGROUND_TASKS": "1" }

Per the official Interactive mode docs, this "disables all background task functionality"all-or-nothing:

  • ❌ Ctrl+B shortcut (TUI backgrounding)
  • ❌ Bash tool's run_in_background: true parameter
  • ❌ Auto-backgrounding of long Bash tools
  • ❌ (per #17457) subagent Warmup

There is no per-feature toggle. Users who rely on backgrounded Bash have to choose between token savings and ergonomics.

CLAUDE_AGENT_SDK_DISABLE_BUILTIN_AGENTS=1 only applies in -p non-interactive mode — in the interactive TUI the 5-6 built-in agents still warm regardless.

Scope of the ask

  1. Acknowledge Warmup in a changelog entry and in Manage costs effectively.
  2. Provide a granular disable (per-agent manifest key or dedicated env var) that does not also kill Ctrl+B / run_in_background.
  3. Cache Warmup across consecutive session starts so the cost is paid only on first use after an agent-manifest change — a pure performance fix that benefits every user.
  4. Make the "X % of weekly limit" reported in the UI queryable programmatically (CLI flag or local file), so users can audit the effect of any Warmup-related change.

Permission Mode

n/a (happens at session boot, before any permission prompt)

Can You Reproduce This?

Yes, every fresh claude session fires Warmups deterministically.

Claude Model

All — Warmup spawns whatever model the registered agent specifies (Opus 4.5, Sonnet, Haiku observed on my host).

Claude Code Version

2.1.107 (latest at time of filing). First Warmup observed on my host: 2026-04-09 17:12 UTC (version ≥ 2.0.77).

Platform

Claude Code CLI on Linux (Debian 11). Also reproduced on Windows 11 (same CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1 workaround works there).

Impact

High. For a subscription user the weekly token budget is opaque and the UI reports only a percentage. Because 99 % of my daily input tokens are Warmups — not real work — a single day of heavy session starts with plugins loaded can consume the weekly allowance before noon. See the Tuesday row of the evidence table: 1.3 M sidechain input tokens, 4 of which were real work.

Additional Context

Comment cross-posted earlier on #42796 (off-topic there, belongs here). Earlier related closures:

  • #17457 — NOT_PLANNED 2026-02-28 — "Multiple duplicate warmup agents spawning causes idle token consumption". Almost the same bug; closed without fix; behaviour has since intensified.
  • #16752 — "Agent warmup mode causes infinite retry loop with high API traffic" — closed.
  • #16961 — "Tool results echo task name 'Warmup' instead of executing, agent loops 1700+ times" — closed.
  • #25138 — claude --agent --print hang on cold start — closed.

Filing separately to make the current state and quantified impact visible, in the hope that the NOT_PLANNED decision can be revisited given the new data.

extent analysis

TL;DR

Set CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1 in ~/.claude/settings.json to temporarily mitigate the issue, although this will disable all background task functionality.

Guidance

  • The Warmup feature is causing a significant drain on subscription tokens due to the lack of prompt caching across session starts.
  • To verify the issue, run the provided weekly_token_usage.py script and check the Sub%Inp column for high values (95-99%).
  • A granular disable for Warmup, such as a per-agent manifest key or dedicated env var, is needed to prevent token drain without disabling other background tasks.
  • Caching Warmup across consecutive session starts could provide a pure performance fix, benefiting every user.

Example

No code snippet is provided as the issue is more related to configuration and feature implementation.

Notes

The current workaround has limitations, as it disables all background task functionality. A more targeted solution is required to address the issue without impacting other features.

Recommendation

Apply the workaround CLAUDE_CODE_DISABLE_BACKGROUND_TASKS=1 until a more granular solution is implemented, as it is the only currently available mitigation for the token drain issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Subagent 'Warmup' drains 99%+ of subscription input tokens (regression since 2026-04-09; ref #17457 closed NOT_PLANNED) [1 participants]