claude-code - 💡(How to fix) Fix [BUG] Cowork token burn rate forces subscription upgrades; root causes are file-read failures + connector instability + scheduler stalls [1 participants]

Root Cause

A single power user has had to upgrade their Claude subscription tier specifically because of token burn from rework caused by a constellation of bugs in Cowork. Filing as a separate issue to give engineering a cost-impact lens for prioritizing fixes — the user-financial impact is real and persistent.

Fix Action

Fix / Workaround

(a) Failed file reads, MCP calls, and missed scheduler firings should not silently consume tokens via retry loops (b) Cowork should have cost-aware retry logic with caps on retries per failure (c) Sub-task dispatch should reuse cached memory file reads where possible to avoid re-paying for the same context on every turn (d) Users should have visibility into where their token spend is going (chat / sub-tasks / file ops / MCP retries) so they can identify cost spikes

Observable signal:

User has had to upgrade Claude subscription tier at least once because of token consumption rate
User estimates rework from Issues 1+2+3 accounts for a substantial fraction of consumption
Per-user telemetry (token spend by category: chat / sub-task dispatch / file ops / MCP retries) would be the right diagnostic surface here, but is not exposed to end users.
Per-user telemetry on token spend by category: chat / sub-task dispatch / file ops / MCP retries — does data already exist?
Are there guard rails against repeated retry loops that exhaust quota? Cap on retries per file or per turn?
Cost-aware retry logic for file reads and MCP calls (exponential backoff with awareness of the cost)
For long-running orchestrator-style sessions, are there caching opportunities to avoid re-paying for the same memory file reads on every turn?
Should Cowork surface a "this is going to cost ~$X" warning before dispatching expensive operations?

Code Example

No specific error messages — this is an aggregate / financial impact issue.

Observable signal:
- User has had to upgrade Claude subscription tier at least once because of token consumption rate
- User estimates rework from Issues 1+2+3 accounts for a substantial fraction of consumption
- Per-user telemetry (token spend by category: chat / sub-task dispatch / file ops / MCP retries) would be the right diagnostic surface here, but is not exposed to end users.

Engineering can validate by examining this user's token consumption pattern: ratio of failed-tool-call retries to successful operations.

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report (please file separate reports for different bugs)
I am using the latest version of Claude Code

What's Wrong?

Usage profile: heavy power user with ~60+ scheduled tasks, multiple concurrent Cowork sessions, multiple concurrent Code sessions, deep file-system integration via Drive Desktop

What Should Happen?

Heavy users should be able to maintain workflows on a single subscription tier without unexpected token burn from rework caused by underlying bugs. Specifically:

Power users should be the easiest to retain, not the hardest. Subscription upgrades should reflect actual increased value, not absorbed cost from infrastructure bugs.

Error Messages/Logs

No specific error messages — this is an aggregate / financial impact issue.

Observable signal:
- User has had to upgrade Claude subscription tier at least once because of token consumption rate
- User estimates rework from Issues 1+2+3 accounts for a substantial fraction of consumption
- Per-user telemetry (token spend by category: chat / sub-task dispatch / file ops / MCP retries) would be the right diagnostic surface here, but is not exposed to end users.

Engineering can validate by examining this user's token consumption pattern: ratio of failed-tool-call retries to successful operations.

Steps to Reproduce

Use Cowork normally over weeks of heavy use (60+ scheduled tasks, multiple concurrent sessions, deep file integration via Drive Desktop).
Encounter Issue #3 (file-read failures): each failed read triggers a retry, often a fresh sub-task with fresh context window — each costs tokens.
Encounter Issue #2 (MCP disconnects): failed email checks cause manual re-runs.
Encounter Issue #1 (scheduler stalls): missed reminders become manual re-runs.
Token consumption accumulates beyond expected rate for the work being done.
User must upgrade subscription tier to maintain workflow.

Direct quote: "I had to up my subscription because it kept burning through tokens so fast. Doing work over and over again. This has been happening since I started using it."

Claude Model

Not sure / Multiple models

Is this a regression?

I don't know

Last Working Version

No response

Claude Code Version

Claude 1.5354.0 (9a9e3d) 2026-04-29T01:14:34.000Z

Platform

Anthropic API

Operating System

Windows

Terminal/Shell

Windows Terminal

Additional Information

Impact

User-financial: subscription upgrade
Trust: user begins to manually attach files to chat to bypass Drive integration entirely, defeating much of the Cowork value prop
Workflow disruption: each retry interrupts active work

Suggested investigation

Per-user telemetry on token spend by category: chat / sub-task dispatch / file ops / MCP retries — does data already exist?
Are there guard rails against repeated retry loops that exhaust quota? Cap on retries per file or per turn?
Cost-aware retry logic for file reads and MCP calls (exponential backoff with awareness of the cost)
For long-running orchestrator-style sessions, are there caching opportunities to avoid re-paying for the same memory file reads on every turn?
Should Cowork surface a "this is going to cost ~$X" warning before dispatching expensive operations?

Note on root-cause priority

Fixing Issues 1 + 3 likely cuts this user's token burn substantially. Issue 2 also contributes but at smaller magnitude.


### Labels
`bug`, `cowork`, `cost-impact`, `meta`

---

## Filing order

Recommend filing in this order:

1. Issue 3 (highest severity, most token impact)
2. Issue 1 (most reproducible)
3. Issue 2 (important context for #1 and #4)
4. Issue 4 (the meta-issue, links the others — file last so you can paste in their issue numbers)

extent analysis

TL;DR

Implement cost-aware retry logic with caps on retries per failure to prevent token burn from rework caused by underlying bugs in Cowork.

Guidance

Investigate per-user telemetry on token spend by category to understand the source of token consumption.
Implement exponential backoff with awareness of the cost for file reads and MCP calls to prevent repeated retry loops.
Explore caching opportunities to avoid re-paying for the same memory file reads on every turn in long-running orchestrator-style sessions.
Consider surfacing a "this is going to cost ~$X" warning before dispatching expensive operations to increase user visibility into token spend.

Example

No specific code example is provided, but implementing a retry mechanism with a cap on retries per failure could look like:

max_retries = 5
retry_delay = 1  # initial delay in seconds

for attempt in range(max_retries):
    try:
        # perform operation
        break
    except Exception as e:
        # handle exception
        retry_delay *= 2  # exponential backoff
        time.sleep(retry_delay)

Notes

The provided information suggests that fixing Issues 1 and 3 may have the most significant impact on reducing token burn. However, without more detailed information on the underlying system and its implementation, it's challenging to provide a comprehensive solution.

Recommendation

Apply a workaround by implementing cost-aware retry logic and caching opportunities to reduce token burn. This approach addresses the immediate issue and provides a foundation for further optimization and improvement.

claude-code - 💡(How to fix) Fix [BUG] Cowork token burn rate forces subscription upgrades; root causes are file-read failures + connector instability + scheduler stalls [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Root Cause

Fix Action

Fix / Workaround

Code Example

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

Suggested investigation

Note on root-cause priority

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING