claude-code - 💡(How to fix) Fix [Skill Submission] SaveTokens - working solution for 395 open issues on token cost, context management & agent orchestration

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

Users are silently burning through quota because nothing warns them their session has grown to 150k+ tokens with per-turn costs 10x higher than a fresh session — even with caching enabled.

RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing requests and this feature hasn't been requested yet
  • This is a single feature request (not multiple features)

Note: This is NOT a duplicate. This is a working implementation, not a feature request. Existing similar issues are listed intentionally at the bottom as references.

Problem Statement

There are 395 open issues in this repo describing the same root problem: Claude Code has no built-in token self-awareness or cost management.

Broken into 4 categories:

  • 150 issues: No visibility into real token/cache costs per session
  • 129 issues: No signal for when to compact or start a fresh session
  • 71 issues: No automatic way to split heavy tasks across agents
  • 45 issues: No community skill marketplace to share solutions

Most referenced: #44779 #55133 #58254 #41653 #12790 #16373 #18192

Users are silently burning through quota because nothing warns them their session has grown to 150k+ tokens with per-turn costs 10x higher than a fresh session — even with caching enabled.

Proposed Solution

I built SaveTokens - a working Claude Code skill that solves all 4 categories today with no UI changes and no API changes needed.

It has 3 modes:

MODE A - Real Token Health Check Reads directly from ~/.claude/projects/*/session.jsonl and reports: • Current context in actual tokens (not a percentage guess) • Cache read + cache write + input + output broken out separately • Estimated session cost in USD • Specific action: /compact now, /compact after task, or /clear

MODE B - Heavy Task Orchestration
Scores any task against 5 heaviness signals using deterministic regex. If score ≥ 2, automatically decomposes into ≤5 subtasks and spawns focused claude-sonnet-4-6 agents in parallel — 5x cheaper than Opus, each agent stays under 30k tokens instead of one session hitting 150k+.

MODE C - 25% Threshold Auto-Agent If ≥25% of session turns hit 150k+ context, declares a cost emergency, activates automatic agent splitting, and recommends /compact immediately.

What's included: • SKILL.md - skill instructions • scripts/token_usage.py - reads real session JSONL • scripts/task_analyzer.py - deterministic task scoring • scripts/bootstrap.py - portable one-time installer (~/.claude/SaveTokens/) • evals/evals.json - 5 evals, 27 assertions, all passing

Verified results: ✅ "just fix typo in README" → LIGHT, handled inline ✅ OAuth2 (Node + React + DB + tests + docs) → HEAVY, 5 agents ✅ Full auth refactor → HEAVY, 4 agents, Research agent runs first ✅ Session at 113k tokens → Heavy, $5.39, recommends /compact ✅ 25% threshold trigger → Mode C, urgent warning activated

Alternative Solutions

Option B - Unblock the community skill marketplace.

Issue #41653 shows remoteMarketplaceClient rejects all third-party plugin sources with "External plugin sources are not yet supported."

Unblocking this would let the community distribute SaveTokens and similar skills without requiring Anthropic review for every submission. This would indirectly close the 45 skills/marketplace issues and accelerate community-driven solutions for the other 350.

Priority

High - Significant impact on productivity

Feature Category

Developer tools/SDK

Use Case Example

Scenario 1 - Usage check mid-session: User types: /SaveTokens Output: "Session size: Heavy | Current context: ~113k tokens | Turns at 150k+: 0 of 122 (0%) | Est. cost: $5.39 | → Run /compact after this task."

Scenario 2 - Heavy task delegation: User types: "Add OAuth2 login: Node.js backend, React frontend, PostgreSQL schema, integration tests, update README" Output: Detects 3 heaviness signals → splits into 5 Sonnet agents (Database, Backend, Frontend, Tests, Docs) → spawns in parallel → synthesizes results. Total context per agent: ~20k instead of ~100k.

Scenario 3 - 25% threshold: User says: "My usage warning shows 25% of sessions at 150k+ context" Output: " Structural cost problem detected. Activating Auto-Agent Mode. Run /compact now. All heavy tasks will auto-split going forward."

Additional Context

The data source already exists — Claude Code writes full token usage (input_tokens, cache_read_input_tokens, cache_creation_input_tokens, output_tokens) to ~/.claude/projects/*/session.jsonl on every turn. SaveTokens simply reads it. No new APIs needed.

This was built and tested in a single session. The skill is portable across machines via bootstrap.py which installs scripts to ~/.claude/SaveTokens/ regardless of plugin installation path.

Related issues this addresses: #44779 #55133 #58254 #41653 #12790 #16373 #18192 #11535 #42607 #55755 #17772 #23620 #13579 #18550 #54673 #43510 #36751

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING