claude-code - 💡(How to fix) Fix [BUG] Session limits maxing out on their own, without any interaction from user. [1 pull requests]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

Error Messages/Logs

Fix Action

Fixed

RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

The Problem: Every Claude Code session loads the full instruction blocks for every connected MCP server, all plugin system prompts, CLAUDE.md, and memory files before the user types a single character. With 20+ MCP servers connected, this burns a significant portion of the Pro plan's session/hourly token limit on initialization overhead the user never requested and cannot control.

The Impact: The user logged in to find their 5-hour rolling session limit already depleted — not from their work, but from context loading on session start. Over the month of May, this contributed to $10.31 in usage credit overages against a $10 cap.

The Core Complaint: The deferred tool schema system exists (tools load on-demand via ToolSearch) — but MCP server instruction blocks, plugin hooks, and system context do NOT follow the same lazy-loading pattern. They inject unconditionally on every session start. On a paid plan with usage limits, this is a hidden recurring cost that users cannot opt out of, do not see coming, and cannot attribute to any action they took.

What the user is asking for:

MCP server instructions should load lazily (on first use), not upfront Users should be able to see per-session token cost breakdowns Session initialization overhead should not count against usage limits, or should at minimum be disclosed

What Should Happen?

MCP server instruction blocks should load lazily — injected into context only when a tool from that server is first called, not unconditionally on session start. This is already the pattern used for tool schemas (the deferred tool system), so the infrastructure concept exists; it just isn't applied to server instruction blocks.

Additionally:

Users should be able to see a token cost breakdown per session, including what was consumed by initialization vs. actual user interactions Session initialization overhead should either not count against hourly/daily usage limits, or be disclosed clearly at plan purchase so users can make informed decisions about how many MCP servers to connect The current behavior silently drains paid usage limits before the user has done any work, with no visibility and no opt-out.

Error Messages/Logs

Steps to Reproduce

No steps to reproduce; it is a native behavior that is quite literally robbing users of valuable PAID FOR session limits.

Claude Model

Sonnet (default)

Is this a regression?

Yes, this worked in a previous version

Last Working Version

No response

Claude Code Version

NA

Platform

Other

Operating System

Windows

Terminal/Shell

Other

Additional Information

No response

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Session limits maxing out on their own, without any interaction from user. [1 pull requests]