claude-code - 💡(How to fix) Fix Context window detection fails for third-party Anthropic-compatible providers [3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#46416Fetched 2026-04-11 06:20:52
View on GitHub
Comments
3
Participants
2
Timeline
6
Reactions
0
Author
Timeline (top)
commented ×3labeled ×3

Root Cause

In src/utils/context.ts, getContextWindowForModel() calls getModelCapability() to retrieve max_input_tokens from a cached capability list. However, getModelCapability() is gated by isFirstPartyAnthropicBaseUrl() in src/utils/model/modelCapabilities.ts:46-51:

function isModelCapabilitiesEligible(): boolean {
  if (process.env.USER_TYPE !== 'ant') return false
  if (getAPIProvider() !== 'firstParty') return false
  if (!isFirstPartyAnthropicBaseUrl()) return false  // ← MiniMax fails here
  return true
}

Since MiniMax's base URL is https://api.minimax.io/anthropic (not api.anthropic.com), isFirstPartyAnthropicBaseUrl() returns false, and getModelCapability() returns undefined. This causes getContextWindowForModel() to fall through to MODEL_CONTEXT_WINDOW_DEFAULT = 200_000.

Fix Action

Workaround

Set CLAUDE_CODE_MAX_CONTEXT_TOKENS=1000000 in environment to override the auto-detected window for all models.

Code Example

function isModelCapabilitiesEligible(): boolean {
  if (process.env.USER_TYPE !== 'ant') return false
  if (getAPIProvider() !== 'firstParty') return false
  if (!isFirstPartyAnthropicBaseUrl()) return false  // ← MiniMax fails here
  return true
}

---

CLAUDE_CODE_MAX_CONTEXT_TOKENS="MiniMax-M2.7:1000000,claude-opus-4-6:1000000"
RAW_BUFFERClick to expand / collapse

When using a third-party provider that implements the Anthropic API (e.g., MiniMax via https://api.minimax.io/anthropic), Claude Code's context window detection falls back to the hardcoded default of 200,000 tokens, even when the underlying model may support a larger context window.

Root Cause

In src/utils/context.ts, getContextWindowForModel() calls getModelCapability() to retrieve max_input_tokens from a cached capability list. However, getModelCapability() is gated by isFirstPartyAnthropicBaseUrl() in src/utils/model/modelCapabilities.ts:46-51:

function isModelCapabilitiesEligible(): boolean {
  if (process.env.USER_TYPE !== 'ant') return false
  if (getAPIProvider() !== 'firstParty') return false
  if (!isFirstPartyAnthropicBaseUrl()) return false  // ← MiniMax fails here
  return true
}

Since MiniMax's base URL is https://api.minimax.io/anthropic (not api.anthropic.com), isFirstPartyAnthropicBaseUrl() returns false, and getModelCapability() returns undefined. This causes getContextWindowForModel() to fall through to MODEL_CONTEXT_WINDOW_DEFAULT = 200_000.

Impact

  • AutoCompact triggers too aggressively: With a 200K assumed window, AutoCompact threshold is 200,000 - 13,000 = 187,000 (93.5%). If MiniMax actually supports 1M, this is ~19% into the real window.
  • Users hit context limits unexpectedly: The tool estimates the context is at 93.5% when it may actually be at only 18.7% for a 1M-capable model served through MiniMax.
  • Manual /compact becomes necessary: Users report needing to run /compact manually when the auto-compact warning should have fired much earlier (or not at all).

Reproduction Steps

  1. Set ANTHROPIC_BASE_URL=https://api.minimax.io/anthropic
  2. Set ANTHROPIC_AUTH_TOKEN=<MiniMax token>
  3. Set ANTHROPIC_MODEL=MiniMax-M2.7 (or any model served through MiniMax)
  4. Observe that getContextWindowForModel("MiniMax-M2.7") returns 200000 regardless of the model's actual capabilities
  5. Note that AutoCompact warning fires at ~187K tokens (based on 200K window) rather than at a proportional threshold for the actual window

Expected Behavior

Claude Code should either:

  1. Detect actual context window for third-party providers (if the provider exposes model capabilities via their own endpoints)
  2. Allow manual override via environment variable (CLAUDE_CODE_MAX_CONTEXT_TOKENS) or model configuration
  3. At minimum, not assume the smallest possible window for unknown third-party providers — use a conservative estimate or probe the actual limit

Proposed Fix

Option A: Extend capability detection to third-party providers (medium effort)

Add a getThirdPartyModelCapability() path that tries to fetch from the provider's model list endpoint, or maintain a local override map for known MiniMax/Gateway models.

Option B: Environment variable override for specific models (simple, immediate)

Add support for per-model context window overrides in CLAUDE_CODE_MAX_CONTEXT_TOKENS:

CLAUDE_CODE_MAX_CONTEXT_TOKENS="MiniMax-M2.7:1000000,claude-opus-4-6:1000000"

Option C: Probe actual context limit on first use (most robust)

On the first API call with a new model, detect 413 Payload Too Large and learn the actual limit, persisting it locally. Already done for team memory (src/services/teamMemorySync/index.ts:529), could be generalized.

Workaround

Set CLAUDE_CODE_MAX_CONTEXT_TOKENS=1000000 in environment to override the auto-detected window for all models.

Additional Context

  • AutoCompact already has a circuit breaker (3 consecutive failures) to prevent hammering the API when context is irrecoverably over the limit (src/services/compact/autoCompact.ts:70)
  • The 200K default is documented at src/utils/context.ts:9 as a comment but may not reflect actual provider capabilities
  • This issue affects any Anthropic-compatible third-party API (Azure, AWS Bedrock, Vertex, MiniMax, OpenRouter, etc.) where the provider URL doesn't match api.anthropic.com

extent analysis

TL;DR

  • Implementing a workaround by setting CLAUDE_CODE_MAX_CONTEXT_TOKENS to a higher value, such as 1000000, can help mitigate the issue of context window detection falling back to the default of 200,000 tokens.

Guidance

  • Identify the underlying model's actual context window capability, either through the provider's documentation or by probing the API.
  • Consider implementing getThirdPartyModelCapability() to fetch model capabilities from the provider's endpoint or maintain a local override map for known models.
  • Use the CLAUDE_CODE_MAX_CONTEXT_TOKENS environment variable to override the auto-detected window for specific models, as in CLAUDE_CODE_MAX_CONTEXT_TOKENS="MiniMax-M2.7:1000000".

Example

// Example of setting CLAUDE_CODE_MAX_CONTEXT_TOKENS for a specific model
process.env.CLAUDE_CODE_MAX_CONTEXT_TOKENS = "MiniMax-M2.7:1000000";

Notes

  • The proposed fixes (Option A, B, and C) have varying levels of effort and complexity, and the chosen solution should depend on the specific requirements and constraints of the project.
  • The workaround of setting CLAUDE_CODE_MAX_CONTEXT_TOKENS may not be suitable for all use cases, especially if the actual context window capability is unknown or varies between models.

Recommendation

  • Apply the workaround by setting CLAUDE_CODE_MAX_CONTEXT_TOKENS to a higher value, as it provides an immediate and simple solution to mitigate the issue, although it may not be the most robust or accurate solution in the long term.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING