claude-code - 💡(How to fix) Fix [FEATURE] Per-model beta-feature opt-out for proxy/gateway-routed deployments — make context_management injection model-aware (or expose a settings.json allowlist/denylist)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

  1. Cleaner error messages. Today: "Extra inputs are not permitted" with no context. Post-feature: Claude Code can log "skipping context_management for Haiku per settings.json allowedModels" at debug level, making the routing explicit.

Fix Action

Fix / Workaround

The current workaround — CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1 — disables every beta including prompt-caching, fine-grained-tool-streaming, output-128k, and advisor-tool. For a power user the cost of losing prompt-caching alone can 3–5× iterative agent token spend.

Lets advanced users patch arbitrary fields. Risk: footgun for naive users.

A multi-account proxy deployment routing through Wei-Shaw/sub2api saw 26 sequential Haiku 4.5 turn failures over 30 minutes — all returning the identical context_management: Extra inputs are not permitted 400 from Anthropic upstream. Each failure cost the operator ~1 minute of retry/diagnostic time. The eventual workaround required a 4-iter custom wrapper (stripping body fields + header segments before forwarding to sub2api) totaling ~1000 lines of Go and dedicated tests.

Code Example

// ~/.claude/settings.json
{
  "experimentalBetas": {
    "contextManagement": {
      // Either:
      "allowedModels": ["claude-sonnet-*", "claude-opus-*"]
      // Or:
      "deniedModels": ["claude-haiku-*"]
    },
    "outputOneTwentyEight": {
      "allowedModels": ["claude-sonnet-*", "claude-opus-*"]
    }
    // ... per beta key
  }
}

---

export CLAUDE_CODE_DISABLE_BETA_CONTEXT_MANAGEMENT_FOR_MODEL_PATTERN="haiku"
export CLAUDE_CODE_DISABLE_BETA_OUTPUT_ONE_TWENTY_EIGHT_FOR_MODEL_PATTERN="haiku"

---

{
  "requestBodyOverrides": {
    "claude-haiku-4-5-*": {
      "stripTopLevelFields": ["context_management"],
    },
  },
}
RAW_BUFFERClick to expand / collapse

Body

Problem

Claude Code currently injects beta features (context_management, thinking, output-128k, advisor-tool, etc.) into outgoing /v1/messages request bodies unconditionally for every model when the corresponding anthropic-beta header is set. The only opt-out is the all-or-nothing CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1 environment variable.

This is too coarse for two real deployment shapes that are increasingly common:

  1. Proxy-routed deployments (Wei-Shaw/sub2api, LiteLLM, OpenRouter, custom gateways) where the target endpoint may accept a beta for some models but not others. Example: Bedrock rejects context_management for all models, but only some Anthropic-direct endpoints accept it.

  2. Model-specific upstream behavior where Anthropic's own /v1/messages API rejects a beta field for some models even when the corresponding beta header is set. Currently confirmed: claude-haiku-4-5-20251001 rejects body-level context_management with HTTP 400 "Extra inputs are not permitted" even when anthropic-beta: context-management-2025-06-27 is in the header. See issues #51700, #21612, #39589, #14135, #49548 for the running history of this specific symptom.

The current workaround — CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS=1 — disables every beta including prompt-caching, fine-grained-tool-streaming, output-128k, and advisor-tool. For a power user the cost of losing prompt-caching alone can 3–5× iterative agent token spend.

Proposed change

Add a way to declare which betas are valid for which models at Claude Code startup, so the per-request injection logic can skip betas that the targeted model's endpoint will reject. Three concrete shapes ordered by my preference:

Shape A (preferred): settings.json allowlist/denylist

// ~/.claude/settings.json
{
  "experimentalBetas": {
    "contextManagement": {
      // Either:
      "allowedModels": ["claude-sonnet-*", "claude-opus-*"]
      // Or:
      "deniedModels": ["claude-haiku-*"]
    },
    "outputOneTwentyEight": {
      "allowedModels": ["claude-sonnet-*", "claude-opus-*"]
    }
    // ... per beta key
  }
}

Pattern: glob-style model matching, allowlist takes precedence over denylist when both are present, missing entry means "default behavior" (current unconditional injection). Implementation cost: a single config-reader + a model-vs-pattern check in the body-builder.

Shape B (minimal): Environment-variable per-beta denylist

export CLAUDE_CODE_DISABLE_BETA_CONTEXT_MANAGEMENT_FOR_MODEL_PATTERN="haiku"
export CLAUDE_CODE_DISABLE_BETA_OUTPUT_ONE_TWENTY_EIGHT_FOR_MODEL_PATTERN="haiku"

One env var per beta, denylist pattern is a substring match (case-insensitive). Less expressive than Shape A but trivially scriptable from existing shell-init flows.

Shape C (broadest): Just expose the request-body builder via config

{
  "requestBodyOverrides": {
    "claude-haiku-4-5-*": {
      "stripTopLevelFields": ["context_management"],
    },
  },
}

Lets advanced users patch arbitrary fields. Risk: footgun for naive users.

Use cases this unblocks

  1. Routing Haiku through sub2api / OAuth proxies. Today's symptom (HTTP 400 for every Haiku turn) goes away without disabling betas globally. Tracked in #51700 (still open, still failing for users 4 weeks after #21612 was closed as resolved).

  2. Mixed-vendor routing. Some teams send Opus to Anthropic-direct, Sonnet to Bedrock, Haiku to Vertex AI. Each upstream has a different beta acceptance matrix. A single Claude Code config could honor each upstream's rules per-model.

  3. A/B beta features. Operators experimenting with whether context_management helps their workflow can scope the test to specific models without forfeiting prompt-caching everywhere.

  4. Cleaner error messages. Today: "Extra inputs are not permitted" with no context. Post-feature: Claude Code can log "skipping context_management for Haiku per settings.json allowedModels" at debug level, making the routing explicit.

Real-world data motivating this

A multi-account proxy deployment routing through Wei-Shaw/sub2api saw 26 sequential Haiku 4.5 turn failures over 30 minutes — all returning the identical context_management: Extra inputs are not permitted 400 from Anthropic upstream. Each failure cost the operator ~1 minute of retry/diagnostic time. The eventual workaround required a 4-iter custom wrapper (stripping body fields + header segments before forwarding to sub2api) totaling ~1000 lines of Go and dedicated tests.

Most of that workaround could collapse into ~20 lines of Claude Code config if Shape A or B existed.

Forensic chain + before/after metric:

Wrapper iterationErrors recorded (30-min window)
Pre-fix (v1.170.0)26
With body-strip only24 (header asymmetry uncovered)
With body + header strip1 (sub2api re-injection uncovered)
With body + header + thinking strip0

Public commit chain: terrylica/ccmax-monitor iter-126 + iter-127.

Why not just file an issue against sub2api / fix the proxy?

We've done that (Wei-Shaw/sub2api PR #<placeholder>) — and stripping the field at the proxy works. But the root issue is that Claude Code sends a field the targeted model's API rejects. Fixing it at the proxy layer means every gateway implementation needs the same patch. Fixing it client-side means every gateway works.

Compat notes

  • No breaking change. Missing config entry = current behavior preserved
  • Forward-compatible. When Anthropic upstream starts accepting context_management for Haiku, operators just remove the denylist entry
  • Discoverable. A debug-log line like "skipping <beta> for model <id> per settings.json" makes the behavior obvious in /diagnose
  • Testable. Pure config-table lookup; trivial to unit-test

Related issues

  • #51700 (OPEN) "Keep getting context management extra inputs are not permitted" — the symptom this feature would eliminate for proxy-routed Haiku users
  • #21612 (CLOSED) — original, marked resolved but the symptom recurs
  • #44521 (OPEN) [FEATURE] Expose context_management / clear_tool_uses_20250919 config — complementary; that issue wants MORE control over context_management, this issue wants a way to OPT OUT per-model
  • #49548 (OPEN) — same symptom on Bedrock

Happy to elaborate on any of the proposed shapes or contribute a draft PR if there's interest in a specific direction. Thanks for Claude Code!

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [FEATURE] Per-model beta-feature opt-out for proxy/gateway-routed deployments — make context_management injection model-aware (or expose a settings.json allowlist/denylist)