litellm - ✅(Solved) Fix Anthropic 400 "credit balance too low" error does not trigger fallback routing [2 pull requests, 1 comments, 2 participants]

litellm2026-03-21 19:57:37

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#24320•Fetched 2026-04-08 01:13:19

View on GitHub

Comments

Participants

Timeline

Reactions

Author

psarma89

Participants

AhsanSheraz

psarma89

Timeline (top)

cross-referenced ×2commented ×1labeled ×1mentioned ×1

When Anthropic returns a 400 Bad Request with error type invalid_request_error and message "Your credit balance is too low to access the Anthropic API", LiteLLM's router does not trigger fallback routing. The error is classified as a non-retryable client error, so the request fails without attempting any configured fallback models.

This is problematic because "credit balance too low" is a billing/quota issue, not an invalid request. The request itself is well-formed — it would succeed on any other provider serving the same model. Fallback should absolutely be attempted here.

Error Message

No fallback model group found for original model_group=anthropic/claude-opus-4-6 Error doing the fallback: litellm.InternalServerError: ...

Root Cause

PR fix notes

PR #24616: fix(router): 429 routing — cooldown bypass, providers.json mapping, Anthropic credit balance fallback

Repository: BerriAI/litellm
Author: naarob
State: closed | merged: False
Link: https://github.com/BerriAI/litellm/pull/24616

Description (problem / solution / changelog)

Summary

Three related fixes for 429/rate-limit routing failures. All three share the same root cause: rate-limit errors were not surfaced as RateLimitError, so the router could not cool down deployments or trigger fallback routing.

Fix 1 — `cooldown_handlers.py`: APIConnectionError wrapping 429 bypassed cooldown (closes #24366)

_is_cooldown_required() returned False for any exception containing 'APIConnectionError'. But providers from providers.json have their HTTP 429 responses wrapped as APIConnectionError by the catch-all mapper — so those deployments were never cooled down and all retries kept hitting the same rate-limited endpoint.

Fix: only skip cooldown for APIConnectionError when neither '429' nor 'rate limit' appears in the exception string.

# Before — skips cooldown for ANY APIConnectionError, even wrapped 429s
if 'APIConnectionError' in exception_str:
    return False

# After — only skips when no rate-limit signal is present
if 'APIConnectionError' in exception_str:
    if not ('429' in exception_str or 'rate limit' in exception_str.lower()):
        return False

Fix 2 — `constants.py`: 9 providers.json providers missing from `openai_compatible_providers` (closes #24366)

veniceai, scaleway, gmi, sarvam, xiaomi_mimo, abliteration, llamagate, assemblyai, charity_engine were not in openai_compatible_providers. Without this, their exceptions fell through to the catch-all mapper which raises APIConnectionError for any status code — preventing RateLimitError from being raised for 429 responses.

Fix 3 — `exception_mapping_utils.py`: Anthropic 400 'credit balance too low' → `RateLimitError` (closes #24320)

Anthropic returns HTTP 400 for billing exhaustion with the message "Your credit balance is too low to access the Anthropic API". This was mapped to BadRequestError, preventing router fallback.

The request itself is valid — the failure is a billing/quota issue that should be treated like a rate limit so the router tries alternative deployments.

Tests

21 new regression tests in tests/router_unit_tests/test_pr_429_routing_fixes.py:

TestCooldownAPIConnectionError429 (6 tests) — pure vs 429-wrapped APIConnectionError
TestProvidersJsonInOpenaiCompatible (10 tests) — all 9 new providers + existing ones unchanged
TestAnthropicCreditBalanceMapping (5 tests) — credit balance → RateLimitError, normal 400 → BadRequestError, 429 → RateLimitError

All 21 pass — 0 regressions.

Changed files

litellm/constants.py (modified, +9/-0)
litellm/litellm_core_utils/exception_mapping_utils.py (modified, +36/-0)
litellm/llms/anthropic/experimental_pass_through/messages/streaming_iterator.py (modified, +28/-4)
litellm/llms/xai/chat/transformation.py (modified, +5/-0)
litellm/model_prices_and_context_window_backup.json (modified, +37329/-37334)
litellm/router.py (modified, +11/-1)
litellm/router_utils/cooldown_handlers.py (modified, +12/-1)
model_prices_and_context_window.json (modified, +37329/-37334)
tests/router_unit_tests/test_pr_429_routing_fixes.py (added, +247/-0)
tests/router_unit_tests/test_pr_litellm_round2_fixes.py (added, +414/-0)

PR #107: refactor: extract business logic from routers to CQRS service layer

Repository: pilotspace/pilot-space
Author: TinDang97
State: closed | merged: True
Link: https://github.com/pilotspace/pilot-space/pull/107

Description (problem / solution / changelog)

Summary

Extract business logic from 22+ routers into 20 new CQRS service classes following Clean Architecture
Register all services in DI container (20 Factory providers, 20 Dep types)
Create shared utilities: RateLimitService, IssuePriorityMapper, StateNameNormalizer
Consolidate duplicated _require_admin patterns and _get_admin_workspace across routers
Register previously unwired services (FeatureToggleService, ScimService) in DI
Migrate raw SQL in 3 services to repository pattern (NoteTemplateService, AdminDashboardService, DependencyGraphService)
Fix regression: restore inline admin checks for slug-aware routers (skill_templates, workspace_plugins, workspace_role_skills)

New Services Created (20)

Service	Source Router
`RateLimitService`	ghost_text, issues_ai_context (x2)
`AdminDashboardService`	admin.py
`AIConfigurationService`	ai_configuration.py
`CreateExtractedIssuesService`	ai_extraction.py, workspace_notes_ai.py
`GovernanceRollbackService`	ai_governance.py
`BlockOwnershipService`	block_ownership.py
`DependencyGraphService`	dependency_graph.py
`NoteTemplateService`	note_templates.py
`RelatedIssuesSuggestionService`	related_issues.py
`McpServerService` + `McpOAuthService`	workspace_mcp_servers.py
`AttachmentManagementService`	ai_attachments.py
`PluginLifecycleService`	workspace_plugins.py
`OcrConfigurationService`	workspace_ocr_settings.py
`WorkspaceAISettingsService`	workspace_ai_settings.py
`SprintBoardService`	pm_sprint_board.py
`CapacityPlanService`	pm_capacity.py
`ActionButtonService`	workspace_action_buttons.py
`MCPToolExecutionService`	mcp_tools.py
`ProjectDetailService`	projects.py

Stats

58 files changed, +7,015 / -4,734 lines
Biggest router reductions: workspace_mcp_servers.py (960→350), ai_configuration.py (649→200), ai_governance.py (530→150)

Test plan

Verify skill-templates endpoint works with workspace slugs (was broken, now fixed)
Verify workspace_plugins endpoints work with workspace slugs
Verify workspace_role_skills endpoints work with workspace slugs
Run make quality-gates-backend (ruff, pyright, pytest)
Smoke test: workspace CRUD, issue CRUD, AI chat, plugin install/uninstall
Verify MCP server OAuth flow still works
Verify AI configuration provider key testing still works

Summary by CodeRabbit

Refactor
- Many API routes simplified to thin HTTP handlers delegating logic to centralized services for more consistent behavior.
New Features
- Admin dashboard, AI configuration & provider key testing, OCR settings, attachment signed URLs & ingestion, MCP server/tool/OAuth support, plugin lifecycle tools, sprint board & capacity planning, project detail summaries, dependency-graph view, related-issue suggestions, note templates, workspace action buttons, AI governance rollback, and Redis-backed rate limiting.
Tests
- Unit tests updated to validate service-level behavior and router delegation.

Changed files

backend/src/pilot_space/ai/infrastructure/anthropic_client_pool.py (modified, +45/-21)
backend/src/pilot_space/ai/services/ghost_text.py (modified, +117/-20)
backend/src/pilot_space/api/v1/dependencies.py (modified, +362/-0)
backend/src/pilot_space/api/v1/routers/_chat_schemas.py (modified, +12/-82)
backend/src/pilot_space/api/v1/routers/_mcp_server_schemas.py (modified, +33/-546)
backend/src/pilot_space/api/v1/routers/admin.py (modified, +14/-293)
backend/src/pilot_space/api/v1/routers/ai.py (modified, +0/-24)
backend/src/pilot_space/api/v1/routers/ai_annotations.py (modified, +1/-19)
backend/src/pilot_space/api/v1/routers/ai_approvals.py (modified, +6/-52)
backend/src/pilot_space/api/v1/routers/ai_attachments.py (modified, +45/-237)
backend/src/pilot_space/api/v1/routers/ai_chat.py (modified, +2/-2)
backend/src/pilot_space/api/v1/routers/ai_chat_model_routing.py (modified, +4/-43)
backend/src/pilot_space/api/v1/routers/ai_configuration.py (modified, +35/-473)
backend/src/pilot_space/api/v1/routers/ai_extraction.py (modified, +65/-279)
backend/src/pilot_space/api/v1/routers/ai_governance.py (modified, +34/-431)
backend/src/pilot_space/api/v1/routers/ai_pr_review.py (modified, +1/-39)
backend/src/pilot_space/api/v1/routers/ai_sessions.py (modified, +7/-102)
backend/src/pilot_space/api/v1/routers/ai_tasks.py (modified, +1/-42)
backend/src/pilot_space/api/v1/routers/block_ownership.py (modified, +28/-188)
backend/src/pilot_space/api/v1/routers/dependency_graph.py (modified, +35/-189)
backend/src/pilot_space/api/v1/routers/ghost_text.py (modified, +4/-42)
backend/src/pilot_space/api/v1/routers/knowledge_graph.py (modified, +1/-13)
backend/src/pilot_space/api/v1/routers/mcp_tools.py (modified, +35/-151)
backend/src/pilot_space/api/v1/routers/note_templates.py (modified, +44/-159)
backend/src/pilot_space/api/v1/routers/notes_ai.py (modified, +2/-23)
backend/src/pilot_space/api/v1/routers/notifications.py (modified, +5/-45)
backend/src/pilot_space/api/v1/routers/pm_blocks.py (modified, +21/-95)
backend/src/pilot_space/api/v1/routers/pm_capacity.py (modified, +9/-88)
backend/src/pilot_space/api/v1/routers/pm_dependency_graph.py (modified, +5/-26)
backend/src/pilot_space/api/v1/routers/pm_release_notes.py (modified, +1/-20)
backend/src/pilot_space/api/v1/routers/pm_sprint_board.py (modified, +20/-134)
backend/src/pilot_space/api/v1/routers/projects.py (modified, +39/-223)
backend/src/pilot_space/api/v1/routers/related_issues.py (modified, +35/-153)
backend/src/pilot_space/api/v1/routers/skill_approvals.py (modified, +6/-62)
backend/src/pilot_space/api/v1/routers/skill_templates.py (modified, +8/-23)
backend/src/pilot_space/api/v1/routers/skills.py (modified, +1/-17)
backend/src/pilot_space/api/v1/routers/webhooks.py (modified, +3/-29)
backend/src/pilot_space/api/v1/routers/workspace_action_buttons.py (modified, +30/-132)
backend/src/pilot_space/api/v1/routers/workspace_ai_settings.py (modified, +7/-262)
backend/src/pilot_space/api/v1/routers/workspace_encryption.py (modified, +10/-62)
backend/src/pilot_space/api/v1/routers/workspace_feature_toggles.py (modified, +3/-3)
backend/src/pilot_space/api/v1/routers/workspace_issues.py (modified, +7/-37)
backend/src/pilot_space/api/v1/routers/workspace_mcp_servers.py (modified, +91/-606)
backend/src/pilot_space/api/v1/routers/workspace_note_issue_links.py (modified, +4/-32)
backend/src/pilot_space/api/v1/routers/workspace_note_links.py (modified, +6/-46)
backend/src/pilot_space/api/v1/routers/workspace_notes_ai.py (modified, +42/-110)
backend/src/pilot_space/api/v1/routers/workspace_ocr_settings.py (modified, +53/-194)
backend/src/pilot_space/api/v1/routers/workspace_plugins.py (modified, +51/-306)
backend/src/pilot_space/api/v1/routers/workspace_quota.py (modified, +5/-44)
backend/src/pilot_space/api/v1/routers/workspace_role_skills.py (modified, +7/-23)
backend/src/pilot_space/api/v1/schemas/ai.py (added, +36/-0)
backend/src/pilot_space/api/v1/schemas/ai_annotations.py (added, +31/-0)
backend/src/pilot_space/api/v1/schemas/ai_chat.py (added, +102/-0)
backend/src/pilot_space/api/v1/schemas/ai_chat_model_routing.py (added, +60/-0)
backend/src/pilot_space/api/v1/schemas/ai_extraction.py (added, +118/-0)
backend/src/pilot_space/api/v1/schemas/ai_governance.py (added, +34/-0)
backend/src/pilot_space/api/v1/schemas/ai_sessions.py (added, +125/-0)
backend/src/pilot_space/api/v1/schemas/ai_tasks.py (added, +57/-0)
backend/src/pilot_space/api/v1/schemas/attachments.py (modified, +5/-37)
backend/src/pilot_space/api/v1/schemas/dependency_graph.py (added, +46/-0)
backend/src/pilot_space/api/v1/schemas/ghost_text.py (added, +80/-0)
backend/src/pilot_space/api/v1/schemas/issue.py (modified, +7/-0)
backend/src/pilot_space/api/v1/schemas/knowledge_graph.py (modified, +8/-0)
backend/src/pilot_space/api/v1/schemas/mcp_server.py (added, +571/-0)
backend/src/pilot_space/api/v1/schemas/mcp_tools.py (added, +52/-0)
backend/src/pilot_space/api/v1/schemas/notifications.py (added, +57/-0)
backend/src/pilot_space/api/v1/schemas/pm_blocks.py (added, +47/-0)
backend/src/pilot_space/api/v1/schemas/pm_capacity.py (added, +40/-0)
backend/src/pilot_space/api/v1/schemas/pm_dependency_graph.py (added, +45/-0)
backend/src/pilot_space/api/v1/schemas/pm_release_notes.py (added, +36/-0)
backend/src/pilot_space/api/v1/schemas/pm_sprint_board.py (added, +70/-0)
backend/src/pilot_space/api/v1/schemas/pr_review.py (modified, +34/-0)
backend/src/pilot_space/api/v1/schemas/related_issues.py (added, +44/-0)
backend/src/pilot_space/api/v1/schemas/skill_approvals.py (added, +84/-0)
backend/src/pilot_space/api/v1/schemas/skills.py (added, +30/-0)
backend/src/pilot_space/api/v1/schemas/workspace_encryption.py (added, +77/-0)
backend/src/pilot_space/api/v1/schemas/workspace_note_issue_links.py (added, +43/-0)
backend/src/pilot_space/api/v1/schemas/workspace_note_links.py (added, +58/-0)
backend/src/pilot_space/api/v1/schemas/workspace_notes_ai.py (added, +39/-0)
backend/src/pilot_space/api/v1/schemas/workspace_ocr_settings.py (added, +51/-0)
backend/src/pilot_space/api/v1/schemas/workspace_quota.py (added, +55/-0)
backend/src/pilot_space/application/services/action_button.py (added, +174/-0)
backend/src/pilot_space/application/services/admin_dashboard.py (added, +198/-0)
backend/src/pilot_space/application/services/ai_configuration.py (added, +446/-0)
backend/src/pilot_space/application/services/ai_extraction.py (added, +234/-0)
backend/src/pilot_space/application/services/ai_governance.py (added, +347/-0)
backend/src/pilot_space/application/services/attachment_management.py (added, +317/-0)
backend/src/pilot_space/application/services/block_ownership.py (added, +239/-0)
backend/src/pilot_space/application/services/capacity_plan.py (added, +118/-0)
backend/src/pilot_space/application/services/dependency_graph.py (added, +241/-0)
backend/src/pilot_space/application/services/mcp_oauth.py (added, +292/-0)
backend/src/pilot_space/application/services/mcp_server.py (added, +554/-0)
backend/src/pilot_space/application/services/mcp_tool_execution.py (added, +156/-0)
backend/src/pilot_space/application/services/note_template.py (added, +203/-0)
backend/src/pilot_space/application/services/ocr_configuration.py (added, +175/-0)
backend/src/pilot_space/application/services/plugin_lifecycle.py (added, +357/-0)
backend/src/pilot_space/application/services/pm_block_insight_service.py (modified, +112/-0)
backend/src/pilot_space/application/services/project_detail.py (added, +187/-0)
backend/src/pilot_space/application/services/rate_limit.py (added, +74/-0)
backend/src/pilot_space/application/services/related_issues.py (added, +158/-0)

Code Example

No fallback model group found for original model_group=anthropic/claude-opus-4-6
Error doing the fallback: litellm.InternalServerError: ...

---

anthropic/claude-opus-4-6 -> databricks/databricks-claude-opus-4-6, vertex_ai/claude-opus-4-6

---

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits."
  }
}

RAW_BUFFERClick to expand / collapse

Bug Report

Description

Expected Behavior

When Anthropic returns a billing-related error (credit balance too low, quota exceeded, etc.), LiteLLM should classify it as a fallback-eligible error and route to the configured fallback model group, just as it would for a 500 or 429.

Actual Behavior

The router raises the error directly to the caller with:

No fallback model group found for original model_group=anthropic/claude-opus-4-6
Error doing the fallback: litellm.InternalServerError: ...

The fallback mechanism is entered but fails because the error is classified as non-retryable. CooldownDeployments=[] confirms the failed deployment is never placed on cooldown.

Broader Issue

More generally, there are several 400-class errors from providers that should trigger fallback because they are provider-specific issues, not request issues:

Anthropic: 400 with "credit balance too low" — billing issue
Anthropic: 400 with "overloaded" — capacity issue
OpenAI: 400 with "billing_hard_limit_reached" — billing issue
Any provider: Account-level blocks, region restrictions, or temporary suspensions returned as 400

These are all cases where the same request would succeed on a different provider/deployment. LiteLLM should either:

Treat specific known billing/quota error messages as fallback-eligible regardless of HTTP status code
Provide a configuration option (e.g., fallback_on_status_codes: [400, 500, 429]) to let users control which status codes trigger fallback routing

Our Use Case

We run a multi-provider setup with fallbacks configured across Anthropic, Databricks (Foundation Model API), Vertex AI, and AWS Bedrock — all serving the same Claude models. When one provider has an issue (billing, outage, rate limits), we expect LiteLLM to seamlessly route to the next provider. This worked perfectly in our mock_testing_fallbacks tests but failed in production because the real Anthropic error came back as a 400.

Reproduction

Configure a model with fallbacks:

anthropic/claude-opus-4-6 -> databricks/databricks-claude-opus-4-6, vertex_ai/claude-opus-4-6

Ensure the Anthropic API key has zero credits
Send a request to anthropic/claude-opus-4-6
Observe: fallback is NOT triggered, error is returned directly

Anthropic Error Response

{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "Your credit balance is too low to access the Anthropic API. Please go to Plans & Billing to upgrade or purchase credits."
  }
}

HTTP status: 400 Bad Request

Environment

LiteLLM version: v1.81.0
Deployment: ECS Fargate
Python: 3.13

Note on Anthropic's Error Classification

Anthropic arguably should return 402 Payment Required or 429 Too Many Requests for billing issues rather than 400 Bad Request with invalid_request_error. However, LiteLLM shouldn't rely solely on HTTP status codes for fallback decisions — the error message content should also be considered.

extent analysis

Fix Plan

To address the issue, we need to modify the error handling logic in LiteLLM to treat specific known billing/quota error messages as fallback-eligible, regardless of the HTTP status code. We can achieve this by adding a configuration option to specify which error messages should trigger fallback routing.

Step 1: Add a configuration option

Add a new configuration option fallback_error_messages to specify the error messages that should trigger fallback routing.

# config.py
fallback_error_messages = [
    "Your credit balance is too low to access the Anthropic API",
    "billing_hard_limit_reached",
    # Add other known billing/quota error messages here
]

Step 2: Modify the error handling logic

Modify the error handling logic to check if the error message is in the fallback_error_messages list. If it is, trigger the fallback routing.

# error_handler.py
def handle_error(error):
    if error.message in fallback_error_messages:
        # Trigger fallback routing
        trigger_fallback()
    else:
        # Raise the error directly to the caller
        raise error

Step 3: Update the fallback mechanism

Update the fallback mechanism to use the new fallback_error_messages configuration option.

# fallback.py
def trigger_fallback():
    # Get the fallback model group
    fallback_model_group = get_fallback_model_group()
    
    # Route to the fallback model group
    route_to_fallback_model_group(fallback_model_group)

Verification

To verify that the fix worked, you can test the following scenarios:

Send a request to anthropic/claude-opus-4-6 with an Anthropic API key that has zero credits. The fallback should be triggered, and the request should be routed to the next provider.
Send a request to anthropic/claude-opus-4-6 with an Anthropic API key that has sufficient credits. The request should succeed, and the fallback should not be triggered.

Extra Tips

Make sure to update the fallback_error_messages list with known billing/quota error messages from all providers.
Consider adding a configuration option to let users control which status codes trigger fallback routing, in addition to the fallback_error_messages list.
Test the fallback mechanism thoroughly to ensure it works as expected in different scenarios.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #authentication setup #request error #file not found

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix Anthropic 400 "credit balance too low" error does not trigger fallback routing [2 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

PR fix notes

PR #24616: fix(router): 429 routing — cooldown bypass, providers.json mapping, Anthropic credit balance fallback

Description (problem / solution / changelog)

Summary

Fix 1 — cooldown_handlers.py: APIConnectionError wrapping 429 bypassed cooldown (closes #24366)

Fix 2 — constants.py: 9 providers.json providers missing from openai_compatible_providers (closes #24366)

Fix 3 — exception_mapping_utils.py: Anthropic 400 'credit balance too low' → RateLimitError (closes #24320)

Tests

Changed files

PR #107: refactor: extract business logic from routers to CQRS service layer

Description (problem / solution / changelog)

Summary

New Services Created (20)

Stats

Test plan

Summary by CodeRabbit

Changed files

Code Example

Bug Report

Description

Expected Behavior

Actual Behavior

Broader Issue

Our Use Case

Reproduction

Anthropic Error Response

Environment

Note on Anthropic's Error Classification

extent analysis

Fix Plan

Step 1: Add a configuration option

Step 2: Modify the error handling logic

Step 3: Update the fallback mechanism

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Fix 1 — `cooldown_handlers.py`: APIConnectionError wrapping 429 bypassed cooldown (closes #24366)

Fix 2 — `constants.py`: 9 providers.json providers missing from `openai_compatible_providers` (closes #24366)

Fix 3 — `exception_mapping_utils.py`: Anthropic 400 'credit balance too low' → `RateLimitError` (closes #24320)