litellm - ✅(Solved) Fix [Bug]: litellm.llms.openai.common_utils.OpenAIError: Bad Request when using Claude Code 2.1.69 [1 pull requests, 5 comments, 3 participants]

litellm2026-03-05 06:37:56

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#22878•Fetched 2026-04-08 00:39:30

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×5labeled ×4subscribed ×4cross-referenced ×1

Error Message

Check the error and log in litellm Final returned optional params: {'stream': True, 'stream_options': {'include_usage': True}, 'max_tokens': 32000, 'tools': [{'type': 'function', 'function': {'name': 'Read', 'parameters': {'$schema': 'https://json-schema.org/draft/2020-12/schema', 'type': 'object', 'properties': {'file_path': {'description': 'The absolute path to the file to read', 'type': 'string'}, 'offset': {'description': 'The line number to start reading from. Only provide if the file is too large to read at once', 'type': 'number'}, 'limit': {'description': 'The number of lines to read. Only provide if the file is too large to read at once.', 'type': 'number'}, 'pages': {'description': 'Page range for PDF files (e.g., "1-5", "3", "10-20"). Only applicable to PDF files. Maximum 20 pages per request.', 'type': 'string'}}, 'required': ['file_path'], 'additionalProperties': False, 'defer_loading': True}, 'description': 'Reads a file from the local filesystem. You can access any file directly by using this tool.\nAssume this tool is able to read all files on the machine. If the User provides a path to a file assume that path is valid. It is okay to read a file that does not exist; an error will be returned.\n\nUsage:\n- The file_path parameter must be an absolute path, not a relative path\n- By default, it reads up to 2000 lines starting from the beginning of the file\n- You can optionally specify a line offset and limit (especially handy for long files), but it's recommended to read the whole file by not providing these parameters\n- Any lines longer than 2000 characters will be truncated\n- Results are returned using cat -n format, with line numbers starting at 1\n- This tool allows Claude Code to read images (eg PNG, JPG, etc). When reading an image file the contents are presented visually as Claude Code is a multimodal LLM.\n- This tool can read PDF files (.pdf). For large PDFs (more than 10 pages), you MUST provide the pages parameter to read specific page ranges (e.g., pages: "1-5"). Reading a large PDF without the pages parameter will fail. Maximum 20 pages per request.\n- This tool can read Jupyter notebooks (.ipynb files) and returns all cells with their outputs, combining code, text, and visualizations.\n- This tool can only read files, not directories. To read a directory, use an ls command via the Bash tool.\n- You can call multiple tools in a single response. It is always better to speculatively read multiple potentially useful files in parallel.\n- You will regularly be asked to read screenshots. If the user provides a path to a screenshot, ALWAYS use this tool to view the file at the path. This tool will work with all temporary file paths.\n- If you read a file that exists but has empty contents you will receive a system reminder warning in place of file contents.'}}, {'type': 'function', 'function': {'name': 'ToolSearch', 'parameters': {'$schema': 'https://json-schema.org/draft/2020-12/schema', 'type': 'object', 'properties': {'query': {'description': 'Query to find deferred tools. Use "select:<tool_name>" for direct selection, or keywords to search.', 'type': 'string'}, 'max_results': {'description': 'Maximum number of results to return (default: 5)', 'default': 5, 'type': 'number'}}, 'required': ['query', 'max_results'], 'additionalProperties': False}, 'description': 'Search for or select deferred tools to make them available for use.\n\nMANDATORY PREREQUISITE - THIS IS A HARD REQUIREMENT\n\nYou MUST use this tool to load deferred tools BEFORE calling them directly.\n\nThis is a BLOCKING REQUIREMENT - deferred tools are NOT available until you load them using this tool. Look for <available-deferred-tools> messages in the conversation for the list of tools you can discover. Both query modes (keyword search and direct selection) load the returned tools — once a tool appears in the results, it is immediately available to call.\n\nWhy this is non-negotiable:\n- Deferred tools are not loaded until discovered via this tool\n- Calling a deferred tool without first loading it will fail\n\nQuery modes:\n\n1. Keyword search - Use keywords when you're unsure which tool to use or need to discover multiple tools at once:\n - "list directory" - find tools for listing directories\n - "notebook jupyter" - find notebook editing tools\n - "slack message" - find slack messaging tools\n - Returns up to 5 matching tools ranked by relevance\n - All returned tools are immediately available to call — no further selection step needed\n\n2. Direct selection - Use select:<tool_name> when you know the exact tool name:\n - "select:mcp__slack__read_channel"\n - "select:NotebookEdit"\n - "select:Read,Edit,Grep" - load multiple tools at once with comma separation\n - Returns the named tool(s) if they exist\n\nIMPORTANT: Both modes load tools equally. Do NOT follow up a keyword search with select: calls for tools already returned — they are already loaded.\n\n3. Required keyword - Prefix with + to require a match:\n - "+linear create issue" - only tools from "linear", ranked by "create"/"issue"\n - "+slack send" - only "slack" tools, ranked by "send"\n - Useful when you know the service name but not the exact tool\n\nCORRECT Usage Patterns:\n\n<example>\nUser: I need to work with slack somehow\nAssistant: Let me search for slack tools.\n[Calls ToolSearch with query: "slack"]\nAssistant: Found several options including mcp__slack__read_channel.\n[Calls mcp__slack__read_channel directly — it was loaded by the keyword search]\n</example>\n\n<example>\nUser: Edit the Jupyter notebook\nAssistant: Let me load the notebook editing tool.\n[Calls ToolSearch with query: "select:NotebookEdit"]\n[Calls NotebookEdit]\n</example>\n\n<example>\nUser: List files in the src directory\nAssistant: I can see mcp__filesystem__list_directory in the available tools. Let me select it.\n[Calls ToolSearch with query: "select:mcp__filesystem__list_directory"]\n[Calls the tool]\n</example>\n\nINCORRECT Usage Patterns - NEVER DO THESE:\n\n<bad-example>\nUser: Read my slack messages\nAssistant: [Directly calls mcp__slack__read_channel without loading it first]\nWRONG - You must load the tool FIRST using this tool\n</bad-example>\n\n<bad-example>\nAssistant: [Calls ToolSearch with query: "slack", gets back mcp__slack__read_channel]\nAssistant: [Calls ToolSearch with query: "select:mcp__slack__read_channel"]\nWRONG - The keyword search already loaded the tool. The select call is redundant.\n</bad-example>'}}], 'max_retries': 0, 'extra_body': {}} Final returned optional params: {'stream': True, 'stream_options': {'include_usage': True}, 'max_tokens': 32000, 'tools': [{'type': 'function', 'function': {'name': 'Read', 'parameters': {'$schema': 'https://json-schema.org/draft/2020-12/schema', 'type': 'object', 'properties': {'file_path': {'description': 'The absolute path to the file to read', 'type': 'string'}, 'offset': {'description': 'The line number to start reading from. Only provide if the file is too large to read at once', 'type': 'number'}, 'limit': {'description': 'The number of lines to read. Only provide if the file is too large to read at once.', 'type': 'number'}, 'pages': {'description': 'Page range for PDF files (e.g., "1-5", "3", "10-20"). Only applicable to PDF files. Maximum 20 pages per request.', 'type': 'string'}}, 'required': ['file_path'], 'additionalProperties': False, 'defer_loading': True}, 'description': 'Reads a file from the local filesystem. You can access any file directly by using this tool.\nAssume this tool is able to read all files on the machine. If the User provides a path to a file assume that path is valid. It is okay to read a file that does not exist; an error will be returned.\n\nUsage:\n- The file_path parameter must be an absolute path, not a relative path\n- By default, it reads up to 2000 lines starting from the beginning of the file\n- You can optionally specify a line offset and limit (especially handy for long files), but it's recommended to read the whole file by not providing these parameters\n- Any lines longer than 2000 characters will be truncated\n- Results are returned using cat -n format, with line numbers starting at 1\n- This tool allows Claude Code to read images (eg PNG, JPG, etc). When reading an image file the contents are presented visually as Claude Code is a multimodal LLM.\n- This tool can read PDF files (.pdf). For large PDFs (more than 10 pages), you MUST provide the pages parameter to read specific page ranges (e.g., pages: "1-5"). Reading a large PDF without the pages parameter will fail. Maximum 20 pages per request.\n- This tool can read Jupyter notebooks (.ipynb files) and returns all cells with their outputs, combining code, text, and visualizations.\n- This tool can only read files, not directories. To read a directory, use an ls command via the Bash tool.\n- You can call multiple tools in a single response. It is always better to speculatively read multiple potentially useful files in parallel.\n- You will regularly be asked to read screenshots. If the user provides a path to a screenshot, ALWAYS use this tool to view the file at the path. This tool will work with all temporary file paths.\n- If you read a file that exists but has empty contents you will receive a system reminder warning in place of file contents.'}}, {'type': 'function', 'function': {'name': 'ToolSearch', 'parameters': {'$schema': 'https://json-schema.org/draft/2020-12/schema', 'type': 'object', 'properties': {'query': {'description': 'Query to find deferred tools. Use "select:<tool_name>" for direct selection, or keywords to search.', 'type': 'string'}, 'max_results': {'description': 'Maximum number of results to return (default: 5)', 'default': 5, 'type': 'number'}}, 'required': ['query', 'max_results'], 'additionalProperties': False}, 'description': 'Search for or select deferred tools to make them available for use.\n\nMANDATORY PREREQUISITE - THIS IS A HARD REQUIREMENT\n\nYou MUST use this tool to load deferred tools BEFORE calling them directly.\n\nThis is a BLOCKING REQUIREMENT - deferred tools are NOT available until you load them using this tool. Look for <available-deferred-tools> messages in the conversation for the list of tools you can discover. Both query modes (keyword search and direct selection) load the returned tools — once a tool appears in the results, it is immediately available to call.\n\nWhy this is non-negotiable:\n- Deferred tools are not loaded until discovered via this tool\n- Calling a deferred tool without first loading it will fail\n\nQuery modes:\n\n1. Keyword search - Use keywords when you're unsure which tool to use or need to discover multiple tools at once:\n - "list directory" - find tools for listing directories\n - "notebook jupyter" - find notebook editing tools\n - "slack message" - find slack messaging tools\n - Returns up to 5 matching tools ranked by relevance\n - All returned tools are immediately available to call — no further selection step needed\n\n2. Direct selection - Use select:<tool_name> when you know the exact tool name:\n - "select:mcp__slack__read_channel"\n - "select:NotebookEdit"\n - "select:Read,Edit,Grep" - load multiple tools at once with comma separation\n - Returns the named tool(s) if they exist\n\nIMPORTANT: Both modes load tools equally. Do NOT follow up a keyword search with select: calls for tools already returned — they are already loaded.\n\n3. Required keyword - Prefix with + to require a match:\n - "+linear create issue" - only tools from "linear", ranked by "create"/"issue"\n - "+slack send" - only "slack" tools, ranked by "send"\n - Useful when you know the service name but not the exact tool\n\nCORRECT Usage Patterns:\n\n<example>\nUser: I need to work with slack somehow\nAssistant: Let me search for slack tools.\n[Calls ToolSearch with query: "slack"]\nAssistant: Found several options including mcp__slack__read_channel.\n[Calls mcp__slack__read_channel directly — it was loaded by the keyword search]\n</example>\n\n<example>\nUser: Edit the Jupyter notebook\nAssistant: Let me load the notebook editing tool.\n[Calls ToolSearch with query: "select:NotebookEdit"]\n[Calls NotebookEdit]\n</example>\n\n<example>\nUser: List files in the src directory\nAssistant: I can see mcp__filesystem__list_directory in the available tools. Let me select it.\n[Calls ToolSearch with query: "select:mcp__filesystem__list_directory"]\n[Calls the tool]\n</example>\n\nINCORRECT Usage Patterns - NEVER DO THESE:\n\n<bad-example>\nUser: Read my slack messages\nAssistant: [Directly calls mcp__slack__read_channel without loading it first]\nWRONG - You must load the tool FIRST using this tool\n</bad-example>\n\n<bad-example>\nAssistant: [Calls ToolSearch with query: "slack", gets back mcp__slack__read_channel]\nAssistant: [Calls ToolSearch with query: "select:mcp__slack__read_channel"]\nWRONG - The keyword search already loaded the tool. The select call is redundant.\n</bad-example>'}}], 'max_retries': 0, 'extra_body': {}} Custom Logger Error - Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): Custom Logger Error - Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): 14:34:42 - LiteLLM Proxy:ERROR: endpoints.py:121 - litellm.proxy.proxy_server.anthropic_response(): Exception occured - litellm.BadRequestError: Github_copilotException - Bad Request. Received Model Group=claude-sonnet-4-5 Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): During handling of the above exception, another exception occurred: Traceback (most recent call last): error=e, raise error

Fix Action

Fixed

Fixed by PR: fix(anthropic): deduplicate tool_result messages by tool_call_id (https://github.com/BerriAI/litellm/pull/23104)

PR fix notes

PR #23104: fix(anthropic): deduplicate tool_result messages by tool_call_id

Repository: BerriAI/litellm
Author: netbrah
State: closed | merged: True
Link: https://github.com/BerriAI/litellm/pull/23104

Description (problem / solution / changelog)

Related Issues

Related to #16711 — "each tool_use must have a single result. Found multiple tool_result blocks" with tool response handling Fixes #22946 — Anthropic rejects tool_use ordering when context compaction merges assistant turns Related to #22878 — Claude Code Bad Request errors in multi-turn tool calling

What this PR does

Adds "Case D" deduplication to sanitize_messages_for_tool_calling() — when multiple tool messages reference the same tool_call_id within a contiguous tool-result block, only the last occurrence is kept.

Anthropic requires exactly one tool_result per tool_use and rejects with:

each tool_use must have a single result. Found multiple tool_result blocks
with id: toolu_xxx

Root Cause

Session history replay (conversation resume, checkpointing) can produce duplicate tool_result messages for the same tool_call_id. The existing sanitize_messages_for_tool_calling() handles orphaned tool results (Cases A-C) but does NOT detect duplicates.

Note: This dedup already exists for Bedrock (_deduplicate_bedrock_content_blocks) but was missing for the general OpenAI-format message path used by Anthropic and Vertex AI.

Fix

After existing sanitization (Cases A-C), a second pass scans sanitized_messages for duplicate tool_call_id values within each contiguous block of tool results:

Track seen_in_block: Dict[str, int] mapping tool_call_id → message index
When a duplicate is found, mark the earlier occurrence for removal (last-wins strategy)
Reset tracking on any non-tool message (user/assistant/system) to scope dedup per conversation turn
Tool/function messages without a tool_call_id (malformed) do NOT reset the block — only real turn boundaries do
Log each dedup via verbose_logger.warning() for observability

The last-wins strategy differs from Bedrock's first-wins (_deduplicate_bedrock_content_blocks) because the duplicate here arises from session history replay where the last entry represents the final state, not provider-side content block duplication.

Tests

5 unit tests (no network calls):

test_sanitize_messages_deduplicates_tool_results — duplicate tool_call_id within one turn, keeps last
test_sanitize_messages_preserves_unique_tool_results — distinct IDs pass through unchanged with content verified
test_sanitize_messages_dedup_disabled_when_modify_params_false — no sanitization when flag is off
test_sanitize_messages_dedup_scoped_per_turn_preserves_cross_turn — same tool_call_id in two different turns, both preserved
test_sanitize_messages_combined_case_a_and_case_d — combined scenario: missing result (Case A dummy) + duplicate results (Case D dedup) in same assistant message, with ordering assertions

Files Changed

File	Change
`litellm/litellm_core_utils/prompt_templates/factory.py`	Case D dedup logic in `sanitize_messages_for_tool_calling()`
`tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py`	5 new test functions

Changed files

litellm/litellm_core_utils/prompt_templates/factory.py (modified, +48/-0)
tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py (modified, +287/-1)

Code Example

Claude setting:

---

Starting litellm: `litellm --config copilot-config.yaml --port 4444`

### Steps to Reproduce

1. Use claude code 2.1.69
2. Set up the litellm in proxy mode
3. Configure claude to use the local proxy
4. Launch claude, ask something like "what does this code work for?"
5. Check the error and log in litellm


### Relevant log output

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

I tried to proxy my Github Copilot model into Claude Code via litellm, it worked well but somehow broken today.

I've tested like curl -X POST http://localhost:4444/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $$(grep LITELLM_MASTER_KEY .env | cut -d'=' -f2 | tr -d '\"')" \ -d '{"model": "claude-sonnet-4", "messages": [{"role": "user", "content": "Hello"}]}' and it returned response in 200, also my VSCode Github Copilot can respond and work as normal, so I think something might be wrong between litellm or claude code.

My configuration:

litellm_settings:
  drop_params: true  
  set_verbose: true 

model_list:
  - model_name: claude-3-5-sonnet-20241022
    litellm_params:
      model: github_copilot/claude-sonnet-4.6 
      extra_headers: &copilot_headers
        Editor-Version: "vscode/1.85.1"
        Copilot-Integration-Id: "vscode-chat"
  - model_name: claude-sonnet-4
    litellm_params:
      model: github_copilot/claude-sonnet-4 
      extra_headers: *copilot_headers

  - model_name: claude-opus-4
    litellm_params:
      model: github_copilot/claude-opus-4.6 
      extra_headers: *copilot_headers

    # Claude Opus 4.6 (Anthropic) - Max tokens: 64000, Context: 200000
  - model_name: claude-opus-4-6  
    litellm_params:
      model: github_copilot/claude-opus-4.6
      extra_headers: *copilot_headers

    # Claude Sonnet 4.6 (Anthropic) - Max tokens: 32000, Context: 200000
  - model_name: claude-sonnet-4-6  
    litellm_params:
      model: github_copilot/claude-sonnet-4.6
      extra_headers: *copilot_headers

    # Claude Sonnet 4.5 (Anthropic) - Max tokens: 32000, Context: 144000
  - model_name: claude-sonnet-4-5  
    litellm_params:
      model: github_copilot/claude-sonnet-4.5
      extra_headers: *copilot_headers

    # Claude Opus 4.5 (Anthropic) - Max tokens: 32000, Context: 160000
  - model_name: claude-opus-4-5  
    litellm_params:
      model: github_copilot/claude-opus-4.5
      extra_headers: *copilot_headers

    # Claude Haiku 4.5 (Anthropic) - Max tokens: 32000, Context: 144000
  - model_name: claude-haiku-4-5  
    litellm_params:
      model: github_copilot/claude-haiku-4.5
      extra_headers: *copilot_headers

  - model_name: gemini-3-1-pro-preview
    litellm_params:
      model: github_copilot/gemini-3.1-pro-preview
      extra_headers: *copilot_headers
    # Gemini 3.1 Pro (Google) - Max tokens: 64000, Context: 128000

  - model_name: gemini-3-pro-preview
    litellm_params:
      model: github_copilot/gemini-3-pro-preview
      extra_headers: *copilot_headers
    # Gemini 3 Pro Preview (Google) - Max tokens: 64000, Context: 128000

  - model_name: gemini-3-flash-preview
    litellm_params:
      model: github_copilot/gemini-3-flash-preview
      extra_headers: *copilot_headers
    # Gemini 3 Flash Preview (Google) - Max tokens: 64000, Context: 128000

  - model_name: gemini-2-5-pro
    litellm_params:
      model: github_copilot/gemini-2.5-pro
      extra_headers: *copilot_headers
    # Gemini 2.5 Pro (Google) - Max tokens: 64000, Context: 128000

  - model_name: gpt-5-2-codex
    litellm_params:
      model: github_copilot/gpt-5.2-codex
      extra_headers: *copilot_headers
    # GPT-5.2-Codex (OpenAI) - Max tokens: 128000, Context: 400000

  - model_name: gpt-5-3-codex
    litellm_params:
      model: github_copilot/gpt-5.3-codex
      extra_headers: *copilot_headers
    # GPT-5.3-Codex (OpenAI) - Max tokens: 128000, Context: 400000

  - model_name: gpt-5-1
    litellm_params:
      model: github_copilot/gpt-5.1
      extra_headers: *copilot_headers
    # GPT-5.1 (OpenAI) - Max tokens: 64000, Context: 264000

  - model_name: gpt-5-1-codex
    litellm_params:
      model: github_copilot/gpt-5.1-codex
      extra_headers: *copilot_headers
    # GPT-5.1-Codex (OpenAI) - Max tokens: 128000, Context: 400000

  - model_name: gpt-5-1-codex-mini
    litellm_params:
      model: github_copilot/gpt-5.1-codex-mini
      extra_headers: *copilot_headers
    # GPT-5.1-Codex-Mini (OpenAI) - Max tokens: 128000, Context: 400000

  - model_name: gpt-5-1-codex-max
    litellm_params:
      model: github_copilot/gpt-5.1-codex-max
      extra_headers: *copilot_headers
    # GPT-5.1-Codex-Max (OpenAI) - Max tokens: 128000, Context: 400000

  - model_name: gpt-4o
    litellm_params:
      model: github_copilot/gpt-4o
      extra_headers: *copilot_headers
    # GPT-4o (Azure OpenAI) - Max tokens: 4096, Context: 128000

  - model_name: gpt-4o-mini
    litellm_params:
      model: github_copilot/gpt-4o-mini
      extra_headers: *copilot_headers
    # GPT-4o mini (Azure OpenAI) - Max tokens: 4096, Context: 128000

  - model_name: gpt-4o-mini-2024-07-18
    litellm_params:
      model: github_copilot/gpt-4o-mini-2024-07-18
      extra_headers: *copilot_headers
    # GPT-4o mini (Azure OpenAI) - Max tokens: 4096, Context: 128000

  - model_name: gpt-4o-2024-11-20
    litellm_params:
      model: github_copilot/gpt-4o-2024-11-20
      extra_headers: *copilot_headers
    # GPT-4o (Azure OpenAI) - Max tokens: 16384, Context: 128000

  - model_name: gpt-4o-2024-08-06
    litellm_params:
      model: github_copilot/gpt-4o-2024-08-06
      extra_headers: *copilot_headers
    # GPT-4o (Azure OpenAI) - Max tokens: 16384, Context: 128000

  - model_name: gpt-4o-2024-05-13
    litellm_params:
      model: github_copilot/gpt-4o-2024-05-13
      extra_headers: *copilot_headers
    # GPT-4o (Azure OpenAI) - Max tokens: 4096, Context: 128000

  - model_name: gpt-4-o-preview
    litellm_params:
      model: github_copilot/gpt-4-o-preview
      extra_headers: *copilot_headers
    # GPT-4o Preview (Azure OpenAI) - Max tokens: 4096, Context: 128000

  - model_name: gpt-4
    litellm_params:
      model: github_copilot/gpt-4
      extra_headers: *copilot_headers
    # GPT-4 (Azure OpenAI) - Max tokens: 4096, Context: 128000
  - model_name: gpt-4-1
    litellm_params:
      model: github_copilot/gpt-4.1
      extra_headers: *copilot_headers
    # GPT-4.1 (Azure OpenAI) - Max tokens: 16384, Context: 128000

  - model_name: gpt-4-1-2025-04-14
    litellm_params:
      model: github_copilot/gpt-4.1-2025-04-14
      extra_headers: *copilot_headers
    # GPT-4.1 (Azure OpenAI) - Max tokens: 16384, Context: 128000

  - model_name: gpt-4-0613
    litellm_params:
      model: github_copilot/gpt-4-0613
      extra_headers: *copilot_headers
    # GPT 4 (Azure OpenAI) - Max tokens: 4096, Context: 32768

  - model_name: gpt-3-5-turbo
    litellm_params:
      model: github_copilot/gpt-3.5-turbo
      extra_headers: *copilot_headers
    # GPT 3.5 Turbo (Azure OpenAI) - Max tokens: 4096, Context: 16384

  - model_name: gpt-3-5-turbo-0613
    litellm_params:
      model: github_copilot/gpt-3.5-turbo-0613
      extra_headers: *copilot_headers
    # GPT 3.5 Turbo (Azure OpenAI) - Max tokens: 4096, Context: 16384

Claude setting:

{
  "$schema": "https://json.schemastore.org/claude-code-settings.json",
  "env": {
    "ANTHROPIC_AUTH_TOKEN": "<my_lite_llm_api>",
    "ANTHROPIC_BASE_URL": "http://localhost:4444",
    "ANTHROPIC_MODEL": "claude-sonnet-4-5",
    "ANTHROPIC_SMALL_FAST_MODEL": "gpt-4"
  },
  "model": "claude-sonnet-4-5"
}

Starting litellm: litellm --config copilot-config.yaml --port 4444

Steps to Reproduce

Use claude code 2.1.69
Set up the litellm in proxy mode
Configure claude to use the local proxy
Launch claude, ask something like "what does this code work for?"
Check the error and log in litellm

Relevant log output

Starting LiteLLM proxy...
INFO:     Started server process [86614]
INFO:     Waiting for application startup.

   ██╗     ██╗████████╗███████╗██╗     ██╗     ███╗   ███╗
   ██║     ██║╚══██╔══╝██╔════╝██║     ██║     ████╗ ████║
   ██║     ██║   ██║   █████╗  ██║     ██║     ██╔████╔██║
   ██║     ██║   ██║   ██╔══╝  ██║     ██║     ██║╚██╔╝██║
   ███████╗██║   ██║   ███████╗███████╗███████╗██║ ╚═╝ ██║
   ╚══════╝╚═╝   ╚═╝   ╚══════╝╚══════╝╚══════╝╚═╝     ╚═╝


#------------------------------------------------------------#
#                                                            #
#           'It would help me if you could add...'            #
#        https://github.com/BerriAI/litellm/issues/new        #
#                                                            #
#------------------------------------------------------------#

 Thank you for using LiteLLM! - Krrish & Ishaan



Give Feedback / Get Help: https://github.com/BerriAI/litellm/issues/new


LiteLLM: Proxy initialized with Config, Set models:
    claude-3-5-sonnet-20241022
    claude-sonnet-4
    claude-opus-4
    claude-opus-4-6
    claude-sonnet-4-6
    claude-sonnet-4-5
    claude-opus-4-5
    claude-haiku-4-5
    gemini-3-1-pro-preview
    gemini-3-pro-preview
    gemini-3-flash-preview
    gemini-2-5-pro
    gpt-5-2-codex
    gpt-5-3-codex
    gpt-5-1
    gpt-5-1-codex
    gpt-5-1-codex-mini
    gpt-5-1-codex-max
    gpt-4o
    gpt-4o-mini
    gpt-4o-mini-2024-07-18
    gpt-4o-2024-11-20
    gpt-4o-2024-08-06
    gpt-4o-2024-05-13
    gpt-4-o-preview
    gpt-4
    gpt-4-1
    gpt-4-1-2025-04-14
    gpt-4-0613
    gpt-3-5-turbo
    gpt-3-5-turbo-0613
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:4444 (Press CTRL+C to quit)
14:34:29 - LiteLLM:WARNING: utils.py:761 - `litellm.set_verbose` is deprecated. Please set `os.environ['LITELLM_LOG'] = 'DEBUG'` for debug logs.
Initialized litellm callbacks, Async Success Callbacks: [<bound method Router.deployment_callback_on_success of <litellm.router.Router object at 0x11dc91940>>, <litellm.proxy.hooks.litellm_skills.main.SkillsInjectionHook object at 0x117040ec0>, <litellm.proxy.hooks.model_max_budget_limiter._PROXY_VirtualKeyModelMaxBudgetLimiter object at 0x11c3c3380>, <litellm.proxy.hooks.max_budget_limiter._PROXY_MaxBudgetLimiter object at 0x11dcdc7d0>, <litellm.proxy.hooks.parallel_request_limiter_v3._PROXY_MaxParallelRequestsHandler_v3 object at 0x11dc927b0>, <litellm.proxy.hooks.cache_control_check._PROXY_CacheControlCheck object at 0x11dcdc910>, <litellm.proxy.hooks.responses_id_security.ResponsesIDSecurity object at 0x11dc92900>, <litellm._service_logger.ServiceLogging object at 0x11c484050>]
Final returned optional params: {'stream': True, 'stream_options': {'include_usage': True}, 'max_tokens': 32000, 'tools': [{'type': 'function', 'function': {'name': 'ToolSearch', 'parameters': {'$schema': 'https://json-schema.org/draft/2020-12/schema', 'type': 'object', 'properties': {'query': {'description': 'Query to find deferred tools. Use "select:<tool_name>" for direct selection, or keywords to search.', 'type': 'string'}, 'max_results': {'description': 'Maximum number of results to return (default: 5)', 'default': 5, 'type': 'number'}}, 'required': ['query', 'max_results'], 'additionalProperties': False}, 'description': 'Search for or select deferred tools to make them available for use.\n\n**MANDATORY PREREQUISITE - THIS IS A HARD REQUIREMENT**\n\nYou MUST use this tool to load deferred tools BEFORE calling them directly.\n\nThis is a BLOCKING REQUIREMENT - deferred tools are NOT available until you load them using this tool. Look for <available-deferred-tools> messages in the conversation for the list of tools you can discover. Both query modes (keyword search and direct selection) load the returned tools — once a tool appears in the results, it is immediately available to call.\n\n**Why this is non-negotiable:**\n- Deferred tools are not loaded until discovered via this tool\n- Calling a deferred tool without first loading it will fail\n\n**Query modes:**\n\n1. **Keyword search** - Use keywords when you\'re unsure which tool to use or need to discover multiple tools at once:\n   - "list directory" - find tools for listing directories\n   - "notebook jupyter" - find notebook editing tools\n   - "slack message" - find slack messaging tools\n   - Returns up to 5 matching tools ranked by relevance\n   - All returned tools are immediately available to call — no further selection step needed\n\n2. **Direct selection** - Use `select:<tool_name>` when you know the exact tool name:\n   - "select:mcp__slack__read_channel"\n   - "select:NotebookEdit"\n   - "select:Read,Edit,Grep" - load multiple tools at once with comma separation\n   - Returns the named tool(s) if they exist\n\n**IMPORTANT:** Both modes load tools equally. Do NOT follow up a keyword search with `select:` calls for tools already returned — they are already loaded.\n\n3. **Required keyword** - Prefix with `+` to require a match:\n   - "+linear create issue" - only tools from "linear", ranked by "create"/"issue"\n   - "+slack send" - only "slack" tools, ranked by "send"\n   - Useful when you know the service name but not the exact tool\n\n**CORRECT Usage Patterns:**\n\n<example>\nUser: I need to work with slack somehow\nAssistant: Let me search for slack tools.\n[Calls ToolSearch with query: "slack"]\nAssistant: Found several options including mcp__slack__read_channel.\n[Calls mcp__slack__read_channel directly — it was loaded by the keyword search]\n</example>\n\n<example>\nUser: Edit the Jupyter notebook\nAssistant: Let me load the notebook editing tool.\n[Calls ToolSearch with query: "select:NotebookEdit"]\n[Calls NotebookEdit]\n</example>\n\n<example>\nUser: List files in the src directory\nAssistant: I can see mcp__filesystem__list_directory in the available tools. Let me select it.\n[Calls ToolSearch with query: "select:mcp__filesystem__list_directory"]\n[Calls the tool]\n</example>\n\n**INCORRECT Usage Patterns - NEVER DO THESE:**\n\n<bad-example>\nUser: Read my slack messages\nAssistant: [Directly calls mcp__slack__read_channel without loading it first]\nWRONG - You must load the tool FIRST using this tool\n</bad-example>\n\n<bad-example>\nAssistant: [Calls ToolSearch with query: "slack", gets back mcp__slack__read_channel]\nAssistant: [Calls ToolSearch with query: "select:mcp__slack__read_channel"]\nWRONG - The keyword search already loaded the tool. The select call is redundant.\n</bad-example>'}}], 'max_retries': 0, 'extra_body': {}}
INFO:     127.0.0.1:61135 - "POST /v1/messages?beta=true HTTP/1.1" 200 OK
Logging Details LiteLLM-Async Success Call, cache_hit=False
Async success callbacks: Got a complete streaming response
Logging Details LiteLLM-Async Success Call, cache_hit=False
14:34:34 - LiteLLM:WARNING: utils.py:761 - `litellm.set_verbose` is deprecated. Please set `os.environ['LITELLM_LOG'] = 'DEBUG'` for debug logs.
Initialized litellm callbacks, Async Success Callbacks: [<bound method Router.deployment_callback_on_success of <litellm.router.Router object at 0x11dc91940>>, <litellm.proxy.hooks.litellm_skills.main.SkillsInjectionHook object at 0x117040ec0>, <litellm.proxy.hooks.model_max_budget_limiter._PROXY_VirtualKeyModelMaxBudgetLimiter object at 0x11c3c3380>, <litellm.proxy.hooks.max_budget_limiter._PROXY_MaxBudgetLimiter object at 0x11dcdc7d0>, <litellm.proxy.hooks.parallel_request_limiter_v3._PROXY_MaxParallelRequestsHandler_v3 object at 0x11dc927b0>, <litellm.proxy.hooks.cache_control_check._PROXY_CacheControlCheck object at 0x11dcdc910>, <litellm.proxy.hooks.responses_id_security.ResponsesIDSecurity object at 0x11dc92900>, <litellm._service_logger.ServiceLogging object at 0x11c484050>]
Final returned optional params: {'stream': True, 'stream_options': {'include_usage': True}, 'max_tokens': 32000, 'tools': [{'type': 'function', 'function': {'name': 'Read', 'parameters': {'$schema': 'https://json-schema.org/draft/2020-12/schema', 'type': 'object', 'properties': {'file_path': {'description': 'The absolute path to the file to read', 'type': 'string'}, 'offset': {'description': 'The line number to start reading from. Only provide if the file is too large to read at once', 'type': 'number'}, 'limit': {'description': 'The number of lines to read. Only provide if the file is too large to read at once.', 'type': 'number'}, 'pages': {'description': 'Page range for PDF files (e.g., "1-5", "3", "10-20"). Only applicable to PDF files. Maximum 20 pages per request.', 'type': 'string'}}, 'required': ['file_path'], 'additionalProperties': False, 'defer_loading': True}, 'description': 'Reads a file from the local filesystem. You can access any file directly by using this tool.\nAssume this tool is able to read all files on the machine. If the User provides a path to a file assume that path is valid. It is okay to read a file that does not exist; an error will be returned.\n\nUsage:\n- The file_path parameter must be an absolute path, not a relative path\n- By default, it reads up to 2000 lines starting from the beginning of the file\n- You can optionally specify a line offset and limit (especially handy for long files), but it\'s recommended to read the whole file by not providing these parameters\n- Any lines longer than 2000 characters will be truncated\n- Results are returned using cat -n format, with line numbers starting at 1\n- This tool allows Claude Code to read images (eg PNG, JPG, etc). When reading an image file the contents are presented visually as Claude Code is a multimodal LLM.\n- This tool can read PDF files (.pdf). For large PDFs (more than 10 pages), you MUST provide the pages parameter to read specific page ranges (e.g., pages: "1-5"). Reading a large PDF without the pages parameter will fail. Maximum 20 pages per request.\n- This tool can read Jupyter notebooks (.ipynb files) and returns all cells with their outputs, combining code, text, and visualizations.\n- This tool can only read files, not directories. To read a directory, use an ls command via the Bash tool.\n- You can call multiple tools in a single response. It is always better to speculatively read multiple potentially useful files in parallel.\n- You will regularly be asked to read screenshots. If the user provides a path to a screenshot, ALWAYS use this tool to view the file at the path. This tool will work with all temporary file paths.\n- If you read a file that exists but has empty contents you will receive a system reminder warning in place of file contents.'}}, {'type': 'function', 'function': {'name': 'ToolSearch', 'parameters': {'$schema': 'https://json-schema.org/draft/2020-12/schema', 'type': 'object', 'properties': {'query': {'description': 'Query to find deferred tools. Use "select:<tool_name>" for direct selection, or keywords to search.', 'type': 'string'}, 'max_results': {'description': 'Maximum number of results to return (default: 5)', 'default': 5, 'type': 'number'}}, 'required': ['query', 'max_results'], 'additionalProperties': False}, 'description': 'Search for or select deferred tools to make them available for use.\n\n**MANDATORY PREREQUISITE - THIS IS A HARD REQUIREMENT**\n\nYou MUST use this tool to load deferred tools BEFORE calling them directly.\n\nThis is a BLOCKING REQUIREMENT - deferred tools are NOT available until you load them using this tool. Look for <available-deferred-tools> messages in the conversation for the list of tools you can discover. Both query modes (keyword search and direct selection) load the returned tools — once a tool appears in the results, it is immediately available to call.\n\n**Why this is non-negotiable:**\n- Deferred tools are not loaded until discovered via this tool\n- Calling a deferred tool without first loading it will fail\n\n**Query modes:**\n\n1. **Keyword search** - Use keywords when you\'re unsure which tool to use or need to discover multiple tools at once:\n   - "list directory" - find tools for listing directories\n   - "notebook jupyter" - find notebook editing tools\n   - "slack message" - find slack messaging tools\n   - Returns up to 5 matching tools ranked by relevance\n   - All returned tools are immediately available to call — no further selection step needed\n\n2. **Direct selection** - Use `select:<tool_name>` when you know the exact tool name:\n   - "select:mcp__slack__read_channel"\n   - "select:NotebookEdit"\n   - "select:Read,Edit,Grep" - load multiple tools at once with comma separation\n   - Returns the named tool(s) if they exist\n\n**IMPORTANT:** Both modes load tools equally. Do NOT follow up a keyword search with `select:` calls for tools already returned — they are already loaded.\n\n3. **Required keyword** - Prefix with `+` to require a match:\n   - "+linear create issue" - only tools from "linear", ranked by "create"/"issue"\n   - "+slack send" - only "slack" tools, ranked by "send"\n   - Useful when you know the service name but not the exact tool\n\n**CORRECT Usage Patterns:**\n\n<example>\nUser: I need to work with slack somehow\nAssistant: Let me search for slack tools.\n[Calls ToolSearch with query: "slack"]\nAssistant: Found several options including mcp__slack__read_channel.\n[Calls mcp__slack__read_channel directly — it was loaded by the keyword search]\n</example>\n\n<example>\nUser: Edit the Jupyter notebook\nAssistant: Let me load the notebook editing tool.\n[Calls ToolSearch with query: "select:NotebookEdit"]\n[Calls NotebookEdit]\n</example>\n\n<example>\nUser: List files in the src directory\nAssistant: I can see mcp__filesystem__list_directory in the available tools. Let me select it.\n[Calls ToolSearch with query: "select:mcp__filesystem__list_directory"]\n[Calls the tool]\n</example>\n\n**INCORRECT Usage Patterns - NEVER DO THESE:**\n\n<bad-example>\nUser: Read my slack messages\nAssistant: [Directly calls mcp__slack__read_channel without loading it first]\nWRONG - You must load the tool FIRST using this tool\n</bad-example>\n\n<bad-example>\nAssistant: [Calls ToolSearch with query: "slack", gets back mcp__slack__read_channel]\nAssistant: [Calls ToolSearch with query: "select:mcp__slack__read_channel"]\nWRONG - The keyword search already loaded the tool. The select call is redundant.\n</bad-example>'}}], 'max_retries': 0, 'extra_body': {}}
INFO:     127.0.0.1:61135 - "POST /v1/messages?beta=true HTTP/1.1" 200 OK
Logging Details LiteLLM-Async Success Call, cache_hit=False
Async success callbacks: Got a complete streaming response
Logging Details LiteLLM-Async Success Call, cache_hit=False
14:34:41 - LiteLLM:WARNING: utils.py:761 - `litellm.set_verbose` is deprecated. Please set `os.environ['LITELLM_LOG'] = 'DEBUG'` for debug logs.
Initialized litellm callbacks, Async Success Callbacks: [<bound method Router.deployment_callback_on_success of <litellm.router.Router object at 0x11dc91940>>, <litellm.proxy.hooks.litellm_skills.main.SkillsInjectionHook object at 0x117040ec0>, <litellm.proxy.hooks.model_max_budget_limiter._PROXY_VirtualKeyModelMaxBudgetLimiter object at 0x11c3c3380>, <litellm.proxy.hooks.max_budget_limiter._PROXY_MaxBudgetLimiter object at 0x11dcdc7d0>, <litellm.proxy.hooks.parallel_request_limiter_v3._PROXY_MaxParallelRequestsHandler_v3 object at 0x11dc927b0>, <litellm.proxy.hooks.cache_control_check._PROXY_CacheControlCheck object at 0x11dcdc910>, <litellm.proxy.hooks.responses_id_security.ResponsesIDSecurity object at 0x11dc92900>, <litellm._service_logger.ServiceLogging object at 0x11c484050>]
Final returned optional params: {'stream': True, 'stream_options': {'include_usage': True}, 'max_tokens': 32000, 'tools': [{'type': 'function', 'function': {'name': 'Read', 'parameters': {'$schema': 'https://json-schema.org/draft/2020-12/schema', 'type': 'object', 'properties': {'file_path': {'description': 'The absolute path to the file to read', 'type': 'string'}, 'offset': {'description': 'The line number to start reading from. Only provide if the file is too large to read at once', 'type': 'number'}, 'limit': {'description': 'The number of lines to read. Only provide if the file is too large to read at once.', 'type': 'number'}, 'pages': {'description': 'Page range for PDF files (e.g., "1-5", "3", "10-20"). Only applicable to PDF files. Maximum 20 pages per request.', 'type': 'string'}}, 'required': ['file_path'], 'additionalProperties': False, 'defer_loading': True}, 'description': 'Reads a file from the local filesystem. You can access any file directly by using this tool.\nAssume this tool is able to read all files on the machine. If the User provides a path to a file assume that path is valid. It is okay to read a file that does not exist; an error will be returned.\n\nUsage:\n- The file_path parameter must be an absolute path, not a relative path\n- By default, it reads up to 2000 lines starting from the beginning of the file\n- You can optionally specify a line offset and limit (especially handy for long files), but it\'s recommended to read the whole file by not providing these parameters\n- Any lines longer than 2000 characters will be truncated\n- Results are returned using cat -n format, with line numbers starting at 1\n- This tool allows Claude Code to read images (eg PNG, JPG, etc). When reading an image file the contents are presented visually as Claude Code is a multimodal LLM.\n- This tool can read PDF files (.pdf). For large PDFs (more than 10 pages), you MUST provide the pages parameter to read specific page ranges (e.g., pages: "1-5"). Reading a large PDF without the pages parameter will fail. Maximum 20 pages per request.\n- This tool can read Jupyter notebooks (.ipynb files) and returns all cells with their outputs, combining code, text, and visualizations.\n- This tool can only read files, not directories. To read a directory, use an ls command via the Bash tool.\n- You can call multiple tools in a single response. It is always better to speculatively read multiple potentially useful files in parallel.\n- You will regularly be asked to read screenshots. If the user provides a path to a screenshot, ALWAYS use this tool to view the file at the path. This tool will work with all temporary file paths.\n- If you read a file that exists but has empty contents you will receive a system reminder warning in place of file contents.'}}, {'type': 'function', 'function': {'name': 'ToolSearch', 'parameters': {'$schema': 'https://json-schema.org/draft/2020-12/schema', 'type': 'object', 'properties': {'query': {'description': 'Query to find deferred tools. Use "select:<tool_name>" for direct selection, or keywords to search.', 'type': 'string'}, 'max_results': {'description': 'Maximum number of results to return (default: 5)', 'default': 5, 'type': 'number'}}, 'required': ['query', 'max_results'], 'additionalProperties': False}, 'description': 'Search for or select deferred tools to make them available for use.\n\n**MANDATORY PREREQUISITE - THIS IS A HARD REQUIREMENT**\n\nYou MUST use this tool to load deferred tools BEFORE calling them directly.\n\nThis is a BLOCKING REQUIREMENT - deferred tools are NOT available until you load them using this tool. Look for <available-deferred-tools> messages in the conversation for the list of tools you can discover. Both query modes (keyword search and direct selection) load the returned tools — once a tool appears in the results, it is immediately available to call.\n\n**Why this is non-negotiable:**\n- Deferred tools are not loaded until discovered via this tool\n- Calling a deferred tool without first loading it will fail\n\n**Query modes:**\n\n1. **Keyword search** - Use keywords when you\'re unsure which tool to use or need to discover multiple tools at once:\n   - "list directory" - find tools for listing directories\n   - "notebook jupyter" - find notebook editing tools\n   - "slack message" - find slack messaging tools\n   - Returns up to 5 matching tools ranked by relevance\n   - All returned tools are immediately available to call — no further selection step needed\n\n2. **Direct selection** - Use `select:<tool_name>` when you know the exact tool name:\n   - "select:mcp__slack__read_channel"\n   - "select:NotebookEdit"\n   - "select:Read,Edit,Grep" - load multiple tools at once with comma separation\n   - Returns the named tool(s) if they exist\n\n**IMPORTANT:** Both modes load tools equally. Do NOT follow up a keyword search with `select:` calls for tools already returned — they are already loaded.\n\n3. **Required keyword** - Prefix with `+` to require a match:\n   - "+linear create issue" - only tools from "linear", ranked by "create"/"issue"\n   - "+slack send" - only "slack" tools, ranked by "send"\n   - Useful when you know the service name but not the exact tool\n\n**CORRECT Usage Patterns:**\n\n<example>\nUser: I need to work with slack somehow\nAssistant: Let me search for slack tools.\n[Calls ToolSearch with query: "slack"]\nAssistant: Found several options including mcp__slack__read_channel.\n[Calls mcp__slack__read_channel directly — it was loaded by the keyword search]\n</example>\n\n<example>\nUser: Edit the Jupyter notebook\nAssistant: Let me load the notebook editing tool.\n[Calls ToolSearch with query: "select:NotebookEdit"]\n[Calls NotebookEdit]\n</example>\n\n<example>\nUser: List files in the src directory\nAssistant: I can see mcp__filesystem__list_directory in the available tools. Let me select it.\n[Calls ToolSearch with query: "select:mcp__filesystem__list_directory"]\n[Calls the tool]\n</example>\n\n**INCORRECT Usage Patterns - NEVER DO THESE:**\n\n<bad-example>\nUser: Read my slack messages\nAssistant: [Directly calls mcp__slack__read_channel without loading it first]\nWRONG - You must load the tool FIRST using this tool\n</bad-example>\n\n<bad-example>\nAssistant: [Calls ToolSearch with query: "slack", gets back mcp__slack__read_channel]\nAssistant: [Calls ToolSearch with query: "select:mcp__slack__read_channel"]\nWRONG - The keyword search already loaded the tool. The select call is redundant.\n</bad-example>'}}], 'max_retries': 0, 'extra_body': {}}
Custom Logger Error - Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 1106, in async_streaming
    headers, response = await self.make_openai_chat_completion_request(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<4 lines>...
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/litellm_core_utils/logging_utils.py", line 297, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 460, in make_openai_chat_completion_request
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 437, in make_openai_chat_completion_request
    await openai_aclient.chat.completions.with_raw_response.create(
        **data, timeout=timeout
    )
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/_legacy_response.py", line 384, in wrapped
    return cast(LegacyAPIResponse[R], await func(*args, **kwargs))
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/resources/chat/completions/completions.py", line 2700, in create
    return await self._post(
           ^^^^^^^^^^^^^^^^^
    ...<49 lines>...
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/_base_client.py", line 1884, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/_base_client.py", line 1669, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Bad Request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/main.py", line 613, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 1156, in async_streaming
    raise OpenAIError(
    ...<4 lines>...
    )
litellm.llms.openai.common_utils.OpenAIError: Bad Request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/utils.py", line 1862, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/main.py", line 632, in acompletion
    raise exception_type(
          ~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<3 lines>...
        extra_kwargs=kwargs,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2398, in exception_type
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 507, in exception_type
    raise BadRequestError(
    ...<5 lines>...
    )
litellm.exceptions.BadRequestError: litellm.BadRequestError: Github_copilotException - Bad Request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/integrations/custom_logger.py", line 512, in async_log_event
    await callback_func(
    ...<4 lines>...
    )
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 5813, in async_deployment_callback_on_failure
    deployment_name = kwargs["litellm_params"]["metadata"].get(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get'

Custom Logger Error - Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 1106, in async_streaming
    headers, response = await self.make_openai_chat_completion_request(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<4 lines>...
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/litellm_core_utils/logging_utils.py", line 297, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 460, in make_openai_chat_completion_request
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 437, in make_openai_chat_completion_request
    await openai_aclient.chat.completions.with_raw_response.create(
        **data, timeout=timeout
    )
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/_legacy_response.py", line 384, in wrapped
    return cast(LegacyAPIResponse[R], await func(*args, **kwargs))
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/resources/chat/completions/completions.py", line 2700, in create
    return await self._post(
           ^^^^^^^^^^^^^^^^^
    ...<49 lines>...
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/_base_client.py", line 1884, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/_base_client.py", line 1669, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Bad Request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/main.py", line 613, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 1156, in async_streaming
    raise OpenAIError(
    ...<4 lines>...
    )
litellm.llms.openai.common_utils.OpenAIError: Bad Request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/utils.py", line 1862, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/anthropic/experimental_pass_through/messages/handler.py", line 187, in anthropic_messages
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/anthropic/experimental_pass_through/adapters/handler.py", line 233, in async_anthropic_messages_handler
    completion_response = await litellm.acompletion(**completion_kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/utils.py", line 2041, in wrapper_async
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/utils.py", line 1862, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/main.py", line 632, in acompletion
    raise exception_type(
          ~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<3 lines>...
        extra_kwargs=kwargs,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2398, in exception_type
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 507, in exception_type
    raise BadRequestError(
    ...<5 lines>...
    )
litellm.exceptions.BadRequestError: litellm.BadRequestError: Github_copilotException - Bad Request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/integrations/custom_logger.py", line 512, in async_log_event
    await callback_func(
    ...<4 lines>...
    )
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 5813, in async_deployment_callback_on_failure
    deployment_name = kwargs["litellm_params"]["metadata"].get(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'get'

14:34:42 - LiteLLM Proxy:ERROR: endpoints.py:121 - litellm.proxy.proxy_server.anthropic_response(): Exception occured - litellm.BadRequestError: Github_copilotException - Bad Request. Received Model Group=claude-sonnet-4-5
Available Model Group Fallbacks=None
Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 1106, in async_streaming
    headers, response = await self.make_openai_chat_completion_request(
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<4 lines>...
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/litellm_core_utils/logging_utils.py", line 297, in async_wrapper
    result = await func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 460, in make_openai_chat_completion_request
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 437, in make_openai_chat_completion_request
    await openai_aclient.chat.completions.with_raw_response.create(
        **data, timeout=timeout
    )
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/_legacy_response.py", line 384, in wrapped
    return cast(LegacyAPIResponse[R], await func(*args, **kwargs))
                                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/resources/chat/completions/completions.py", line 2700, in create
    return await self._post(
           ^^^^^^^^^^^^^^^^^
    ...<49 lines>...
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/_base_client.py", line 1884, in post
    return await self.request(cast_to, opts, stream=stream, stream_cls=stream_cls)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/openai/_base_client.py", line 1669, in request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Bad Request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/main.py", line 613, in acompletion
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/openai/openai.py", line 1156, in async_streaming
    raise OpenAIError(
    ...<4 lines>...
    )
litellm.llms.openai.common_utils.OpenAIError: Bad Request

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/proxy/anthropic_endpoints/endpoints.py", line 53, in anthropic_response
    result = await base_llm_response_processor.base_process_llm_request(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<16 lines>...
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/proxy/common_request_processing.py", line 886, in base_process_llm_request
    responses = await llm_responses
                ^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 4709, in async_wrapper
    return await self._ageneric_api_call_with_fallbacks(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<2 lines>...
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 3458, in _ageneric_api_call_with_fallbacks
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 3445, in _ageneric_api_call_with_fallbacks
    response = await self.async_function_with_fallbacks(**kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 5111, in async_function_with_fallbacks
    return await self.async_function_with_fallbacks_common_utils(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<8 lines>...
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 5068, in async_function_with_fallbacks_common_utils
    raise original_exception
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 5102, in async_function_with_fallbacks
    response = await self.async_function_with_retries(*args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 5257, in async_function_with_retries
    self.should_retry_this_error(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        error=e,
        ^^^^^^^^
    ...<4 lines>...
        content_policy_fallbacks=content_policy_fallbacks,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 5439, in should_retry_this_error
    raise error
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 5208, in async_function_with_retries
    response = await self.make_call(original_function, *args, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 5354, in make_call
    response = await response
               ^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 3570, in _ageneric_api_call_with_fallbacks_helper
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/router.py", line 3556, in _ageneric_api_call_with_fallbacks_helper
    response = await response  # type: ignore
               ^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/utils.py", line 2041, in wrapper_async
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/utils.py", line 1862, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/anthropic/experimental_pass_through/messages/handler.py", line 187, in anthropic_messages
    response = await init_response
               ^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/llms/anthropic/experimental_pass_through/adapters/handler.py", line 233, in async_anthropic_messages_handler
    completion_response = await litellm.acompletion(**completion_kwargs)
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/utils.py", line 2041, in wrapper_async
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/utils.py", line 1862, in wrapper_async
    result = await original_function(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/main.py", line 632, in acompletion
    raise exception_type(
          ~~~~~~~~~~~~~~^
        model=model,
        ^^^^^^^^^^^^
    ...<3 lines>...
        extra_kwargs=kwargs,
        ^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 2398, in exception_type
    raise e
  File "/Users/<username>/claude-code-over-github-copilot/venv/lib/python3.14/site-packages/litellm/litellm_core_utils/exception_mapping_utils.py", line 507, in exception_type
    raise BadRequestError(
    ...<5 lines>...
    )
litellm.exceptions.BadRequestError: litellm.BadRequestError: Github_copilotException - Bad Request. Received Model Group=claude-sonnet-4-5
Available Model Group Fallbacks=None
INFO:     127.0.0.1:61135 - "POST /v1/messages?beta=true HTTP/1.1" 400 Bad Request

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

1.82.0

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To resolve the BadRequestError issue with LiteLLM, follow these steps:

Check Model Configuration: Ensure that the model configuration in copilot-config.yaml is correct, especially the model parameter under litellm_params. Verify that the model name matches the one used in the Claude Code settings.
Update LiteLLM Version: Consider updating LiteLLM to the latest version, as the current version (1.82.0) might have known issues or bugs that are fixed in newer releases.
Validate API Requests: Inspect the API requests being sent to the OpenAI API using tools like curl or a debugger. Verify that the request payload, headers, and parameters are correctly formatted and match the expected input for the chat.completions endpoint.
Handle Errors and Exceptions: Review the error handling and exception handling mechanisms in the LiteLLM codebase. Ensure that errors are properly caught, logged, and propagated to the user, providing informative error messages to aid in debugging.

Example code snippet to update the copilot-config.yaml file:

model_list:
  - model_name: claude-sonnet-4-5
    litellm_params:
      model: github_copilot/claude-sonnet-4.5
      extra_headers: *copilot_headers

In this example, the model parameter is updated to match the correct model name.

Verification

To verify that the fix worked:

Restart the LiteLLM proxy server.
Send a test request to the Claude Code API using curl or a similar tool.
Monitor the LiteLLM logs for any error messages or exceptions.
Verify that the response from the Claude Code API is successful and contains the expected output.

Extra Tips

Regularly check the LiteLLM documentation and release notes for updates, bug fixes, and known issues.
Use debugging tools and log analysis to identify and diagnose issues with the LiteLLM proxy server.
Consider implementing additional error handling and logging mechanisms to improve the robustness and reliability of the LiteLLM integration.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #orchestration issue #cache issue #memory leak #API versioning

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.