litellm - ✅(Solved) Fix [Bug]: KeyError: 'file' in Anthropic (and Anthropic-via-Vertex) when message carries a non-OpenAI file content block (e.g. LangChain v1) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#26227Fetched 2026-04-23 07:24:37
View on GitHub
Comments
0
Participants
1
Timeline
2
Reactions
0
Participants
Timeline (top)
cross-referenced ×1labeled ×1

Error Message

File ".../litellm/llms/anthropic/common_utils.py", line 508, in validate_environment file_id_used = self.is_file_id_used(messages=messages) File ".../litellm/llms/anthropic/common_utils.py", line 113, in is_file_id_used file_ids = get_file_ids_from_messages(messages) File ".../litellm/litellm_core_utils/prompt_templates/common_utils.py", line 1063, in get_file_ids_from_messages file_object_file_field = file_object["file"] KeyError: 'file'

litellm.llms.vertex_ai.vertex_ai_partner_models.main.VertexAIError: 'file' litellm.exceptions.InternalServerError: Vertex_aiException InternalServerError - 'file'

Root Cause

type: \"file\" is a public content-block discriminator, not a LiteLLM-private invariant. The two discovery helpers in litellm/litellm_core_utils/prompt_templates/common_utils.py:

  • get_file_ids_from_messages (line 1049)
  • update_messages_with_model_file_ids (line 431)

both short-circuit on c[\"type\"] == \"file\" and then assume c[\"file\"] exists. These functions are discovery passes. If a block has no file sub-dict, it simply has no file_id to extract or remap, so the correct behavior is to skip that block, not raise.

Related upstream layers (context, not asks)

  • langchain-core: _normalize_messages rewrites OpenAI file blocks to v1 shape. Reasonable in isolation; the problem is the asymmetry when the concrete chat-model integration does not translate back.
  • langchain-litellm: fixed in #61 (merged Jan 2026) via convert_to_openai_data_block. Users on langchain-litellm < 0.4 still hit the LiteLLM crash.

Fix Action

Fix / Workaround

PR #24503 (open, no reviews yet) patches the same spots but raises BadRequestError when file is missing. For these two discovery functions specifically that is still a hard failure for a block that is otherwise valid at the downstream provider. Skip semantics is strictly more permissive and still preserves #24503's fix for the stricter call sites (Gemini/Bedrock/Anthropic transformers, migrate_file_to_image_url), where a missing file sub-dict really is a malformed request.

PR fix notes

PR #26228: fix(anthropic): skip non-OpenAI file content blocks in file-id discovery helpers

Description (problem / solution / changelog)

Relevant issues

Closes #26227. Related to #24503 (different approach, see below).

Pre-Submission checklist

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement
  • My PR passes all unit tests on make test-unit for the affected module
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

Bug Fix

Problem

AnthropicConfig.validate_environment (run on every Anthropic + Anthropic-via-Vertex request) calls:

  • is_file_id_used -> get_file_ids_from_messages (common_utils.py:1049)
  • downstream also update_messages_with_model_file_ids (common_utils.py:431)

Both helpers short-circuit on c[\"type\"] == \"file\" and then do a bare c[\"file\"] access. Any content block that uses \"file\" as its type discriminator but does not match the OpenAI Chat Completions sub-shape (nested file dict) raises KeyError: 'file'. The Vertex partner layer wraps this as VertexAIError(500) -> litellm.InternalServerError, before the LLM is contacted.

Real-world trigger: LangChain v1's _normalize_messages rewrites OpenAI file blocks into {\"type\":\"file\",\"id\":...,\"base64\":...,\"mime_type\":...,\"extras\":{}} on the way into every chat model. Hits every vertex_ai/claude-* and anthropic/claude-* request that carries an attachment.

See #26227 for full RCA, traceback, and minimal repro.

Fix

type: \"file\" is a public content-block discriminator. The two helpers above are discovery passes:

  • get_file_ids_from_messages returns a list of file IDs.
  • update_messages_with_model_file_ids rewrites provider-scoped file IDs in-place.

A block with no file sub-dict has no file_id to extract or remap, so the correct behavior is to skip the block, not raise. This patch switches both from c[\"file\"] to c.get(\"file\") + isinstance(..., dict) check, and continues past non-OpenAI blocks.

Why skip and not raise BadRequestError (as in #24503)

PR #24503 raises BadRequestError in these two spots. For the stricter sites it also touches (Gemini/Bedrock/Anthropic transformers, migrate_file_to_image_url), that is the right behavior: a block that has reached the provider transformer is expected to be fully-formed OpenAI shape, and a missing file sub-dict really is a malformed request.

These two discovery helpers are different. They run unconditionally inside validate_environment, ahead of any provider-specific transformer. Raising BadRequestError here fails the whole request for any legitimate non-OpenAI block (LangChain v1, provider-native, custom user shapes) even when the downstream transformer would handle it correctly. Skip is strictly more permissive: well-formed OpenAI blocks still yield their file_id, and non-OpenAI blocks stop crashing validate_environment.

Happy to coordinate with @krisxia0506 if this should land as a delta on top of #24503, or get rolled in there directly.

Changes

FileChange
litellm/litellm_core_utils/prompt_templates/common_utils.pyDefensive .get(\"file\") + isinstance(..., dict) check in get_file_ids_from_messages and update_messages_with_model_file_ids; skip the block if the sub-dict is missing or malformed.
tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_common_utils.py5 new regression tests (LangChain v1 shape, OpenAI happy path, mixed shapes, file set to a non-dict value, remap path for non-OpenAI blocks).

Tests

All 25 tests in test_litellm_core_utils_prompt_templates_common_utils.py pass locally:

======================== 25 passed, 2 warnings in 0.40s ========================

New tests cover:

  • test_get_file_ids_from_messages_skips_langchain_v1_file_block
  • test_get_file_ids_from_messages_still_extracts_from_openai_shape
  • test_get_file_ids_from_messages_mixed_shapes
  • test_get_file_ids_from_messages_file_field_not_dict
  • test_update_messages_with_model_file_ids_skips_non_openai_file_blocks

Changed files

  • litellm/litellm_core_utils/prompt_templates/common_utils.py (modified, +14/-2)
  • tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_common_utils.py (modified, +113/-0)

Code Example

File \".../litellm/llms/anthropic/common_utils.py\", line 508, in validate_environment
    file_id_used = self.is_file_id_used(messages=messages)
File \".../litellm/llms/anthropic/common_utils.py\", line 113, in is_file_id_used
    file_ids = get_file_ids_from_messages(messages)
File \".../litellm/litellm_core_utils/prompt_templates/common_utils.py\", line 1063, in get_file_ids_from_messages
    file_object_file_field = file_object[\"file\"]
KeyError: 'file'

litellm.llms.vertex_ai.vertex_ai_partner_models.main.VertexAIError: 'file'
litellm.exceptions.InternalServerError: Vertex_aiException InternalServerError - 'file'

---

from litellm.litellm_core_utils.prompt_templates.common_utils import (
    get_file_ids_from_messages,
)

# LangChain v1 standardized file block shape:
# `type: \"file\"` plus `base64` / `mime_type` / `extras` siblings,
# but no nested `file` sub-dict.
msgs = [{
    \"role\": \"user\",
    \"content\": [
        {\"type\": \"text\", \"text\": \"summarise this PDF\"},
        {
            \"type\": \"file\",
            \"id\": \"lc_1\",
            \"base64\": \"JVBERi0xLjQK\",
            \"mime_type\": \"application/pdf\",
            \"extras\": {\"file_format\": \"application/pdf\"},
        },
    ],
}]

get_file_ids_from_messages(msgs)  # -> KeyError: 'file'
RAW_BUFFERClick to expand / collapse

What happened?

AnthropicConfig.validate_environment (called unconditionally on every Anthropic and Anthropic-via-Vertex request) runs is_file_id_used -> get_file_ids_from_messages, which then does a bare c["file"] on any content block whose type == "file". Any such block that is not in the OpenAI Chat Completions shape raises KeyError: 'file', which the Vertex partner layer wraps as VertexAIError(500) and then litellm.InternalServerError. The failure fires before any network call.

A real-world way to hit this: LangChain v1's _normalize_messages rewrites an OpenAI-shaped file block into the v1 standardized shape {"type":"file","id":...,"base64":...,"mime_type":...,"extras":{}} on the way into every chat model's _astream. langchain-litellm >= 0.4 now converts this back to OpenAI shape (#61 in that repo), but users on older 0.3.x still hit it, and any other caller that produces a type: \"file\" block without the OpenAI file sub-dict will hit it too.

Relevant log / traceback

File \".../litellm/llms/anthropic/common_utils.py\", line 508, in validate_environment
    file_id_used = self.is_file_id_used(messages=messages)
File \".../litellm/llms/anthropic/common_utils.py\", line 113, in is_file_id_used
    file_ids = get_file_ids_from_messages(messages)
File \".../litellm/litellm_core_utils/prompt_templates/common_utils.py\", line 1063, in get_file_ids_from_messages
    file_object_file_field = file_object[\"file\"]
KeyError: 'file'

litellm.llms.vertex_ai.vertex_ai_partner_models.main.VertexAIError: 'file'
litellm.exceptions.InternalServerError: Vertex_aiException InternalServerError - 'file'

Same class of crash is reachable at line 455 via update_messages_with_model_file_ids.

Minimal reproduction

No credentials needed, crashes before the network hop.

from litellm.litellm_core_utils.prompt_templates.common_utils import (
    get_file_ids_from_messages,
)

# LangChain v1 standardized file block shape:
# `type: \"file\"` plus `base64` / `mime_type` / `extras` siblings,
# but no nested `file` sub-dict.
msgs = [{
    \"role\": \"user\",
    \"content\": [
        {\"type\": \"text\", \"text\": \"summarise this PDF\"},
        {
            \"type\": \"file\",
            \"id\": \"lc_1\",
            \"base64\": \"JVBERi0xLjQK\",
            \"mime_type\": \"application/pdf\",
            \"extras\": {\"file_format\": \"application/pdf\"},
        },
    ],
}]

get_file_ids_from_messages(msgs)  # -> KeyError: 'file'

End-to-end repro via ChatLiteLLM with model=\"vertex_ai/claude-haiku-4-5\" or \"anthropic/claude-haiku-4-5\" fails with InternalServerError - 'file'.

Root cause

type: \"file\" is a public content-block discriminator, not a LiteLLM-private invariant. The two discovery helpers in litellm/litellm_core_utils/prompt_templates/common_utils.py:

  • get_file_ids_from_messages (line 1049)
  • update_messages_with_model_file_ids (line 431)

both short-circuit on c[\"type\"] == \"file\" and then assume c[\"file\"] exists. These functions are discovery passes. If a block has no file sub-dict, it simply has no file_id to extract or remap, so the correct behavior is to skip that block, not raise.

Related upstream layers (context, not asks)

  • langchain-core: _normalize_messages rewrites OpenAI file blocks to v1 shape. Reasonable in isolation; the problem is the asymmetry when the concrete chat-model integration does not translate back.
  • langchain-litellm: fixed in #61 (merged Jan 2026) via convert_to_openai_data_block. Users on langchain-litellm < 0.4 still hit the LiteLLM crash.

Relevant existing work

PR #24503 (open, no reviews yet) patches the same spots but raises BadRequestError when file is missing. For these two discovery functions specifically that is still a hard failure for a block that is otherwise valid at the downstream provider. Skip semantics is strictly more permissive and still preserves #24503's fix for the stricter call sites (Gemini/Bedrock/Anthropic transformers, migrate_file_to_image_url), where a missing file sub-dict really is a malformed request.

Proposed fix

PR coming: defensive .get(\"file\") + dict check in both discovery helpers, with 5 regression tests (LangChain v1 shape, OpenAI happy path, mixed shapes, file set to a non-dict, and the remap path). Verified against 25/25 tests in test_litellm_core_utils_prompt_templates_common_utils.py.

Are you a ML Ops Team?

No

What LiteLLM version are you on?

v1.83.8 (also reproduced on v1.83.10-nightly and v1.83.11; current main is still affected as of the filing date).

Twitter / LinkedIn details

https://www.linkedin.com/in/anmolg1997/

extent analysis

TL;DR

The most likely fix is to modify the get_file_ids_from_messages and update_messages_with_model_file_ids functions to handle cases where a type: "file" block does not contain a file sub-dict.

Guidance

  • Check the type of each content block and only attempt to access the file sub-dict if it exists, using a defensive .get("file") approach.
  • Verify that the fix works by testing with different types of content blocks, including those with and without a file sub-dict.
  • Consider adding regression tests to ensure that the fix does not introduce new issues.
  • Review the proposed fix in the upcoming PR and ensure that it aligns with the expected behavior.

Example

def get_file_ids_from_messages(messages):
    file_ids = []
    for message in messages:
        for content in message.get("content", []):
            if content.get("type") == "file":
                file_object = content.get("file")
                if file_object:
                    # process the file object
                    pass
                else:
                    # skip this block if it does not contain a file sub-dict
                    continue
    return file_ids

Notes

The fix should be applied to both get_file_ids_from_messages and update_messages_with_model_file_ids functions to ensure consistent behavior.

Recommendation

Apply the proposed fix in the upcoming PR, which includes defensive .get("file") and dict checks in both discovery helpers, along with regression tests to verify the fix. This approach ensures that the functions can handle different types of content blocks without raising errors.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING