litellm - ✅(Solved) Fix [Bug]: Thinking blocks corrupted on round-trip when assistant performs multiple web searches [3 pull requests, 4 comments, 2 participants]

Q: Expected behavior

Round-tripping `r.choices[0].message` should preserve the original `content` array ordering so Anthropic accepts it.

jph00 · 2026-03-07T19:48:32Z

[litellm] PR 23093: fix: preserve thinking block order with multiple web searches - Repository: BerriAI/litellm - Author: MaxwellCalkin - State: closed | merge… # PR #23093: fix: preserve thinking block order with multiple web searches - Repository: BerriAI/litellm - Author: MaxwellCalkin - State: closed | merged: True - Link: https://github.com/BerriAI/litellm/pull/23093 ## Description (problem / solution / changelog) **Note: This PR was authored by Claude (AI), operated by @maxwellcalkin.** ## Summary Fixes #23047 When using Claude with extended thinking and web search, if the model performs 2+ web searches in a single turn, the next `completion()` call fails with: ``` thinking or redacted_thinking blocks in the latest assistant message cannot be modified ``` ### Root cause In `anthropic_messages_pt`, assistant content was reconstructed by: 1. Prepending **all** `thinking_blocks` first 2. Then appending text blocks 3. Then appending `server_tool_use` + `web_search_tool_result` blocks But Anthropic's original response interleaves thinking blocks *between* tool use/result blocks: ``` [thinking_1, server_tool_use_1, web_search_result_1, thinking_2, text, server_tool_use_2, web_search_result_2] ``` The reconstructed order differs, which breaks Anthropic's thinking block signature verification. ### Fix When both `thinking_blocks` and server tool calls (`srvtoolu_*`) are present on an assistant message, the code now **interleaves** them instead of separating them: - Each thinking block is paired with its corresponding server tool use group (server_tool_use + tool_result) - Extra thinking blocks (if more than tool groups) are emitted before the text block - Extra tool groups (if more than thinking blocks) are emitted without a preceding thinking block - Regular (non-server) tool calls are appended at the end - When no server tool calls are present, the existing sequential behavior is preserved ### Changes - `litellm/litellm_core_utils/prompt_templates/factory.py`: Added interleaved mode for thinking blocks + server tool calls in `anthropic_messages_pt` - `tests/llm_translation/test_prompt_factory.py`: Added 3 tests covering the interleaving fix, backward compatibility, and edge cases ## Test plan - [x] `test_anthropic_messages_pt_interleave_thinking_with_server_tool_calls` — verifies correct interleaved order with 2 web searches - [x] `test_anthropic_messages_pt_thinking_blocks_no_server_tools_unchanged` — verifies existing behavior preserved when only regular tool_use - [x] `test_anthropic_messages_pt_interleave_more_thinking_than_tool_groups` — verifies handling of more thinking blocks than tool groups ## Changed files - `litellm/litellm_core_utils/prompt_templates/factory.py` (modified, +241/-58) - `tests/llm_translation/test_prompt_factory.py` (modified, +343/-0) --- # PR #23137: fix(anthropic): preserve interleaved thinking block order on round-trip (#23047) - Repository: BerriAI/litellm - Author: ZenOfAutumn - State: open | merged: False - Link: https://github.com/BerriAI/litellm/pull/23137 ## Description (problem / solution / changelog) ## Summary Fixes https://github.com/BerriAI/litellm/issues/23047 When Anthropic responses contain interleaved thinking blocks, tool use, and web search results, the `anthropic_messages_pt` function was corrupting the block order by unconditionally prepending `thinking_blocks` before processing the `content` list. This caused duplicate thinking blocks and incorrect ordering on multi-turn round-trips. ## Problem The `anthropic_messages_pt` function in `factory.py` always prepended `thinking_blocks` (from the top-level field) before iterating over the `content` list. When the `content` list already contained interleaved thinking, `server_tool_use`, and `*_tool_result` blocks (as returned by Anthropic for web search / tool use with extended thinking), this resulted in: 1. **Duplicate thinking blocks** — the same thinking block appeared twice (once from `thinking_blocks` field, once from `content` list) 2. **Corrupted ordering** — all thinking blocks were moved to the front, breaking the original interleaved sequence ## Solution - Added a check: if the `content` list already contains `type: "thinking"` blocks, skip prepending `thinking_blocks` to avoid duplication and preserve the original interleaved order. - Added pass-through handling for `server_tool_use` and `*_tool_result` content blocks (e.g. `web_search_tool_result`, `tool_search_tool_result`, `bash_code_execution_tool_result`). - Added handling for `redacted_thinking` blocks. ## Changes - `litellm/litellm_core_utils/prompt_templates/factory.py`: Modified `anthropic_messages_pt` to detect and skip duplicate thinking block insertion; added pass-through for `server_tool_use`, `*_tool_result`, and `redacted_thinking` block types. - `tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py`: Added 3 new unit tests covering interleaved thinking + tool use ordering, thinking_b

litellm2026-03-07 19:48:32

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#23047•Fetched 2026-04-08 00:38:42

View on GitHub

Comments

Participants

Timeline

Reactions

Author

jph00

Participants

giulio-leone

jph00

Timeline (top)

commented ×4cross-referenced ×4labeled ×3referenced ×3

Error Message

from litellm import completion

m = 'claude-sonnet-4-6' msgs = [{'role': 'user', 'content': 'Search the web for the latest news about fast.ai and answer.ai'}] r = completion(m, msgs, web_search_options={"search_context_size": "low"}, reasoning_effort='low')

Confirm thinking + multiple tool calls present

m1 = r.choices[0].message assert m1.thinking_blocks, "No thinking blocks — retry until model thinks" assert len(m1.tool_calls) >= 2, f"Need 2+ web searches, got {len(m1.tool_calls or [])}"

Round-trip: pass message back unmodified

msgs.append(m1) msgs.append({'role': 'user', 'content': 'Now search for news about solveit'}) r2 = completion(m, msgs, web_search_options={"search_context_size": "low"}, reasoning_effort='low')

^^^ raises BadRequestError: thinking blocks cannot be modified

Root Cause

In litellm/litellm_core_utils/prompt_templates/factory.py, anthropic_messages_pt rebuilds the assistant content array by:

Prepending all thinking_blocks first
Then appending text blocks
Then appending server_tool_use + web_search_tool_result blocks

But the original Anthropic response interleaves thinking blocks between tool use/result blocks (e.g. [thinking_1, server_tool_use_1, web_search_result_1, thinking_2, text, server_tool_use_2, web_search_result_2]). The reconstructed order differs, which breaks Anthropic's thinking block signature verification.

With a single web search, thinking blocks happen to end up in the right relative position. With 2+, the reordering is detected.

Fix Action

Fixed

Fixed by PR: fix: preserve thinking block order with multiple web searches (https://github.com/BerriAI/litellm/pull/23093)
Fixed by PR: fix(anthropic): preserve interleaved thinking block order on round-trip (#23047) (https://github.com/BerriAI/litellm/pull/23137)
Fixed by PR: Litellm oss staging 03 10 2026 (https://github.com/BerriAI/litellm/pull/23276)

PR fix notes

PR #23093: fix: preserve thinking block order with multiple web searches

Repository: BerriAI/litellm
Author: MaxwellCalkin
State: closed | merged: True
Link: https://github.com/BerriAI/litellm/pull/23093

Description (problem / solution / changelog)

Note: This PR was authored by Claude (AI), operated by @maxwellcalkin.

Summary

Fixes #23047

When using Claude with extended thinking and web search, if the model performs 2+ web searches in a single turn, the next completion() call fails with:

thinking or redacted_thinking blocks in the latest assistant message cannot be modified

Root cause

In anthropic_messages_pt, assistant content was reconstructed by:

Prepending all thinking_blocks first
Then appending text blocks
Then appending server_tool_use + web_search_tool_result blocks

But Anthropic's original response interleaves thinking blocks between tool use/result blocks:

[thinking_1, server_tool_use_1, web_search_result_1, thinking_2, text, server_tool_use_2, web_search_result_2]

The reconstructed order differs, which breaks Anthropic's thinking block signature verification.

Fix

When both thinking_blocks and server tool calls (srvtoolu_*) are present on an assistant message, the code now interleaves them instead of separating them:

Each thinking block is paired with its corresponding server tool use group (server_tool_use + tool_result)
Extra thinking blocks (if more than tool groups) are emitted before the text block
Extra tool groups (if more than thinking blocks) are emitted without a preceding thinking block
Regular (non-server) tool calls are appended at the end
When no server tool calls are present, the existing sequential behavior is preserved

Changes

litellm/litellm_core_utils/prompt_templates/factory.py: Added interleaved mode for thinking blocks + server tool calls in anthropic_messages_pt
tests/llm_translation/test_prompt_factory.py: Added 3 tests covering the interleaving fix, backward compatibility, and edge cases

Test plan

test_anthropic_messages_pt_interleave_thinking_with_server_tool_calls — verifies correct interleaved order with 2 web searches
test_anthropic_messages_pt_thinking_blocks_no_server_tools_unchanged — verifies existing behavior preserved when only regular tool_use
test_anthropic_messages_pt_interleave_more_thinking_than_tool_groups — verifies handling of more thinking blocks than tool groups

Changed files

litellm/litellm_core_utils/prompt_templates/factory.py (modified, +241/-58)
tests/llm_translation/test_prompt_factory.py (modified, +343/-0)

PR #23137: fix(anthropic): preserve interleaved thinking block order on round-trip (#23047)

Repository: BerriAI/litellm
Author: ZenOfAutumn
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/23137

Description (problem / solution / changelog)

Summary

Fixes https://github.com/BerriAI/litellm/issues/23047

When Anthropic responses contain interleaved thinking blocks, tool use, and web search results, the anthropic_messages_pt function was corrupting the block order by unconditionally prepending thinking_blocks before processing the content list. This caused duplicate thinking blocks and incorrect ordering on multi-turn round-trips.

Problem

The anthropic_messages_pt function in factory.py always prepended thinking_blocks (from the top-level field) before iterating over the content list. When the content list already contained interleaved thinking, server_tool_use, and *_tool_result blocks (as returned by Anthropic for web search / tool use with extended thinking), this resulted in:

Duplicate thinking blocks — the same thinking block appeared twice (once from thinking_blocks field, once from content list)
Corrupted ordering — all thinking blocks were moved to the front, breaking the original interleaved sequence

Solution

Added a check: if the content list already contains type: "thinking" blocks, skip prepending thinking_blocks to avoid duplication and preserve the original interleaved order.
Added pass-through handling for server_tool_use and *_tool_result content blocks (e.g. web_search_tool_result, tool_search_tool_result, bash_code_execution_tool_result).
Added handling for redacted_thinking blocks.

Changes

litellm/litellm_core_utils/prompt_templates/factory.py: Modified anthropic_messages_pt to detect and skip duplicate thinking block insertion; added pass-through for server_tool_use, *_tool_result, and redacted_thinking block types.
tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py: Added 3 new unit tests covering interleaved thinking + tool use ordering, thinking_blocks-only fallback, and single web search with thinking.

Testing

All 3 new tests pass:

pytest tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py::test_anthropic_messages_pt_preserves_interleaved_thinking_and_tool_use_order
pytest tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py::test_anthropic_messages_pt_thinking_blocks_field_only_no_content_list
pytest tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py::test_anthropic_messages_pt_single_web_search_with_thinking

Changed files

litellm/litellm_core_utils/prompt_templates/factory.py (modified, +22/-1)
tests/test_litellm/litellm_core_utils/prompt_templates/test_litellm_core_utils_prompt_templates_factory.py (modified, +257/-0)

PR #23276: Litellm oss staging 03 10 2026

Repository: BerriAI/litellm
Author: RheagalFire
State: closed | merged: True
Link: https://github.com/BerriAI/litellm/pull/23276

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Type

🆕 New Feature 🐛 Bug Fix 🧹 Refactoring 📖 Documentation 🚄 Infrastructure ✅ Test

Changes

Changed files

CLAUDE.md (modified, +7/-0)
docs/my-website/docs/apply_guardrail.md (modified, +1/-0)
docs/my-website/docs/completion/output.md (modified, +22/-0)
docs/my-website/docs/contributing/adding_openai_compatible_providers.md (modified, +40/-2)
docs/my-website/docs/mcp_guardrail.md (modified, +1/-0)
docs/my-website/docs/provider_registration/add_model_pricing.md (modified, +26/-1)
docs/my-website/docs/proxy/guardrails/panw_prisma_airs.md (modified, +129/-457)
litellm-proxy-extras/litellm_proxy_extras/migrations/20260309115809_add_missing_indexes/migration.sql (added, +13/-0)
litellm-proxy-extras/litellm_proxy_extras/schema.prisma (modified, +6/-0)
litellm/caching/dual_cache.py (modified, +0/-4)
litellm/completion_extras/litellm_responses_transformation/transformation.py (modified, +1/-21)
litellm/constants.py (modified, +1/-5)
litellm/google_genai/adapters/transformation.py (modified, +0/-2)
litellm/litellm_core_utils/core_helpers.py (modified, +50/-40)
litellm/litellm_core_utils/duration_parser.py (modified, +6/-4)
litellm/litellm_core_utils/get_model_cost_map.py (modified, +60/-5)
litellm/litellm_core_utils/prompt_templates/factory.py (modified, +241/-58)
litellm/litellm_core_utils/redact_messages.py (modified, +0/-70)
litellm/llms/azure/chat/gpt_5_transformation.py (modified, +5/-19)
litellm/llms/bedrock/chat/converse_transformation.py (modified, +2/-26)
litellm/llms/fireworks_ai/chat/transformation.py (modified, +1/-4)
litellm/llms/openai/chat/gpt_5_transformation.py (modified, +18/-61)
litellm/llms/openai/image_edit/transformation.py (modified, +1/-0)
litellm/llms/openai_like/README.md (modified, +35/-3)
litellm/llms/openai_like/dynamic_config.py (modified, +60/-0)
litellm/llms/openai_like/json_loader.py (modified, +9/-0)
litellm/llms/openai_like/responses/__init__.py (added, +5/-0)
litellm/llms/openai_like/responses/transformation.py (added, +51/-0)
litellm/llms/perplexity/responses/transformation.py (modified, +63/-427)
litellm/llms/sagemaker/completion/handler.py (modified, +36/-20)
litellm/llms/snowflake/chat/transformation.py (modified, +11/-10)
litellm/llms/vertex_ai/common_utils.py (modified, +23/-0)
litellm/llms/vertex_ai/gemini/transformation.py (modified, +0/-6)
litellm/llms/vertex_ai/gemini/vertex_and_google_ai_studio_gemini.py (modified, +36/-27)
litellm/proxy/_experimental/mcp_server/openapi_to_mcp_generator.py (modified, +1/-18)
litellm/proxy/_experimental/mcp_server/server.py (modified, +4/-7)
litellm/proxy/auth/model_checks.py (modified, +6/-16)
litellm/proxy/credential_endpoints/endpoints.py (modified, +45/-52)
litellm/proxy/guardrails/guardrail_hooks/panw_prisma_airs/__init__.py (modified, +1/-1)
litellm/proxy/guardrails/guardrail_hooks/panw_prisma_airs/panw_prisma_airs.py (modified, +943/-96)
litellm/proxy/management_endpoints/team_endpoints.py (modified, +18/-0)
litellm/proxy/pass_through_endpoints/pass_through_endpoints.py (modified, +1/-2)
litellm/proxy/schema.prisma (modified, +6/-0)
litellm/proxy/spend_tracking/spend_management_endpoints.py (modified, +5/-15)
litellm/responses/litellm_completion_transformation/transformation.py (modified, +21/-89)
litellm/responses/main.py (modified, +6/-6)
litellm/router.py (modified, +0/-22)
litellm/router_strategy/lowest_latency.py (modified, +3/-5)
litellm/types/images/main.py (modified, +1/-0)
litellm/types/llms/openai.py (modified, +9/-1)
litellm/types/proxy/guardrails/guardrail_hooks/panw_prisma_airs.py (modified, +7/-0)
litellm/types/utils.py (modified, +5/-1)
litellm/utils.py (modified, +61/-21)
provider_endpoints_support.json (modified, +0/-18)
schema.prisma (modified, +6/-0)
tests/llm_translation/test_prompt_factory.py (modified, +343/-0)
tests/llm_translation/test_skills_api.py (modified, +5/-11)
tests/local_testing/test_custom_callback_input.py (modified, +3/-5)
tests/logging_callback_tests/test_logging_redaction_e2e_test.py (modified, +5/-10)
tests/test_litellm/caching/test_dual_cache.py (modified, +0/-103)
tests/test_litellm/completion_extras/litellm_responses_transformation/test_completion_extras_litellm_responses_transformation_transformation.py (modified, +1/-216)
tests/test_litellm/litellm_core_utils/test_core_helpers.py (modified, +106/-1)
tests/test_litellm/llms/azure/chat/test_azure_gpt5_transformation.py (modified, +0/-17)
tests/test_litellm/llms/bedrock/chat/test_converse_transformation.py (modified, +0/-50)
tests/test_litellm/llms/fireworks_ai/chat/test_fireworks_ai_chat_transformation.py (modified, +0/-54)
tests/test_litellm/llms/openai/chat/test_openai_gpt_transformation.py (modified, +0/-193)
tests/test_litellm/llms/openai/test_gpt5_transformation.py (modified, +6/-71)
tests/test_litellm/llms/openai/test_openai_image_edit_transformation.py (modified, +48/-0)
tests/test_litellm/llms/openai_like/responses/__init__.py (added, +0/-0)
tests/test_litellm/llms/openai_like/responses/test_openai_like_responses.py (added, +341/-0)
tests/test_litellm/llms/perplexity/responses/test_perplexity_responses_transformation.py (modified, +257/-46)
tests/test_litellm/llms/sagemaker/test_sagemaker_embedding_role_assumption.py (removed, +0/-243)
tests/test_litellm/llms/snowflake/chat/test_snowflake_chat_transformation.py (modified, +6/-2)
tests/test_litellm/llms/vertex_ai/gemini/test_vertex_ai_gemini_transformation.py (modified, +0/-69)
tests/test_litellm/llms/vertex_ai/gemini/test_vertex_and_google_ai_studio_gemini.py (modified, +14/-16)
tests/test_litellm/llms/vertex_ai/test_vertex_ai_common_utils.py (modified, +91/-0)
tests/test_litellm/proxy/_experimental/mcp_server/test_mcp_server.py (modified, +0/-147)
tests/test_litellm/proxy/auth/test_model_checks.py (modified, +0/-134)
tests/test_litellm/proxy/guardrails/guardrail_hooks/test_panw_prisma_airs.py (modified, +4112/-294)
tests/test_litellm/proxy/pass_through_endpoints/test_pass_through_endpoints.py (modified, +0/-39)
tests/test_litellm/proxy/spend_tracking/test_spend_tracking_utils.py (modified, +2/-3)
tests/test_litellm/proxy/test_openapi_schema_validation.py (removed, +0/-142)
tests/test_litellm/responses/litellm_completion_transformation/test_litellm_completion_responses.py (modified, +0/-125)
tests/test_litellm/test_model_cost_aliases.py (added, +238/-0)
tests/test_litellm/test_router_retry_non_retryable_errors.py (removed, +0/-251)
tests/test_litellm/types/test_types_utils.py (modified, +54/-0)
ui/litellm-dashboard/src/components/VirtualKeysPage/VirtualKeysTable.test.tsx (modified, +3/-79)
ui/litellm-dashboard/src/components/VirtualKeysPage/VirtualKeysTable.tsx (modified, +13/-35)

Code Example

messages.N.content.1: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.

---

from litellm import completion

m = 'claude-sonnet-4-6'
msgs = [{'role': 'user', 'content': 'Search the web for the latest news about fast.ai and answer.ai'}]
r = completion(m, msgs, web_search_options={"search_context_size": "low"}, reasoning_effort='low')

# Confirm thinking + multiple tool calls present
m1 = r.choices[0].message
assert m1.thinking_blocks, "No thinking blocks — retry until model thinks"
assert len(m1.tool_calls) >= 2, f"Need 2+ web searches, got {len(m1.tool_calls or [])}"

# Round-trip: pass message back unmodified
msgs.append(m1)
msgs.append({'role': 'user', 'content': 'Now search for news about solveit'})
r2 = completion(m, msgs, web_search_options={"search_context_size": "low"}, reasoning_effort='low')
# ^^^ raises BadRequestError: thinking blocks cannot be modified

---

BadRequestError: litellm.BadRequestError: AnthropicException - {"type":"error","error":{"type":"invalid_request_error","message":"messages.1.content.1: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response."},"request_id":"req_011CYpHNUZA6pBuJhf3r4uPa"}

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When using Claude with extended thinking (reasoning_effort) and web search (web_search_options), if the model performs 2+ web searches in a single turn, the next completion() call fails with:

messages.N.content.1: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response.

This happens even when passing r.choices[0].message back unmodified — litellm's internal anthropic_messages_pt reconstructs the content array in the wrong order.

Root cause

In litellm/litellm_core_utils/prompt_templates/factory.py, anthropic_messages_pt rebuilds the assistant content array by:

Prepending all thinking_blocks first
Then appending text blocks
Then appending server_tool_use + web_search_tool_result blocks

With a single web search, thinking blocks happen to end up in the right relative position. With 2+, the reordering is detected.

Environment

litellm version: 1.82.0
Model: claude-sonnet-4-6 (also affects other Claude models with thinking + web search)
Python 3.12

Steps to Reproduce

from litellm import completion

m = 'claude-sonnet-4-6'
msgs = [{'role': 'user', 'content': 'Search the web for the latest news about fast.ai and answer.ai'}]
r = completion(m, msgs, web_search_options={"search_context_size": "low"}, reasoning_effort='low')

# Confirm thinking + multiple tool calls present
m1 = r.choices[0].message
assert m1.thinking_blocks, "No thinking blocks — retry until model thinks"
assert len(m1.tool_calls) >= 2, f"Need 2+ web searches, got {len(m1.tool_calls or [])}"

# Round-trip: pass message back unmodified
msgs.append(m1)
msgs.append({'role': 'user', 'content': 'Now search for news about solveit'})
r2 = completion(m, msgs, web_search_options={"search_context_size": "low"}, reasoning_effort='low')
# ^^^ raises BadRequestError: thinking blocks cannot be modified

Expected behavior

Round-tripping r.choices[0].message should preserve the original content array ordering so Anthropic accepts it.

Relevant log output

BadRequestError: litellm.BadRequestError: AnthropicException - {"type":"error","error":{"type":"invalid_request_error","message":"messages.1.content.1: `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified. These blocks must remain as they were in the original response."},"request_id":"req_011CYpHNUZA6pBuJhf3r4uPa"}

What part of LiteLLM is this about?

SDK (litellm Python package)

What LiteLLM version are you on ?

v1.82.0

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To resolve the issue, we need to modify the anthropic_messages_pt function in litellm/litellm_core_utils/prompt_templates/factory.py to preserve the original order of thinking blocks and tool use/result blocks.

Modify the anthropic_messages_pt function to interleave thinking blocks between tool use/result blocks:

def anthropic_messages_pt(messages):
    # ...
    content = []
    for block in original_message.content:
        if block.type == 'thinking':
            content.append(block)
        elif block.type == 'server_tool_use' or block.type == 'web_search_tool_result':
            # Interleave thinking blocks between tool use/result blocks
            thinking_blocks = [b for b in original_message.content if b.type == 'thinking' and b.index > len(content)]
            content.extend(thinking_blocks)
            content.append(block)
        else:
            content.append(block)
    # ...

Alternatively, you can also use a more straightforward approach by sorting the blocks based on their original index:

def anthropic_messages_pt(messages):
    # ...
    content = sorted(original_message.content, key=lambda x: x.index)
    # ...

Verification

To verify that the fix worked, you can run the same test code that reproduces the issue:

from litellm import completion

m = 'claude-sonnet-4-6'
msgs = [{'role': 'user', 'content': 'Search the web for the latest news about fast.ai and answer.ai'}]
r = completion(m, msgs, web_search_options={"search_context_size": "low"}, reasoning_effort='low')

# Confirm thinking + multiple tool calls present
m1 = r.choices[0].message
assert m1.thinking_blocks, "No thinking blocks — retry until model thinks"
assert len(m1.tool_calls) >= 2, f"Need 2+ web searches, got {len(m1.tool_calls or [])}"

# Round-trip: pass message back unmodified
msgs.append(m1)
msgs.append({'role': 'user', 'content': 'Now search for news about solveit'})
r2 = completion(m, msgs, web_search_options={"search_context_size": "low"}, reasoning_effort='low')

If the fix is correct, the code should no longer raise a BadRequestError.

Extra Tips

Make sure to test the fix thoroughly to ensure that it works for all possible scenarios.
Consider submitting a pull request to the LiteLLM repository to share the fix with the community.
If you encounter any issues or have further questions, don't hesitate to reach out to the LiteLLM support team.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

Round-tripping r.choices[0].message should preserve the original content array ordering so Anthropic accepts it.

#api #ssr #installation #tensor shape #autograd error #batch processing #GPU compatibility #latency issue #model loading #dependency error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: Thinking blocks corrupted on round-trip when assistant performs multiple web searches [3 pull requests, 4 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Confirm thinking + multiple tool calls present

Round-trip: pass message back unmodified

^^^ raises BadRequestError: thinking blocks cannot be modified

Root Cause

Fix Action

Fixed

PR fix notes

PR #23093: fix: preserve thinking block order with multiple web searches

Description (problem / solution / changelog)

Summary

Root cause

Fix

Changes

Test plan

Changed files

PR #23137: fix(anthropic): preserve interleaved thinking block order on round-trip (#23047)

Description (problem / solution / changelog)

Summary

Problem

Solution

Changes

Testing

Changed files

PR #23276: Litellm oss staging 03 10 2026

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

CI (LiteLLM team)

Type

Changes

Changed files

Code Example

Check for existing issues

What happened?

Root cause

Environment

Steps to Reproduce

Expected behavior

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING