litellm - ✅(Solved) Fix [Bug]: Responses API → Vertex AI Claude breaks on multi-turn tool calling (tool_use/tool_result pairing corrupted) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#23105Fetched 2026-04-08 00:38:35
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×2closed ×1labeled ×1renamed ×1

Multi-turn tool calling through the Responses API (/v1/responses) fails when routing to Vertex AI Anthropic (Claude) models. The same requests succeed via Chat Completions (/v1/chat/completions) to the same models.

Single-turn tool calls work. The failure occurs on the 2nd+ tool call turn in a multi-turn conversation.

Error Message

litellm.BadRequestError: Vertex_aiException BadRequestError -
  "tool_use ids were found without tool_result blocks immediately after:
   toolu_vrtx_01MWvTJiSMaC4cndrp3sB8WX. Each tool_use block must have
   a corresponding tool_result block in the next message."

Root Cause

Root Cause (hypothesis)

Fix Action

Fixed

PR fix notes

PR #23116: fix(responses): merge parallel function_call items into single assist…

Description (problem / solution / changelog)

…ant message

Relevant issues

Fixes #23105

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

<!-- Select the type of Pull Request --> <!-- Keep only the necessary ones -->

🆕 New Feature 🐛 Bug Fix 🧹 Refactoring 📖 Documentation 🚄 Infrastructure ✅ Test

Changes

Changed files

  • litellm/responses/litellm_completion_transformation/transformation.py (modified, +35/-0)
  • tests/test_litellm/responses/litellm_completion_transformation/test_litellm_completion_responses.py (modified, +125/-0)

Code Example

## Bug Report

**LiteLLM version:** 1.81.16 (also reproduced conceptually on 1.80.10+)

### Description

Multi-turn tool calling through the **Responses API** (`/v1/responses`) fails
when routing to **Vertex AI Anthropic (Claude)** models. The same requests
succeed via Chat Completions (`/v1/chat/completions`) to the same models.

Single-turn tool calls work. The failure occurs on the **2nd+ tool call turn**
in a multi-turn conversation.

### Error

---

### Root Cause (hypothesis)

The Responses API path performs a double translation:
1. Responses API input → internal Chat Completions format
2. Internal Chat CompletionsAnthropic Messages API format

During this double hop, the `tool_use` / `tool_result` block pairing gets
corrupted on multi-turn conversations. Vertex AI (Anthropic) then rejects
the malformed message because tool_use blocks lack corresponding tool_result
blocks in the immediately following message.

### Reproduction

**Fails (Responses APIClaude):**

---

**Works (Chat Completions → same Claude model):**

---

**Works (Responses APIGPT models):**

---

### Test Matrix

| Scenario | API | Backend | Result |
|----------|-----|---------|--------|
| Multi-turn tool calls | Chat Completions | Claude (Vertex AI) |PASS |
| Multi-turn tool calls | Responses | Claude (Vertex AI) |FAIL |
| Multi-turn tool calls | Responses | GPT (Azure) |PASS |
| Single tool call | Responses | Claude (Vertex AI) |PASS |

### Impact

This blocks any application that:
1. Uses the Responses API (e.g., Codex CLI, which now **requires** `wire_api = "responses"` and has removed Chat Completions support)
2. Routes to Claude/Anthropic models through LiteLLM
3. Uses multi-turn tool calling

This effectively makes **Claude models unusable through Codex CLI + LiteLLM**.

### Related Issues

- #6836Meta-issue for Anthropic tool calling problems
- #11804Anthropic duplicate tool_result handling
- #12020Tool input_schema type enforcement

### Environment

- LiteLLM: 1.81.16
- Provider: `vertex_ai` (Anthropic partner models)
- Models tested: claude-sonnet-4-5, claude-opus-4-5, claude-haiku-4-5
- Client: OpenAI Python SDK v1.x (Responses API)

### Suggested Investigation

The bug likely lives in the Responses API → internal message translation
layer, specifically how `tool_use` and `tool_result` blocks are ordered
when converting from Responses format to Chat Completions format before
the final Anthropic translation hop. The Anthropic Chat Completions path
(which skips the Responses → internal step) handles the same conversations
correctly.
RAW_BUFFERClick to expand / collapse
## Bug Report

**LiteLLM version:** 1.81.16 (also reproduced conceptually on 1.80.10+)

### Description

Multi-turn tool calling through the **Responses API** (`/v1/responses`) fails
when routing to **Vertex AI Anthropic (Claude)** models. The same requests
succeed via Chat Completions (`/v1/chat/completions`) to the same models.

Single-turn tool calls work. The failure occurs on the **2nd+ tool call turn**
in a multi-turn conversation.

### Error

```
litellm.BadRequestError: Vertex_aiException BadRequestError -
  "tool_use ids were found without tool_result blocks immediately after:
   toolu_vrtx_01MWvTJiSMaC4cndrp3sB8WX. Each tool_use block must have
   a corresponding tool_result block in the next message."
```

### Root Cause (hypothesis)

The Responses API path performs a double translation:
1. Responses API input → internal Chat Completions format
2. Internal Chat Completions → Anthropic Messages API format

During this double hop, the `tool_use` / `tool_result` block pairing gets
corrupted on multi-turn conversations. Vertex AI (Anthropic) then rejects
the malformed message because tool_use blocks lack corresponding tool_result
blocks in the immediately following message.

### Reproduction

**Fails (Responses API → Claude):**
```python
from openai import OpenAI

client = OpenAI(base_url="http://localhost:4000", api_key="sk-...")

# Turn 1: user asks something requiring tools
response = client.responses.create(
    model="claude-sonnet-4-5",
    input=[{"role": "user", "content": "What's the weather in SF and NYC?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather for a city",
            "parameters": {
                "type": "object",
                "properties": {"city": {"type": "string"}},
                "required": ["city"]
            }
        }
    }]
)
# Turn 1 succeeds — model makes tool_calls

# Turn 2: send tool results back and continue
# THIS is where it breaks — the tool_use/tool_result message ordering
# is corrupted during Responses → Chat Completions → Anthropic translation
```

**Works (Chat Completions → same Claude model):**
```python
response = client.chat.completions.create(
    model="claude-sonnet-4-5",
    messages=[...],  # same conversation
    tools=[...]      # same tools
)
# Multi-turn works perfectly via Chat Completions
```

**Works (Responses API → GPT models):**
```python
response = client.responses.create(
    model="gpt-4.1",  # Azure path — no Anthropic translation
    input=[...],
    tools=[...]
)
# Multi-turn works via Responses API to non-Anthropic backends
```

### Test Matrix

| Scenario | API | Backend | Result |
|----------|-----|---------|--------|
| Multi-turn tool calls | Chat Completions | Claude (Vertex AI) | ✅ PASS |
| Multi-turn tool calls | Responses | Claude (Vertex AI) | ❌ FAIL |
| Multi-turn tool calls | Responses | GPT (Azure) | ✅ PASS |
| Single tool call | Responses | Claude (Vertex AI) | ✅ PASS |

### Impact

This blocks any application that:
1. Uses the Responses API (e.g., Codex CLI, which now **requires** `wire_api = "responses"` and has removed Chat Completions support)
2. Routes to Claude/Anthropic models through LiteLLM
3. Uses multi-turn tool calling

This effectively makes **Claude models unusable through Codex CLI + LiteLLM**.

### Related Issues

- #6836 — Meta-issue for Anthropic tool calling problems
- #11804 — Anthropic duplicate tool_result handling
- #12020 — Tool input_schema type enforcement

### Environment

- LiteLLM: 1.81.16
- Provider: `vertex_ai` (Anthropic partner models)
- Models tested: claude-sonnet-4-5, claude-opus-4-5, claude-haiku-4-5
- Client: OpenAI Python SDK v1.x (Responses API)

### Suggested Investigation

The bug likely lives in the Responses API → internal message translation
layer, specifically how `tool_use` and `tool_result` blocks are ordered
when converting from Responses format to Chat Completions format before
the final Anthropic translation hop. The Anthropic Chat Completions path
(which skips the Responses → internal step) handles the same conversations
correctly.

extent analysis

Fix Plan

To address the issue with multi-turn tool calling through the Responses API to Vertex AI Anthropic models, we need to correct the ordering of tool_use and tool_result blocks during the translation from Responses format to Chat Completions format. Here are the steps:

  • Modify the Responses API to internal message translation layer to maintain the correct ordering of tool_use and tool_result blocks.
  • Ensure that each tool_use block is immediately followed by a corresponding tool_result block in the translated message.

Example code snippet to achieve this:

def translate_responses_to_chat_completions(responses_message):
    # ... existing translation logic ...
    tool_uses = []
    tool_results = []
    
    for message in responses_message["input"]:
        if "tool_use" in message:
            tool_uses.append(message)
        elif "tool_result" in message:
            tool_results.append(message)
    
    # Reorder messages to maintain correct tool_use and tool_result pairing
    translated_messages = []
    for tool_use in tool_uses:
        translated_messages.append(tool_use)
        # Find the corresponding tool_result and append it immediately after
        for tool_result in tool_results:
            if tool_result["tool_result"]["tool_id"] == tool_use["tool_use"]["tool_id"]:
                translated_messages.append(tool_result)
                break
    
    # ... existing translation logic ...
    return translated_messages

Verification

To verify the fix, test the Responses API with multi-turn tool calling to Vertex AI Anthropic models using the modified translation layer. The test cases should pass, and the tool_use and tool_result blocks should be correctly ordered in the translated messages.

Extra Tips

  • Review the related issues (#6836, #11804, #12020) to ensure that the fix does not introduce any regressions.
  • Consider adding additional logging or debugging statements to the translation layer to help identify any future issues with tool_use and tool_result block ordering.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - ✅(Solved) Fix [Bug]: Responses API → Vertex AI Claude breaks on multi-turn tool calling (tool_use/tool_result pairing corrupted) [1 pull requests, 1 participants]