litellm - ✅(Solved) Fix [Bug]: /v1/messages streaming adapter drops tool_use input arguments for non-Anthropic models (regression in v1.82.x, works in v1.81.14) [1 pull requests, 1 participants]

litellm2026-04-08 05:06:53

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#25321•Fetched 2026-04-09 07:52:52

View on GitHub

Comments

Participants

Timeline

Reactions

Author

josh900

Participants

josh900

Timeline (top)

labeled ×4

Error Message

InputValidationError: Bash failed due to the following issue: The required parameter command is missing

Root Cause

In streaming_iterator.py, when _should_start_new_content_block() returns True (e.g. text → tool_use transition), the code queues content_block_stop + content_block_start and returns immediately without queuing processed_chunk:

# streaming_iterator.py — async __anext__, ~line 306-330
if should_start_new_block and not self.sent_content_block_finish:
    self.chunk_queue.append({"type": "content_block_stop", ...})
    self.chunk_queue.append({"type": "content_block_start", ..., "content_block": self.current_content_block_start})
    self.sent_content_block_finish = False
    return self.chunk_queue.popleft()  # ← processed_chunk is NEVER QUEUED

The comment says: "The trigger chunk itself is not emitted as a delta since the content_block_start already carries the relevant information."

This is correct for text blocks (content_block_start has the text). But it's wrong for tool_use blocks: _translate_streaming_openai_chunk_to_anthropic_content_block always creates tool_use with input={} (line 1217 in transformation.py). The actual arguments are ONLY in processed_chunk as an input_json_delta — and that chunk is silently discarded.

In v1.81.14, the _should_start_new_content_block mechanism didn't exist — all chunks were unconditionally translated and queued, so tool arguments were never lost.

Fix Action

Fix

Add 2 lines in both sync and async paths — after queuing content_block_start, also queue processed_chunk if it carries delta content:

if processed_chunk.get("type") == "content_block_delta":
    self.chunk_queue.append(processed_chunk)

This is safe for all block types: text deltas, tool_use input_json_delta, and thinking_delta all benefit from being queued alongside their content_block_start. The real Anthropic API sends both content_block_start and the first delta in sequence — this fix aligns the adapter with that behavior.

PR fix notes

PR #25411: fix(adapter): preserve tool_use arguments in /v1/messages streaming block transitions

Repository: BerriAI/litellm
Author: lawrence3699
State: closed | merged: False
Link: https://github.com/BerriAI/litellm/pull/25411

Description (problem / solution / changelog)

Relevant issues

Fixes #25321

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Type

🐛 Bug Fix

Changes

When _should_start_new_content_block() triggers a block transition (e.g. text → tool_use), the /v1/messages streaming adapter queues content_block_stop + content_block_start and then returns — discarding the processed_chunk. For providers that split tool calls across multiple chunks (OpenAI-style), this is fine because the first chunk only carries the function name with empty arguments. But providers like Gemini send the function name and arguments in a single chunk. The discarded processed_chunk contains the input_json_delta with the actual tool arguments, so every tool_use block arrives with input: {}.

Before: Claude Code using Gemini through the proxy gets InputValidationError: The required parameter 'command' is missing on every tool call.

After: The adapter queues the delta alongside content_block_start when it carries non-empty content (partial_json, text, or thinking). Empty deltas from OpenAI-style split tool calls are still correctly skipped.

The fix is applied to both the sync (__next__) and async (__anext__) code paths.

Tests added

test_anthropic_stream_wrapper_tool_args_in_first_chunk — verifies tool arguments are preserved when sent in the same chunk as the function name (sync)
test_async_anthropic_stream_wrapper_tool_args_in_first_chunk — async version of the same
Updated test_anthropic_stream_wrapper_interleaved_tool_calls_and_text — text content during block transitions is now correctly emitted as a content_block_delta instead of being silently dropped

Changed files

litellm/litellm_core_utils/streaming_handler.py (modified, +6/-0)
litellm/llms/anthropic/experimental_pass_through/adapters/streaming_iterator.py (modified, +16/-4)
tests/test_litellm/litellm_core_utils/test_streaming_handler.py (modified, +86/-0)
tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_parallel_tool_calls.py (modified, +151/-2)

Code Example

# streaming_iterator.py — async __anext__, ~line 306-330
if should_start_new_block and not self.sent_content_block_finish:
    self.chunk_queue.append({"type": "content_block_stop", ...})
    self.chunk_queue.append({"type": "content_block_start", ..., "content_block": self.current_content_block_start})
    self.sent_content_block_finish = False
    return self.chunk_queue.popleft()  # ← processed_chunk is NEVER QUEUED

---

{
  "function": {
    "name": "Bash",
    "arguments": "{\"command\": \"ls -R\", \"description\": \"List files\"}"
  }
}

---

{
  "type": "tool_use",
  "id": "call_6f570bdb...",
  "name": "Bash",
  "input": {}
}

---

InputValidationError: Bash failed due to the following issue:
The required parameter `command` is missing

---

if processed_chunk.get("type") == "content_block_delta":
    self.chunk_queue.append(processed_chunk)

---

model_list:
     - model_name: gemini-3-flash-preview
       litellm_params:
         model: gemini/gemini-3-flash-preview
         api_key: os.environ/GEMINI_API_KEY
   litellm_settings:
     drop_params: true
     modify_params: true

---

export ANTHROPIC_BASE_URL="http://localhost:4000"
   export ANTHROPIC_AUTH_TOKEN="$LITELLM_MASTER_KEY"

---

claude --model gemini-3-flash-preview -p "list the files in the current directory"

---

# Claude Code logs showing empty tool input on v1.82.3:

{"type":"tool_use","id":"call_81b4866e7c6549ddb26cf39f2989__thought__Eq8JCqwJAb4...","name":"Bash","input":{}}

# Followed by validation error:
InputValidationError: Bash failed due to the following issue:
The required parameter `command` is missing

# LiteLLM raw response from Gemini shows arguments ARE present:
tool_calls: [{
  function: {
    name: "Bash",
    arguments: "{\"command\": \"ls -R\", \"description\": \"List files in the current directory and subdirectories\"}"
  }
}]

# But content_block_start sent to client has input={}:
{"type":"content_block_start","index":1,"content_block":{"type":"tool_use","id":"call_...","name":"Bash","input":{}}}
# No content_block_delta with input_json_delta follows — processed_chunk was discarded

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When using Claude Code with non-Anthropic models (Gemini) through LiteLLM's /v1/messages endpoint, all tool_use blocks arrive with empty input: {}. The model returns correct tool call arguments in OpenAI format, but the streaming adapter discards them during translation back to Anthropic format. This causes Claude Code to loop on InputValidationError: The required parameter 'command' is missing for every tool call (Bash, Write, TodoWrite, etc.).

Works on: v1.81.14-stable
Broken on: v1.82.0-stable, v1.82.3-stable.patch.2, v1.83.4-nightly

Root Cause

# streaming_iterator.py — async __anext__, ~line 306-330
if should_start_new_block and not self.sent_content_block_finish:
    self.chunk_queue.append({"type": "content_block_stop", ...})
    self.chunk_queue.append({"type": "content_block_start", ..., "content_block": self.current_content_block_start})
    self.sent_content_block_finish = False
    return self.chunk_queue.popleft()  # ← processed_chunk is NEVER QUEUED

The comment says: "The trigger chunk itself is not emitted as a delta since the content_block_start already carries the relevant information."

In v1.81.14, the _should_start_new_content_block mechanism didn't exist — all chunks were unconditionally translated and queued, so tool arguments were never lost.

Evidence

Gemini raw response (OpenAI format) — arguments present:

{
  "function": {
    "name": "Bash",
    "arguments": "{\"command\": \"ls -R\", \"description\": \"List files\"}"
  }
}

What Claude Code receives (Anthropic format) — arguments missing:

{
  "type": "tool_use",
  "id": "call_6f570bdb...",
  "name": "Bash",
  "input": {}
}

Claude Code error:

InputValidationError: Bash failed due to the following issue:
The required parameter `command` is missing

Fix

Add 2 lines in both sync and async paths — after queuing content_block_start, also queue processed_chunk if it carries delta content:

if processed_chunk.get("type") == "content_block_delta":
    self.chunk_queue.append(processed_chunk)

PR #13638 fixed the same class of bug (input_json_delta vs text_delta) in an earlier version
Issue #20711 describes the same argument-drop pattern in the Responses API streaming path

Steps to Reproduce

Configure LiteLLM proxy with a Gemini model:

model_list:
  - model_name: gemini-3-flash-preview
    litellm_params:
      model: gemini/gemini-3-flash-preview
      api_key: os.environ/GEMINI_API_KEY
litellm_settings:
  drop_params: true
  modify_params: true

Point Claude Code at the proxy:

export ANTHROPIC_BASE_URL="http://localhost:4000"
export ANTHROPIC_AUTH_TOKEN="$LITELLM_MASTER_KEY"

Run any Claude Code task that requires tool use:

claude --model gemini-3-flash-preview -p "list the files in the current directory"

Observe: every tool call (Bash, Write, TodoWrite, etc.) fails with InputValidationError: The required parameter 'command' is missing because input is {}.
Downgrade to v1.81.14-stable — same test works correctly.

Relevant log output

# Claude Code logs showing empty tool input on v1.82.3:

{"type":"tool_use","id":"call_81b4866e7c6549ddb26cf39f2989__thought__Eq8JCqwJAb4...","name":"Bash","input":{}}

# Followed by validation error:
InputValidationError: Bash failed due to the following issue:
The required parameter `command` is missing

# LiteLLM raw response from Gemini shows arguments ARE present:
tool_calls: [{
  function: {
    name: "Bash",
    arguments: "{\"command\": \"ls -R\", \"description\": \"List files in the current directory and subdirectories\"}"
  }
}]

# But content_block_start sent to client has input={}:
{"type":"content_block_start","index":1,"content_block":{"type":"tool_use","id":"call_...","name":"Bash","input":{}}}
# No content_block_delta with input_json_delta follows — processed_chunk was discarded

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.3-stable.patch.2 (also reproduces on v1.83.4-nightly; works on v1.81.14-stable)

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The issue can be fixed by adding two lines of code to queue the processed_chunk after queuing content_block_start in both sync and async paths.

Guidance

The root cause of the issue is that the processed_chunk is not being queued when _should_start_new_content_block returns True, resulting in the loss of tool arguments.
To fix this, add the following lines of code after queuing content_block_start:

if processed_chunk.get("type") == "content_block_delta":
    self.chunk_queue.append(processed_chunk)

This fix is safe for all block types and aligns the adapter with the real Anthropic API behavior.
Verify the fix by running the same test case and checking that the tool arguments are no longer missing.

Example

The provided code snippet in the issue already shows the fix:

if processed_chunk.get("type") == "content_block_delta":
    self.chunk_queue.append(processed_chunk)

This code should be added to both sync and async paths after queuing content_block_start.

Notes

The issue is specific to the LiteLLM proxy and only occurs when using non-Anthropic models (Gemini) through the /v1/messages endpoint. The fix should be applied to the streaming_iterator.py file.

Recommendation

Apply the workaround by adding the two lines of code to queue the processed_chunk after queuing content_block_start. This fix is safe and aligns the adapter with the real Anthropic API behavior.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #retriever error #indexing error #inference speed #output truncation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: /v1/messages streaming adapter drops tool_use input arguments for non-Anthropic models (regression in v1.82.x, works in v1.81.14) [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix

PR fix notes

PR #25411: fix(adapter): preserve tool_use arguments in /v1/messages streaming block transitions

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Type

Changes

Tests added

Changed files

Code Example

Check for existing issues

What happened?

Root Cause

Evidence

Fix

Related

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING