litellm - 💡(How to fix) Fix responses(): timeout parameter silently dropped on completion transformation path (Anthropic, Bedrock, Vertex)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

timeout is declared as a named parameter in the responses() function signature:

def responses(
    input: ...,
    model: ...,
    ...
    timeout: Optional[Union[float, httpx.Timeout]] = None,   # <-- consumed here
    ...
    **kwargs,   # <-- timeout is NOT in kwargs
):

Since timeout is a named parameter, Python removes it from **kwargs. When the completion transformation path passes **kwargs to response_api_handler(), the timeout value is silently lost.

The native path explicitly forwards timeout=timeout or request_timeout, but the completion transformation path does not.

Fix Action

Fix

Add timeout=timeout to the response_api_handler() call on the completion transformation path:

if responses_api_provider_config is None or use_chat_completions_api is True:
    return litellm_completion_transformation_handler.response_api_handler(
        model=model,
        input=input,
        responses_api_request=response_api_optional_params,
        custom_llm_provider=custom_llm_provider,
        _is_async=_is_async,
        stream=stream,
        extra_headers=extra_headers,
        extra_body=extra_body,
        timeout=timeout,           # <-- ADD THIS
        **kwargs,
    )

The downstream handler (LiteLLMCompletionTransformationHandler.response_api_handler) already accepts **kwargs, so timeout will flow through to acompletion() / completion() calls automatically.

Also check: extra_body was recently added to the forwarded params. Other named parameters that may also be silently dropped on this path (same class of bug):

ParameterForwarded on completion path?Forwarded on native path?
timeoutNOYES (timeout or request_timeout)
extra_queryNOYES
max_output_tokensNO (in response_api_optional_params)YES
temperatureNO (in response_api_optional_params)YES
top_pNO (in response_api_optional_params)YES

timeout and extra_query are the most critical missing ones since they are NOT included in response_api_optional_params either.

Code Example

# COMPLETION TRANSFORMATION PATH (Anthropic, Bedrock, etc.)
# timeout is a NAMED PARAMETER of responses(), so it is NOT in **kwargs
if responses_api_provider_config is None or use_chat_completions_api is True:
    return litellm_completion_transformation_handler.response_api_handler(
        model=model,
        input=input,
        responses_api_request=response_api_optional_params,
        custom_llm_provider=custom_llm_provider,
        _is_async=_is_async,
        stream=stream,
        extra_headers=extra_headers,
        extra_body=extra_body,
        **kwargs,            # <-- timeout is NOT here, it was consumed as a named param
    )

---

# NATIVE RESPONSES API PATH (OpenAI, Azure)
response = base_llm_http_handler.response_api_handler(
    model=model,
    input=input,
    ...
    timeout=timeout or request_timeout,   # <-- timeout IS explicitly forwarded here
    ...
)

---

def responses(
    input: ...,
    model: ...,
    ...
    timeout: Optional[Union[float, httpx.Timeout]] = None,   # <-- consumed here
    ...
    **kwargs,   # <-- timeout is NOT in kwargs
):

---

if responses_api_provider_config is None or use_chat_completions_api is True:
    return litellm_completion_transformation_handler.response_api_handler(
        model=model,
        input=input,
        responses_api_request=response_api_optional_params,
        custom_llm_provider=custom_llm_provider,
        _is_async=_is_async,
        stream=stream,
        extra_headers=extra_headers,
        extra_body=extra_body,
        timeout=timeout,           # <-- ADD THIS
        **kwargs,
    )

---

import asyncio
import litellm
from litellm import Router

router = Router(
    model_list=[{
        "model_name": "claude-sonnet",
        "litellm_params": {
            "model": "anthropic/claude-sonnet-4-6",
            "api_key": "sk-ant-...",
        },
    }],
    timeout=5,  # 5 second timeout
)

async def test():
    # This will NOT timeout after 5s for Anthropic models.
    # It will use the Anthropic SDK default timeout (~600s).
    response = await router.aresponses(
        model="claude-sonnet",
        input="Write a very long essay about the history of computing",
        stream=True,
    )

asyncio.run(test())
RAW_BUFFERClick to expand / collapse

Bug Summary

litellm.responses() / litellm.aresponses() silently drops the timeout parameter when routing through the completion transformation path (used by Anthropic, Vertex AI Claude, Bedrock, and any provider without a native Responses API config). The timeout works correctly on the native Responses API path (OpenAI, Azure).

This means Router(timeout=40) is a no-op for Anthropic models using the Responses API. The actual timeout falls back to the Anthropic SDK default (~600s).

Production Impact

  • Incident: A single aresponses() call to claude-sonnet-4-6 hung for 602.7 seconds despite Router(timeout=40).
  • Root cause: The 602.7s matches the Anthropic SDK default timeout (~600s), confirming the Router's timeout=40 was never enforced.

Bug Location

File: litellm/responses/main.pyresponses() function

The problem (line ~1108 on current main)

# COMPLETION TRANSFORMATION PATH (Anthropic, Bedrock, etc.)
# timeout is a NAMED PARAMETER of responses(), so it is NOT in **kwargs
if responses_api_provider_config is None or use_chat_completions_api is True:
    return litellm_completion_transformation_handler.response_api_handler(
        model=model,
        input=input,
        responses_api_request=response_api_optional_params,
        custom_llm_provider=custom_llm_provider,
        _is_async=_is_async,
        stream=stream,
        extra_headers=extra_headers,
        extra_body=extra_body,
        **kwargs,            # <-- timeout is NOT here, it was consumed as a named param
    )

The working path for comparison (line ~1169 on current main)

# NATIVE RESPONSES API PATH (OpenAI, Azure)
response = base_llm_http_handler.response_api_handler(
    model=model,
    input=input,
    ...
    timeout=timeout or request_timeout,   # <-- timeout IS explicitly forwarded here
    ...
)

Root Cause

timeout is declared as a named parameter in the responses() function signature:

def responses(
    input: ...,
    model: ...,
    ...
    timeout: Optional[Union[float, httpx.Timeout]] = None,   # <-- consumed here
    ...
    **kwargs,   # <-- timeout is NOT in kwargs
):

Since timeout is a named parameter, Python removes it from **kwargs. When the completion transformation path passes **kwargs to response_api_handler(), the timeout value is silently lost.

The native path explicitly forwards timeout=timeout or request_timeout, but the completion transformation path does not.

Fix

Add timeout=timeout to the response_api_handler() call on the completion transformation path:

if responses_api_provider_config is None or use_chat_completions_api is True:
    return litellm_completion_transformation_handler.response_api_handler(
        model=model,
        input=input,
        responses_api_request=response_api_optional_params,
        custom_llm_provider=custom_llm_provider,
        _is_async=_is_async,
        stream=stream,
        extra_headers=extra_headers,
        extra_body=extra_body,
        timeout=timeout,           # <-- ADD THIS
        **kwargs,
    )

The downstream handler (LiteLLMCompletionTransformationHandler.response_api_handler) already accepts **kwargs, so timeout will flow through to acompletion() / completion() calls automatically.

Also check: extra_body was recently added to the forwarded params. Other named parameters that may also be silently dropped on this path (same class of bug):

ParameterForwarded on completion path?Forwarded on native path?
timeoutNOYES (timeout or request_timeout)
extra_queryNOYES
max_output_tokensNO (in response_api_optional_params)YES
temperatureNO (in response_api_optional_params)YES
top_pNO (in response_api_optional_params)YES

timeout and extra_query are the most critical missing ones since they are NOT included in response_api_optional_params either.

Precedent

This is the exact same class of bug that was fixed in #22544 for the metadata parameter. That PR added metadata=metadata to the completion transformation path call. The same fix pattern applies to timeout.

Reproduction

import asyncio
import litellm
from litellm import Router

router = Router(
    model_list=[{
        "model_name": "claude-sonnet",
        "litellm_params": {
            "model": "anthropic/claude-sonnet-4-6",
            "api_key": "sk-ant-...",
        },
    }],
    timeout=5,  # 5 second timeout
)

async def test():
    # This will NOT timeout after 5s for Anthropic models.
    # It will use the Anthropic SDK default timeout (~600s).
    response = await router.aresponses(
        model="claude-sonnet",
        input="Write a very long essay about the history of computing",
        stream=True,
    )

asyncio.run(test())

Test results

  • Simple model name + slow server: timeout fires correctly at ~5s
  • anthropic/claude-sonnet-4-6 non-streaming: 201.2s instead of ~5s — FAIL
  • anthropic/claude-sonnet-4-6 streaming (the incident path): 60.3s (hung until slow server responded) — FAIL, timeout completely ignored

Introspection test result

Monkey-patching litellm.aresponses confirmed that Router does pass timeout=5 to litellm.aresponses(). The bug is inside litellm.aresponses() / responses() itself — it receives the timeout but drops it before calling acompletion().

Affected Providers

Any provider that goes through the completion transformation path (i.e., does NOT have a native BaseResponsesAPIConfig):

  • Anthropic (anthropic/claude-*)
  • Vertex AI (vertex_ai/claude-*)
  • Bedrock (bedrock/anthropic.*)
  • Any other provider without native Responses API support

Related Issues/PRs

  • #22544 — Fixed metadata not forwarded (same bug class)
  • #25591 — Fixed timeout not fetched from litellm_settings (different but related)
  • #16320 — Responses API implementation

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING