litellm - ✅(Solved) Fix [Bug]: /v1/messages ignores client-supplied `timeout` (handler chain drops it before httpx) [1 pull requests, 1 participants]

litellm2026-04-29 06:25:50

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#26752•Fetched 2026-04-30 06:20:22

View on GitHub

Comments

Participants

Timeline

Reactions

Author

kimsehwan96

Participants

kimsehwan96

Timeline (top)

labeled ×2cross-referenced ×1

Fix Action

Fixed

Fixed by PR: [Fix] /v1/messages — plumb client-supplied timeout to httpx call (https://github.com/BerriAI/litellm/pull/26754)

PR fix notes

PR #26754: [Fix] /v1/messages — plumb client-supplied timeout to httpx call

Repository: BerriAI/litellm
Author: kimsehwan96
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/26754

Description (problem / solution / changelog)

Relevant issues

Fixes #26752

Linear ticket

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

50-55 passing tests: main is stable with minor issues.

45-49 passing tests: acceptable but needs attention

<= 40 passing tests: unstable; be careful with your merges and assess the risk.

Branch creation CI run
Link:
CI run for the last commit
Link:
Merge / cherry-pick CI run
Links:

Screenshots / Proof of Fix

Type

🐛 Bug Fix

Changes

The anthropic_messages handler chain dropped timeout before the httpx call, so per-request timeouts on /v1/messages were silently ignored - /chat/completions honored them.

This PR forwards timeout through the chain so /v1/messages matches the /chat/completions behavior.

Changed files

litellm/llms/anthropic/experimental_pass_through/messages/handler.py (modified, +11/-0)
litellm/llms/custom_httpx/llm_http_handler.py (modified, +6/-0)
tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_anthropic_experimental_pass_through_messages_handler.py (modified, +95/-0)
tests/test_litellm/llms/custom_httpx/test_llm_http_handler.py (modified, +137/-0)

Code Example

# /chat/completions — times out as expected
curl -sS http://0.0.0.0:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "messages": [{"role": "user", "content": "hi"}],
    "timeout": 0.3
  }'
# -> 408 litellm.Timeout

# /v1/messages — should also time out, but returns 200
curl -sS http://0.0.0.0:4000/v1/messages \
  -H "x-api-key: sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "max_tokens": 64,
    "messages": [{"role": "user", "content": "hi"}],
    "timeout": 0.3
  }'
# -> 200 (timeout was ignored)

---

(none — silent drop)

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Per-request timeout (and stream_timeout) on /v1/messages is silently ignored. /chat/completions honors it; /v1/messages does not.

Steps to Reproduce

Call the same model through both endpoints with a tight timeout. The /chat/completions request returns 408; the /v1/messages request returns a normal response.

# /chat/completions — times out as expected
curl -sS http://0.0.0.0:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "messages": [{"role": "user", "content": "hi"}],
    "timeout": 0.3
  }'
# -> 408 litellm.Timeout

# /v1/messages — should also time out, but returns 200
curl -sS http://0.0.0.0:4000/v1/messages \
  -H "x-api-key: sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "max_tokens": 64,
    "messages": [{"role": "user", "content": "hi"}],
    "timeout": 0.3
  }'
# -> 200 (timeout was ignored)

Relevant log output

(none — silent drop)

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.83.11

Twitter / LinkedIn details

https://www.linkedin.com/in/세환-김-a31543202/

extent analysis

TL;DR

The /v1/messages endpoint is not honoring the timeout parameter, causing it to return a normal response instead of timing out.

Guidance

Review the implementation of the /v1/messages endpoint to ensure it properly handles the timeout parameter.
Compare the handling of timeout in /v1/messages with the implementation in /chat/completions, which correctly returns a 408 error when the timeout is exceeded.
Verify that the timeout parameter is being passed correctly to the underlying model or service that handles the request.
Check for any differences in configuration or dependencies between the two endpoints that could be causing the discrepancy.

Example

No code snippet is provided as the issue does not contain sufficient information about the implementation details.

Notes

The issue seems to be specific to the /v1/messages endpoint and its handling of the timeout parameter. Without more information about the implementation, it's difficult to provide a more detailed solution.

Recommendation

Apply a workaround by modifying the /v1/messages endpoint to properly handle the timeout parameter, as the correct behavior is already implemented in the /chat/completions endpoint.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #agent execution #callback error #memory management #API rate limit

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.