litellm - ✅(Solved) Fix [Bug]: /v1/messages ignores client-supplied `timeout` (handler chain drops it before httpx) [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#26752Fetched 2026-04-30 06:20:22
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
labeled ×2cross-referenced ×1

Fix Action

Fixed

PR fix notes

PR #26754: [Fix] /v1/messages — plumb client-supplied timeout to httpx call

Description (problem / solution / changelog)

Relevant issues

Fixes #26752

Linear ticket

<!-- if you are an internal contributor, add the Linear ticket e.g. "Resolves LIT-1234" to magically link the Linear ticket to the GitHub PR -->

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem
  • I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Delays in PR merge?

If you're seeing a delay in your PR being merged, ping the LiteLLM Team on Slack (#pr-review).

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Screenshots / Proof of Fix

<!-- Include screenshots, screen recordings, or log output demonstrating that your changes work as expected. For bug fixes: show reproduction before the fix and passing behavior after. For new features: show the feature working end-to-end. For UI changes: include before/after screenshots. -->

Type

<!-- Select the type of Pull Request --> <!-- Keep only the necessary ones -->

🐛 Bug Fix

Changes

The anthropic_messages handler chain dropped timeout before the httpx call, so per-request timeouts on /v1/messages were silently ignored - /chat/completions honored them.

This PR forwards timeout through the chain so /v1/messages matches the /chat/completions behavior.

Changed files

  • litellm/llms/anthropic/experimental_pass_through/messages/handler.py (modified, +11/-0)
  • litellm/llms/custom_httpx/llm_http_handler.py (modified, +6/-0)
  • tests/test_litellm/llms/anthropic/experimental_pass_through/messages/test_anthropic_experimental_pass_through_messages_handler.py (modified, +95/-0)
  • tests/test_litellm/llms/custom_httpx/test_llm_http_handler.py (modified, +137/-0)

Code Example

# /chat/completions — times out as expected
curl -sS http://0.0.0.0:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "messages": [{"role": "user", "content": "hi"}],
    "timeout": 0.3
  }'
# -> 408 litellm.Timeout

# /v1/messages — should also time out, but returns 200
curl -sS http://0.0.0.0:4000/v1/messages \
  -H "x-api-key: sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "max_tokens": 64,
    "messages": [{"role": "user", "content": "hi"}],
    "timeout": 0.3
  }'
# -> 200 (timeout was ignored)

---

(none — silent drop)
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

Per-request timeout (and stream_timeout) on /v1/messages is silently ignored. /chat/completions honors it; /v1/messages does not.

Steps to Reproduce

Call the same model through both endpoints with a tight timeout. The /chat/completions request returns 408; the /v1/messages request returns a normal response.

# /chat/completions — times out as expected
curl -sS http://0.0.0.0:4000/v1/chat/completions \
  -H "Authorization: Bearer sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "messages": [{"role": "user", "content": "hi"}],
    "timeout": 0.3
  }'
# -> 408 litellm.Timeout

# /v1/messages — should also time out, but returns 200
curl -sS http://0.0.0.0:4000/v1/messages \
  -H "x-api-key: sk-1234" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-haiku-4-5",
    "max_tokens": 64,
    "messages": [{"role": "user", "content": "hi"}],
    "timeout": 0.3
  }'
# -> 200 (timeout was ignored)

Relevant log output

(none — silent drop)

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.83.11

Twitter / LinkedIn details

https://www.linkedin.com/in/세환-김-a31543202/

extent analysis

TL;DR

The /v1/messages endpoint is not honoring the timeout parameter, causing it to return a normal response instead of timing out.

Guidance

  • Review the implementation of the /v1/messages endpoint to ensure it properly handles the timeout parameter.
  • Compare the handling of timeout in /v1/messages with the implementation in /chat/completions, which correctly returns a 408 error when the timeout is exceeded.
  • Verify that the timeout parameter is being passed correctly to the underlying model or service that handles the request.
  • Check for any differences in configuration or dependencies between the two endpoints that could be causing the discrepancy.

Example

No code snippet is provided as the issue does not contain sufficient information about the implementation details.

Notes

The issue seems to be specific to the /v1/messages endpoint and its handling of the timeout parameter. Without more information about the implementation, it's difficult to provide a more detailed solution.

Recommendation

Apply a workaround by modifying the /v1/messages endpoint to properly handle the timeout parameter, as the correct behavior is already implemented in the /chat/completions endpoint.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING