litellm - 💡(How to fix) Fix [Bug]: Proxy forwards internal _litellm_* reservation fields to OpenAI chat completions

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

LiteLLM proxy can attach internal rate-limiter / TPM reservation bookkeeping fields directly to the request payload dict, then forward the same dict to the upstream OpenAI Chat Completions API.

When that happens, OpenAI rejects the request with errors like:

  • Unknown parameter: '_litellm_rate_limit_descriptors'.
  • Unrecognized request arguments supplied: _litellm_rate_limit_descriptors, _litellm_tpm_reserved_model, _litellm_tpm_reserved_scopes, _litellm_tpm_reserved_tokens

In our environment this showed up from LiteLLM proxy as RateLimitError / 429, but the upstream OpenAI response was actually a 400 invalid_request_error.

Error Message

Proxy-side logs showed failures on POST /v1/chat/completions with messages equivalent to:

Root Cause

LiteLLM proxy can attach internal rate-limiter / TPM reservation bookkeeping fields directly to the request payload dict, then forward the same dict to the upstream OpenAI Chat Completions API.

When that happens, OpenAI rejects the request with errors like:

  • Unknown parameter: '_litellm_rate_limit_descriptors'.
  • Unrecognized request arguments supplied: _litellm_rate_limit_descriptors, _litellm_tpm_reserved_model, _litellm_tpm_reserved_scopes, _litellm_tpm_reserved_tokens

In our environment this showed up from LiteLLM proxy as RateLimitError / 429, but the upstream OpenAI response was actually a 400 invalid_request_error.

Fix Action

Fix / Workaround

  • This issue is related in spirit to other cases where LiteLLM internal params leak into upstream provider payloads, but this one is proxy / rate-limiter specific.
  • I have a minimal patch and unit test for this if useful.

Code Example

OpenAIException - Unrecognized request arguments supplied:
_litellm_rate_limit_descriptors,
_litellm_tpm_reserved_model,
_litellm_tpm_reserved_scopes,
_litellm_tpm_reserved_tokens

---

OpenAIException - Unknown parameter: '_litellm_rate_limit_descriptors'
RAW_BUFFERClick to expand / collapse

[Bug]: Proxy forwards internal _litellm_* reservation fields to OpenAI chat completions

Description

LiteLLM proxy can attach internal rate-limiter / TPM reservation bookkeeping fields directly to the request payload dict, then forward the same dict to the upstream OpenAI Chat Completions API.

When that happens, OpenAI rejects the request with errors like:

  • Unknown parameter: '_litellm_rate_limit_descriptors'.
  • Unrecognized request arguments supplied: _litellm_rate_limit_descriptors, _litellm_tpm_reserved_model, _litellm_tpm_reserved_scopes, _litellm_tpm_reserved_tokens

In our environment this showed up from LiteLLM proxy as RateLimitError / 429, but the upstream OpenAI response was actually a 400 invalid_request_error.

Observed behavior

Proxy-side logs showed failures on POST /v1/chat/completions with messages equivalent to:

OpenAIException - Unrecognized request arguments supplied:
_litellm_rate_limit_descriptors,
_litellm_tpm_reserved_model,
_litellm_tpm_reserved_scopes,
_litellm_tpm_reserved_tokens

and:

OpenAIException - Unknown parameter: '_litellm_rate_limit_descriptors'

The failures were reproducible on OpenAI-backed model groups while other requests still succeeded, so this does not look like a generic transport or auth issue.

Why this happens

parallel_request_limiter_v3.py stashes internal bookkeeping fields such as:

  • _litellm_rate_limit_descriptors
  • _litellm_tpm_reserved_tokens
  • _litellm_tpm_reserved_model
  • _litellm_tpm_reserved_scopes

on the request data dict so later callbacks can reconcile reservations.

litellm/llms/openai/openai.py then forwards **data directly into:

  • openai_aclient.chat.completions.with_raw_response.create(...)
  • openai_client.chat.completions.with_raw_response.create(...)

That leaks LiteLLM-internal fields into the upstream OpenAI payload.

Expected behavior

LiteLLM should preserve internal _litellm_* fields locally for reconciliation and callbacks, but strip them from the payload sent to upstream OpenAI APIs.

Suggested fix

Sanitize the request dict in OpenAIChatCompletion.make_openai_chat_completion_request() and make_sync_openai_chat_completion_request() before calling OpenAI, removing top-level keys prefixed with _litellm_.

Notes

  • This issue is related in spirit to other cases where LiteLLM internal params leak into upstream provider payloads, but this one is proxy / rate-limiter specific.
  • I have a minimal patch and unit test for this if useful.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

LiteLLM should preserve internal _litellm_* fields locally for reconciliation and callbacks, but strip them from the payload sent to upstream OpenAI APIs.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING