litellm - 💡(How to fix) Fix [Feature]: Support MCP Elicitation and Sampling in the MCP Gateway [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#23761Fetched 2026-04-08 00:49:18
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
labeled ×3subscribed ×1

Error Message

  • Elicitation is harder in this mode since there's no user-facing MCP client to present a form to. Could be deferred or handled as a clear error rather than a silent drop.
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

The Feature

The MCP spec (2025-06-18, extended in 2025-11-25 which LiteLLM targets since v1.80.18) defines two server→client capabilities that the MCP Gateway does not currently support:

Today, the gateway only proxies client→server messages. When an upstream MCP server sends elicitation/create or sampling/createMessage, the message has no path back to the client and is silently dropped.

Impact differs per MCP mode

Mode A — Transparent proxy (/mcp/, /{server}/mcp — Cursor, Claude Code, FastMCP clients):

Both capabilities need bidirectional JSON-RPC relay over the existing Streamable HTTP session. This requires propagating client-declared capabilities upstream during initialize, and routing server→client messages back through the SSE channel.

Mode B — Tool bridge (/v1/chat/completions, /v1/responses with "type": "mcp" tools):

LiteLLM is the MCP client — there's no downstream MCP client to relay to.

  • Sampling is a natural fit here: LiteLLM could handle sampling/createMessage internally using its own LLM routing. It already manages 100+ providers with cost tracking and rate limiting — this is arguably the ideal position to fulfill sampling requests.
  • Elicitation is harder in this mode since there's no user-facing MCP client to present a form to. Could be deferred or handled as a clear error rather than a silent drop.

Motivation, pitch

Elicitation (Mode A): We run MCP servers behind LiteLLM that use elicitation for interactive workflows (user confirmation, disambiguation, missing parameters). Deploying them behind the gateway for centralized auth and access control silently breaks elicitation. URL-mode elicitation is especially relevant for enterprise use — it lets MCP servers initiate third-party OAuth flows without credentials transiting through the client.

Sampling (Mode B): MCP servers with agentic behaviors (multi-step triage, analysis) use sampling for intermediate LLM reasoning. LiteLLM handling this internally via its own model routing would enable a powerful pattern: agentic MCP servers that work through LiteLLM without needing their own API keys.

A phased approach could make sense: sampling in Mode B first (most natural fit), then bidirectional relay in Mode A, then elicitation in Mode B if needed.

Related

  • #22855 (generic request for 2025-11-25 spec alignment — this issue covers the specific missing capabilities)

What part of LiteLLM is this about?

Proxy

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

No

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To address the missing MCP capabilities, we'll implement the following steps:

Mode A — Transparent Proxy

  1. Modify the initialize handler to propagate client-declared capabilities upstream.
  2. Implement bidirectional JSON-RPC relay over the existing Streamable HTTP session.
  3. Route server→client messages back through the SSE channel.

Example code for modifying the initialize handler:

def initialize_handler(request):
    # ...
    client_capabilities = request.json['capabilities']
    # Propagate client capabilities upstream
    upstream_request = {'capabilities': client_capabilities}
    # ...

Mode B — Tool Bridge

  1. Implement internal handling of sampling/createMessage using LiteLLM's own LLM routing.
  2. Handle elicitation as a clear error or defer it.

Example code for internal handling of sampling/createMessage:

def sampling_create_message_handler(request):
    # Get the LLM provider and routing information
    llm_provider = get_llm_provider(request.json['provider'])
    # Use LiteLLM's own LLM routing to fulfill the sampling request
    response = llm_provider.sample(request.json['prompt'])
    # ...

Verification

To verify the fix, test the following scenarios:

  • In Mode A, send an elicitation/create message from the upstream MCP server and verify that it is relayed back to the client.
  • In Mode B, send a sampling/createMessage message from the upstream MCP server and verify that it is handled internally by LiteLLM.

Extra Tips

  • Ensure that the implementation is compatible with the 2025-11-25 MCP spec.
  • Consider implementing a phased approach to roll out the new capabilities.
  • Monitor the performance and latency of the new implementation to ensure it meets the requirements.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

litellm - 💡(How to fix) Fix [Feature]: Support MCP Elicitation and Sampling in the MCP Gateway [1 participants]