litellm - 💡(How to fix) Fix [Feature]: Support MCP Elicitation and Sampling in the MCP Gateway [1 participants]

litellm2026-03-16 17:20:36

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#23761•Fetched 2026-04-08 00:49:18

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Anko59

Participants

Anko59

Timeline (top)

labeled ×3subscribed ×1

Error Message

Elicitation is harder in this mode since there's no user-facing MCP client to present a form to. Could be deferred or handled as a clear error rather than a silent drop.

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

The Feature

The MCP spec (2025-06-18, extended in 2025-11-25 which LiteLLM targets since v1.80.18) defines two server→client capabilities that the MCP Gateway does not currently support:

Elicitation ([spec](https://modelcontextprotocol.io/specification/2025-11-25/client/elicitation)): server requests structured user input mid-operation via elicitation/create. The 2025-11-25 spec adds URL-mode for secure out-of-band OAuth flows with third-party services.
Sampling ([spec](https://modelcontextprotocol.io/specification/2025-11-25/client/sampling)): server requests LLM completions from the client via sampling/createMessage, enabling agentic server-side behaviors without server-side API keys.

Today, the gateway only proxies client→server messages. When an upstream MCP server sends elicitation/create or sampling/createMessage, the message has no path back to the client and is silently dropped.

Impact differs per MCP mode

Mode A — Transparent proxy (/mcp/, /{server}/mcp — Cursor, Claude Code, FastMCP clients):

Both capabilities need bidirectional JSON-RPC relay over the existing Streamable HTTP session. This requires propagating client-declared capabilities upstream during initialize, and routing server→client messages back through the SSE channel.

Mode B — Tool bridge (/v1/chat/completions, /v1/responses with "type": "mcp" tools):

LiteLLM is the MCP client — there's no downstream MCP client to relay to.

Sampling is a natural fit here: LiteLLM could handle sampling/createMessage internally using its own LLM routing. It already manages 100+ providers with cost tracking and rate limiting — this is arguably the ideal position to fulfill sampling requests.
Elicitation is harder in this mode since there's no user-facing MCP client to present a form to. Could be deferred or handled as a clear error rather than a silent drop.

Motivation, pitch

Elicitation (Mode A): We run MCP servers behind LiteLLM that use elicitation for interactive workflows (user confirmation, disambiguation, missing parameters). Deploying them behind the gateway for centralized auth and access control silently breaks elicitation. URL-mode elicitation is especially relevant for enterprise use — it lets MCP servers initiate third-party OAuth flows without credentials transiting through the client.

Sampling (Mode B): MCP servers with agentic behaviors (multi-step triage, analysis) use sampling for intermediate LLM reasoning. LiteLLM handling this internally via its own model routing would enable a powerful pattern: agentic MCP servers that work through LiteLLM without needing their own API keys.

A phased approach could make sense: sampling in Mode B first (most natural fit), then bidirectional relay in Mode A, then elicitation in Mode B if needed.

#22855 (generic request for 2025-11-25 spec alignment — this issue covers the specific missing capabilities)

What part of LiteLLM is this about?

Proxy

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

Twitter / LinkedIn details

No response

extent analysis

Fix Plan

To address the missing MCP capabilities, we'll implement the following steps:

Mode A — Transparent Proxy

Modify the initialize handler to propagate client-declared capabilities upstream.
Implement bidirectional JSON-RPC relay over the existing Streamable HTTP session.
Route server→client messages back through the SSE channel.

Example code for modifying the initialize handler:

def initialize_handler(request):
    # ...
    client_capabilities = request.json['capabilities']
    # Propagate client capabilities upstream
    upstream_request = {'capabilities': client_capabilities}
    # ...

Mode B — Tool Bridge

Implement internal handling of sampling/createMessage using LiteLLM's own LLM routing.
Handle elicitation as a clear error or defer it.

Example code for internal handling of sampling/createMessage:

def sampling_create_message_handler(request):
    # Get the LLM provider and routing information
    llm_provider = get_llm_provider(request.json['provider'])
    # Use LiteLLM's own LLM routing to fulfill the sampling request
    response = llm_provider.sample(request.json['prompt'])
    # ...

Verification

To verify the fix, test the following scenarios:

In Mode A, send an elicitation/create message from the upstream MCP server and verify that it is relayed back to the client.
In Mode B, send a sampling/createMessage message from the upstream MCP server and verify that it is handled internally by LiteLLM.

Extra Tips

Ensure that the implementation is compatible with the 2025-11-25 MCP spec.
Consider implementing a phased approach to roll out the new capabilities.
Monitor the performance and latency of the new implementation to ensure it meets the requirements.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #API routing #API middleware #SSR setup #ISR setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

litellm - 💡(How to fix) Fix [Feature]: Support MCP Elicitation and Sampling in the MCP Gateway [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Check for existing issues

The Feature

Impact differs per MCP mode

Motivation, pitch

Related

What part of LiteLLM is this about?

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

Twitter / LinkedIn details

extent analysis

Fix Plan

Mode A — Transparent Proxy

Mode B — Tool Bridge

Verification

Extra Tips

Still need to ship something?

TRENDING

litellm - 💡(How to fix) Fix [Feature]: Support MCP Elicitation and Sampling in the MCP Gateway [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Check for existing issues

The Feature

Impact differs per MCP mode

Motivation, pitch

Related

What part of LiteLLM is this about?

LiteLLM is hiring a founding backend engineer, are you interested in joining us and shipping to all our users?

Twitter / LinkedIn details

extent analysis

Fix Plan

Mode A — Transparent Proxy

Mode B — Tool Bridge

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING