litellm - ✅(Solved) Fix [Bug]: `mcp_semantic_tool_filter` silently drops all non-MCP ("native") tools [1 pull requests, 1 participants]

XyLearningProgramming · 2026-04-22T01:57:17Z

[litellm] PR 26247: fix mcp : preserve native tools in semantic filter hook - Repository: BerriAI/litellm - Author: ayushh0110 - State: open | merged: False -… # PR #26247: fix(mcp): preserve native tools in semantic filter hook - Repository: BerriAI/litellm - Author: ayushh0110 - State: open | merged: False - Link: https://github.com/BerriAI/litellm/pull/26247 ## Description (problem / solution / changelog) ## Relevant issues Fixes #26212 ## Pre-Submission checklist - [x] I have Added testing in the [`tests/test_litellm/`](https://github.com/BerriAI/litellm/tree/main/tests/test_litellm) directory, **Adding at least 1 test is a hard requirement** - [see details](https://docs.litellm.ai/docs/extras/contributing_code) - [x] My PR passes all unit tests on [`make test-unit`](https://docs.litellm.ai/docs/extras/contributing_code) - [x] My PR's scope is as isolated as possible, it only solves 1 specific problem - [x] I have requested a Greptile review by commenting `@greptileai` and received a **Confidence Score of at least 4/5** before requesting a maintainer review ## Screenshots / Proof of Fix All 9 tests pass (7 existing + 2 new regression tests): **tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py ✅ test_semantic_filter_basic_filtering PASSED ✅ test_semantic_filter_top_k_limiting PASSED ✅ test_semantic_filter_disabled PASSED ✅ test_semantic_filter_empty_tools PASSED ✅ test_semantic_filter_extract_user_query PASSED ✅ test_semantic_filter_hook_triggers_on_completion PASSED ✅ test_semantic_filter_hook_skips_no_tools PASSED ✅ test_semantic_filter_hook_preserves_native_tools PASSED ← NEW ✅ test_semantic_filter_hook_all_native_tools PASSED ← NEW 9 passed in 2.96s** ## Type 🐛 Bug Fix ## Changes **Problem:** `SemanticToolFilterHook.async_pre_call_hook` passed ALL tools (MCP + native) to `filter_tools()`, which queries a `SemanticRouter` built exclusively from the MCP registry. Native tools have no entry in the router's `_tool_map`, so `_get_tools_by_names()` silently drops them. The LLM never receives native tools → responds in prose or the upstream provider returns 400. **Fix (in `hook.py`):** - Added `_is_mcp_tool()` helper that checks if a tool's name exists in `self.filter._tool_map` - Modified `async_pre_call_hook` to partition tools into native (not in `_tool_map`) and MCP-registered before filtering - Semantic filter runs only on MCP tools; native tools are merged back unconditionally - Enhanced logging to report native/MCP tool counts separately **Tests (in `test_semantic_tool_filter.py`):** - `test_semantic_filter_hook_preserves_native_tools` — mixed MCP + native tools: verifies native tools survive filtering - `test_semantic_filter_hook_all_native_tools` — all-native request: verifies all tools pass through when none are MCP-registered ## Changed files - `litellm/proxy/hooks/mcp_semantic_filter/hook.py` (modified, +60/-9) - `tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py` (modified, +242/-0) ### Check for existing issues - [x] I have searched the existing issues and checked that my issue is not a duplicate. ### What happened? When `litellm_settings.mcp_semantic_tool_filter.enabled: true` is set, the proxy silently removes every non-MCP tool from the `tools` array of a `/chat/completions` (or `/responses`) request, even when the caller never referenced MCP at all. Concretely, if a client sends a request with a plain OpenAI-format tool definition (a "native" tool the client implements locally, e.g. `weather_lookup`), the semantic filter runs, decides that none of its (MCP-only) router canonicals match, and writes `data["tools"] = []` before the request is forwarded to the upstream LLM. This has two visible failure modes depending on the caller: 1. **Tool call never happens.** The model receives no tools, so it answers in prose or apologizes that it cannot call the tool — even though the client *did* send it. 2. **`BadRequestError` from the upstream provider.** If the caller also set `tool_choice` (any non-`"none"` value), upstream vLLM / OpenAI-compatible backends reject the mutated request with `Invalid value for 'tool_choice': 'tool_choice' is only allowed when 'tools' are specified.` This is how I originally noticed the bug in OpenCode. **Expected:** native (non-MCP) tools pass through the filter unchanged. The semantic filter should only operate on tools that originate from MCP servers the proxy manages. **Actual:** native tools are dropped along with any non-matching MCP tools. ### Root cause (tracing) - `SemanticToolFilterHook.async_pre_call_hook` (`litellm/proxy/hooks/mcp_semantic_filter/hook.py`) passes the entire `data["tools"]` list (MCP references expanded + any native tools mixed in) as `available_tools` to `self.filter.filter_tools(...)`. - `SemanticMCPToolFilter.filter_tools` (`litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py`) queries `self.tool_router`, which was built by `build_router_from_mcp_registry()` and seeded

litellm2026-04-22 01:57:17

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

BerriAI/litellm#26212•Fetched 2026-04-22 07:45:43

View on GitHub

Comments

Participants

Timeline

Reactions

Author

XyLearningProgramming

Participants

XyLearningProgramming

Timeline (top)

labeled ×3

Root Cause

Root cause (tracing)

SemanticToolFilterHook.async_pre_call_hook (litellm/proxy/hooks/mcp_semantic_filter/hook.py) passes the entire data["tools"] list (MCP references expanded + any native tools mixed in) as available_tools to self.filter.filter_tools(...).
SemanticMCPToolFilter.filter_tools (litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py) queries self.tool_router, which was built by build_router_from_mcp_registry() and seeded only from global_mcp_server_manager. Its output vocabulary is the set of MCP canonical tool names. Native tools have no route.
_get_tools_by_names(matched_canonicals, available_tools) then keeps only tools whose name resolves to one of those MCP canonicals (strict equality on main). Native tools cannot resolve → silently dropped. The hook then unconditionally overwrites:

PR fix notes

PR #26247: fix(mcp): preserve native tools in semantic filter hook

Repository: BerriAI/litellm
Author: ayushh0110
State: open | merged: False
Link: https://github.com/BerriAI/litellm/pull/26247

Description (problem / solution / changelog)

Relevant issues

Fixes #26212

Pre-Submission checklist

I have Added testing in the tests/test_litellm/ directory, Adding at least 1 test is a hard requirement - see details
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible, it only solves 1 specific problem
I have requested a Greptile review by commenting @greptileai and received a Confidence Score of at least 4/5 before requesting a maintainer review

Screenshots / Proof of Fix

All 9 tests pass (7 existing + 2 new regression tests):

tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py ✅ test_semantic_filter_basic_filtering PASSED ✅ test_semantic_filter_top_k_limiting PASSED ✅ test_semantic_filter_disabled PASSED ✅ test_semantic_filter_empty_tools PASSED ✅ test_semantic_filter_extract_user_query PASSED ✅ test_semantic_filter_hook_triggers_on_completion PASSED ✅ test_semantic_filter_hook_skips_no_tools PASSED ✅ test_semantic_filter_hook_preserves_native_tools PASSED ← NEW ✅ test_semantic_filter_hook_all_native_tools PASSED ← NEW 9 passed in 2.96s

Type

🐛 Bug Fix

Changes

Problem: SemanticToolFilterHook.async_pre_call_hook passed ALL tools (MCP + native) to filter_tools(), which queries a SemanticRouter built exclusively from the MCP registry. Native tools have no entry in the router's _tool_map, so _get_tools_by_names() silently drops them. The LLM never receives native tools → responds in prose or the upstream provider returns 400.

Fix (in hook.py):

Added _is_mcp_tool() helper that checks if a tool's name exists in self.filter._tool_map
Modified async_pre_call_hook to partition tools into native (not in _tool_map) and MCP-registered before filtering
Semantic filter runs only on MCP tools; native tools are merged back unconditionally
Enhanced logging to report native/MCP tool counts separately

Tests (in test_semantic_tool_filter.py):

test_semantic_filter_hook_preserves_native_tools — mixed MCP + native tools: verifies native tools survive filtering
test_semantic_filter_hook_all_native_tools — all-native request: verifies all tools pass through when none are MCP-registered

Changed files

litellm/proxy/hooks/mcp_semantic_filter/hook.py (modified, +60/-9)
tests/test_litellm/proxy/_experimental/mcp_server/test_semantic_tool_filter.py (modified, +242/-0)

Code Example

data["tools"] = filtered_tools

---

# config.yaml
model_list:
  - model_name: my-model
    litellm_params:
      model: hosted_vllm/my-model
      api_base: https://my-vllm.example.com/v1
litellm_settings:
  mcp_semantic_tool_filter:
    enabled: true
    embedding_model: qwen3-embedding-4b   # any embedding model your router has
    top_k: 5

---

curl -s -X POST "$BASE_URL/chat/completions" \
  -H "Authorization: Bearer $KEY" \
  -H "Content-Type: application/json" \
  -D - -o /tmp/body.json \
  -d '{
    "model": "my-model",
    "messages": [{"role":"user","content":"what is the weather in tokyo?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "weather_lookup",
        "description": "Look up the current weather for a city",
        "parameters": {"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}
      }
    }],
    "tool_choice": "auto"
  }'

---

# Response headers on reproduction
HTTP/2 200 
x-litellm-call-id: 32c29115-3cf3-4302-9bfd-6d2392168db3
x-litellm-model-id: f02277f7...
x-litellm-model-group: DeepSeek-V3
x-litellm-semantic-filter: 1->0    # <-- 1 tool in, 0 out

# Response body: model answers in prose, never emits a tool call
# (truncated)
"content":"To get the current weather in Tokyo, I can check a reliable weather
service for you. Would you like me to fetch the latest weather updates for Tokyo
now? Alternatively, you can check real-time weather on websites like: Weather.com,
AccuWeather, Weather Underground..."

RAW_BUFFERClick to expand / collapse

Check for existing issues

I have searched the existing issues and checked that my issue is not a duplicate.

What happened?

When litellm_settings.mcp_semantic_tool_filter.enabled: true is set, the proxy silently removes every non-MCP tool from the tools array of a /chat/completions (or /responses) request, even when the caller never referenced MCP at all. Concretely, if a client sends a request with a plain OpenAI-format tool definition (a "native" tool the client implements locally, e.g. weather_lookup), the semantic filter runs, decides that none of its (MCP-only) router canonicals match, and writes data["tools"] = [] before the request is forwarded to the upstream LLM. This has two visible failure modes depending on the caller:

Tool call never happens. The model receives no tools, so it answers in prose or apologizes that it cannot call the tool — even though the client did send it.
BadRequestError from the upstream provider. If the caller also set tool_choice (any non-"none" value), upstream vLLM / OpenAI-compatible backends reject the mutated request with Invalid value for 'tool_choice': 'tool_choice' is only allowed when 'tools' are specified. This is how I originally noticed the bug in OpenCode.

Expected: native (non-MCP) tools pass through the filter unchanged. The semantic filter should only operate on tools that originate from MCP servers the proxy manages.

Actual: native tools are dropped along with any non-matching MCP tools.

Root cause (tracing)

SemanticToolFilterHook.async_pre_call_hook (litellm/proxy/hooks/mcp_semantic_filter/hook.py) passes the entire data["tools"] list (MCP references expanded + any native tools mixed in) as available_tools to self.filter.filter_tools(...).
SemanticMCPToolFilter.filter_tools (litellm/proxy/_experimental/mcp_server/semantic_tool_filter.py) queries self.tool_router, which was built by build_router_from_mcp_registry() and seeded only from global_mcp_server_manager. Its output vocabulary is the set of MCP canonical tool names. Native tools have no route.
_get_tools_by_names(matched_canonicals, available_tools) then keeps only tools whose name resolves to one of those MCP canonicals (strict equality on main). Native tools cannot resolve → silently dropped. The hook then unconditionally overwrites:

data["tools"] = filtered_tools

so the caller has no recovery path.

Steps to Reproduce

Run a LiteLLM proxy with a config that enables the semantic filter:

# config.yaml
model_list:
  - model_name: my-model
    litellm_params:
      model: hosted_vllm/my-model
      api_base: https://my-vllm.example.com/v1
litellm_settings:
  mcp_semantic_tool_filter:
    enabled: true
    embedding_model: qwen3-embedding-4b   # any embedding model your router has
    top_k: 5

(MCP servers do not need to be registered for the bug to reproduce, but having at least one configured is the realistic deployment shape.)

Send a request containing ONLY a native (non-MCP) tool:

curl -s -X POST "$BASE_URL/chat/completions" \
  -H "Authorization: Bearer $KEY" \
  -H "Content-Type: application/json" \
  -D - -o /tmp/body.json \
  -d '{
    "model": "my-model",
    "messages": [{"role":"user","content":"what is the weather in tokyo?"}],
    "tools": [{
      "type": "function",
      "function": {
        "name": "weather_lookup",
        "description": "Look up the current weather for a city",
        "parameters": {"type":"object","properties":{"city":{"type":"string"}},"required":["city"]}
      }
    }],
    "tool_choice": "auto"
  }'

Observe:

Response header x-litellm-semantic-filter: 1->0 (one tool in, zero out). Response body is a plain-text reply — the model never received weather_lookup. If the upstream backend validates tool_choice (vLLM does), the request fails with 400 Invalid value for 'tool_choice': 'tool_choice' is only allowed when 'tools' are specified. Remove "tools" from the same payload and re-run. The x-litellm-semantic-filter header is absent: confirming it was the filter, not any downstream component, that dropped the tool.

Relevant log output

# Response headers on reproduction
HTTP/2 200 
x-litellm-call-id: 32c29115-3cf3-4302-9bfd-6d2392168db3
x-litellm-model-id: f02277f7...
x-litellm-model-group: DeepSeek-V3
x-litellm-semantic-filter: 1->0    # <-- 1 tool in, 0 out

# Response body: model answers in prose, never emits a tool call
# (truncated)
"content":"To get the current weather in Tokyo, I can check a reliable weather
service for you. Would you like me to fetch the latest weather updates for Tokyo
now? Alternatively, you can check real-time weather on websites like: Weather.com,
AccuWeather, Weather Underground..."

What part of LiteLLM is this about?

Proxy

What LiteLLM version are you on ?

v1.82.0

Twitter / LinkedIn details

No response

extent analysis

TL;DR

The issue can be fixed by modifying the SemanticToolFilterHook.async_pre_call_hook to conditionally filter tools based on their origin, allowing native tools to pass through unchanged.

Guidance

Identify the source of each tool in the data["tools"] list to determine whether it's an MCP tool or a native tool.
Modify the SemanticMCPToolFilter.filter_tools method to only filter out non-matching MCP tools, leaving native tools intact.
Update the SemanticToolFilterHook.async_pre_call_hook to handle native tools separately, ensuring they are not overwritten by the filtered list.
Verify the fix by sending a request with a native tool and checking that it is not removed by the semantic filter.

Example

# Modified SemanticToolFilterHook.async_pre_call_hook
def async_pre_call_hook(self, data):
    # ...
    available_tools = data["tools"]
    native_tools = [tool for tool in available_tools if not tool["name"].startswith("mcp:")]
    filtered_mcp_tools = self.filter.filter_tools(available_tools)
    data["tools"] = native_tools + filtered_mcp_tools
    # ...

Notes

The provided fix assumes that native tools can be identified by their name not starting with "mcp:". If this is not the case, an alternative method for distinguishing between MCP and native tools will be needed.

Recommendation

Apply the workaround by modifying the SemanticToolFilterHook.async_pre_call_hook to conditionally filter tools based on their origin, as described in the guidance section. This will allow native tools to pass through the semantic filter unchanged, resolving the issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #API rate limit #retriever error #indexing error #inference speed

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

litellm - ✅(Solved) Fix [Bug]: `mcp_semantic_tool_filter` silently drops all non-MCP ("native") tools [1 pull requests, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Root cause (tracing)

PR fix notes

PR #26247: fix(mcp): preserve native tools in semantic filter hook

Description (problem / solution / changelog)

Relevant issues

Pre-Submission checklist

Screenshots / Proof of Fix

Type

Changes

Changed files

Code Example

Check for existing issues

What happened?

Root cause (tracing)

Steps to Reproduce

Relevant log output

What part of LiteLLM is this about?

What LiteLLM version are you on ?

Twitter / LinkedIn details

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING