litellm - 💡(How to fix) Fix [Feature]: Server-side WebFetch interception (mirror of WebSearch interception) for non-Anthropic backends [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
BerriAI/litellm#25711Fetched 2026-04-16 06:37:00
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
labeled ×3cross-referenced ×1

Error Message

  • Default: require a configured sandboxed provider in fetch_tools (same shape as search_tools already uses). No provider configured → feature disabled with a clear error, not a silent raw fetch.
RAW_BUFFERClick to expand / collapse

Check for existing issues

  • I have searched the existing issues and checked that my issue is not a duplicate.

The Feature

Add a webfetch_interception integration that mirrors the existing websearch_interception architecture (litellm/integrations/websearch_interception/), so that Anthropic-native server-side web_fetch_20250910 tool calls can be transparently executed by the LiteLLM proxy when the actual backend model does not support that server-side tool (Bedrock, Vertex/Gemini, OpenAI, etc.).

Concretely:

  1. A litellm_web_fetch standard tool definition (analogous to litellm_web_search) with a URL (+ optional max_content_tokens / allowed_domains) input schema.
  2. Detection + conversion in litellm/llms/anthropic/experimental_pass_through/messages/handler.py: any incoming tool with type matching web_fetch_* is rewritten to the standard tool before the request reaches the backend, so the backend emits a regular tool_use block instead of failing with invalid_tool_input.
  3. A WebFetchInterceptionLogger (CustomLogger) that implements async_should_run_agentic_loop / async_run_agentic_loop, calls a new litellm.afetch() helper, and feeds the result back into the agentic loop exactly like the search path does.
  4. A pluggable fetch-provider layer under llm_router.fetch_tools. Strongly prefer sandboxed/managed providers (Jina Reader, Firecrawl, Tavily Extract) over a direct httpx fetch — see Motivation for why.
  5. Streaming handled the same way as WebSearch — internally downgrade stream=True to non-streaming during the agentic loop, return the final response to the caller.

Motivation, pitch

Anthropic's web_search_20250305 and web_fetch_20250910 are server-side tools: the client declares the tool type and Anthropic's infrastructure executes it. Claude Code CLI sends these native tool types by default. The moment a user routes Claude Code (or any Anthropic-style client) through LiteLLM to a non-Anthropic backend — Bedrock, Vertex/Gemini, OpenAI — those server-side tools have no one to execute them, and the request fails or returns an empty result.

websearch_interception already solves this for web_search by converting the native tool to litellm_web_search before the backend call and running the fetch locally in an agentic loop. web_fetch is the exact same problem shape and deserves the exact same solution. Without it, users mixing Claude Code + Gemini/Bedrock silently lose the web_fetch capability, even though everything else works.

Related prior art in this repo:

  • litellm/integrations/websearch_interception/ARCHITECTURE.md
  • litellm/integrations/websearch_interception/handler.py, tools.py

Why sandboxed/managed fetch providers are the right default (not raw httpx)

web_fetch is a much bigger security surface than web_search: the proxy is being asked to open an arbitrary URL produced by an LLM. Rolling our own fetcher means LiteLLM takes on responsibility for every one of the following, in-process, per request:

  • SSRF protection (RFC1918 / link-local / loopback / cloud metadata endpoints like 169.254.169.254, metadata.google.internal, etc.)
  • Redirect chain limits + final-host re-validation after each hop (defeats DNS rebinding / redirect-based SSRF)
  • Response size cap and strict per-request timeout
  • Content-type allowlist and charset handling
  • HTML → text extraction with main-content detection, boilerplate stripping, and script/style removal
  • PDF / dynamic-JS page handling (basically requires a headless browser)
  • Robots / rate-limit etiquette so a shared proxy instance does not get the whole org IP-banned

Delegating to Jina Reader / Firecrawl / Tavily Extract collapses almost all of that into the provider's responsibility — they already run the fetch in a sandboxed, isolated environment, handle JS rendering and extraction, and return clean model-ready text. For a multi-tenant proxy like LiteLLM this is a materially safer default than running URL fetches in-process. I would suggest:

  • Default: require a configured sandboxed provider in fetch_tools (same shape as search_tools already uses). No provider configured → feature disabled with a clear error, not a silent raw fetch.
  • Optional: a raw-httpx provider gated behind an explicit opt-in flag (e.g. allow_direct_fetch: true) for self-hosted users who understand the SSRF risk and want to avoid a third-party dependency.

This keeps the guardrail surface inside LiteLLM small and makes the threat model for operators much easier to reason about.

What part of LiteLLM is this about?

Proxy (primary) + SDK (litellm.afetch helper)

extent analysis

TL;DR

Implement a webfetch_interception integration in LiteLLM to mirror the existing websearch_interception architecture, enabling transparent execution of Anthropic-native server-side web_fetch_20250910 tool calls.

Guidance

  • Create a litellm_web_fetch standard tool definition with a URL and optional parameters, analogous to litellm_web_search.
  • Develop a WebFetchInterceptionLogger that implements async_should_run_agentic_loop and async_run_agentic_loop, utilizing a new litellm.afetch() helper to feed results back into the agentic loop.
  • Design a pluggable fetch-provider layer under llm_router.fetch_tools, preferring sandboxed/managed providers like Jina Reader, Firecrawl, or Tavily Extract over direct httpx fetch.
  • Consider implementing a default sandboxed provider requirement in fetch_tools with an optional opt-in flag for raw-httpx provider.

Example

# Example litellm_web_fetch tool definition
tool_definition = {
    "type": "litellm_web_fetch",
    "input_schema": {
        "url": {"type": "string"},
        "max_content_tokens": {"type": "integer", "optional": True},
        "allowed_domains": {"type": "array", "items": {"type": "string"}, "optional": True}
    }
}

Notes

The implementation should prioritize security and sandboxing to mitigate SSRF risks, and consider the trade-offs between managed providers and direct httpx fetch.

Recommendation

Apply a workaround by implementing the webfetch_interception integration with a default sandboxed provider, as it provides a more secure and maintainable solution for executing Anthropic-native server-side web_fetch_20250910 tool calls.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING