hermes - 💡(How to fix) Fix [Feature] Support Anthropic Tool Search for MCP tools [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#18074Fetched 2026-05-01 05:54:04
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Timeline (top)
labeled ×5

Hermes Agent v0.11.0 sends the full schema of every registered MCP tool on every API call, regardless of relevance to the current turn. With multiple MCP servers connected, this becomes a significant fraction of every prompt.

Anthropic shipped Tool Search Tool (beta) precisely to solve this: tools marked defer_loading: true are not loaded into the model context until the model actively searches for them. The feature is supported across Anthropic API, Bedrock, Vertex AI, and Azure (i.e. all paths Hermes already uses).

This issue requests Hermes config support to opt MCP tools into Tool Search.

Root Cause

Hermes Agent v0.11.0 sends the full schema of every registered MCP tool on every API call, regardless of relevance to the current turn. With multiple MCP servers connected, this becomes a significant fraction of every prompt.

Anthropic shipped Tool Search Tool (beta) precisely to solve this: tools marked defer_loading: true are not loaded into the model context until the model actively searches for them. The feature is supported across Anthropic API, Bedrock, Vertex AI, and Azure (i.e. all paths Hermes already uses).

This issue requests Hermes config support to opt MCP tools into Tool Search.

Code Example

mcp_servers:
  vault-bridge:
    transport: streamable-http
    url: http://...:3142/mcp
    defer_loading: true   # NEW: marks all tools from this server as defer_loading=true

  cve-lookup:
    transport: stdio
    command: /usr/bin/node
    args: [...]
    defer_loading: true
    # OR per-tool override:
    tool_overrides:
      cve_lookup: { defer_loading: false }   # always-loaded for hot tools

agent:
  tool_search:
    enabled: true
    variant: bm25   # bm25 | regex
    # When enabled, Hermes adds the tool_search_tool definition to the tools array
    # and sends the `anthropic-beta: advanced-tool-use-2025-11-20` header on Anthropic-family providers
RAW_BUFFERClick to expand / collapse

Summary

Hermes Agent v0.11.0 sends the full schema of every registered MCP tool on every API call, regardless of relevance to the current turn. With multiple MCP servers connected, this becomes a significant fraction of every prompt.

Anthropic shipped Tool Search Tool (beta) precisely to solve this: tools marked defer_loading: true are not loaded into the model context until the model actively searches for them. The feature is supported across Anthropic API, Bedrock, Vertex AI, and Azure (i.e. all paths Hermes already uses).

This issue requests Hermes config support to opt MCP tools into Tool Search.

Why this matters (numbers)

  • Anthropic reports ~85% reduction in tool-definition tokens on real catalogs (source).
  • Accuracy improves, not degrades: Anthropic's internal MCP evals went from Opus 4 49% → 74% and Opus 4.5 79.5% → 88.1% with Tool Search enabled (large catalogs reduce false positives by removing decision paralysis).
  • Independent confirmation: arXiv 2604.21816 "Tool Attention" measures the "MCP Tools Tax" at 15-60K tokens/turn for typical multi-server deployments.
  • Related: existing issue #4379 ("Token overhead analysis: 73% of each API call is fixed overhead") quantifies the problem from the Hermes side.

Concrete sample

A real Hermes deployment (5 MCP servers, 34 tools total, mostly Haiku 4.5 via Bedrock) shows average prompt size 45K tokens/turn, of which ~22K is tool schema overhead (≈50%). Cache miss generations (start of session) cost $0.07-$0.10; cache hits cost $0.007. Tool Search would shrink the schema portion of the cache miss without affecting the rest of the prompt.

Proposed config surface

In ~/.hermes/config.yaml:

mcp_servers:
  vault-bridge:
    transport: streamable-http
    url: http://...:3142/mcp
    defer_loading: true   # NEW: marks all tools from this server as defer_loading=true

  cve-lookup:
    transport: stdio
    command: /usr/bin/node
    args: [...]
    defer_loading: true
    # OR per-tool override:
    tool_overrides:
      cve_lookup: { defer_loading: false }   # always-loaded for hot tools

agent:
  tool_search:
    enabled: true
    variant: bm25   # bm25 | regex
    # When enabled, Hermes adds the tool_search_tool definition to the tools array
    # and sends the `anthropic-beta: advanced-tool-use-2025-11-20` header on Anthropic-family providers

Implementation notes

  1. Header propagation: pass anthropic-beta: advanced-tool-use-2025-11-20 on Anthropic, Bedrock, Vertex, Azure paths. Skip on non-Anthropic providers (graceful degradation — Hermes already supports fallback model logic).
  2. tool_search_tool injection: when agent.tool_search.enabled: true, prepend the synthetic {"type": "tool_search_tool_bm25_20251119"} (or regex variant) to the tools array.
  3. Defer flag: when serializing each MCP-discovered tool to the API request, include defer_loading: <bool> as configured.
  4. Beta MCP header: also requires mcp-client-2025-11-20 per Anthropic docs.
  5. Backward compat: default tool_search.enabled: false, defer_loading: false — opt-in only. Existing deployments unchanged.

Related issues

  • #4379 — Token overhead analysis (73% fixed overhead). Tool Search directly attacks this.
  • #18051 — MCP utility stubs registered even when server doesn't advertise capabilities. Tool Search reduces the cost of those stubs being present.

References

Acceptance criteria

  • mcp_servers.<name>.defer_loading (bool) accepted and propagated to all tools from that server
  • mcp_servers.<name>.tool_overrides.<tool>.defer_loading (bool) optional per-tool override
  • agent.tool_search.enabled: true injects tool_search_tool_bm25_20251119 and advanced-tool-use-2025-11-20 header on Anthropic-family providers
  • No-op on non-Anthropic providers (graceful degradation)
  • Documentation updated in docs/user-guide/features/tools/
  • Telemetry: log when Tool Search is engaged + how many tools were deferred

extent analysis

TL;DR

Enable Tool Search in Hermes Agent by setting agent.tool_search.enabled: true and configure defer_loading for MCP tools to reduce schema overhead.

Guidance

  • Update ~/.hermes/config.yaml to include defer_loading: true for MCP servers or specific tools to opt-in to Tool Search.
  • Set agent.tool_search.enabled: true to enable Tool Search and inject the tool_search_tool definition.
  • Verify that the anthropic-beta: advanced-tool-use-2025-11-20 header is sent on Anthropic-family providers.
  • Test with a sample deployment to measure the reduction in schema overhead and token count.

Example

mcp_servers:
  vault-bridge:
    transport: streamable-http
    url: http://...:3142/mcp
    defer_loading: true

agent:
  tool_search:
    enabled: true
    variant: bm25

Notes

The implementation requires careful configuration of defer_loading for each MCP server or tool, as well as enabling Tool Search in the agent settings. The tool_search_tool injection and header propagation should be verified to ensure correct functionality.

Recommendation

Apply the workaround by enabling Tool Search and configuring defer_loading for MCP tools, as this is expected to significantly reduce schema overhead and improve accuracy, as demonstrated by Anthropic's internal evaluations and external research.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - 💡(How to fix) Fix [Feature] Support Anthropic Tool Search for MCP tools [1 participants]