hermes - ✅(Solved) Fix [Feature]: Expose limit parameter and document query operators in web_search tool [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
NousResearch/hermes-agent#16696Fetched 2026-04-28 06:51:32
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
labeled ×3cross-referenced ×1

Fix Action

Fix / Workaround

Motivation The current web_search tool only accepts a query string, hardcoding limit=5 and offering no documented way to use search query operators. This means:

  • The LLM cannot request more than 5 results without workarounds, which is limiting for research-heavy tasks.
  • Query operators like site:, filetype:pdf, intitle:, and -exclude work with most backends but are undocumented, so the LLM never discovers them.
  • The tool description is minimal ("Search the web for information on any topic"), giving the LLM no guidance on advanced usage.

PR fix notes

PR #16808: feat(web): expose limit for web_search

Description (problem / solution / changelog)

What does this PR do?

Adds an optional limit parameter to the web_search tool schema and wires it through to the existing web_search_tool() handler.

The default stays at 5 results, so existing tool calls keep the same behavior. Callers can now request a larger or smaller result set when needed.

Related Issue

Fixes #16696

Type of Change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 🔒 Security fix
  • 📝 Documentation update
  • ✅ Tests (adding or improving test coverage)
  • ♻️ Refactor (no behavior change)
  • 🎯 New skill (bundled or hub)

Changes Made

  • Added optional limit to the web_search tool schema with min 1, max 100, and default 5.
  • Passed the requested limit from the registered handler into web_search_tool().
  • Clamped runtime limit values to 1..100 before calling the configured backend.
  • Updated the tool description to mention common query operators such as site:, filetype:, intitle:, -term, and exact phrases, while noting that support depends on the backend.
  • Added tests for schema shape, handler wiring, default limit behavior, and runtime clamping.

How to Test

  1. source /Users/blackishgreen03/workspace/hermes-agent/venv/bin/activate && python -m pytest tests/tools/test_web_tools_config.py -q
  2. Result: 49 passed
  3. Full suite was attempted locally on the same latest upstream base with python -m pytest tests/ -q, but this local environment is not clean for the whole repository: it fails with missing optional packages such as acp, numpy, and fastapi, plus many unrelated existing gateway/web/tool failures (16556 passed, 117 skipped, 191 failed, 66 errors).

Checklist

Code

  • I've read the Contributing Guide
  • My commit messages follow Conventional Commits (fix(scope):, feat(scope):, etc.)
  • I searched for existing PRs to make sure this isn't a duplicate
  • My PR contains only changes related to this fix/feature (no unrelated commits)
  • I've run pytest tests/ -q and all tests pass
  • I've added tests for my changes (required for bug fixes, strongly encouraged for features)
  • I've tested on my platform: macOS, Python 3.11 using the existing local repo venv

Documentation & Housekeeping

  • I've updated relevant documentation (README, docs/, docstrings) — or N/A
  • I've updated cli-config.yaml.example if I added/changed config keys — or N/A
  • I've updated CONTRIBUTING.md or AGENTS.md if I changed architecture or workflows — or N/A
  • I've considered cross-platform impact (Windows, macOS) per the compatibility guide — or N/A
  • I've updated tool descriptions/schemas if I changed tool behavior — or N/A

For New Skills

N/A

Screenshots / Logs

Targeted test:

49 passed in 2.25s

Changed files

  • tests/tools/test_web_tools_config.py (modified, +48/-0)
  • tools/web_tools.py (modified, +16/-3)

Code Example

# Before
WEB_SEARCH_SCHEMA = {
    "name": "web_search",
    "description": "Search the web for information on any topic. Returns up to 5 relevant results with titles, URLs, and descriptions.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query to look up on the web"
            }
        },
        "required": ["query"]
    }
}

After
WEB_SEARCH_SCHEMA = {
    "name": "web_search",
    "description": "Search the web for information. Returns results with titles, URLs, and descriptions. Use query operators for targeted filtering: site:domain (restrict to a domain), intitle:word (title must contain), allintitle:word (all words in title), filetype:ext (file type, e.g. filetype:pdf), inurl:word, allinurl:word, related:domain (find similar sites), -term (exclude), \"exact phrase\" (exact match). For large content extraction, use web_extract on returned URLs.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query. Supports operators: site:domain (restrict to domain), intitle:word, allintitle:word, filetype:ext (e.g. filetype:pdf), inurl:word, allinurl:word, related:domain (similar sites), -term (exclude results containing term), \"exact phrase\" (exact match). Example: 'site:arxiv.org LLM fine-tuning' or 'filetype:pdf machine learning survey'"
            },
            "limit": {
                "type": "integer",
                "description": "Maximum number of results to return (default: 10, max: 100). Use higher limits when you need more candidates for downstream extraction.",
                "default": 10
            }
        },
        "required": ["query"]
    }
}

---

# Before
def web_search_tool(query: str, limit: int = 5) -> str:

After
def web_search_tool(query: str, limit: int = 10) -> str:

---

# Before
handler=lambda args, **kw: web_search_tool(args.get("query", ""), limit=5),

After
handler=lambda args, **kw: web_search_tool(args.get("query", ""), limit=args.get("limit", 10)),

---
RAW_BUFFERClick to expand / collapse

Problem or Use Case

I worked with my Hermes Agent to improve web_search tool some and wanted to share the changes we made. Thought this might make the current web_search more functional. For context, I am using local Firecrawl with SearXNG as the backend for Firecrawl.

Motivation The current web_search tool only accepts a query string, hardcoding limit=5 and offering no documented way to use search query operators. This means:

  • The LLM cannot request more than 5 results without workarounds, which is limiting for research-heavy tasks.
  • Query operators like site:, filetype:pdf, intitle:, and -exclude work with most backends but are undocumented, so the LLM never discovers them.
  • The tool description is minimal ("Search the web for information on any topic"), giving the LLM no guidance on advanced usage.

Proposed Solution

Verified changes (tested on self-hosted Firecrawl + SearXNG backend)

All changes are backend-agnostic — they only expose existing functionality that Firecrawl, Tavily, Exa, and Parallel already support.

File: tools/web_tools.py

1. Update WEB_SEARCH_SCHEMA

# Before
WEB_SEARCH_SCHEMA = {
    "name": "web_search",
    "description": "Search the web for information on any topic. Returns up to 5 relevant results with titles, URLs, and descriptions.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query to look up on the web"
            }
        },
        "required": ["query"]
    }
}

After
WEB_SEARCH_SCHEMA = {
    "name": "web_search",
    "description": "Search the web for information. Returns results with titles, URLs, and descriptions. Use query operators for targeted filtering: site:domain (restrict to a domain), intitle:word (title must contain), allintitle:word (all words in title), filetype:ext (file type, e.g. filetype:pdf), inurl:word, allinurl:word, related:domain (find similar sites), -term (exclude), \"exact phrase\" (exact match). For large content extraction, use web_extract on returned URLs.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query. Supports operators: site:domain (restrict to domain), intitle:word, allintitle:word, filetype:ext (e.g. filetype:pdf), inurl:word, allinurl:word, related:domain (similar sites), -term (exclude results containing term), \"exact phrase\" (exact match). Example: 'site:arxiv.org LLM fine-tuning' or 'filetype:pdf machine learning survey'"
            },
            "limit": {
                "type": "integer",
                "description": "Maximum number of results to return (default: 10, max: 100). Use higher limits when you need more candidates for downstream extraction.",
                "default": 10
            }
        },
        "required": ["query"]
    }
}

2. Update function signature and docstring

# Before
def web_search_tool(query: str, limit: int = 5) -> str:

After
def web_search_tool(query: str, limit: int = 10) -> str:

3. Update the handler lambda

# Before
handler=lambda args, **kw: web_search_tool(args.get("query", ""), limit=5),

After
handler=lambda args, **kw: web_search_tool(args.get("query", ""), limit=args.get("limit", 10)),

A few notes:

  • Default limit 5 → 10: This doubles results for cloud-tier users (higher credit/token cost per search). If maintainers prefer cost conservatism, keeping limit=5 as default while still exposing the parameter is a reasonable alternative.
  • Query operators are backend-agnostic: Operators pass through the query string. Backends that support them (Firecrawl, Tavily) honor them; those that don't simply ignore them. No harm either way.
  • No backend-specific code changes: All four backends (Firecrawl, Tavily, Exa, Parallel) already support limit. This change only wires an existing parameter through to the tool schema and handler.

Alternatives Considered

No response

Feature Type

Performance / reliability

Scope

Small (single file, < 50 lines)

Contribution

  • I'd like to implement this myself and submit a PR

Debug Report (optional)

extent analysis

TL;DR

Update the WEB_SEARCH_SCHEMA and web_search_tool function to expose the limit parameter and document query operators for improved search functionality.

Guidance

  • Update the WEB_SEARCH_SCHEMA to include the limit parameter and document query operators in the description.
  • Modify the web_search_tool function signature to accept the limit parameter with a default value of 10.
  • Update the handler lambda to pass the limit value from the args dictionary to the web_search_tool function.
  • Consider keeping the default limit as 5 to conserve costs, while still exposing the parameter for users who need more results.

Example

WEB_SEARCH_SCHEMA = {
    "name": "web_search",
    "description": "Search the web for information. Returns results with titles, URLs, and descriptions. Use query operators for targeted filtering: site:domain, intitle:word, filetype:ext, etc.",
    "parameters": {
        "type": "object",
        "properties": {
            "query": {
                "type": "string",
                "description": "The search query. Supports operators: site:domain, intitle:word, filetype:ext, etc."
            },
            "limit": {
                "type": "integer",
                "description": "Maximum number of results to return (default: 10, max: 100)",
                "default": 10
            }
        },
        "required": ["query"]
    }
}

Notes

The proposed changes are backend-agnostic and only expose existing functionality, so no harm is expected if some backends do not support the query operators.

Recommendation

Apply the proposed changes to update the WEB_SEARCH_SCHEMA and web_search_tool function, as they improve the search functionality and provide more flexibility for users.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

hermes - ✅(Solved) Fix [Feature]: Expose limit parameter and document query operators in web_search tool [1 pull requests, 1 participants]