hermes - 💡(How to fix) Fix Feature: Content-bound tool call extraction layer (fallback for models emitting tool calls as text/XML)

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Code Example

# In chat_completion_helpers.py, before strip_think_blocks() call:
def _extract_content_tool_calls(content: str) -> tuple[str, list[dict]]:
    """Extract tool calls embedded in content text.
    
    Returns (cleaned_content, extracted_tool_calls).
    If tool calls are found, they are removed from content and 
    returned in OpenAI tool_calls format.
    """
    tool_calls = []
    
    # Pattern registry — each entry: (regex, parser_func)
    patterns = [
        # Generic JSON blocks:{"name":"...","arguments":{...}}
        (r'◂\s*(\{.*?\})\s*▸', _parse_json_tool_call),
        # MiniMax XML: <minimax:tool_call><invoke name="..."><parameter ...>
        (r'◂minimax:tool_call▸\s*◂invoke\s+name="([^"]+)"▸(.*?)◂/invoke▸\s*◂/minimax:tool_call▸', _parse_minimax_xml),
        # Standard XML: ◂tool_call▸.../tool_call▸ containing JSON
        (r'◂tool_call[^>]*▸\s*(\{.*?\})\s*◂/tool_call▸', _parse_json_tool_call),
        # Function-call XML: ◂function_call▸{"name":"..."}/function_call▸
        (r'◂function_call[^>]*▸\s*(\{.*?\})\s*◂/function_call▸', _parse_json_tool_call),
        # Gemma style: <function name="...">JSON</function>
        (r'<function\s+name="([^"]+)">\s*(.*?)\s*</function>', _parse_gemma_function),
    ]
    # ... apply patterns, build tool_calls, clean content
RAW_BUFFERClick to expand / collapse

Problem

When certain LLM providers (MiniMax M2.7, DeepSeek V4, open-weight models on OpenRouter/NIM) emit tool calls as text content instead of via the structured tool_calls response field, Hermes silently discards the tool invocations and displays raw XML/text to the user. The agent loop never executes the tools.

This affects:

  • MiniMax M2.7 via NVIDIA NIM: emits <minimax:tool_call><invoke name="..."><parameter ...> XML blocks in content
  • DeepSeek V4 via certain proxies: emits _execute\ntool_name: ...\ncommand: ... text blocks in content
  • Open-weight models (Hermes-4, Gemma variants) via OpenRouter: emit ◂{...}▸ or <function_call> blocks in content
  • Any model that fails to use structured function calling, regardless of cause

Current behavior

The existing strip_think_blocks() in agent/agent_runtime_helpers.py (L484-527) recognizes and strips 6 XML tool-call tag families from content:

  • ◂tool_call▸...◂/tool_call▸
  • ◂tool_calls▸...◂/tool_calls▸
  • ◂function_call▸...◂/function_call▸
  • etc.

But it only removes them — it never extracts and executes the tool calls. The content is cleaned for display, but the tool invocations are silently lost.

Additionally, <minimax:tool_call> (namespace-prefixed) and _execute text format have zero code coverage — they pass through as raw text.

Evidence: Code path analysis

FormatFile:LineCurrent handlingTool executed?
◂tool_call▸JSON◂/tool_call▸agent_runtime_helpers.py:486Stripped by re.sub
◂function_call▸...◂/function_call▸agent_runtime_helpers.py:486Stripped by re.sub
<function name="...">...agent_runtime_helpers.py:497Stripped (boundary-gated)
◂minimax:tool_call▸...No code existsPasses through as text
_execute\ntool_name:...No code existsPasses through as text
◂{...}▸ (Copilot ACP)copilot_acp_client.py:30Parsed and executed
Structured tool_calls fieldchat_completion_helpers.py:484Parsed and executed

The only code path that extracts tool calls from content text is copilot_acp_client.py — and it's scoped exclusively to the Codex/ACP transport. The main chat completion path (streaming + non-streaming) has zero content→tool-call extraction.

Related issues

  • #27834 — MiniMax/DeepSeek XML tool calls rendered as text
  • #741 — Model outputs tool calls as text (closed as "Hermes model not supported", but the architectural gap remains)
  • #28238 — Strip reasoning_content for providers that reject it
  • #27930 — Strip reasoning_content for OpenAI-compatible providers

Proposed Solution

Content-Bound Tool Call Extraction Layer

Add a content→tool_calls extraction step in build_assistant_message() (chat_completion_helpers.py) that runs before strip_think_blocks(). This creates a fallback path: if the structured tool_calls field is empty but the content contains recognizable tool call patterns, extract them into the standard msg["tool_calls"] format.

Design

# In chat_completion_helpers.py, before strip_think_blocks() call:
def _extract_content_tool_calls(content: str) -> tuple[str, list[dict]]:
    """Extract tool calls embedded in content text.
    
    Returns (cleaned_content, extracted_tool_calls).
    If tool calls are found, they are removed from content and 
    returned in OpenAI tool_calls format.
    """
    tool_calls = []
    
    # Pattern registry — each entry: (regex, parser_func)
    patterns = [
        # Generic JSON blocks: ◂{"name":"...","arguments":{...}}▸
        (r'◂\s*(\{.*?\})\s*▸', _parse_json_tool_call),
        # MiniMax XML: <minimax:tool_call><invoke name="..."><parameter ...>
        (r'◂minimax:tool_call▸\s*◂invoke\s+name="([^"]+)"▸(.*?)◂/invoke▸\s*◂/minimax:tool_call▸', _parse_minimax_xml),
        # Standard XML: ◂tool_call▸...◂/tool_call▸ containing JSON
        (r'◂tool_call[^>]*▸\s*(\{.*?\})\s*◂/tool_call▸', _parse_json_tool_call),
        # Function-call XML: ◂function_call▸{"name":"..."}◂/function_call▸
        (r'◂function_call[^>]*▸\s*(\{.*?\})\s*◂/function_call▸', _parse_json_tool_call),
        # Gemma style: <function name="...">JSON</function>
        (r'<function\s+name="([^"]+)">\s*(.*?)\s*</function>', _parse_gemma_function),
    ]
    # ... apply patterns, build tool_calls, clean content

Key properties

  1. Fallback-only: Only activates when assistant_message.tool_calls is empty/None. Structured tool calls always take priority.
  2. Pattern registry: New XML/text formats can be added without touching core logic. Providers register their format.
  3. Zero breaking changes: Existing behavior is preserved — strip_think_blocks() still runs on the remaining content.
  4. Reuse existing code: copilot_acp_client.py already has _TOOL_CALL_BLOCK_RE and _TOOL_CALL_JSON_RE — these patterns can be consolidated.

Implementation scope

ComponentLinesDescription
_extract_content_tool_calls()~80Core extraction function with pattern registry
Integration in build_assistant_message()~5Call before strip_think_blocks()
Integration in streaming path~10Apply after content accumulation
Tests~120Per-pattern tests + integration
Total~215

Implementation Notes

  • The existing _TOOL_CALL_BLOCK_RE and _TOOL_CALL_JSON_RE in copilot_acp_client.py already handle ◂{...}▸ JSON extraction. These should be consolidated into a shared utility.
  • strip_think_blocks() should be updated to also handle <minimax:tool_call> namespace-prefixed tags (currently missed entirely).
  • For streaming: tool call XML tags arrive across multiple chunks. Content accumulation should buffer partial XML until a complete tag pair is detected, then extract. This is similar to how tool_calls_acc buffers structured tool call deltas.
  • Provider-specific patterns (like <minimax:tool_call>) could be registered via the plugin system if maintainers prefer extensibility over a fixed list.

Impact

  • Fixes #27834 without requiring model-side changes
  • Addresses the architectural gap that caused #741 (any model emitting tool calls as text)
  • Improves reliability for all OpenRouter/NIM/proxy configurations where structured tool calls may be lost
  • No performance cost — extraction only runs when tool_calls is empty, which is already the failure case

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING