hermes - 💡(How to fix) Fix BUG: tool_use parser fails intermittently when closing tag appears inside JSON strings

StepCodex · 2026-05-09T16:26:59Z

[hermes] When using provider=claude-subscription or any OpenAI-compatible endpoint with Claude models , the model occasionally emits tool calls as {...} ` XML text in the `content` field instead of structured `tool_calls`. The existing fallback parser in `_extract_tool_use_from_content()` fails intermittently when the JSON arguments contain ` ` inside a string value (e.g. `write_file` content with HTML tags). ## Summary When using `provider=claude-subscription` (or any OpenAI-compatible endpoint with Claude models), the model occasionally emits tool calls as ` {...} ` XML text in the `content` field instead of structured `tool_calls`. The existing fallback parser in `_extract_tool_use_from_content()` fails intermittently when the JSON arguments contain ` ` inside a string value (e.g. `write_file` content with HTML tags). ## Root Cause The current regex-based parser at `run_agent.py:1556`: ```python pattern = re.compile( r" \s*(.*?)\s* | \s*(.*)", re.DOTALL ) ``` The lazy quantifier `(.*?)` matches the **first** ` ` it encounters, even if that tag is inside a JSON string. This produces invalid JSON that fails `json.loads()`, causing the entire tool_use block to be treated as plain text and displayed to the user. ### Example failure case ```python text = ' {"name": "write_file", "arguments": {"content": " "}} ' # Regex captures: '{"name": "write_file", "arguments": {"content": "' # json.loads() fails → tool_use treated as TEXT ``` ## Why intermittent - Works when JSON does NOT contain ` ` in strings - Fails when JSON contains ` ` in strings (write_file with HTML, web content, etc.) - "Re-emit works" because the model generates different JSON on retry, or the provider correctly converts to structured `tool_calls` ## Files affected - `run_agent.py:1536-1585` — `_extract_tool_use_from_content()` method - `run_agent.py:5137-5150` — fallback invocation in `_build_assistant_message()` ## Proposed Fix Replace the regex with an iterative scanner that finds the first ` ` producing valid JSON with a `"name"` field: ```python def _extract_tool_use_from_content(self, text: str): ... while True: start_idx = remaining.find(" ") ... # Scan for first that produces valid JSON with "name" while True: candidate_end = remaining.find(" ", search_pos) candidate_json = remaining[content_start:candidate_end].strip() try: tc_data = json.loads(candidate_json) if "name" in tc_data: end_idx = candidate_end break except (json.JSONDecodeError, KeyError): pass search_pos = candidate_end + 1 ... ``` ## Test Plan 1. Run `python test_tool_use_reproducer.py` — all 9 edge cases should pass 2. Critical new edge cases: - ` ` inside JSON string value - Multiple tool_use blocks with mixed content - Explanatory text mentioning ` ` before actual tool call ## Environment - Provider: `claude-subscription` (OpenAI-compatible endpoint) - Profile: `victor` - Hermes version: latest main - Recurrent: yes - Non-deterministic: yes (depends on model output) ## Deliverables - [x] Root cause identified - [x] Fix proposed with diff - [x] Test reproducer updated with edge cases - [ ] PR opened (pending maintainer review) --- /cc @sergionsantos

hermes2026-05-09 16:26:59

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

When using provider=claude-subscription (or any OpenAI-compatible endpoint with Claude models), the model occasionally emits tool calls as <tool_use>{...}</tool_use> XML text in the content field instead of structured tool_calls. The existing fallback parser in _extract_tool_use_from_content() fails intermittently when the JSON arguments contain </tool_use> inside a string value (e.g. write_file content with HTML tags).

Root Cause

The current regex-based parser at run_agent.py:1556:

pattern = re.compile(
    r"<tool_use>\s*(.*?)\s*</tool_use>|<tool_use>\s*(.*)", re.DOTALL
)

The lazy quantifier (.*?) matches the first </tool_use> it encounters, even if that tag is inside a JSON string. This produces invalid JSON that fails json.loads(), causing the entire tool_use block to be treated as plain text and displayed to the user.

Code Example

pattern = re.compile(
    r"<tool_use>\s*(.*?)\s*</tool_use>|<tool_use>\s*(.*)", re.DOTALL
)

---

text = '<tool_use>{"name": "write_file", "arguments": {"content": "</tool_use>"}}</tool_use>'
# Regex captures: '{"name": "write_file", "arguments": {"content": "'
# json.loads() fails → tool_use treated as TEXT

---

def _extract_tool_use_from_content(self, text: str):
    ...
    while True:
        start_idx = remaining.find("<tool_use>")
        ...
        # Scan for first </tool_use> that produces valid JSON with "name"
        while True:
            candidate_end = remaining.find("</tool_use>", search_pos)
            candidate_json = remaining[content_start:candidate_end].strip()
            try:
                tc_data = json.loads(candidate_json)
                if "name" in tc_data:
                    end_idx = candidate_end
                    break
            except (json.JSONDecodeError, KeyError):
                pass
            search_pos = candidate_end + 1
    ...

RAW_BUFFERClick to expand / collapse

Summary

Root Cause

The current regex-based parser at run_agent.py:1556:

pattern = re.compile(
    r"<tool_use>\s*(.*?)\s*</tool_use>|<tool_use>\s*(.*)", re.DOTALL
)

Example failure case

text = '<tool_use>{"name": "write_file", "arguments": {"content": "</tool_use>"}}</tool_use>'
# Regex captures: '{"name": "write_file", "arguments": {"content": "'
# json.loads() fails → tool_use treated as TEXT

Why intermittent

Works when JSON does NOT contain </tool_use> in strings
Fails when JSON contains </tool_use> in strings (write_file with HTML, web content, etc.)
"Re-emit works" because the model generates different JSON on retry, or the provider correctly converts to structured tool_calls

Files affected

run_agent.py:1536-1585 — _extract_tool_use_from_content() method
run_agent.py:5137-5150 — fallback invocation in _build_assistant_message()

Proposed Fix

Replace the regex with an iterative scanner that finds the first </tool_use> producing valid JSON with a "name" field:

def _extract_tool_use_from_content(self, text: str):
    ...
    while True:
        start_idx = remaining.find("<tool_use>")
        ...
        # Scan for first </tool_use> that produces valid JSON with "name"
        while True:
            candidate_end = remaining.find("</tool_use>", search_pos)
            candidate_json = remaining[content_start:candidate_end].strip()
            try:
                tc_data = json.loads(candidate_json)
                if "name" in tc_data:
                    end_idx = candidate_end
                    break
            except (json.JSONDecodeError, KeyError):
                pass
            search_pos = candidate_end + 1
    ...

Test Plan

Run python test_tool_use_reproducer.py — all 9 edge cases should pass
Critical new edge cases:
- </tool_use> inside JSON string value
- Multiple tool_use blocks with mixed content
- Explanatory text mentioning <tool_use> before actual tool call

Environment

Provider: claude-subscription (OpenAI-compatible endpoint)
Profile: victor
Hermes version: latest main
Recurrent: yes
Non-deterministic: yes (depends on model output)

Deliverables

Root cause identified
Fix proposed with diff
Test reproducer updated with edge cases
PR opened (pending maintainer review)

/cc @sergionsantos

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#environment variable #network issue #logging issue #authentication issue #prompt issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

hermes - 💡(How to fix) Fix BUG: tool_use parser fails intermittently when closing tag appears inside JSON strings

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Root Cause

Example failure case

Why intermittent

Files affected

Proposed Fix

Test Plan

Environment

Deliverables

Still need to ship something?

TRENDING

hermes - 💡(How to fix) Fix BUG: tool_use parser fails intermittently when closing tag appears inside JSON strings

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Code Example

Summary

Root Cause

Example failure case

Why intermittent

Files affected

Proposed Fix

Test Plan

Environment

Deliverables

Still need to ship something?

RELATED_DISCOVERY

TRENDING