hermes - 💡(How to fix) Fix Bug: tool_use regex parser fails with nested JSON — blocks pass through as raw text

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When using the claude-subscription provider, <tool_use>{...}</tool_use> blocks emitted by the model occasionally arrive at the user as plain text instead of being parsed and executed. When the user says "segue" (continue), the model re-emits the block and it works correctly.

Error Message

def _extract_tool_uses_from_text(text: str) -> tuple[list[SimpleNamespace], str]: """Pull <tool_use>{...}</tool_use> blocks out with proper nested brace handling.""" if not isinstance(text, str) or "<tool_use>" not in text: return [], text or ""

blocks: list[SimpleNamespace] = []
spans: list[tuple[int, int]] = []

pos = 0
while True:
    start_tag = text.find("<tool_use>", pos)
    if start_tag == -1:
        break
    json_start = start_tag + len("<tool_use>")
    while json_start < len(text) and text[json_start].isspace():
        json_start += 1
    
    brace_count = 0
    json_end = json_start
    in_string = False
    escape_next = False
    
    for i in range(json_start, len(text)):
        ch = text[i]
        if escape_next:
            escape_next = False
            continue
        if ch == '\\':
            escape_next = True
            continue
        if ch == '"' and not escape_next:
            in_string = not in_string
            continue
        if not in_string:
            if ch == '{':
                brace_count += 1
            elif ch == '}':
                brace_count -= 1
                if brace_count == 0:
                    json_end = i + 1
                    break
    
    if brace_count != 0:
        pos = start_tag + 1
        continue
    
    end_tag_start = text.find("</tool_use>", json_end)
    if end_tag_start == -1:
        pos = start_tag + 1
        continue
    
    raw = text[json_start:json_end]
    try:
        obj = json.loads(raw)
    except Exception:
        pos = end_tag_start + len("</tool_use>")
        continue
    
    name = obj.get("name")
    if not isinstance(name, str) or not name.strip():
        pos = end_tag_start + len("</tool_use>")
        continue
    call_id = obj.get("id")
    if not isinstance(call_id, str) or not call_id.strip():
        call_id = f"toolu_cli_{uuid.uuid4().hex[:16]}"
    input_data = obj.get("input")
    if not isinstance(input_data, dict):
        input_data = {}
    blocks.append(_tool_use_block(call_id, name.strip(), input_data))
    spans.append((start_tag, end_tag_start + len("</tool_use>")))
    pos = end_tag_start + len("</tool_use>")

if not spans:
    return blocks, text
parts: list[str] = []
cursor = 0
for start, end in spans:
    if cursor < start:
        parts.append(text[cursor:start])
    cursor = max(cursor, end)
if cursor < len(text):
    parts.append(text[cursor:])
return blocks, "".join(parts).strip()

Root Cause

File: agent/claude_cli_client.py
Lines: 40-42, 206-242

The regex used to extract tool_use blocks fails with nested JSON objects:

_TOOL_USE_BLOCK_RE = re.compile(
    r"<tool_use>\s*(\{.*?\})\s*</tool_use>", re.DOTALL
)

The .*? (non-greedy) pattern stops at the first } it encounters, not the one that closes the JSON object. When the tool call contains nested objects (e.g., terminal with {"command": "...", "timeout": 300}), the regex captures an incomplete JSON fragment, json.loads() fails silently (caught by except Exception: continue), and the entire <tool_use> block passes through to the user as raw text.

Fix Action

Fix / Workaround

Environment

  • Provider: claude-subscription
  • Affected profiles: victor (and likely others)
  • Occurrence: Recurrent, non-deterministic
  • Workaround: User says "segue" to force re-emit

Code Example

_TOOL_USE_BLOCK_RE = re.compile(
    r"<tool_use>\s*(\{.*?\})\s*</tool_use>", re.DOTALL
)

---

<tool_use>{"id": "t1", "name": "terminal", "input": {"command": "ls -la", "timeout": 300}}</tool_use>

---

{"id": "t1", "name": "terminal", "input": {"command": "ls -la"

---

def _extract_tool_uses_from_text(text: str) -> tuple[list[SimpleNamespace], str]:
    """Pull <tool_use>{...}</tool_use> blocks out with proper nested brace handling."""
    if not isinstance(text, str) or "<tool_use>" not in text:
        return [], text or ""
    
    blocks: list[SimpleNamespace] = []
    spans: list[tuple[int, int]] = []
    
    pos = 0
    while True:
        start_tag = text.find("<tool_use>", pos)
        if start_tag == -1:
            break
        json_start = start_tag + len("<tool_use>")
        while json_start < len(text) and text[json_start].isspace():
            json_start += 1
        
        brace_count = 0
        json_end = json_start
        in_string = False
        escape_next = False
        
        for i in range(json_start, len(text)):
            ch = text[i]
            if escape_next:
                escape_next = False
                continue
            if ch == '\\':
                escape_next = True
                continue
            if ch == '"' and not escape_next:
                in_string = not in_string
                continue
            if not in_string:
                if ch == '{':
                    brace_count += 1
                elif ch == '}':
                    brace_count -= 1
                    if brace_count == 0:
                        json_end = i + 1
                        break
        
        if brace_count != 0:
            pos = start_tag + 1
            continue
        
        end_tag_start = text.find("</tool_use>", json_end)
        if end_tag_start == -1:
            pos = start_tag + 1
            continue
        
        raw = text[json_start:json_end]
        try:
            obj = json.loads(raw)
        except Exception:
            pos = end_tag_start + len("</tool_use>")
            continue
        
        name = obj.get("name")
        if not isinstance(name, str) or not name.strip():
            pos = end_tag_start + len("</tool_use>")
            continue
        call_id = obj.get("id")
        if not isinstance(call_id, str) or not call_id.strip():
            call_id = f"toolu_cli_{uuid.uuid4().hex[:16]}"
        input_data = obj.get("input")
        if not isinstance(input_data, dict):
            input_data = {}
        blocks.append(_tool_use_block(call_id, name.strip(), input_data))
        spans.append((start_tag, end_tag_start + len("</tool_use>")))
        pos = end_tag_start + len("</tool_use>")
    
    if not spans:
        return blocks, text
    parts: list[str] = []
    cursor = 0
    for start, end in spans:
        if cursor < start:
            parts.append(text[cursor:start])
        cursor = max(cursor, end)
    if cursor < len(text):
        parts.append(text[cursor:])
    return blocks, "".join(parts).strip()

---

def test_nested_json_tool_use():
    text = '<tool_use>{"id":"t1","name":"terminal","input":{"command":"ls","timeout":300}}</tool_use>'
    blocks, leftover = _extract_tool_uses_from_text(text)
    assert len(blocks) == 1
    assert blocks[0].name == "terminal"
    assert blocks[0].input == {"command": "ls", "timeout": 300}
    assert leftover == ""
RAW_BUFFERClick to expand / collapse

Bug: tool_use blocks occasionally arrive as raw text instead of being parsed (provider: claude-subscription)

Summary

When using the claude-subscription provider, <tool_use>{...}</tool_use> blocks emitted by the model occasionally arrive at the user as plain text instead of being parsed and executed. When the user says "segue" (continue), the model re-emits the block and it works correctly.

Environment

  • Provider: claude-subscription
  • Affected profiles: victor (and likely others)
  • Occurrence: Recurrent, non-deterministic
  • Workaround: User says "segue" to force re-emit

Root Cause

File: agent/claude_cli_client.py
Lines: 40-42, 206-242

The regex used to extract tool_use blocks fails with nested JSON objects:

_TOOL_USE_BLOCK_RE = re.compile(
    r"<tool_use>\s*(\{.*?\})\s*</tool_use>", re.DOTALL
)

The .*? (non-greedy) pattern stops at the first } it encounters, not the one that closes the JSON object. When the tool call contains nested objects (e.g., terminal with {"command": "...", "timeout": 300}), the regex captures an incomplete JSON fragment, json.loads() fails silently (caught by except Exception: continue), and the entire <tool_use> block passes through to the user as raw text.

Example of failure

Input text:

<tool_use>{"id": "t1", "name": "terminal", "input": {"command": "ls -la", "timeout": 300}}</tool_use>

Regex captures:

{"id": "t1", "name": "terminal", "input": {"command": "ls -la"

json.loads() fails → block skipped → raw text shown to user.

Why it's non-deterministic

Depends on whether the model emits flat JSON (works) or nested JSON (fails). Commands like terminal with input: {command, timeout} are common triggers.

Proposed Fix

Replace the regex with a brace-counting parser that properly handles nested objects:

def _extract_tool_uses_from_text(text: str) -> tuple[list[SimpleNamespace], str]:
    """Pull <tool_use>{...}</tool_use> blocks out with proper nested brace handling."""
    if not isinstance(text, str) or "<tool_use>" not in text:
        return [], text or ""
    
    blocks: list[SimpleNamespace] = []
    spans: list[tuple[int, int]] = []
    
    pos = 0
    while True:
        start_tag = text.find("<tool_use>", pos)
        if start_tag == -1:
            break
        json_start = start_tag + len("<tool_use>")
        while json_start < len(text) and text[json_start].isspace():
            json_start += 1
        
        brace_count = 0
        json_end = json_start
        in_string = False
        escape_next = False
        
        for i in range(json_start, len(text)):
            ch = text[i]
            if escape_next:
                escape_next = False
                continue
            if ch == '\\':
                escape_next = True
                continue
            if ch == '"' and not escape_next:
                in_string = not in_string
                continue
            if not in_string:
                if ch == '{':
                    brace_count += 1
                elif ch == '}':
                    brace_count -= 1
                    if brace_count == 0:
                        json_end = i + 1
                        break
        
        if brace_count != 0:
            pos = start_tag + 1
            continue
        
        end_tag_start = text.find("</tool_use>", json_end)
        if end_tag_start == -1:
            pos = start_tag + 1
            continue
        
        raw = text[json_start:json_end]
        try:
            obj = json.loads(raw)
        except Exception:
            pos = end_tag_start + len("</tool_use>")
            continue
        
        name = obj.get("name")
        if not isinstance(name, str) or not name.strip():
            pos = end_tag_start + len("</tool_use>")
            continue
        call_id = obj.get("id")
        if not isinstance(call_id, str) or not call_id.strip():
            call_id = f"toolu_cli_{uuid.uuid4().hex[:16]}"
        input_data = obj.get("input")
        if not isinstance(input_data, dict):
            input_data = {}
        blocks.append(_tool_use_block(call_id, name.strip(), input_data))
        spans.append((start_tag, end_tag_start + len("</tool_use>")))
        pos = end_tag_start + len("</tool_use>")
    
    if not spans:
        return blocks, text
    parts: list[str] = []
    cursor = 0
    for start, end in spans:
        if cursor < start:
            parts.append(text[cursor:start])
        cursor = max(cursor, end)
    if cursor < len(text):
        parts.append(text[cursor:])
    return blocks, "".join(parts).strip()

Test Case

def test_nested_json_tool_use():
    text = '<tool_use>{"id":"t1","name":"terminal","input":{"command":"ls","timeout":300}}</tool_use>'
    blocks, leftover = _extract_tool_uses_from_text(text)
    assert len(blocks) == 1
    assert blocks[0].name == "terminal"
    assert blocks[0].input == {"command": "ls", "timeout": 300}
    assert leftover == ""

Additional Context

  • There are no existing tests for _extract_tool_uses_from_text in the test suite.
  • The AnthropicTransport.normalize_response (in agent/transports/anthropic.py) correctly handles structured tool_use blocks from the Anthropic SDK, but the claude-subscription provider uses ClaudeCLIClient which parses raw text output from claude CLI, hence the need for text-based regex extraction.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING