hermes - 💡(How to fix) Fix patch tool: new_string escape sequences (\t) get written literally

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Root Cause

Root Cause The LLM's training data overwhelmingly represents tabs as the two-character sequence \t (e.g. in Python string literals, regex, diffs). When generating JSON tool call arguments, the modeyte. The tool then faithfully writes those literal characters.

Fix Action

Fix / Workaround

Bug When using the patch tool (replace mode) to edit a tab-indented file, if the LLM sends \t (two literal characters: backslash + t) in new_string instead of a real tab byte (0x09), the toohas syntax errors.

Existing Mitigations That Don't Cover This Case - _strategy_escape_normalized in fuzzy_match.py only handles matching (unescaping old_string before comparing against content). It does not touch new_string. - _detect_escape_drift catches ' and " drift in new_string, but not \t.

  1. Call patch with \t as two literal characters in new_string (simulating model output) patch(path="/tmp/tab_test.py", old_string="\tprint("before")", new_string="\tprint("after")")
RAW_BUFFERClick to expand / collapse
Bug
When using the patch tool (replace mode) to edit a tab-indented file, if the LLM sends \t (two literal characters: backslash + t) in new_string instead of a real tab byte (0x09), the toohas syntax errors.

Root Cause
The LLM's training data overwhelmingly represents tabs as the two-character sequence \t (e.g. in Python string literals, regex, diffs). When generating JSON tool call arguments, the modeyte. The tool then faithfully writes those literal characters.

Existing Mitigations That Don't Cover This Case
- _strategy_escape_normalized in fuzzy_match.py only handles matching (unescaping old_string before comparing against content). It does not touch new_string.
- _detect_escape_drift catches \' and \" drift in new_string, but not \t.

Reproduction

python
1. Create a file with a real tab
echo -e 'def hello():\n\tprint("before")' > /tmp/tab_test.py

2. Call patch with \t as two literal characters in new_string (simulating model output)
patch(path="/tmp/tab_test.py", old_string="\tprint(\"before\")", new_string="\tprint(\"after\")")

3. Result: file contains literal \t characters instead of real tabs
cat -A /tmp/tab_test.py
Output: def hello():$\n\tprint("after")$
^^ should be ^I


Proposed Fix
In fuzzy_find_and_replace() in tools/fuzzy_match.py, before the matching loop (around line 72), unescape common escape sequences in new_string:

python
Unescape common escape sequences that the LLM commonly produces
as literal text instead of real bytes in JSON tool call arguments.
new_string = new_string.replace('\\t', '\t').replace('\\r', '\r')
Note: \n is excluded because newlines serialize correctly in JSON.


This is the symmetrical counterpart to _strategy_escape_normalized which already does the same for old_string during matching.

\r is included for the same reason — when the model copies multi-line text with \r\n (Windows-style) as \\r\\n, unescaping \r fixes it.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING