hermes - 💡(How to fix) Fix patch tool: new_string escape sequences (\t) get written literally

hermes2026-05-28 07:46:32

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

Root Cause

Root Cause The LLM's training data overwhelmingly represents tabs as the two-character sequence \t (e.g. in Python string literals, regex, diffs). When generating JSON tool call arguments, the modeyte. The tool then faithfully writes those literal characters.

Fix Action

Fix / Workaround

Bug When using the patch tool (replace mode) to edit a tab-indented file, if the LLM sends \t (two literal characters: backslash + t) in new_string instead of a real tab byte (0x09), the toohas syntax errors.

Existing Mitigations That Don't Cover This Case - _strategy_escape_normalized in fuzzy_match.py only handles matching (unescaping old_string before comparing against content). It does not touch new_string. - _detect_escape_drift catches ' and " drift in new_string, but not \t.

Call patch with \t as two literal characters in new_string (simulating model output) patch(path="/tmp/tab_test.py", old_string="\tprint("before")", new_string="\tprint("after")")

RAW_BUFFERClick to expand / collapse

Bug
When using the patch tool (replace mode) to edit a tab-indented file, if the LLM sends \t (two literal characters: backslash + t) in new_string instead of a real tab byte (0x09), the toohas syntax errors.

Root Cause
The LLM's training data overwhelmingly represents tabs as the two-character sequence \t (e.g. in Python string literals, regex, diffs). When generating JSON tool call arguments, the modeyte. The tool then faithfully writes those literal characters.

Existing Mitigations That Don't Cover This Case
- _strategy_escape_normalized in fuzzy_match.py only handles matching (unescaping old_string before comparing against content). It does not touch new_string.
- _detect_escape_drift catches \' and \" drift in new_string, but not \t.

Reproduction

python
1. Create a file with a real tab
echo -e 'def hello():\n\tprint("before")' > /tmp/tab_test.py

2. Call patch with \t as two literal characters in new_string (simulating model output)
patch(path="/tmp/tab_test.py", old_string="\tprint(\"before\")", new_string="\tprint(\"after\")")

3. Result: file contains literal \t characters instead of real tabs
cat -A /tmp/tab_test.py
Output: def hello():$\n\tprint("after")$
^^ should be ^I


Proposed Fix
In fuzzy_find_and_replace() in tools/fuzzy_match.py, before the matching loop (around line 72), unescape common escape sequences in new_string:

python
Unescape common escape sequences that the LLM commonly produces
as literal text instead of real bytes in JSON tool call arguments.
new_string = new_string.replace('\\t', '\t').replace('\\r', '\r')
Note: \n is excluded because newlines serialize correctly in JSON.


This is the symmetrical counterpart to _strategy_escape_normalized which already does the same for old_string during matching.

\r is included for the same reason — when the model copies multi-line text with \r\n (Windows-style) as \\r\\n, unescaping \r fixes it.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering