langchain - ✅(Solved) Fix `parse_partial_json` fails on raw `\\r` and `\\t` inside strings [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#36747Fetched 2026-04-17 08:23:05
View on GitHub
Comments
1
Participants
2
Timeline
7
Reactions
0
Timeline (top)
cross-referenced ×2labeled ×2referenced ×2commented ×1

parse_partial_json already escapes raw \n characters inside JSON strings (since they're illegal unescaped per RFC 8259). But it doesn't handle \r or \t, which are also control characters that must be escaped in JSON strings. Some models emit these raw control characters in their output, causing the parser to fail.

The fix is to add the same escaping pattern for \r and \t that already exists for \n.

Error Message

json.decoder.JSONDecodeError: Invalid control character at: line 1 column 15 (char 14)

Root Cause

Raw newline works fine because it's already handled

result = parse_partial_json('{"key": "line1\nline2"}') print(result) # {'key': 'line1\nline2'} -- OK

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

PR fix notes

PR #36748: fix(core): parse_partial_json escapes raw \r and \t inside strings (#36747)

Description (problem / solution / changelog)

Why

Closes #36747.

parse_partial_json already escapes raw \n inside JSON string values (required by RFC 8259 — unescaped control chars are illegal in JSON strings). But \r and \t were not handled, so model outputs containing those raw control chars would crash the parser with json.JSONDecodeError: Invalid control character.

Repro from the issue:

>>> parse_partial_json('{"key": "col1\tcol2"}')
# raises json.JSONDecodeError

What

Two more elif branches in the inner character loop, mirroring the existing \n handler:

elif char == "\r" and not escaped:
    new_char = "\\r"
elif char == "\t" and not escaped:
    new_char = "\\t"

The sibling helper _replace_new_line already handles all three in the non-streaming path (line 25-27), so this just brings the streaming parser up to the same behavior.

Tests

4 new parametrized cases in TEST_CASES_PARTIAL:

('{"key": "line1\nline2"}', '{"key": "line1\\nline2"}'),   # pre-existing behavior, now explicit
('{"key": "line1\rline2"}', '{"key": "line1\\rline2"}'),   # #36747 regression guard
('{"key": "col1\tcol2"}', '{"key": "col1\\tcol2"}'),       # #36747 regression guard
('{"k": "a\nb\rc\td"}', '{"k": "a\\nb\\rc\\td"}'),         # mixed
$ pytest libs/core/tests/unit_tests/output_parsers/test_json.py -k parse_partial -v
13 passed

Scope

Minimal behavior change. No public API change. Doesn't affect any other JSON parsing path — only the character-by-character recovery loop inside parse_partial_json.

Related

  • Closes #36747
  • Aligns with _replace_new_line helper (same file) which already handles all three control chars

Changed files

  • libs/core/langchain_core/utils/json.py (modified, +5/-3)
  • libs/core/tests/unit_tests/output_parsers/test_json.py (modified, +8/-0)

PR #36763: fix(core): escape all JSON control characters in parse_partial_json

Description (problem / solution / changelog)

Description

Fixes #36747

parse_partial_json only escaped raw \n inside JSON strings but not other control characters that are illegal per RFC 8259 (U+0000 through U+001F). LLMs can emit raw \r, \t, \b, \f, and other control characters in their output, causing json.JSONDecodeError.

Changes

  • langchain_core/utils/json.py: Added escaping for \r, \t, \b, \f, and a catch-all for remaining control characters using \uXXXX encoding
  • tests/unit_tests/output_parsers/test_json.py: Added 6 test cases covering \r, \t, \b, \f, null char, and mixed control characters

Before

parse_partial_json('{"key": "line1\rline2"}')
# raises json.JSONDecodeError: Invalid control character

After

parse_partial_json('{"key": "line1\rline2"}')
# {'key': 'line1\rline2'}

Changed files

  • libs/core/langchain_core/utils/json.py (modified, +13/-3)
  • libs/core/tests/unit_tests/output_parsers/test_json.py (modified, +6/-0)

Code Example

from langchain_core.utils.json import parse_partial_json

# Raw carriage return inside a JSON string value
result = parse_partial_json('{"key": "line1\rline2"}')
print(result)  # raises json.JSONDecodeError

# Raw tab inside a JSON string value
result = parse_partial_json('{"key": "col1\tcol2"}')
print(result)  # raises json.JSONDecodeError

# Raw newline works fine because it's already handled
result = parse_partial_json('{"key": "line1\nline2"}')
print(result)  # {'key': 'line1\nline2'} -- OK

---

json.decoder.JSONDecodeError: Invalid control character at: line 1 column 15 (char 14)

---

System Information
------------------
> OS:  Windows
> OS Version:  10.0.26200
> Python Version:  3.13.7

Package Information
-------------------
> langchain_core: 1.3.0a1
> langsmith: 0.7.13
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain-core

Reproduction Steps / Example Code (Python)

from langchain_core.utils.json import parse_partial_json

# Raw carriage return inside a JSON string value
result = parse_partial_json('{"key": "line1\rline2"}')
print(result)  # raises json.JSONDecodeError

# Raw tab inside a JSON string value
result = parse_partial_json('{"key": "col1\tcol2"}')
print(result)  # raises json.JSONDecodeError

# Raw newline works fine because it's already handled
result = parse_partial_json('{"key": "line1\nline2"}')
print(result)  # {'key': 'line1\nline2'} -- OK

Error Message and Stack Trace (if applicable)

json.decoder.JSONDecodeError: Invalid control character at: line 1 column 15 (char 14)

Description

parse_partial_json already escapes raw \n characters inside JSON strings (since they're illegal unescaped per RFC 8259). But it doesn't handle \r or \t, which are also control characters that must be escaped in JSON strings. Some models emit these raw control characters in their output, causing the parser to fail.

The fix is to add the same escaping pattern for \r and \t that already exists for \n.

System Info

System Information
------------------
> OS:  Windows
> OS Version:  10.0.26200
> Python Version:  3.13.7

Package Information
-------------------
> langchain_core: 1.3.0a1
> langsmith: 0.7.13

extent analysis

TL;DR

The most likely fix is to update the parse_partial_json function to escape raw \r and \t characters inside JSON strings.

Guidance

  • The issue arises from the parse_partial_json function not handling raw \r and \t characters inside JSON strings, which are illegal unescaped per RFC 8259.
  • To verify the issue, run the provided example code and observe the json.JSONDecodeError exception.
  • The fix involves adding the same escaping pattern for \r and \t that already exists for \n in the parse_partial_json function.
  • The updated function should replace \r and \t with their escaped equivalents, \r with \\r and \t with \\t, before parsing the JSON string.

Example

def parse_partial_json(json_str):
    # Escape raw \r and \t characters
    json_str = json_str.replace('\r', '\\r').replace('\t', '\\t')
    # Existing code to handle \n and parse JSON
    # ...

Notes

The provided example code and system information suggest that the issue is specific to the langchain_core package version 1.3.0a1. The fix may not be necessary in future versions if the issue is addressed.

Recommendation

Apply workaround: Update the parse_partial_json function to escape raw \r and \t characters, as the issue is not resolved by updating to the latest stable version of LangChain.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING