claude-code - 💡(How to fix) Fix output_config.format truncates long-form text fields due to FSM renormalization bias [1 participants]

claude-code2026-05-02 09:09:28

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#55538•Fetched 2026-05-03 04:50:49

View on GitHub

Comments

Participants

Timeline

Reactions

Author

herrhaase

Participants

herrhaase

Timeline (top)

cross-referenced ×1labeled ×1

When using output_config.format for grammar-constrained JSON output, long-form text fields (500-2000 characters) are truncated approximately 40% of the time. The output is valid JSON that conforms to the schema, but string content is cut short mid-sentence.

Root Cause

Fix Action

Workaround

Use unconstrained text generation as the primary output mode and parse with a lenient JSON parser. Fall back to grammar-constrained output only when parsing fails. Optionally run a repair pass to detect and fix truncation from the fallback path.

RAW_BUFFERClick to expand / collapse

Description

Reproduction

Model: Sonnet 4.6
Schema: JSON object with multiple string fields, some expected to contain 500-2000 characters of prose
System prompt: ~80K tokens
Observed: ~40% of outputs had at least one text field truncated
Control: Switching to unconstrained text generation (raw JSON) with the identical prompt produced 0 truncations across 24 outputs over 8 runs

Hypothesis

The finite state machine enforcing the grammar renormalizes token probabilities away from continuing long strings, creating a bias toward closing the string early and moving on to the next field. The effect is more pronounced with longer target strings and larger overall prompts.

Expected behavior

Grammar-constrained output should not systematically truncate content that the model produces correctly in unconstrained mode. If grammar enforcement introduces a length bias, this should be documented as a known limitation.

Workaround

extent analysis

TL;DR

Using unconstrained text generation as the primary output mode and parsing with a lenient JSON parser may help mitigate the truncation issue.

Guidance

Verify that the issue is indeed caused by the grammar-constrained output by comparing the output of constrained and unconstrained modes for the same input.
Test the proposed workaround of using unconstrained text generation and parsing with a lenient JSON parser to see if it reduces truncation.
Consider implementing a repair pass to detect and fix truncation when falling back to grammar-constrained output.
Evaluate the trade-offs between using grammar-constrained output for validation and the potential for truncation, versus using unconstrained output and relying on lenient parsing.

Example

No code example is provided as the issue does not specify specific code or APIs.

Notes

The workaround may introduce additional complexity and potential errors due to the use of lenient parsing, and may not be suitable for all use cases. The root cause of the issue, related to the finite state machine and token probabilities, may require further investigation to fully resolve.

Recommendation

Apply the workaround of using unconstrained text generation as the primary output mode and parsing with a lenient JSON parser, as it may help mitigate the truncation issue while still allowing for some level of validation through the fallback to grammar-constrained output.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

#model save/load #optimization #mixed precision #training loop #device allocation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix output_config.format truncates long-form text fields due to FSM renormalization bias [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Description

Reproduction

Hypothesis

Expected behavior

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix output_config.format truncates long-form text fields due to FSM renormalization bias [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Description

Reproduction

Hypothesis

Expected behavior

Workaround

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING