claude-code - 💡(How to fix) Fix [Bug] Write tool silently fails when tool_use truncated by max_tokens limit [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#45436Fetched 2026-04-09 08:05:28
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
labeled ×4commented ×1subscribed ×1

Error Message

  1. Surface the real error to the model: "tool_use truncated by max_tokens — retry with smaller content or use Edit for partial updates" instead of generic "missing content"
  2. Client-side heuristic: warn model when file target size exceeds safe threshold

Root Cause

Root cause (verified from session jsonl): assistant responses have an 8000 output token ceiling. When the model generates a large content parameter inside a tool_use block, it can hit stop_reason=max_tokens before the content field is emitted. The API closes out the tool_use block with file_path only, and the harness rejects it as missing required param.

Code Example

[]
RAW_BUFFERClick to expand / collapse

Bug Description Title: Write tool silently fails with "missing content" when tool_use hits max_tokens limit (hard 8k output ceiling)

Body:

Reproducible bug in Claude Code where Write tool invocations with large content parameters fail with InputValidationError: The required parameter 'content' is missing, even though the model is correctly attempting to pass content.

Root cause (verified from session jsonl): assistant responses have an 8000 output token ceiling. When the model generates a large content parameter inside a tool_use block, it can hit stop_reason=max_tokens before the content field is emitted. The API closes out the tool_use block with file_path only, and the harness rejects it as missing required param.

Evidence from session 2fa0be29-8f59-413b-89a0-ae522a8b9bd9 (2026-04-08):

  • Successful EN Write, 18 077 chars: stop_reason=tool_use, output_tokens=7 900
  • 11 consecutive failures on Ukrainian translation (~18 000 chars, ~30 000 tokens due to Cyrillic BPE density): all stop_reason=max_tokens, output_tokens=8 000
  • A diagnostic Write(content="test") between failures succeeded → rules out file/encoding/tool bugs
  • Retry loop makes it worse (context grows, budget unchanged)

Expected behavior (at least one of):

  1. Surface the real error to the model: "tool_use truncated by max_tokens — retry with smaller content or use Edit for partial updates" instead of generic "missing content"
  2. Auto-suggest chunking: harness could detect stop_reason=max_tokens inside tool_use and provide actionable guidance
  3. Allow Write to stream content across multiple turns / raise output ceiling for tool_use blocks
  4. Client-side heuristic: warn model when file target size exceeds safe threshold

User impact: especially bad for non-English content (Cyrillic, CJK) where token density is 3× English — files that "look normal" in characters blow the budget. User ends up in frustrating retry loops until they manually instruct chunking.

Environment Info

  • Platform: win32
  • Terminal: windows-terminal
  • Version: 2.1.96
  • Feedback ID: 1b2d7c7d-e262-414f-9fe0-4be98ff56e47

Errors

[]

extent analysis

TL;DR

Implement content chunking to avoid hitting the 8,000 output token ceiling when using the Write tool with large content parameters.

Guidance

  • Identify cases where the content parameter is likely to exceed the token limit and manually chunk the content before passing it to the Write tool.
  • Consider implementing a client-side heuristic to warn the model when the file target size exceeds a safe threshold, to prevent retry loops.
  • Review the API documentation to see if there are any existing mechanisms for streaming content across multiple turns or raising the output ceiling for tool_use blocks.
  • Analyze the session JSONL data to better understand the token density of different languages and adjust the chunking strategy accordingly.

Example

def chunk_content(content, max_tokens=8000):
    # Simple example of chunking content into smaller pieces
    chunks = []
    while len(content) > max_tokens:
        chunks.append(content[:max_tokens])
        content = content[max_tokens:]
    chunks.append(content)
    return chunks

# Usage
content = "large_content_string"
chunks = chunk_content(content)
for chunk in chunks:
    # Call the Write tool with each chunk
    Write(content=chunk)

Notes

The provided solution is a workaround and may not be the most efficient or scalable solution. It is recommended to work with the API developers to implement a more robust solution, such as streaming content across multiple turns or raising the output ceiling for tool_use blocks.

Recommendation

Apply workaround: Implement content chunking to avoid hitting the 8,000 output token ceiling, as this is a temporary solution that can help mitigate the issue until a more permanent fix is implemented.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING