claude-code - 💡(How to fix) Fix [BUG] Thinking stalls exceeding conversation cache TTL [1 participants]

claude-code2026-04-30 08:05:31

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#54934•Fetched 2026-05-01 05:50:36

View on GitHub

Comments

Participants

Timeline

Reactions

Author

poof86

Participants

poof86

Timeline (top)

labeled ×5

Error Message

I tried waiting it out but got hit with API Error: Claude's response exceeded the 32000 output token maximum. To configure this behaviour, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable

Error Messages/Logs

RAW_BUFFERClick to expand / collapse

Preflight Checklist

I have searched existing issues and this hasn't been reported yet
This is a single bug report (please file separate reports for different bugs)
I am using the latest version of Claude Code

What's Wrong?

When Claude Code uses thinking (medium), the thinking process stalls repeatedly at round token counts (observed: 33.0k, 65.0k, 97.0k tokens). Each stall lasts long enough (more than 5 minutes) that the conversation cache expires. After a long wait I see tokens being spent again, but notice >20% jump in token usage for my 5-hour session! This makes using Claude Code unusable on Pro for any non-trivial task where you wait for Claude Code to finish its thinking process without interrupting.

I tried waiting it out but got hit with API Error: Claude's response exceeded the 32000 output token maximum. To configure this behaviour, set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable

I've been experiencing the same issue in Claude.ai Code on desktop where my usage for the 5 hour interval is used up without a single line of code output. (Not exaggerating here)

What Should Happen?

Thinking should progress continuously or resumes within cache TTL window.

Error Messages/Logs

Steps to Reproduce

Start Claude Code CLI in terminal
Give a complex implementation task
Observe thinking token counter pausing at round numbers
Stall duration exceeds cache TTL, see hits on session token usage

Claude Model

Sonnet (default)

Is this a regression?

Yes, this worked in a previous version

Last Working Version

No response

Claude Code Version

2.1.123

Platform

Anthropic API

Operating System

Ubuntu/Debian Linux

Terminal/Shell

Other

Additional Information

No response

extent analysis

TL;DR

Set the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable to a higher value to potentially mitigate the issue of Claude Code stalling at round token counts.

Guidance

Investigate the relationship between the stalling token counts (33.0k, 65.0k, 97.0k) and the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable to determine if increasing this value can prevent stalling.
Verify that the conversation cache expiration is indeed causing the token usage jumps by monitoring the cache TTL and token usage patterns.
Test with a smaller task to see if the stalling issue still occurs, which could help determine if the problem is task-size related.
Consider reaching out to the Anthropic API support to inquire about any known issues with the Sonnet model and token usage.

Example

No code snippet is provided as the issue seems to be related to configuration and API interaction rather than code.

Notes

The exact cause of the stalling issue is unclear, and it's uncertain whether increasing CLAUDE_CODE_MAX_OUTPUT_TOKENS will fully resolve the problem. The fact that this worked in a previous version suggests a potential regression, but without the last working version specified, it's difficult to pinpoint the change that introduced this issue.

Recommendation

Apply workaround: Increase the CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable to a higher value to potentially mitigate the stalling issue, as this directly addresses the error message received and may help in preventing the conversation cache from expiring due to token limits.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #generation error #database connection #vector store #environment variable

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [BUG] Thinking stalls exceeding conversation cache TTL [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Thinking stalls exceeding conversation cache TTL [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Messages/Logs

Preflight Checklist

What's Wrong?

What Should Happen?

Error Messages/Logs

Steps to Reproduce

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING