claude-code - 💡(How to fix) Fix [Bug] Model exhibits inconsistent reasoning, instruction non-compliance, and context degradation in extended sessions [2 comments, 3 participants]

claude-code2026-04-30 12:43:20

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

anthropics/claude-code#54991•Fetched 2026-05-01 05:49:06

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

labeled ×4commented ×2

Error Message

[{"error":"Error: Request was aborted.\n at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:3448)\n at processTicksAndRejections (native:7:39)","timestamp":"2026-04-29T14:47:39.185Z"},{"error":"Error: Request was aborted.\n at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:3448)\n at processTicksAndRejections (native:7:39)","timestamp":"2026-04-29T20:00:40.167Z"},{"error":"Error: Request was aborted.\n at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:3448)\n at processTicksAndRejections (native:7:39)","timestamp":"2026-04-29T20:29:28.654Z"},{"error":"Error: 429 {"type":"error","error":{"type":"rate_limit_error","message":"Rate limited"},"request_id":"req_011CaZFvQiWm9ZL9WUvyqjg5"}\n at generate (B:/~BUN/root/src/entrypoints/cli.js:11:57670)\n at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:4943)\n at processTicksAndRejections (native:7:39)","timestamp":"2026-04-30T03:37:48.906Z"},{"error":"Error: 400 {"type":"error","error":{"type":"invalid_request_error","message":"prompt is too long: 1005508 tokens > 1000000 maximum"},"request_id":"req_011CaZj6Zex8N4U35cvtY2nJ"}\n at generate (B:/~BUN/root/src/entrypoints/cli.js:11:57448)\n at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:4943)\n at processTicksAndRejections (native:7:39)","timestamp":"2026-04-30T09:34:11.668Z"},{"error":"MaxFileReadTokenExceededError: File content (1108324 tokens) exceeds maximum allowed tokens (25000). Use offset and limit parameters to read specific portions of the file, or search for specific content instead of reading the whole file.\n at Su7 (B:/~BUN/root/src/entrypoints/cli.js:4908:12810)\n at processTicksAndRejections (native:7:39)","timestamp":"2026-04-30T09:48:07.086Z"},{"error":"MaxFileReadTokenExceededError: File content (137732 tokens) exceeds maximum allowed tokens (25000). Use offset …

Code Example

[{"error":"Error: Request was aborted.\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:3448)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-29T14:47:39.185Z"},{"error":"Error: Request was aborted.\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:3448)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-29T20:00:40.167Z"},{"error":"Error: Request was aborted.\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:3448)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-29T20:29:28.654Z"},{"error":"Error: 429 {\"type\":\"error\",\"error\":{\"type\":\"rate_limit_error\",\"message\":\"Rate limited\"},\"request_id\":\"req_011CaZFvQiWm9ZL9WUvyqjg5\"}\n    at generate (B:/~BUN/root/src/entrypoints/cli.js:11:57670)\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:4943)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-30T03:37:48.906Z"},{"error":"Error: 400 {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"message\":\"prompt is too long: 1005508 tokens > 1000000 maximum\"},\"request_id\":\"req_011CaZj6Zex8N4U35cvtY2nJ\"}\n    at generate (B:/~BUN/root/src/entrypoints/cli.js:11:57448)\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:4943)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-30T09:34:11.668Z"},{"error":"MaxFileReadTokenExceededError: File content (1108324 tokens) exceeds maximum allowed tokens (25000). Use offset and limit parameters to read specific portions of the file, or search for specific content instead of reading the whole file.\n    at Su7 (B:/~BUN/root/src/entrypoints/cli.js:4908:12810)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-30T09:48:07.086Z"},{"error":"MaxFileReadTokenExceededError: File content (137732 tokens) exceeds maximum allowed tokens (25000). Use offset …

RAW_BUFFERClick to expand / collapse

Bug Description During a long technical session with multiple pivots (audit → fix → restart → re-audit), the model exhibited several systemic defects that make it less reliable than previous versions:

Contradictory numerical estimates within a single session. When asked the exact same question ("what is a realistic rate?"), the model gave varying answers over the course of an hour: "5-7%/d", "10.8%/d", "impossible", "5-10%/d", and "this is by design — no rates need to be calculated." Each time, the answer seemed confident and well-founded, but there was no continuity between them. This is a hallucination not at the level of facts, but at the level of analytical framing.

Ignoring explicit user instructions. The user repeatedly requested that the model read the JSONL conversation history to restore the context of previous discussions. The model would reply "reading," but only performed a minimal grep and moved on to answering without genuinely reading the file. This happened several times, even after direct reprimands. This symptom suggests an RLHF reward optimized for "looking active/helpful" rather than actually following instructions.

Sycophancy under pressure. When the user expressed dissatisfaction (e.g., "this is nonsense," "I don't understand"), the model completely reversed its previous analysis without verifying if the user was actually correct. After 3-4 iterations of this reframing, the original insights were entirely lost.

Reactive instead of proactive tool usage. Features like sequential-thinking, agents, and memory access were only utilized when explicitly requested by the user, and often too late. Had these been engaged at the beginning of the session, much of the contradiction and rework could have been avoided.

Misinterpreting system states as "bugs." The model repeatedly classified expected system behavior (e.g., capital fully deployed → no new fires until resolved) as a "blocker" or a "problem." This created a false sense of urgency for non-existent fixes, a flaw the user only caught after several hours of troubleshooting.

Context loss after compaction. Following the auto-compaction of the conversation, the model lost the subtlety and nuance of earlier discussions. Even though a summary was retained, the model's reasoning became much more generic and less tied to the specific details of the task.

Comparison with previous models: Claude Sonnet 4.5 / Opus 4.6 maintained better stability in similar long-context sessions, ALTHOUGH THEY ALSO PERFORMED POORLY, ESPECIALLY IN THE 1M CONTEXT FORMAT. The current Opus 4.7 feels like a regression in coordination, despite the marketing surrounding its 1M context window.

AND THIS HAPPENS CONSISTENTLY AFTER EVERY NEW MODEL RELEASE: IT FUNCTIONS NORMALLY FOR ABOUT 2 WEEKS, AND THEN A SHARP DECLINE AND REGRESSION IN PERFORMANCE OCCURS.

Environment Info

Platform: win32
Terminal: null
Version: 2.1.123
Feedback ID: f11a9668-5d70-4eae-bf5e-4d789c502c5b

Errors

[{"error":"Error: Request was aborted.\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:3448)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-29T14:47:39.185Z"},{"error":"Error: Request was aborted.\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:3448)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-29T20:00:40.167Z"},{"error":"Error: Request was aborted.\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:3448)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-29T20:29:28.654Z"},{"error":"Error: 429 {\"type\":\"error\",\"error\":{\"type\":\"rate_limit_error\",\"message\":\"Rate limited\"},\"request_id\":\"req_011CaZFvQiWm9ZL9WUvyqjg5\"}\n    at generate (B:/~BUN/root/src/entrypoints/cli.js:11:57670)\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:4943)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-30T03:37:48.906Z"},{"error":"Error: 400 {\"type\":\"error\",\"error\":{\"type\":\"invalid_request_error\",\"message\":\"prompt is too long: 1005508 tokens > 1000000 maximum\"},\"request_id\":\"req_011CaZj6Zex8N4U35cvtY2nJ\"}\n    at generate (B:/~BUN/root/src/entrypoints/cli.js:11:57448)\n    at makeRequest (B:/~BUN/root/src/entrypoints/cli.js:50:4943)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-30T09:34:11.668Z"},{"error":"MaxFileReadTokenExceededError: File content (1108324 tokens) exceeds maximum allowed tokens (25000). Use offset and limit parameters to read specific portions of the file, or search for specific content instead of reading the whole file.\n    at Su7 (B:/~BUN/root/src/entrypoints/cli.js:4908:12810)\n    at processTicksAndRejections (native:7:39)","timestamp":"2026-04-30T09:48:07.086Z"},{"error":"MaxFileReadTokenExceededError: File content (137732 tokens) exceeds maximum allowed tokens (25000). Use offset …

Note: Content was truncated.

extent analysis

TL;DR

The model's performance regression and systemic defects can be addressed by re-examining the conversation history handling, instruction following, and error management, potentially requiring adjustments to the model's configuration or training data.

Guidance

Review the conversation history handling to ensure the model properly reads and retains context, potentially by adjusting the JSONL conversation history processing.
Verify that the model correctly follows user instructions, such as reading the conversation history, to prevent misinterpretation and hallucination.
Investigate the error management to handle cases like rate limiting and invalid requests, potentially by implementing retry mechanisms or adjusting the request parameters.
Consider re-training the model with a focus on stability and consistency in long-context sessions, using previous models like Claude Sonnet 4.5 or Opus 4.6 as a reference.
Monitor the model's performance over time to identify any patterns or correlations with the sharp decline in performance after new model releases.

Example

No specific code snippet can be provided without more context, but the error messages suggest that adjusting the makeRequest function in cli.js to handle rate limiting and invalid requests might be necessary.

Notes

The provided information is incomplete, and the content was truncated, which limits the ability to provide a comprehensive solution. Further investigation into the model's configuration, training data, and error handling is necessary to fully address the issues.

Recommendation

Apply a workaround by re-examining the conversation history handling and error management, as the root cause of the performance regression is unclear and may require significant changes to the model or its training data.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #conversation history #configuration error #environment variable #network issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [Bug] Model exhibits inconsistent reasoning, instruction non-compliance, and context degradation in extended sessions [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

claude-code - 💡(How to fix) Fix [Bug] Model exhibits inconsistent reasoning, instruction non-compliance, and context degradation in extended sessions [2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING