claude-code - 💡(How to fix) Fix [BUG] Opus 4.7 (1M context) severe latency regression starting evening of Apr 24, 2026 [4 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
anthropics/claude-code#53234Fetched 2026-04-26 05:20:57
View on GitHub
Comments
4
Participants
2
Timeline
12
Reactions
0
Author
Timeline (top)
commented ×4labeled ×3mentioned ×2subscribed ×2

Error Message

Error Messages/Logs

No error messages — the session does not error or time out, it simply takes 5–7+ minutes to return a response. UI shows extended "Baking…" / "Clauding…" / "Crystallizing…" states with elapsed counters in the multiple-minutes range.

Code Example

No error messages — the session does not error or time out, it simply takes 57+ minutes to return a response. UI shows extended "Baking…" / "Clauding…" / "Crystallizing…" states with elapsed counters in the multiple-minutes range.
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

Claude Code with Opus 4.7 (1M context, xhigh effort) has become 10–15x slower than baseline starting the evening of April 24, 2026 (~19:00–20:00 UK time / ~18:00–19:00 UTC).

A simple /prime command on my project that has consistently taken ~30 seconds for the past 2+ weeks now takes 5–7+ minutes for the exact same operation, on the same project, with no changes on my side (same version, same config, same files, same model).

The output is correct and well-formed once it eventually completes — this is a pure latency issue, not a quality regression. Most of the wall-clock time is spent in the "Baking…" / "Clauding…" / "Crystallizing…" phase before any output appears.

Note: I'm on Claude Code v2.1.119, which is past the v2.1.116 fixes referenced in Anthropic's April 23 postmortem (https://www.anthropic.com/engineering/april-23-postmortem). This regression appeared after those fixes shipped, so this is likely a new/distinct issue.

What Should Happen?

/prime and similar lightweight commands (single small file read, ~150–200 input tokens) should complete in roughly 30 seconds, as they consistently did for the 2+ weeks prior to Apr 24, 2026.

Error Messages/Logs

No error messages — the session does not error or time out, it simply takes 5–7+ minutes to return a response. UI shows extended "Baking…" / "Clauding…" / "Crystallizing…" states with elapsed counters in the multiple-minutes range.

Steps to Reproduce

  1. Open Claude Code in a project on macOS (Mac mini).
  2. Confirm the model header reads: "Opus 4.7 (1M context) with xhigh effort · Claude Max".
  3. Run any lightweight slash command that reads a small file. In my case: /prime, which reads a single ROADMAP.md (~161 input tokens) and produces a short status summary.
  4. Observe elapsed time before completion.

Expected: ~30s. Observed (consecutive runs on Apr 25, 2026): 5m 13s, then 6m 58s.

Reproducible across:

  • Fresh sessions (not a stale-session issue)
  • Different terminals (VS Code integrated terminal and Apple Terminal)
  • Different commands (not specific to /prime — also seen with general queries)

Claude Model

Opus

Is this a regression?

Yes, this worked in a previous version

Last Working Version

2.1.119 — same version. The version did not change. Was working at ~30s latency on this exact version up to the evening of Apr 24, 2026, then regressed without any local change. This is a backend regression rather than a CLI version regression.

Claude Code Version

2.1.119 (Claude Code)

Platform

Anthropic API

Operating System

macOS

Terminal/Shell

VS Code integrated terminal

Additional Information

Suspected cause: Given the timing (post the v2.1.116 fixes from the Apr 23 postmortem) and that the symptom is specific to the Opus 4.7 1M context path, candidate causes include:

  1. A backend serving issue specific to the 1M context variant — similar in shape to the 2025 misrouting incident where short-context requests were routed to 1M context servers (https://www.anthropic.com/engineering/a-postmortem-of-three-recent-issues).
  2. A new regression introduced after v2.1.116.
  3. Capacity / queueing on the Opus 4.7 serving path.

Related open issues (similar symptoms, not yet resolved):

  • #50623 — Opus 4.7 performance degradation and excessive token consumption (Apr 19, still open)
  • #49244 — Opus model quality regression starting ~Apr 15
  • #37422 — Opus extremely slow inference (closed as dup, identical symptom shape)

Plan: Claude Max Project: small-to-medium codebase, well under 200k context — there is no legitimate reason for the 1M variant to be needed for these requests.

Happy to provide session feedback IDs, additional traces, or run diagnostic commands if useful.

extent analysis

TL;DR

The issue can likely be mitigated by investigating backend serving issues or capacity/queueing on the Opus 4.7 serving path, potentially related to the 1M context variant.

Guidance

  • Investigate if the issue is related to a backend serving problem, similar to the 2025 misrouting incident, where short-context requests were routed to 1M context servers.
  • Check for any new regressions introduced after v2.1.116 that might be causing the latency issue.
  • Look into capacity/queueing on the Opus 4.7 serving path as a potential cause for the slowdown.
  • Consider providing session feedback IDs, additional traces, or running diagnostic commands to help diagnose the issue.

Example

No code snippet is provided as the issue seems to be related to backend serving or capacity/queueing rather than a code-specific problem.

Notes

The issue appears to be specific to the Opus 4.7 1M context path and started after the v2.1.116 fixes were implemented, suggesting a potential backend regression. The fact that the issue is reproducible across different terminals and commands suggests a systemic problem rather than a local configuration issue.

Recommendation

Apply workaround: Investigate and potentially optimize the backend serving configuration for the Opus 4.7 1M context path to mitigate the latency issue, as the problem seems to be related to backend serving or capacity/queueing rather than a code-specific issue.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Opus 4.7 (1M context) severe latency regression starting evening of Apr 24, 2026 [4 comments, 2 participants]