claude-code - 💡(How to fix) Fix [BUG] /deep-research — default workflow frequently hits "API Error: Server is temporarily limiting requests"

Root Cause

SECOND BUG — silent data destruction. A verifier that fails / abstains is scored as a REFUTATION. So a run whose search + fetch succeeded (24 sources, 64 claims extracted) returns: "All 25 claims refuted by adversarial verification. Research inconclusive." The tool reports the opposite of the truth — it found plenty, then threw it all away. This is arguably worse than the rate limit, because it looks like "no evidence found."

Code Example

API Error: Server is temporarily limiting requests (not your usage limit) · Rate limited

---

API Error: Server is temporarily limiting requests (not your usage limit) · Rate limited

---

agent({schema}): subagent completed without calling StructuredOutput (after 2 in-conversation nudges)

---

Scope 1/1 ✔ · Search 5/5 ✔ · Fetch 24/24 ✔ · Verify 75/75 FAILED · Synthesize never ran

---

agent({schema}): subagent completed without calling StructuredOutput (after 2 in-conversation nudges)

Preflight Checklist:

I have searched existing issues and this hasn't been reported yet
This is a single bug report
I am using the latest version of Claude Code

What's Wrong?

The /deep-research default workflow frequently fails with a server-side rate limit error. Only about 3 out of 10 runs complete successfully — the other ~7 terminate partway through. The default workflow spawns multiple concurrent subagents, and the burst of parallel requests appears to trip server-side rate limiting (not a personal usage-limit issue — the error explicitly states "not your usage limit").

This is actually two bugs: the rate limit itself, and a destructive-scoring bug it triggers that silently discards good research and reports the opposite of the truth.

Error returned:

API Error: Server is temporarily limiting requests (not your usage limit) · Rate limited

What Should Happen?

/deep-research should complete its subagent fan-out without tripping rate limits — by throttling/queuing concurrent subagent requests internally, backing off and retrying transparently on a 429, or surfacing a clear retry state rather than failing the whole run. Critically, a throttled verification step should degrade gracefully (keep unverified findings, flagged) rather than discarding them. Given the current ~3/10 success rate, consider gating or deprecating the default workflow until it is reliable.

Error Messages/Logs:

API Error: Server is temporarily limiting requests (not your usage limit) · Rate limited

Per-agent failure signature (every one of the 75 verify agents, two consecutive runs):

agent({schema}): subagent completed without calling StructuredOutput (after 2 in-conversation nudges)

Steps to Reproduce:

Open Claude Code in a repository.
Run /deep-research with the default workflow on a query broad enough to spawn several concurrent subagents.
Observe the run partway through.
Error appears: API Error: Server is temporarily limiting requests (not your usage limit) · Rate limited, and the run terminates without completing (or completes but reports "research inconclusive" — see below).

Note: roughly 7 of 10 runs fail this way; only ~3 of 10 complete. Failure frequency appears tied to concurrent subagent fan-out.

Additional diagnostic detail (from the workflow run output):

This is TWO bugs — the rate limit, and a destructive-scoring bug it triggers.

Failure is isolated to the VERIFY phase, and to schema-bound subagents specifically. Phase breakdown of two consecutive failed runs (identical):
```
Scope 1/1 ✔ · Search 5/5 ✔ · Fetch 24/24 ✔ · Verify 75/75 FAILED · Synthesize never ran
```
Search and fetch are also concurrent yet survive. The verify phase fans out 3 votes × 25 claims = 75 concurrent agents, each FORCED to emit structured output. That is the widest schema-bound burst, and it is exactly where it dies.
Exact per-agent failure signature (every one of the 75):
```
agent({schema}): subagent completed without calling StructuredOutput (after 2 in-conversation nudges)
```
i.e. the agent is rate-limited mid-turn, never reaches its forced StructuredOutput call, and is abandoned after 2 nudges.
SECOND BUG — silent data destruction. A verifier that fails / abstains is scored as a REFUTATION. So a run whose search + fetch succeeded (24 sources, 64 claims extracted) returns: "All 25 claims refuted by adversarial verification. Research inconclusive." The tool reports the opposite of the truth — it found plenty, then threw it all away. This is arguably worse than the rate limit, because it looks like "no evidence found."

Run stats per failure: 105 agents, ~1.05M subagent tokens, ~2m50s.

Suggested fixes (in priority order):

Make abstain / no-output ≠ refuted. A failed verifier should mark a claim unverified (kept + flagged), so a throttled verify phase degrades gracefully instead of nuking findings.
Backoff/retry on 429 inside the verify fan-out; cap verify concurrency.
Resume / checkpoint a stalled run — a run that reached "Synthesize pending" loses 100% of ~3 minutes and ~1M tokens of completed work.
Reduce the burst: 1–2 votes, or batched verification (1 agent per N claims).

Environment:

Claude Model: claude-opus-4-8
Is this a regression? No / Unknown
Last Working Version: N/A
Claude Code Version: 2.1.165
Platform: Anthropic API
Operating System: Linux
Terminal/Shell: Cursor
Plan: Claude Max subscription
Feedback ID: 382dad86-2a65-4e5e-b62f-0e6b81e137b0

Additional Information:

Errors array captured: [] (no stack trace surfaced client-side; the failure is a server-returned rate-limit message rather than a local exception).

Possibly related to concurrent subagent fan-out. See related: #63938 (Configurable concurrent subagent limit), #16157 and #38335 (usage-limit reports). A configurable concurrency cap on subagents would likely mitigate the rate limit, but fix #1 above (abstain ≠ refuted) is needed regardless, since it is what turns a transient throttle into total, silent loss of research output.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

claude-code - 💡(How to fix) Fix [BUG] /deep-research — default workflow frequently hits "API Error: Server is temporarily limiting requests"

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Still need to ship something?

TRENDING