claude-code - 💡(How to fix) Fix [BUG] Workflow harness retries indefinitely on HTTP 429 rate_limit — 97 agents, 2M tokens burned in 34 seconds ()

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

USER-FACING (what surfaced):

<status>failed</status>

<summary>Error: agent({schema}): subagent completed without calling StructuredOutput (after 2 in-conversation nudges)</summary> <failures> parallel[0] failed: agent({schema}): subagent completed without calling StructuredOutput parallel[1] failed: ... [same line repeated ~15 times across retries] </failures> <usage> <agent_count>97</agent_count> <subagent_tokens>2077133</subagent_tokens> <tool_uses>282</tool_uses> <duration_ms>628578</duration_ms> </usage>

ACTUAL ROOT CAUSE (extracted from agent JSONL, not surfaced to user):

{ "timestamp": "2026-05-31T17:31:37.542Z", "apiErrorStatus": 429, "requestId": "req_011Cbb3afFu5gMMi4ZPGSnZt", "error": "rate_limit", "isApiErrorMessage": true, "text": "You've hit your session limit · resets 10:10pm (Europe/Kiev)", "stop_reason": "stop_sequence", "usage": { "input_tokens": 0, "output_tokens": 0 } }

RETRY LOOP METRICS:

  • 37 retries in 34.493 seconds (avg interval 0.96s, no backoff)
  • First 429: 2026-05-31T17:31:25.988Z
  • Last 429: 2026-05-31T17:32:00.481Z
  • All 37 retries got identical 429 with identical error text
  • Workflow continued retrying long past the point where the response was clearly not transient

Root Cause

The user-facing error hides the root cause: "subagent completed without calling StructuredOutput (after 2 in-conversation nudges)". The real cause was only visible by grepping agent JSONL transcripts: "error":"rate_limit", "apiErrorStatus":429, "text":"You've hit your session limit".

Code Example

USER-FACING (what surfaced):

<status>failed</status>
<summary>Error: agent({schema}): subagent completed without calling StructuredOutput (after 2 in-conversation nudges)</summary>
<failures>
  parallel[0] failed: agent({schema}): subagent completed without calling StructuredOutput
  parallel[1] failed: ...
  [same line repeated ~15 times across retries]
</failures>
<usage>
  <agent_count>97</agent_count>
  <subagent_tokens>2077133</subagent_tokens>
  <tool_uses>282</tool_uses>
  <duration_ms>628578</duration_ms>
</usage>


ACTUAL ROOT CAUSE (extracted from agent JSONL, not surfaced to user):

{
  "timestamp": "2026-05-31T17:31:37.542Z",
  "apiErrorStatus": 429,
  "requestId": "req_011Cbb3afFu5gMMi4ZPGSnZt",
  "error": "rate_limit",
  "isApiErrorMessage": true,
  "text": "You've hit your session limit · resets 10:10pm (Europe/Kiev)",
  "stop_reason": "stop_sequence",
  "usage": { "input_tokens": 0, "output_tokens": 0 }
}


RETRY LOOP METRICS:
- 37 retries in 34.493 seconds (avg interval 0.96s, no backoff)
- First 429: 2026-05-31T17:31:25.988Z
- Last 429:  2026-05-31T17:32:00.481Z
- All 37 retries got identical 429 with identical error text
- Workflow continued retrying long past the point where the response was clearly not transient
RAW_BUFFERClick to expand / collapse

Preflight Checklist

  • I have searched existing issues and this hasn't been reported yet
  • This is a single bug report (please file separate reports for different bugs)
  • I am using the latest version of Claude Code

What's Wrong?

I'm providing the information that Claude Code was able to collect about this issue. It's possible that its explanation of the cause is incorrect, but I hope the data is sufficient for your team to investigate the underlying problem.

The main issue is that a small and relatively simple request consumed about 80% of my 5-hour usage limit on the Max (5x) plan yesterday. The same thing happened again today and consumed roughly 50% of the limit.

Thank you in advance for looking into this.

The deep-research workflow primitive treats HTTP 429 rate_limit responses from the Anthropic API as "subagent completed without calling StructuredOutput" and enters a tight retry loop with no exponential backoff and no circuit breaker.

In my run, a subagent hit my session limit mid-workflow (HTTP 429). The harness then fired 37 retries in 34.5 seconds (~1 retry/sec). Each retry was itself a billable API call counting toward the same exhausted limit, so the loop actively prevented recovery. Final stats: 97 agent invocations, 2,077,133 subagent tokens, 10.5 min wall-clock — for a research query that should have cost <50k tokens.

The user-facing error hides the root cause: "subagent completed without calling StructuredOutput (after 2 in-conversation nudges)". The real cause was only visible by grepping agent JSONL transcripts: "error":"rate_limit", "apiErrorStatus":429, "text":"You've hit your session limit".

Three distinct defects stack here:

  1. agent({schema}) primitive doesn't distinguish HTTP errors from genuine missing-StructuredOutput
  2. No exponential backoff and no circuit breaker on retries
  3. User-facing error message masks the underlying API error

But yesterday there was an even worse issue where nearly 4 million tokens were consumed by a relatively small request. Unfortunately, I no longer have the logs or data from that incident.

What Should Happen?

When a subagent returns HTTP 429 / 5xx / quota error from the Anthropic API, the workflow harness should:

  1. Detect error === "rate_limit" or apiErrorStatus >= 429 and abort the entire workflow immediately — do NOT classify as "missing StructuredOutput call".
  2. Surface the real API error to the user: e.g. "Workflow aborted — API session limit reached, resets at [time]."
  3. For genuine transient errors (different from quota), use exponential backoff (1s → 2s → 4s, jittered) and a max-3-consecutive-identical-failures circuit breaker.

Error Messages/Logs

USER-FACING (what surfaced):

<status>failed</status>
<summary>Error: agent({schema}): subagent completed without calling StructuredOutput (after 2 in-conversation nudges)</summary>
<failures>
  parallel[0] failed: agent({schema}): subagent completed without calling StructuredOutput
  parallel[1] failed: ...
  [same line repeated ~15 times across retries]
</failures>
<usage>
  <agent_count>97</agent_count>
  <subagent_tokens>2077133</subagent_tokens>
  <tool_uses>282</tool_uses>
  <duration_ms>628578</duration_ms>
</usage>


ACTUAL ROOT CAUSE (extracted from agent JSONL, not surfaced to user):

{
  "timestamp": "2026-05-31T17:31:37.542Z",
  "apiErrorStatus": 429,
  "requestId": "req_011Cbb3afFu5gMMi4ZPGSnZt",
  "error": "rate_limit",
  "isApiErrorMessage": true,
  "text": "You've hit your session limit · resets 10:10pm (Europe/Kiev)",
  "stop_reason": "stop_sequence",
  "usage": { "input_tokens": 0, "output_tokens": 0 }
}


RETRY LOOP METRICS:
- 37 retries in 34.493 seconds (avg interval 0.96s, no backoff)
- First 429: 2026-05-31T17:31:25.988Z
- Last 429:  2026-05-31T17:32:00.481Z
- All 37 retries got identical 429 with identical error text
- Workflow continued retrying long past the point where the response was clearly not transient

Steps to Reproduce

  1. Run any user session up to ~40-50% of the Pro 5-hour token quota.

  2. Invoke the built-in deep-research skill with a query complex enough to spawn 60+ subagents (multi-aspect research with cross-source verification).

  3. Workflow proceeds through Scope → Search → Fetch phases normally (legitimate token use).

  4. Mid-Verify phase the session limit is crossed. Next subagent returns HTTP 429.

  5. Observe: workflow does NOT abort. It retries the failing subagent every ~1 second for 34 seconds, spawning ~37 retries — each one returning the same 429 and each one counting toward the now-exhausted limit.

  6. Eventually orchestrator gives up. Final: 97 agent_count, 2M subagent tokens consumed for zero usable output.

Expected: at step 5, on the FIRST 429, workflow should abort and surface the API error.

Claude Model

Opus

Is this a regression?

I don't know

Last Working Version

No response

Claude Code Version

2.1.156

Platform

Anthropic API

Operating System

macOS

Terminal/Shell

VS Code integrated terminal

Additional Information

Interface: Claude Code VSCode extension (transcripts show entrypoint: claude-vscode). The Terminal/Shell field doesn't have a VSCode option — consider adding one.

Platform detail: macOS Darwin 24.5.0.

Key identifiers for your log lookup:

  • Workflow Run ID: wf_6a68935f-35f
  • Task ID: whdk5dfwz
  • Session ID: 83411bf2-c4b1-4d00-9e6e-ba0260b9df3d
  • First failed Anthropic requestId: req_011Cbb3afFu5gMMi4ZPGSnZt
  • First 429: 2026-05-31T17:31:25.988Z
  • Last 429: 2026-05-31T17:32:00.481Z

Full transcripts (98 JSONL files, 37 contain 429 errors) are on disk at: ~/.claude/projects/<session>/subagents/workflows/wf_6a68935f-35f/

Happy to share privately if useful for repro.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

claude-code - 💡(How to fix) Fix [BUG] Workflow harness retries indefinitely on HTTP 429 rate_limit — 97 agents, 2M tokens burned in 34 seconds ()