openclaw - 💡(How to fix) Fix [Bug]: Anthropic embedded runs timeout around ~60s on `anthropic:correct_max` regardless of configured timeoutSeconds [2 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#60459Fetched 2026-04-08 02:50:55
View on GitHub
Comments
2
Participants
3
Timeline
3
Reactions
0
Timeline (top)
commented ×2cross-referenced ×1

Error Message

The surfaced error is always the same:

Root Cause

If it is helpful, I can gather and attach a few more sanitized log windows, but I wanted to get the core pattern filed first because this has been littering real scheduled jobs for the last two days.

Code Example

2026-04-02 13:01:02 CDT | Daily Thinking Time | anthropic/claude-opus-4-6 | configured timeout 3600s | failed in 62.9s
2026-04-02 13:31:04 CDT | morning-briefing | anthropic/claude-sonnet-4-6 | configured timeout 900s | failed in 69.8s
2026-04-02 13:31:04 CDT | Morning News Scan | anthropic/claude-sonnet-4-6 | configured timeout 300s | failed in 62.6s
2026-04-02 21:31:03 CDT | Nightly Episodic Cleanup | anthropic/claude-sonnet-4-6 | configured timeout 600s | failed in 63.4s
2026-04-02 22:31:03 CDT | Granola Enrichment | anthropic/claude-sonnet-4-6 | configured timeout 900s | failed in 63.1s
2026-04-03 00:31:04 CDT | Francisco Health Check (Night) | anthropic/claude-sonnet-4-6 | configured timeout 120s | failed in 64.7s
2026-04-03 08:01:03 CDT | morning-briefing | anthropic/claude-sonnet-4-6 | configured timeout 900s | failed in 63.5s
2026-04-03 09:01:03 CDT | Daily #lab Comic Strip | anthropic/claude-opus-4-6 | configured timeout 300s | failed in 63.4s

---

Request timed out before a response was generated. Please try again, or increase `agents.defaults.timeoutSeconds` in your config.

---

2026-04-03T08:01:02.586-05:00 [agent/embedded] Profile anthropic:correct_max timed out. Trying next account...
2026-04-03T08:01:02.588-05:00 [agent/embedded] embedded run failover decision: runId=13eb66e0-ebc2-40de-94c2-b3d70aff2d42 stage=assistant decision=surface_error reason=timeout provider=anthropic/claude-sonnet-4-6 profile=sha256:15481447beae

2026-04-03T09:01:02.527-05:00 [agent/embedded] Profile anthropic:correct_max timed out. Trying next account...
2026-04-03T09:01:02.529-05:00 [agent/embedded] embedded run failover decision: runId=d5cd68f8-a1f3-46a3-92d7-8cbc927142b8 stage=assistant decision=surface_error reason=timeout provider=anthropic/claude-opus-4-6 profile=sha256:15481447beae
RAW_BUFFERClick to expand / collapse

I am seeing a repeatable Anthropic specific timeout pattern on a real OpenClaw instance over the last 24 to 48 hours. This does not behave like a normal long running job hitting its configured timeout.

Across the last 24 hours I found 9 matching failures in local cron and task state. Every one was an Anthropic run. Seven were Claude Sonnet 4.6 and two were Claude Opus 4.6. I did not find the same repeated exact failure pattern for non Anthropic runs in the same window.

The important part is that the jobs are not failing at their configured timeout. They are clustering around roughly 60 to 65 seconds, with one at 80 seconds. Here are representative cases from the last day:

2026-04-02 13:01:02 CDT | Daily Thinking Time | anthropic/claude-opus-4-6 | configured timeout 3600s | failed in 62.9s
2026-04-02 13:31:04 CDT | morning-briefing | anthropic/claude-sonnet-4-6 | configured timeout 900s | failed in 69.8s
2026-04-02 13:31:04 CDT | Morning News Scan | anthropic/claude-sonnet-4-6 | configured timeout 300s | failed in 62.6s
2026-04-02 21:31:03 CDT | Nightly Episodic Cleanup | anthropic/claude-sonnet-4-6 | configured timeout 600s | failed in 63.4s
2026-04-02 22:31:03 CDT | Granola Enrichment | anthropic/claude-sonnet-4-6 | configured timeout 900s | failed in 63.1s
2026-04-03 00:31:04 CDT | Francisco Health Check (Night) | anthropic/claude-sonnet-4-6 | configured timeout 120s | failed in 64.7s
2026-04-03 08:01:03 CDT | morning-briefing | anthropic/claude-sonnet-4-6 | configured timeout 900s | failed in 63.5s
2026-04-03 09:01:03 CDT | Daily #lab Comic Strip | anthropic/claude-opus-4-6 | configured timeout 300s | failed in 63.4s

The surfaced error is always the same:

Request timed out before a response was generated. Please try again, or increase `agents.defaults.timeoutSeconds` in your config.

That advice appears to be wrong for this failure mode. One affected job had timeoutSeconds set to 3600 and still failed after 62.9 seconds. Another had 900 and failed after 63.5. Another had 300 and failed after 62.6. So whatever boundary is firing here does not appear to be the job timeout the user is being told to change.

The gateway logs point to the Anthropic embedded path, specifically the auth profile anthropic:correct_max. Representative lines:

2026-04-03T08:01:02.586-05:00 [agent/embedded] Profile anthropic:correct_max timed out. Trying next account...
2026-04-03T08:01:02.588-05:00 [agent/embedded] embedded run failover decision: runId=13eb66e0-ebc2-40de-94c2-b3d70aff2d42 stage=assistant decision=surface_error reason=timeout provider=anthropic/claude-sonnet-4-6 profile=sha256:15481447beae

2026-04-03T09:01:02.527-05:00 [agent/embedded] Profile anthropic:correct_max timed out. Trying next account...
2026-04-03T09:01:02.529-05:00 [agent/embedded] embedded run failover decision: runId=d5cd68f8-a1f3-46a3-92d7-8cbc927142b8 stage=assistant decision=surface_error reason=timeout provider=anthropic/claude-opus-4-6 profile=sha256:15481447beae

I also checked local config. This instance does not set agents.defaults.timeoutSeconds, agents.defaults.llm.idleTimeoutSeconds, or agents.defaults.embeddedPi. I also checked current source. The generic agent timeout default is much higher than 60 seconds, and src/agents/pi-embedded-runner/run/llm-idle-timeout.ts appears to default to 300 seconds when unset. That makes the observed roughly 63 second boundary especially suspicious.

I suspect two separate problems are getting conflated here. The first is an Anthropic specific timeout on the embedded path, or on a specific Anthropic auth profile, around the one minute mark. The second is that OpenClaw then surfaces a generic remediation message that points users at agents.defaults.timeoutSeconds, even when the failing boundary is clearly somewhere else.

Anthropic may still be part of the trigger here. They had recent timeout related status incidents, so upstream slowness could be what exposes this. But even if upstream latency is the trigger, OpenClaw still appears to be handling it incorrectly on this path. The tight clustering around roughly one minute, regardless of much higher job timeouts, is the part that looks wrong.

This seems related to #34644, #51057, and #58711, but I do not think any one of those fully captures this exact failure mode. The distinctive behavior here is that Anthropic runs on the embedded path time out around one minute, the logs blame a specific profile, and the user gets told to change a timeout that does not control the observed failure.

What I expected was straightforward: either the run should be allowed to continue until its configured timeout, or OpenClaw should surface the actual lower level timeout that fired, ideally with enough detail to distinguish total timeout from first token timeout, idle timeout, or provider specific timeout. What I am seeing instead is a generic timeout message that points at the wrong knob and makes the issue look like a local misconfiguration when it does not appear to be one.

If it is helpful, I can gather and attach a few more sanitized log windows, but I wanted to get the core pattern filed first because this has been littering real scheduled jobs for the last two days.

extent analysis

TL;DR

Investigate and adjust the timeout settings specific to the Anthropic embedded path, as the current failures are not related to the configured job timeouts.

Guidance

  1. Review Anthropic auth profile timeouts: Check the configuration and logs for the anthropic:correct_max profile to see if there's a specific timeout setting that's causing the failures around the 60-65 second mark.
  2. Distinguish between job and provider timeouts: Modify the error handling to surface the actual timeout that fired, whether it's a job timeout, idle timeout, or provider-specific timeout, to help identify the root cause.
  3. Verify OpenClaw's timeout handling: Investigate why OpenClaw is surfacing a generic remediation message pointing to agents.defaults.timeoutSeconds when the failing boundary appears to be elsewhere.
  4. Check for upstream Anthropic issues: Although the problem seems to be with OpenClaw's handling, verify if there are any ongoing or recent issues with Anthropic's services that could be contributing to the timeouts.

Example

No specific code example is provided, as the issue seems to be related to configuration and timeout settings rather than code-level changes.

Notes

The issue appears to be specific to the Anthropic embedded path and the anthropic:correct_max auth profile. The failures are not consistent with the configured job timeouts, suggesting a separate timeout setting is being triggered.

Recommendation

Apply a workaround by adjusting the timeout settings specific to the Anthropic embedded path, as the current failures do not seem to be related to the configured job timeouts. This may involve setting a higher timeout for the anthropic:correct_max profile or modifying the error handling to surface the actual timeout that fired.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING