openclaw - 💡(How to fix) Fix Ollama: gateway times out at 60s despite model responding in <10s via direct curl [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#62911Fetched 2026-04-09 08:00:51
View on GitHub
Comments
1
Participants
2
Timeline
2
Reactions
0
Timeline (top)
closed ×1commented ×1

OpenClaw's gateway times out on all Ollama requests after exactly 60 seconds, while the same model responds correctly in <10s via direct curl. The timeout fires even with a pre-warmed model already resident in VRAM. The Ollama stack is healthy — the break is in the gateway's HTTP streaming path to Ollama.

Error Message

[agent/embedded] embedded run failover decision: reason=timeout provider=ollama/qwen2.5:3b [diagnostic] lane task error: lane=main durationMs=60371 error="FailoverError: LLM request timed out." [diagnostic] lane task error: lane=session:agent:quinn:main durationMs=60373 [model-fallback/decision] candidate_failed reason=unknown next=ollama/qwen2.5:7b [diagnostic] lane task error: lane=main durationMs=60733 error="FailoverError: LLM request timed out."

Root Cause

OpenClaw's gateway times out on all Ollama requests after exactly 60 seconds, while the same model responds correctly in <10s via direct curl. The timeout fires even with a pre-warmed model already resident in VRAM. The Ollama stack is healthy — the break is in the gateway's HTTP streaming path to Ollama.

Fix Action

Workaround

Direct curl http://localhost:11434/api/chat works correctly. openclaw agent against Ollama-backed agents is non-functional on this hardware until resolved.

Code Example

[agent/embedded] embedded run failover decision: reason=timeout provider=ollama/qwen2.5:3b
[diagnostic] lane task error: lane=main durationMs=60371 error="FailoverError: LLM request timed out."
[diagnostic] lane task error: lane=session:agent:quinn:main durationMs=60373
[model-fallback/decision] candidate_failed reason=unknown next=ollama/qwen2.5:7b
[diagnostic] lane task error: lane=main durationMs=60733 error="FailoverError: LLM request timed out."
RAW_BUFFERClick to expand / collapse

Summary

OpenClaw's gateway times out on all Ollama requests after exactly 60 seconds, while the same model responds correctly in <10s via direct curl. The timeout fires even with a pre-warmed model already resident in VRAM. The Ollama stack is healthy — the break is in the gateway's HTTP streaming path to Ollama.

Diagnostic evidence

TestMethodResult
qwen2.5:3b 5-token responsecurl /api/generate stream:false✅ 9.7s
qwen2.5:3b 5-token responsecurl /api/chat stream:false✅ 25.6s
qwen2.5:3b first tokencurl /api/chat stream:true4.2s
openclaw agent --agent quinn (same prompt)gateway path❌ 60s timeout, every time

Both models in the fallback chain (qwen2.5:3b, qwen2.5:7b) time out at the same threshold. The gateway logs show the timeout fires in the lane task layer:

[agent/embedded] embedded run failover decision: reason=timeout provider=ollama/qwen2.5:3b
[diagnostic] lane task error: lane=main durationMs=60371 error="FailoverError: LLM request timed out."
[diagnostic] lane task error: lane=session:agent:quinn:main durationMs=60373
[model-fallback/decision] candidate_failed reason=unknown next=ollama/qwen2.5:7b
[diagnostic] lane task error: lane=main durationMs=60733 error="FailoverError: LLM request timed out."

Key observations

  1. Not a cold-start problem. Pre-warmed model (already resident, 2279MB VRAM), expires_at confirming it was loaded — same 60s timeout.
  2. Not a model performance problem. Direct curl /api/chat stream:true returns first token in 4.2 seconds.
  3. Hardcoded 60s timeout. Both fallback models hit exactly 60s. No config key found in configuration-reference.md or schema to adjust this. agents.defaults.timeoutSeconds: 1200 and openclaw agent --timeout do not affect it.
  4. Gateway-first, embedded-second, both fail. The diagnostic output shows Gateway agent failed; falling back to embedded — the embedded path uses the same timeout, so fallback does not help.
  5. Exit code unreliable. openclaw agent returned exit 0 on the first failure run and exit 1 on the second — cannot be used for programmatic pass/fail detection.

Environment

  • OS: Ubuntu 24.04.4 LTS (kernel 6.17.0-20-generic)
  • OpenClaw: v2026.4.2
  • Hardware: Lenovo V15 G4 ABP — AMD Ryzen 7 7000, 40GB RAM, AMD Radeon iGPU (Vulkan)
  • Ollama: localhost:11434, OLLAMA_VULKAN=1, OLLAMA_FLASH_ATTENTION=1
  • Model: qwen2.5:3b (primary), qwen2.5:7b (fallback)
  • Provider config: baseUrl: "http://127.0.0.1:11434", api: "ollama", apiKey: "ollama-local"

Expected behavior

  1. A configurable models.providers.ollama.requestTimeoutMs (or equivalent) key so local model latency can be accommodated.
  2. Alternatively: the gateway streaming path should begin the timeout clock from first-token receipt, not from connection open — since local models may have high prompt-eval latency before any tokens stream.

Workaround

Direct curl http://localhost:11434/api/chat works correctly. openclaw agent against Ollama-backed agents is non-functional on this hardware until resolved.

extent analysis

TL;DR

The most likely fix is to adjust the timeout configuration in the OpenClaw gateway to accommodate the local model latency.

Guidance

  • Investigate the OpenClaw configuration to find a potential timeout setting that can be adjusted to a value higher than 60 seconds, allowing the local model to respond without timing out.
  • Verify that the agents.defaults.timeoutSeconds and openclaw agent --timeout options do not affect the hardcoded 60s timeout, and look for alternative configuration options.
  • Consider implementing a workaround by using direct curl requests to the Ollama API, as this has been shown to work correctly.
  • Review the OpenClaw documentation and source code to determine if there is a configurable timeout setting, such as models.providers.ollama.requestTimeoutMs, that can be used to accommodate local model latency.

Example

No code snippet is provided as the issue does not imply a specific code change, but rather a configuration adjustment.

Notes

The exact configuration option to adjust the timeout is not specified in the provided information, so further investigation into the OpenClaw documentation and source code is necessary.

Recommendation

Apply a workaround by using direct curl requests to the Ollama API until a configurable timeout setting is found or implemented in the OpenClaw gateway.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  1. A configurable models.providers.ollama.requestTimeoutMs (or equivalent) key so local model latency can be accommodated.
  2. Alternatively: the gateway streaming path should begin the timeout clock from first-token receipt, not from connection open — since local models may have high prompt-eval latency before any tokens stream.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING