openclaw - 💡(How to fix) Fix Ollama `glm-5.1:cloud` stalls after tool results in agent runs while direct Ollama App chat works

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When OpenClaw is configured to use Ollama with glm-5.1:cloud, normal chat in the Ollama App works, but agent runs in OpenClaw can stall after a successful tool execution. The UI shows tool output, then no assistant response arrives for a long time, and the run eventually ends with timeout/abort/network errors.

This appears to be specifically in the agent/tool loop path (toolResult -> next model response), not in simple direct chat.

Error Message

[agent/embedded] embedded run failover decision: decision=fallback_model reason=timeout from=ollama/glm-5.1:cloud [diagnostic] lane task error: error="FailoverError: LLM request timed out." [model-fallback/decision] candidate=ollama/glm-5.1:cloud reason=timeout next=nvidia/z-ai/glm-5.1

Root Cause

When OpenClaw is configured to use Ollama with glm-5.1:cloud, normal chat in the Ollama App works, but agent runs in OpenClaw can stall after a successful tool execution. The UI shows tool output, then no assistant response arrives for a long time, and the run eventually ends with timeout/abort/network errors.

This appears to be specifically in the agent/tool loop path (toolResult -> next model response), not in simple direct chat.

Code Example

{"type":"message","message":{"role":"toolResult","toolName":"exec","isError":false}}
{"type":"custom","customType":"openclaw:prompt-error","data":{"provider":"ollama","model":"glm-5.1:cloud","api":"ollama","error":"aborted | cron: job execution timed out"}}
{"type":"message","message":{"role":"assistant","stopReason":"error","errorMessage":"This operation was aborted"}}

---

[agent/embedded] embedded run failover decision: decision=fallback_model reason=timeout from=ollama/glm-5.1:cloud
[diagnostic] lane task error: error="FailoverError: LLM request timed out."
[model-fallback/decision] candidate=ollama/glm-5.1:cloud reason=timeout next=nvidia/z-ai/glm-5.1

---

error=LLM request failed: network connection was interrupted. rawError=fetch failed | read ECONNRESET
error=LLM request failed: network connection error. rawError=fetch failed | Client network socket disconnected before secure TLS connection was established

---

Context overflow: estimated context size exceeds safe threshold during tool loop.
context overflow detected (attempt 1/3); attempting auto-compaction for ollama/glm-5.1:cloud
RAW_BUFFERClick to expand / collapse

Summary

When OpenClaw is configured to use Ollama with glm-5.1:cloud, normal chat in the Ollama App works, but agent runs in OpenClaw can stall after a successful tool execution. The UI shows tool output, then no assistant response arrives for a long time, and the run eventually ends with timeout/abort/network errors.

This appears to be specifically in the agent/tool loop path (toolResult -> next model response), not in simple direct chat.

Environment

  • OpenClaw: 2026.5.7
  • Model: ollama/glm-5.1:cloud
  • API type: ollama
  • Primary Ollama base URL: http://127.0.0.1:11434
  • Agent defaults include:
    • streaming: true
    • thinkingDefault: high
  • I also had multiple Ollama fallbacks configured (ollama2 .. ollama5) pointing to the same model/provider family before falling back to NVIDIA.

What works

  • glm-5.1:cloud in the Ollama App works normally for direct chat.
  • Short/simple OpenClaw prompts may also work.

What fails

  • Multi-step OpenClaw agent runs that use tools.
  • After a tool completes successfully and the tool result is visible, OpenClaw sometimes never gets the next usable assistant response from glm-5.1:cloud.
  • From the user side this looks like a freeze/hang.

Expected behavior

After a successful tool call and tool result, OpenClaw should receive the next assistant response or fail over quickly and visibly.

Actual behavior

The run can stall between toolResult and the next assistant message, then eventually fail with abort/timeout/network errors.

Minimal repro pattern

  1. Configure OpenClaw primary model as ollama/glm-5.1:cloud using api: "ollama".
  2. Run an agent task that performs several tool calls.
  3. Observe that tool calls execute and tool output is shown.
  4. After one of the toolResult messages, the run may stop producing assistant output.
  5. Eventually it ends with timeout/abort/network errors, or only recovers after retries/compaction/fallback.

Evidence from local logs

  1. A session where a tool finishes successfully and the next model step aborts immediately:
{"type":"message","message":{"role":"toolResult","toolName":"exec","isError":false}}
{"type":"custom","customType":"openclaw:prompt-error","data":{"provider":"ollama","model":"glm-5.1:cloud","api":"ollama","error":"aborted | cron: job execution timed out"}}
{"type":"message","message":{"role":"assistant","stopReason":"error","errorMessage":"This operation was aborted"}}
  1. Timeout/failover path in gateway logs:
[agent/embedded] embedded run failover decision: decision=fallback_model reason=timeout from=ollama/glm-5.1:cloud
[diagnostic] lane task error: error="FailoverError: LLM request timed out."
[model-fallback/decision] candidate=ollama/glm-5.1:cloud reason=timeout next=nvidia/z-ai/glm-5.1
  1. Network-level failures from the same model/provider path:
error=LLM request failed: network connection was interrupted. rawError=fetch failed | read ECONNRESET
error=LLM request failed: network connection error. rawError=fetch failed | Client network socket disconnected before secure TLS connection was established
  1. In longer tool loops, context pressure also shows up:
Context overflow: estimated context size exceeds safe threshold during tool loop.
context overflow detected (attempt 1/3); attempting auto-compaction for ollama/glm-5.1:cloud

Why I think this is not just a generic Ollama/App problem

  • The same model works in the Ollama App for direct chat.
  • The breakage is most visible in OpenClaw agent orchestration after tool results.
  • The failure pattern is silent enough that from the UI it looks like the agent is frozen, even though the underlying issue seems to be timeout/abort/network handling in the model handoff after tools.

Questions

  • Is there a known incompatibility or instability with glm-5.1:cloud in the OpenClaw tool loop path?
  • Should OpenClaw fail over earlier/more explicitly after toolResult -> next prompt stalls?
  • Is there any recommended config for cloud Ollama models in agent mode (reduced thinking, no streaming, lower context pressure, different timeout strategy)?

If useful, I can provide a redacted config excerpt and additional redacted session/gateway logs.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

After a successful tool call and tool result, OpenClaw should receive the next assistant response or fail over quickly and visibly.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING