openclaw - 💡(How to fix) Fix Ollama `glm-5.1:cloud` stalls after tool results in agent runs while direct Ollama App chat works

openclaw2026-05-08 10:26:06

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

When OpenClaw is configured to use Ollama with glm-5.1:cloud, normal chat in the Ollama App works, but agent runs in OpenClaw can stall after a successful tool execution. The UI shows tool output, then no assistant response arrives for a long time, and the run eventually ends with timeout/abort/network errors.

This appears to be specifically in the agent/tool loop path (toolResult -> next model response), not in simple direct chat.

Error Message

[agent/embedded] embedded run failover decision: decision=fallback_model reason=timeout from=ollama/glm-5.1:cloud [diagnostic] lane task error: error="FailoverError: LLM request timed out." [model-fallback/decision] candidate=ollama/glm-5.1:cloud reason=timeout next=nvidia/z-ai/glm-5.1

Root Cause

This appears to be specifically in the agent/tool loop path (toolResult -> next model response), not in simple direct chat.

Code Example

{"type":"message","message":{"role":"toolResult","toolName":"exec","isError":false}}
{"type":"custom","customType":"openclaw:prompt-error","data":{"provider":"ollama","model":"glm-5.1:cloud","api":"ollama","error":"aborted | cron: job execution timed out"}}
{"type":"message","message":{"role":"assistant","stopReason":"error","errorMessage":"This operation was aborted"}}

---

[agent/embedded] embedded run failover decision: decision=fallback_model reason=timeout from=ollama/glm-5.1:cloud
[diagnostic] lane task error: error="FailoverError: LLM request timed out."
[model-fallback/decision] candidate=ollama/glm-5.1:cloud reason=timeout next=nvidia/z-ai/glm-5.1

---

error=LLM request failed: network connection was interrupted. rawError=fetch failed | read ECONNRESET
error=LLM request failed: network connection error. rawError=fetch failed | Client network socket disconnected before secure TLS connection was established

---

Context overflow: estimated context size exceeds safe threshold during tool loop.
context overflow detected (attempt 1/3); attempting auto-compaction for ollama/glm-5.1:cloud

RAW_BUFFERClick to expand / collapse

Summary

This appears to be specifically in the agent/tool loop path (toolResult -> next model response), not in simple direct chat.

Environment

OpenClaw: 2026.5.7
Model: ollama/glm-5.1:cloud
API type: ollama
Primary Ollama base URL: http://127.0.0.1:11434
Agent defaults include:
- streaming: true
- thinkingDefault: high
I also had multiple Ollama fallbacks configured (ollama2 .. ollama5) pointing to the same model/provider family before falling back to NVIDIA.

What works

glm-5.1:cloud in the Ollama App works normally for direct chat.
Short/simple OpenClaw prompts may also work.

What fails

Multi-step OpenClaw agent runs that use tools.
After a tool completes successfully and the tool result is visible, OpenClaw sometimes never gets the next usable assistant response from glm-5.1:cloud.
From the user side this looks like a freeze/hang.

Expected behavior

After a successful tool call and tool result, OpenClaw should receive the next assistant response or fail over quickly and visibly.

Actual behavior

The run can stall between toolResult and the next assistant message, then eventually fail with abort/timeout/network errors.

Minimal repro pattern

Configure OpenClaw primary model as ollama/glm-5.1:cloud using api: "ollama".
Run an agent task that performs several tool calls.
Observe that tool calls execute and tool output is shown.
After one of the toolResult messages, the run may stop producing assistant output.
Eventually it ends with timeout/abort/network errors, or only recovers after retries/compaction/fallback.

Evidence from local logs

A session where a tool finishes successfully and the next model step aborts immediately:

{"type":"message","message":{"role":"toolResult","toolName":"exec","isError":false}}
{"type":"custom","customType":"openclaw:prompt-error","data":{"provider":"ollama","model":"glm-5.1:cloud","api":"ollama","error":"aborted | cron: job execution timed out"}}
{"type":"message","message":{"role":"assistant","stopReason":"error","errorMessage":"This operation was aborted"}}

Timeout/failover path in gateway logs:

[agent/embedded] embedded run failover decision: decision=fallback_model reason=timeout from=ollama/glm-5.1:cloud
[diagnostic] lane task error: error="FailoverError: LLM request timed out."
[model-fallback/decision] candidate=ollama/glm-5.1:cloud reason=timeout next=nvidia/z-ai/glm-5.1

Network-level failures from the same model/provider path:

error=LLM request failed: network connection was interrupted. rawError=fetch failed | read ECONNRESET
error=LLM request failed: network connection error. rawError=fetch failed | Client network socket disconnected before secure TLS connection was established

In longer tool loops, context pressure also shows up:

Context overflow: estimated context size exceeds safe threshold during tool loop.
context overflow detected (attempt 1/3); attempting auto-compaction for ollama/glm-5.1:cloud

Why I think this is not just a generic Ollama/App problem

The same model works in the Ollama App for direct chat.
The breakage is most visible in OpenClaw agent orchestration after tool results.
The failure pattern is silent enough that from the UI it looks like the agent is frozen, even though the underlying issue seems to be timeout/abort/network handling in the model handoff after tools.

Questions

Is there a known incompatibility or instability with glm-5.1:cloud in the OpenClaw tool loop path?
Should OpenClaw fail over earlier/more explicitly after toolResult -> next prompt stalls?
Is there any recommended config for cloud Ollama models in agent mode (reduced thinking, no streaming, lower context pressure, different timeout strategy)?

If useful, I can provide a redacted config excerpt and additional redacted session/gateway logs.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

After a successful tool call and tool result, OpenClaw should receive the next assistant response or fail over quickly and visibly.

#api #cache issue #memory leak #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Ollama `glm-5.1:cloud` stalls after tool results in agent runs while direct Ollama App chat works

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Environment

What works

What fails

Expected behavior

Actual behavior

Minimal repro pattern

Evidence from local logs

Why I think this is not just a generic Ollama/App problem

Questions

FAQ

Expected behavior

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Ollama `glm-5.1:cloud` stalls after tool results in agent runs while direct Ollama App chat works

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Code Example

Summary

Environment

What works

What fails

Expected behavior

Actual behavior

Minimal repro pattern

Evidence from local logs

Why I think this is not just a generic Ollama/App problem

Questions

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING