openclaw - 💡(How to fix) Fix [Bug]: Local model provider calls thread block gateway event loop on Windows beta; trivial infer run takes ~4 minutes [1 pull requests]

Root Cause

On Windows with OpenClaw 2026.5.24-beta.1, local model calls appear to block or starve the Gateway event loop. Even a trivial fresh prompt like hi, how are you or:

openclaw infer model run --model llamacpp/qwen3.5-9b-instruct-Q5_K_M.gguf --prompt "hi" --json

takes around 3 minutes.

The underlying llama.cpp backend can generate quickly in isolation, but when invoked through OpenClaw the Gateway shows repeated event-loop starvation warnings, slow WebSocket RPCs, Telegram fetch timer delays, and stalled sessions with activeWorkKind=model_call.

This reproduces with both llama.cpp and Ollama backends, so it does not look specific to one local server implementation.

Code Example

Relevant log excerpts:

[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=49s eventLoopDelayP99Ms=29813.1 eventLoopDelayMaxMs=29813.1 eventLoopUtilization=1 cpuCoreRatio=0.987 active=1 waiting=0 queued=0 work=[active=agent:main:main(processing/embedded_run,q=1,age=56s last=embedded_run:started)]

[fetch-timeout] fetch timeout after 9999ms (elapsed 18183ms) timer delayed 8184ms, likely event-loop starvation operation=fetchWithTimeout url=https://api.telegram.org/.../getMe

[agent/embedded] [trace:embedded-run] prep stages: runId=270498cf-d78a-4f58-ae81-f271e9ee4738 sessionId=d432c2dd-b18c-4ae8-947a-1dc7b409f875 phase=stream-ready totalMs=11071 stages=workspace-sandbox:2ms@2ms,skills:1ms@3ms,core-plugin-tools:2096ms@2099ms,bootstrap-context:18ms@2117ms,bundle-tools:338ms@2455ms,system-prompt:5976ms@8431ms,session-resource-loader:2604ms@11035ms,agent-session:5ms@11040ms,stream-setup:30ms@11070ms

[diagnostic] long-running session: sessionId=d432c2dd-b18c-4ae8-947a-1dc7b409f875 sessionKey=agent:main:main state=processing age=135s queueDepth=1 reason=queued_behind_active_work classification=long_running activeWorkKind=model_call lastProgress=model_call:started lastProgressAge=87s recovery=none

[diagnostic] stalled session: sessionId=d432c2dd-b18c-4ae8-947a-1dc7b409f875 sessionKey=agent:main:main state=processing age=140s queueDepth=0 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=model_call lastProgress=model_call:started

Slow RPC examples from the same window:

sessions.list 29482ms
chat.history 30201ms
sessions.list 20701ms
sessions.list 25379ms
models.list 29399ms

Bug type

Behavior bug (incorrect output/state without crash)

Beta release blocker

Yes

Summary

On Windows with OpenClaw 2026.5.24-beta.1, local model calls appear to block or starve the Gateway event loop. Even a trivial fresh prompt like hi, how are you or:

openclaw infer model run --model llamacpp/qwen3.5-9b-instruct-Q5_K_M.gguf --prompt "hi" --json

takes around 3 minutes.

This reproduces with both llama.cpp and Ollama backends, so it does not look specific to one local server implementation.

Steps to reproduce

Fresh chat with a trivial prompt takes many minutes. openclaw infer model run --prompt "hi" also takes ~3 minutes. Gateway/control RPCs become very slow during the run. Telegram health/fetch timers are delayed and report likely event-loop starvation. Logs show model calls stuck with no progress.

Expected behavior

A trivial local model prompt should not starve the Gateway event loop. Even if the local backend/model is slow, Gateway timers, health checks, WebSocket RPCs, and channel polling should remain responsive or degrade gracefully.

Actual behavior

During local model calls, the Gateway event loop appears saturated:

eventLoopDelayP99Ms=20-29s eventLoopUtilization=1 cpuCoreRatio≈0.98 activeWorkKind=model_call

This makes unrelated Gateway operations appear broken or delayed.

OpenClaw version

2026.5.24-beta.1

Operating system

Windows 11

Install method

npm

Model

Qwen 3.5 9B

Provider / routing chain

openclaw -> llama.cpp -> qwen

Additional provider/model setup details

openclaw-diagnostics-2026-05-25T18-08-09-809Z-6904.zip

Configs/backends tried

llama.cpp via OpenAI-compatible endpoint Ollama backend OpenAI Responses-style config OpenAI Chat Completions-style config Tool support enabled/disabled attempts Fresh/simple prompts and fresh chats

The issue persists across local backend choices.

Diagnostics

I have an openclaw gateway diagnostics export zip generated while reproducing this. The export includes sanitized logs, gateway status, health, config shape, and stability data. I can attach it to this issue.

Logs, screenshots, and evidence

Relevant log excerpts:

[diagnostic] liveness warning: reasons=event_loop_delay,event_loop_utilization,cpu interval=49s eventLoopDelayP99Ms=29813.1 eventLoopDelayMaxMs=29813.1 eventLoopUtilization=1 cpuCoreRatio=0.987 active=1 waiting=0 queued=0 work=[active=agent:main:main(processing/embedded_run,q=1,age=56s last=embedded_run:started)]

[fetch-timeout] fetch timeout after 9999ms (elapsed 18183ms) timer delayed 8184ms, likely event-loop starvation operation=fetchWithTimeout url=https://api.telegram.org/.../getMe

[agent/embedded] [trace:embedded-run] prep stages: runId=270498cf-d78a-4f58-ae81-f271e9ee4738 sessionId=d432c2dd-b18c-4ae8-947a-1dc7b409f875 phase=stream-ready totalMs=11071 stages=workspace-sandbox:2ms@2ms,skills:1ms@3ms,core-plugin-tools:2096ms@2099ms,bootstrap-context:18ms@2117ms,bundle-tools:338ms@2455ms,system-prompt:5976ms@8431ms,session-resource-loader:2604ms@11035ms,agent-session:5ms@11040ms,stream-setup:30ms@11070ms

[diagnostic] long-running session: sessionId=d432c2dd-b18c-4ae8-947a-1dc7b409f875 sessionKey=agent:main:main state=processing age=135s queueDepth=1 reason=queued_behind_active_work classification=long_running activeWorkKind=model_call lastProgress=model_call:started lastProgressAge=87s recovery=none

[diagnostic] stalled session: sessionId=d432c2dd-b18c-4ae8-947a-1dc7b409f875 sessionKey=agent:main:main state=processing age=140s queueDepth=0 reason=active_work_without_progress classification=stalled_agent_run activeWorkKind=model_call lastProgress=model_call:started

Slow RPC examples from the same window:

sessions.list 29482ms
chat.history 30201ms
sessions.list 20701ms
sessions.list 25379ms
models.list 29399ms

Impact and severity

Local model use is effectively unusable for even trivial prompts on this setup, despite the backend itself being capable of high token/sec throughput outside OpenClaw.

Additional information

The model provider invocation path for local providers on Windows may be doing CPU-heavy synchronous work or otherwise failing to isolate the local model request/stream processing from the Gateway event loop. The expensive pre-run prep is also visible (~11s), but the main failure appears after model_call:started, where the Gateway starts reporting starvation and stalled agent runs.

Edit:

Possibly related: sessions.list stalls while local model call is active:

While the local model call is stalled, repeated Gateway WS RPCs also become very slow: text 18:53:58 [ws] ⇄ res ✓ sessions.list 20736ms 18:54:19 [ws] ⇄ res ✓ sessions.list 20652ms 18:55:19 [ws] ⇄ res ✓ sessions.list 21084ms 18:56:06 [ws] ⇄ res ✓ sessions.list 25379ms 19:18:54 [ws] ⇄ res ✓ sessions.list 20005ms 19:19:14 [ws] ⇄ res ✓ sessions.list 20414ms These occur near event-loop starvation / stalled model-call logs: text fetch timeout ... timer delayed ... likely event-loop starvation stalled session ... activeWorkKind=model_call lastProgress=model_call:started Expected: local status/session RPCs should remain responsive even if a model backend is slow. Actual: simple local RPCs take ~20-25s while the model call is active.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix [Bug]: Local model provider calls thread block gateway event loop on Windows beta; trivial infer run takes ~4 minutes [1 pull requests]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fixed

Code Example

Bug type

Beta release blocker

Summary

Steps to reproduce

Expected behavior

Actual behavior

OpenClaw version

Operating system

Install method

Model

Provider / routing chain

Additional provider/model setup details

Logs, screenshots, and evidence

Impact and severity

Additional information

Possibly related: sessions.list stalls while local model call is active:

FAQ

Expected behavior

Still need to ship something?

TRENDING