ollama - 💡(How to fix) Fix Regression: Severe queue delay + tool loop hang in Ollama v0.23.2 (MLX / macOS)

ollama2026-05-08 15:24:22

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

RAW_BUFFERClick to expand / collapse

What is the issue?

Summary

Ollama v0.23.2 introduces two major regressions on macOS (Apple Silicon, MLX backend):

Severe queue delay — up to 2–3 minutes between tasks Tool loop hang — model enters repeated search_web / fetch_url loop with no completion

These issues were not present in v0.23.1 and render v0.23.2 unsuitable for production use.

Environment OS: macOS (Apple Silicon) Hardware: Apple system (MLX backend in use) Ollama version: 0.23.2 Previous working version: 0.23.1 Interface: Open WebUI Model: qwen3.6:35b-a3b-mlx-bf16 Issue 1 — Queue Delay Regression Behavior After completing a request, the next request is delayed significantly Observed delay: 2–3 minutes between tasks Occurs even with: no concurrent jobs idle system sufficient RAM (no memory pressure) Expected Behavior Immediate or near-immediate task start (as in v0.23.1) Actual Behavior Requests sit idle before execution begins Appears to be queueing or scheduling regression Issue 2 — Tool Loop Hang (search_web / fetch_url) Behavior Model enters repeated tool calls: search_web fetch_url No final response is produced Loop continues indefinitely Observations Occurs without any other jobs pending Seen in Open WebUI tool activity panel: multiple search calls growing source list (e.g., 10+ sources) Requires manual interruption Failure Rate Observed in 2 out of 5 prompts (~40%) Expected Behavior Model completes tool use and returns a final answer Actual Behavior Infinite or long-running tool loop with no completion Reproduction Queue Delay

Run a prompt using:

ollama run qwen3.6:35b-a3b-mlx-bf16 After completion, immediately submit another prompt Observe delay before execution begins (up to several minutes) Tool Loop Use Open WebUI with tool-enabled model Submit a prompt requiring web lookup Observe repeated search_web / fetch_url calls No final response returned Impact Breaks multi-user and sequential workflows Makes system appear unresponsive Requires manual intervention to stop tool loops Not suitable for production environments Additional Notes System shows no resource constraints (RAM healthy, minimal swap) No concurrency required to reproduce Appears related to: scheduling / queue handling tool execution loop control Request

Please investigate:

task scheduling / queue handling changes in 0.23.2 tool execution termination conditions interaction between MLX backend and tool loop handling

Happy to provide additional logs or run targeted tests if needed.

Relevant log output

OS

No response

GPU

No response

CPU

No response

Ollama version

No response

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#network issue #logging issue #authentication issue #prompt issue #agent setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix Regression: Severe queue delay + tool loop hang in Ollama v0.23.2 (MLX / macOS)

Recommended Tools

GitHub issue graph ai analysis

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix Regression: Severe queue delay + tool loop hang in Ollama v0.23.2 (MLX / macOS)

Recommended Tools

GitHub issue graph ai analysis

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

Still need to ship something?

RELATED_DISCOVERY

TRENDING