openclaw - 💡(How to fix) Fix Qwen 2.5 Coder 32B via llama.cpp: tool calls emitted as plain text, not structured tool_calls [1 participants]

openclaw2026-04-04 00:21:04

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#60601•Fetched 2026-04-08 02:49:18

View on GitHub

Comments

Participants

Timeline

Reactions

Author

SophiaDuane

Participants

SophiaDuane

Root Cause

Qwen 2.5 Coder 32B (at 32B param size) does not reliably wrap tool calls in <tool_call> XML tags when tool_choice is unset or "auto". Instead it outputs:

Bare JSON: {"name": "read", "arguments": {"path": "..."}}
Wrong XML tags: <tools>{"name": "read", ...}</tools> (instead of <tool_call>)

Because llama-server's tool call parser looks for <tool_call> tags specifically, these variants are not converted into structured tool_calls in the API response. They end up in message.content as plain text, and OpenClaw's buildAssistantMessage only processes response.message.tool_calls.

With "tool_choice": "required" the model does produce proper <tool_call> tags and llama-server returns structured tool_calls correctly. But OpenClaw does not send tool_choice unless explicitly configured.

Fix Action

Workaround

I wrote a reverse proxy that sits between OpenClaw and llama-server. It buffers streaming responses, detects tool call patterns in the accumulated text content (bare JSON, <tools>, <tool_call> tags), and re-emits a corrected SSE stream with proper tool_calls structure. This works but shouldn't be necessary.

Code Example

Sure, let's try again to check the config file.

{"name": "read", "arguments": {"path": "cow_trader/config.py"}}

RAW_BUFFERClick to expand / collapse

Qwen 2.5 Coder 32B via llama.cpp: tool calls emitted as plain text, not structured tool_calls

Bug Description

When using Qwen 2.5 Coder 32B (Q4_K_M GGUF) via llama.cpp's OpenAI-compatible API (openai-completions), OpenClaw does not detect or execute tool calls. The model outputs tool calls as plain JSON text in the content field instead of OpenClaw receiving them as structured tool_calls objects.

Environment

OpenClaw: 2026.4.1
Model: qwen2.5-coder-32b-instruct-q4_k_m.gguf via llama.cpp (build 8638)
API: openai-completions
llama-server flags: --jinja enabled
OS: Linux (Ubuntu 24.04, Docker)

Steps to Reproduce

Configure a llamacpp provider in openclaw.json with "api": "openai-completions" pointing at llama-server
Start llama-server with --jinja and the Qwen 2.5 Coder 32B GGUF
Send a message that requires tool use (e.g., "read cow_trader/config.py")

Expected Behavior

OpenClaw should detect the tool call and execute the read tool.

Actual Behavior

The model's response appears as plain text in the chat:

Sure, let's try again to check the config file.

{"name": "read", "arguments": {"path": "cow_trader/config.py"}}

No tool is executed. The tool call JSON is rendered as text to the user.

Root Cause

Qwen 2.5 Coder 32B (at 32B param size) does not reliably wrap tool calls in <tool_call> XML tags when tool_choice is unset or "auto". Instead it outputs:

Bare JSON: {"name": "read", "arguments": {"path": "..."}}
Wrong XML tags: <tools>{"name": "read", ...}</tools> (instead of <tool_call>)

Workaround

Suggested Fix

One or more of these could address it:

Allow per-model toolChoice config -- Add a compat.defaultToolChoice field to the model definition schema so users can set "required" for models that need it.
Content-based tool call fallback -- When a model returns finish_reason: "stop" but the content contains JSON matching the {"name": "...", "arguments": {...}} tool call pattern (especially wrapped in <tool_call> or <tools> tags), parse and promote them to tool_calls. This would help with many small/local models.
Send tool_choice: "auto" by default -- Some models behave better when this is explicit rather than omitted, though this alone doesn't fix Qwen 2.5 Coder 32B.

Option 2 would be the most broadly useful, as many local models (especially smaller ones) have inconsistent tool call formatting.

extent analysis

TL;DR

Implement a content-based tool call fallback to parse and promote JSON tool calls in the response content to structured tool_calls.

Guidance

Investigate adding a compat.defaultToolChoice field to the model definition schema to allow per-model toolChoice configuration.
Consider implementing a fallback to parse JSON tool calls in the response content when the model returns finish_reason: "stop".
Evaluate sending tool_choice: "auto" by default to improve model behavior, although this may not fix the issue with Qwen 2.5 Coder 32B.

Example

No explicit code example is provided, but the suggested fix involves modifying the model definition schema or the tool call parsing logic.

Notes

The issue is specific to Qwen 2.5 Coder 32B and may not apply to other models. The suggested fixes aim to improve the robustness of tool call detection and parsing.

Recommendation

Apply a workaround, specifically implementing a content-based tool call fallback, as it would be the most broadly useful solution, helping with many small/local models.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #installation #tensor shape #autograd error #model save/load

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Qwen 2.5 Coder 32B via llama.cpp: tool calls emitted as plain text, not structured tool_calls [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Qwen 2.5 Coder 32B via llama.cpp: tool calls emitted as plain text, not structured tool_calls

Bug Description

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause

Workaround

Suggested Fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Qwen 2.5 Coder 32B via llama.cpp: tool calls emitted as plain text, not structured tool_calls [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Workaround

Code Example

Qwen 2.5 Coder 32B via llama.cpp: tool calls emitted as plain text, not structured tool_calls

Bug Description

Environment

Steps to Reproduce

Expected Behavior

Actual Behavior

Root Cause

Workaround

Suggested Fix

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING