openclaw - 💡(How to fix) Fix Tool-call regression — agents return prose ('I will search…') instead of emitting tool calls [1 comments, 2 participants]

openclaw2026-05-10 07:19:14

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

openclaw/openclaw#80150•Fetched 2026-05-11 03:18:21

View on GitHub

Comments

Participants

Timeline

Reactions

Author

jpazvd

Participants

clawsweeper[bot]

jpazvd

Timeline (top)

closed ×1commented ×1cross-referenced ×1

Across all locally-served LLMs in our deployment, agents respond to prompts that should trigger tool invocation by describing what they would do rather than actually calling the tool. The pattern is consistent enough that it appears to be a gateway-level tool-advertisement bug, not a per-model behavior difference.

Empirically: prompts like "Search the web for X and tell me what you find" (action_now in our bench) routinely produce replies starting with "I'll search for X…", "Let me look up Y…", or "I would need to call the search tool to find Z…" — but no tool call is emitted, and the user gets only the announcement, not the result.

Root Cause

All persona work that relies on action-over-announce behavior (Tier 1 #1, #6, #7, #20, ad-hoc) is partly theatre because the bot can't take the actions the persona describes.
Bench prompts that test tool-using behavior (action_now, delegate_data_task, morning_briefing) cap out at ~33–67% pass rates regardless of model class.
Daily cron skills (chart-of-the-day, morning-briefing) that should call MCPs to fetch real data instead pull from training-data recency, producing stale or fabricated content.

Fix Action

Fix / Workaround

Workaround in place

Code Example

[bench] prompt action_now → reply
> I'll search the web for what's new in OpenClaw in May 2026 and report back
> with concrete findings. Let me check the official release notes…

[no tool call recorded; turn ends]

---

openclaw agent --agent main --session-id $(uuidgen) \
  --message 'Search the web for what is new in OpenClaw in May 2026 and tell me what you find. Reply with concrete findings — URLs, dates, version numbers.' \
  --json --timeout 300

RAW_BUFFERClick to expand / collapse

Upstream issue draft — OpenClaw

Repo: https://github.com/openclaw/openclaw/issues/new Title: Tool-call regression — agents return prose ("I will search…") instead of emitting tool calls

Summary

Symptom

[bench] prompt action_now → reply
> I'll search the web for what's new in OpenClaw in May 2026 and report back
> with concrete findings. Let me check the official release notes…

[no tool call recorded; turn ends]

This pattern is reproduced across:

qwen2.5:14b (default)
qwen3-fast:30b (MoE)
qwen3-think:30b (chain-of-thought)
qwen2.5:14b-instruct-q5_K_M / q8_0 (quantization variants)

The persona files (SOUL.md, AGENTS.md) all contain "action over explanation" rules with concrete examples of the desired shape. The rules have been validated to flip behavior in 1–2 iterations on qwen2.5:14b for some tools (Tier 1 #6 delegate_data_task shipped 2026-05-08 at 100% on the configured model). But for native-tool calls (web_search, exec, read), the regression persists.

Hypothesis

The gateway tool-advertisement layer may not be advertising the available tool set to the model in a way that triggers the model's tool-calling behavior. Things to verify:

Does the system prompt actually list the available tools for the agent?
Is the model instructed (in OpenClaw's framing) that tools are available to call, not just describable?
For models with native tool-calling format (Qwen 2.5 family supports OpenAI-style function calls), is the gateway sending the tool schemas in the right format?

Reproduction

openclaw agent --agent main --session-id $(uuidgen) \
  --message 'Search the web for what is new in OpenClaw in May 2026 and tell me what you find. Reply with concrete findings — URLs, dates, version numbers.' \
  --json --timeout 300

Expected: agent emits a web_search tool call, gateway resolves it, model summarizes results. Actual: agent's reply is prose announcing what it would do, no tool call recorded in the gateway turn log.

What this blocks

This is the single largest UX bug in the deployment. Until it ships:

All persona work that relies on action-over-announce behavior (Tier 1 #1, #6, #7, #20, ad-hoc) is partly theatre because the bot can't take the actions the persona describes.
Bench prompts that test tool-using behavior (action_now, delegate_data_task, morning_briefing) cap out at ~33–67% pass rates regardless of model class.
Daily cron skills (chart-of-the-day, morning-briefing) that should call MCPs to fetch real data instead pull from training-data recency, producing stale or fabricated content.

Environment

openclaw 2026.5.2 (8b2a6e5)
macOS 25.4.0 (Darwin 25.4.0) / arm64 / Mac Mini M4 Pro 64 GB
Models tested: qwen2.5:14b (Q4 default), qwen3-fast:30b, qwen3-think:30b, qwen2.5:14b-q5_K_M, qwen2.5:14b-q8_0

Pairs with openclaw-mcp-runtime-disposed.md (this repo, separate issue draft) — together these two issues account for the bulk of locally-tracked Tier 0 work in docs/18-chatbot-improvement-plan.md. Tracked there as Tier 0 #7.

What I'd expect

For prompts that should invoke a tool, the gateway turn log shows a tool_call event followed by a tool_result event followed by the model's final reply that incorporates the tool result.

Workaround in place

None local. We've tuned persona text exhaustively (concrete examples, verbatim failing-phrase callouts, "do not announce; just call the tool" rules) and the regression persists. The pattern survives across model classes, which is what convinces me this is a gateway-side advertisement issue rather than a model-personality issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#dependency error #configuration error #environment variable #network issue #logging issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

openclaw - 💡(How to fix) Fix Tool-call regression — agents return prose ('I will search…') instead of emitting tool calls [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Workaround in place

Code Example

Upstream issue draft — OpenClaw

Summary

Symptom

Hypothesis

Reproduction

What this blocks

Environment

Related

What I'd expect

Workaround in place

Still need to ship something?

TRENDING

openclaw - 💡(How to fix) Fix Tool-call regression — agents return prose ('I will search…') instead of emitting tool calls [1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Root Cause

Fix Action

Fix / Workaround

Workaround in place

Code Example

Upstream issue draft — OpenClaw

Summary

Symptom

Hypothesis

Reproduction

What this blocks

Environment

Related

What I'd expect

Workaround in place

Still need to ship something?

RELATED_DISCOVERY

TRENDING