openclaw - 💡(How to fix) Fix [Bug]: vLLM + Qwen outputs raw <tool_call> XML instead of executing tools (Integration gap with openai-responses / openai API) [3 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
openclaw/openclaw#49508Fetched 2026-04-08 00:54:30
View on GitHub
Comments
3
Participants
3
Timeline
7
Reactions
0
Timeline (top)
commented ×3labeled ×2closed ×1locked ×1

When using a local vLLM backend, OpenClaw completely fails to execute tool calls across all tested models (including Qwen3). Depending on the api configuration, it either prints the raw tool invocation syntax (e.g., XML <tool_call>) directly to the chat UI as plain text, or it sends a misformatted payload that causes a fatal TypeError (Jinja template crash) on the vLLM server.

<img width="1611" height="369" alt="Image" src="https://github.com/user-attachments/assets/8b2aee42-070b-403f-bda6-5adfab58d1f4" />

Error Message

When switching "api": "openai-responses" to "api": "openai", the vLLM backend returns a 400 Bad Request with a Jinja template error: TypeError: can only concatenate str (not "list") to str. The OpenClaw gateway sends a request payload that is incompatible with the vLLM/Qwen3 chat template. This results in a 400 Bad Request error. The vLLM server logs show a Python TypeError: can only concatenate str (not "list") to str, indicating that OpenClaw is passing a list object where the template expects a string during the Jinja rendering process.

Root Cause

When using a local vLLM backend, OpenClaw completely fails to execute tool calls across all tested models (including Qwen3). Depending on the api configuration, it either prints the raw tool invocation syntax (e.g., XML <tool_call>) directly to the chat UI as plain text, or it sends a misformatted payload that causes a fatal TypeError (Jinja template crash) on the vLLM server.

<img width="1611" height="369" alt="Image" src="https://github.com/user-attachments/assets/8b2aee42-070b-403f-bda6-5adfab58d1f4" />
RAW_BUFFERClick to expand / collapse

Bug type

Behavior bug (incorrect output/state without crash)

Summary

When using a local vLLM backend, OpenClaw completely fails to execute tool calls across all tested models (including Qwen3). Depending on the api configuration, it either prints the raw tool invocation syntax (e.g., XML <tool_call>) directly to the chat UI as plain text, or it sends a misformatted payload that causes a fatal TypeError (Jinja template crash) on the vLLM server.

<img width="1611" height="369" alt="Image" src="https://github.com/user-attachments/assets/8b2aee42-070b-403f-bda6-5adfab58d1f4" />

Steps to reproduce

1.Start the vLLM backend using the standard OpenAI API server entrypoint with tool-calling parameters enabled: docker run -it --rm \ -p 8000:8000 \ --name qwen3-coder-30b-server \ --privileged \ --runtime=dlrt \ --ipc=host \ -e DENGLIN_DEVICES=all \ -e VLLM_UPSTREAM_INTERFACE=ollama \ -v /home/xfxq/zjin/models/Qwen3-Coder-30B-A3B-Instruct:/model \ harbor.xffuture.com/llm-optimize-algo/dl-vllm:v0.13.1.0 \ python3 -m vllm.entrypoints.openai.api_server \ --host 0.0.0.0 \ --port 8000 \ --model /model \ --served-model-name qwen3-coder \ --tensor-parallel-size 4 \ --gpu-memory-utilization 0.90 \ --max-model-len 65536 \ --trust-remote-code \ --dtype bfloat16 \ --enable-chunked-prefill \ --distributed-executor-backend mp \ --enable-auto-tool-choice \ --tool-call-parser qwen3_xml \ --chat-template-content-format string

2.Configure OpenClaw (openclaw.json) to connect to the local vLLM endpoint: { "meta": { "lastTouchedVersion": "2026.3.13", "lastTouchedAt": "2026-03-18T02:18:06.068Z" }, "auth": { "profiles": { "vllm:default": { "provider": "vllm", "mode": "api_key" } } }, "models": { "mode": "merge", "providers": { "vllm": { "baseUrl": "http://192.168.1.26:8000/v1", "apiKey": "sk-local-token", "api": "openai-responses", "models": [ { "id": "qwen3-coder", "name": "Qwen3-30B-Local", "contextWindow": 65536 } ] } } }, "agents": { "defaults": { "model": { "primary": "vllm/qwen3-coder" }, "workspace": "C:\\Users\\admin\\.openclaw\\workspace", "compaction": { "mode": "safeguard" } } }, "tools": { "profile": "full" }, "commands": { "native": true, "nativeSkills": true, "restart": true, "ownerDisplay": "raw" }, "gateway": { "mode": "local", "auth": { "mode": "token", "token": "17293ef0ba2e9b9c60f8f96831209ef40a0a24ad383d4195" } }, "skills": { "entries": { "baidu-search": { "apiKey": "sk-denglin" }, "web-search": { "apiKey": "c1e536cc65ef1c792f2025f5bff8b8a5398c9112b5ce67f2a07eb51fd0f39e4e" } } } } 3.Run the gateway by executing openclaw gateway in the terminal. 4.Trigger a tool call: In the chat interface, send a prompt like "Help me open Baidu" 5.Observe the Failure: Instead of executing the browser tool, OpenClaw outputs raw XML tags in the chat UI: <tool_call> <function=browser> <parameter=action>open</parameter> <parameter=targetUrl>https://www.baidu.com</parameter> </function> </tool_call> When switching "api": "openai-responses" to "api": "openai", the vLLM backend returns a 400 Bad Request with a Jinja template error: TypeError: can only concatenate str (not "list") to str.

Expected behavior

  1. Automatic Interception: When the model outputs a tool call (either in XML format like <tool_call> or via standard OpenAI JSON tool_calls), OpenClaw should automatically intercept the output instead of displaying it as raw text to the user.
  2. Successful Tool Execution: The OpenClaw gateway should parse the function name (e.g., browser) and parameters (e.g., targetUrl) and trigger the corresponding native skill or plugin (e.g., opening the browser to the specified URL).
  3. API Compatibility: OpenClaw should be able to communicate with vLLM's OpenAI-compatible endpoint without triggering Jinja template errors (TypeError: can only concatenate str (not "list") to str) in the backend, supporting standard message payloads.
  4. Seamless Integration: Users should see a "Tool Executing" status in the UI, followed by the tool's result, rather than seeing the internal "thought process" or raw code tags of the model.

Actual behavior

Currently, the tool-calling loop is broken in two distinct ways depending on the OpenClaw configuration:

  1. When api is set to openai-responses: OpenClaw ignores the tool calling syntax generated by the model. Instead of intercepting the call and launching the browser, it treats the XML <tool_call> output as a standard chat response and prints it directly to the user interface as plain text.

  2. When api is set to openai (Standard mode): The OpenClaw gateway sends a request payload that is incompatible with the vLLM/Qwen3 chat template. This results in a 400 Bad Request error. The vLLM server logs show a Python TypeError: can only concatenate str (not "list") to str, indicating that OpenClaw is passing a list object where the template expects a string during the Jinja rendering process.

In both cases, no native skills (like baidu-search or web-browser) are ever triggered.

OpenClaw version

2026.3.13

Operating system

windows11

Install method

No response

Model

QWEN3

Provider / routing chain

openclaw -> vLLM (local self-hosted)

Config file / key location

No response

Additional provider/model setup details

No response

Logs, screenshots, and evidence

Impact and severity

No response

Additional information

No response

extent analysis

Fix Plan

To resolve the issue, we need to modify the OpenClaw configuration and potentially the vLLM backend to handle tool calls correctly. Here are the steps:

  • Modify OpenClaw configuration: Update the openclaw.json file to use the correct API mode and handle tool calls.
  • Update vLLM backend: Modify the vLLM backend to correctly parse and handle tool calls in the standard OpenAI JSON format.
  • Implement tool call parsing: Add a parser to OpenClaw to intercept and execute tool calls in the XML format.

Code Changes

Here are some example code changes to illustrate the fix:

# In openclaw.json, update the API mode
"models": {
  "mode": "merge",
  "providers": {
    "vllm": {
      "baseUrl": "http://192.168.1.26:8000/v1",
      "apiKey": "sk-local-token",
      "api": "openai",  # Update API mode
      "models": [
        {
          "id": "qwen3-coder",
          "name": "Qwen3-30B-Local",
          "contextWindow": 65536
        }
      ]
    }
  }
}

# In vLLM backend, add a parser for tool calls in standard OpenAI JSON format
def parse_tool_call(tool_call):
    if "function" in tool_call:
        function = tool_call["function"]
        parameters = tool_call.get("parameters", {})
        # Execute the tool call
        if function == "browser":
            # Open the browser with the specified URL
            url = parameters.get("targetUrl")
            # Implement browser opening logic here
            pass
        # Add more tool call handlers as needed
    return None

# In OpenClaw, add a parser to intercept and execute tool calls in XML format
import xml.etree.ElementTree as ET

def parse_xml_tool_call(xml_string):
    root = ET.fromstring(xml_string)
    tool_call = {}
    for child in root:
        if child.tag == "function":
            tool_call["function"] = child.text
        elif child.tag == "parameter":
            tool_call.setdefault("parameters", {})[child.attrib["name"]] = child.text
    # Execute the tool call
    parse_tool_call(tool_call)
    return None

Verification

To verify that the fix worked, follow these steps:

  • Restart the OpenClaw gateway and vLLM backend.
  • Trigger a tool call in the chat interface (e.g., "Help me open Baidu").
  • Verify that the tool call is executed correctly (e.g., the browser opens with the specified URL).
  • Check the vLLM server logs for any errors or issues.

Extra Tips

  • Ensure that the vLLM backend is correctly configured to handle tool calls in the standard

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  1. Automatic Interception: When the model outputs a tool call (either in XML format like <tool_call> or via standard OpenAI JSON tool_calls), OpenClaw should automatically intercept the output instead of displaying it as raw text to the user.
  2. Successful Tool Execution: The OpenClaw gateway should parse the function name (e.g., browser) and parameters (e.g., targetUrl) and trigger the corresponding native skill or plugin (e.g., opening the browser to the specified URL).
  3. API Compatibility: OpenClaw should be able to communicate with vLLM's OpenAI-compatible endpoint without triggering Jinja template errors (TypeError: can only concatenate str (not "list") to str) in the backend, supporting standard message payloads.
  4. Seamless Integration: Users should see a "Tool Executing" status in the UI, followed by the tool's result, rather than seeing the internal "thought process" or raw code tags of the model.

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING