ollama - ✅(Solved) Fix /api/generate returns HTTP 500 {"error":"EOF"} with qwen3.5:9b when prompt requests <tool_call> XML output [1 pull requests, 1 comments, 2 participants]

Q: Expected behavior

The server should not return HTTP 500 / `{"error":"EOF"}` because the prompt contains ` `-style XML text. Even if Ollama or the model-side parser dislikes this format, the request should fail gracefully, for example by: - returning raw model text, or - returning a controlled parse error with a clear message but not an internal server error.

ollama - ✅(Solved) Fix /api/generate returns HTTP 500 {"error":"EOF"} with qwen3.5:9b when prompt requests XML output [1 pull requests, 1 comments, 2 participants]

meowDieJob · 2026-03-21T00:02:00Z

[ollama] /api/generate works for plain text prompts and simple XML-like prompts, but can return HTTP 500 with {"error":"EOF"} when the prompt asks the model to… `/api/generate` works for plain text prompts and simple XML-like prompts, but can return HTTP 500 with `{"error":"EOF"}` when the prompt asks the model to emit ` ... ` style XML output. This reproduces even without using native Ollama tools. The prompt only contains tool-like XML tags as plain text instructions. Possibly related to other Qwen XML/tool-call parsing issues, but this reproduces on `qwen3.5:9b` with `/api/generate` and without native tools. # PR #15011: model/parsers: fall back to content when qwen tool call XML parse fails - Repository: ollama/ollama - Author: r266-tech - State: open | merged: False - Link: https://github.com/ollama/ollama/pull/15011 ## Description (problem / solution / changelog) ## Summary Fixes #14986 ## Root Cause The Qwen3.5 model always has a builtin parser registered (`qwen3.5`). When a user prompt instructs the model to emit ` ... `-style XML as plain text — without registering native Ollama tools — the `Qwen3CoderParser` still scans for those delimiters in the model output, finds them, and calls `parseToolCall()`. If the content between the tags is not valid Qwen3-coder XML (e.g. a JSON payload such as `{"name": "get_weather", ...}`), `xml.Unmarshal` returns an error (typically `EOF`). That error propagated all the way up to `GenerateHandler` / `ChatHandler`, which sent `{"error":"EOF"}` and returned HTTP 500. ## Fix In `Qwen3CoderParser.Add()`, when `parseToolCall` fails: - **Before:** return the error immediately - **After:** log a warning and write the raw tool-call text (including the wrapping tags) into the content `strings.Builder`, then `break` This means the caller always receives a usable HTTP 200 response with the raw model output instead of an internal server error. **Safety:** when real tools are registered and the model produces well-formed Qwen3-coder XML, `parseToolCall` succeeds and the existing path is taken unchanged. ## Changes - `model/parsers/qwen3coder.go`: on parse failure, fall back to content instead of returning an error - `model/parsers/qwen35_test.go`: add regression test that initialises the parser with no tools (matching the `/api/generate` code path) and feeds it a JSON-style ` ` block — expects no error and non-empty content ## Testing New test: `TestQwen35ParserToolCallAsPlainTextFallback` --- **CLA Confirmation:** I have read, understood, and agree to the Ollama Contributor License Agreement (CLA). I understand that this contribution may be used under the terms of the MIT license. ## Changed files - `model/parsers/qwen35_test.go` (modified, +32/-0) - `model/parsers/qwen3coder.go` (modified, +10/-2) ## Fixed - Fixed by PR: model/parsers: fall back to content when qwen tool call XML parse fails (https://github.com/ollama/ollama/pull/15011) ### What is the issue? ### Environment - Ollama version: `v0.18.2` - Model: `qwen3.5:9b` - Endpoint: `/api/generate` - Native Ollama tools: not used ### Summary `/api/generate` works for plain text prompts and simple XML-like prompts, but can return HTTP 500 with `{"error":"EOF"}` when the prompt asks the model to emit ` ... ` style XML output. This reproduces even without using native Ollama tools. The prompt only contains tool-like XML tags as plain text instructions. Possibly related to other Qwen XML/tool-call parsing issues, but this reproduces on `qwen3.5:9b` with `/api/generate` and without native tools. ### Minimal reproduction I used the following script: ```python import json import urllib.request import urllib.error HOST = "http://127.0.0.1:11434" MODEL = "qwen3.5:9b" tests = [ ("plain_text", "hello"), ("simple_xml", " 查上海天气 "), ("xml_with_tool_hint", " [{\"name\":\"get_weather\"}] \n 查上海天气 \n请只输出一个 ... "), ("minitest1"," [{\"name\":\"get_weather\"}] \n 查上海天气 \n请只输出一个工具调用块"), ] for name, prompt in tests: payload = { "model": MODEL, "stream": False, "prompt": prompt, } req = urllib.request.Request( f"{HOST}/api/generate", data=json.dumps(payload).encode("utf-8"), headers={"Content-Type": "application/json"}, method="POST", ) print(f"\n=== TEST: {name} ===") try: with urllib.request.urlopen(req, timeout=120) as resp: body = resp.read().decode("utf-8") print("HTTP", resp.status) print(body[:1000]) except urllib.error.HTTPError as e: body = e.read().decode("utf-8", errors="replace") print("HTTPError", e.code) print(body) except Exception as e: print("Exception", repr(e)) ``` ### Results #### 1) plain_text Prompt: ```text hello ``` Result: ```text HTTP 200 {"model":"qwen3.5:9b","created_at":"2026-03-20T23:46:44.373588553Z","response":"Hello! 👋 How can I help you today?", ...} ``` #### 2) simple_xml Prompt: ```text 查上海天气</use

ollama2026-03-21 00:02:00

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#14986•Fetched 2026-04-08 01:08:21

View on GitHub

Comments

Participants

Timeline

Reactions

Author

meowDieJob

Participants

meowDieJob

r266-tech

Timeline (top)

commented ×1cross-referenced ×1labeled ×1referenced ×1

/api/generate works for plain text prompts and simple XML-like prompts, but can return HTTP 500 with {"error":"EOF"} when the prompt asks the model to emit <tool_call>...</tool_call> style XML output.

This reproduces even without using native Ollama tools. The prompt only contains tool-like XML tags as plain text instructions.

Possibly related to other Qwen XML/tool-call parsing issues, but this reproduces on qwen3.5:9b with /api/generate and without native tools.

Error Message

import json import urllib.request import urllib.error

HOST = "http://127.0.0.1:11434" MODEL = "qwen3.5:9b"

tests = [ ("plain_text", "hello"), ("simple_xml", "<user_request>查上海天气</user_request>"), ("xml_with_tool_hint", "<tools>[{"name":"get_weather"}]</tools>\n<user_request>查上海天气</user_request>\n请只输出一个 <tool_call>...</tool_call>"), ("minitest1","<tools>[{"name":"get_weather"}]</tools>\n<user_request>查上海天气</user_request>\n请只输出一个工具调用块"), ]

for name, prompt in tests: payload = { "model": MODEL, "stream": False, "prompt": prompt, } req = urllib.request.Request( f"{HOST}/api/generate", data=json.dumps(payload).encode("utf-8"), headers={"Content-Type": "application/json"}, method="POST", ) print(f"\n=== TEST: {name} ===") try: with urllib.request.urlopen(req, timeout=120) as resp: body = resp.read().decode("utf-8") print("HTTP", resp.status) print(body[:1000]) except urllib.error.HTTPError as e: body = e.read().decode("utf-8", errors="replace") print("HTTPError", e.code) print(body) except Exception as e: print("Exception", repr(e))

Root Cause

The server should not return HTTP 500 / {"error":"EOF"} because the prompt contains <tool_call>-style XML text.

Fix Action

Fixed

Fixed by PR: model/parsers: fall back to content when qwen tool call XML parse fails (https://github.com/ollama/ollama/pull/15011)

PR fix notes

PR #15011: model/parsers: fall back to content when qwen tool call XML parse fails

Repository: ollama/ollama
Author: r266-tech
State: open | merged: False
Link: https://github.com/ollama/ollama/pull/15011

Description (problem / solution / changelog)

Summary

Fixes #14986

Root Cause

The Qwen3.5 model always has a builtin parser registered (qwen3.5). When a user prompt instructs the model to emit <tool_call>...</tool_call>-style XML as plain text — without registering native Ollama tools — the Qwen3CoderParser still scans for those delimiters in the model output, finds them, and calls parseToolCall().

If the content between the tags is not valid Qwen3-coder XML (e.g. a JSON payload such as {"name": "get_weather", ...}), xml.Unmarshal returns an error (typically EOF). That error propagated all the way up to GenerateHandler / ChatHandler, which sent {"error":"EOF"} and returned HTTP 500.

Fix

In Qwen3CoderParser.Add(), when parseToolCall fails:

Before: return the error immediately
After: log a warning and write the raw tool-call text (including the wrapping tags) into the content strings.Builder, then break

This means the caller always receives a usable HTTP 200 response with the raw model output instead of an internal server error.

Safety: when real tools are registered and the model produces well-formed Qwen3-coder XML, parseToolCall succeeds and the existing path is taken unchanged.

Changes

model/parsers/qwen3coder.go: on parse failure, fall back to content instead of returning an error
model/parsers/qwen35_test.go: add regression test that initialises the parser with no tools (matching the /api/generate code path) and feeds it a JSON-style <tool_call> block — expects no error and non-empty content

Testing

New test: TestQwen35ParserToolCallAsPlainTextFallback

CLA Confirmation: I have read, understood, and agree to the Ollama Contributor License Agreement (CLA). I understand that this contribution may be used under the terms of the MIT license.

Changed files

model/parsers/qwen35_test.go (modified, +32/-0)
model/parsers/qwen3coder.go (modified, +10/-2)

Code Example

import json
import urllib.request
import urllib.error

HOST = "http://127.0.0.1:11434"
MODEL = "qwen3.5:9b"

tests = [
    ("plain_text", "hello"),
    ("simple_xml", "<user_request>查上海天气</user_request>"),
    ("xml_with_tool_hint", "<tools>[{\"name\":\"get_weather\"}]</tools>\n<user_request>查上海天气</user_request>\n请只输出一个 <tool_call>...</tool_call>"),
    ("minitest1","<tools>[{\"name\":\"get_weather\"}]</tools>\n<user_request>查上海天气</user_request>\n请只输出一个工具调用块"),
]

for name, prompt in tests:
    payload = {
        "model": MODEL,
        "stream": False,
        "prompt": prompt,
    }
    req = urllib.request.Request(
        f"{HOST}/api/generate",
        data=json.dumps(payload).encode("utf-8"),
        headers={"Content-Type": "application/json"},
        method="POST",
    )
    print(f"\n=== TEST: {name} ===")
    try:
        with urllib.request.urlopen(req, timeout=120) as resp:
            body = resp.read().decode("utf-8")
            print("HTTP", resp.status)
            print(body[:1000])
    except urllib.error.HTTPError as e:
        body = e.read().decode("utf-8", errors="replace")
        print("HTTPError", e.code)
        print(body)
    except Exception as e:
        print("Exception", repr(e))

---

hello

---

HTTP 200
{"model":"qwen3.5:9b","created_at":"2026-03-20T23:46:44.373588553Z","response":"Hello! 👋 How can I help you today?", ...}

---

<user_request>查上海天气</user_request>

---

HTTP 200
{"model":"qwen3.5:9b","created_at":"2026-03-20T23:47:08.663588206Z","response":"很抱歉，作为人工智能助手，我暂时无法直接获取实时的天气数据。...", ...}

---

<tools>[{"name":"get_weather"}]</tools>
<user_request>查上海天气</user_request>
请只输出一个 <tool_call>...</tool_call>

---

HTTPError 500
{"error":"EOF"}

---

<tools>[{"name":"get_weather"}]</tools>
<user_request>查上海天气</user_request>
请只输出一个工具调用块

---

HTTP 200
{"model":"qwen3.5:9b","created_at":"2026-03-20T23:47:56.226133671Z","response":"

---

### Expected behavior

The server should not return HTTP 500 / `{"error":"EOF"}` because the prompt contains `<tool_call>`-style XML text.

Even if Ollama or the model-side parser dislikes this format, the request should fail gracefully, for example by:

- returning raw model text, or
- returning a controlled parse error with a clear message

but not an internal server error.

### Actual behavior

When the prompt explicitly asks for `<tool_call>...</tool_call>` or `<tool_call></tool_call>`, `/api/generate` can fail with:

---

### Notes

This seems to be triggered specifically by the `<tool_call>` XML pattern in the prompt.

Important detail: this reproduction does **not** use native Ollama tools. The issue appears to happen even when these tags are just plain prompt text.

### Relevant log output

RAW_BUFFERClick to expand / collapse

What is the issue?

Environment

Ollama version: v0.18.2
Model: qwen3.5:9b
Endpoint: /api/generate
Native Ollama tools: not used

Summary

This reproduces even without using native Ollama tools. The prompt only contains tool-like XML tags as plain text instructions.

Possibly related to other Qwen XML/tool-call parsing issues, but this reproduces on qwen3.5:9b with /api/generate and without native tools.

Minimal reproduction

I used the following script:

import json
import urllib.request
import urllib.error

HOST = "http://127.0.0.1:11434"
MODEL = "qwen3.5:9b"

tests = [
    ("plain_text", "hello"),
    ("simple_xml", "<user_request>查上海天气</user_request>"),
    ("xml_with_tool_hint", "<tools>[{\"name\":\"get_weather\"}]</tools>\n<user_request>查上海天气</user_request>\n请只输出一个 <tool_call>...</tool_call>"),
    ("minitest1","<tools>[{\"name\":\"get_weather\"}]</tools>\n<user_request>查上海天气</user_request>\n请只输出一个工具调用块"),
]

for name, prompt in tests:
    payload = {
        "model": MODEL,
        "stream": False,
        "prompt": prompt,
    }
    req = urllib.request.Request(
        f"{HOST}/api/generate",
        data=json.dumps(payload).encode("utf-8"),
        headers={"Content-Type": "application/json"},
        method="POST",
    )
    print(f"\n=== TEST: {name} ===")
    try:
        with urllib.request.urlopen(req, timeout=120) as resp:
            body = resp.read().decode("utf-8")
            print("HTTP", resp.status)
            print(body[:1000])
    except urllib.error.HTTPError as e:
        body = e.read().decode("utf-8", errors="replace")
        print("HTTPError", e.code)
        print(body)
    except Exception as e:
        print("Exception", repr(e))

Results

1) plain_text

Prompt:

hello

Result:

HTTP 200
{"model":"qwen3.5:9b","created_at":"2026-03-20T23:46:44.373588553Z","response":"Hello! 👋 How can I help you today?", ...}

2) simple_xml

Prompt:

<user_request>查上海天气</user_request>

Result:

HTTP 200
{"model":"qwen3.5:9b","created_at":"2026-03-20T23:47:08.663588206Z","response":"很抱歉，作为人工智能助手，我暂时无法直接获取实时的天气数据。...", ...}

3) xml_with_tool_hint

Prompt:

<tools>[{"name":"get_weather"}]</tools>
<user_request>查上海天气</user_request>
请只输出一个 <tool_call>...</tool_call>

Result:

HTTPError 500
{"error":"EOF"}

5) minitest1

Prompt:

<tools>[{"name":"get_weather"}]</tools>
<user_request>查上海天气</user_request>
请只输出一个工具调用块

Result:

HTTP 200
{"model":"qwen3.5:9b","created_at":"2026-03-20T23:47:56.226133671Z","response":"```json
{
  \"tool_name\": \"get_weather\",
  \"parameters\": {
    \"location\": \"上海\",
    \"city_name\": \"Shanghai\",
    \"country\": \"CN\"
  }
}
```", ...}

Expected behavior

The server should not return HTTP 500 / {"error":"EOF"} because the prompt contains <tool_call>-style XML text.

Even if Ollama or the model-side parser dislikes this format, the request should fail gracefully, for example by:

returning raw model text, or
returning a controlled parse error with a clear message

but not an internal server error.

Actual behavior

When the prompt explicitly asks for <tool_call>...</tool_call> or <tool_call></tool_call>, /api/generate can fail with:

{"error":"EOF"}

Notes

This seems to be triggered specifically by the <tool_call> XML pattern in the prompt.

Important detail: this reproduction does not use native Ollama tools. The issue appears to happen even when these tags are just plain prompt text.

Relevant log output

OS

Linux

GPU

Nvidia

CPU

Intel

Ollama version

0.18.2

extent analysis

Fix Plan

To resolve the issue of /api/generate returning HTTP 500 with {"error":"EOF"} when the prompt contains <tool_call>...</tool_call> style XML output, we need to modify the server-side code to handle this specific XML pattern more robustly. Here are the steps:

Update the XML parsing logic: Modify the XML parser to correctly handle the <tool_call> tags, even when they appear as plain text in the prompt.
Implement a try-except block: Wrap the code that generates the response in a try-except block to catch any exceptions that may occur when parsing the prompt.
Return a controlled error message: If an exception occurs, return a controlled error message instead of an internal server error.

Example code:

try:
    # Generate response
    response = generate_response(prompt)
    return {"model": model, "response": response}
except Exception as e:
    # Return a controlled error message
    return {"error": "Failed to generate response: " + str(e)}

In the generate_response function, we need to update the XML parsing logic to handle the <tool_call> tags:

import xml.etree.ElementTree as ET

def generate_response(prompt):
    # Parse the prompt as XML
    try:
        root = ET.fromstring(prompt)
    except ET.ParseError:
        # If the prompt is not valid XML, return a controlled error message
        return "Invalid XML prompt"
    
    # Handle the <tool_call> tags
    tool_calls = root.findall(".//tool_call")
    if tool_calls:
        # Generate the response based on the <tool_call> tags
        response = generate_tool_call_response(tool_calls)
        return response
    else:
        # Generate the response based on the prompt text
        return generate_text_response(prompt)

Verification

To verify that the fix worked, we can test the /api/generate endpoint with the same prompts that previously caused the HTTP 500 error. We should see a controlled error message or a successful response instead of an internal server error.

Extra Tips

Make sure to test the updated code thoroughly to ensure that it handles all possible edge cases.
Consider adding additional logging to help diagnose any issues that may arise in the future.
If the issue persists, try to isolate the problem by testing individual components of the code to identify the root cause.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

FAQ

Expected behavior

The server should not return HTTP 500 / {"error":"EOF"} because the prompt contains <tool_call>-style XML text.

Even if Ollama or the model-side parser dislikes this format, the request should fail gracefully, for example by:

returning raw model text, or
returning a controlled parse error with a clear message

but not an internal server error.

#api #ssr #installation #tensor shape #autograd error #memory management #API rate limit #retriever error #indexing error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

ollama - ✅(Solved) Fix /api/generate returns HTTP 500 {"error":"EOF"} with qwen3.5:9b when prompt requests <tool_call> XML output [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fixed

PR fix notes

PR #15011: model/parsers: fall back to content when qwen tool call XML parse fails

Description (problem / solution / changelog)

Summary

Root Cause

Fix

Changes

Testing

Changed files

Code Example

What is the issue?

Environment

Summary

Minimal reproduction

Results

1) plain_text

2) simple_xml

3) xml_with_tool_hint

5) minitest1

Expected behavior

Actual behavior

Notes

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

Fix Plan

Verification

Extra Tips

FAQ

Expected behavior

Still need to ship something?

RELATED_DISCOVERY

TRENDING