ollama - 💡(How to fix) Fix /v1/chat/completions not working with qwen3.5 multimodal model [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#14728Fetched 2026-04-08 00:32:26
View on GitHub
Comments
2
Participants
2
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
commented ×2closed ×1labeled ×1

Error Message

[2026-03-09 16:03:19.776] [DEBUG] [ureq::stream:395] connecting to localhost:11434 at 127.0.0.1:11434 [2026-03-09 16:03:19.776] [DEBUG] [ureq::stream:202] created stream: Stream(TcpStream { addr: 127.0.0.1:57784, peer: 127.0.0.1:11434, fd: 3 }) [2026-03-09 16:03:19.776] [DEBUG] [ureq::unit:261] sending request POST http://localhost:11434/v1/chat/completions [2026-03-09 16:03:19.776] [DEBUG] [ureq::unit:480] writing prelude: POST /v1/chat/completions HTTP/1.1 Host: localhost:11434 User-Agent: ureq/2.12.1 Accept: / Content-Type: application/json accept-encoding: gzip Content-Length: 3783642 [2026-03-09 16:03:45.965] [DEBUG] [ureq::response:396] Body entirely buffered (length: 206) [2026-03-09 16:03:45.965] [DEBUG] [ureq::pool:130] adding stream to pool: http|localhost|11434 -> Stream(TcpStream { addr: 127.0.0.1:57784, peer: 127.0.0.1:11434, fd: 3 }) [2026-03-09 16:03:45.965] [DEBUG] [ureq::unit:314] response 500 to POST http://localhost:11434/v1/chat/completions [2026-03-09 16:03:45.965] [DEBUG] [ureq::stream:322] dropping stream: Stream(TcpStream { addr: 127.0.0.1:57784, peer: 127.0.0.1:11434, fd: 3 }) Error: Failed to OCR page 1

Caused by: 0: Failed to connect to VML server at http://localhost:11434/v1 1: http://localhost:11434/v1/chat/completions: status code 500

Root Cause

Caused by: 0: Failed to connect to VML server at http://localhost:11434/v1 1: http://localhost:11434/v1/chat/completions: status code 500

Code Example

[2026-03-09 16:03:19.776] [DEBUG] [ureq::stream:395] connecting to localhost:11434 at 127.0.0.1:11434
[2026-03-09 16:03:19.776] [DEBUG] [ureq::stream:202] created stream: Stream(TcpStream { addr: 127.0.0.1:57784, peer: 127.0.0.1:11434, fd: 3 })
[2026-03-09 16:03:19.776] [DEBUG] [ureq::unit:261] sending request POST http://localhost:11434/v1/chat/completions
[2026-03-09 16:03:19.776] [DEBUG] [ureq::unit:480] writing prelude: POST /v1/chat/completions HTTP/1.1
Host: localhost:11434
User-Agent: ureq/2.12.1
Accept: */*
Content-Type: application/json
accept-encoding: gzip
Content-Length: 3783642
[2026-03-09 16:03:45.965] [DEBUG] [ureq::response:396] Body entirely buffered (length: 206)
[2026-03-09 16:03:45.965] [DEBUG] [ureq::pool:130] adding stream to pool: http|localhost|11434 -> Stream(TcpStream { addr: 127.0.0.1:57784, peer: 127.0.0.1:11434, fd: 3 })
[2026-03-09 16:03:45.965] [DEBUG] [ureq::unit:314] response 500 to POST http://localhost:11434/v1/chat/completions
[2026-03-09 16:03:45.965] [DEBUG] [ureq::stream:322] dropping stream: Stream(TcpStream { addr: 127.0.0.1:57784, peer: 127.0.0.1:11434, fd: 3 })
Error: Failed to OCR page 1

Caused by:
    0: Failed to connect to VML server at http://localhost:11434/v1
    1: http://localhost:11434/v1/chat/completions: status code 500

---
RAW_BUFFERClick to expand / collapse

What is the issue?

Example: https://docs.ollama.com/api/openai-compatibility#/v1/chat/completions-with-vision-example

Not working with qwen3.5:27b which is a multimodal LLM.

[2026-03-09 16:03:19.776] [DEBUG] [ureq::stream:395] connecting to localhost:11434 at 127.0.0.1:11434
[2026-03-09 16:03:19.776] [DEBUG] [ureq::stream:202] created stream: Stream(TcpStream { addr: 127.0.0.1:57784, peer: 127.0.0.1:11434, fd: 3 })
[2026-03-09 16:03:19.776] [DEBUG] [ureq::unit:261] sending request POST http://localhost:11434/v1/chat/completions
[2026-03-09 16:03:19.776] [DEBUG] [ureq::unit:480] writing prelude: POST /v1/chat/completions HTTP/1.1
Host: localhost:11434
User-Agent: ureq/2.12.1
Accept: */*
Content-Type: application/json
accept-encoding: gzip
Content-Length: 3783642
[2026-03-09 16:03:45.965] [DEBUG] [ureq::response:396] Body entirely buffered (length: 206)
[2026-03-09 16:03:45.965] [DEBUG] [ureq::pool:130] adding stream to pool: http|localhost|11434 -> Stream(TcpStream { addr: 127.0.0.1:57784, peer: 127.0.0.1:11434, fd: 3 })
[2026-03-09 16:03:45.965] [DEBUG] [ureq::unit:314] response 500 to POST http://localhost:11434/v1/chat/completions
[2026-03-09 16:03:45.965] [DEBUG] [ureq::stream:322] dropping stream: Stream(TcpStream { addr: 127.0.0.1:57784, peer: 127.0.0.1:11434, fd: 3 })
Error: Failed to OCR page 1

Caused by:
    0: Failed to connect to VML server at http://localhost:11434/v1
    1: http://localhost:11434/v1/chat/completions: status code 500

Relevant log output

OS

Ubuntu 22.04.4 LTS

GPU

4 x A100

CPU

No response

Ollama version

ollama version is 0.17.4

extent analysis

Fix Plan

To resolve the issue with the multimodal LLM qwen3.5:27b, we need to adjust the configuration and potentially modify the code handling the POST /v1/chat/completions request. Here are the steps:

  • Increase the request timeout: The default timeout might be too low for large requests. Increase the timeout value in your ureq configuration.
  • Check VML server status: Ensure the VML server at http://localhost:11434/v1 is running and accessible.
  • Modify the request payload: The Content-Length header indicates a large payload. Consider compressing the request body or splitting it into smaller chunks.
  • Handle 500 status code: Implement error handling for the 500 status code response from the /v1/chat/completions endpoint.

Example code snippet in Rust to increase the request timeout and handle errors:

use ureq::Agent;

let agent = Agent::new();
agent.timeout(30000); // 30 seconds

let res = agent.post("http://localhost:11434/v1/chat/completions")
    .set("Content-Type", "application/json")
    .send_string(&request_body);

match res {
    Ok(response) => {
        if response.status() == 500 {
            // Handle 500 status code error
            println!("Error: {}", response.into_string().unwrap());
        } else {
            // Process successful response
        }
    }
    Err(err) => {
        // Handle request error
        println!("Error: {}", err);
    }
}

Verification

To verify the fix, send a test request to the /v1/chat/completions endpoint with a large payload and check the response status code. If the response is successful, the fix has worked.

Extra Tips

  • Monitor the VML server logs for any errors or issues.
  • Consider implementing retries for failed requests with a backoff strategy.
  • Review the ureq documentation for any configuration options that may help with large requests.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING