ollama - 💡(How to fix) Fix Cloud models: tool_call.function.arguments truncated or 502 Bad Gateway when outputting large content

Error Message

When using Ollama Cloud models via the Cloud API (https://ollama.com/v1) or local proxy (http://localhost:11434/v1) with tool definitions, tool_call.function.arguments is either truncated or results in a 502 Bad Gateway error when the model attempts to output large content through a tool call. Both endpoints exhibit identical behavior — the issue is server-side on Ollama's cloud infrastructure, not the client proxy. {"error":"Post "https://ollama.com:443/v1/chat/completions?ts=1778353283\": unexpected EOF"} tool_call.function.arguments should contain complete, valid JSON regardless of content length, up to the model's context window limit (256K for kimi-k2.6:cloud). If there is a server-side limit, it should be documented and the API should return a clear error instead of a 502 or silent truncation. 2. If a limit must exist, document it clearly and return a proper 4xx error instead of 502 when exceeded.

Code Example

POST https://ollama.com/v1/chat/completions
(or equivalently http://localhost:11434/v1/chat/completions)

{
  "model": "kimi-k2.6:cloud",
  "messages": [
    {"role": "system", "content": "You are a helpful coding assistant. Always use the write_file tool to write files with ALL content."},
    {"role": "user", "content": "Write a file called test.py with a simple Flask app, about 40 lines."}
  ],
  "stream": false,
  "tools": [{"type": "function", "function": {"name": "write_file", "description": "Write content to a file", "parameters": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}, "required": ["path", "content"]}}}]
}

---

{"error":"Post \"https://ollama.com:443/v1/chat/completions?ts=1778353283\": unexpected EOF"}

Bug Description

This makes agent workflows that rely on tool calls to write files unreliable or completely broken.

Environment

Ollama version: 0.23.0
Model: kimi-k2.6:cloud (also affects glm-5.1:cloud, qwen3.5:397b)
Endpoints tested: Both https://ollama.com/v1/chat/completions and http://localhost:11434/v1/chat/completions — same results
Plan: Pro
Context length: 262,144 (cloud default, confirmed via ollama show)

Reproduction

Send a request to either endpoint with tool definitions. The model generates a write_file tool call. As the requested output size increases, the behavior degrades:

Test 1: Small file (~40 lines) — Works ✅

POST https://ollama.com/v1/chat/completions
(or equivalently http://localhost:11434/v1/chat/completions)

{
  "model": "kimi-k2.6:cloud",
  "messages": [
    {"role": "system", "content": "You are a helpful coding assistant. Always use the write_file tool to write files with ALL content."},
    {"role": "user", "content": "Write a file called test.py with a simple Flask app, about 40 lines."}
  ],
  "stream": false,
  "tools": [{"type": "function", "function": {"name": "write_file", "description": "Write content to a file", "parameters": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}, "required": ["path", "content"]}}}]
}

Result: tool_call.function.arguments length ~950-1100 chars. Content is complete and valid JSON.

Test 2: Medium file (~60 lines) — Intermittent 502 ⚠️

Same request but asking for ~60 lines. After 3 consecutive attempts:

Attempt	Result
1	✅ OK: args_len=2058, 68 lines
2	✅ OK: args_len=2054, 61 lines
3	❌ HTTP 502 Bad Gateway

Test 3: Large file (100+ lines) — Consistent 502 ❌

Requesting 100+ lines of code reliably results in 502 Bad Gateway.

Test 4: Very large file (200+ lines) — Immediate EOF ❌

{"error":"Post \"https://ollama.com:443/v1/chat/completions?ts=1778353283\": unexpected EOF"}

Both endpoints (ollama.com/v1 and localhost:11434/v1) produce identical failures, confirming the issue is on Ollama's cloud server side.

Key Observations

No API parameter controls tool_call size: The OpenAI compatibility endpoint (/v1/chat/completions) has no parameter to control tool_call.function.arguments maximum length. max_tokens only controls total generation tokens, not individual tool call argument size.
Cloud models have no Modelfile: ollama show kimi-k2.6:cloud --modelfile returns empty. Users cannot set num_ctx or other parameters via Modelfile for cloud models.
The truncation/502 is server-side: Both direct Cloud API (https://ollama.com/v1) and local proxy (http://localhost:11434/v1) produce identical failures. The local proxy merely forwards to ollama.com:443. Per docs, Cloud models are set to their maximum context length by default — context length is not the bottleneck.
Non-tool-call output works fine: Regular text responses (without tool_calls) with large content do not trigger 502 errors, suggesting the issue is specific to how the cloud handles structured tool_call.function.arguments in buffering/streaming.
Both endpoints affected identically: Whether calling https://ollama.com/v1/chat/completions directly or proxying through http://localhost:11434/v1/chat/completions, the same truncation/502 pattern occurs — eliminating any local Ollama proxy as the cause.

Expected Behavior

tool_call.function.arguments should contain complete, valid JSON regardless of content length, up to the model's context window limit (256K for kimi-k2.6:cloud). If there is a server-side limit, it should be documented and the API should return a clear error instead of a 502 or silent truncation.

Impact

This makes Ollama Cloud models effectively unusable for agent workflows that write files via tool calls (e.g., coding assistants, automated code generation). Any file over ~40 lines cannot be reliably written in a single tool call.

Suggested Fixes

Increase or remove the tool_call arguments buffer limit on the Ollama Cloud proxy.
If a limit must exist, document it clearly and return a proper 4xx error instead of 502 when exceeded.
Consider chunked/streaming delivery of large tool_call arguments to avoid single-buffer limits.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - 💡(How to fix) Fix Cloud models: tool_call.function.arguments truncated or 502 Bad Gateway when outputting large content

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Bug Description

Environment

Reproduction

Test 1: Small file (~40 lines) — Works ✅

Test 2: Medium file (~60 lines) — Intermittent 502 ⚠️

Test 3: Large file (100+ lines) — Consistent 502 ❌

Test 4: Very large file (200+ lines) — Immediate EOF ❌

Key Observations

Expected Behavior

Impact

Suggested Fixes

Still need to ship something?

TRENDING

ollama - 💡(How to fix) Fix Cloud models: tool_call.function.arguments truncated or 502 Bad Gateway when outputting large content

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

Bug Description

Environment

Reproduction

Test 1: Small file (~40 lines) — Works ✅

Test 2: Medium file (~60 lines) — Intermittent 502 ⚠️

Test 3: Large file (100+ lines) — Consistent 502 ❌

Test 4: Very large file (200+ lines) — Immediate EOF ❌

Key Observations

Expected Behavior

Impact

Suggested Fixes

Still need to ship something?

RELATED_DISCOVERY

TRENDING