ollama - 💡(How to fix) Fix Cloud models: tool_call.function.arguments truncated or 502 Bad Gateway when outputting large content

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

When using Ollama Cloud models via the Cloud API (https://ollama.com/v1) or local proxy (http://localhost:11434/v1) with tool definitions, tool_call.function.arguments is either truncated or results in a 502 Bad Gateway error when the model attempts to output large content through a tool call. Both endpoints exhibit identical behavior — the issue is server-side on Ollama's cloud infrastructure, not the client proxy. {"error":"Post "https://ollama.com:443/v1/chat/completions?ts=1778353283\": unexpected EOF"} tool_call.function.arguments should contain complete, valid JSON regardless of content length, up to the model's context window limit (256K for kimi-k2.6:cloud). If there is a server-side limit, it should be documented and the API should return a clear error instead of a 502 or silent truncation. 2. If a limit must exist, document it clearly and return a proper 4xx error instead of 502 when exceeded.

Code Example

POST https://ollama.com/v1/chat/completions
(or equivalently http://localhost:11434/v1/chat/completions)

{
  "model": "kimi-k2.6:cloud",
  "messages": [
    {"role": "system", "content": "You are a helpful coding assistant. Always use the write_file tool to write files with ALL content."},
    {"role": "user", "content": "Write a file called test.py with a simple Flask app, about 40 lines."}
  ],
  "stream": false,
  "tools": [{"type": "function", "function": {"name": "write_file", "description": "Write content to a file", "parameters": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}, "required": ["path", "content"]}}}]
}

---

{"error":"Post \"https://ollama.com:443/v1/chat/completions?ts=1778353283\": unexpected EOF"}
RAW_BUFFERClick to expand / collapse

Bug Description

When using Ollama Cloud models via the Cloud API (https://ollama.com/v1) or local proxy (http://localhost:11434/v1) with tool definitions, tool_call.function.arguments is either truncated or results in a 502 Bad Gateway error when the model attempts to output large content through a tool call. Both endpoints exhibit identical behavior — the issue is server-side on Ollama's cloud infrastructure, not the client proxy.

This makes agent workflows that rely on tool calls to write files unreliable or completely broken.

Environment

  • Ollama version: 0.23.0
  • Model: kimi-k2.6:cloud (also affects glm-5.1:cloud, qwen3.5:397b)
  • Endpoints tested: Both https://ollama.com/v1/chat/completions and http://localhost:11434/v1/chat/completions — same results
  • Plan: Pro
  • Context length: 262,144 (cloud default, confirmed via ollama show)

Reproduction

Send a request to either endpoint with tool definitions. The model generates a write_file tool call. As the requested output size increases, the behavior degrades:

Test 1: Small file (~40 lines) — Works ✅

POST https://ollama.com/v1/chat/completions
(or equivalently http://localhost:11434/v1/chat/completions)

{
  "model": "kimi-k2.6:cloud",
  "messages": [
    {"role": "system", "content": "You are a helpful coding assistant. Always use the write_file tool to write files with ALL content."},
    {"role": "user", "content": "Write a file called test.py with a simple Flask app, about 40 lines."}
  ],
  "stream": false,
  "tools": [{"type": "function", "function": {"name": "write_file", "description": "Write content to a file", "parameters": {"type": "object", "properties": {"path": {"type": "string"}, "content": {"type": "string"}}, "required": ["path", "content"]}}}]
}

Result: tool_call.function.arguments length ~950-1100 chars. Content is complete and valid JSON.

Test 2: Medium file (~60 lines) — Intermittent 502 ⚠️

Same request but asking for ~60 lines. After 3 consecutive attempts:

AttemptResult
1✅ OK: args_len=2058, 68 lines
2✅ OK: args_len=2054, 61 lines
3❌ HTTP 502 Bad Gateway

Test 3: Large file (100+ lines) — Consistent 502 ❌

Requesting 100+ lines of code reliably results in 502 Bad Gateway.

Test 4: Very large file (200+ lines) — Immediate EOF ❌

{"error":"Post \"https://ollama.com:443/v1/chat/completions?ts=1778353283\": unexpected EOF"}

Both endpoints (ollama.com/v1 and localhost:11434/v1) produce identical failures, confirming the issue is on Ollama's cloud server side.

Key Observations

  1. No API parameter controls tool_call size: The OpenAI compatibility endpoint (/v1/chat/completions) has no parameter to control tool_call.function.arguments maximum length. max_tokens only controls total generation tokens, not individual tool call argument size.

  2. Cloud models have no Modelfile: ollama show kimi-k2.6:cloud --modelfile returns empty. Users cannot set num_ctx or other parameters via Modelfile for cloud models.

  3. The truncation/502 is server-side: Both direct Cloud API (https://ollama.com/v1) and local proxy (http://localhost:11434/v1) produce identical failures. The local proxy merely forwards to ollama.com:443. Per docs, Cloud models are set to their maximum context length by default — context length is not the bottleneck.

  4. Non-tool-call output works fine: Regular text responses (without tool_calls) with large content do not trigger 502 errors, suggesting the issue is specific to how the cloud handles structured tool_call.function.arguments in buffering/streaming.

  5. Both endpoints affected identically: Whether calling https://ollama.com/v1/chat/completions directly or proxying through http://localhost:11434/v1/chat/completions, the same truncation/502 pattern occurs — eliminating any local Ollama proxy as the cause.

Expected Behavior

tool_call.function.arguments should contain complete, valid JSON regardless of content length, up to the model's context window limit (256K for kimi-k2.6:cloud). If there is a server-side limit, it should be documented and the API should return a clear error instead of a 502 or silent truncation.

Impact

This makes Ollama Cloud models effectively unusable for agent workflows that write files via tool calls (e.g., coding assistants, automated code generation). Any file over ~40 lines cannot be reliably written in a single tool call.

Suggested Fixes

  1. Increase or remove the tool_call arguments buffer limit on the Ollama Cloud proxy.
  2. If a limit must exist, document it clearly and return a proper 4xx error instead of 502 when exceeded.
  3. Consider chunked/streaming delivery of large tool_call arguments to avoid single-buffer limits.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix Cloud models: tool_call.function.arguments truncated or 502 Bad Gateway when outputting large content