ollama - 💡(How to fix) Fix [cloud] Streaming times out after ~120-145s on long responses with large context — no ping/keepalive during tool composition or long generation

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Error Message

⚠️ ollama-cloud stream drop (ReadTimeout) after 144.6s — reconnecting, retry 2/3 ⚠️ ollama-cloud stream drop (ReadTimeout) after 134.3s — reconnecting, retry 3/3 ❌ Connection to provider failed after 3 attempts. The provider may be experiencing issues — try again in a moment. ⚠️ API call failed (attempt 1/3): ReadTimeout 🔌 Provider: ollama-cloud Model: minimax-m2.7 🌐 Endpoint: https://ollama.com/v1 📝 Error: The read operation timed out ⏳ Retrying in 2.4s (attempt 1/3)... ⚠️ ollama-cloud stream drop (ReadTimeout) after 124.0s — reconnecting, retry 2/3 ⚠️ ollama-cloud stream drop (ReadTimeout) after 123.9s — reconnecting, retry 3/3 ┊ 🐍 preparing execute_code… ┊ 🐍 exec import subprocess, xml.etree.ElementTree as ET 3.8s ┊ 🐍 preparing execute_code… ┊ 🐍 preparing execute_code… ┊ 🐍 exec import subprocess, xml.etree.ElementTree as ET 3.4s ┊ 🐍 exec import subprocess, re 4.2s ┊ 🐍 preparing execute_code… ┊ 🐍 exec import subprocess, xml.etree.ElementTree as ET 5.0s ⚠️ Model returned empty after tool calls — nudging to continue

Code Example

⚠️ ollama-cloud stream drop (ReadTimeout) after 144.6s — reconnecting, retry 2/3
⚠️ ollama-cloud stream drop (ReadTimeout) after 134.3s — reconnecting, retry 3/3
Connection to provider failed after 3 attempts. The provider may be experiencing issues — try again in a moment.
⚠️  API call failed (attempt 1/3): ReadTimeout
   🔌 Provider: ollama-cloud  Model: minimax-m2.7
   🌐 Endpoint: https://ollama.com/v1
   📝 Error: The read operation timed out
Retrying in 2.4s (attempt 1/3)...
⚠️ ollama-cloud stream drop (ReadTimeout) after 124.0s — reconnecting, retry 2/3
⚠️ ollama-cloud stream drop (ReadTimeout) after 123.9s — reconnecting, retry 3/3
  ┊ 🐍 preparing execute_code…
  ┊ 🐍 exec      import subprocess, xml.etree.ElementTree as ET  3.8s
  ┊ 🐍 preparing execute_code…
  ┊ 🐍 preparing execute_code…
  ┊ 🐍 exec      import subprocess, xml.etree.ElementTree as ET  3.4s
  ┊ 🐍 exec      import subprocess, re  4.2s
  ┊ 🐍 preparing execute_code…
  ┊ 🐍 exec      import subprocess, xml.etree.ElementTree as ET  5.0s
⚠️ Model returned empty after tool calls — nudging to continue
RAW_BUFFERClick to expand / collapse

What is the issue?

Intermittently I'm hitting stream timeouts with Ollama Cloud models. Using the ollama.com cloud proxy causes streaming requests to time out after ~2 minutes on long generation tasks, even when the underlying model is still actively processing.

Setup: using Hermes Agent with Ollama Cloud models on Ubuntu 24.04.

Relevant log output

⚠️ ollama-cloud stream drop (ReadTimeout) after 144.6s — reconnecting, retry 2/3
⚠️ ollama-cloud stream drop (ReadTimeout) after 134.3s — reconnecting, retry 3/3
❌ Connection to provider failed after 3 attempts. The provider may be experiencing issues — try again in a moment.
⚠️  API call failed (attempt 1/3): ReadTimeout
   🔌 Provider: ollama-cloud  Model: minimax-m2.7
   🌐 Endpoint: https://ollama.com/v1
   📝 Error: The read operation timed out
⏳ Retrying in 2.4s (attempt 1/3)...
⚠️ ollama-cloud stream drop (ReadTimeout) after 124.0s — reconnecting, retry 2/3
⚠️ ollama-cloud stream drop (ReadTimeout) after 123.9s — reconnecting, retry 3/3
  ┊ 🐍 preparing execute_code…
  ┊ 🐍 exec      import subprocess, xml.etree.ElementTree as ET  3.8s
  ┊ 🐍 preparing execute_code…
  ┊ 🐍 preparing execute_code…
  ┊ 🐍 exec      import subprocess, xml.etree.ElementTree as ET  3.4s
  ┊ 🐍 exec      import subprocess, re  4.2s
  ┊ 🐍 preparing execute_code…
  ┊ 🐍 exec      import subprocess, xml.etree.ElementTree as ET  5.0s
⚠️ Model returned empty after tool calls — nudging to continue

OS

Linux

GPU

No response

CPU

AMD

Ollama version

0.22.1

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

ollama - 💡(How to fix) Fix [cloud] Streaming times out after ~120-145s on long responses with large context — no ping/keepalive during tool composition or long generation