ollama - ✅(Solved) Fix Feature Request: anthropic/anthropic.go:Process() should send ping events during tool call composition to avoid a streaming timeout [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
ollama/ollama#14902Fetched 2026-04-08 00:48:02
View on GitHub
Comments
0
Participants
1
Timeline
6
Reactions
0
Author
Participants
Timeline (top)
cross-referenced ×2referenced ×2labeled ×1marked_as_duplicate ×1

Error Message

From the user's point of view, the command takes a long time and eventually starts showing messages about timeouts. grep ERROR ~/.claude/debug/latest shows [ERROR] Error streaming, falling back to non-streaming mode: The operation timed out. and things like [ERROR] API error (attempt 1/11): undefined Request timed out.. No file is created.

Fix Action

Fixed

PR fix notes

PR #14932: Fix streaming timeout during tool call composition

Description (problem / solution / changelog)

(No description)

Changed files

  • anthropic/anthropic.go (modified, +11/-0)
  • server/routes.go (modified, +4/-1)

Code Example

$ ollama ps
NAME                    ID              SIZE     PROCESSOR          CONTEXT    UNTIL               
glm-4.7-flash:latest    d1a8a26252f1    21 GB    46%/54% CPU/GPU    65536      59 minutes from now

---

if req.Stream != nil && *req.Stream {
    ch <- res
}

---

if r.Message.Thinking == "" && r.Message.Content == "" && !r.Done && len(r.Message.ToolCalls) == 0 {
    events = append(events, StreamEvent{
                    Event: "ping",
                    Data: PingEvent{
                              Type: "ping",
                    },
             })
}
RAW_BUFFERClick to expand / collapse

This is a simpler approach that would also resolve the behavior described in https://github.com/ollama/ollama/issues/14858 . I'll redescribe the issue so you don't have to read an obsolete request.

Environment

  • Ollama 0.18
  • Claude Code 2.1.70
  • OS/hardware: Ubuntu 24.04, AMD Ryzen 5, GeForce RTX 3060

If it matters, I'm running everything locally, ethernet cable unplugged.

$ ollama ps
NAME                    ID              SIZE     PROCESSOR          CONTEXT    UNTIL               
glm-4.7-flash:latest    d1a8a26252f1    21 GB    46%/54% CPU/GPU    65536      59 minutes from now

Summary

While a model is composing a tool call, no responses are sent to the client, even in streaming mode. During that time, the client may time out, causing the request to repeatedly fail and be retried for (in my testing) about an hour.

At least for the Anthropic API, this can be avoided by sending ping events - this works for me on a local build.

Reproduction Steps

  1. Run Claude Code via Ollama: ollama launch claude --model glm-4.7-flash
  2. At the prompt, enter something like Recall a summary of each book of the Old Testament and write them to old_testament_summaries.md

Note: While this is sufficient to reproduce the behavior on my system, the underlying issue is a timeout; a more powerful computer might be able to successfully complete the task quickly enough that you don't see it. Setting CUDA_VISIBLE_DEVICES="" to force CPU inference or specifying a longer text should make the issue more obvious.

Desired behavior

old_testament_summaries.md should be created and contain text.

Current behavior

From the user's point of view, the command takes a long time and eventually starts showing messages about timeouts. grep ERROR ~/.claude/debug/latest shows [ERROR] Error streaming, falling back to non-streaming mode: The operation timed out. and things like [ERROR] API error (attempt 1/11): undefined Request timed out.. No file is created.

Observations

https://ollama.com/blog/streaming-tool initially made me think this was expected to work, but reading it more closely I see that it only talks about streaming tool responses, not composing the arguments, so I'm filing this as a feature request instead of as a bug.

I believe this is due to tools/tools.go:Add() gathering up responses before parsing them into a ToolCall. Certainly anthropic/anthropic.go:Process() is not set up to stream tool call arguments - it expects tool calls to be given in one shot. A packet trace supports this theory - I see Ollama talking to the model and coming up with text, while the Claude-Ollama connection has no traffic and Claude eventually gives up.

Proposed Changes

Disclaimer: While this suggestion fixes my use case, I'm not confident this won't break other things - someone more knowledgeable needs to verify this is a good approach.

server/routes.go:ChatHandler

https://github.com/ollama/ollama/blob/bbbad97686205cfd897a9e4e931889a3598a0652/server/routes.go#L2443-L2448 only passes along events if the parser comes up with non-empty output. When tools/tools.go:Add() is gathering up responses, it generates empty output, and so during that time nothing makes it to anthropic/anthropic.go:Process()

Instead of doing nothing when there's empty output, other layers should still be notified so they can react if desired.

if req.Stream != nil && *req.Stream {
    ch <- res
}

anthropic/anthropic.go:Process()

anthropic.go defines a PingEvent, though it's never used anywhere. If Process sees an empty message, it could emit a PingEvent:

if r.Message.Thinking == "" && r.Message.Content == "" && !r.Done && len(r.Message.ToolCalls) == 0 {
    events = append(events, StreamEvent{
                    Event: "ping",
                    Data: PingEvent{
                              Type: "ping",
                    },
             })
}

Considerations

https://github.com/ollama/ollama/blob/bbbad97686205cfd897a9e4e931889a3598a0652/server/routes.go#L610-L614 suggests that quashing empty events was a conscious choice to avoid triggering bad client behavior. What was the bad behavior and is there another way to avoid it? Does a similar change need to happen here?

extent analysis

Fix Plan

To resolve the issue of no responses being sent to the client while a model is composing a tool call, we need to modify the code to send ping events in streaming mode. Here are the steps:

  • In server/routes.go:ChatHandler, modify the code to pass along events even if the parser comes up with empty output:
if req.Stream != nil && *req.Stream {
    ch <- res
}
  • In anthropic/anthropic.go:Process(), add a check for empty messages and emit a PingEvent:
if r.Message.Thinking == "" && r.Message.Content == "" && !r.Done && len(r.Message.ToolCalls) == 0 {
    events = append(events, StreamEvent{
                    Event: "ping",
                    Data: PingEvent{
                              Type: "ping",
                    },
             })
}

Verification

To verify that the fix worked, run the reproduction steps again and check for the following:

  • The client should no longer time out while the model is composing a tool call.
  • The old_testament_summaries.md file should be created and contain text.
  • The debug logs should no longer show timeout errors.

Extra Tips

  • Make sure to test the changes thoroughly to avoid introducing any regressions.
  • Consider adding additional logging or monitoring to detect any potential issues with the ping events.
  • Review the code changes to ensure they align with the project's coding standards and best practices.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING