ollama - ✅(Solved) Fix bge-m3 only returns NaN on bitcoin whitepaper, other docs [1 pull requests, 5 comments, 4 participants]

ollama2026-03-06 02:31:06

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

ollama/ollama#14657•Fetched 2026-04-08 00:33:16

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×5mentioned ×2subscribed ×2cross-referenced ×1

Error Message

openai.InternalServerError: Error code: 500 - {'error': {'message': 'failed to encode response: json: unsupported value: NaN', 'type': 'api_error', 'param': None, 'code': None}}

Fix Action

Fixed

Fixed by PR: server: handle NaN values in embedding responses (https://github.com/ollama/ollama/pull/14739)

PR fix notes

PR #14739: server: handle NaN values in embedding responses

Repository: ollama/ollama
Author: mvanhorn
State: open | merged: False
Link: https://github.com/ollama/ollama/pull/14739

Description (problem / solution / changelog)

Fixes #14657

Summary

Added ValidateEmbedding function in the llm package to detect NaN/Inf values before JSON serialization
Applied validation in both runner-level embedding handlers (ollamarunner and llamarunner) where the crash originates
Also added NaN/Inf check in the deprecated EmbeddingsHandler endpoint which was missing the validation that EmbedHandler already had via normalize()
Returns a clear error message ("model produced invalid embedding values (NaN or Inf)") instead of crashing with json: unsupported value: NaN

Context

Go's encoding/json does not support NaN or Inf float values. When a model (e.g., bge-m3 with certain inputs) produces NaN values in its embeddings, the JSON encoder crashes with an unhelpful 500 error. The EmbedHandler path already catches this via the normalize() function, but the runner-level handlers and the deprecated EmbeddingsHandler did not have this protection.

The workaround OLLAMA_FLASH_ATTENTION=false mentioned in the issue suggests the root cause may be in flash attention computation, which could be investigated separately as a deeper fix.

Testing

Added TestValidateEmbedding with coverage for valid embeddings, NaN, positive/negative Inf, empty/nil slices, and edge cases
Existing TestNormalize continues to pass

This contribution was developed with AI assistance (Claude Code).

Changed files

llm/embedding_test.go (added, +79/-0)
llm/server.go (modified, +12/-0)
runner/llamarunner/runner.go (modified, +4/-0)
runner/ollamarunner/runner.go (modified, +7/-1)
server/routes.go (modified, +5/-0)

Code Example

ollama pull bge-m3

---

curl -X POST http://localhost:11434/v1/embeddings \
    -H "Content-Type: application/json" \
    -d '{
      "model": "bge-m3",
      "input": "This is a test sentence."
    }'

---

curl -X POST http://localhost:11434/v1/embeddings \
    -H "Content-Type: application/json" \
    -d '{
      "model": "bge-m3",
      "input": "Bitcoin: A Peer-to-Peer Electronic Cash System. Abstract. A purely peer-to-peer version of electronic cash
  would allow online payments to be sent directly from one party to another without going through a financial institution.
  Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to
   prevent double-spending. We propose a solution to the double-spending problem using a peer-to-peer network. The network
  timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be
  changed without redoing the proof-of-work."
    }'

---

{
    "error": {
      "message": "failed to encode response: json: unsupported value: NaN",
      "type": "api_error",
      "param": null,
      "code": null
    }
  }

---

openai.InternalServerError: Error code: 500 - {'error': {'message': 'failed to encode response: json: unsupported value: NaN', 'type': 'api_error', 'param': None, 'code': None}}

RAW_BUFFERClick to expand / collapse

What is the issue?

The bge-m3 model returns NaN values through the OpenAI-compatible embeddings API (/v1/embeddings) when processing certain text content, particularly technical documents.

This causes a 500 error with the message: failed to encode response: json: unsupported value: NaN

Environment

Ollama Version: 0.17.6
OS: Windows 10 / MSYS_NT-10.0-26100
Model: bge-m3:latest (ID: 790764642607, Size: 1.2 GB)
GPU: NVIDIA GeFORCE RTX 2080 Ti

Steps to Reproduce

Pull the bge-m3 model

ollama pull bge-m3

Test with simple text

  curl -X POST http://localhost:11434/v1/embeddings \
    -H "Content-Type: application/json" \
    -d '{
      "model": "bge-m3",
      "input": "This is a test sentence."
    }'

✅ Returns valid 1024-dimensional embedding

Test with technical document content

  curl -X POST http://localhost:11434/v1/embeddings \
    -H "Content-Type: application/json" \
    -d '{
      "model": "bge-m3",
      "input": "Bitcoin: A Peer-to-Peer Electronic Cash System. Abstract. A purely peer-to-peer version of electronic cash
  would allow online payments to be sent directly from one party to another without going through a financial institution.
  Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to
   prevent double-spending. We propose a solution to the double-spending problem using a peer-to-peer network. The network
  timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be
  changed without redoing the proof-of-work."
    }'

Expected Behavior

Should return a valid 1024-dimensional embedding array, similar to the simple text case.

Actual Behavior

  {
    "error": {
      "message": "failed to encode response: json: unsupported value: NaN",
      "type": "api_error",
      "param": null,
      "code": null
    }
  }

Additional Context

The same technical text works perfectly with nomic-embed-text:latest, which returns valid 768-dimensional embeddings without any NaN values
This issue occurs consistently with content from technical PDFs (e.g., Bitcoin whitepaper, research papers)
The issue appears to be specific to bge-m3 - other embedding models handle the same content without issues

Relevant log output

openai.InternalServerError: Error code: 500 - {'error': {'message': 'failed to encode response: json: unsupported value: NaN', 'type': 'api_error', 'param': None, 'code': None}}

OS

Windows

GPU

Nvidia

CPU

Intel

Ollama version

0.17.6

extent analysis

Fix Plan

To address the issue of NaN values returned by the bge-m3 model, we'll implement the following steps:

Clip input values: Ensure that the input values to the model are within a valid range to prevent NaN values.
Handle NaN values in the model: Modify the model to handle NaN values by replacing them with a suitable replacement value (e.g., zero).
Update the API to handle NaN values: Modify the API to handle NaN values in the response.

Code Changes

Here's an example code snippet in Python that demonstrates how to clip input values and handle NaN values:

import torch
import torch.nn as nn
import numpy as np

# Define a function to clip input values
def clip_input_values(input_values, min_value=-1e6, max_value=1e6):
    return np.clip(input_values, min_value, max_value)

# Define a function to handle NaN values in the model
def handle_nan_values(model_output):
    return torch.where(torch.isnan(model_output), torch.zeros_like(model_output), model_output)

# Example usage:
input_values = np.array([1.0, 2.0, np.nan, 4.0])
clipped_input_values = clip_input_values(input_values)
model_output = torch.tensor([1.0, 2.0, np.nan, 4.0])
handled_output = handle_nan_values(model_output)

print("Clipped Input Values:", clipped_input_values)
print("Handled Output:", handled_output)

API Updates

To handle NaN values in the API response, you can add a check for NaN values before returning the response:

import json

# Define a function to handle NaN values in the API response
def handle_nan_in_response(response):
    if np.isnan(response).any():
        return {"error": "NaN values encountered in response"}
    return response

# Example usage:
response = np.array([1.0, 2.0, np.nan, 4.0])
handled_response = handle_nan_in_response(response)

print("Handled Response:", handled_response)

Verification

To verify that the fix worked, you can test the API with the same technical document content that previously caused NaN values. The API should now return a valid response without NaN values.

Extra Tips

Ensure that the model is properly trained and validated to handle a wide range of input values.
Consider adding input validation and sanitization to prevent invalid or malicious input values.
Monitor the API for any issues related to NaN values and update the fix as needed.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #API middleware #SSR setup #ISR setup #authentication setup

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

ollama - ✅(Solved) Fix bge-m3 only returns NaN on bitcoin whitepaper, other docs [1 pull requests, 5 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #14739: server: handle NaN values in embedding responses

Description (problem / solution / changelog)

Summary

Context

Testing

Changed files

Code Example

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

Fix Plan

Code Changes

API Updates

Verification

Extra Tips

Still need to ship something?

TRENDING

ollama - ✅(Solved) Fix bge-m3 only returns NaN on bitcoin whitepaper, other docs [1 pull requests, 5 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #14739: server: handle NaN values in embedding responses

Description (problem / solution / changelog)

Summary

Context

Testing

Changed files

Code Example

What is the issue?

Relevant log output

OS

GPU

CPU

Ollama version

extent analysis

Fix Plan

Code Changes

API Updates

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING