llamaIndex - ✅(Solved) Fix [Bug]: Ollama does not respect the client's initialization [1 pull requests, 1 comments, 2 participants]

llamaIndex2026-03-19 22:14:43

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#21086•Fetched 2026-04-08 01:03:36

View on GitHub

Comments

Participants

Timeline

Reactions

Author

josx

Participants

dosubot[bot]

josx

Timeline (top)

labeled ×2closed ×1commented ×1cross-referenced ×1

Error Message

This error is 401, because it is not sending the headers

Traceback (most recent call last): File "test.py", line 12, in <module> resp = llm.complete("hi") ^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/llama_index_instrumentation/dispatcher.py", line 413, in wrapper result = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 447, in wrapped_llm_predict f_return_val = f(_self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 659, in complete return chat_to_completion_decorator(self.chat)(prompt, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/llama_index/core/base/llms/generic_utils.py", line 184, in wrapper chat_response = func(messages, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/llama_index_instrumentation/dispatcher.py", line 413, in wrapper result = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 181, in wrapped_llm_chat f_return_val = f(_self, messages, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 404, in chat options=self._model_kwargs, ^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 214, in _model_kwargs "num_ctx": self.get_context_window(), ^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 224, in get_context_window info = self.client.show(self.model).modelinfo ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/ollama/_client.py", line 637, in show return self._request( ^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/ollama/_client.py", line 190, in _request return cls(**self._request_raw(*args, **kwargs).json()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".venv/lib/python3.11/site-packages/ollama/_client.py", line 134, in _request_raw raise ResponseError(e.response.text, e.response.status_code) from None ollama._types.ResponseError: Unauthorized (status code: 401)

Root Cause

This error is 401, because it is not sending the headers

Fix Action

Fix / Workaround

PR fix notes

PR #21091: fix(ollama): pass custom headers to auto-created clients

Repository: run-llama/llama_index
Author: howardpen9
State: closed | merged: True
Link: https://github.com/run-llama/llama_index/pull/21091

Description (problem / solution / changelog)

Description

When users need authentication headers (e.g. Authorization: Bearer) for remote Ollama instances, they currently have to manually construct Client / AsyncClient objects and pass them to the Ollama constructor. This is unintuitive because the auto-created fallback clients silently discard any auth context.

This PR adds a headers parameter to the Ollama class. When set, auto-created sync and async clients inherit these headers. Explicitly passed client / async_client objects still take precedence (existing behavior preserved).

Changes

Add headers: Optional[Dict[str, str]] field to Ollama class
Pass headers to __init__ and through to super().__init__()
Update client and async_client properties to pass headers when creating fallback clients

Usage

from llama_index.llms.ollama import Ollama

# Before: had to manually construct Client with headers
# After: just pass headers directly
llm = Ollama(
    model="llama3",
    base_url="https://my-ollama-server.com",
    headers={"Authorization": "Bearer MY_API_KEY"},
)

resp = llm.complete("Hello")  # headers automatically included

Backward Compatibility

headers defaults to None — no behavior change for existing users
Explicitly passed client / async_client still take full precedence
No new dependencies

Fixes #21086

Changed files

llama-index-integrations/llms/llama-index-llms-ollama/llama_index/llms/ollama/base.py (modified, +14/-2)
llama-index-integrations/llms/llama-index-llms-ollama/pyproject.toml (modified, +1/-1)

Code Example

from llama_index.llms.ollama import Ollama
from ollama import AsyncClient, Client

host=MY_CVUSTOM_HOST
headers={"Authorization": "Bearer MY_API_KEY}
model="MY_MODEL"

c = AsyncClient(host=host, headers=headers)
llm = Ollama(async_client=c, base_url=host, model=model)

resp = llm.complete("hi")
print(resp)

---

from ollama import Client
host=MY_CVUSTOM_HOST
headers={"Authorization": "Bearer MY_API_KEY}
model="MY_MODEL"
client = Client(host=host, headers=headers)
messages = [{ 'role': 'user',  'content': 'Why is the sky blue?', },]
for part in client.chat(model, messages=messages, stream=True):
    print(part.message.content, end='', flush=True)

---

This error is 401, because it is not sending the headers


Traceback (most recent call last):
  File "test.py", line 12, in <module>
    resp = llm.complete("hi")
           ^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index_instrumentation/dispatcher.py", line 413, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 447, in wrapped_llm_predict
    f_return_val = f(_self, *args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 659, in complete
    return chat_to_completion_decorator(self.chat)(prompt, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/core/base/llms/generic_utils.py", line 184, in wrapper
    chat_response = func(messages, **kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index_instrumentation/dispatcher.py", line 413, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 181, in wrapped_llm_chat
    f_return_val = f(_self, messages, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 404, in chat
    options=self._model_kwargs,
            ^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 214, in _model_kwargs
    "num_ctx": self.get_context_window(),
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 224, in get_context_window
    info = self.client.show(self.model).modelinfo
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/ollama/_client.py", line 637, in show
    return self._request(
           ^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/ollama/_client.py", line 190, in _request
    return cls(**self._request_raw(*args, **kwargs).json())
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/ollama/_client.py", line 134, in _request_raw
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: Unauthorized (status code: 401)

RAW_BUFFERClick to expand / collapse

Bug Description

I was experimenting with llama_index, and decided to use Ollama, but my network Ollama instance has a api_key (Authorization headers with Bearer).

It seems that Ollama in Llama-Index not respect **kwargs passed or it is no using Client sent at all. Also I need to send twice the host.

Version

llama-index-llms-ollama==0.10.0

Steps to Reproduce

from llama_index.llms.ollama import Ollama
from ollama import AsyncClient, Client

host=MY_CVUSTOM_HOST
headers={"Authorization": "Bearer MY_API_KEY}
model="MY_MODEL"

c = AsyncClient(host=host, headers=headers)
llm = Ollama(async_client=c, base_url=host, model=model)

resp = llm.complete("hi")
print(resp)

If i use directly de Ollama python client it works ok:

from ollama import Client
host=MY_CVUSTOM_HOST
headers={"Authorization": "Bearer MY_API_KEY}
model="MY_MODEL"
client = Client(host=host, headers=headers)
messages = [{ 'role': 'user',  'content': 'Why is the sky blue?', },]
for part in client.chat(model, messages=messages, stream=True):
    print(part.message.content, end='', flush=True)

Relevant Logs/Tracbacks

This error is 401, because it is not sending the headers


Traceback (most recent call last):
  File "test.py", line 12, in <module>
    resp = llm.complete("hi")
           ^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index_instrumentation/dispatcher.py", line 413, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 447, in wrapped_llm_predict
    f_return_val = f(_self, *args, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 659, in complete
    return chat_to_completion_decorator(self.chat)(prompt, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/core/base/llms/generic_utils.py", line 184, in wrapper
    chat_response = func(messages, **kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index_instrumentation/dispatcher.py", line 413, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 181, in wrapped_llm_chat
    f_return_val = f(_self, messages, **kwargs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 404, in chat
    options=self._model_kwargs,
            ^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 214, in _model_kwargs
    "num_ctx": self.get_context_window(),
               ^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/llama_index/llms/ollama/base.py", line 224, in get_context_window
    info = self.client.show(self.model).modelinfo
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/ollama/_client.py", line 637, in show
    return self._request(
           ^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/ollama/_client.py", line 190, in _request
    return cls(**self._request_raw(*args, **kwargs).json())
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.11/site-packages/ollama/_client.py", line 134, in _request_raw
    raise ResponseError(e.response.text, e.response.status_code) from None
ollama._types.ResponseError: Unauthorized (status code: 401)

extent analysis

Fix Plan

To fix the issue, you need to pass the headers to the Ollama instance. However, it seems like the Ollama class does not directly accept headers as a parameter.

Instead, you can pass the async_client with the headers set. Here's how you can modify your code:

from llama_index.llms.ollama import Ollama
from ollama import AsyncClient

host = 'MY_CUSTOM_HOST'
headers = {"Authorization": "Bearer MY_API_KEY"}
model = "MY_MODEL"

c = AsyncClient(host=host, headers=headers)
llm = Ollama(async_client=c, base_url=host, model=model)

resp = llm.complete("hi")
print(resp)

However, since the Ollama class is not using the async_client correctly, we need to modify the Ollama class itself to use the async_client.

Here's an example of how you can modify the Ollama class:

from llama_index.llms.ollama import Ollama
from ollama import AsyncClient

class CustomOllama(Ollama):
    def __init__(self, async_client, base_url, model):
        self.async_client = async_client
        self.base_url = base_url
        self.model = model

    def _request(self, method, path, **kwargs):
        return self.async_client._request_raw(method, path, **kwargs)

# Usage
host = 'MY_CUSTOM_HOST'
headers = {"Authorization": "Bearer MY_API_KEY"}
model = "MY_MODEL"

c = AsyncClient(host=host, headers=headers)
llm = CustomOllama(async_client=c, base_url=host, model=model)

resp = llm.complete("hi")
print(resp)

Verification

To verify that the fix worked, you can check the response status code. If the status code is 200, it means the request was successful.

resp = llm.complete("hi")
print(resp.status_code)  # Should print 200

Extra Tips

Make sure to handle any exceptions that may occur during the request. You can do this by wrapping the request in a try-except block.

try:
    resp = llm.complete("hi")
    print(resp)
except Exception as e:
    print(f"An error occurred: {e}")

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #ISR setup #authentication setup #request error #file not found

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

llamaIndex - ✅(Solved) Fix [Bug]: Ollama does not respect the client's initialization [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #21091: fix(ollama): pass custom headers to auto-created clients

Description (problem / solution / changelog)

Description

Changes

Usage

Backward Compatibility

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

llamaIndex - ✅(Solved) Fix [Bug]: Ollama does not respect the client's initialization [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #21091: fix(ollama): pass custom headers to auto-created clients

Description (problem / solution / changelog)

Description

Changes

Usage

Backward Compatibility

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING