llamaIndex - 💡(How to fix) Fix [Bug]: Retrievers and Indexes silently fall back to OpenAI when local LLM initialization fails [1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#20917Fetched 2026-04-08 00:30:13
View on GitHub
Comments
1
Participants
2
Timeline
3
Reactions
0
Timeline (top)
closed ×1commented ×1cross-referenced ×1

Error Message

When llama_index/core/retrievers/base.py or index classes initialize an LLM and it fails (e.g., HuggingFaceEmbedding cannot load), the code does not raise an exception. Instead, it returns a default OpenAI() instance or skips validation, allowing subsequent operations to contact OpenAI.

  1. Fail fast: LLM initialization failures should raise an exception immediately with a clear message
  2. Log visibility: Document which LLM is being used and warn if it differs from the requested one
  3. Observe: No error is raised; the query succeeds (but contacts OpenAI silently)
  • Failing queries with a clear error if the LLM was not available at initialization time

Root Cause

  • Air-gapped environments: Deployments designed to never contact external APIs suddenly do, breaking compliance and security policies
  • Cost leakage: Unintended OpenAI API calls incur charges and leave audit trails
  • Silent failure: No warnings, logs, or errors tell the developer that local execution failed
  • Related to #20912: This is the root cause of the fallback vulnerability
RAW_BUFFERClick to expand / collapse

Problem

When a local or air-gapped deployment specifies a custom LLM, llama_index/core/retrievers/ and llama_index/core/indexes/ silently fall back to OpenAI if LLM initialization fails or credentials are missing. This breaks the contract of local-first deployments and creates a security gap: code that should fail visibly instead sends data to OpenAI without user awareness.

Why this matters

  • Air-gapped environments: Deployments designed to never contact external APIs suddenly do, breaking compliance and security policies
  • Cost leakage: Unintended OpenAI API calls incur charges and leave audit trails
  • Silent failure: No warnings, logs, or errors tell the developer that local execution failed
  • Related to #20912: This is the root cause of the fallback vulnerability

Current behavior

When llama_index/core/retrievers/base.py or index classes initialize an LLM and it fails (e.g., HuggingFaceEmbedding cannot load), the code does not raise an exception. Instead, it returns a default OpenAI() instance or skips validation, allowing subsequent operations to contact OpenAI.

Example: If a retriever's _get_llm() method is called and the configured LLM fails to load, the caller receives OpenAI() without any indication that the original request failed.

Expected behavior

  1. Fail fast: LLM initialization failures should raise an exception immediately with a clear message
  2. No defaults: Do not fall back to OpenAI; require explicit configuration
  3. Log visibility: Document which LLM is being used and warn if it differs from the requested one
  4. Validate at instantiation: Check LLM availability when a Retriever or Index is created, not lazily during the first query

Steps to reproduce

  1. Create an air-gapped environment with no OpenAI API key
  2. Instantiate a BM25Retriever with HuggingFaceEmbedding or a local model
  3. Configure the retriever with a non-existent model file or unreachable endpoint
  4. Run a query
  5. Observe: No error is raised; the query succeeds (but contacts OpenAI silently)

What good looks like

  • Raising LLMInitializationError or similar when the configured LLM cannot load
  • Logging the LLM choice at debug level: "Using LLM: CustomModel (local)"
  • Failing queries with a clear error if the LLM was not available at initialization time
  • Tests that verify air-gapped deployments cannot accidentally contact OpenAI

Affected files

  • llama_index/core/retrievers/base.py — LLM initialization logic
  • llama_index/core/indexes/base.py — Index LLM initialization
  • Any _get_llm() or get_llm() methods that return a default

Contributed by Klement Gunndu

extent analysis

Fix Plan

Update LLM Initialization Logic

  1. Create a custom exception: Define LLMInitializationError in llama_index/core/retrievers/base.py:

class LLMInitializationError(Exception): pass

2. **Raise exception on LLM failure**: Update `base.py` to raise `LLMInitializationError` when the configured LLM cannot load:
   ```python
try:
    # Attempt to load the LLM
    self.llm = self._get_llm()
except Exception as e:
    # Raise a custom exception with a clear message
    raise LLMInitializationError(f"Failed to initialize LLM: {str(e)}")
  1. Log LLM choice: Log the chosen LLM at debug level in base.py:

import logging

Log the LLM choice

logging.debug(f"Using LLM: {self.llm.class.name}")

4. **Validate LLM at instantiation**: Update `base.py` to check LLM availability when a `Retriever` or `Index` is created:
   ```python
try:
    # Attempt to load the LLM
    self.llm = self._get_llm()
except LLMInitializationError:
    # Fail fast if LLM initialization fails
    raise
  1. Update affected files: Apply the same changes to llama_index/core/indexes/base.py and any _get_llm() or get_llm() methods that return a default.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

FAQ

Expected behavior

  1. Fail fast: LLM initialization failures should raise an exception immediately with a clear message
  2. No defaults: Do not fall back to OpenAI; require explicit configuration
  3. Log visibility: Document which LLM is being used and warn if it differs from the requested one
  4. Validate at instantiation: Check LLM availability when a Retriever or Index is created, not lazily during the first query

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING