llamaIndex - ✅(Solved) Fix [Bug]: CondensePlusContextChatEngine.astream_chat silently aborts generation and yields 'Empty Response' when Retriever returns 0 nodes [7 pull requests, 8 comments, 4 participants]

jefersonlop3s · 2026-03-06T01:51:03Z

[llamaIndex] PR 20919: fix: call LLM directly when retriever returns 0 nodes instead of Empty Response - Repository: run-llama/llama index - Author: MaxwellCal… # PR #20919: fix: call LLM directly when retriever returns 0 nodes instead of Empty Response - Repository: run-llama/llama_index - Author: MaxwellCalkin - State: closed | merged: False - Link: https://github.com/run-llama/llama_index/pull/20919 ## Description (problem / solution / changelog) ## Description Fixes #20894. When `CondensePlusContextChatEngine`'s retriever returns 0 nodes (e.g., empty vector store, strict metadata filters), `BaseSynthesizer.synthesize()` / `asynthesize()` short-circuits with a hardcoded `"Empty Response"` string without ever calling the LLM. This is problematic for chat engines where the LLM should still respond conversationally using its base knowledge and the system prompt. This is especially painful in multi-tenant RAG systems where a new user with an empty vector space expects the AI to still interact conversationally. ## Fix When no context nodes are retrieved, `CondensePlusContextChatEngine` now bypasses the synthesizer and calls the LLM directly with the system prompt, chat history, and user message. This is handled consistently in all 4 chat methods: - `chat()` — calls `self._llm.chat(messages)` - `stream_chat()` — calls `self._llm.stream_chat(messages)` - `achat()` — calls `self._llm.achat(messages)` - `astream_chat()` — calls `self._llm.astream_chat(messages)` A shared `_build_llm_messages()` helper constructs the message list (system prompt + chat history + user message) for all cases. ## Why this approach - **Targeted**: Only changes the chat engine, not `BaseSynthesizer` (which may intentionally skip LLM calls for other use cases like pure QA) - **Consistent with `MultiModalCondensePlusContextChatEngine`**: That engine already calls the LLM directly and never goes through `BaseSynthesizer`, so it never had this bug - **No new dependencies or parameters**: No configuration needed — a chat engine should always produce a conversational response - **Memory is still updated**: The user and assistant messages are written to memory in all code paths ## Before / After | Scenario | Before | After | |---|---|---| | Retriever returns 0 nodes (sync) | Returns `"Empty Response"` instantly, no LLM call | LLM responds using base knowledge + system prompt | | Retriever returns 0 nodes (streaming) | Yields `"Empty Response"` token instantly | Streams LLM response normally | | Retriever returns nodes | Normal behavior (unchanged) | Normal behavior (unchanged) | **AI Disclosure:** This PR was authored by Claude (AI), directed by @MaxwellCalkin. ## Changed files - `llama-index-core/llama_index/core/chat_engine/condense_plus_context.py` (modified, +104/-6) --- # PR #20942: fix: : CondensePlusContextChatEngine.astream_chat silently aborts - Repository: run-llama/llama_index - Author: JiwaniZakir - State: closed | merged: False - Link: https://github.com/run-llama/llama_index/pull/20942 ## Description (problem / solution / changelog) Closes #20894 **What**: Pass an empty-text placeholder node to the synthesizer when the retriever returns 0 nodes, instead of letting it short-circuit with a hardcoded `"Empty Response"`. **Why**: Previously, `CompactAndRefine.synthesize`/`asynthesize` received an empty node list and skipped the LLM call entirely. Now, a `NodeWithScore(node=TextNode(text=""))` placeholder ensures the synthesizer still invokes the LLM with the full prompt template. **How**: In `condense_plus_context.py`, all four chat paths (`chat`, `stream_chat`, `achat`, `astream_chat`) now gate `context_nodes` with `synth_nodes = context_nodes or [NodeWithScore(node=TextNode(text=""))]` before passing to the synthesizer. This is a minimal, localized fix — the retriever, prompt construction, and memory logic are untouched. Added an `EmptyRetriever` helper and an `empty_chat_engine` fixture in `test_condense_plus_context.py`, with four new tests (`test_chat_empty_nodes`, `test_stream_chat_empty_nodes`, `test_achat_empty_nodes`, `test_astream_chat_empty_nodes`) that assert the response is not `"Empty Response"` and that chat history is correctly updated. ## Description See above. ## New Package? No. ## Version Bump? No — bug fix only, no API change. ## Type of Change Bug fix (non-breaking). ## How Has This Been Tested? Four new unit tests covering sync, streaming, async, and async-streaming paths with an `EmptyRetriever` that returns `[]`. All assert the LLM is actually called and produces real output. ## Suggested Checklist: - [x] I have performed a self-review of my own code - [x] I have added tests that prove my fix is effective - [x] New and existing unit tests pass locally with my changes - [x] I have made corresponding changes to the documentation — N/A, no docs needed ## Changed files - `llama-index-core/llama_index/core/chat_engine/condense_plus_context.py` (modified, +9/-5) - `llama-index-core/tests/chat_engine/test_condense_p

llamaIndex2026-03-06 01:51:03

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#20894•Fetched 2026-04-08 00:30:23

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×8cross-referenced ×5referenced ×4mentioned ×3

Fix Action

Fix / Workaround

print(f"Final Response: '{full_text}'") # Expected output: "Hello! Yes, I can help you..." # Actual output: "Empty Response" (Instantly, no LLM call dispatched)

Execution HALTS in less than 1.2s without dispatching any POST request to /api/chat

PR fix notes

PR #20919: fix: call LLM directly when retriever returns 0 nodes instead of Empty Response

Repository: run-llama/llama_index
Author: MaxwellCalkin
State: closed | merged: False
Link: https://github.com/run-llama/llama_index/pull/20919

Description (problem / solution / changelog)

Description

Fixes #20894.

When CondensePlusContextChatEngine's retriever returns 0 nodes (e.g., empty vector store, strict metadata filters), BaseSynthesizer.synthesize() / asynthesize() short-circuits with a hardcoded "Empty Response" string without ever calling the LLM. This is problematic for chat engines where the LLM should still respond conversationally using its base knowledge and the system prompt.

This is especially painful in multi-tenant RAG systems where a new user with an empty vector space expects the AI to still interact conversationally.

Fix

When no context nodes are retrieved, CondensePlusContextChatEngine now bypasses the synthesizer and calls the LLM directly with the system prompt, chat history, and user message. This is handled consistently in all 4 chat methods:

chat() — calls self._llm.chat(messages)
stream_chat() — calls self._llm.stream_chat(messages)
achat() — calls self._llm.achat(messages)
astream_chat() — calls self._llm.astream_chat(messages)

A shared _build_llm_messages() helper constructs the message list (system prompt + chat history + user message) for all cases.

Why this approach

Targeted: Only changes the chat engine, not BaseSynthesizer (which may intentionally skip LLM calls for other use cases like pure QA)
Consistent with MultiModalCondensePlusContextChatEngine: That engine already calls the LLM directly and never goes through BaseSynthesizer, so it never had this bug
No new dependencies or parameters: No configuration needed — a chat engine should always produce a conversational response
Memory is still updated: The user and assistant messages are written to memory in all code paths

Before / After

Scenario	Before	After
Retriever returns 0 nodes (sync)	Returns `"Empty Response"` instantly, no LLM call	LLM responds using base knowledge + system prompt
Retriever returns 0 nodes (streaming)	Yields `"Empty Response"` token instantly	Streams LLM response normally
Retriever returns nodes	Normal behavior (unchanged)	Normal behavior (unchanged)

AI Disclosure: This PR was authored by Claude (AI), directed by @MaxwellCalkin.

Changed files

llama-index-core/llama_index/core/chat_engine/condense_plus_context.py (modified, +104/-6)

PR #20942: fix: : CondensePlusContextChatEngine.astream_chat silently aborts

Repository: run-llama/llama_index
Author: JiwaniZakir
State: closed | merged: False
Link: https://github.com/run-llama/llama_index/pull/20942

Description (problem / solution / changelog)

Closes #20894

What: Pass an empty-text placeholder node to the synthesizer when the retriever returns 0 nodes, instead of letting it short-circuit with a hardcoded "Empty Response".

Why: Previously, CompactAndRefine.synthesize/asynthesize received an empty node list and skipped the LLM call entirely. Now, a NodeWithScore(node=TextNode(text="")) placeholder ensures the synthesizer still invokes the LLM with the full prompt template.

How: In condense_plus_context.py, all four chat paths (chat, stream_chat, achat, astream_chat) now gate context_nodes with synth_nodes = context_nodes or [NodeWithScore(node=TextNode(text=""))] before passing to the synthesizer. This is a minimal, localized fix — the retriever, prompt construction, and memory logic are untouched.

Added an EmptyRetriever helper and an empty_chat_engine fixture in test_condense_plus_context.py, with four new tests (test_chat_empty_nodes, test_stream_chat_empty_nodes, test_achat_empty_nodes, test_astream_chat_empty_nodes) that assert the response is not "Empty Response" and that chat history is correctly updated.

Description

See above.

New Package?

No.

Version Bump?

No — bug fix only, no API change.

Type of Change

Bug fix (non-breaking).

How Has This Been Tested?

Four new unit tests covering sync, streaming, async, and async-streaming paths with an EmptyRetriever that returns []. All assert the LLM is actually called and produces real output.

Suggested Checklist:

I have performed a self-review of my own code
I have added tests that prove my fix is effective
New and existing unit tests pass locally with my changes
I have made corresponding changes to the documentation — N/A, no docs needed

Changed files

llama-index-core/llama_index/core/chat_engine/condense_plus_context.py (modified, +9/-5)
llama-index-core/tests/chat_engine/test_condense_plus_context.py (modified, +60/-1)

PR #20967: fix: call LLM with empty context instead of returning 'Empty Response' when no nodes retrieved

Repository: run-llama/llama_index
Author: gambletan
State: closed | merged: False
Link: https://github.com/run-llama/llama_index/pull/20967

Description (problem / solution / changelog)

Summary

Fixes CondensePlusContextChatEngine silently returning "Empty Response" when the retriever returns 0 nodes, instead of calling the LLM
When no nodes are retrieved (e.g., empty vector store, metadata filters yielding no results), a single node with empty text is passed to the response synthesizer so the LLM is still invoked and can answer using its system prompt and training knowledge
The original empty node list is preserved in source_nodes so callers can still detect that no documents were retrieved

Fixes #20894

Test plan

Create a CondensePlusContextChatEngine with an empty VectorStoreIndex and verify chat() returns a real LLM response instead of "Empty Response"
Verify achat(), stream_chat(), and astream_chat() also return real LLM responses with 0 retrieved nodes
Verify that source_nodes in the response is an empty list (not containing the placeholder node)
Verify normal behavior (nodes retrieved) is unchanged
Verify context_source.raw_output still contains the original empty node list

🤖 Generated with Claude Code

Changed files

llama-index-core/llama_index/core/chat_engine/condense_plus_context.py (modified, +35/-13)

PR #20970: fix: CondensePlusContextChatEngine returns Empty Response when retriever yields 0 nodes

Repository: run-llama/llama_index
Author: gambletan
State: closed | merged: False
Link: https://github.com/run-llama/llama_index/pull/20970

Description (problem / solution / changelog)

Description

When CondensePlusContextChatEngine is used with a retriever that returns 0 nodes (e.g. empty index, strict metadata filters in multi-tenant setups), BaseSynthesizer.synthesize/asynthesize short-circuits with a hardcoded "Empty Response" string without ever calling the LLM. This silently breaks conversational UX — the user gets an instant static string instead of a real LLM response.

Fixes #20894

What

In _run_c3 and _arun_c3, when context_nodes is empty, substitute a placeholder NodeWithScore(node=TextNode(text=""), score=0.0) so the synthesizer still invokes the LLM with the full prompt template, system prompt, and chat history. The context_source and externally-visible source_nodes remain truthful (empty list), while only the internal synth_nodes passed to the synthesizer gets the placeholder.

This is a minimal, localized fix — the retriever, prompt construction, memory logic, and BaseSynthesizer are all untouched.

New Package?

Version Bump?

No — bug fix only, no API change.

Type of Change

Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Added EmptyRetriever helper and empty_chat_engine fixture in test_condense_plus_context.py, with four new tests covering all chat paths:

test_chat_empty_nodes
test_stream_chat_empty_nodes
test_achat_empty_nodes
test_astream_chat_empty_nodes

All assert the response is not "Empty Response" and that chat history is correctly updated with 2 entries (user + assistant).

Suggested Checklist:

I have performed a self-review of my own code
I have added tests that prove my fix is effective
New and existing unit tests pass locally with my changes
I have made corresponding changes to the documentation — N/A

Changed files

llama-index-core/llama_index/core/chat_engine/condense_plus_context.py (modified, +35/-21)
llama-index-core/tests/chat_engine/test_condense_plus_context.py (modified, +108/-1)

PR #21040: fix(chat_engine): call LLM directly when retriever returns 0 nodes in CondensePlusContextChatEngine

Repository: run-llama/llama_index
Author: ComeOnOliver
State: closed | merged: False
Link: https://github.com/run-llama/llama_index/pull/21040

Description (problem / solution / changelog)

Summary

Fixes #20894

Problem

When CondensePlusContextChatEngine's retriever returns 0 nodes (e.g., empty vector store, metadata filters that match nothing), the CompactAndRefine synthesizer immediately returns a hardcoded "Empty Response" without ever calling the LLM. This silently breaks production RAG systems — especially multi-tenant architectures — where the LLM should still respond using its baseline knowledge and the system prompt.

Root Cause

BaseSynthesizer.synthesize() and BaseSynthesizer.asynthesize() have an early return when len(nodes) == 0, returning "Empty Response" without dispatching any LLM API call. The chat engine passes context nodes directly to the synthesizer without checking for this edge case.

Fix

Added a fallback path in all four chat methods (chat, stream_chat, achat, astream_chat): when context_nodes is empty, the engine now calls the LLM directly with the system prompt, chat history, and user message via self._llm.chat() / self._llm.stream_chat() / self._llm.achat() / self._llm.astream_chat().

This preserves the existing behavior when context nodes are found, and only activates the fallback when the retriever returns nothing.

Changes

Added _build_fallback_messages() helper that constructs the message list from system prompt + chat history + user message
Modified chat() — calls self._llm.chat() directly when no context nodes
Modified stream_chat() — calls self._llm.stream_chat() directly when no context nodes
Modified achat() — calls self._llm.achat() directly when no context nodes
Modified astream_chat() — calls self._llm.astream_chat() directly when no context nodes

Before / After

Before: Retriever returns 0 nodes → synthesizer returns "Empty Response" in <1 second, no LLM call
After: Retriever returns 0 nodes → LLM is called with system prompt + chat history → real response generated

Changed files

llama-index-core/llama_index/core/chat_engine/condense_plus_context.py (modified, +126/-44)

PR #21047: fix: call LLM with empty context when retriever returns 0 nodes (#20894)

Repository: run-llama/llama_index
Author: aayushbaluni
State: closed | merged: False
Link: https://github.com/run-llama/llama_index/pull/21047

Description (problem / solution / changelog)

Summary

Fixes #20894.

Root cause: BaseSynthesizer.synthesize/asynthesize short-circuited when len(nodes) == 0, returning hardcoded Empty Response without invoking the LLM.

Fix: Add use_llm_when_empty parameter (default True) to BaseSynthesizer; when nodes is empty, pass text_chunks=[''] to get_response so the LLM is called with empty context.

Changes

base.py: Add use_llm_when_empty param; call get_response with [""] when nodes empty
test_condense_plus_context.py: Add regression tests for chat and astream_chat with empty index

Testing

Added 2 tests covering chat and astream_chat with empty retriever
All existing tests pass

Made with Cursor

Changed files

llama-index-core/llama_index/core/response_synthesizers/base.py (modified, +25/-17)
llama-index-core/tests/chat_engine/test_condense_plus_context.py (modified, +31/-1)

PR #21206: fix: add opt-in fallback_to_llm param for empty retrieval in CondensePlusContextChatEngine

Repository: run-llama/llama_index
Author: IgnazioDS
State: open | merged: False
Link: https://github.com/run-llama/llama_index/pull/21206

Description (problem / solution / changelog)

Summary

Adds an opt-in fallback_to_llm parameter (default False) to CondensePlusContextChatEngine
When enabled and the retriever returns 0 nodes, the engine calls the LLM directly with the system prompt, chat history, and user message instead of short-circuiting with "Empty Response"
All four chat methods are covered: chat, stream_chat, achat, astream_chat
Default behavior is unchanged -- existing users are not affected

Problem

In multi-tenant RAG systems with metadata filters, queries often return 0 nodes (e.g., new users with empty vector spaces). The current behavior silently returns a hardcoded "Empty Response" instead of letting the LLM attempt an answer using its baseline knowledge and the system prompt. See #20894.

Why this approach

Per maintainer feedback, the fix must be opt-in, not a default behavior change. This PR:

Scopes the change to CondensePlusContextChatEngine only (does not modify BaseSynthesizer)
Defaults to False so query engines that rely on the empty-response behavior are unaffected
Provides a simple API: CondensePlusContextChatEngine.from_defaults(..., fallback_to_llm=True)

Usage

engine = CondensePlusContextChatEngine.from_defaults(
    retriever=retriever,
    llm=llm,
    system_prompt="You are a helpful AI.",
    fallback_to_llm=True,  # <-- opt in
)

Test plan

Query with 0 retrieval nodes and fallback_to_llm=True gets an LLM response
Query with 0 retrieval nodes and fallback_to_llm=False (default) still returns "Empty Response"
Query with matching nodes works normally regardless of fallback_to_llm setting
Streaming (astream_chat, stream_chat) works correctly with 0 nodes and fallback enabled

Fixes #20894

Changed files

llama-index-core/llama_index/core/chat_engine/condense_plus_context.py (modified, +125/-44)

Code Example

INFO:src.agent:📂 Initializing local search tool (RAG)...
INFO:src.agent:   ✓ RAG tool initialized successfully
INFO:src.engine_builder:⚙️  Configuring Hybrid Search (Vector + BM25)...
INFO:src.engine_builder:   🔍 Tenant Filter applied: Jeferson
WARNING:src.engine_builder:   ⚠️  No documents found in ChromaDB for BM25.
INFO:src.engine_builder:   🧠 Engine initialized via OLLAMA with model qwen2.5:0.5b

[STARTING ASTREAM_CHAT TRACE]

INFO:llama_index.core.chat_engine.condense_plus_context:Condensed question: Hello Sovereign! This is a test being sent directly from N8N via Webhook in the Cybrid network.

# Execution HALTS in less than 1.2s without dispatching any POST request to /api/chat

==== DEBUG RESPONSE ====
TYPE: <class 'llama_index.core.chat_engine.types.StreamingAgentChatResponse'>
FULL TEXT FINAL: 'Empty Response'
chat_stream.response: 'Empty Response'

RAW_BUFFERClick to expand / collapse

Bug Description

When using CondensePlusContextChatEngine with an asynchronous stream (astream_chat), if the provided Retriever (e.g., QueryFusionRetriever or VectorIndexRetriever) returns 0 nodes (which can happen frequently with valid metadata filters like tenant IDs), the Chat Engine silently aborts the LLM generation process.

Instead of passing the system prompt and the user query to the LLM with an empty context, the synthesizer (called via _arun_c3 -> asynthesize) evaluates to an empty node list and completely skips the LLM API call. It instantly returns a hardcoded "Empty Response" string wrapped in an AsyncStreamingResponse.

This is highly problematic for production RAG systems (like multi-tenant architectures), where a user might ask a general question or have an empty vector space on their first day. Instead of the LLM answering naturally leveraging its baseline knowledge and the System Prompt, the application receives a silent "Empty Response" in less than 1 second, with no exceptions raised, masking the behavior.

llamaindex_bug_report.md

Version

Python 3.12.x LlamaIndex v0.10.x LLM Provider: Agnostic (Tested with Ollama, but reproducible via OpenAI due to synthesizer logic)

Steps to Reproduce

import asyncio from llama_index.core import Document, VectorStoreIndex from llama_index.core.chat_engine import CondensePlusContextChatEngine from llama_index.llms.ollama import Ollama # or OpenAI

async def main(): # 1. Create an empty index (or one where filters will yield 0 nodes) index = VectorStoreIndex.from_documents([]) retriever = index.as_retriever(similarity_top_k=2)

# 2. Setup LLM
llm = Ollama(model="qwen2.5:0.5b") # Ensure model is running locally

# 3. Build CondensePlusContextChatEngine
engine = CondensePlusContextChatEngine.from_defaults(
    retriever=retriever,
    llm=llm,
    system_prompt="You are a helpful AI."
)

# 4. Attempt to trigger Async Stream
print("Sending query...")
chat_stream = await engine.astream_chat("Hello, can you help me?")

full_text = ""
async for token in chat_stream.async_response_gen():
    full_text += token
    
print(f"Final Response: '{full_text}'") 
# Expected output: "Hello! Yes, I can help you..."
# Actual output: "Empty Response" (Instantly, no LLM call dispatched)

if name == "main": asyncio.run(main())

Relevant Logs/Tracbacks

INFO:src.agent:📂 Initializing local search tool (RAG)...
INFO:src.agent:   ✓ RAG tool initialized successfully
INFO:src.engine_builder:⚙️  Configuring Hybrid Search (Vector + BM25)...
INFO:src.engine_builder:   🔍 Tenant Filter applied: Jeferson
WARNING:src.engine_builder:   ⚠️  No documents found in ChromaDB for BM25.
INFO:src.engine_builder:   🧠 Engine initialized via OLLAMA with model qwen2.5:0.5b

[STARTING ASTREAM_CHAT TRACE]

INFO:llama_index.core.chat_engine.condense_plus_context:Condensed question: Hello Sovereign! This is a test being sent directly from N8N via Webhook in the Cybrid network.

# Execution HALTS in less than 1.2s without dispatching any POST request to /api/chat

==== DEBUG RESPONSE ====
TYPE: <class 'llama_index.core.chat_engine.types.StreamingAgentChatResponse'>
FULL TEXT FINAL: 'Empty Response'
chat_stream.response: 'Empty Response'

extent analysis

Fix Plan

1. Update `CondensePlusContextChatEngine` to handle empty node lists

Code Change:

# llama_index/core/chat_engine.py
class CondensePlusContextChatEngine:
    # ...

    async def _arun_c3(self, *args, **kwargs):
        # ...

        # Check if node list is empty
        if not nodes:
            # Instead of returning an empty response, dispatch the LLM API call with an empty context
            return await self._dispatch_llm_api_call(context="")

    async def _dispatch_llm_api_call(self, context):
        # Implement LLM API call dispatching logic here
        # ...

2. Update `astream_chat` to handle empty responses from the LLM API

Code Change:

# llama_index/core/chat_engine.py
class CondensePlusContextChatEngine:
    # ...

    async def astream_chat(self, query):
        # ...

        async for token in chat_stream.async_response_gen():
            full_text += token

        # Check if response is empty
        if full_text == "Empty Response":
            # Dispatch the LLM API call again with the original context
            await self._dispatch_llm_api_call(context=self._get_context(query))
            # Append the new response to the chat stream
            async for token in chat_stream.async_response_gen():
                full_text += token

3. Update `VectorStoreIndex` to handle empty filters

Code Change:

# llama_index/core/index.py
class VectorStoreIndex:
    # ...

    def as_retriever(self, similarity_top_k=2):
        # ...

        # Check if filter yields 0 nodes
        if not self._filter_nodes:
            # Return an empty retriever
            return EmptyRetriever()

EmptyRetriever class:

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #retriever error #indexing error #inference speed

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

llamaIndex - ✅(Solved) Fix [Bug]: CondensePlusContextChatEngine.astream_chat silently aborts generation and yields 'Empty Response' when Retriever returns 0 nodes [7 pull requests, 8 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fix / Workaround

Execution HALTS in less than 1.2s without dispatching any POST request to /api/chat

PR fix notes

PR #20919: fix: call LLM directly when retriever returns 0 nodes instead of Empty Response

Description (problem / solution / changelog)

Description

Fix

Why this approach

Before / After

Changed files

PR #20942: fix: : CondensePlusContextChatEngine.astream_chat silently aborts

Description (problem / solution / changelog)

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Changed files

PR #20967: fix: call LLM with empty context instead of returning 'Empty Response' when no nodes retrieved

Description (problem / solution / changelog)

Summary

Test plan

Changed files

PR #20970: fix: CondensePlusContextChatEngine returns Empty Response when retriever yields 0 nodes

Description (problem / solution / changelog)

Description

What

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Changed files

PR #21040: fix(chat_engine): call LLM directly when retriever returns 0 nodes in CondensePlusContextChatEngine

Description (problem / solution / changelog)

Summary

Problem

Root Cause

Fix

Changes

Before / After

Changed files

PR #21047: fix: call LLM with empty context when retriever returns 0 nodes (#20894)

Description (problem / solution / changelog)

Summary

Changes

Testing

Changed files

PR #21206: fix: add opt-in fallback_to_llm param for empty retrieval in CondensePlusContextChatEngine

Description (problem / solution / changelog)

Summary

Problem

Why this approach

Usage

Test plan

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

1. Update CondensePlusContextChatEngine to handle empty node lists

2. Update astream_chat to handle empty responses from the LLM API

3. Update VectorStoreIndex to handle empty filters

Still need to ship something?

RELATED_DISCOVERY

TRENDING

1. Update `CondensePlusContextChatEngine` to handle empty node lists

2. Update `astream_chat` to handle empty responses from the LLM API

3. Update `VectorStoreIndex` to handle empty filters