llamaIndex - ✅(Solved) Fix [Bug]: Refine ignores `query_satisfied=True` when `structured_answer_filtering=True` [1 pull requests, 1 comments, 2 participants]

llamaIndex2026-04-16 06:23:12

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#21397•Fetched 2026-04-17 08:26:54

View on GitHub

Comments

Participants

Timeline

Reactions

Author

gautamvarmadatla

Participants

dosubot[bot]

gautamvarmadatla

Timeline (top)

labeled ×2commented ×1cross-referenced ×1

Error Message

Relevant Logs/Tracebacks

Code Example

### Relevant Logs/Tracebacks

RAW_BUFFERClick to expand / collapse

Bug Description

When structured_answer_filtering=True, the Refine synthesizer is expected to stop once query_satisfied=True. Instead, it continues calling the LLM on remaining chunks, wasting API calls and also could potentially degrade the final answer.

Version

0.14.19

Steps to Reproduce

import os
from llama_index.core import SummaryIndex, Document, Settings
from llama_index.core.callbacks import CallbackManager, TokenCountingHandler
from llama_index.llms.openai import OpenAI

token_counter = TokenCountingHandler()

Settings.llm = OpenAI(
    model="gpt-5",
    api_key=os.environ["OPENAI_API_KEY"],
    temperature=0,
)
Settings.callback_manager = CallbackManager([token_counter])

index = SummaryIndex.from_documents([
    Document(text="Paris is the capital of France."),        # answers the query
    Document(text="The Eiffel Tower is located in Paris."),  # should be skipped
    Document(text="France is a country in Western Europe."), # should be skipped
])

engine = index.as_query_engine(
    response_mode="refine",
    structured_answer_filtering=True,
)

token_counter.reset_counts()
response = engine.query("What is the capital of France?")

print(response)
print(f"LLM calls: {len(token_counter.llm_token_counts)}")
print("Expected: 1 if Refine short-circuits after query_satisfied=True")

Relevant Logs/Tracebacks

Paris
LLM calls: 3
Expected: 1 if Refine short-circuits after query_satisfied=True

extent analysis

TL;DR

The issue can be fixed by modifying the Refine synthesizer to stop calling the LLM once query_satisfied=True when structured_answer_filtering=True.

Guidance

Review the Refine synthesizer's implementation to ensure it checks the query_satisfied flag after each chunk processing and stops further LLM calls when structured_answer_filtering=True.
Verify that the query_satisfied flag is correctly set to True when the query is answered.
Check if there are any other conditions or edge cases that might prevent the Refine synthesizer from short-circuiting.
Consider adding logging or debugging statements to track the query_satisfied flag and LLM calls to better understand the issue.

Example

# Pseudo-code example of the modified Refine synthesizer
if structured_answer_filtering and query_satisfied:
    break  # stop calling LLM on remaining chunks

Notes

The provided code snippet and logs suggest that the issue is specific to the Refine synthesizer's implementation and its interaction with the structured_answer_filtering flag. The fix may require modifying the synthesizer's logic to correctly handle this flag.

Recommendation

Apply workaround: Modify the Refine synthesizer to stop calling the LLM once query_satisfied=True when structured_answer_filtering=True, as this is the most direct way to address the issue based on the provided information.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #tokenizer error #prompt formatting #chain error #conversation history

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

llamaIndex - ✅(Solved) Fix [Bug]: Refine ignores `query_satisfied=True` when `structured_answer_filtering=True` [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Relevant Logs/Tracebacks

Fix Action

Fixed

PR fix notes

PR #21398: fix(core): short-circuit Refine on query_satisfied when structured_answer_filtering is enabled

Description (problem / solution / changelog)

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracebacks

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

PR #21398: fix(core): short-circuit Refine on `query_satisfied` when `structured_answer_filtering` is enabled