llamaIndex - ✅(Solved) Fix [Bug]: `SubQuestionQueryEngine` partial-failure handling breaks on non-ValueError exceptions [3 pull requests, 2 comments, 2 participants]

gautamvarmadatla · 2026-03-06T15:31:29Z

[llamaIndex] PR 20905: fix core : partial-failure handling in SubQuestionQueryEngine - Repository: run-llama/llama index - Author: gautamvarmadatla - State: cl… # PR #20905: fix(core): partial-failure handling in SubQuestionQueryEngine - Repository: run-llama/llama_index - Author: gautamvarmadatla - State: closed | merged: True - Link: https://github.com/run-llama/llama_index/pull/20905 ## Description (problem / solution / changelog) # Description Fixes #20904 Broadened to `except Exception` with `exc_info=True` so failed sub-questions are logged and skipped, allowing the remaining results to be synthesized. Also added tests covering all three execution paths (sync, sync with `use_async=True`, and fully async) to make sure that a failing sub-question is skipped and does not crash the overall query. ## New Package? Did I fill in the `tool.llamahub` section in the `pyproject.toml` and provide a detailed README.md for my new integration or package? - [ ] Yes - [X] No ## Version Bump? Did I bump the version in the `pyproject.toml` file of the package I am updating? (Except for the `llama-index-core` package) - [ ] Yes - [X] No ## Type of Change Please delete options that are not relevant. - [X] Bug fix (non-breaking change which fixes an issue) ## How Has This Been Tested? Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing. - [X] I added new unit tests to cover this change - [ ] I believe this change is already covered by existing unit tests ## Suggested Checklist: - [X] I have performed a self-review of my own code - [X] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] I have added Google Colab support for the newly added notebooks. - [X] My changes generate no new warnings - [X] I have added tests that prove my fix is effective or that my feature works - [X] New and existing unit tests pass locally with my changes - [X] I ran `uv run make format; uv run make lint` to appease the lint gods ## Changed files - `llama-index-core/llama_index/core/query_engine/sub_question_query_engine.py` (modified, +10/-4) - `llama-index-core/tests/query_engine/test_sub_question_query_engine.py` (added, +106/-0) --- # PR #20913: fix: catch all exceptions in SubQuestionQueryEngine sub-question execution - Repository: run-llama/llama_index - Author: nightcityblade - State: closed | merged: False - Link: https://github.com/run-llama/llama_index/pull/20913 ## Description (problem / solution / changelog) # Description `_query_subq` and `_aquery_subq` in `SubQuestionQueryEngine` only caught `ValueError`, but sub-question execution can raise various exceptions (provider API errors, transport errors, timeouts, `KeyError` for invalid tool names, etc.). These uncaught exceptions caused the entire query to fail instead of gracefully skipping the failed sub-question — defeating the partial-failure tolerance that the class is designed for via `filter(None, qa_pairs_all)`. Widen the catch to `Exception` and include error details in the warning log message. Fixes #20904 ## New Package? - [ ] Yes - [x] No ## Version Bump? - [ ] Yes - [x] No ## Changed files - `llama-index-core/llama_index/core/query_engine/sub_question_query_engine.py` (modified, +4/-4) --- # PR #20921: fix(query_engine): broaden exception handling in SubQuestionQueryEngine to catch all runtime errors - Repository: run-llama/llama_index - Author: s-zx - State: closed | merged: False - Link: https://github.com/run-llama/llama_index/pull/20921 ## Description (problem / solution / changelog) ## Summary `SubQuestionQueryEngine` is designed to tolerate partial sub-question failures — failed sub-questions return `None` and are filtered via `filter(None, qa_pairs_all)` before response synthesis. However, both `_query_subq` and `_aquery_subq` only caught `ValueError`, so any other exception caused the **entire** query to fail. ## Root Cause ```python # Before (both methods) except ValueError: logger.warning(f"[{sub_q.tool_name}] Failed to run {question}") return None ``` Real-world sub-engine failures are rarely `ValueError`: - Provider API rate limits → `RuntimeError` - Network failures → `ConnectionError` / `TimeoutError` - Invalid tool name → `KeyError` All of these escaped the narrow `except ValueError` clause. ## Fix ```python # After (both methods) except Exception: logger.warning(f"[{sub_q.tool_name}] Failed to run {question}") return None ``` The constructor fall-back that catches `ValueError` when trying to instantiate the OpenAI question generator (line 116) is deliberately **left unchanged** — that is intentional degradation, not an error. ## Tests Added `llama-index-core/tests/query_engine/test_sub_question_query_engine.py`: - `test_query_subq_tolerates_non_value_error` — parametrized over RuntimeError, KeyError, TimeoutError, ConnectionError - `test_aquery_subq_tolerates_non_value_error` — async variant - `test_batch_query_skips_failed_su

llamaIndex2026-03-06 15:31:29

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#20904•Fetched 2026-04-08 00:30:18

View on GitHub

Comments

Participants

Timeline

Reactions

Author

gautamvarmadatla

Participants

gautamvarmadatla

nightcityblade

Timeline (top)

cross-referenced ×3commented ×2labeled ×2referenced ×2

Error Message

RuntimeError Traceback (most recent call last)

Fix Action

Fix / Workaround

8 frames/usr/local/lib/python3.12/dist-packages/llama_index_instrumentation/dispatcher.py in wrapper(func, instance, args, kwargs) 411 412 try: --> 413 result = func(*args, **kwargs) 414 if isinstance(result, asyncio.Future): 415 # If the result is a Future, wrap it

/usr/local/lib/python3.12/dist-packages/llama_index/core/base/base_query_engine.py in query(self, str_or_query_bundle) 42 if isinstance(str_or_query_bundle, str): 43 str_or_query_bundle = QueryBundle(str_or_query_bundle) ---> 44 query_result = self._query(str_or_query_bundle) 45 dispatcher.event( 46 QueryEndEvent(query=str_or_query_bundle, response=query_result)

/usr/local/lib/python3.12/dist-packages/llama_index_instrumentation/dispatcher.py in wrapper(func, instance, args, kwargs) 411 412 try: --> 413 result = func(*args, **kwargs) 414 if isinstance(result, asyncio.Future): 415 # If the result is a Future, wrap it

PR fix notes

PR #20905: fix(core): partial-failure handling in SubQuestionQueryEngine

Repository: run-llama/llama_index
Author: gautamvarmadatla
State: closed | merged: True
Link: https://github.com/run-llama/llama_index/pull/20905

Description (problem / solution / changelog)

Description

Fixes #20904

Broadened to except Exception with exc_info=True so failed sub-questions are logged and skipped, allowing the remaining results to be synthesized. Also added tests covering all three execution paths (sync, sync with use_async=True, and fully async) to make sure that a failing sub-question is skipped and does not crash the overall query.

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

Type of Change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

I added new unit tests to cover this change
I believe this change is already covered by existing unit tests

Suggested Checklist:

I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added Google Colab support for the newly added notebooks.
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I ran uv run make format; uv run make lint to appease the lint gods

Changed files

llama-index-core/llama_index/core/query_engine/sub_question_query_engine.py (modified, +10/-4)
llama-index-core/tests/query_engine/test_sub_question_query_engine.py (added, +106/-0)

PR #20913: fix: catch all exceptions in SubQuestionQueryEngine sub-question execution

Repository: run-llama/llama_index
Author: nightcityblade
State: closed | merged: False
Link: https://github.com/run-llama/llama_index/pull/20913

Description (problem / solution / changelog)

Description

_query_subq and _aquery_subq in SubQuestionQueryEngine only caught ValueError, but sub-question execution can raise various exceptions (provider API errors, transport errors, timeouts, KeyError for invalid tool names, etc.). These uncaught exceptions caused the entire query to fail instead of gracefully skipping the failed sub-question — defeating the partial-failure tolerance that the class is designed for via filter(None, qa_pairs_all).

Widen the catch to Exception and include error details in the warning log message.

Fixes #20904

New Package?

Version Bump?

Changed files

llama-index-core/llama_index/core/query_engine/sub_question_query_engine.py (modified, +4/-4)

PR #20921: fix(query_engine): broaden exception handling in SubQuestionQueryEngine to catch all runtime errors

Repository: run-llama/llama_index
Author: s-zx
State: closed | merged: False
Link: https://github.com/run-llama/llama_index/pull/20921

Description (problem / solution / changelog)

Summary

SubQuestionQueryEngine is designed to tolerate partial sub-question failures — failed sub-questions return None and are filtered via filter(None, qa_pairs_all) before response synthesis. However, both _query_subq and _aquery_subq only caught ValueError, so any other exception caused the entire query to fail.

Root Cause

# Before (both methods)
except ValueError:
    logger.warning(f"[{sub_q.tool_name}] Failed to run {question}")
    return None

Real-world sub-engine failures are rarely ValueError:

Provider API rate limits → RuntimeError
Network failures → ConnectionError / TimeoutError
Invalid tool name → KeyError

All of these escaped the narrow except ValueError clause.

Fix

# After (both methods)
except Exception:
    logger.warning(f"[{sub_q.tool_name}] Failed to run {question}")
    return None

The constructor fall-back that catches ValueError when trying to instantiate the OpenAI question generator (line 116) is deliberately left unchanged — that is intentional degradation, not an error.

Tests

Added llama-index-core/tests/query_engine/test_sub_question_query_engine.py:

test_query_subq_tolerates_non_value_error — parametrized over RuntimeError, KeyError, TimeoutError, ConnectionError
test_aquery_subq_tolerates_non_value_error — async variant
test_batch_query_skips_failed_sub_question — integration-style test showing the full query succeeds with one failing sub-engine

Fixes #20904

Changed files

llama-index-core/llama_index/core/query_engine/sub_question_query_engine.py (modified, +2/-2)
llama-index-core/tests/query_engine/test_sub_question_query_engine.py (added, +151/-0)

Code Example

### Relevant Logs/Tracbacks

RAW_BUFFERClick to expand / collapse

Bug Description

Both _query_subq and _aquery_subq in SubQuestionQueryEngine only catch ValueError, even though the class is explicitly designed to tolerate partial sub-question failures via filter(None, qa_pairs_all) . So, common runtime exceptions from sub-query execution, such as provider API errors, transport errors, timeouts, or a KeyError from an invalid tool name, escape uncaught and cause the entire query to fail instead of skipping the failed sub-question and continuing with the remaining results.

Version

0.14.15

Steps to Reproduce


  from unittest.mock import MagicMock
  from llama_index.core import VectorStoreIndex, Settings
  from llama_index.core.base.base_query_engine import BaseQueryEngine
  from llama_index.core.base.response.schema import RESPONSE_TYPE
  from llama_index.core.callbacks import CallbackManager
  from llama_index.core.question_gen.types import SubQuestion
  from llama_index.core.query_engine.sub_question_query_engine import SubQuestionQueryEngine
  from llama_index.core.response_synthesizers import get_response_synthesizer
  from llama_index.core.schema import Document, QueryBundle
  from llama_index.core.tools import QueryEngineTool, ToolMetadata
  from llama_index.llms.openai import OpenAI
  from llama_index.embeddings.openai import OpenAIEmbedding

  Settings.llm = OpenAI(model="gpt-5")
  Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")


  class RateLimitedQueryEngine(BaseQueryEngine):
      def __init__(self):
          super().__init__(callback_manager=CallbackManager([]))
      def _query(self, query_bundle: QueryBundle) -> RESPONSE_TYPE:
          raise RuntimeError("API rate limit exceeded")
      async def _aquery(self, query_bundle: QueryBundle) -> RESPONSE_TYPE:
          raise RuntimeError("API rate limit exceeded")
      def _get_prompt_modules(self):
          return {}

  index = VectorStoreIndex.from_documents([Document(text="Paris is the capital of France.")])

  tools = [
      QueryEngineTool(
          query_engine=index.as_query_engine(),
          metadata=ToolMetadata(name="france_docs", description="Facts about France"),
      ),
      QueryEngineTool(
          query_engine=RateLimitedQueryEngine(),
          metadata=ToolMetadata(name="germany_docs", description="Facts about Germany"),
      ),
  ]

  question_gen = MagicMock()
  question_gen.generate.return_value = [
      SubQuestion(sub_question="What is the capital of France?", tool_name="france_docs"),
      SubQuestion(sub_question="What is the capital of Germany?", tool_name="germany_docs"),
  ]

  engine = SubQuestionQueryEngine(
      question_gen=question_gen,
      response_synthesizer=get_response_synthesizer(),
      query_engine_tools=tools,
      use_async=False,
  )
  response = engine.query("What are the capitals of France and Germany?")
  print(response)

Relevant Logs/Tracbacks

Generated 2 sub questions.
[france_docs] Q: What is the capital of France?
[france_docs] A: Paris
[germany_docs] Q: What is the capital of Germany?
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
/tmp/ipykernel_1122/1314053154.py in <cell line: 0>()
     51     use_async=False,
     52 )
---> 53 response = engine.query("What are the capitals of France and Germany?")
     54 print(response)

8 frames/usr/local/lib/python3.12/dist-packages/llama_index_instrumentation/dispatcher.py in wrapper(func, instance, args, kwargs)
    411 
    412             try:
--> 413                 result = func(*args, **kwargs)
    414                 if isinstance(result, asyncio.Future):
    415                     # If the result is a Future, wrap it

/usr/local/lib/python3.12/dist-packages/llama_index/core/base/base_query_engine.py in query(self, str_or_query_bundle)
     42             if isinstance(str_or_query_bundle, str):
     43                 str_or_query_bundle = QueryBundle(str_or_query_bundle)
---> 44             query_result = self._query(str_or_query_bundle)
     45         dispatcher.event(
     46             QueryEndEvent(query=str_or_query_bundle, response=query_result)

/usr/local/lib/python3.12/dist-packages/llama_index_instrumentation/dispatcher.py in wrapper(func, instance, args, kwargs)
    411 
    412             try:
--> 413                 result = func(*args, **kwargs)
    414                 if isinstance(result, asyncio.Future):
    415                     # If the result is a Future, wrap it

/usr/local/lib/python3.12/dist-packages/llama_index/core/query_engine/sub_question_query_engine.py in _query(self, query_bundle)
    153             else:
    154                 qa_pairs_all = [
--> 155                     self._query_subq(sub_q, color=colors[str(ind)])
    156                     for ind, sub_q in enumerate(sub_questions)
    157                 ]

/usr/local/lib/python3.12/dist-packages/llama_index/core/query_engine/sub_question_query_engine.py in _query_subq(self, sub_q, color)
    261                     print_text(f"[{sub_q.tool_name}] Q: {question}\n", color=color)
    262 
--> 263                 response = query_engine.query(question)
    264                 response_text = str(response)
    265 

/usr/local/lib/python3.12/dist-packages/llama_index_instrumentation/dispatcher.py in wrapper(func, instance, args, kwargs)
    411 
    412             try:
--> 413                 result = func(*args, **kwargs)
    414                 if isinstance(result, asyncio.Future):
    415                     # If the result is a Future, wrap it

/usr/local/lib/python3.12/dist-packages/llama_index/core/base/base_query_engine.py in query(self, str_or_query_bundle)
     42             if isinstance(str_or_query_bundle, str):
     43                 str_or_query_bundle = QueryBundle(str_or_query_bundle)
---> 44             query_result = self._query(str_or_query_bundle)
     45         dispatcher.event(
     46             QueryEndEvent(query=str_or_query_bundle, response=query_result)

/usr/local/lib/python3.12/dist-packages/llama_index_instrumentation/dispatcher.py in wrapper(func, instance, args, kwargs)
    411 
    412             try:
--> 413                 result = func(*args, **kwargs)
    414                 if isinstance(result, asyncio.Future):
    415                     # If the result is a Future, wrap it

/tmp/ipykernel_1122/1314053154.py in _query(self, query_bundle)
     20         super().__init__(callback_manager=CallbackManager([]))
     21     def _query(self, query_bundle: QueryBundle) -> RESPONSE_TYPE:
---> 22         raise RuntimeError("API rate limit exceeded")
     23     async def _aquery(self, query_bundle: QueryBundle) -> RESPONSE_TYPE:
     24         raise RuntimeError("API rate limit exceeded")

RuntimeError: API rate limit exceeded

extent analysis

Fix Plan

1. Catch Specific Exceptions

Update the SubQuestionQueryEngine class to catch specific exceptions in _query_subq and _aquery_subq methods.

class SubQuestionQueryEngine:
    # ...

    def _query_subq(self, sub_q, color):
        try:
            response = query_engine.query(question)
            # ...
        except (RuntimeError, TimeoutError, KeyError) as e:
            # Log the exception and skip the failed sub-question
            print(f"Skipping failed sub-question: {e}")
            return None

    async def _aquery_subq(self, sub_q, color):
        try:
            response = await query_engine._aquery(question)
            # ...
        except (RuntimeError, TimeoutError, KeyError) as e:
            # Log the exception and skip the failed sub-question
            print(f"Skipping failed sub-question: {e}")
            return None

2. Filter Out Failed Sub-Questions

Update the SubQuestionQueryEngine class to filter out failed sub-questions in the _query method.

class SubQuestionQueryEngine:
    # ...

    def query(self, str_or_query_bundle):
        # ...
        qa_pairs_all = [
            self._query_subq(sub_q, color=colors[str(ind)])
            for ind, sub_q in enumerate(sub_questions)
            if self._query_subq(sub_q, color=colors[str(ind)]) is not None
        ]
        # ...

3. Update Test Code

Update the test code to handle the new behavior.

def test_sub_question_query_engine():
    # ...
    question_gen.generate.return_value = [
        SubQuestion(sub_question="What is the capital of France?", tool_name="france_docs"),
        SubQuestion(sub_question="What is the capital of Germany?", tool_name="germany_docs"),
    ]
    engine

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #callback error #memory management #API rate limit #retriever error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

llamaIndex - ✅(Solved) Fix [Bug]: `SubQuestionQueryEngine` partial-failure handling breaks on non-ValueError exceptions [3 pull requests, 2 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fix / Workaround

PR fix notes

PR #20905: fix(core): partial-failure handling in SubQuestionQueryEngine

Description (problem / solution / changelog)

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Changed files

PR #20913: fix: catch all exceptions in SubQuestionQueryEngine sub-question execution

Description (problem / solution / changelog)

Description

New Package?

Version Bump?

Changed files

PR #20921: fix(query_engine): broaden exception handling in SubQuestionQueryEngine to catch all runtime errors

Description (problem / solution / changelog)

Summary

Root Cause

Fix

Tests

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

1. Catch Specific Exceptions

2. Filter Out Failed Sub-Questions

3. Update Test Code

Still need to ship something?

RELATED_DISCOVERY

TRENDING