llamaIndex - ✅(Solved) Fix [Bug]: `QueryFusionRetriever._aretrieve` blocks the event loop during query generation [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#21159Fetched 2026-04-08 01:31:19
View on GitHub
Comments
1
Participants
2
Timeline
10
Reactions
0
Timeline (top)
cross-referenced ×4labeled ×2referenced ×2closed ×1

Fix Action

Fixed

PR fix notes

PR #21160: fix(core): use async query generation in QueryFusionRetriever._aretrieve

Description (problem / solution / changelog)

Description

Added _aget_queries() using await self._llm.acomplete() and updated _aretrieve to call it instead of the blocking _get_queries().

Fixes #21159

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • No

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

Changed files

  • llama-index-core/llama_index/core/retrievers/fusion_retriever.py (modified, +14/-1)
  • llama-index-core/tests/retrievers/test_fusion_retriever.py (added, +35/-0)

Code Example

### Relevant Logs/Tracbacks
RAW_BUFFERClick to expand / collapse

Bug Description

_aretrieve() calls the synchronous _get_queries(), which blocks on self._llm.complete() instead of awaiting an async equivalent. When num_queries > 1 (the default), this blocks the current event loop during query generation and prevents other coroutines on that same loop from making progress until query expansion finishes.

Version

0.14.19

Steps to Reproduce

import asyncio
from llama_index.core.base.base_retriever import BaseRetriever
from llama_index.core.retrievers import QueryFusionRetriever
from llama_index.core.schema import NodeWithScore, QueryBundle, TextNode
from llama_index.llms.openai import OpenAI

class MockRetriever(BaseRetriever):
      def _retrieve(self, query_bundle):
          return [NodeWithScore(node=TextNode(text="Hi I am Gautam!"), score=1.0)]

retriever = QueryFusionRetriever(
      retrievers=[MockRetriever()],
      llm=OpenAI(model="gpt-5"),
      num_queries=4,
  )

async def other_work():
      await asyncio.sleep(0.1)
      print("other work ran")

task = asyncio.ensure_future(other_work())
await retriever.aretrieve("What is the capital of France?")
print(f"other_work done: {task.done()}")

Relevant Logs/Tracbacks

other_work done: False

So basically the task had the entire duration of a real OpenAI HTTP call to complete its
0.1s sleep and still never got scheduled.

extent analysis

Fix Plan

To fix the issue, we need to make the _get_queries() method asynchronous to avoid blocking the event loop.

Here are the steps:

  • Make _get_queries() an asynchronous method by adding the async keyword.
  • Replace the synchronous self._llm.complete() call with an asynchronous equivalent, if available, or use await asyncio.to_thread() to run the synchronous call in a separate thread.

Example Code

async def _get_queries(self, query_bundle):
    # Assuming self._llm.complete() has an async equivalent
    await self._llm.complete_async()
    # ... rest of the method ...

# or if no async equivalent is available
async def _get_queries(self, query_bundle):
    loop = asyncio.get_running_loop()
    result = await loop.run_in_executor(None, self._llm.complete)
    # ... rest of the method ...

Verification

To verify the fix, run the provided test code again and check if the other_work task is completed before the aretrieve call finishes. The output should indicate that other_work is done:

other_work done: True

Extra Tips

When working with asynchronous code, it's essential to ensure that all blocking calls are properly awaited to avoid starving the event loop. Use asyncio.to_thread() or loop.run_in_executor() to run synchronous code in a separate thread, allowing other coroutines to make progress.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING