llamaIndex - ✅(Solved) Fix [Bug]: RouterQueryEngine._aquery blocks the event loop when multiple engines are selected [1 pull requests, 5 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#20794Fetched 2026-04-08 00:30:53
View on GitHub
Comments
5
Participants
3
Timeline
9
Reactions
0
Timeline (top)
commented ×5labeled ×2closed ×1cross-referenced ×1

Fix Action

Fixed

PR fix notes

PR #20795: fix(core): replace blocking run_async_tasks with asyncio.gather

Description (problem / solution / changelog)

Description

I replaced run_async_tasks(tasks) with await asyncio.gather(*tasks) in the async fan-out paths for RouterQueryEngine._aquery and ToolRetrieverRouterQueryEngine._aquery, so multi-engine routing no longer performs a synchronous blocking wait inside an async method. I also added a couple of regression tests for both classes to confirm the event loop isn’t blocked.

Fixes #20794

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • No

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

Changed files

  • llama-index-core/llama_index/core/query_engine/router_query_engine.py (modified, +3/-3)
  • llama-index-core/tests/query_engine/test_router_query_engine.py (added, +114/-0)

Code Example

### Relevant Logs/Tracbacks
RAW_BUFFERClick to expand / collapse

Bug Description

When the router picks more than one engine/tool, the async _aquery() path calls run_async_tasks() to run them. That helper goes through asyncio_run(), and if there’s already a running event loop it spins up a new thread to run the work and then waits on future.result() on the current thread. Since _aquery() is running on the event-loop thread, that wait blocks the loop until all sub-queries finish, so everything else on the loop (other requests, timers, background tasks) gets stuck.

Kind of similar to #17349 , #14515 , etc .

Version

0.14.15

Steps to Reproduce

import asyncio
from unittest.mock import MagicMock

from llama_index.core.base.base_selector import BaseSelector, SelectorResult, SingleSelection
from llama_index.core.base.response.schema import Response
from llama_index.core.query_engine.router_query_engine import RouterQueryEngine
from llama_index.core.tools.types import ToolMetadata

class MultiSelector(BaseSelector):
    def _get_prompts(self): return {}
    def _update_prompts(self, p): pass
    def _get_prompt_modules(self): return {}
    def _select(self, choices, query):
        return SelectorResult(selections=[
            SingleSelection(index=0, reason=""),
            SingleSelection(index=1, reason=""),
        ])
    async def _aselect(self, choices, query): return self._select(choices, query)

async def fake_query(_):
    await asyncio.sleep(0.05)         
    return Response(response="ok")

def tool(name):
    e, t = MagicMock(), MagicMock()
    e.aquery = fake_query
    t.query_engine = e
    t.metadata = ToolMetadata(name=name, description=name)
    return t

class Summarizer:
    async def aget_response(self, *_): return "combined"

async def repro():
    router = RouterQueryEngine(
        selector=MultiSelector(),
        query_engine_tools=[tool("a"), tool("b")],
        llm=MagicMock(),
        summarizer=Summarizer(),
    )

    ran = False
    async def bg():
        nonlocal ran
        await asyncio.sleep(0.01)
        ran = True

    asyncio.create_task(bg())
    await router.aquery("test")
    print("background ran during aquery:", ran)  # buggy: False, fixed: True

await repro()

Relevant Logs/Tracbacks

extent analysis

Fix: Make run_async_tasks fully async (no thread‑blocking)

The block occurs because run_async_tasks() falls back to asyncio_run() → a new thread → future.result(), which blocks the current event‑loop thread.
Replace that helper with a pure‑async implementation that uses asyncio.gather (or asyncio.wait) and never calls future.result().

1. Update the helper (usually in llama_index/core/utils.py)

# old signature (sync)
def run_async_tasks(tasks: List[Callable[[], Awaitable[Any]]]) -> List[Any]:
    ...

# new async version
import asyncio
from typing import Awaitable, Callable, List, Any

async def run_async_tasks(tasks: List[Callable[[], Awaitable[Any]]]) -> List[Any]:
    """
    Execute a list of async callables concurrently without blocking the
    current event‑loop.  Each element in *tasks* is a zero‑arg coroutine
    factory (e.g. lambda: tool.aquery(query)).
    """
    # Build the coroutine objects
    coros = [task() for task in tasks]

    # Run them concurrently; propagate exceptions as usual
    results = await asyncio.gather(*coros, return_exceptions=False)
    return results

2. Call it with await from _aquery

class RouterQueryEngine:
    # ...

    async def _aquery(self, query: str) -> Response:
        # Build a list of async callables for the selected tools
        selected = self._select_tools(query)          # returns list of tools
        task_fns = [lambda t=t: t.query_engine.aquery(query) for t in selected]

        # <-- NEW: await the async helper
        sub_responses = await run_async_tasks(task_fns)

        # Continue with summarisation, etc.
        combined = await self.summarizer.aget_response(sub_responses)
        return Response(response=combined)

3. Remove the old asyncio_run/thread shim (if still imported)

# delete or comment out
# from llama_index.core.utils import asyncio_run

4. (Optional) Back‑compat shim for callers that still expect the sync API

If some external code still imports the sync version, keep a thin wrapper:

def run_async_tasks_sync(tasks):
    """Legacy wrapper – executes the async helper in the current loop."""
    return asyncio.get_event_loop().run_until_complete(run_async_tasks(tasks))

5. Verify the fix

async def repro():
    router = RouterQueryEngine(
        selector=MultiSelector(),
        query_engine_tools=[tool("a"), tool("b")],
        llm=MagicMock(),
        summarizer=Summarizer(),
    )

    ran = False
    async def bg():
        nonlocal ran
        await asyncio.sleep(0.01)
        ran = True

    asyncio.create_task(bg())
    await router.aquery("test")
    print("background ran during aquery:", ran)   # → True

Run the script; the background task should set ran = True, confirming the event loop stayed responsive.

Extra Tips

  • Never block the event‑loop thread with future.result() or run_until_complete.
  • Use asyncio.gather (or asyncio.wait) for concurrent async work.
  • Add a unit test that spawns a background task while router.aquery runs; assert the background task completes.
  • If you need a sync entry‑point, expose a separate run_router_query_sync that creates its own loop outside any existing one.

That’s all – swapping the blocking helper for the async gather version restores proper concurrency.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

llamaIndex - ✅(Solved) Fix [Bug]: RouterQueryEngine._aquery blocks the event loop when multiple engines are selected [1 pull requests, 5 comments, 3 participants]