llamaIndex - ✅(Solved) Fix [Bug]: RAG CLI Chat does not work with GoogleGenAI [1 pull requests, 3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#20782Fetched 2026-04-08 00:31:02
View on GitHub
Comments
3
Participants
2
Timeline
10
Reactions
0
Timeline (top)
cross-referenced ×3commented ×2labeled ×2mentioned ×1

Fix Action

Fixed

PR fix notes

PR #20867: fix: replace asyncio.run() with sync helpers in google-genai LLM

Description (problem / solution / changelog)

Summary

Sync methods in GoogleGenAI (_chat, _stream_chat, structured_predict, structured_predict_without_function_calling, stream_structured_predict) used asyncio.run() to call async helpers (prepare_chat_params, chat_message_to_gemini). This crashes with RuntimeError: asyncio.run() cannot be called from a running event loop when called from inside an already-running event loop — which is the case in:

  • Jupyter notebooks
  • FastAPI / Starlette endpoints (sync route handlers)
  • Background workers with persistent event loops (e.g. asyncio.run_coroutine_threadsafe)
  • Any framework using nest_asyncio deprecation path

Approach

Following the maintainer's direction from #20240 and #20818 (create sync duplicates, don't use asyncio_run() wrapper):

  • Added sync versions of the 3 async helpers in utils.py:

    • create_file_part_sync() — uses client.files.upload() + time.sleep() instead of client.aio.files.upload() + await asyncio.sleep()
    • chat_message_to_gemini_sync() — calls create_file_part_sync() instead of await create_file_part()
    • prepare_chat_params_sync() — sequential list comprehension instead of asyncio.gather()
  • This follows the existing pattern in the codebase: delete_uploaded_files() (sync) / adelete_uploaded_files() (async)

  • Replaced all 5 asyncio.run() call sites in base.py with the new sync helpers

  • Async methods are untouched — they still use await with the original async helpers

  • Updated existing tests + added regression test for sync-from-running-loop scenario

Changes

FileChange
utils.pyAdded create_file_part_sync, chat_message_to_gemini_sync, prepare_chat_params_sync
base.pyReplaced asyncio.run(prepare_chat_params(...))prepare_chat_params_sync(...) (2 sites)
base.pyReplaced asyncio.run(chat_message_to_gemini(...))chat_message_to_gemini_sync(...) (3 sites)
test_base_cleanup.pyUpdated sync test monkeypatches + added running-loop regression test

Test plan

  • All 5 existing + new tests pass (pytest tests/test_base_cleanup.py)
  • Regression test: _chat() called from inside @pytest.mark.asyncio (running loop) — no RuntimeError
  • Integration test with Google API key (manual)

Fixes #20782 Relates to #19812

Changed files

  • llama-index-integrations/llms/llama-index-llms-google-genai/llama_index/llms/google_genai/base.py (modified, +9/-15)
  • llama-index-integrations/llms/llama-index-llms-google-genai/llama_index/llms/google_genai/utils.py (modified, +270/-0)
  • llama-index-integrations/llms/llama-index-llms-google-genai/tests/test_base_cleanup.py (modified, +35/-4)

Code Example

# line 403
next_msg, chat_kwargs, file_api_names = asyncio.run(
    prepare_chat_params(
        self.model, messages, self.file_mode, self._client, **params
    )
)

---

async def start_chat_repl(self) -> None:
    """
    Start a REPL for chatting with the agent.
    """

---
RAW_BUFFERClick to expand / collapse

Bug Description

RAG CLI Chat does not work with GoogleGenAI.

# line 403
next_msg, chat_kwargs, file_api_names = asyncio.run(
    prepare_chat_params(
        self.model, messages, self.file_mode, self._client, **params
    )
)
async def start_chat_repl(self) -> None:
    """
    Start a REPL for chatting with the agent.
    """
  • GoogleGenAI._stream_chat() is in the call flow from RagCLI.start_chat_repl(), therefore, RuntimeError shall be raised.

Version

v0.14.15

Steps to Reproduce

  • Step 1: create a python wrapper for RAG CLI in which Settings.llm is set to GoogleGenAI
  • Step 2: ingest several documents
  • Step 3: run python rag_cli_wrapper.py --chat

Relevant Logs/Tracbacks

extent analysis

<think> Okay, let's see. The user is having an issue with the RAG CLI Chat not working with GoogleGenAI. The problem seems to be related to asyncio and a RuntimeError. Let me break down the details.

First, the GoogleGenAI's _stream_chat method uses asyncio.run to call prepare_chat_params. But the start_chat_repl method in RagCLI is async. So when they call asyncio.run inside an already running event loop, that's a problem. Because you can't run another event loop inside an existing one. That's probably why the RuntimeError is happening.

So the root cause is that they're using asyncio.run in an async function that's already part of an async context. The solution would be to avoid nesting event loops. Instead of using asyncio.run, they should use await with a create_task or something similar. But how exactly?

In the code, the line in GoogleGenAI._stream_chat is using asyncio.run. Since start_chat_repl is async, when that function is called, it's already in an event loop. So replacing asyncio.run with asyncio.get_event_loop().create_task or using await directly might fix it. Wait, but prepare_chat_params is a function that's being run with asyncio.run. Maybe that function is async? Let me check the code again.

Looking at the link provided, prepare_chat_params is probably an async function. So instead of calling asyncio.run(prepare_chat_params(...)), which starts a new event loop, they should await it. But since they're inside an async function, they can just use await prepare_chat_params(...) without asyncio.run.

So the fix would be to replace the asyncio.run call with a direct await. That way, they're using the existing event loop instead of creating a new one. Let me test this logic.

Yes, if the prepare_chat_params is async, then using await would be the correct approach. So the code should change from:

next_msg, chat_kwargs, file_api_names

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

llamaIndex - ✅(Solved) Fix [Bug]: RAG CLI Chat does not work with GoogleGenAI [1 pull requests, 3 comments, 2 participants]