llamaIndex - ✅(Solved) Fix [Bug]: RAG CLI Chat does not work with GoogleGenAI [1 pull requests, 3 comments, 2 participants]

lehoang318 · 2026-02-23T14:40:13Z

[llamaIndex] PR 20867: fix: replace asyncio.run with sync helpers in google-genai LLM - Repository: run-llama/llama index - Author: MarioRicoIbanez - State: op… # PR #20867: fix: replace asyncio.run() with sync helpers in google-genai LLM - Repository: run-llama/llama_index - Author: MarioRicoIbanez - State: open | merged: False - Link: https://github.com/run-llama/llama_index/pull/20867 ## Description (problem / solution / changelog) ## Summary Sync methods in `GoogleGenAI` (`_chat`, `_stream_chat`, `structured_predict`, `structured_predict_without_function_calling`, `stream_structured_predict`) used `asyncio.run()` to call async helpers (`prepare_chat_params`, `chat_message_to_gemini`). This crashes with `RuntimeError: asyncio.run() cannot be called from a running event loop` when called from inside an already-running event loop — which is the case in: - **Jupyter notebooks** - **FastAPI / Starlette endpoints** (sync route handlers) - **Background workers** with persistent event loops (e.g. `asyncio.run_coroutine_threadsafe`) - Any framework using `nest_asyncio` deprecation path ## Approach Following the maintainer's direction from #20240 and #20818 (create sync duplicates, don't use `asyncio_run()` wrapper): - Added **sync versions** of the 3 async helpers in `utils.py`: - `create_file_part_sync()` — uses `client.files.upload()` + `time.sleep()` instead of `client.aio.files.upload()` + `await asyncio.sleep()` - `chat_message_to_gemini_sync()` — calls `create_file_part_sync()` instead of `await create_file_part()` - `prepare_chat_params_sync()` — sequential list comprehension instead of `asyncio.gather()` - This follows the **existing pattern** in the codebase: `delete_uploaded_files()` (sync) / `adelete_uploaded_files()` (async) - Replaced all **5 `asyncio.run()` call sites** in `base.py` with the new sync helpers - **Async methods are untouched** — they still use `await` with the original async helpers - Updated existing tests + added regression test for sync-from-running-loop scenario ## Changes | File | Change | |------|--------| | `utils.py` | Added `create_file_part_sync`, `chat_message_to_gemini_sync`, `prepare_chat_params_sync` | | `base.py` | Replaced `asyncio.run(prepare_chat_params(...))` → `prepare_chat_params_sync(...)` (2 sites) | | `base.py` | Replaced `asyncio.run(chat_message_to_gemini(...))` → `chat_message_to_gemini_sync(...)` (3 sites) | | `test_base_cleanup.py` | Updated sync test monkeypatches + added running-loop regression test | ## Test plan - [x] All 5 existing + new tests pass (`pytest tests/test_base_cleanup.py`) - [x] Regression test: `_chat()` called from inside `@pytest.mark.asyncio` (running loop) — no RuntimeError - [ ] Integration test with Google API key (manual) Fixes #20782 Relates to #19812 ## Changed files - `llama-index-integrations/llms/llama-index-llms-google-genai/llama_index/llms/google_genai/base.py` (modified, +9/-15) - `llama-index-integrations/llms/llama-index-llms-google-genai/llama_index/llms/google_genai/utils.py` (modified, +270/-0) - `llama-index-integrations/llms/llama-index-llms-google-genai/tests/test_base_cleanup.py` (modified, +35/-4) ## Fixed - Fixed by PR: fix: replace asyncio.run() with sync helpers in google-genai LLM (https://github.com/run-llama/llama_index/pull/20867) ### Bug Description # RAG CLI Chat does not work with GoogleGenAI. * In details + [GoogleGenAI._stream_chat()](https://github.com/run-llama/llama_index/blob/4937fc017cbf91d08c6beaadb790ae44745a87a1/llama-index-integrations/llms/llama-index-llms-google-genai/llama_index/llms/google_genai/base.py#L403) call `asyncio.run` ``` # line 403 next_msg, chat_kwargs, file_api_names = asyncio.run( prepare_chat_params( self.model, messages, self.file_mode, self._client, **params ) ) ``` + [RagCLI.start_chat_repl()](https://github.com/run-llama/llama_index/blob/v0.14.15/llama-index-cli/llama_index/cli/rag/base.py#L269) is defined as `async` ``` async def start_chat_repl(self) -> None: """ Start a REPL for chatting with the agent. """ ``` + `GoogleGenAI._stream_chat()` is in the call flow from `RagCLI.start_chat_repl()`, therefore, `RuntimeError` shall be raised. ### Version v0.14.15 ### Steps to Reproduce * Step 1: create a [python wrapper for RAG CLI](https://developers.llamaindex.ai/python/framework/getting_started/starter_tools/rag_cli/#customization) in which Settings.llm is set to GoogleGenAI * Step 2: ingest several documents * Step 3: run `python rag_cli_wrapper.py --chat` ### Relevant Logs/Tracbacks ```shell ```

llamaIndex2026-02-23 14:40:13

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#20782•Fetched 2026-04-08 00:31:02

View on GitHub

Comments

Participants

Timeline

Reactions

Author

lehoang318

Participants

dosubot[bot]

lehoang318

Timeline (top)

cross-referenced ×3commented ×2labeled ×2mentioned ×1

Fix Action

Fixed

Fixed by PR: fix: replace asyncio.run() with sync helpers in google-genai LLM (https://github.com/run-llama/llama_index/pull/20867)

PR fix notes

PR #20867: fix: replace asyncio.run() with sync helpers in google-genai LLM

Repository: run-llama/llama_index
Author: MarioRicoIbanez
State: open | merged: False
Link: https://github.com/run-llama/llama_index/pull/20867

Description (problem / solution / changelog)

Summary

Sync methods in GoogleGenAI (_chat, _stream_chat, structured_predict, structured_predict_without_function_calling, stream_structured_predict) used asyncio.run() to call async helpers (prepare_chat_params, chat_message_to_gemini). This crashes with RuntimeError: asyncio.run() cannot be called from a running event loop when called from inside an already-running event loop — which is the case in:

Jupyter notebooks
FastAPI / Starlette endpoints (sync route handlers)
Background workers with persistent event loops (e.g. asyncio.run_coroutine_threadsafe)
Any framework using nest_asyncio deprecation path

Approach

Following the maintainer's direction from #20240 and #20818 (create sync duplicates, don't use asyncio_run() wrapper):

Added sync versions of the 3 async helpers in utils.py:
- create_file_part_sync() — uses client.files.upload() + time.sleep() instead of client.aio.files.upload() + await asyncio.sleep()
- chat_message_to_gemini_sync() — calls create_file_part_sync() instead of await create_file_part()
- prepare_chat_params_sync() — sequential list comprehension instead of asyncio.gather()
This follows the existing pattern in the codebase: delete_uploaded_files() (sync) / adelete_uploaded_files() (async)
Replaced all 5 asyncio.run() call sites in base.py with the new sync helpers
Async methods are untouched — they still use await with the original async helpers
Updated existing tests + added regression test for sync-from-running-loop scenario

Changes

File	Change
`utils.py`	Added `create_file_part_sync`, `chat_message_to_gemini_sync`, `prepare_chat_params_sync`
`base.py`	Replaced `asyncio.run(prepare_chat_params(...))` → `prepare_chat_params_sync(...)` (2 sites)
`base.py`	Replaced `asyncio.run(chat_message_to_gemini(...))` → `chat_message_to_gemini_sync(...)` (3 sites)
`test_base_cleanup.py`	Updated sync test monkeypatches + added running-loop regression test

Test plan

All 5 existing + new tests pass (pytest tests/test_base_cleanup.py)
Regression test: _chat() called from inside @pytest.mark.asyncio (running loop) — no RuntimeError
Integration test with Google API key (manual)

Fixes #20782 Relates to #19812

Changed files

llama-index-integrations/llms/llama-index-llms-google-genai/llama_index/llms/google_genai/base.py (modified, +9/-15)
llama-index-integrations/llms/llama-index-llms-google-genai/llama_index/llms/google_genai/utils.py (modified, +270/-0)
llama-index-integrations/llms/llama-index-llms-google-genai/tests/test_base_cleanup.py (modified, +35/-4)

Code Example

# line 403
next_msg, chat_kwargs, file_api_names = asyncio.run(
    prepare_chat_params(
        self.model, messages, self.file_mode, self._client, **params
    )
)

---

async def start_chat_repl(self) -> None:
    """
    Start a REPL for chatting with the agent.
    """

---

RAW_BUFFERClick to expand / collapse

Bug Description

RAG CLI Chat does not work with GoogleGenAI.

In details
- GoogleGenAI._stream_chat() call asyncio.run

# line 403
next_msg, chat_kwargs, file_api_names = asyncio.run(
    prepare_chat_params(
        self.model, messages, self.file_mode, self._client, **params
    )
)

RagCLI.start_chat_repl() is defined as async

async def start_chat_repl(self) -> None:
    """
    Start a REPL for chatting with the agent.
    """

GoogleGenAI._stream_chat() is in the call flow from RagCLI.start_chat_repl(), therefore, RuntimeError shall be raised.

Version

v0.14.15

Steps to Reproduce

Step 1: create a python wrapper for RAG CLI in which Settings.llm is set to GoogleGenAI
Step 2: ingest several documents
Step 3: run python rag_cli_wrapper.py --chat

Relevant Logs/Tracbacks

extent analysis

<think> Okay, let's see. The user is having an issue with the RAG CLI Chat not working with GoogleGenAI. The problem seems to be related to asyncio and a RuntimeError. Let me break down the details.

First, the GoogleGenAI's _stream_chat method uses asyncio.run to call prepare_chat_params. But the start_chat_repl method in RagCLI is async. So when they call asyncio.run inside an already running event loop, that's a problem. Because you can't run another event loop inside an existing one. That's probably why the RuntimeError is happening.

So the root cause is that they're using asyncio.run in an async function that's already part of an async context. The solution would be to avoid nesting event loops. Instead of using asyncio.run, they should use await with a create_task or something similar. But how exactly?

In the code, the line in GoogleGenAI._stream_chat is using asyncio.run. Since start_chat_repl is async, when that function is called, it's already in an event loop. So replacing asyncio.run with asyncio.get_event_loop().create_task or using await directly might fix it. Wait, but prepare_chat_params is a function that's being run with asyncio.run. Maybe that function is async? Let me check the code again.

Looking at the link provided, prepare_chat_params is probably an async function. So instead of calling asyncio.run(prepare_chat_params(...)), which starts a new event loop, they should await it. But since they're inside an async function, they can just use await prepare_chat_params(...) without asyncio.run.

So the fix would be to replace the asyncio.run call with a direct await. That way, they're using the existing event loop instead of creating a new one. Let me test this logic.

Yes, if the prepare_chat_params is async, then using await would be the correct approach. So the code should change from:

next_msg, chat_kwargs, file_api_names

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #environment variable #network issue #logging issue #authentication issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

llamaIndex - ✅(Solved) Fix [Bug]: RAG CLI Chat does not work with GoogleGenAI [1 pull requests, 3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #20867: fix: replace asyncio.run() with sync helpers in google-genai LLM

Description (problem / solution / changelog)

Summary

Approach

Changes

Test plan

Changed files

Code Example

Bug Description

RAG CLI Chat does not work with GoogleGenAI.

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Still need to ship something?

TRENDING

llamaIndex - ✅(Solved) Fix [Bug]: RAG CLI Chat does not work with GoogleGenAI [1 pull requests, 3 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Fix Action

Fixed

PR fix notes

PR #20867: fix: replace asyncio.run() with sync helpers in google-genai LLM

Description (problem / solution / changelog)

Summary

Approach

Changes

Test plan

Changed files

Code Example

Bug Description

RAG CLI Chat does not work with GoogleGenAI.

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Still need to ship something?

RELATED_DISCOVERY

TRENDING