llamaIndex - ✅(Solved) Fix [Bug]: Google reranker does not work in production [2 pull requests, 7 comments, 4 participants]

mdciri · 2026-03-25T17:33:03Z

[llamaIndex] PR 21184: fix: Google reranker async issues in production - Repository: run-llama/llama index - Author: hkc5 - State: closed | merged: False - Lin… # PR #21184: fix: Google reranker async issues in production - Repository: run-llama/llama_index - Author: hkc5 - State: closed | merged: False - Link: https://github.com/run-llama/llama_index/pull/21184 ## Description (problem / solution / changelog) Fixes #21150 ## Problem The Google reranker was failing in production environments (e.g., React agents) due to issues with the async client. The `RankServiceAsyncClient` was being initialized in `__init__` but used in a different event loop context during `_apostprocess_nodes`, causing technical issues. ## Solution Remove the custom `_apostprocess_nodes` implementation and the `_async_client` attribute. The class now relies on the base class implementation which uses `asyncio.to_thread` to run the synchronous `_postprocess_nodes` method. This approach: 1. Works correctly in all event loop contexts (Jupyter, production, etc.) 2. Uses only the synchronous client, avoiding event loop conflicts 3. Is consistent with how other postprocessors handle async operations ## Changes - Removed `_async_client` private attribute - Removed custom `_apostprocess_nodes` method (now uses base class default) - Updated tests to work with the new implementation ## Testing The async test now verifies that `apostprocess_nodes` correctly delegates to the sync `_postprocess_nodes` method via `asyncio.to_thread`. ## Changed files - `llama-index-integrations/postprocessor/llama-index-postprocessor-google-rerank/llama_index/postprocessor/google_rerank/base.py` (modified, +0/-48) - `llama-index-integrations/postprocessor/llama-index-postprocessor-google-rerank/tests/test_postprocessor_google_rerank.py` (modified, +6/-11) --- # PR #21210: fix: resolve multiple bugs in core and openai integrations (Issues #21109, #21159, #21150, #21124) - Repository: run-llama/llama_index - Author: Nanibunny - State: closed | merged: False - Link: https://github.com/run-llama/llama_index/pull/21210 ## Description (problem / solution / changelog) This PR aggregates fixes for multiple reported issues across the `llama-index-core`, `llama-index-llms-openai`, and `llama-index-postprocessor-google-rerank` packages: * **Fixes #21109:** Modified `SimplePropertyGraphStore` to correctly persist utf-8 encoded characters on Windows environments by adding explicit `encoding="utf-8"` arguments to file handlers. * **Fixes #21159:** Fixed an issue in `QueryFusionRetriever` where the asynchronous `_aretrieve` method blocked the event loop by synchronously calling `_get_queries()`. Introduced and awaited an `_aget_queries` equivalent instead. * **Fixes #21150:** Resolved production gRPC failures in `GoogleRerank` caused by thread/event-loop mismatches by lazy-loading the `RankServiceAsyncClient` during async execution instead of in the `__init__` constructor. * **Fixes #21124:** Addressed multiple edge cases in the OpenAI LLM integration regarding O1/O3 reasoning models and serialization: * Retained assistant text content when tool calls are present. * Parses, tracks, and serializes the `phase` property for commentary and reasoning workflows. * Used prefix matching instead of rigid dictionary checks for O1/O3 models to automatically support emerging variants (like `gpt-5`). * Properly distributes reasoning tokens universally across multiple `ThinkingBlock` elements. * Enforces valid JSON string serialization for `ToolCallBlock` tool arguments where dicts were previously failing. ## Type of Change - [x] Bug fix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] This change requires a documentation update ## How Has This Been Tested? - [x] I added new unit tests to cover this change - [x] I believe this change is already covered by existing unit tests ## Suggested Checklist: - [x] I have performed a self-review of my own code - [x] I have commented my code, particularly in hard-to-understand areas - [ ] I have made corresponding changes to the documentation - [ ] I have added Google Colab support for the newly added notebooks. - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] New and existing unit tests pass locally with my changes - [x] I ran `uv run make format; uv run make lint` to appease the lint gods ## Changed files - `llama-index-core/llama_index/core/graph_stores/simple_labelled.py` (modified, +2/-2) - `llama-index-core/llama_index/core/retrievers/fusion_retriever.py` (modified, +18/-1) - `llama-index-integrations/postprocessor/llama-index-postprocessor-google-rerank/llama_index/postprocessor/google_rerank/base.py` (modified, +15/-4) - `llama-index-integrations/postprocessor/llama-index-postprocessor-google-rerank/tests

llamaIndex2026-03-25 17:33:03

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#21150•Fetched 2026-04-08 01:31:22

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×7cross-referenced ×2labeled ×2

Code Example

async def _apostprocess_nodes(
    self,
    nodes: List[NodeWithScore],
    query_bundle: Optional[QueryBundle] = None,
) -> List[NodeWithScore]:
    """Postprocess nodes (async)."""
    return await asyncio.to_thread(self._postprocess_nodes, nodes, query_bundle)

---

RAW_BUFFERClick to expand / collapse

Bug Description

I tried the google reranker nodeprocessor and it works fine in my tests and in my jupyter notebook.

On the contrary, it does not work in production. I have a react agent with a query engine tool with the reranker and the virtual assistant continues reply: I’m sorry, but due to technical issues with the tools I have available, I’m unable to answer your question. Please try asking something else.

The problem is with the asyncronous threds when calling the _apostprocess_nodes() function which uses _async_client = discoveryengine.RankServiceAsyncClient(). In support of this, when I remove the _apostprocess_nodes() function, and use the default:

async def _apostprocess_nodes(
    self,
    nodes: List[NodeWithScore],
    query_bundle: Optional[QueryBundle] = None,
) -> List[NodeWithScore]:
    """Postprocess nodes (async)."""
    return await asyncio.to_thread(self._postprocess_nodes, nodes, query_bundle)

all works just fine.

Version

0.14.15

Steps to Reproduce

You should deploy somewhere your virtual assistant which initialize a react agent and a query engine with gemini with the GoogleGenAI() class using VertexAIConfig().

Relevant Logs/Tracbacks

extent analysis

Fix Plan

The issue seems to be related to asynchronous thread handling in the _apostprocess_nodes() function. To fix this, we need to ensure that the asynchronous client is properly awaited and handled.

Step-by-Step Solution

Modify the _apostprocess_nodes() function to properly handle the asynchronous client:

async def _apostprocess_nodes( self, nodes: List[NodeWithScore], query_bundle: Optional[QueryBundle] = None, ) -> List[NodeWithScore]: """Postprocess nodes (async).""" async_client = discoveryengine.RankServiceAsyncClient() # Ensure the client is properly awaited result = await async_client.rank_nodes(nodes, query_bundle) return result

2. **Add error handling** to catch and log any exceptions that may occur during the asynchronous operation:
    ```python
async def _apostprocess_nodes(
    self,
    nodes: List[NodeWithScore],
    query_bundle: Optional[QueryBundle] = None,
) -> List[NodeWithScore]:
    """Postprocess nodes (async)."""
    try:
        async_client = discoveryengine.RankServiceAsyncClient()
        result = await async_client.rank_nodes(nodes, query_bundle)
        return result
    except Exception as e:
        # Log the exception and return an error message
        logging.error(f"Error in _apostprocess_nodes: {e}")
        return []

Verification

To verify that the fix worked, you can test the _apostprocess_nodes() function in your production environment and check that it no longer returns an error message.

Extra Tips

Ensure that the discoveryengine.RankServiceAsyncClient() is properly configured and authenticated.
Consider adding additional logging and monitoring to detect and diagnose any future issues with the asynchronous operation.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#model save/load #optimization #mixed precision #training loop #device allocation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

llamaIndex - ✅(Solved) Fix [Bug]: Google reranker does not work in production [2 pull requests, 7 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

PR fix notes

PR #21184: fix: Google reranker async issues in production

Description (problem / solution / changelog)

Problem

Solution

Changes

Testing

Changed files

PR #21210: fix: resolve multiple bugs in core and openai integrations (Issues #21109, #21159, #21150, #21124)

Description (problem / solution / changelog)

Type of Change

How Has This Been Tested?

Suggested Checklist:

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

Step-by-Step Solution

Verification

Extra Tips

Still need to ship something?

TRENDING

llamaIndex - ✅(Solved) Fix [Bug]: Google reranker does not work in production [2 pull requests, 7 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

PR fix notes

PR #21184: fix: Google reranker async issues in production

Description (problem / solution / changelog)

Problem

Solution

Changes

Testing

Changed files

PR #21210: fix: resolve multiple bugs in core and openai integrations (Issues #21109, #21159, #21150, #21124)

Description (problem / solution / changelog)

Type of Change

How Has This Been Tested?

Suggested Checklist:

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracbacks

extent analysis

Fix Plan

Step-by-Step Solution

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING