llamaIndex - ✅(Solved) Fix [Bug]: Google reranker does not work in production [2 pull requests, 7 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#21150Fetched 2026-04-08 01:31:22
View on GitHub
Comments
7
Participants
4
Timeline
11
Reactions
0
Author
Timeline (top)
commented ×7cross-referenced ×2labeled ×2

PR fix notes

PR #21184: fix: Google reranker async issues in production

Description (problem / solution / changelog)

Fixes #21150

Problem

The Google reranker was failing in production environments (e.g., React agents) due to issues with the async client. The RankServiceAsyncClient was being initialized in __init__ but used in a different event loop context during _apostprocess_nodes, causing technical issues.

Solution

Remove the custom _apostprocess_nodes implementation and the _async_client attribute. The class now relies on the base class implementation which uses asyncio.to_thread to run the synchronous _postprocess_nodes method. This approach:

  1. Works correctly in all event loop contexts (Jupyter, production, etc.)
  2. Uses only the synchronous client, avoiding event loop conflicts
  3. Is consistent with how other postprocessors handle async operations

Changes

  • Removed _async_client private attribute
  • Removed custom _apostprocess_nodes method (now uses base class default)
  • Updated tests to work with the new implementation

Testing

The async test now verifies that apostprocess_nodes correctly delegates to the sync _postprocess_nodes method via asyncio.to_thread.

Changed files

  • llama-index-integrations/postprocessor/llama-index-postprocessor-google-rerank/llama_index/postprocessor/google_rerank/base.py (modified, +0/-48)
  • llama-index-integrations/postprocessor/llama-index-postprocessor-google-rerank/tests/test_postprocessor_google_rerank.py (modified, +6/-11)

PR #21210: fix: resolve multiple bugs in core and openai integrations (Issues #21109, #21159, #21150, #21124)

Description (problem / solution / changelog)

This PR aggregates fixes for multiple reported issues across the llama-index-core, llama-index-llms-openai, and llama-index-postprocessor-google-rerank packages:

  • Fixes #21109: Modified SimplePropertyGraphStore to correctly persist utf-8 encoded characters on Windows environments by adding explicit encoding="utf-8" arguments to file handlers.
  • Fixes #21159: Fixed an issue in QueryFusionRetriever where the asynchronous _aretrieve method blocked the event loop by synchronously calling _get_queries(). Introduced and awaited an _aget_queries equivalent instead.
  • Fixes #21150: Resolved production gRPC failures in GoogleRerank caused by thread/event-loop mismatches by lazy-loading the RankServiceAsyncClient during async execution instead of in the __init__ constructor.
  • Fixes #21124: Addressed multiple edge cases in the OpenAI LLM integration regarding O1/O3 reasoning models and serialization:
    • Retained assistant text content when tool calls are present.
    • Parses, tracks, and serializes the phase property for commentary and reasoning workflows.
    • Used prefix matching instead of rigid dictionary checks for O1/O3 models to automatically support emerging variants (like gpt-5).
    • Properly distributes reasoning tokens universally across multiple ThinkingBlock elements.
    • Enforces valid JSON string serialization for ToolCallBlock tool arguments where dicts were previously failing.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

Changed files

  • llama-index-core/llama_index/core/graph_stores/simple_labelled.py (modified, +2/-2)
  • llama-index-core/llama_index/core/retrievers/fusion_retriever.py (modified, +18/-1)
  • llama-index-integrations/postprocessor/llama-index-postprocessor-google-rerank/llama_index/postprocessor/google_rerank/base.py (modified, +15/-4)
  • llama-index-integrations/postprocessor/llama-index-postprocessor-google-rerank/tests/test_postprocessor_google_rerank.py (modified, +7/-3)

Code Example

async def _apostprocess_nodes(
    self,
    nodes: List[NodeWithScore],
    query_bundle: Optional[QueryBundle] = None,
) -> List[NodeWithScore]:
    """Postprocess nodes (async)."""
    return await asyncio.to_thread(self._postprocess_nodes, nodes, query_bundle)

---
RAW_BUFFERClick to expand / collapse

Bug Description

I tried the google reranker nodeprocessor and it works fine in my tests and in my jupyter notebook.

On the contrary, it does not work in production. I have a react agent with a query engine tool with the reranker and the virtual assistant continues reply: I’m sorry, but due to technical issues with the tools I have available, I’m unable to answer your question. Please try asking something else.

The problem is with the asyncronous threds when calling the _apostprocess_nodes() function which uses _async_client = discoveryengine.RankServiceAsyncClient(). In support of this, when I remove the _apostprocess_nodes() function, and use the default:

async def _apostprocess_nodes(
    self,
    nodes: List[NodeWithScore],
    query_bundle: Optional[QueryBundle] = None,
) -> List[NodeWithScore]:
    """Postprocess nodes (async)."""
    return await asyncio.to_thread(self._postprocess_nodes, nodes, query_bundle)

all works just fine.

Version

0.14.15

Steps to Reproduce

You should deploy somewhere your virtual assistant which initialize a react agent and a query engine with gemini with the GoogleGenAI() class using VertexAIConfig().

Relevant Logs/Tracbacks

extent analysis

Fix Plan

The issue seems to be related to asynchronous thread handling in the _apostprocess_nodes() function. To fix this, we need to ensure that the asynchronous client is properly awaited and handled.

Step-by-Step Solution

  1. Modify the _apostprocess_nodes() function to properly handle the asynchronous client:

async def _apostprocess_nodes( self, nodes: List[NodeWithScore], query_bundle: Optional[QueryBundle] = None, ) -> List[NodeWithScore]: """Postprocess nodes (async).""" async_client = discoveryengine.RankServiceAsyncClient() # Ensure the client is properly awaited result = await async_client.rank_nodes(nodes, query_bundle) return result

2. **Add error handling** to catch and log any exceptions that may occur during the asynchronous operation:
    ```python
async def _apostprocess_nodes(
    self,
    nodes: List[NodeWithScore],
    query_bundle: Optional[QueryBundle] = None,
) -> List[NodeWithScore]:
    """Postprocess nodes (async)."""
    try:
        async_client = discoveryengine.RankServiceAsyncClient()
        result = await async_client.rank_nodes(nodes, query_bundle)
        return result
    except Exception as e:
        # Log the exception and return an error message
        logging.error(f"Error in _apostprocess_nodes: {e}")
        return []

Verification

To verify that the fix worked, you can test the _apostprocess_nodes() function in your production environment and check that it no longer returns an error message.

Extra Tips

  • Ensure that the discoveryengine.RankServiceAsyncClient() is properly configured and authenticated.
  • Consider adding additional logging and monitoring to detect and diagnose any future issues with the asynchronous operation.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING