llamaIndex - ✅(Solved) Fix [Bug]: `refresh_ref_docs()` / `arefresh_ref_docs()` drop kwargs after the first document in a batch [2 pull requests, 2 comments, 3 participants]

llamaIndex2026-04-30 18:44:11

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

run-llama/llama_index#21518•Fetched 2026-05-01 05:33:16

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×2cross-referenced ×2labeled ×2

Error Message

Relevant Logs/Tracebacks

Root Cause

insert_kwargs and update_kwargs passed to refresh_ref_docs() are silently dropped after the first matching document because the method calls .pop() on the shared update_kwargs dict inside the document loop. So, the first inserted or updated document receives the expected kwargs, but subsequent documents in the same batch receive {} without any error.

Code Example

### Relevant Logs/Tracebacks

RAW_BUFFERClick to expand / collapse

Bug Description

Version

0.14.21

Steps to Reproduce

from typing import Any, List

from llama_index.core import Document, VectorStoreIndex
from llama_index.core.schema import BaseNode, TransformComponent


class RecordKwargs(TransformComponent):
    def __call__(self, nodes: List[BaseNode], **kwargs: Any) -> List[BaseNode]:
        print(f"transform received: {kwargs}")
        return nodes


docs = [Document(text=f"doc {i}") for i in range(3)]
index = VectorStoreIndex([], transformations=[RecordKwargs()])

print("refresh_ref_docs with insert_kwargs={'my_flag': True}:")
index.refresh_ref_docs(docs, insert_kwargs={"my_flag": True})

Relevant Logs/Tracebacks

refresh_ref_docs with insert_kwargs={'my_flag': True}:
transform received: {'my_flag': True}
transform received: {}
transform received: {}
[True, True, True]

extent analysis

TL;DR

Avoid using .pop() on the shared update_kwargs dict inside the document loop in refresh_ref_docs() to prevent silently dropping insert_kwargs and update_kwargs after the first matching document.

Guidance

Identify the line of code where .pop() is called on update_kwargs and refactor to avoid modifying the shared dictionary.
Consider creating a copy of update_kwargs for each document iteration to prevent unintended side effects.
Review the refresh_ref_docs() method to ensure it handles insert_kwargs and update_kwargs correctly for all documents in a batch.
Verify the fix by running the provided example code and checking the output of transform received to ensure it prints the expected kwargs for all documents.

Example

# Create a copy of update_kwargs for each document iteration
for doc in documents:
    update_kwargs_copy = update_kwargs.copy()
    # Use update_kwargs_copy instead of update_kwargs

Notes

The provided code snippet and logs suggest that the issue is related to the use of .pop() on the shared update_kwargs dict. However, without the complete implementation of refresh_ref_docs(), it's difficult to provide a more specific fix.

Recommendation

Apply workaround: Refactor the refresh_ref_docs() method to avoid modifying the shared update_kwargs dict, and consider creating a copy of update_kwargs for each document iteration. This should prevent the silent dropping of insert_kwargs and update_kwargs after the first matching document.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#index setup #retrieval issue #search optimization #API routing #API middleware

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

llamaIndex - ✅(Solved) Fix [Bug]: `refresh_ref_docs()` / `arefresh_ref_docs()` drop kwargs after the first document in a batch [2 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Relevant Logs/Tracebacks

Root Cause

Fix Action

Fixed

PR fix notes

PR #21519: fix(core): preserve refresh_ref_docs kwargs across batch documents

Description (problem / solution / changelog)

Description

New Package?

Version Bump?

Type of Change

How Has This Been Tested?

Suggested Checklist:

Changed files

PR #21522: fix(indices): preserve insert_kwargs and update_kwargs across all documents in refresh_ref_docs batch

Description (problem / solution / changelog)

Changed files

Code Example

Bug Description

Version

Steps to Reproduce

Relevant Logs/Tracebacks

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING