llamaIndex - ✅(Solved) Fix [Bug]: `VectorStoreIndex` delete helpers use a wrong delete API and sync call in async path [2 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#21049Fetched 2026-04-08 00:53:04
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Participants
Timeline (top)
cross-referenced ×2labeled ×2closed ×1

Root Cause

_delete_from_index_struct() and _adelete_from_index_struct() both call self._vector_store.delete(node_id), passing a node ID to a ref-doc deletion API. This is semantically incorrect imo, since node-wise deletion has separate delete_nodes() / adelete_nodes() APIs. It is also unnecessary, because delete_ref_doc() already calls delete(ref_doc_id), and adelete_ref_doc() already schedules adelete(ref_doc_id). In addition, _adelete_from_index_struct() makes this extra delete call synchronously instead of using the async delete path.

Fix Action

Fixed

PR fix notes

PR #21050: fix(core): remove incorrect per-node delete calls in index helpers

Description (problem / solution / changelog)

Description

Removes the extra vector_store.delete(node_id) calls from VectorStoreIndex delete helpers, since ref-doc cleanup is already handled by the top-level delete path. This would also remove the synchronous delete call from the async path.

Fixes #21049

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • No

Type of Change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

Changed files

  • llama-index-core/llama_index/core/indices/vector_store/base.py (modified, +0/-2)
  • llama-index-core/tests/indices/vector_store/test_simple.py (modified, +28/-0)
  • llama-index-core/tests/indices/vector_store/test_simple_async.py (modified, +31/-1)

PR #21055: fix: use async adelete in _adelete_from_index_struct

Description (problem / solution / changelog)

Fixes #21049

Summary

_adelete_from_index_struct() was calling the synchronous self._vector_store.delete(node_id) instead of await self._vector_store.adelete(node_id) in an async context. This bypasses the async code path and can cause issues in async environments.

Changes

  • Changed sync self._vector_store.delete() call to await self._vector_store.adelete() in _adelete_from_index_struct

Changed files

  • llama-index-core/llama_index/core/indices/vector_store/base.py (modified, +1/-1)

Code Example

### Relevant Logs/Tracbacks
RAW_BUFFERClick to expand / collapse

Bug Description

_delete_from_index_struct() and _adelete_from_index_struct() both call self._vector_store.delete(node_id), passing a node ID to a ref-doc deletion API. This is semantically incorrect imo, since node-wise deletion has separate delete_nodes() / adelete_nodes() APIs. It is also unnecessary, because delete_ref_doc() already calls delete(ref_doc_id), and adelete_ref_doc() already schedules adelete(ref_doc_id). In addition, _adelete_from_index_struct() makes this extra delete call synchronously instead of using the async delete path.

Version

0.14.18

Steps to Reproduce


import asyncio
from typing import Any

from llama_index.core import VectorStoreIndex, MockEmbedding
from llama_index.core.schema import Document
from llama_index.core.storage.storage_context import StorageContext
from llama_index.core.vector_stores.simple import SimpleVectorStore

delete_log = []


class TrackingVectorStore(SimpleVectorStore):
    def delete(self, ref_doc_id: str, **kwargs: Any) -> None:
        delete_log.append(("SYNC", ref_doc_id))
        super().delete(ref_doc_id, **kwargs)

    async def adelete(self, ref_doc_id: str, **kwargs: Any) -> None:
        delete_log.append(("ASYNC", ref_doc_id))
        await super().adelete(ref_doc_id, **kwargs)

    @property
    def client(self) -> Any:
        return None


store = TrackingVectorStore()
index = VectorStoreIndex.from_documents(
    [Document(text="hello world", doc_id="my-doc-id")],
    storage_context=StorageContext.from_defaults(vector_store=store),
    embed_model=MockEmbedding(embed_dim=8),
)

index.delete_ref_doc("my-doc-id")
print("Sync delete calls:", delete_log)

delete_log.clear()

store2 = TrackingVectorStore()
index2 = VectorStoreIndex.from_documents(
    [Document(text="hello world", doc_id="my-doc-id")],
    storage_context=StorageContext.from_defaults(vector_store=store2),
    embed_model=MockEmbedding(embed_dim=8),
)

await index2.adelete_ref_doc("my-doc-id")
print("Async delete calls:", delete_log)

Relevant Logs/Tracbacks

Sync delete calls: [('SYNC', 'my-doc-id'), ('SYNC', 'afc5df02-1d56-40bc-bda2-fdd222f2fd83')]
Async delete calls: [('ASYNC', 'my-doc-id'), ('SYNC', 'my-doc-id'), ('SYNC', 'b1aa8b5e-981f-4494-a9c0-ee137533f749')]

Note: with SimpleVectorStore, the extra ('SYNC', 'my-doc-id') in the async case comes from the default adelete() fallback to delete(). The final ('SYNC', '<node-id>') is the helper bug.

extent analysis

Fix Plan

To fix the issue, we need to remove the unnecessary self._vector_store.delete(node_id) calls from _delete_from_index_struct() and _adelete_from_index_struct().

Here are the steps:

  • Remove the self._vector_store.delete(node_id) call from _delete_from_index_struct().
  • Remove the self._vector_store.delete(node_id) call from _adelete_from_index_struct() and use the async delete path instead.

Example code:

def _delete_from_index_struct(self, node_id: str) -> None:
    # Remove the following line
    # self._vector_store.delete(node_id)
    # Rest of the method remains the same

async def _adelete_from_index_struct(self, node_id: str) -> None:
    # Replace the following line with the async delete call
    # self._vector_store.delete(node_id)
    await self._vector_store.adelete(node_id)
    # Rest of the method remains the same

Verification

To verify the fix, run the provided test code and check the delete_log output. It should no longer contain the extra ('SYNC', '<node-id>') calls.

Extra Tips

  • Make sure to test the fix with both sync and async delete methods to ensure the issue is fully resolved.
  • Review the code to ensure that the delete and adelete methods are being used correctly throughout the project.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING