llamaIndex - ✅(Solved) Fix [Bug]: RedisVectorStore returns corrupted node IDs in add/async_add due to misuse of .strip() [1 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#21483Fetched 2026-04-27 05:28:57
View on GitHub
Comments
1
Participants
2
Timeline
6
Reactions
0
Timeline (top)
labeled ×2commented ×1cross-referenced ×1mentioned ×1

Fix Action

Fixed

PR fix notes

PR #21484: fix(redis): prevent node ID corruption in async_add and add by replac…

Description (problem / solution / changelog)

Description

This PR fixes a silent data corruption bug in the RedisVectorStore related to how returned node IDs are parsed after loading documents into Redis.

In add and async_add, the code previously used .strip(prefix + separator) to remove the prefix from the full Redis key. Since Python's .strip(chars) removes character sets rather than exact substrings, it inadvertently swallows valid starting hex characters of the returned UUIDs if they happen to intersect with the prefix characters.

This PR replaces .strip() with .removeprefix() to safely and accurately remove the exact prefix string.

Fixes #21483

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • No

Type of Change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

  • I added new unit tests to cover this change
  • I believe this change is already covered by existing unit tests (Added local testing with custom prefixes containing overlapping hex characters to verify exact string removal).

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

Changed files

  • llama-index-integrations/vector_stores/llama-index-vector-stores-redis/llama_index/vector_stores/redis/base.py (modified, +2/-2)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-redis/pyproject.toml (modified, +1/-1)

Code Example

import asyncio
from llama_index.core.schema import TextNode
from llama_index.vector_stores.redis import RedisVectorStore
from redisvl.schema import IndexSchema

async def main():
    
    cache_schema = IndexSchema.from_dict({
        "index": {
            "name": "test_index",
            "prefix": "semantic_cache_doc"
        },
        "fields":[
            {"type": "tag", "name": "id"},
            {"type": "tag", "name": "doc_id"},
            {"type": "text", "name": "text"},
            {
                "type": "vector",
                "name": "vector",
                "attrs": {
                    "dims": 1536,
                    "algorithm": "FLAT",
                    "distance_metric": "COSINE"
                }
            }
        ]
    })
    
    vector_store = RedisVectorStore(
        schema=cache_schema,
        redis_url="redis://localhost:6379"
    )

    # Create a node with a UUID that starts with a hex character present in the prefix
    # 'e' is present in "semantic_cache_doc"
    node = TextNode(text="test", id_="e7b95ae7-6369-404d-8287-1f4504121563")

    node.embedding = [0.0] * 1536

    returned_ids = await vector_store.async_add([node])
    
    print(f"Original ID: {node.node_id}")
    print(f"Returned ID: {returned_ids[0]}")
    # Output will be: 7b95ae7-6369-404d-8287-1f4504121563 
    # Notice the missing 'e' at the beginning.

asyncio.run(main())

---

Original ID: e7b95ae7-6369-404d-8287-1f4504121563
Returned ID: 7b95ae7-6369-404d-8287-1f4504121563
RAW_BUFFERClick to expand / collapse

Bug Description

In the RedisVectorStore integration, the add and async_add methods return a list of node IDs after successfully loading them into Redis. To return the original IDs, the code attempts to remove the Redis prefix from the full key using: key.strip(self._async_index.prefix + self._async_index.key_separator)

However, Python's .strip(chars) removes any combination of the provided characters from the ends of the string, not the exact substring. Since node IDs are typically UUIDs containing hex characters (a-f, 0-9), and prefixes often contain similar characters (e.g., doc, cache), .strip() inadvertently deletes valid starting characters of the UUID if they intersect with the prefix characters.

The exact substring removal should be done using .removeprefix().

Version

0.8.0

Steps to Reproduce

import asyncio
from llama_index.core.schema import TextNode
from llama_index.vector_stores.redis import RedisVectorStore
from redisvl.schema import IndexSchema

async def main():
    
    cache_schema = IndexSchema.from_dict({
        "index": {
            "name": "test_index",
            "prefix": "semantic_cache_doc"
        },
        "fields":[
            {"type": "tag", "name": "id"},
            {"type": "tag", "name": "doc_id"},
            {"type": "text", "name": "text"},
            {
                "type": "vector",
                "name": "vector",
                "attrs": {
                    "dims": 1536,
                    "algorithm": "FLAT",
                    "distance_metric": "COSINE"
                }
            }
        ]
    })
    
    vector_store = RedisVectorStore(
        schema=cache_schema,
        redis_url="redis://localhost:6379"
    )

    # Create a node with a UUID that starts with a hex character present in the prefix
    # 'e' is present in "semantic_cache_doc"
    node = TextNode(text="test", id_="e7b95ae7-6369-404d-8287-1f4504121563")

    node.embedding = [0.0] * 1536

    returned_ids = await vector_store.async_add([node])
    
    print(f"Original ID: {node.node_id}")
    print(f"Returned ID: {returned_ids[0]}")
    # Output will be: 7b95ae7-6369-404d-8287-1f4504121563 
    # Notice the missing 'e' at the beginning.

asyncio.run(main())
```python

### Relevant Logs/Tracebacks

```shell
Original ID: e7b95ae7-6369-404d-8287-1f4504121563
Returned ID: 7b95ae7-6369-404d-8287-1f4504121563

extent analysis

TL;DR

Replace the strip() method with removeprefix() to correctly remove the Redis prefix from the full key.

Guidance

  • Identify the lines of code using strip() for prefix removal and replace them with removeprefix().
  • Verify the fix by running the provided example code and checking that the original ID is correctly returned.
  • Be aware that removeprefix() is available in Python 3.9 and later; for earlier versions, consider using str.startswith() and slicing to remove the prefix.

Example

key = key.removeprefix(self._async_index.prefix + self._async_index.key_separator)

Notes

This fix assumes that the prefix is always present at the start of the key. If the prefix can be missing or appear elsewhere in the key, additional error checking may be necessary.

Recommendation

Apply the workaround by replacing strip() with removeprefix(), as this correctly removes the exact prefix substring from the key.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

llamaIndex - ✅(Solved) Fix [Bug]: RedisVectorStore returns corrupted node IDs in add/async_add due to misuse of .strip() [1 pull requests, 1 comments, 2 participants]