llamaIndex - ✅(Solved) Fix FaissMapVectorStore uses eval() to load persisted id map [2 pull requests, 1 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#20677Fetched 2026-04-08 00:31:34
View on GitHub
Comments
1
Participants
2
Timeline
9
Reactions
0
Timeline (top)
cross-referenced ×2mentioned ×2subscribed ×2closed ×1

Fix Action

Fixed

PR fix notes

PR #20688: Fix/faiss map replace eval with json

Description (problem / solution / changelog)

Description

Replace unsafe eval() with json.loads in FaissMapVectorStore id map persistence.

FaissMapVectorStore.from_persist_path previously used eval(f.read()) to deserialize the id map file. While the persist side wrote the map with f.write(str(id_map)), using eval() to read it back would execute arbitrary code if the file were modified; a security risk.

This PR:

Replaces str()/eval() serialization with json.dump/json.loads for the id map Adds ast.literal_eval fallback for backward compatibility with files saved in the old str() format, so existing users are not broken Converts integer faiss_id keys to/from strings for JSON compatibility Fixes add_with_ids to pass a numpy array instead of a bare int for compatibility with faiss-cpu >= 1.12

Fixes #20677

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • [-] No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • [-] No

Type of Change

Please delete options that are not relevant.

  • [-] Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.

  • I added new unit tests to cover this change
  • [-] I believe this change is already covered by existing unit tests

Suggested Checklist:

  • [-] I have performed a self-review of my own code
  • [-] I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • [-] My changes generate no new warnings
  • [-] I have added tests that prove my fix is effective or that my feature works
  • [-] New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

Changed files

  • llama-index-integrations/vector_stores/llama-index-vector-stores-faiss/llama_index/vector_stores/faiss/map_store.py (modified, +30/-6)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-tencentvectordb/llama_index/vector_stores/tencentvectordb/__init__.py (modified, +2/-1)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-tencentvectordb/llama_index/vector_stores/tencentvectordb/base.py (modified, +5/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-tencentvectordb/pyproject.toml (modified, +1/-1)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-tencentvectordb/tests/test_vector_stores_tencentvectordb.py (modified, +36/-1)

PR #20675: Replace eval() with json.loads in FaissMapVectorStore persistence

Description (problem / solution / changelog)

Description

FaissMapVectorStore.from_persist_path currently deserializes the id map file using eval(f.read()), which will execute arbitrary Python code if the file contents have been tampered with. This is a straightforward code-execution-via-deserialization issue.

Resolves #20677

Changes

Persist side (persist()):

  • Replaced f.write(str(id_map)) with json.dump(id_map, f) for safe, structured serialization

Load side (from_persist_path()):

  • Replaced eval(f.read()) with json.loads(raw)
  • Added ast.literal_eval() as a fallback so that existing files saved with the old str() format can still be loaded without breakage
  • Added key type normalization after loading, since JSON converts all dict keys to strings (the faiss_id keys need to be ints for index lookups to work)

Backward Compatibility

Files persisted with the old str() format will still load correctly via the ast.literal_eval() fallback. New files will be written as proper JSON. No API changes.

Testing

The existing test_persist_and_load test in tests/test_vector_stores_faiss.py covers the persist/load round-trip and verifies that maps and queries remain correct after loading.

Changed files

  • llama-index-integrations/vector_stores/llama-index-vector-stores-faiss/llama_index/vector_stores/faiss/map_store.py (modified, +16/-5)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-faiss/pyproject.toml (modified, +1/-1)
RAW_BUFFERClick to expand / collapse

FaissMapVectorStore.from_persist_path deserializes the id map file using eval(f.read()):

https://github.com/run-llama/llama_index/blob/main/llama-index-integrations/vector_stores/llama-index-vector-stores-faiss/llama_index/vector_stores/faiss/map_store.py#L285

The persist side writes the map with f.write(str(id_map)), so the file contains a Python dict literal. Using eval() to read it back works, but it would also execute arbitrary code if the file were modified.

Switching to json.dump/json.loads for the round-trip is a drop-in fix. An ast.literal_eval fallback handles files already saved with the old str() format, so existing users aren't broken.

I put up a PR for this: #20675

cc @bfoley13 @logan-markewich

extent analysis

Fix Plan

To address the security vulnerability, we will replace eval() with json.loads() for deserializing the id map file and use json.dump() for serialization. We will also add a fallback using ast.literal_eval() to support existing files.

Step-by-Step Solution

  • Replace f.write(str(id_map)) with json.dump(id_map, f) for serialization.
  • Replace eval(f.read()) with json.load(f) for deserialization.
  • Add a try-except block to use ast.literal_eval(f.read()) as a fallback if json.load(f) fails.

Example Code

import json
import ast

# Serialization
with open('id_map_file', 'w') as f:
    json.dump(id_map, f)

# Deserialization with fallback
try:
    with open('id_map_file', 'r') as f:
        id_map = json.load(f)
except json.JSONDecodeError:
    with open('id_map_file', 'r') as f:
        id_map = ast.literal_eval(f.read())

Verification

Test the new serialization and deserialization code with both new and existing files to ensure compatibility and security.

Extra Tips

Remember to update any dependent code or tests to reflect the changes in serialization and deserialization. Also, consider adding input validation and error handling to further improve security and robustness.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

llamaIndex - ✅(Solved) Fix FaissMapVectorStore uses eval() to load persisted id map [2 pull requests, 1 comments, 2 participants]