llamaIndex - ✅(Solved) Fix [Feature Request]: Add sqlite-vec Vector Store Integration [1 pull requests, 3 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
run-llama/llama_index#21064Fetched 2026-04-08 00:58:21
View on GitHub
Comments
3
Participants
2
Timeline
12
Reactions
0
Timeline (top)
commented ×3mentioned ×3subscribed ×3labeled ×2

Fix Action

Fixed

PR fix notes

PR #21073: [Feature]: Add sqlite-vec vector store integration

Description (problem / solution / changelog)

Description

Adds a new vector store integration using sqlite-vec, a lightweight SQLite extension for vector similarity search. sqlite-vec is a zero-dependency, pure-SQLite solution that makes vector search accessible without requiring a separate database server.

Uses a two-table design: a vec0 virtual table for KNN search + a regular SQLite table for metadata/text storage. Supports cosine and L2 distance metrics, file-based and in-memory persistence, and comprehensive metadata filtering (EQ, NE, IN, NIN, GT/GTE/LT/LTE, TEXT_MATCH, IS_EMPTY, AND/OR/NOT conditions).

Dependency: sqlite-vec>=0.1.1

Fixes #21064

New Package?

Did I fill in the tool.llamahub section in the pyproject.toml and provide a detailed README.md for my new integration or package?

  • Yes
  • No

Version Bump?

Did I bump the version in the pyproject.toml file of the package I am updating? (Except for the llama-index-core package)

  • Yes
  • No

Type of Change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • I added new unit tests to cover this change

43 tests covering both in-memory and disk-persisted modes, all metadata filter operators including nested AND/OR/NOT conditions, async operations, and full node CRUD (add, get, delete by node_id, delete by ref_doc_id, clear).

Suggested Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added Google Colab support for the newly added notebooks.
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I ran uv run make format; uv run make lint to appease the lint gods

Changed files

  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/.gitignore (added, +147/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/Makefile (added, +20/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/README.md (added, +24/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/llama_index/py.typed (added, +0/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/llama_index/vector_stores/sqlite_vec/__init__.py (added, +4/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/llama_index/vector_stores/sqlite_vec/base.py (added, +549/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/pyproject.toml (added, +70/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/tests/__init__.py (added, +0/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/tests/test_sqlite_vec.py (added, +519/-0)
  • llama-index-integrations/vector_stores/llama-index-vector-stores-sqlite-vec/uv.lock (added, +4380/-0)
RAW_BUFFERClick to expand / collapse

Feature Description

Add support for sqlite-vec as a vector store backend in LlamaIndex.

sqlite-vec is a lightweight, zero-dependency SQLite extension for vector similarity search written in pure C. It supports float, int8, and binary vector types, and runs anywhere SQLite runs — including Python, Node.js, and edge environments.

The proposed integration would implement a SqliteVecVectorStore class extending BasePydanticVectorStore, supporting:

  • Vector storage and cosine/L2 similarity search via SQLite virtual tables
  • Metadata filtering on stored documents
  • Persistent and in-memory modes (just like SQLite itself)
  • Standard LlamaIndex operations: add(), query(), delete(), get_nodes(), clear()

Reason

LlamaIndex doesn’t currently support a SQLite-based vector store integration. While there are embedded options like DuckDB and FAISS, SQLite is the most widely used database engine globally and is already a part of many Python projects.

For users looking for a simple, local-first RAG pipeline without relying on external services, the options are limited:

  1. They can run a separate vector database (like Chroma, Qdrant, or Milvus).
  2. They can use DuckDB, which might not fit into their existing setup.
  3. They can opt for FAISS, but it doesn’t offer built-in persistence or metadata filtering.

This is why we welcome sqlite-vec — it requires no server or extra infrastructure and uses a database format that developers are already familiar with.

Value of Feature

  • No setup needed: Just use a local .db file — perfect for testing, prototyping, CI/CD, and edge deployments.
  • Familiar to developers: SQLite is the most popular database engine worldwide, so you can easily inspect and debug your data with common SQLite tools.
  • Lightweight: It’s a pure C extension with no extra dependencies, making it great for low-resource environments like IoT, mobile apps, and serverless setups.
  • Bridges the gap: It provides a simple, persistent option between in-memory stores like FAISS and heavier SQL databases like DuckDB and Postgres.
  • Widely used in Python: The sqlite-vec library is already being used in LangChain and other LLM frameworks, so users of LlamaIndex will find it useful too.

I am willing to submit a PR for this integration.

extent analysis

Fix Plan

To integrate sqlite-vec as a vector store backend in LlamaIndex, we need to create a SqliteVecVectorStore class. Here are the steps:

  • Implement the SqliteVecVectorStore class extending BasePydanticVectorStore
  • Add methods for vector storage and cosine/L2 similarity search via SQLite virtual tables
  • Implement metadata filtering on stored documents
  • Support persistent and in-memory modes

Example Code

import sqlite3
from llama_index import BasePydanticVectorStore

class SqliteVecVectorStore(BasePydanticVectorStore):
    def __init__(self, db_path):
        self.db_path = db_path
        self.conn = sqlite3.connect(db_path)
        self.cursor = self.conn.cursor()

    def add(self, vectors, metadata):
        # Create table if not exists
        self.cursor.execute("""
            CREATE TABLE IF NOT EXISTS vectors (
                id INTEGER PRIMARY KEY,
                vector BLOB,
                metadata TEXT
            );
        """)
        # Insert vectors and metadata
        for vector, meta in zip(vectors, metadata):
            self.cursor.execute("INSERT INTO vectors (vector, metadata) VALUES (?, ?)", (vector, meta))
        self.conn.commit()

    def query(self, query_vector, top_k):
        # Create virtual table for similarity search
        self.cursor.execute("""
            CREATE VIRTUAL TABLE IF NOT EXISTS similarity USING sqlite_vec(
                vectors,
                query_vector=?,
                top_k=?
            );
        """, (query_vector, top_k))
        # Execute query
        self.cursor.execute("SELECT * FROM similarity")
        results = self.cursor.fetchall()
        return results

    def delete(self, ids):
        # Delete vectors by id
        self.cursor.execute("DELETE FROM vectors WHERE id IN (%s)" % ",".join("?" * len(ids)), ids)
        self.conn.commit()

    def get_nodes(self):
        # Get all vectors and metadata
        self.cursor.execute("SELECT * FROM vectors")
        results = self.cursor.fetchall()
        return results

    def clear(self):
        # Clear all vectors and metadata
        self.cursor.execute("DELETE FROM vectors")
        self.conn.commit()

Verification

To verify the fix, you can test the SqliteVecVectorStore class with sample data and queries. For example:

store = SqliteVecVectorStore("test.db")
vectors = [b"vector1", b"vector2", b"vector3"]
metadata = ["meta1", "meta2", "meta3"]
store.add(vectors, metadata)
results = store.query(b"query_vector", 2)
print(results)

This should print the top 2 similar vectors and their metadata.

Extra Tips

  • Make sure to handle errors and exceptions properly in the SqliteVecVectorStore class.
  • Consider adding support for other vector

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING