langchain - 💡(How to fix) Fix `Chroma.update_document()` fails when `Document` metadata is omitted

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

I'm trying to update a Chroma document using update_document() without explicitly providing metadata.

I expect update_document() to behave consistently with add_documents(), which accepts Document objects without explicitly supplied metadata.

Instead, update_document() raises:

ValueError: Expected metadata to be a non-empty dict, got 0 metadata attributes in update.

The issue appears to originate from the update path inside:

https://github.com/langchain-ai/langchain/blob/master/libs/partners/chroma/langchain_chroma/vectorstores.py

Specifically, inside:

def update_documents(self, ids: list[str], documents: list[Document]) -> None:

the implementation forwards metadata directly:

metadata = [document.metadata for document in documents]

without normalizing missing metadata before passing it to the Chroma update API.

This creates inconsistent behavior because:

  • add_documents() accepts Document(page_content="...")
  • update_document() rejects the same document structure unless metadata={} is explicitly provided

A possible fix may be normalizing missing metadata before forwarding to Chroma, e.g.:

metadata = [
    document.metadata or {}
    for document in documents
]

Error Message

Exception has occurred: ValueError Expected metadata to be a non-empty dict, got 0 metadata attributes in update. File "my_path_that_you_will_not_know_about\bug_test.py", line 30, in <module> document_id="doc_1", document=Document(page_content="Updated document")

) ValueError: Expected metadata to be a non-empty dict, got 0 metadata attributes in update.

Root Cause

This creates inconsistent behavior because:

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

aiohttp: 3.13.5 asyncpg: 0.31.0 chromadb: 1.5.9 dataclasses-json: 0.6.7 httpx: 0.28.1 httpx-sse: 0.4.3 huggingface-hub: 1.15.0 jsonpatch: 1.33 langgraph: 1.2.0 numpy: 2.4.4 openai: 2.36.0 openai-agents: 0.17.2 opentelemetry-api: 1.41.1 opentelemetry-sdk: 1.41.1 orjson: 3.11.9 packaging: 26.2 pgvector: 0.3.6 psycopg: 3.3.4 psycopg-pool: 3.3.1 pydantic: 2.13.4 pydantic-settings: 2.14.1 PyYAML: 6.0.3 pyyaml: 6.0.3 requests: 2.34.1 requests-toolbelt: 1.0.0 rich: 15.0.0 sentence-transformers: 5.5.0 SQLAlchemy: 2.0.49 sqlalchemy: 2.0.49 tenacity: 9.1.4 tiktoken: 0.12.0 tokenizers: 0.22.2 transformers: 5.8.1 typing-extensions: 4.15.0 uuid-utils: 0.15.0 websockets: 16.0 xxhash: 3.7.0 zstandard: 0.25.0

Code Example

import os
import getpass
from dotenv import load_dotenv

from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain_chroma import Chroma

# Step 1: Load OpenAI API Key
load_dotenv(override=True)
if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key: ")


# Step 2: Create Embedder
embedder = OpenAIEmbeddings(model="text-embedding-3-large")


# Step 3: Create Vector Store
vector_store = Chroma(collection_name="test_collection", embedding_function=embedder)


# Step 4: Add Document
vector_store.add_documents(
    documents=[Document(page_content="Original document")], ids=["doc_1"]
)


# Step 5: Update Document
vector_store.update_document(
    document_id="doc_1", document=Document(page_content="Updated document")
)

---

Exception has occurred: ValueError
Expected metadata to be a non-empty dict, got 0 metadata attributes in update.
  File "my_path_that_you_will_not_know_about\bug_test.py", line 30, in <module>
    document_id="doc_1", document=Document(page_content="Updated document")

)
ValueError: Expected metadata to be a non-empty dict, got 0 metadata attributes in update.

---

the implementation forwards metadata directly:

---

without normalizing missing metadata before passing it to the Chroma update API.

This creates inconsistent behavior because:

* `add_documents()` accepts `Document(page_content="...")`
* `update_document()` rejects the same document structure unless `metadata={}` is explicitly provided

A possible fix may be normalizing missing metadata before forwarding to Chroma, e.g.:
RAW_BUFFERClick to expand / collapse

Submission checklist

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

import os
import getpass
from dotenv import load_dotenv

from langchain_openai import OpenAIEmbeddings
from langchain_core.documents import Document
from langchain_chroma import Chroma

# Step 1: Load OpenAI API Key
load_dotenv(override=True)
if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key: ")


# Step 2: Create Embedder
embedder = OpenAIEmbeddings(model="text-embedding-3-large")


# Step 3: Create Vector Store
vector_store = Chroma(collection_name="test_collection", embedding_function=embedder)


# Step 4: Add Document
vector_store.add_documents(
    documents=[Document(page_content="Original document")], ids=["doc_1"]
)


# Step 5: Update Document
vector_store.update_document(
    document_id="doc_1", document=Document(page_content="Updated document")
)

Error Message and Stack Trace (if applicable)

Exception has occurred: ValueError
Expected metadata to be a non-empty dict, got 0 metadata attributes in update.
  File "my_path_that_you_will_not_know_about\bug_test.py", line 30, in <module>
    document_id="doc_1", document=Document(page_content="Updated document")

)
ValueError: Expected metadata to be a non-empty dict, got 0 metadata attributes in update.

Description

I'm trying to update a Chroma document using update_document() without explicitly providing metadata.

I expect update_document() to behave consistently with add_documents(), which accepts Document objects without explicitly supplied metadata.

Instead, update_document() raises:

ValueError: Expected metadata to be a non-empty dict, got 0 metadata attributes in update.

The issue appears to originate from the update path inside:

https://github.com/langchain-ai/langchain/blob/master/libs/partners/chroma/langchain_chroma/vectorstores.py

Specifically, inside:

def update_documents(self, ids: list[str], documents: list[Document]) -> None:

the implementation forwards metadata directly:

metadata = [document.metadata for document in documents]

without normalizing missing metadata before passing it to the Chroma update API.

This creates inconsistent behavior because:

  • add_documents() accepts Document(page_content="...")
  • update_document() rejects the same document structure unless metadata={} is explicitly provided

A possible fix may be normalizing missing metadata before forwarding to Chroma, e.g.:

metadata = [
    document.metadata or {}
    for document in documents
]

System Info

System Information

OS: Windows OS Version: 10.0.26200 Python Version: 3.12.12 (main, Feb 12 2026, 00:40:26) [MSC v.1944 64 bit (AMD64)]

Package Information

langchain_core: 1.4.0 langchain: 1.3.0 langchain_community: 0.4.1 langsmith: 0.8.4 langchain_chroma: 1.1.0 langchain_classic: 1.0.7 langchain_huggingface: 1.2.2 langchain_openai: 1.2.1 langchain_postgres: 0.0.17 langchain_protocol: 0.0.15 langchain_tavily: 0.2.18 langchain_text_splitters: 1.1.2 langgraph_sdk: 0.3.14

Optional packages not installed

deepagents deepagents-cli

Other Dependencies

aiohttp: 3.13.5 asyncpg: 0.31.0 chromadb: 1.5.9 dataclasses-json: 0.6.7 httpx: 0.28.1 httpx-sse: 0.4.3 huggingface-hub: 1.15.0 jsonpatch: 1.33 langgraph: 1.2.0 numpy: 2.4.4 openai: 2.36.0 openai-agents: 0.17.2 opentelemetry-api: 1.41.1 opentelemetry-sdk: 1.41.1 orjson: 3.11.9 packaging: 26.2 pgvector: 0.3.6 psycopg: 3.3.4 psycopg-pool: 3.3.1 pydantic: 2.13.4 pydantic-settings: 2.14.1 PyYAML: 6.0.3 pyyaml: 6.0.3 requests: 2.34.1 requests-toolbelt: 1.0.0 rich: 15.0.0 sentence-transformers: 5.5.0 SQLAlchemy: 2.0.49 sqlalchemy: 2.0.49 tenacity: 9.1.4 tiktoken: 0.12.0 tokenizers: 0.22.2 transformers: 5.8.1 typing-extensions: 4.15.0 uuid-utils: 0.15.0 websockets: 16.0 xxhash: 3.7.0 zstandard: 0.25.0

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

langchain - 💡(How to fix) Fix `Chroma.update_document()` fails when `Document` metadata is omitted