langchain - ✅(Solved) Fix WeaviateVectorStore fails silently on index [4 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#35572Fetched 2026-04-08 00:25:30
View on GitHub
Comments
2
Participants
2
Timeline
14
Reactions
0
Author
Participants
Timeline (top)
referenced ×5cross-referenced ×4commented ×2labeled ×2

the add_documents / add_texts in WeaviateVectorStore logs errors during indexing but still returns a valid list of chunk IDs, giving the impression that indexing was successfull especially in fastapi apps where the langchain logger is not attached to the runtime. there should be imho a way at least something like failed_objects, success_objects = store.add_documents(), so that it becomes clear for the implementation side of add_documents which chunks failed:

<img width="1776" height="866" alt="Image" src="https://github.com/user-attachments/assets/b114cd15-d7a3-483f-b7f5-982dfbd6fa29" />

Error Message

logger.error(err_message)

Error Message and Stack Trace (if applicable)

Root Cause

the add_documents / add_texts in WeaviateVectorStore logs errors during indexing but still returns a valid list of chunk IDs, giving the impression that indexing was successfull especially in fastapi apps where the langchain logger is not attached to the runtime. there should be imho a way at least something like failed_objects, success_objects = store.add_documents(), so that it becomes clear for the implementation side of add_documents which chunks failed:

<img width="1776" height="866" alt="Image" src="https://github.com/user-attachments/assets/b114cd15-d7a3-483f-b7f5-982dfbd6fa29" />

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

aiohttp: 3.12.14 anthropic: 0.75.0 blockbuster: 1.5.25 click: 8.1.8 cloudpickle: 3.1.1 cryptography: 44.0.1 dataclasses-json: 0.6.7 grpcio: 1.76.0 grpcio-health-checking: 1.76.0 grpcio-tools: 1.75.1 httpx: 0.28.1 httpx-sse: 0.4.0 jsonpatch: 1.33 jsonschema-rs: 0.29.1 langgraph: 1.0.5 langgraph-checkpoint: 3.0.1 numpy: 2.2.6 openai: 2.14.0 opentelemetry-api: 1.39.0 opentelemetry-exporter-otlp-proto-http: 1.39.0 opentelemetry-sdk: 1.39.0 orjson: 3.11.5 packaging: 25.0 protobuf: 6.33.2 pydantic: 2.12.5 pydantic-settings: 2.12.0 pyjwt: 2.10.1 pytest: 8.3.4 python-dotenv: 1.1.0 pyyaml: 6.0.3 PyYAML: 6.0.3 requests: 2.32.5 requests-toolbelt: 1.0.0 rich: 13.9.4 simsimd: 6.5.12 sqlalchemy: 2.0.40 SQLAlchemy: 2.0.40 sse-starlette: 2.1.3 starlette: 0.46.2 structlog: 25.2.0 tenacity: 9.1.2 tiktoken: 0.8.0 truststore: 0.10.4 typing-extensions: 4.15.0 uuid-utils: 0.12.0 uvicorn: 0.34.0 watchfiles: 1.0.4 weaviate-client: 4.14.0 zstandard: 0.23.0 (venv) x@LAPTOP-97M072GO:~/Repos/chain-pipes$

PR fix notes

PR #260: fix(weaviate): raise on batch indexing failures in add_texts

Description (problem / solution / changelog)

Summary

add_texts() now raises a ValueError when Weaviate batch indexing fails, instead of silently logging errors and returning all IDs as if everything succeeded.

Fixes langchain-ai/langchain#35572

Problem

WeaviateVectorStore.add_texts() calls self._client.batch.failed_objects after batch indexing, logs any failures, but still returns the full list of IDs — giving callers no indication that some objects weren't actually indexed. This is especially problematic in FastAPI apps where the LangChain logger may not be attached.

Changes

libs/weaviate/langchain_weaviate/vectorstores.py

  • After logging failures, filter failed UUIDs out of the returned ids list
  • Raise ValueError with a descriptive message: count of failures, total attempted, and first error
  • Attach successful_ids attribute to the exception for partial recovery (getattr(e, 'successful_ids', []))

libs/weaviate/tests/unit_tests/test_vectorstores_unit.py

  • Added test_add_texts_raises_value_error_on_batch_failures() — mocks one failed object out of two, asserts ValueError is raised with correct message and successful_ids contains only the non-failed ID

Backward Compatibility

This is a behavior change from silent failure to raising. However, silent data loss on batch indexing is a bug, not a feature — callers relying on the previous behavior were already losing data without knowing it.

Changed files

  • libs/weaviate/langchain_weaviate/vectorstores.py (modified, +11/-0)
  • libs/weaviate/tests/unit_tests/test_vectorstores_unit.py (modified, +40/-0)

PR #261: fix: raise ValueError when batch operations fail in add_texts

Description (problem / solution / changelog)

Description

Fixes langchain-ai/langchain#35572

Currently, WeaviateVectorStore.add_texts() only logs errors when batch operations fail, but still returns all IDs (including failed ones). This causes silent failures, especially in FastAPI apps where logs aren't visible.

Changes

  • Modified add_texts() to raise ValueError with detailed error messages when any objects fail to index
  • Added unit test to verify the new behavior

Impact

Breaking change: Code that previously ignored failures will now get exceptions Benefit: Developers will immediately know when indexing fails

Example

Before: Silent failure After: Raises ValueError with details

Changed files

  • libs/weaviate/langchain_weaviate/vectorstores.py (modified, +12/-5)
  • libs/weaviate/tests/unit_tests/test_vectorstores_unit.py (modified, +25/-0)

PR #35632: fix: WeaviateVectorStore fails silently on index

Description (problem / solution / changelog)

Summary

This addresses #35572.

What changed

<!-- Claude filled in the implementation details at commit time -->

Implemented a fix based on the issue description. See the diff for specifics.

Testing

  • Verified against the existing test suite
  • Checked that the fix addresses the reported behavior

Changed files

  • libs/partners/weaviate/langchain_weaviate/__init__.py (added, +5/-0)
  • libs/partners/weaviate/langchain_weaviate/_math.py (added, +138/-0)
  • libs/partners/weaviate/langchain_weaviate/utils.py (added, +52/-0)
  • libs/partners/weaviate/langchain_weaviate/vectorstores.py (added, +563/-0)
  • libs/partners/weaviate/pyproject.toml (added, +107/-0)
  • libs/partners/weaviate/tests/__init__.py (added, +0/-0)
  • libs/partners/weaviate/tests/unit_tests/__init__.py (added, +0/-0)
  • libs/partners/weaviate/tests/unit_tests/test_vectorstores_unit.py (added, +307/-0)

PR #262: fix: raise on WeaviateVectorStore batch indexing failures

Description (problem / solution / changelog)

Summary

Raise ValueError when Weaviate batch indexing fails instead of silently returning IDs. Fixes langchain-ai/langchain#35572.

Root Cause

add_texts logged errors via logger.error but still returned all IDs, giving false success in FastAPI apps where logs are not visible.

Fix

Check failed_objs and raise ValueError with details before returning.

Fixes langchain-ai/langchain#35572

Changed files

  • libs/weaviate/langchain_weaviate/vectorstores.py (modified, +4/-6)

Code Example

def add_texts(
        self,
        texts: Iterable[str],
        metadatas: Optional[List[dict]] = None,
        tenant: Optional[str] = None,
        **kwargs: Any,
    ) -> List[str]:
        """Upload texts with metadata (properties) to Weaviate."""
        from weaviate.util import get_valid_uuid  # type: ignore

        if tenant and not self._does_tenant_exist(tenant):
            logger.info(
                f"Tenant {tenant} does not exist in index {self._index_name}. "
                "Creating tenant."
            )
            tenant_objs = [weaviate.classes.tenants.Tenant(name=tenant)]
            self._collection.tenants.create(tenants=tenant_objs)

        ids = []
        embeddings: Optional[List[List[float]]] = None
        if self._embedding:
            embeddings = self._embedding.embed_documents(list(texts))

        with self._client.batch.dynamic() as batch:
            for i, text in enumerate(texts):
                data_properties = {self._text_key: text}
                if metadatas is not None:
                    for key, val in metadatas[i].items():
                        data_properties[key] = _json_serializable(val)

                # Allow for ids (consistent w/ other methods)
                # # Or uuids (backwards compatible w/ existing arg)
                # If the UUID of one of the objects already exists
                # then the existing object will be replaced by the new object.
                _id = get_valid_uuid(uuid4())
                if "uuids" in kwargs:
                    _id = kwargs["uuids"][i]
                elif "ids" in kwargs:
                    _id = kwargs["ids"][i]

                batch.add_object(
                    collection=self._index_name,
                    properties=data_properties,
                    uuid=_id,
                    vector=embeddings[i] if embeddings else None,
                    tenant=tenant,
                )

                ids.append(_id)

        failed_objs = self._client.batch.failed_objects
        for obj in failed_objs:
            err_message = (
                f"Failed to add object: {obj.original_uuid}\nReason: {obj.message}"
            )

            logger.error(err_message)

        return ids

---
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

def add_texts(
        self,
        texts: Iterable[str],
        metadatas: Optional[List[dict]] = None,
        tenant: Optional[str] = None,
        **kwargs: Any,
    ) -> List[str]:
        """Upload texts with metadata (properties) to Weaviate."""
        from weaviate.util import get_valid_uuid  # type: ignore

        if tenant and not self._does_tenant_exist(tenant):
            logger.info(
                f"Tenant {tenant} does not exist in index {self._index_name}. "
                "Creating tenant."
            )
            tenant_objs = [weaviate.classes.tenants.Tenant(name=tenant)]
            self._collection.tenants.create(tenants=tenant_objs)

        ids = []
        embeddings: Optional[List[List[float]]] = None
        if self._embedding:
            embeddings = self._embedding.embed_documents(list(texts))

        with self._client.batch.dynamic() as batch:
            for i, text in enumerate(texts):
                data_properties = {self._text_key: text}
                if metadatas is not None:
                    for key, val in metadatas[i].items():
                        data_properties[key] = _json_serializable(val)

                # Allow for ids (consistent w/ other methods)
                # # Or uuids (backwards compatible w/ existing arg)
                # If the UUID of one of the objects already exists
                # then the existing object will be replaced by the new object.
                _id = get_valid_uuid(uuid4())
                if "uuids" in kwargs:
                    _id = kwargs["uuids"][i]
                elif "ids" in kwargs:
                    _id = kwargs["ids"][i]

                batch.add_object(
                    collection=self._index_name,
                    properties=data_properties,
                    uuid=_id,
                    vector=embeddings[i] if embeddings else None,
                    tenant=tenant,
                )

                ids.append(_id)

        failed_objs = self._client.batch.failed_objects
        for obj in failed_objs:
            err_message = (
                f"Failed to add object: {obj.original_uuid}\nReason: {obj.message}"
            )

            logger.error(err_message)

        return ids

Error Message and Stack Trace (if applicable)

Description

the add_documents / add_texts in WeaviateVectorStore logs errors during indexing but still returns a valid list of chunk IDs, giving the impression that indexing was successfull especially in fastapi apps where the langchain logger is not attached to the runtime. there should be imho a way at least something like failed_objects, success_objects = store.add_documents(), so that it becomes clear for the implementation side of add_documents which chunks failed:

<img width="1776" height="866" alt="Image" src="https://github.com/user-attachments/assets/b114cd15-d7a3-483f-b7f5-982dfbd6fa29" />

System Info

python -m langchain_core.sys_info

System Information

OS: Linux OS Version: #1 SMP Tue Nov 5 00:21:55 UTC 2024 Python Version: 3.12.7 | packaged by Anaconda, Inc. | (main, Oct 4 2024, 13:27:36) [GCC 11.2.0]

Package Information

langchain_core: 1.2.6 langchain: 1.2.0 langchain_community: 0.4.1 langsmith: 0.3.45 langchain_anthropic: 1.0.0 langchain_classic: 1.0.1 langchain_openai: 1.1.6 langchain_text_splitters: 1.1.0 langchain_weaviate: 0.0.6 langgraph_api: 0.6.24 langgraph_cli: 0.4.11 langgraph_runtime_inmem: 0.21.1 langgraph_sdk: 0.3.1

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.12.14 anthropic: 0.75.0 blockbuster: 1.5.25 click: 8.1.8 cloudpickle: 3.1.1 cryptography: 44.0.1 dataclasses-json: 0.6.7 grpcio: 1.76.0 grpcio-health-checking: 1.76.0 grpcio-tools: 1.75.1 httpx: 0.28.1 httpx-sse: 0.4.0 jsonpatch: 1.33 jsonschema-rs: 0.29.1 langgraph: 1.0.5 langgraph-checkpoint: 3.0.1 numpy: 2.2.6 openai: 2.14.0 opentelemetry-api: 1.39.0 opentelemetry-exporter-otlp-proto-http: 1.39.0 opentelemetry-sdk: 1.39.0 orjson: 3.11.5 packaging: 25.0 protobuf: 6.33.2 pydantic: 2.12.5 pydantic-settings: 2.12.0 pyjwt: 2.10.1 pytest: 8.3.4 python-dotenv: 1.1.0 pyyaml: 6.0.3 PyYAML: 6.0.3 requests: 2.32.5 requests-toolbelt: 1.0.0 rich: 13.9.4 simsimd: 6.5.12 sqlalchemy: 2.0.40 SQLAlchemy: 2.0.40 sse-starlette: 2.1.3 starlette: 0.46.2 structlog: 25.2.0 tenacity: 9.1.2 tiktoken: 0.8.0 truststore: 0.10.4 typing-extensions: 4.15.0 uuid-utils: 0.12.0 uvicorn: 0.34.0 watchfiles: 1.0.4 weaviate-client: 4.14.0 zstandard: 0.23.0 (venv) x@LAPTOP-97M072GO:~/Repos/chain-pipes$

extent analysis

Problem Summary

The add_texts method in WeaviateVectorStore logs errors during indexing but still returns a valid list of chunk IDs, making it difficult to determine which chunks failed.

Root Cause Analysis

The root cause is likely due to the fact that the add_texts method does not properly handle failed objects. It logs the errors but continues to process the remaining objects, resulting in a valid list of chunk IDs being returned.

Fix Plan

To fix this issue, we need to modify the add_texts method to properly handle failed objects. Here are the steps to fix this issue:

Step 1: Modify the add_texts method to return failed objects

def add_texts(
        self,
        texts: Iterable[str],
        metadatas: Optional[List[dict]] = None,
        tenant: Optional[str] = None,
        **kwargs: Any,
    ) -> Tuple[List[str], List[FailedObject]]:
    ...
    failed_objs = self._client.batch.failed_objects
    return ids, failed_objs

Step 2: Update the caller to handle failed objects

failed_objects, success_objects = store.add_texts(texts)
for failed_obj in failed_objects:
    logger.error(f"Failed to add object: {failed_obj.original_uuid}\nReason: {failed_obj.message}")

Step 3: Update the logging to include failed objects

logger.info(f"Added {len(success_objects)} objects successfully")
logger.error(f"Failed to add {len(failed_objects)} objects: {failed_objects}")

Verification

To verify that the fix worked, you can run the add_texts method with a sample dataset and check the logs for any failed objects. You can also add additional logging statements to verify that the failed objects are being properly handled.

Extra Tips

  • Make sure to update the logging configuration to include the failed objects.
  • Consider

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

langchain - ✅(Solved) Fix WeaviateVectorStore fails silently on index [4 pull requests, 2 comments, 2 participants]