langchain - ✅(Solved) Fix importing langchain_text_splitters giving NVML warning [2 pull requests, 9 comments, 6 participants]

caravin · 2026-02-25T13:24:58Z

[langchain] Just trying to import the package is resulting in the above warning. Although it isn't harmful, we have a couple of workflows where we raise flags… Just trying to import the package is resulting in the above warning. Although it isn't harmful, we have a couple of workflows where we raise flags if the process raise any warnings. And this warning is creating issues for us. I am not 100% sure that it is only because of langchain_text_splitters and not related to my setup but for me just importing this package is causing the issue # PR #35469: fix(text-splitters): restore lazy imports for heavy optional dependencies - Repository: langchain-ai/langchain - Author: gitbalaji - State: open | merged: False - Link: https://github.com/langchain-ai/langchain/pull/35469 ## Description (problem / solution / changelog) ## Summary - Moves `nltk`, `spacy`, `sentence-transformers`, and `konlpy` imports back inside class constructors/functions so they are only loaded when the respective splitter is actually instantiated - Adds a subprocess-based regression test to verify no heavy packages are imported at `langchain_text_splitters` load time ## Why PR #32325 moved these optional dependency imports to module-level `try/except` blocks (to satisfy ruff's `PLC0415` rule). Since `__init__.py` imports all four splitter modules, this caused `import langchain_text_splitters` to eagerly load all optional heavy packages, resulting in: - A PyTorch NVML warning (`UserWarning: Can't initialize NVML`) on non-GPU machines - A ~650MB memory spike on import (74MB → 736MB), vs ~50MB in 0.3.x The fix restores the lazy import pattern with `# noqa: PLC0415` to suppress the linter rule, which is the correct trade-off when a dependency has high instantiation cost. ## Review notes - The `PLC0415` suppressions are intentional — these are optional heavy dependencies that should never be loaded unless the user explicitly instantiates the splitter class - The regression test uses a subprocess for proper isolation (the test file itself imports `langchain_text_splitters` at the top, so `sys.modules` checks within the same process would not reflect a clean import state) Fixes #35437. > **AI disclaimer:** This PR was developed with assistance from Claude Code (Anthropic AI). ## Changed files - `libs/text-splitters/langchain_text_splitters/konlpy.py` (modified, +7/-13) - `libs/text-splitters/langchain_text_splitters/nltk.py` (modified, +4/-9) - `libs/text-splitters/langchain_text_splitters/sentence_transformers.py` (modified, +7/-12) - `libs/text-splitters/langchain_text_splitters/spacy.py` (modified, +12/-15) - `libs/text-splitters/tests/unit_tests/test_text_splitters.py` (modified, +20/-0) --- # PR #35499: fix(text-splitters): lazy-import nltk/spacy/sentence-transformers to avoid ~700MB memory bloat - Repository: langchain-ai/langchain - Author: sxu75374 - State: open | merged: False - Link: https://github.com/langchain-ai/langchain/pull/35499 ## Description (problem / solution / changelog) ## Problem Importing `langchain_text_splitters` eagerly pulls in `nltk`, `spacy`, and `sentence-transformers` (plus `torch` transitively) at module level, adding ~700 MiB RSS even when only lightweight splitters like `RecursiveCharacterTextSplitter` are used. This regression was introduced in 1.0.0 when [PLC0415](https://docs.astral.sh/ruff/rules/import-outside-top-level/) moved lazy imports to top-level (PR #32325). ## Root Cause `__init__.py` imports from `.nltk`, `.spacy`, and `.sentence_transformers` at module level. Each of these files does `try: import heavy_lib` at the top, so merely importing the package triggers the full import chain. ## Fix Move the three heavy-dep splitter imports (`NLTKTextSplitter`, `SpacyTextSplitter`, `SentenceTransformersTokenTextSplitter`) behind a `__getattr__` lazy loader, following the exact same pattern used throughout `langchain-core` (e.g., `langchain_core.embeddings`, `langchain_core.callbacks`). - **`__all__` is unchanged** — public API is identical - **`TYPE_CHECKING` block** preserves type-checker visibility - **`globals()[attr_name] = result`** caches the import so subsequent accesses are free ## Tests Added `tests/unit_tests/test_lazy_imports.py`: - Bare package import does not pull in heavy deps - Lazy `__getattr__` correctly resolves each class - Unknown attributes raise `AttributeError` - Lightweight splitters remain directly importable - E2E: using `RecursiveCharacterTextSplitter` doesn't import heavy deps ## Reproduction (from issue) ```python import os, psutil def mem(): return psutil.Process(os.getpid()).memory_info().rss / 1024 / 1024 print(f"Start: {mem():.1f} MiB") from langchain_text_splitters import RecursiveCharacterTextSplitter print(f"After: {mem():.1f} MiB") # Before: 74 → 736 MiB (+662 MiB) # After: 74 → ~80 MiB (no heavy deps loaded) ``` Fixes #35437 ## Changed files - `libs/text-splitters/langchain_text_splitters/__init__.py` (modified, +33/-5) - `libs/text-splitters/tests/unit_tests/test_lazy_i

langchain2026-02-25 13:24:58

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#35437•Fetched 2026-04-08 00:26:10

View on GitHub

Comments

Participants

Timeline

Reactions

Author

caravin

Participants

caravin

cbornet

fairchildadrian9-create

gitbalaji

keenborder786

xXMrNidaXx

Timeline (top)

commented ×9mentioned ×9subscribed ×9labeled ×3

Just trying to import the package is resulting in the above warning. Although it isn't harmful, we have a couple of workflows where we raise flags if the process raise any warnings. And this warning is creating issues for us.

I am not 100% sure that it is only because of langchain_text_splitters and not related to my setup but for me just importing this package is causing the issue

Error Message

Error Message and Stack Trace (if applicable)

warnings.warn("Can't initialize NVML")

Root Cause

I am not 100% sure that it is only because of langchain_text_splitters and not related to my setup but for me just importing this package is causing the issue

Fix Action

Fix / Workaround

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

PR fix notes

PR #35469: fix(text-splitters): restore lazy imports for heavy optional dependencies

Repository: langchain-ai/langchain
Author: gitbalaji
State: open | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35469

Description (problem / solution / changelog)

Summary

Moves nltk, spacy, sentence-transformers, and konlpy imports back inside class constructors/functions so they are only loaded when the respective splitter is actually instantiated
Adds a subprocess-based regression test to verify no heavy packages are imported at langchain_text_splitters load time

Why

PR #32325 moved these optional dependency imports to module-level try/except blocks (to satisfy ruff's PLC0415 rule). Since __init__.py imports all four splitter modules, this caused import langchain_text_splitters to eagerly load all optional heavy packages, resulting in:

A PyTorch NVML warning (UserWarning: Can't initialize NVML) on non-GPU machines
A ~650MB memory spike on import (74MB → 736MB), vs ~50MB in 0.3.x

The fix restores the lazy import pattern with # noqa: PLC0415 to suppress the linter rule, which is the correct trade-off when a dependency has high instantiation cost.

Review notes

The PLC0415 suppressions are intentional — these are optional heavy dependencies that should never be loaded unless the user explicitly instantiates the splitter class
The regression test uses a subprocess for proper isolation (the test file itself imports langchain_text_splitters at the top, so sys.modules checks within the same process would not reflect a clean import state)

Fixes #35437.

AI disclaimer: This PR was developed with assistance from Claude Code (Anthropic AI).

Changed files

libs/text-splitters/langchain_text_splitters/konlpy.py (modified, +7/-13)
libs/text-splitters/langchain_text_splitters/nltk.py (modified, +4/-9)
libs/text-splitters/langchain_text_splitters/sentence_transformers.py (modified, +7/-12)
libs/text-splitters/langchain_text_splitters/spacy.py (modified, +12/-15)
libs/text-splitters/tests/unit_tests/test_text_splitters.py (modified, +20/-0)

PR #35499: fix(text-splitters): lazy-import nltk/spacy/sentence-transformers to avoid ~700MB memory bloat

Repository: langchain-ai/langchain
Author: sxu75374
State: open | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35499

Description (problem / solution / changelog)

Problem

Importing langchain_text_splitters eagerly pulls in nltk, spacy, and sentence-transformers (plus torch transitively) at module level, adding ~700 MiB RSS even when only lightweight splitters like RecursiveCharacterTextSplitter are used.

This regression was introduced in 1.0.0 when PLC0415 moved lazy imports to top-level (PR #32325).

Root Cause

__init__.py imports from .nltk, .spacy, and .sentence_transformers at module level. Each of these files does try: import heavy_lib at the top, so merely importing the package triggers the full import chain.

Fix

Move the three heavy-dep splitter imports (NLTKTextSplitter, SpacyTextSplitter, SentenceTransformersTokenTextSplitter) behind a __getattr__ lazy loader, following the exact same pattern used throughout langchain-core (e.g., langchain_core.embeddings, langchain_core.callbacks).

__all__ is unchanged — public API is identical
TYPE_CHECKING block preserves type-checker visibility
globals()[attr_name] = result caches the import so subsequent accesses are free

Tests Added

tests/unit_tests/test_lazy_imports.py:

Bare package import does not pull in heavy deps
Lazy __getattr__ correctly resolves each class
Unknown attributes raise AttributeError
Lightweight splitters remain directly importable
E2E: using RecursiveCharacterTextSplitter doesn't import heavy deps

Reproduction (from issue)

import os, psutil
def mem(): return psutil.Process(os.getpid()).memory_info().rss / 1024 / 1024

print(f"Start: {mem():.1f} MiB")
from langchain_text_splitters import RecursiveCharacterTextSplitter
print(f"After: {mem():.1f} MiB")
# Before: 74 → 736 MiB (+662 MiB)
# After:  74 → ~80 MiB (no heavy deps loaded)

Fixes #35437

Changed files

libs/text-splitters/langchain_text_splitters/__init__.py (modified, +33/-5)
libs/text-splitters/tests/unit_tests/test_lazy_imports.py (added, +99/-0)

Code Example

import langchain_text_splitters

---

/usr/local/python/python-3.11/std/lib64/python3.11/site-packages/torch/cuda/__init__.py:611: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")

RAW_BUFFERClick to expand / collapse

Checked other resources

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

import langchain_text_splitters

Error Message and Stack Trace (if applicable)

/usr/local/python/python-3.11/std/lib64/python3.11/site-packages/torch/cuda/__init__.py:611: UserWarning: Can't initialize NVML
  warnings.warn("Can't initialize NVML")

Description

I am not 100% sure that it is only because of langchain_text_splitters and not related to my setup but for me just importing this package is causing the issue

System Info

langchain==1.x stack

extent analysis

Problem Summary

Fixing a UserWarning: Can't initialize NVML warning when importing langchain_text_splitters.

Root Cause Analysis

The warning is likely due to a missing CUDA driver or a conflict with the torch library.

Fix Plan

Step 1: Check CUDA Driver

Install the latest CUDA driver from the official NVIDIA website.
Verify that the CUDA driver is installed and configured correctly.

Step 2: Update Torch Library

Update the torch library to the latest version using pip: pip install --upgrade torch
Verify that the torch library is updated and the warning is resolved.

Step 3: Disable CUDA Initialization

If the above steps do not resolve the issue, try disabling CUDA initialization in the langchain_text_splitters package.
Add the following code to your langchain_text_splitters installation directory (usually site-packages):

import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

This will disable CUDA initialization and prevent the warning from being raised.

Step 4: Verify the Fix

Re-import the langchain_text_splitters package and verify that the warning is no longer raised.
Run your workflows and verify that the warning does not cause any issues.

Verification

To verify that the fix worked, run the following code:

import langchain_text_splitters
try:
    import torch
    torch.cuda.is_available()
except Exception as e:
    print(f"Error: {e}")
else:
    print("CUDA is available")

If the fix worked, the output should be "CUDA is available". If the warning is still raised, try the above steps again or seek further assistance.

Extra Tips

Make sure to update the torch library to the latest version to ensure that you have the latest bug fixes and features.
If you are using a virtual environment, make sure to activate

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #prompt template #agent execution #callback error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

langchain - ✅(Solved) Fix importing langchain_text_splitters giving NVML warning [2 pull requests, 9 comments, 6 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Message and Stack Trace (if applicable)

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #35469: fix(text-splitters): restore lazy imports for heavy optional dependencies

Description (problem / solution / changelog)

Summary

Why

Review notes

Changed files

PR #35499: fix(text-splitters): lazy-import nltk/spacy/sentence-transformers to avoid ~700MB memory bloat

Description (problem / solution / changelog)

Problem

Root Cause

Fix

Tests Added

Reproduction (from issue)

Changed files

Code Example

Checked other resources

Package (Required)

Related Issues / PRs

Reproduction Steps / Example Code (Python)

Error Message and Stack Trace (if applicable)

Description

System Info

extent analysis

Problem Summary

Root Cause Analysis

Fix Plan

Step 1: Check CUDA Driver

Step 2: Update Torch Library

Step 3: Disable CUDA Initialization

Step 4: Verify the Fix

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING