langchain - ✅(Solved) Fix Performance: transformers are imported unconditionally on BaseChatModel import [1 pull requests, 1 comments, 2 participants]

zhemaituk · 2026-04-16T22:07:07Z

[langchain] Essentially it adds 300-500 ms during import/app startup to just check if GPT2TokenizerFast is available or not. It looks like the the only place i… Essentially it adds 300-500 ms during import/app startup to just check if `GPT2TokenizerFast` is available or not. It looks like the the only place it's imported is [langchain_core/language_models/base.py](https://github.com/langchain-ai/langchain/blob/6fb37dba71da807af60aa7b909f71f0625a666bf/libs/core/langchain_core/language_models/base.py#L42). Just moving the import inside get_tokenizer where it's used should do the trick. # PR #36836: perf(core): lazy-import transformers in get_tokenizer() - Repository: langchain-ai/langchain - Author: voidborne-d - State: closed | merged: False - Link: https://github.com/langchain-ai/langchain/pull/36836 ## Description (problem / solution / changelog) ## Summary Move the `from transformers import GPT2TokenizerFast` import from module level into `get_tokenizer()` so that importing `BaseChatModel` no longer triggers a 300–500 ms `transformers` import on every application startup. ## Problem When `transformers` is installed, importing anything from `langchain_core.language_models` unconditionally imports the entire `transformers` package at module level ([base.py#L42](https://github.com/langchain-ai/langchain/blob/6fb37dba71da807af60aa7b909f71f0625a666bf/libs/core/langchain_core/language_models/base.py#L42)). This adds **300–500 ms** to every cold startup, even when the GPT-2 tokenizer is never used. ``` import time: 367927 | 851828 | transformers ``` ## Fix - Remove the top-level `try/except` import of `GPT2TokenizerFast` and the `_HAS_TRANSFORMERS` sentinel. - Move the import inside `get_tokenizer()`, which is `@cache`-decorated and only called as a fallback for token counting. - The import runs exactly once on first call (due to `@cache`) and never runs if `get_tokenizer()` is not invoked. - Added `from None` to the `raise ImportError` to suppress the noisy chained exception. ## Test Added `test_transformers_not_imported_on_base_import` regression test that verifies `transformers` is not in `sys.modules` after importing the base module. ## Impact - **Before:** every `from langchain_core.language_models import BaseChatModel` pays 300–500 ms. - **After:** zero cost unless `get_tokenizer()` is actually called. - No behavioral change for users who do call `get_tokenizer()`. Fixes #36835 ## Changed files - `libs/core/langchain_core/language_models/base.py` (modified, +4/-9) - `libs/core/tests/unit_tests/language_models/test_base_lazy_imports.py` (added, +37/-0) ## Fix / Workaround - [x] This is a bug, not a usage question. - [x] I added a clear and descriptive title that summarizes this issue. - [x] I used the GitHub search to find a similar question and didn't find it. - [x] I am sure that this is a bug in LangChain rather than my code. - [x] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package). - [x] This is not related to the langchain-community package. - [x] I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS. Other Dependencies ------------------ > beautifulsoup4: 4.14.3 > boto3: 1.42.84 > httpx: 0.28.1 > jsonpatch: 1.33 > numpy: 2.4.4 > orjson: 3.11.8 > packaging: 26.1 > playwright: 1.58.0 > pydantic: 2.13.1 > pytest: 9.0.3 > pyyaml: 6.0.3 > requests: 2.33.1 > requests-toolbelt: 1.0.0 > rich: 15.0.0 > tenacity: 9.1.4 > typing-extensions: 4.15.0 > uuid-utils: 0.14.1 > xxhash: 3.6.0 > zstandard: 0.25.0 ### Submission checklist - [x] This is a bug, not a usage question. - [x] I added a clear and descriptive title that summarizes this issue. - [x] I used the GitHub search to find a similar question and didn't find it. - [x] I am sure that this is a bug in LangChain rather than my code. - [x] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package). - [x] This is not related to the langchain-community package. - [x] I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS. ### Package (Required) - [ ] langchain - [ ] langchain-openai - [ ] langchain-anthropic - [ ] langchain-classic - [x] langchain-core - [ ] langchain-model-profiles - [ ] langchain-tests - [ ] langchain-text-splitters - [ ] langchain-chroma - [ ] langchain-deepseek - [ ] langchain-exa - [ ] langchain-fireworks - [ ] langchain-groq - [ ] langchain-huggingface - [ ] langchain-mistralai - [ ] langchain-nomic - [ ] langchain-ollama - [ ] langchain-openrouter - [ ] langchain-perplexity - [ ] langchain-qdrant - [ ] langchain-xai - [ ] Other / not sure / general ### Related Issues / PRs _No response_ ### Reproduction Steps / Example Code (Python) ```python pip install transformers python -X importtime -c "from langchain_core.language_models import BaseChatModel" 2>&1 | grep transformers ``` ### Error Message and Stack Trace (if applicable) ```s

langchain2026-04-16 22:07:07

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#36835•Fetched 2026-04-17 08:26:46

View on GitHub

Comments

Participants

Timeline

Reactions

Author

zhemaituk

Participants

voidborne-d

zhemaituk

Timeline (top)

labeled ×3commented ×1cross-referenced ×1issue_type_added ×1

Essentially it adds 300-500 ms during import/app startup to just check if GPT2TokenizerFast is available or not.

It looks like the the only place it's imported is langchain_core/language_models/base.py.

Just moving the import inside get_tokenizer where it's used should do the trick.

Error Message

Error Message and Stack Trace (if applicable)

Root Cause

Essentially it adds 300-500 ms during import/app startup to just check if GPT2TokenizerFast is available or not.

It looks like the the only place it's imported is langchain_core/language_models/base.py.

Just moving the import inside get_tokenizer where it's used should do the trick.

Fix Action

Fix / Workaround

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

beautifulsoup4: 4.14.3 boto3: 1.42.84 httpx: 0.28.1 jsonpatch: 1.33 numpy: 2.4.4 orjson: 3.11.8 packaging: 26.1 playwright: 1.58.0 pydantic: 2.13.1 pytest: 9.0.3 pyyaml: 6.0.3 requests: 2.33.1 requests-toolbelt: 1.0.0 rich: 15.0.0 tenacity: 9.1.4 typing-extensions: 4.15.0 uuid-utils: 0.14.1 xxhash: 3.6.0 zstandard: 0.25.0

PR fix notes

PR #36836: perf(core): lazy-import transformers in get_tokenizer()

Repository: langchain-ai/langchain
Author: voidborne-d
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/36836

Description (problem / solution / changelog)

Summary

Move the from transformers import GPT2TokenizerFast import from module level into get_tokenizer() so that importing BaseChatModel no longer triggers a 300–500 ms transformers import on every application startup.

Problem

When transformers is installed, importing anything from langchain_core.language_models unconditionally imports the entire transformers package at module level (base.py#L42). This adds 300–500 ms to every cold startup, even when the GPT-2 tokenizer is never used.

import time:    367927 |     851828 |   transformers

Fix

Remove the top-level try/except import of GPT2TokenizerFast and the _HAS_TRANSFORMERS sentinel.
Move the import inside get_tokenizer(), which is @cache-decorated and only called as a fallback for token counting.
The import runs exactly once on first call (due to @cache) and never runs if get_tokenizer() is not invoked.
Added from None to the raise ImportError to suppress the noisy chained exception.

Test

Added test_transformers_not_imported_on_base_import regression test that verifies transformers is not in sys.modules after importing the base module.

Impact

Before: every from langchain_core.language_models import BaseChatModel pays 300–500 ms.
After: zero cost unless get_tokenizer() is actually called.
No behavioral change for users who do call get_tokenizer().

Fixes #36835

Changed files

libs/core/langchain_core/language_models/base.py (modified, +4/-9)
libs/core/tests/unit_tests/language_models/test_base_lazy_imports.py (added, +37/-0)

Code Example

pip install transformers

python -X importtime -c "from langchain_core.language_models import BaseChatModel" 2>&1 | grep transformers

---

import time:       158 |        158 |       transformers.dependency_versions_table
import time:       293 |        293 |             transformers.utils.doc
import time:       414 |        414 |                 transformers._typing
import time:       310 |      41629 |               transformers.utils.logging
import time:    296533 |     298007 |               transformers.utils.import_utils
import time:      1277 |     392881 |             transformers.utils.generic
import time:      1984 |     400583 |           transformers.utils.auto_docstring
import time:       744 |      24734 |           transformers.utils.chat_template_utils
import time:       226 |        226 |           transformers.utils.constants
import time:     11211 |      52855 |           transformers.utils.hub
import time:       246 |        246 |           transformers.utils.kernel_config
import time:       223 |        223 |           transformers.utils.peft_utils
import time:       482 |     479346 |         transformers.utils
import time:       150 |     479496 |       transformers.utils.versions
import time:      3384 |     483037 |     transformers.dependency_versions_check
import time:       183 |        183 |     transformers.utils.dummy_sentencepiece_and_tokenizers_objects
import time:       177 |        177 |     transformers.utils.dummy_mistral_common_objects
import time:       505 |        505 |     transformers.utils.dummy_torchvision_objects
import time:    367927 |     851828 |   transformers
import time:       707 |        707 |     transformers.convert_slow_tokenizer
import time:       992 |        992 |       transformers.integrations
import time:       400 |        647 |       transformers.dynamic_module_utils
import time:       241 |       2470 |       transformers.utils.chat_parsing_utils
import time:      1698 |       5805 |     transformers.integrations.ggml
import time:       757 |     709222 |     transformers.modeling_gguf_pytorch_utils
import time:       554 |     716357 |   transformers.tokenization_utils_tokenizers

RAW_BUFFERClick to expand / collapse

Submission checklist

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

pip install transformers

python -X importtime -c "from langchain_core.language_models import BaseChatModel" 2>&1 | grep transformers

Error Message and Stack Trace (if applicable)

import time:       158 |        158 |       transformers.dependency_versions_table
import time:       293 |        293 |             transformers.utils.doc
import time:       414 |        414 |                 transformers._typing
import time:       310 |      41629 |               transformers.utils.logging
import time:    296533 |     298007 |               transformers.utils.import_utils
import time:      1277 |     392881 |             transformers.utils.generic
import time:      1984 |     400583 |           transformers.utils.auto_docstring
import time:       744 |      24734 |           transformers.utils.chat_template_utils
import time:       226 |        226 |           transformers.utils.constants
import time:     11211 |      52855 |           transformers.utils.hub
import time:       246 |        246 |           transformers.utils.kernel_config
import time:       223 |        223 |           transformers.utils.peft_utils
import time:       482 |     479346 |         transformers.utils
import time:       150 |     479496 |       transformers.utils.versions
import time:      3384 |     483037 |     transformers.dependency_versions_check
import time:       183 |        183 |     transformers.utils.dummy_sentencepiece_and_tokenizers_objects
import time:       177 |        177 |     transformers.utils.dummy_mistral_common_objects
import time:       505 |        505 |     transformers.utils.dummy_torchvision_objects
import time:    367927 |     851828 |   transformers
import time:       707 |        707 |     transformers.convert_slow_tokenizer
import time:       992 |        992 |       transformers.integrations
import time:       400 |        647 |       transformers.dynamic_module_utils
import time:       241 |       2470 |       transformers.utils.chat_parsing_utils
import time:      1698 |       5805 |     transformers.integrations.ggml
import time:       757 |     709222 |     transformers.modeling_gguf_pytorch_utils
import time:       554 |     716357 |   transformers.tokenization_utils_tokenizers

Description

Essentially it adds 300-500 ms during import/app startup to just check if GPT2TokenizerFast is available or not.

It looks like the the only place it's imported is langchain_core/language_models/base.py.

Just moving the import inside get_tokenizer where it's used should do the trick.

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 25.4.0: Thu Mar 19 19:32:59 PDT 2026; root:xnu-12377.101.15~1/RELEASE_ARM64_T8122 Python Version: 3.14.4 (main, Apr 7 2026, 13:13:20) [Clang 21.0.0 (clang-2100.0.123.102)]

Package Information

langchain_core: 1.2.31 langsmith: 0.7.32 langchain_aws: 1.4.4 langgraph_sdk: 0.3.13

Optional packages not installed

deepagents deepagents-cli

Other Dependencies

beautifulsoup4: 4.14.3 boto3: 1.42.84 httpx: 0.28.1 jsonpatch: 1.33 numpy: 2.4.4 orjson: 3.11.8 packaging: 26.1 playwright: 1.58.0 pydantic: 2.13.1 pytest: 9.0.3 pyyaml: 6.0.3 requests: 2.33.1 requests-toolbelt: 1.0.0 rich: 15.0.0 tenacity: 9.1.4 typing-extensions: 4.15.0 uuid-utils: 0.14.1 xxhash: 3.6.0 zstandard: 0.25.0

extent analysis

TL;DR

Moving the import of GPT2TokenizerFast inside the get_tokenizer method where it's used can potentially fix the issue by reducing the import time.

Guidance

Identify the file where GPT2TokenizerFast is imported, which is langchain_core/language_models/base.py.
Move the import statement of GPT2TokenizerFast from the top of the file to inside the get_tokenizer method where it's actually used.
Verify that the import time has decreased after making this change by running the provided reproduction steps again.
Consider the system and package information provided to ensure that the fix is applicable to the specific environment.

Example

# Before
from transformers import GPT2TokenizerFast

class BaseChatModel:
    def get_tokenizer(self):
        # use GPT2TokenizerFast here
        pass

# After
class BaseChatModel:
    def get_tokenizer(self):
        from transformers import GPT2TokenizerFast
        # use GPT2TokenizerFast here
        pass

Notes

The provided reproduction steps and system information suggest that the issue is related to the import time of GPT2TokenizerFast. However, without more information about the specific use case and requirements, it's difficult to provide a more detailed solution.

Recommendation

Apply the workaround by moving the import statement of GPT2TokenizerFast inside the get_tokenizer method. This should reduce the import time and potentially fix the issue.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #optimization #mixed precision #training loop

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

langchain - ✅(Solved) Fix Performance: transformers are imported unconditionally on BaseChatModel import [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Message and Stack Trace (if applicable)

Root Cause

Fix Action

Fix / Workaround

Other Dependencies

PR fix notes

PR #36836: perf(core): lazy-import transformers in get_tokenizer()

Description (problem / solution / changelog)

Summary

Problem

Fix

Test

Impact

Changed files

Code Example

Submission checklist

Package (Required)

Related Issues / PRs

Reproduction Steps / Example Code (Python)

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING