langchain - 💡(How to fix) Fix Avoid hard dependency on tiktoken (make it optional / lazy-loaded) [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#37220Fetched 2026-05-07 03:31:23
View on GitHub
Comments
2
Participants
2
Timeline
12
Reactions
0
Timeline (top)
labeled ×4commented ×2mentioned ×2subscribed ×2
RAW_BUFFERClick to expand / collapse

Submission checklist

  • This is a feature request, not a bug report or usage question.
  • I added a clear and descriptive title that summarizes the feature request.
  • I used the GitHub search to find a similar feature request and didn't find it.
  • I checked the LangChain documentation and API reference to see if this feature already exists.
  • This is not related to the langchain-community package.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Feature Description

tiktoken is valuable for OpenAI-compatible models, but it should be treated as a provider-specific implementation detail, not a core dependency.

In our case, it is unclear where and how tiktoken is being used across LangChain, which makes it difficult to safely remove or replace.

This creates friction for production adoption, especially in environments where dependencies must be tightly controlled. We would prefer not to include tiktoken, but the current implicit usage makes it hard to assess the impact and safely opt out.

It would be helpful if the dependency on tiktoken were more explicit, optional, and isolated to OpenAI-specific integrations.

Use Case

1. Not compatible with multi-model usage

Different providers have different tokenization:

  • Anthropic
  • Gemini
  • Open-source models (Llama, Mistral, etc.)

tiktoken is OpenAI-specific, but is effectively treated as a default.


2. Issues in production environments

In local/on-prem deployments:

  • model providers may change dynamically
  • adding tokenizer dependencies increases review cost
  • packaging becomes harder (especially offline)
  • missing tokenizer can break runtime behavior

Proposed Solution

  • Introduce a provider-agnostic tokenizer abstraction (e.g., TokenCounter)
  • Delegate token counting to provider-specific implementations when available
  • Gracefully fall back to estimation when no tokenizer is present
  • Do not import tiktoken in core code; it should remain an optional dependency in OpenAI-specific integrations

Alternatives Considered

No response

Additional Context

No response

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING