langchain - ✅(Solved) Fix `_get_approximate_token_counter` doesn't recognize ChatAnthropicVertex, causing SummarizationMiddleware to never trigger [2 pull requests, 2 comments, 1 participants]

langchain2026-03-27 21:00:40

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#36318•Fetched 2026-04-08 01:40:56

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Jordanh1996

Participants

Jordanh1996

Assignees

Jordanh1996

Timeline (top)

commented ×2cross-referenced ×2referenced ×2assigned ×1

_get_approximate_token_counter in langchain.agents.middleware.summarization uses an exact match model._llm_type == "anthropic-chat" to detect Anthropic models and apply the correct chars_per_token=3.3. But ChatAnthropicVertex (Claude via Vertex AI) returns _llm_type = "anthropic-chat-vertexai", so it falls through to the default 4.0 chars/token.

This underestimates token count by ~16%. When used with SummarizationMiddleware, the trigger threshold (85% of 200K = 170K) is never reached according to the estimate, but the actual prompt is already past 200K. The API rejects it.

Two additional safety mechanisms are also inactive for Vertex AI:

use_usage_metadata_scaling in count_tokens_approximately requires response_metadata["model_provider"] to be set on AI messages. ChatAnthropicVertex never sets this (unlike ChatAnthropic which sets "anthropic"). This is a langchain-google-vertexai issue.
_should_summarize_based_on_reported_tokens has the same dependency and is equally inactive.

Suggested fix: Change the check to model._llm_type.startswith("anthropic-chat") to cover both ChatAnthropic and ChatAnthropicVertex.

Error Message

BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'prompt is too long: 202018 tokens > 200000 maximum'}}

Root Cause

Two additional safety mechanisms are also inactive for Vertex AI:

use_usage_metadata_scaling in count_tokens_approximately requires response_metadata["model_provider"] to be set on AI messages. ChatAnthropicVertex never sets this (unlike ChatAnthropic which sets "anthropic"). This is a langchain-google-vertexai issue.
_should_summarize_based_on_reported_tokens has the same dependency and is equally inactive.

Suggested fix: Change the check to model._llm_type.startswith("anthropic-chat") to cover both ChatAnthropic and ChatAnthropicVertex.

Fix Action

Fix / Workaround

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I provided a self-contained, minimal, reproducible example that a maintainer can copy and run AS IS, including all necessary imports and data.

PR fix notes

PR #36319: fix: recognize ChatAnthropicVertex in _get_approximate_token_counter

Repository: langchain-ai/langchain
Author: Jordanh1996
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/36319

Description (problem / solution / changelog)

(No description)

Changed files

libs/langchain_v1/langchain/agents/middleware/summarization.py (modified, +2/-2)

PR #36320: fix(langchain): recognize ChatAnthropicVertex in _get_approximate_token_counter

Repository: langchain-ai/langchain
Author: Jordanh1996
State: closed | merged: True
Link: https://github.com/langchain-ai/langchain/pull/36320

Description (problem / solution / changelog)

Summary

_get_approximate_token_counter uses model._llm_type == "anthropic-chat" to detect Anthropic models and apply chars_per_token=3.3. But ChatAnthropicVertex (Claude via Vertex AI) returns _llm_type = "anthropic-chat-vertexai", so the check fails and the default 4.0 chars/token is used.

This underestimates token count by ~16%. When used with SummarizationMiddleware, the trigger threshold (e.g. 85% of 200K = 170K) is never reached according to the estimate, but the actual prompt is already past 200K. The API rejects it with prompt is too long.

Change

== "anthropic-chat" → .startswith("anthropic-chat")

This covers both ChatAnthropic ("anthropic-chat") and ChatAnthropicVertex ("anthropic-chat-vertexai").

Reproduction

Full standalone script with API calls: https://gist.github.com/Jordanh1996/af56156da6cfab82917215dd340f76ac

Fixes #36318

This contribution was made with AI assistance (Claude).

Changed files

libs/langchain_v1/langchain/agents/middleware/summarization.py (modified, +1/-1)

Code Example

from functools import partial
from langchain.agents.middleware.summarization import _get_approximate_token_counter
from langchain_core.messages import HumanMessage
from langchain_core.messages.utils import count_tokens_approximately
from langchain_google_vertexai.model_garden import ChatAnthropicVertex

model = ChatAnthropicVertex(
    model_name="claude-haiku-4-5",
    project="your-project",
    location="us-east5",
    max_output_tokens=256,
    profile={"max_input_tokens": 200_000},
)

# The bug: _get_approximate_token_counter checks model._llm_type == "anthropic-chat"
# but ChatAnthropicVertex returns "anthropic-chat-vertexai"
print(f"model._llm_type: {model._llm_type!r}")  # 'anthropic-chat-vertexai'
print(f"Match: {model._llm_type == 'anthropic-chat'}")  # False

# Result: counter uses default 4.0 chars/token instead of 3.3
counter = _get_approximate_token_counter(model)
correct = partial(count_tokens_approximately, chars_per_token=3.3)

msg = [HumanMessage(content="x" * 600_000)]
print(f"Middleware estimate: {counter(msg)}")   # ~150K
print(f"Correct estimate:   {correct(msg)}")    # ~182K

---

BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'prompt is too long: 202018 tokens > 200000 maximum'}}

---

langchain: 1.2.10
langchain_core: 1.2.16
langchain_google_vertexai: 3.2.2
langchain_anthropic: 1.3.4
Python: 3.11.8
OS: Darwin (macOS ARM64)

RAW_BUFFERClick to expand / collapse

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I provided a self-contained, minimal, reproducible example that a maintainer can copy and run AS IS, including all necessary imports and data.

Example Code

from functools import partial
from langchain.agents.middleware.summarization import _get_approximate_token_counter
from langchain_core.messages import HumanMessage
from langchain_core.messages.utils import count_tokens_approximately
from langchain_google_vertexai.model_garden import ChatAnthropicVertex

model = ChatAnthropicVertex(
    model_name="claude-haiku-4-5",
    project="your-project",
    location="us-east5",
    max_output_tokens=256,
    profile={"max_input_tokens": 200_000},
)

# The bug: _get_approximate_token_counter checks model._llm_type == "anthropic-chat"
# but ChatAnthropicVertex returns "anthropic-chat-vertexai"
print(f"model._llm_type: {model._llm_type!r}")  # 'anthropic-chat-vertexai'
print(f"Match: {model._llm_type == 'anthropic-chat'}")  # False

# Result: counter uses default 4.0 chars/token instead of 3.3
counter = _get_approximate_token_counter(model)
correct = partial(count_tokens_approximately, chars_per_token=3.3)

msg = [HumanMessage(content="x" * 600_000)]
print(f"Middleware estimate: {counter(msg)}")   # ~150K
print(f"Correct estimate:   {correct(msg)}")    # ~182K

Full standalone reproduction script with API calls demonstrating the rejection: https://gist.github.com/Jordanh1996/af56156da6cfab82917215dd340f76ac

Error Message and Stack Trace (if applicable)

BadRequestError: Error code: 400 - {'type': 'error', 'error': {'type': 'invalid_request_error', 'message': 'prompt is too long: 202018 tokens > 200000 maximum'}}

Description

Two additional safety mechanisms are also inactive for Vertex AI:

use_usage_metadata_scaling in count_tokens_approximately requires response_metadata["model_provider"] to be set on AI messages. ChatAnthropicVertex never sets this (unlike ChatAnthropic which sets "anthropic"). This is a langchain-google-vertexai issue.
_should_summarize_based_on_reported_tokens has the same dependency and is equally inactive.

Suggested fix: Change the check to model._llm_type.startswith("anthropic-chat") to cover both ChatAnthropic and ChatAnthropicVertex.

System Info

langchain: 1.2.10
langchain_core: 1.2.16
langchain_google_vertexai: 3.2.2
langchain_anthropic: 1.3.4
Python: 3.11.8
OS: Darwin (macOS ARM64)

extent analysis

Fix Plan

To resolve the issue, we need to modify the _get_approximate_token_counter function in langchain.agents.middleware.summarization to correctly identify Anthropic models, including those from Vertex AI.

Update the condition in _get_approximate_token_counter to use startswith instead of exact match:

if model._llm_type.startswith("anthropic-chat"):
    return partial(count_tokens_approximately, chars_per_token=3.3)

This change will ensure that both ChatAnthropic and ChatAnthropicVertex models are correctly identified and the accurate chars_per_token value is used.

Verification

To verify the fix, run the provided example code and check that the estimated token count matches the correct estimate:

counter = _get_approximate_token_counter(model)
msg = [HumanMessage(content="x" * 600_000)]
print(f"Middleware estimate: {counter(msg)}")   # ~182K

The estimated token count should now match the correct estimate of ~182K.

Extra Tips

Make sure to update the langchain package to the latest version after applying the fix.
Consider submitting a pull request to the langchain repository to include this fix in future releases.
If you encounter similar issues with other models or providers, check the _llm_type attribute and adjust the condition accordingly.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #orchestration issue #cache issue #memory leak #API versioning

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

langchain - ✅(Solved) Fix `_get_approximate_token_counter` doesn't recognize ChatAnthropicVertex, causing SummarizationMiddleware to never trigger [2 pull requests, 2 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #36319: fix: recognize ChatAnthropicVertex in _get_approximate_token_counter

Description (problem / solution / changelog)

Changed files

PR #36320: fix(langchain): recognize ChatAnthropicVertex in _get_approximate_token_counter

Description (problem / solution / changelog)

Summary

Change

Reproduction

Changed files

Code Example

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

TRENDING

langchain - ✅(Solved) Fix `_get_approximate_token_counter` doesn't recognize ChatAnthropicVertex, causing SummarizationMiddleware to never trigger [2 pull requests, 2 comments, 1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #36319: fix: recognize ChatAnthropicVertex in _get_approximate_token_counter

Description (problem / solution / changelog)

Changed files

PR #36320: fix(langchain): recognize ChatAnthropicVertex in _get_approximate_token_counter

Description (problem / solution / changelog)

Summary

Change

Reproduction

Changed files

Code Example

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

extent analysis

Fix Plan

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING