langchain - ✅(Solved) Fix count_tokens_approximately: missing handler for tool_use content blocks causes ~2.4x overcounting [4 pull requests, 2 comments, 3 participants]

langchain2026-03-04 18:39:55

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#35558•Fetched 2026-04-08 00:25:36

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Assignees

Timeline (top)

cross-referenced ×5referenced ×3commented ×2assigned ×1

Error Message

In production (weside.ai), we measured 4.6x overcounting for real Anthropic conversation threads containing tool calls. The tiktoken cl100k_base approximation already overcounts Claude tokens (~1.9x), and the repr() fallback compounds the error to ~2.4x on top of that for tool_use blocks specifically.

Root Cause

In langchain_core/messages/utils.py around line 2281, the content block handler has cases for text, image_url, etc. but no case for tool_use / tool_result:

# Current behavior (simplified):
for item in content:
    if isinstance(item, str):
        ...
    elif item.get("type") == "text":
        total_chars += len(item.get("text", ""))
    else:
        total_chars += len(repr(item))  # ← tool_use falls here!

repr({"type": "tool_use", "id": "...", "name": "...", "input": {...}}) produces a Python dict repr with single quotes, True/False booleans, etc. — typically ~2.4x longer than the equivalent compact JSON for nested tool inputs.

Fix Action

Workaround

We implemented a normalization function in our codebase (_normalize_for_counting()) that pre-processes messages to extract only text content and normalize tool blocks to compact JSON before passing to any token counter. This prevents the repr() inflation.

PR fix notes

PR #35566: fix(core): use compact json for tool_use/tool_result in count_tokens_approximately

Repository: langchain-ai/langchain
Author: keenborder786
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35566

Description (problem / solution / changelog)

Added explicit handling for tool_use and tool_result list-content blocks.
Switched these blocks from repr(block) length counting to compact JSON length counting (json.dumps(..., separators=(",", ":"))), with a safe fallback to repr.
Added regression tests for both tool_use and tool_result counting.
Updated the existing list-content/tool-call test to assert the correct invariant after this normalization.

Fixes: #35558

Changed files

libs/core/langchain_core/messages/utils.py (modified, +12/-0)
libs/core/tests/unit_tests/messages/test_utils.py (modified, +65/-1)

PR #35568: fix(core): handle tool_use/tool_result blocks in count_tokens_approximately

Repository: langchain-ai/langchain
Author: nightcityblade
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35568

Description (problem / solution / changelog)

Use json.dumps(separators=(",",":")) instead of repr() for tool_use and tool_result content blocks in count_tokens_approximately().

repr() produces Python-style output (single quotes, True/False) that inflates the character count by ~2.4x for nested tool inputs compared to the actual JSON representation used by providers.

Fixes #35558

Changes:

libs/core/langchain_core/messages/utils.py: Added explicit handlers for tool_use and tool_result block types using compact json.dumps
libs/core/tests/unit_tests/messages/test_utils.py: Added 2 new tests for tool_use/tool_result blocks, widened tolerance in existing test

Verification:

make test passes (156/156 tests in test_utils.py)
ruff check passes

Changed files

libs/core/langchain_core/messages/utils.py (modified, +12/-0)
libs/core/tests/unit_tests/messages/test_utils.py (modified, +49/-1)

PR #35650: fix: use compact JSON for tool_use blocks in count_tokens_approximately

Repository: langchain-ai/langchain
Author: alvinttang
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35650

Description (problem / solution / changelog)

Summary

Fixes #35558

count_tokens_approximately() uses repr() as a fallback for unknown content block types. For tool_use and tool_result blocks (Anthropic format), repr() produces Python dict representation with single quotes and verbose formatting, causing ~2.4x overcounting compared to actual token usage.

This change switches the fallback to json.dumps() with compact separators, which produces output much closer to what LLM APIs actually tokenize.

Impact

In production, the repr() overcounting compounded with tiktoken approximation to cause 4.6x overcounting for conversations with tool calls, leading to premature context summarization at only ~40% of actual token budget utilization.

Test plan

Verify count_tokens_approximately produces more accurate counts for messages with tool_use blocks
Verify existing tests pass

🤖 Generated with Claude Code

Changed files

libs/core/langchain_core/messages/utils.py (modified, +6/-2)

PR #35696: fix(core): add tool_use/tool_result handlers to count_tokens_approximately

Repository: langchain-ai/langchain
Author: giulio-leone
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35696

Description (problem / solution / changelog)

Problem

count_tokens_approximately() has no handler for tool_use or tool_result content blocks (Anthropic format). These fall through to the else branch which calls repr(block), producing Python dict representations longer than compact JSON equivalents.

In production, this causes premature summarization — the system believes the context is full when only ~40% of the actual token budget is consumed.

Closes #35558

Root Cause

The content block handler in count_tokens_approximately has cases for text, image_url, etc. but no case for tool_use / tool_result:

else:
    total_chars += len(repr(item))  # ← tool_use falls here!

repr() produces Python dict representations with single quotes, True/False booleans, etc. — typically longer than equivalent compact JSON for nested tool inputs.

Fix

Added explicit handlers for tool_use, tool_result, and thinking content blocks
Changed the fallback for unknown dict blocks from repr() to json.dumps(separators=(',',':')) for more accurate character estimation
Changed tool_calls serialization (for string-content AI messages) from repr() to json.dumps() for consistency

Tests

Added 8 new tests verifying accurate token counting for:

tool_use blocks (simple and nested inputs)
tool_result blocks (string content, nested text+image blocks)
Multiple tool_use blocks in a single message
thinking blocks
Compact JSON produces fewer chars than repr()

Changed files

libs/core/langchain_core/messages/utils.py (modified, +36/-4)
libs/core/tests/unit_tests/messages/test_utils.py (modified, +161/-9)

Code Example

from langchain_core.messages import AIMessage
from langchain_core.messages.utils import count_tokens_approximately

# Message with tool_use content block (Anthropic format)
msg = AIMessage(content=[
    {
        "type": "tool_use",
        "id": "toolu_01AbCdEf",
        "name": "search_memories",
        "input": {"query": "recent events"},
    }
])

# Actual compact JSON would be ~90 chars ≈ ~23 tokens
# repr() produces: "{'type': 'tool_use', 'id': 'toolu_01AbCdEf', 'name': 'search_memories', 'input': {'query': 'recent events'}}"
# That's ~115 chars from repr vs ~90 for compact JSON — already worse, and scales badly with large inputs

approx = count_tokens_approximately(msg)
print(f"Approximated tokens: {approx}")
# Returns significantly more tokens than the actual content warrants

---

# Current behavior (simplified):
for item in content:
    if isinstance(item, str):
        ...
    elif item.get("type") == "text":
        total_chars += len(item.get("text", ""))
    else:
        total_chars += len(repr(item))  # ← tool_use falls here!

---

import json

elif item.get("type") in ("tool_use", "tool_result"):
    # Normalize to compact JSON to avoid repr() inflation
    total_chars += len(json.dumps(item, separators=(",", ":")))

RAW_BUFFERClick to expand / collapse

Bug Description

count_tokens_approximately() in langchain_core/messages/utils.py has no handler for tool_use content blocks (Anthropic format). These blocks fall through to the else branch which calls repr(block) — producing a Python object representation like "{'type': 'tool_use', 'id': '...', 'name': '...', 'input': {...}}" that is much longer than the actual JSON content.

Affected Version

langchain-core>=0.1.x (confirmed on 1.2.17)

Reproducer

from langchain_core.messages import AIMessage
from langchain_core.messages.utils import count_tokens_approximately

# Message with tool_use content block (Anthropic format)
msg = AIMessage(content=[
    {
        "type": "tool_use",
        "id": "toolu_01AbCdEf",
        "name": "search_memories",
        "input": {"query": "recent events"},
    }
])

# Actual compact JSON would be ~90 chars ≈ ~23 tokens
# repr() produces: "{'type': 'tool_use', 'id': 'toolu_01AbCdEf', 'name': 'search_memories', 'input': {'query': 'recent events'}}"
# That's ~115 chars from repr vs ~90 for compact JSON — already worse, and scales badly with large inputs

approx = count_tokens_approximately(msg)
print(f"Approximated tokens: {approx}")
# Returns significantly more tokens than the actual content warrants

Root Cause

In langchain_core/messages/utils.py around line 2281, the content block handler has cases for text, image_url, etc. but no case for tool_use / tool_result:

# Current behavior (simplified):
for item in content:
    if isinstance(item, str):
        ...
    elif item.get("type") == "text":
        total_chars += len(item.get("text", ""))
    else:
        total_chars += len(repr(item))  # ← tool_use falls here!

Impact

This caused premature summarization — the system believed the context was full when only ~40% of the actual token budget was consumed.

Expected Behavior

tool_use and tool_result blocks should be normalized to compact JSON (or at minimum use json.dumps(item)) before measuring character length:

import json

elif item.get("type") in ("tool_use", "tool_result"):
    # Normalize to compact JSON to avoid repr() inflation
    total_chars += len(json.dumps(item, separators=(",", ":")))

Workaround

Additional Notes

This issue also affects tool_result content blocks (which can contain nested content arrays with text and image items)
The fix should be consistent with how text blocks are handled — only count meaningful content, not Python object repr overhead

extent analysis

Fix Plan

Step 1: Update `count_tokens_approximately()` in `langchain_core/messages/utils.py`

import json

# ...

elif item.get("type") in ("tool_use", "tool_result"):
    # Normalize to compact JSON to avoid repr() inflation
    total_chars += len(json.dumps(item, separators=(",", ":")))

Step 2: Add `tool_use` and `tool_result` content block handling

# ...

elif item.get("type") == "text":
    total_chars += len(item.get("text", ""))
elif item.get("type") in ("tool_use", "tool_result"):
    # Handle tool_use and tool_result content blocks
    total_chars += len(json.dumps(item, separators=(",", ":")))
else:
    total_chars += len(repr(item))

Verification

Run the reproducer code with the updated count_tokens_approximately() function.
Verify that the approximated tokens count is accurate and not overcounting.
Test with various input messages containing tool_use and tool_result content blocks.

Extra Tips

To prevent regressions, ensure that the fix is consistent with how text blocks are handled.
Consider adding additional logging or debugging statements to verify the correctness of the fix.
If you're using a CI/CD pipeline, add a test case to verify the fix and prevent regressions.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #memory leak #API versioning #request timeout

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

langchain - ✅(Solved) Fix count_tokens_approximately: missing handler for tool_use content blocks causes ~2.4x overcounting [4 pull requests, 2 comments, 3 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Workaround

PR fix notes

PR #35566: fix(core): use compact json for tool_use/tool_result in count_tokens_approximately

Description (problem / solution / changelog)

Changed files

PR #35568: fix(core): handle tool_use/tool_result blocks in count_tokens_approximately

Description (problem / solution / changelog)

Changed files

PR #35650: fix: use compact JSON for tool_use blocks in count_tokens_approximately

Description (problem / solution / changelog)

Summary

Impact

Test plan

Changed files

PR #35696: fix(core): add tool_use/tool_result handlers to count_tokens_approximately

Description (problem / solution / changelog)

Problem

Root Cause

Fix

Tests

Changed files

Code Example

Bug Description

Affected Version

Reproducer

Root Cause

Impact

Expected Behavior

Workaround

Additional Notes

extent analysis

Fix Plan

Step 1: Update count_tokens_approximately() in langchain_core/messages/utils.py

Step 2: Add tool_use and tool_result content block handling

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Step 1: Update `count_tokens_approximately()` in `langchain_core/messages/utils.py`

Step 2: Add `tool_use` and `tool_result` content block handling