langchain - ✅(Solved) Fix Make ToolCallLimitMiddleware proactive via before_model hook [3 pull requests, 10 comments, 4 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#35766Fetched 2026-04-08 00:24:39
View on GitHub
Comments
10
Participants
4
Timeline
26
Reactions
0
Author
Timeline (top)
commented ×10subscribed ×4cross-referenced ×3labeled ×3

Error Message

warning_threshold: int = 5 # Warn when N calls left

Fix Action

Fixed

PR fix notes

PR #35894: feat(langchain): add proactive warnings to ToolCallLimitMiddleware

Description (problem / solution / changelog)

Closes #35766

Add built-in proactive warnings to ToolCallLimitMiddleware so the LLM gets advance notice of its remaining tool call budget before limits are enforced reactively.

What changed

ToolCallLimitMiddleware was purely reactive - it only enforced limits in after_model after the LLM had already generated tool calls. The LLM had no upfront awareness, leading to wasted calls and abrupt enforcement.

This PR adds two new keyword-only params (proactive: bool = False, warning_threshold: int = 5) and implements wrap_model_call / awrap_model_call to inject an ephemeral HumanMessage warning via request.override(messages=...) when the remaining budget is at or below the threshold. The warning reaches the LLM but never enters conversation state.

Key design decision: uses wrap_model_call (not before_model) because before_model returns persistent state updates via add_messages reducer - repeated warnings would accumulate in history. wrap_model_call + request.override() keeps warnings ephemeral, following the same pattern as ContextEditingMiddleware.

warning_builder for message localization

Added warning_builder: Callable[[int, str | None], str] | None = None keyword-only param to allow custom/localized warning messages. When None (default), the built-in English messages are used. Example:

def spanish_warnings(remaining: int, tool_name: str | None) -> str:
    tool_desc = f" para '{tool_name}'" if tool_name else ""
    if remaining <= 0:
        return f"[Aviso: Has agotado tu presupuesto de llamadas{tool_desc}.]"
    return f"[Aviso: Te quedan {remaining} llamada(s){tool_desc}.]"

limiter = ToolCallLimitMiddleware(
    run_limit=10,
    proactive=True,
    warning_builder=spanish_warnings,
)

Backward-compatible: proactive defaults to False and warning_builder defaults to None, so existing behavior is unchanged with zero overhead.

How I verified

  • 15 new unit tests (11 proactive + 4 warning_builder) + 1 integration test via create_agent (all passing, 32 total in file)
  • Full middleware test suite: 571 passed, 0 failures
  • make lint and make format clean

Description 🤖 generated with Claude Code

Changed files

  • libs/langchain_v1/langchain/agents/middleware/tool_call_limit.py (modified, +155/-2)
  • libs/langchain_v1/tests/unit_tests/agents/middleware/implementations/test_tool_call_limit.py (modified, +263/-0)

PR #10419: feat(langchain): add proactive warnings to toolCallLimitMiddleware

Description (problem / solution / changelog)

FEATURE PROPOSAL

Refers to langchain-ai/langchain#35766 JS port of langchain-ai/langchain#35894

What changed

toolCallLimitMiddleware was purely reactive - it only enforced limits in afterModel after the LLM had already generated tool calls. The LLM had no upfront awareness, leading to wasted calls and abrupt enforcement.

This PR adds three new config options (proactive, warningThreshold, warningBuilder) and implements a wrapModelCall hook that injects an ephemeral HumanMessage warning when the remaining tool call budget is at or below the threshold. The warning reaches the LLM but never persists in conversation state.

Key design decision: uses wrapModelCall (not beforeModel) because beforeModel returns persistent state updates via the messages reducer - repeated warnings would accumulate in history. wrapModelCall keeps warnings ephemeral by modifying request.messages before handler(), following the same pattern as contextEditingMiddleware.

New options

OptionTypeDefaultDescription
proactivebooleanfalseEnable proactive warnings via wrapModelCall
warningThresholdnumber5Remaining call count at which to start warning
warningBuilder(remaining, toolName) => stringbuilt-inCustom warning message builder

warningBuilder for custom messages

const limiter = toolCallLimitMiddleware({
  runLimit: 10,
  proactive: true,
  warningBuilder: (remaining, toolName) => {
    const desc = toolName ? ` for '${toolName}'` : "";
    if (remaining <= 0) return `[Budget exhausted${desc}. Stop calling tools.]`;
    return `[${remaining} call(s) left${desc}. Plan carefully.]`;
  },
});

Backward-compatible: proactive defaults to false, so existing behavior is unchanged with zero overhead (no wrapModelCall hook is set).

How I verified

  • 11 new tests (3 initialization + 7 unit + 1 integration via createAgent) - all passing (44 total in file)
  • Full middleware test suite: 260 passed, 0 failures across 15 test files
  • lint:fix and format

Test plan

  • proactive defaults to false - no wrapModelCall hook set
  • wrapModelCall hook exists when proactive: true
  • Negative warningThreshold throws
  • Warning injected when remaining <= threshold
  • No warning when remaining > threshold
  • Warning at exact threshold boundary
  • Exhausted message when remaining <= 0
  • Uses minimum of thread/run remaining
  • Tool-specific warnings include tool name
  • warningThreshold: 0 only warns at exhaustion
  • Custom warningBuilder called with correct args
  • Warning reaches model but does NOT persist in result.messages

Changed files

  • .changeset/proactive-warnings-tool-call-limit.md (added, +7/-0)
  • libs/langchain/src/agents/middleware/tests/toolCallLimit.test.ts (modified, +394/-0)
  • libs/langchain/src/agents/middleware/toolCallLimit.ts (modified, +120/-3)

PR #35922: feat(langchain): Added proactive behaviour to ToolCallLimitMiddleware

Description (problem / solution / changelog)

Fixes #35766

Description of changes

Add proactive suuport to ToolCallLimitMiddleware: adds proactive and warning_threshold parameters to inject ephemeral limit-awareness messages before model calls, including initial context on the first call of each run and warnings as the budget runs low. Also extends warning_threshold to accept list[int] for discrete warning points.

Background

ToolCallLimitMiddleware was purely reactive. It only acted after tool calls limit were reached. This has few problems:

  1. This might make LLM to utilize tools less effectively since it is not aware of tool limit from the beginning. When the limit is reached we panick the LLM by sending tool call limit reached message all of a sudden which it was not aware of

  2. The reactive way of detecting tool call limit leads to additional tool call which doesn't get used.

Changes Summary

  • Two new keyword-only params (proactive: bool = False, warning_threshold: int = 5) and implements wrap_model_call / awrap_model_call to inject an ephemeral HumanMessage warning via request.override(messages=...) when the remaining threshold condition is met.
  • Key design decision: uses wrap_model_call (not before_model) because before_model returns persistent state updates via add_messages reducer - repeated warnings would accumulate in history. wrap_model_call + request.override() keeps warnings ephemeral, following the same pattern as ContextEditingMiddleware.
  • In the beginning of each run LLM is provided with the details of tool call limit so that LLM has idea right from the beginning about tool call limit and later based of warning_threshold warning messages are injected to inform LLM about the available budget
  • Backward-compatible: proactive defaults to False and warning_builder defaults to None, so existing behavior is unchanged.

How I verified the changes:

  • Added 28 new unit tests which includes initial context injection, subsequent warning messages based on int and list based warning_threshold, backward compatibility checks, tool-name variant
  • Overall 41 tests in test_tool_call_limit.py -- all passed.
  • Ran make test, make format & make lint

This PR is built on top of @pawel-twardziak initial PR. Thanks for the inital code setup.

Changed files

  • libs/langchain_v1/langchain/agents/middleware/tool_call_limit.py (modified, +266/-2)
  • libs/langchain_v1/tests/unit_tests/agents/middleware/implementations/test_tool_call_limit.py (modified, +473/-0)

Code Example

[
HumanMessage(content='Get all current affairs across the globe'),
HumanMessage(content=f'You are left with {tool_calls_left}' tool calls, plan accordingly), # inserted before model call
AIMessage(content=...)
ToolMessage(content=...)
HumanMessage(content=f'You are left with {tool_calls_left}' tool calls, plan accordingly), # inserted before next model call
AIMessage(content=...)
...
]

---

class ToolCallLimitMiddleware(BaseCallbackHandler):
    def __init__(
        self,
        max_tool_calls: int = 50,
        proactive: bool = True,  # New flag
        warning_threshold: int = 5  # Warn when N calls left
    ):
        self.max_tool_calls = max_tool_calls
        self.proactive = proactive
        self.warning_threshold = warning_threshold
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a feature request, not a bug report or usage question.
  • I added a clear and descriptive title that summarizes the feature request.
  • I used the GitHub search to find a similar feature request and didn't find it.
  • I checked the LangChain documentation and API reference to see if this feature already exists.
  • This is not related to the langchain-community package.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Feature Description

The ToolCallLimitMiddleware is currently reactive, which means the limit logic is invoked after the limit is reached which has few downsides:

  1. This might make LLM to utilize tools less effectively since it is not aware of tool limit from the beginning. When the limit is reached we panick the LLM by sending tool call limit reached message all of a sudden which it was not aware of
  2. The reactive way of detecting tool call limit leads to additional tool call which doesn't get used.

Use Case

LLM can plan ahead if it is aware of the concept of tool call limit and MAYBE converge better. Compare it with how a human would act in real life, in a game if we are aware of no of retries/no of tries left we do plan accordingly right?

Proposed Solution

This can be implemted by using @before_model hook where an additional message is passed which informs LLM how many tool calls left.

[
HumanMessage(content='Get all current affairs across the globe'),
HumanMessage(content=f'You are left with {tool_calls_left}' tool calls, plan accordingly), # inserted before model call
AIMessage(content=...)
ToolMessage(content=...)
HumanMessage(content=f'You are left with {tool_calls_left}' tool calls, plan accordingly), # inserted before next model call
AIMessage(content=...)
...
]

If this appraoch makes sense I'll be happy to raise the PR!

Alternatives Considered

The proposed solution requires no additional configuration through constructor parameters and the remaining tool call left message is sent after each iteration., we can make few enhancements to the proposed solution:

  1. Introduce new attributes to ToolCallLimitMiddleware class with which developers can configure the behaviour of tool call remaining message sent to LLM (as suggested in one of the comments below)
class ToolCallLimitMiddleware(BaseCallbackHandler):
    def __init__(
        self,
        max_tool_calls: int = 50,
        proactive: bool = True,  # New flag
        warning_threshold: int = 5  # Warn when N calls left
    ):
        self.max_tool_calls = max_tool_calls
        self.proactive = proactive
        self.warning_threshold = warning_threshold
  1. The tool call remaining message is sent using HumanMessage if this is not the right message type we can consider ToolMessage

extent analysis

Fix Plan

Implement @before_model Hook

To make the ToolCallLimitMiddleware proactive, we'll use the @before_model hook to pass the remaining tool calls to the LLM.

Step-by-Step Solution

  1. Create a new class that will handle the tool call limit logic:

class ToolCallLimitMiddleware: def init(self, tool_calls_left): self.tool_calls_left = tool_calls_left

def __call__(self, messages):
    # Insert the remaining tool calls message before the model call
    messages.insert(1, HumanMessage(content=f'You are left with {self.tool_calls_left} tool calls, plan accordingly'))
    return messages

2. **Update the `langchain` configuration** to use the new middleware:
   ```python
from langchain import Chain

# Create a new Chain instance with the ToolCallLimitMiddleware
chain = Chain(
    middleware=[ToolCallLimitMiddleware(tool_calls_left=5)],  # adjust the tool_calls_left value as needed
    model='your-model-name'
)
  1. Pass the remaining tool calls to the ToolCallLimitMiddleware instance:

tool_calls_left = 5 # adjust the value as needed middleware = ToolCallLimitMiddleware(tool_calls_left) messages = [ HumanMessage(content='Get all current affairs across the globe'), AIMessage(content=...), ToolMessage(content=...), # ... ]

Use the middleware to process the messages

processed_messages = middleware(messages)


### Verification
---------------

To verify that the fix worked, you can:

1. **Monitor the LLM's behavior** and observe if it plans ahead when aware of the tool call limit.
2. **Check the logs** for any errors or unexpected behavior.
3. **Test the feature** with different scenarios and tool call limits

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

langchain - ✅(Solved) Fix Make ToolCallLimitMiddleware proactive via before_model hook [3 pull requests, 10 comments, 4 participants]