langchain - ✅(Solved) Fix Make ToolCallLimitMiddleware proactive via before_model hook [3 pull requests, 10 comments, 4 participants]

langchain2026-03-11 16:38:49

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#35766•Fetched 2026-04-08 00:24:39

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

commented ×10subscribed ×4cross-referenced ×3labeled ×3

Error Message

warning_threshold: int = 5 # Warn when N calls left

Fix Action

Fixed

Fixed by PR: feat(langchain): add proactive warnings to ToolCallLimitMiddleware (https://github.com/langchain-ai/langchain/pull/35894)
Fixed by PR: feat(langchain): add proactive warnings to toolCallLimitMiddleware (https://github.com/langchain-ai/langchainjs/pull/10419)
Fixed by PR: feat(langchain): Added proactive behaviour to ToolCallLimitMiddleware (https://github.com/langchain-ai/langchain/pull/35922)

PR fix notes

PR #35894: feat(langchain): add proactive warnings to ToolCallLimitMiddleware

Repository: langchain-ai/langchain
Author: pawel-twardziak
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35894

Description (problem / solution / changelog)

Closes #35766

Add built-in proactive warnings to ToolCallLimitMiddleware so the LLM gets advance notice of its remaining tool call budget before limits are enforced reactively.

What changed

ToolCallLimitMiddleware was purely reactive - it only enforced limits in after_model after the LLM had already generated tool calls. The LLM had no upfront awareness, leading to wasted calls and abrupt enforcement.

This PR adds two new keyword-only params (proactive: bool = False, warning_threshold: int = 5) and implements wrap_model_call / awrap_model_call to inject an ephemeral HumanMessage warning via request.override(messages=...) when the remaining budget is at or below the threshold. The warning reaches the LLM but never enters conversation state.

Key design decision: uses wrap_model_call (not before_model) because before_model returns persistent state updates via add_messages reducer - repeated warnings would accumulate in history. wrap_model_call + request.override() keeps warnings ephemeral, following the same pattern as ContextEditingMiddleware.

`warning_builder` for message localization

Added warning_builder: Callable[[int, str | None], str] | None = None keyword-only param to allow custom/localized warning messages. When None (default), the built-in English messages are used. Example:

def spanish_warnings(remaining: int, tool_name: str | None) -> str:
    tool_desc = f" para '{tool_name}'" if tool_name else ""
    if remaining <= 0:
        return f"[Aviso: Has agotado tu presupuesto de llamadas{tool_desc}.]"
    return f"[Aviso: Te quedan {remaining} llamada(s){tool_desc}.]"

limiter = ToolCallLimitMiddleware(
    run_limit=10,
    proactive=True,
    warning_builder=spanish_warnings,
)

Backward-compatible: proactive defaults to False and warning_builder defaults to None, so existing behavior is unchanged with zero overhead.

How I verified

15 new unit tests (11 proactive + 4 warning_builder) + 1 integration test via create_agent (all passing, 32 total in file)
Full middleware test suite: 571 passed, 0 failures
make lint and make format clean

Description 🤖 generated with Claude Code

Changed files

libs/langchain_v1/langchain/agents/middleware/tool_call_limit.py (modified, +155/-2)
libs/langchain_v1/tests/unit_tests/agents/middleware/implementations/test_tool_call_limit.py (modified, +263/-0)

PR #10419: feat(langchain): add proactive warnings to toolCallLimitMiddleware

Repository: langchain-ai/langchainjs
Author: pawel-twardziak
State: open | merged: False
Link: https://github.com/langchain-ai/langchainjs/pull/10419

Description (problem / solution / changelog)

FEATURE PROPOSAL

Refers to langchain-ai/langchain#35766 JS port of langchain-ai/langchain#35894

What changed

toolCallLimitMiddleware was purely reactive - it only enforced limits in afterModel after the LLM had already generated tool calls. The LLM had no upfront awareness, leading to wasted calls and abrupt enforcement.

This PR adds three new config options (proactive, warningThreshold, warningBuilder) and implements a wrapModelCall hook that injects an ephemeral HumanMessage warning when the remaining tool call budget is at or below the threshold. The warning reaches the LLM but never persists in conversation state.

Key design decision: uses wrapModelCall (not beforeModel) because beforeModel returns persistent state updates via the messages reducer - repeated warnings would accumulate in history. wrapModelCall keeps warnings ephemeral by modifying request.messages before handler(), following the same pattern as contextEditingMiddleware.

New options

Option	Type	Default	Description
proactive	boolean	false	Enable proactive warnings via wrapModelCall
warningThreshold	number	5	Remaining call count at which to start warning
warningBuilder	(remaining, toolName) => string	built-in	Custom warning message builder

warningBuilder for custom messages

const limiter = toolCallLimitMiddleware({
  runLimit: 10,
  proactive: true,
  warningBuilder: (remaining, toolName) => {
    const desc = toolName ? ` for '${toolName}'` : "";
    if (remaining <= 0) return `[Budget exhausted${desc}. Stop calling tools.]`;
    return `[${remaining} call(s) left${desc}. Plan carefully.]`;
  },
});

Backward-compatible: proactive defaults to false, so existing behavior is unchanged with zero overhead (no wrapModelCall hook is set).

How I verified

11 new tests (3 initialization + 7 unit + 1 integration via createAgent) - all passing (44 total in file)
Full middleware test suite: 260 passed, 0 failures across 15 test files
lint:fix and format

Test plan

proactive defaults to false - no wrapModelCall hook set
wrapModelCall hook exists when proactive: true
Negative warningThreshold throws
Warning injected when remaining <= threshold
No warning when remaining > threshold
Warning at exact threshold boundary
Exhausted message when remaining <= 0
Uses minimum of thread/run remaining
Tool-specific warnings include tool name
warningThreshold: 0 only warns at exhaustion
Custom warningBuilder called with correct args
Warning reaches model but does NOT persist in result.messages

Changed files

.changeset/proactive-warnings-tool-call-limit.md (added, +7/-0)
libs/langchain/src/agents/middleware/tests/toolCallLimit.test.ts (modified, +394/-0)
libs/langchain/src/agents/middleware/toolCallLimit.ts (modified, +120/-3)

PR #35922: feat(langchain): Added proactive behaviour to ToolCallLimitMiddleware

Repository: langchain-ai/langchain
Author: 29swastik
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35922

Description (problem / solution / changelog)

Fixes #35766

Description of changes

Add proactive suuport to ToolCallLimitMiddleware: adds proactive and warning_threshold parameters to inject ephemeral limit-awareness messages before model calls, including initial context on the first call of each run and warnings as the budget runs low. Also extends warning_threshold to accept list[int] for discrete warning points.

Background

ToolCallLimitMiddleware was purely reactive. It only acted after tool calls limit were reached. This has few problems:

This might make LLM to utilize tools less effectively since it is not aware of tool limit from the beginning. When the limit is reached we panick the LLM by sending tool call limit reached message all of a sudden which it was not aware of
The reactive way of detecting tool call limit leads to additional tool call which doesn't get used.

Changes Summary

Two new keyword-only params (proactive: bool = False, warning_threshold: int = 5) and implements wrap_model_call / awrap_model_call to inject an ephemeral HumanMessage warning via request.override(messages=...) when the remaining threshold condition is met.
Key design decision: uses wrap_model_call (not before_model) because before_model returns persistent state updates via add_messages reducer - repeated warnings would accumulate in history. wrap_model_call + request.override() keeps warnings ephemeral, following the same pattern as ContextEditingMiddleware.
In the beginning of each run LLM is provided with the details of tool call limit so that LLM has idea right from the beginning about tool call limit and later based of warning_threshold warning messages are injected to inform LLM about the available budget
Backward-compatible: proactive defaults to False and warning_builder defaults to None, so existing behavior is unchanged.

How I verified the changes:

Added 28 new unit tests which includes initial context injection, subsequent warning messages based on int and list based warning_threshold, backward compatibility checks, tool-name variant
Overall 41 tests in test_tool_call_limit.py -- all passed.
Ran make test, make format & make lint

This PR is built on top of @pawel-twardziak initial PR. Thanks for the inital code setup.

Changed files

libs/langchain_v1/langchain/agents/middleware/tool_call_limit.py (modified, +266/-2)
libs/langchain_v1/tests/unit_tests/agents/middleware/implementations/test_tool_call_limit.py (modified, +473/-0)

Code Example

[
HumanMessage(content='Get all current affairs across the globe'),
HumanMessage(content=f'You are left with {tool_calls_left}' tool calls, plan accordingly), # inserted before model call
AIMessage(content=...)
ToolMessage(content=...)
HumanMessage(content=f'You are left with {tool_calls_left}' tool calls, plan accordingly), # inserted before next model call
AIMessage(content=...)
...
]

---

class ToolCallLimitMiddleware(BaseCallbackHandler):
    def __init__(
        self,
        max_tool_calls: int = 50,
        proactive: bool = True,  # New flag
        warning_threshold: int = 5  # Warn when N calls left
    ):
        self.max_tool_calls = max_tool_calls
        self.proactive = proactive
        self.warning_threshold = warning_threshold

RAW_BUFFERClick to expand / collapse

Checked other resources

This is a feature request, not a bug report or usage question.
I added a clear and descriptive title that summarizes the feature request.
I used the GitHub search to find a similar feature request and didn't find it.
I checked the LangChain documentation and API reference to see if this feature already exists.
This is not related to the langchain-community package.

Package (Required)

Feature Description

The ToolCallLimitMiddleware is currently reactive, which means the limit logic is invoked after the limit is reached which has few downsides:

This might make LLM to utilize tools less effectively since it is not aware of tool limit from the beginning. When the limit is reached we panick the LLM by sending tool call limit reached message all of a sudden which it was not aware of
The reactive way of detecting tool call limit leads to additional tool call which doesn't get used.

Use Case

LLM can plan ahead if it is aware of the concept of tool call limit and MAYBE converge better. Compare it with how a human would act in real life, in a game if we are aware of no of retries/no of tries left we do plan accordingly right?

Proposed Solution

This can be implemted by using @before_model hook where an additional message is passed which informs LLM how many tool calls left.

[
HumanMessage(content='Get all current affairs across the globe'),
HumanMessage(content=f'You are left with {tool_calls_left}' tool calls, plan accordingly), # inserted before model call
AIMessage(content=...)
ToolMessage(content=...)
HumanMessage(content=f'You are left with {tool_calls_left}' tool calls, plan accordingly), # inserted before next model call
AIMessage(content=...)
...
]

If this appraoch makes sense I'll be happy to raise the PR!

Alternatives Considered

The proposed solution requires no additional configuration through constructor parameters and the remaining tool call left message is sent after each iteration., we can make few enhancements to the proposed solution:

Introduce new attributes to ToolCallLimitMiddleware class with which developers can configure the behaviour of tool call remaining message sent to LLM (as suggested in one of the comments below)

class ToolCallLimitMiddleware(BaseCallbackHandler):
    def __init__(
        self,
        max_tool_calls: int = 50,
        proactive: bool = True,  # New flag
        warning_threshold: int = 5  # Warn when N calls left
    ):
        self.max_tool_calls = max_tool_calls
        self.proactive = proactive
        self.warning_threshold = warning_threshold

The tool call remaining message is sent using HumanMessage if this is not the right message type we can consider ToolMessage

extent analysis

Fix Plan

Implement `@before_model` Hook

To make the ToolCallLimitMiddleware proactive, we'll use the @before_model hook to pass the remaining tool calls to the LLM.

Step-by-Step Solution

Create a new class that will handle the tool call limit logic:

class ToolCallLimitMiddleware: def init(self, tool_calls_left): self.tool_calls_left = tool_calls_left

def __call__(self, messages):
    # Insert the remaining tool calls message before the model call
    messages.insert(1, HumanMessage(content=f'You are left with {self.tool_calls_left} tool calls, plan accordingly'))
    return messages


2. **Update the `langchain` configuration** to use the new middleware:
   ```python
from langchain import Chain

# Create a new Chain instance with the ToolCallLimitMiddleware
chain = Chain(
    middleware=[ToolCallLimitMiddleware(tool_calls_left=5)],  # adjust the tool_calls_left value as needed
    model='your-model-name'
)

Pass the remaining tool calls to the ToolCallLimitMiddleware instance:

tool_calls_left = 5 # adjust the value as needed middleware = ToolCallLimitMiddleware(tool_calls_left) messages = [ HumanMessage(content='Get all current affairs across the globe'), AIMessage(content=...), ToolMessage(content=...), # ... ]

Use the middleware to process the messages

processed_messages = middleware(messages)


### Verification
---------------

To verify that the fix worked, you can:

1. **Monitor the LLM's behavior** and observe if it plans ahead when aware of the tool call limit.
2. **Check the logs** for any errors or unexpected behavior.
3. **Test the feature** with different scenarios and tool call limits

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #docker error #permission error #memory optimization

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

langchain - ✅(Solved) Fix Make ToolCallLimitMiddleware proactive via before_model hook [3 pull requests, 10 comments, 4 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Fix Action

Fixed

PR fix notes

PR #35894: feat(langchain): add proactive warnings to ToolCallLimitMiddleware

Description (problem / solution / changelog)

What changed

warning_builder for message localization

How I verified

Changed files

PR #10419: feat(langchain): add proactive warnings to toolCallLimitMiddleware

Description (problem / solution / changelog)

FEATURE PROPOSAL

What changed

New options

warningBuilder for custom messages

How I verified

Test plan

Changed files

PR #35922: feat(langchain): Added proactive behaviour to ToolCallLimitMiddleware

Description (problem / solution / changelog)

Description of changes

Background

Changes Summary

How I verified the changes:

Changed files

Code Example

Checked other resources

Package (Required)

Feature Description

Use Case

Proposed Solution

Alternatives Considered

extent analysis

Fix Plan

Implement @before_model Hook

Step-by-Step Solution

Use the middleware to process the messages

Still need to ship something?

RELATED_DISCOVERY

TRENDING

`warning_builder` for message localization

Implement `@before_model` Hook