langchain - 💡(How to fix) Fix Add tool_call_id to on_tool_start event's data

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

Fix Action

Fix / Workaround

Streaming UI consumers building per-tool-call panels need to attribute every on_tool_* event to the originating tool call as soon as the tool starts — not wait for on_tool_end / on_tool_error. This matters especially under parallel tool calls of the same tool (e.g., a supervisor agent dispatching several sub-agents of the same kind in one assistant turn), where name alone isn't sufficient to disambiguate which streamed events belong to which call, and falling back to run_id/parent traversal is fragile.

  • Reading the streamed metadata field. Possible if the BE manually splices tool_call_id into the tool's runtime config metadata via a BaseTool subclass that overrides ainvoke. Works, but every downstream agent author has to invent the same workaround. The data should be first-class on the event.
    • Walking parent_ids / run_id to recover the originating tool call. Fragile — LangGraph collapses parent relationships, and run_id is separate from tool_call_id.

Code Example

import asyncio
  from langchain.agents import create_agent
  from langchain_core.tools import tool
  
  @tool
  def slow_lookup(query: str) -> str:
      """A tool that takes a bit of time so we can observe its start event."""
      return f"result for {query}"
  
  agent = create_agent(model=..., tools=[slow_lookup])
  
  async def demonstrate_issue():
      async for event in agent.astream_events(
          {"messages": "look up alpha and beta in parallel"},
          version="v2",
      ):
          if event["event"] == "on_tool_start":
              # event["data"] has "input" but no "tool_call_id"
              assert "tool_call_id" not in event["data"]
  
  asyncio.run(demonstrate_issue())

### Use Case

Streaming UI consumers building per-tool-call panels need to attribute every on_tool_* event to the originating tool call as soon as the tool starts — not wait for on_tool_end / on_tool_error. This matters especially under parallel tool calls of the same tool (e.g., a supervisor agent dispatching several sub-agents of the same kind in one assistant turn), where name alone isn't sufficient to disambiguate which streamed events belong to which call, and falling back to run_id/parent traversal is fragile.
  
on_tool_error already exposes tool_call_id in its event data (added in #33731). Doing the same for on_tool_start is the natural parity.

### Proposed Solution

Mirror the on_tool_error change from #33731 on the on_tool_start send:
RAW_BUFFERClick to expand / collapse

Submission checklist

  • This is a feature request, not a bug report or usage question.
  • I added a clear and descriptive title that summarizes the feature request.
  • I used the GitHub search to find a similar feature request and didn't find it.
  • I checked the LangChain documentation and API reference to see if this feature already exists.
  • This is not related to the langchain-community package.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Feature Description

The on_tool_start stream event currently exposes name, tags, run_id, metadata, parent_ids, and data.input — but not tool_call_id.

LangChain already tracks tool_call_id for each tool run: BaseTool.arun passes it to the callback manager, _AstreamEventsCallbackHandler stores it in run_info during on_tool_start (since #33731), and it's surfaced on on_tool_error.data (also #33731). The data is right there; it's just not emitted on on_tool_start.

import asyncio
from langchain.agents import create_agent
from langchain_core.tools import tool

@tool
def slow_lookup(query: str) -> str:
    """A tool that takes a bit of time so we can observe its start event."""
    return f"result for {query}"

agent = create_agent(model=..., tools=[slow_lookup])

async def demonstrate_issue():
    async for event in agent.astream_events(
        {"messages": "look up alpha and beta in parallel"},
        version="v2",
    ):
        if event["event"] == "on_tool_start":
            # event["data"] has "input" but no "tool_call_id"
            assert "tool_call_id" not in event["data"]

asyncio.run(demonstrate_issue())

### Use Case

Streaming UI consumers building per-tool-call panels need to attribute every on_tool_* event to the originating tool call as soon as the tool starts — not wait for on_tool_end / on_tool_error. This matters especially under parallel tool calls of the same tool (e.g., a supervisor agent dispatching several sub-agents of the same kind in one assistant turn), where name alone isn't sufficient to disambiguate which streamed events belong to which call, and falling back to run_id/parent traversal is fragile.

on_tool_error already exposes tool_call_id in its event data (added in #33731). Doing the same for on_tool_start is the natural parity.

### Proposed Solution

Mirror the on_tool_error change from #33731 on the on_tool_start send:

```python
# libs/core/langchain_core/tracers/event_stream.py — in on_tool_start
self._send(
    {
        "event": "on_tool_start",
        "data": {
            "input": inputs or {},
            "tool_call_id": kwargs.get("tool_call_id"),   # ← new
        },
        "name": name_,
        "tags": tags or [],
        "run_id": str(run_id),
        "metadata": metadata or {},
        "parent_ids": self._get_parent_ids(run_id),
    },
    "tool",
)

The kwarg is already passed in (it's used at line 311 to populate run_info); this just surfaces it on the emitted event too.

Alternatives Considered

  • Reading the streamed metadata field. Possible if the BE manually splices tool_call_id into the tool's runtime config metadata via a BaseTool subclass that overrides ainvoke. Works, but every downstream agent author has to invent the same workaround. The data should be first-class on the event.
  • Walking parent_ids / run_id to recover the originating tool call. Fragile — LangGraph collapses parent relationships, and run_id is separate from tool_call_id.

Additional Context

Parallel to #33597 / #33731. Same justification (stateless agents, streaming-UI tool-call mapping), same one-line surface fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING