langchain - 💡(How to fix) Fix `create_agent` model node uses `trace=False`, silently preventing `BaseCallbackHandler` from receiving `on_llm_start`/`on_llm_end` callbacks [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#36477Fetched 2026-04-08 02:33:23
View on GitHub
Comments
0
Participants
1
Timeline
5
Reactions
0
Author
Participants
Timeline (top)
labeled ×3closed ×1issue_type_added ×1

create_agent registers the model node with trace=False in factory.py:

# langchain/agents/factory.py
graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=False))

When trace=False, RunnableCallable.ainvoke skips callback context propagation entirely:

# langgraph/_internal/_runnable.py
async def ainvoke(self, input, config=None, **kwargs):
    if self.trace:
        # trace=True: sets up callback_manager, patch_config, set_config_context
        # → inner model.ainvoke() picks up callbacks from context variable
        ...
    else:
        # trace=False: directly calls the function, NO callback context setup
        ret = await self.afunc(*args, **kwargs)

Then inside _execute_model_async, the model is invoked without passing config:

# langchain/agents/factory.py
async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    model_, effective_response_format = _get_bound_model(request)
    output = await model_.ainvoke(messages)  # ← no config → no callbacks

Since trace=False skipped set_config_context, and ainvoke has no explicit config, var_child_runnable_config is empty. All BaseCallbackHandler callbacks for LLM calls are silently dropped.

Error Message

Error Message and Stack Trace (if applicable)

No error is thrown. The callbacks are silently droppedon_llm_start and on_llm_end are never called, while on_chain_start and on_chain_end work correctly.

Root Cause

Use AgentMiddleware.after_model hook to extract state["messages"][-1].usage_metadata. This works because middleware runs as an independent graph node unaffected by trace=False. However, it cannot provide on_llm_start (no latency tracking), on_tool_start/end, or other events that BaseCallbackHandler is designed to capture.

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.
# langgraph/_internal/_runnable.py
async def ainvoke(self, input, config=None, **kwargs):
    if self.trace:
        # trace=True: sets up callback_manager, patch_config, set_config_context
        # → inner model.ainvoke() picks up callbacks from context variable
        ...
    else:
        # trace=False: directly calls the function, NO callback context setup
        ret = await self.afunc(*args, **kwargs)

Current workaround:

Code Example

from langchain.agents import create_agent
from langchain_core.callbacks import BaseCallbackHandler
from langchain_openai import ChatOpenAI
 
 
class DebugHandler(BaseCallbackHandler):
    def on_chain_start(self, serialized, inputs, **kwargs):
        print(f"[chain_start] {serialized.get('name', 'unknown')}")
 
    def on_chain_end(self, outputs, **kwargs):
        print("[chain_end]")
 
    def on_llm_start(self, serialized, prompts, **kwargs):
        print(f"[llm_start] {serialized.get('name', 'unknown')}")  # ← NEVER CALLED
 
    def on_llm_end(self, response, **kwargs):
        usage = getattr(response.generations[0][0].message, "usage_metadata", None)
        print(f"[llm_end] tokens={usage}")  # ← NEVER CALLED
 
 
agent = create_agent(
    model=ChatOpenAI(model="gpt-4o"),
    tools=[],
    system_prompt="Say hello.",
)
 
result = agent.invoke(
    {"messages": [{"role": "user", "content": "Hi"}]},
    config={"callbacks": [DebugHandler()]},
)

---

No error is thrown. The callbacks are **silently dropped**`on_llm_start` and `on_llm_end` are never called, while `on_chain_start` and `on_chain_end` work correctly.
 

# Expected output:
[chain_start] LangGraph
[llm_start] ChatOpenAI          ← expected but missing
[llm_end] tokens={...}          ← expected but missing
[chain_end]
 
# Actual output:
[chain_start] LangGraph          ← graph-level trace works
[chain_end]

---

# langchain/agents/factory.py
graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=False))

---

# langgraph/_internal/_runnable.py
async def ainvoke(self, input, config=None, **kwargs):
    if self.trace:
        # trace=True: sets up callback_manager, patch_config, set_config_context
        # → inner model.ainvoke() picks up callbacks from context variable
        ...
    else:
        # trace=False: directly calls the function, NO callback context setup
        ret = await self.afunc(*args, **kwargs)

---

# langchain/agents/factory.py
async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    model_, effective_response_format = _get_bound_model(request)
    output = await model_.ainvoke(messages)  # ← no config → no callbacks

---

graph.astream(input, config={"callbacks": [handler]})
RunnableCallable(model_node, trace=False).ainvoke(state, config)
    → trace=False: directly calls amodel_node(state), skips set_config_context
_execute_model_async(request)
        → model_.ainvoke(messages)  # no config → no callbacks
          → on_llm_start / on_llm_end NEVER fire

---

def create_agent(model, tools, *, trace_model_node: bool = False, ...):
    graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=trace_model_node))

---

async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    output = await model_.ainvoke(messages, config=request.config)  # ← pass config
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

from langchain.agents import create_agent
from langchain_core.callbacks import BaseCallbackHandler
from langchain_openai import ChatOpenAI
 
 
class DebugHandler(BaseCallbackHandler):
    def on_chain_start(self, serialized, inputs, **kwargs):
        print(f"[chain_start] {serialized.get('name', 'unknown')}")
 
    def on_chain_end(self, outputs, **kwargs):
        print("[chain_end]")
 
    def on_llm_start(self, serialized, prompts, **kwargs):
        print(f"[llm_start] {serialized.get('name', 'unknown')}")  # ← NEVER CALLED
 
    def on_llm_end(self, response, **kwargs):
        usage = getattr(response.generations[0][0].message, "usage_metadata", None)
        print(f"[llm_end] tokens={usage}")  # ← NEVER CALLED
 
 
agent = create_agent(
    model=ChatOpenAI(model="gpt-4o"),
    tools=[],
    system_prompt="Say hello.",
)
 
result = agent.invoke(
    {"messages": [{"role": "user", "content": "Hi"}]},
    config={"callbacks": [DebugHandler()]},
)

Error Message and Stack Trace (if applicable)

No error is thrown. The callbacks are **silently dropped** — `on_llm_start` and `on_llm_end` are never called, while `on_chain_start` and `on_chain_end` work correctly.
 

# Expected output:
[chain_start] LangGraph
[llm_start] ChatOpenAI          ← expected but missing
[llm_end] tokens={...}          ← expected but missing
[chain_end]
 
# Actual output:
[chain_start] LangGraph          ← graph-level trace works
[chain_end]

Description

create_agent registers the model node with trace=False in factory.py:

# langchain/agents/factory.py
graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=False))

When trace=False, RunnableCallable.ainvoke skips callback context propagation entirely:

# langgraph/_internal/_runnable.py
async def ainvoke(self, input, config=None, **kwargs):
    if self.trace:
        # trace=True: sets up callback_manager, patch_config, set_config_context
        # → inner model.ainvoke() picks up callbacks from context variable
        ...
    else:
        # trace=False: directly calls the function, NO callback context setup
        ret = await self.afunc(*args, **kwargs)

Then inside _execute_model_async, the model is invoked without passing config:

# langchain/agents/factory.py
async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    model_, effective_response_format = _get_bound_model(request)
    output = await model_.ainvoke(messages)  # ← no config → no callbacks

Since trace=False skipped set_config_context, and ainvoke has no explicit config, var_child_runnable_config is empty. All BaseCallbackHandler callbacks for LLM calls are silently dropped.

Call chain:

graph.astream(input, config={"callbacks": [handler]})
  → RunnableCallable(model_node, trace=False).ainvoke(state, config)
    → trace=False: directly calls amodel_node(state), skips set_config_context
      → _execute_model_async(request)
        → model_.ainvoke(messages)  # no config → no callbacks
          → on_llm_start / on_llm_end NEVER fire

Impact:

  • on_llm_start / on_llm_end never called → no token usage tracking via callbacks
  • ❌ Third-party observability tools (Langfuse, LangWatch, etc.) cannot see individual LLM calls within the model node
  • ❌ Any BaseCallbackHandler user loses LLM-level visibility
  • on_chain_start / on_chain_end still work (graph-level node execution)

Comparison with JS version:

The JS createAgent (langchainjs/libs/langchain/src/agents/index.tsReactAgent) does not have this issue. The JS implementation handles model invocation differently, and callback propagation works as expected. The trace=False behavior is specific to the Python factory.py.

Suggested fixes:

Option A — Add a parameter to create_agent:

def create_agent(model, tools, *, trace_model_node: bool = False, ...):
    graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=trace_model_node))

Option B — Keep trace=False for the wrapper but pass config to the inner model call:

async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    output = await model_.ainvoke(messages, config=request.config)  # ← pass config

Option C — Always use trace=True for the model node.

Current workaround:

Use AgentMiddleware.after_model hook to extract state["messages"][-1].usage_metadata. This works because middleware runs as an independent graph node unaffected by trace=False. However, it cannot provide on_llm_start (no latency tracking), on_tool_start/end, or other events that BaseCallbackHandler is designed to capture.

Related:

  • before_agent_node also uses trace=False, potentially causing similar issues.
  • This does NOT affect users who build their own StateGraph — only create_agent is affected.

System Info

langchain: 1.2.0 langchain_core: 1.2.5 langgraph: 0.2.60 Python: 3.12 OS: Linux

extent analysis

TL;DR

To fix the issue of silently dropped callbacks for LLM calls, modify the create_agent function to allow setting trace_model_node to True or pass the config to the inner model call.

Guidance

  1. Modify create_agent function: Add a parameter trace_model_node to create_agent to control the tracing behavior of the model node, as suggested in Option A.
  2. Pass config to inner model call: Alternatively, modify _execute_model_async to pass the config to the inner model call, as shown in Option B.
  3. Verify callback behavior: After applying the fix, verify that on_llm_start and on_llm_end callbacks are called as expected by checking the output for the expected messages.
  4. Test with different scenarios: Test the fix with different models, tools, and configurations to ensure that the callbacks work correctly in all cases.

Example

def create_agent(model, tools, *, trace_model_node: bool = True, ...):
    graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=trace_model_node))

Notes

The issue is specific to the Python implementation of create_agent and does not affect the JavaScript version. The fix should be applied to the langchain package.

Recommendation

Apply Option A by adding a trace_model_node parameter to create_agent, allowing users to control the tracing behavior of the model node. This provides more flexibility and does not introduce any potential issues with other parts of the code.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING