langchain - 💡(How to fix) Fix `create_agent` model node uses `trace=False`, silently preventing `BaseCallbackHandler` from receiving `on_llm_start`/`on_llm_end` callbacks [1 participants]

langchain2026-04-03 02:15:00

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#36477•Fetched 2026-04-08 02:33:23

View on GitHub

Comments

Participants

Timeline

Reactions

Author

rayhpeng

Participants

rayhpeng

Timeline (top)

labeled ×3closed ×1issue_type_added ×1

create_agent registers the model node with trace=False in factory.py:

# langchain/agents/factory.py
graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=False))

When trace=False, RunnableCallable.ainvoke skips callback context propagation entirely:

# langgraph/_internal/_runnable.py
async def ainvoke(self, input, config=None, **kwargs):
    if self.trace:
        # trace=True: sets up callback_manager, patch_config, set_config_context
        # → inner model.ainvoke() picks up callbacks from context variable
        ...
    else:
        # trace=False: directly calls the function, NO callback context setup
        ret = await self.afunc(*args, **kwargs)

Then inside _execute_model_async, the model is invoked without passing config:

# langchain/agents/factory.py
async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    model_, effective_response_format = _get_bound_model(request)
    output = await model_.ainvoke(messages)  # ← no config → no callbacks

Since trace=False skipped set_config_context, and ainvoke has no explicit config, var_child_runnable_config is empty. All BaseCallbackHandler callbacks for LLM calls are silently dropped.

Error Message

Error Message and Stack Trace (if applicable)

No error is thrown. The callbacks are silently dropped — on_llm_start and on_llm_end are never called, while on_chain_start and on_chain_end work correctly.

Root Cause

Use AgentMiddleware.after_model hook to extract state["messages"][-1].usage_metadata. This works because middleware runs as an independent graph node unaffected by trace=False. However, it cannot provide on_llm_start (no latency tracking), on_tool_start/end, or other events that BaseCallbackHandler is designed to capture.

Fix Action

Fix / Workaround

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

# langgraph/_internal/_runnable.py
async def ainvoke(self, input, config=None, **kwargs):
    if self.trace:
        # trace=True: sets up callback_manager, patch_config, set_config_context
        # → inner model.ainvoke() picks up callbacks from context variable
        ...
    else:
        # trace=False: directly calls the function, NO callback context setup
        ret = await self.afunc(*args, **kwargs)

Current workaround:

Code Example

from langchain.agents import create_agent
from langchain_core.callbacks import BaseCallbackHandler
from langchain_openai import ChatOpenAI
 
 
class DebugHandler(BaseCallbackHandler):
    def on_chain_start(self, serialized, inputs, **kwargs):
        print(f"[chain_start] {serialized.get('name', 'unknown')}")
 
    def on_chain_end(self, outputs, **kwargs):
        print("[chain_end]")
 
    def on_llm_start(self, serialized, prompts, **kwargs):
        print(f"[llm_start] {serialized.get('name', 'unknown')}")  # ← NEVER CALLED
 
    def on_llm_end(self, response, **kwargs):
        usage = getattr(response.generations[0][0].message, "usage_metadata", None)
        print(f"[llm_end] tokens={usage}")  # ← NEVER CALLED
 
 
agent = create_agent(
    model=ChatOpenAI(model="gpt-4o"),
    tools=[],
    system_prompt="Say hello.",
)
 
result = agent.invoke(
    {"messages": [{"role": "user", "content": "Hi"}]},
    config={"callbacks": [DebugHandler()]},
)

---

No error is thrown. The callbacks are **silently dropped** — `on_llm_start` and `on_llm_end` are never called, while `on_chain_start` and `on_chain_end` work correctly.
 

# Expected output:
[chain_start] LangGraph
[llm_start] ChatOpenAI          ← expected but missing
[llm_end] tokens={...}          ← expected but missing
[chain_end]
 
# Actual output:
[chain_start] LangGraph          ← graph-level trace works
[chain_end]

---

# langchain/agents/factory.py
graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=False))

---

# langgraph/_internal/_runnable.py
async def ainvoke(self, input, config=None, **kwargs):
    if self.trace:
        # trace=True: sets up callback_manager, patch_config, set_config_context
        # → inner model.ainvoke() picks up callbacks from context variable
        ...
    else:
        # trace=False: directly calls the function, NO callback context setup
        ret = await self.afunc(*args, **kwargs)

---

# langchain/agents/factory.py
async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    model_, effective_response_format = _get_bound_model(request)
    output = await model_.ainvoke(messages)  # ← no config → no callbacks

---

graph.astream(input, config={"callbacks": [handler]})
  → RunnableCallable(model_node, trace=False).ainvoke(state, config)
    → trace=False: directly calls amodel_node(state), skips set_config_context
      → _execute_model_async(request)
        → model_.ainvoke(messages)  # no config → no callbacks
          → on_llm_start / on_llm_end NEVER fire

---

def create_agent(model, tools, *, trace_model_node: bool = False, ...):
    graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=trace_model_node))

---

async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    output = await model_.ainvoke(messages, config=request.config)  # ← pass config

RAW_BUFFERClick to expand / collapse

Checked other resources

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

from langchain.agents import create_agent
from langchain_core.callbacks import BaseCallbackHandler
from langchain_openai import ChatOpenAI
 
 
class DebugHandler(BaseCallbackHandler):
    def on_chain_start(self, serialized, inputs, **kwargs):
        print(f"[chain_start] {serialized.get('name', 'unknown')}")
 
    def on_chain_end(self, outputs, **kwargs):
        print("[chain_end]")
 
    def on_llm_start(self, serialized, prompts, **kwargs):
        print(f"[llm_start] {serialized.get('name', 'unknown')}")  # ← NEVER CALLED
 
    def on_llm_end(self, response, **kwargs):
        usage = getattr(response.generations[0][0].message, "usage_metadata", None)
        print(f"[llm_end] tokens={usage}")  # ← NEVER CALLED
 
 
agent = create_agent(
    model=ChatOpenAI(model="gpt-4o"),
    tools=[],
    system_prompt="Say hello.",
)
 
result = agent.invoke(
    {"messages": [{"role": "user", "content": "Hi"}]},
    config={"callbacks": [DebugHandler()]},
)

Error Message and Stack Trace (if applicable)

No error is thrown. The callbacks are **silently dropped** — `on_llm_start` and `on_llm_end` are never called, while `on_chain_start` and `on_chain_end` work correctly.
 

# Expected output:
[chain_start] LangGraph
[llm_start] ChatOpenAI          ← expected but missing
[llm_end] tokens={...}          ← expected but missing
[chain_end]
 
# Actual output:
[chain_start] LangGraph          ← graph-level trace works
[chain_end]

Description

create_agent registers the model node with trace=False in factory.py:

# langchain/agents/factory.py
graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=False))

When trace=False, RunnableCallable.ainvoke skips callback context propagation entirely:

# langgraph/_internal/_runnable.py
async def ainvoke(self, input, config=None, **kwargs):
    if self.trace:
        # trace=True: sets up callback_manager, patch_config, set_config_context
        # → inner model.ainvoke() picks up callbacks from context variable
        ...
    else:
        # trace=False: directly calls the function, NO callback context setup
        ret = await self.afunc(*args, **kwargs)

Then inside _execute_model_async, the model is invoked without passing config:

# langchain/agents/factory.py
async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    model_, effective_response_format = _get_bound_model(request)
    output = await model_.ainvoke(messages)  # ← no config → no callbacks

Call chain:

graph.astream(input, config={"callbacks": [handler]})
  → RunnableCallable(model_node, trace=False).ainvoke(state, config)
    → trace=False: directly calls amodel_node(state), skips set_config_context
      → _execute_model_async(request)
        → model_.ainvoke(messages)  # no config → no callbacks
          → on_llm_start / on_llm_end NEVER fire

Impact:

❌ on_llm_start / on_llm_end never called → no token usage tracking via callbacks
❌ Third-party observability tools (Langfuse, LangWatch, etc.) cannot see individual LLM calls within the model node
❌ Any BaseCallbackHandler user loses LLM-level visibility
✅ on_chain_start / on_chain_end still work (graph-level node execution)

Comparison with JS version:

The JS createAgent (langchainjs/libs/langchain/src/agents/index.ts → ReactAgent) does not have this issue. The JS implementation handles model invocation differently, and callback propagation works as expected. The trace=False behavior is specific to the Python factory.py.

Suggested fixes:

Option A — Add a parameter to create_agent:

def create_agent(model, tools, *, trace_model_node: bool = False, ...):
    graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=trace_model_node))

Option B — Keep trace=False for the wrapper but pass config to the inner model call:

async def _execute_model_async(request: ModelRequest) -> ModelResponse:
    output = await model_.ainvoke(messages, config=request.config)  # ← pass config

Option C — Always use trace=True for the model node.

Current workaround:

before_agent_node also uses trace=False, potentially causing similar issues.
This does NOT affect users who build their own StateGraph — only create_agent is affected.

System Info

langchain: 1.2.0 langchain_core: 1.2.5 langgraph: 0.2.60 Python: 3.12 OS: Linux

extent analysis

TL;DR

To fix the issue of silently dropped callbacks for LLM calls, modify the create_agent function to allow setting trace_model_node to True or pass the config to the inner model call.

Guidance

Modify create_agent function: Add a parameter trace_model_node to create_agent to control the tracing behavior of the model node, as suggested in Option A.
Pass config to inner model call: Alternatively, modify _execute_model_async to pass the config to the inner model call, as shown in Option B.
Verify callback behavior: After applying the fix, verify that on_llm_start and on_llm_end callbacks are called as expected by checking the output for the expected messages.
Test with different scenarios: Test the fix with different models, tools, and configurations to ensure that the callbacks work correctly in all cases.

Example

def create_agent(model, tools, *, trace_model_node: bool = True, ...):
    graph.add_node("model", RunnableCallable(model_node, amodel_node, trace=trace_model_node))

Notes

The issue is specific to the Python implementation of create_agent and does not affect the JavaScript version. The fix should be applied to the langchain package.

Recommendation

Apply Option A by adding a trace_model_node parameter to create_agent, allowing users to control the tracing behavior of the model node. This provides more flexibility and does not introduce any potential issues with other parts of the code.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #GPU compatibility #latency issue #model loading

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

langchain - 💡(How to fix) Fix `create_agent` model node uses `trace=False`, silently preventing `BaseCallbackHandler` from receiving `on_llm_start`/`on_llm_end` callbacks [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Error Message and Stack Trace (if applicable)

Root Cause

Fix Action

Fix / Workaround

Current workaround:

Code Example

Checked other resources

Package (Required)

Related Issues / PRs

Reproduction Steps / Example Code (Python)

Error Message and Stack Trace (if applicable)

Description

Call chain:

Impact:

Comparison with JS version:

Suggested fixes:

Current workaround:

Related:

System Info

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING