langchain - ✅(Solved) Fix `disable_streaming="tool_calling"` crashes in streaming mode [7 pull requests, 6 comments, 6 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#35436Fetched 2026-04-08 00:26:12
View on GitHub
Comments
6
Participants
6
Timeline
19
Reactions
0
Timeline (top)
cross-referenced ×7commented ×6labeled ×3issue_type_added ×1

There are two separate bugs that make disable_streaming="tool_calling" unusable with ChatOpenAI when streaming=True:

Bug 1: _agenerate receives an AsyncStream instead of ChatCompletion Root cause: BaseChatOpenAI._default_params hardcodes "stream": self.streaming.

@property
def _default_params(self) -> dict[str, Any]:
    return {
        "model": self.model_name,
        "stream": self.streaming,  # ← always True when streaming=True
        **{k: v for k, v in exclude_if_none.items() if v is not None},
        **self.model_kwargs,
    }

When disable_streaming="tool_calling" triggers, BaseChatModel._agenerate_with_cache calls _agenerate() instead of _astream(). But _agenerate() calls _get_request_payload() which includes _default_paramsstream=True is in the payload → the OpenAI async client returns an AsyncStream_create_chat_result calls response.model_dump() → crash.

Bug 2: _should_stream disables streaming for ALL turns, not just tool-call turns Root cause: In BaseChatModel._should_stream

if self.disable_streaming == "tool_calling" and kwargs.get("tools"):
    return False

This checks whether tools are bound (kwargs.get("tools")), not whether the current response contains tool calls. When using create_react_agent (or any agent with bind_tools), tools are always in kwargs for every LLM invocation — including the final text-response turn. This means _should_stream returns False for all turns, completely disabling token-by-token streaming.

This is an inherent limitation: you can't know whether a response will contain tool calls before making the API call. However, the current behavior is misleading — disable_streaming="tool_calling" suggests it only affects tool-calling turns, but it actually disables streaming entirely when tools are bound.

Context: Why this matters OpenAI-compatible providers (e.g., Dashscope/Qwen, vLLM) sometimes produce malformed tool call chunks during streaming — splitting a single tool call into a valid call with empty args and an invalid fragment containing the actual args. disable_streaming="tool_calling" would be the ideal fix (non-streaming for tool calls = no chunk splitting), but the two bugs above make it unusable.

Error Message

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

File "langchain_core/language_models/chat_models.py", line 1361, in _agenerate_with_cache result = await self._agenerate(...) File "langchain_openai/chat_models/base.py", line 1744, in _agenerate return await run_in_executor(None, self._create_chat_result, response, generation_info) File "langchain_openai/chat_models/base.py", line 1540, in _create_chat_result response if isinstance(response, dict) else response.model_dump() # response is an AsyncStream, not a ChatCompletion

Root Cause

Bug 1: _agenerate receives an AsyncStream instead of ChatCompletion Root cause: BaseChatOpenAI._default_params hardcodes "stream": self.streaming.

@property
def _default_params(self) -> dict[str, Any]:
    return {
        "model": self.model_name,
        "stream": self.streaming,  # ← always True when streaming=True
        **{k: v for k, v in exclude_if_none.items() if v is not None},
        **self.model_kwargs,
    }

When disable_streaming="tool_calling" triggers, BaseChatModel._agenerate_with_cache calls _agenerate() instead of _astream(). But _agenerate() calls _get_request_payload() which includes _default_paramsstream=True is in the payload → the OpenAI async client returns an AsyncStream_create_chat_result calls response.model_dump() → crash.

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

aiohttp: 3.13.3 anthropic: 0.83.0 dataclasses-json: 0.6.7 httpx: 0.28.1 httpx-sse: 0.4.3 jsonpatch: 1.33 langgraph: 1.0.9 numpy: 2.4.2 openai: 2.22.0 orjson: 3.11.7 packaging: 24.2 pydantic: 2.12.5 pydantic-settings: 2.13.1 PyYAML: 6.0.3 pyyaml: 6.0.3 requests: 2.32.5 requests-toolbelt: 1.0.0 rich: 14.3.3 SQLAlchemy: 2.0.46 sqlalchemy: 2.0.46 tenacity: 9.1.4 tiktoken: 0.12.0 typing-extensions: 4.15.0 uuid-utils: 0.14.1 wrapt: 2.1.1 xxhash: 3.6.0 zstandard: 0.25.0

The same bugs exist in the latest master source code (verified Feb 25, 2026) — `_default_params` still includes `"stream": self.streaming` and `_should_stream` still checks `kwargs.get("tools")`.

PR fix notes

PR #35440: fix(openai): fix disable_streaming=tool_calling crash in streaming mode

Description (problem / solution / changelog)

Summary

  • Fix crash when using disable_streaming="tool_calling" with streaming enabled

Problem

When disable_streaming="tool_calling" is set and tools are passed, the _default_params always includes stream=True, causing the API to return AsyncStream instead of ChatCompletion. This leads to a crash when _create_chat_result tries to call model_dump() on the AsyncStream.

Solution

Override the stream param to False in _get_invocation_params when disable_streaming="tool_calling" and tools are present.

Fixes #35436

Changed files

  • libs/partners/openai/langchain_openai/chat_models/base.py (modified, +21/-12)

PR #35457: fix(openai): force stream=False in _generate/_agenerate when disable_streaming is active

Description (problem / solution / changelog)

Summary

  • Bug: When streaming=True and disable_streaming="tool_calling" with tools bound, _should_stream() correctly routes to _generate/_agenerate. But _default_params still sets stream=True from self.streaming, so the OpenAI API returns a Stream object. _create_chat_result then crashes calling .model_dump() on it.
  • Fix: Explicitly set payload["stream"] = False in both BaseChatOpenAI._generate and BaseChatOpenAI._agenerate after building the payload, since these are non-streaming code paths.
  • Docs: Clarify in the disable_streaming field docstring and _should_stream comments that "tool_calling" disables streaming for all turns when tools are bound (not just tool-call turns), since the response content is unknown before the API call.

Closes #35436

Areas requiring careful review

  • The payload["stream"] = False override in _generate/_agenerate is unconditional. This is intentional -- these methods are the non-streaming code paths and should never send stream=True. The existing response_format branch already pops stream entirely, so this is consistent.

Test plan

  • New sync test: test_disable_streaming_tool_calling_forces_stream_false -- verifies stream=False in the API payload when streaming=True + disable_streaming="tool_calling" + tools bound
  • New async test: test_disable_streaming_tool_calling_forces_stream_false_async -- same for async path
  • All 183 existing tests in tests/unit_tests/chat_models/test_base.py pass
  • All 12 core disable_streaming tests pass

[!NOTE] This contribution was developed with assistance from an AI agent (Claude).

🤖 Generated with Claude Code

Changed files

  • libs/core/langchain_core/language_models/chat_models.py (modified, +8/-1)
  • libs/partners/openai/langchain_openai/chat_models/base.py (modified, +8/-0)
  • libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +74/-0)

PR #35480: fix(openai): strip stream param in _generate/_agenerate to fix disable_streaming crash

Description (problem / solution / changelog)

Problem

When disable_streaming="tool_calling" triggers, BaseChatModel._agenerate_with_cache calls _agenerate() instead of _astream(). But _get_request_payload() includes stream=True from _default_params, so the OpenAI client returns an AsyncStream object instead of a ChatCompletion. The subsequent response.model_dump() then crashes:

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

This makes disable_streaming="tool_calling" completely unusable with ChatOpenAI when streaming=True.

Root Cause

BaseChatOpenAI._default_params hardcodes "stream": self.streaming. When self.streaming=True, this ends up in the payload for _generate()/_agenerate() — which are the non-streaming code paths. The response_format branch already handled this by popping stream, but the regular completion path did not.

Fix

Pop stream from the payload at the start of both _generate() and _agenerate() since these are non-streaming code paths that should never send stream=True to the API.

Tests

All 181 existing unit tests pass (2 consecutive clean runs).

Fixes #35436

Changed files

  • libs/partners/openai/langchain_openai/chat_models/base.py (modified, +4/-2)

PR #35486: fix(openai): ensure stream=False in payload when disable_streaming is active

Description (problem / solution / changelog)

Summary

Fixes #35436

When disable_streaming='tool_calling' (or True) routes execution through _generate/_agenerate instead of _stream/_astream, the request payload still contained stream=True because _default_params unconditionally includes "stream": self.streaming.

This caused the OpenAI client to return an AsyncStream object instead of a ChatCompletion, leading to a crash in _create_chat_result when calling response.model_dump().

Root Cause

_default_params (line 1113) hardcodes "stream": self.streaming. When self.streaming=True and disable_streaming routes through _generate/_agenerate (the non-streaming code path), the payload still has stream: True, causing the API to return a stream iterator.

Fix

Explicitly set payload['stream'] = False at the top of both _generate and _agenerate. These methods are the non-streaming code path by definition — they should never send stream: True to the API.

Tests

Added test_disable_streaming_payload_stream_false and its async variant. Both verify that when streaming=True + disable_streaming='tool_calling' + tools are bound, the actual API payload has stream=False.

Changed files

  • libs/partners/openai/langchain_openai/chat_models/base.py (modified, +8/-0)
  • libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +43/-0)

PR #35505: fix(openai): prevent crash when disable_streaming="tool_calling" in streaming mode

Description (problem / solution / changelog)

Summary

Fixes #35436 — disable_streaming="tool_calling" crashes during streaming with AttributeError: 'AsyncStream' object has no attribute 'model_dump'.

Root Cause

When streaming=True and disable_streaming="tool_calling" are both set on ChatOpenAI, and tools are bound:

  1. _should_stream() correctly returns False (streaming should be disabled for tool-calling turns)
  2. stream()/astream() falls back to invoke()/ainvoke(), which calls _generate()/_agenerate()
  3. Bug: _generate()/_agenerate() call _get_request_payload() which merges _default_params — and _default_params unconditionally includes "stream": self.streaming (i.e. True)
  4. The OpenAI client receives stream=True and returns an AsyncStream/Stream object
  5. _create_chat_result() calls response.model_dump() on the stream object → crash

Fix

Strip the stream key from the payload in both _generate() and _agenerate() immediately after building it. These are the non-streaming code paths and should never send stream=True to the API. The streaming methods (_stream/_astream) already explicitly set kwargs["stream"] = True before building the payload, so they are unaffected.

The existing payload.pop("stream") inside the response_format branches becomes redundant and is removed (it would raise KeyError since stream is already popped).

Tests

Added 8 unit tests covering:

  • _generate payload has no stream key when disable_streaming="tool_calling"
  • _agenerate payload has no stream key when disable_streaming="tool_calling"
  • invoke() with disable_streaming="tool_calling" + streaming=True + tools does not crash
  • stream() with disable_streaming="tool_calling" falls back to invoke correctly
  • disable_streaming=False still includes stream=True in payload (streaming preserved)
  • _generate without streaming flag also has no stream in payload
  • _should_stream returns False when disable_streaming="tool_calling" and tools are present
  • _should_stream returns True when disable_streaming="tool_calling" but no tools

All 189 existing unit tests in test_base.py continue to pass.

Changed files

  • libs/partners/openai/langchain_openai/chat_models/base.py (modified, +11/-3)
  • libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +207/-0)

PR #35573: fix(openai): ensure _generate/_agenerate send stream=False to the API

Description (problem / solution / changelog)

Summary

  • When streaming=True and disable_streaming="tool_calling", _should_stream correctly routes to _generate/_agenerate (non-streaming path)
  • But _default_params hardcodes "stream": self.streaming, so the payload still contained stream=True
  • The OpenAI client returned a Stream/AsyncStream instead of a ChatCompletion, crashing _create_chat_result with AttributeError: 'AsyncStream' object has no attribute 'model_dump'
  • Fix: explicitly set kwargs["stream"] = False at the top of _generate and _agenerate, mirroring _astream which already sets kwargs["stream"] = True

Files changed

  • libs/partners/openai/langchain_openai/chat_models/base.py — 2 lines: set kwargs["stream"] = False in _generate and _agenerate
  • libs/partners/openai/tests/unit_tests/chat_models/test_base.py — 14 test cases covering all combinations of streaming × disable_streaming, plus explicit tool-calling scenarios

Test plan

  • test_generate_always_sets_stream_false — 6 parametrized cases (streaming=True/False × disable_streaming=None/tool_calling/True)
  • test_agenerate_always_sets_stream_false — 6 parametrized cases (async counterpart)
  • test_generate_stream_false_with_tools — exact issue #35436 scenario: streaming=True + disable_streaming="tool_calling" + tools in kwargs
  • test_agenerate_stream_false_with_tools — async counterpart
  • Full existing unit test suite: 196 passed, 0 failed

Closes #35436

🤖 Generated with Claude Code

Changed files

  • libs/partners/openai/langchain_openai/chat_models/base.py (modified, +2/-0)
  • libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +146/-0)

PR #35685: fix(openai): force stream=False in _generate/_agenerate when disable_streaming is active

Description (problem / solution / changelog)

Bug

When disable_streaming="tool_calling" is set and streaming=True, calling _generate/_agenerate (the non-streaming path) still sends stream=True to the OpenAI API because _default_params hardcodes "stream": self.streaming. The API returns an AsyncStream/Stream instead of a ChatCompletion, crashing with:

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

Fix

Explicitly set payload["stream"] = False at the top of both _generate and _agenerate. These methods are always the non-streaming code path — when _should_stream returns False (e.g. due to disable_streaming), execution is routed here, so the API payload must never request streaming.

Tests

Added sync and async unit tests that verify the payload sent to the OpenAI client has stream=False when disable_streaming="tool_calling" is active with streaming=True.

Closes #35436

Changed files

  • libs/partners/openai/langchain_openai/chat_models/base.py (modified, +8/-0)
  • libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +139/-0)

Code Example

import asyncio
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Sunny in {city}"

llm = ChatOpenAI(
    model="qwen-max",  # or any OpenAI-compatible model
    openai_api_key="your-key",
    openai_api_base="https://dashscope.aliyuncs.com/compatible-mode/v1",
    temperature=0.3,
    streaming=True,
    disable_streaming="tool_calling",
)

agent = create_react_agent(model=llm, tools=[get_weather])

async def main():
    async for event in agent.astream_events(
        {"messages": [{"role": "user", "content": "What's the weather in Tokyo?"}]},
        version="v2",
    ):
        print(event["event"])

asyncio.run(main())

---

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

  File "langchain_core/language_models/chat_models.py", line 1361, in _agenerate_with_cache
    result = await self._agenerate(...)
  File "langchain_openai/chat_models/base.py", line 1744, in _agenerate
    return await run_in_executor(None, self._create_chat_result, response, generation_info)
  File "langchain_openai/chat_models/base.py", line 1540, in _create_chat_result
    response if isinstance(response, dict) else response.model_dump()
    # response is an AsyncStream, not a ChatCompletion

---

@property
def _default_params(self) -> dict[str, Any]:
    return {
        "model": self.model_name,
        "stream": self.streaming,  # ← always True when streaming=True
        **{k: v for k, v in exclude_if_none.items() if v is not None},
        **self.model_kwargs,
    }

---

if self.disable_streaming == "tool_calling" and kwargs.get("tools"):
    return False

---

System Information
------------------
> OS:  Windows
> OS Version:  10.0.26200
> Python Version:  3.13.12 (tags/v3.13.12:1cbe481, Feb  3 2026, 18:22:25) [MSC v.1944 64 bit (AMD64)]

Package Information
-------------------
> langchain_core: 1.2.15
> langchain: 1.2.10
> langchain_community: 0.4.1
> langsmith: 0.7.6
> langchain_anthropic: 1.3.3
> langchain_classic: 1.0.1
> langchain_openai: 1.1.10
> langchain_text_splitters: 1.1.1
> langgraph_sdk: 0.3.8

Optional packages not installed
-------------------------------
> deepagents
> deepagents-cli

Other Dependencies
------------------
> aiohttp: 3.13.3
> anthropic: 0.83.0
> dataclasses-json: 0.6.7
> httpx: 0.28.1
> httpx-sse: 0.4.3
> jsonpatch: 1.33
> langgraph: 1.0.9
> numpy: 2.4.2
> openai: 2.22.0
> orjson: 3.11.7
> packaging: 24.2
> pydantic: 2.12.5
> pydantic-settings: 2.13.1
> PyYAML: 6.0.3
> pyyaml: 6.0.3
> requests: 2.32.5
> requests-toolbelt: 1.0.0
> rich: 14.3.3
> SQLAlchemy: 2.0.46
> sqlalchemy: 2.0.46
> tenacity: 9.1.4
> tiktoken: 0.12.0
> typing-extensions: 4.15.0
> uuid-utils: 0.14.1
> wrapt: 2.1.1
> xxhash: 3.6.0
> zstandard: 0.25.0
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

#34654

Reproduction Steps / Example Code (Python)

import asyncio
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Sunny in {city}"

llm = ChatOpenAI(
    model="qwen-max",  # or any OpenAI-compatible model
    openai_api_key="your-key",
    openai_api_base="https://dashscope.aliyuncs.com/compatible-mode/v1",
    temperature=0.3,
    streaming=True,
    disable_streaming="tool_calling",
)

agent = create_react_agent(model=llm, tools=[get_weather])

async def main():
    async for event in agent.astream_events(
        {"messages": [{"role": "user", "content": "What's the weather in Tokyo?"}]},
        version="v2",
    ):
        print(event["event"])

asyncio.run(main())

Error Message and Stack Trace (if applicable)

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

  File "langchain_core/language_models/chat_models.py", line 1361, in _agenerate_with_cache
    result = await self._agenerate(...)
  File "langchain_openai/chat_models/base.py", line 1744, in _agenerate
    return await run_in_executor(None, self._create_chat_result, response, generation_info)
  File "langchain_openai/chat_models/base.py", line 1540, in _create_chat_result
    response if isinstance(response, dict) else response.model_dump()
    # response is an AsyncStream, not a ChatCompletion

Description

There are two separate bugs that make disable_streaming="tool_calling" unusable with ChatOpenAI when streaming=True:

Bug 1: _agenerate receives an AsyncStream instead of ChatCompletion Root cause: BaseChatOpenAI._default_params hardcodes "stream": self.streaming.

@property
def _default_params(self) -> dict[str, Any]:
    return {
        "model": self.model_name,
        "stream": self.streaming,  # ← always True when streaming=True
        **{k: v for k, v in exclude_if_none.items() if v is not None},
        **self.model_kwargs,
    }

When disable_streaming="tool_calling" triggers, BaseChatModel._agenerate_with_cache calls _agenerate() instead of _astream(). But _agenerate() calls _get_request_payload() which includes _default_paramsstream=True is in the payload → the OpenAI async client returns an AsyncStream_create_chat_result calls response.model_dump() → crash.

Bug 2: _should_stream disables streaming for ALL turns, not just tool-call turns Root cause: In BaseChatModel._should_stream

if self.disable_streaming == "tool_calling" and kwargs.get("tools"):
    return False

This checks whether tools are bound (kwargs.get("tools")), not whether the current response contains tool calls. When using create_react_agent (or any agent with bind_tools), tools are always in kwargs for every LLM invocation — including the final text-response turn. This means _should_stream returns False for all turns, completely disabling token-by-token streaming.

This is an inherent limitation: you can't know whether a response will contain tool calls before making the API call. However, the current behavior is misleading — disable_streaming="tool_calling" suggests it only affects tool-calling turns, but it actually disables streaming entirely when tools are bound.

Context: Why this matters OpenAI-compatible providers (e.g., Dashscope/Qwen, vLLM) sometimes produce malformed tool call chunks during streaming — splitting a single tool call into a valid call with empty args and an invalid fragment containing the actual args. disable_streaming="tool_calling" would be the ideal fix (non-streaming for tool calls = no chunk splitting), but the two bugs above make it unusable.

System Info

System Information
------------------
> OS:  Windows
> OS Version:  10.0.26200
> Python Version:  3.13.12 (tags/v3.13.12:1cbe481, Feb  3 2026, 18:22:25) [MSC v.1944 64 bit (AMD64)]

Package Information
-------------------
> langchain_core: 1.2.15
> langchain: 1.2.10
> langchain_community: 0.4.1
> langsmith: 0.7.6
> langchain_anthropic: 1.3.3
> langchain_classic: 1.0.1
> langchain_openai: 1.1.10
> langchain_text_splitters: 1.1.1
> langgraph_sdk: 0.3.8

Optional packages not installed
-------------------------------
> deepagents
> deepagents-cli

Other Dependencies
------------------
> aiohttp: 3.13.3
> anthropic: 0.83.0
> dataclasses-json: 0.6.7
> httpx: 0.28.1
> httpx-sse: 0.4.3
> jsonpatch: 1.33
> langgraph: 1.0.9
> numpy: 2.4.2
> openai: 2.22.0
> orjson: 3.11.7
> packaging: 24.2
> pydantic: 2.12.5
> pydantic-settings: 2.13.1
> PyYAML: 6.0.3
> pyyaml: 6.0.3
> requests: 2.32.5
> requests-toolbelt: 1.0.0
> rich: 14.3.3
> SQLAlchemy: 2.0.46
> sqlalchemy: 2.0.46
> tenacity: 9.1.4
> tiktoken: 0.12.0
> typing-extensions: 4.15.0
> uuid-utils: 0.14.1
> wrapt: 2.1.1
> xxhash: 3.6.0
> zstandard: 0.25.0

The same bugs exist in the latest master source code (verified Feb 25, 2026) — _default_params still includes "stream": self.streaming and _should_stream still checks kwargs.get("tools").

extent analysis

Fix Plan

Bug 1: _agenerate receives an AsyncStream instead of ChatCompletion

  1. Update langchain_openai to fix _default_params:

@property def _default_params(self) -> dict[str, Any]: return { "model": self.model_name, **{k: v for k, v in exclude_if_none.items() if v is not None}, **self.model_kwargs, }

   Remove the line `"stream": self.streaming` to prevent hardcoding `stream=True` in the payload.

2. **Update `langchain_openai` to fix `_should_stream`**:
   ```python
def _should_stream(self, kwargs: dict[str, Any]) -> bool:
    if self.disable_streaming == "tool_calling" and kwargs.get("tools") and kwargs.get("event") and kwargs["event"].get("role") == "tool":
        return False
    return True

Modify the condition to check whether the current response contains a tool call (kwargs["event"].get("role") == "tool").

Bug 2: _should_stream disables streaming for ALL turns, not just tool-call turns

  1. Update langchain_openai to fix _should_stream:

def _should_stream(self, kwargs: dict[str, Any]) -> bool: if self.disable_streaming == "tool_calling" and kwargs.get("tools"): return False return True

   Remove the condition `kwargs.get("event") and kwargs["event"].get("role") == "tool"` to prevent disabling streaming for all turns.

### Verification

1. Run the reproduction code with the updated `langchain_openai` package.
2. Verify that the `AsyncStream` object has the `model_dump` attribute.
3. Check that the

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

langchain - ✅(Solved) Fix `disable_streaming="tool_calling"` crashes in streaming mode [7 pull requests, 6 comments, 6 participants]