langchain - ✅(Solved) Fix `disable_streaming="tool_calling"` crashes in streaming mode [7 pull requests, 6 comments, 6 participants]

edwardmeng · 2026-02-25T10:50:16Z

[langchain] There are two separate bugs that make disable streaming="tool calling" unusable with ChatOpenAI when streaming=True : Bug 1: agenerate receives an… There are **two separate bugs** that make `disable_streaming="tool_calling"` unusable with `ChatOpenAI` when `streaming=True`: Bug 1: `_agenerate` receives an **AsyncStream** instead of **ChatCompletion** Root cause: `BaseChatOpenAI._default_params` hardcodes `"stream": self.streaming`. ``` @property def _default_params(self) -> dict[str, Any]: return { "model": self.model_name, "stream": self.streaming, # ← always True when streaming=True **{k: v for k, v in exclude_if_none.items() if v is not None}, **self.model_kwargs, } ``` When `disable_streaming="tool_calling"` triggers, `BaseChatModel._agenerate_with_cache` calls `_agenerate()` instead of `_astream()`. But `_agenerate()` calls `_get_request_payload()` which includes `_default_params` → `stream=True` is in the payload → the OpenAI async client returns an `AsyncStream` → `_create_chat_result` calls `response.model_dump()` → crash. Bug 2: `_should_stream` disables streaming for ALL turns, not just tool-call turns Root cause: In `BaseChatModel._should_stream` ``` if self.disable_streaming == "tool_calling" and kwargs.get("tools"): return False ``` This checks whether tools are bound (`kwargs.get("tools")`), not whether the current response contains tool calls. When using `create_react_agent` (or any agent with `bind_tools`), tools are always in `kwargs` for every LLM invocation — including the final text-response turn. This means `_should_stream` returns `False` for all turns, completely disabling token-by-token streaming. This is an inherent limitation: you can't know whether a response will contain tool calls before making the API call. However, the current behavior is misleading — `disable_streaming="tool_calling"` suggests it only affects tool-calling turns, but it actually disables streaming entirely when tools are bound. **Context: Why this matters** OpenAI-compatible providers (e.g., Dashscope/Qwen, vLLM) sometimes produce malformed tool call chunks during streaming — splitting a single tool call into a valid call with empty args and an invalid fragment containing the actual args. disable_streaming="tool_calling" would be the ideal fix (non-streaming for tool calls = no chunk splitting), but the two bugs above make it unusable. # PR #35440: fix(openai): fix disable_streaming=tool_calling crash in streaming mode - Repository: langchain-ai/langchain - Author: dotuananh0712 - State: closed | merged: False - Link: https://github.com/langchain-ai/langchain/pull/35440 ## Description (problem / solution / changelog) ## Summary - Fix crash when using disable_streaming="tool_calling" with streaming enabled ## Problem When disable_streaming="tool_calling" is set and tools are passed, the _default_params always includes stream=True, causing the API to return AsyncStream instead of ChatCompletion. This leads to a crash when _create_chat_result tries to call model_dump() on the AsyncStream. ## Solution Override the stream param to False in _get_invocation_params when disable_streaming="tool_calling" and tools are present. Fixes #35436 ## Changed files - `libs/partners/openai/langchain_openai/chat_models/base.py` (modified, +21/-12) --- # PR #35457: fix(openai): force stream=False in _generate/_agenerate when disable_streaming is active - Repository: langchain-ai/langchain - Author: Anandesh-Sharma - State: open | merged: False - Link: https://github.com/langchain-ai/langchain/pull/35457 ## Description (problem / solution / changelog) ## Summary - **Bug**: When `streaming=True` and `disable_streaming="tool_calling"` with tools bound, `_should_stream()` correctly routes to `_generate`/`_agenerate`. But `_default_params` still sets `stream=True` from `self.streaming`, so the OpenAI API returns a Stream object. `_create_chat_result` then crashes calling `.model_dump()` on it. - **Fix**: Explicitly set `payload["stream"] = False` in both `BaseChatOpenAI._generate` and `BaseChatOpenAI._agenerate` after building the payload, since these are non-streaming code paths. - **Docs**: Clarify in the `disable_streaming` field docstring and `_should_stream` comments that `"tool_calling"` disables streaming for all turns when tools are bound (not just tool-call turns), since the response content is unknown before the API call. Closes #35436 ## Areas requiring careful review - The `payload["stream"] = False` override in `_generate`/`_agenerate` is unconditional. This is intentional -- these methods are the non-streaming code paths and should never send `stream=True`. The existing `response_format` branch already pops `stream` entirely, so this is consistent. ## Test plan - [x] New sync test: `test_disable_streaming_tool_calling_forces_stream_false` -- verifies `stream=False` in the API payload when `streaming=True` + `disable_streaming="tool_calling"` + tools bound - [x] New async test: `test_disable_streaming_tool_calli

langchain2026-02-25 10:50:16

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#35436•Fetched 2026-04-08 00:26:12

View on GitHub

Comments

Participants

Timeline

Reactions

Author

Participants

Timeline (top)

cross-referenced ×7commented ×6labeled ×3issue_type_added ×1

There are two separate bugs that make disable_streaming="tool_calling" unusable with ChatOpenAI when streaming=True:

Bug 1: _agenerate receives an AsyncStream instead of ChatCompletion Root cause: BaseChatOpenAI._default_params hardcodes "stream": self.streaming.

@property
def _default_params(self) -> dict[str, Any]:
    return {
        "model": self.model_name,
        "stream": self.streaming,  # ← always True when streaming=True
        **{k: v for k, v in exclude_if_none.items() if v is not None},
        **self.model_kwargs,
    }

When disable_streaming="tool_calling" triggers, BaseChatModel._agenerate_with_cache calls _agenerate() instead of _astream(). But _agenerate() calls _get_request_payload() which includes _default_params → stream=True is in the payload → the OpenAI async client returns an AsyncStream → _create_chat_result calls response.model_dump() → crash.

Bug 2: _should_stream disables streaming for ALL turns, not just tool-call turns Root cause: In BaseChatModel._should_stream

if self.disable_streaming == "tool_calling" and kwargs.get("tools"):
    return False

This checks whether tools are bound (kwargs.get("tools")), not whether the current response contains tool calls. When using create_react_agent (or any agent with bind_tools), tools are always in kwargs for every LLM invocation — including the final text-response turn. This means _should_stream returns False for all turns, completely disabling token-by-token streaming.

This is an inherent limitation: you can't know whether a response will contain tool calls before making the API call. However, the current behavior is misleading — disable_streaming="tool_calling" suggests it only affects tool-calling turns, but it actually disables streaming entirely when tools are bound.

Context: Why this matters OpenAI-compatible providers (e.g., Dashscope/Qwen, vLLM) sometimes produce malformed tool call chunks during streaming — splitting a single tool call into a valid call with empty args and an invalid fragment containing the actual args. disable_streaming="tool_calling" would be the ideal fix (non-streaming for tool calls = no chunk splitting), but the two bugs above make it unusable.

Error Message

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

File "langchain_core/language_models/chat_models.py", line 1361, in _agenerate_with_cache result = await self._agenerate(...) File "langchain_openai/chat_models/base.py", line 1744, in _agenerate return await run_in_executor(None, self._create_chat_result, response, generation_info) File "langchain_openai/chat_models/base.py", line 1540, in _create_chat_result response if isinstance(response, dict) else response.model_dump() # response is an AsyncStream, not a ChatCompletion

Root Cause

Bug 1: _agenerate receives an AsyncStream instead of ChatCompletion Root cause: BaseChatOpenAI._default_params hardcodes "stream": self.streaming.

@property
def _default_params(self) -> dict[str, Any]:
    return {
        "model": self.model_name,
        "stream": self.streaming,  # ← always True when streaming=True
        **{k: v for k, v in exclude_if_none.items() if v is not None},
        **self.model_kwargs,
    }

Fix Action

Fix / Workaround

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

aiohttp: 3.13.3 anthropic: 0.83.0 dataclasses-json: 0.6.7 httpx: 0.28.1 httpx-sse: 0.4.3 jsonpatch: 1.33 langgraph: 1.0.9 numpy: 2.4.2 openai: 2.22.0 orjson: 3.11.7 packaging: 24.2 pydantic: 2.12.5 pydantic-settings: 2.13.1 PyYAML: 6.0.3 pyyaml: 6.0.3 requests: 2.32.5 requests-toolbelt: 1.0.0 rich: 14.3.3 SQLAlchemy: 2.0.46 sqlalchemy: 2.0.46 tenacity: 9.1.4 tiktoken: 0.12.0 typing-extensions: 4.15.0 uuid-utils: 0.14.1 wrapt: 2.1.1 xxhash: 3.6.0 zstandard: 0.25.0

The same bugs exist in the latest master source code (verified Feb 25, 2026) — `_default_params` still includes `"stream": self.streaming` and `_should_stream` still checks `kwargs.get("tools")`.

PR fix notes

PR #35440: fix(openai): fix disable_streaming=tool_calling crash in streaming mode

Repository: langchain-ai/langchain
Author: dotuananh0712
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35440

Description (problem / solution / changelog)

Summary

Fix crash when using disable_streaming="tool_calling" with streaming enabled

Problem

When disable_streaming="tool_calling" is set and tools are passed, the _default_params always includes stream=True, causing the API to return AsyncStream instead of ChatCompletion. This leads to a crash when _create_chat_result tries to call model_dump() on the AsyncStream.

Solution

Override the stream param to False in _get_invocation_params when disable_streaming="tool_calling" and tools are present.

Fixes #35436

Changed files

libs/partners/openai/langchain_openai/chat_models/base.py (modified, +21/-12)

PR #35457: fix(openai): force stream=False in _generate/_agenerate when disable_streaming is active

Repository: langchain-ai/langchain
Author: Anandesh-Sharma
State: open | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35457

Description (problem / solution / changelog)

Summary

Bug: When streaming=True and disable_streaming="tool_calling" with tools bound, _should_stream() correctly routes to _generate/_agenerate. But _default_params still sets stream=True from self.streaming, so the OpenAI API returns a Stream object. _create_chat_result then crashes calling .model_dump() on it.
Fix: Explicitly set payload["stream"] = False in both BaseChatOpenAI._generate and BaseChatOpenAI._agenerate after building the payload, since these are non-streaming code paths.
Docs: Clarify in the disable_streaming field docstring and _should_stream comments that "tool_calling" disables streaming for all turns when tools are bound (not just tool-call turns), since the response content is unknown before the API call.

Closes #35436

Areas requiring careful review

The payload["stream"] = False override in _generate/_agenerate is unconditional. This is intentional -- these methods are the non-streaming code paths and should never send stream=True. The existing response_format branch already pops stream entirely, so this is consistent.

Test plan

New sync test: test_disable_streaming_tool_calling_forces_stream_false -- verifies stream=False in the API payload when streaming=True + disable_streaming="tool_calling" + tools bound
New async test: test_disable_streaming_tool_calling_forces_stream_false_async -- same for async path
All 183 existing tests in tests/unit_tests/chat_models/test_base.py pass
All 12 core disable_streaming tests pass

[!NOTE] This contribution was developed with assistance from an AI agent (Claude).

🤖 Generated with Claude Code

Changed files

libs/core/langchain_core/language_models/chat_models.py (modified, +8/-1)
libs/partners/openai/langchain_openai/chat_models/base.py (modified, +8/-0)
libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +74/-0)

PR #35480: fix(openai): strip stream param in _generate/_agenerate to fix disable_streaming crash

Repository: langchain-ai/langchain
Author: giulio-leone
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35480

Description (problem / solution / changelog)

Problem

When disable_streaming="tool_calling" triggers, BaseChatModel._agenerate_with_cache calls _agenerate() instead of _astream(). But _get_request_payload() includes stream=True from _default_params, so the OpenAI client returns an AsyncStream object instead of a ChatCompletion. The subsequent response.model_dump() then crashes:

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

This makes disable_streaming="tool_calling" completely unusable with ChatOpenAI when streaming=True.

Root Cause

BaseChatOpenAI._default_params hardcodes "stream": self.streaming. When self.streaming=True, this ends up in the payload for _generate()/_agenerate() — which are the non-streaming code paths. The response_format branch already handled this by popping stream, but the regular completion path did not.

Fix

Pop stream from the payload at the start of both _generate() and _agenerate() since these are non-streaming code paths that should never send stream=True to the API.

Tests

All 181 existing unit tests pass (2 consecutive clean runs).

Fixes #35436

Changed files

libs/partners/openai/langchain_openai/chat_models/base.py (modified, +4/-2)

PR #35486: fix(openai): ensure stream=False in payload when disable_streaming is active

Repository: langchain-ai/langchain
Author: giulio-leone
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35486

Description (problem / solution / changelog)

Summary

Fixes #35436

When disable_streaming='tool_calling' (or True) routes execution through _generate/_agenerate instead of _stream/_astream, the request payload still contained stream=True because _default_params unconditionally includes "stream": self.streaming.

This caused the OpenAI client to return an AsyncStream object instead of a ChatCompletion, leading to a crash in _create_chat_result when calling response.model_dump().

Root Cause

_default_params (line 1113) hardcodes "stream": self.streaming. When self.streaming=True and disable_streaming routes through _generate/_agenerate (the non-streaming code path), the payload still has stream: True, causing the API to return a stream iterator.

Fix

Explicitly set payload['stream'] = False at the top of both _generate and _agenerate. These methods are the non-streaming code path by definition — they should never send stream: True to the API.

Tests

Added test_disable_streaming_payload_stream_false and its async variant. Both verify that when streaming=True + disable_streaming='tool_calling' + tools are bound, the actual API payload has stream=False.

Changed files

libs/partners/openai/langchain_openai/chat_models/base.py (modified, +8/-0)
libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +43/-0)

PR #35505: fix(openai): prevent crash when disable_streaming="tool_calling" in streaming mode

Repository: langchain-ai/langchain
Author: giulio-leone
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35505

Description (problem / solution / changelog)

Summary

Fixes #35436 — disable_streaming="tool_calling" crashes during streaming with AttributeError: 'AsyncStream' object has no attribute 'model_dump'.

Root Cause

When streaming=True and disable_streaming="tool_calling" are both set on ChatOpenAI, and tools are bound:

_should_stream() correctly returns False (streaming should be disabled for tool-calling turns)
stream()/astream() falls back to invoke()/ainvoke(), which calls _generate()/_agenerate()
Bug: _generate()/_agenerate() call _get_request_payload() which merges _default_params — and _default_params unconditionally includes "stream": self.streaming (i.e. True)
The OpenAI client receives stream=True and returns an AsyncStream/Stream object
_create_chat_result() calls response.model_dump() on the stream object → crash

Fix

Strip the stream key from the payload in both _generate() and _agenerate() immediately after building it. These are the non-streaming code paths and should never send stream=True to the API. The streaming methods (_stream/_astream) already explicitly set kwargs["stream"] = True before building the payload, so they are unaffected.

The existing payload.pop("stream") inside the response_format branches becomes redundant and is removed (it would raise KeyError since stream is already popped).

Tests

Added 8 unit tests covering:

_generate payload has no stream key when disable_streaming="tool_calling"
_agenerate payload has no stream key when disable_streaming="tool_calling"
invoke() with disable_streaming="tool_calling" + streaming=True + tools does not crash
stream() with disable_streaming="tool_calling" falls back to invoke correctly
disable_streaming=False still includes stream=True in payload (streaming preserved)
_generate without streaming flag also has no stream in payload
_should_stream returns False when disable_streaming="tool_calling" and tools are present
_should_stream returns True when disable_streaming="tool_calling" but no tools

All 189 existing unit tests in test_base.py continue to pass.

Changed files

libs/partners/openai/langchain_openai/chat_models/base.py (modified, +11/-3)
libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +207/-0)

PR #35573: fix(openai): ensure _generate/_agenerate send stream=False to the API

Repository: langchain-ai/langchain
Author: AlonNaor22
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35573

Description (problem / solution / changelog)

Summary

When streaming=True and disable_streaming="tool_calling", _should_stream correctly routes to _generate/_agenerate (non-streaming path)
But _default_params hardcodes "stream": self.streaming, so the payload still contained stream=True
The OpenAI client returned a Stream/AsyncStream instead of a ChatCompletion, crashing _create_chat_result with AttributeError: 'AsyncStream' object has no attribute 'model_dump'
Fix: explicitly set kwargs["stream"] = False at the top of _generate and _agenerate, mirroring _astream which already sets kwargs["stream"] = True

Files changed

libs/partners/openai/langchain_openai/chat_models/base.py — 2 lines: set kwargs["stream"] = False in _generate and _agenerate
libs/partners/openai/tests/unit_tests/chat_models/test_base.py — 14 test cases covering all combinations of streaming × disable_streaming, plus explicit tool-calling scenarios

Test plan

test_generate_always_sets_stream_false — 6 parametrized cases (streaming=True/False × disable_streaming=None/tool_calling/True)
test_agenerate_always_sets_stream_false — 6 parametrized cases (async counterpart)
test_generate_stream_false_with_tools — exact issue #35436 scenario: streaming=True + disable_streaming="tool_calling" + tools in kwargs
test_agenerate_stream_false_with_tools — async counterpart
Full existing unit test suite: 196 passed, 0 failed

Closes #35436

🤖 Generated with Claude Code

Changed files

libs/partners/openai/langchain_openai/chat_models/base.py (modified, +2/-0)
libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +146/-0)

PR #35685: fix(openai): force stream=False in _generate/_agenerate when disable_streaming is active

Repository: langchain-ai/langchain
Author: giulio-leone
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/35685

Description (problem / solution / changelog)

Bug

When disable_streaming="tool_calling" is set and streaming=True, calling _generate/_agenerate (the non-streaming path) still sends stream=True to the OpenAI API because _default_params hardcodes "stream": self.streaming. The API returns an AsyncStream/Stream instead of a ChatCompletion, crashing with:

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

Fix

Explicitly set payload["stream"] = False at the top of both _generate and _agenerate. These methods are always the non-streaming code path — when _should_stream returns False (e.g. due to disable_streaming), execution is routed here, so the API payload must never request streaming.

Tests

Added sync and async unit tests that verify the payload sent to the OpenAI client has stream=False when disable_streaming="tool_calling" is active with streaming=True.

Closes #35436

Changed files

libs/partners/openai/langchain_openai/chat_models/base.py (modified, +8/-0)
libs/partners/openai/tests/unit_tests/chat_models/test_base.py (modified, +139/-0)

Code Example

import asyncio
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Sunny in {city}"

llm = ChatOpenAI(
    model="qwen-max",  # or any OpenAI-compatible model
    openai_api_key="your-key",
    openai_api_base="https://dashscope.aliyuncs.com/compatible-mode/v1",
    temperature=0.3,
    streaming=True,
    disable_streaming="tool_calling",
)

agent = create_react_agent(model=llm, tools=[get_weather])

async def main():
    async for event in agent.astream_events(
        {"messages": [{"role": "user", "content": "What's the weather in Tokyo?"}]},
        version="v2",
    ):
        print(event["event"])

asyncio.run(main())

---

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

  File "langchain_core/language_models/chat_models.py", line 1361, in _agenerate_with_cache
    result = await self._agenerate(...)
  File "langchain_openai/chat_models/base.py", line 1744, in _agenerate
    return await run_in_executor(None, self._create_chat_result, response, generation_info)
  File "langchain_openai/chat_models/base.py", line 1540, in _create_chat_result
    response if isinstance(response, dict) else response.model_dump()
    # response is an AsyncStream, not a ChatCompletion

---

@property
def _default_params(self) -> dict[str, Any]:
    return {
        "model": self.model_name,
        "stream": self.streaming,  # ← always True when streaming=True
        **{k: v for k, v in exclude_if_none.items() if v is not None},
        **self.model_kwargs,
    }

---

if self.disable_streaming == "tool_calling" and kwargs.get("tools"):
    return False

---

System Information
------------------
> OS:  Windows
> OS Version:  10.0.26200
> Python Version:  3.13.12 (tags/v3.13.12:1cbe481, Feb  3 2026, 18:22:25) [MSC v.1944 64 bit (AMD64)]

Package Information
-------------------
> langchain_core: 1.2.15
> langchain: 1.2.10
> langchain_community: 0.4.1
> langsmith: 0.7.6
> langchain_anthropic: 1.3.3
> langchain_classic: 1.0.1
> langchain_openai: 1.1.10
> langchain_text_splitters: 1.1.1
> langgraph_sdk: 0.3.8

Optional packages not installed
-------------------------------
> deepagents
> deepagents-cli

Other Dependencies
------------------
> aiohttp: 3.13.3
> anthropic: 0.83.0
> dataclasses-json: 0.6.7
> httpx: 0.28.1
> httpx-sse: 0.4.3
> jsonpatch: 1.33
> langgraph: 1.0.9
> numpy: 2.4.2
> openai: 2.22.0
> orjson: 3.11.7
> packaging: 24.2
> pydantic: 2.12.5
> pydantic-settings: 2.13.1
> PyYAML: 6.0.3
> pyyaml: 6.0.3
> requests: 2.32.5
> requests-toolbelt: 1.0.0
> rich: 14.3.3
> SQLAlchemy: 2.0.46
> sqlalchemy: 2.0.46
> tenacity: 9.1.4
> tiktoken: 0.12.0
> typing-extensions: 4.15.0
> uuid-utils: 0.14.1
> wrapt: 2.1.1
> xxhash: 3.6.0
> zstandard: 0.25.0

RAW_BUFFERClick to expand / collapse

Checked other resources

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

Related Issues / PRs

#34654

Reproduction Steps / Example Code (Python)

import asyncio
from langchain_openai import ChatOpenAI
from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent

@tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Sunny in {city}"

llm = ChatOpenAI(
    model="qwen-max",  # or any OpenAI-compatible model
    openai_api_key="your-key",
    openai_api_base="https://dashscope.aliyuncs.com/compatible-mode/v1",
    temperature=0.3,
    streaming=True,
    disable_streaming="tool_calling",
)

agent = create_react_agent(model=llm, tools=[get_weather])

async def main():
    async for event in agent.astream_events(
        {"messages": [{"role": "user", "content": "What's the weather in Tokyo?"}]},
        version="v2",
    ):
        print(event["event"])

asyncio.run(main())

Error Message and Stack Trace (if applicable)

AttributeError: 'AsyncStream' object has no attribute 'model_dump'

  File "langchain_core/language_models/chat_models.py", line 1361, in _agenerate_with_cache
    result = await self._agenerate(...)
  File "langchain_openai/chat_models/base.py", line 1744, in _agenerate
    return await run_in_executor(None, self._create_chat_result, response, generation_info)
  File "langchain_openai/chat_models/base.py", line 1540, in _create_chat_result
    response if isinstance(response, dict) else response.model_dump()
    # response is an AsyncStream, not a ChatCompletion

Description

There are two separate bugs that make disable_streaming="tool_calling" unusable with ChatOpenAI when streaming=True:

Bug 1: _agenerate receives an AsyncStream instead of ChatCompletion Root cause: BaseChatOpenAI._default_params hardcodes "stream": self.streaming.

@property
def _default_params(self) -> dict[str, Any]:
    return {
        "model": self.model_name,
        "stream": self.streaming,  # ← always True when streaming=True
        **{k: v for k, v in exclude_if_none.items() if v is not None},
        **self.model_kwargs,
    }

Bug 2: _should_stream disables streaming for ALL turns, not just tool-call turns Root cause: In BaseChatModel._should_stream

if self.disable_streaming == "tool_calling" and kwargs.get("tools"):
    return False

System Info

System Information
------------------
> OS:  Windows
> OS Version:  10.0.26200
> Python Version:  3.13.12 (tags/v3.13.12:1cbe481, Feb  3 2026, 18:22:25) [MSC v.1944 64 bit (AMD64)]

Package Information
-------------------
> langchain_core: 1.2.15
> langchain: 1.2.10
> langchain_community: 0.4.1
> langsmith: 0.7.6
> langchain_anthropic: 1.3.3
> langchain_classic: 1.0.1
> langchain_openai: 1.1.10
> langchain_text_splitters: 1.1.1
> langgraph_sdk: 0.3.8

Optional packages not installed
-------------------------------
> deepagents
> deepagents-cli

Other Dependencies
------------------
> aiohttp: 3.13.3
> anthropic: 0.83.0
> dataclasses-json: 0.6.7
> httpx: 0.28.1
> httpx-sse: 0.4.3
> jsonpatch: 1.33
> langgraph: 1.0.9
> numpy: 2.4.2
> openai: 2.22.0
> orjson: 3.11.7
> packaging: 24.2
> pydantic: 2.12.5
> pydantic-settings: 2.13.1
> PyYAML: 6.0.3
> pyyaml: 6.0.3
> requests: 2.32.5
> requests-toolbelt: 1.0.0
> rich: 14.3.3
> SQLAlchemy: 2.0.46
> sqlalchemy: 2.0.46
> tenacity: 9.1.4
> tiktoken: 0.12.0
> typing-extensions: 4.15.0
> uuid-utils: 0.14.1
> wrapt: 2.1.1
> xxhash: 3.6.0
> zstandard: 0.25.0

The same bugs exist in the latest master source code (verified Feb 25, 2026) — _default_params still includes "stream": self.streaming and _should_stream still checks kwargs.get("tools").

extent analysis

Fix Plan

Bug 1: `_agenerate` receives an `AsyncStream` instead of `ChatCompletion`

Update langchain_openai to fix _default_params:

@property def _default_params(self) -> dict[str, Any]: return { "model": self.model_name, **{k: v for k, v in exclude_if_none.items() if v is not None}, **self.model_kwargs, }

   Remove the line `"stream": self.streaming` to prevent hardcoding `stream=True` in the payload.

2. **Update `langchain_openai` to fix `_should_stream`**:
   ```python
def _should_stream(self, kwargs: dict[str, Any]) -> bool:
    if self.disable_streaming == "tool_calling" and kwargs.get("tools") and kwargs.get("event") and kwargs["event"].get("role") == "tool":
        return False
    return True

Modify the condition to check whether the current response contains a tool call (kwargs["event"].get("role") == "tool").

Bug 2: `_should_stream` disables streaming for ALL turns, not just tool-call turns

Update langchain_openai to fix _should_stream:

def _should_stream(self, kwargs: dict[str, Any]) -> bool: if self.disable_streaming == "tool_calling" and kwargs.get("tools"): return False return True

   Remove the condition `kwargs.get("event") and kwargs["event"].get("role") == "tool"` to prevent disabling streaming for all turns.

### Verification

1. Run the reproduction code with the updated `langchain_openai` package.
2. Verify that the `AsyncStream` object has the `model_dump` attribute.
3. Check that the

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #agent execution #callback error #memory management

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

langchain - ✅(Solved) Fix `disable_streaming="tool_calling"` crashes in streaming mode [7 pull requests, 6 comments, 6 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Fix Action

Fix / Workaround

Other Dependencies

PR fix notes

PR #35440: fix(openai): fix disable_streaming=tool_calling crash in streaming mode

Description (problem / solution / changelog)

Summary

Problem

Solution

Changed files

PR #35457: fix(openai): force stream=False in _generate/_agenerate when disable_streaming is active

Description (problem / solution / changelog)

Summary

Areas requiring careful review

Test plan

Changed files

PR #35480: fix(openai): strip stream param in _generate/_agenerate to fix disable_streaming crash

Description (problem / solution / changelog)

Problem

Root Cause

Fix

Tests

Changed files

PR #35486: fix(openai): ensure stream=False in payload when disable_streaming is active

Description (problem / solution / changelog)

Summary

Root Cause

Fix

Tests

Changed files

PR #35505: fix(openai): prevent crash when disable_streaming="tool_calling" in streaming mode

Description (problem / solution / changelog)

Summary

Root Cause

Fix

Tests

Changed files

PR #35573: fix(openai): ensure _generate/_agenerate send stream=False to the API

Description (problem / solution / changelog)

Summary

Files changed

Test plan

Changed files

PR #35685: fix(openai): force stream=False in _generate/_agenerate when disable_streaming is active

Description (problem / solution / changelog)

Bug

Fix

Tests

Changed files

Code Example

Checked other resources

Package (Required)

Related Issues / PRs

Reproduction Steps / Example Code (Python)

Error Message and Stack Trace (if applicable)

Description

System Info

extent analysis

Fix Plan

Bug 1: _agenerate receives an AsyncStream instead of ChatCompletion

Bug 2: _should_stream disables streaming for ALL turns, not just tool-call turns

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Bug 1: `_agenerate` receives an `AsyncStream` instead of `ChatCompletion`

Bug 2: `_should_stream` disables streaming for ALL turns, not just tool-call turns