langchain - 💡(How to fix) Fix `ChatOpenAI` Responses API streaming drops reasoning summary when provider sends `response.output_item.done` instead of `response.reasoning_summary_text.delta`

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When using ChatOpenAI (or AzureChatOpenAI) with use_responses_api=True and a reasoning model, the reasoning summary is silently dropped for providers — notably Azure OpenAI — that deliver the full reasoning text in response.output_item.done rather than streaming it incrementally via response.reasoning_summary_text.delta.

Error Message

Error Message and Stack Trace (if applicable)

Root Cause

_convert_responses_chunk_to_generation_chunk handles response.output_item.done for every item type (web_search_call, file_search_call, code_interpreter_call, etc.) but has no branch for chunk.item.type == "reasoning". It falls through to the final else: return None, discarding the event entirely.

Non-Azure OpenAI sends response.reasoning_summary_text.delta events (which LangChain does handle at line ~4557), so those providers are unaffected. Azure only sends:

  1. response.output_item.added (type=reasoning, no text) → LangChain emits empty block ✓
  2. response.output_item.done (type=reasoning, full summary) → LangChain returns None

The full reasoning text — which is present in chunk.item.summary[*].text — is never emitted as a ChatGenerationChunk and is absent from the final AIMessage.

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

aiohttp: 3.13.5 async-timeout: 5.0.1 dataclasses-json: 0.6.7 filetype: 1.2.0 google-genai: 1.75.0 httpx: 0.28.1 httpx-sse: 0.4.3 jsonpatch: 1.33 numpy: 2.4.4 openai: 1.109.1 opentelemetry-api: 1.36.0 opentelemetry-exporter-otlp-proto-http: 1.36.0 opentelemetry-sdk: 1.36.0 orjson: 3.11.9 packaging: 24.2 pydantic: 2.13.4 pydantic-settings: 2.14.1 pytest: 8.4.2 pyyaml: 6.0.3 PyYAML: 6.0.3 requests: 2.34.0 requests-toolbelt: 1.0.0 rich: 15.0.0 SQLAlchemy: 2.0.49 sqlalchemy: 2.0.49 tenacity: 9.1.4 tiktoken: 0.12.0 typing-extensions: 4.15.0 uuid-utils: 0.15.0 websockets: 16.0 wrapt: 1.17.3 xxhash: 3.7.0 zstandard: 0.25.0

Code Example

## NO CREDS REPRO - MOCKING CHUNK CONTENT
from unittest.mock import MagicMock
from langchain_openai.chat_models.base import _convert_responses_chunk_to_generation_chunk

# Simulate the event Azure sends: response.output_item.done, item.type == "reasoning"
item = MagicMock()
item.type = "reasoning"
item.id = "rs_abc123"
item.model_dump.return_value = {
    "type": "reasoning",
    "id": "rs_abc123",
    "summary": [{"type": "summary_text", "text": "Let me think step by step..."}],
}

chunk = MagicMock()
chunk.type = "response.output_item.done"
chunk.item = item
chunk.output_index = 0

_, _, _, result = _convert_responses_chunk_to_generation_chunk(chunk, 0, -1, -1)

# Bug: result is None — reasoning text silently dropped
assert result is None, "Already fixed!"
print("Bug confirmed: result is None, reasoning text is lost")

## E2E REPRO

from langchain_openai import AzureChatOpenAI

llm = AzureChatOpenAI(
    azure_deployment="o4-mini",
    azure_endpoint="https://<your-resource>.openai.azure.com/",
    api_key="<your-key>",
    api_version="2025-04-01-preview",
    use_responses_api=True,
    model_kwargs={"reasoning": {"effort": "medium", "summary": "auto"}},
)

for chunk in llm.stream("Explain step by step why 0.1 + 0.2 != 0.3"):
    reasoning = [
        b for b in chunk.message.content
        if isinstance(b, dict) and b.get("type") == "reasoning"
    ]
    if reasoning:
        print(reasoning)
        # Prints {'id': 'rs_...', 'type': 'reasoning'} — summary field is absent

---



---

# track which output indices have been streamed via delta events
elif chunk.type == "response.reasoning_summary_text.delta":
    if streamed_reasoning_output_indices is not None:
        streamed_reasoning_output_indices.add(chunk.output_index)
    ...  # existing code unchanged

# new handler
elif (
    chunk.type == "response.output_item.done"
    and chunk.item.type == "reasoning"
    and chunk.output_index not in (streamed_reasoning_output_indices or set())
):
    _advance(chunk.output_index)
    item = chunk.item.model_dump(exclude_none=True, mode="json")
    text = "".join(
        part.get("text", "")
        for part in (item.get("summary") or [])
        if part.get("type") == "summary_text"
    )
    if text:
        content.append({
            "type": "reasoning",
            "id": item.get("id", ""),
            "summary": [{"type": "summary_text", "text": text}],
            "index": current_index,
        })

---

System Information
------------------
> OS:  Darwin
> OS Version:  Darwin Kernel Version 25.3.0: Wed Jan 28 20:56:35 PST 2026; root:xnu-12377.91.3~2/RELEASE_ARM64_T6030
> Python Version:  3.12.0 (main, Mar 17 2026, 16:14:47) [Clang 17.0.0 (clang-1700.6.3.2)]

Package Information
-------------------
> langchain_core: 1.4.0
> langchain_community: 0.4.1
> langsmith: 0.8.3
> langchain_classic: 1.0.7
> langchain_google_genai: 4.1.3
> langchain_openai: 1.1.6
> langchain_protocol: 0.0.15
> langchain_text_splitters: 1.1.2

Optional packages not installed
-------------------------------
> deepagents
> deepagents-cli

Other Dependencies
------------------
> aiohttp: 3.13.5
> async-timeout: 5.0.1
> dataclasses-json: 0.6.7
> filetype: 1.2.0
> google-genai: 1.75.0
> httpx: 0.28.1
> httpx-sse: 0.4.3
> jsonpatch: 1.33
> numpy: 2.4.4
> openai: 1.109.1
> opentelemetry-api: 1.36.0
> opentelemetry-exporter-otlp-proto-http: 1.36.0
> opentelemetry-sdk: 1.36.0
> orjson: 3.11.9
> packaging: 24.2
> pydantic: 2.13.4
> pydantic-settings: 2.14.1
> pytest: 8.4.2
> pyyaml: 6.0.3
> PyYAML: 6.0.3
> requests: 2.34.0
> requests-toolbelt: 1.0.0
> rich: 15.0.0
> SQLAlchemy: 2.0.49
> sqlalchemy: 2.0.49
> tenacity: 9.1.4
> tiktoken: 0.12.0
> typing-extensions: 4.15.0
> uuid-utils: 0.15.0
> websockets: 16.0
> wrapt: 1.17.3
> xxhash: 3.7.0
> zstandard: 0.25.0
RAW_BUFFERClick to expand / collapse

Submission checklist

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

## NO CREDS REPRO - MOCKING CHUNK CONTENT
from unittest.mock import MagicMock
from langchain_openai.chat_models.base import _convert_responses_chunk_to_generation_chunk

# Simulate the event Azure sends: response.output_item.done, item.type == "reasoning"
item = MagicMock()
item.type = "reasoning"
item.id = "rs_abc123"
item.model_dump.return_value = {
    "type": "reasoning",
    "id": "rs_abc123",
    "summary": [{"type": "summary_text", "text": "Let me think step by step..."}],
}

chunk = MagicMock()
chunk.type = "response.output_item.done"
chunk.item = item
chunk.output_index = 0

_, _, _, result = _convert_responses_chunk_to_generation_chunk(chunk, 0, -1, -1)

# Bug: result is None — reasoning text silently dropped
assert result is None, "Already fixed!"
print("Bug confirmed: result is None, reasoning text is lost")

## E2E REPRO

from langchain_openai import AzureChatOpenAI

llm = AzureChatOpenAI(
    azure_deployment="o4-mini",
    azure_endpoint="https://<your-resource>.openai.azure.com/",
    api_key="<your-key>",
    api_version="2025-04-01-preview",
    use_responses_api=True,
    model_kwargs={"reasoning": {"effort": "medium", "summary": "auto"}},
)

for chunk in llm.stream("Explain step by step why 0.1 + 0.2 != 0.3"):
    reasoning = [
        b for b in chunk.message.content
        if isinstance(b, dict) and b.get("type") == "reasoning"
    ]
    if reasoning:
        print(reasoning)
        # Prints {'id': 'rs_...', 'type': 'reasoning'} — summary field is absent

Error Message and Stack Trace (if applicable)

Description

When using ChatOpenAI (or AzureChatOpenAI) with use_responses_api=True and a reasoning model, the reasoning summary is silently dropped for providers — notably Azure OpenAI — that deliver the full reasoning text in response.output_item.done rather than streaming it incrementally via response.reasoning_summary_text.delta.

Root cause

_convert_responses_chunk_to_generation_chunk handles response.output_item.done for every item type (web_search_call, file_search_call, code_interpreter_call, etc.) but has no branch for chunk.item.type == "reasoning". It falls through to the final else: return None, discarding the event entirely.

Non-Azure OpenAI sends response.reasoning_summary_text.delta events (which LangChain does handle at line ~4557), so those providers are unaffected. Azure only sends:

  1. response.output_item.added (type=reasoning, no text) → LangChain emits empty block ✓
  2. response.output_item.done (type=reasoning, full summary) → LangChain returns None

The full reasoning text — which is present in chunk.item.summary[*].text — is never emitted as a ChatGenerationChunk and is absent from the final AIMessage.

Expected behaviour

The reasoning summary text should be present on the emitted ChatGenerationChunk, matching the behaviour already produced by response.reasoning_summary_text.delta on non-Azure providers.

Proposed fix

Add a handler for response.output_item.done + chunk.item.type == "reasoning", guarded so it only fires when no delta events were received for the same output index (to avoid double-emission on providers that send both):

# track which output indices have been streamed via delta events
elif chunk.type == "response.reasoning_summary_text.delta":
    if streamed_reasoning_output_indices is not None:
        streamed_reasoning_output_indices.add(chunk.output_index)
    ...  # existing code unchanged

# new handler
elif (
    chunk.type == "response.output_item.done"
    and chunk.item.type == "reasoning"
    and chunk.output_index not in (streamed_reasoning_output_indices or set())
):
    _advance(chunk.output_index)
    item = chunk.item.model_dump(exclude_none=True, mode="json")
    text = "".join(
        part.get("text", "")
        for part in (item.get("summary") or [])
        if part.get("type") == "summary_text"
    )
    if text:
        content.append({
            "type": "reasoning",
            "id": item.get("id", ""),
            "summary": [{"type": "summary_text", "text": text}],
            "index": current_index,
        })

streamed_reasoning_output_indices: set[int] | None = None would be added as an optional parameter to _convert_responses_chunk_to_generation_chunk, initialised in both _stream_responses and _astream_responses, and passed on every call.

System Info

System Information
------------------
> OS:  Darwin
> OS Version:  Darwin Kernel Version 25.3.0: Wed Jan 28 20:56:35 PST 2026; root:xnu-12377.91.3~2/RELEASE_ARM64_T6030
> Python Version:  3.12.0 (main, Mar 17 2026, 16:14:47) [Clang 17.0.0 (clang-1700.6.3.2)]

Package Information
-------------------
> langchain_core: 1.4.0
> langchain_community: 0.4.1
> langsmith: 0.8.3
> langchain_classic: 1.0.7
> langchain_google_genai: 4.1.3
> langchain_openai: 1.1.6
> langchain_protocol: 0.0.15
> langchain_text_splitters: 1.1.2

Optional packages not installed
-------------------------------
> deepagents
> deepagents-cli

Other Dependencies
------------------
> aiohttp: 3.13.5
> async-timeout: 5.0.1
> dataclasses-json: 0.6.7
> filetype: 1.2.0
> google-genai: 1.75.0
> httpx: 0.28.1
> httpx-sse: 0.4.3
> jsonpatch: 1.33
> numpy: 2.4.4
> openai: 1.109.1
> opentelemetry-api: 1.36.0
> opentelemetry-exporter-otlp-proto-http: 1.36.0
> opentelemetry-sdk: 1.36.0
> orjson: 3.11.9
> packaging: 24.2
> pydantic: 2.13.4
> pydantic-settings: 2.14.1
> pytest: 8.4.2
> pyyaml: 6.0.3
> PyYAML: 6.0.3
> requests: 2.34.0
> requests-toolbelt: 1.0.0
> rich: 15.0.0
> SQLAlchemy: 2.0.49
> sqlalchemy: 2.0.49
> tenacity: 9.1.4
> tiktoken: 0.12.0
> typing-extensions: 4.15.0
> uuid-utils: 0.15.0
> websockets: 16.0
> wrapt: 1.17.3
> xxhash: 3.7.0
> zstandard: 0.25.0

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING