langchain - 💡(How to fix) Fix stream_events(version="v3"): reasoning content dropped from AIMessage when same message contains a tool_call

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

When using `stream_events(version="v3")` / `astream_events(version="v3")` with a model that emits a reasoning block followed by a tool_call in the same message (e.g. Gemini 2.5 / 3.x with thinking enabled), the reasoning text is dropped from the assembled `AIMessage.content`. The streamed `.reasoning` projection still exposes the text during the run, but the persisted message — what's returned from the node, fed into the checkpointer, and shown on conversation read-back — has the reasoning gone.

Expected: `msg.content` contains both the reasoning block and the tool_call block, mirroring what's available via the `.reasoning` and `.tool_calls` projections during streaming. Persisting a tool-calling assistant turn should not lose its reasoning.

Actual: `msg.content` contains only the tool_call. The reasoning is unrecoverable from the saved message — it is not in `additional_kwargs`, not in `response_metadata`, and `.content_blocks` reconstructs the same single-element list.

Root cause (`langchain_core/language_models/chat_model_stream.py`, `_finish`): the method backfills any `tool_call_chunks` that didn't receive an explicit `content-block-finish` before `message-finish`:

```python _sweep_chunk_store(self._tool_call_chunks, ...) _sweep_chunk_store(self._server_tool_call_chunks, ...) ```

There is no equivalent sweep for `_reasoning_per_block` (or `_text_per_block`). When a provider emits `content-block-delta` events for reasoning but no terminating `content-block-finish` before the next block or message-finish, the accumulated reasoning never lands in `self._blocks`, so `_assemble_message` builds `content` without it.

Suggested fix: after the existing tool-call sweeps in `_finish`, synthesize a finalized `ReasoningContentBlock` (and `TextContentBlock`) into `self._blocks` for any index present in `_reasoning_per_block` / `_text_per_block` that isn't already in `self._blocks`. This mirrors the tool-call sweep and keeps `content` complete regardless of whether the provider emits a per-block `content-block-finish`.

Practical impact (Gemini agents): any tool-calling turn's thinking is lost from conversation history. Live SSE rendering still works (`.reasoning` projection); only the persisted/replayed transcript is affected.

Error Message

Error Message and Stack Trace (if applicable)

No exception — silent data loss.

Root Cause

Root cause (`langchain_core/language_models/chat_model_stream.py`, `_finish`): the method backfills any `tool_call_chunks` that didn't receive an explicit `content-block-finish` before `message-finish`:

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version (langchain-core 1.4.0, langchain 1.3.0).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example.

s = ChatModelStream() s.dispatch({"event": "message-start", "role": "ai", "id": "msg-1"})

Reasoning block: opens, streams a delta, but NEVER gets content-block-finish.

(This is the shape Gemini emits when a tool call immediately follows reasoning

in the same response.)

s.dispatch({ "event": "content-block-start", "index": 0, "content": {"type": "reasoning", "id": "r-1"}, }) s.dispatch({ "event": "content-block-delta", "index": 0, "delta": {"type": "reasoning-delta", "reasoning": "I should call the schema tool."}, })

Code Example

"""Reasoning text streamed via the v3 protocol is lost from the assembled
AIMessage when the same message also contains a tool_call.

_finish() backfills unfinished tool_call_chunks into _blocks via
_sweep_chunk_store, but does no equivalent sweep for unfinished reasoning
blocks. Any reasoning that didn't receive an explicit content-block-finish
before message-finish is silently dropped from message.content — even
though it remains visible during the run via the .reasoning projection.
"""

from langchain_core.language_models.chat_model_stream import ChatModelStream

s = ChatModelStream()
s.dispatch({\"event\": \"message-start\", \"role\": \"ai\", \"id\": \"msg-1\"})

# Reasoning block: opens, streams a delta, but NEVER gets content-block-finish.
# (This is the shape Gemini emits when a tool call immediately follows reasoning
# in the same response.)
s.dispatch({
    \"event\": \"content-block-start\",
    \"index\": 0,
    \"content\": {\"type\": \"reasoning\", \"id\": \"r-1\"},
})
s.dispatch({
    \"event\": \"content-block-delta\",
    \"index\": 0,
    \"delta\": {\"type\": \"reasoning-delta\",
              \"reasoning\": \"I should call the schema tool.\"},
})

# Tool-call block: opens, streams args, finishes normally.
s.dispatch({
    \"event\": \"content-block-start\",
    \"index\": 1,
    \"content\": {\"type\": \"tool_call_chunk\", \"id\": \"tc-1\", \"name\": \"get_schema\"},
})
s.dispatch({
    \"event\": \"content-block-delta\",
    \"index\": 1,
    \"delta\": {\"type\": \"block-delta\",
              \"fields\": {\"type\": \"tool_call_chunk\", \"args\": '{\"q\": \"x\"}'}},
})
s.dispatch({
    \"event\": \"content-block-finish\",
    \"index\": 1,
    \"content\": {\"type\": \"tool_call\", \"id\": \"tc-1\",
                \"name\": \"get_schema\", \"args\": {\"q\": \"x\"}},
})

s.dispatch({\"event\": \"message-finish\"})

msg = s.output_message
print(\"content:             \", msg.content)
print(\"reasoning projection:\", str(s.reasoning))
# content:              [{'type': 'tool_call', ...}]            <- reasoning missing
# reasoning projection: I should call the schema tool.          <- but it's in here
RAW_BUFFERClick to expand / collapse

Submission checklist

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version (langchain-core 1.4.0, langchain 1.3.0).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example.

Package

langchain-core

Reproduction Steps / Example Code (Python)

"""Reasoning text streamed via the v3 protocol is lost from the assembled
AIMessage when the same message also contains a tool_call.

_finish() backfills unfinished tool_call_chunks into _blocks via
_sweep_chunk_store, but does no equivalent sweep for unfinished reasoning
blocks. Any reasoning that didn't receive an explicit content-block-finish
before message-finish is silently dropped from message.content — even
though it remains visible during the run via the .reasoning projection.
"""

from langchain_core.language_models.chat_model_stream import ChatModelStream

s = ChatModelStream()
s.dispatch({\"event\": \"message-start\", \"role\": \"ai\", \"id\": \"msg-1\"})

# Reasoning block: opens, streams a delta, but NEVER gets content-block-finish.
# (This is the shape Gemini emits when a tool call immediately follows reasoning
# in the same response.)
s.dispatch({
    \"event\": \"content-block-start\",
    \"index\": 0,
    \"content\": {\"type\": \"reasoning\", \"id\": \"r-1\"},
})
s.dispatch({
    \"event\": \"content-block-delta\",
    \"index\": 0,
    \"delta\": {\"type\": \"reasoning-delta\",
              \"reasoning\": \"I should call the schema tool.\"},
})

# Tool-call block: opens, streams args, finishes normally.
s.dispatch({
    \"event\": \"content-block-start\",
    \"index\": 1,
    \"content\": {\"type\": \"tool_call_chunk\", \"id\": \"tc-1\", \"name\": \"get_schema\"},
})
s.dispatch({
    \"event\": \"content-block-delta\",
    \"index\": 1,
    \"delta\": {\"type\": \"block-delta\",
              \"fields\": {\"type\": \"tool_call_chunk\", \"args\": '{\"q\": \"x\"}'}},
})
s.dispatch({
    \"event\": \"content-block-finish\",
    \"index\": 1,
    \"content\": {\"type\": \"tool_call\", \"id\": \"tc-1\",
                \"name\": \"get_schema\", \"args\": {\"q\": \"x\"}},
})

s.dispatch({\"event\": \"message-finish\"})

msg = s.output_message
print(\"content:             \", msg.content)
print(\"reasoning projection:\", str(s.reasoning))
# content:              [{'type': 'tool_call', ...}]            <- reasoning missing
# reasoning projection: I should call the schema tool.          <- but it's in here

Error Message and Stack Trace (if applicable)

No exception — silent data loss.

Description

When using `stream_events(version="v3")` / `astream_events(version="v3")` with a model that emits a reasoning block followed by a tool_call in the same message (e.g. Gemini 2.5 / 3.x with thinking enabled), the reasoning text is dropped from the assembled `AIMessage.content`. The streamed `.reasoning` projection still exposes the text during the run, but the persisted message — what's returned from the node, fed into the checkpointer, and shown on conversation read-back — has the reasoning gone.

Expected: `msg.content` contains both the reasoning block and the tool_call block, mirroring what's available via the `.reasoning` and `.tool_calls` projections during streaming. Persisting a tool-calling assistant turn should not lose its reasoning.

Actual: `msg.content` contains only the tool_call. The reasoning is unrecoverable from the saved message — it is not in `additional_kwargs`, not in `response_metadata`, and `.content_blocks` reconstructs the same single-element list.

Root cause (`langchain_core/language_models/chat_model_stream.py`, `_finish`): the method backfills any `tool_call_chunks` that didn't receive an explicit `content-block-finish` before `message-finish`:

```python _sweep_chunk_store(self._tool_call_chunks, ...) _sweep_chunk_store(self._server_tool_call_chunks, ...) ```

There is no equivalent sweep for `_reasoning_per_block` (or `_text_per_block`). When a provider emits `content-block-delta` events for reasoning but no terminating `content-block-finish` before the next block or message-finish, the accumulated reasoning never lands in `self._blocks`, so `_assemble_message` builds `content` without it.

Suggested fix: after the existing tool-call sweeps in `_finish`, synthesize a finalized `ReasoningContentBlock` (and `TextContentBlock`) into `self._blocks` for any index present in `_reasoning_per_block` / `_text_per_block` that isn't already in `self._blocks`. This mirrors the tool-call sweep and keeps `content` complete regardless of whether the provider emits a per-block `content-block-finish`.

Practical impact (Gemini agents): any tool-calling turn's thinking is lost from conversation history. Live SSE rendering still works (`.reasoning` projection); only the persisted/replayed transcript is affected.

System Info

```

OS: Linux, Python 3.12.9 langchain_core: 1.4.0 langchain: 1.3.0 langchain_google_genai: 4.2.2 langgraph: 1.2.0 langchain_protocol: 0.0.15 ```

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

langchain - 💡(How to fix) Fix stream_events(version="v3"): reasoning content dropped from AIMessage when same message contains a tool_call