langchain - 💡(How to fix) Fix Bug: merge_dicts concatenates ls_provider across streaming chunks, inflating response_metadata [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#36993Fetched 2026-04-25 06:03:12
View on GitHub
Comments
2
Participants
2
Timeline
6
Reactions
0
Timeline (top)
labeled ×3commented ×2issue_type_added ×1

langchain-aws v1.4.5 (PR langchain-ai/langchain-aws#981) adds response_metadata["ls_provider"] = "amazon_bedrock" to each streaming chunk in ChatBedrockConverse._stream(). This is needed for SummarizationMiddleware to identify the provider from message metadata.

However, merge_dicts() concatenates string values when merging chunks. The skip-concatenation guard on line 59 of _merge.py protects model_provider (also set on every chunk) but does not protect ls_provider:

right_k in {"id", "output_version", "model_provider"}  # ls_provider missing

After merging ~20 chunks, ls_provider becomes "amazon_bedrock" repeated 20 times (280 chars). Every AIMessage in conversation history carries this inflated metadata.

Proposed fix: add "ls_provider" to the existing guard, matching the pattern already established for "model_provider". I have a fix ready with tests if assigned.

Error Message

Error Message and Stack Trace (if applicable)

No explicit error is raised. The reproduction above shows that ls_provider grows

Root Cause

langchain-aws v1.4.5 (PR langchain-ai/langchain-aws#981) adds response_metadata["ls_provider"] = "amazon_bedrock" to each streaming chunk in ChatBedrockConverse._stream(). This is needed for SummarizationMiddleware to identify the provider from message metadata.

However, merge_dicts() concatenates string values when merging chunks. The skip-concatenation guard on line 59 of _merge.py protects model_provider (also set on every chunk) but does not protect ls_provider:

right_k in {"id", "output_version", "model_provider"}  # ls_provider missing

After merging ~20 chunks, ls_provider becomes "amazon_bedrock" repeated 20 times (280 chars). Every AIMessage in conversation history carries this inflated metadata.

Proposed fix: add "ls_provider" to the existing guard, matching the pattern already established for "model_provider". I have a fix ready with tests if assigned.

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

httpx: 0.28.1 jsonpatch: 1.33 orjson: 3.11.8 packaging: 26.1 pydantic: 2.13.3 pytest: 9.0.3 pyyaml: 6.0.3 requests: 2.33.1 requests-toolbelt: 1.0.0 tenacity: 9.1.4 typing-extensions: 4.15.0 uuid-utils: 0.14.1 xxhash: 3.6.0 zstandard: 0.25.0

Code Example

from langchain_core.utils._merge import merge_dicts

# Simulate two streaming chunks from ChatBedrockConverse (langchain-aws>=1.4.5).
# Each chunk carries ls_provider="amazon_bedrock" in response_metadata.
left = {"ls_provider": "amazon_bedrock", "model_provider": "bedrock_converse"}
right = {"ls_provider": "amazon_bedrock", "model_provider": "bedrock_converse"}
result = merge_dicts(left, right)

print("model_provider:", result["model_provider"])  # "bedrock_converse" (correct)
print("ls_provider:", result["ls_provider"])         # "amazon_bedrockamazon_bedrock" (BUG)

# After 20 streaming chunks (typical response):
metadata = {"ls_provider": "amazon_bedrock"}
for _ in range(19):
    metadata = merge_dicts(metadata, {"ls_provider": "amazon_bedrock"})
print(f"After 20 chunks: {len(metadata['ls_provider'])} chars")  # 280 instead of 14

---

No explicit error is raised. The reproduction above shows that ls_provider grows
to N * len("amazon_bedrock") after N streaming chunks. In our production
environment (LangGraph agent, 58+ graph steps, langchain-aws==1.4.5), we observed
massively inflated ls_provider fields across all AIMessages in conversation history,
contributing to request payload bloat.
RAW_BUFFERClick to expand / collapse

Submission checklist

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

  • langchain-ai/langchain-aws#946 — ls_provider vs model_provider mismatch
  • langchain-ai/langchain-aws#981 — PR that adds ls_provider to streaming chunks
  • #34807 — similar merge_dicts concatenation issue with tool call fields

Reproduction Steps / Example Code (Python)

from langchain_core.utils._merge import merge_dicts

# Simulate two streaming chunks from ChatBedrockConverse (langchain-aws>=1.4.5).
# Each chunk carries ls_provider="amazon_bedrock" in response_metadata.
left = {"ls_provider": "amazon_bedrock", "model_provider": "bedrock_converse"}
right = {"ls_provider": "amazon_bedrock", "model_provider": "bedrock_converse"}
result = merge_dicts(left, right)

print("model_provider:", result["model_provider"])  # "bedrock_converse" (correct)
print("ls_provider:", result["ls_provider"])         # "amazon_bedrockamazon_bedrock" (BUG)

# After 20 streaming chunks (typical response):
metadata = {"ls_provider": "amazon_bedrock"}
for _ in range(19):
    metadata = merge_dicts(metadata, {"ls_provider": "amazon_bedrock"})
print(f"After 20 chunks: {len(metadata['ls_provider'])} chars")  # 280 instead of 14

Error Message and Stack Trace (if applicable)

No explicit error is raised. The reproduction above shows that ls_provider grows
to N * len("amazon_bedrock") after N streaming chunks. In our production
environment (LangGraph agent, 58+ graph steps, langchain-aws==1.4.5), we observed
massively inflated ls_provider fields across all AIMessages in conversation history,
contributing to request payload bloat.

Description

langchain-aws v1.4.5 (PR langchain-ai/langchain-aws#981) adds response_metadata["ls_provider"] = "amazon_bedrock" to each streaming chunk in ChatBedrockConverse._stream(). This is needed for SummarizationMiddleware to identify the provider from message metadata.

However, merge_dicts() concatenates string values when merging chunks. The skip-concatenation guard on line 59 of _merge.py protects model_provider (also set on every chunk) but does not protect ls_provider:

right_k in {"id", "output_version", "model_provider"}  # ls_provider missing

After merging ~20 chunks, ls_provider becomes "amazon_bedrock" repeated 20 times (280 chars). Every AIMessage in conversation history carries this inflated metadata.

Proposed fix: add "ls_provider" to the existing guard, matching the pattern already established for "model_provider". I have a fix ready with tests if assigned.

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 25.4.0: Thu Mar 19 19:33:09 PDT 2026; root:xnu-12377.101.15~1/RELEASE_ARM64_T8112 Python Version: 3.14.3 (main, Feb 3 2026, 15:32:20) [Clang 17.0.0 (clang-1700.6.3.2)]

Package Information

langchain_core: 1.3.2 langsmith: 0.7.36 langchain_protocol: 0.0.11

Optional packages not installed

deepagents deepagents-cli

Other Dependencies

httpx: 0.28.1 jsonpatch: 1.33 orjson: 3.11.8 packaging: 26.1 pydantic: 2.13.3 pytest: 9.0.3 pyyaml: 6.0.3 requests: 2.33.1 requests-toolbelt: 1.0.0 tenacity: 9.1.4 typing-extensions: 4.15.0 uuid-utils: 0.14.1 xxhash: 3.6.0 zstandard: 0.25.0

extent analysis

TL;DR

The issue can be fixed by modifying the merge_dicts function to handle the "ls_provider" key correctly, preventing its value from being concatenated.

Guidance

  • The root cause of the issue is the merge_dicts function concatenating string values when merging chunks, which is not the desired behavior for the "ls_provider" key.
  • To fix this, the skip-concatenation guard in the _merge.py file should be updated to include the "ls_provider" key, similar to the existing guard for the "model_provider" key.
  • The proposed fix involves adding "ls_provider" to the existing guard, which would prevent the concatenation of its value when merging chunks.
  • The fix can be verified by running the provided reproduction steps and checking that the "ls_provider" value is no longer concatenated after merging multiple chunks.

Example

right_k in {"id", "output_version", "model_provider", "ls_provider"}  # updated guard

Notes

  • The issue is specific to the langchain-aws package, version 1.4.5, and the langchain-core package, version 1.3.2.
  • The fix should be applied to the _merge.py file in the langchain-core package.

Recommendation

Apply the proposed workaround by updating the skip-concatenation guard to include the "ls_provider" key, as this will prevent the concatenation of its value when merging chunks.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

langchain - 💡(How to fix) Fix Bug: merge_dicts concatenates ls_provider across streaming chunks, inflating response_metadata [2 comments, 2 participants]