langchain - 💡(How to fix) Fix Token-efficient serialization for agent message passing to reduce context window overhead at scale [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#36764Fetched 2026-04-17 08:22:56
View on GitHub
Comments
0
Participants
1
Timeline
4
Reactions
0
Author
Participants
Timeline (top)
labeled ×3issue_type_added ×1

Root Cause

The problem compounds specifically in LangChain because:

Code Example

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    serializer="ulmen"  # opt-in
)

---

chain = (
    prompt 
    | llm 
    | output_parser
).with_config(serializer="ulmen")

---

from langchain_core.globals import set_serializer
set_serializer("ulmen")
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a feature request, not a bug report or usage question.
  • I added a clear and descriptive title that summarizes the feature request.
  • I used the GitHub search to find a similar feature request and didn't find it.
  • I checked the LangChain documentation and API reference to see if this feature already exists.
  • This is not related to the langchain-community package.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Feature Description

I would like LangChain to support a pluggable serialization interface for agent message passing, with ULMEN as a drop-in alternative to JSON.

LangChain currently serializes agent messages, tool calls, and chain outputs as JSON. At scale this creates a measurable and expensive problem.

~44% of tokens in typical LangChain agent payloads are pure JSON syntax overhead before any reasoning begins.

ULMEN benchmarks on NVIDIA Tesla T4:

<img width="1861" height="1281" alt="Image" src="https://github.com/user-attachments/assets/acf1dbf5-2079-4dc3-bdf0-1d7bb5993006" />

This feature would allow users to:

  1. Reduce inference costs 44% with zero logic changes
  2. Reclaim context window capacity for actual reasoning
  3. Validate agent state transitions before they reach the LLM via a Semantic Firewall
  4. Scale agent pipelines without serialization becoming a cost bottleneck

Use Case

I'm trying to build multi-agent pipelines that run at scale without inference costs compounding faster than product value.

Currently I have to work around this by:

  • Manual prompt compression (lossy, fragile)
  • Custom serialization per chain (not systematic)
  • Accepting the JSON tax as a fixed cost

The problem compounds specifically in LangChain because:

  1. AgentExecutor passes full message history as JSON on every turn
  2. Tool results serialize back as JSON with repeated key names across every step
  3. LCEL chains serialize intermediate outputs as JSON between steps
  4. Multi-agent patterns built on RunnableSequence compound the overhead at every node

At 10M agent loops on GPT-4o this overhead costs approximately $59K. At 100M loops: $590K.

This feature would help users to systematically eliminate format overhead without changing any business logic.

Proposed Solution

A SerializerProtocol in langchain-core that allows pluggable serialization across the agent execution pipeline.

The API could look like:

Option 1: Agent level

agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    serializer="ulmen"  # opt-in
)

Option 2: Chain level

chain = (
    prompt 
    | llm 
    | output_parser
).with_config(serializer="ulmen")

Option 3: Global config

from langchain_core.globals import set_serializer
set_serializer("ulmen")

ULMEN implementation:

  • Drop-in Python/Rust library
  • No schema compilation required
  • Pure Python fallback if Rust unavailable
  • Byte-identical output between Python/Rust
  • BSL license, free under $10M revenue

Includes a Semantic Firewall that validates agent state transitions:

  • Rejects orphaned tool calls
  • Catches backwards step transitions
  • Validates enum states
  • Raises structured errors vs silent failures

Reproducible benchmarks: github.com/makroumi/ulmen

Alternatives Considered

  1. orjson Faster serialization but identical token count. Context window overhead unchanged.Doesn't address the core problem.

  2. MessagePack
    Smaller wire format but token count unchanged.No semantic validation layer.Not LLM-native.

  3. Manual context trimmingLossy. Requires custom logic per chain.Not systematic. Breaks with schema changes.

  4. Prompt compression (LLMLingua etc.) Works on natural language, not structured agent state. Different problem space.

None of these address the root cause: JSON was designed for web APIs, not LLM context windows. ULMEN was designed specifically for this constraint from the ground up.

Additional Context

This problem becomes more acute as:

  • Agent pipelines grow more complex
  • Context windows fill faster with history
  • Scale increases inference costs
  • Multi-agent patterns multiply overhead

Related community discussion:

  • r/LangChain: "Agentic workflows and the JSON trap" (posted today)

Similar approach validated independently:

  • LEAN format (r/LLMDevs): 47% token reduction
  • TOON format: token-oriented notation
  • Multiple teams converging on same insight

ULMEN goes further by adding the Semantic Firewall validation layer specifically for agent state integrity.

Full benchmark notebook (reproducible):github.com/makroumi/ulmen

Happy to submit a PR implementing the SerializerProtocol interface once maintainers confirm preferred integration approach.

extent analysis

TL;DR

Implement a pluggable serialization interface in LangChain to support ULMEN as an alternative to JSON, reducing inference costs and improving scalability.

Guidance

  • Introduce a SerializerProtocol in langchain-core to enable pluggable serialization across the agent execution pipeline.
  • Consider implementing the proposed AgentExecutor, Chain, and global configuration options to allow users to opt-in to ULMEN serialization.
  • Evaluate the ULMEN library's performance and features, including its Semantic Firewall validation layer, to ensure it meets LangChain's requirements.
  • Review the provided benchmarks and reproducible notebook to understand the potential benefits of ULMEN serialization.

Example

# Option 1: Agent level
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    serializer="ulmen"  # opt-in
)

Notes

The proposed solution requires careful evaluation of the ULMEN library and its integration with LangChain. The maintainers should confirm the preferred integration approach before a PR is submitted.

Recommendation

Apply the proposed SerializerProtocol interface to enable pluggable serialization, allowing users to opt-in to ULMEN serialization and potentially reduce inference costs by 44%.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING