langchain - 💡(How to fix) Fix PIIMiddleware state hooks miss tool-call args and corrupt structured content

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…

PIIMiddleware has two redaction paths now:

  1. state-level hooks (before_model / after_model)
  2. the recursive message/value walker added in #37616 for streaming surfaces and values snapshots

Those two paths disagree in two cases:

  1. after_model returns early for tool-call-only AIMessage(content="", tool_calls=[...]), so tool_calls[*].args are not redacted for non-streaming invoke()/graph-state consumers.
  2. the state hooks call str(message.content) for HumanMessage, ToolMessage, and AIMessage, which stringifies list-typed structured content blocks instead of preserving their shape.

This looks unintended because #37616 added _redact_value() / _redact_base_message(), added unit coverage for empty-content tool-call messages and structured content, and documented the state-level hooks as the non-streaming backstop.

Expected behavior:

  • state hooks should redact the same message surfaces the shared recursive helper already handles
  • structured content should stay structured
  • after_model should not skip tool-call-only assistant messages just because content == ""

Suggested fix: Refactor before_model and after_model to reuse the shared recursive message redactor instead of manually calling str(message.content) and reconstructing partial message objects.

Error Message

N/A. This is a deterministic behavior bug rather than an exception path.

Actual behavior:

  • after_model returns None for AIMessage(content="", tool_calls=[...])
  • before_model/after_model call str(message.content), which turns structured content blocks into a Python repr string

Root Cause

This looks unintended because #37616 added _redact_value() / _redact_base_message(), added unit coverage for empty-content tool-call messages and structured content, and documented the state-level hooks as the non-streaming backstop.

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

httpx: 0.28.1 jsonpatch: 1.33 langgraph: 1.2.1 orjson: 3.11.6 packaging: 24.2 pydantic: 2.12.5 pyyaml: 6.0.3 requests: 2.33.0 requests-toolbelt: 1.0.0 tenacity: 9.1.2 typing-extensions: 4.15.0 uuid-utils: 0.12.0 xxhash: 3.6.0 zstandard: 0.25.0

Code Example

from typing import Any

from langchain.agents import AgentState
from langchain.agents.middleware.pii import PIIMiddleware
from langchain_core.messages import AIMessage, HumanMessage, ToolCall
from langgraph.runtime import Runtime

# 1) after_model skips tool-call-only AI messages
middleware = PIIMiddleware("email", apply_to_output=True)
msg = AIMessage(
    content="",
    tool_calls=[
        ToolCall(
            name="send_email",
            args={"to": "[email protected]"},
            id="c1",
        )
    ],
)
state = AgentState[Any](messages=[HumanMessage("hi"), msg])

result = middleware.after_model(state, Runtime())

# Expected: tool call args are redacted
assert result is not None
assert result["messages"][-1].tool_calls[0]["args"] == {"to": "[REDACTED_EMAIL]"}


# 2) before_model stringifies structured content blocks
middleware = PIIMiddleware("email", apply_to_input=True)
msg = HumanMessage(
    content=[{"type": "text", "text": "Reach me at [email protected]"}]
)
state = AgentState[Any](messages=[msg])

result = middleware.before_model(state, Runtime())

# Expected: content remains a list of blocks with text redacted in place
assert result is not None
assert isinstance(result["messages"][0].content, list)
assert result["messages"][0].content[0]["text"] == "Reach me at [REDACTED_EMAIL]"

---

N/A. This is a deterministic behavior bug rather than an exception path.

Actual behavior:
- after_model returns None for AIMessage(content="", tool_calls=[...])
- before_model/after_model call str(message.content), which turns structured content blocks into a Python repr string
RAW_BUFFERClick to expand / collapse

Submission checklist

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

  • Related background: #35011
  • Original PIIMiddleware implementation: #33271
  • Recent streaming/value-snapshot redaction work: #37616

Reproduction Steps / Example Code (Python)

from typing import Any

from langchain.agents import AgentState
from langchain.agents.middleware.pii import PIIMiddleware
from langchain_core.messages import AIMessage, HumanMessage, ToolCall
from langgraph.runtime import Runtime

# 1) after_model skips tool-call-only AI messages
middleware = PIIMiddleware("email", apply_to_output=True)
msg = AIMessage(
    content="",
    tool_calls=[
        ToolCall(
            name="send_email",
            args={"to": "[email protected]"},
            id="c1",
        )
    ],
)
state = AgentState[Any](messages=[HumanMessage("hi"), msg])

result = middleware.after_model(state, Runtime())

# Expected: tool call args are redacted
assert result is not None
assert result["messages"][-1].tool_calls[0]["args"] == {"to": "[REDACTED_EMAIL]"}


# 2) before_model stringifies structured content blocks
middleware = PIIMiddleware("email", apply_to_input=True)
msg = HumanMessage(
    content=[{"type": "text", "text": "Reach me at [email protected]"}]
)
state = AgentState[Any](messages=[msg])

result = middleware.before_model(state, Runtime())

# Expected: content remains a list of blocks with text redacted in place
assert result is not None
assert isinstance(result["messages"][0].content, list)
assert result["messages"][0].content[0]["text"] == "Reach me at [REDACTED_EMAIL]"

Error Message and Stack Trace (if applicable)

N/A. This is a deterministic behavior bug rather than an exception path.

Actual behavior:
- after_model returns None for AIMessage(content="", tool_calls=[...])
- before_model/after_model call str(message.content), which turns structured content blocks into a Python repr string

Description

PIIMiddleware has two redaction paths now:

  1. state-level hooks (before_model / after_model)
  2. the recursive message/value walker added in #37616 for streaming surfaces and values snapshots

Those two paths disagree in two cases:

  1. after_model returns early for tool-call-only AIMessage(content="", tool_calls=[...]), so tool_calls[*].args are not redacted for non-streaming invoke()/graph-state consumers.
  2. the state hooks call str(message.content) for HumanMessage, ToolMessage, and AIMessage, which stringifies list-typed structured content blocks instead of preserving their shape.

This looks unintended because #37616 added _redact_value() / _redact_base_message(), added unit coverage for empty-content tool-call messages and structured content, and documented the state-level hooks as the non-streaming backstop.

Expected behavior:

  • state hooks should redact the same message surfaces the shared recursive helper already handles
  • structured content should stay structured
  • after_model should not skip tool-call-only assistant messages just because content == ""

Suggested fix: Refactor before_model and after_model to reuse the shared recursive message redactor instead of manually calling str(message.content) and reconstructing partial message objects.

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 25.5.0: Mon Apr 27 20:38:56 PDT 2026; root:xnu-12377.121.6~2/RELEASE_ARM64_T6000 Python Version: 3.13.2 (main, Feb 4 2025, 14:51:09) [Clang 16.0.0 (clang-1600.0.26.6)]

Package Information

langchain_core: 1.4.0 langchain: 1.3.1 langsmith: 0.8.0 langchain_protocol: 0.0.15 langgraph_sdk: 0.3.3

Optional packages not installed

deepagents deepagents-cli

Other Dependencies

httpx: 0.28.1 jsonpatch: 1.33 langgraph: 1.2.1 orjson: 3.11.6 packaging: 24.2 pydantic: 2.12.5 pyyaml: 6.0.3 requests: 2.33.0 requests-toolbelt: 1.0.0 tenacity: 9.1.2 typing-extensions: 4.15.0 uuid-utils: 0.12.0 xxhash: 3.6.0 zstandard: 0.25.0

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING