langchain - 💡(How to fix) Fix ChatMistralAI: citation metadata from Mistral API response is silently dropped [2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#36427Fetched 2026-04-08 02:22:26
View on GitHub
Comments
2
Participants
2
Timeline
11
Reactions
0
Assignees
Timeline (top)
labeled ×3commented ×2mentioned ×2subscribed ×2

Code Example

content = _message.get("content", "") or ""

---

citations = []
if isinstance(content, list):
    parts = []
    for chunk in content:
        parts.append(chunk.get("text", ""))
        if chunk.get("type") == "reference":
            citations.append(chunk)
    content = "".join(parts)

response_metadata = {"model_provider": "mistralai"}
if citations:
    response_metadata["citations"] = citations

---

content = [
    {"type": "text", "text": "According to the document, "},
    {"type": "reference", "reference_ids": [0], "text": "the temperature is 20°C"},
    {"type": "text", "text": " on average."}
]
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a feature request, not a bug report or usage question.
  • I added a clear and descriptive title that summarizes the feature request.
  • I used the GitHub search to find a similar feature request and didn't find it.
  • I checked the LangChain documentation and API reference to see if this feature already exists.
  • This is not related to the langchain-community package.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Feature Description

ChatMistralAI._convert_mistral_chat_message_to_message treats response content as a plain string. When calling Mistral's API with citations=True, content is a list of typed chunks (text and reference) the reference metadata (reference_ids, source mapping) gets silently dropped.

The citation data should be extracted and stored in response_metadata["citations"] so users doing RAG with Mistral can map answer fragments back to source documents.

Relevant code: _convert_mistral_chat_message_to_message in langchain_mistralai/chat_models.py, specifically:

content = _message.get("content", "") or ""

Use Case

RAG pipelines using Mistral models with native citation support. Mistral returns which parts of the answer come from which source documents, but there's currently no way to access that through ChatMistralAI. Users who need inline citations have to bypass langchain and call the Mistral SDK directly.

Proposed Solution

When content is a list, concatenate the text for backward compatibility and extract reference chunks into response_metadata:

citations = []
if isinstance(content, list):
    parts = []
    for chunk in content:
        parts.append(chunk.get("text", ""))
        if chunk.get("type") == "reference":
            citations.append(chunk)
    content = "".join(parts)

response_metadata = {"model_provider": "mistralai"}
if citations:
    response_metadata["citations"] = citations

content stays a string, citations are available via response_metadata. No breaking change.

Alternatives Considered

  • Calling the mistralai SDK directly instead of going through langchain — works but loses all the langchain integration (chains, callbacks, tracing)
  • Wrapping ChatMistralAI with a post-processing step that re-parses the raw API response — fragile, duplicates work

Additional Context

Mistral citation response format:

content = [
    {"type": "text", "text": "According to the document, "},
    {"type": "reference", "reference_ids": [0], "text": "the temperature is 20°C"},
    {"type": "text", "text": " on average."}
]

Docs: https://docs.mistral.ai/capabilities/citations/

extent analysis

TL;DR

Modify the _convert_mistral_chat_message_to_message function in langchain_mistralai/chat_models.py to handle cases where the content is a list, extracting citation data and storing it in response_metadata["citations"].

Guidance

  • Identify the content type in the _convert_mistral_chat_message_to_message function and apply different handling based on whether it's a string or a list.
  • When content is a list, iterate through its elements to concatenate text parts and extract reference chunks for citation data.
  • Store the extracted citation data in response_metadata["citations"] for access in RAG pipelines.
  • Ensure backward compatibility by keeping content as a string after processing.

Example

The proposed solution code snippet already provides a clear example of how to achieve this:

citations = []
if isinstance(content, list):
    parts = []
    for chunk in content:
        parts.append(chunk.get("text", ""))
        if chunk.get("type") == "reference":
            citations.append(chunk)
    content = "".join(parts)

response_metadata = {"model_provider": "mistralai"}
if citations:
    response_metadata["citations"] = citations

Notes

This solution assumes that the Mistral API's response format for citations is consistent and can be reliably parsed as shown in the example. Any changes to the API's response format could require adjustments to this solution.

Recommendation

Apply the proposed workaround by modifying the _convert_mistral_chat_message_to_message function as described, allowing for the extraction and storage of citation data without breaking existing functionality. This approach enables the use of Mistral models with native citation support within RAG pipelines through langchain.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING