langchain - ✅(Solved) Fix Mistral integration does not properly handle retries and concurrency limits [3 pull requests, 2 comments, 2 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#36753Fetched 2026-04-17 08:23:02
View on GitHub
Comments
2
Participants
2
Timeline
9
Reactions
0
Author
Timeline (top)
cross-referenced ×3labeled ×3commented ×2issue_type_added ×1

The current Mistral integration does not properly enforce retry behavior and concurrency limits.

Expected behavior:

  • Retries should only occur for transient errors (e.g., 429, 5xx, network failures)
  • Concurrency should be limited by max_concurrent_requests

Actual behavior:

  • Retry logic applies broadly without filtering retryable errors
  • Async calls using asyncio.gather do not enforce concurrency limits, leading to uncontrolled parallel requests

This can result in unnecessary retries for permanent errors and excessive concurrent requests that may hit rate limits or degrade performance.

Error Message

Error Message and Stack Trace (if applicable)

Root Cause

The current Mistral integration does not properly enforce retry behavior and concurrency limits.

Expected behavior:

  • Retries should only occur for transient errors (e.g., 429, 5xx, network failures)
  • Concurrency should be limited by max_concurrent_requests

Actual behavior:

  • Retry logic applies broadly without filtering retryable errors
  • Async calls using asyncio.gather do not enforce concurrency limits, leading to uncontrolled parallel requests

This can result in unnecessary retries for permanent errors and excessive concurrent requests that may hit rate limits or degrade performance.

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

PR fix notes

PR #36752: fix(mistralai): add retry handling and enforce concurrency limits for API calls

Description (problem / solution / changelog)

Fixes #36753

Adds retry handling for transient errors and enforces concurrency limits in the Mistral integration to improve reliability of API calls.

Changed files

  • libs/partners/mistralai/langchain_mistralai/chat_models.py (modified, +33/-3)
  • libs/partners/mistralai/langchain_mistralai/embeddings.py (modified, +24/-9)
  • libs/partners/mistralai/tests/unit_tests/test_chat_models.py (modified, +59/-0)
  • libs/partners/mistralai/tests/unit_tests/test_embeddings.py (modified, +50/-0)

PR #36754: fix(mistralai): add retry handling and enforce concurrency limits for API calls

Description (problem / solution / changelog)

Fixes #36753

Adds retry handling for transient errors and enforces concurrency limits in the Mistral integration to improve reliability of API calls.

Social handles (optional)

LinkedIn: https://linkedin.com/in/rugvedchandekar

Changed files

  • libs/partners/mistralai/langchain_mistralai/chat_models.py (modified, +33/-3)
  • libs/partners/mistralai/langchain_mistralai/embeddings.py (modified, +24/-9)
  • libs/partners/mistralai/tests/unit_tests/test_chat_models.py (modified, +59/-0)
  • libs/partners/mistralai/tests/unit_tests/test_embeddings.py (modified, +50/-0)

PR #36756: fix(mistralai): add retry handling and enforce concurrency limits for API calls

Description (problem / solution / changelog)

Improves retry handling and enforces concurrency limits in the Mistral integration to improve reliability of API calls.

Related to #36753

Changed files

  • libs/partners/mistralai/langchain_mistralai/chat_models.py (modified, +33/-3)
  • libs/partners/mistralai/langchain_mistralai/embeddings.py (modified, +24/-9)
  • libs/partners/mistralai/tests/unit_tests/test_chat_models.py (modified, +59/-0)
  • libs/partners/mistralai/tests/unit_tests/test_embeddings.py (modified, +50/-0)

Code Example

import asyncio
from langchain_mistralai import ChatMistralAI

async def test_concurrency():
    llm = ChatMistralAI(max_concurrent_requests=2)

    async def call():
        return await llm.ainvoke("Hello")

    # Fire multiple requests
    await asyncio.gather(*(call() for _ in range(5)))

asyncio.run(test_concurrency())

---
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

import asyncio
from langchain_mistralai import ChatMistralAI

async def test_concurrency():
    llm = ChatMistralAI(max_concurrent_requests=2)

    async def call():
        return await llm.ainvoke("Hello")

    # Fire multiple requests
    await asyncio.gather(*(call() for _ in range(5)))

asyncio.run(test_concurrency())

Error Message and Stack Trace (if applicable)

Description

The current Mistral integration does not properly enforce retry behavior and concurrency limits.

Expected behavior:

  • Retries should only occur for transient errors (e.g., 429, 5xx, network failures)
  • Concurrency should be limited by max_concurrent_requests

Actual behavior:

  • Retry logic applies broadly without filtering retryable errors
  • Async calls using asyncio.gather do not enforce concurrency limits, leading to uncontrolled parallel requests

This can result in unnecessary retries for permanent errors and excessive concurrent requests that may hit rate limits or degrade performance.

System Info

System Information

OS: Windows OS Version: 10.0.19045 Python Version: 3.11.14

Package Information

langchain_core: 1.3.0a2 langsmith: 0.6.3 langchain_mistralai: 1.1.2 langchain_tests: 1.1.6

Other Dependencies

httpx: 0.28.1 httpx-sse: 0.4.1 pydantic: 2.12.1 tenacity: 9.1.2 tokenizers: 0.22.1

extent analysis

TL;DR

The issue can be addressed by modifying the retry logic in the Mistral integration to filter retryable errors and enforcing concurrency limits using a semaphore.

Guidance

  • Review the ChatMistralAI class to ensure that retry logic only applies to transient errors (e.g., 429, 5xx, network failures) and not permanent errors.
  • Implement a semaphore to limit concurrent requests based on the max_concurrent_requests parameter, preventing excessive parallel requests.
  • Consider using a library like tenacity to handle retries with a more fine-grained approach.
  • Verify that the modified integration behaves as expected by testing it with various error scenarios and concurrency levels.

Example

import asyncio
from langchain_mistralai import ChatMistralAI
import asyncio_semaphore

async def test_concurrency():
    llm = ChatMistralAI(max_concurrent_requests=2)
    semaphore = asyncio_semaphore.Semaphore(llm.max_concurrent_requests)

    async def call():
        async with semaphore:
            return await llm.ainvoke("Hello")

    # Fire multiple requests
    await asyncio.gather(*(call() for _ in range(5)))

asyncio.run(test_concurrency())

Notes

The provided example code snippet is a minimal illustration of how to enforce concurrency limits using a semaphore. The actual implementation may require additional modifications to the ChatMistralAI class and its retry logic.

Recommendation

Apply a workaround by modifying the retry logic and implementing a semaphore to enforce concurrency limits, as the issue is not resolved by updating to the latest stable version of LangChain.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING