langchain - ✅(Solved) Fix `RunnableRetry` does not fire `on_retry` callback [3 pull requests, 5 comments, 6 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
langchain-ai/langchain#36382Fetched 2026-04-08 01:52:46
View on GitHub
Comments
5
Participants
6
Timeline
13
Reactions
0
Author
Timeline (top)
commented ×5cross-referenced ×3labeled ×3issue_type_added ×1

I am using RunnableLambda's with_retry method to deal with some transient errors, but I also want to get each error's frequency. I defined a callback handler that would log on_retry, but with_retry doesn't fire on_retry (see code snippet). I have a working fix for this on my branch which wires up RunnableLambda to the hooks in BaseCallbackHandler, but wanted to check before making a PR.

Error Message

from langchain_core.callbacks import BaseCallbackHandler from langchain_core.runnables import RunnableLambda from tenacity import RetryCallState

class RetryLogger(BaseCallbackHandler): def on_retry(self, retry_state: RetryCallState, **kwargs): print(f"on_retry fired: attempt {retry_state.attempt_number}")

count = 0 def flaky(x): global count count += 1 if count < 3: raise ValueError("transient") return x

runnable = RunnableLambda(flaky).with_retry( retry_if_exception_type=(ValueError,), stop_after_attempt=5, )

on_retry never prints

runnable.invoke(1, config={"callbacks": [RetryLogger()]})

Root Cause

I am using RunnableLambda's with_retry method to deal with some transient errors, but I also want to get each error's frequency. I defined a callback handler that would log on_retry, but with_retry doesn't fire on_retry (see code snippet). I have a working fix for this on my branch which wires up RunnableLambda to the hooks in BaseCallbackHandler, but wanted to check before making a PR.

Fix Action

Fix / Workaround

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

aiohttp: 3.13.3 anthropic: 0.76.0 dataclasses-json: 0.6.7 filetype: 1.2.0 google-genai: 1.59.0 httpx: 0.28.1 httpx-sse: 0.4.3 jsonpatch: 1.33 langgraph: 1.0.6 numpy: 2.4.1 openai: 2.15.0 opentelemetry-api: 1.39.1 opentelemetry-exporter-otlp-proto-http: 1.39.1 opentelemetry-sdk: 1.39.1 orjson: 3.11.5 packaging: 25.0 pydantic: 2.12.5 pydantic-settings: 2.12.0 pyyaml: 6.0.3 PyYAML: 6.0.3 requests: 2.32.5 requests-toolbelt: 1.0.0 SQLAlchemy: 2.0.45 sqlalchemy: 2.0.45 tenacity: 9.1.2 tiktoken: 0.12.0 typing-extensions: 4.15.0 uuid-utils: 0.13.0 zstandard: 0.25.0

PR fix notes

PR #36383: fix(core): fire on_retry callback in RunnableRetry

Description (problem / solution / changelog)

Fixes #36382

This updates RunnableRetry to fire the on_retry callback for retry attempts in sync, async, and batch execution paths. It also adds unit tests covering those callback flows so retry handlers receive the expected attempt metadata.

AI assistance disclaimer: AI tooling was used to help draft and validate this change, and the final code and PR content were reviewed by the author.

  • Added unit tests covering sync invoke, async ainvoke, and batch retry callback behavior.
  • Attempted to run targeted pytest coverage for the new retry callback tests locally, but the current environment is missing the blockbuster test dependency, so I could not complete the test run in this checkout.

Important Notes

  • This is not a breaking change; it restores expected callback behavior for retries.
  • The change is scoped to core, which matches the modified files.
  • There is a local modification to libs/core/uv.lock in the working tree that is not part of the branch diff against origin/master; do not include it in the PR on purpose.

Verification Notes

If you want the verification section to claim passing checks instead of the blocked local run, first install the core test dependencies and rerun the targeted tests from libs/core.

Changed files

  • libs/core/langchain_core/runnables/retry.py (modified, +18/-8)
  • libs/core/tests/unit_tests/runnables/test_runnable.py (modified, +88/-0)

PR #36424: fix(core): fire on_retry callback in RunnableRetry for all execution paths

Description (problem / solution / changelog)

Why

RunnableRetry wraps tenacity's retry loop but never calls un_manager.on_retry() after a failed attempt. This means any BaseCallbackHandler that implements on_retry receives no signal, making it impossible to observe or log individual retry attempts via the standard callback interface.

Fixes #36382

What changed

libs/core/langchain_core/runnables/retry.py — after each failed tenacity attempt (detected via ttempt.retry_state.outcome.failed), the appropriate on_retry method is called on the run manager in all four execution paths:

MethodChange
_invokecalls
un_manager.on_retry(retry_state)
_ainvokecalls wait run_manager.on_retry(retry_state)
_batchcalls
m.on_retry(retry_state) for each run manager
_abatchcalls wait rm.on_retry(retry_state) for each run manager

No public API changes. No new parameters. Fully backward compatible.

Tests

Four new unit tests added to est_runnable.py:

  • est_retry_fires_on_retry_callback — sync invoke, 2 failures then success, verifies 2 on_retry calls with correct attempt numbers
  • est_async_retry_fires_on_retry_callback — async invoke, same scenario
  • est_retry_batch_fires_on_retry_callback — sync batch with one failing element, verifies on_retry fires
  • est_async_retry_batch_fires_on_retry_callback — async batch, same scenario

All 8 relevant tests (4 new + 4 existing retry tests) pass.

Areas requiring careful review

  • The outcome.failed check runs outside the with attempt: block. After a successful attempt, tenacity sets outcome to a non-failed result before we check — so the elif branch will not fire on success. This is correct, but worth confirming against tenacity's internal state machine.

AI disclosure

This PR was developed with AI assistance. I have reviewed every line of the fix and the tests, understand the root cause, and verified all tests pass locally.

Changed files

  • libs/core/langchain_core/runnables/retry.py (modified, +10/-0)
  • libs/core/tests/unit_tests/runnables/test_runnable.py (modified, +138/-0)

PR #36556: fix(core): fire on_retry callback in RunnableRetry

Description (problem / solution / changelog)

Description

RunnableRetry does not fire on_retry callbacks, making it impossible to observe retry attempts through LangChain's callback system.

This PR wires tenacity's before_sleep hook to run_manager.on_retry() across all four execution paths: _invoke, _ainvoke, _batch, and _abatch.

Fixes #36382

Changes

Production (langchain_core/runnables/retry.py)

  • _invoke / _ainvoke: Added a _before_sleep closure that calls run_manager.on_retry(retry_state) before each retry sleep. Passed to self._sync_retrying() / self._async_retrying() via the before_sleep kwarg.
  • _batch / _abatch: Same pattern, but the closure iterates only over retry_run_managers — the subset of run managers whose inputs failed in the current attempt. Successful inputs never receive on_retry.

Tests (tests/unit_tests/runnables/test_runnable.py)

Added 4 regression tests with a RetryCallbackHandler that records attempt_number from each on_retry call:

TestPathValidates
test_retry_invoke_fires_on_retry_callbacksync invokeon_retry fires on attempts 1, 2
test_retry_ainvoke_fires_on_retry_callbackasync invokesame, async path
test_retry_batch_fires_on_retry_callback_for_failed_inputs_onlysync batchonly failed input's handler fires
test_retry_abatch_fires_on_retry_callback_for_failed_inputs_onlyasync batchsame, async path

Notes

  • No new dependencies introduced (RetryCallState was already imported).
  • The batch path shares a single retry_state across all failed inputs in a given attempt (inherent to tenacity's single-exception-raise model). This is consistent with the existing batch retry architecture and can be refined in a future PR if per-input retry state is needed.

Changed files

  • libs/core/langchain_core/runnables/retry.py (modified, +49/-5)
  • libs/core/tests/unit_tests/runnables/test_runnable.py (modified, +168/-0)

Code Example

from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.runnables import RunnableLambda
from tenacity import RetryCallState

class RetryLogger(BaseCallbackHandler):
    def on_retry(self, retry_state: RetryCallState, **kwargs):
        print(f"on_retry fired: attempt {retry_state.attempt_number}")

count = 0
def flaky(x):
    global count
    count += 1
    if count < 3:
        raise ValueError("transient")
    return x

runnable = RunnableLambda(flaky).with_retry(
    retry_if_exception_type=(ValueError,),
    stop_after_attempt=5,
)

# on_retry never prints
runnable.invoke(1, config={"callbacks": [RetryLogger()]})

---
RAW_BUFFERClick to expand / collapse

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.runnables import RunnableLambda
from tenacity import RetryCallState

class RetryLogger(BaseCallbackHandler):
    def on_retry(self, retry_state: RetryCallState, **kwargs):
        print(f"on_retry fired: attempt {retry_state.attempt_number}")

count = 0
def flaky(x):
    global count
    count += 1
    if count < 3:
        raise ValueError("transient")
    return x

runnable = RunnableLambda(flaky).with_retry(
    retry_if_exception_type=(ValueError,),
    stop_after_attempt=5,
)

# on_retry never prints
runnable.invoke(1, config={"callbacks": [RetryLogger()]})

Error Message and Stack Trace (if applicable)

Description

I am using RunnableLambda's with_retry method to deal with some transient errors, but I also want to get each error's frequency. I defined a callback handler that would log on_retry, but with_retry doesn't fire on_retry (see code snippet). I have a working fix for this on my branch which wires up RunnableLambda to the hooks in BaseCallbackHandler, but wanted to check before making a PR.

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 25.3.0: Wed Jan 28 20:54:05 PST 2026; root:xnu-12377.81.4~5/RELEASE_ARM64_T8142 Python Version: 3.12.12 (main, Jan 13 2026, 17:48:58) [Clang 21.1.4 ]

Package Information

langchain_core: 1.2.7 langchain: 1.2.4 langchain_community: 0.4.1 langsmith: 0.6.4 langchain_anthropic: 1.3.1 langchain_classic: 1.0.1 langchain_google_genai: 4.2.0 langchain_openai: 1.1.7 langchain_text_splitters: 1.1.0 langgraph_sdk: 0.3.3

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.13.3 anthropic: 0.76.0 dataclasses-json: 0.6.7 filetype: 1.2.0 google-genai: 1.59.0 httpx: 0.28.1 httpx-sse: 0.4.3 jsonpatch: 1.33 langgraph: 1.0.6 numpy: 2.4.1 openai: 2.15.0 opentelemetry-api: 1.39.1 opentelemetry-exporter-otlp-proto-http: 1.39.1 opentelemetry-sdk: 1.39.1 orjson: 3.11.5 packaging: 25.0 pydantic: 2.12.5 pydantic-settings: 2.12.0 pyyaml: 6.0.3 PyYAML: 6.0.3 requests: 2.32.5 requests-toolbelt: 1.0.0 SQLAlchemy: 2.0.45 sqlalchemy: 2.0.45 tenacity: 9.1.2 tiktoken: 0.12.0 typing-extensions: 4.15.0 uuid-utils: 0.13.0 zstandard: 0.25.0

extent analysis

Fix Plan

To fix the issue, we need to modify the RunnableLambda class to call the on_retry method of the callback handler when a retry occurs.

Here are the steps to achieve this:

  • Modify the with_retry method of RunnableLambda to accept a callback handler.
  • Call the on_retry method of the callback handler when a retry occurs.

Code Changes

from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.runnables import RunnableLambda
from tenacity import RetryCallState, retry, stop_after_attempt, retry_if_exception_type

class RetryLogger(BaseCallbackHandler):
    def on_retry(self, retry_state: RetryCallState, **kwargs):
        print(f"on_retry fired: attempt {retry_state.attempt_number}")

def flaky(x):
    count = flaky.count + 1
    flaky.count = count
    if count < 3:
        raise ValueError("transient")
    return x
flaky.count = 0

# Define a custom retry decorator that calls the on_retry method of the callback handler
def retry_with_callback(retry_if_exception_type, stop_after_attempt, callback_handler):
    def decorator(func):
        @retry(retry=retry_if_exception_type, stop=stop_after_attempt)
        def wrapper(*args, **kwargs):
            try:
                return func(*args, **kwargs)
            except Exception as e:
                callback_handler.on_retry(RetryCallState(attempt_number=wrapper.attempt_number, outcome=e), **kwargs)
                raise
        wrapper.attempt_number = 0
        def attempt_wrapper(*args, **kwargs):
            wrapper.attempt_number += 1
            return wrapper(*args, **kwargs)
        return attempt_wrapper
    return decorator

# Apply the custom retry decorator to the flaky function
retry_decorator = retry_with_callback(retry_if_exception_type=(ValueError,), stop_after_attempt=5, callback_handler=RetryLogger())
flaky_retry = retry_decorator(flaky)

# Test the flaky function with retry
flaky_retry(1)

Verification

To verify that the fix worked, run the test code and check that the on_retry method is called when a retry occurs. The output should include the "on_retry fired" message for each retry attempt.

Extra Tips

  • Make sure to handle the on_retry method call in the callback handler to avoid any potential issues.
  • Consider adding additional logging or error handling to the on_retry method to suit your specific use case.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING