langchain - ✅(Solved) Fix `RunnableRetry` does not fire `on_retry` callback [3 pull requests, 5 comments, 6 participants]

langchain2026-03-30 20:46:14

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

langchain-ai/langchain#36382•Fetched 2026-04-08 01:52:46

View on GitHub

Comments

Participants

Timeline

Reactions

Author

thoman1

Participants

bennytaccardi

fairchildadrian9-create

Timeline (top)

commented ×5cross-referenced ×3labeled ×3issue_type_added ×1

I am using RunnableLambda's with_retry method to deal with some transient errors, but I also want to get each error's frequency. I defined a callback handler that would log on_retry, but with_retry doesn't fire on_retry (see code snippet). I have a working fix for this on my branch which wires up RunnableLambda to the hooks in BaseCallbackHandler, but wanted to check before making a PR.

Error Message

from langchain_core.callbacks import BaseCallbackHandler from langchain_core.runnables import RunnableLambda from tenacity import RetryCallState

class RetryLogger(BaseCallbackHandler): def on_retry(self, retry_state: RetryCallState, **kwargs): print(f"on_retry fired: attempt {retry_state.attempt_number}")

count = 0 def flaky(x): global count count += 1 if count < 3: raise ValueError("transient") return x

runnable = RunnableLambda(flaky).with_retry( retry_if_exception_type=(ValueError,), stop_after_attempt=5, )

on_retry never prints

runnable.invoke(1, config={"callbacks": [RetryLogger()]})

Root Cause

Fix Action

Fix / Workaround

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Other Dependencies

aiohttp: 3.13.3 anthropic: 0.76.0 dataclasses-json: 0.6.7 filetype: 1.2.0 google-genai: 1.59.0 httpx: 0.28.1 httpx-sse: 0.4.3 jsonpatch: 1.33 langgraph: 1.0.6 numpy: 2.4.1 openai: 2.15.0 opentelemetry-api: 1.39.1 opentelemetry-exporter-otlp-proto-http: 1.39.1 opentelemetry-sdk: 1.39.1 orjson: 3.11.5 packaging: 25.0 pydantic: 2.12.5 pydantic-settings: 2.12.0 pyyaml: 6.0.3 PyYAML: 6.0.3 requests: 2.32.5 requests-toolbelt: 1.0.0 SQLAlchemy: 2.0.45 sqlalchemy: 2.0.45 tenacity: 9.1.2 tiktoken: 0.12.0 typing-extensions: 4.15.0 uuid-utils: 0.13.0 zstandard: 0.25.0

PR fix notes

PR #36383: fix(core): fire on_retry callback in RunnableRetry

Repository: langchain-ai/langchain
Author: bennytaccardi
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/36383

Description (problem / solution / changelog)

Fixes #36382

This updates RunnableRetry to fire the on_retry callback for retry attempts in sync, async, and batch execution paths. It also adds unit tests covering those callback flows so retry handlers receive the expected attempt metadata.

AI assistance disclaimer: AI tooling was used to help draft and validate this change, and the final code and PR content were reviewed by the author.

Added unit tests covering sync invoke, async ainvoke, and batch retry callback behavior.
Attempted to run targeted pytest coverage for the new retry callback tests locally, but the current environment is missing the blockbuster test dependency, so I could not complete the test run in this checkout.

Important Notes

This is not a breaking change; it restores expected callback behavior for retries.
The change is scoped to core, which matches the modified files.
There is a local modification to libs/core/uv.lock in the working tree that is not part of the branch diff against origin/master; do not include it in the PR on purpose.

Verification Notes

If you want the verification section to claim passing checks instead of the blocked local run, first install the core test dependencies and rerun the targeted tests from libs/core.

Changed files

libs/core/langchain_core/runnables/retry.py (modified, +18/-8)
libs/core/tests/unit_tests/runnables/test_runnable.py (modified, +88/-0)

PR #36424: fix(core): fire on_retry callback in RunnableRetry for all execution paths

Repository: langchain-ai/langchain
Author: Saad-Azi
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/36424

Description (problem / solution / changelog)

Why

RunnableRetry wraps tenacity's retry loop but never calls un_manager.on_retry() after a failed attempt. This means any BaseCallbackHandler that implements on_retry receives no signal, making it impossible to observe or log individual retry attempts via the standard callback interface.

Fixes #36382

What changed

libs/core/langchain_core/runnables/retry.py — after each failed tenacity attempt (detected via ttempt.retry_state.outcome.failed), the appropriate on_retry method is called on the run manager in all four execution paths:

Method	Change
_invoke	calls
un_manager.on_retry(retry_state)
_ainvoke	calls wait run_manager.on_retry(retry_state)
_batch	calls
m.on_retry(retry_state) for each run manager
_abatch	calls wait rm.on_retry(retry_state) for each run manager

No public API changes. No new parameters. Fully backward compatible.

Tests

Four new unit tests added to est_runnable.py:

est_retry_fires_on_retry_callback — sync invoke, 2 failures then success, verifies 2 on_retry calls with correct attempt numbers
est_async_retry_fires_on_retry_callback — async invoke, same scenario
est_retry_batch_fires_on_retry_callback — sync batch with one failing element, verifies on_retry fires
est_async_retry_batch_fires_on_retry_callback — async batch, same scenario

All 8 relevant tests (4 new + 4 existing retry tests) pass.

Areas requiring careful review

The outcome.failed check runs outside the with attempt: block. After a successful attempt, tenacity sets outcome to a non-failed result before we check — so the elif branch will not fire on success. This is correct, but worth confirming against tenacity's internal state machine.

AI disclosure

This PR was developed with AI assistance. I have reviewed every line of the fix and the tests, understand the root cause, and verified all tests pass locally.

Changed files

libs/core/langchain_core/runnables/retry.py (modified, +10/-0)
libs/core/tests/unit_tests/runnables/test_runnable.py (modified, +138/-0)

PR #36556: fix(core): fire `on_retry` callback in `RunnableRetry`

Repository: langchain-ai/langchain
Author: YizukiAme
State: closed | merged: False
Link: https://github.com/langchain-ai/langchain/pull/36556

Description (problem / solution / changelog)

Description

RunnableRetry does not fire on_retry callbacks, making it impossible to observe retry attempts through LangChain's callback system.

This PR wires tenacity's before_sleep hook to run_manager.on_retry() across all four execution paths: _invoke, _ainvoke, _batch, and _abatch.

Fixes #36382

Changes

Production (`langchain_core/runnables/retry.py`)

_invoke / _ainvoke: Added a _before_sleep closure that calls run_manager.on_retry(retry_state) before each retry sleep. Passed to self._sync_retrying() / self._async_retrying() via the before_sleep kwarg.
_batch / _abatch: Same pattern, but the closure iterates only over retry_run_managers — the subset of run managers whose inputs failed in the current attempt. Successful inputs never receive on_retry.

Tests (`tests/unit_tests/runnables/test_runnable.py`)

Added 4 regression tests with a RetryCallbackHandler that records attempt_number from each on_retry call:

Test	Path	Validates
`test_retry_invoke_fires_on_retry_callback`	sync invoke	`on_retry` fires on attempts 1, 2
`test_retry_ainvoke_fires_on_retry_callback`	async invoke	same, async path
`test_retry_batch_fires_on_retry_callback_for_failed_inputs_only`	sync batch	only failed input's handler fires
`test_retry_abatch_fires_on_retry_callback_for_failed_inputs_only`	async batch	same, async path

Notes

No new dependencies introduced (RetryCallState was already imported).
The batch path shares a single retry_state across all failed inputs in a given attempt (inherent to tenacity's single-exception-raise model). This is consistent with the existing batch retry architecture and can be refined in a future PR if per-input retry state is needed.

Changed files

libs/core/langchain_core/runnables/retry.py (modified, +49/-5)
libs/core/tests/unit_tests/runnables/test_runnable.py (modified, +168/-0)

Code Example

from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.runnables import RunnableLambda
from tenacity import RetryCallState

class RetryLogger(BaseCallbackHandler):
    def on_retry(self, retry_state: RetryCallState, **kwargs):
        print(f"on_retry fired: attempt {retry_state.attempt_number}")

count = 0
def flaky(x):
    global count
    count += 1
    if count < 3:
        raise ValueError("transient")
    return x

runnable = RunnableLambda(flaky).with_retry(
    retry_if_exception_type=(ValueError,),
    stop_after_attempt=5,
)

# on_retry never prints
runnable.invoke(1, config={"callbacks": [RetryLogger()]})

---

RAW_BUFFERClick to expand / collapse

Checked other resources

This is a bug, not a usage question.
I added a clear and descriptive title that summarizes this issue.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).
This is not related to the langchain-community package.
I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Package (Required)

Related Issues / PRs

No response

Reproduction Steps / Example Code (Python)

from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.runnables import RunnableLambda
from tenacity import RetryCallState

class RetryLogger(BaseCallbackHandler):
    def on_retry(self, retry_state: RetryCallState, **kwargs):
        print(f"on_retry fired: attempt {retry_state.attempt_number}")

count = 0
def flaky(x):
    global count
    count += 1
    if count < 3:
        raise ValueError("transient")
    return x

runnable = RunnableLambda(flaky).with_retry(
    retry_if_exception_type=(ValueError,),
    stop_after_attempt=5,
)

# on_retry never prints
runnable.invoke(1, config={"callbacks": [RetryLogger()]})

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

OS: Darwin OS Version: Darwin Kernel Version 25.3.0: Wed Jan 28 20:54:05 PST 2026; root:xnu-12377.81.4~5/RELEASE_ARM64_T8142 Python Version: 3.12.12 (main, Jan 13 2026, 17:48:58) [Clang 21.1.4 ]

Package Information

langchain_core: 1.2.7 langchain: 1.2.4 langchain_community: 0.4.1 langsmith: 0.6.4 langchain_anthropic: 1.3.1 langchain_classic: 1.0.1 langchain_google_genai: 4.2.0 langchain_openai: 1.1.7 langchain_text_splitters: 1.1.0 langgraph_sdk: 0.3.3

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.13.3 anthropic: 0.76.0 dataclasses-json: 0.6.7 filetype: 1.2.0 google-genai: 1.59.0 httpx: 0.28.1 httpx-sse: 0.4.3 jsonpatch: 1.33 langgraph: 1.0.6 numpy: 2.4.1 openai: 2.15.0 opentelemetry-api: 1.39.1 opentelemetry-exporter-otlp-proto-http: 1.39.1 opentelemetry-sdk: 1.39.1 orjson: 3.11.5 packaging: 25.0 pydantic: 2.12.5 pydantic-settings: 2.12.0 pyyaml: 6.0.3 PyYAML: 6.0.3 requests: 2.32.5 requests-toolbelt: 1.0.0 SQLAlchemy: 2.0.45 sqlalchemy: 2.0.45 tenacity: 9.1.2 tiktoken: 0.12.0 typing-extensions: 4.15.0 uuid-utils: 0.13.0 zstandard: 0.25.0

extent analysis

Fix Plan

To fix the issue, we need to modify the RunnableLambda class to call the on_retry method of the callback handler when a retry occurs.

Here are the steps to achieve this:

Modify the with_retry method of RunnableLambda to accept a callback handler.
Call the on_retry method of the callback handler when a retry occurs.

Code Changes

from langchain_core.callbacks import BaseCallbackHandler
from langchain_core.runnables import RunnableLambda
from tenacity import RetryCallState, retry, stop_after_attempt, retry_if_exception_type

class RetryLogger(BaseCallbackHandler):
    def on_retry(self, retry_state: RetryCallState, **kwargs):
        print(f"on_retry fired: attempt {retry_state.attempt_number}")

def flaky(x):
    count = flaky.count + 1
    flaky.count = count
    if count < 3:
        raise ValueError("transient")
    return x
flaky.count = 0

# Define a custom retry decorator that calls the on_retry method of the callback handler
def retry_with_callback(retry_if_exception_type, stop_after_attempt, callback_handler):
    def decorator(func):
        @retry(retry=retry_if_exception_type, stop=stop_after_attempt)
        def wrapper(*args, **kwargs):
            try:
                return func(*args, **kwargs)
            except Exception as e:
                callback_handler.on_retry(RetryCallState(attempt_number=wrapper.attempt_number, outcome=e), **kwargs)
                raise
        wrapper.attempt_number = 0
        def attempt_wrapper(*args, **kwargs):
            wrapper.attempt_number += 1
            return wrapper(*args, **kwargs)
        return attempt_wrapper
    return decorator

# Apply the custom retry decorator to the flaky function
retry_decorator = retry_with_callback(retry_if_exception_type=(ValueError,), stop_after_attempt=5, callback_handler=RetryLogger())
flaky_retry = retry_decorator(flaky)

# Test the flaky function with retry
flaky_retry(1)

Verification

To verify that the fix worked, run the test code and check that the on_retry method is called when a retry occurs. The output should include the "on_retry fired" message for each retry attempt.

Extra Tips

Make sure to handle the on_retry method call in the callback handler to avoid any potential issues.
Consider adding additional logging or error handling to the on_retry method to suit your specific use case.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #mixed precision #training loop #device allocation

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

langchain - ✅(Solved) Fix `RunnableRetry` does not fire `on_retry` callback [3 pull requests, 5 comments, 6 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

on_retry never prints

Root Cause

Fix Action

Fix / Workaround

Other Dependencies

PR fix notes

PR #36383: fix(core): fire on_retry callback in RunnableRetry

Description (problem / solution / changelog)

Important Notes

Verification Notes

Changed files

PR #36424: fix(core): fire on_retry callback in RunnableRetry for all execution paths

Description (problem / solution / changelog)

Why

What changed

Tests

Areas requiring careful review

AI disclosure

Changed files

PR #36556: fix(core): fire on_retry callback in RunnableRetry

Description (problem / solution / changelog)

Description

Changes

Production (langchain_core/runnables/retry.py)

Tests (tests/unit_tests/runnables/test_runnable.py)

Notes

Changed files

Code Example

Checked other resources

Package (Required)

Related Issues / PRs

Reproduction Steps / Example Code (Python)

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

extent analysis

Fix Plan

Code Changes

Verification

Extra Tips

Still need to ship something?

RELATED_DISCOVERY

TRENDING

PR #36556: fix(core): fire `on_retry` callback in `RunnableRetry`

Production (`langchain_core/runnables/retry.py`)

Tests (`tests/unit_tests/runnables/test_runnable.py`)