pytorch - 💡(How to fix) Fix CTCLoss backward is inconsistent with its forward derivative with respect to the documented log_probs input. [1 participants]

pytorch2026-04-27 01:00:47

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#181534•Fetched 2026-04-27 05:28:46

View on GitHub

Comments

Participants

Timeline

Reactions

Author

beanduan22

Participants

beanduan22

Timeline (top)

mentioned ×44subscribed ×44labeled ×6

Error Message

import torch import torch.nn as nn import torch.nn.functional as F

torch.manual_seed(123)

T, N, C = 3, 1, 3 # time, batch, classes incl. blank log_probs = F.log_softmax(torch.randn(T, N, C), dim=-1).double() log_probs.requires_grad_(True)

targets = torch.tensor([[1]]) input_lens = torch.tensor([T]) target_lens = torch.tensor([1])

loss_fn = nn.CTCLoss(blank=0)

def f(x): return loss_fn(x, targets, input_lens, target_lens)

print(torch.autograd.gradcheck(f, (log_probs,), raise_exception=True))

Code Example

import torch
import torch.nn as nn
import torch.nn.functional as F

torch.manual_seed(123)

T, N, C = 3, 1, 3  # time, batch, classes incl. blank
log_probs = F.log_softmax(torch.randn(T, N, C), dim=-1).double()
log_probs.requires_grad_(True)

targets = torch.tensor([[1]])
input_lens = torch.tensor([T])
target_lens = torch.tensor([1])

loss_fn = nn.CTCLoss(blank=0)

def f(x):
    return loss_fn(x, targets, input_lens, target_lens)

print(torch.autograd.gradcheck(f, (log_probs,), raise_exception=True))

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

import torch
import torch.nn as nn
import torch.nn.functional as F

torch.manual_seed(123)

T, N, C = 3, 1, 3  # time, batch, classes incl. blank
log_probs = F.log_softmax(torch.randn(T, N, C), dim=-1).double()
log_probs.requires_grad_(True)

targets = torch.tensor([[1]])
input_lens = torch.tensor([T])
target_lens = torch.tensor([1])

loss_fn = nn.CTCLoss(blank=0)

def f(x):
    return loss_fn(x, targets, input_lens, target_lens)

print(torch.autograd.gradcheck(f, (log_probs,), raise_exception=True))

Environment:

PyTorch: 2.11.0+cu128
Python: 3.13.5

The following minimal example reproduces the issue. gradcheck fails with a clear Jacobian mismatch for the input log_probs.

The numerical Jacobian contains exact zeros for class 2 at every time step:

numerical: tensor([[-0.4727], [-0.5273], [ 0.0000], [-0.5624], [-0.4376], [ 0.0000], [-0.4784], [-0.5216], [ 0.0000]], dtype=torch.float64)

However, the analytical gradient returned by CTCLoss backward gives non-zero values at the same positions:

analytical: tensor([[-0.1430], [-0.1116], [ 0.2547], [-0.2236], [-0.3075], [ 0.5311], [-0.3088], [-0.3109], [ 0.6198]], dtype=torch.float64)

Since class 2 is neither the blank class nor the target label in this example, the true derivative with respect to log_probs[..., 2] should be zero. The numerical Jacobian confirms this, but the analytical backward returns non-zero gradients.

This suggests that CTCLoss.backward() is inconsistent with the derivative of CTCLoss.forward() with respect to the documented log_probs input.

Versions

PyTorch: 2.11.0+cu128 Python: 3.13.5

cc @ezyang @albanD @gqchen @nikitaved @soulitzer @Varal7 @bobrenjc93 @mruberry @jbschlosser @walterddr @mikaylagawarecki

extent analysis

TL;DR

The most likely fix is to verify the correctness of the CTCLoss implementation, specifically the backward method, to ensure it aligns with the derivative of CTCLoss.forward() with respect to the log_probs input.

Guidance

Review the CTCLoss implementation, focusing on the backward method, to identify potential inconsistencies with the forward method's derivative.
Verify that the log_probs input is correctly handled in the backward method, particularly for classes that are neither the blank class nor the target label.
Check the PyTorch documentation and issue tracker for any known issues or updates related to CTCLoss and its gradient calculation.
Consider testing the CTCLoss implementation with different inputs and scenarios to further isolate the issue.

Notes

The provided example suggests an inconsistency in the CTCLoss.backward() method, but without access to the PyTorch implementation details, it's challenging to provide a definitive fix. The issue may be related to a specific version of PyTorch or a particular use case.

Recommendation

Apply workaround: Until the CTCLoss implementation is verified or updated, consider using a custom or alternative loss function that correctly handles the log_probs input and its gradient calculation.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#embedding generation #cache error #pipeline error #runtime error #dependency conflict

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix CTCLoss backward is inconsistent with its forward derivative with respect to the documented log_probs input. [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

🐛 Describe the bug

Versions

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix CTCLoss backward is inconsistent with its forward derivative with respect to the documented log_probs input. [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Code Example

🐛 Describe the bug

Versions

extent analysis

TL;DR

Guidance

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING