pytorch - ✅(Solved) Fix StepLR accepts step_size=0 and crashes with ZeroDivisionError on first step() [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#177833Fetched 2026-04-08 01:01:31
View on GitHub
Comments
0
Participants
1
Timeline
26
Reactions
0
Author
Participants
Timeline (top)
referenced ×7labeled ×6mentioned ×6subscribed ×6

Error Message

import torch opt = torch.optim.SGD([torch.randn(1, requires_grad=True)], lr=0.1) sched = torch.optim.lr_scheduler.StepLR(opt, step_size=0) opt.step() sched.step() # ZeroDivisionError: integer modulo by zero

Root Cause

StepLR(optimizer, step_size=0) is accepted at construction time but crashes with ZeroDivisionError when step() is called, because get_lr() does modulo by step_size.

Fix Action

Fixed

PR fix notes

PR #178295: Fix division-by-zero in lr_scheduler parameter validation (#177833)

Description (problem / solution / changelog)

Summary Several lr_scheduler classes silently accept zero values for parameters, then raise a ZeroDivisionError at runtime. This PR adds validation so users get a clear error message:

  • StepLR: reject step_size == 0
  • CosineAnnealingLR: reject T_max == 0
  • CyclicLR: reject step_size_up == 0
  • ConstantLR: tighten existing factor validation from < 0 to <= 0, since factor=0 also causes division by zero in get_lr()

Known issues not addressed

  • The learning rate itself can still go negative, which doesn't produce meaningful behavior. Restricting this can be done as a follow-up.
  • Two additional ZeroDivisionError cases identified which are only happening in the deprecated use case: In LinearLR and in PolynomialLR: total_iters=0 causes ZeroDivisionError when using the closed form .step(epoch=..)

Test plan

  • Added test_step_lr_step_size_zero, test_constantlr_factor_zero, test_cos_anneal_lr_T_max_zero, and test_cycle_lr_step_size_up_zero to verify each scheduler raises ValueError on zero input
  • Existing lr_scheduler tests pass: test/optim/test_lrscheduler.py

Addresses: #177833 and #174025

Changed files

  • test/optim/test_lrscheduler.py (modified, +40/-0)
  • torch/optim/lr_scheduler.py (modified, +21/-1)

Code Example

import torch
opt = torch.optim.SGD([torch.randn(1, requires_grad=True)], lr=0.1)
sched = torch.optim.lr_scheduler.StepLR(opt, step_size=0)
opt.step()
sched.step()  # ZeroDivisionError: integer modulo by zero
RAW_BUFFERClick to expand / collapse

StepLR(optimizer, step_size=0) is accepted at construction time but crashes with ZeroDivisionError when step() is called, because get_lr() does modulo by step_size.

import torch
opt = torch.optim.SGD([torch.randn(1, requires_grad=True)], lr=0.1)
sched = torch.optim.lr_scheduler.StepLR(opt, step_size=0)
opt.step()
sched.step()  # ZeroDivisionError: integer modulo by zero

Negative step_size values are also silently accepted.

Related: #168223 #169107 #174025 (all stale)

cc @vincentqb @jbschlosser @albanD @janeyx99 @crcrpar @malfet

extent analysis

Fix Plan

To fix the issue, we need to add input validation to ensure that step_size is a positive integer.

Step-by-Step Solution

  • Modify the StepLR constructor to validate the step_size argument:
class StepLR(torch.optim.lr_scheduler.LambdaLR):
    def __init__(self, optimizer, step_size, gamma=0.1, last_epoch=-1):
        if not isinstance(step_size, int) or step_size <= 0:
            raise ValueError("step_size must be a positive integer")
        self.step_size = step_size
        # ... rest of the constructor remains the same
  • Alternatively, you can also modify the step() method to handle the case where step_size is 0:
def step(self, epoch=None):
    if self.step_size == 0:
        raise ValueError("step_size cannot be 0")
    # ... rest of the step() method remains the same

Verification

To verify that the fix worked, you can test the StepLR scheduler with a valid step_size value:

opt = torch.optim.SGD([torch.randn(1, requires_grad=True)], lr=0.1)
sched = torch.optim.lr_scheduler.StepLR(opt, step_size=1)
opt.step()
sched.step()  # should not raise an error

Extra Tips

  • Always validate user input to prevent unexpected errors.
  • Consider adding a check for negative step_size values and raise a ValueError if encountered.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - ✅(Solved) Fix StepLR accepts step_size=0 and crashes with ZeroDivisionError on first step() [1 pull requests, 1 participants]