pytorch - 💡(How to fix) Fix `insert_overlap_deps=True` in `schedule_overlap_bucketing` breaks inductor when `graphsafe_run_with_rng_state` is present [1 participants]

Official PRs (…)
ON THIS PAGE

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#178875Fetched 2026-04-08 01:57:21
View on GitHub
Comments
0
Participants
1
Timeline
122
Reactions
0
Author
Participants
Assignees
Timeline (top)
mentioned ×52subscribed ×52labeled ×12unlabeled ×5

Code Example

# torch/_inductor/graph.py, in GraphLowering.placeholder
  elif isinstance(example, torch.Generator):
      assert len(V.graph.current_node.users) == 1 and next(
          iter(V.graph.current_node.users)
      ).target in (
          torch._prims.rng_prims.graphsafe_run_with_rng_state,
          torch.ops.higher_order.invoke_subgraph,
      )
RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

Description:

When schedule_overlap_bucketing is called with insert_overlap_deps=True on a graph that contains graphsafe_run_with_rng_state nodes (from activation checkpointing with nondeterministic ops like SDPA), inductor fails with an AssertionError in GraphLowering.placeholder.

The issue: insert_overlap_deps wraps nodes with control_deps (a HigherOrderOperator), including graphsafe_run_with_rng_state nodes. Inductor's placeholder handler for torch.Generator inputs asserts that the Generator placeholder has exactly one user whose target is either graphsafe_run_with_rng_state or invoke_subgraph:

  # torch/_inductor/graph.py, in GraphLowering.placeholder
  elif isinstance(example, torch.Generator):
      assert len(V.graph.current_node.users) == 1 and next(
          iter(V.graph.current_node.users)
      ).target in (
          torch._prims.rng_prims.graphsafe_run_with_rng_state,
          torch.ops.higher_order.invoke_subgraph,
      )

After insert_overlap_deps, the Generator placeholder's user is a control_deps node (whose target is a ControlDeps object), not graphsafe_run_with_rng_state directly. This fails the assertion.

Repro:

Any model that combines:

  1. checkpoint_wrapper (or torch.utils.checkpoint.checkpoint) around a module containing a nondeterministic op (e.g., scaled_dot_product_attention)
  2. torch.compile
  3. schedule_overlap_bucketing(..., insert_overlap_deps=True) set as a post_grad_custom_post_pass

The nondeterministic op inside the checkpoint triggers functionalize_rng_ops in the partitioner, which introduces graphsafe_run_with_rng_state with Generator placeholders. Then insert_overlap_deps wraps these with control_deps, breaking the inductor assertion.

Possible fixes:

  • schedule_overlap_bucketing should skip graphsafe_run_with_rng_state nodes when inserting overlap deps (they are RNG synchronization points, not compute/comm nodes that need reordering protection)
  • Or inductor's assertion should recognize ControlDeps as a transparent wrapper and look through it to find the actual target

Versions

Torch 2.12.0.dev20260323+cu128

cc @soulitzer @pbelevich @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo @ydwu4 @bdhirsh @bobrenjc93 @aorenste

extent analysis

TL;DR

Modify schedule_overlap_bucketing to skip wrapping graphsafe_run_with_rng_state nodes with control_deps when insert_overlap_deps=True.

Guidance

  • Identify the graphsafe_run_with_rng_state nodes in the graph that are causing the assertion failure.
  • Modify the schedule_overlap_bucketing function to exclude these nodes when inserting overlap dependencies.
  • Alternatively, update the inductor's assertion to recognize ControlDeps as a transparent wrapper and look through it to find the actual target.
  • Verify the fix by running the model with the modified schedule_overlap_bucketing function and checking that the assertion failure is resolved.

Example

# Modified schedule_overlap_bucketing function
def schedule_overlap_bucketing(..., insert_overlap_deps=True):
    ...
    if insert_overlap_deps:
        for node in graph.nodes:
            if node.kind != "graphsafe_run_with_rng_state":  # Skip graphsafe_run_with_rng_state nodes
                # Insert overlap dependencies
                ...

Notes

The fix assumes that graphsafe_run_with_rng_state nodes are not compute or communication nodes that require reordering protection. If this assumption is incorrect, the inductor's assertion may need to be updated to recognize ControlDeps as a transparent wrapper.

Recommendation

Apply workaround: Modify schedule_overlap_bucketing to skip wrapping graphsafe_run_with_rng_state nodes with control_deps when insert_overlap_deps=True, as this is a more targeted and less invasive fix.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING