pytorch - 💡(How to fix) Fix [inductor] Support pre/post kernel execution hooks in compiled graphs [1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#180974Fetched 2026-04-22 07:43:13
View on GitHub
Comments
0
Participants
1
Timeline
150
Reactions
0
Participants
Timeline (top)
subscribed ×73mentioned ×72labeled ×5
RAW_BUFFERClick to expand / collapse

🚀 The feature, motivation and pitch

I'm building debugging and profiling tools for Intel GPU workloads and need a way to run user callbacks before and after each node execution inside Inductor-compiled graphs, with access to the node name and its input/output tensors.

The existing generate_intermediate_hooks captures tensor values after each node for debugging, which is a different use case. There is currently no mechanism to instrument kernel calls for profiling/tracing purposes (e.g., recording kernel name, input/output tensors, and timing).

Alternatives

Extend generate_intermediate_hooks. This captures buffer values after each node for debugging, not around kernel calls. Repurposing it would conflate two different use cases and break existing users.

Additional context

I have a working implementation with tests ready to go as a PR once this is marked actionable. Available on branch: https://github.com/yohaigevim/pytorch/tree/node-hooks

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @kadeng @muchulee8 @amjames @aakhundov @coconutruben @jataylo

extent analysis

TL;DR

Implement a new mechanism to instrument kernel calls for profiling/tracing purposes, allowing user callbacks before and after each node execution with access to node name and input/output tensors.

Guidance

  • Review the existing generate_intermediate_hooks function to understand its limitations and why it cannot be repurposed for profiling/tracing use cases.
  • Consider implementing a new hook mechanism specifically designed for profiling/tracing, allowing for user callbacks around kernel calls.
  • Evaluate the working implementation on the node-hooks branch, ensuring it meets the requirements for instrumenting kernel calls and providing access to node name and input/output tensors.
  • Discuss with the maintainers and contributors to determine the best approach for integrating the new hook mechanism into the existing codebase.

Notes

The implementation details of the new hook mechanism are not specified, and the working implementation on the node-hooks branch should be reviewed to ensure it meets the requirements.

Recommendation

Apply workaround by implementing a new hook mechanism, as repurposing the existing generate_intermediate_hooks function may conflate different use cases and break existing users.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING

pytorch - 💡(How to fix) Fix [inductor] Support pre/post kernel execution hooks in compiled graphs [1 participants]