pytorch - ✅(Solved) Fix [RFC] Add Minimal torch.compile Backend to Validate PrivateUse1 Integration [1 pull requests, 4 comments, 3 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
pytorch/pytorch#181093Fetched 2026-04-23 07:22:44
View on GitHub
Comments
4
Participants
3
Timeline
143
Reactions
2
Timeline (top)
mentioned ×65subscribed ×65labeled ×8commented ×4

Fix Action

Fixed

PR fix notes

PR #181092: [OpenReg] Add Minimal torch.compile Backend to Validate PrivateUse1 Integration

Description (problem / solution / changelog)

Description:

Adds a minimal torch.compile backend to OpenReg that validates the PyTorch-side integration contract for PrivateUse1 devices.

The backend is a passthrough, it receives the FX graph from Dynamo, performs lightweight validation using stable FX APIs (Graph.lint, eliminate_dead_code, recompile, python_code), checks FakeTensor device metadata on placeholder nodes, and returns gm.forward for eager execution through the existing CPU fallback. No lowering or device-specific compilation is performed.

Changes:

torch_openreg/compiler.py: backend implementation, registered via register_backend torch_openreg/init.py: import compiler module to trigger registration on package import tests/test_compile.py: Tests covering registration, graph capture, FakeTensor device propagation, guard caching, recompilation, autograd, nn.Module, dynamic shapes, graph breaks, autocast, and default device interaction

Fixes: #181093

Changed files

  • test/cpp_extensions/open_registration_extension/torch_openreg/tests/test_compile.py (added, +206/-0)
  • test/cpp_extensions/open_registration_extension/torch_openreg/torch_openreg/__init__.py (modified, +1/-0)
  • test/cpp_extensions/open_registration_extension/torch_openreg/torch_openreg/compiler.py (added, +30/-0)
RAW_BUFFERClick to expand / collapse

Motivation

OpenReg is PyTorch's in-tree reference implementation for PrivateUse1 backend integration. It currently covers operator registration, autoload, AMP, and device APIs but has no torch.compile integration. This means there is no reference how a PrivateUse1 device integrates with PyTorch's compiler stack.

Design

The concepts at the torch.compile interface are shared across all backends registration, FX graph capture, FakeTensor metadata, guards, graph breaks. But the actual compilation logic is completely different from one backend to the next (TPU lowers to StableHLO/XLA, CUDA targets Triton, etc.).

Following OpenReg's minimality principle, the proposed backend is a passthrough, it receives the FX graph from Dynamo, performs lightweight validation using FX APIs, and returns it for eager execution through the existing CPU fallback. No lowering, optimization, or device-specific code generation is performed.

The backend validates the PyTorch side integration contract, not backend-specific compilation logic. Each real vendor validates their own lowering in their own out-of-tree CI.

What the backend does

A function implementing the standard torch.compile backend contract:

  • Validates graph well-formedness via Graph.lint()
  • Checks FakeTensor device metadata on placeholder nodes to verify PrivateUse1 device propagation
  • Demonstrates the transform-and-recompile pattern via eliminate_dead_code() and recompile()
  • Logs generated source via python_code() for debug visibility
  • Returns gm.forward (passthrough execution)
  • Registered via in-code register_backend call, triggered on package import.

What the tests cover

Tests validating the PyTorch-side behavior for PrivateUse1:

  • Backend registration and discoverability
  • End-to-end compile and execute on PrivateUse1 tensors
  • FakeTensor device propagation through Dynamo tracing
  • Guard caching and shape-triggered recompilation
  • Autograd through compilation
  • nn.Module with PrivateUse1 parameters
  • Dynamic shapes
  • Graph break detection on device transfers
  • Autocast with PrivateUse1 device type
  • Default device interaction with compilation

Pull Request: 181092

cc @bdhirsh @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @kadeng @amjames @Lucaskabela @jataylo @azahed98 @NmomoN @mengpenghui @fwenguang @cdzhan @1274085042 @PHLens @albanD

extent analysis

TL;DR

The proposed fix involves integrating the PrivateUse1 backend with PyTorch's compiler stack by implementing a passthrough backend that validates the FX graph and returns it for eager execution.

Guidance

  • Review the proposed backend implementation to ensure it correctly validates the FX graph and handles PrivateUse1 device propagation.
  • Verify that the tests cover all necessary scenarios, including end-to-end compilation and execution, FakeTensor device propagation, and autograd through compilation.
  • Check the registration of the backend via the in-code register_backend call to ensure it is triggered on package import.
  • Test the backend with various PyTorch features, such as dynamic shapes, graph break detection, and autocast with PrivateUse1 device type.

Example

No code snippet is provided as the issue does not contain specific code that needs to be modified or implemented.

Notes

The proposed backend is a minimal implementation that does not perform any lowering, optimization, or device-specific code generation. Each vendor is responsible for validating their own lowering in their own out-of-tree CI.

Recommendation

Apply the proposed workaround by implementing the passthrough backend and verifying its correctness through the provided tests. This will allow for the integration of the PrivateUse1 backend with PyTorch's compiler stack while ensuring that the actual compilation logic is handled by each vendor separately.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING