pytorch - ✅(Solved) Fix torch.backends.fp32_precision setter doesn't propagate to cudnn.conv/rnn, and mixed new/legacy API access causes RuntimeError [1 pull requests, 1 comments, 2 participants]

pytorch2026-04-06 09:16:26

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#179445•Fetched 2026-04-08 02:51:39

View on GitHub

Comments

Participants

Timeline

Reactions

Author

ydshieh

Participants

khushali9

ydshieh

Timeline (top)

mentioned ×35subscribed ×35unsubscribed ×8labeled ×6

Two related issues with the TF32 precision flag APIs:

Error Message

import torch torch.backends.fp32_precision = "ieee" torch.backends.cudnn.conv.fp32_precision = "ieee" torch.backends.cudnn.rnn.fp32_precision = "ieee"

Both conv and rnn are now 'ieee' -- they agree

torch.backends.cudnn.allow_tf32 Traceback (most recent call last): ... RuntimeError: PyTorch is checking whether allow_tf32 is enabled for cuDNN without a specific operator name, but the current flag(s) indicate that cuDNN conv and cuDNN RNN have different TF32 flags. This combination indicates that you have used a mix of the legacy and new APIs to set the TF32 flags.

Root Cause

User sets torch.backends.fp32_precision = "ieee" (intending to disable TF32 everywhere).
Because of Issue 1, cudnn.conv.fp32_precision and cudnn.rnn.fp32_precision remain 'tf32'.
User notices this and manually sets both to "ieee".
Accessing allow_tf32 now raises RuntimeError even though the state is fully consistent.

Fix Action

Fix / Workaround

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: AuthenticAMD Model name: AMD EPYC 7R32 CPU family: 23 Model: 49 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 Stepping: 0 BogoMIPS: 5599.99 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid a perfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr rdpru wbnoinvd arat npt nrip_save rdpid Hypervisor vendor: KVM Virtualization type: full L1d cache: 256 KiB (8 instances) L1i cache: 256 KiB (8 instances) L2 cache: 4 MiB (8 instances) L3 cache: 32 MiB (2 instances) NUMA node(s): 1 NUMA node0 CPU(s): 0-15 Vulnerability Gather data sampling: Not affected Vulnerability Indirect target selection: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Reg file data sampling: Not affected Vulnerability Retbleed: Mitigation; untrained return thunk; SMT enabled with STIBP protection Vulnerability Spec rstack overflow: Vulnerable: Safe RET, no microcode Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines; IBPB conditional; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected Vulnerability Srbds: Not affected Vulnerability Tsa: Not affected Vulnerability Tsx async abort: Not affected Vulnerability Vmscape: Not affected

PR fix notes

PR #179750: torch.backends.fp32_precision setter propagate to cudnn.conv/rnn

Repository: pytorch/pytorch
Author: khushali9
State: open | merged: False
Link: https://github.com/pytorch/pytorch/pull/179750

Description (problem / solution / changelog)

fixes 179445

issue#1 torch.backends.fp32_precision setter doesn't propagate to cudnn.conv/rnn, I have tried to invoke default to handle this. issue#2 not resolved as that seems like legit to me. we need to error out if someone tries to use legacy and new API at the same time. suggested workaround to @ydshieh

Let me know what you think.

Changed files

aten/src/ATen/Context.cpp (modified, +25/-0)
aten/src/ATen/Context.h (modified, +6/-3)
test/test_cuda.py (modified, +15/-0)

Code Example

>>> import torch
>>> torch.backends.fp32_precision
'none'
>>> torch.backends.cudnn.conv.fp32_precision
'tf32'
>>> torch.backends.cudnn.rnn.fp32_precision
'tf32'

>>> torch.backends.fp32_precision = "ieee"

>>> torch.backends.cudnn.fp32_precision       # updated
'ieee'
>>> torch.backends.cuda.matmul.fp32_precision  # updated
'ieee'
>>> torch.backends.cudnn.conv.fp32_precision   # NOT updated (bug)
'tf32'
>>> torch.backends.cudnn.rnn.fp32_precision    # NOT updated (bug)
'tf32'

---

>>> import torch
>>> torch.backends.fp32_precision = "ieee"
>>> torch.backends.cudnn.conv.fp32_precision = "ieee"
>>> torch.backends.cudnn.rnn.fp32_precision  = "ieee"
>>> # Both conv and rnn are now 'ieee' -- they agree

>>> torch.backends.cudnn.allow_tf32
Traceback (most recent call last):
  ...
RuntimeError: PyTorch is checking whether allow_tf32 is enabled for cuDNN
without a specific operator name, but the current flag(s) indicate that
cuDNN conv and cuDNN RNN have different TF32 flags.
This combination indicates that you have used a mix of the legacy and new
APIs to set the TF32 flags.

---

>>> torch.backends.cudnn.allow_tf32 = False
>>> torch.backends.cudnn.allow_tf32
False  # No error

---

Collecting environment information...
PyTorch version: 2.11.0+cu126
Is debug build: False
CUDA used to build PyTorch: 12.6
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.10.12 (main, Mar  3 2026, 11:56:32) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-6.12.73-95.123.amzn2023.x86_64-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: 
GPU models and configuration: GPU 0: NVIDIA A10G
Nvidia driver version: 580.126.09
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_graph.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_heuristic.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops.so.9.3.0
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Caching allocator config: N/A

CPU:
Architecture:                            x86_64
CPU op-mode(s):                          32-bit, 64-bit
Address sizes:                           48 bits physical, 48 bits virtual
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               AuthenticAMD
Model name:                              AMD EPYC 7R32
CPU family:                              23
Model:                                   49
Thread(s) per core:                      2
Core(s) per socket:                      8
Socket(s):                               1
Stepping:                                0
BogoMIPS:                                5599.99
Flags:                                   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid a
perfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr rdpru wbnoinvd arat npt nrip_save rdpid
Hypervisor vendor:                       KVM
Virtualization type:                     full
L1d cache:                               256 KiB (8 instances)
L1i cache:                               256 KiB (8 instances)
L2 cache:                                4 MiB (8 instances)
L3 cache:                                32 MiB (2 instances)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Mitigation; untrained return thunk; SMT enabled with STIBP protection
Vulnerability Spec rstack overflow:      Vulnerable: Safe RET, no microcode
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; Retpolines; IBPB conditional; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Versions of relevant libraries:
[pip3] mypy_extensions==1.1.0
[pip3] numpy==1.26.4
[pip3] nvidia-cublas==13.1.0.3
[pip3] nvidia-cublas-cu12==12.6.4.1
[pip3] nvidia-cuda-cupti==13.0.85
[pip3] nvidia-cuda-cupti-cu12==12.6.80
[pip3] nvidia-cuda-nvrtc==13.0.88
[pip3] nvidia-cuda-nvrtc-cu12==12.6.85
[pip3] nvidia-cuda-runtime==13.0.96
[pip3] nvidia-cuda-runtime-cu12==12.6.77
[pip3] nvidia-cudnn-cu12==9.10.2.21
[pip3] nvidia-cudnn-cu13==9.19.0.56
[pip3] nvidia-cufft==12.0.0.61
[pip3] nvidia-cufft-cu12==11.3.0.4
[pip3] nvidia-curand==10.4.0.35
[pip3] nvidia-curand-cu12==10.3.7.77
[pip3] nvidia-cusolver==12.0.4.66
[pip3] nvidia-cusolver-cu12==11.7.1.2
[pip3] nvidia-cusparse==12.6.3.3
[pip3] nvidia-cusparse-cu12==12.5.4.2
[pip3] nvidia-cusparselt-cu12==0.7.1
[pip3] nvidia-cusparselt-cu13==0.8.0
[pip3] nvidia-nccl-cu12==2.28.9
[pip3] nvidia-nccl-cu13==2.28.9
[pip3] nvidia-nvjitlink==13.0.88
[pip3] nvidia-nvjitlink-cu12==12.6.85
[pip3] nvidia-nvtx==13.0.85
[pip3] nvidia-nvtx-cu12==12.6.77
[pip3] torch==2.11.0+cu126
[pip3] torchaudio==2.11.0+cu126
[pip3] torchcodec==0.11.0+cpu
[pip3] torchvision==0.26.0+cu126
[pip3] triton==3.6.0
[conda] Could not collect

RAW_BUFFERClick to expand / collapse

Summary

Two related issues with the TF32 precision flag APIs:

Issue 1: `torch.backends.fp32_precision = "ieee"` does not propagate to `cudnn.conv` and `cudnn.rnn`

Setting torch.backends.fp32_precision propagates to torch.backends.cudnn.fp32_precision and torch.backends.cuda.matmul.fp32_precision, but not to torch.backends.cudnn.conv.fp32_precision or torch.backends.cudnn.rnn.fp32_precision.

Reproduction:

>>> import torch
>>> torch.backends.fp32_precision
'none'
>>> torch.backends.cudnn.conv.fp32_precision
'tf32'
>>> torch.backends.cudnn.rnn.fp32_precision
'tf32'

>>> torch.backends.fp32_precision = "ieee"

>>> torch.backends.cudnn.fp32_precision       # updated
'ieee'
>>> torch.backends.cuda.matmul.fp32_precision  # updated
'ieee'
>>> torch.backends.cudnn.conv.fp32_precision   # NOT updated (bug)
'tf32'
>>> torch.backends.cudnn.rnn.fp32_precision    # NOT updated (bug)
'tf32'

Expected behavior: Setting the top-level torch.backends.fp32_precision should propagate consistently to all sub-backends, including cudnn.conv and cudnn.rnn. If the current behavior of not propagating to cudnn.conv and cudnn.rnn is intentional, it should be very clearly documented in https://docs.pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices, which is not the current case.

Issue 2: Accessing `torch.backends.cudnn.allow_tf32` raises `RuntimeError` even when all `fp32_precision` flags are set to `"ieee"`

See the reproduction below.

Reproduction:

>>> import torch
>>> torch.backends.fp32_precision = "ieee"
>>> torch.backends.cudnn.conv.fp32_precision = "ieee"
>>> torch.backends.cudnn.rnn.fp32_precision  = "ieee"
>>> # Both conv and rnn are now 'ieee' -- they agree

>>> torch.backends.cudnn.allow_tf32
Traceback (most recent call last):
  ...
RuntimeError: PyTorch is checking whether allow_tf32 is enabled for cuDNN
without a specific operator name, but the current flag(s) indicate that
cuDNN conv and cuDNN RNN have different TF32 flags.
This combination indicates that you have used a mix of the legacy and new
APIs to set the TF32 flags.

The error message claims "cuDNN conv and cuDNN RNN have different TF32 flags", but in this example both are "ieee" — so the message is incorrect. The real trigger appears to be that the new API was used at all, causing some internal state to diverge from what the legacy getter expects, even when the resulting values are identical.

If we continue and set torch.backends.cudnn.allow_tf32 = False, it will resolve the issue:

>>> torch.backends.cudnn.allow_tf32 = False
>>> torch.backends.cudnn.allow_tf32
False  # No error

Expected behavior: Accessing torch.backends.cudnn.allow_tf32 should not raise a RuntimeError.

Connection between the two issues

The state that triggers Issue 2 is easy to reach unintentionally via Issue 1:

User sets torch.backends.fp32_precision = "ieee" (intending to disable TF32 everywhere).
Because of Issue 1, cudnn.conv.fp32_precision and cudnn.rnn.fp32_precision remain 'tf32'.
User notices this and manually sets both to "ieee".
Accessing allow_tf32 now raises RuntimeError even though the state is fully consistent.

Fixing Issue 1 would make it harder to inadvertently reach this state, but Issue 2 is an independent bug in the legacy getter that should be fixed regardless.

Versions

Collecting environment information...
PyTorch version: 2.11.0+cu126
Is debug build: False
CUDA used to build PyTorch: 12.6
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: Could not collect
Libc version: glibc-2.35

Python version: 3.10.12 (main, Mar  3 2026, 11:56:32) [GCC 11.4.0] (64-bit runtime)
Python platform: Linux-6.12.73-95.123.amzn2023.x86_64-x86_64-with-glibc2.35
Is CUDA available: True
CUDA runtime version: Could not collect
CUDA_MODULE_LOADING set to: 
GPU models and configuration: GPU 0: NVIDIA A10G
Nvidia driver version: 580.126.09
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_engines_precompiled.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_engines_runtime_compiled.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_graph.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_heuristic.so.9.3.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops.so.9.3.0
Is XPU available: False
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
Caching allocator config: N/A

CPU:
Architecture:                            x86_64
CPU op-mode(s):                          32-bit, 64-bit
Address sizes:                           48 bits physical, 48 bits virtual
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               AuthenticAMD
Model name:                              AMD EPYC 7R32
CPU family:                              23
Model:                                   49
Thread(s) per core:                      2
Core(s) per socket:                      8
Socket(s):                               1
Stepping:                                0
BogoMIPS:                                5599.99
Flags:                                   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid a
perfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy cr8_legacy abm sse4a misalignsse 3dnowprefetch topoext ssbd ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 clzero xsaveerptr rdpru wbnoinvd arat npt nrip_save rdpid
Hypervisor vendor:                       KVM
Virtualization type:                     full
L1d cache:                               256 KiB (8 instances)
L1i cache:                               256 KiB (8 instances)
L2 cache:                                4 MiB (8 instances)
L3 cache:                                32 MiB (2 instances)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Mitigation; untrained return thunk; SMT enabled with STIBP protection
Vulnerability Spec rstack overflow:      Vulnerable: Safe RET, no microcode
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; Retpolines; IBPB conditional; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Versions of relevant libraries:
[pip3] mypy_extensions==1.1.0
[pip3] numpy==1.26.4
[pip3] nvidia-cublas==13.1.0.3
[pip3] nvidia-cublas-cu12==12.6.4.1
[pip3] nvidia-cuda-cupti==13.0.85
[pip3] nvidia-cuda-cupti-cu12==12.6.80
[pip3] nvidia-cuda-nvrtc==13.0.88
[pip3] nvidia-cuda-nvrtc-cu12==12.6.85
[pip3] nvidia-cuda-runtime==13.0.96
[pip3] nvidia-cuda-runtime-cu12==12.6.77
[pip3] nvidia-cudnn-cu12==9.10.2.21
[pip3] nvidia-cudnn-cu13==9.19.0.56
[pip3] nvidia-cufft==12.0.0.61
[pip3] nvidia-cufft-cu12==11.3.0.4
[pip3] nvidia-curand==10.4.0.35
[pip3] nvidia-curand-cu12==10.3.7.77
[pip3] nvidia-cusolver==12.0.4.66
[pip3] nvidia-cusolver-cu12==11.7.1.2
[pip3] nvidia-cusparse==12.6.3.3
[pip3] nvidia-cusparse-cu12==12.5.4.2
[pip3] nvidia-cusparselt-cu12==0.7.1
[pip3] nvidia-cusparselt-cu13==0.8.0
[pip3] nvidia-nccl-cu12==2.28.9
[pip3] nvidia-nccl-cu13==2.28.9
[pip3] nvidia-nvjitlink==13.0.88
[pip3] nvidia-nvjitlink-cu12==12.6.85
[pip3] nvidia-nvtx==13.0.85
[pip3] nvidia-nvtx-cu12==12.6.77
[pip3] torch==2.11.0+cu126
[pip3] torchaudio==2.11.0+cu126
[pip3] torchcodec==0.11.0+cpu
[pip3] torchvision==0.26.0+cu126
[pip3] triton==3.6.0
[conda] Could not collect

extent analysis

TL;DR

Setting torch.backends.fp32_precision does not propagate to torch.backends.cudnn.conv.fp32_precision and torch.backends.cudnn.rnn.fp32_precision, and accessing torch.backends.cudnn.allow_tf32 raises a RuntimeError even when all fp32_precision flags are set to "ieee".

Guidance

Manually set torch.backends.cudnn.conv.fp32_precision and torch.backends.cudnn.rnn.fp32_precision to "ieee" after setting torch.backends.fp32_precision to ensure consistency.
Set torch.backends.cudnn.allow_tf32 to False to resolve the RuntimeError when accessing it.
Verify that all fp32_precision flags are set to the same value to avoid inconsistencies.
Be aware that the current behavior may be due to a bug in the legacy getter, and setting torch.backends.cudnn.allow_tf32 to False may be a temporary workaround.

Example

import torch
torch.backends.fp32_precision = "ieee"
torch.backends.cudnn.conv.fp32_precision = "ieee"
torch.backends.cudnn.rnn.fp32_precision = "ieee"
torch.backends.cudnn.allow_tf32 = False

Notes

The issue may be specific to the version of PyTorch (2.11.0+cu126) and CUDA (12.6) being used. The behavior of the torch.backends module may change in future versions, and the workaround provided may not be necessary.

Recommendation

Apply the workaround by manually setting the fp32_precision flags and setting torch.backends.cudnn.allow_tf32 to False, as this resolves the RuntimeError and ensures consistency in the fp32_precision flags.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #API rate limit #retriever error #indexing error #inference speed

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - ✅(Solved) Fix torch.backends.fp32_precision setter doesn't propagate to cudnn.conv/rnn, and mixed new/legacy API access causes RuntimeError [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Both conv and rnn are now 'ieee' -- they agree

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #179750: torch.backends.fp32_precision setter propagate to cudnn.conv/rnn

Description (problem / solution / changelog)

Changed files

Code Example

Summary

Issue 1: `torch.backends.fp32_precision = "ieee"` does not propagate to `cudnn.conv` and `cudnn.rnn`

Issue 2: Accessing `torch.backends.cudnn.allow_tf32` raises `RuntimeError` even when all `fp32_precision` flags are set to `"ieee"`

Connection between the two issues

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

TRENDING

pytorch - ✅(Solved) Fix torch.backends.fp32_precision setter doesn't propagate to cudnn.conv/rnn, and mixed new/legacy API access causes RuntimeError [1 pull requests, 1 comments, 2 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Both conv and rnn are now 'ieee' -- they agree

Root Cause

Fix Action

Fix / Workaround

PR fix notes

PR #179750: torch.backends.fp32_precision setter propagate to cudnn.conv/rnn

Description (problem / solution / changelog)

Changed files

Code Example

Summary

Issue 1: torch.backends.fp32_precision = "ieee" does not propagate to cudnn.conv and cudnn.rnn

Issue 2: Accessing torch.backends.cudnn.allow_tf32 raises RuntimeError even when all fp32_precision flags are set to "ieee"

Connection between the two issues

Versions

extent analysis

TL;DR

Guidance

Example

Notes

Recommendation

Still need to ship something?

RELATED_DISCOVERY

TRENDING

Issue 1: `torch.backends.fp32_precision = "ieee"` does not propagate to `cudnn.conv` and `cudnn.rnn`

Issue 2: Accessing `torch.backends.cudnn.allow_tf32` raises `RuntimeError` even when all `fp32_precision` flags are set to `"ieee"`