pytorch - 💡(How to fix) Fix CPU FPE crash in `torch.nn.quantized.functional.conv2d` (qnnpack) [1 participants]

RAT-tjy · 2026-05-06T14:33:38Z

[pytorch] Calling torch.nn.quantized.functional.conv2d on CPU with qnnpack engine, using a valid spatial output size but mismatched groups / C in / weight shap… Calling `torch.nn.quantized.functional.conv2d` on CPU with **qnnpack** engine, using a valid spatial output size but mismatched `groups`/`C_in`/`weight shape` combination can trigger an AddressSanitizer **FPE (floating-point exception)** crash in `fxdiv_init_uint64_t`. The bug bypasses the existing frontend output dimension check, leading to a denial-of-service. ## Fix / Workaround ``` ==2546745==ERROR: AddressSanitizer: FPE on unknown address 0x7fffe80f4f49 (pc 0x7fffe80f4f49 bp 0x7fffffffc560 sp 0x7fffffffc440 T0) #0 0x7fffe80f4f49 in fxdiv_init_uint64_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:291:23 #1 0x7fffe80f4f49 in fxdiv_init_size_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:329:52 #2 0x7fffe80f4f49 in pthreadpool_compute_4d_tiled /root/pytorch/third_party/pthreadpool/src/legacy-api.c:209:21 #3 0x7fffe6e99f5c in qnnpack::qnnpackConv(pytorch_qnnp_operator*, void*, unsigned long, unsigned long, unsigned long, unsigned long, unsigned char, unsigned char const*, unsigned char const*, float const*, unsigned char, unsigned char, unsigned char, unsigned char*, pthreadpool*) /root/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/conv-run.cc #4 0x7fffdae142ef in at::Tensor PackedConvWeightsQnnp ::apply_impl (at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1011:18 #5 0x7fffdae127b8 in PackedConvWeightsQnnp ::apply(at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1074:10 #6 0x7fffdae20259 in at::native::(anonymous namespace)::QConvInt8 ::run(at::Tensor, c10::intrusive_ptr , c10::detail::intrusive_target_default_null_type > > const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:2117:29 #7 0x7fffdae217db in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_ , c10::detail::intrusive_target_default_null_type > > const&, double, long), at::Tensor, c10::guts::typelist::typelist , c10::detail::intrusive_target_default_null_type > > const&, double, long> >::operator()(at::Tensor, c10::intrusive_ptr , c10::detail::intrusive_target_default_null_type > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoRuntimeFunctor.h:22:12 #8 0x7fffdae217db in c10::impl::wrap_kernel_functor_unboxed_ , c10::detail::intrusive_target_default_null_type > > const&, double, long), at::Tensor, c10::guts::typelist::typelist , c10::detail::intrusive_target_default_null_type > > const&, double, long> >, at::Tensor (at::Tensor, c10::intrusive_ptr , c10::detail::intrusive_target_default_null_type > > const&, double, long)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor, c10::intrusive_ptr , c10::detail::intrusive_target_default_null_type > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:578:12 #9 0x7fffdae21d08 in std::decay , c10::detail::intrusive_target_default_null_type > > const&, double, long), at::Tensor, c10::guts::typelist::typelist , c10::detail::intrusive_target_default_null_type > > const&, double, long> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack_ , c10::detail::intrusive_target_default_null_type > > const&, double, long), at::Tensor, c10::guts::typelist::typelist , c10::detail::intrusive_target_default_null_type > > const&, double, long> >, false, 0ul, 1ul, 2ul, 3ul, at::Tensor, c10::intrusive_ptr , c10::detail::intrusive_target_default_null_type > > const&, double, long>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector >*, std::integer_sequence , c10::guts::typelist::typelist ,

pytorch2026-05-06 14:33:38

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

pytorch/pytorch#182653•Fetched 2026-05-07 03:30:51

View on GitHub

Comments

Participants

Timeline

Reactions

Author

RAT-tjy

Participants

RAT-tjy

Timeline (top)

mentioned ×8subscribed ×8labeled ×2

Calling torch.nn.quantized.functional.conv2d on CPU with qnnpack engine, using a valid spatial output size but mismatched groups/C_in/weight shape combination can trigger an AddressSanitizer FPE (floating-point exception) crash in fxdiv_init_uint64_t. The bug bypasses the existing frontend output dimension check, leading to a denial-of-service.

Error Message

==2546745==ERROR: AddressSanitizer: FPE on unknown address 0x7fffe80f4f49 (pc 0x7fffe80f4f49 bp 0x7fffffffc560 sp 0x7fffffffc440 T0) #0 0x7fffe80f4f49 in fxdiv_init_uint64_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:291:23 #1 0x7fffe80f4f49 in fxdiv_init_size_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:329:52 #2 0x7fffe80f4f49 in pthreadpool_compute_4d_tiled /root/pytorch/third_party/pthreadpool/src/legacy-api.c:209:21 #3 0x7fffe6e99f5c in qnnpack::qnnpackConv(pytorch_qnnp_operator*, void*, unsigned long, unsigned long, unsigned long, unsigned long, unsigned char, unsigned char const*, unsigned char const*, float const*, unsigned char, unsigned char, unsigned char, unsigned char*, pthreadpool*) /root/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/conv-run.cc #4 0x7fffdae142ef in at::Tensor PackedConvWeightsQnnp<2>::apply_impl<false>(at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1011:18 #5 0x7fffdae127b8 in PackedConvWeightsQnnp<2>::apply(at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1074:10 #6 0x7fffdae20259 in at::native::(anonymous namespace)::QConvInt8<2, false>::run(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:2117:29 #7 0x7fffdae217db in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor ()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >::operator()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoRuntimeFunctor.h:22:12 #8 0x7fffdae217db in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor ()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, at::Tensor (at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:578:12 #9 0x7fffdae21d08 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor ()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor ()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false, 0ul, 1ul, 2ul, 3ul, at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorstd::vector >, std::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul>, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long>) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:642:10 #10 0x7fffdae21ac6 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor ()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor ()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorstd::vector >) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:666:10 #11 0x7fffdae21ac6 in c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor ()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorc10::IValue >) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:765:28 #12 0x7fffe2a0e9b0 in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorc10::IValue >) const /root/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3 #13 0x7fffe2a0e9b0 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorc10::IValue >) const /root/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:92:22 #14 0x7fffe2a0e9b0 in c10::Dispatcher::redispatchBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorc10::IValue >) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:935:10 #15 0x7fffe2a0e9b0 in c10::OperatorHandle::redispatchBoxed(c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorc10::IValue >) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:550:34 #16 0x7fffe2a0e9b0 in torch::autograd::basicAutogradNotImplementedFallbackImpl(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorc10::IValue >) /root/pytorch/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:149:6 #17 0x7fffd9632d53 in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorc10::IValue >) const /root/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3 #18 0x7fffd9632d53 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocatorc10::IValue >) const /root/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:92:22 #19 0x7fffd9632d53 in c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocatorc10::IValue >) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:893:10 #20 0x7ffff2ba975e in std::function<void (std::vector<c10::IValue, std::allocatorc10::IValue >&)>::operator()(std::vector<c10::IValue, std::allocatorc10::IValue >&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9 #21 0x7ffff2ba975e in torch::jit::Operation::operator()(std::vector<c10::IValue, std::allocatorc10::IValue >&) /root/pytorch/aten/src/ATen/core/stack.h:41:5 #22 0x7ffff2ba975e in torch::jit::invokeOperatorFromPython(c10::ArrayRef<std::shared_ptrtorch::jit::Operator >, pybind11::args const&, pybind11::kwargs const&, std::optionalc10::DispatchKey) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:860:7 #23 0x7ffff2baac15 in torch::jit::_get_operation_for_overload_or_packet(c10::ArrayRef<std::shared_ptrtorch::jit::Operator >, c10::Symbol, pybind11::args const&, pybind11::kwargs const&, bool, std::optionalc10::DispatchKey) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:968:9 #24 0x7ffff2baa7f0 in torch::jit::_get_operation_for_packet(std::vector<std::shared_ptrtorch::jit::Operator> const&, c10::Symbol, pybind11::args const&, pybind11::kwargs const&, bool, std::optionalc10::DispatchKey) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:949:10 #25 0x7ffff2a24f9f in torch::jit::initJITBindings(_object)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)::operator()(pybind11::args const&, pybind11::kwargs const&) const /root/pytorch/torch/csrc/jit/python/init.cpp:1815:24 #26 0x7ffff2a24f9f in pybind11::object pybind11::detail::argument_loader<pybind11::args const&, pybind11::kwargs const&>::call_impl<pybind11::object, torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&, 0ul, 1ul, pybind11::detail::void_type>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&, std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && /root/pytorch/torch/include/pybind11/cast.h:2137:16 #27 0x7ffff2a24f9f in std::enable_if<!(std::is_voidpybind11::object::value), pybind11::object>::type pybind11::detail::argument_loader<pybind11::args const&, pybind11::kwargs const&>::call<pybind11::object, pybind11::detail::void_type, torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&) && /root/pytorch/torch/include/pybind11/cast.h:2105:42 #28 0x7ffff2a24f9f in void pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&), pybind11::object, pybind11::args const&, pybind11::kwargs const&, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&&, pybind11::object ()(pybind11::args const&, pybind11::kwargs const&), pybind11::name const&, pybind11::doc const&)::'lambda'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&) const /root/pytorch/torch/include/pybind11/pybind11.h:430:56 #29 0x7ffff2a24f9f in void pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&), pybind11::object, pybind11::args const&, pybind11::kwargs const&, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&&, pybind11::object ()(pybind11::args const&, pybind11::kwargs const&), pybind11::name const&, pybind11::doc const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) /root/pytorch/torch/include/pybind11/pybind11.h:400:21 #30 0x7ffff1aac8b6 in pybind11::cpp_function::dispatcher(_object, _object*, _object*) /root/pytorch/torch/include/pybind11/pybind11.h:1063:30 #31 0x5629f28d8025 in PyObject_Call (/usr/bin/python3.10+0x1a2025) #32 0x5629f28b29b3 in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17c9b3) #33 0x5629f28c11bb in _PyFunction_Vectorcall (/usr/bin/python3.10+0x18b1bb) #34 0x5629f28ab81f in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17581f) #35 0x5629f28c11bb in _PyFunction_Vectorcall (/usr/bin/python3.10+0x18b1bb) #36 0x5629f28ab81f in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17581f) #37 0x5629f2990565 in PyEval_EvalCode (/usr/bin/python3.10+0x25a565) #38 0x5629f29b6ed7 in PyRun_SimpleFileExFlags (/usr/bin/python3.10+0x280ed7) #39 0x5629f29b16de in PyRun_FileExFlags (/usr/bin/python3.10+0x27b6de) #40 0x5629f29aa3ad in Py_RunMain (/usr/bin/python3.10+0x2743ad) #41 0x5629f298447c in Py_BytesMain (/usr/bin/python3.10+0x24e47c) #42 0x7ffff7106d8f in __libc_start_call_main (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f) #43 0x7ffff7106e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f) #44 0x5629f2984374 in _start (/usr/bin/python3.10+0x24e374) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: FPE /root/pytorch/third_party/FXdiv/include/fxdiv.h:291:23 in fxdiv_init_uint64_t ==2546745==ABORTING

Root Cause

Suspected root cause

Fix Action

Fix / Workaround

==2546745==ERROR: AddressSanitizer: FPE on unknown address 0x7fffe80f4f49 (pc 0x7fffe80f4f49 bp 0x7fffffffc560 sp 0x7fffffffc440 T0)
    #0 0x7fffe80f4f49 in fxdiv_init_uint64_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:291:23
    #1 0x7fffe80f4f49 in fxdiv_init_size_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:329:52
    #2 0x7fffe80f4f49 in pthreadpool_compute_4d_tiled /root/pytorch/third_party/pthreadpool/src/legacy-api.c:209:21
    #3 0x7fffe6e99f5c in qnnpack::qnnpackConv(pytorch_qnnp_operator*, void*, unsigned long, unsigned long, unsigned long, unsigned long, unsigned char, unsigned char const*, unsigned char const*, float const*, unsigned char, unsigned char, unsigned char, unsigned char*, pthreadpool*) /root/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/conv-run.cc
    #4 0x7fffdae142ef in at::Tensor PackedConvWeightsQnnp<2>::apply_impl<false>(at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1011:18
    #5 0x7fffdae127b8 in PackedConvWeightsQnnp<2>::apply(at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1074:10
    #6 0x7fffdae20259 in at::native::(anonymous namespace)::QConvInt8<2, false>::run(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:2117:29
    #7 0x7fffdae217db in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >::operator()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoRuntimeFunctor.h:22:12
    #8 0x7fffdae217db in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, at::Tensor (at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:578:12
    #9 0x7fffdae21d08 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false, 0ul, 1ul, 2ul, 3ul, at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<std::vector> >*, std::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul>, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long>*) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:642:10
    #10 0x7fffdae21ac6 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<std::vector> >*) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:666:10
    #11 0x7fffdae21ac6 in c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:765:28
    #12 0x7fffe2a0e9b0 in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3
    #13 0x7fffe2a0e9b0 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:92:22
    #14 0x7fffe2a0e9b0 in c10::Dispatcher::redispatchBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:935:10
    #15 0x7fffe2a0e9b0 in c10::OperatorHandle::redispatchBoxed(c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:550:34
    #16 0x7fffe2a0e9b0 in torch::autograd::basicAutogradNotImplementedFallbackImpl(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /root/pytorch/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:149:6
    #17 0x7fffd9632d53 in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3
    #18 0x7fffd9632d53 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:92:22
    #19 0x7fffd9632d53 in c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:893:10
    #20 0x7ffff2ba975e in std::function<void (std::vector<c10::IValue, std::allocator<c10::IValue> >&)>::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9
    #21 0x7ffff2ba975e in torch::jit::Operation::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /root/pytorch/aten/src/ATen/core/stack.h:41:5
    #22 0x7ffff2ba975e in torch::jit::invokeOperatorFromPython(c10::ArrayRef<std::shared_ptr<torch::jit::Operator> >, pybind11::args const&, pybind11::kwargs const&, std::optional<c10::DispatchKey>) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:860:7
    #23 0x7ffff2baac15 in torch::jit::_get_operation_for_overload_or_packet(c10::ArrayRef<std::shared_ptr<torch::jit::Operator> >, c10::Symbol, pybind11::args const&, pybind11::kwargs const&, bool, std::optional<c10::DispatchKey>) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:968:9
    #24 0x7ffff2baa7f0 in torch::jit::_get_operation_for_packet(std::vector<std::shared_ptr<torch::jit::Operator>> const&, c10::Symbol, pybind11::args const&, pybind11::kwargs const&, bool, std::optional<c10::DispatchKey>) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:949:10
    #25 0x7ffff2a24f9f in torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)::operator()(pybind11::args const&, pybind11::kwargs const&) const /root/pytorch/torch/csrc/jit/python/init.cpp:1815:24
    #26 0x7ffff2a24f9f in pybind11::object pybind11::detail::argument_loader<pybind11::args const&, pybind11::kwargs const&>::call_impl<pybind11::object, torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&, 0ul, 1ul, pybind11::detail::void_type>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&, std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && /root/pytorch/torch/include/pybind11/cast.h:2137:16
    #27 0x7ffff2a24f9f in std::enable_if<!(std::is_void<pybind11::object>::value), pybind11::object>::type pybind11::detail::argument_loader<pybind11::args const&, pybind11::kwargs const&>::call<pybind11::object, pybind11::detail::void_type, torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&) && /root/pytorch/torch/include/pybind11/cast.h:2105:42
    #28 0x7ffff2a24f9f in void pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&), pybind11::object, pybind11::args const&, pybind11::kwargs const&, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&&, pybind11::object (*)(pybind11::args const&, pybind11::kwargs const&), pybind11::name const&, pybind11::doc const&)::'lambda'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&) const /root/pytorch/torch/include/pybind11/pybind11.h:430:56
    #29 0x7ffff2a24f9f in void pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&), pybind11::object, pybind11::args const&, pybind11::kwargs const&, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&&, pybind11::object (*)(pybind11::args const&, pybind11::kwargs const&), pybind11::name const&, pybind11::doc const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) /root/pytorch/torch/include/pybind11/pybind11.h:400:21
    #30 0x7ffff1aac8b6 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /root/pytorch/torch/include/pybind11/pybind11.h:1063:30
    #31 0x5629f28d8025 in PyObject_Call (/usr/bin/python3.10+0x1a2025)
    #32 0x5629f28b29b3 in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17c9b3)
    #33 0x5629f28c11bb in _PyFunction_Vectorcall (/usr/bin/python3.10+0x18b1bb)
    #34 0x5629f28ab81f in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17581f)
    #35 0x5629f28c11bb in _PyFunction_Vectorcall (/usr/bin/python3.10+0x18b1bb)
    #36 0x5629f28ab81f in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17581f)
    #37 0x5629f2990565 in PyEval_EvalCode (/usr/bin/python3.10+0x25a565)
    #38 0x5629f29b6ed7 in PyRun_SimpleFileExFlags (/usr/bin/python3.10+0x280ed7)
    #39 0x5629f29b16de in PyRun_FileExFlags (/usr/bin/python3.10+0x27b6de)
    #40 0x5629f29aa3ad in Py_RunMain (/usr/bin/python3.10+0x2743ad)
    #41 0x5629f298447c in Py_BytesMain (/usr/bin/python3.10+0x24e47c)
    #42 0x7ffff7106d8f in __libc_start_call_main (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
    #43 0x7ffff7106e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f)
    #44 0x5629f2984374 in _start (/usr/bin/python3.10+0x24e374)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: FPE /root/pytorch/third_party/FXdiv/include/fxdiv.h:291:23 in fxdiv_init_uint64_t
==2546745==ABORTING

CPU: Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 48 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0-127 Vendor ID: AuthenticAMD Model name: AMD Ryzen Threadripper PRO 5995WX 64-Cores CPU family: 25 Model: 8 Thread(s) per core: 2 Core(s) per socket: 64 Socket(s): 1 Stepping: 2 Frequency boost: enabled CPU max MHz: 2700.0000 CPU min MHz: 1800.0000 BogoMIPS: 5389.77 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 invpcid_single hw_pstate ssbd mba ibrs ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca Virtualization: AMD-V L1d cache: 2 MiB (64 instances) L1i cache: 2 MiB (64 instances) L2 cache: 32 MiB (64 instances) L3 cache: 256 MiB (8 instances) NUMA node(s): 1 NUMA node0 CPU(s): 0-127 Vulnerability Gather data sampling: Not affected Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Retpolines; IBPB conditional; IBRS_FW; STIBP always-on; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected

Code Example

# poc.nn.quantized.functional.conv2d.py
import torch

print("torch version:", torch.__version__)
torch.backends.quantized.engine = 'qnnpack'

# Minimal Reproducer
# Input Shape: (N=2, C_in=8, H=16, W=7)
x = torch.quantize_per_tensor(
    torch.zeros(2, 8, 16, 7), 
    scale=0.01, zero_point=0, dtype=torch.quint8
)

# Weight Shape: (C_out=6, C_in/groups=1, K_h=7, K_w=1)
w = torch.quantize_per_tensor(
    torch.zeros(6, 1, 7, 1), 
    scale=0.01, zero_point=-128, dtype=torch.qint8
)

# Trigger FPE crash
output = torch.nn.quantized.functional.conv2d(
    x, w, None,
    stride=(1, 1),
    padding=(0, 0),
    dilation=(1, 1),
    groups=8,
    scale=0.01,
    zero_point=0
)
print(output.shape)

---

==2546745==ERROR: AddressSanitizer: FPE on unknown address 0x7fffe80f4f49 (pc 0x7fffe80f4f49 bp 0x7fffffffc560 sp 0x7fffffffc440 T0)
    #0 0x7fffe80f4f49 in fxdiv_init_uint64_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:291:23
    #1 0x7fffe80f4f49 in fxdiv_init_size_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:329:52
    #2 0x7fffe80f4f49 in pthreadpool_compute_4d_tiled /root/pytorch/third_party/pthreadpool/src/legacy-api.c:209:21
    #3 0x7fffe6e99f5c in qnnpack::qnnpackConv(pytorch_qnnp_operator*, void*, unsigned long, unsigned long, unsigned long, unsigned long, unsigned char, unsigned char const*, unsigned char const*, float const*, unsigned char, unsigned char, unsigned char, unsigned char*, pthreadpool*) /root/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/conv-run.cc
    #4 0x7fffdae142ef in at::Tensor PackedConvWeightsQnnp<2>::apply_impl<false>(at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1011:18
    #5 0x7fffdae127b8 in PackedConvWeightsQnnp<2>::apply(at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1074:10
    #6 0x7fffdae20259 in at::native::(anonymous namespace)::QConvInt8<2, false>::run(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:2117:29
    #7 0x7fffdae217db in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >::operator()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoRuntimeFunctor.h:22:12
    #8 0x7fffdae217db in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, at::Tensor (at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:578:12
    #9 0x7fffdae21d08 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false, 0ul, 1ul, 2ul, 3ul, at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<std::vector> >*, std::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul>, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long>*) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:642:10
    #10 0x7fffdae21ac6 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<std::vector> >*) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:666:10
    #11 0x7fffdae21ac6 in c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:765:28
    #12 0x7fffe2a0e9b0 in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3
    #13 0x7fffe2a0e9b0 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:92:22
    #14 0x7fffe2a0e9b0 in c10::Dispatcher::redispatchBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:935:10
    #15 0x7fffe2a0e9b0 in c10::OperatorHandle::redispatchBoxed(c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:550:34
    #16 0x7fffe2a0e9b0 in torch::autograd::basicAutogradNotImplementedFallbackImpl(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /root/pytorch/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:149:6
    #17 0x7fffd9632d53 in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3
    #18 0x7fffd9632d53 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:92:22
    #19 0x7fffd9632d53 in c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:893:10
    #20 0x7ffff2ba975e in std::function<void (std::vector<c10::IValue, std::allocator<c10::IValue> >&)>::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9
    #21 0x7ffff2ba975e in torch::jit::Operation::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /root/pytorch/aten/src/ATen/core/stack.h:41:5
    #22 0x7ffff2ba975e in torch::jit::invokeOperatorFromPython(c10::ArrayRef<std::shared_ptr<torch::jit::Operator> >, pybind11::args const&, pybind11::kwargs const&, std::optional<c10::DispatchKey>) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:860:7
    #23 0x7ffff2baac15 in torch::jit::_get_operation_for_overload_or_packet(c10::ArrayRef<std::shared_ptr<torch::jit::Operator> >, c10::Symbol, pybind11::args const&, pybind11::kwargs const&, bool, std::optional<c10::DispatchKey>) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:968:9
    #24 0x7ffff2baa7f0 in torch::jit::_get_operation_for_packet(std::vector<std::shared_ptr<torch::jit::Operator>> const&, c10::Symbol, pybind11::args const&, pybind11::kwargs const&, bool, std::optional<c10::DispatchKey>) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:949:10
    #25 0x7ffff2a24f9f in torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)::operator()(pybind11::args const&, pybind11::kwargs const&) const /root/pytorch/torch/csrc/jit/python/init.cpp:1815:24
    #26 0x7ffff2a24f9f in pybind11::object pybind11::detail::argument_loader<pybind11::args const&, pybind11::kwargs const&>::call_impl<pybind11::object, torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&, 0ul, 1ul, pybind11::detail::void_type>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&, std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && /root/pytorch/torch/include/pybind11/cast.h:2137:16
    #27 0x7ffff2a24f9f in std::enable_if<!(std::is_void<pybind11::object>::value), pybind11::object>::type pybind11::detail::argument_loader<pybind11::args const&, pybind11::kwargs const&>::call<pybind11::object, pybind11::detail::void_type, torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&) && /root/pytorch/torch/include/pybind11/cast.h:2105:42
    #28 0x7ffff2a24f9f in void pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&), pybind11::object, pybind11::args const&, pybind11::kwargs const&, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&&, pybind11::object (*)(pybind11::args const&, pybind11::kwargs const&), pybind11::name const&, pybind11::doc const&)::'lambda'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&) const /root/pytorch/torch/include/pybind11/pybind11.h:430:56
    #29 0x7ffff2a24f9f in void pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&), pybind11::object, pybind11::args const&, pybind11::kwargs const&, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&&, pybind11::object (*)(pybind11::args const&, pybind11::kwargs const&), pybind11::name const&, pybind11::doc const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) /root/pytorch/torch/include/pybind11/pybind11.h:400:21
    #30 0x7ffff1aac8b6 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /root/pytorch/torch/include/pybind11/pybind11.h:1063:30
    #31 0x5629f28d8025 in PyObject_Call (/usr/bin/python3.10+0x1a2025)
    #32 0x5629f28b29b3 in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17c9b3)
    #33 0x5629f28c11bb in _PyFunction_Vectorcall (/usr/bin/python3.10+0x18b1bb)
    #34 0x5629f28ab81f in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17581f)
    #35 0x5629f28c11bb in _PyFunction_Vectorcall (/usr/bin/python3.10+0x18b1bb)
    #36 0x5629f28ab81f in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17581f)
    #37 0x5629f2990565 in PyEval_EvalCode (/usr/bin/python3.10+0x25a565)
    #38 0x5629f29b6ed7 in PyRun_SimpleFileExFlags (/usr/bin/python3.10+0x280ed7)
    #39 0x5629f29b16de in PyRun_FileExFlags (/usr/bin/python3.10+0x27b6de)
    #40 0x5629f29aa3ad in Py_RunMain (/usr/bin/python3.10+0x2743ad)
    #41 0x5629f298447c in Py_BytesMain (/usr/bin/python3.10+0x24e47c)
    #42 0x7ffff7106d8f in __libc_start_call_main (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
    #43 0x7ffff7106e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f)
    #44 0x5629f2984374 in _start (/usr/bin/python3.10+0x24e374)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: FPE /root/pytorch/third_party/FXdiv/include/fxdiv.h:291:23 in fxdiv_init_uint64_t
==2546745==ABORTING

RAW_BUFFERClick to expand / collapse

🐛 Describe the bug

Summary

Poc

Run following command:

python3 poc.nn.quantized.functional.conv2d.py

# poc.nn.quantized.functional.conv2d.py
import torch

print("torch version:", torch.__version__)
torch.backends.quantized.engine = 'qnnpack'

# Minimal Reproducer
# Input Shape: (N=2, C_in=8, H=16, W=7)
x = torch.quantize_per_tensor(
    torch.zeros(2, 8, 16, 7), 
    scale=0.01, zero_point=0, dtype=torch.quint8
)

# Weight Shape: (C_out=6, C_in/groups=1, K_h=7, K_w=1)
w = torch.quantize_per_tensor(
    torch.zeros(6, 1, 7, 1), 
    scale=0.01, zero_point=-128, dtype=torch.qint8
)

# Trigger FPE crash
output = torch.nn.quantized.functional.conv2d(
    x, w, None,
    stride=(1, 1),
    padding=(0, 0),
    dilation=(1, 1),
    groups=8,
    scale=0.01,
    zero_point=0
)
print(output.shape)

ASAN-report

==2546745==ERROR: AddressSanitizer: FPE on unknown address 0x7fffe80f4f49 (pc 0x7fffe80f4f49 bp 0x7fffffffc560 sp 0x7fffffffc440 T0)
    #0 0x7fffe80f4f49 in fxdiv_init_uint64_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:291:23
    #1 0x7fffe80f4f49 in fxdiv_init_size_t /root/pytorch/third_party/FXdiv/include/fxdiv.h:329:52
    #2 0x7fffe80f4f49 in pthreadpool_compute_4d_tiled /root/pytorch/third_party/pthreadpool/src/legacy-api.c:209:21
    #3 0x7fffe6e99f5c in qnnpack::qnnpackConv(pytorch_qnnp_operator*, void*, unsigned long, unsigned long, unsigned long, unsigned long, unsigned char, unsigned char const*, unsigned char const*, float const*, unsigned char, unsigned char, unsigned char, unsigned char*, pthreadpool*) /root/pytorch/aten/src/ATen/native/quantized/cpu/qnnpack/src/conv-run.cc
    #4 0x7fffdae142ef in at::Tensor PackedConvWeightsQnnp<2>::apply_impl<false>(at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1011:18
    #5 0x7fffdae127b8 in PackedConvWeightsQnnp<2>::apply(at::Tensor const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:1074:10
    #6 0x7fffdae20259 in at::native::(anonymous namespace)::QConvInt8<2, false>::run(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/native/quantized/cpu/qconv.cpp:2117:29
    #7 0x7fffdae217db in c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >::operator()(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/WrapFunctionIntoRuntimeFunctor.h:22:12
    #8 0x7fffdae217db in c10::impl::wrap_kernel_functor_unboxed_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, at::Tensor (at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long)>::call(c10::OperatorKernel*, c10::DispatchKeySet, at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:578:12
    #9 0x7fffdae21d08 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack_<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false, 0ul, 1ul, 2ul, 3ul, at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<std::vector> >*, std::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul>, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long>*) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:642:10
    #10 0x7fffdae21ac6 in std::decay<c10::guts::infer_function_traits<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> > >::type::return_type>::type c10::impl::call_functor_with_args_from_stack<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false>(c10::OperatorKernel*, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<std::vector> >*) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:666:10
    #11 0x7fffdae21ac6 in c10::impl::make_boxed_from_unboxed_functor<c10::impl::detail::WrapFunctionIntoRuntimeFunctor_<at::Tensor (*)(at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long), at::Tensor, c10::guts::typelist::typelist<at::Tensor, c10::intrusive_ptr<ConvPackedParamsBase<2>, c10::detail::intrusive_target_default_null_type<ConvPackedParamsBase<2> > > const&, double, long> >, false>::call(c10::OperatorKernel*, c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /root/pytorch/aten/src/ATen/core/boxing/impl/make_boxed_from_unboxed_functor.h:765:28
    #12 0x7fffe2a0e9b0 in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3
    #13 0x7fffe2a0e9b0 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:92:22
    #14 0x7fffe2a0e9b0 in c10::Dispatcher::redispatchBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:935:10
    #15 0x7fffe2a0e9b0 in c10::OperatorHandle::redispatchBoxed(c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:550:34
    #16 0x7fffe2a0e9b0 in torch::autograd::basicAutogradNotImplementedFallbackImpl(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) /root/pytorch/torch/csrc/autograd/autograd_not_implemented_fallback.cpp:149:6
    #17 0x7fffd9632d53 in c10::BoxedKernel::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/BoxedKernel_impl.h:48:3
    #18 0x7fffd9632d53 in c10::KernelFunction::callBoxed(c10::OperatorHandle const&, c10::DispatchKeySet, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/boxing/KernelFunction_impl.h:92:22
    #19 0x7fffd9632d53 in c10::Dispatcher::callBoxed(c10::OperatorHandle const&, std::vector<c10::IValue, std::allocator<c10::IValue> >*) const /root/pytorch/aten/src/ATen/core/dispatch/Dispatcher.h:893:10
    #20 0x7ffff2ba975e in std::function<void (std::vector<c10::IValue, std::allocator<c10::IValue> >&)>::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) const /usr/bin/../lib/gcc/x86_64-linux-gnu/11/../../../../include/c++/11/bits/std_function.h:590:9
    #21 0x7ffff2ba975e in torch::jit::Operation::operator()(std::vector<c10::IValue, std::allocator<c10::IValue> >&) /root/pytorch/aten/src/ATen/core/stack.h:41:5
    #22 0x7ffff2ba975e in torch::jit::invokeOperatorFromPython(c10::ArrayRef<std::shared_ptr<torch::jit::Operator> >, pybind11::args const&, pybind11::kwargs const&, std::optional<c10::DispatchKey>) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:860:7
    #23 0x7ffff2baac15 in torch::jit::_get_operation_for_overload_or_packet(c10::ArrayRef<std::shared_ptr<torch::jit::Operator> >, c10::Symbol, pybind11::args const&, pybind11::kwargs const&, bool, std::optional<c10::DispatchKey>) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:968:9
    #24 0x7ffff2baa7f0 in torch::jit::_get_operation_for_packet(std::vector<std::shared_ptr<torch::jit::Operator>> const&, c10::Symbol, pybind11::args const&, pybind11::kwargs const&, bool, std::optional<c10::DispatchKey>) /root/pytorch/torch/csrc/jit/python/pybind_utils.cpp:949:10
    #25 0x7ffff2a24f9f in torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)::operator()(pybind11::args const&, pybind11::kwargs const&) const /root/pytorch/torch/csrc/jit/python/init.cpp:1815:24
    #26 0x7ffff2a24f9f in pybind11::object pybind11::detail::argument_loader<pybind11::args const&, pybind11::kwargs const&>::call_impl<pybind11::object, torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&, 0ul, 1ul, pybind11::detail::void_type>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&, std::integer_sequence<unsigned long, 0ul, 1ul>, pybind11::detail::void_type&&) && /root/pytorch/torch/include/pybind11/cast.h:2137:16
    #27 0x7ffff2a24f9f in std::enable_if<!(std::is_void<pybind11::object>::value), pybind11::object>::type pybind11::detail::argument_loader<pybind11::args const&, pybind11::kwargs const&>::call<pybind11::object, pybind11::detail::void_type, torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&) && /root/pytorch/torch/include/pybind11/cast.h:2105:42
    #28 0x7ffff2a24f9f in void pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&), pybind11::object, pybind11::args const&, pybind11::kwargs const&, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&&, pybind11::object (*)(pybind11::args const&, pybind11::kwargs const&), pybind11::name const&, pybind11::doc const&)::'lambda'(pybind11::detail::function_call&)::operator()(pybind11::detail::function_call&) const /root/pytorch/torch/include/pybind11/pybind11.h:430:56
    #29 0x7ffff2a24f9f in void pybind11::cpp_function::initialize<torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&), pybind11::object, pybind11::args const&, pybind11::kwargs const&, pybind11::name, pybind11::doc>(torch::jit::initJITBindings(_object*)::$_229::operator()(std::__cxx11::string const&) const::'lambda'(pybind11::args const&, pybind11::kwargs const&)&&, pybind11::object (*)(pybind11::args const&, pybind11::kwargs const&), pybind11::name const&, pybind11::doc const&)::'lambda'(pybind11::detail::function_call&)::__invoke(pybind11::detail::function_call&) /root/pytorch/torch/include/pybind11/pybind11.h:400:21
    #30 0x7ffff1aac8b6 in pybind11::cpp_function::dispatcher(_object*, _object*, _object*) /root/pytorch/torch/include/pybind11/pybind11.h:1063:30
    #31 0x5629f28d8025 in PyObject_Call (/usr/bin/python3.10+0x1a2025)
    #32 0x5629f28b29b3 in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17c9b3)
    #33 0x5629f28c11bb in _PyFunction_Vectorcall (/usr/bin/python3.10+0x18b1bb)
    #34 0x5629f28ab81f in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17581f)
    #35 0x5629f28c11bb in _PyFunction_Vectorcall (/usr/bin/python3.10+0x18b1bb)
    #36 0x5629f28ab81f in _PyEval_EvalFrameDefault (/usr/bin/python3.10+0x17581f)
    #37 0x5629f2990565 in PyEval_EvalCode (/usr/bin/python3.10+0x25a565)
    #38 0x5629f29b6ed7 in PyRun_SimpleFileExFlags (/usr/bin/python3.10+0x280ed7)
    #39 0x5629f29b16de in PyRun_FileExFlags (/usr/bin/python3.10+0x27b6de)
    #40 0x5629f29aa3ad in Py_RunMain (/usr/bin/python3.10+0x2743ad)
    #41 0x5629f298447c in Py_BytesMain (/usr/bin/python3.10+0x24e47c)
    #42 0x7ffff7106d8f in __libc_start_call_main (/lib/x86_64-linux-gnu/libc.so.6+0x29d8f)
    #43 0x7ffff7106e3f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x29e3f)
    #44 0x5629f2984374 in _start (/usr/bin/python3.10+0x24e374)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: FPE /root/pytorch/third_party/FXdiv/include/fxdiv.h:291:23 in fxdiv_init_uint64_t
==2546745==ABORTING

Suspected root cause

The bug is caused by incomplete frontend validation and unsafe underlying computation:

The existing check only verifies that the output spatial dimensions (height/width) are greater than 0, which is satisfied in this PoC (output H=10, W=7).
The combination of groups=8, C_in=8, weight.shape[0]=6 (output channels) is invalid for grouped convolution, but not checked by the frontend.
The QNNPACK backend passes this invalid combination to pthreadpool_compute_4d_tiled, which computes a tile size of 0 and passes it to fxdiv_init_uint64_t as the divisor.
The third-party FXdiv library performs a division by zero, triggering a hardware-level FPE crash (denial-of-service).

Versions

PyTorch version: 2.10.0a0+gitf2bb22f Is debug build: False CUDA used to build PyTorch: Could not collect ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.5 LTS (x86_64) GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04.2) 11.4.0 Clang version: 14.0.0-1ubuntu1.1 CMake version: version 4.1.2 Libc version: glibc-2.35

Python version: 3.10.12 (main, Aug 15 2025, 14:32:43) [GCC 11.4.0] (64-bit runtime) Python platform: Linux-5.4.0-200-generic-x86_64-with-glibc2.35 Is CUDA available: False CUDA runtime version: Could not collect CUDA_MODULE_LOADING set to: N/A GPU models and configuration: Nvidia driver version: Could not collect cuDNN version: Could not collect Is XPU available: False HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: False Caching allocator config: N/A

Versions of relevant libraries: [pip3] numpy==2.2.6 [pip3] optree==0.17.0 [pip3] torch==2.10.0a0+gitf2bb22f [conda] Could not collect

cc @jerryzh168 @jianyuh @raghuramank100 @jamesr66a @vkuzo @jgong5 @Xia-Weiwen @leslie-fang-intel

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #runtime error #dependency conflict #environment setup #docker error

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Data

Security

Network

Code

UI/UX

Text

System

Multimedia

Protocol

API

Engineering

pytorch - 💡(How to fix) Fix CPU FPE crash in `torch.nn.quantized.functional.conv2d` (qnnpack) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Suspected root cause

Fix Action

Fix / Workaround

Code Example

🐛 Describe the bug

Summary

Poc

ASAN-report

Suspected root cause

Versions

Still need to ship something?

TRENDING

pytorch - 💡(How to fix) Fix CPU FPE crash in `torch.nn.quantized.functional.conv2d` (qnnpack) [1 participants]

Recommended Tools

GitHub issue graph ai analysis

Error Message

Root Cause

Suspected root cause

Fix Action

Fix / Workaround

Code Example

🐛 Describe the bug

Summary

Poc

ASAN-report

Suspected root cause

Versions

Still need to ship something?

RELATED_DISCOVERY

TRENDING