- 
                Notifications
    You must be signed in to change notification settings 
- Fork 4
Description
Thanks a lot for providing and sharing the amazing tool. Recently I faced notable performance penalties trying to fit a small TSMixerx model on many many different datasets using AMD CPU in parallel. I found that some of the ops such as aten::bernoulli_, aten::mul, aten::copy_, and aten::mm need much more CPU compute time disproportionally as I increase the parallel processes (I'm using Dask framework for distributed training), and confirmed that the CPU utilization is still less than 50% so CPU oversubscription should not be an issue. Checked memory / IOWAIT etc. and all shall be fine. I suspect this is due to MKL lib biasing towards Intel CPU so I desperately hope that this plugin could save me from there.
Since the binaries only support up to python 3.11, I'm trying to build from source on python 3.12 due to other dependencies in my project setup. However, I'm stuck by some build errors as following:
python setup.py bdist_wheel
running bdist_wheel
running build
running build_py
copying src/cpu/python/zentorch/_WOQLinear.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_graph_preprocess_matcher.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/__init__.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_info.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_graph_preprocess_patterns.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_logging.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_woq_model_reload.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_compile_backend.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_build_info.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_eltwise_fusions.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_op_replacement.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_mkldnn.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_fusion_matcher.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_utils.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_quantization_utils.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_optimize.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_meta_registrations.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_custom_op_replacement.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/_fusion_patterns.py -> build/lib.linux-x86_64-cpython-312/zentorch
copying src/cpu/python/zentorch/llm/_model_conversion_functions.py -> build/lib.linux-x86_64-cpython-312/zentorch/llm
copying src/cpu/python/zentorch/llm/__init__.py -> build/lib.linux-x86_64-cpython-312/zentorch/llm
copying src/cpu/python/zentorch/llm/_custom_models_reference_linear_fusion.py -> build/lib.linux-x86_64-cpython-312/zentorch/llm
copying src/cpu/python/zentorch/llm/_custom_model_forward.py -> build/lib.linux-x86_64-cpython-312/zentorch/llm
copying src/cpu/python/zentorch/llm/_checks.py -> build/lib.linux-x86_64-cpython-312/zentorch/llm
copying src/cpu/python/zentorch/llm/_optimize.py -> build/lib.linux-x86_64-cpython-312/zentorch/llm
running build_ext
cmake -S /home/kemove/git/ZenDNN-pytorch-plugin -B build/temp.linux-x86_64-cpython-312 -DCMAKE_BUILD_TYPE=Release -DCMAKE_PREFIX_PATH=/home/kemove/git/pytorch/torch/share/cmake
CMake Deprecation Warning at CMakeLists.txt:6 (cmake_minimum_required):
  Compatibility with CMake < 3.10 will be removed from a future version of
  CMake.
  Update the VERSION argument <min> value.  Or, use the <min>...<max> syntax
  to tell CMake that the project requires at least <min> but has been updated
  to work with policies introduced by <max> or earlier.
CMake Warning (dev) at /home/kemove/.pyenv/versions/3.12.2/lib/python3.12/site-packages/cmake/data/share/cmake-3.31/Modules/FetchContent.cmake:1953 (message):
  Calling FetchContent_Populate(blis) is deprecated, call
  FetchContent_MakeAvailable(blis) instead.  Policy CMP0169 can be set to OLD
  to allow FetchContent_Populate(blis) to be called directly for now, but the
  ability to call it with declared details will be removed completely in a
  future version.
Call Stack (most recent call first):
  cmake/modules/FindZENDNN.cmake:34 (FetchContent_Populate)
  CMakeLists.txt:13 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.
CMake Warning (dev) at /home/kemove/.pyenv/versions/3.12.2/lib/python3.12/site-packages/cmake/data/share/cmake-3.31/Modules/FetchContent.cmake:1953 (message):
  Calling FetchContent_Populate(FBGEMM) is deprecated, call
  FetchContent_MakeAvailable(FBGEMM) instead.  Policy CMP0169 can be set to
  OLD to allow FetchContent_Populate(FBGEMM) to be called directly for now,
  but the ability to call it with declared details will be removed completely
  in a future version.
Call Stack (most recent call first):
  cmake/modules/FindZENDNN.cmake:73 (FetchContent_Populate)
  CMakeLists.txt:13 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.
-- Submodule update
CMake Warning (dev) at /home/kemove/.pyenv/versions/3.12.2/lib/python3.12/site-packages/cmake/data/share/cmake-3.31/Modules/FetchContent.cmake:1953 (message):
  Calling FetchContent_Populate(libxsmm) is deprecated, call
  FetchContent_MakeAvailable(libxsmm) instead.  Policy CMP0169 can be set to
  OLD to allow FetchContent_Populate(libxsmm) to be called directly for now,
  but the ability to call it with declared details will be removed completely
  in a future version.
Call Stack (most recent call first):
  cmake/modules/FindZENDNN.cmake:131 (FetchContent_Populate)
  CMakeLists.txt:13 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.
CMake Warning (dev) at /home/kemove/.pyenv/versions/3.12.2/lib/python3.12/site-packages/cmake/data/share/cmake-3.31/Modules/FetchContent.cmake:1953 (message):
  Calling FetchContent_Populate(ZenDNN) is deprecated, call
  FetchContent_MakeAvailable(ZenDNN) instead.  Policy CMP0169 can be set to
  OLD to allow FetchContent_Populate(ZenDNN) to be called directly for now,
  but the ability to call it with declared details will be removed completely
  in a future version.
Call Stack (most recent call first):
  cmake/modules/FindZENDNN.cmake:175 (FetchContent_Populate)
  CMakeLists.txt:13 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.
-- Found AOCL BLIS libraries: /home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312/lib/libblis-mt.a
-- Found AOCL BLIS include  : /home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312/blis_gcc_build/include
-- Found ZENDNN libraries   : /home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312/lib/libamdZenDNN.a
-- Found ZENDNN include     : /home/kemove/git/ZenDNN-pytorch-plugin/third_party/ZenDNN/inc
-- Caffe2: CUDA detected: 12.5
-- Caffe2: CUDA nvcc is: /usr/local/cuda-12.5/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda-12.5
-- Caffe2: Header version is: 12.5
CMake Warning at /home/kemove/git/pytorch/torch/share/cmake/Caffe2/public/cuda.cmake:140 (message):
  Failed to compute shorthash for libnvrtc.so
Call Stack (most recent call first):
  /home/kemove/git/pytorch/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
  /home/kemove/git/pytorch/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  cmake/modules/FindCPUkernels.cmake:8 (find_package)
  CMakeLists.txt:16 (find_package)
CMake Warning (dev) at /home/kemove/.pyenv/versions/3.12.2/lib/python3.12/site-packages/cmake/data/share/cmake-3.31/Modules/FindPackageHandleStandardArgs.cmake:441 (message):
  The package name passed to `find_package_handle_standard_args` (nvtx3) does
  not match the name of the calling package (Caffe2).  This can lead to
  problems in calling code that expects `find_package` result variables
  (e.g., `_FOUND`) to follow a certain pattern.
Call Stack (most recent call first):
  /home/kemove/git/pytorch/torch/share/cmake/Caffe2/public/cuda.cmake:174 (find_package_handle_standard_args)
  /home/kemove/git/pytorch/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
  /home/kemove/git/pytorch/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  cmake/modules/FindCPUkernels.cmake:8 (find_package)
  CMakeLists.txt:16 (find_package)
This warning is for project developers.  Use -Wno-dev to suppress it.
-- Could NOT find nvtx3 (missing: nvtx3_dir)
CMake Warning at /home/kemove/git/pytorch/torch/share/cmake/Caffe2/public/cuda.cmake:180 (message):
  Cannot find NVTX3, find old NVTX instead
Call Stack (most recent call first):
  /home/kemove/git/pytorch/torch/share/cmake/Caffe2/Caffe2Config.cmake:86 (include)
  /home/kemove/git/pytorch/torch/share/cmake/Torch/TorchConfig.cmake:68 (find_package)
  cmake/modules/FindCPUkernels.cmake:8 (find_package)
  CMakeLists.txt:16 (find_package)
-- USE_CUDNN is set to 0. Compiling without cuDNN support
-- USE_CUSPARSELT is set to 0. Compiling without cuSPARSELt support
-- USE_CUDSS is set to 0. Compiling without cuDSS support
-- USE_CUFILE is set to 0. Compiling without cuFile support
-- Autodetected CUDA architecture(s):  8.9
-- Added CUDA NVCC flags for: -gencode;arch=compute_89,code=sm_89
-- Configuring done (1.2s)
-- Generating done (0.0s)
-- Build files have been written to: /home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312
make -j -C build/temp.linux-x86_64-cpython-312
make: Entering directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[1]: Entering directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[2]: Entering directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[2]: Entering directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[2]: Entering directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[2]: Entering directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[2]: Leaving directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[2]: Leaving directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[2]: Leaving directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
[ 14%] Built target libxsmm
[ 28%] Built target libamdblis
[ 42%] Built target libfbgemm
make[2]: Entering directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[2]: Leaving directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[2]: Entering directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
[ 57%] Generating lib/libamdZenDNN.a
make[3]: Entering directory '/home/kemove/git/ZenDNN-pytorch-plugin/third_party/ZenDNN'
make[2]: Leaving directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
[100%] Built target CPUkernels
In file included from src/cpu/avx512_embedding_bag.cpp:27:
src/cpu/avx512_embedding_bag_utils.hpp: In member function ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::load_ps(const zendnn::impl::bfloat16_t*)’:
src/cpu/avx512_embedding_bag_utils.hpp:118:28: error: there are no arguments to ‘_mm512_cvtpbh_ps’ that depend on a template parameter, so a declaration of ‘_mm512_cvtpbh_ps’ must be available [-fpermissive]
  118 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ^~~~~~~~~~~~~~~~
src/cpu/avx512_embedding_bag_utils.hpp:118:28: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
src/cpu/avx512_embedding_bag_utils.hpp: In member function ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*)’:
src/cpu/avx512_embedding_bag_utils.hpp:126:28: error: there are no arguments to ‘_mm512_cvtpbh_ps’ that depend on a template parameter, so a declaration of ‘_mm512_cvtpbh_ps’ must be available [-fpermissive]
  126 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ^~~~~~~~~~~~~~~~
src/cpu/avx512_embedding_bag_utils.hpp: In member function ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float)’:
src/cpu/avx512_embedding_bag_utils.hpp:138:28: error: there are no arguments to ‘_mm512_cvtpbh_ps’ that depend on a template parameter, so a declaration of ‘_mm512_cvtpbh_ps’ must be available [-fpermissive]
  138 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ^~~~~~~~~~~~~~~~
src/cpu/avx512_embedding_bag_utils.hpp: In member function ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*)’:
src/cpu/avx512_embedding_bag_utils.hpp:148:28: error: there are no arguments to ‘_mm512_cvtpbh_ps’ that depend on a template parameter, so a declaration of ‘_mm512_cvtpbh_ps’ must be available [-fpermissive]
  148 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ^~~~~~~~~~~~~~~~
src/cpu/avx512_embedding_bag_utils.hpp: In member function ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::load_ps(const zendnn::impl::bfloat16_t*)’:
src/cpu/avx512_embedding_bag_utils.hpp:193:28: error: there are no arguments to ‘_mm512_cvtpbh_ps’ that depend on a template parameter, so a declaration of ‘_mm512_cvtpbh_ps’ must be available [-fpermissive]
  193 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ^~~~~~~~~~~~~~~~
src/cpu/avx512_embedding_bag_utils.hpp: In member function ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*)’:
src/cpu/avx512_embedding_bag_utils.hpp:201:28: error: there are no arguments to ‘_mm512_cvtpbh_ps’ that depend on a template parameter, so a declaration of ‘_mm512_cvtpbh_ps’ must be available [-fpermissive]
  201 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ^~~~~~~~~~~~~~~~
src/cpu/avx512_embedding_bag_utils.hpp: In member function ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float)’:
src/cpu/avx512_embedding_bag_utils.hpp:213:28: error: there are no arguments to ‘_mm512_cvtpbh_ps’ that depend on a template parameter, so a declaration of ‘_mm512_cvtpbh_ps’ must be available [-fpermissive]
  213 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ^~~~~~~~~~~~~~~~
src/cpu/avx512_embedding_bag_utils.hpp: In member function ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*)’:
src/cpu/avx512_embedding_bag_utils.hpp:223:28: error: there are no arguments to ‘_mm512_cvtpbh_ps’ that depend on a template parameter, so a declaration of ‘_mm512_cvtpbh_ps’ must be available [-fpermissive]
  223 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ^~~~~~~~~~~~~~~~
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 32]’:
src/cpu/avx512_embedding_bag.cpp:183:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:201:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  201 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 16]’:
src/cpu/avx512_embedding_bag.cpp:230:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:201:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  201 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 8]’:
src/cpu/avx512_embedding_bag.cpp:277:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:201:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  201 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 4]’:
src/cpu/avx512_embedding_bag.cpp:324:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:201:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  201 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 2]’:
src/cpu/avx512_embedding_bag.cpp:371:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:201:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  201 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 1]’:
src/cpu/avx512_embedding_bag.cpp:418:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:201:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  201 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 32]’:
src/cpu/avx512_embedding_bag.cpp:552:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:213:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  213 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 16]’:
src/cpu/avx512_embedding_bag.cpp:599:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:213:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  213 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 8]’:
src/cpu/avx512_embedding_bag.cpp:646:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:213:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  213 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 4]’:
src/cpu/avx512_embedding_bag.cpp:693:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:213:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  213 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 2]’:
src/cpu/avx512_embedding_bag.cpp:740:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:213:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  213 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 1]’:
src/cpu/avx512_embedding_bag.cpp:787:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:213:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  213 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 32]’:
src/cpu/avx512_embedding_bag.cpp:1312:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:193:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  193 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 32]’:
src/cpu/avx512_embedding_bag.cpp:1319:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:223:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  223 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 16]’:
src/cpu/avx512_embedding_bag.cpp:1370:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:193:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  193 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 16]’:
src/cpu/avx512_embedding_bag.cpp:1377:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:223:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  223 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 8]’:
src/cpu/avx512_embedding_bag.cpp:1428:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:193:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  193 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 8]’:
src/cpu/avx512_embedding_bag.cpp:1435:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:223:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  223 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 4]’:
src/cpu/avx512_embedding_bag.cpp:1486:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:193:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  193 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 4]’:
src/cpu/avx512_embedding_bag.cpp:1493:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:223:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  223 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 2]’:
src/cpu/avx512_embedding_bag.cpp:1544:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:193:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  193 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 2]’:
src/cpu/avx512_embedding_bag.cpp:1551:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:223:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  223 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 1]’:
src/cpu/avx512_embedding_bag.cpp:1602:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:193:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  193 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, zendnn::impl::bfloat16_t, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 1]’:
src/cpu/avx512_embedding_bag.cpp:1609:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_bf16; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1711:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:223:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  223 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 32]’:
src/cpu/avx512_embedding_bag.cpp:183:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:126:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  126 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 16]’:
src/cpu/avx512_embedding_bag.cpp:230:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:126:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  126 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 8]’:
src/cpu/avx512_embedding_bag.cpp:277:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:126:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  126 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 4]’:
src/cpu/avx512_embedding_bag.cpp:324:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:126:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  126 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 2]’:
src/cpu/avx512_embedding_bag.cpp:371:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:126:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  126 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_add_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 1]’:
src/cpu/avx512_embedding_bag.cpp:418:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:126:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  126 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 32]’:
src/cpu/avx512_embedding_bag.cpp:552:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:138:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  138 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 16]’:
src/cpu/avx512_embedding_bag.cpp:599:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:138:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  138 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 8]’:
src/cpu/avx512_embedding_bag.cpp:646:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:138:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  138 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 4]’:
src/cpu/avx512_embedding_bag.cpp:693:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:138:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  138 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 2]’:
src/cpu/avx512_embedding_bag.cpp:740:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:138:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  138 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_fmadd_ps(const zendnn::impl::bfloat16_t*, float) [with unsigned int DIM = 1]’:
src/cpu/avx512_embedding_bag.cpp:787:43:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_sum_wt(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:138:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  138 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 32]’:
src/cpu/avx512_embedding_bag.cpp:1312:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:118:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  118 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 32]’:
src/cpu/avx512_embedding_bag.cpp:1319:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:148:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  148 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 16]’:
src/cpu/avx512_embedding_bag.cpp:1370:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:118:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  118 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 16]’:
src/cpu/avx512_embedding_bag.cpp:1377:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:148:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  148 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 8]’:
src/cpu/avx512_embedding_bag.cpp:1428:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:118:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  118 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 8]’:
src/cpu/avx512_embedding_bag.cpp:1435:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:148:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  148 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 4]’:
src/cpu/avx512_embedding_bag.cpp:1486:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:118:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  118 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 4]’:
src/cpu/avx512_embedding_bag.cpp:1493:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:148:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  148 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 2]’:
src/cpu/avx512_embedding_bag.cpp:1544:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:118:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  118 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 2]’:
src/cpu/avx512_embedding_bag.cpp:1551:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:148:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  148 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::load_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 1]’:
src/cpu/avx512_embedding_bag.cpp:1602:36:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:118:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  118 |             v[i]         = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
src/cpu/avx512_embedding_bag_utils.hpp: In instantiation of ‘void zenmmAVX512_ext_ps<zendnn::impl::bfloat16_t, float, DIM>::fetch_max_ps(const zendnn::impl::bfloat16_t*) [with unsigned int DIM = 1]’:
src/cpu/avx512_embedding_bag.cpp:1609:41:   required from ‘zendnn::impl::status_t zendnn::impl::cpu::avx512_embedding_bag_t<in_data_type, out_data_type>::avx512_max(const zendnn::impl::cpu::emb_params_t&) const [with zendnn_data_type_t in_data_type = zendnn_bf16; zendnn_data_type_t out_data_type = zendnn_f32; zendnn::impl::status_t = zendnn_status_t]’
src/cpu/avx512_embedding_bag.cpp:1712:17:   required from here
src/cpu/avx512_embedding_bag_utils.hpp:148:44: error: ‘_mm512_cvtpbh_ps’ was not declared in this scope; did you mean ‘_mm512_cvtph_ps’?
  148 |             __m512   tps = _mm512_cvtpbh_ps(tbh);
      |                            ~~~~~~~~~~~~~~~~^~~~~
      |                            _mm512_cvtph_ps
make[3]: *** [Makefile:278: _out/obj/src/cpu/avx512_embedding_bag.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: Leaving directory '/home/kemove/git/ZenDNN-pytorch-plugin/third_party/ZenDNN'
make[2]: *** [CMakeFiles/libamdZenDNN.dir/build.make:485: lib/libamdZenDNN.a] Error 2
make[2]: Leaving directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make[1]: *** [CMakeFiles/Makefile2:193: CMakeFiles/libamdZenDNN.dir/all] Error 2
make[1]: Leaving directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
make: *** [Makefile:91: all] Error 2
make: Leaving directory '/home/kemove/git/ZenDNN-pytorch-plugin/build/temp.linux-x86_64-cpython-312'
error: command '/usr/bin/make' failed with exit code 2