You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a CUDA supporting card and a CPU that doesn't support AVX2, and I want to build llama-cpp-python for CUDA. I can compile the latest llama.cpp in my (x64!!) Visual Studio environment with cmake, and it works, detecting no AVX2 and CUDA out of the box without any arguments and giving me a binary that prints the expected system info n_threads = 4 / 8 | AVX = 1 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0 and runs perfectly fine. So theoretically it should be possible.
AVX2=1, and obviously when I try to run any model it just errors out with a Windows Error 0xc000001d because I don't actually have any AVX2 for it to use.
It also does the same thing if I transpose the arguments and use set CMAKE_ARGS="-DLLAMA_CUBLAS=ON -DLLAMA_AVX2=OFF"
Environment and Context
i7-3770, Windows 10 Enterprise 64 bit 10.0.19044, Visual Studio 2022, cl.exe 19.35.32217.1 for x64, cmake version 3.25.1-msvc1, Python 3.10.4, pip 23.1.2
Failure Logs
Here is a verbose compile log:
(venv) D:\llamastuff\test>set FORCE_CMAKE=1
(venv) D:\llamastuff\test>set CMAKE_ARGS="-DLLAMA_AVX2=OFF -DLLAMA_CUBLAS=ON"
(venv) D:\llamastuff\test>pip install llama-cpp-python --force-reinstall --upgrade --no-cache-dir --verbose
Using pip 23.1.2 from D:\llamastuff\test\venv\lib\site-packages\pip (python 3.10)
Collecting llama-cpp-python
Downloading llama_cpp_python-0.1.55.tar.gz (1.4 MB)
---------------------------------------- 1.4/1.4 MB 397.4 kB/s eta 0:00:00
Running command pip subprocess to install build dependencies
Collecting setuptools>=42
Using cached setuptools-67.8.0-py3-none-any.whl (1.1 MB)
Collecting scikit-build>=0.13
Using cached scikit_build-0.17.5-py3-none-any.whl (82 kB)
Collecting cmake>=3.18
Using cached cmake-3.26.3-py2.py3-none-win_amd64.whl (33.0 MB)
Collecting ninja
Using cached ninja-1.11.1-py2.py3-none-win_amd64.whl (313 kB)
Collecting distro (from scikit-build>=0.13)
Using cached distro-1.8.0-py3-none-any.whl (20 kB)
Collecting packaging (from scikit-build>=0.13)
Using cached packaging-23.1-py3-none-any.whl (48 kB)
Collecting tomli (from scikit-build>=0.13)
Using cached tomli-2.0.1-py3-none-any.whl (12 kB)
Collecting wheel>=0.32.0 (from scikit-build>=0.13)
Using cached wheel-0.40.0-py3-none-any.whl (64 kB)
Installing collected packages: ninja, cmake, wheel, tomli, setuptools, packaging, distro, scikit-build
Successfully installed cmake-3.26.3 distro-1.8.0 ninja-1.11.1 packaging-23.1 scikit-build-0.17.5 setuptools-67.8.0 tomli-2.0.1 wheel-0.40.0
Installing build dependencies ... done
Running command Getting requirements to build wheel
running egg_info
writing llama_cpp_python.egg-info\PKG-INFO
writing dependency_links to llama_cpp_python.egg-info\dependency_links.txt
writing requirements to llama_cpp_python.egg-info\requires.txt
writing top-level names to llama_cpp_python.egg-info\top_level.txt
reading manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
adding license file 'LICENSE.md'
writing manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
Getting requirements to build wheel ... done
Running command Preparing metadata (pyproject.toml)
running dist_info
creating C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info
writing C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\PKG-INFO
writing dependency_links to C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\dependency_links.txt
writing requirements to C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\requires.txt
writing top-level names to C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\top_level.txt
writing manifest file 'C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\SOURCES.txt'
reading manifest file 'C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\SOURCES.txt'
adding license file 'LICENSE.md'
writing manifest file 'C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python.egg-info\SOURCES.txt'
creating 'C:\Users\Vardogger\AppData\Local\Temp\pip-modern-metadata-hz1064nm\llama_cpp_python-0.1.55.dist-info'
Preparing metadata (pyproject.toml) ... done
Collecting typing-extensions>=4.5.0 (from llama-cpp-python)
Downloading typing_extensions-4.6.2-py3-none-any.whl (31 kB)
Building wheels for collected packages: llama-cpp-python
Running command Building wheel for llama-cpp-python (pyproject.toml)
--------------------------------------------------------------------------------
-- Trying 'Ninja (Visual Studio 17 2022 x64 v143)' generator
--------------------------------
---------------------------
----------------------
-----------------
------------
-------
--
Not searching for unused variables given on the command line.
-- The C compiler identification is MSVC 19.35.32217.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: D:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- The CXX compiler identification is MSVC 19.35.32217.1
CMake Warning (dev) at C:/Users/Vardogger/AppData/Local/Temp/pip-build-env-zzt_1t3c/overlay/Lib/site-packages/cmake/data/share/cmake-3.26/Modules/CMakeDetermineCXXCompiler.cmake:168 (if):
Policy CMP0054 is not set: Only interpret if() arguments as variables or
keywords when unquoted. Run "cmake --help-policy CMP0054" for policy
details. Use the cmake_policy command to set the policy and suppress this
warning.
Quoted variables like "MSVC" will no longer be dereferenced when the policy
is set to NEW. Since the policy is not set the OLD behavior will be used.
Call Stack (most recent call first):
CMakeLists.txt:4 (ENABLE_LANGUAGE)
This warning is for project developers. Use -Wno-dev to suppress it.
CMake Warning (dev) at C:/Users/Vardogger/AppData/Local/Temp/pip-build-env-zzt_1t3c/overlay/Lib/site-packages/cmake/data/share/cmake-3.26/Modules/CMakeDetermineCXXCompiler.cmake:189 (elseif):
Policy CMP0054 is not set: Only interpret if() arguments as variables or
keywords when unquoted. Run "cmake --help-policy CMP0054" for policy
details. Use the cmake_policy command to set the policy and suppress this
warning.
Quoted variables like "MSVC" will no longer be dereferenced when the policy
is set to NEW. Since the policy is not set the OLD behavior will be used.
Call Stack (most recent call first):
CMakeLists.txt:4 (ENABLE_LANGUAGE)
This warning is for project developers. Use -Wno-dev to suppress it.
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: D:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Configuring done (2.7s)
-- Generating done (0.0s)
-- Build files have been written to: C:/Users/Vardogger/AppData/Local/Temp/pip-install-nrzfuccz/llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253/_cmake_test_compile/build
--
-------
------------
-----------------
----------------------
---------------------------
--------------------------------
-- Trying 'Ninja (Visual Studio 17 2022 x64 v143)' generator - success
--------------------------------------------------------------------------------
Configuring Project
Working directory:
C:\Users\Vardogger\AppData\Local\Temp\pip-install-nrzfuccz\llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253\_skbuild\win-amd64-3.10\cmake-build
Command:
'C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\cmake\data\bin/cmake.exe' 'C:\Users\Vardogger\AppData\Local\Temp\pip-install-nrzfuccz\llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253' -G Ninja '-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\ninja\data\bin\ninja' -D_SKBUILD_FORCE_MSVC=1930 --no-warn-unused-cli '-DCMAKE_INSTALL_PREFIX:PATH=C:\Users\Vardogger\AppData\Local\Temp\pip-install-nrzfuccz\llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253\_skbuild\win-amd64-3.10\cmake-install' -DPYTHON_VERSION_STRING:STRING=3.10.4 -DSKBUILD:INTERNAL=TRUE '-DCMAKE_MODULE_PATH:PATH=C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\skbuild\resources\cmake' '-DPYTHON_EXECUTABLE:PATH=D:\llamastuff\test\venv\Scripts\python.exe' '-DPYTHON_INCLUDE_DIR:PATH=D:\python\python310\Include' '-DPYTHON_LIBRARY:PATH=D:\python\python310\libs\python310.lib' '-DPython_EXECUTABLE:PATH=D:\llamastuff\test\venv\Scripts\python.exe' '-DPython_ROOT_DIR:PATH=D:\llamastuff\test\venv' -DPython_FIND_REGISTRY:STRING=NEVER '-DPython_INCLUDE_DIR:PATH=D:\python\python310\Include' '-DPython_LIBRARY:PATH=D:\python\python310\libs\python310.lib' '-DPython3_EXECUTABLE:PATH=D:\llamastuff\test\venv\Scripts\python.exe' '-DPython3_ROOT_DIR:PATH=D:\llamastuff\test\venv' -DPython3_FIND_REGISTRY:STRING=NEVER '-DPython3_INCLUDE_DIR:PATH=D:\python\python310\Include' '-DPython3_LIBRARY:PATH=D:\python\python310\libs\python310.lib' '-DCMAKE_MAKE_PROGRAM:FILEPATH=C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\ninja\data\bin\ninja' '"-DLLAMA_AVX2=OFF' '-DLLAMA_CUBLAS=ON"' -DCMAKE_BUILD_TYPE:STRING=Release '-DLLAMA_AVX2=OFF -DLLAMA_CUBLAS=ON'
Not searching for unused variables given on the command line.
CMake Warning:
Ignoring extra path from command line:
""-DLLAMA_AVX2=OFF"
-- The C compiler identification is MSVC 19.35.32217.1
-- The CXX compiler identification is MSVC 19.35.32217.1
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: D:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: D:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.35.32215/bin/Hostx64/x64/cl.exe - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Found Git: D:/Program Files/Git/cmd/git.exe (found version "2.40.0.windows.1")
fatal: not a git repository (or any of the parent directories): .git
fatal: not a git repository (or any of the parent directories): .git
CMake Warning at vendor/llama.cpp/CMakeLists.txt:109 (message):
Git repository not found; to enable automatic generation of build info,
make sure Git is installed and the project is a Git repository.
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Failed
-- Looking for pthread_create in pthreads
-- Looking for pthread_create in pthreads - not found
-- Looking for pthread_create in pthread
-- Looking for pthread_create in pthread - not found
-- Found Threads: TRUE
-- Found CUDAToolkit: D:/Program Files/CUDA Toolkit/include (found version "11.7.64")
-- cuBLAS found
-- The CUDA compiler identification is NVIDIA 11.7.64
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: D:/Program Files/CUDA Toolkit/bin/nvcc.exe - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- CMAKE_SYSTEM_PROCESSOR: AMD64
-- x86 detected
-- GGML CUDA sources found, configuring CUDA architecture
-- Configuring done (19.2s)
-- Generating done (0.0s)
-- Build files have been written to: C:/Users/Vardogger/AppData/Local/Temp/pip-install-nrzfuccz/llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253/_skbuild/win-amd64-3.10/cmake-build
-- Install configuration: "Release"
-- Installing: C:/Users/Vardogger/AppData/Local/Temp/pip-install-nrzfuccz/llama-cpp-python_bb1daf7ce27d485d8e2db5b40d01b253/_skbuild/win-amd64-3.10/cmake-install/llama_cpp/llama.dll
[1/5] Building C object vendor\llama.cpp\CMakeFiles\ggml.dir\ggml.c.obj
[2/5] Building CXX object vendor\llama.cpp\CMakeFiles\llama.dir\llama.cpp.obj
[3/5] Building CUDA object vendor\llama.cpp\CMakeFiles\ggml.dir\ggml-cuda.cu.obj
ggml-cuda.cu
[4/5] Linking CXX shared library bin\llama.dll
[4/5] Install the project...
copying llama_cpp\llama.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.py
copying llama_cpp\llama_cpp.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_cpp.py
copying llama_cpp\llama_types.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_types.py
copying llama_cpp\__init__.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp\__init__.py
creating directory _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server
copying llama_cpp/server\app.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server\app.py
copying llama_cpp/server\__init__.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server\__init__.py
copying llama_cpp/server\__main__.py -> _skbuild\win-amd64-3.10\cmake-install\llama_cpp/server\__main__.py
running bdist_wheel
running build
running build_py
creating _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310
creating _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_cpp.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama_types.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\__init__.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
creating _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\server\app.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\server\__init__.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\server\__main__.py -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server
copying _skbuild\win-amd64-3.10\cmake-install\llama_cpp\llama.dll -> _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp
copied 7 files
running build_ext
installing to _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel
running install
running install_lib
creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64
creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel
creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\llama_cpp
copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama.dll -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama_cpp.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\llama_types.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\llama_cpp\server
copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server\app.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp\server
copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server\__init__.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp\server
copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\server\__main__.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp\server
copying _skbuild\win-amd64-3.10\setuptools\lib.win-amd64-cpython-310\llama_cpp\__init__.py -> _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp
copied 8 files
running install_egg_info
running egg_info
writing llama_cpp_python.egg-info\PKG-INFO
writing dependency_links to llama_cpp_python.egg-info\dependency_links.txt
writing requirements to llama_cpp_python.egg-info\requires.txt
writing top-level names to llama_cpp_python.egg-info\top_level.txt
reading manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
adding license file 'LICENSE.md'
writing manifest file 'llama_cpp_python.egg-info\SOURCES.txt'
Copying llama_cpp_python.egg-info to _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\.\llama_cpp_python-0.1.55-py3.10.egg-info
running install_scripts
copied 0 files
C:\Users\Vardogger\AppData\Local\Temp\pip-build-env-zzt_1t3c\overlay\Lib\site-packages\wheel\bdist_wheel.py:100: RuntimeWarning: Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect
if get_flag("Py_DEBUG", hasattr(sys, "gettotalrefcount"), warn=(impl == "cp")):
creating _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel\llama_cpp_python-0.1.55.dist-info\WHEEL
creating 'C:\Users\Vardogger\AppData\Local\Temp\pip-wheel-oqcck55p\.tmp-mdlva4hn\llama_cpp_python-0.1.55-cp310-cp310-win_amd64.whl' and adding '_skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel' to it
adding 'llama_cpp/__init__.py'
adding 'llama_cpp/llama.dll'
adding 'llama_cpp/llama.py'
adding 'llama_cpp/llama_cpp.py'
adding 'llama_cpp/llama_types.py'
adding 'llama_cpp/server/__init__.py'
adding 'llama_cpp/server/__main__.py'
adding 'llama_cpp/server/app.py'
adding 'llama_cpp_python-0.1.55.dist-info/LICENSE.md'
adding 'llama_cpp_python-0.1.55.dist-info/METADATA'
adding 'llama_cpp_python-0.1.55.dist-info/WHEEL'
adding 'llama_cpp_python-0.1.55.dist-info/top_level.txt'
adding 'llama_cpp_python-0.1.55.dist-info/RECORD'
removing _skbuild\win-amd64-3.10\setuptools\bdist.win-amd64\wheel
Building wheel for llama-cpp-python (pyproject.toml) ... done
Created wheel for llama-cpp-python: filename=llama_cpp_python-0.1.55-cp310-cp310-win_amd64.whl size=289667 sha256=af2b25b6a7ee2f16e2e2bd7e6027f0292ec8d14429f27dc9dbc60b8f0cc79d43
Stored in directory: C:\Users\Vardogger\AppData\Local\Temp\pip-ephem-wheel-cache-bw7hp7lu\wheels\e3\c8\b2\3b99086798b666cdff1000d0995fd164d3eb9db7b7fe4aca09
Successfully built llama-cpp-python
Installing collected packages: typing-extensions, llama-cpp-python
Attempting uninstall: typing-extensions
Found existing installation: typing_extensions 4.6.2
Uninstalling typing_extensions-4.6.2:
Removing file or directory d:\llamastuff\test\venv\lib\site-packages\__pycache__\typing_extensions.cpython-310.pyc
Removing file or directory d:\llamastuff\test\venv\lib\site-packages\typing_extensions-4.6.2.dist-info\
Removing file or directory d:\llamastuff\test\venv\lib\site-packages\typing_extensions.py
Successfully uninstalled typing_extensions-4.6.2
Successfully installed llama-cpp-python-0.1.55 typing-extensions-4.6.2
(venv) D:\llamastuff\test>python -c "from llama_cpp import *; print(llama_print_system_info())"
b'AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | '
The text was updated successfully, but these errors were encountered:
I wonder if this issue is what is happening on Linux as well on #272 I keep getting ILLEGAL INSTRUCTION on Linux everytime I build a new library with cuBLAS support.
Can confirm @chen369's suggestion in #272 lets me compile successfully. Cloning the repo, editing the vendor/llama.cpp CMakeLists.txt to set AVX2 OFF on line 56 and CUBLAS ON on line 70 and doing the pip install+setup from there with FORCE_CMAKE=ON and no other args gives me a working module with AVX = 1 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0
If it turns out to be a llama.cpp bug, though, I'll be quite confused. For me llama.cpp cmake automatically detects CUDA and no AVX2 without me telling it anything, it's only when llama-cpp-python is building with -DLLAMA_CUBLAS=ON that it chooses AVX2 and ignores the arg saying not to. Seems more like a problem in the way llama-cpp-python is talking to cmake. But I don't really understand that whole business so maybe it is!
Okay I really don't know how I managed it earlier, I don't think the environment variables would've been affecting it if I accidentally had them activated? But yes after checking again it turns out that cmake building llama.cpp with no arguments does not intelligently detect CUDA and no AVX2. So that's an upstream issue.
But cmake building llama.cpp with -DLLAMA_AVX2=OFF -DLLAMA_CUBLAS=ON works fine, as expected, and CMAKE_ARGS should be getting that across and isn't.
Expected Behavior
I have a CUDA supporting card and a CPU that doesn't support AVX2, and I want to build llama-cpp-python for CUDA. I can compile the latest llama.cpp in my (x64!!) Visual Studio environment with cmake, and it works,
detecting no AVX2 and CUDA out of the box without any arguments andgiving me a binary that prints the expected system infon_threads = 4 / 8 | AVX = 1 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | VSX = 0
and runs perfectly fine. So theoretically it should be possible.With llama-cpp-python I run these commands:
And I expect to get the same system info as I do for llama.cpp, AVX2=0 and BLAS=1. I also expect to be able to load models and run them!
Current Behavior
Instead I get this system info:
AVX2=1, and obviously when I try to run any model it just errors out with a Windows Error 0xc000001d because I don't actually have any AVX2 for it to use.
It also does the same thing if I transpose the arguments and use
set CMAKE_ARGS="-DLLAMA_CUBLAS=ON -DLLAMA_AVX2=OFF"
Environment and Context
i7-3770, Windows 10 Enterprise 64 bit 10.0.19044, Visual Studio 2022, cl.exe 19.35.32217.1 for x64, cmake version 3.25.1-msvc1, Python 3.10.4, pip 23.1.2
Failure Logs
Here is a verbose compile log:
The text was updated successfully, but these errors were encountered: