Description
Build scripts support many types of hardware acceleration: https://github.com/abetlen/llama-cpp-python/blob/v0.2.11/README.md#installation-with-hardware-acceleration
# cuBLAS
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
# Metal
CMAKE_ARGS="-DLLAMA_METAL=on" pip install llama-cpp-python
If one is missing OS-level dependencies, the errors thrown can be somewhat arcane, to those not informed. The request is to nice-ify these messages, providing context and/or solutions.
For example, when installing on an AWS EC2 g4dn.4xlarge
, without nvcc
installed via nvidia-cuda-toolkit
:
CMake Error at /tmp/pip-build-env-xoh3dnhj/normal/lib/python3.11/site-packages/cmake/data/share/cmake-3.27/Modules/CMakeDetermineCUDACompiler.cmake:603 (message):
Failed to detect a default CUDA architecture.
This can be nice-ified to be:
Failed to detect a default CUDA architecture.
cuBLAS backend failed: do you have nvcc installed?
Or when accidentally pasting in the wrong CMAKE_ARGS
(here I mixed up Metal and cuBLAS):
CMake Error at vendor/llama.cpp/CMakeLists.txt:174 (find_library):
Could not find FOUNDATION_LIBRARY using the following names: Foundation
This can be nice-ified to be:
Could not find FOUNDATION_LIBRARY using the following names: Foundation
Metal backend failed: are you on a Mac?
In Python, Exception
s can be re-raised using from
: raise Exception("Nice message.") from arcane_exc
. I am not sure if this is something supported by CMakeLists.txt
, but I think it would be really helpful to provide better failure messages for the hardware acceleration specifics.
For context, these errors were encountered in ggml-org/llama.cpp#3459.