Skip to content

How to install with GPU support via cuBLAS and CUDA #250

@DavidBurela

Description

@DavidBurela

Submitting and closing, to help anyone else searching for how to solve this. Including my error message as that is where I was stuck with no results found on the web.
I have also captured an exact step by step in this ReadMe: https://github.com/DavidBurela/edgellm#edgellm

Install CUDA toolkit

You need to ensure you have the CUDA toolkit installed. as you need nvcc etc in your path, to correctly compile when you install via
CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python

Ensure you install the correct version of CUDA toolkit

When I installed with cuBLAS support and tried to run, I would get this error
the provided PTX was compiled with an unsupported toolchain.

I was able to pin the root cause down to the CUDA Toolkit version being installed, was newer than what my GPU Drivers supported.
Run nvidia-smi, and note what version of CUDA is supported in the top right.
Here my GPU drivers support 12.0, so I can install CUDA toolkit 12.0.1
image

Download & install the correct version

Direct download and install

https://developer.nvidia.com/cuda-toolkit-archive

Conda

If you are using Conda you can also download it directly into your environment

conda create -n condaexample python=3.11 #enter later python version if needed
conda activate condaexample 
# Full list at https://anaconda.org/nvidia/cuda-toolkit
conda install -c "nvidia/label/cuda-12.1.1" cuda-toolkit

Enable in code

# CPU only
model = LlamaCpp(model_path="./models/model.bin", verbose=True, n_threads=8)

# GPU. Must specify number of layers to load into VRAM
model = LlamaCpp(model_path="./models/model.bin", verbose=True, n_threads=8, n_gpu_layers=20)

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions