How to install with GPU support via cuBLAS and CUDA

Submitting and closing, to help anyone else searching for how to solve this. Including my error message as that is where I was stuck with no results found on the web.
I have also captured an exact step by step in this ReadMe: https://github.com/DavidBurela/edgellm#edgellm

# Install CUDA toolkit
You need to ensure you have the CUDA toolkit installed. as you need `nvcc` etc in your path, to correctly compile when you install via
`CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python`

## Ensure you install the correct version of CUDA toolkit
When I installed with cuBLAS support and tried to run, I would get this error
`the provided PTX was compiled with an unsupported toolchain.`

I was able to pin the root cause down to the CUDA Toolkit version being installed, was newer than what my GPU Drivers supported.
Run `nvidia-smi`, and note what version of CUDA is supported in the top right.
Here my GPU drivers support 12.0, so I can install CUDA toolkit 12.0.1
![image](https://github.com/abetlen/llama-cpp-python/assets/1320729/f3ea76c6-1fb1-438f-afaf-173b7dc8a3e8)



## Download & install the correct version

### Direct download and install
https://developer.nvidia.com/cuda-toolkit-archive

### Conda
If you are using Conda you can also download it directly into your environment

``` bash
conda create -n condaexample python=3.11 #enter later python version if needed
conda activate condaexample 
# Full list at https://anaconda.org/nvidia/cuda-toolkit
conda install -c "nvidia/label/cuda-12.1.1" cuda-toolkit
```

# Enable in code
``` python
# CPU only
model = LlamaCpp(model_path="./models/model.bin", verbose=True, n_threads=8)

# GPU. Must specify number of layers to load into VRAM
model = LlamaCpp(model_path="./models/model.bin", verbose=True, n_threads=8, n_gpu_layers=20)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to install with GPU support via cuBLAS and CUDA #250

Install CUDA toolkit

Ensure you install the correct version of CUDA toolkit

Download & install the correct version

Direct download and install

Conda

Enable in code

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

How to install with GPU support via cuBLAS and CUDA #250

Description

Install CUDA toolkit

Ensure you install the correct version of CUDA toolkit

Download & install the correct version

Direct download and install

Conda

Enable in code

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions