You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA vGPU-32GB, compute capability 8.9, VMM: yes
version: 4954 (3cd3a39)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
CUDA
Hardware
4080S 32G
Models
No response
Problem description & steps to reproduce
If you use CUDA to compile llama.cpp, name will encounter a crash when using llama-llava-clip-quantize-cli to quantize the vision part of the clip. After checking, the error area is found in the figure below.
This is most likely an error caused by the inability to access memory in the GPU backend. It needs to be compiled into a CPU backend version before it can be executed. Have you encountered this problem?
Name and Version
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 1 CUDA devices:
Device 0: NVIDIA vGPU-32GB, compute capability 8.9, VMM: yes
version: 4954 (3cd3a39)
built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
Operating systems
Linux
GGML backends
CUDA
Hardware
4080S 32G
Models
No response
Problem description & steps to reproduce
If you use CUDA to compile llama.cpp, name will encounter a crash when using llama-llava-clip-quantize-cli to quantize the vision part of the clip. After checking, the error area is found in the figure below.
This is most likely an error caused by the inability to access memory in the GPU backend. It needs to be compiled into a CPU backend version before it can be executed. Have you encountered this problem?
First Bad Commit
No response
Relevant log output
The text was updated successfully, but these errors were encountered: