Skip to content

AI Labs Service doesn't use GPU #3431

@gastoner

Description

@gastoner

Bug description

Hi.

I startet to use Podman as a replacement to docker. I realy love it.
I also discovered that there is the AI Lab to host models. So I started using it. Everything works, except the GPU support. When I create a service from a model it pull the image and start a container. When I then use the container to chat with a model it only uses the CPU.

Image

I have followed the steps on https://podman-desktop.io/docs/podman/gpu and get this output:

Image

But when I create a service from a model it doesn't use the gpu.
This also happens when I start the LLama stack and use it. It doesn't use the gpu.

Operating system

w11

Installation Method

from Podman-Desktop extension page

Version

1.2.x

Steps to reproduce

Just create a service from a model in the AI lab and try to use it.

Relevant log output

gml_cuda_init: failed to initialize CUDA: no CUDA-capable device is detected
build: 5985 (3f4fc97f) with cc (GCC) 12.2.1 20221121 (Red Hat 12.2.1-7) for x86_64-redhat-linux
system info: n_threads = 8, n_threads_batch = 8, total_threads = 16

system_info: n_threads = 8 (n_threads_batch = 8) / 16 | CUDA : ARCHS = 500,610,700,750,800,860,890 | USE_GRAPHS = 1 | PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 | 

main: binding port with default address family
main: HTTP server is listening, hostname: 0.0.0.0, port: 8000, http threads: 15
main: loading m

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions