Skip to content

Vulkan Apple Silicon compatibility #5322

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Vulkan Apple Silicon compatibility #5322

wants to merge 2 commits into from

Conversation

rbourgeat
Copy link

@rbourgeat rbourgeat commented Feb 4, 2024

I just apply my #2059 (comment) suggestion to @0cc4m !

Tested with mistral-7b-instruct-v0.1.Q4_K_M.gguf model with Mac M1 Max (RAM: 32Go):

$ LLAMA_VULKAN=1 make
[...]
$ ./main -m ./models/mistral-7b-instruct-v0.1.Q4_K_M.gguf -n 128 -ngl 1 --repeat_penalty 1.1 --color -i
[...]
llama_print_timings:        load time =     291,56 ms
llama_print_timings:      sample time =       2,61 ms /    22 runs   (    0,12 ms per token,  8422,66 tokens per second)
llama_print_timings: prompt eval time =       0,00 ms /     1 tokens (    0,00 ms per token,      inf tokens per second)
llama_print_timings:        eval time =    1670,33 ms /    23 runs   (   72,62 ms per token,    13,77 tokens per second)
llama_print_timings:       total time =    1934,57 ms /    24 tokens

And without vulkan:

llama_print_timings:        load time =     301.09 ms
llama_print_timings:      sample time =       3.30 ms /    27 runs   (    0.12 ms per token,  8184.30 tokens per second)
llama_print_timings: prompt eval time =       0.00 ms /     1 tokens (    0.00 ms per token,      inf tokens per second)
llama_print_timings:        eval time =    1862.61 ms /    28 runs   (   66.52 ms per token,    15.03 tokens per second)
llama_print_timings:       total time =    2857.50 ms /    29 tokens

@0cc4m
Copy link
Collaborator

0cc4m commented Feb 4, 2024

You're a little late, #5311 is working on this feature as well. Should be the same changes needed for both of your use cases, shouldn't it?

@rbourgeat
Copy link
Author

You're a little late, #5311 is working on this feature as well. Should be the same changes needed for both of your use cases, shouldn't it?

True ! I didn’t see it 😂 you can close this MR 👌🏻

@dokterbob dokterbob mentioned this pull request Feb 5, 2024
@cebtenzzre cebtenzzre closed this Feb 5, 2024
@rbourgeat rbourgeat deleted the vulkan-apple-silicon branch February 5, 2024 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants