-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Added all CPU to Docker GPU images for 'token_embd.weight' compatibility #12749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
slaren
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but if any of these images are intended to be used on Arm (maybe Vulkan?), it would need the same logic as the CPU image to disable GGML_BACKEND_DL on Arm builds.
|
As far as I can tell only CPU images are built with ARM, all gpu images are built for AMD64. |
0c48a07 to
835d9e4
Compare
|
@aubinkure how can you trace it back to this PR if this was more recent than the issues you refereed to. |
|
You're right, it seems this PR isn't related to the issues I mentioned. That said, I double checked and this PR is definitely breaking quantization for me inside a Docker container. I'll open up a new issue with steps to reproduce. |
Issue #12500, Cuda docker images are crashing with old GPU and maybe more recent ones because token_embd.weight is being processed by the CPU, since BMI2 was added this causes the program to crash due to compatibility issues.
It was recommended to add all CPU variants to all GPU images because it would benefit GPU images with CPU compatibility.