-
Notifications
You must be signed in to change notification settings - Fork 11.7k
llama.cpp compiled and run with mpi outputs garbage. #3099
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
It looks like it isn't really using MPI at all, and simply ran 3 separate instances at once. Just to make sure, it would help if you post the exact command you used to build it. |
|
I can't exactly reproduce that problem, but I found a problem, which seems related, possibly it's the same problem but with different symptoms for me. I'll update when I figure it out. @ggerganov With MPI, the |
Thanks @staviq , one other problem I noticed was, MPI run was very slow, even when localhost is the only participant. |
MPI was originally added to help with RAM requirements, so if you don't have enough RAM you can offload part of the workload to another machine, so the model fits on the combined RAM. Compared to running out of memory and swapping to disk on a single machine with not enough RAM, MPI is faster, but you exchange one bottleneck for another, in this case network sockets. Currently, granular arithmetic is parallel, but the model itself is still somewhat sequential in nature, so MPI here was never about proper distributed parallelization, model "chunks" still run one after the other, because one gets its input from the output of the previous one ( generally speaking ). There was some talk here about further parallelization, recently, but I don't think anything like it is implemented yet. The main limitation still seems to be the construction of models themselves. |
This issue was closed because it has been inactive for 14 days since being marked as stale. |
Cmd
mpirun --prefix /usr/lib64/openmpi --hostfile /root/hostfile -n 3 ./main -m /var/cache/hf/llama-2-7b.Q4_K_M.gguf -p "How to build a website?" -n 10 -e
Output
Full details
The text was updated successfully, but these errors were encountered: