GGUF Sharded model metadata display might have a memory leak #603

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

madgetr opened this issue Apr 3, 2024 · 6 comments

Contributor

madgetr commented Apr 3, 2024

Steps to reproduce

open a sharded gguf model in the model inspector: https://huggingface.co/ggml-org/models/tree/main/grok-1?show_tensors=grok-1%2Fgrok-1-q4_0-00001-of-00009.gguf
wait about 1min
observe oom error in browser.

specs:
Google chrome Version 123.0.6312.86 (Official Build) (64-bit)
Windows 11
Ram 32Gb

Contributor Author

madgetr commented Apr 3, 2024

Confirmed it happens for others

Contributor Author

madgetr commented Apr 3, 2024

Member

julien-c commented Apr 3, 2024

i think in some cases it might download the full file or something (was reported on a ggml repo)

Member

ngxson commented Apr 3, 2024

Related to: ggml-org/llama.cpp#6343 (comment)

The file continues to download, which cause the browser to run out of memory.

Contributor Author

madgetr commented Apr 3, 2024

Thanks @ngxson I did not realize that you had already found this 👍

Collaborator

mishig25 commented Apr 24, 2024

fixed by #634

mishig25 closed this as completed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment