Skip to content

Can make it work with PHI-3-MINI and opencl #970

Closed
@0wwafa

Description

@0wwafa

https://huggingface.co/ZeroWw/Phi-3-mini-128k-instruct-GGUF/blob/main/Phi-3-mini-128k-instruct.q5_k.gguf

llm_load_tensors: ggml ctx size =    0.24 MiB
llm_load_tensors: offloading 4 repeating layers to GPU
llm_load_tensors: offloaded 4/33 layers to GPU
llm_load_tensors:        CPU buffer size =  2918.26 MiB
llm_load_tensors:     OpenCL buffer size =   324.19 MiB
......................................................................................
Automatic RoPE Scaling: Using (scale:1.000, base:10000.0).
llama_new_context_with_model: n_ctx      = 8288
llama_new_context_with_model: n_batch    = 512
llama_new_context_with_model: n_ubatch   = 512
llama_new_context_with_model: flash_attn = 0
llama_new_context_with_model: freq_base  = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_kv_cache_init:        CPU KV buffer size =  3108.00 MiB
llama_new_context_with_model: KV self size  = 3108.00 MiB, K (f16): 1554.00 MiB, V (f16): 1554.00 MiB
llama_new_context_with_model:        CPU  output buffer size =     0.12 MiB
llama_new_context_with_model:        CPU compute buffer size =   570.19 MiB
llama_new_context_with_model: graph nodes  = 1286
llama_new_context_with_model: graph splits = 1
Traceback (most recent call last):
  File "koboldcpp.py", line 3783, in <module>
  File "koboldcpp.py", line 3445, in main
  File "koboldcpp.py", line 444, in load_model
OSError: exception: access violation reading 0x000000000510D000
[14532] Failed to execute script 'koboldcpp' due to unhandled exception!

but if I don't use opencl it works.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions