[User] Running perplexity for LLaMA2 with CLBlast segfaults #2736

KerfuffleV2 · 2023-08-23T09:06:33Z

Current Behavior

Running perplexity segfaults. Seems like this occurs right at the end of calculating the first block.

Environment and Context

Tested with b8ad1b6 but this issue has been around for a while. Notably from before the GGUF stuff got merged, so it's not a problem with GGUF or subsequent changes.

Physical (or virtual) hardware you are using, e.g. for Linux:

GPU is an AMD Radeon RX 6600.

ggml_opencl: selecting platform: 'AMD Accelerated Parallel Processing'                                                                                               
ggml_opencl: selecting device: 'gfx1030'
ggml_opencl: device FP16 support: true

Operating System, e.g. for Linux:

Linux 6.4.11-arch2-1 #1 SMP PREEMPT_DYNAMIC Sat, 19 Aug 2023 15:38:34 +0000 x86_64 GNU/Linux

SDK version, e.g. for Linux:

Not sure if it matters but the CLBlast version is 1.6.1.

Failure Information (for bugs)

Steps to Reproduce

Seems like this happens with LLaMA2 models specifically. I can confirm it definitely happens with openorca-platypus2-13b.ggmlv3.q5_K_M.

Can anyone else with CLBLast + AMD GPU replicate the issue on a 13B LLaMA2 model?

Failure Logs

I tried compiling with LLAMA_DEBUG=1 and running in GDB but the results weren't too helpful. It crashes deep in the AMD libraries doing a memcpy in some separate thread.

The text was updated successfully, but these errors were encountered:

ghost · 2023-08-23T12:24:57Z

Hi, I don't have the hardware to test your case, but Android with CLBlast shows wild perplexity.

hchenphd mentions perplexity so high that his system can't represent the value, further proving ggml/gguf is probably unrelated.

KerfuffleV2 · 2023-08-23T13:45:48Z

Hmm, not sure if this is the same thing since my problem seems specific to LLaMA2. I didn't have an issue running it on a LLaMA1 7B model.

klosax · 2023-08-23T18:31:21Z

Any difference using other ctx sizes or batch sizes?

KerfuffleV2 · 2023-08-24T00:18:12Z

Any difference using other ctx sizes or batch sizes?

Good question. Got some weird results:

args	result
`-b 512 -c 128`	CRASH (with "corrupted double-linked list")
`-b 512 -c 512`	CRASH
`-b 520 -c 520`	CRASH
`-b 528 -c 528`	CRASH
`-b 544 -c 544`	CRASH
`-b 570 -c 570`	CRASH (but after time for first block)
`-b 576 -c 576`	OK
`-b 512 -c 1024`	CRASH
`-b 512 -c 2048`	CRASH
`-b 640 -c 640`	OK
`-b 768 -c 768`	OK
`-b 1024 -c 1024`	OK
`-b 1024 -c 1026`	OK

CRASH by itself means it crashes immediately after (apparently) computing the first block without any other output. I am pretty sure it's a memory corruption issue and memory corruption can break stuff immediately or not. So the fact that it works sometimes might not indicate it's really okay.

netrunnereve · 2023-08-25T01:44:15Z

With Clover and CLBlast I've never had any issues with LLaMA 2 perplexity on my AMD card. I tried running with some of the ctx/batch sizes that caused you to crash on a 13B and was able to process a couple blocks with no issues.

ggml_opencl: selecting platform: 'Clover'
ggml_opencl: selecting device: 'AMD Radeon FirePro W8100 (hawaii, LLVM 15.0.7, DRM 3.42, 5.15.0-79-generic)'
ggml_opencl: device FP16 support: false

Does your RX 6600 even support Clover or is it ROCM OpenCL only? It might be worth a try to see if that fixes things, though keep in mind Clover is known to be slow.

KerfuffleV2 · 2023-08-25T20:45:14Z

It might be worth a try to see if that fixes things, though keep in mind Clover is known to be slow.

But, but, but I don't want to use the slow thing! :) Using CLBlast is actually a lot slower than the ROCM patch (which was finally merged 🎉) also.

Seems like there might be Clover stuff in Mesa but it also seems like Clover is on the way out: https://www.phoronix.com/news/Mesa-Delete-Clover-Discussion - probably would be better to just try to use the normal OpenCL stuff in Mesa I would think.

netrunnereve · 2023-08-29T01:09:26Z

Yep I'm excited for Rusticl - but by then we should have Vulkan llama.cpp support and that's the better choice going forwards.

shibe2 · 2023-10-12T12:54:33Z

I recently fixed buffer overflow in CLBlast backend. Maybe it was the cause here. If someone will confirm that my change fixes the crash, I will link this issue when I will be merging the fix. It may be triggered by certain models and batch sizes, so known problematic combinations may be used for testing.

KerfuffleV2 · 2023-10-14T11:05:09Z

@shibe2 Thanks, I think you already fixed it in one of your previous changes. I can't reproduce the issue anymore even without #3603.

KerfuffleV2 mentioned this issue Sep 6, 2023

[User] GGML_ASSERT failure for opencl #3002

Closed

4 tasks

shibe2 mentioned this issue Oct 12, 2023

CLBlast: Fix temporary buffer size for f16 conversion (wsize) #3603

Merged

KerfuffleV2 closed this as completed Oct 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[User] Running perplexity for LLaMA2 with CLBlast segfaults #2736

[User] Running perplexity for LLaMA2 with CLBlast segfaults #2736

KerfuffleV2 commented Aug 23, 2023

ghost commented Aug 23, 2023 •

edited by ghost

Loading

KerfuffleV2 commented Aug 23, 2023

klosax commented Aug 23, 2023

KerfuffleV2 commented Aug 24, 2023

netrunnereve commented Aug 25, 2023

KerfuffleV2 commented Aug 25, 2023

netrunnereve commented Aug 29, 2023

shibe2 commented Oct 12, 2023 •

edited

Loading

KerfuffleV2 commented Oct 14, 2023

[User] Running perplexity for LLaMA2 with CLBlast segfaults #2736

[User] Running perplexity for LLaMA2 with CLBlast segfaults #2736

Comments

KerfuffleV2 commented Aug 23, 2023

Current Behavior

Environment and Context

Failure Information (for bugs)

Steps to Reproduce

Failure Logs

ghost commented Aug 23, 2023 • edited by ghost Loading

KerfuffleV2 commented Aug 23, 2023

klosax commented Aug 23, 2023

KerfuffleV2 commented Aug 24, 2023

netrunnereve commented Aug 25, 2023

KerfuffleV2 commented Aug 25, 2023

netrunnereve commented Aug 29, 2023

shibe2 commented Oct 12, 2023 • edited Loading

KerfuffleV2 commented Oct 14, 2023

ghost commented Aug 23, 2023 •

edited by ghost

Loading

shibe2 commented Oct 12, 2023 •

edited

Loading