Skip to content

Vulkan: fix NaN in tanh.comp with AMD proprietary driver on Windows #10723

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Dec 8, 2024

Conversation

stduhpf
Copy link
Contributor

@stduhpf stduhpf commented Dec 8, 2024

I noticed when running test-backend-ops that the TANH op can sometimes output NaNs with Vulkan. I didn't experience any issues because of it, but it was a simple fix, that should hopefuly not cause any noticable slowdown.

Master:

> .\build\bin\Release\test-backend-ops.exe  | Select-String -Pattern "TANH"
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 64 | matrix cores: none
ggml_vulkan: Compiling shaders..........................Done!

  TANH(type=f32,ne_a=[128,2,2,2],v=0): [TANH] NaN at index 0 (Vulkan0=-nan(ind) CPU=-1.000000) FAIL
  TANH(type=f32,ne_a=[5,7,11,13],v=0): [TANH] NaN at index 2 (Vulkan0=-nan(ind) CPU=1.000000) FAIL
  TANH(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
  TANH(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]

PR:

>.\build\bin\Release\test-backend-ops.exe  | Select-String -Pattern "TANH"
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = AMD Radeon RX 5700 XT (AMD proprietary driver) | uma: 0 | fp16: 1 | warp size: 64 | matrix cores: none
ggml_vulkan: Compiling shaders..........................Done!

  TANH(type=f32,ne_a=[128,2,2,2],v=0): OK
  TANH(type=f32,ne_a=[5,7,11,13],v=0): OK
  TANH(type=f32,ne_a=[128,2,2,2],v=1): not supported [Vulkan0]
  TANH(type=f32,ne_a=[5,7,11,13],v=1): not supported [Vulkan0]

@github-actions github-actions bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Dec 8, 2024
@0cc4m
Copy link
Collaborator

0cc4m commented Dec 8, 2024

We've had this issue before (#5260), last time I fixed it by replacing tanh(x) with 1 - 2 / (e^(2x) + 1). Not sure which version is more performant.

@stduhpf
Copy link
Contributor Author

stduhpf commented Dec 8, 2024

It looks like doing 1 - 2 / (e^(2x) + 1) is faster, at least on my end: https://www.shadertoy.com/view/MfGBzR

Edit: Actually, it looks like it's even faster than built-in tanh? 🤔 Maybe webGLSL isn't such a good way to test performance....

@0cc4m
Copy link
Collaborator

0cc4m commented Dec 8, 2024

It looks like doing 1 - 2 / (e^(2x) + 1) is faster, at least on my end: https://www.shadertoy.com/view/MfGBzR

Edit: Actually, it looks like it's even faster than built-in tanh? 🤔 Maybe webGLSL isn't such a good way to test performance....

I think it's unlikely that any of these versions is slow enough to cause an issue, looks good, thank you for the fix.

This just leaves the q2_k and q3_k MMV issue on Windows AMD, all the other unit test failures are barely above the threshold.

@0cc4m 0cc4m merged commit 06d7014 into ggml-org:master Dec 8, 2024
2 checks passed
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Dec 20, 2024
tinglou pushed a commit to tinglou/llama.cpp that referenced this pull request Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants