Open
Description
I've tried on MultiFloats.jl on the GPU, but I'm getting loss of precision compared to the CPU:
using CUDA, MultiFloats
A = rand(Float64x8, 100, 100)
B = rand(Float64x8, 100, 100)
A * B - Array(CuArray(A) * CuArray(B))
Gives me
100×100 Matrix{MultiFloat{Float64, 8}}:
-1.23827e-98 9.35263e-99 -8.83181e-99 … -4.70324e-99 -1.3348e-98
-1.98421e-99 8.20389e-99 1.67043e-98 1.45499e-98 2.32225e-98
-2.77264e-99 -3.30951e-99 1.32426e-98 -1.09181e-98 7.84157e-100
1.92544e-98 6.35776e-99 -8.85547e-99 1.29435e-98 -4.89252e-99
-5.52038e-99 5.35901e-99 -3.705e-98 1.53947e-99 7.38954e-99
-2.16904e-98 1.64505e-98 -1.16536e-98 … -3.19036e-98 7.5397e-99
6.72487e-98 6.07349e-99 -2.87359e-98 ...
but eps(Float64x8)
is 5.9091063153828709e-126
.
What explain this? The order of iteration?
Metadata
Metadata
Assignees
Labels
No labels