-
Notifications
You must be signed in to change notification settings - Fork 697
Open
Description
in testing with --enable-avx512 we (@boegel and me) found that for one simple example:
curl -OL http://micro.stanford.edu/mediawiki/images/a/a9/Simple_example.tar
tar xfv Simple_example.tar
cd simple_example
sed -i'' 's/\(N[01] =\) [0-9]*/\1 16384/g' simple_example.c
gcc -O2 -march=native simple_example.c -lfftw3 -lm -o simple_example
the avx512 FFTW 3.3.7 is consistently (tested on multiple Skylake varieties) slower than the avx2 FFTW 3.3.7 by about a factor of 1.75.
See here for more details:
easybuilders/easybuild-easyblocks#1416 (comment)
it seems that the avx512 ops used by FFTW are simply more expensive on their own and this is not because of CPU frequency issues.
Is this just because FFTW's avx512 support was written before these chips were available so it could not be benchmarked at the time?
Metadata
Metadata
Assignees
Labels
No labels