Closed
Description
Usually I use the POWER8 platform but today I found a bad error in the x86-64 code for the DTRMV routine.
I want to compute x <- A*x, where
- A is nonunit upper triangular,
- N = 1...64,
- LDA = 64,
- and INCX=1
and get wrong results on a Haswell (16 cores, Xeon CPU E5-2640 v3) based system using gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) if OpenBLAS is compiled via
make USE_OPENMP=1
I compared the results with the Netlib implementation and obtained:
Correct RESULT: DTRMV(U,N,N, 1, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 2, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 3, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 4, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 5, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 6, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 7, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 8, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 9, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 10, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 11, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 12, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 13, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 14, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 15, A, 64, X, 1) MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N, 16, A, 64, X, 1) MAXERR = .000000000000000D+00
Wrong RESULT : DTRMV(U,N,N, 17, A, 64, X, 1) MAXERR = .568434188608080D-13
Wrong RESULT : DTRMV(U,N,N, 18, A, 64, X, 1) MAXERR = .113686837721616D-12
Wrong RESULT : DTRMV(U,N,N, 19, A, 64, X, 1) MAXERR = .113686837721616D-12
Wrong RESULT : DTRMV(U,N,N, 20, A, 64, X, 1) MAXERR = .227373675443232D-12
Wrong RESULT : DTRMV(U,N,N, 21, A, 64, X, 1) MAXERR = .227373675443232D-12
Wrong RESULT : DTRMV(U,N,N, 22, A, 64, X, 1) MAXERR = .454747350886464D-12
Wrong RESULT : DTRMV(U,N,N, 23, A, 64, X, 1) MAXERR = .909494701772928D-12
Wrong RESULT : DTRMV(U,N,N, 24, A, 64, X, 1) MAXERR = .909494701772928D-12
Wrong RESULT : DTRMV(U,N,N, 25, A, 64, X, 1) MAXERR = .181898940354586D-11
Wrong RESULT : DTRMV(U,N,N, 26, A, 64, X, 1) MAXERR = .363797880709171D-11
Wrong RESULT : DTRMV(U,N,N, 27, A, 64, X, 1) MAXERR = .727595761418343D-11
Wrong RESULT : DTRMV(U,N,N, 28, A, 64, X, 1) MAXERR = .363797880709171D-11
Wrong RESULT : DTRMV(U,N,N, 29, A, 64, X, 1) MAXERR = .291038304567337D-10
Wrong RESULT : DTRMV(U,N,N, 30, A, 64, X, 1) MAXERR = .582076609134674D-10
Wrong RESULT : DTRMV(U,N,N, 31, A, 64, X, 1) MAXERR = .145519152283669D-10
Wrong RESULT : DTRMV(U,N,N, 32, A, 64, X, 1) MAXERR = .582076609134674D-10
Wrong RESULT : DTRMV(U,N,N, 33, A, 64, X, 1) MAXERR = .596598620663450D+06
Wrong RESULT : DTRMV(U,N,N, 34, A, 64, X, 1) MAXERR = .605368209769108D+06
Wrong RESULT : DTRMV(U,N,N, 35, A, 64, X, 1) MAXERR = .114251427897636D+07
Wrong RESULT : DTRMV(U,N,N, 36, A, 64, X, 1) MAXERR = .176011411015678D+07
Wrong RESULT : DTRMV(U,N,N, 37, A, 64, X, 1) MAXERR = .273520848158376D+07
Wrong RESULT : DTRMV(U,N,N, 38, A, 64, X, 1) MAXERR = .400535392194670D+07
Wrong RESULT : DTRMV(U,N,N, 39, A, 64, X, 1) MAXERR = .583437738276019D+07
Wrong RESULT : DTRMV(U,N,N, 40, A, 64, X, 1) MAXERR = .850833665131698D+07
Wrong RESULT : DTRMV(U,N,N, 41, A, 64, X, 1) MAXERR = .123493561359394D+08
Wrong RESULT : DTRMV(U,N,N, 42, A, 64, X, 1) MAXERR = .173391397036582D+08
Wrong RESULT : DTRMV(U,N,N, 43, A, 64, X, 1) MAXERR = .260484477387416D+08
Wrong RESULT : DTRMV(U,N,N, 44, A, 64, X, 1) MAXERR = .376086864422978D+08
Wrong RESULT : DTRMV(U,N,N, 45, A, 64, X, 1) MAXERR = .562860046618746D+08
Wrong RESULT : DTRMV(U,N,N, 46, A, 64, X, 1) MAXERR = .722429815166366D+08
Wrong RESULT : DTRMV(U,N,N, 47, A, 64, X, 1) MAXERR = .129301313570969D+09
Wrong RESULT : DTRMV(U,N,N, 48, A, 64, X, 1) MAXERR = .193137989170661D+09
Wrong RESULT : DTRMV(U,N,N, 49, A, 64, X, 1) MAXERR = .279222204411076D+09
Wrong RESULT : DTRMV(U,N,N, 50, A, 64, X, 1) MAXERR = .678662007793433D+09
Wrong RESULT : DTRMV(U,N,N, 51, A, 64, X, 1) MAXERR = .121269930403095D+10
Wrong RESULT : DTRMV(U,N,N, 52, A, 64, X, 1) MAXERR = .106155723183043D+10
Wrong RESULT : DTRMV(U,N,N, 53, A, 64, X, 1) MAXERR = .306408230139373D+10
Wrong RESULT : DTRMV(U,N,N, 54, A, 64, X, 1) MAXERR = .470622143023516D+10
Wrong RESULT : DTRMV(U,N,N, 55, A, 64, X, 1) MAXERR = .716298126556327D+10
Wrong RESULT : DTRMV(U,N,N, 56, A, 64, X, 1) MAXERR = .103315607770989D+11
Wrong RESULT : DTRMV(U,N,N, 57, A, 64, X, 1) MAXERR = .837779499609675D+10
Wrong RESULT : DTRMV(U,N,N, 58, A, 64, X, 1) MAXERR = .255092826195343D+11
Wrong RESULT : DTRMV(U,N,N, 59, A, 64, X, 1) MAXERR = .388799116737581D+11
Wrong RESULT : DTRMV(U,N,N, 60, A, 64, X, 1) MAXERR = .612787305842372D+11
Wrong RESULT : DTRMV(U,N,N, 61, A, 64, X, 1) MAXERR = .421813369205623D+11
Wrong RESULT : DTRMV(U,N,N, 62, A, 64, X, 1) MAXERR = .531064994597937D+11
Wrong RESULT : DTRMV(U,N,N, 63, A, 64, X, 1) MAXERR = .218143907859646D+12
Wrong RESULT : DTRMV(U,N,N, 64, A, 64, X, 1) MAXERR = .160600130355114D+12
The demonstration code is here: https://gist.github.com/grisuthedragon/32d8ad11e1d722414f921509b97d5507
Metadata
Metadata
Assignees
Labels
No labels