Skip to content

Wrong results for small DTRMV on x86-64 #1332

Closed
@grisuthedragon

Description

@grisuthedragon

Usually I use the POWER8 platform but today I found a bad error in the x86-64 code for the DTRMV routine.

I want to compute x <- A*x, where

  • A is nonunit upper triangular,
  • N = 1...64,
  • LDA = 64,
  • and INCX=1

and get wrong results on a Haswell (16 cores, Xeon CPU E5-2640 v3) based system using gcc (Ubuntu 5.4.0-6ubuntu1~16.04.5) if OpenBLAS is compiled via

make USE_OPENMP=1 

I compared the results with the Netlib implementation and obtained:

Correct RESULT: DTRMV(U,N,N,    1, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,    2, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,    3, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,    4, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,    5, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,    6, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,    7, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,    8, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,    9, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,   10, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,   11, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,   12, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,   13, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,   14, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,   15, A,   64, X,     1)  MAXERR = .000000000000000D+00
Correct RESULT: DTRMV(U,N,N,   16, A,   64, X,     1)  MAXERR = .000000000000000D+00
Wrong RESULT  : DTRMV(U,N,N,   17, A,   64, X,     1)  MAXERR = .568434188608080D-13
Wrong RESULT  : DTRMV(U,N,N,   18, A,   64, X,     1)  MAXERR = .113686837721616D-12
Wrong RESULT  : DTRMV(U,N,N,   19, A,   64, X,     1)  MAXERR = .113686837721616D-12
Wrong RESULT  : DTRMV(U,N,N,   20, A,   64, X,     1)  MAXERR = .227373675443232D-12
Wrong RESULT  : DTRMV(U,N,N,   21, A,   64, X,     1)  MAXERR = .227373675443232D-12
Wrong RESULT  : DTRMV(U,N,N,   22, A,   64, X,     1)  MAXERR = .454747350886464D-12
Wrong RESULT  : DTRMV(U,N,N,   23, A,   64, X,     1)  MAXERR = .909494701772928D-12
Wrong RESULT  : DTRMV(U,N,N,   24, A,   64, X,     1)  MAXERR = .909494701772928D-12
Wrong RESULT  : DTRMV(U,N,N,   25, A,   64, X,     1)  MAXERR = .181898940354586D-11
Wrong RESULT  : DTRMV(U,N,N,   26, A,   64, X,     1)  MAXERR = .363797880709171D-11
Wrong RESULT  : DTRMV(U,N,N,   27, A,   64, X,     1)  MAXERR = .727595761418343D-11
Wrong RESULT  : DTRMV(U,N,N,   28, A,   64, X,     1)  MAXERR = .363797880709171D-11
Wrong RESULT  : DTRMV(U,N,N,   29, A,   64, X,     1)  MAXERR = .291038304567337D-10
Wrong RESULT  : DTRMV(U,N,N,   30, A,   64, X,     1)  MAXERR = .582076609134674D-10
Wrong RESULT  : DTRMV(U,N,N,   31, A,   64, X,     1)  MAXERR = .145519152283669D-10
Wrong RESULT  : DTRMV(U,N,N,   32, A,   64, X,     1)  MAXERR = .582076609134674D-10
Wrong RESULT  : DTRMV(U,N,N,   33, A,   64, X,     1)  MAXERR = .596598620663450D+06
Wrong RESULT  : DTRMV(U,N,N,   34, A,   64, X,     1)  MAXERR = .605368209769108D+06
Wrong RESULT  : DTRMV(U,N,N,   35, A,   64, X,     1)  MAXERR = .114251427897636D+07
Wrong RESULT  : DTRMV(U,N,N,   36, A,   64, X,     1)  MAXERR = .176011411015678D+07
Wrong RESULT  : DTRMV(U,N,N,   37, A,   64, X,     1)  MAXERR = .273520848158376D+07
Wrong RESULT  : DTRMV(U,N,N,   38, A,   64, X,     1)  MAXERR = .400535392194670D+07
Wrong RESULT  : DTRMV(U,N,N,   39, A,   64, X,     1)  MAXERR = .583437738276019D+07
Wrong RESULT  : DTRMV(U,N,N,   40, A,   64, X,     1)  MAXERR = .850833665131698D+07
Wrong RESULT  : DTRMV(U,N,N,   41, A,   64, X,     1)  MAXERR = .123493561359394D+08
Wrong RESULT  : DTRMV(U,N,N,   42, A,   64, X,     1)  MAXERR = .173391397036582D+08
Wrong RESULT  : DTRMV(U,N,N,   43, A,   64, X,     1)  MAXERR = .260484477387416D+08
Wrong RESULT  : DTRMV(U,N,N,   44, A,   64, X,     1)  MAXERR = .376086864422978D+08
Wrong RESULT  : DTRMV(U,N,N,   45, A,   64, X,     1)  MAXERR = .562860046618746D+08
Wrong RESULT  : DTRMV(U,N,N,   46, A,   64, X,     1)  MAXERR = .722429815166366D+08
Wrong RESULT  : DTRMV(U,N,N,   47, A,   64, X,     1)  MAXERR = .129301313570969D+09
Wrong RESULT  : DTRMV(U,N,N,   48, A,   64, X,     1)  MAXERR = .193137989170661D+09
Wrong RESULT  : DTRMV(U,N,N,   49, A,   64, X,     1)  MAXERR = .279222204411076D+09
Wrong RESULT  : DTRMV(U,N,N,   50, A,   64, X,     1)  MAXERR = .678662007793433D+09
Wrong RESULT  : DTRMV(U,N,N,   51, A,   64, X,     1)  MAXERR = .121269930403095D+10
Wrong RESULT  : DTRMV(U,N,N,   52, A,   64, X,     1)  MAXERR = .106155723183043D+10
Wrong RESULT  : DTRMV(U,N,N,   53, A,   64, X,     1)  MAXERR = .306408230139373D+10
Wrong RESULT  : DTRMV(U,N,N,   54, A,   64, X,     1)  MAXERR = .470622143023516D+10
Wrong RESULT  : DTRMV(U,N,N,   55, A,   64, X,     1)  MAXERR = .716298126556327D+10
Wrong RESULT  : DTRMV(U,N,N,   56, A,   64, X,     1)  MAXERR = .103315607770989D+11
Wrong RESULT  : DTRMV(U,N,N,   57, A,   64, X,     1)  MAXERR = .837779499609675D+10
Wrong RESULT  : DTRMV(U,N,N,   58, A,   64, X,     1)  MAXERR = .255092826195343D+11
Wrong RESULT  : DTRMV(U,N,N,   59, A,   64, X,     1)  MAXERR = .388799116737581D+11
Wrong RESULT  : DTRMV(U,N,N,   60, A,   64, X,     1)  MAXERR = .612787305842372D+11
Wrong RESULT  : DTRMV(U,N,N,   61, A,   64, X,     1)  MAXERR = .421813369205623D+11
Wrong RESULT  : DTRMV(U,N,N,   62, A,   64, X,     1)  MAXERR = .531064994597937D+11
Wrong RESULT  : DTRMV(U,N,N,   63, A,   64, X,     1)  MAXERR = .218143907859646D+12
Wrong RESULT  : DTRMV(U,N,N,   64, A,   64, X,     1)  MAXERR = .160600130355114D+12

The demonstration code is here: https://gist.github.com/grisuthedragon/32d8ad11e1d722414f921509b97d5507

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions