You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have been talking about this on the google groups, but I am reporting the enchantment here, too.
As we all know, BLAS contains many composite operations, like GEMM.
C += A*B
But, as the library is currently designed, we are unable to take advantage of that, since each operation is performed eagerly, as soon as possible.
The right way to do it would be to keep the operations stored as an AST, performing some kind of smart analysis to keep it unevaluated until it is certain that the most specific BLAS function can be used.
Since the overhead of the breeze library is already insignificant, the performance could be improved a lot this way.