Skip to content

Add warning in matmul #459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from
Closed

Add warning in matmul #459

wants to merge 2 commits into from

Conversation

sayandip18
Copy link
Contributor

Fix for #340

@hameerabbasi
Copy link
Collaborator

Can you run the benchmarks related to matmul before/after this PR and post the results?

@sayandip18
Copy link
Contributor Author

The time for matmul has increased

All benchmarks:

       before           after         ratio
     [75125cd0]       [a8e1abab]
     <master>         <warning> 
       4.75±0.3ms       4.68±0.2ms     0.98  benchmark_coo.ElemwiseBroadcastingSuite.time_add
         762±50μs         736±10μs     0.97  benchmark_coo.ElemwiseBroadcastingSuite.time_mul
       6.63±0.7ms       6.38±0.1ms     0.96  benchmark_coo.ElemwiseSuite.time_add
         1.93±0ms         1.94±0ms     1.00  benchmark_coo.ElemwiseSuite.time_mul
         887±60μs          889±5μs     1.00  benchmark_coo.IndexingSuite.time_index_fancy
       68.9±0.2μs       68.8±0.4μs     1.00  benchmark_coo.IndexingSuite.time_index_scalar
         254±10μs        244±0.4μs     0.96  benchmark_coo.IndexingSuite.time_index_slice
          488±1μs        488±0.9μs     1.00  benchmark_coo.IndexingSuite.time_index_slice2
         501±40μs          505±9μs     1.01  benchmark_coo.IndexingSuite.time_index_slice3
        699±300μs         1.51±0ms    ~2.17  benchmark_coo.MatrixMultiplySuite.time_matmul
       8.90±0.8ms      8.67±0.08ms     0.97  benchmark_gcxs.ElemwiseBroadcastingSuite.time_add
      2.33±0.02ms      2.32±0.02ms     1.00  benchmark_gcxs.ElemwiseBroadcastingSuite.time_mul
      12.4±0.03ms      12.3±0.03ms     1.00  benchmark_gcxs.ElemwiseSuite.time_add
      5.93±0.03ms      5.94±0.02ms     1.00  benchmark_gcxs.ElemwiseSuite.time_mul
      2.30±0.04ms         2.31±0ms     1.01  benchmark_gcxs.IndexingSuite.time_index_fancy
          424±2μs        425±0.8μs     1.00  benchmark_gcxs.IndexingSuite.time_index_scalar
       2.26±0.2ms         2.25±0ms     0.99  benchmark_gcxs.IndexingSuite.time_index_slice
-      2.02±0.1ms         1.61±0ms     0.80  benchmark_gcxs.IndexingSuite.time_index_slice2
         1.32±0ms         1.33±0ms     1.01  benchmark_gcxs.IndexingSuite.time_index_slice3
+        741±50μs         2.84±0ms     3.83  benchmark_gcxs.MatrixMultiplySuite.time_matmul
         292±10ms          292±1ms     1.00  benchmark_tensordot.TensordotSuiteDenseSparse.time_dense
         290±10ms        290±0.8ms     1.00  benchmark_tensordot.TensordotSuiteDenseSparse.time_sparse
          252±4ms          256±9ms     1.02  benchmark_tensordot.TensordotSuiteSparseDense.time_dense
          246±1ms          252±4ms     1.02  benchmark_tensordot.TensordotSuiteSparseDense.time_sparse
          157±2ms          158±2ms     1.01  benchmark_tensordot.TensordotSuiteSparseSparse.time_dense
          157±2ms          158±2ms     1.00  benchmark_tensordot.TensordotSuiteSparseSparse.time_sparse

@hameerabbasi
Copy link
Collaborator

Can you possibly do this check inside a Numba function, and warn outside? Maybe that will improve the performance.

If that doesn't, we can close this PR as being way too detrimental to performance.

@sayandip18
Copy link
Contributor Author

I did something like this

@njit
def check_nan(a, b):
    if np.isnan(np.min(a)) or np.isnan(np.min(b)):
        return 1
    else:
        return 0

def matmul(a, b):

    if check_nan(a, b):
        warnings.warn("Warning: nan is not propagated in matrix multiplication")
    check_zero_fill_value(a, b)

But the benchmarks for matmul aren't running.

@hameerabbasi
Copy link
Collaborator

You will need to pass in the data and fill value. That's because some formats can't be jitted.

@sayandip18
Copy link
Contributor Author

You will need to pass in the data and fill value.

How do I do that? Can you elaborate?

@sayandip18
Copy link
Contributor Author

New benchmarks

[  8.65%] ··· benchmark_coo.ElemwiseBroadcastingSuite.time_add                                                                             4.81±0.3ms
[  8.97%] ··· benchmark_coo.ElemwiseBroadcastingSuite.time_mul                                                                               829±70μs
[  9.29%] ··· benchmark_coo.ElemwiseSuite.time_add                                                                                        6.33±0.03ms
[  9.62%] ··· benchmark_coo.ElemwiseSuite.time_mul                                                                                        1.92±0.02ms
[  9.94%] ··· benchmark_coo.IndexingSuite.time_index_fancy                                                                                   891±10μs
[ 10.26%] ··· benchmark_coo.IndexingSuite.time_index_scalar                                                                                69.2±0.1μs
[ 10.58%] ··· benchmark_coo.IndexingSuite.time_index_slice                                                                                  244±0.7μs
[ 10.90%] ··· benchmark_coo.IndexingSuite.time_index_slice2                                                                                  495±10μs
[ 11.22%] ··· benchmark_coo.IndexingSuite.time_index_slice3                                                                                   507±6μs
[ 11.54%] ··· benchmark_coo.MatrixMultiplySuite.time_matmul                                                                                   689±6μs
[ 11.86%] ··· benchmark_gcxs.ElemwiseBroadcastingSuite.time_add                                                                           8.53±0.03ms
[ 12.18%] ··· benchmark_gcxs.ElemwiseBroadcastingSuite.time_mul                                                                           2.35±0.02ms
[ 12.50%] ··· benchmark_gcxs.ElemwiseSuite.time_add                                                                                        12.6±0.3ms
[ 12.82%] ··· benchmark_gcxs.ElemwiseSuite.time_mul                                                                                       6.00±0.05ms
[ 13.14%] ··· benchmark_gcxs.IndexingSuite.time_index_fancy                                                                                2.38±0.3ms
[ 13.46%] ··· benchmark_gcxs.IndexingSuite.time_index_scalar                                                                                  435±2μs
[ 13.78%] ··· benchmark_gcxs.IndexingSuite.time_index_slice                                                                               2.28±0.01ms
[ 14.10%] ··· benchmark_gcxs.IndexingSuite.time_index_slice2                                                                              1.64±0.01ms
[ 14.42%] ··· benchmark_gcxs.IndexingSuite.time_index_slice3                                                                              1.38±0.04ms
[ 14.74%] ··· benchmark_gcxs.MatrixMultiplySuite.time_matmul                                                                                 724±10μs
[ 15.06%] ··· benchmark_tensordot.TensordotSuiteDenseSparse.time_dense                                                                      287±0.5ms
[ 15.38%] ··· benchmark_tensordot.TensordotSuiteDenseSparse.time_sparse                                                                       295±8ms
[ 15.71%] ··· benchmark_tensordot.TensordotSuiteSparseDense.time_dense                                                                        248±5ms
[ 16.03%] ··· benchmark_tensordot.TensordotSuiteSparseDense.time_sparse                                                                       246±1ms
[ 16.35%] ··· benchmark_tensordot.TensordotSuiteSparseSparse.time_dense                                                                       160±6ms
[ 16.67%] ··· benchmark_tensordot.TensordotSuiteSparseSparse.time_sparse                                                                      158±2ms

@hameerabbasi
Copy link
Collaborator

Can you run a comparison against master?

@sayandip18
Copy link
Contributor Author

       before           after         ratio
     [75125cd0]       [878bd9bd]
     <master>         <warning> 
      4.47±0.02ms       4.81±0.4ms     1.08  benchmark_coo.ElemwiseBroadcastingSuite.time_add
         739±10μs         744±10μs     1.01  benchmark_coo.ElemwiseBroadcastingSuite.time_mul
      6.26±0.06ms      6.36±0.07ms     1.02  benchmark_coo.ElemwiseSuite.time_add
         1.90±0ms      1.93±0.02ms     1.01  benchmark_coo.ElemwiseSuite.time_mul
          891±9μs          894±3μs     1.00  benchmark_coo.IndexingSuite.time_index_fancy
       68.7±0.2μs       68.0±0.2μs     0.99  benchmark_coo.IndexingSuite.time_index_scalar
        244±0.4μs        245±0.8μs     1.00  benchmark_coo.IndexingSuite.time_index_slice
          489±1μs          487±1μs     1.00  benchmark_coo.IndexingSuite.time_index_slice2
        502±0.2μs          502±1μs     1.00  benchmark_coo.IndexingSuite.time_index_slice3
          667±1μs          681±3μs     1.02  benchmark_coo.MatrixMultiplySuite.time_matmul
       8.52±0.3ms      8.50±0.03ms     1.00  benchmark_gcxs.ElemwiseBroadcastingSuite.time_add
      2.33±0.02ms      2.32±0.01ms     1.00  benchmark_gcxs.ElemwiseBroadcastingSuite.time_mul
      12.4±0.02ms      12.4±0.03ms     1.00  benchmark_gcxs.ElemwiseSuite.time_add
      5.93±0.03ms      5.91±0.03ms     1.00  benchmark_gcxs.ElemwiseSuite.time_mul
         2.30±0ms      2.30±0.01ms     1.00  benchmark_gcxs.IndexingSuite.time_index_fancy
        431±0.3μs        428±0.3μs     0.99  benchmark_gcxs.IndexingSuite.time_index_scalar
         2.24±0ms         2.24±0ms     1.00  benchmark_gcxs.IndexingSuite.time_index_slice
         1.62±0ms         1.61±0ms     0.99  benchmark_gcxs.IndexingSuite.time_index_slice2
         1.33±0ms         1.32±0ms     0.99  benchmark_gcxs.IndexingSuite.time_index_slice3
        701±0.9μs          711±2μs     1.01  benchmark_gcxs.MatrixMultiplySuite.time_matmul
        286±0.4ms        286±0.1ms     1.00  benchmark_tensordot.TensordotSuiteDenseSparse.time_dense
        289±0.5ms          289±1ms     1.00  benchmark_tensordot.TensordotSuiteDenseSparse.time_sparse
          250±4ms          250±3ms     1.00  benchmark_tensordot.TensordotSuiteSparseDense.time_dense
          249±4ms          249±3ms     1.00  benchmark_tensordot.TensordotSuiteSparseDense.time_sparse
          158±2ms          157±2ms     1.00  benchmark_tensordot.TensordotSuiteSparseSparse.time_dense
          158±2ms          158±2ms     1.00  benchmark_tensordot.TensordotSuiteSparseSparse.time_sparse

@hameerabbasi
Copy link
Collaborator

Superceded by #469.

@sayandip18 sayandip18 deleted the warning branch April 25, 2021 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants