Skip to content

Test regression in scheduling_eval.spmvGPU, scheduling_eval.ttvGPU #462

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Infinoid opened this issue May 19, 2021 · 2 comments
Closed

Test regression in scheduling_eval.spmvGPU, scheduling_eval.ttvGPU #462

Infinoid opened this issue May 19, 2021 · 2 comments
Assignees
Labels
bug Indicates an unexpected problem or unintended behavior

Comments

@Infinoid
Copy link
Contributor

Infinoid commented May 19, 2021

The recent fix for precompute with non-matching indexes, #459, has caused two new CUDA test failures.

You will need to build TACO with -DCUDA=ON to run these tests:

$ ctest --output-on-failure -R 'scheduling_eval.(spmv|ttv)GPU'
Test project /home/infinoid/workspace/taco/build
    Start 87: scheduling_eval.spmvGPU
1/2 Test #87: scheduling_eval.spmvGPU ..........***Failed    7.90 sec
Note: Google Test filter = scheduling_eval.spmvGPU
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from scheduling_eval
[ RUN      ] scheduling_eval.spmvGPU
Compiler bug at /home/infinoid/workspace/taco/git/src/lower/lower.cpp:50 in lower
Please report it to developers
 Condition failed: isLowerable(stmt, &reason)
 Not lowerable, because the index statement is not in concrete index notation, because all variables in concrete notation must be bound by a forall statement: suchthat(forall(block, forall(warp, forall(thread, where(forall(thread_nz, y(i) += precomputed(thread_nz)), forall(thread_nz_pre, precomputed(thread_nz_pre) = A(i,j) * x(j))), GPUThread, Atomics), GPUWarp, IgnoreRaces), GPUBlock, IgnoreRaces), fuse(i, j, f) and pos(f, fpos, A(i,j)) and split(fpos, block, fpos1, 2048) and split(fpos1, warp, fpos2, 256) and split(fpos2, thread, thread_nz, 8))
[  FAILED  ] scheduling_eval.spmvGPU (3799 ms)
[----------] 1 test from scheduling_eval (3799 ms total)

[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (3799 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] scheduling_eval.spmvGPU

 1 FAILED TEST

    Start 92: scheduling_eval.ttvGPU
2/2 Test #92: scheduling_eval.ttvGPU ...........***Failed    8.11 sec
Note: Google Test filter = scheduling_eval.ttvGPU
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from scheduling_eval
[ RUN      ] scheduling_eval.ttvGPU
Compiler bug at /home/infinoid/workspace/taco/git/src/lower/lower.cpp:50 in lower
Please report it to developers
 Condition failed: isLowerable(stmt, &reason)
 Not lowerable, because the index statement is not in concrete index notation, because all variables in concrete notation must be bound by a forall statement: suchthat(forall(block, forall(warp, forall(thread, where(forall(thread_nz, A(i,j) += precomputed(thread_nz)), forall(thread_nz_pre, precomputed(thread_nz_pre) = B(i,j,k) * c(k))), GPUThread, Atomics), GPUWarp, IgnoreRaces), GPUBlock, IgnoreRaces), fuse(j, k, jk) and fuse(i, jk, f) and pos(f, fpos, B(i,j,k)) and split(fpos, block, fpos1, 2048) and split(fpos1, warp, fpos2, 256) and split(fpos2, thread, thread_nz, 8))
[  FAILED  ] scheduling_eval.ttvGPU (3949 ms)

When I revert back to ffd7155, these tests pass again.

$ ctest --output-on-failure -R 'scheduling_eval.(spmv|ttv)GPU'
Test project /home/infinoid/workspace/taco/build
    Start 87: scheduling_eval.spmvGPU
1/2 Test #87: scheduling_eval.spmvGPU ..........   Passed   13.80 sec
    Start 92: scheduling_eval.ttvGPU
2/2 Test #92: scheduling_eval.ttvGPU ...........   Passed   14.61 sec

100% tests passed, 0 tests failed out of 2

Total Test time (real) =  28.41 sec

This simple command-line reproducer also exposes the issue:

$ bin/taco "y(i) = A(i,j) * x(j)" -f=A:ds -s="fuse(i,j,f),pos(f,fpos,A),precompute(A(i,j)*x(j),fpos,fpre)"
terminate called after throwing an instance of 'taco::TacoException'
  what():  Compiler bug at /home/infinoid/workspace/taco/git/src/lower/lower.cpp:50 in lower
Please report it to developers
 Condition failed: isLowerable(stmt, &reason)
 Not lowerable, because the index statement is not in concrete index notation, because all variables in concrete notation must be bound by a forall statement: suchthat(where(forall(fpos, y(i) += workspace(fpos)), forall(fpre, workspace(fpre) = A(i,j) * x(j))), fuse(i, j, f) and pos(f, fpos, A(i,j)))

Changing fpre to fpos in the precompute allows it to succeed. The above command also works after reverting to ffd7155.

Cc: @nirvikbaruah. Apologies for the late feedback on this failure; hopefully #457 will close the feedback loop.

@stephenchouca stephenchouca added the bug Indicates an unexpected problem or unintended behavior label May 20, 2021
@weiya711
Copy link
Contributor

Working with @nirvikbaruah on fixing this, a sample fix is in the https://github.com/tensor-compiler/taco/test-pr470 branch. The tests for scheduling_eval.spmvGPU, scheduling_eval.ttvGPU now fail at a different point

Error at /home/infinoid/taco-cuda-ci-runner/_work/taco/taco/src/lower/lowerer_impl.cpp:926 in lowerForallCloned:
 Unable to vectorize or unroll loop over unbound variable thread_nz_pre

@weiya711
Copy link
Contributor

Should be fixed with #474

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

3 participants