You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The recent fix for precompute with non-matching indexes, #459, has caused two new CUDA test failures.
You will need to build TACO with -DCUDA=ON to run these tests:
$ ctest --output-on-failure -R 'scheduling_eval.(spmv|ttv)GPU'
Test project /home/infinoid/workspace/taco/build
Start 87: scheduling_eval.spmvGPU
1/2 Test #87: scheduling_eval.spmvGPU ..........***Failed 7.90 sec
Note: Google Test filter = scheduling_eval.spmvGPU
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from scheduling_eval
[ RUN ] scheduling_eval.spmvGPU
Compiler bug at /home/infinoid/workspace/taco/git/src/lower/lower.cpp:50 in lower
Please report it to developers
Condition failed: isLowerable(stmt, &reason)
Not lowerable, because the index statement is not in concrete index notation, because all variables in concrete notation must be bound by a forall statement: suchthat(forall(block, forall(warp, forall(thread, where(forall(thread_nz, y(i) += precomputed(thread_nz)), forall(thread_nz_pre, precomputed(thread_nz_pre) = A(i,j) * x(j))), GPUThread, Atomics), GPUWarp, IgnoreRaces), GPUBlock, IgnoreRaces), fuse(i, j, f) and pos(f, fpos, A(i,j)) and split(fpos, block, fpos1, 2048) and split(fpos1, warp, fpos2, 256) and split(fpos2, thread, thread_nz, 8))
[ FAILED ] scheduling_eval.spmvGPU (3799 ms)
[----------] 1 test from scheduling_eval (3799 ms total)
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (3799 ms total)
[ PASSED ] 0 tests.
[ FAILED ] 1 test, listed below:
[ FAILED ] scheduling_eval.spmvGPU
1 FAILED TEST
Start 92: scheduling_eval.ttvGPU
2/2 Test #92: scheduling_eval.ttvGPU ...........***Failed 8.11 sec
Note: Google Test filter = scheduling_eval.ttvGPU
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from scheduling_eval
[ RUN ] scheduling_eval.ttvGPU
Compiler bug at /home/infinoid/workspace/taco/git/src/lower/lower.cpp:50 in lower
Please report it to developers
Condition failed: isLowerable(stmt, &reason)
Not lowerable, because the index statement is not in concrete index notation, because all variables in concrete notation must be bound by a forall statement: suchthat(forall(block, forall(warp, forall(thread, where(forall(thread_nz, A(i,j) += precomputed(thread_nz)), forall(thread_nz_pre, precomputed(thread_nz_pre) = B(i,j,k) * c(k))), GPUThread, Atomics), GPUWarp, IgnoreRaces), GPUBlock, IgnoreRaces), fuse(j, k, jk) and fuse(i, jk, f) and pos(f, fpos, B(i,j,k)) and split(fpos, block, fpos1, 2048) and split(fpos1, warp, fpos2, 256) and split(fpos2, thread, thread_nz, 8))
[ FAILED ] scheduling_eval.ttvGPU (3949 ms)
When I revert back to ffd7155, these tests pass again.
$ ctest --output-on-failure -R 'scheduling_eval.(spmv|ttv)GPU'
Test project /home/infinoid/workspace/taco/build
Start 87: scheduling_eval.spmvGPU
1/2 Test #87: scheduling_eval.spmvGPU .......... Passed 13.80 sec
Start 92: scheduling_eval.ttvGPU
2/2 Test #92: scheduling_eval.ttvGPU ........... Passed 14.61 sec
100% tests passed, 0 tests failed out of 2
Total Test time (real) = 28.41 sec
This simple command-line reproducer also exposes the issue:
$ bin/taco "y(i) = A(i,j) * x(j)" -f=A:ds -s="fuse(i,j,f),pos(f,fpos,A),precompute(A(i,j)*x(j),fpos,fpre)"
terminate called after throwing an instance of 'taco::TacoException'
what(): Compiler bug at /home/infinoid/workspace/taco/git/src/lower/lower.cpp:50 in lower
Please report it to developers
Condition failed: isLowerable(stmt, &reason)
Not lowerable, because the index statement is not in concrete index notation, because all variables in concrete notation must be bound by a forall statement: suchthat(where(forall(fpos, y(i) += workspace(fpos)), forall(fpre, workspace(fpre) = A(i,j) * x(j))), fuse(i, j, f) and pos(f, fpos, A(i,j)))
Changing fpre to fpos in the precompute allows it to succeed. The above command also works after reverting to ffd7155.
Cc: @nirvikbaruah. Apologies for the late feedback on this failure; hopefully #457 will close the feedback loop.
The text was updated successfully, but these errors were encountered:
Error at /home/infinoid/taco-cuda-ci-runner/_work/taco/taco/src/lower/lowerer_impl.cpp:926 in lowerForallCloned:
Unable to vectorize or unroll loop over unbound variable thread_nz_pre
The recent fix for precompute with non-matching indexes, #459, has caused two new CUDA test failures.
You will need to build TACO with
-DCUDA=ON
to run these tests:When I revert back to ffd7155, these tests pass again.
This simple command-line reproducer also exposes the issue:
Changing
fpre
tofpos
in the precompute allows it to succeed. The above command also works after reverting to ffd7155.Cc: @nirvikbaruah. Apologies for the late feedback on this failure; hopefully #457 will close the feedback loop.
The text was updated successfully, but these errors were encountered: