Exception throwed in "precompute" schedule #347

roastduck · 2020-12-12T09:17:31Z

I was trying to run the following schedule:

taco "C(i0,i1,j)=A(i0,i1,p0,p1)*B(p0,p1,j)" -f="C:ddd:0,1,2" -f="A:dsdd:0,2,1,3" -f="B:ddd:0,1,2" -s="precompute(A(i0,i1,p0,p1)*B(p0,p1,j),p0,p0_cache)"

It ended up with an error:

terminate called after throwing an instance of 'taco::TacoException'
  what():  Compiler bug at /home/rd/src/taco/src/lower/lower.cpp:49 in lower
Please report it to developers
 Condition failed: isLowerable(stmt, &reason)
 Not lowerable, because the index statement is not in concrete index notation, because all variables in concrete notation must be bound by a forall statement: suchthat(forall(i0, where(forall(p0, forall(i1, forall(p1, forall(j, C(i0,i1,j) += workspace(p0))))), forall(p0_cache, workspace(p0_cache) = A(i0,i1,p0_cache,p1) * B(p0_cache,p1,j)))), precompute(p0, p0_cache))

I was running the latest commit (dd62216). However, the online tool at http://tensor-compiler.org/codegen.html works fine with the same command. Maybe one of the latest commit introduced the bug.

The text was updated successfully, but these errors were encountered:

roastduck · 2020-12-14T10:29:18Z

PS. I think it would be better to display the actual deployed commit ID on the online tool site, so I can fall back to that version if the master branch goes wrong.

stephenchouca · 2020-12-14T17:17:58Z

It seems like there's a bug with precompute. Actually, the code generated by the online tool looks to be incorrect as well:

  #pragma omp parallel for schedule(runtime)
  for (int32_t i0 = 0; i0 < A1_dimension; i0++) {
    double* restrict workspace = 0;
    workspace = (double*)malloc(sizeof(double) * 42);
    for (int32_t pworkspace = 0; pworkspace < 42; pworkspace++) {
      workspace[pworkspace] = 0.0;
    }
    int32_t p0_cacheA = A2_pos[i0];
    int32_t pA2_end = A2_pos[(i0 + 1)];
    int32_t p0_cacheA0 = A2_crd[p0_cacheA];
    int32_t p0 = A2_crd[p0_cacheA];
    int32_t p0_cache = p0;
    int32_t p0_cache_end = C4_dimension;

    while (p0_cacheA < pA2_end && p0 < p0_cache_end) {
      p0_cacheA0 = A2_crd[p0_cacheA];
      p0 = A2_crd[p0_cacheA];
      workspace[p0_cache] = A_vals[p1A] * B_vals[jB];

      p0_cacheA += (int32_t)(p0_cacheA0 == p0);
      p0_cacheA0 = A2_crd[p0_cacheA];
      p0 = A2_crd[p0_cacheA];
    }
    for (int32_t p0 = 0; p0 < C4_dimension; p0++) {
      for (int32_t i1 = 0; i1 < A2_dimension; i1++) {
        int32_t i1C = i0 * C2_dimension + i1;
        for (int32_t p1 = 0; p1 < B2_dimension; p1++) {
          for (int32_t j = 0; j < B3_dimension; j++) {
            int32_t jC = i1C * C3_dimension + j;
            C_vals[jC] = C_vals[jC] + workspace[p0workspace];
          }
        }
      }
    }
    free(workspace);
  }

For instance, the line C_vals[jC] = C_vals[jC] + workspace[p0workspace]; for instances references a p0workspace that doesn't exist. The online tool simply wasn't catching this bug because it was built as a release build (which disables some internal assertions) rather than a debug build.

Infinoid · 2020-12-15T21:09:51Z

I've been seeing missing local variables with precompute too. I ran into it with MTTKRP, but managed to boil it down to this:

taco 'A(i,j) = B(i,j) * 5' -f=A:dd -f=B:dd -s='precompute(B(i,j)*5,j,k)'

Which generates this loop:

  for (int32_t i = 0; i < B1_dimension; i++) {
    double* restrict workspace = 0;
    workspace = (double*)malloc(sizeof(double) * 42);
    for (int32_t pworkspace = 0; pworkspace < 42; pworkspace++) {
      workspace[pworkspace] = 0.0;
    }
    for (int32_t k = 0; k < A2_dimension; k++) {
      int32_t j = k;
      if (j >= A2_dimension)
        continue;

      int32_t kB = i * B2_dimension + k;
      workspace[k] = B_vals[kB] * 5;
    }
    for (int32_t j = 0; j < A2_dimension; j++) {
      A_vals[jA] = workspace[jworkspace];
    }
    free(workspace);
  }

And references the variable jA without ever declaring it.

Infinoid · 2021-05-19T13:02:26Z

The missing variable thing is fixed. The original "Not lowerable" bug described by @roastduck is still present.

I tried to minimize that reproducer, here's the result:

$ bin/taco "C(i0)=A(i0,p0)*B(p0)" -s="precompute(A(i0,p0)*B(p0),i0,i0_cache)" 
terminate called after throwing an instance of 'taco::TacoException'
  what():  Compiler bug at .../src/lower/lower.cpp:50 in lower
Please report it to developers
 Condition failed: isLowerable(stmt, &reason)
 Not lowerable, because the index statement is not in concrete index notation, because all variables in concrete notation must be bound by a forall statement: where(forall(i0, where(C(i0) = tp0C, forall(p0, tp0C += workspace(i0)))), forall(i0_cache, workspace(i0_cache) = A(i0_cache,p0) * B(p0)))

Precomputing at i0 breaks, precomputing at p0 seems to work fine.

weiya711 · 2021-06-17T21:46:50Z

The branch taco/multidim-workspace (392fc66) seems to fix the following two commands from @Infinoid:

./bin/taco "C(i0)=A(i0,p0)*B(p0)" -s="precompute(A(i0,p0)*B(p0),p0,p0_cache)" 
./bin/taco "C(i0)=A(i0,p0)*B(p0)" -s="precompute(A(i0,p0)*B(p0),i0,i0_cache)"

But it doesn't fix the command from @roastduck

taco "C(i0,i1,j)=A(i0,i1,p0,p1)*B(p0,p1,j)" -f="C:ddd:0,1,2" -f="A:dsdd:0,2,1,3" -f="B:ddd:0,1,2" -s="precompute(A(i0,i1,p0,p1)*B(p0,p1,j),p0,p0_cache)"

I'm a little confused about this larger command because currently when you precompute across p0 based on the precompute algorithm you must first reorder the index variables to {i0, i1, j, p0, p1} because the shared index variables {i0, i1, j} across the RHS and LHS of the assignment must be in the outer-loops of the precompute where statement.

In order to fix this the command should now have an added reorder(i0, i1, j, p0, p1) scheduling command shown below:

./bin/taco "C(i0,i1,j)=A(i0,i1,p0,p1)*B(p0,p1,j)" -f="C:ddd:0,1,2" -f="A:dsdd:0,2,1,3" -f="B:ddd:0,1,2" -s="reorder(i0, i1, j, p0, p1), precompute(A(i0,i1,p0,p1)*B(p0,p1,j),p0,p0_cache)"

This command also seems to generate correct code on the taco/multidim-workspace branch.

stephenchouca added the bug Indicates an unexpected problem or unintended behavior label Dec 14, 2020

Infinoid mentioned this issue Mar 26, 2021

precompute generates invalid code when I give it differing index variable names #431

Closed

weiya711 self-assigned this May 14, 2021

nirvikbaruah mentioned this issue May 14, 2021

Missing index variable declarations for precompute #459

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Exception throwed in "precompute" schedule #347

Exception throwed in "precompute" schedule #347

roastduck commented Dec 12, 2020

roastduck commented Dec 14, 2020

stephenchouca commented Dec 14, 2020

Infinoid commented Dec 15, 2020

Infinoid commented May 19, 2021

weiya711 commented Jun 17, 2021

Exception throwed in "precompute" schedule #347

Exception throwed in "precompute" schedule #347

Comments

roastduck commented Dec 12, 2020

roastduck commented Dec 14, 2020

stephenchouca commented Dec 14, 2020

Infinoid commented Dec 15, 2020

Infinoid commented May 19, 2021

weiya711 commented Jun 17, 2021