-
Notifications
You must be signed in to change notification settings - Fork 194
Exception throwed in "precompute" schedule #347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
PS. I think it would be better to display the actual deployed commit ID on the online tool site, so I can fall back to that version if the master branch goes wrong. |
It seems like there's a bug with #pragma omp parallel for schedule(runtime)
for (int32_t i0 = 0; i0 < A1_dimension; i0++) {
double* restrict workspace = 0;
workspace = (double*)malloc(sizeof(double) * 42);
for (int32_t pworkspace = 0; pworkspace < 42; pworkspace++) {
workspace[pworkspace] = 0.0;
}
int32_t p0_cacheA = A2_pos[i0];
int32_t pA2_end = A2_pos[(i0 + 1)];
int32_t p0_cacheA0 = A2_crd[p0_cacheA];
int32_t p0 = A2_crd[p0_cacheA];
int32_t p0_cache = p0;
int32_t p0_cache_end = C4_dimension;
while (p0_cacheA < pA2_end && p0 < p0_cache_end) {
p0_cacheA0 = A2_crd[p0_cacheA];
p0 = A2_crd[p0_cacheA];
workspace[p0_cache] = A_vals[p1A] * B_vals[jB];
p0_cacheA += (int32_t)(p0_cacheA0 == p0);
p0_cacheA0 = A2_crd[p0_cacheA];
p0 = A2_crd[p0_cacheA];
}
for (int32_t p0 = 0; p0 < C4_dimension; p0++) {
for (int32_t i1 = 0; i1 < A2_dimension; i1++) {
int32_t i1C = i0 * C2_dimension + i1;
for (int32_t p1 = 0; p1 < B2_dimension; p1++) {
for (int32_t j = 0; j < B3_dimension; j++) {
int32_t jC = i1C * C3_dimension + j;
C_vals[jC] = C_vals[jC] + workspace[p0workspace];
}
}
}
}
free(workspace);
} For instance, the line |
I've been seeing missing local variables with taco 'A(i,j) = B(i,j) * 5' -f=A:dd -f=B:dd -s='precompute(B(i,j)*5,j,k)' Which generates this loop: for (int32_t i = 0; i < B1_dimension; i++) {
double* restrict workspace = 0;
workspace = (double*)malloc(sizeof(double) * 42);
for (int32_t pworkspace = 0; pworkspace < 42; pworkspace++) {
workspace[pworkspace] = 0.0;
}
for (int32_t k = 0; k < A2_dimension; k++) {
int32_t j = k;
if (j >= A2_dimension)
continue;
int32_t kB = i * B2_dimension + k;
workspace[k] = B_vals[kB] * 5;
}
for (int32_t j = 0; j < A2_dimension; j++) {
A_vals[jA] = workspace[jworkspace];
}
free(workspace);
} And references the variable |
The missing variable thing is fixed. The original "Not lowerable" bug described by @roastduck is still present. I tried to minimize that reproducer, here's the result:
Precomputing at |
The branch taco/multidim-workspace (392fc66) seems to fix the following two commands from @Infinoid:
But it doesn't fix the command from @roastduck
I'm a little confused about this larger command because currently when you precompute across In order to fix this the command should now have an added
This command also seems to generate correct code on the taco/multidim-workspace branch. |
I was trying to run the following schedule:
It ended up with an error:
I was running the latest commit (dd62216). However, the online tool at http://tensor-compiler.org/codegen.html works fine with the same command. Maybe one of the latest commit introduced the bug.
The text was updated successfully, but these errors were encountered: