Add numba overload for `solve_triangular` #423

jessegrabowski · 2023-08-26T05:10:36Z

Motivation for these changes

The pytensor.tensor.slinalg module is not currently compatible with mode = "NUMBA". This PR is a first step in an effort to fix that. It's marked as a draft because it's 1) not done, and 2) needs discussion/work.

Functions in slinalg don't have overloads in numba.np.linalg, so to implement these functions there needs to be an overload that calls the relevant C LAPACK functions. This involves some acrobatics with C pointers and typing, which I am absolutely not an expert at. Currently, I use dynamic pointers from ctypes, essentially just following numba/numba#5301. This works, but it means the resulting functions can't be cached, which will be a huge slowdown on complex graphs (I think).

A more complete approach would try to directly extend numba/numba/_lapack.c with some new pointers to the relevant scipy code. I'm not sure if it would be possible to have our own e..g _lapack_extensions.c that could have #include _lapack.con top? The pattern in that file looks straightforward enough to copy, but it's been a long time since I did anything in C, and I'm not sure how importing across modules would work.

Also, to answer "why solve_triangular? Because it's a function that we don't have now, that only depends on a single LAPACK call. Once the pattern is ironed out, I'll do these for all the functions we currently have in slinalg, most importantly solve (yes, we have the np.linalg.solve overload, but it doesn't allow access to the specialized solvers for e.g. symmetric positive definite matrices, which matters a lot for PyMC).

Implementation details

I followed the implementation of LAPACK overloads established in numba/numpy/linalg/linalg.py. There's a class called _LAPACK that holds signatures for all the LAPACK functions that will be implemented, then an overload function.

Checklist

Explain motivation and implementation 👆
Make sure that the pre-commit linting/style checks pass.
Link relevant issues, preferably in nice commit messages.
The commits correspond to relevant logical changes. Note that if they don't, we will rewrite/rebase/squash the git history before merging.
Are the changes covered by tests and docstrings?
Fill out the short summary sections 👇

Major / Breaking Changes

None

New features

Solve triangular matrices with numba!

Bugfixes

None

Documentation

Not yet

Maintenance

None

pytensor/link/numba/dispatch/slinalg.py

Remove test_SolveTriangular from numba\test_nlinalg.py

codecov-commenter · 2023-08-28T03:31:13Z

Codecov Report

Merging #423 (42b5b5f) into main (071eadd) will decrease coverage by 0.12%.
The diff coverage is 49.10%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #423      +/-   ##
==========================================
- Coverage   80.75%   80.64%   -0.12%     
==========================================
  Files         159      160       +1     
  Lines       45849    46016     +167     
  Branches    11234    11263      +29     
==========================================
+ Hits        37026    37108      +82     
- Misses       6595     6671      +76     
- Partials     2228     2237       +9

Files Changed	Coverage Δ
pytensor/link/numba/dispatch/slinalg.py	`48.79% <48.79%> (ø)`
pytensor/link/numba/dispatch/__init__.py	`100.00% <100.00%> (ø)`

pytensor/link/numba/dispatch/slinalg.py

Add informative message to error raised by check_finite=True

pytensor/link/numba/dispatch/slinalg.py

ricardoV94 · 2023-08-29T14:41:40Z

Is this ready for review or something important still missing?

…ensor into numba_slinalg

jessegrabowski · 2023-08-29T14:44:36Z

I'm still not 100% sold on how it's all implemented. I wanted someone to take a closer look at _lapack.c in the numba library and decide if we can do it more like that, or if this hackish way is acceptable.

aseyboldt

I think there are maybe a few ways to make this a bit faster, but it looks good to me as it is. I'm not really sure why it would feel hackish? The only downside I can think of compared to compiling a separate extension module is that numba can't cache this due to the dynamic pointer.

pytensor/link/numba/dispatch/slinalg.py

aseyboldt · 2023-08-29T15:16:49Z

pytensor/link/numba/dispatch/slinalg.py

+
+        # Need to expand B here; I tried everywhere else and it doesn't work
+        if B_is_1d:
+            B_copy = _copy_to_fortran_order(np.expand_dims(B, -1))


If the original B was 1d, I don't think we need the copy?

In my testing, trtrs expects at least 2d everything. The docs say LDB >= 0, but when I was giving it 1d arrays I was getting back numerically incorrect results.

After testing, you're right. I wasn't able to avoid the copy in the 2d case though. If I don't copy 2d B, numba flags this line:
B_NDIM = 1 if B_is_1d else int(B.shape[1])

Saying that it's considering a case where B is 3d. Not sure why it thinks that is possible. Does numba evaluate all if-else branches on all possible inputs?

aseyboldt · 2023-08-29T15:19:52Z

pytensor/link/numba/dispatch/slinalg.py

+        if A.shape[0] != B.shape[0]:
+            raise linalg.LinAlgError("Dimensions of A and B do not conform")
+
+        A_copy = _copy_to_fortran_order(A)


Can we avoid the copy if it is c-order by flipping transval? I think we could also have a special overload for when trans, lower and unit_diag are literals, and we statically know that A and B are C or Fortran continuous.
I think that would really be only an optimization of the current code though, this here should be fine as well.

Does setting an array to fortran contiguous actually transpose the matrix, or does it just re-order the pointers to the internal flat representation?

After testing, we can avoid copying A in all cases.

Re: the other point, do you mean checking the values of trans, lower, and unit_diag inside the wrapper function, then returning a specialized impl function based on their values? Similar to how I'm doing dispatching to real/complex versions here?

jessegrabowski · 2023-08-29T16:06:33Z

It feels hackish because 1) we can't cache the functions (relevant for compile times, which you've pointed out are extremely long with numba), 2) we can't support complex inputs due to a weird technical reason, not due to some principled/fundamental reason, 3) It's nowhere close to working within the "official" numba API, so I have no idea how future proof it is. Complex inputs definitely worked last year, so something was changed in the numba codebase to break that A_copy.view(w_type).ctype work-around. No idea what else might change and break this code down the road.

Rename addr to lapack_ptr

Don't copy B matrix when B is array in overload func

ricardoV94 · 2023-09-01T10:52:49Z

Some conflicts have cropped up

jessegrabowski · 2023-09-01T11:23:04Z

What do you mean?

ricardoV94 · 2023-09-01T11:25:53Z

jessegrabowski · 2023-09-01T11:34:21Z

Still says no conflicts for me. I'll update my fork and double check everything.

ricardoV94 · 2023-09-01T16:22:51Z

Still says no conflicts for me. I'll update my fork and double check everything.

Swap Squash and merge button to rebase and merge, and you should see it

ricardoV94 · 2023-09-01T16:31:46Z

Marked as a draft, given the suggestion to upper pin numba. Feel free to convert back to ready for review if you chage your mind about it or do it!

ricardoV94 · 2023-09-01T16:32:22Z

@maresb is there a way to set an upper version limit on an optional dependency (that will also be respected by conda)?

maresb · 2023-09-01T18:26:33Z

Sure, just add it under run_constrained in the feedstock. https://docs.conda.io/projects/conda-build/en/stable/resources/define-metadata.html#run-constrained

jessegrabowski · 2023-09-24T12:37:12Z

Following conversation with @ricardoV94 I'm merging this. We'll cross the bridge of code breaking when/if we get there.

Add numba overload for

4348da7

jessegrabowski requested review from aseyboldt and ricardoV94 August 26, 2023 05:10

aseyboldt reviewed Aug 26, 2023

View reviewed changes

pytensor/link/numba/dispatch/slinalg.py Outdated Show resolved Hide resolved

jessegrabowski added 8 commits August 28, 2023 01:59

Overload dummy function instead of scipy.linalg

c9f5f4f

Add tolerance for float32 tests

08de8fa

Add tolerance for float32 tests

a323a1d

Remove overload test

268f583

Allow b to be 1d array

3837a28

Remove test_SolveTriangular from numba\test_nlinalg.py

Allow b to be 1d array

74f3965

Remove test_SolveTriangular from numba\test_nlinalg.py

revert local change to pyproject.toml

3e1c85a

add numba importorskip to test_slinalg.py

15e23b9

Test all parameterizations of solve_triangular

7649d59

jessegrabowski commented Aug 29, 2023

View reviewed changes

pytensor/link/numba/dispatch/slinalg.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Aug 29, 2023

View reviewed changes

pytensor/link/numba/dispatch/slinalg.py Outdated Show resolved Hide resolved

ricardoV94 added numba backend compatibility linalg Linear algebra enhancement New feature or request labels Aug 29, 2023

Raise when inputs are complex

c7a1f28

Add informative message to error raised by check_finite=True

ricardoV94 reviewed Aug 29, 2023

View reviewed changes

pytensor/link/numba/dispatch/slinalg.py Outdated Show resolved Hide resolved

jessegrabowski added 2 commits August 29, 2023 16:37

simplify check for complex input types

296fec3

simplify check for complex input types

a779687

Merge branch 'numba_slinalg' of https://github.com/jessegrabowski/pyt…

0f4f197

…ensor into numba_slinalg

jessegrabowski marked this pull request as ready for review August 29, 2023 14:47

ricardoV94 requested a review from aseyboldt August 29, 2023 14:51

aseyboldt approved these changes Aug 29, 2023

View reviewed changes

jessegrabowski added 2 commits August 29, 2023 18:22

Rename _get_addr_and_float_pointer to _get_lapack_ptr_and_ptr_type

dd8cfed

Rename addr to lapack_ptr

Don't copy A matrix in overload func

519b1c5

Don't copy B matrix when B is array in overload func

Merge branch 'pymc-devs:main' into numba_slinalg

7ccd3df

ricardoV94 approved these changes Sep 1, 2023

View reviewed changes

michaelosthege approved these changes Sep 1, 2023

View reviewed changes

ricardoV94 marked this pull request as draft September 1, 2023 16:31

Merge branch 'main' into numba_slinalg

42b5b5f

jessegrabowski marked this pull request as ready for review September 24, 2023 12:35

jessegrabowski merged commit 2d94407 into pymc-devs:main Sep 24, 2023

jessegrabowski deleted the numba_slinalg branch September 24, 2023 12:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add numba overload for `solve_triangular` #423

Add numba overload for `solve_triangular` #423

jessegrabowski commented Aug 26, 2023

codecov-commenter commented Aug 28, 2023 •

edited

Loading

ricardoV94 commented Aug 29, 2023

jessegrabowski commented Aug 29, 2023 •

edited

Loading

aseyboldt left a comment

aseyboldt Aug 29, 2023

jessegrabowski Aug 29, 2023

jessegrabowski Aug 29, 2023

aseyboldt Aug 29, 2023

jessegrabowski Aug 29, 2023

jessegrabowski Aug 29, 2023

jessegrabowski commented Aug 29, 2023

ricardoV94 commented Sep 1, 2023

jessegrabowski commented Sep 1, 2023

ricardoV94 commented Sep 1, 2023

jessegrabowski commented Sep 1, 2023

ricardoV94 commented Sep 1, 2023 •

edited

Loading

ricardoV94 commented Sep 1, 2023

ricardoV94 commented Sep 1, 2023 •

edited

Loading

maresb commented Sep 1, 2023

jessegrabowski commented Sep 24, 2023

Add numba overload for solve_triangular #423

Add numba overload for solve_triangular #423

Conversation

jessegrabowski commented Aug 26, 2023

Motivation for these changes

Implementation details

Checklist

Major / Breaking Changes

New features

Bugfixes

Documentation

Maintenance

codecov-commenter commented Aug 28, 2023 • edited Loading

Codecov Report

ricardoV94 commented Aug 29, 2023

jessegrabowski commented Aug 29, 2023 • edited Loading

aseyboldt left a comment

Choose a reason for hiding this comment

aseyboldt Aug 29, 2023

Choose a reason for hiding this comment

jessegrabowski Aug 29, 2023

Choose a reason for hiding this comment

jessegrabowski Aug 29, 2023

Choose a reason for hiding this comment

aseyboldt Aug 29, 2023

Choose a reason for hiding this comment

jessegrabowski Aug 29, 2023

Choose a reason for hiding this comment

jessegrabowski Aug 29, 2023

Choose a reason for hiding this comment

jessegrabowski commented Aug 29, 2023

ricardoV94 commented Sep 1, 2023

jessegrabowski commented Sep 1, 2023

ricardoV94 commented Sep 1, 2023

jessegrabowski commented Sep 1, 2023

ricardoV94 commented Sep 1, 2023 • edited Loading

ricardoV94 commented Sep 1, 2023

ricardoV94 commented Sep 1, 2023 • edited Loading

maresb commented Sep 1, 2023

jessegrabowski commented Sep 24, 2023

Add numba overload for `solve_triangular` #423

Add numba overload for `solve_triangular` #423

codecov-commenter commented Aug 28, 2023 •

edited

Loading

jessegrabowski commented Aug 29, 2023 •

edited

Loading

ricardoV94 commented Sep 1, 2023 •

edited

Loading

ricardoV94 commented Sep 1, 2023 •

edited

Loading