Implement `StructuredDotGradCSR` and `StructuredDotGradCSC` in numba backend #1860

tomicapretto · 2026-01-29T03:03:37Z

Description

The main contribution of this PR is the implementation of StructuredDotGradCSR and StructuredDotGradCSC in the numba backend.

While I was working on it, I noticed Ops SpSum and SparseFromDense were running in object mode, so I also implemented them.

Checklist

Checked that the pre-commit linting/style checks pass
Included tests that prove the fix is effective or that the new feature works
Added necessary documentation (docstrings and/or example notebooks)
If you are a pro: each commit corresponds to a relevant logical change

Type of change

pytensor/link/numba/dispatch/sparse/math.py

pytensor/link/numba/dispatch/sparse/variable.py

pytensor/link/numba/dispatch/sparse/math.py

…ntation with StructuredDotGradCSR

… backend

tomicapretto · 2026-01-31T18:26:32Z

The test that fails is:

FAILED tests/tensor/test_slinalg.py::TestSchur::test_schur_empty - ValueError: negative dimensions not allowed

which is unrelated to this PR.

ricardoV94 · 2026-01-31T21:23:32Z

The test that fails is:
FAILED tests/tensor/test_slinalg.py::TestSchur::test_schur_empty - ValueError: negative dimensions not allowed
which is unrelated to this PR.

@jessegrabowski

ricardoV94 · 2026-01-31T21:25:16Z

pytensor/link/numba/dispatch/sparse/basic.py

+
+            # Pre-allocate internal containers
+            data = np.empty(nnz, dtype=matrix.dtype)
+            indices = np.empty(nnz, dtype=np.uint32)


Should be int32 no?

We used uint32 in other places. I thought that since they're never negative, we could use uint32.

pytensor/link/numba/dispatch/sparse/basic.py

ricardoV94 · 2026-01-31T21:29:49Z

pytensor/link/numba/dispatch/sparse/math.py

+
+                for col_idx in range(size):
+                    for value_idx in range(x_ptr[col_idx], x_ptr[col_idx + 1]):
+                        output[value_idx] = np.dot(


Have to be careful with np.dot. IIRC numba overload doesn't support integer / mixed dtypes well

Argh, I'm using it since np.sum(x * y) was slower. There are a bunch of test that pass different data types, and they have all passed. Probably that's ok?

Its probably fine as long as we're upcasting the inputs to a common dtype in the make_node of Dot?

In the medium term we should consider re-implementing the BLAS calls ourselves

ricardoV94

This looks great, I just left some minor comments

…f format conversions

…those implementations in SparseFromDense

jessegrabowski · 2026-02-01T15:48:25Z

pytensor/link/numba/dispatch/sparse/math.py

+    axis = op.axis
+
+    @numba_basic.numba_njit
+    def perform(x):


does mypy freak out if you typehint this as SparseArray -> SparseArray? It would make the function more clear. Not required if it causes a headache (typehinting these overloads often does)

I have not tried it, but the SpSum op returns a dense array (see this).

What happens here is that this calls the function I implemented in overload_sum in variable.py.
Maybe a global somewhere (per op or at the top of the file) saying that many (if not all) Ops are using overloads written in a separate python file?

jessegrabowski · 2026-02-01T15:57:24Z

pytensor/link/numba/dispatch/sparse/math.py

        # General spmspm algorithm in CSR format
        @numba_basic.numba_njit
-        def _spmspm(n_row, n_col, x_ptr, x_ind, x_data, y_ptr, y_ind, y_data):
+        def _spmspm_csr(x, y, n_row, n_col):


I think it's worth considering a bit of reorganization here for future extensibility. We can make a new sparse/math sub-module and have a sum.py file with each of these inner njit functions defined independently. numba_funcify_SparseDenseMultiply can still live here, but it would be just an input checker and routing to the correct function. I'm thinking about what it will look like in the future to add support for a new sparse type.

The pattern I'm thinking about is what we are doing with linalg, for example QZ: each case is defined separately here, then the actual dispatch is defined here.

It sounds good to me. I thought a bit about it prior starting to work on this, but I saw the other ops in this module were implemented this way, so I thought it was for a reason. Maybe I just overthought about it and it was simple convenience.

jessegrabowski · 2026-02-01T16:00:28Z

pytensor/link/numba/dispatch/sparse/math.py

+        if formats == ("csc", "csc"):
+            # In all cases, the output is dense when the op is Dot.
+            @numba_basic.numba_njit
+            def spmspm(x, y):


to my point above, it would be great if each of these functions were defined with a name that clarified the case we're handling. It would make this format routing much more clear.

jessegrabowski · 2026-02-01T16:00:47Z

pytensor/link/numba/dispatch/sparse/math.py

+@register_funcify_and_cache_key(StructuredDotGradCSR)
+@register_funcify_and_cache_key(StructuredDotGradCSC)
+def numba_funcify_StructuredDotGrad(op, node, **kwargs):
+    # Let:


Make this a docstring

pytensor/link/numba/dispatch/sparse/math.py

jessegrabowski · 2026-02-01T16:17:37Z

pytensor/link/numba/dispatch/sparse/variable.py

+            # Pass 1: Count non-zeros to pre-allocate
+            nnz = 0
+            for i in range(n_rows):
+                for j in range(n_cols):
+                    if arg1[i, j] != 0:
+                        nnz += 1


There are some helpers in common between the csc and csr case that you could consider extracting (though I recognize we haven't hit the rule of 3 yet)

jessegrabowski · 2026-02-01T16:20:33Z

pytensor/link/numba/dispatch/sparse/variable.py

+@overload_method(CSMatrixType, "sum")
+def overload_sum(matrix, axis):
+    # 'axis' can be either None, 0, or 1.
+    if axis is types.none:


if axis is None doesn't work here?

Nope, this was actually something I discovered by trial and error. We're dealing with numba types as inputs here, thus None does not work. Same idea applies for isinstance(matrix, np.ndarray), one has to do isinstance(matrix, types.Array).

I could have read numba's docs on extending it more thoroughly, of course hehe

jessegrabowski · 2026-02-01T16:22:26Z

pytensor/sparse/math.py

        out[0] = np.asarray(variable, str(variable.dtype))

    def grad(self, inputs, gout):
+        # FIXME: It's not always true that b and g_out are dense.


Did you already open an issue for this?

Just created it #1871, thanks for the nudge.

jessegrabowski · 2026-02-01T16:23:53Z

tests/link/numba/sparse/test_math.py

+
+
+@pytest.mark.parametrize("x_format", ["csr", "csc"])
+@pytest.mark.parametrize("y_format", ["csr", "csc", None])


Suggested change

@pytest.mark.parametrize("y_format", ["csr", "csc", None])

@pytest.mark.parametrize("y_format", ["csr", "csc", "dense"])

jessegrabowski · 2026-02-01T16:24:41Z

tests/link/numba/sparse/test_math.py

+@pytest.mark.parametrize("y_format", ["csr", "csc", None])
+@pytest.mark.parametrize("x_shape, y_shape", DOT_SHAPES)
+def test_structured_dot_grad(x_format, y_format, x_shape, y_shape):
+    rng = np.random.default_rng(sum(map(ord, x_format)) + sum(x_shape) + sum(y_shape))


I personally don't love seeded tests, I'd rather know over time if an implementation is flaky. I'm not sure everyone agrees on this, though.

I don't think randomness should be a default. Float point precision is a weird thing and from experience flakyness from float point issues is orders of magnitude larger than from actual bugs (god knows how much electricity/time the whole float32 ci cost us in the long run)

Also from experience real bugs are unlikely to be masked away from a single seed.

I would restrict randomness to stochastic/statistical applications, where it's a meaningful construct.

In this specific case, I don't believe a seed could matter for bug catching.

I don't think randomness should be a default. Float point precision is a weird thing and from experience flakyness from float point issues is orders of magnitude larger than from actual bugs (god knows how much electricity/time the whole float32 ci cost us in the long run)

Also from experience real bugs are unlikely to be masked away from a single seed.

I would restrict randomness to stochastic/statistical applications, where it's a meaningful construct. Or where there's not risk of float point shenanigans

In this specific case, I don't believe a seed could matter for bug catching.

I just followed the pattern I saw. I don't like thinking about how I create a seed, but at the same time I hate my tests failing because of weird float stuff.

Are we OK with myself removing seeds in these tests then?

Co-authored-by: Jesse Grabowski <[email protected]>

tomicapretto force-pushed the sparse_gradients_numba branch 2 times, most recently from 190c587 to 4690cde Compare January 29, 2026 13:34

ricardoV94 reviewed Jan 29, 2026

View reviewed changes

pytensor/link/numba/dispatch/sparse/math.py Outdated Show resolved Hide resolved

tomicapretto force-pushed the sparse_gradients_numba branch 2 times, most recently from c96ae8c to 2025883 Compare January 29, 2026 14:56

tomicapretto commented Jan 29, 2026

View reviewed changes

pytensor/link/numba/dispatch/sparse/variable.py Show resolved Hide resolved

tomicapretto force-pushed the sparse_gradients_numba branch 2 times, most recently from 512fb59 to 32099f1 Compare January 30, 2026 03:42

tomicapretto commented Jan 30, 2026

View reviewed changes

pytensor/link/numba/dispatch/sparse/math.py Show resolved Hide resolved

tomicapretto commented Jan 30, 2026

View reviewed changes

pytensor/link/numba/dispatch/sparse/math.py Show resolved Hide resolved

tomicapretto commented Jan 30, 2026

View reviewed changes

pytensor/link/numba/dispatch/sparse/math.py Show resolved Hide resolved

tomicapretto commented Jan 30, 2026

View reviewed changes

pytensor/link/numba/dispatch/sparse/math.py Show resolved Hide resolved

tomicapretto force-pushed the sparse_gradients_numba branch 2 times, most recently from e7b1261 to a054b5d Compare January 31, 2026 17:51

tomicapretto added 7 commits January 31, 2026 14:52

Implement StructuredDotGradCSR in numba backend

22add6b

Implement SparseFromDense in numba backend

a2a3d47

Overlaod sum method in sparse matrices in numba backend

a98632b

Overload SpSum in numba backend

8b4b5fe

Implement StructuredDotGradCSC in numba backend, unifying its impleme…

89743ce

…ntation with StructuredDotGradCSR

Add special case when Y is vector in StructuredDotGradCSR

a1fb97d

Add tests for: SparseFromDense, StructuredDotGrad, and SpSum in numba…

6af5a1a

… backend

tomicapretto force-pushed the sparse_gradients_numba branch from a054b5d to 6af5a1a Compare January 31, 2026 17:54

tomicapretto marked this pull request as ready for review January 31, 2026 17:56

tomicapretto added enhancement New feature or request numba sparse variables labels Jan 31, 2026

ricardoV94 reviewed Jan 31, 2026

View reviewed changes

pytensor/link/numba/dispatch/sparse/basic.py Outdated Show resolved Hide resolved

ricardoV94 reviewed Jan 31, 2026

View reviewed changes

tomicapretto added 2 commits February 1, 2026 11:11

Handle sparse mat mat in a more specialized manner, reducing number o…

ac54ffe

…f format conversions

Extend the implementation of csr_matrix and csc_matrix in numba. Use …

d142550

…those implementations in SparseFromDense

jessegrabowski reviewed Feb 1, 2026

View reviewed changes

Apply suggestion from @jessegrabowski

1a21a86

Co-authored-by: Jesse Grabowski <[email protected]>



		@pytest.mark.parametrize("x_format", ["csr", "csc"])
		@pytest.mark.parametrize("y_format", ["csr", "csc", None])

Implement StructuredDotGradCSR and StructuredDotGradCSC in numba backend #1860

Are you sure you want to change the base?

Implement StructuredDotGradCSR and StructuredDotGradCSC in numba backend #1860

Conversation

tomicapretto commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Type of change

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tomicapretto commented Jan 31, 2026

Uh oh!

ricardoV94 commented Jan 31, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ricardoV94 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tomicapretto Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ricardoV94 Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Implement `StructuredDotGradCSR` and `StructuredDotGradCSC` in numba backend #1860

Implement `StructuredDotGradCSR` and `StructuredDotGradCSC` in numba backend #1860

tomicapretto commented Jan 29, 2026 •

edited

Loading

tomicapretto Feb 1, 2026 •

edited

Loading

ricardoV94 Feb 1, 2026 •

edited

Loading