-
-
Notifications
You must be signed in to change notification settings - Fork 77
Fix Enzyme sparse matrix sparsity pattern corruption (issue #835) #860
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix Enzyme sparse matrix sparsity pattern corruption (issue #835) #860
Conversation
e2cf5a5 to
98949fb
Compare
Updated based on feedbackAddressed two concerns: 1. GPU SafetyChanged the implementation from nested loops with scalar indexing to vectorized operations: # Build column indices vector from colptr
col_indices = _expand_colptr_to_col_indices(colptr, n_cols, nnz_count)
# Vectorized update instead of nested loop
vals .-= z[rows] .* y[col_indices]The 2. Non-zeros concernThe dense fallback |
98949fb to
223cc46
Compare
Updated implementationBased on feedback, the implementation has been revised to use a non-allocating loop for sparse matrix operations: function _sparse_outer_sub!(dA::SparseMatrixCSC, z::AbstractVector, y::AbstractVector)
rows = rowvals(dA)
vals = nonzeros(dA)
colptr = getcolptr(dA)
# Non-allocating loop over CSC structure
# This is efficient and cache-friendly (column-major order)
@inbounds for col in 1:size(dA, 2)
y_col = y[col]
for idx in colptr[col]:(colptr[col + 1] - 1)
vals[idx] -= z[rows[idx]] * y_col
end
end
return dA
endKey points:
The Note: The dense matrix Enzyme tests are failing on main as a pre-existing issue (unrelated to this PR). |
The issue: Enzyme.make_zero shares structural arrays (rowval, colptr) between primal and shadow sparse matrices. Broadcast operations like `dA .-= z * y'` can change the sparsity pattern, corrupting both shadow AND primal matrices. The fix: Add sparse-safe helper functions that operate directly on the nonzeros array to preserve the sparsity pattern: - _safe_add!: Add arrays preserving sparsity pattern - _safe_zero!: Zero arrays preserving sparsity pattern - _sparse_outer_sub!: Compute outer product subtraction preserving sparsity pattern Supports both CPU and GPU sparse matrices: - SparseMatrixCSC (CPU): Uses non-allocating loop for efficiency - AbstractSparseMatrixCSC (GPU): Uses vectorized operations with O(nnz) allocation for GPU-compatible broadcasting Also added SparseArrays as a dependency for the LinearSolveEnzymeExt extension. Note: The dense matrix Enzyme tests are failing on main (pre-existing issue). 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
223cc46 to
eb1d270
Compare
Updated: Added GPU sparse matrix supportNow supports both CPU and GPU sparse matrices: CPU sparse (
|
Summary
nzvalarrays instead of using broadcast operations that can change sparsityRoot Cause
Enzyme.make_zerocreates shadow sparse matrices that share the structural arrays (rowval,colptr) with the primal matrix. When the reverse rule executes broadcast operations likedA .-= z * transpose(y), these can change the sparsity pattern of the shadow matrix, and because the structural arrays are shared, this inadvertently corrupts the primal matrix's structure.Evidence from investigation:
Solution
Add sparse-safe helper functions that dispatch on
AbstractSparseMatrix:_safe_add!(dst, src): For sparse matrices, adds values viadst.nzval .+= src.nzval_safe_zero!(A): For sparse matrices, zeros viafill!(A.nzval, 0)_sparse_outer_sub!(dA, z, y): For sparse matrices, only accumulates gradients into existing non-zero positions using a direct loop over the CSC structureThese preserve the sparsity pattern by operating only on the
nzvalarray.Test plan
AssertionError: _goodbuffers(S)crashFixes #835
🤖 Generated with Claude Code