-
Hi! Thank you for releasing the Python API for cuDSS. I am trying to solve a linear system using cuDSS, but I'm encountering an issue where the solver returns Could you please let me know what I might be doing wrong? Please find below a Minimal Working Example that demonstrates the issue: import cupy as cp
import nvmath
from nvmath.bindings import cudss as cudss
n = 3
nnz = 9
nrhs = 1
# Initialize data for the sparse matrix A in CSR format
csr_offsets_h = cp.array([0, 3, 6, 9], dtype = cp.int32)
csr_columns_h = cp.array([0, 1, 2, 0, 1, 2, 0, 1, 2], dtype = cp.int32)
csr_values_h = cp.array([3., 1., 2.,
1., 4., 1.,
2., 1., 3.], dtype = cp.float64, order = 'F')
# Initialize data for the right-hand side vector b and solution x
b_values_h = cp.array([[11.], [12.], [13.]], dtype=cp.float64, order='F')
x_values_h = cp.zeros_like(b_values_h, order='F')
handle = cudss.create()
config = cudss.config_create()
dta = cudss.data_create(handle)
bb = cudss.matrix_create_dn(n,1,n, b_values_h.data.ptr, nvmath.CudaDataType.CUDA_R_64F, cudss.Layout.COL_MAJOR)
xx = cudss.matrix_create_dn(n,1,n, x_values_h.data.ptr, nvmath.CudaDataType.CUDA_R_64F, cudss.Layout.COL_MAJOR)
aa = cudss.matrix_create_csr(n, n, nnz, csr_offsets_h.data.ptr, 0,
csr_columns_h.data.ptr, csr_values_h.data.ptr,
nvmath.CudaDataType.CUDA_R_32I, nvmath.CudaDataType.CUDA_R_64F,
cudss.MatrixType.GENERAL, cudss.MatrixViewType.FULL, cudss.IndexBase.ZERO)
cudss.execute(handle, cudss.Phase.ANALYSIS, config, dta, aa, xx, bb)
cudss.execute(handle, cudss.Phase.FACTORIZATION, config, dta, aa, xx, bb)
# execute refactorization phase (without this the solve phase fails)
cudss.execute(handle, cudss.Phase.REFACTORIZATION, config, dta, aa, xx, bb)
cudss.execute(handle, cudss.Phase.SOLVE, config, dta, aa, xx, bb)
cp.cuda.Stream.null.synchronize()
cp.asarray(x_values_h)
cudss.matrix_destroy(aa)
cudss.matrix_destroy(bb)
cudss.matrix_destroy(xx)
cudss.data_destroy(handle, dta)
cudss.config_destroy(config)
cudss.destroy(handle) Thank you for your time! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi @aaliq01 Thank you for taking time to describe the problem you're experiencing. I run your code and it seems to produce expected results (even without the extra REFACTORIZATION) as long as I use cudss 0.5.0. With You can query the version used with
The Python bindings were created to support cudss 0.5.0 version. Currently, this is nvmath's strict dependency for sparse solver. With pip, the Please let us know if downgrading the cudss resolves your problem. Is using the 0.5.0 OK for you, or do you need some particular features that are present in the most recent cudss release? |
Beta Was this translation helpful? Give feedback.
-
Hi @stiepan Thank you so much for your reply. Downgrading cuDSS from 0.6.0 to 0.5.0 does solve the issue. I checked the release notes and I think the version 0.5.0 satisfies most of my current requirements. Thanks again! |
Beta Was this translation helpful? Give feedback.
Hi @aaliq01
Thank you for taking time to describe the problem you're experiencing.
I run your code and it seems to produce expected results (even without the extra REFACTORIZATION) as long as I use cudss 0.5.0. With
nvidia-cudss-cu12-0.5.0.16
the results are as expected, while withnvidia-cudss-cu12-0.6.0.5
I see the behavior you are describing.You can query the version used with
The Python bindings were created to support cudss 0.5.0 version. Currently, this is nvmath's strict dependency for sparse solver. With pip, the
pip install nvmath-python[cu12]==0.5.0
should make sure…