3.5. Refactorization and direct solvers#
These are functions that implement direct solvers for sparse systems with different coefficient matrices that share the same sparsity pattern. The refactorization functions are divided into the following categories:
Initialization and meta data. Basic functions to initialize and destroy meta data.
Triangular refactorization. Refactorization of new matrices given a known sparsity pattern.
Direct sparse solvers. Based on triangular refactorization.
Note
Throughout the APIs’ descriptions, we use the following notations:
x[i] stands for the ith element of vector x, while A[i,j] represents the element in the ith row and jth column of matrix A. Indices are 1based, i.e. x[1] is the first element of x.
If X is a real vector or matrix, \(X^T\) indicates its transpose; if X is complex, then \(X^H\) represents its conjugate transpose. When X could be real or complex, we use X’ to indicate X transposed or X conjugate transposed, accordingly.
x_i \(=x_i\); we sometimes use both notations, \(x_i\) when displaying mathematical equations, and x_i in the text describing the function parameters.
3.5.1. Initialization and meta data#
3.5.1.1. rocsolver_create_rfinfo()#

rocblas_status rocsolver_create_rfinfo(rocsolver_rfinfo *rfinfo, rocblas_handle handle)#
CREATE_RFINFO initializes the structure rfinfo, required by the refactorization functions CSRRF_REFACTLU and CSRRF_SOLVE, that contains the meta data and descriptors of the involved matrices.
 Parameters:
rfinfo – [out] rocsolver_rfinfo
.
The pointer to the rfinfo struct to be initialized.
handle – [in] rocblas_handle.
3.5.1.2. rocsolver_destroy_rfinfo()#

rocblas_status rocsolver_destroy_rfinfo(rocsolver_rfinfo rfinfo)#
DESTROY_RFINFO destroys the structure rfinfo used by the refactorization functions CSRRF_REFACTLU and CSRRF_SOLVE.
 Parameters:
rfinfo – [in] rocsolver_rfinfo
.
The rfinfo struct to be destroyed.
3.5.1.3. rocsolver_csrrf_analysis()#

rocblas_status rocsolver_dcsrrf_analysis(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, const rocblas_int nnzM, rocblas_int *ptrM, rocblas_int *indM, double *valM, const rocblas_int nnzT, rocblas_int *ptrT, rocblas_int *indT, double *valT, rocblas_int *pivP, rocblas_int *pivQ, double *B, const rocblas_int ldb, rocsolver_rfinfo rfinfo)#

rocblas_status rocsolver_scsrrf_analysis(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, const rocblas_int nnzM, rocblas_int *ptrM, rocblas_int *indM, float *valM, const rocblas_int nnzT, rocblas_int *ptrT, rocblas_int *indT, float *valT, rocblas_int *pivP, rocblas_int *pivQ, float *B, const rocblas_int ldb, rocsolver_rfinfo rfinfo)#
CSRRF_ANALYSIS performs the analysis phase required by the refactorization functions CSRRF_REFACTLU and CSRRF_SOLVE.
Consider a sparse matrix \(M\) previously factorized as
\[ PMQ = L_MU_M \]where \(L_M\) is lower triangular with unit diagonal, \(U_M\) is upper triangular, and \(P\) and \(Q\) are permutation matrices associated with pivoting and reordering (to minimize fillin), respectively. The meta data generated by this routine is collected in the output parameter rfinfo. This information will allow the fast LU refactorization of another sparse matrix \(A\) as
\[ PAQ = L_AU_A \]and, eventually, the computation of the solution vector \(X\) of any linear system of the form
\[ AX = B \]as long as \(A\) has the same sparsity pattern as the previous matrix \(M\).
This function supposes that the LU factors \(L_M\) and \(U_M\) are passed in a bundle matrix \(T=(L_MI)+U_M\) as returned by CSRRF_SUMLU, and that rfinfo has been initialized by RFINFO_CREATE.
Note
If only a refactorization will be executed (i.e. no solver phase), then nrhs can be set to zero and B can be null.
 Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows (and columns) of matrix M.
nrhs – [in]
rocblas_int. nrhs >= 0.
The number of righthandsides (columns of matrix B). Set nrhs to zero when only the refactorization is needed.
nnzM – [in]
rocblas_int. nnzM >= 0.
The number of nonzero elements in M.
ptrM – [in]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indM and valM. The last element of ptrM is equal to nnzM.
indM – [in]
pointer to rocblas_int. Array on the GPU of dimension nnzM.
It contains the column indices of the nonzero elements of M. Indices are sorted by row and by column within each row.
valM – [in]
pointer to type. Array on the GPU of dimension nnzM.
The values of the nonzero elements of M.
nnzT – [in]
rocblas_int. nnzT >= 0.
The number of nonzero elements in T.
ptrT – [in]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indT and valT. The last element of ptrT is equal to nnzT.
indT – [in]
pointer to rocblas_int. Array on the GPU of dimension nnzT.
It contains the column indices of the nonzero elements of T. Indices are sorted by row and by column within each row.
valT – [in]
pointer to type. Array on the GPU of dimension nnzT.
The values of the nonzero elements of T.
pivP – [in]
pointer to rocblas_int. Array on the GPU of dimension n.
Contains the pivot indices representing the permutation matrix P, i.e. the order in which the rows of matrix M were rearranged.
pivQ – [in]
pointer to rocblas_int. Array on the GPU of dimension n.
Contains the pivot indices representing the permutation matrix Q, i.e. the order in which the columns of matrix M were rearranged.
B – [in]
pointer to type. Array on the GPU of dimension ldb*nrhs.
The right hand side matrix B. It can be null if only the refactorization is needed.
ldb – [in] rocblas_int. ldb >= n. The leading dimension of B.
rfinfo – [out]
rocsolver_rfinfo.
Structure that holds the meta data generated in the analysis phase.
3.5.2. Triangular refactorization#
3.5.2.1. rocsolver_<type>csrrf_sumlu()#

rocblas_status rocsolver_dcsrrf_sumlu(rocblas_handle handle, const rocblas_int n, const rocblas_int nnzL, rocblas_int *ptrL, rocblas_int *indL, double *valL, const rocblas_int nnzU, rocblas_int *ptrU, rocblas_int *indU, double *valU, rocblas_int *ptrT, rocblas_int *indT, double *valT)#

rocblas_status rocsolver_scsrrf_sumlu(rocblas_handle handle, const rocblas_int n, const rocblas_int nnzL, rocblas_int *ptrL, rocblas_int *indL, float *valL, const rocblas_int nnzU, rocblas_int *ptrU, rocblas_int *indU, float *valU, rocblas_int *ptrT, rocblas_int *indT, float *valT)#
CSRRF_SUMLU bundles the factors \(L\) and \(U\), associated with the LU factorization of a sparse matrix \(A\), into a single sparse matrix \(T=(LI)+U\).
Factor \(L\) is a sparse lower triangular matrix with unit diagonal elements, and \(U\) is a sparse upper triangular matrix. The resulting sparse matrix \(T\) combines both sparse factors without storing the unit diagonal; in other words, the number of nonzero elements of T, nnzT, is given by nnzT = nnzL  n + nnzU.
 Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows (and columns) of matrix A.
nnzL – [in]
rocblas_int. nnzL >= n.
The number of nonzero elements in L.
ptrL – [in]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indL and valL. The last element of ptrL is equal to nnzL.
indL – [in]
pointer to rocblas_int. Array on the GPU of dimension nnzL.
It contains the column indices of the nonzero elements of L. Indices are sorted by row and by column within each row.
valL – [in]
pointer to type. Array on the GPU of dimension nnzL.
The values of the nonzero elements of L.
nnzU – [in]
rocblas_int. nnzU >= 0.
The number of nonzero elements in U.
ptrU – [in]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indU and valU. The last element of ptrU is equal to nnzU.
indU – [in]
pointer to rocblas_int. Array on the GPU of dimension nnzU.
It contains the column indices of the nonzero elements of U. Indices are sorted by row and by column within each row.
valU – [in]
pointer to type. Array on the GPU of dimension nnzU.
The values of the nonzero elements of U.
ptrT – [out]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indT and valT. The last element of ptrT is equal to nnzT.
indT – [out]
pointer to rocblas_int. Array on the GPU of dimension nnzT.
It contains the column indices of the nonzero elements of T. Indices are sorted by row and by column within each row.
valT – [out]
pointer to type. Array on the GPU of dimension nnzT.
The values of the nonzero elements of T.
3.5.2.2. rocsolver_<type>csrrf_splitlu()#

rocblas_status rocsolver_dcsrrf_splitlu(rocblas_handle handle, const rocblas_int n, const rocblas_int nnzT, rocblas_int *ptrT, rocblas_int *indT, double *valT, rocblas_int *ptrL, rocblas_int *indL, double *valL, rocblas_int *ptrU, rocblas_int *indU, double *valU)#

rocblas_status rocsolver_scsrrf_splitlu(rocblas_handle handle, const rocblas_int n, const rocblas_int nnzT, rocblas_int *ptrT, rocblas_int *indT, float *valT, rocblas_int *ptrL, rocblas_int *indL, float *valL, rocblas_int *ptrU, rocblas_int *indU, float *valU)#
CSRRF_SPLITLU splits the factors \(L\) and \(U\), associated with the LU factorization of a sparse matrix \(A\), from a bundled matrix \(T=(LI)+U\).
Factor \(L\) is a sparse lower triangular matrix with unit diagonal elements, and \(U\) is a sparse upper triangular matrix. Conceptually, on input, U is stored on the diagonal and upper part of \(T\), while the non diagonal elements of \(L\) are stored on the strictly lower part of \(T\).
 Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows (and columns) of matrix A.
nnzT – [in]
rocblas_int. nnzT >= 0.
The number of nonzero elements in T.
ptrT – [in]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indT and valT. The last element of ptrT is equal to nnzT.
indT – [in]
pointer to rocblas_int. Array on the GPU of dimension nnzT.
It contains the column indices of the nonzero elements of T. Indices are sorted by row and by column within each row.
valT – [in]
pointer to type. Array on the GPU of dimension nnzT.
The values of the nonzero elements of T.
ptrL – [out]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indL and valL. The last element of ptrL is equal to nnzL.
indL – [out]
pointer to rocblas_int. Array on the GPU of dimension nnzL.
It contains the column indices of the nonzero elements of L. Indices are sorted by row and by column within each row. (If nnzL is not known in advance, the size of this array could be set to nnzT + n as an upper bound).
valL – [out]
pointer to type. Array on the GPU of dimension nnzL.
The values of the nonzero elements of L. (If nnzL is not known in advance, the size of this array could be set to nnzT + n as an upper bound).
ptrU – [out]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indU and valU. The last element of ptrU is equal to nnzU.
indU – [out]
pointer to rocblas_int. Array on the GPU of dimension nnzU.
It contains the column indices of the nonzero elements of U. Indices are sorted by row and by column within each row. (If nnzU is not known in advance, the size of this array could be set to nnzT as an upper bound).
valU – [out]
pointer to type. Array on the GPU of dimension nnzU.
The values of the nonzero elements of U. (If nnzU is not known in advance, the size of this array could be set to nnzT as an upper bound).
3.5.2.3. rocsolver_<type>csrrf_refactlu()#

rocblas_status rocsolver_dcsrrf_refactlu(rocblas_handle handle, const rocblas_int n, const rocblas_int nnzA, rocblas_int *ptrA, rocblas_int *indA, double *valA, const rocblas_int nnzT, rocblas_int *ptrT, rocblas_int *indT, double *valT, rocblas_int *pivP, rocblas_int *pivQ, rocsolver_rfinfo rfinfo)#

rocblas_status rocsolver_scsrrf_refactlu(rocblas_handle handle, const rocblas_int n, const rocblas_int nnzA, rocblas_int *ptrA, rocblas_int *indA, float *valA, const rocblas_int nnzT, rocblas_int *ptrT, rocblas_int *indT, float *valT, rocblas_int *pivP, rocblas_int *pivQ, rocsolver_rfinfo rfinfo)#
CSRRF_REFACTLU performs a fast LU factorization of a sparse matrix \(A\) based on the information from the factorization of a previous matrix \(M\) with the same sparsity pattern (refactorization).
Consider a sparse matrix \(M\) previously factorized as
\[ PMQ = L_MU_M \]where \(L_M\) is lower triangular with unit diagonal, \(U_M\) is upper triangular, and \(P\) and \(Q\) are permutation matrices associated with pivoting and reordering (to minimize fillin), respectively. If \(A\) has the same sparsity pattern as \(M\), then the refactorization
\[ PAQ = L_AU_A \]can be computed numerically without any symbolic or analysis phases.
This function supposes that rfinfo has been updated, by function CSRRF_ANALYSIS, after the analysis phase of the previous matrix M and its initial factorization.
 Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows (and columns) of matrix A.
nnzA – [in]
rocblas_int. nnzA >= 0.
The number of nonzero elements in A.
ptrA – [in]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indA and valA. The last element of ptrM is equal to nnzA.
indA – [in]
pointer to rocblas_int. Array on the GPU of dimension nnzA.
It contains the column indices of the nonzero elements of M. Indices are sorted by row and by column within each row.
valA – [in]
pointer to type. Array on the GPU of dimension nnzA.
The values of the nonzero elements of A.
nnzT – [in]
rocblas_int. nnzT >= 0.
The number of nonzero elements in T.
ptrT – [in]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indT and valT. The last element of ptrT is equal to nnzT.
indT – [in]
pointer to rocblas_int. Array on the GPU of dimension nnzT.
It contains the column indices of the nonzero elements of T. Indices are sorted by row and by column within each row.
valT – [out]
pointer to type. Array on the GPU of dimension nnzT.
The values of the nonzero elements of the new bundle matrix (L_A  I) + U_A.
pivP – [in]
pointer to rocblas_int. Array on the GPU of dimension n.
Contains the pivot indices representing the permutation matrix P, i.e. the order in which the rows of matrix M were rearranged.
pivQ – [in]
pointer to rocblas_int. Array on the GPU of dimension n.
Contains the pivot indices representing the permutation matrix Q, i.e. the order in which the columns of matrix M were rearranged.
rfinfo – [in]
rocsolver_rfinfo.
Structure that holds the meta data generated in the analysis phase.
3.5.3. Direct sparse solvers#
3.5.3.1. rocsolver_<type>csrrf_solve()#

rocblas_status rocsolver_dcsrrf_solve(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, const rocblas_int nnzT, rocblas_int *ptrT, rocblas_int *indT, double *valT, rocblas_int *pivP, rocblas_int *pivQ, double *B, const rocblas_int ldb, rocsolver_rfinfo rfinfo)#

rocblas_status rocsolver_scsrrf_solve(rocblas_handle handle, const rocblas_int n, const rocblas_int nrhs, const rocblas_int nnzT, rocblas_int *ptrT, rocblas_int *indT, float *valT, rocblas_int *pivP, rocblas_int *pivQ, float *B, const rocblas_int ldb, rocsolver_rfinfo rfinfo)#
CSRRF_SOLVE solves a linear system with sparse coefficient matrix \(A\) in its factorized form.
The linear system is of the form
\[ AX = B \]where the sparse matrix \(A\) is factorized as
\[ PAQ = L_AU_A \]and \(B\) is a dense matrix of right hand sides.
This function supposes that the LU factors \(L_A\) and \(U_A\) are passed in a bundle matrix \(T=(L_AI)+U_A\) as returned by CSRRF_REFACTLU or CSRRF_SUMLU, and that rfinfo has been updated, by function CSRRF_ANALYSIS, after the analysis phase.
 Parameters:
handle – [in] rocblas_handle.
n – [in]
rocblas_int. n >= 0.
The number of rows (and columns) of matrix A.
nrhs – [in]
rocblas_int. nrhs >= 0.
The number of right hand sides, i.e. the number of columns of matrix B.
nnzT – [in]
rocblas_int. nnzT >= 0.
The number of nonzero elements in T.
ptrT – [in]
pointer to rocblas_int. Array on the GPU of dimension n+1.
It contains the positions of the beginning of each row in indT and valT. The last element of ptrT is equal to nnzT.
indT – [in]
pointer to rocblas_int. Array on the GPU of dimension nnzT.
It contains the column indices of the nonzero elements of T. Indices are sorted by row and by column within each row.
valT – [in]
pointer to type. Array on the GPU of dimension nnzT.
The values of the nonzero elements of T.
pivP – [in]
pointer to rocblas_int. Array on the GPU of dimension n.
Contains the pivot indices representing the permutation matrix P, i.e. the order in which the rows of matrix A were rearranged.
pivQ – [in]
pointer to rocblas_int. Array on the GPU of dimension n.
Contains the pivot indices representing the permutation matrix Q, i.e. the order in which the columns of matrix A were rearranged.
B – [inout]
pointer to type. Array on the GPU of dimension ldb*nrhs.
On entry the right hand side matrix B. On exit, the solution matrix X.
ldb – [in]
rocblas_int. ldb >= n.
The leading dimension of B.
rfinfo – [in]
rocsolver_rfinfo.
Structure that holds the meta data generated in the analysis phase.