Preconditioner functions#
This module contains all sparse preconditioners.
The sparse preconditioners describe manipulations on a matrix in sparse format to obtain a sparse preconditioner matrix.
hipsparseXbsrilu02_zeroPivot()#
-
hipsparseStatus_t hipsparseXbsrilu02_zeroPivot(hipsparseHandle_t handle, bsrilu02Info_t info, int *position)#
hipsparseXbsrilu02_zeroPivotreturns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXbsrilu02_analysis() or hipsparseXbsrilu02() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored inposition, using same index base as the BSR matrix.positioncan be in host or device memory. If no zero pivot has been found,positionis set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.- Deprecated:
This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.
Note
If a zero pivot is found,
position\(=j\) means that either the diagonal block \(A_{j,j}\) is missing (structural zero) or the diagonal block \(A_{j,j}\) is not invertible (numerical zero).Note
hipsparseXbsrilu02_zeroPivotis a blocking function. It might influence performance negatively.- Parameters:
handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
position – [inout] pointer to zero pivot \(j\), can be in host or device memory.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED –
handleis not initialized.HIPSPARSE_STATUS_INVALID_VALUE –
handle,infoorpositionis nullptr.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.
hipsparseXbsrilu02_numericBoost()#
-
hipsparseStatus_t hipsparseSbsrilu02_numericBoost(hipsparseHandle_t handle, bsrilu02Info_t info, int enable_boost, double *tol, float *boost_val)#
-
hipsparseStatus_t hipsparseDbsrilu02_numericBoost(hipsparseHandle_t handle, bsrilu02Info_t info, int enable_boost, double *tol, double *boost_val)#
-
hipsparseStatus_t hipsparseCbsrilu02_numericBoost(hipsparseHandle_t handle, bsrilu02Info_t info, int enable_boost, double *tol, hipComplex *boost_val)#
-
hipsparseStatus_t hipsparseZbsrilu02_numericBoost(hipsparseHandle_t handle, bsrilu02Info_t info, int enable_boost, double *tol, hipDoubleComplex *boost_val)#
hipsparseXbsrilu02_numericBoostenables the user to replace a numerical value in an incomplete LU factorization.tolis used to determine whether a numerical value is replaced byboost_val, such that \(A_{j,j} = \text{boost_val}\) if \(\text{tol} \ge \left|A_{j,j}\right|\).- Deprecated:
This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.
Note
The boost value is enabled by setting
enable_boostto 1 or disabled by settingenable_boostto 0.Note
tolandboost_valcan be in host or device memory.- Parameters:
handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
enable_boost – [in] enable/disable numeric boost.
tol – [in] tolerance to determine whether a numerical value is replaced or not.
boost_val – [in] boost value to replace a numerical value.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED –
handleis not initialized.HIPSPARSE_STATUS_INVALID_VALUE –
handle,info,tolorboost_valis nullptr.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXbsrilu02_bufferSize()#
-
hipsparseStatus_t hipsparseSbsrilu02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDbsrilu02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCbsrilu02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZbsrilu02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes)#
hipsparseXbsrilu02_bufferSizereturns the size of the temporary storage buffer in bytes that is required by hipsparseXbsrilu02_analysis() and hipsparseXbsrilu02(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specifies whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrSortedValA – [in] array of length
nnzb*blockDim*blockDimcontaining the values of the sparse BSR matrix.bsrSortedRowPtrA – [in] array of
mb+1elements that point to the start of every block row of the sparse BSR matrix.bsrSortedColIndA – [in] array of
nnzbelements containing the block column indices of the sparse BSR matrix.blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where
m=mb*blockDim.info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSbsrilu02_analysis(), hipsparseDbsrilu02_analysis(), hipsparseCbsrilu02_analysis(), hipsparseZbsrilu02_analysis(), hipsparseSbsrilu02(), hipsparseDbsrilu02(), hipsparseCbsrilu02() and hipsparseZbsrilu02().
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,mb,nnzb,blockDim,descrA,bsrSortedValA,bsrSortedRowPtrA,bsrSortedColIndA,infoorpBufferSizeInBytespointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
hipsparseXbsrilu02_analysis()#
-
hipsparseStatus_t hipsparseSbsrilu02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseDbsrilu02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseCbsrilu02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseZbsrilu02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseXbsrilu02_analysisperforms the analysis step for hipsparseXbsrilu02(). It is expected that this function will be executed only once for a given matrix.Note
If the matrix sparsity pattern changes, the gathered information will become invalid.
Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specified whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrSortedValA – [in] array of length
nnzb*blockDim*blockDimcontaining the values of the sparse BSR matrix.bsrSortedRowPtrA – [in] array of
mb+1elements that point to the start of every block row of the sparse BSR matrix.bsrSortedColIndA – [in] array of
nnzbelements containing the block column indices of the sparse BSR matrix.blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where
m=mb*blockDim.info – [out] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,mb,nnzb,blockDim,descrA,bsrSortedValA,bsrSortedRowPtrA,bsrSortedColIndA,infoorpBufferpointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
hipsparseXbsrilu02()#
-
hipsparseStatus_t hipsparseSbsrilu02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrSortedValA_valM, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseDbsrilu02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrSortedValA_valM, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseCbsrilu02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrSortedValA_valM, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseZbsrilu02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrSortedValA_valM, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
Incomplete LU factorization with 0 fill-ins and no pivoting using BSR storage format.
hipsparseXbsrilu02computes the incomplete LU factorization with 0 fill-ins and no pivoting of a sparse \(mb \times mb\) BSR matrix \(A\), such that\[ A \approx LU \]Computing the above incomplete LU factorization requires three steps to complete. First, the user determines the size of the required temporary storage buffer by calling hipsparseXbsrilu02_bufferSize(). Once this buffer size has been determined, the user allocates the buffer and passes it to hipsparseXbsrilu02_analysis(). This will perform analysis on the sparsity pattern of the matrix. Finally, the user calls
hipsparseXbsrilu02to perform the actual factorization. The calculation of the buffer size and the analysis of the sparse matrix only need to be performed once for a given sparsity pattern while the factorization can be repeatedly applied to multiple matrices having the same sparsity pattern. Once all calls to hipsparseXbsrilu02() are complete, the temporary buffer can be deallocated.hipsparseXbsrilu02reports the first zero pivot (either numerical or structural zero). The zero pivot status can be obtained by calling hipsparseXbsrilu02_zeroPivot().Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specified whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrSortedValA_valM – [inout] array of length
nnzb*blockDim*blockDimcontaining the values of the sparse BSR matrix.bsrSortedRowPtrA – [in] array of
mb+1elements that point to the start of every block row of the sparse BSR matrix.bsrSortedColIndA – [in] array of
nnzbelements containing the block column indices of the sparse BSR matrix.blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where
m=mb*blockDim.info – [in] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,mb,nnzb,blockDim,descrA,bsrSortedValA_valM,bsrSortedRowPtrAorbsrSortedColIndApointer is invalid.HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // A sample square matrix A (4x4) in BSR format for ILU(0) factorization.
8 // The 'S' in Sbsrilu02 indicates single precision float.
9 // We'll use a block size of 1 for simplicity, making it behave like CSR ILU.
10 // Matrix A:
11 // ( 1 2 0 0 )
12 // ( 3 4 5 0 )
13 // ( 0 6 7 8 )
14 // ( 0 0 9 10 )
15
16 int m = 4; // Number of rows
17 int n = 4; // Number of columns
18 int bs = 1; // Block size
19 int mb = m / bs; // Number of block rows
20 int nb = n / bs; // Number of block columns
21 int nnzb = 10; // Number of non-zero blocks
22
23 // BSR row pointers
24 std::vector<int> hbsrRowPtr = {0, 2, 5, 8, 10};
25
26 // BSR column indices
27 std::vector<int> hbsrColInd = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
28
29 // BSR values (single precision float)
30 std::vector<float> hbsrVal = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f, 7.0f, 8.0f, 9.0f, 10.0f};
31
32 // Matrix descriptor
33 hipsparseMatDescr_t descr;
34 HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
35
36 // Set index base on descriptor
37 HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));
38
39 // For ILU(0), the L factor often has a unit diagonal.
40 HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
41
42 // BSRILU02 info
43 bsrilu02Info_t info;
44 HIPSPARSE_CHECK(hipsparseCreateBsrilu02Info(&info));
45
46 // Offload data to device
47 int* dbsrRowPtr;
48 int* dbsrColInd;
49 float* dbsrVal; // This will store the factorized L and U values
50
51 HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
52 HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
53 HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(float) * nnzb * bs * bs));
54
55 HIP_CHECK(
56 hipMemcpy(dbsrRowPtr, hbsrRowPtr.data(), sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
57 HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd.data(), sizeof(int) * nnzb, hipMemcpyHostToDevice));
58 HIP_CHECK(
59 hipMemcpy(dbsrVal, hbsrVal.data(), sizeof(float) * nnzb * bs * bs, hipMemcpyHostToDevice));
60
61 // 1. Get buffer size
62 int bufferSize = 0;
63 HIPSPARSE_CHECK(
64 hipsparseSbsrilu02_bufferSize(handle,
65 HIPSPARSE_DIRECTION_COLUMN, // Block storage direction
66 mb,
67 nnzb,
68 descr,
69 dbsrVal,
70 dbsrRowPtr,
71 dbsrColInd,
72 bs,
73 info,
74 &bufferSize));
75
76 void* dbuffer = nullptr;
77 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
78
79 // 2. Perform analysis (symbolic factorization)
80 // This step analyzes the sparsity pattern of A to determine the structure of L and U.
81 HIPSPARSE_CHECK(
82 hipsparseSbsrilu02_analysis(handle,
83 HIPSPARSE_DIRECTION_COLUMN,
84 mb,
85 nnzb,
86 descr,
87 dbsrVal,
88 dbsrRowPtr,
89 dbsrColInd,
90 bs,
91 info,
92 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
93 dbuffer));
94
95 // 3. Perform factorization (numerical computation)
96 // This step computes the actual numerical values of L and U, stored in dbsrVal.
97 HIPSPARSE_CHECK(hipsparseSbsrilu02(handle,
98 HIPSPARSE_DIRECTION_COLUMN,
99 mb,
100 nnzb,
101 descr,
102 dbsrVal,
103 dbsrRowPtr,
104 dbsrColInd,
105 bs,
106 info,
107 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
108 dbuffer));
109
110 // 4. Check for zero pivots
111 // A zero pivot can occur during factorization, indicating a numerical breakdown.
112 int zeroPivot = 0; // -1 if no zero pivot, otherwise the block row index of the first zero pivot
113 HIPSPARSE_CHECK(hipsparseXbsrilu02_zeroPivot(handle, info, &zeroPivot));
114 if(zeroPivot != -1)
115 {
116 printf("Error: Zero pivot detected during ILU0 factorization at block row index %d\n",
117 zeroPivot);
118 // Handle the error (e.g., return, use a different preconditioner, etc.)
119 }
120 else
121 {
122 printf("BSRILU0 factorization completed successfully (no zero pivots detected).\n");
123 }
124
125 // Copy the factorized values (L and U combined) back to host
126 std::vector<float> hbsrVal_result(nnzb * bs * bs);
127 HIP_CHECK(hipMemcpy(
128 hbsrVal_result.data(), dbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyDeviceToHost));
129
130 // Print the result (the values of the factorized L and U combined)
131 printf("\nFactorized BSR values (L and U combined):\n");
132 for(int i = 0; i < nnzb * bs * bs; ++i)
133 {
134 printf("val[%d] = %f\n", i, hbsrVal_result[i]);
135 }
136
137 // Clean up
138 HIPSPARSE_CHECK(hipsparseDestroyBsrilu02Info(info));
139 HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
140 HIPSPARSE_CHECK(hipsparseDestroy(handle));
141
142 HIP_CHECK(hipFree(dbsrRowPtr));
143 HIP_CHECK(hipFree(dbsrColInd));
144 HIP_CHECK(hipFree(dbsrVal));
145 HIP_CHECK(hipFree(dbuffer));
146
147 return 0;
148}
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // A sample square matrix A (4x4) in BSR format for ILU(0) factorization.
8 // The 'S' in Sbsrilu02 indicates single precision float.
9 // We'll use a block size of 1 for simplicity, making it behave like CSR ILU.
10 // Matrix A:
11 // ( 1 2 0 0 )
12 // ( 3 4 5 0 )
13 // ( 0 6 7 8 )
14 // ( 0 0 9 10 )
15
16 int m = 4; // Number of rows
17 int n = 4; // Number of columns
18 int bs = 1; // Block size
19 int mb = m / bs; // Number of block rows
20 int nb = n / bs; // Number of block columns
21 int nnzb = 10; // Number of non-zero blocks
22
23 // BSR row pointers
24 int hbsrRowPtr[] = {0, 2, 5, 8, 10};
25
26 // BSR column indices
27 int hbsrColInd[] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
28
29 // BSR values (single precision float)
30 float hbsrVal[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0};
31
32 // Matrix descriptor
33 hipsparseMatDescr_t descr;
34 HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
35
36 // Set index base on descriptor
37 HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));
38
39 // For ILU(0), the L factor often has a unit diagonal.
40 HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
41
42 // BSRILU02 info
43 bsrilu02Info_t info;
44 HIPSPARSE_CHECK(hipsparseCreateBsrilu02Info(&info));
45
46 // Offload data to device
47 int* dbsrRowPtr;
48 int* dbsrColInd;
49 float* dbsrVal; // This will store the factorized L and U values
50
51 HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
52 HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
53 HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(float) * nnzb * bs * bs));
54
55 HIP_CHECK(hipMemcpy(dbsrRowPtr, hbsrRowPtr, sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
56 HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd, sizeof(int) * nnzb, hipMemcpyHostToDevice));
57 HIP_CHECK(hipMemcpy(dbsrVal, hbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyHostToDevice));
58
59 // 1. Get buffer size
60 int bufferSize = 0;
61 HIPSPARSE_CHECK(
62 hipsparseSbsrilu02_bufferSize(handle,
63 HIPSPARSE_DIRECTION_COLUMN, // Block storage direction
64 mb,
65 nnzb,
66 descr,
67 dbsrVal,
68 dbsrRowPtr,
69 dbsrColInd,
70 bs,
71 info,
72 &bufferSize));
73
74 void* dbuffer = NULL;
75 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
76
77 // 2. Perform analysis (symbolic factorization)
78 // This step analyzes the sparsity pattern of A to determine the structure of L and U.
79 HIPSPARSE_CHECK(
80 hipsparseSbsrilu02_analysis(handle,
81 HIPSPARSE_DIRECTION_COLUMN,
82 mb,
83 nnzb,
84 descr,
85 dbsrVal,
86 dbsrRowPtr,
87 dbsrColInd,
88 bs,
89 info,
90 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
91 dbuffer));
92
93 // 3. Perform factorization (numerical computation)
94 // This step computes the actual numerical values of L and U, stored in dbsrVal.
95 HIPSPARSE_CHECK(hipsparseSbsrilu02(handle,
96 HIPSPARSE_DIRECTION_COLUMN,
97 mb,
98 nnzb,
99 descr,
100 dbsrVal,
101 dbsrRowPtr,
102 dbsrColInd,
103 bs,
104 info,
105 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
106 dbuffer));
107
108 // 4. Check for zero pivots
109 // A zero pivot can occur during factorization, indicating a numerical breakdown.
110 int zeroPivot = 0; // -1 if no zero pivot, otherwise the block row index of the first zero pivot
111 HIPSPARSE_CHECK(hipsparseXbsrilu02_zeroPivot(handle, info, &zeroPivot));
112 if(zeroPivot != -1)
113 {
114 printf("Error: Zero pivot detected during ILU0 factorization at block row index %d\n",
115 zeroPivot);
116 // Handle the error (e.g., return, use a different preconditioner, etc.)
117 }
118 else
119 {
120 printf("BSRILU0 factorization completed successfully (no zero pivots detected).\n");
121 }
122
123 // Copy the factorized values (L and U combined) back to host
124 float* hbsrVal_result = (float*)malloc(nnzb * bs * bs * sizeof(float));
125 HIP_CHECK(
126 hipMemcpy(hbsrVal_result, dbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyDeviceToHost));
127
128 // Print the result (the values of the factorized L and U combined)
129 printf("\nFactorized BSR values (L and U combined):\n");
130 for(int i = 0; i < nnzb * bs * bs; ++i)
131 {
132 printf("val[%d] = %f\n", i, hbsrVal_result[i]);
133 }
134
135 // Clean up
136 free(hbsrVal_result);
137
138 HIPSPARSE_CHECK(hipsparseDestroyBsrilu02Info(info));
139 HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
140 HIPSPARSE_CHECK(hipsparseDestroy(handle));
141
142 HIP_CHECK(hipFree(dbsrRowPtr));
143 HIP_CHECK(hipFree(dbsrColInd));
144 HIP_CHECK(hipFree(dbsrVal));
145 HIP_CHECK(hipFree(dbuffer));
146
147 return 0;
148}
1program example_hipsparse_bsrilu02
2 use iso_c_binding
3 implicit none
4
5 ! HIP
6 interface
7 function hipMalloc(ptr, size) &
8 bind(c, name = 'hipMalloc')
9 use iso_c_binding
10 implicit none
11 integer(c_int) :: hipMalloc
12 type(c_ptr) :: ptr
13 integer(c_size_t), value :: size
14 end function hipMalloc
15
16 function hipFree(ptr) &
17 bind(c, name = 'hipFree')
18 use iso_c_binding
19 implicit none
20 integer(c_int) :: hipFree
21 type(c_ptr), value :: ptr
22 end function hipFree
23
24 function hipMemcpy(dst, src, size, kind) &
25 bind(c, name = 'hipMemcpy')
26 use iso_c_binding
27 implicit none
28 integer(c_int) :: hipMemcpy
29 type(c_ptr), value :: dst
30 type(c_ptr), intent(in), value :: src
31 integer(c_size_t), value :: size
32 integer(c_int), value :: kind
33 end function hipMemcpy
34 end interface
35
36 integer, parameter :: hipMemcpyHostToDevice = 1
37 integer, parameter :: hipMemcpyDeviceToHost = 2
38
39 ! hipSPARSE
40 interface
41 function hipsparseCreate(handle) &
42 bind(c, name = 'hipsparseCreate')
43 use iso_c_binding
44 implicit none
45 integer(c_int) :: hipsparseCreate
46 type(c_ptr) :: handle
47 end function hipsparseCreate
48
49 function hipsparseDestroy(handle) &
50 bind(c, name = 'hipsparseDestroy')
51 use iso_c_binding
52 implicit none
53 integer(c_int) :: hipsparseDestroy
54 type(c_ptr), value :: handle
55 end function hipsparseDestroy
56
57 function hipsparseCreateMatDescr(descr) &
58 bind(c, name = 'hipsparseCreateMatDescr')
59 use iso_c_binding
60 implicit none
61 integer(c_int) :: hipsparseCreateMatDescr
62 type(c_ptr) :: descr
63 end function hipsparseCreateMatDescr
64
65 function hipsparseDestroyMatDescr(descr) &
66 bind(c, name = 'hipsparseDestroyMatDescr')
67 use iso_c_binding
68 implicit none
69 integer(c_int) :: hipsparseDestroyMatDescr
70 type(c_ptr), value :: descr
71 end function hipsparseDestroyMatDescr
72
73 function hipsparseCreateBsrilu02Info(info) &
74 bind(c, name = 'hipsparseCreateBsrilu02Info')
75 use iso_c_binding
76 implicit none
77 integer(c_int) :: hipsparseCreateBsrilu02Info
78 type(c_ptr) :: info
79 end function hipsparseCreateBsrilu02Info
80
81 function hipsparseDestroyBsrilu02Info(info) &
82 bind(c, name = 'hipsparseDestroyBsrilu02Info')
83 use iso_c_binding
84 implicit none
85 integer(c_int) :: hipsparseDestroyBsrilu02Info
86 type(c_ptr), value :: info
87 end function hipsparseDestroyBsrilu02Info
88
89 function hipsparseSbsrilu02_bufferSize(handle, dirA, mb, nnzb, descrA, bsrSortedValA, &
90 bsrSortedRowPtrA, bsrSortedColIndA, blockDim, &
91 info, pBufferSizeInBytes) &
92 bind(c, name = 'hipsparseSbsrilu02_bufferSize')
93 use iso_c_binding
94 implicit none
95 integer(c_int) :: hipsparseSbsrilu02_bufferSize
96 type(c_ptr), value :: handle
97 integer(c_int), value :: dirA
98 integer(c_int), value :: mb
99 integer(c_int), value :: nnzb
100 type(c_ptr), value :: descrA
101 type(c_ptr), intent(in), value :: bsrSortedValA
102 type(c_ptr), intent(in), value :: bsrSortedRowPtrA
103 type(c_ptr), intent(in), value :: bsrSortedColIndA
104 integer(c_int), value :: blockDim
105 type(c_ptr), value :: info
106 type(c_ptr), value :: pBufferSizeInBytes
107 end function hipsparseSbsrilu02_bufferSize
108
109 function hipsparseSbsrilu02_analysis(handle, dirA, mb, nnzb, descrA, bsrSortedValA, &
110 bsrSortedRowPtrA, bsrSortedColIndA, blockDim, &
111 info, policy, pBuffer) &
112 bind(c, name = 'hipsparseSbsrilu02_analysis')
113 use iso_c_binding
114 implicit none
115 integer(c_int) :: hipsparseSbsrilu02_analysis
116 type(c_ptr), value :: handle
117 integer(c_int), value :: dirA
118 integer(c_int), value :: mb
119 integer(c_int), value :: nnzb
120 type(c_ptr), value :: descrA
121 type(c_ptr), intent(in), value :: bsrSortedValA
122 type(c_ptr), intent(in), value :: bsrSortedRowPtrA
123 type(c_ptr), intent(in), value :: bsrSortedColIndA
124 integer(c_int), value :: blockDim
125 type(c_ptr), value :: info
126 integer(c_int), value :: policy
127 type(c_ptr), value :: pBuffer
128 end function hipsparseSbsrilu02_analysis
129
130 function hipsparseSbsrilu02(handle, dirA, mb, nnzb, descrA, bsrSortedValA, bsrSortedRowPtrA, &
131 bsrSortedColIndA, blockDim, info, policy, pBuffer) &
132 bind(c, name = 'hipsparseSbsrilu02')
133 use iso_c_binding
134 implicit none
135 integer(c_int) :: hipsparseSbsrilu02
136 type(c_ptr), value :: handle
137 integer(c_int), value :: dirA
138 integer(c_int), value :: mb
139 integer(c_int), value :: nnzb
140 type(c_ptr), value :: descrA
141 type(c_ptr), value :: bsrSortedValA
142 type(c_ptr), intent(in), value :: bsrSortedRowPtrA
143 type(c_ptr), intent(in), value :: bsrSortedColIndA
144 integer(c_int), value :: blockDim
145 type(c_ptr), value :: info
146 integer(c_int), value :: policy
147 type(c_ptr), value :: pBuffer
148 end function hipsparseSbsrilu02
149
150 function hipsparseXbsrilu02_zeroPivot(handle, info, position) &
151 bind(c, name = 'hipsparseXbsrilu02_zeroPivot')
152 use iso_c_binding
153 implicit none
154 integer(c_int) :: hipsparseXbsrilu02_zeroPivot
155 type(c_ptr), value :: handle
156 type(c_ptr), value :: info
157 type(c_ptr), value :: position
158 end function hipsparseXbsrilu02_zeroPivot
159 end interface
160
161 integer, parameter :: HIPSPARSE_DIRECTION_COLUMN = 1
162 integer, parameter :: HIPSPARSE_SOLVE_POLICY_USE_LEVEL = 1
163
164 ! Variables
165 type(c_ptr) :: handle
166 type(c_ptr) :: descr
167 type(c_ptr) :: info
168 integer :: i, stat
169 integer, target :: zeroPivot
170 integer(c_int), target :: bufferSize
171
172 ! Block sparse matrix A (4x4 with block size 1)
173 integer, parameter :: m = 4
174 integer, parameter :: n = 4
175 integer, parameter :: bs = 1
176 integer, parameter :: mb = m / bs
177 integer, parameter :: nb = n / bs
178 integer, parameter :: nnzb = 10
179
180 integer, dimension(mb+1), target :: hbsrRowPtr = (/0, 2, 5, 8, 10/)
181 integer, dimension(nnzb), target :: hbsrColInd = (/0, 1, 0, 1, 2, 1, 2, 3, 2, 3/)
182 real(c_float), dimension(nnzb*bs*bs), target :: hbsrVal = (/1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0/)
183
184 ! Result array
185 real(c_float), dimension(nnzb*bs*bs), target :: hbsrVal_result
186
187 ! Device pointers
188 type(c_ptr) :: dbsrRowPtr
189 type(c_ptr) :: dbsrColInd
190 type(c_ptr) :: dbsrVal
191 type(c_ptr) :: dbuffer
192
193 ! Create hipSPARSE handle
194 stat = hipsparseCreate(handle)
195 if (stat /= 0) stop
196
197 ! Create matrix descriptor
198 stat = hipsparseCreateMatDescr(descr)
199 if (stat /= 0) stop
200
201 ! Create bsrilu02 info
202 stat = hipsparseCreateBsrilu02Info(info)
203 if (stat /= 0) stop
204
205 ! Allocate device memory
206 stat = hipMalloc(dbsrRowPtr, int((mb + 1) * 4, c_size_t))
207 if (stat /= 0) stop
208 stat = hipMalloc(dbsrColInd, int(nnzb * 4, c_size_t))
209 if (stat /= 0) stop
210 stat = hipMalloc(dbsrVal, int(nnzb * bs * bs * 4, c_size_t))
211 if (stat /= 0) stop
212
213 ! Copy data to device
214 stat = hipMemcpy(dbsrRowPtr, c_loc(hbsrRowPtr), int((mb + 1) * 4, c_size_t), hipMemcpyHostToDevice)
215 if (stat /= 0) stop
216 stat = hipMemcpy(dbsrColInd, c_loc(hbsrColInd), int(nnzb * 4, c_size_t), hipMemcpyHostToDevice)
217 if (stat /= 0) stop
218 stat = hipMemcpy(dbsrVal, c_loc(hbsrVal), int(nnzb * bs * bs * 4, c_size_t), hipMemcpyHostToDevice)
219 if (stat /= 0) stop
220
221 ! Get buffer size
222 stat = hipsparseSbsrilu02_bufferSize(handle, &
223 HIPSPARSE_DIRECTION_COLUMN, &
224 mb, &
225 nnzb, &
226 descr, &
227 dbsrVal, &
228 dbsrRowPtr, &
229 dbsrColInd, &
230 bs, &
231 info, &
232 c_loc(bufferSize))
233 if (stat /= 0) stop
234
235 ! Allocate temporary buffer
236 stat = hipMalloc(dbuffer, int(bufferSize, c_size_t))
237 if (stat /= 0) stop
238
239 ! Perform analysis step
240 stat = hipsparseSbsrilu02_analysis(handle, &
241 HIPSPARSE_DIRECTION_COLUMN, &
242 mb, &
243 nnzb, &
244 descr, &
245 dbsrVal, &
246 dbsrRowPtr, &
247 dbsrColInd, &
248 bs, &
249 info, &
250 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
251 dbuffer)
252 if (stat /= 0) stop
253
254 ! Perform factorization
255 stat = hipsparseSbsrilu02(handle, &
256 HIPSPARSE_DIRECTION_COLUMN, &
257 mb, &
258 nnzb, &
259 descr, &
260 dbsrVal, &
261 dbsrRowPtr, &
262 dbsrColInd, &
263 bs, &
264 info, &
265 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
266 dbuffer)
267 if (stat /= 0) stop
268
269 ! Check for zero pivots
270 stat = hipsparseXbsrilu02_zeroPivot(handle, info, c_loc(zeroPivot))
271 if (zeroPivot /= -1) then
272 write(*,*) 'Error: Zero pivot detected at row index', zeroPivot
273 else
274 write(*,*) 'BSRILU02 factorization completed successfully'
275 end if
276
277 ! Copy result back to host
278 stat = hipMemcpy(c_loc(hbsrVal_result), dbsrVal, int(nnzb * bs * bs * 4, c_size_t), hipMemcpyDeviceToHost)
279 if (stat /= 0) stop
280
281 ! Print result
282 write(*,*) 'Factorized BSR values (L and U combined):'
283 do i = 1, nnzb * bs * bs
284 write(*,*) 'val[', i-1, '] =', hbsrVal_result(i)
285 end do
286
287 ! Clean up
288 stat = hipFree(dbsrRowPtr)
289 stat = hipFree(dbsrColInd)
290 stat = hipFree(dbsrVal)
291 stat = hipFree(dbuffer)
292
293 stat = hipsparseDestroyBsrilu02Info(info)
294 stat = hipsparseDestroyMatDescr(descr)
295 stat = hipsparseDestroy(handle)
296
297end program example_hipsparse_bsrilu02
hipsparseXcsrilu02_zeroPivot()#
-
hipsparseStatus_t hipsparseXcsrilu02_zeroPivot(hipsparseHandle_t handle, csrilu02Info_t info, int *position)#
hipsparseXcsrilu02_zeroPivotreturns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXcsrilu02() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored inposition, using same index base as the CSR matrix.positioncan be in host or device memory. If no zero pivot has been found,positionis set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.- Deprecated:
This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.
Note
hipsparseXcsrilu02_zeroPivotis a blocking function. It might influence performance negatively.- Parameters:
handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
position – [inout] pointer to zero pivot \(j\), can be in host or device memory.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED –
handleis not initialized.HIPSPARSE_STATUS_INVALID_VALUE –
handle,infoorpositionis nullptr.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.
hipsparseXcsrilu02_numericBoost()#
-
hipsparseStatus_t hipsparseScsrilu02_numericBoost(hipsparseHandle_t handle, csrilu02Info_t info, int enable_boost, double *tol, float *boost_val)#
-
hipsparseStatus_t hipsparseDcsrilu02_numericBoost(hipsparseHandle_t handle, csrilu02Info_t info, int enable_boost, double *tol, double *boost_val)#
-
hipsparseStatus_t hipsparseCcsrilu02_numericBoost(hipsparseHandle_t handle, csrilu02Info_t info, int enable_boost, double *tol, hipComplex *boost_val)#
-
hipsparseStatus_t hipsparseZcsrilu02_numericBoost(hipsparseHandle_t handle, csrilu02Info_t info, int enable_boost, double *tol, hipDoubleComplex *boost_val)#
hipsparseXcsrilu02_numericBoostenables the user to replace a numerical value in an incomplete LU factorization.tolis used to determine whether a numerical value is replaced byboost_val, such that \(A_{j,j} = \text{boost_val}\) if \(\text{tol} \ge \left|A_{j,j}\right|\).- Deprecated:
This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.
Note
The boost value is enabled by setting
enable_boostto 1 or disabled by settingenable_boostto 0.Note
tolandboost_valcan be in host or device memory.- Parameters:
handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
enable_boost – [in] enable/disable numeric boost.
tol – [in] tolerance to determine whether a numerical value is replaced or not.
boost_val – [in] boost value to replace a numerical value.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED –
handleis not initialized.HIPSPARSE_STATUS_INVALID_VALUE –
handle,info,tolorboost_valis nullptr.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXcsrilu02_bufferSize()#
-
hipsparseStatus_t hipsparseScsrilu02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDcsrilu02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCcsrilu02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZcsrilu02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, int *pBufferSizeInBytes)#
hipsparseXcsrilu02_bufferSizereturns the size of the temporary storage buffer in bytes that is required by hipsparseXcsrilu02_analysis() and hipsparseXcsrilu02(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of
nnzelements of the sparse CSR matrix.csrSortedRowPtrA – [in] array of
m+1elements that point to the start of every row of the sparse CSR matrix.csrSortedColIndA – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXcsrilu02_analysis() and hipsparseXcsrilu02().
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,nnz,descrA,csrSortedValA,csrSortedRowPtrA,csrSortedColIndA,infoorpBufferSizeInBytespointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXcsrilu02_bufferSizeExt()#
-
hipsparseStatus_t hipsparseScsrilu02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDcsrilu02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCcsrilu02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZcsrilu02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, size_t *pBufferSizeInBytes)#
hipsparseXcsrilu02_bufferSizeExtreturns the size of the temporary storage buffer in bytes that is required by hipsparseXcsrilu02_analysis() and hipsparseXcsrilu02(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of
nnzelements of the sparse CSR matrix.csrSortedRowPtrA – [in] array of
m+1elements that point to the start of every row of the sparse CSR matrix.csrSortedColIndA – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXcsrilu02_analysis() and hipsparseXcsrilu02().
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,nnz,descrA,csrSortedValA,csrSortedRowPtrA,csrSortedColIndA,infoorpBufferSizeInBytespointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXcsrilu02_analysis()#
-
hipsparseStatus_t hipsparseScsrilu02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseDcsrilu02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseCcsrilu02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseZcsrilu02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseXcsrilu02_analysisperforms the analysis step for hipsparseXcsrilu02(). It is expected that this function will be executed only once for a given matrix and particular operation type.Note
If the matrix sparsity pattern changes, the gathered information will become invalid.
Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of
nnzelements of the sparse CSR matrix.csrSortedRowPtrA – [in] array of
m+1elements that point to the start of every row of the sparse CSR matrix.csrSortedColIndA – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.info – [out] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,nnz,descrA,csrSortedValA,csrSortedRowPtrA,csrSortedColIndA,infoorpBufferpointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXcsrilu02()#
-
hipsparseStatus_t hipsparseScsrilu02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseDcsrilu02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseCcsrilu02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseZcsrilu02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
Incomplete LU factorization with 0 fill-ins and no pivoting using CSR storage format.
hipsparseXcsrilu02computes the incomplete LU factorization with 0 fill-ins and no pivoting of a sparse \(m \times m\) CSR matrix \(A\), such that\[ A \approx LU \]where the lower triangular matrix \(L\) and the upper triangular matrix \(U\) are computed using:\[\begin{split} \begin{array}{ll} L_{ij} = \frac{1}{U_{jj}}(A_{ij} - \sum_{k=0}^{j-1}L_{ik} \times U_{kj}), & \text{if i > j} \\ U_{ij} = (A_{ij} - \sum_{k=0}^{j-1}L_{ik} \times U_{kj}), & \text{if i <= j} \end{array} \end{split}\]for each entry found in the CSR matrix \(A\).Computing the above incomplete \(LU\) factorization requires three steps to complete. First, the user determines the size of the required temporary storage buffer by calling hipsparseXcsrilu02_bufferSize(). Once this buffer size has been determined, the user allocates the buffer and passes it to hipsparseXcsrilu02_analysis(). This will perform analysis on the sparsity pattern of the matrix. Finally, the user calls
hipsparseScsrilu02,hipsparseDcsrilu02,hipsparseCcsrilu02, orhipsparseZcsrilu02to perform the actual factorization. The calculation of the buffer size and the analysis of the sparse matrix only need to be performed once for a given sparsity pattern while the factorization can be repeatedly applied to multiple matrices having the same sparsity pattern. Once all calls to hipsparseXcsrilu02() are complete, the temporary buffer can be deallocated.When computing the \(LU\) factorization, it is possible that \(U_{jj} == 0\) which would result in a division by zero. This could occur from either \(A_{jj}\) not existing in the sparse CSR matrix (referred to as a structural zero) or because \(A_{ij} - \sum_{k=0}^{j-1}L_{ik} \times U_{kj} == 0\) (referred to as a numerical zero). For example, running the \(LU\) factorization on the following matrix:
\[\begin{split} \begin{bmatrix} 2 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 2 \end{bmatrix} \end{split}\]results in a successful \(LU\) factorization, however running with the matrix:\[\begin{split} \begin{bmatrix} 2 & 1 & 0 \\ 1 & 1/2 & 1 \\ 0 & 1 & 2 \end{bmatrix} \end{split}\]results in a numerical zero because:\[\begin{split} \begin{array}{ll} U_{00} &= 2 \\ U_{01} &= 1 \\ L_{10} &= \frac{1}{2} \\ U_{11} &= \frac{1}{2} - \frac{1}{2} &= 0 \end{array} \end{split}\]The user can detect the presence of a structural zero by calling hipsparseXcsrilu02_zeroPivot() after hipsparseXcsrilu02_analysis() and/or the presence of a structural or numerical zero by calling hipsparseXcsrilu02_zeroPivot() after hipsparseXcsrilu02(). In both cases, hipsparseXcsrilu02_zeroPivot() will report the first zero pivot (either numerical or structural) found. See example below. The user can also set the diagonal type to be \(1\) using hipsparseSetMatDiagType() which will interpret the matrix \(A\) as having ones on its diagonal (even if no nonzero exists in the sparsity pattern).hipsparseXcsrilu02computes the \(LU\) factorization inplace meaning that the values arraycsrSortedValA_valMof the \(A\) matrix is overwritten with the \(L\) matrix stored in the strictly lower triangular part of \(A\) and the \(U\) matrix stored in the upper part of \(A\):\[\begin{split} \begin{align} \begin{bmatrix} a_{00} & a_{01} & a_{02} \\ a_{10} & a_{11} & a_{12} \\ a_{20} & a_{21} & a_{22} \end{bmatrix} \rightarrow \begin{bmatrix} u_{00} & u_{01} & u_{02} \\ l_{10} & u_{11} & u_{12} \\ l_{20} & l_{21} & u_{22} \end{bmatrix} \end{align} \end{split}\]The row pointer arraycsrSortedRowPtrAand the column indices arraycsrSortedColIndAremain the same for \(A\) and \(LU\) as the incomplete factorization does not generate new nonzeros in \(LU\) which do not already exist in \(A\).The performance of computing \(LU\) factorization with hipSPARSE greatly depends on the sparisty pattern the the matrix \(A\) as this is what determines the amount of parallelism available.
Note
The sparse CSR matrix has to be sorted. This can be achieved by calling hipsparseXcsrsort().
Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA_valM – [inout] array of
nnzelements of the sparse CSR matrix.csrSortedRowPtrA – [in] array of
m+1elements that point to the start of every row of the sparse CSR matrix.csrSortedColIndA – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.info – [in] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,nnz,descrA,csrSortedValA_valM,csrSortedRowPtrAorcsrSortedColIndApointer is invalid.HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // A sample square matrix A (4x4) in CSR format for ILU(0) factorization.
8 // The 'S' in Scsrilu02 indicates single precision float.
9 // Matrix A:
10 // ( 1 2 0 0 )
11 // ( 3 4 5 0 )
12 // ( 0 6 7 8 )
13 // ( 0 0 9 10 )
14
15 int m = 4; // Number of rows
16 int n = 4; // Number of columns (equal to m for ILU)
17 int nnz = 10; // Number of non-zero elements
18
19 // CSR row pointers
20 std::vector<int> hcsrRowPtr = {0, 2, 5, 8, 10};
21
22 // CSR column indices
23 std::vector<int> hcsrColInd = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
24
25 // CSR values
26 std::vector<float> hcsrVal = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f, 7.0f, 8.0f, 9.0f, 10.0f};
27
28 // Matrix descriptor
29 hipsparseMatDescr_t descr;
30 HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
31
32 // Set index base on descriptor
33 HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));
34
35 // For incomplete LU, the L factor often has a unit diagonal.
36 HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
37
38 // CSRILU02 info (for incomplete LU factorization)
39 csrilu02Info_t info;
40 HIPSPARSE_CHECK(hipsparseCreateCsrilu02Info(&info));
41
42 // Offload data to device
43 int* dcsrRowPtr;
44 int* dcsrColInd;
45 float* dcsrVal; // This will store the factorized L and U values
46
47 HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
48 HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
49 HIP_CHECK(
50 hipMalloc((void**)&dcsrVal,
51 sizeof(float) * nnz)); // Note: Same size as input, values will be overwritten
52
53 HIP_CHECK(
54 hipMemcpy(dcsrRowPtr, hcsrRowPtr.data(), sizeof(int) * (m + 1), hipMemcpyHostToDevice));
55 HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd.data(), sizeof(int) * nnz, hipMemcpyHostToDevice));
56 HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal.data(), sizeof(float) * nnz, hipMemcpyHostToDevice));
57
58 // 1. Get buffer size
59 int bufferSize = 0;
60 HIPSPARSE_CHECK(hipsparseScsrilu02_bufferSize(
61 handle, m, nnz, descr, dcsrVal, dcsrRowPtr, dcsrColInd, info, &bufferSize));
62
63 void* dbuffer = nullptr;
64 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
65
66 // 2. Perform analysis (symbolic factorization)
67 // This step analyzes the sparsity pattern of A to determine the structure of L and U.
68 HIPSPARSE_CHECK(
69 hipsparseScsrilu02_analysis(handle,
70 m,
71 nnz,
72 descr,
73 dcsrVal,
74 dcsrRowPtr,
75 dcsrColInd,
76 info,
77 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
78 dbuffer));
79
80 // 3. Perform factorization (numerical computation)
81 // This step computes the actual numerical values of L and U, stored in dcsrVal.
82 HIPSPARSE_CHECK(hipsparseScsrilu02(handle,
83 m,
84 nnz,
85 descr,
86 dcsrVal,
87 dcsrRowPtr,
88 dcsrColInd,
89 info,
90 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
91 dbuffer));
92
93 // 4. Check for zero pivots
94 // A zero pivot can occur during factorization, indicating a numerical breakdown.
95 int zeroPivot = 0; // -1 if no zero pivot, otherwise the row index of the first zero pivot
96 HIPSPARSE_CHECK(hipsparseXcsrilu02_zeroPivot(handle, info, &zeroPivot));
97 if(zeroPivot != -1)
98 {
99 printf("Error: Zero pivot detected during ILU0 factorization at row index %d\n", zeroPivot);
100 // Depending on your application, you might want to handle this error
101 // or switch to a different preconditioner.
102 }
103 else
104 {
105 printf("CSRILU0 factorization completed successfully (no zero pivots detected).\n");
106 }
107
108 // Copy the factorized values (L and U combined) back to host
109 std::vector<float> hcsrVal_result(nnz);
110 HIP_CHECK(
111 hipMemcpy(hcsrVal_result.data(), dcsrVal, sizeof(float) * nnz, hipMemcpyDeviceToHost));
112
113 // Print the result (the values of the factorized L and U combined)
114 printf("\nFactorized CSR values (L and U combined):\n");
115 for(int i = 0; i < nnz; ++i)
116 {
117 printf("val[%d] = %f\n", i, hcsrVal_result[i]);
118 }
119
120 // Clean up
121 HIPSPARSE_CHECK(hipsparseDestroyCsrilu02Info(info));
122 HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
123 HIPSPARSE_CHECK(hipsparseDestroy(handle));
124
125 HIP_CHECK(hipFree(dcsrRowPtr));
126 HIP_CHECK(hipFree(dcsrColInd));
127 HIP_CHECK(hipFree(dcsrVal));
128 HIP_CHECK(hipFree(dbuffer));
129
130 return 0;
131}
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // A sample square matrix A (4x4) in CSR format for ILU(0) factorization.
8 // The 'S' in Scsrilu02 indicates single precision float.
9 // Matrix A:
10 // ( 1 2 0 0 )
11 // ( 3 4 5 0 )
12 // ( 0 6 7 8 )
13 // ( 0 0 9 10 )
14
15 int m = 4; // Number of rows
16 int n = 4; // Number of columns (equal to m for ILU)
17 int nnz = 10; // Number of non-zero elements
18
19 // CSR row pointers
20 int hcsrRowPtr[] = {0, 2, 5, 8, 10};
21
22 // CSR column indices
23 int hcsrColInd[] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
24
25 // CSR values
26 float hcsrVal[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0};
27
28 // Matrix descriptor
29 hipsparseMatDescr_t descr;
30 HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
31
32 // Set index base on descriptor
33 HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));
34
35 // For incomplete LU, the L factor often has a unit diagonal.
36 HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
37
38 // CSRILU02 info (for incomplete LU factorization)
39 csrilu02Info_t info;
40 HIPSPARSE_CHECK(hipsparseCreateCsrilu02Info(&info));
41
42 // Offload data to device
43 int* dcsrRowPtr;
44 int* dcsrColInd;
45 float* dcsrVal; // This will store the factorized L and U values
46
47 HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
48 HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
49 HIP_CHECK(
50 hipMalloc((void**)&dcsrVal,
51 sizeof(float) * nnz)); // Note: Same size as input, values will be overwritten
52
53 HIP_CHECK(hipMemcpy(dcsrRowPtr, hcsrRowPtr, sizeof(int) * (m + 1), hipMemcpyHostToDevice));
54 HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd, sizeof(int) * nnz, hipMemcpyHostToDevice));
55 HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal, sizeof(float) * nnz, hipMemcpyHostToDevice));
56
57 // 1. Get buffer size
58 int bufferSize = 0;
59 HIPSPARSE_CHECK(hipsparseScsrilu02_bufferSize(
60 handle, m, nnz, descr, dcsrVal, dcsrRowPtr, dcsrColInd, info, &bufferSize));
61
62 void* dbuffer = NULL;
63 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
64
65 // 2. Perform analysis (symbolic factorization)
66 // This step analyzes the sparsity pattern of A to determine the structure of L and U.
67 HIPSPARSE_CHECK(
68 hipsparseScsrilu02_analysis(handle,
69 m,
70 nnz,
71 descr,
72 dcsrVal,
73 dcsrRowPtr,
74 dcsrColInd,
75 info,
76 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
77 dbuffer));
78
79 // 3. Perform factorization (numerical computation)
80 // This step computes the actual numerical values of L and U, stored in dcsrVal.
81 HIPSPARSE_CHECK(hipsparseScsrilu02(handle,
82 m,
83 nnz,
84 descr,
85 dcsrVal,
86 dcsrRowPtr,
87 dcsrColInd,
88 info,
89 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
90 dbuffer));
91
92 // 4. Check for zero pivots
93 // A zero pivot can occur during factorization, indicating a numerical breakdown.
94 int zeroPivot = 0; // -1 if no zero pivot, otherwise the row index of the first zero pivot
95 HIPSPARSE_CHECK(hipsparseXcsrilu02_zeroPivot(handle, info, &zeroPivot));
96 if(zeroPivot != -1)
97 {
98 printf("Error: Zero pivot detected during ILU0 factorization at row index %d\n", zeroPivot);
99 // Depending on your application, you might want to handle this error
100 // or switch to a different preconditioner.
101 }
102 else
103 {
104 printf("CSRILU0 factorization completed successfully (no zero pivots detected).\n");
105 }
106
107 // Copy the factorized values (L and U combined) back to host
108 float* hcsrVal_result = (float*)malloc(nnz * sizeof(float));
109 HIP_CHECK(hipMemcpy(hcsrVal_result, dcsrVal, sizeof(float) * nnz, hipMemcpyDeviceToHost));
110
111 // Print the result (the values of the factorized L and U combined)
112 printf("\nFactorized CSR values (L and U combined):\n");
113 for(int i = 0; i < nnz; ++i)
114 {
115 printf("val[%d] = %f\n", i, hcsrVal_result[i]);
116 }
117
118 // Clean up
119 free(hcsrVal_result);
120
121 HIPSPARSE_CHECK(hipsparseDestroyCsrilu02Info(info));
122 HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
123 HIPSPARSE_CHECK(hipsparseDestroy(handle));
124
125 HIP_CHECK(hipFree(dcsrRowPtr));
126 HIP_CHECK(hipFree(dcsrColInd));
127 HIP_CHECK(hipFree(dcsrVal));
128 HIP_CHECK(hipFree(dbuffer));
129
130 return 0;
131}
1program example_hipsparse_csrilu02
2 use iso_c_binding
3 implicit none
4
5 ! HIP
6 interface
7 function hipMalloc(ptr, size) &
8 bind(c, name = 'hipMalloc')
9 use iso_c_binding
10 implicit none
11 integer(c_int) :: hipMalloc
12 type(c_ptr) :: ptr
13 integer(c_size_t), value :: size
14 end function hipMalloc
15
16 function hipFree(ptr) &
17 bind(c, name = 'hipFree')
18 use iso_c_binding
19 implicit none
20 integer(c_int) :: hipFree
21 type(c_ptr), value :: ptr
22 end function hipFree
23
24 function hipMemcpy(dst, src, size, kind) &
25 bind(c, name = 'hipMemcpy')
26 use iso_c_binding
27 implicit none
28 integer(c_int) :: hipMemcpy
29 type(c_ptr), value :: dst
30 type(c_ptr), intent(in), value :: src
31 integer(c_size_t), value :: size
32 integer(c_int), value :: kind
33 end function hipMemcpy
34 end interface
35
36 integer, parameter :: hipMemcpyHostToDevice = 1
37 integer, parameter :: hipMemcpyDeviceToHost = 2
38
39 ! hipSPARSE
40 interface
41 function hipsparseCreate(handle) &
42 bind(c, name = 'hipsparseCreate')
43 use iso_c_binding
44 implicit none
45 integer(c_int) :: hipsparseCreate
46 type(c_ptr) :: handle
47 end function hipsparseCreate
48
49 function hipsparseDestroy(handle) &
50 bind(c, name = 'hipsparseDestroy')
51 use iso_c_binding
52 implicit none
53 integer(c_int) :: hipsparseDestroy
54 type(c_ptr), value :: handle
55 end function hipsparseDestroy
56
57 function hipsparseCreateMatDescr(descr) &
58 bind(c, name = 'hipsparseCreateMatDescr')
59 use iso_c_binding
60 implicit none
61 integer(c_int) :: hipsparseCreateMatDescr
62 type(c_ptr) :: descr
63 end function hipsparseCreateMatDescr
64
65 function hipsparseDestroyMatDescr(descr) &
66 bind(c, name = 'hipsparseDestroyMatDescr')
67 use iso_c_binding
68 implicit none
69 integer(c_int) :: hipsparseDestroyMatDescr
70 type(c_ptr), value :: descr
71 end function hipsparseDestroyMatDescr
72
73 function hipsparseCreateCsrilu02Info(info) &
74 bind(c, name = 'hipsparseCreateCsrilu02Info')
75 use iso_c_binding
76 implicit none
77 integer(c_int) :: hipsparseCreateCsrilu02Info
78 type(c_ptr) :: info
79 end function hipsparseCreateCsrilu02Info
80
81 function hipsparseDestroyCsrilu02Info(info) &
82 bind(c, name = 'hipsparseDestroyCsrilu02Info')
83 use iso_c_binding
84 implicit none
85 integer(c_int) :: hipsparseDestroyCsrilu02Info
86 type(c_ptr), value :: info
87 end function hipsparseDestroyCsrilu02Info
88
89 function hipsparseScsrilu02_bufferSize(handle, m, nnz, descr, csrSortedValA, csrSortedRowPtrA, &
90 csrSortedColIndA, info, pBufferSizeInBytes) &
91 bind(c, name = 'hipsparseScsrilu02_bufferSize')
92 use iso_c_binding
93 implicit none
94 integer(c_int) :: hipsparseScsrilu02_bufferSize
95 type(c_ptr), value :: handle
96 integer(c_int), value :: m
97 integer(c_int), value :: nnz
98 type(c_ptr), value :: descr
99 type(c_ptr), intent(in), value :: csrSortedValA
100 type(c_ptr), intent(in), value :: csrSortedRowPtrA
101 type(c_ptr), intent(in), value :: csrSortedColIndA
102 type(c_ptr), value :: info
103 type(c_ptr), value :: pBufferSizeInBytes
104 end function hipsparseScsrilu02_bufferSize
105
106 function hipsparseScsrilu02_analysis(handle, m, nnz, descr, csrSortedValA, csrSortedRowPtrA, &
107 csrSortedColIndA, info, policy, pBuffer) &
108 bind(c, name = 'hipsparseScsrilu02_analysis')
109 use iso_c_binding
110 implicit none
111 integer(c_int) :: hipsparseScsrilu02_analysis
112 type(c_ptr), value :: handle
113 integer(c_int), value :: m
114 integer(c_int), value :: nnz
115 type(c_ptr), value :: descr
116 type(c_ptr), intent(in), value :: csrSortedValA
117 type(c_ptr), intent(in), value :: csrSortedRowPtrA
118 type(c_ptr), intent(in), value :: csrSortedColIndA
119 type(c_ptr), value :: info
120 integer(c_int), value :: policy
121 type(c_ptr), value :: pBuffer
122 end function hipsparseScsrilu02_analysis
123
124 function hipsparseScsrilu02(handle, m, nnz, descr, csrSortedValA_valM, csrSortedRowPtrA, &
125 csrSortedColIndA, info, policy, pBuffer) &
126 bind(c, name = 'hipsparseScsrilu02')
127 use iso_c_binding
128 implicit none
129 integer(c_int) :: hipsparseScsrilu02
130 type(c_ptr), value :: handle
131 integer(c_int), value :: m
132 integer(c_int), value :: nnz
133 type(c_ptr), value :: descr
134 type(c_ptr), value :: csrSortedValA_valM
135 type(c_ptr), intent(in), value :: csrSortedRowPtrA
136 type(c_ptr), intent(in), value :: csrSortedColIndA
137 type(c_ptr), value :: info
138 integer(c_int), value :: policy
139 type(c_ptr), value :: pBuffer
140 end function hipsparseScsrilu02
141
142 function hipsparseXcsrilu02_zeroPivot(handle, info, position) &
143 bind(c, name = 'hipsparseXcsrilu02_zeroPivot')
144 use iso_c_binding
145 implicit none
146 integer(c_int) :: hipsparseXcsrilu02_zeroPivot
147 type(c_ptr), value :: handle
148 type(c_ptr), value :: info
149 type(c_ptr), value :: position
150 end function hipsparseXcsrilu02_zeroPivot
151 end interface
152
153 integer, parameter :: HIPSPARSE_SOLVE_POLICY_USE_LEVEL = 1
154
155 ! Variables
156 type(c_ptr) :: handle
157 type(c_ptr) :: descr
158 type(c_ptr) :: info
159 integer :: i, stat
160 integer, target :: zeroPivot
161 integer(c_int), target :: bufferSize
162
163 ! Matrix A (4x4) in CSR format
164 integer, parameter :: m = 4
165 integer, parameter :: n = 4
166 integer, parameter :: nnz = 10
167
168 integer, dimension(m+1), target :: hcsrRowPtr = (/0, 2, 5, 8, 10/)
169 integer, dimension(nnz), target :: hcsrColInd = (/0, 1, 0, 1, 2, 1, 2, 3, 2, 3/)
170 real(c_float), dimension(nnz), target :: hcsrVal = (/1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0/)
171
172 ! Result array
173 real(c_float), dimension(nnz), target :: hcsrVal_result
174
175 ! Device pointers
176 type(c_ptr) :: dcsrRowPtr
177 type(c_ptr) :: dcsrColInd
178 type(c_ptr) :: dcsrVal
179 type(c_ptr) :: dbuffer
180
181 ! Create hipSPARSE handle
182 stat = hipsparseCreate(handle)
183 if (stat /= 0) then
184 write(*,*) 'Error: hipsparseCreate failed'
185 stop
186 end if
187
188 ! Create matrix descriptor
189 stat = hipsparseCreateMatDescr(descr)
190 if (stat /= 0) then
191 write(*,*) 'Error: hipsparseCreateMatDescr failed'
192 stop
193 end if
194
195 ! Create csrilu02 info
196 stat = hipsparseCreateCsrilu02Info(info)
197 if (stat /= 0) then
198 write(*,*) 'Error: hipsparseCreateCsrilu02Info failed'
199 stop
200 end if
201
202 ! Allocate device memory
203 stat = hipMalloc(dcsrRowPtr, int((m + 1) * 4, c_size_t))
204 if (stat /= 0) stop
205 stat = hipMalloc(dcsrColInd, int(nnz * 4, c_size_t))
206 if (stat /= 0) stop
207 stat = hipMalloc(dcsrVal, int(nnz * 4, c_size_t))
208 if (stat /= 0) stop
209
210 ! Copy data to device
211 stat = hipMemcpy(dcsrRowPtr, c_loc(hcsrRowPtr), int((m + 1) * 4, c_size_t), hipMemcpyHostToDevice)
212 if (stat /= 0) stop
213 stat = hipMemcpy(dcsrColInd, c_loc(hcsrColInd), int(nnz * 4, c_size_t), hipMemcpyHostToDevice)
214 if (stat /= 0) stop
215 stat = hipMemcpy(dcsrVal, c_loc(hcsrVal), int(nnz * 4, c_size_t), hipMemcpyHostToDevice)
216 if (stat /= 0) stop
217
218 ! Get buffer size
219 stat = hipsparseScsrilu02_bufferSize(handle, &
220 m, &
221 nnz, &
222 descr, &
223 dcsrVal, &
224 dcsrRowPtr, &
225 dcsrColInd, &
226 info, &
227 c_loc(bufferSize))
228 if (stat /= 0) then
229 write(*,*) 'Error: hipsparseScsrilu02_bufferSize failed'
230 stop
231 end if
232
233 ! Allocate temporary buffer
234 stat = hipMalloc(dbuffer, int(bufferSize, c_size_t))
235 if (stat /= 0) stop
236
237 ! Perform analysis step
238 stat = hipsparseScsrilu02_analysis(handle, &
239 m, &
240 nnz, &
241 descr, &
242 dcsrVal, &
243 dcsrRowPtr, &
244 dcsrColInd, &
245 info, &
246 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
247 dbuffer)
248 if (stat /= 0) then
249 write(*,*) 'Error: hipsparseScsrilu02_analysis failed'
250 stop
251 end if
252
253 ! Perform factorization
254 stat = hipsparseScsrilu02(handle, &
255 m, &
256 nnz, &
257 descr, &
258 dcsrVal, &
259 dcsrRowPtr, &
260 dcsrColInd, &
261 info, &
262 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
263 dbuffer)
264 if (stat /= 0) then
265 write(*,*) 'Error: hipsparseScsrilu02 failed'
266 stop
267 end if
268
269 ! Check for zero pivots
270 stat = hipsparseXcsrilu02_zeroPivot(handle, info, c_loc(zeroPivot))
271 if (zeroPivot /= -1) then
272 write(*,*) 'Error: Zero pivot detected at row index', zeroPivot
273 else
274 write(*,*) 'CSRILU02 factorization completed successfully'
275 end if
276
277 ! Copy result back to host
278 stat = hipMemcpy(c_loc(hcsrVal_result), dcsrVal, int(nnz * 4, c_size_t), hipMemcpyDeviceToHost)
279 if (stat /= 0) stop
280
281 ! Print result
282 write(*,*) 'Factorized CSR values (L and U combined):'
283 do i = 1, nnz
284 write(*,*) 'val[', i-1, '] =', hcsrVal_result(i)
285 end do
286
287 ! Clean up
288 stat = hipFree(dcsrRowPtr)
289 stat = hipFree(dcsrColInd)
290 stat = hipFree(dcsrVal)
291 stat = hipFree(dbuffer)
292
293 stat = hipsparseDestroyCsrilu02Info(info)
294 stat = hipsparseDestroyMatDescr(descr)
295 stat = hipsparseDestroy(handle)
296
297end program example_hipsparse_csrilu02
hipsparseXbsric02_zeroPivot()#
-
hipsparseStatus_t hipsparseXbsric02_zeroPivot(hipsparseHandle_t handle, bsric02Info_t info, int *position)#
hipsparseXbsric02_zeroPivotreturns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXbsric02_analysis() or hipsparseXbsric02() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored inposition, using same index base as the BSR matrix.positioncan be in host or device memory. If no zero pivot has been found,positionis set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.- Deprecated:
This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.
Note
If a zero pivot is found,
position=jmeans that either the diagonal blockA(j,j)is missing (structural zero) or the diagonal blockA(j,j)is not positive definite (numerical zero).Note
hipsparseXbsric02_zeroPivotis a blocking function. It might influence performance negatively.- Parameters:
handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
position – [inout] pointer to zero pivot \(j\), can be in host or device memory.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED –
handleis not initialized.HIPSPARSE_STATUS_INVALID_VALUE –
handle,infoorpositionis nullptr.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.
hipsparseXbsric02_bufferSize()#
-
hipsparseStatus_t hipsparseSbsric02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDbsric02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCbsric02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZbsric02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, int *pBufferSizeInBytes)#
hipsparseXbsric02_bufferSizereturns the size of the temporary storage buffer in bytes that is required by hipsparseXbsric02_analysis() and hipsparseXbsric02(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specifies whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix. Must be non-negative.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix. Must be non-negative.
descrA – [in] descriptor of the sparse BSR matrix.
bsrValA – [in] array of length
nnzb*blockDim*blockDimcontaining the values of the sparse BSR matrix.bsrRowPtrA – [in] array of
mb+1elements that point to the start of every block row of the sparse BSR matrix.bsrColIndA – [in] array of
nnzbelements containing the block column indices of the sparse BSR matrix.blockDim – [in] the block dimension of the BSR matrix. Must be positive. Between 1 and m where
m=mb*blockDim.info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSbsric02_analysis(), hipsparseDbsric02_analysis(), hipsparseCbsric02_analysis(), hipsparseZbsric02_analysis(), hipsparseSbsric02(), hipsparseDbsric02(), hipsparseCbsric02() and hipsparseZbsric02().
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED –
handleis not initialized.HIPSPARSE_STATUS_INVALID_VALUE –
handle,descrA,bsrValA,bsrRowPtrA,bsrColIndA,infoorpBufferSizeInBytesis nullptr,mbornnzbis negative, orblockDimis invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
hipsparseXbsric02_analysis()#
-
hipsparseStatus_t hipsparseSbsric02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, const float *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseDbsric02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, const double *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseCbsric02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, const hipComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseZbsric02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, const hipDoubleComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseXbsric02_analysisperforms the analysis step for hipsparseXbsric02(). It is expected that this function will be executed only once for a given matrix and particular operation type.Note
If the matrix sparsity pattern changes, the gathered information will become invalid.
Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specified whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrValA – [in] array of length
nnzb*blockDim*blockDimcontaining the values of the sparse BSR matrix.bsrRowPtrA – [in] array of
mb+1elements that point to the start of every block row of the sparse BSR matrix.bsrColIndA – [in] array of
nnzbelements containing the block column indices of the sparse BSR matrix.blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where
m=mb*blockDim.info – [out] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,mb,nnzb,blockDim,descrA,bsrValA,bsrRowPtrA,bsrColIndA,infoorpBufferpointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
hipsparseXbsric02()#
-
hipsparseStatus_t hipsparseSbsric02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseDbsric02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseCbsric02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseZbsric02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
Incomplete Cholesky factorization with 0 fill-ins and no pivoting using BSR storage format.
hipsparseXbsric02computes the incomplete Cholesky factorization with 0 fill-ins and no pivoting of a sparse \(mb \times mb\) BSR matrix \(A\), such that\[ A \approx LL^T \]Computing the above incomplete Cholesky factorization requires three steps to complete. First, the user determines the size of the required temporary storage buffer by calling hipsparseXbsric02_bufferSize(). Once this buffer size has been determined, the user allocates the buffer and passes it to hipsparseXbsric02_analysis(). This will perform analysis on the sparsity pattern of the matrix. Finally, the user calls
hipsparseXbsric02to perform the actual factorization. The calculation of the buffer size and the analysis of the sparse matrix only need to be performed once for a given sparsity pattern while the factorization can be repeatedly applied to multiple matrices having the same sparsity pattern. Once all calls tohipsparseXbsric02are complete, the temporary buffer can be deallocated.hipsparseXbsric02requires a user allocated temporary buffer. Its size is returned by hipsparseXbsric02_bufferSize(). Furthermore, analysis meta data is required. It can be obtained by hipsparseXbsric02_analysis().hipsparseXbsric02reports the first zero pivot (either numerical or structural zero). The zero pivot status can be obtained by calling hipsparseXbsric02_zeroPivot().hipsparseXbsric02reports the first zero pivot (either numerical or structural zero). The zero pivot status can be obtained by calling hipsparseXbsric02_zeroPivot().Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specified whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrValA – [inout] array of length
nnzb*blockDim*blockDimcontaining the values of the sparse BSR matrix.bsrRowPtrA – [in] array of
mb+1elements that point to the start of every block row of the sparse BSR matrix.bsrColIndA – [in] array of
nnzbelements containing the block column indices of the sparse BSR matrix.blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where
m=mb*blockDim.info – [in] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,mb,nnzb,blockDim,descrA,bsrValA,bsrRowPtrA, orbsrColIndApointer is invalid.HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // A sample symmetric positive definite matrix A (4x4)
8 // with a block size of 1. This example effectively uses BSR format
9 // for a CSR-like matrix.
10 // Matrix A:
11 // ( 4 1 0 0 )
12 // ( 1 5 2 0 )
13 // ( 0 2 3 1 )
14 // ( 0 0 1 2 )
15
16 const int m = 4; // Number of rows
17 const int n = 4; // Number of columns
18 const int bs = 1; // Block size
19 const int mb = m / bs; // Number of block rows
20 const int nb = n / bs; // Number of block columns
21 const int nnzb = 10; // Number of non-zero blocks
22
23 // BSR row pointers
24 std::vector<int> hbsrRowPtr = {0, 2, 5, 8, 10};
25
26 // BSR column indices
27 std::vector<int> hbsrColInd = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
28
29 // BSR values (single precision float for 'S'bsric02)
30 // Values are stored column-major within each block, but with bs=1, this is simple.
31 // The values correspond to the upper triangular part of the matrix.
32 std::vector<float> hbsrVal = {4.0f, 1.0f, 1.0f, 5.0f, 2.0f, 2.0f, 3.0f, 1.0f, 1.0f, 2.0f};
33
34 // Matrix descriptor
35 hipsparseMatDescr_t descr;
36 HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
37
38 // Set index base on descriptor
39 HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));
40
41 // Set fill mode to lower and diagonal type to unit (required for IC02)
42 HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
43 HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
44
45 // BSRIC02 info
46 bsric02Info_t info;
47 HIPSPARSE_CHECK(hipsparseCreateBsric02Info(&info));
48
49 // Offload data to device
50 int* dbsrRowPtr;
51 int* dbsrColInd;
52 float* dbsrVal;
53
54 HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
55 HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
56 HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(float) * nnzb * bs * bs));
57
58 HIP_CHECK(
59 hipMemcpy(dbsrRowPtr, hbsrRowPtr.data(), sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
60 HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd.data(), sizeof(int) * nnzb, hipMemcpyHostToDevice));
61 HIP_CHECK(
62 hipMemcpy(dbsrVal, hbsrVal.data(), sizeof(float) * nnzb * bs * bs, hipMemcpyHostToDevice));
63
64 // 1. Get buffer size
65 int bufferSize = 0;
66 HIPSPARSE_CHECK(hipsparseSbsric02_bufferSize(handle,
67 HIPSPARSE_DIRECTION_COLUMN,
68 mb,
69 nnzb,
70 descr,
71 dbsrVal,
72 dbsrRowPtr,
73 dbsrColInd,
74 bs,
75 info,
76 &bufferSize));
77
78 void* dbuffer = nullptr;
79 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
80
81 // 2. Perform analysis (symbolic factorization)
82 HIPSPARSE_CHECK(hipsparseSbsric02_analysis(handle,
83 HIPSPARSE_DIRECTION_COLUMN,
84 mb,
85 nnzb,
86 descr,
87 dbsrVal,
88 dbsrRowPtr,
89 dbsrColInd,
90 bs,
91 info,
92 HIPSPARSE_SOLVE_POLICY_USE_LEVEL,
93 dbuffer));
94
95 // 3. Perform factorization (numerical computation)
96 HIPSPARSE_CHECK(hipsparseSbsric02(handle,
97 HIPSPARSE_DIRECTION_COLUMN,
98 mb,
99 nnzb,
100 descr,
101 dbsrVal,
102 dbsrRowPtr,
103 dbsrColInd,
104 bs,
105 info,
106 HIPSPARSE_SOLVE_POLICY_USE_LEVEL,
107 dbuffer));
108
109 // 4. Check for zero pivots
110 int zeroPivot = 0;
111 HIPSPARSE_CHECK(hipsparseXbsric02_zeroPivot(handle, info, &zeroPivot));
112 if(zeroPivot != -1)
113 {
114 printf("Error: Zero pivot detected at index %d\n", zeroPivot);
115 // Handle error, e.g., by returning an error code
116 }
117
118 // Copy the factorized values back to host
119 std::vector<float> hbsrVal_result(nnzb * bs * bs);
120 HIP_CHECK(hipMemcpy(
121 hbsrVal_result.data(), dbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyDeviceToHost));
122
123 // Print the result (the values of the factorized matrix)
124 printf("Successfully computed incomplete Cholesky factorization.\n");
125 printf("Factorized BSR values:\n");
126 for(int i = 0; i < nnzb * bs * bs; ++i)
127 {
128 printf("val[%d] = %f\n", i, hbsrVal_result[i]);
129 }
130
131 // Clean up
132 HIPSPARSE_CHECK(hipsparseDestroyBsric02Info(info));
133 HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
134 HIPSPARSE_CHECK(hipsparseDestroy(handle));
135
136 HIP_CHECK(hipFree(dbsrRowPtr));
137 HIP_CHECK(hipFree(dbsrColInd));
138 HIP_CHECK(hipFree(dbsrVal));
139 HIP_CHECK(hipFree(dbuffer));
140
141 return 0;
142}
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // A sample symmetric positive definite matrix A (4x4)
8 // with a block size of 1. This example effectively uses BSR format
9 // for a CSR-like matrix.
10 // Matrix A:
11 // ( 4 1 0 0 )
12 // ( 1 5 2 0 )
13 // ( 0 2 3 1 )
14 // ( 0 0 1 2 )
15
16 const int m = 4; // Number of rows
17 const int n = 4; // Number of columns
18 const int bs = 1; // Block size
19 const int mb = m / bs; // Number of block rows
20 const int nb = n / bs; // Number of block columns
21 const int nnzb = 10; // Number of non-zero blocks
22
23 // BSR row pointers
24 int hbsrRowPtr[] = {0, 2, 5, 8, 10};
25
26 // BSR column indices
27 int hbsrColInd[] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
28
29 // BSR values (single precision float for 'S'bsric02)
30 // Values are stored column-major within each block, but with bs=1, this is simple.
31 // The values correspond to the upper triangular part of the matrix.
32 float hbsrVal[] = {4.0, 1.0, 1.0, 5.0, 2.0, 2.0, 3.0, 1.0, 1.0, 2.0};
33
34 // Matrix descriptor
35 hipsparseMatDescr_t descr;
36 HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
37
38 // Set index base on descriptor
39 HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));
40
41 // Set fill mode to lower and diagonal type to unit (required for IC02)
42 HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
43 HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
44
45 // BSRIC02 info
46 bsric02Info_t info;
47 HIPSPARSE_CHECK(hipsparseCreateBsric02Info(&info));
48
49 // Offload data to device
50 int* dbsrRowPtr;
51 int* dbsrColInd;
52 float* dbsrVal;
53
54 HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
55 HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
56 HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(float) * nnzb * bs * bs));
57
58 HIP_CHECK(hipMemcpy(dbsrRowPtr, hbsrRowPtr, sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
59 HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd, sizeof(int) * nnzb, hipMemcpyHostToDevice));
60 HIP_CHECK(hipMemcpy(dbsrVal, hbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyHostToDevice));
61
62 // 1. Get buffer size
63 int bufferSize = 0;
64 HIPSPARSE_CHECK(hipsparseSbsric02_bufferSize(handle,
65 HIPSPARSE_DIRECTION_COLUMN,
66 mb,
67 nnzb,
68 descr,
69 dbsrVal,
70 dbsrRowPtr,
71 dbsrColInd,
72 bs,
73 info,
74 &bufferSize));
75
76 void* dbuffer = NULL;
77 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
78
79 // 2. Perform analysis (symbolic factorization)
80 HIPSPARSE_CHECK(hipsparseSbsric02_analysis(handle,
81 HIPSPARSE_DIRECTION_COLUMN,
82 mb,
83 nnzb,
84 descr,
85 dbsrVal,
86 dbsrRowPtr,
87 dbsrColInd,
88 bs,
89 info,
90 HIPSPARSE_SOLVE_POLICY_USE_LEVEL,
91 dbuffer));
92
93 // 3. Perform factorization (numerical computation)
94 HIPSPARSE_CHECK(hipsparseSbsric02(handle,
95 HIPSPARSE_DIRECTION_COLUMN,
96 mb,
97 nnzb,
98 descr,
99 dbsrVal,
100 dbsrRowPtr,
101 dbsrColInd,
102 bs,
103 info,
104 HIPSPARSE_SOLVE_POLICY_USE_LEVEL,
105 dbuffer));
106
107 // 4. Check for zero pivots
108 int zeroPivot = 0;
109 HIPSPARSE_CHECK(hipsparseXbsric02_zeroPivot(handle, info, &zeroPivot));
110 if(zeroPivot != -1)
111 {
112 printf("Error: Zero pivot detected at index %d\n", zeroPivot);
113 // Handle error, e.g., by returning an error code
114 }
115
116 // Copy the factorized values back to host
117 float* hbsrVal_result = (float*)malloc(nnzb * bs * bs * sizeof(float));
118 HIP_CHECK(
119 hipMemcpy(hbsrVal_result, dbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyDeviceToHost));
120
121 // Print the result (the values of the factorized matrix)
122 printf("Successfully computed incomplete Cholesky factorization.\n");
123 printf("Factorized BSR values:\n");
124 for(int i = 0; i < nnzb * bs * bs; ++i)
125 {
126 printf("val[%d] = %f\n", i, hbsrVal_result[i]);
127 }
128
129 // Clean up
130 free(hbsrVal_result);
131
132 HIPSPARSE_CHECK(hipsparseDestroyBsric02Info(info));
133 HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
134 HIPSPARSE_CHECK(hipsparseDestroy(handle));
135
136 HIP_CHECK(hipFree(dbsrRowPtr));
137 HIP_CHECK(hipFree(dbsrColInd));
138 HIP_CHECK(hipFree(dbsrVal));
139 HIP_CHECK(hipFree(dbuffer));
140
141 return 0;
142}
1program example_hipsparse_bsric02
2 use iso_c_binding
3 implicit none
4
5 ! HIP
6 interface
7 function hipMalloc(ptr, size) &
8 bind(c, name = 'hipMalloc')
9 use iso_c_binding
10 implicit none
11 integer(c_int) :: hipMalloc
12 type(c_ptr) :: ptr
13 integer(c_size_t), value :: size
14 end function hipMalloc
15
16 function hipFree(ptr) &
17 bind(c, name = 'hipFree')
18 use iso_c_binding
19 implicit none
20 integer(c_int) :: hipFree
21 type(c_ptr), value :: ptr
22 end function hipFree
23
24 function hipMemcpy(dst, src, size, kind) &
25 bind(c, name = 'hipMemcpy')
26 use iso_c_binding
27 implicit none
28 integer(c_int) :: hipMemcpy
29 type(c_ptr), value :: dst
30 type(c_ptr), intent(in), value :: src
31 integer(c_size_t), value :: size
32 integer(c_int), value :: kind
33 end function hipMemcpy
34 end interface
35
36 integer, parameter :: hipMemcpyHostToDevice = 1
37 integer, parameter :: hipMemcpyDeviceToHost = 2
38
39 ! hipSPARSE
40 interface
41 function hipsparseCreate(handle) &
42 bind(c, name = 'hipsparseCreate')
43 use iso_c_binding
44 implicit none
45 integer(c_int) :: hipsparseCreate
46 type(c_ptr) :: handle
47 end function hipsparseCreate
48
49 function hipsparseDestroy(handle) &
50 bind(c, name = 'hipsparseDestroy')
51 use iso_c_binding
52 implicit none
53 integer(c_int) :: hipsparseDestroy
54 type(c_ptr), value :: handle
55 end function hipsparseDestroy
56
57 function hipsparseCreateMatDescr(descr) &
58 bind(c, name = 'hipsparseCreateMatDescr')
59 use iso_c_binding
60 implicit none
61 integer(c_int) :: hipsparseCreateMatDescr
62 type(c_ptr) :: descr
63 end function hipsparseCreateMatDescr
64
65 function hipsparseDestroyMatDescr(descr) &
66 bind(c, name = 'hipsparseDestroyMatDescr')
67 use iso_c_binding
68 implicit none
69 integer(c_int) :: hipsparseDestroyMatDescr
70 type(c_ptr), value :: descr
71 end function hipsparseDestroyMatDescr
72
73 function hipsparseSetMatIndexBase(descr, base) &
74 bind(c, name = 'hipsparseSetMatIndexBase')
75 use iso_c_binding
76 implicit none
77 integer(c_int) :: hipsparseSetMatIndexBase
78 type(c_ptr), value :: descr
79 integer(c_int), value :: base
80 end function hipsparseSetMatIndexBase
81
82 function hipsparseSetMatFillMode(descr, fillMode) &
83 bind(c, name = 'hipsparseSetMatFillMode')
84 use iso_c_binding
85 implicit none
86 integer(c_int) :: hipsparseSetMatFillMode
87 type(c_ptr), value :: descr
88 integer(c_int), value :: fillMode
89 end function hipsparseSetMatFillMode
90
91 function hipsparseSetMatDiagType(descr, diagType) &
92 bind(c, name = 'hipsparseSetMatDiagType')
93 use iso_c_binding
94 implicit none
95 integer(c_int) :: hipsparseSetMatDiagType
96 type(c_ptr), value :: descr
97 integer(c_int), value :: diagType
98 end function hipsparseSetMatDiagType
99
100 function hipsparseCreateBsric02Info(info) &
101 bind(c, name = 'hipsparseCreateBsric02Info')
102 use iso_c_binding
103 implicit none
104 integer(c_int) :: hipsparseCreateBsric02Info
105 type(c_ptr) :: info
106 end function hipsparseCreateBsric02Info
107
108 function hipsparseDestroyBsric02Info(info) &
109 bind(c, name = 'hipsparseDestroyBsric02Info')
110 use iso_c_binding
111 implicit none
112 integer(c_int) :: hipsparseDestroyBsric02Info
113 type(c_ptr), value :: info
114 end function hipsparseDestroyBsric02Info
115
116 function hipsparseSbsric02_bufferSize(handle, dirA, mb, nnzb, descrA, bsrSortedValA, &
117 bsrSortedRowPtrA, bsrSortedColIndA, blockDim, &
118 info, pBufferSizeInBytes) &
119 bind(c, name = 'hipsparseSbsric02_bufferSize')
120 use iso_c_binding
121 implicit none
122 integer(c_int) :: hipsparseSbsric02_bufferSize
123 type(c_ptr), value :: handle
124 integer(c_int), value :: dirA
125 integer(c_int), value :: mb
126 integer(c_int), value :: nnzb
127 type(c_ptr), value :: descrA
128 type(c_ptr), intent(in), value :: bsrSortedValA
129 type(c_ptr), intent(in), value :: bsrSortedRowPtrA
130 type(c_ptr), intent(in), value :: bsrSortedColIndA
131 integer(c_int), value :: blockDim
132 type(c_ptr), value :: info
133 type(c_ptr), value :: pBufferSizeInBytes
134 end function hipsparseSbsric02_bufferSize
135
136 function hipsparseSbsric02_analysis(handle, dirA, mb, nnzb, descrA, bsrSortedValA, &
137 bsrSortedRowPtrA, bsrSortedColIndA, blockDim, &
138 info, policy, pBuffer) &
139 bind(c, name = 'hipsparseSbsric02_analysis')
140 use iso_c_binding
141 implicit none
142 integer(c_int) :: hipsparseSbsric02_analysis
143 type(c_ptr), value :: handle
144 integer(c_int), value :: dirA
145 integer(c_int), value :: mb
146 integer(c_int), value :: nnzb
147 type(c_ptr), value :: descrA
148 type(c_ptr), intent(in), value :: bsrSortedValA
149 type(c_ptr), intent(in), value :: bsrSortedRowPtrA
150 type(c_ptr), intent(in), value :: bsrSortedColIndA
151 integer(c_int), value :: blockDim
152 type(c_ptr), value :: info
153 integer(c_int), value :: policy
154 type(c_ptr), value :: pBuffer
155 end function hipsparseSbsric02_analysis
156
157 function hipsparseSbsric02(handle, dirA, mb, nnzb, descrA, bsrSortedValA, bsrSortedRowPtrA, &
158 bsrSortedColIndA, blockDim, info, policy, pBuffer) &
159 bind(c, name = 'hipsparseSbsric02')
160 use iso_c_binding
161 implicit none
162 integer(c_int) :: hipsparseSbsric02
163 type(c_ptr), value :: handle
164 integer(c_int), value :: dirA
165 integer(c_int), value :: mb
166 integer(c_int), value :: nnzb
167 type(c_ptr), value :: descrA
168 type(c_ptr), value :: bsrSortedValA
169 type(c_ptr), intent(in), value :: bsrSortedRowPtrA
170 type(c_ptr), intent(in), value :: bsrSortedColIndA
171 integer(c_int), value :: blockDim
172 type(c_ptr), value :: info
173 integer(c_int), value :: policy
174 type(c_ptr), value :: pBuffer
175 end function hipsparseSbsric02
176
177 function hipsparseXbsric02_zeroPivot(handle, info, position) &
178 bind(c, name = 'hipsparseXbsric02_zeroPivot')
179 use iso_c_binding
180 implicit none
181 integer(c_int) :: hipsparseXbsric02_zeroPivot
182 type(c_ptr), value :: handle
183 type(c_ptr), value :: info
184 type(c_ptr), value :: position
185 end function hipsparseXbsric02_zeroPivot
186 end interface
187
188 integer, parameter :: HIPSPARSE_INDEX_BASE_ZERO = 0
189 integer, parameter :: HIPSPARSE_FILL_MODE_LOWER = 0
190 integer, parameter :: HIPSPARSE_DIAG_TYPE_UNIT = 1
191 integer, parameter :: HIPSPARSE_DIRECTION_COLUMN = 1
192 integer, parameter :: HIPSPARSE_SOLVE_POLICY_USE_LEVEL = 1
193
194 ! Variables
195 type(c_ptr) :: handle
196 type(c_ptr) :: descr
197 type(c_ptr) :: info
198 integer :: i, stat
199 integer, target :: zeroPivot
200 integer(c_int), target :: bufferSize
201
202 ! Block sparse matrix A (4x4 with block size 1)
203 integer, parameter :: m = 4
204 integer, parameter :: n = 4
205 integer, parameter :: bs = 1
206 integer, parameter :: mb = m / bs
207 integer, parameter :: nb = n / bs
208 integer, parameter :: nnzb = 10
209
210 integer, dimension(mb+1), target :: hbsrRowPtr = (/0, 2, 5, 8, 10/)
211 integer, dimension(nnzb), target :: hbsrColInd = (/0, 1, 0, 1, 2, 1, 2, 3, 2, 3/)
212 real(c_float), dimension(nnzb*bs*bs), target :: hbsrVal = (/4.0, 1.0, 1.0, 5.0, 2.0, 2.0, 3.0, 1.0, 1.0, 2.0/)
213
214 ! Result array
215 real(c_float), dimension(nnzb*bs*bs), target :: hbsrVal_result
216
217 ! Device pointers
218 type(c_ptr) :: dbsrRowPtr
219 type(c_ptr) :: dbsrColInd
220 type(c_ptr) :: dbsrVal
221 type(c_ptr) :: dbuffer
222
223 ! Create hipSPARSE handle
224 stat = hipsparseCreate(handle)
225 if (stat /= 0) stop
226
227 ! Create matrix descriptor
228 stat = hipsparseCreateMatDescr(descr)
229 if (stat /= 0) stop
230
231 ! Set matrix properties
232 stat = hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO)
233 if (stat /= 0) stop
234 stat = hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER)
235 if (stat /= 0) stop
236 stat = hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT)
237 if (stat /= 0) stop
238
239 ! Create bsric02 info
240 stat = hipsparseCreateBsric02Info(info)
241 if (stat /= 0) stop
242
243 ! Allocate device memory
244 stat = hipMalloc(dbsrRowPtr, int((mb + 1) * 4, c_size_t))
245 if (stat /= 0) stop
246 stat = hipMalloc(dbsrColInd, int(nnzb * 4, c_size_t))
247 if (stat /= 0) stop
248 stat = hipMalloc(dbsrVal, int(nnzb * bs * bs * 4, c_size_t))
249 if (stat /= 0) stop
250
251 ! Copy data to device
252 stat = hipMemcpy(dbsrRowPtr, c_loc(hbsrRowPtr), int((mb + 1) * 4, c_size_t), hipMemcpyHostToDevice)
253 if (stat /= 0) stop
254 stat = hipMemcpy(dbsrColInd, c_loc(hbsrColInd), int(nnzb * 4, c_size_t), hipMemcpyHostToDevice)
255 if (stat /= 0) stop
256 stat = hipMemcpy(dbsrVal, c_loc(hbsrVal), int(nnzb * bs * bs * 4, c_size_t), hipMemcpyHostToDevice)
257 if (stat /= 0) stop
258
259 ! Get buffer size
260 stat = hipsparseSbsric02_bufferSize(handle, &
261 HIPSPARSE_DIRECTION_COLUMN, &
262 mb, &
263 nnzb, &
264 descr, &
265 dbsrVal, &
266 dbsrRowPtr, &
267 dbsrColInd, &
268 bs, &
269 info, &
270 c_loc(bufferSize))
271 if (stat /= 0) stop
272
273 ! Allocate temporary buffer
274 stat = hipMalloc(dbuffer, int(bufferSize, c_size_t))
275 if (stat /= 0) stop
276
277 ! Perform analysis step
278 stat = hipsparseSbsric02_analysis(handle, &
279 HIPSPARSE_DIRECTION_COLUMN, &
280 mb, &
281 nnzb, &
282 descr, &
283 dbsrVal, &
284 dbsrRowPtr, &
285 dbsrColInd, &
286 bs, &
287 info, &
288 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
289 dbuffer)
290 if (stat /= 0) stop
291
292 ! Perform factorization
293 stat = hipsparseSbsric02(handle, &
294 HIPSPARSE_DIRECTION_COLUMN, &
295 mb, &
296 nnzb, &
297 descr, &
298 dbsrVal, &
299 dbsrRowPtr, &
300 dbsrColInd, &
301 bs, &
302 info, &
303 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
304 dbuffer)
305 if (stat /= 0) stop
306
307 ! Check for zero pivots
308 stat = hipsparseXbsric02_zeroPivot(handle, info, c_loc(zeroPivot))
309 if (zeroPivot /= -1) then
310 write(*,*) 'Error: Zero pivot detected at row index', zeroPivot
311 else
312 write(*,*) 'BSRIC02 factorization completed successfully'
313 end if
314
315 ! Copy result back to host
316 stat = hipMemcpy(c_loc(hbsrVal_result), dbsrVal, int(nnzb * bs * bs * 4, c_size_t), hipMemcpyDeviceToHost)
317 if (stat /= 0) stop
318
319 ! Print result
320 write(*,*) 'Factorized BSR values (L factor):'
321 do i = 1, nnzb * bs * bs
322 write(*,*) 'val[', i-1, '] =', hbsrVal_result(i)
323 end do
324
325 ! Clean up
326 stat = hipFree(dbsrRowPtr)
327 stat = hipFree(dbsrColInd)
328 stat = hipFree(dbsrVal)
329 stat = hipFree(dbuffer)
330
331 stat = hipsparseDestroyBsric02Info(info)
332 stat = hipsparseDestroyMatDescr(descr)
333 stat = hipsparseDestroy(handle)
334
335end program example_hipsparse_bsric02
hipsparseXcsric02_zeroPivot()#
-
hipsparseStatus_t hipsparseXcsric02_zeroPivot(hipsparseHandle_t handle, csric02Info_t info, int *position)#
hipsparseXcsric02_zeroPivotreturns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXcsric02_analysis() or hipsparseXcsric02() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored inposition, using same index base as the CSR matrix.positioncan be in host or device memory. If no zero pivot has been found,positionis set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.- Deprecated:
This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.
Note
hipsparseXcsric02_zeroPivotis a blocking function. It might influence performance negatively.- Parameters:
handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
position – [inout] pointer to zero pivot \(j\), can be in host or device memory.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED –
handleis not initialized.HIPSPARSE_STATUS_INVALID_VALUE –
handle,infoorpositionis nullptr.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.
hipsparseXcsric02_bufferSize()#
-
hipsparseStatus_t hipsparseScsric02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDcsric02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCcsric02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, int *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZcsric02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, int *pBufferSizeInBytes)#
hipsparseXcsric02_bufferSizereturns the size of the temporary storage buffer in bytes that is required by hipsparseXcsric02_analysis() and hipsparseXcsric02(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of
nnzelements of the sparse CSR matrix.csrSortedRowPtrA – [in] array of
m+1elements that point to the start of every row of the sparse CSR matrix.csrSortedColIndA – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXcsric02_analysis() and hipsparseXcsric02().
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,nnz,descrA,csrSortedValA,csrSortedRowPtrA,csrSortedColIndA,infoorpBufferSizeInBytespointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
hipsparseXcsric02_bufferSizeExt()#
-
hipsparseStatus_t hipsparseScsric02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDcsric02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCcsric02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZcsric02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, size_t *pBufferSizeInBytes)#
hipsparseXcsric02_bufferSizeExtreturns the size of the temporary storage buffer in bytes that is required by hipsparseXcsric02_analysis() and hipsparseXcsric02(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of
nnzelements of the sparse CSR matrix.csrSortedRowPtrA – [in] array of
m+1elements that point to the start of every row of the sparse CSR matrix.csrSortedColIndA – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXcsric02_analysis() and hipsparseXcsric02().
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,nnz,descrA,csrSortedValA,csrSortedRowPtrA,csrSortedColIndA,infoorpBufferSizeInBytespointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
hipsparseXcsric02_analysis()#
-
hipsparseStatus_t hipsparseScsric02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseDcsric02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseCcsric02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseZcsric02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
hipsparseXcsric02_analysisperforms the analysis step for hipsparseXcsric02().Note
If the matrix sparsity pattern changes, the gathered information will become invalid.
Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of
nnzelements of the sparse CSR matrix.csrSortedRowPtrA – [in] array of
m+1elements that point to the start of every row of the sparse CSR matrix.csrSortedColIndA – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.info – [out] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,nnz,descrA,csrSortedValA,csrSortedRowPtrA,csrSortedColIndA,infoorpBufferpointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
hipsparseXcsric02()#
-
hipsparseStatus_t hipsparseScsric02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseDcsric02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseCcsric02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
-
hipsparseStatus_t hipsparseZcsric02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#
Incomplete Cholesky factorization with 0 fill-ins and no pivoting using CSR storage format.
hipsparseXcsric02computes the incomplete Cholesky factorization with 0 fill-ins and no pivoting of a sparse \(m \times m\) CSR matrix \(A\), such that\[ A \approx LL^T \]where the lower triangular matrix \(L\) is computed using:\[\begin{split} L_{ij} = \left\{ \begin{array}{ll} \sqrt{A_{jj} - \sum_{k=0}^{j-1}(L_{jk})^{2}}, & \text{if i == j} \\ \frac{1}{L_{jj}}(A_{jj} - \sum_{k=0}^{j-1}L_{ik} \times L_{jk}), & \text{if i > j} \end{array} \right. \end{split}\]for each entry found in the CSR matrix \(A\).Computing the above incomplete Cholesky factorization requires three steps to complete. First, the user determines the size of the required temporary storage buffer by calling hipsparseXcsric02_bufferSize(). Once this buffer size has been determined, the user allocates the buffer and passes it to hipsparseXcsric02_analysis(). This will perform analysis on the sparsity pattern of the matrix. Finally, the user calls
hipsparseScsric02,hipsparseDcsric02,hipsparseCcsric02, orhipsparseZcsric02to perform the actual factorization. The calculation of the buffer size and the analysis of the sparse matrix only need to be performed once for a given sparsity pattern while the factorization can be repeatedly applied to multiple matrices having the same sparsity pattern. Once all calls to hipsparseXcsric02() are complete, the temporary buffer can be deallocated.When computing the Cholesky factorization, it is possible that \(L_{jj} == 0\) which would result in a division by zero. This could occur from either \(A_{jj}\) not existing in the sparse CSR matrix (referred to as a structural zero) or because \(A_{jj} - \sum_{k=0}^{j-1}(L_{jk})^{2} == 0\) (referred to as a numerical zero). For example, running the Cholesky factorization on the following matrix:
\[\begin{split} \begin{bmatrix} 2 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 2 \end{bmatrix} \end{split}\]results in a successful Cholesky factorization, however running with the matrix:\[\begin{split} \begin{bmatrix} 2 & 1 & 0 \\ 1 & 1/2 & 1 \\ 0 & 1 & 2 \end{bmatrix} \end{split}\]results in a numerical zero because:\[\begin{split} \begin{array}{ll} L_{00} &= \sqrt{2} \\ L_{10} &= \frac{1}{\sqrt{2}} \\ L_{11} &= \sqrt{\frac{1}{2} - (\frac{1}{\sqrt{2}})^2} &= 0 \end{array} \end{split}\]The user can detect the presence of a structural zero by calling hipsparseXcsric02_zeroPivot() after hipsparseXcsric02_analysis() and/or the presence of a structural or numerical zero by calling hipsparseXcsric02_zeroPivot() after hipsparseXcsric02():In both cases, hipsparseXcsric02_zeroPivot() will report the first zero pivot (either numerical or structural) found. See full example below. The user can also set the diagonal type to be \(1\) using hipsparseSetMatDiagType() which will interpret the matrix \(A\) as having ones on its diagonal (even if no nonzero exists in the sparsity pattern).hipsparseDcsric02(handle, m, nnz, descrM, csrVal, csrRowPtr, csrColInd, info, HIPSPARSE_SOLVE_POLICY_USE_LEVEL, buffer); // Check for zero pivot if(CUSPARSE_STATUS_ZERO_PIVOT == hipsparseXcsric02_zeroPivot(handle, info, &position)) { printf("L has structural and/or numerical zero at L(%d,%d)\n", position, position); }
hipsparseXcsric02computes the Cholesky factorization inplace meaning that the values arraycsrSortedValA_valMof the \(A\) matrix is overwritten with the \(L\) matrix stored in the lower triangular part of \(A\):\[\begin{split} \begin{align} \begin{bmatrix} a_{00} & a_{01} & a_{02} \\ a_{10} & a_{11} & a_{12} \\ a_{20} & a_{21} & a_{22} \end{bmatrix} \rightarrow \begin{bmatrix} l_{00} & a_{01} & a_{02} \\ l_{10} & l_{11} & a_{12} \\ l_{20} & l_{21} & l_{22} \end{bmatrix} \end{align} \end{split}\]The row pointer arraycsrSortedRowPtrAand the column indices arraycsrSortedColIndAremain the same for \(A\) and the output as the incomplete factorization does not generate new nonzeros in the output which do not already exist in \(A\).The performance of computing Cholesky factorization with hipSPARSE greatly depends on the sparisty pattern the the matrix \(A\) as this is what determines the amount of parallelism available.
Note
The sparse CSR matrix has to be sorted. This can be achieved by calling hipsparseXcsrsort().
Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA_valM – [inout] array of
nnzelements of the sparse CSR matrix.csrSortedRowPtrA – [in] array of
m+1elements that point to the start of every row of the sparse CSR matrix.csrSortedColIndA – [in] array of
nnzelements containing the column indices of the sparse CSR matrix.info – [in] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,nnz,descrA,csrSortedValA_valM,csrSortedRowPtrAorcsrSortedColIndApointer is invalid.HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // A sample symmetric positive definite matrix A (4x4) in CSR format.
8 // The 'S' in Scsric02 indicates single precision float.
9 // Matrix A:
10 // ( 4 1 0 0 )
11 // ( 1 5 2 0 )
12 // ( 0 2 3 1 )
13 // ( 0 0 1 2 )
14 // This matrix is symmetric. For IC02, we typically provide the full matrix
15 // or just the lower/upper part if using `HIPSPARSE_MATRIX_TYPE_SYMMETRIC`
16 // with the descriptor. Here, we provide elements for both lower and upper parts
17 // for simplicity, but the factorization will operate on the implicitly
18 // symmetric matrix and produce the lower triangular factor L.
19
20 int m = 4; // Number of rows
21 int n = 4; // Number of columns (equal to m for Cholesky)
22 int nnz = 10; // Number of non-zero elements (counting only one side for symmetric)
23
24 // CSR row pointers
25 std::vector<int> hcsrRowPtr = {0, 2, 5, 8, 10};
26
27 // CSR column indices
28 // These indices correspond to the non-zero values used below.
29 // For a symmetric matrix A, we implicitly work with A_lower.
30 // The output will be L.
31 std::vector<int> hcsrColInd = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
32
33 // CSR values (single precision float for 'S'csric02)
34 // The factorization computes the lower triangular L factor.
35 // The input values represent the entries of A that correspond to the non-zero pattern.
36 std::vector<float> hcsrVal = {4.0f, 1.0f, 1.0f, 5.0f, 2.0f, 2.0f, 3.0f, 1.0f, 1.0f, 2.0f};
37
38 // Matrix descriptor
39 hipsparseMatDescr_t descr;
40 HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
41
42 // Set index base on descriptor
43 HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));
44
45 // For incomplete Cholesky, the L factor is computed.
46 // L is lower triangular with a unit diagonal.
47 HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
48 HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
49 // Optionally set matrix type to symmetric if only storing one triangle of A
50 // HIPSPARSE_CHECK(hipsparseSetMatType(descr, HIPSPARSE_MATRIX_TYPE_SYMMETRIC));
51
52 // CSRIC02 info
53 csric02Info_t info;
54 HIPSPARSE_CHECK(hipsparseCreateCsric02Info(&info));
55
56 // Offload data to device
57 int* dcsrRowPtr;
58 int* dcsrColInd;
59 float* dcsrVal; // This will store the factorized L values
60
61 HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
62 HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
63 HIP_CHECK(
64 hipMalloc((void**)&dcsrVal,
65 sizeof(float) * nnz)); // Note: Same size as input, values will be overwritten
66
67 HIP_CHECK(
68 hipMemcpy(dcsrRowPtr, hcsrRowPtr.data(), sizeof(int) * (m + 1), hipMemcpyHostToDevice));
69 HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd.data(), sizeof(int) * nnz, hipMemcpyHostToDevice));
70 HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal.data(), sizeof(float) * nnz, hipMemcpyHostToDevice));
71
72 // 1. Get buffer size
73 int bufferSize = 0;
74 HIPSPARSE_CHECK(hipsparseScsric02_bufferSize(
75 handle, m, nnz, descr, dcsrVal, dcsrRowPtr, dcsrColInd, info, &bufferSize));
76
77 void* dbuffer = nullptr;
78 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
79
80 // 2. Perform analysis (symbolic factorization)
81 // This step analyzes the sparsity pattern of A to determine the structure of L.
82 HIPSPARSE_CHECK(
83 hipsparseScsric02_analysis(handle,
84 m,
85 nnz,
86 descr,
87 dcsrVal,
88 dcsrRowPtr,
89 dcsrColInd,
90 info,
91 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
92 dbuffer));
93
94 // 3. Perform factorization (numerical computation)
95 // This step computes the actual numerical values of L, stored in dcsrVal.
96 HIPSPARSE_CHECK(hipsparseScsric02(handle,
97 m,
98 nnz,
99 descr,
100 dcsrVal,
101 dcsrRowPtr,
102 dcsrColInd,
103 info,
104 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
105 dbuffer));
106
107 // 4. Check for zero pivots
108 // A zero pivot can occur during factorization, indicating a numerical breakdown.
109 int zeroPivot = 0; // -1 if no zero pivot, otherwise the row index of the first zero pivot
110 HIPSPARSE_CHECK(hipsparseXcsric02_zeroPivot(handle, info, &zeroPivot));
111 if(zeroPivot != -1)
112 {
113 printf("Error: Zero pivot detected during IC02 factorization at row index %d\n", zeroPivot);
114 // Depending on your application, you might want to handle this error
115 // or switch to a different preconditioner.
116 }
117 else
118 {
119 printf("CSRIC02 factorization completed successfully (no zero pivots detected).\n");
120 }
121
122 // Copy the factorized values (L) back to host
123 std::vector<float> hcsrVal_result(nnz);
124 HIP_CHECK(
125 hipMemcpy(hcsrVal_result.data(), dcsrVal, sizeof(float) * nnz, hipMemcpyDeviceToHost));
126
127 // Print the result (the values of the factorized L)
128 printf("\nFactorized CSR values (L factor):\n");
129 for(int i = 0; i < nnz; ++i)
130 {
131 printf("val[%d] = %f\n", i, hcsrVal_result[i]);
132 }
133
134 // Clean up
135 HIPSPARSE_CHECK(hipsparseDestroyCsric02Info(info));
136 HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
137 HIPSPARSE_CHECK(hipsparseDestroy(handle));
138
139 HIP_CHECK(hipFree(dcsrRowPtr));
140 HIP_CHECK(hipFree(dcsrColInd));
141 HIP_CHECK(hipFree(dcsrVal));
142 HIP_CHECK(hipFree(dbuffer));
143
144 return 0;
145}
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // A sample symmetric positive definite matrix A (4x4) in CSR format.
8 // The 'S' in Scsric02 indicates single precision float.
9 // Matrix A:
10 // ( 4 1 0 0 )
11 // ( 1 5 2 0 )
12 // ( 0 2 3 1 )
13 // ( 0 0 1 2 )
14 // This matrix is symmetric. For IC02, we typically provide the full matrix
15 // or just the lower/upper part if using `HIPSPARSE_MATRIX_TYPE_SYMMETRIC`
16 // with the descriptor. Here, we provide elements for both lower and upper parts
17 // for simplicity, but the factorization will operate on the implicitly
18 // symmetric matrix and produce the lower triangular factor L.
19
20 int m = 4; // Number of rows
21 int n = 4; // Number of columns (equal to m for Cholesky)
22 int nnz = 10; // Number of non-zero elements (counting only one side for symmetric)
23
24 // CSR row pointers
25 int hcsrRowPtr[] = {0, 2, 5, 8, 10};
26
27 // CSR column indices
28 // These indices correspond to the non-zero values used below.
29 // For a symmetric matrix A, we implicitly work with A_lower.
30 // The output will be L.
31 int hcsrColInd[] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};
32
33 // CSR values (single precision float for 'S'csric02)
34 // The factorization computes the lower triangular L factor.
35 // The input values represent the entries of A that correspond to the non-zero pattern.
36 float hcsrVal[] = {4.0, 1.0, 1.0, 5.0, 2.0, 2.0, 3.0, 1.0, 1.0, 2.0};
37
38 // Matrix descriptor
39 hipsparseMatDescr_t descr;
40 HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));
41
42 // Set index base on descriptor
43 HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));
44
45 // For incomplete Cholesky, the L factor is computed.
46 // L is lower triangular with a unit diagonal.
47 HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
48 HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
49 // Optionally set matrix type to symmetric if only storing one triangle of A
50 // HIPSPARSE_CHECK(hipsparseSetMatType(descr, HIPSPARSE_MATRIX_TYPE_SYMMETRIC));
51
52 // CSRIC02 info
53 csric02Info_t info;
54 HIPSPARSE_CHECK(hipsparseCreateCsric02Info(&info));
55
56 // Offload data to device
57 int* dcsrRowPtr;
58 int* dcsrColInd;
59 float* dcsrVal; // This will store the factorized L values
60
61 HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
62 HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
63 HIP_CHECK(
64 hipMalloc((void**)&dcsrVal,
65 sizeof(float) * nnz)); // Note: Same size as input, values will be overwritten
66
67 HIP_CHECK(hipMemcpy(dcsrRowPtr, hcsrRowPtr, sizeof(int) * (m + 1), hipMemcpyHostToDevice));
68 HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd, sizeof(int) * nnz, hipMemcpyHostToDevice));
69 HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal, sizeof(float) * nnz, hipMemcpyHostToDevice));
70
71 // 1. Get buffer size
72 int bufferSize = 0;
73 HIPSPARSE_CHECK(hipsparseScsric02_bufferSize(
74 handle, m, nnz, descr, dcsrVal, dcsrRowPtr, dcsrColInd, info, &bufferSize));
75
76 void* dbuffer = NULL;
77 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
78
79 // 2. Perform analysis (symbolic factorization)
80 // This step analyzes the sparsity pattern of A to determine the structure of L.
81 HIPSPARSE_CHECK(
82 hipsparseScsric02_analysis(handle,
83 m,
84 nnz,
85 descr,
86 dcsrVal,
87 dcsrRowPtr,
88 dcsrColInd,
89 info,
90 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
91 dbuffer));
92
93 // 3. Perform factorization (numerical computation)
94 // This step computes the actual numerical values of L, stored in dcsrVal.
95 HIPSPARSE_CHECK(hipsparseScsric02(handle,
96 m,
97 nnz,
98 descr,
99 dcsrVal,
100 dcsrRowPtr,
101 dcsrColInd,
102 info,
103 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
104 dbuffer));
105
106 // 4. Check for zero pivots
107 // A zero pivot can occur during factorization, indicating a numerical breakdown.
108 int zeroPivot = 0; // -1 if no zero pivot, otherwise the row index of the first zero pivot
109 HIPSPARSE_CHECK(hipsparseXcsric02_zeroPivot(handle, info, &zeroPivot));
110 if(zeroPivot != -1)
111 {
112 printf("Error: Zero pivot detected during IC02 factorization at row index %d\n", zeroPivot);
113 // Depending on your application, you might want to handle this error
114 // or switch to a different preconditioner.
115 }
116 else
117 {
118 printf("CSRIC02 factorization completed successfully (no zero pivots detected).\n");
119 }
120
121 // Copy the factorized values (L) back to host
122 float* hcsrVal_result = (float*)malloc(nnz * sizeof(float));
123 HIP_CHECK(hipMemcpy(hcsrVal_result, dcsrVal, sizeof(float) * nnz, hipMemcpyDeviceToHost));
124
125 // Print the result (the values of the factorized L)
126 printf("\nFactorized CSR values (L factor):\n");
127 for(int i = 0; i < nnz; ++i)
128 {
129 printf("val[%d] = %f\n", i, hcsrVal_result[i]);
130 }
131
132 // Clean up
133 free(hcsrVal_result);
134
135 HIPSPARSE_CHECK(hipsparseDestroyCsric02Info(info));
136 HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
137 HIPSPARSE_CHECK(hipsparseDestroy(handle));
138
139 HIP_CHECK(hipFree(dcsrRowPtr));
140 HIP_CHECK(hipFree(dcsrColInd));
141 HIP_CHECK(hipFree(dcsrVal));
142 HIP_CHECK(hipFree(dbuffer));
143
144 return 0;
145}
1program example_hipsparse_csric02
2 use iso_c_binding
3 implicit none
4
5 ! HIP
6 interface
7 function hipMalloc(ptr, size) &
8 bind(c, name = 'hipMalloc')
9 use iso_c_binding
10 implicit none
11 integer(c_int) :: hipMalloc
12 type(c_ptr) :: ptr
13 integer(c_size_t), value :: size
14 end function hipMalloc
15
16 function hipFree(ptr) &
17 bind(c, name = 'hipFree')
18 use iso_c_binding
19 implicit none
20 integer(c_int) :: hipFree
21 type(c_ptr), value :: ptr
22 end function hipFree
23
24 function hipMemcpy(dst, src, size, kind) &
25 bind(c, name = 'hipMemcpy')
26 use iso_c_binding
27 implicit none
28 integer(c_int) :: hipMemcpy
29 type(c_ptr), value :: dst
30 type(c_ptr), intent(in), value :: src
31 integer(c_size_t), value :: size
32 integer(c_int), value :: kind
33 end function hipMemcpy
34 end interface
35
36 integer, parameter :: hipMemcpyHostToDevice = 1
37 integer, parameter :: hipMemcpyDeviceToHost = 2
38
39 ! hipSPARSE
40 interface
41 function hipsparseCreate(handle) &
42 bind(c, name = 'hipsparseCreate')
43 use iso_c_binding
44 implicit none
45 integer(c_int) :: hipsparseCreate
46 type(c_ptr) :: handle
47 end function hipsparseCreate
48
49 function hipsparseDestroy(handle) &
50 bind(c, name = 'hipsparseDestroy')
51 use iso_c_binding
52 implicit none
53 integer(c_int) :: hipsparseDestroy
54 type(c_ptr), value :: handle
55 end function hipsparseDestroy
56
57 function hipsparseCreateMatDescr(descr) &
58 bind(c, name = 'hipsparseCreateMatDescr')
59 use iso_c_binding
60 implicit none
61 integer(c_int) :: hipsparseCreateMatDescr
62 type(c_ptr) :: descr
63 end function hipsparseCreateMatDescr
64
65 function hipsparseDestroyMatDescr(descr) &
66 bind(c, name = 'hipsparseDestroyMatDescr')
67 use iso_c_binding
68 implicit none
69 integer(c_int) :: hipsparseDestroyMatDescr
70 type(c_ptr), value :: descr
71 end function hipsparseDestroyMatDescr
72
73 function hipsparseSetMatFillMode(descr, fillMode) &
74 bind(c, name = 'hipsparseSetMatFillMode')
75 use iso_c_binding
76 implicit none
77 integer(c_int) :: hipsparseSetMatFillMode
78 type(c_ptr), value :: descr
79 integer(c_int), value :: fillMode
80 end function hipsparseSetMatFillMode
81
82 function hipsparseSetMatDiagType(descr, diagType) &
83 bind(c, name = 'hipsparseSetMatDiagType')
84 use iso_c_binding
85 implicit none
86 integer(c_int) :: hipsparseSetMatDiagType
87 type(c_ptr), value :: descr
88 integer(c_int), value :: diagType
89 end function hipsparseSetMatDiagType
90
91 function hipsparseCreateCsric02Info(info) &
92 bind(c, name = 'hipsparseCreateCsric02Info')
93 use iso_c_binding
94 implicit none
95 integer(c_int) :: hipsparseCreateCsric02Info
96 type(c_ptr) :: info
97 end function hipsparseCreateCsric02Info
98
99 function hipsparseDestroyCsric02Info(info) &
100 bind(c, name = 'hipsparseDestroyCsric02Info')
101 use iso_c_binding
102 implicit none
103 integer(c_int) :: hipsparseDestroyCsric02Info
104 type(c_ptr), value :: info
105 end function hipsparseDestroyCsric02Info
106
107 function hipsparseScsric02_bufferSize(handle, m, nnz, descr, csrSortedValA, csrSortedRowPtrA, &
108 csrSortedColIndA, info, pBufferSizeInBytes) &
109 bind(c, name = 'hipsparseScsric02_bufferSize')
110 use iso_c_binding
111 implicit none
112 integer(c_int) :: hipsparseScsric02_bufferSize
113 type(c_ptr), value :: handle
114 integer(c_int), value :: m
115 integer(c_int), value :: nnz
116 type(c_ptr), value :: descr
117 type(c_ptr), intent(in), value :: csrSortedValA
118 type(c_ptr), intent(in), value :: csrSortedRowPtrA
119 type(c_ptr), intent(in), value :: csrSortedColIndA
120 type(c_ptr), value :: info
121 type(c_ptr), value :: pBufferSizeInBytes
122 end function hipsparseScsric02_bufferSize
123
124 function hipsparseScsric02_analysis(handle, m, nnz, descr, csrSortedValA, csrSortedRowPtrA, &
125 csrSortedColIndA, info, policy, pBuffer) &
126 bind(c, name = 'hipsparseScsric02_analysis')
127 use iso_c_binding
128 implicit none
129 integer(c_int) :: hipsparseScsric02_analysis
130 type(c_ptr), value :: handle
131 integer(c_int), value :: m
132 integer(c_int), value :: nnz
133 type(c_ptr), value :: descr
134 type(c_ptr), intent(in), value :: csrSortedValA
135 type(c_ptr), intent(in), value :: csrSortedRowPtrA
136 type(c_ptr), intent(in), value :: csrSortedColIndA
137 type(c_ptr), value :: info
138 integer(c_int), value :: policy
139 type(c_ptr), value :: pBuffer
140 end function hipsparseScsric02_analysis
141
142 function hipsparseScsric02(handle, m, nnz, descr, csrSortedValA_valM, csrSortedRowPtrA, &
143 csrSortedColIndA, info, policy, pBuffer) &
144 bind(c, name = 'hipsparseScsric02')
145 use iso_c_binding
146 implicit none
147 integer(c_int) :: hipsparseScsric02
148 type(c_ptr), value :: handle
149 integer(c_int), value :: m
150 integer(c_int), value :: nnz
151 type(c_ptr), value :: descr
152 type(c_ptr), value :: csrSortedValA_valM
153 type(c_ptr), intent(in), value :: csrSortedRowPtrA
154 type(c_ptr), intent(in), value :: csrSortedColIndA
155 type(c_ptr), value :: info
156 integer(c_int), value :: policy
157 type(c_ptr), value :: pBuffer
158 end function hipsparseScsric02
159
160 function hipsparseXcsric02_zeroPivot(handle, info, position) &
161 bind(c, name = 'hipsparseXcsric02_zeroPivot')
162 use iso_c_binding
163 implicit none
164 integer(c_int) :: hipsparseXcsric02_zeroPivot
165 type(c_ptr), value :: handle
166 type(c_ptr), value :: info
167 type(c_ptr), value :: position
168 end function hipsparseXcsric02_zeroPivot
169 end interface
170
171 integer, parameter :: HIPSPARSE_FILL_MODE_LOWER = 0
172 integer, parameter :: HIPSPARSE_DIAG_TYPE_UNIT = 1
173 integer, parameter :: HIPSPARSE_SOLVE_POLICY_USE_LEVEL = 1
174
175 ! Variables
176 type(c_ptr) :: handle
177 type(c_ptr) :: descr
178 type(c_ptr) :: info
179 integer :: i, stat
180 integer, target :: zeroPivot
181 integer(c_int), target :: bufferSize
182
183 ! Symmetric positive definite matrix A (4x4) in CSR format
184 integer, parameter :: m = 4
185 integer, parameter :: n = 4
186 integer, parameter :: nnz = 10
187
188 integer, dimension(m+1), target :: hcsrRowPtr = (/0, 2, 5, 8, 10/)
189 integer, dimension(nnz), target :: hcsrColInd = (/0, 1, 0, 1, 2, 1, 2, 3, 2, 3/)
190 real(c_float), dimension(nnz), target :: hcsrVal = (/4.0, 1.0, 1.0, 5.0, 2.0, 2.0, 3.0, 1.0, 1.0, 2.0/)
191
192 ! Result array
193 real(c_float), dimension(nnz), target :: hcsrVal_result
194
195 ! Device pointers
196 type(c_ptr) :: dcsrRowPtr
197 type(c_ptr) :: dcsrColInd
198 type(c_ptr) :: dcsrVal
199 type(c_ptr) :: dbuffer
200
201 ! Create hipSPARSE handle
202 stat = hipsparseCreate(handle)
203 if (stat /= 0) then
204 write(*,*) 'Error: hipsparseCreate failed'
205 stop
206 end if
207
208 ! Create matrix descriptor
209 stat = hipsparseCreateMatDescr(descr)
210 if (stat /= 0) then
211 write(*,*) 'Error: hipsparseCreateMatDescr failed'
212 stop
213 end if
214
215 ! Set matrix fill mode (lower triangular)
216 stat = hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER)
217 if (stat /= 0) then
218 write(*,*) 'Error: hipsparseSetMatFillMode failed'
219 stop
220 end if
221
222 ! Set matrix diagonal type
223 stat = hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT)
224 if (stat /= 0) then
225 write(*,*) 'Error: hipsparseSetMatDiagType failed'
226 stop
227 end if
228
229 ! Create csric02 info
230 stat = hipsparseCreateCsric02Info(info)
231 if (stat /= 0) then
232 write(*,*) 'Error: hipsparseCreateCsric02Info failed'
233 stop
234 end if
235
236 ! Allocate device memory
237 stat = hipMalloc(dcsrRowPtr, int((m + 1) * 4, c_size_t))
238 if (stat /= 0) stop
239 stat = hipMalloc(dcsrColInd, int(nnz * 4, c_size_t))
240 if (stat /= 0) stop
241 stat = hipMalloc(dcsrVal, int(nnz * 4, c_size_t))
242 if (stat /= 0) stop
243
244 ! Copy data to device
245 stat = hipMemcpy(dcsrRowPtr, c_loc(hcsrRowPtr), int((m + 1) * 4, c_size_t), hipMemcpyHostToDevice)
246 if (stat /= 0) stop
247 stat = hipMemcpy(dcsrColInd, c_loc(hcsrColInd), int(nnz * 4, c_size_t), hipMemcpyHostToDevice)
248 if (stat /= 0) stop
249 stat = hipMemcpy(dcsrVal, c_loc(hcsrVal), int(nnz * 4, c_size_t), hipMemcpyHostToDevice)
250 if (stat /= 0) stop
251
252 ! Get buffer size
253 stat = hipsparseScsric02_bufferSize(handle, &
254 m, &
255 nnz, &
256 descr, &
257 dcsrVal, &
258 dcsrRowPtr, &
259 dcsrColInd, &
260 info, &
261 c_loc(bufferSize))
262 if (stat /= 0) then
263 write(*,*) 'Error: hipsparseScsric02_bufferSize failed'
264 stop
265 end if
266
267 ! Allocate temporary buffer
268 stat = hipMalloc(dbuffer, int(bufferSize, c_size_t))
269 if (stat /= 0) stop
270
271 ! Perform analysis step
272 stat = hipsparseScsric02_analysis(handle, &
273 m, &
274 nnz, &
275 descr, &
276 dcsrVal, &
277 dcsrRowPtr, &
278 dcsrColInd, &
279 info, &
280 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
281 dbuffer)
282 if (stat /= 0) then
283 write(*,*) 'Error: hipsparseScsric02_analysis failed'
284 stop
285 end if
286
287 ! Perform factorization
288 stat = hipsparseScsric02(handle, &
289 m, &
290 nnz, &
291 descr, &
292 dcsrVal, &
293 dcsrRowPtr, &
294 dcsrColInd, &
295 info, &
296 HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
297 dbuffer)
298 if (stat /= 0) then
299 write(*,*) 'Error: hipsparseScsric02 failed'
300 stop
301 end if
302
303 ! Check for zero pivots
304 stat = hipsparseXcsric02_zeroPivot(handle, info, c_loc(zeroPivot))
305 if (zeroPivot /= -1) then
306 write(*,*) 'Error: Zero pivot detected at row index', zeroPivot
307 else
308 write(*,*) 'CSRIC02 factorization completed successfully'
309 end if
310
311 ! Copy result back to host
312 stat = hipMemcpy(c_loc(hcsrVal_result), dcsrVal, int(nnz * 4, c_size_t), hipMemcpyDeviceToHost)
313 if (stat /= 0) stop
314
315 ! Print result
316 write(*,*) 'Factorized CSR values (L factor):'
317 do i = 1, nnz
318 write(*,*) 'val[', i-1, '] =', hcsrVal_result(i)
319 end do
320
321 ! Clean up
322 stat = hipFree(dcsrRowPtr)
323 stat = hipFree(dcsrColInd)
324 stat = hipFree(dcsrVal)
325 stat = hipFree(dbuffer)
326
327 stat = hipsparseDestroyCsric02Info(info)
328 stat = hipsparseDestroyMatDescr(descr)
329 stat = hipsparseDestroy(handle)
330
331end program example_hipsparse_csric02
hipsparseXgtsv2_bufferSizeExt()#
-
hipsparseStatus_t hipsparseSgtsv2_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const float *dl, const float *d, const float *du, const float *B, int ldb, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDgtsv2_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const double *dl, const double *d, const double *du, const double *B, int ldb, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCgtsv2_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *B, int ldb, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZgtsv2_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *B, int ldb, size_t *pBufferSizeInBytes)#
hipsparseSgtsv2_bufferSizeExtreturns the size of the temporary storage buffer that is required by hipsparseXgtsv2(). The temporary storage buffer must be allocated by the user.Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
Note
This routine supports execution in a hipGraph context.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system. Must be at least 2.
n – [in] number of columns in the dense matrix B. Must be non-negative.
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
B – [in] dense matrix of size (
ldb,n).ldb – [in] leading dimension of B. Must satisfy
ldb>= max(1, m).pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXgtsv2().
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED –
handleis not initialized.HIPSPARSE_STATUS_INVALID_VALUE –
handle,dl,d,du,BorpBufferSizeInBytesis nullptr,mis less than 2 ornis negative, orldbis invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXgtsv2()#
-
hipsparseStatus_t hipsparseSgtsv2(hipsparseHandle_t handle, int m, int n, const float *dl, const float *d, const float *du, float *B, int ldb, void *pBuffer)#
-
hipsparseStatus_t hipsparseDgtsv2(hipsparseHandle_t handle, int m, int n, const double *dl, const double *d, const double *du, double *B, int ldb, void *pBuffer)#
-
hipsparseStatus_t hipsparseCgtsv2(hipsparseHandle_t handle, int m, int n, const hipComplex *dl, const hipComplex *d, const hipComplex *du, hipComplex *B, int ldb, void *pBuffer)#
-
hipsparseStatus_t hipsparseZgtsv2(hipsparseHandle_t handle, int m, int n, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, hipDoubleComplex *B, int ldb, void *pBuffer)#
Tridiagonal solver with pivoting.
hipsparseXgtsv2solves a tridiagonal system for multiple right hand sides using pivoting\[ T*B = B \]where \(T\) is a sparse tridiagonal matrix and \(B\) is a dense \(ldb \times n\) matrix storing the right-hand side vectors in column order. The tridiagonal matrix \(T\) is defined by three vectors:dlfor the lower diagonal,dfor the main diagonal anddufor the upper diagonal.Solving the tridiagonal system involves two steps. First, the user calls hipsparseXgtsv2_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgtsv2() to perform the actual solve. The \(B\) dense matrix, which initially stores the
nright-hand side vectors, is overwritten with thensolution vectors after the call to hipsparseXgtsv2().Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
Note
This routine supports execution in a hipGraph context.
- Parameters:
handle – [in] handle to the hipSPARSE library context queue.
m – [in] size of the tri-diagonal linear system (must be >= 2).
n – [in] number of columns in the dense matrix B.
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
B – [inout] Dense matrix of size (
ldb,n).ldb – [in] Leading dimension of B. Must satisfy
ldb>= max(1, m).pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,n,ldb,dl,d,du,BorpBufferpointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
1int main(int argc, char* argv[])
2{
3 // Size of square tridiagonal matrix
4 int m = 5;
5
6 // Number of columns in right-hand side (column ordered) matrix
7 int n = 3;
8
9 // Leading dimension of right-hand side (column ordered) matrix
10 int ldb = m;
11
12 // Host tri-diagonal matrix
13 // 2 3 0 0 0
14 // 2 4 2 0 0
15 // 0 1 1 1 0
16 // 0 0 1 3 1
17 // 0 0 0 1 4
18 std::vector<float> hdl = {0.0f, 2.0f, 1.0f, 1.0f, 1.0f};
19 std::vector<float> hd = {2.0f, 4.0f, 1.0f, 3.0f, 4.0f};
20 std::vector<float> hdu = {3.0f, 2.0f, 1.0f, 1.0f, 0.0f};
21
22 // Host right-hand side column vectors
23 std::vector<float> hB(ldb * n, 2.0f);
24
25 float* ddl = nullptr;
26 float* dd = nullptr;
27 float* ddu = nullptr;
28 float* dB = nullptr;
29 HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m));
30 HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m));
31 HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m));
32 HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * ldb * n));
33
34 HIP_CHECK(hipMemcpy(ddl, hdl.data(), sizeof(float) * m, hipMemcpyHostToDevice));
35 HIP_CHECK(hipMemcpy(dd, hd.data(), sizeof(float) * m, hipMemcpyHostToDevice));
36 HIP_CHECK(hipMemcpy(ddu, hdu.data(), sizeof(float) * m, hipMemcpyHostToDevice));
37 HIP_CHECK(hipMemcpy(dB, hB.data(), sizeof(float) * ldb * n, hipMemcpyHostToDevice));
38
39 // hipSPARSE handle
40 hipsparseHandle_t handle;
41 HIPSPARSE_CHECK(hipsparseCreate(&handle));
42
43 // 1. Get buffer size
44 size_t bufferSize = 0;
45 HIPSPARSE_CHECK(
46 hipsparseSgtsv2_bufferSizeExt(handle, m, m, ddl, dd, ddu, dB, ldb, &bufferSize));
47
48 void* dbuffer = nullptr;
49 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
50
51 // 2. Perform tridiagonal solve with pivoting
52 // The solution is computed and stored in the dB vector.
53 HIPSPARSE_CHECK(hipsparseSgtsv2(handle, m, m, ddl, dd, ddu, dB, ldb, dbuffer));
54
55 // Copy solution back to host from dB
56 HIP_CHECK(hipMemcpy(hB.data(), dB, sizeof(float) * m, hipMemcpyDeviceToHost));
57
58 // Print the solution
59 printf("Solution for the tridiagonal system:\n");
60 for(int i = 0; i < m; ++i)
61 {
62 printf(" x[%d] = %f\n", i, hB[i]);
63 }
64
65 // Clean up
66 HIP_CHECK(hipFree(ddl));
67 HIP_CHECK(hipFree(dd));
68 HIP_CHECK(hipFree(ddu));
69 HIP_CHECK(hipFree(dB));
70 HIP_CHECK(hipFree(dbuffer));
71
72 HIPSPARSE_CHECK(hipsparseDestroy(handle));
73
74 return 0;
75}
1int main(int argc, char* argv[])
2{
3 // Size of square tridiagonal matrix
4 int m = 5;
5
6 // Number of columns in right-hand side (column ordered) matrix
7 int n = 3;
8
9 // Leading dimension of right-hand side (column ordered) matrix
10 int ldb = m;
11
12 // Host tri-diagonal matrix
13 // 2 3 0 0 0
14 // 2 4 2 0 0
15 // 0 1 1 1 0
16 // 0 0 1 3 1
17 // 0 0 0 1 4
18 float hdl[] = {0.0, 2.0, 1.0, 1.0, 1.0};
19 float hd[] = {2.0, 4.0, 1.0, 3.0, 4.0};
20 float hdu[] = {3.0, 2.0, 1.0, 1.0, 0.0};
21
22 // Host right-hand side column vectors
23 float* hB = (float*)malloc((ldb * n, 2.0) * sizeof(float));
24
25 float* ddl = NULL;
26 float* dd = NULL;
27 float* ddu = NULL;
28 float* dB = NULL;
29 HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m));
30 HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m));
31 HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m));
32 HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * ldb * n));
33
34 HIP_CHECK(hipMemcpy(ddl, hdl, sizeof(float) * m, hipMemcpyHostToDevice));
35 HIP_CHECK(hipMemcpy(dd, hd, sizeof(float) * m, hipMemcpyHostToDevice));
36 HIP_CHECK(hipMemcpy(ddu, hdu, sizeof(float) * m, hipMemcpyHostToDevice));
37 HIP_CHECK(hipMemcpy(dB, hB, sizeof(float) * ldb * n, hipMemcpyHostToDevice));
38
39 // hipSPARSE handle
40 hipsparseHandle_t handle;
41 HIPSPARSE_CHECK(hipsparseCreate(&handle));
42
43 // 1. Get buffer size
44 size_t bufferSize = 0;
45 HIPSPARSE_CHECK(
46 hipsparseSgtsv2_bufferSizeExt(handle, m, m, ddl, dd, ddu, dB, ldb, &bufferSize));
47
48 void* dbuffer = NULL;
49 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
50
51 // 2. Perform tridiagonal solve with pivoting
52 // The solution is computed and stored in the dB vector.
53 HIPSPARSE_CHECK(hipsparseSgtsv2(handle, m, m, ddl, dd, ddu, dB, ldb, dbuffer));
54
55 // Copy solution back to host from dB
56 HIP_CHECK(hipMemcpy(hB, dB, sizeof(float) * m, hipMemcpyDeviceToHost));
57
58 // Print the solution
59 printf("Solution for the tridiagonal system:\n");
60 for(int i = 0; i < m; ++i)
61 {
62 printf(" x[%d] = %f\n", i, hB[i]);
63 }
64
65 // Clean up
66 HIP_CHECK(hipFree(ddl));
67 HIP_CHECK(hipFree(dd));
68 HIP_CHECK(hipFree(ddu));
69 HIP_CHECK(hipFree(dB));
70 HIP_CHECK(hipFree(dbuffer));
71
72 HIPSPARSE_CHECK(hipsparseDestroy(handle));
73
74 return 0;
75}
hipsparseXgtsv2_nopivot_bufferSizeExt()#
-
hipsparseStatus_t hipsparseSgtsv2_nopivot_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const float *dl, const float *d, const float *du, const float *B, int ldb, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDgtsv2_nopivot_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const double *dl, const double *d, const double *du, const double *B, int ldb, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCgtsv2_nopivot_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *B, int ldb, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZgtsv2_nopivot_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *B, int ldb, size_t *pBufferSizeInBytes)#
hipsparseXgtsv2_nopivot_bufferSizeExtreturns the size of the temporary storage buffer in bytes that is required by hipsparseXgtsv2_nopivot(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system. Must be >= 2.
n – [in] number of columns in the dense matrix B. Must be non-negative.
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
B – [in] Dense matrix of size (
ldb,n).ldb – [in] Leading dimension of B. Must satisfy
ldb>= max(1, m).pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSgtsv2_nopivot “hipsparseXgtsv2_nopivot()”.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED –
handleis not initialized.HIPSPARSE_STATUS_INVALID_VALUE –
handle,dl,d,du,BorpBufferSizeInBytesis nullptr,mis less than 2 ornis negative, orldbis invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXgtsv2_nopivot()#
-
hipsparseStatus_t hipsparseSgtsv2_nopivot(hipsparseHandle_t handle, int m, int n, const float *dl, const float *d, const float *du, float *B, int ldb, void *pBuffer)#
-
hipsparseStatus_t hipsparseDgtsv2_nopivot(hipsparseHandle_t handle, int m, int n, const double *dl, const double *d, const double *du, double *B, int ldb, void *pBuffer)#
-
hipsparseStatus_t hipsparseCgtsv2_nopivot(hipsparseHandle_t handle, int m, int n, const hipComplex *dl, const hipComplex *d, const hipComplex *du, hipComplex *B, int ldb, void *pBuffer)#
-
hipsparseStatus_t hipsparseZgtsv2_nopivot(hipsparseHandle_t handle, int m, int n, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, hipDoubleComplex *B, int ldb, void *pBuffer)#
Tridiagonal solver (no pivoting)
hipsparseXgtsv2_nopivotsolves a tridiagonal linear system for multiple right-hand sides without pivoting\[ T*B = B \]where \(T\) is a sparse tridiagonal matrix and \(B\) is a dense \(ldb \times n\) matrix storing the right-hand side vectors in column order. The tridiagonal matrix \(T\) is defined by three vectors:dlfor the lower diagonal,dfor the main diagonal anddufor the upper diagonal.Solving the tridiagonal system with multiple right-hand sides without pivoting involves two steps. First, the user calls hipsparseXgtsv2_nopivot_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgtsv2_nopivot() to perform the actual solve. The \(B\) dense matrix, which initially stores the
nright-hand side vectors, is overwritten with thensolution vectors after the call to hipsparseXgtsv2_nopivot().Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system (must be >= 2).
n – [in] number of columns in the dense matrix B.
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
B – [inout] Dense matrix of size (
ldb,n).ldb – [in] Leading dimension of B. Must satisfy
ldb>= max(1, m).pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,n,ldb,dl,d,du,BorpBufferpointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // Size of square tridiagonal matrix
8 int m = 5;
9
10 // Number of columns in right-hand side (column ordered) matrix
11 int n = 3;
12
13 // Leading dimension of right-hand side (column ordered) matrix
14 int ldb = m;
15
16 // Host tri-diagonal matrix
17 // 2 -1 0 0 0
18 // -1 2 -1 0 0
19 // 0 -1 2 -1 0
20 // 0 0 -1 2 -1
21 // 0 0 0 -1 2
22 std::vector<float> hdl = {0.0f, -1.0f, -1.0f, -1.0f, -1.0f};
23 std::vector<float> hd = {2.0f, 2.0f, 2.0f, 2.0f, 2.0f};
24 std::vector<float> hdu = {-1.0f, -1.0f, -1.0f, -1.0f, 0.0f};
25
26 // Host right-hand side column vectors
27 std::vector<float> hB(ldb * n, 1.0f);
28
29 float* ddl = nullptr;
30 float* dd = nullptr;
31 float* ddu = nullptr;
32 float* dB = nullptr;
33 HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m));
34 HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m));
35 HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m));
36 HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * ldb * n));
37
38 HIP_CHECK(hipMemcpy(ddl, hdl.data(), sizeof(float) * m, hipMemcpyHostToDevice));
39 HIP_CHECK(hipMemcpy(dd, hd.data(), sizeof(float) * m, hipMemcpyHostToDevice));
40 HIP_CHECK(hipMemcpy(ddu, hdu.data(), sizeof(float) * m, hipMemcpyHostToDevice));
41 HIP_CHECK(hipMemcpy(dB, hB.data(), sizeof(float) * ldb * n, hipMemcpyHostToDevice));
42
43 // Obtain required buffer size
44 size_t bufferSize;
45 HIPSPARSE_CHECK(
46 hipsparseSgtsv2_nopivot_bufferSizeExt(handle, m, n, ddl, dd, ddu, dB, ldb, &bufferSize));
47
48 void* dbuffer;
49 HIP_CHECK(hipMalloc(&dbuffer, bufferSize));
50
51 HIPSPARSE_CHECK(hipsparseSgtsv2_nopivot(handle, m, n, ddl, dd, ddu, dB, ldb, dbuffer));
52
53 // Copy right-hand side to host
54 HIP_CHECK(hipMemcpy(hB.data(), dB, sizeof(float) * ldb * n, hipMemcpyDeviceToHost));
55
56 // Print the solution
57 printf("Solution for the tridiagonal system:\n");
58 for(int i = 0; i < m; ++i)
59 {
60 printf(" x[%d] = %f\n", i, hB[i]);
61 }
62
63 // Clean up
64 HIP_CHECK(hipFree(ddl));
65 HIP_CHECK(hipFree(dd));
66 HIP_CHECK(hipFree(ddu));
67 HIP_CHECK(hipFree(dB));
68 HIP_CHECK(hipFree(dbuffer));
69
70 HIPSPARSE_CHECK(hipsparseDestroy(handle));
71
72 return 0;
73}
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // Size of square tridiagonal matrix
8 int m = 5;
9
10 // Number of columns in right-hand side (column ordered) matrix
11 int n = 3;
12
13 // Leading dimension of right-hand side (column ordered) matrix
14 int ldb = m;
15
16 // Host tri-diagonal matrix
17 // 2 -1 0 0 0
18 // -1 2 -1 0 0
19 // 0 -1 2 -1 0
20 // 0 0 -1 2 -1
21 // 0 0 0 -1 2
22 float hdl[] = {0.0, -1.0, -1.0, -1.0, -1.0};
23 float hd[] = {2.0, 2.0, 2.0, 2.0, 2.0};
24 float hdu[] = {-1.0, -1.0, -1.0, -1.0, 0.0};
25
26 // Host right-hand side column vectors
27 float* hB = (float*)malloc((ldb * n, 1.0) * sizeof(float));
28
29 float* ddl = NULL;
30 float* dd = NULL;
31 float* ddu = NULL;
32 float* dB = NULL;
33 HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m));
34 HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m));
35 HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m));
36 HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * ldb * n));
37
38 HIP_CHECK(hipMemcpy(ddl, hdl, sizeof(float) * m, hipMemcpyHostToDevice));
39 HIP_CHECK(hipMemcpy(dd, hd, sizeof(float) * m, hipMemcpyHostToDevice));
40 HIP_CHECK(hipMemcpy(ddu, hdu, sizeof(float) * m, hipMemcpyHostToDevice));
41 HIP_CHECK(hipMemcpy(dB, hB, sizeof(float) * ldb * n, hipMemcpyHostToDevice));
42
43 // Obtain required buffer size
44 size_t bufferSize;
45 HIPSPARSE_CHECK(
46 hipsparseSgtsv2_nopivot_bufferSizeExt(handle, m, n, ddl, dd, ddu, dB, ldb, &bufferSize));
47
48 void* dbuffer;
49 HIP_CHECK(hipMalloc(&dbuffer, bufferSize));
50
51 HIPSPARSE_CHECK(hipsparseSgtsv2_nopivot(handle, m, n, ddl, dd, ddu, dB, ldb, dbuffer));
52
53 // Copy right-hand side to host
54 HIP_CHECK(hipMemcpy(hB, dB, sizeof(float) * ldb * n, hipMemcpyDeviceToHost));
55
56 // Print the solution
57 printf("Solution for the tridiagonal system:\n");
58 for(int i = 0; i < m; ++i)
59 {
60 printf(" x[%d] = %f\n", i, hB[i]);
61 }
62
63 // Clean up
64 HIP_CHECK(hipFree(ddl));
65 HIP_CHECK(hipFree(dd));
66 HIP_CHECK(hipFree(ddu));
67 HIP_CHECK(hipFree(dB));
68 HIP_CHECK(hipFree(dbuffer));
69
70 HIPSPARSE_CHECK(hipsparseDestroy(handle));
71
72 return 0;
73}
hipsparseXgtsv2StridedBatch_bufferSizeExt()#
-
hipsparseStatus_t hipsparseSgtsv2StridedBatch_bufferSizeExt(hipsparseHandle_t handle, int m, const float *dl, const float *d, const float *du, const float *x, int batchCount, int batchStride, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDgtsv2StridedBatch_bufferSizeExt(hipsparseHandle_t handle, int m, const double *dl, const double *d, const double *du, const double *x, int batchCount, int batchStride, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCgtsv2StridedBatch_bufferSizeExt(hipsparseHandle_t handle, int m, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *x, int batchCount, int batchStride, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZgtsv2StridedBatch_bufferSizeExt(hipsparseHandle_t handle, int m, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *x, int batchCount, int batchStride, size_t *pBufferSizeInBytes)#
hipsparseXgtsv2StridedBatch_bufferSizeExtreturns the size of the temporary storage buffer in bytes that is required by hipsparseXgtsv2StridedBatch(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system.
dl – [in] lower diagonal of tri-diagonal system where the ith system lower diagonal starts at
dl+batchStride*i.d – [in] main diagonal of tri-diagonal system where the ith system diagonal starts at
d+batchStride*i.du – [in] upper diagonal of tri-diagonal system where the ith system upper diagonal starts at
du+batchStride*i.x – [inout] Dense array of righthand-sides where the ith righthand-side starts at
x+batchStride*i.batchCount – [in] The number of systems to solve.
batchStride – [in] The number of elements that separate each system. Must satisfy
batchStride>= m.pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSgtsv2StridedBatch “hipsparseXgtsv2StridedBatch()”.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,batchCount,batchStride,dl,d,du,xorpBufferSizeInBytespointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXgtsv2StridedBatch()#
-
hipsparseStatus_t hipsparseSgtsv2StridedBatch(hipsparseHandle_t handle, int m, const float *dl, const float *d, const float *du, float *x, int batchCount, int batchStride, void *pBuffer)#
-
hipsparseStatus_t hipsparseDgtsv2StridedBatch(hipsparseHandle_t handle, int m, const double *dl, const double *d, const double *du, double *x, int batchCount, int batchStride, void *pBuffer)#
-
hipsparseStatus_t hipsparseCgtsv2StridedBatch(hipsparseHandle_t handle, int m, const hipComplex *dl, const hipComplex *d, const hipComplex *du, hipComplex *x, int batchCount, int batchStride, void *pBuffer)#
-
hipsparseStatus_t hipsparseZgtsv2StridedBatch(hipsparseHandle_t handle, int m, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, hipDoubleComplex *x, int batchCount, int batchStride, void *pBuffer)#
Strided Batch tridiagonal solver (no pivoting)
hipsparseXgtsv2StridedBatchsolves a batched tridiagonal linear system\[ T^{i}*x^{i} = x^{i} \]where for each batch \(i=0\ldots\)batchCount, \(T^{i}\) is a sparse tridiagonal matrix and \(x^{i}\) is a dense right-hand side vector. All of the tridiagonal matrices, \(T^{i}\), are packed one after the other into three vectors:dlfor the lower diagonals,dfor the main diagonals anddufor the upper diagonals. See below for a description of what this strided memory pattern looks like.Solving the batched tridiagonal system involves two steps. First, the user calls hipsparseXgtsv2StridedBatch_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgtsv2StridedBatch() to perform the actual solve. The \(x^{i}\) vectors, which initially stores the right-hand side values, are overwritten with the solution after the call to hipsparseXgtsv2StridedBatch().
The strided batch routines write each batch matrix one after the other in memory. For example, consider the following batch matrices:
\[\begin{split} \begin{bmatrix} t^{0}_{00} & t^{0}_{01} & 0 \\ t^{0}_{10} & t^{0}_{11} & t^{0}_{12} \\ 0 & t^{0}_{21} & t^{0}_{22} \end{bmatrix} \begin{bmatrix} t^{1}_{00} & t^{1}_{01} & 0 \\ t^{1}_{10} & t^{1}_{11} & t^{1}_{12} \\ 0 & t^{1}_{21} & t^{1}_{22} \end{bmatrix} \begin{bmatrix} t^{2}_{00} & t^{2}_{01} & 0 \\ t^{2}_{10} & t^{2}_{11} & t^{2}_{12} \\ 0 & t^{2}_{21} & t^{2}_{22} \end{bmatrix} \end{split}\]In strided format, the upper, lower, and diagonal arrays would look like:
\[\begin{split} \begin{align} \text{lower} &= \begin{bmatrix} 0 & t^{0}_{10} & t^{0}_{21} & 0 & t^{1}_{10} & t^{1}_{21} & 0 & t^{2}_{10} & t^{2}_{21} \end{bmatrix} \\ \text{diagonal} &= \begin{bmatrix} t^{0}_{00} & t^{0}_{11} & t^{0}_{22} & t^{1}_{00} & t^{1}_{11} & t^{1}_{22} & t^{2}_{00} & t^{2}_{11} & t^{2}_{22} \end{bmatrix} \\ \text{upper} &= \begin{bmatrix} t^{0}_{01} & t^{0}_{12} & 0 & t^{1}_{01} & t^{1}_{12} & 0 & t^{2}_{01} & t^{2}_{12} & 0 \end{bmatrix} \\ \end{align} \end{split}\]For the lower array, for each batchi, thei*batchStrideentries are zero and for the upper array thei*batchStride+batchStride-1entries are zero.Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system (must be >= 2).
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
x – [inout] Dense array of righthand-sides where the ith righthand-side starts at
x+batchStride*i.batchCount – [in] The number of systems to solve.
batchStride – [in] The number of elements that separate each system. Must satisfy
batchStride>= m.pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,batchCount,batchStride,dl,d,du,xorpBufferpointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXgtsvInterleavedBatch_bufferSizeExt()#
-
hipsparseStatus_t hipsparseSgtsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const float *dl, const float *d, const float *du, const float *x, int batchCount, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDgtsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const double *dl, const double *d, const double *du, const double *x, int batchCount, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCgtsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *x, int batchCount, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZgtsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *x, int batchCount, size_t *pBufferSizeInBytes)#
hipsparseXgtsvInterleavedBatch_bufferSizeExtreturns the size of the temporary storage buffer in bytes that is required by hipsparseXgtsvInterleavedBatch(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
algo – [in] Algorithm to use when solving tridiagonal systems. Options are thomas (
algo=0), LU (algo=1), or QR (algo=2). Thomas algorithm is the fastest but is not stable while LU and QR are slower but are stable.m – [in] size of the tri-diagonal linear system.
dl – [in] lower diagonal of tri-diagonal system. The first element of the lower diagonal must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. The last element of the upper diagonal must be zero.
x – [inout] Dense array of righthand-sides with dimension
batchCountbym.batchCount – [in] The number of systems to solve.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSgtsvInterleavedBatch().
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,batchCount,dl,d,du,xorpBufferSizeInBytespointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXgtsvInterleavedBatch()#
-
hipsparseStatus_t hipsparseSgtsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, float *dl, float *d, float *du, float *x, int batchCount, void *pBuffer)#
-
hipsparseStatus_t hipsparseDgtsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, double *dl, double *d, double *du, double *x, int batchCount, void *pBuffer)#
-
hipsparseStatus_t hipsparseCgtsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, hipComplex *dl, hipComplex *d, hipComplex *du, hipComplex *x, int batchCount, void *pBuffer)#
-
hipsparseStatus_t hipsparseZgtsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, hipDoubleComplex *dl, hipDoubleComplex *d, hipDoubleComplex *du, hipDoubleComplex *x, int batchCount, void *pBuffer)#
Interleaved Batch tridiagonal solver.
hipsparseXgtsvInterleavedBatchsolves a batched tridiagonal linear system\[ T^{i}*x^{i} = x^{i} \]where for each batch \(i=0\ldots\)batchCount, \(T^{i}\) is a sparse tridiagonal matrix and \(x^{i}\) is a dense right-hand side vector. All of the tridiagonal matrices, \(T^{i}\), are packed in an interleaved fashion into three vectors:dlfor the lower diagonals,dfor the main diagonals anddufor the upper diagonals. See below for a description of what this interleaved memory pattern looks like.Solving the batched tridiagonal system involves two steps. First, the user calls hipsparseXgtsvInterleavedBatch_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgtsvInterleavedBatch() to perform the actual solve. The \(x^{i}\) vectors, which initially stores the right-hand side values, are overwritten with the solution after the call to hipsparseXgtsvInterleavedBatch().
The user can specify different algorithms for
hipsparseXgtsvInterleavedBatchto use. Options are thomas (algo=0), LU (algo=1), or QR (algo=2).Unlike the strided batch routines which write each batch matrix one after the other in memory, the interleaved routines write the batch matrices such that each element from each matrix is written consecutively one after the other. For example, consider the following batch matrices:
\[\begin{split} \begin{bmatrix} t^{0}_{00} & t^{0}_{01} & 0 \\ t^{0}_{10} & t^{0}_{11} & t^{0}_{12} \\ 0 & t^{0}_{21} & t^{0}_{22} \end{bmatrix} \begin{bmatrix} t^{1}_{00} & t^{1}_{01} & 0 \\ t^{1}_{10} & t^{1}_{11} & t^{1}_{12} \\ 0 & t^{1}_{21} & t^{1}_{22} \end{bmatrix} \begin{bmatrix} t^{2}_{00} & t^{2}_{01} & 0 \\ t^{2}_{10} & t^{2}_{11} & t^{2}_{12} \\ 0 & t^{2}_{21} & t^{2}_{22} \end{bmatrix} \end{split}\]In interleaved format, the upper, lower, and diagonal arrays would look like:
\[\begin{split} \begin{align} \text{lower} &= \begin{bmatrix} 0 & 0 & 0 & t^{0}_{10} & t^{1}_{10} & t^{1}_{10} & t^{0}_{21} & t^{1}_{21} & t^{2}_{21} \end{bmatrix} \\ \text{diagonal} &= \begin{bmatrix} t^{0}_{00} & t^{1}_{00} & t^{2}_{00} & t^{0}_{11} & t^{1}_{11} & t^{2}_{11} & t^{0}_{22} & t^{1}_{22} & t^{2}_{22} \end{bmatrix} \\ \text{upper} &= \begin{bmatrix} t^{0}_{01} & t^{1}_{01} & t^{2}_{01} & t^{0}_{12} & t^{1}_{12} & t^{2}_{12} & 0 & 0 & 0 \end{bmatrix} \\ \end{align} \end{split}\]For the lower array, the firstbatchCountentries are zero and for the upper array the lastbatchCountentries are zero.Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
algo – [in] Algorithm to use when solving tridiagonal systems. Options are thomas (
algo=0), LU (algo=1), or QR (algo=2). Thomas algorithm is the fastest but is not stable while LU and QR are slower but are stable.m – [in] size of the tri-diagonal linear system.
dl – [inout] lower diagonal of tri-diagonal system. The first element of the lower diagonal must be zero.
d – [inout] main diagonal of tri-diagonal system.
du – [inout] upper diagonal of tri-diagonal system. The last element of the upper diagonal must be zero.
x – [inout] Dense array of righthand-sides with dimension
batchCountbym.batchCount – [in] The number of systems to solve.
pBuffer – [in] temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,batchCount,dl,d,du,xorpBufferpointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // Size of each square tridiagonal matrix
8 int m = 6;
9
10 // Number of batches
11 int batchCount = 4;
12
13 // Can be Thomas algorithm (0), LU (1), or QR (2)
14 int algo = 1;
15
16 // Host tridiagonal matrix
17 std::vector<float> hdl(m * batchCount);
18 std::vector<float> hd(m * batchCount);
19 std::vector<float> hdu(m * batchCount);
20
21 // Solve multiple tridiagonal matrix systems by interleaving matrices for better memory access:
22 //
23 // 4 2 0 0 0 0 5 3 0 0 0 0 6 4 0 0 0 0 7 5 0 0 0 0
24 // 2 4 2 0 0 0 3 5 3 0 0 0 4 6 4 0 0 0 5 7 5 0 0 0
25 // A1 = 0 2 4 2 0 0 A2 = 0 3 5 3 0 0 A3 = 0 4 6 4 0 0 A4 = 0 5 7 5 0 0
26 // 0 0 2 4 2 0 0 0 3 5 3 0 0 0 4 6 4 0 0 0 5 7 5 0
27 // 0 0 0 2 4 2 0 0 0 3 5 3 0 0 0 4 6 4 0 0 0 5 7 5
28 // 0 0 0 0 2 4 0 0 0 0 3 5 0 0 0 0 4 6 0 0 0 0 5 7
29 //
30 // hdl = 0 0 0 0 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5
31 // hd = 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7
32 // hdu = 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 0 0 0 0
33 for(int b = 0; b < batchCount; ++b)
34 {
35 for(int i = 0; i < m; ++i)
36 {
37 hdl[batchCount * i + b] = 2 + b;
38 hd[batchCount * i + b] = 4 + b;
39 hdu[batchCount * i + b] = 2 + b;
40 }
41
42 hdl[batchCount * 0 + b] = 0.0f;
43 hdu[batchCount * (m - 1) + b] = 0.0f;
44 }
45
46 // Host dense rhs
47 std::vector<float> hx(m * batchCount);
48
49 for(int b = 0; b < batchCount; ++b)
50 {
51 for(int i = 0; i < m; ++i)
52 {
53 hx[batchCount * i + b] = static_cast<float>(b + 1);
54 }
55 }
56
57 float* ddl = nullptr;
58 float* dd = nullptr;
59 float* ddu = nullptr;
60 float* dx = nullptr;
61 HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m * batchCount));
62 HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m * batchCount));
63 HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m * batchCount));
64 HIP_CHECK(hipMalloc((void**)&dx, sizeof(float) * m * batchCount));
65
66 HIP_CHECK(hipMemcpy(ddl, hdl.data(), sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
67 HIP_CHECK(hipMemcpy(dd, hd.data(), sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
68 HIP_CHECK(hipMemcpy(ddu, hdu.data(), sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
69 HIP_CHECK(hipMemcpy(dx, hx.data(), sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
70
71 // 1. Get buffer size
72 size_t bufferSize = 0;
73 HIPSPARSE_CHECK(hipsparseSgtsvInterleavedBatch_bufferSizeExt(
74 handle, algo, m, ddl, dd, ddu, dx, batchCount, &bufferSize));
75
76 void* dbuffer = nullptr;
77 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
78
79 // 2. Perform batched tridiagonal solve
80 HIPSPARSE_CHECK(
81 hipsparseSgtsvInterleavedBatch(handle, algo, m, ddl, dd, ddu, dx, batchCount, dbuffer));
82
83 // Copy solution back to host
84 HIP_CHECK(hipMemcpy(hx.data(), dx, sizeof(float) * m * batchCount, hipMemcpyDeviceToHost));
85
86 // Print the solutions
87 printf("Solutions for batched tridiagonal systems:\n");
88 for(int b = 0; b < batchCount; ++b)
89 {
90 printf(" Batch %d:\n", b);
91 for(int i = 0; i < m; ++i)
92 {
93 printf(" x[%d] = %f\n", i, hx[i * batchCount + b]);
94 }
95 }
96
97 // Clean up
98 HIP_CHECK(hipFree(ddl));
99 HIP_CHECK(hipFree(dd));
100 HIP_CHECK(hipFree(ddu));
101 HIP_CHECK(hipFree(dx));
102 HIP_CHECK(hipFree(dbuffer));
103
104 HIPSPARSE_CHECK(hipsparseDestroy(handle));
105
106 return 0;
107}
1int main(int argc, char* argv[])
2{
3 // hipSPARSE handle
4 hipsparseHandle_t handle;
5 HIPSPARSE_CHECK(hipsparseCreate(&handle));
6
7 // Size of each square tridiagonal matrix
8 int m = 6;
9
10 // Number of batches
11 int batchCount = 4;
12
13 // Can be Thomas algorithm (0), LU (1), or QR (2)
14 int algo = 1;
15
16 // Host tridiagonal matrix
17 float* hdl = (float*)malloc((m * batchCount) * sizeof(float));
18 float* hd = (float*)malloc((m * batchCount) * sizeof(float));
19 float* hdu = (float*)malloc((m * batchCount) * sizeof(float));
20
21 // Solve multiple tridiagonal matrix systems by interleaving matrices for better memory access:
22 //
23 // 4 2 0 0 0 0 5 3 0 0 0 0 6 4 0 0 0 0 7 5 0 0 0 0
24 // 2 4 2 0 0 0 3 5 3 0 0 0 4 6 4 0 0 0 5 7 5 0 0 0
25 // A1 = 0 2 4 2 0 0 A2 = 0 3 5 3 0 0 A3 = 0 4 6 4 0 0 A4 = 0 5 7 5 0 0
26 // 0 0 2 4 2 0 0 0 3 5 3 0 0 0 4 6 4 0 0 0 5 7 5 0
27 // 0 0 0 2 4 2 0 0 0 3 5 3 0 0 0 4 6 4 0 0 0 5 7 5
28 // 0 0 0 0 2 4 0 0 0 0 3 5 0 0 0 0 4 6 0 0 0 0 5 7
29 //
30 // hdl = 0 0 0 0 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5
31 // hd = 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7
32 // hdu = 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 0 0 0 0
33 for(int b = 0; b < batchCount; ++b)
34 {
35 for(int i = 0; i < m; ++i)
36 {
37 hdl[batchCount * i + b] = 2 + b;
38 hd[batchCount * i + b] = 4 + b;
39 hdu[batchCount * i + b] = 2 + b;
40 }
41
42 hdl[batchCount * 0 + b] = 0.0;
43 hdu[batchCount * (m - 1) + b] = 0.0;
44 }
45
46 // Host dense rhs
47 float* hx = (float*)malloc((m * batchCount) * sizeof(float));
48
49 for(int b = 0; b < batchCount; ++b)
50 {
51 for(int i = 0; i < m; ++i)
52 {
53 hx[batchCount * i + b] = (float)(b + 1);
54 }
55 }
56
57 float* ddl = NULL;
58 float* dd = NULL;
59 float* ddu = NULL;
60 float* dx = NULL;
61 HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m * batchCount));
62 HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m * batchCount));
63 HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m * batchCount));
64 HIP_CHECK(hipMalloc((void**)&dx, sizeof(float) * m * batchCount));
65
66 HIP_CHECK(hipMemcpy(ddl, hdl, sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
67 HIP_CHECK(hipMemcpy(dd, hd, sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
68 HIP_CHECK(hipMemcpy(ddu, hdu, sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
69 HIP_CHECK(hipMemcpy(dx, hx, sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
70
71 // 1. Get buffer size
72 size_t bufferSize = 0;
73 HIPSPARSE_CHECK(hipsparseSgtsvInterleavedBatch_bufferSizeExt(
74 handle, algo, m, ddl, dd, ddu, dx, batchCount, &bufferSize));
75
76 void* dbuffer = NULL;
77 HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));
78
79 // 2. Perform batched tridiagonal solve
80 HIPSPARSE_CHECK(
81 hipsparseSgtsvInterleavedBatch(handle, algo, m, ddl, dd, ddu, dx, batchCount, dbuffer));
82
83 // Copy solution back to host
84 HIP_CHECK(hipMemcpy(hx, dx, sizeof(float) * m * batchCount, hipMemcpyDeviceToHost));
85
86 // Print the solutions
87 printf("Solutions for batched tridiagonal systems:\n");
88 for(int b = 0; b < batchCount; ++b)
89 {
90 printf(" Batch %d:\n", b);
91 for(int i = 0; i < m; ++i)
92 {
93 printf(" x[%d] = %f\n", i, hx[i * batchCount + b]);
94 }
95 }
96
97 // Clean up
98 HIP_CHECK(hipFree(ddl));
99 HIP_CHECK(hipFree(dd));
100 HIP_CHECK(hipFree(ddu));
101 HIP_CHECK(hipFree(dx));
102 HIP_CHECK(hipFree(dbuffer));
103
104 HIPSPARSE_CHECK(hipsparseDestroy(handle));
105
106 return 0;
107}
hipsparseXgpsvInterleavedBatch_bufferSizeExt()#
-
hipsparseStatus_t hipsparseSgpsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const float *ds, const float *dl, const float *d, const float *du, const float *dw, const float *x, int batchCount, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseDgpsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const double *ds, const double *dl, const double *d, const double *du, const double *dw, const double *x, int batchCount, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseCgpsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const hipComplex *ds, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *dw, const hipComplex *x, int batchCount, size_t *pBufferSizeInBytes)#
-
hipsparseStatus_t hipsparseZgpsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const hipDoubleComplex *ds, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *dw, const hipDoubleComplex *x, int batchCount, size_t *pBufferSizeInBytes)#
hipsparseXgpsvInterleavedBatch_bufferSizeExtreturns the size of the temporary storage buffer in bytes that is required by hipsparseXgpsvInterleavedBatch(). The temporary storage buffer must be allocated by the user.- Parameters:
handle – [in] handle to the hipsparse library context queue.
algo – [in] algorithm to solve the linear system.
m – [in] size of the pentadiagonal linear system.
ds – [in] lower diagonal (distance 2) of pentadiagonal system. First two entries must be zero.
dl – [in] lower diagonal of pentadiagonal system. First entry must be zero.
d – [in] main diagonal of pentadiagonal system.
du – [in] upper diagonal of pentadiagonal system. Last entry must be zero.
dw – [in] upper diagonal (distance 2) of pentadiagonal system. Last two entries must be zero.
x – [in] Dense array of right-hand-sides with dimension
batchCountbym.batchCount – [in] The number of systems to solve.
pBufferSizeInBytes – [out] Number of bytes of the temporary storage buffer required.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,alg,batchCount,ds,dl,d,du,dw,xorpBufferSizeInBytespointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
hipsparseXgpsvInterleavedBatch()#
-
hipsparseStatus_t hipsparseSgpsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, float *ds, float *dl, float *d, float *du, float *dw, float *x, int batchCount, void *pBuffer)#
-
hipsparseStatus_t hipsparseDgpsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, double *ds, double *dl, double *d, double *du, double *dw, double *x, int batchCount, void *pBuffer)#
-
hipsparseStatus_t hipsparseCgpsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, hipComplex *ds, hipComplex *dl, hipComplex *d, hipComplex *du, hipComplex *dw, hipComplex *x, int batchCount, void *pBuffer)#
-
hipsparseStatus_t hipsparseZgpsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, hipDoubleComplex *ds, hipDoubleComplex *dl, hipDoubleComplex *d, hipDoubleComplex *du, hipDoubleComplex *dw, hipDoubleComplex *x, int batchCount, void *pBuffer)#
Interleaved Batch pentadiagonal solver.
hipsparseXgpsvInterleavedBatchsolves a batch of pentadiagonal linear systems\[ P^{i}*x^{i} = x^{i} \]where for each batch \(i=0\ldots\)batchCount, \(P^{i}\) is a sparse pentadiagonal matrix and \(x^{i}\) is a dense right-hand side vector. All of the pentadiagonal matrices, \(P^{i}\), are packed in an interleaved fashion into five vectors:dsfor the lowest diagonals,dlfor the lower diagonals,dfor the main diagonals,dufor the upper diagonals, anddwfor the highest digaonals. See below for a description of what this interleaved memory pattern looks like.Solving the batched pentadiagonal system involves two steps. First, the user calls hipsparseSgpsvInterleavedBatch_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgpsvInterleavedBatch() to perform the actual solve. The \(x^{i}\) vectors, which initially stores the right-hand side values, are overwritten with the solution after the call to hipsparseXgpsvInterleavedBatch().
Unlike the strided batch routines which write each batch matrix one after the other in memory, the interleaved routines write the batch matrices such that each element from each matrix is written consecutively one after the other. For example, consider the following batch matrices:
\[\begin{split} \begin{bmatrix} t^{0}_{00} & t^{0}_{01} & t^{0}_{02} \\ t^{0}_{10} & t^{0}_{11} & t^{0}_{12} \\ t^{0}_{20} & t^{0}_{21} & t^{0}_{22} \end{bmatrix} \begin{bmatrix} t^{1}_{00} & t^{1}_{01} & t^{1}_{02} \\ t^{1}_{10} & t^{1}_{11} & t^{1}_{12} \\ t^{1}_{20} & t^{1}_{21} & t^{1}_{22} \end{bmatrix} \begin{bmatrix} t^{2}_{00} & t^{2}_{01} & t^{2}_{02} \\ t^{2}_{10} & t^{2}_{11} & t^{2}_{12} \\ t^{2}_{20} & t^{2}_{21} & t^{2}_{22} \end{bmatrix} \end{split}\]In interleaved format, the highest, higher, lowest, lower, and diagonal arrays would look like:
\[\begin{split} \begin{align} \text{lowest} &= \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 & t^{0}_{20} & t^{1}_{20} & t^{2}_{20} \end{bmatrix} \\ \text{lower} &= \begin{bmatrix} 0 & 0 & 0 & t^{0}_{10} & t^{1}_{10} & t^{1}_{10} & t^{0}_{21} & t^{1}_{21} & t^{2}_{21} \end{bmatrix} \\ \text{diagonal} &= \begin{bmatrix} t^{0}_{00} & t^{1}_{00} & t^{2}_{00} & t^{0}_{11} & t^{1}_{11} & t^{2}_{11} & t^{0}_{22} & t^{1}_{22} & t^{2}_{22} \end{bmatrix} \\ \text{higher} &= \begin{bmatrix} t^{0}_{01} & t^{1}_{01} & t^{2}_{01} & t^{0}_{12} & t^{1}_{12} & t^{2}_{12} & 0 & 0 & 0 \end{bmatrix} \\ \text{highest} &= \begin{bmatrix} t^{0}_{02} & t^{1}_{02} & t^{2}_{02} & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \\ \end{align} \end{split}\]For the lowest array, the first2*batchCountentries are zero, for the lower array, the firstbatchCountentries are zero, for the upper array the lastbatchCountentries are zero, and for the highest array, the last2*batchCountentries are zero.Note
This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.
- Parameters:
handle – [in] handle to the hipsparse library context queue.
algo – [in] algorithm to solve the linear system.
m – [in] size of the pentadiagonal linear system.
ds – [inout] lower diagonal (distance 2) of pentadiagonal system. First two entries must be zero.
dl – [inout] lower diagonal of pentadiagonal system. First entry must be zero.
d – [inout] main diagonal of pentadiagonal system.
du – [inout] upper diagonal of pentadiagonal system. Last entry must be zero.
dw – [inout] upper diagonal (distance 2) of pentadiagonal system. Last two entries must be zero.
x – [inout] Dense array of right-hand-sides with dimension
batchCountbym.batchCount – [in] The number of systems to solve.
pBuffer – [in] Temporary storage buffer allocated by the user.
- Return values:
HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE –
handle,m,alg,batchCount,ds,dl,d,du,dw,xorpBufferpointer is invalid.HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.