Preconditioner functions

Preconditioner functions#

This module contains all sparse preconditioners.

The sparse preconditioners describe manipulations on a matrix in sparse format to obtain a sparse preconditioner matrix.

hipsparseXbsrilu02_zeroPivot()#

hipsparseStatus_t hipsparseXbsrilu02_zeroPivot(hipsparseHandle_t handle, bsrilu02Info_t info, int *position)#

hipsparseXbsrilu02_zeroPivot returns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXbsrilu02_analysis() or hipsparseXbsrilu02() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the BSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.

Deprecated:: This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.

Note

If a zero pivot is found, position \(=j\) means that either the diagonal block \(A_{j,j}\) is missing (structural zero) or the diagonal block \(A_{j,j}\) is not invertible (numerical zero).

Note

hipsparseXbsrilu02_zeroPivot is a blocking function. It might influence performance negatively.

Parameters:

handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
position – [inout] pointer to zero pivot \(j\), can be in host or device memory.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED – handle is not initialized.
HIPSPARSE_STATUS_INVALID_VALUE – handle, info or position is nullptr.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.

hipsparseXbsrilu02_numericBoost()#

hipsparseStatus_t hipsparseSbsrilu02_numericBoost(hipsparseHandle_t handle, bsrilu02Info_t info, int enable_boost, double *tol, float *boost_val)#

hipsparseStatus_t hipsparseDbsrilu02_numericBoost(hipsparseHandle_t handle, bsrilu02Info_t info, int enable_boost, double *tol, double *boost_val)#

hipsparseStatus_t hipsparseCbsrilu02_numericBoost(hipsparseHandle_t handle, bsrilu02Info_t info, int enable_boost, double *tol, hipComplex *boost_val)#

hipsparseStatus_t hipsparseZbsrilu02_numericBoost(hipsparseHandle_t handle, bsrilu02Info_t info, int enable_boost, double *tol, hipDoubleComplex *boost_val)#

hipsparseXbsrilu02_numericBoost enables the user to replace a numerical value in an incomplete LU factorization. tol is used to determine whether a numerical value is replaced by boost_val, such that \(A_{j,j} = \text{boost_val}\) if \(\text{tol} \ge \left|A_{j,j}\right|\).

Deprecated:: This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.

Note

The boost value is enabled by setting enable_boost to 1 or disabled by setting enable_boost to 0.

Note

tol and boost_val can be in host or device memory.

Parameters:

handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
enable_boost – [in] enable/disable numeric boost.
tol – [in] tolerance to determine whether a numerical value is replaced or not.
boost_val – [in] boost value to replace a numerical value.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED – handle is not initialized.
HIPSPARSE_STATUS_INVALID_VALUE – handle, info, tol or boost_val is nullptr.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXbsrilu02_bufferSize()#

hipsparseStatus_t hipsparseSbsrilu02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDbsrilu02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCbsrilu02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZbsrilu02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, int *pBufferSizeInBytes)#

hipsparseXbsrilu02_bufferSize returns the size of the temporary storage buffer in bytes that is required by hipsparseXbsrilu02_analysis() and hipsparseXbsrilu02(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specifies whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrSortedValA – [in] array of length nnzb*blockDim*blockDim containing the values of the sparse BSR matrix.
bsrSortedRowPtrA – [in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.
bsrSortedColIndA – [in] array of nnzb elements containing the block column indices of the sparse BSR matrix.
blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where m=mb*blockDim.
info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSbsrilu02_analysis(), hipsparseDbsrilu02_analysis(), hipsparseCbsrilu02_analysis(), hipsparseZbsrilu02_analysis(), hipsparseSbsrilu02(), hipsparseDbsrilu02(), hipsparseCbsrilu02() and hipsparseZbsrilu02().

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, mb, nnzb, blockDim, descrA, bsrSortedValA, bsrSortedRowPtrA, bsrSortedColIndA, info or pBufferSizeInBytes pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

hipsparseXbsrilu02_analysis()#

hipsparseStatus_t hipsparseSbsrilu02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseDbsrilu02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseCbsrilu02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseZbsrilu02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrSortedValA, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseXbsrilu02_analysis performs the analysis step for hipsparseXbsrilu02(). It is expected that this function will be executed only once for a given matrix.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specified whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrSortedValA – [in] array of length nnzb*blockDim*blockDim containing the values of the sparse BSR matrix.
bsrSortedRowPtrA – [in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.
bsrSortedColIndA – [in] array of nnzb elements containing the block column indices of the sparse BSR matrix.
blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where m=mb*blockDim.
info – [out] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, mb, nnzb, blockDim, descrA, bsrSortedValA, bsrSortedRowPtrA, bsrSortedColIndA, info or pBuffer pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

hipsparseXbsrilu02()#

hipsparseStatus_t hipsparseSbsrilu02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrSortedValA_valM, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseDbsrilu02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrSortedValA_valM, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseCbsrilu02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrSortedValA_valM, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseZbsrilu02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrSortedValA_valM, const int *bsrSortedRowPtrA, const int *bsrSortedColIndA, int blockDim, bsrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

Incomplete LU factorization with 0 fill-ins and no pivoting using BSR storage format.

hipsparseXbsrilu02 computes the incomplete LU factorization with 0 fill-ins and no pivoting of a sparse \(mb \times mb\) BSR matrix \(A\), such that

\[ A \approx LU \]

Computing the above incomplete LU factorization requires three steps to complete. First, the user determines the size of the required temporary storage buffer by calling hipsparseXbsrilu02_bufferSize(). Once this buffer size has been determined, the user allocates the buffer and passes it to hipsparseXbsrilu02_analysis(). This will perform analysis on the sparsity pattern of the matrix. Finally, the user calls hipsparseXbsrilu02 to perform the actual factorization. The calculation of the buffer size and the analysis of the sparse matrix only need to be performed once for a given sparsity pattern while the factorization can be repeatedly applied to multiple matrices having the same sparsity pattern. Once all calls to hipsparseXbsrilu02() are complete, the temporary buffer can be deallocated.

hipsparseXbsrilu02 reports the first zero pivot (either numerical or structural zero). The zero pivot status can be obtained by calling hipsparseXbsrilu02_zeroPivot().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specified whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrSortedValA_valM – [inout] array of length nnzb*blockDim*blockDim containing the values of the sparse BSR matrix.
bsrSortedRowPtrA – [in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.
bsrSortedColIndA – [in] array of nnzb elements containing the block column indices of the sparse BSR matrix.
blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where m=mb*blockDim.
info – [in] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, mb, nnzb, blockDim, descrA, bsrSortedValA_valM, bsrSortedRowPtrA or bsrSortedColIndA pointer is invalid.
HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // A sample square matrix A (4x4) in BSR format for ILU(0) factorization.
    // The 'S' in Sbsrilu02 indicates single precision float.
    // We'll use a block size of 1 for simplicity, making it behave like CSR ILU.
    // Matrix A:
    // ( 1  2  0  0 )
    // ( 3  4  5  0 )
    // ( 0  6  7  8 )
    // ( 0  0  9 10 )

    int m    = 4; // Number of rows
    int n    = 4; // Number of columns
    int bs   = 1; // Block size
    int mb   = m / bs; // Number of block rows
    int nb   = n / bs; // Number of block columns
    int nnzb = 10; // Number of non-zero blocks

    // BSR row pointers
    std::vector<int> hbsrRowPtr = {0, 2, 5, 8, 10};

    // BSR column indices
    std::vector<int> hbsrColInd = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};

    // BSR values (single precision float)
    std::vector<float> hbsrVal = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f, 7.0f, 8.0f, 9.0f, 10.0f};

    // Matrix descriptor
    hipsparseMatDescr_t descr;
    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));

    // Set index base on descriptor
    HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));

    // For ILU(0), the L factor often has a unit diagonal.
    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));

    // BSRILU02 info
    bsrilu02Info_t info;
    HIPSPARSE_CHECK(hipsparseCreateBsrilu02Info(&info));

    // Offload data to device
    int*   dbsrRowPtr;
    int*   dbsrColInd;
    float* dbsrVal; // This will store the factorized L and U values

    HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
    HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
    HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(float) * nnzb * bs * bs));

    HIP_CHECK(
        hipMemcpy(dbsrRowPtr, hbsrRowPtr.data(), sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd.data(), sizeof(int) * nnzb, hipMemcpyHostToDevice));
    HIP_CHECK(
        hipMemcpy(dbsrVal, hbsrVal.data(), sizeof(float) * nnzb * bs * bs, hipMemcpyHostToDevice));

    // 1. Get buffer size
    int bufferSize = 0;
    HIPSPARSE_CHECK(
        hipsparseSbsrilu02_bufferSize(handle,
                                      HIPSPARSE_DIRECTION_COLUMN, // Block storage direction
                                      mb,
                                      nnzb,
                                      descr,
                                      dbsrVal,
                                      dbsrRowPtr,
                                      dbsrColInd,
                                      bs,
                                      info,
                                      &bufferSize));

    void* dbuffer = nullptr;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform analysis (symbolic factorization)
    // This step analyzes the sparsity pattern of A to determine the structure of L and U.
    HIPSPARSE_CHECK(
        hipsparseSbsrilu02_analysis(handle,
                                    HIPSPARSE_DIRECTION_COLUMN,
                                    mb,
                                    nnzb,
                                    descr,
                                    dbsrVal,
                                    dbsrRowPtr,
                                    dbsrColInd,
                                    bs,
                                    info,
                                    HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
                                    dbuffer));

    // 3. Perform factorization (numerical computation)
    // This step computes the actual numerical values of L and U, stored in dbsrVal.
    HIPSPARSE_CHECK(hipsparseSbsrilu02(handle,
                                       HIPSPARSE_DIRECTION_COLUMN,
                                       mb,
                                       nnzb,
                                       descr,
                                       dbsrVal,
                                       dbsrRowPtr,
                                       dbsrColInd,
                                       bs,
                                       info,
                                       HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
                                       dbuffer));

    // 4. Check for zero pivots
    // A zero pivot can occur during factorization, indicating a numerical breakdown.
    int zeroPivot = 0; // -1 if no zero pivot, otherwise the block row index of the first zero pivot
    HIPSPARSE_CHECK(hipsparseXbsrilu02_zeroPivot(handle, info, &zeroPivot));
    if(zeroPivot != -1)
    {
        printf("Error: Zero pivot detected during ILU0 factorization at block row index %d\n",
               zeroPivot);
        // Handle the error (e.g., return, use a different preconditioner, etc.)
    }
    else
    {
        printf("BSRILU0 factorization completed successfully (no zero pivots detected).\n");
    }

    // Copy the factorized values (L and U combined) back to host
    std::vector<float> hbsrVal_result(nnzb * bs * bs);
    HIP_CHECK(hipMemcpy(
        hbsrVal_result.data(), dbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyDeviceToHost));

    // Print the result (the values of the factorized L and U combined)
    printf("\nFactorized BSR values (L and U combined):\n");
    for(int i = 0; i < nnzb * bs * bs; ++i)
    {
        printf("val[%d] = %f\n", i, hbsrVal_result[i]);
    }

    // Clean up
    HIPSPARSE_CHECK(hipsparseDestroyBsrilu02Info(info));
    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    HIP_CHECK(hipFree(dbsrRowPtr));
    HIP_CHECK(hipFree(dbsrColInd));
    HIP_CHECK(hipFree(dbsrVal));
    HIP_CHECK(hipFree(dbuffer));

    return 0;
}

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // A sample square matrix A (4x4) in BSR format for ILU(0) factorization.
    // The 'S' in Sbsrilu02 indicates single precision float.
    // We'll use a block size of 1 for simplicity, making it behave like CSR ILU.
    // Matrix A:
    // ( 1  2  0  0 )
    // ( 3  4  5  0 )
    // ( 0  6  7  8 )
    // ( 0  0  9 10 )

    int m    = 4; // Number of rows
    int n    = 4; // Number of columns
    int bs   = 1; // Block size
    int mb   = m / bs; // Number of block rows
    int nb   = n / bs; // Number of block columns
    int nnzb = 10; // Number of non-zero blocks

    // BSR row pointers
    int hbsrRowPtr[] = {0, 2, 5, 8, 10};

    // BSR column indices
    int hbsrColInd[] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};

    // BSR values (single precision float)
    float hbsrVal[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0};

    // Matrix descriptor
    hipsparseMatDescr_t descr;
    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));

    // Set index base on descriptor
    HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));

    // For ILU(0), the L factor often has a unit diagonal.
    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));

    // BSRILU02 info
    bsrilu02Info_t info;
    HIPSPARSE_CHECK(hipsparseCreateBsrilu02Info(&info));

    // Offload data to device
    int*   dbsrRowPtr;
    int*   dbsrColInd;
    float* dbsrVal; // This will store the factorized L and U values

    HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
    HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
    HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(float) * nnzb * bs * bs));

    HIP_CHECK(hipMemcpy(dbsrRowPtr, hbsrRowPtr, sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd, sizeof(int) * nnzb, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dbsrVal, hbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyHostToDevice));

    // 1. Get buffer size
    int bufferSize = 0;
    HIPSPARSE_CHECK(
        hipsparseSbsrilu02_bufferSize(handle,
                                      HIPSPARSE_DIRECTION_COLUMN, // Block storage direction
                                      mb,
                                      nnzb,
                                      descr,
                                      dbsrVal,
                                      dbsrRowPtr,
                                      dbsrColInd,
                                      bs,
                                      info,
                                      &bufferSize));

    void* dbuffer = NULL;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform analysis (symbolic factorization)
    // This step analyzes the sparsity pattern of A to determine the structure of L and U.
    HIPSPARSE_CHECK(
        hipsparseSbsrilu02_analysis(handle,
                                    HIPSPARSE_DIRECTION_COLUMN,
                                    mb,
                                    nnzb,
                                    descr,
                                    dbsrVal,
                                    dbsrRowPtr,
                                    dbsrColInd,
                                    bs,
                                    info,
                                    HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
                                    dbuffer));

    // 3. Perform factorization (numerical computation)
    // This step computes the actual numerical values of L and U, stored in dbsrVal.
    HIPSPARSE_CHECK(hipsparseSbsrilu02(handle,
                                       HIPSPARSE_DIRECTION_COLUMN,
                                       mb,
                                       nnzb,
                                       descr,
                                       dbsrVal,
                                       dbsrRowPtr,
                                       dbsrColInd,
                                       bs,
                                       info,
                                       HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
                                       dbuffer));

    // 4. Check for zero pivots
    // A zero pivot can occur during factorization, indicating a numerical breakdown.
    int zeroPivot = 0; // -1 if no zero pivot, otherwise the block row index of the first zero pivot
    HIPSPARSE_CHECK(hipsparseXbsrilu02_zeroPivot(handle, info, &zeroPivot));
    if(zeroPivot != -1)
    {
        printf("Error: Zero pivot detected during ILU0 factorization at block row index %d\n",
               zeroPivot);
        // Handle the error (e.g., return, use a different preconditioner, etc.)
    }
    else
    {
        printf("BSRILU0 factorization completed successfully (no zero pivots detected).\n");
    }

    // Copy the factorized values (L and U combined) back to host
    float* hbsrVal_result = (float*)malloc(nnzb * bs * bs * sizeof(float));
    HIP_CHECK(
        hipMemcpy(hbsrVal_result, dbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyDeviceToHost));

    // Print the result (the values of the factorized L and U combined)
    printf("\nFactorized BSR values (L and U combined):\n");
    for(int i = 0; i < nnzb * bs * bs; ++i)
    {
        printf("val[%d] = %f\n", i, hbsrVal_result[i]);
    }

    // Clean up
    free(hbsrVal_result);

    HIPSPARSE_CHECK(hipsparseDestroyBsrilu02Info(info));
    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    HIP_CHECK(hipFree(dbsrRowPtr));
    HIP_CHECK(hipFree(dbsrColInd));
    HIP_CHECK(hipFree(dbsrVal));
    HIP_CHECK(hipFree(dbuffer));

    return 0;
}

program example_hipsparse_bsrilu02
    use iso_c_binding
    implicit none

    ! HIP
    interface
        function hipMalloc(ptr, size) &
                bind(c, name = 'hipMalloc')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipMalloc
            type(c_ptr) :: ptr
            integer(c_size_t), value :: size
        end function hipMalloc

        function hipFree(ptr) &
                bind(c, name = 'hipFree')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipFree
            type(c_ptr), value :: ptr
        end function hipFree

        function hipMemcpy(dst, src, size, kind) &
                bind(c, name = 'hipMemcpy')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipMemcpy
            type(c_ptr), value :: dst
            type(c_ptr), intent(in), value :: src
            integer(c_size_t), value :: size
            integer(c_int), value :: kind
        end function hipMemcpy
    end interface

    integer, parameter :: hipMemcpyHostToDevice = 1
    integer, parameter :: hipMemcpyDeviceToHost = 2

    ! hipSPARSE
    interface
        function hipsparseCreate(handle) &
                bind(c, name = 'hipsparseCreate')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreate
            type(c_ptr) :: handle
        end function hipsparseCreate

        function hipsparseDestroy(handle) &
                bind(c, name = 'hipsparseDestroy')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroy
            type(c_ptr), value :: handle
        end function hipsparseDestroy

        function hipsparseCreateMatDescr(descr) &
                bind(c, name = 'hipsparseCreateMatDescr')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreateMatDescr
            type(c_ptr) :: descr
        end function hipsparseCreateMatDescr

        function hipsparseDestroyMatDescr(descr) &
                bind(c, name = 'hipsparseDestroyMatDescr')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroyMatDescr
            type(c_ptr), value :: descr
        end function hipsparseDestroyMatDescr

        function hipsparseCreateBsrilu02Info(info) &
                bind(c, name = 'hipsparseCreateBsrilu02Info')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreateBsrilu02Info
            type(c_ptr) :: info
        end function hipsparseCreateBsrilu02Info

        function hipsparseDestroyBsrilu02Info(info) &
                bind(c, name = 'hipsparseDestroyBsrilu02Info')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroyBsrilu02Info
            type(c_ptr), value :: info
        end function hipsparseDestroyBsrilu02Info

        function hipsparseSbsrilu02_bufferSize(handle, dirA, mb, nnzb, descrA, bsrSortedValA, &
                                               bsrSortedRowPtrA, bsrSortedColIndA, blockDim, &
                                               info, pBufferSizeInBytes) &
                bind(c, name = 'hipsparseSbsrilu02_bufferSize')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSbsrilu02_bufferSize
            type(c_ptr), value :: handle
            integer(c_int), value :: dirA
            integer(c_int), value :: mb
            integer(c_int), value :: nnzb
            type(c_ptr), value :: descrA
            type(c_ptr), intent(in), value :: bsrSortedValA
            type(c_ptr), intent(in), value :: bsrSortedRowPtrA
            type(c_ptr), intent(in), value :: bsrSortedColIndA
            integer(c_int), value :: blockDim
            type(c_ptr), value :: info
            type(c_ptr), value :: pBufferSizeInBytes
        end function hipsparseSbsrilu02_bufferSize

        function hipsparseSbsrilu02_analysis(handle, dirA, mb, nnzb, descrA, bsrSortedValA, &
                                             bsrSortedRowPtrA, bsrSortedColIndA, blockDim, &
                                             info, policy, pBuffer) &
                bind(c, name = 'hipsparseSbsrilu02_analysis')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSbsrilu02_analysis
            type(c_ptr), value :: handle
            integer(c_int), value :: dirA
            integer(c_int), value :: mb
            integer(c_int), value :: nnzb
            type(c_ptr), value :: descrA
            type(c_ptr), intent(in), value :: bsrSortedValA
            type(c_ptr), intent(in), value :: bsrSortedRowPtrA
            type(c_ptr), intent(in), value :: bsrSortedColIndA
            integer(c_int), value :: blockDim
            type(c_ptr), value :: info
            integer(c_int), value :: policy
            type(c_ptr), value :: pBuffer
        end function hipsparseSbsrilu02_analysis

        function hipsparseSbsrilu02(handle, dirA, mb, nnzb, descrA, bsrSortedValA, bsrSortedRowPtrA, &
                                    bsrSortedColIndA, blockDim, info, policy, pBuffer) &
                bind(c, name = 'hipsparseSbsrilu02')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSbsrilu02
            type(c_ptr), value :: handle
            integer(c_int), value :: dirA
            integer(c_int), value :: mb
            integer(c_int), value :: nnzb
            type(c_ptr), value :: descrA
            type(c_ptr), value :: bsrSortedValA
            type(c_ptr), intent(in), value :: bsrSortedRowPtrA
            type(c_ptr), intent(in), value :: bsrSortedColIndA
            integer(c_int), value :: blockDim
            type(c_ptr), value :: info
            integer(c_int), value :: policy
            type(c_ptr), value :: pBuffer
        end function hipsparseSbsrilu02

        function hipsparseXbsrilu02_zeroPivot(handle, info, position) &
                bind(c, name = 'hipsparseXbsrilu02_zeroPivot')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseXbsrilu02_zeroPivot
            type(c_ptr), value :: handle
            type(c_ptr), value :: info
            type(c_ptr), value :: position
        end function hipsparseXbsrilu02_zeroPivot
    end interface

    integer, parameter :: HIPSPARSE_DIRECTION_COLUMN = 1
    integer, parameter :: HIPSPARSE_SOLVE_POLICY_USE_LEVEL = 1

    ! Variables
    type(c_ptr) :: handle
    type(c_ptr) :: descr
    type(c_ptr) :: info
    integer :: i, stat
    integer, target :: zeroPivot
    integer(c_int), target :: bufferSize
    
    ! Block sparse matrix A (4x4 with block size 1)
    integer, parameter :: m = 4
    integer, parameter :: n = 4
    integer, parameter :: bs = 1
    integer, parameter :: mb = m / bs
    integer, parameter :: nb = n / bs
    integer, parameter :: nnzb = 10
    
    integer, dimension(mb+1), target :: hbsrRowPtr = (/0, 2, 5, 8, 10/)
    integer, dimension(nnzb), target :: hbsrColInd = (/0, 1, 0, 1, 2, 1, 2, 3, 2, 3/)
    real(c_float), dimension(nnzb*bs*bs), target :: hbsrVal = (/1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0/)
    
    ! Result array
    real(c_float), dimension(nnzb*bs*bs), target :: hbsrVal_result
    
    ! Device pointers
    type(c_ptr) :: dbsrRowPtr
    type(c_ptr) :: dbsrColInd
    type(c_ptr) :: dbsrVal
    type(c_ptr) :: dbuffer

    ! Create hipSPARSE handle
    stat = hipsparseCreate(handle)
    if (stat /= 0) stop

    ! Create matrix descriptor
    stat = hipsparseCreateMatDescr(descr)
    if (stat /= 0) stop

    ! Create bsrilu02 info
    stat = hipsparseCreateBsrilu02Info(info)
    if (stat /= 0) stop

    ! Allocate device memory
    stat = hipMalloc(dbsrRowPtr, int((mb + 1) * 4, c_size_t))
    if (stat /= 0) stop
    stat = hipMalloc(dbsrColInd, int(nnzb * 4, c_size_t))
    if (stat /= 0) stop
    stat = hipMalloc(dbsrVal, int(nnzb * bs * bs * 4, c_size_t))
    if (stat /= 0) stop

    ! Copy data to device
    stat = hipMemcpy(dbsrRowPtr, c_loc(hbsrRowPtr), int((mb + 1) * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop
    stat = hipMemcpy(dbsrColInd, c_loc(hbsrColInd), int(nnzb * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop
    stat = hipMemcpy(dbsrVal, c_loc(hbsrVal), int(nnzb * bs * bs * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop

    ! Get buffer size
    stat = hipsparseSbsrilu02_bufferSize(handle, &
                                         HIPSPARSE_DIRECTION_COLUMN, &
                                         mb, &
                                         nnzb, &
                                         descr, &
                                         dbsrVal, &
                                         dbsrRowPtr, &
                                         dbsrColInd, &
                                         bs, &
                                         info, &
                                         c_loc(bufferSize))
    if (stat /= 0) stop

    ! Allocate temporary buffer
    stat = hipMalloc(dbuffer, int(bufferSize, c_size_t))
    if (stat /= 0) stop

    ! Perform analysis step
    stat = hipsparseSbsrilu02_analysis(handle, &
                                       HIPSPARSE_DIRECTION_COLUMN, &
                                       mb, &
                                       nnzb, &
                                       descr, &
                                       dbsrVal, &
                                       dbsrRowPtr, &
                                       dbsrColInd, &
                                       bs, &
                                       info, &
                                       HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
                                       dbuffer)
    if (stat /= 0) stop

    ! Perform factorization
    stat = hipsparseSbsrilu02(handle, &
                              HIPSPARSE_DIRECTION_COLUMN, &
                              mb, &
                              nnzb, &
                              descr, &
                              dbsrVal, &
                              dbsrRowPtr, &
                              dbsrColInd, &
                              bs, &
                              info, &
                              HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
                              dbuffer)
    if (stat /= 0) stop

    ! Check for zero pivots
    stat = hipsparseXbsrilu02_zeroPivot(handle, info, c_loc(zeroPivot))
    if (zeroPivot /= -1) then
        write(*,*) 'Error: Zero pivot detected at row index', zeroPivot
    else
        write(*,*) 'BSRILU02 factorization completed successfully'
    end if

    ! Copy result back to host
    stat = hipMemcpy(c_loc(hbsrVal_result), dbsrVal, int(nnzb * bs * bs * 4, c_size_t), hipMemcpyDeviceToHost)
    if (stat /= 0) stop

    ! Print result
    write(*,*) 'Factorized BSR values (L and U combined):'
    do i = 1, nnzb * bs * bs
        write(*,*) 'val[', i-1, '] =', hbsrVal_result(i)
    end do

    ! Clean up
    stat = hipFree(dbsrRowPtr)
    stat = hipFree(dbsrColInd)
    stat = hipFree(dbsrVal)
    stat = hipFree(dbuffer)

    stat = hipsparseDestroyBsrilu02Info(info)
    stat = hipsparseDestroyMatDescr(descr)
    stat = hipsparseDestroy(handle)

end program example_hipsparse_bsrilu02

hipsparseXcsrilu02_zeroPivot()#

hipsparseStatus_t hipsparseXcsrilu02_zeroPivot(hipsparseHandle_t handle, csrilu02Info_t info, int *position)#

hipsparseXcsrilu02_zeroPivot returns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXcsrilu02() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the CSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.

Deprecated:: This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.

Note

hipsparseXcsrilu02_zeroPivot is a blocking function. It might influence performance negatively.

Parameters:

handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
position – [inout] pointer to zero pivot \(j\), can be in host or device memory.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED – handle is not initialized.
HIPSPARSE_STATUS_INVALID_VALUE – handle, info or position is nullptr.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.

hipsparseXcsrilu02_numericBoost()#

hipsparseStatus_t hipsparseScsrilu02_numericBoost(hipsparseHandle_t handle, csrilu02Info_t info, int enable_boost, double *tol, float *boost_val)#

hipsparseStatus_t hipsparseDcsrilu02_numericBoost(hipsparseHandle_t handle, csrilu02Info_t info, int enable_boost, double *tol, double *boost_val)#

hipsparseStatus_t hipsparseCcsrilu02_numericBoost(hipsparseHandle_t handle, csrilu02Info_t info, int enable_boost, double *tol, hipComplex *boost_val)#

hipsparseStatus_t hipsparseZcsrilu02_numericBoost(hipsparseHandle_t handle, csrilu02Info_t info, int enable_boost, double *tol, hipDoubleComplex *boost_val)#

hipsparseXcsrilu02_numericBoost enables the user to replace a numerical value in an incomplete LU factorization. tol is used to determine whether a numerical value is replaced by boost_val, such that \(A_{j,j} = \text{boost_val}\) if \(\text{tol} \ge \left|A_{j,j}\right|\).

Deprecated:: This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.

Note

The boost value is enabled by setting enable_boost to 1 or disabled by setting enable_boost to 0.

Note

tol and boost_val can be in host or device memory.

Parameters:

handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
enable_boost – [in] enable/disable numeric boost.
tol – [in] tolerance to determine whether a numerical value is replaced or not.
boost_val – [in] boost value to replace a numerical value.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED – handle is not initialized.
HIPSPARSE_STATUS_INVALID_VALUE – handle, info, tol or boost_val is nullptr.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXcsrilu02_bufferSize()#

hipsparseStatus_t hipsparseScsrilu02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDcsrilu02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCcsrilu02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZcsrilu02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, int *pBufferSizeInBytes)#

hipsparseXcsrilu02_bufferSize returns the size of the temporary storage buffer in bytes that is required by hipsparseXcsrilu02_analysis() and hipsparseXcsrilu02(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of nnz elements of the sparse CSR matrix.
csrSortedRowPtrA – [in] array of m+1 elements that point to the start of every row of the sparse CSR matrix.
csrSortedColIndA – [in] array of nnz elements containing the column indices of the sparse CSR matrix.
info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXcsrilu02_analysis() and hipsparseXcsrilu02().

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, nnz, descrA, csrSortedValA, csrSortedRowPtrA, csrSortedColIndA, info or pBufferSizeInBytes pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXcsrilu02_bufferSizeExt()#

hipsparseStatus_t hipsparseScsrilu02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDcsrilu02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCcsrilu02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZcsrilu02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, size_t *pBufferSizeInBytes)#

hipsparseXcsrilu02_bufferSizeExt returns the size of the temporary storage buffer in bytes that is required by hipsparseXcsrilu02_analysis() and hipsparseXcsrilu02(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of nnz elements of the sparse CSR matrix.
csrSortedRowPtrA – [in] array of m+1 elements that point to the start of every row of the sparse CSR matrix.
csrSortedColIndA – [in] array of nnz elements containing the column indices of the sparse CSR matrix.
info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXcsrilu02_analysis() and hipsparseXcsrilu02().

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, nnz, descrA, csrSortedValA, csrSortedRowPtrA, csrSortedColIndA, info or pBufferSizeInBytes pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXcsrilu02_analysis()#

hipsparseStatus_t hipsparseScsrilu02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseDcsrilu02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseCcsrilu02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseZcsrilu02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseXcsrilu02_analysis performs the analysis step for hipsparseXcsrilu02(). It is expected that this function will be executed only once for a given matrix and particular operation type.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of nnz elements of the sparse CSR matrix.
csrSortedRowPtrA – [in] array of m+1 elements that point to the start of every row of the sparse CSR matrix.
csrSortedColIndA – [in] array of nnz elements containing the column indices of the sparse CSR matrix.
info – [out] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, nnz, descrA, csrSortedValA, csrSortedRowPtrA, csrSortedColIndA, info or pBuffer pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXcsrilu02()#

hipsparseStatus_t hipsparseScsrilu02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseDcsrilu02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseCcsrilu02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseZcsrilu02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csrilu02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

Incomplete LU factorization with 0 fill-ins and no pivoting using CSR storage format.

hipsparseXcsrilu02 computes the incomplete LU factorization with 0 fill-ins and no pivoting of a sparse \(m \times m\) CSR matrix \(A\), such that

\[ A \approx LU \]

where the lower triangular matrix \(L\) and the upper triangular matrix \(U\) are computed using:

\[\begin{split} \begin{array}{ll} L_{ij} = \frac{1}{U_{jj}}(A_{ij} - \sum_{k=0}^{j-1}L_{ik} \times U_{kj}), & \text{if i > j} \\ U_{ij} = (A_{ij} - \sum_{k=0}^{j-1}L_{ik} \times U_{kj}), & \text{if i <= j} \end{array} \end{split}\]

for each entry found in the CSR matrix \(A\).

Computing the above incomplete \(LU\) factorization requires three steps to complete. First, the user determines the size of the required temporary storage buffer by calling hipsparseXcsrilu02_bufferSize(). Once this buffer size has been determined, the user allocates the buffer and passes it to hipsparseXcsrilu02_analysis(). This will perform analysis on the sparsity pattern of the matrix. Finally, the user calls hipsparseScsrilu02, hipsparseDcsrilu02, hipsparseCcsrilu02, or hipsparseZcsrilu02 to perform the actual factorization. The calculation of the buffer size and the analysis of the sparse matrix only need to be performed once for a given sparsity pattern while the factorization can be repeatedly applied to multiple matrices having the same sparsity pattern. Once all calls to hipsparseXcsrilu02() are complete, the temporary buffer can be deallocated.

When computing the \(LU\) factorization, it is possible that \(U_{jj} == 0\) which would result in a division by zero. This could occur from either \(A_{jj}\) not existing in the sparse CSR matrix (referred to as a structural zero) or because \(A_{ij} - \sum_{k=0}^{j-1}L_{ik} \times U_{kj} == 0\) (referred to as a numerical zero). For example, running the \(LU\) factorization on the following matrix:

\[\begin{split} \begin{bmatrix} 2 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 2 \end{bmatrix} \end{split}\]

results in a successful \(LU\) factorization, however running with the matrix:

\[\begin{split} \begin{bmatrix} 2 & 1 & 0 \\ 1 & 1/2 & 1 \\ 0 & 1 & 2 \end{bmatrix} \end{split}\]

results in a numerical zero because:

\[\begin{split} \begin{array}{ll} U_{00} &= 2 \\ U_{01} &= 1 \\ L_{10} &= \frac{1}{2} \\ U_{11} &= \frac{1}{2} - \frac{1}{2} &= 0 \end{array} \end{split}\]

The user can detect the presence of a structural zero by calling hipsparseXcsrilu02_zeroPivot() after hipsparseXcsrilu02_analysis() and/or the presence of a structural or numerical zero by calling hipsparseXcsrilu02_zeroPivot() after hipsparseXcsrilu02(). In both cases, hipsparseXcsrilu02_zeroPivot() will report the first zero pivot (either numerical or structural) found. See example below. The user can also set the diagonal type to be \(1\) using hipsparseSetMatDiagType() which will interpret the matrix \(A\) as having ones on its diagonal (even if no nonzero exists in the sparsity pattern).

hipsparseXcsrilu02 computes the \(LU\) factorization inplace meaning that the values array csrSortedValA_valM of the \(A\) matrix is overwritten with the \(L\) matrix stored in the strictly lower triangular part of \(A\) and the \(U\) matrix stored in the upper part of \(A\):

\[\begin{split} \begin{align} \begin{bmatrix} a_{00} & a_{01} & a_{02} \\ a_{10} & a_{11} & a_{12} \\ a_{20} & a_{21} & a_{22} \end{bmatrix} \rightarrow \begin{bmatrix} u_{00} & u_{01} & u_{02} \\ l_{10} & u_{11} & u_{12} \\ l_{20} & l_{21} & u_{22} \end{bmatrix} \end{align} \end{split}\]

The row pointer array csrSortedRowPtrA and the column indices array csrSortedColIndA remain the same for \(A\) and \(LU\) as the incomplete factorization does not generate new nonzeros in \(LU\) which do not already exist in \(A\).

The performance of computing \(LU\) factorization with hipSPARSE greatly depends on the sparisty pattern the the matrix \(A\) as this is what determines the amount of parallelism available.

Note

The sparse CSR matrix has to be sorted. This can be achieved by calling hipsparseXcsrsort().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA_valM – [inout] array of nnz elements of the sparse CSR matrix.
csrSortedRowPtrA – [in] array of m+1 elements that point to the start of every row of the sparse CSR matrix.
csrSortedColIndA – [in] array of nnz elements containing the column indices of the sparse CSR matrix.
info – [in] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, nnz, descrA, csrSortedValA_valM, csrSortedRowPtrA or csrSortedColIndA pointer is invalid.
HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // A sample square matrix A (4x4) in CSR format for ILU(0) factorization.
    // The 'S' in Scsrilu02 indicates single precision float.
    // Matrix A:
    // ( 1  2  0  0 )
    // ( 3  4  5  0 )
    // ( 0  6  7  8 )
    // ( 0  0  9 10 )

    int m   = 4; // Number of rows
    int n   = 4; // Number of columns (equal to m for ILU)
    int nnz = 10; // Number of non-zero elements

    // CSR row pointers
    std::vector<int> hcsrRowPtr = {0, 2, 5, 8, 10};

    // CSR column indices
    std::vector<int> hcsrColInd = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};

    // CSR values
    std::vector<float> hcsrVal = {1.0f, 2.0f, 3.0f, 4.0f, 5.0f, 6.0f, 7.0f, 8.0f, 9.0f, 10.0f};

    // Matrix descriptor
    hipsparseMatDescr_t descr;
    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));

    // Set index base on descriptor
    HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));

    // For incomplete LU, the L factor often has a unit diagonal.
    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));

    // CSRILU02 info (for incomplete LU factorization)
    csrilu02Info_t info;
    HIPSPARSE_CHECK(hipsparseCreateCsrilu02Info(&info));

    // Offload data to device
    int*   dcsrRowPtr;
    int*   dcsrColInd;
    float* dcsrVal; // This will store the factorized L and U values

    HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
    HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
    HIP_CHECK(
        hipMalloc((void**)&dcsrVal,
                  sizeof(float) * nnz)); // Note: Same size as input, values will be overwritten

    HIP_CHECK(
        hipMemcpy(dcsrRowPtr, hcsrRowPtr.data(), sizeof(int) * (m + 1), hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd.data(), sizeof(int) * nnz, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal.data(), sizeof(float) * nnz, hipMemcpyHostToDevice));

    // 1. Get buffer size
    int bufferSize = 0;
    HIPSPARSE_CHECK(hipsparseScsrilu02_bufferSize(
        handle, m, nnz, descr, dcsrVal, dcsrRowPtr, dcsrColInd, info, &bufferSize));

    void* dbuffer = nullptr;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform analysis (symbolic factorization)
    // This step analyzes the sparsity pattern of A to determine the structure of L and U.
    HIPSPARSE_CHECK(
        hipsparseScsrilu02_analysis(handle,
                                    m,
                                    nnz,
                                    descr,
                                    dcsrVal,
                                    dcsrRowPtr,
                                    dcsrColInd,
                                    info,
                                    HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
                                    dbuffer));

    // 3. Perform factorization (numerical computation)
    // This step computes the actual numerical values of L and U, stored in dcsrVal.
    HIPSPARSE_CHECK(hipsparseScsrilu02(handle,
                                       m,
                                       nnz,
                                       descr,
                                       dcsrVal,
                                       dcsrRowPtr,
                                       dcsrColInd,
                                       info,
                                       HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
                                       dbuffer));

    // 4. Check for zero pivots
    // A zero pivot can occur during factorization, indicating a numerical breakdown.
    int zeroPivot = 0; // -1 if no zero pivot, otherwise the row index of the first zero pivot
    HIPSPARSE_CHECK(hipsparseXcsrilu02_zeroPivot(handle, info, &zeroPivot));
    if(zeroPivot != -1)
    {
        printf("Error: Zero pivot detected during ILU0 factorization at row index %d\n", zeroPivot);
        // Depending on your application, you might want to handle this error
        // or switch to a different preconditioner.
    }
    else
    {
        printf("CSRILU0 factorization completed successfully (no zero pivots detected).\n");
    }

    // Copy the factorized values (L and U combined) back to host
    std::vector<float> hcsrVal_result(nnz);
    HIP_CHECK(
        hipMemcpy(hcsrVal_result.data(), dcsrVal, sizeof(float) * nnz, hipMemcpyDeviceToHost));

    // Print the result (the values of the factorized L and U combined)
    printf("\nFactorized CSR values (L and U combined):\n");
    for(int i = 0; i < nnz; ++i)
    {
        printf("val[%d] = %f\n", i, hcsrVal_result[i]);
    }

    // Clean up
    HIPSPARSE_CHECK(hipsparseDestroyCsrilu02Info(info));
    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    HIP_CHECK(hipFree(dcsrRowPtr));
    HIP_CHECK(hipFree(dcsrColInd));
    HIP_CHECK(hipFree(dcsrVal));
    HIP_CHECK(hipFree(dbuffer));

    return 0;
}

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // A sample square matrix A (4x4) in CSR format for ILU(0) factorization.
    // The 'S' in Scsrilu02 indicates single precision float.
    // Matrix A:
    // ( 1  2  0  0 )
    // ( 3  4  5  0 )
    // ( 0  6  7  8 )
    // ( 0  0  9 10 )

    int m   = 4; // Number of rows
    int n   = 4; // Number of columns (equal to m for ILU)
    int nnz = 10; // Number of non-zero elements

    // CSR row pointers
    int hcsrRowPtr[] = {0, 2, 5, 8, 10};

    // CSR column indices
    int hcsrColInd[] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};

    // CSR values
    float hcsrVal[] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0};

    // Matrix descriptor
    hipsparseMatDescr_t descr;
    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));

    // Set index base on descriptor
    HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));

    // For incomplete LU, the L factor often has a unit diagonal.
    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));

    // CSRILU02 info (for incomplete LU factorization)
    csrilu02Info_t info;
    HIPSPARSE_CHECK(hipsparseCreateCsrilu02Info(&info));

    // Offload data to device
    int*   dcsrRowPtr;
    int*   dcsrColInd;
    float* dcsrVal; // This will store the factorized L and U values

    HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
    HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
    HIP_CHECK(
        hipMalloc((void**)&dcsrVal,
                  sizeof(float) * nnz)); // Note: Same size as input, values will be overwritten

    HIP_CHECK(hipMemcpy(dcsrRowPtr, hcsrRowPtr, sizeof(int) * (m + 1), hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd, sizeof(int) * nnz, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal, sizeof(float) * nnz, hipMemcpyHostToDevice));

    // 1. Get buffer size
    int bufferSize = 0;
    HIPSPARSE_CHECK(hipsparseScsrilu02_bufferSize(
        handle, m, nnz, descr, dcsrVal, dcsrRowPtr, dcsrColInd, info, &bufferSize));

    void* dbuffer = NULL;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform analysis (symbolic factorization)
    // This step analyzes the sparsity pattern of A to determine the structure of L and U.
    HIPSPARSE_CHECK(
        hipsparseScsrilu02_analysis(handle,
                                    m,
                                    nnz,
                                    descr,
                                    dcsrVal,
                                    dcsrRowPtr,
                                    dcsrColInd,
                                    info,
                                    HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
                                    dbuffer));

    // 3. Perform factorization (numerical computation)
    // This step computes the actual numerical values of L and U, stored in dcsrVal.
    HIPSPARSE_CHECK(hipsparseScsrilu02(handle,
                                       m,
                                       nnz,
                                       descr,
                                       dcsrVal,
                                       dcsrRowPtr,
                                       dcsrColInd,
                                       info,
                                       HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
                                       dbuffer));

    // 4. Check for zero pivots
    // A zero pivot can occur during factorization, indicating a numerical breakdown.
    int zeroPivot = 0; // -1 if no zero pivot, otherwise the row index of the first zero pivot
    HIPSPARSE_CHECK(hipsparseXcsrilu02_zeroPivot(handle, info, &zeroPivot));
    if(zeroPivot != -1)
    {
        printf("Error: Zero pivot detected during ILU0 factorization at row index %d\n", zeroPivot);
        // Depending on your application, you might want to handle this error
        // or switch to a different preconditioner.
    }
    else
    {
        printf("CSRILU0 factorization completed successfully (no zero pivots detected).\n");
    }

    // Copy the factorized values (L and U combined) back to host
    float* hcsrVal_result = (float*)malloc(nnz * sizeof(float));
    HIP_CHECK(hipMemcpy(hcsrVal_result, dcsrVal, sizeof(float) * nnz, hipMemcpyDeviceToHost));

    // Print the result (the values of the factorized L and U combined)
    printf("\nFactorized CSR values (L and U combined):\n");
    for(int i = 0; i < nnz; ++i)
    {
        printf("val[%d] = %f\n", i, hcsrVal_result[i]);
    }

    // Clean up
    free(hcsrVal_result);

    HIPSPARSE_CHECK(hipsparseDestroyCsrilu02Info(info));
    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    HIP_CHECK(hipFree(dcsrRowPtr));
    HIP_CHECK(hipFree(dcsrColInd));
    HIP_CHECK(hipFree(dcsrVal));
    HIP_CHECK(hipFree(dbuffer));

    return 0;
}

program example_hipsparse_csrilu02
    use iso_c_binding
    implicit none

    ! HIP
    interface
        function hipMalloc(ptr, size) &
                bind(c, name = 'hipMalloc')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipMalloc
            type(c_ptr) :: ptr
            integer(c_size_t), value :: size
        end function hipMalloc

        function hipFree(ptr) &
                bind(c, name = 'hipFree')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipFree
            type(c_ptr), value :: ptr
        end function hipFree

        function hipMemcpy(dst, src, size, kind) &
                bind(c, name = 'hipMemcpy')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipMemcpy
            type(c_ptr), value :: dst
            type(c_ptr), intent(in), value :: src
            integer(c_size_t), value :: size
            integer(c_int), value :: kind
        end function hipMemcpy
    end interface

    integer, parameter :: hipMemcpyHostToDevice = 1
    integer, parameter :: hipMemcpyDeviceToHost = 2

    ! hipSPARSE
    interface
        function hipsparseCreate(handle) &
                bind(c, name = 'hipsparseCreate')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreate
            type(c_ptr) :: handle
        end function hipsparseCreate

        function hipsparseDestroy(handle) &
                bind(c, name = 'hipsparseDestroy')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroy
            type(c_ptr), value :: handle
        end function hipsparseDestroy

        function hipsparseCreateMatDescr(descr) &
                bind(c, name = 'hipsparseCreateMatDescr')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreateMatDescr
            type(c_ptr) :: descr
        end function hipsparseCreateMatDescr

        function hipsparseDestroyMatDescr(descr) &
                bind(c, name = 'hipsparseDestroyMatDescr')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroyMatDescr
            type(c_ptr), value :: descr
        end function hipsparseDestroyMatDescr

        function hipsparseCreateCsrilu02Info(info) &
                bind(c, name = 'hipsparseCreateCsrilu02Info')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreateCsrilu02Info
            type(c_ptr) :: info
        end function hipsparseCreateCsrilu02Info

        function hipsparseDestroyCsrilu02Info(info) &
                bind(c, name = 'hipsparseDestroyCsrilu02Info')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroyCsrilu02Info
            type(c_ptr), value :: info
        end function hipsparseDestroyCsrilu02Info

        function hipsparseScsrilu02_bufferSize(handle, m, nnz, descr, csrSortedValA, csrSortedRowPtrA, &
                                               csrSortedColIndA, info, pBufferSizeInBytes) &
                bind(c, name = 'hipsparseScsrilu02_bufferSize')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseScsrilu02_bufferSize
            type(c_ptr), value :: handle
            integer(c_int), value :: m
            integer(c_int), value :: nnz
            type(c_ptr), value :: descr
            type(c_ptr), intent(in), value :: csrSortedValA
            type(c_ptr), intent(in), value :: csrSortedRowPtrA
            type(c_ptr), intent(in), value :: csrSortedColIndA
            type(c_ptr), value :: info
            type(c_ptr), value :: pBufferSizeInBytes
        end function hipsparseScsrilu02_bufferSize

        function hipsparseScsrilu02_analysis(handle, m, nnz, descr, csrSortedValA, csrSortedRowPtrA, &
                                             csrSortedColIndA, info, policy, pBuffer) &
                bind(c, name = 'hipsparseScsrilu02_analysis')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseScsrilu02_analysis
            type(c_ptr), value :: handle
            integer(c_int), value :: m
            integer(c_int), value :: nnz
            type(c_ptr), value :: descr
            type(c_ptr), intent(in), value :: csrSortedValA
            type(c_ptr), intent(in), value :: csrSortedRowPtrA
            type(c_ptr), intent(in), value :: csrSortedColIndA
            type(c_ptr), value :: info
            integer(c_int), value :: policy
            type(c_ptr), value :: pBuffer
        end function hipsparseScsrilu02_analysis

        function hipsparseScsrilu02(handle, m, nnz, descr, csrSortedValA_valM, csrSortedRowPtrA, &
                                    csrSortedColIndA, info, policy, pBuffer) &
                bind(c, name = 'hipsparseScsrilu02')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseScsrilu02
            type(c_ptr), value :: handle
            integer(c_int), value :: m
            integer(c_int), value :: nnz
            type(c_ptr), value :: descr
            type(c_ptr), value :: csrSortedValA_valM
            type(c_ptr), intent(in), value :: csrSortedRowPtrA
            type(c_ptr), intent(in), value :: csrSortedColIndA
            type(c_ptr), value :: info
            integer(c_int), value :: policy
            type(c_ptr), value :: pBuffer
        end function hipsparseScsrilu02

        function hipsparseXcsrilu02_zeroPivot(handle, info, position) &
                bind(c, name = 'hipsparseXcsrilu02_zeroPivot')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseXcsrilu02_zeroPivot
            type(c_ptr), value :: handle
            type(c_ptr), value :: info
            type(c_ptr), value :: position
        end function hipsparseXcsrilu02_zeroPivot
    end interface

    integer, parameter :: HIPSPARSE_SOLVE_POLICY_USE_LEVEL = 1

    ! Variables
    type(c_ptr) :: handle
    type(c_ptr) :: descr
    type(c_ptr) :: info
    integer :: i, stat
    integer, target :: zeroPivot
    integer(c_int), target :: bufferSize
    
    ! Matrix A (4x4) in CSR format
    integer, parameter :: m = 4
    integer, parameter :: n = 4
    integer, parameter :: nnz = 10
    
    integer, dimension(m+1), target :: hcsrRowPtr = (/0, 2, 5, 8, 10/)
    integer, dimension(nnz), target :: hcsrColInd = (/0, 1, 0, 1, 2, 1, 2, 3, 2, 3/)
    real(c_float), dimension(nnz), target :: hcsrVal = (/1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0/)
    
    ! Result array
    real(c_float), dimension(nnz), target :: hcsrVal_result
    
    ! Device pointers
    type(c_ptr) :: dcsrRowPtr
    type(c_ptr) :: dcsrColInd
    type(c_ptr) :: dcsrVal
    type(c_ptr) :: dbuffer

    ! Create hipSPARSE handle
    stat = hipsparseCreate(handle)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseCreate failed'
        stop
    end if

    ! Create matrix descriptor
    stat = hipsparseCreateMatDescr(descr)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseCreateMatDescr failed'
        stop
    end if

    ! Create csrilu02 info
    stat = hipsparseCreateCsrilu02Info(info)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseCreateCsrilu02Info failed'
        stop
    end if

    ! Allocate device memory
    stat = hipMalloc(dcsrRowPtr, int((m + 1) * 4, c_size_t))
    if (stat /= 0) stop
    stat = hipMalloc(dcsrColInd, int(nnz * 4, c_size_t))
    if (stat /= 0) stop
    stat = hipMalloc(dcsrVal, int(nnz * 4, c_size_t))
    if (stat /= 0) stop

    ! Copy data to device
    stat = hipMemcpy(dcsrRowPtr, c_loc(hcsrRowPtr), int((m + 1) * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop
    stat = hipMemcpy(dcsrColInd, c_loc(hcsrColInd), int(nnz * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop
    stat = hipMemcpy(dcsrVal, c_loc(hcsrVal), int(nnz * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop

    ! Get buffer size
    stat = hipsparseScsrilu02_bufferSize(handle, &
                                         m, &
                                         nnz, &
                                         descr, &
                                         dcsrVal, &
                                         dcsrRowPtr, &
                                         dcsrColInd, &
                                         info, &
                                         c_loc(bufferSize))
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseScsrilu02_bufferSize failed'
        stop
    end if

    ! Allocate temporary buffer
    stat = hipMalloc(dbuffer, int(bufferSize, c_size_t))
    if (stat /= 0) stop

    ! Perform analysis step
    stat = hipsparseScsrilu02_analysis(handle, &
                                       m, &
                                       nnz, &
                                       descr, &
                                       dcsrVal, &
                                       dcsrRowPtr, &
                                       dcsrColInd, &
                                       info, &
                                       HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
                                       dbuffer)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseScsrilu02_analysis failed'
        stop
    end if

    ! Perform factorization
    stat = hipsparseScsrilu02(handle, &
                              m, &
                              nnz, &
                              descr, &
                              dcsrVal, &
                              dcsrRowPtr, &
                              dcsrColInd, &
                              info, &
                              HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
                              dbuffer)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseScsrilu02 failed'
        stop
    end if

    ! Check for zero pivots
    stat = hipsparseXcsrilu02_zeroPivot(handle, info, c_loc(zeroPivot))
    if (zeroPivot /= -1) then
        write(*,*) 'Error: Zero pivot detected at row index', zeroPivot
    else
        write(*,*) 'CSRILU02 factorization completed successfully'
    end if

    ! Copy result back to host
    stat = hipMemcpy(c_loc(hcsrVal_result), dcsrVal, int(nnz * 4, c_size_t), hipMemcpyDeviceToHost)
    if (stat /= 0) stop

    ! Print result
    write(*,*) 'Factorized CSR values (L and U combined):'
    do i = 1, nnz
        write(*,*) 'val[', i-1, '] =', hcsrVal_result(i)
    end do

    ! Clean up
    stat = hipFree(dcsrRowPtr)
    stat = hipFree(dcsrColInd)
    stat = hipFree(dcsrVal)
    stat = hipFree(dbuffer)

    stat = hipsparseDestroyCsrilu02Info(info)
    stat = hipsparseDestroyMatDescr(descr)
    stat = hipsparseDestroy(handle)

end program example_hipsparse_csrilu02

hipsparseXbsric02_zeroPivot()#

hipsparseStatus_t hipsparseXbsric02_zeroPivot(hipsparseHandle_t handle, bsric02Info_t info, int *position)#

hipsparseXbsric02_zeroPivot returns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXbsric02_analysis() or hipsparseXbsric02() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the BSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.

Deprecated:: This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.

Note

If a zero pivot is found, position=j means that either the diagonal block A(j,j) is missing (structural zero) or the diagonal block A(j,j) is not positive definite (numerical zero).

Note

hipsparseXbsric02_zeroPivot is a blocking function. It might influence performance negatively.

Parameters:

handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
position – [inout] pointer to zero pivot \(j\), can be in host or device memory.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED – handle is not initialized.
HIPSPARSE_STATUS_INVALID_VALUE – handle, info or position is nullptr.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.

hipsparseXbsric02_bufferSize()#

hipsparseStatus_t hipsparseSbsric02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDbsric02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCbsric02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZbsric02_bufferSize(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, int *pBufferSizeInBytes)#

hipsparseXbsric02_bufferSize returns the size of the temporary storage buffer in bytes that is required by hipsparseXbsric02_analysis() and hipsparseXbsric02(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specifies whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix. Must be non-negative.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix. Must be non-negative.
descrA – [in] descriptor of the sparse BSR matrix.
bsrValA – [in] array of length nnzb*blockDim*blockDim containing the values of the sparse BSR matrix.
bsrRowPtrA – [in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.
bsrColIndA – [in] array of nnzb elements containing the block column indices of the sparse BSR matrix.
blockDim – [in] the block dimension of the BSR matrix. Must be positive. Between 1 and m where m=mb*blockDim.
info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSbsric02_analysis(), hipsparseDbsric02_analysis(), hipsparseCbsric02_analysis(), hipsparseZbsric02_analysis(), hipsparseSbsric02(), hipsparseDbsric02(), hipsparseCbsric02() and hipsparseZbsric02().

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED – handle is not initialized.
HIPSPARSE_STATUS_INVALID_VALUE – handle, descrA, bsrValA, bsrRowPtrA, bsrColIndA, info or pBufferSizeInBytes is nullptr, mb or nnzb is negative, or blockDim is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

hipsparseXbsric02_analysis()#

hipsparseStatus_t hipsparseSbsric02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, const float *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseDbsric02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, const double *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseCbsric02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, const hipComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseZbsric02_analysis(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, const hipDoubleComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseXbsric02_analysis performs the analysis step for hipsparseXbsric02(). It is expected that this function will be executed only once for a given matrix and particular operation type.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specified whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrValA – [in] array of length nnzb*blockDim*blockDim containing the values of the sparse BSR matrix.
bsrRowPtrA – [in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.
bsrColIndA – [in] array of nnzb elements containing the block column indices of the sparse BSR matrix.
blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where m=mb*blockDim.
info – [out] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, mb, nnzb, blockDim, descrA, bsrValA, bsrRowPtrA, bsrColIndA, info or pBuffer pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

hipsparseXbsric02()#

hipsparseStatus_t hipsparseSbsric02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, float *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseDbsric02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, double *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseCbsric02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseZbsric02(hipsparseHandle_t handle, hipsparseDirection_t dirA, int mb, int nnzb, const hipsparseMatDescr_t descrA, hipDoubleComplex *bsrValA, const int *bsrRowPtrA, const int *bsrColIndA, int blockDim, bsric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

Incomplete Cholesky factorization with 0 fill-ins and no pivoting using BSR storage format.

hipsparseXbsric02 computes the incomplete Cholesky factorization with 0 fill-ins and no pivoting of a sparse \(mb \times mb\) BSR matrix \(A\), such that

\[ A \approx LL^T \]

Computing the above incomplete Cholesky factorization requires three steps to complete. First, the user determines the size of the required temporary storage buffer by calling hipsparseXbsric02_bufferSize(). Once this buffer size has been determined, the user allocates the buffer and passes it to hipsparseXbsric02_analysis(). This will perform analysis on the sparsity pattern of the matrix. Finally, the user calls hipsparseXbsric02 to perform the actual factorization. The calculation of the buffer size and the analysis of the sparse matrix only need to be performed once for a given sparsity pattern while the factorization can be repeatedly applied to multiple matrices having the same sparsity pattern. Once all calls to hipsparseXbsric02 are complete, the temporary buffer can be deallocated.

hipsparseXbsric02 requires a user allocated temporary buffer. Its size is returned by hipsparseXbsric02_bufferSize(). Furthermore, analysis meta data is required. It can be obtained by hipsparseXbsric02_analysis(). hipsparseXbsric02 reports the first zero pivot (either numerical or structural zero). The zero pivot status can be obtained by calling hipsparseXbsric02_zeroPivot().

hipsparseXbsric02 reports the first zero pivot (either numerical or structural zero). The zero pivot status can be obtained by calling hipsparseXbsric02_zeroPivot().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
dirA – [in] direction that specified whether to count nonzero elements by HIPSPARSE_DIRECTION_ROW or by HIPSPARSE_DIRECTION_COLUMN.
mb – [in] number of block rows in the sparse BSR matrix.
nnzb – [in] number of non-zero block entries of the sparse BSR matrix.
descrA – [in] descriptor of the sparse BSR matrix.
bsrValA – [inout] array of length nnzb*blockDim*blockDim containing the values of the sparse BSR matrix.
bsrRowPtrA – [in] array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.
bsrColIndA – [in] array of nnzb elements containing the block column indices of the sparse BSR matrix.
blockDim – [in] the block dimension of the BSR matrix. Between 1 and m where m=mb*blockDim.
info – [in] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, mb, nnzb, blockDim, descrA, bsrValA, bsrRowPtrA, or bsrColIndA pointer is invalid.
HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // A sample symmetric positive definite matrix A (4x4)
    // with a block size of 1. This example effectively uses BSR format
    // for a CSR-like matrix.
    // Matrix A:
    // ( 4  1  0  0 )
    // ( 1  5  2  0 )
    // ( 0  2  3  1 )
    // ( 0  0  1  2 )

    const int m    = 4; // Number of rows
    const int n    = 4; // Number of columns
    const int bs   = 1; // Block size
    const int mb   = m / bs; // Number of block rows
    const int nb   = n / bs; // Number of block columns
    const int nnzb = 10; // Number of non-zero blocks

    // BSR row pointers
    std::vector<int> hbsrRowPtr = {0, 2, 5, 8, 10};

    // BSR column indices
    std::vector<int> hbsrColInd = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};

    // BSR values (single precision float for 'S'bsric02)
    // Values are stored column-major within each block, but with bs=1, this is simple.
    // The values correspond to the upper triangular part of the matrix.
    std::vector<float> hbsrVal = {4.0f, 1.0f, 1.0f, 5.0f, 2.0f, 2.0f, 3.0f, 1.0f, 1.0f, 2.0f};

    // Matrix descriptor
    hipsparseMatDescr_t descr;
    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));

    // Set index base on descriptor
    HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));

    // Set fill mode to lower and diagonal type to unit (required for IC02)
    HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));

    // BSRIC02 info
    bsric02Info_t info;
    HIPSPARSE_CHECK(hipsparseCreateBsric02Info(&info));

    // Offload data to device
    int*   dbsrRowPtr;
    int*   dbsrColInd;
    float* dbsrVal;

    HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
    HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
    HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(float) * nnzb * bs * bs));

    HIP_CHECK(
        hipMemcpy(dbsrRowPtr, hbsrRowPtr.data(), sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd.data(), sizeof(int) * nnzb, hipMemcpyHostToDevice));
    HIP_CHECK(
        hipMemcpy(dbsrVal, hbsrVal.data(), sizeof(float) * nnzb * bs * bs, hipMemcpyHostToDevice));

    // 1. Get buffer size
    int bufferSize = 0;
    HIPSPARSE_CHECK(hipsparseSbsric02_bufferSize(handle,
                                                 HIPSPARSE_DIRECTION_COLUMN,
                                                 mb,
                                                 nnzb,
                                                 descr,
                                                 dbsrVal,
                                                 dbsrRowPtr,
                                                 dbsrColInd,
                                                 bs,
                                                 info,
                                                 &bufferSize));

    void* dbuffer = nullptr;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform analysis (symbolic factorization)
    HIPSPARSE_CHECK(hipsparseSbsric02_analysis(handle,
                                               HIPSPARSE_DIRECTION_COLUMN,
                                               mb,
                                               nnzb,
                                               descr,
                                               dbsrVal,
                                               dbsrRowPtr,
                                               dbsrColInd,
                                               bs,
                                               info,
                                               HIPSPARSE_SOLVE_POLICY_USE_LEVEL,
                                               dbuffer));

    // 3. Perform factorization (numerical computation)
    HIPSPARSE_CHECK(hipsparseSbsric02(handle,
                                      HIPSPARSE_DIRECTION_COLUMN,
                                      mb,
                                      nnzb,
                                      descr,
                                      dbsrVal,
                                      dbsrRowPtr,
                                      dbsrColInd,
                                      bs,
                                      info,
                                      HIPSPARSE_SOLVE_POLICY_USE_LEVEL,
                                      dbuffer));

    // 4. Check for zero pivots
    int zeroPivot = 0;
    HIPSPARSE_CHECK(hipsparseXbsric02_zeroPivot(handle, info, &zeroPivot));
    if(zeroPivot != -1)
    {
        printf("Error: Zero pivot detected at index %d\n", zeroPivot);
        // Handle error, e.g., by returning an error code
    }

    // Copy the factorized values back to host
    std::vector<float> hbsrVal_result(nnzb * bs * bs);
    HIP_CHECK(hipMemcpy(
        hbsrVal_result.data(), dbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyDeviceToHost));

    // Print the result (the values of the factorized matrix)
    printf("Successfully computed incomplete Cholesky factorization.\n");
    printf("Factorized BSR values:\n");
    for(int i = 0; i < nnzb * bs * bs; ++i)
    {
        printf("val[%d] = %f\n", i, hbsrVal_result[i]);
    }

    // Clean up
    HIPSPARSE_CHECK(hipsparseDestroyBsric02Info(info));
    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    HIP_CHECK(hipFree(dbsrRowPtr));
    HIP_CHECK(hipFree(dbsrColInd));
    HIP_CHECK(hipFree(dbsrVal));
    HIP_CHECK(hipFree(dbuffer));

    return 0;
}

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // A sample symmetric positive definite matrix A (4x4)
    // with a block size of 1. This example effectively uses BSR format
    // for a CSR-like matrix.
    // Matrix A:
    // ( 4  1  0  0 )
    // ( 1  5  2  0 )
    // ( 0  2  3  1 )
    // ( 0  0  1  2 )

    const int m    = 4; // Number of rows
    const int n    = 4; // Number of columns
    const int bs   = 1; // Block size
    const int mb   = m / bs; // Number of block rows
    const int nb   = n / bs; // Number of block columns
    const int nnzb = 10; // Number of non-zero blocks

    // BSR row pointers
    int hbsrRowPtr[] = {0, 2, 5, 8, 10};

    // BSR column indices
    int hbsrColInd[] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};

    // BSR values (single precision float for 'S'bsric02)
    // Values are stored column-major within each block, but with bs=1, this is simple.
    // The values correspond to the upper triangular part of the matrix.
    float hbsrVal[] = {4.0, 1.0, 1.0, 5.0, 2.0, 2.0, 3.0, 1.0, 1.0, 2.0};

    // Matrix descriptor
    hipsparseMatDescr_t descr;
    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));

    // Set index base on descriptor
    HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));

    // Set fill mode to lower and diagonal type to unit (required for IC02)
    HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));

    // BSRIC02 info
    bsric02Info_t info;
    HIPSPARSE_CHECK(hipsparseCreateBsric02Info(&info));

    // Offload data to device
    int*   dbsrRowPtr;
    int*   dbsrColInd;
    float* dbsrVal;

    HIP_CHECK(hipMalloc((void**)&dbsrRowPtr, sizeof(int) * (mb + 1)));
    HIP_CHECK(hipMalloc((void**)&dbsrColInd, sizeof(int) * nnzb));
    HIP_CHECK(hipMalloc((void**)&dbsrVal, sizeof(float) * nnzb * bs * bs));

    HIP_CHECK(hipMemcpy(dbsrRowPtr, hbsrRowPtr, sizeof(int) * (mb + 1), hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dbsrColInd, hbsrColInd, sizeof(int) * nnzb, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dbsrVal, hbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyHostToDevice));

    // 1. Get buffer size
    int bufferSize = 0;
    HIPSPARSE_CHECK(hipsparseSbsric02_bufferSize(handle,
                                                 HIPSPARSE_DIRECTION_COLUMN,
                                                 mb,
                                                 nnzb,
                                                 descr,
                                                 dbsrVal,
                                                 dbsrRowPtr,
                                                 dbsrColInd,
                                                 bs,
                                                 info,
                                                 &bufferSize));

    void* dbuffer = NULL;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform analysis (symbolic factorization)
    HIPSPARSE_CHECK(hipsparseSbsric02_analysis(handle,
                                               HIPSPARSE_DIRECTION_COLUMN,
                                               mb,
                                               nnzb,
                                               descr,
                                               dbsrVal,
                                               dbsrRowPtr,
                                               dbsrColInd,
                                               bs,
                                               info,
                                               HIPSPARSE_SOLVE_POLICY_USE_LEVEL,
                                               dbuffer));

    // 3. Perform factorization (numerical computation)
    HIPSPARSE_CHECK(hipsparseSbsric02(handle,
                                      HIPSPARSE_DIRECTION_COLUMN,
                                      mb,
                                      nnzb,
                                      descr,
                                      dbsrVal,
                                      dbsrRowPtr,
                                      dbsrColInd,
                                      bs,
                                      info,
                                      HIPSPARSE_SOLVE_POLICY_USE_LEVEL,
                                      dbuffer));

    // 4. Check for zero pivots
    int zeroPivot = 0;
    HIPSPARSE_CHECK(hipsparseXbsric02_zeroPivot(handle, info, &zeroPivot));
    if(zeroPivot != -1)
    {
        printf("Error: Zero pivot detected at index %d\n", zeroPivot);
        // Handle error, e.g., by returning an error code
    }

    // Copy the factorized values back to host
    float* hbsrVal_result = (float*)malloc(nnzb * bs * bs * sizeof(float));
    HIP_CHECK(
        hipMemcpy(hbsrVal_result, dbsrVal, sizeof(float) * nnzb * bs * bs, hipMemcpyDeviceToHost));

    // Print the result (the values of the factorized matrix)
    printf("Successfully computed incomplete Cholesky factorization.\n");
    printf("Factorized BSR values:\n");
    for(int i = 0; i < nnzb * bs * bs; ++i)
    {
        printf("val[%d] = %f\n", i, hbsrVal_result[i]);
    }

    // Clean up
    free(hbsrVal_result);

    HIPSPARSE_CHECK(hipsparseDestroyBsric02Info(info));
    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    HIP_CHECK(hipFree(dbsrRowPtr));
    HIP_CHECK(hipFree(dbsrColInd));
    HIP_CHECK(hipFree(dbsrVal));
    HIP_CHECK(hipFree(dbuffer));

    return 0;
}

program example_hipsparse_bsric02
    use iso_c_binding
    implicit none

    ! HIP
    interface
        function hipMalloc(ptr, size) &
                bind(c, name = 'hipMalloc')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipMalloc
            type(c_ptr) :: ptr
            integer(c_size_t), value :: size
        end function hipMalloc

        function hipFree(ptr) &
                bind(c, name = 'hipFree')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipFree
            type(c_ptr), value :: ptr
        end function hipFree

        function hipMemcpy(dst, src, size, kind) &
                bind(c, name = 'hipMemcpy')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipMemcpy
            type(c_ptr), value :: dst
            type(c_ptr), intent(in), value :: src
            integer(c_size_t), value :: size
            integer(c_int), value :: kind
        end function hipMemcpy
    end interface

    integer, parameter :: hipMemcpyHostToDevice = 1
    integer, parameter :: hipMemcpyDeviceToHost = 2

    ! hipSPARSE
    interface
        function hipsparseCreate(handle) &
                bind(c, name = 'hipsparseCreate')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreate
            type(c_ptr) :: handle
        end function hipsparseCreate

        function hipsparseDestroy(handle) &
                bind(c, name = 'hipsparseDestroy')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroy
            type(c_ptr), value :: handle
        end function hipsparseDestroy

        function hipsparseCreateMatDescr(descr) &
                bind(c, name = 'hipsparseCreateMatDescr')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreateMatDescr
            type(c_ptr) :: descr
        end function hipsparseCreateMatDescr

        function hipsparseDestroyMatDescr(descr) &
                bind(c, name = 'hipsparseDestroyMatDescr')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroyMatDescr
            type(c_ptr), value :: descr
        end function hipsparseDestroyMatDescr

        function hipsparseSetMatIndexBase(descr, base) &
                bind(c, name = 'hipsparseSetMatIndexBase')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSetMatIndexBase
            type(c_ptr), value :: descr
            integer(c_int), value :: base
        end function hipsparseSetMatIndexBase

        function hipsparseSetMatFillMode(descr, fillMode) &
                bind(c, name = 'hipsparseSetMatFillMode')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSetMatFillMode
            type(c_ptr), value :: descr
            integer(c_int), value :: fillMode
        end function hipsparseSetMatFillMode

        function hipsparseSetMatDiagType(descr, diagType) &
                bind(c, name = 'hipsparseSetMatDiagType')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSetMatDiagType
            type(c_ptr), value :: descr
            integer(c_int), value :: diagType
        end function hipsparseSetMatDiagType

        function hipsparseCreateBsric02Info(info) &
                bind(c, name = 'hipsparseCreateBsric02Info')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreateBsric02Info
            type(c_ptr) :: info
        end function hipsparseCreateBsric02Info

        function hipsparseDestroyBsric02Info(info) &
                bind(c, name = 'hipsparseDestroyBsric02Info')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroyBsric02Info
            type(c_ptr), value :: info
        end function hipsparseDestroyBsric02Info

        function hipsparseSbsric02_bufferSize(handle, dirA, mb, nnzb, descrA, bsrSortedValA, &
                                              bsrSortedRowPtrA, bsrSortedColIndA, blockDim, &
                                              info, pBufferSizeInBytes) &
                bind(c, name = 'hipsparseSbsric02_bufferSize')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSbsric02_bufferSize
            type(c_ptr), value :: handle
            integer(c_int), value :: dirA
            integer(c_int), value :: mb
            integer(c_int), value :: nnzb
            type(c_ptr), value :: descrA
            type(c_ptr), intent(in), value :: bsrSortedValA
            type(c_ptr), intent(in), value :: bsrSortedRowPtrA
            type(c_ptr), intent(in), value :: bsrSortedColIndA
            integer(c_int), value :: blockDim
            type(c_ptr), value :: info
            type(c_ptr), value :: pBufferSizeInBytes
        end function hipsparseSbsric02_bufferSize

        function hipsparseSbsric02_analysis(handle, dirA, mb, nnzb, descrA, bsrSortedValA, &
                                            bsrSortedRowPtrA, bsrSortedColIndA, blockDim, &
                                            info, policy, pBuffer) &
                bind(c, name = 'hipsparseSbsric02_analysis')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSbsric02_analysis
            type(c_ptr), value :: handle
            integer(c_int), value :: dirA
            integer(c_int), value :: mb
            integer(c_int), value :: nnzb
            type(c_ptr), value :: descrA
            type(c_ptr), intent(in), value :: bsrSortedValA
            type(c_ptr), intent(in), value :: bsrSortedRowPtrA
            type(c_ptr), intent(in), value :: bsrSortedColIndA
            integer(c_int), value :: blockDim
            type(c_ptr), value :: info
            integer(c_int), value :: policy
            type(c_ptr), value :: pBuffer
        end function hipsparseSbsric02_analysis

        function hipsparseSbsric02(handle, dirA, mb, nnzb, descrA, bsrSortedValA, bsrSortedRowPtrA, &
                                   bsrSortedColIndA, blockDim, info, policy, pBuffer) &
                bind(c, name = 'hipsparseSbsric02')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSbsric02
            type(c_ptr), value :: handle
            integer(c_int), value :: dirA
            integer(c_int), value :: mb
            integer(c_int), value :: nnzb
            type(c_ptr), value :: descrA
            type(c_ptr), value :: bsrSortedValA
            type(c_ptr), intent(in), value :: bsrSortedRowPtrA
            type(c_ptr), intent(in), value :: bsrSortedColIndA
            integer(c_int), value :: blockDim
            type(c_ptr), value :: info
            integer(c_int), value :: policy
            type(c_ptr), value :: pBuffer
        end function hipsparseSbsric02

        function hipsparseXbsric02_zeroPivot(handle, info, position) &
                bind(c, name = 'hipsparseXbsric02_zeroPivot')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseXbsric02_zeroPivot
            type(c_ptr), value :: handle
            type(c_ptr), value :: info
            type(c_ptr), value :: position
        end function hipsparseXbsric02_zeroPivot
    end interface

    integer, parameter :: HIPSPARSE_INDEX_BASE_ZERO = 0
    integer, parameter :: HIPSPARSE_FILL_MODE_LOWER = 0
    integer, parameter :: HIPSPARSE_DIAG_TYPE_UNIT = 1
    integer, parameter :: HIPSPARSE_DIRECTION_COLUMN = 1
    integer, parameter :: HIPSPARSE_SOLVE_POLICY_USE_LEVEL = 1

    ! Variables
    type(c_ptr) :: handle
    type(c_ptr) :: descr
    type(c_ptr) :: info
    integer :: i, stat
    integer, target :: zeroPivot
    integer(c_int), target :: bufferSize
    
    ! Block sparse matrix A (4x4 with block size 1)
    integer, parameter :: m = 4
    integer, parameter :: n = 4
    integer, parameter :: bs = 1
    integer, parameter :: mb = m / bs
    integer, parameter :: nb = n / bs
    integer, parameter :: nnzb = 10
    
    integer, dimension(mb+1), target :: hbsrRowPtr = (/0, 2, 5, 8, 10/)
    integer, dimension(nnzb), target :: hbsrColInd = (/0, 1, 0, 1, 2, 1, 2, 3, 2, 3/)
    real(c_float), dimension(nnzb*bs*bs), target :: hbsrVal = (/4.0, 1.0, 1.0, 5.0, 2.0, 2.0, 3.0, 1.0, 1.0, 2.0/)
    
    ! Result array
    real(c_float), dimension(nnzb*bs*bs), target :: hbsrVal_result
    
    ! Device pointers
    type(c_ptr) :: dbsrRowPtr
    type(c_ptr) :: dbsrColInd
    type(c_ptr) :: dbsrVal
    type(c_ptr) :: dbuffer

    ! Create hipSPARSE handle
    stat = hipsparseCreate(handle)
    if (stat /= 0) stop

    ! Create matrix descriptor
    stat = hipsparseCreateMatDescr(descr)
    if (stat /= 0) stop

    ! Set matrix properties
    stat = hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO)
    if (stat /= 0) stop
    stat = hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER)
    if (stat /= 0) stop
    stat = hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT)
    if (stat /= 0) stop

    ! Create bsric02 info
    stat = hipsparseCreateBsric02Info(info)
    if (stat /= 0) stop

    ! Allocate device memory
    stat = hipMalloc(dbsrRowPtr, int((mb + 1) * 4, c_size_t))
    if (stat /= 0) stop
    stat = hipMalloc(dbsrColInd, int(nnzb * 4, c_size_t))
    if (stat /= 0) stop
    stat = hipMalloc(dbsrVal, int(nnzb * bs * bs * 4, c_size_t))
    if (stat /= 0) stop

    ! Copy data to device
    stat = hipMemcpy(dbsrRowPtr, c_loc(hbsrRowPtr), int((mb + 1) * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop
    stat = hipMemcpy(dbsrColInd, c_loc(hbsrColInd), int(nnzb * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop
    stat = hipMemcpy(dbsrVal, c_loc(hbsrVal), int(nnzb * bs * bs * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop

    ! Get buffer size
    stat = hipsparseSbsric02_bufferSize(handle, &
                                        HIPSPARSE_DIRECTION_COLUMN, &
                                        mb, &
                                        nnzb, &
                                        descr, &
                                        dbsrVal, &
                                        dbsrRowPtr, &
                                        dbsrColInd, &
                                        bs, &
                                        info, &
                                        c_loc(bufferSize))
    if (stat /= 0) stop

    ! Allocate temporary buffer
    stat = hipMalloc(dbuffer, int(bufferSize, c_size_t))
    if (stat /= 0) stop

    ! Perform analysis step
    stat = hipsparseSbsric02_analysis(handle, &
                                      HIPSPARSE_DIRECTION_COLUMN, &
                                      mb, &
                                      nnzb, &
                                      descr, &
                                      dbsrVal, &
                                      dbsrRowPtr, &
                                      dbsrColInd, &
                                      bs, &
                                      info, &
                                      HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
                                      dbuffer)
    if (stat /= 0) stop

    ! Perform factorization
    stat = hipsparseSbsric02(handle, &
                             HIPSPARSE_DIRECTION_COLUMN, &
                             mb, &
                             nnzb, &
                             descr, &
                             dbsrVal, &
                             dbsrRowPtr, &
                             dbsrColInd, &
                             bs, &
                             info, &
                             HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
                             dbuffer)
    if (stat /= 0) stop

    ! Check for zero pivots
    stat = hipsparseXbsric02_zeroPivot(handle, info, c_loc(zeroPivot))
    if (zeroPivot /= -1) then
        write(*,*) 'Error: Zero pivot detected at row index', zeroPivot
    else
        write(*,*) 'BSRIC02 factorization completed successfully'
    end if

    ! Copy result back to host
    stat = hipMemcpy(c_loc(hbsrVal_result), dbsrVal, int(nnzb * bs * bs * 4, c_size_t), hipMemcpyDeviceToHost)
    if (stat /= 0) stop

    ! Print result
    write(*,*) 'Factorized BSR values (L factor):'
    do i = 1, nnzb * bs * bs
        write(*,*) 'val[', i-1, '] =', hbsrVal_result(i)
    end do

    ! Clean up
    stat = hipFree(dbsrRowPtr)
    stat = hipFree(dbsrColInd)
    stat = hipFree(dbsrVal)
    stat = hipFree(dbuffer)

    stat = hipsparseDestroyBsric02Info(info)
    stat = hipsparseDestroyMatDescr(descr)
    stat = hipsparseDestroy(handle)

end program example_hipsparse_bsric02

hipsparseXcsric02_zeroPivot()#

hipsparseStatus_t hipsparseXcsric02_zeroPivot(hipsparseHandle_t handle, csric02Info_t info, int *position)#

hipsparseXcsric02_zeroPivot returns HIPSPARSE_STATUS_ZERO_PIVOT, if either a structural or numerical zero has been found during hipsparseXcsric02_analysis() or hipsparseXcsric02() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the CSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and HIPSPARSE_STATUS_SUCCESS is returned instead.

Deprecated:: This function is deprecated when using the CUDA backend (CUDA 12.0+) and will be removed in CUDA 13.0. This deprecation does not apply to the ROCm backend.

Note

hipsparseXcsric02_zeroPivot is a blocking function. It might influence performance negatively.

Parameters:

handle – [in] handle to the hipsparse library context queue.
info – [in] structure that holds the information collected during the analysis step.
position – [inout] pointer to zero pivot \(j\), can be in host or device memory.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED – handle is not initialized.
HIPSPARSE_STATUS_INVALID_VALUE – handle, info or position is nullptr.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_ZERO_PIVOT – zero pivot has been found.

hipsparseXcsric02_bufferSize()#

hipsparseStatus_t hipsparseScsric02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDcsric02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCcsric02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, int *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZcsric02_bufferSize(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, int *pBufferSizeInBytes)#

hipsparseXcsric02_bufferSize returns the size of the temporary storage buffer in bytes that is required by hipsparseXcsric02_analysis() and hipsparseXcsric02(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of nnz elements of the sparse CSR matrix.
csrSortedRowPtrA – [in] array of m+1 elements that point to the start of every row of the sparse CSR matrix.
csrSortedColIndA – [in] array of nnz elements containing the column indices of the sparse CSR matrix.
info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXcsric02_analysis() and hipsparseXcsric02().

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, nnz, descrA, csrSortedValA, csrSortedRowPtrA, csrSortedColIndA, info or pBufferSizeInBytes pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

hipsparseXcsric02_bufferSizeExt()#

hipsparseStatus_t hipsparseScsric02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDcsric02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCcsric02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZcsric02_bufferSizeExt(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, size_t *pBufferSizeInBytes)#

hipsparseXcsric02_bufferSizeExt returns the size of the temporary storage buffer in bytes that is required by hipsparseXcsric02_analysis() and hipsparseXcsric02(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of nnz elements of the sparse CSR matrix.
csrSortedRowPtrA – [in] array of m+1 elements that point to the start of every row of the sparse CSR matrix.
csrSortedColIndA – [in] array of nnz elements containing the column indices of the sparse CSR matrix.
info – [out] structure that holds the information collected during the analysis step.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXcsric02_analysis() and hipsparseXcsric02().

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, nnz, descrA, csrSortedValA, csrSortedRowPtrA, csrSortedColIndA, info or pBufferSizeInBytes pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

hipsparseXcsric02_analysis()#

hipsparseStatus_t hipsparseScsric02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const float *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseDcsric02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const double *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseCcsric02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const hipComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseZcsric02_analysis(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, const hipDoubleComplex *csrSortedValA, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseXcsric02_analysis performs the analysis step for hipsparseXcsric02().

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA – [in] array of nnz elements of the sparse CSR matrix.
csrSortedRowPtrA – [in] array of m+1 elements that point to the start of every row of the sparse CSR matrix.
csrSortedColIndA – [in] array of nnz elements containing the column indices of the sparse CSR matrix.
info – [out] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, nnz, descrA, csrSortedValA, csrSortedRowPtrA, csrSortedColIndA, info or pBuffer pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

hipsparseXcsric02()#

hipsparseStatus_t hipsparseScsric02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, float *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseDcsric02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, double *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseCcsric02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipComplex *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

hipsparseStatus_t hipsparseZcsric02(hipsparseHandle_t handle, int m, int nnz, const hipsparseMatDescr_t descrA, hipDoubleComplex *csrSortedValA_valM, const int *csrSortedRowPtrA, const int *csrSortedColIndA, csric02Info_t info, hipsparseSolvePolicy_t policy, void *pBuffer)#

Incomplete Cholesky factorization with 0 fill-ins and no pivoting using CSR storage format.

hipsparseXcsric02 computes the incomplete Cholesky factorization with 0 fill-ins and no pivoting of a sparse \(m \times m\) CSR matrix \(A\), such that

\[ A \approx LL^T \]

where the lower triangular matrix \(L\) is computed using:

\[\begin{split} L_{ij} = \left\{ \begin{array}{ll} \sqrt{A_{jj} - \sum_{k=0}^{j-1}(L_{jk})^{2}}, & \text{if i == j} \\ \frac{1}{L_{jj}}(A_{jj} - \sum_{k=0}^{j-1}L_{ik} \times L_{jk}), & \text{if i > j} \end{array} \right. \end{split}\]

for each entry found in the CSR matrix \(A\).

Computing the above incomplete Cholesky factorization requires three steps to complete. First, the user determines the size of the required temporary storage buffer by calling hipsparseXcsric02_bufferSize(). Once this buffer size has been determined, the user allocates the buffer and passes it to hipsparseXcsric02_analysis(). This will perform analysis on the sparsity pattern of the matrix. Finally, the user calls hipsparseScsric02, hipsparseDcsric02, hipsparseCcsric02, or hipsparseZcsric02 to perform the actual factorization. The calculation of the buffer size and the analysis of the sparse matrix only need to be performed once for a given sparsity pattern while the factorization can be repeatedly applied to multiple matrices having the same sparsity pattern. Once all calls to hipsparseXcsric02() are complete, the temporary buffer can be deallocated.

When computing the Cholesky factorization, it is possible that \(L_{jj} == 0\) which would result in a division by zero. This could occur from either \(A_{jj}\) not existing in the sparse CSR matrix (referred to as a structural zero) or because \(A_{jj} - \sum_{k=0}^{j-1}(L_{jk})^{2} == 0\) (referred to as a numerical zero). For example, running the Cholesky factorization on the following matrix:

\[\begin{split} \begin{bmatrix} 2 & 1 & 0 \\ 1 & 2 & 1 \\ 0 & 1 & 2 \end{bmatrix} \end{split}\]

results in a successful Cholesky factorization, however running with the matrix:

\[\begin{split} \begin{bmatrix} 2 & 1 & 0 \\ 1 & 1/2 & 1 \\ 0 & 1 & 2 \end{bmatrix} \end{split}\]

results in a numerical zero because:

\[\begin{split} \begin{array}{ll} L_{00} &= \sqrt{2} \\ L_{10} &= \frac{1}{\sqrt{2}} \\ L_{11} &= \sqrt{\frac{1}{2} - (\frac{1}{\sqrt{2}})^2} &= 0 \end{array} \end{split}\]

The user can detect the presence of a structural zero by calling hipsparseXcsric02_zeroPivot() after hipsparseXcsric02_analysis() and/or the presence of a structural or numerical zero by calling hipsparseXcsric02_zeroPivot() after hipsparseXcsric02():

hipsparseDcsric02(handle,
                  m,
                  nnz,
                  descrM,
                  csrVal,
                  csrRowPtr,
                  csrColInd,
                  info,
                  HIPSPARSE_SOLVE_POLICY_USE_LEVEL,
                  buffer);

// Check for zero pivot
if(CUSPARSE_STATUS_ZERO_PIVOT == hipsparseXcsric02_zeroPivot(handle, info, &position))
{
    printf("L has structural and/or numerical zero at L(%d,%d)\n", position, position);
}

In both cases, hipsparseXcsric02_zeroPivot() will report the first zero pivot (either numerical or structural) found. See full example below. The user can also set the diagonal type to be \(1\) using hipsparseSetMatDiagType() which will interpret the matrix \(A\) as having ones on its diagonal (even if no nonzero exists in the sparsity pattern).

hipsparseXcsric02 computes the Cholesky factorization inplace meaning that the values array csrSortedValA_valM of the \(A\) matrix is overwritten with the \(L\) matrix stored in the lower triangular part of \(A\):

\[\begin{split} \begin{align} \begin{bmatrix} a_{00} & a_{01} & a_{02} \\ a_{10} & a_{11} & a_{12} \\ a_{20} & a_{21} & a_{22} \end{bmatrix} \rightarrow \begin{bmatrix} l_{00} & a_{01} & a_{02} \\ l_{10} & l_{11} & a_{12} \\ l_{20} & l_{21} & l_{22} \end{bmatrix} \end{align} \end{split}\]

The row pointer array csrSortedRowPtrA and the column indices array csrSortedColIndA remain the same for \(A\) and the output as the incomplete factorization does not generate new nonzeros in the output which do not already exist in \(A\).

The performance of computing Cholesky factorization with hipSPARSE greatly depends on the sparisty pattern the the matrix \(A\) as this is what determines the amount of parallelism available.

Note

The sparse CSR matrix has to be sorted. This can be achieved by calling hipsparseXcsrsort().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] number of rows of the sparse CSR matrix.
nnz – [in] number of non-zero entries of the sparse CSR matrix.
descrA – [in] descriptor of the sparse CSR matrix.
csrSortedValA_valM – [inout] array of nnz elements of the sparse CSR matrix.
csrSortedRowPtrA – [in] array of m+1 elements that point to the start of every row of the sparse CSR matrix.
csrSortedColIndA – [in] array of nnz elements containing the column indices of the sparse CSR matrix.
info – [in] structure that holds the information collected during the analysis step.
policy – [in] HIPSPARSE_SOLVE_POLICY_NO_LEVEL or HIPSPARSE_SOLVE_POLICY_USE_LEVEL.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, nnz, descrA, csrSortedValA_valM, csrSortedRowPtrA or csrSortedColIndA pointer is invalid.
HIPSPARSE_STATUS_ARCH_MISMATCH – the device is not supported.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.
HIPSPARSE_STATUS_NOT_SUPPORTED – hipsparseMatrixType_t != HIPSPARSE_MATRIX_TYPE_GENERAL.

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // A sample symmetric positive definite matrix A (4x4) in CSR format.
    // The 'S' in Scsric02 indicates single precision float.
    // Matrix A:
    // ( 4  1  0  0 )
    // ( 1  5  2  0 )
    // ( 0  2  3  1 )
    // ( 0  0  1  2 )
    // This matrix is symmetric. For IC02, we typically provide the full matrix
    // or just the lower/upper part if using `HIPSPARSE_MATRIX_TYPE_SYMMETRIC`
    // with the descriptor. Here, we provide elements for both lower and upper parts
    // for simplicity, but the factorization will operate on the implicitly
    // symmetric matrix and produce the lower triangular factor L.

    int m   = 4; // Number of rows
    int n   = 4; // Number of columns (equal to m for Cholesky)
    int nnz = 10; // Number of non-zero elements (counting only one side for symmetric)

    // CSR row pointers
    std::vector<int> hcsrRowPtr = {0, 2, 5, 8, 10};

    // CSR column indices
    // These indices correspond to the non-zero values used below.
    // For a symmetric matrix A, we implicitly work with A_lower.
    // The output will be L.
    std::vector<int> hcsrColInd = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};

    // CSR values (single precision float for 'S'csric02)
    // The factorization computes the lower triangular L factor.
    // The input values represent the entries of A that correspond to the non-zero pattern.
    std::vector<float> hcsrVal = {4.0f, 1.0f, 1.0f, 5.0f, 2.0f, 2.0f, 3.0f, 1.0f, 1.0f, 2.0f};

    // Matrix descriptor
    hipsparseMatDescr_t descr;
    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));

    // Set index base on descriptor
    HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));

    // For incomplete Cholesky, the L factor is computed.
    // L is lower triangular with a unit diagonal.
    HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
    // Optionally set matrix type to symmetric if only storing one triangle of A
    // HIPSPARSE_CHECK(hipsparseSetMatType(descr, HIPSPARSE_MATRIX_TYPE_SYMMETRIC));

    // CSRIC02 info
    csric02Info_t info;
    HIPSPARSE_CHECK(hipsparseCreateCsric02Info(&info));

    // Offload data to device
    int*   dcsrRowPtr;
    int*   dcsrColInd;
    float* dcsrVal; // This will store the factorized L values

    HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
    HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
    HIP_CHECK(
        hipMalloc((void**)&dcsrVal,
                  sizeof(float) * nnz)); // Note: Same size as input, values will be overwritten

    HIP_CHECK(
        hipMemcpy(dcsrRowPtr, hcsrRowPtr.data(), sizeof(int) * (m + 1), hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd.data(), sizeof(int) * nnz, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal.data(), sizeof(float) * nnz, hipMemcpyHostToDevice));

    // 1. Get buffer size
    int bufferSize = 0;
    HIPSPARSE_CHECK(hipsparseScsric02_bufferSize(
        handle, m, nnz, descr, dcsrVal, dcsrRowPtr, dcsrColInd, info, &bufferSize));

    void* dbuffer = nullptr;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform analysis (symbolic factorization)
    // This step analyzes the sparsity pattern of A to determine the structure of L.
    HIPSPARSE_CHECK(
        hipsparseScsric02_analysis(handle,
                                   m,
                                   nnz,
                                   descr,
                                   dcsrVal,
                                   dcsrRowPtr,
                                   dcsrColInd,
                                   info,
                                   HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
                                   dbuffer));

    // 3. Perform factorization (numerical computation)
    // This step computes the actual numerical values of L, stored in dcsrVal.
    HIPSPARSE_CHECK(hipsparseScsric02(handle,
                                      m,
                                      nnz,
                                      descr,
                                      dcsrVal,
                                      dcsrRowPtr,
                                      dcsrColInd,
                                      info,
                                      HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
                                      dbuffer));

    // 4. Check for zero pivots
    // A zero pivot can occur during factorization, indicating a numerical breakdown.
    int zeroPivot = 0; // -1 if no zero pivot, otherwise the row index of the first zero pivot
    HIPSPARSE_CHECK(hipsparseXcsric02_zeroPivot(handle, info, &zeroPivot));
    if(zeroPivot != -1)
    {
        printf("Error: Zero pivot detected during IC02 factorization at row index %d\n", zeroPivot);
        // Depending on your application, you might want to handle this error
        // or switch to a different preconditioner.
    }
    else
    {
        printf("CSRIC02 factorization completed successfully (no zero pivots detected).\n");
    }

    // Copy the factorized values (L) back to host
    std::vector<float> hcsrVal_result(nnz);
    HIP_CHECK(
        hipMemcpy(hcsrVal_result.data(), dcsrVal, sizeof(float) * nnz, hipMemcpyDeviceToHost));

    // Print the result (the values of the factorized L)
    printf("\nFactorized CSR values (L factor):\n");
    for(int i = 0; i < nnz; ++i)
    {
        printf("val[%d] = %f\n", i, hcsrVal_result[i]);
    }

    // Clean up
    HIPSPARSE_CHECK(hipsparseDestroyCsric02Info(info));
    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    HIP_CHECK(hipFree(dcsrRowPtr));
    HIP_CHECK(hipFree(dcsrColInd));
    HIP_CHECK(hipFree(dcsrVal));
    HIP_CHECK(hipFree(dbuffer));

    return 0;
}

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // A sample symmetric positive definite matrix A (4x4) in CSR format.
    // The 'S' in Scsric02 indicates single precision float.
    // Matrix A:
    // ( 4  1  0  0 )
    // ( 1  5  2  0 )
    // ( 0  2  3  1 )
    // ( 0  0  1  2 )
    // This matrix is symmetric. For IC02, we typically provide the full matrix
    // or just the lower/upper part if using `HIPSPARSE_MATRIX_TYPE_SYMMETRIC`
    // with the descriptor. Here, we provide elements for both lower and upper parts
    // for simplicity, but the factorization will operate on the implicitly
    // symmetric matrix and produce the lower triangular factor L.

    int m   = 4; // Number of rows
    int n   = 4; // Number of columns (equal to m for Cholesky)
    int nnz = 10; // Number of non-zero elements (counting only one side for symmetric)

    // CSR row pointers
    int hcsrRowPtr[] = {0, 2, 5, 8, 10};

    // CSR column indices
    // These indices correspond to the non-zero values used below.
    // For a symmetric matrix A, we implicitly work with A_lower.
    // The output will be L.
    int hcsrColInd[] = {0, 1, 0, 1, 2, 1, 2, 3, 2, 3};

    // CSR values (single precision float for 'S'csric02)
    // The factorization computes the lower triangular L factor.
    // The input values represent the entries of A that correspond to the non-zero pattern.
    float hcsrVal[] = {4.0, 1.0, 1.0, 5.0, 2.0, 2.0, 3.0, 1.0, 1.0, 2.0};

    // Matrix descriptor
    hipsparseMatDescr_t descr;
    HIPSPARSE_CHECK(hipsparseCreateMatDescr(&descr));

    // Set index base on descriptor
    HIPSPARSE_CHECK(hipsparseSetMatIndexBase(descr, HIPSPARSE_INDEX_BASE_ZERO));

    // For incomplete Cholesky, the L factor is computed.
    // L is lower triangular with a unit diagonal.
    HIPSPARSE_CHECK(hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER));
    HIPSPARSE_CHECK(hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT));
    // Optionally set matrix type to symmetric if only storing one triangle of A
    // HIPSPARSE_CHECK(hipsparseSetMatType(descr, HIPSPARSE_MATRIX_TYPE_SYMMETRIC));

    // CSRIC02 info
    csric02Info_t info;
    HIPSPARSE_CHECK(hipsparseCreateCsric02Info(&info));

    // Offload data to device
    int*   dcsrRowPtr;
    int*   dcsrColInd;
    float* dcsrVal; // This will store the factorized L values

    HIP_CHECK(hipMalloc((void**)&dcsrRowPtr, sizeof(int) * (m + 1)));
    HIP_CHECK(hipMalloc((void**)&dcsrColInd, sizeof(int) * nnz));
    HIP_CHECK(
        hipMalloc((void**)&dcsrVal,
                  sizeof(float) * nnz)); // Note: Same size as input, values will be overwritten

    HIP_CHECK(hipMemcpy(dcsrRowPtr, hcsrRowPtr, sizeof(int) * (m + 1), hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dcsrColInd, hcsrColInd, sizeof(int) * nnz, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dcsrVal, hcsrVal, sizeof(float) * nnz, hipMemcpyHostToDevice));

    // 1. Get buffer size
    int bufferSize = 0;
    HIPSPARSE_CHECK(hipsparseScsric02_bufferSize(
        handle, m, nnz, descr, dcsrVal, dcsrRowPtr, dcsrColInd, info, &bufferSize));

    void* dbuffer = NULL;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform analysis (symbolic factorization)
    // This step analyzes the sparsity pattern of A to determine the structure of L.
    HIPSPARSE_CHECK(
        hipsparseScsric02_analysis(handle,
                                   m,
                                   nnz,
                                   descr,
                                   dcsrVal,
                                   dcsrRowPtr,
                                   dcsrColInd,
                                   info,
                                   HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for analysis
                                   dbuffer));

    // 3. Perform factorization (numerical computation)
    // This step computes the actual numerical values of L, stored in dcsrVal.
    HIPSPARSE_CHECK(hipsparseScsric02(handle,
                                      m,
                                      nnz,
                                      descr,
                                      dcsrVal,
                                      dcsrRowPtr,
                                      dcsrColInd,
                                      info,
                                      HIPSPARSE_SOLVE_POLICY_USE_LEVEL, // Policy for factorization
                                      dbuffer));

    // 4. Check for zero pivots
    // A zero pivot can occur during factorization, indicating a numerical breakdown.
    int zeroPivot = 0; // -1 if no zero pivot, otherwise the row index of the first zero pivot
    HIPSPARSE_CHECK(hipsparseXcsric02_zeroPivot(handle, info, &zeroPivot));
    if(zeroPivot != -1)
    {
        printf("Error: Zero pivot detected during IC02 factorization at row index %d\n", zeroPivot);
        // Depending on your application, you might want to handle this error
        // or switch to a different preconditioner.
    }
    else
    {
        printf("CSRIC02 factorization completed successfully (no zero pivots detected).\n");
    }

    // Copy the factorized values (L) back to host
    float* hcsrVal_result = (float*)malloc(nnz * sizeof(float));
    HIP_CHECK(hipMemcpy(hcsrVal_result, dcsrVal, sizeof(float) * nnz, hipMemcpyDeviceToHost));

    // Print the result (the values of the factorized L)
    printf("\nFactorized CSR values (L factor):\n");
    for(int i = 0; i < nnz; ++i)
    {
        printf("val[%d] = %f\n", i, hcsrVal_result[i]);
    }

    // Clean up
    free(hcsrVal_result);

    HIPSPARSE_CHECK(hipsparseDestroyCsric02Info(info));
    HIPSPARSE_CHECK(hipsparseDestroyMatDescr(descr));
    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    HIP_CHECK(hipFree(dcsrRowPtr));
    HIP_CHECK(hipFree(dcsrColInd));
    HIP_CHECK(hipFree(dcsrVal));
    HIP_CHECK(hipFree(dbuffer));

    return 0;
}

program example_hipsparse_csric02
    use iso_c_binding
    implicit none

    ! HIP
    interface
        function hipMalloc(ptr, size) &
                bind(c, name = 'hipMalloc')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipMalloc
            type(c_ptr) :: ptr
            integer(c_size_t), value :: size
        end function hipMalloc

        function hipFree(ptr) &
                bind(c, name = 'hipFree')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipFree
            type(c_ptr), value :: ptr
        end function hipFree

        function hipMemcpy(dst, src, size, kind) &
                bind(c, name = 'hipMemcpy')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipMemcpy
            type(c_ptr), value :: dst
            type(c_ptr), intent(in), value :: src
            integer(c_size_t), value :: size
            integer(c_int), value :: kind
        end function hipMemcpy
    end interface

    integer, parameter :: hipMemcpyHostToDevice = 1
    integer, parameter :: hipMemcpyDeviceToHost = 2

    ! hipSPARSE
    interface
        function hipsparseCreate(handle) &
                bind(c, name = 'hipsparseCreate')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreate
            type(c_ptr) :: handle
        end function hipsparseCreate

        function hipsparseDestroy(handle) &
                bind(c, name = 'hipsparseDestroy')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroy
            type(c_ptr), value :: handle
        end function hipsparseDestroy

        function hipsparseCreateMatDescr(descr) &
                bind(c, name = 'hipsparseCreateMatDescr')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreateMatDescr
            type(c_ptr) :: descr
        end function hipsparseCreateMatDescr

        function hipsparseDestroyMatDescr(descr) &
                bind(c, name = 'hipsparseDestroyMatDescr')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroyMatDescr
            type(c_ptr), value :: descr
        end function hipsparseDestroyMatDescr

        function hipsparseSetMatFillMode(descr, fillMode) &
                bind(c, name = 'hipsparseSetMatFillMode')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSetMatFillMode
            type(c_ptr), value :: descr
            integer(c_int), value :: fillMode
        end function hipsparseSetMatFillMode

        function hipsparseSetMatDiagType(descr, diagType) &
                bind(c, name = 'hipsparseSetMatDiagType')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseSetMatDiagType
            type(c_ptr), value :: descr
            integer(c_int), value :: diagType
        end function hipsparseSetMatDiagType

        function hipsparseCreateCsric02Info(info) &
                bind(c, name = 'hipsparseCreateCsric02Info')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseCreateCsric02Info
            type(c_ptr) :: info
        end function hipsparseCreateCsric02Info

        function hipsparseDestroyCsric02Info(info) &
                bind(c, name = 'hipsparseDestroyCsric02Info')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseDestroyCsric02Info
            type(c_ptr), value :: info
        end function hipsparseDestroyCsric02Info

        function hipsparseScsric02_bufferSize(handle, m, nnz, descr, csrSortedValA, csrSortedRowPtrA, &
                                              csrSortedColIndA, info, pBufferSizeInBytes) &
                bind(c, name = 'hipsparseScsric02_bufferSize')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseScsric02_bufferSize
            type(c_ptr), value :: handle
            integer(c_int), value :: m
            integer(c_int), value :: nnz
            type(c_ptr), value :: descr
            type(c_ptr), intent(in), value :: csrSortedValA
            type(c_ptr), intent(in), value :: csrSortedRowPtrA
            type(c_ptr), intent(in), value :: csrSortedColIndA
            type(c_ptr), value :: info
            type(c_ptr), value :: pBufferSizeInBytes
        end function hipsparseScsric02_bufferSize

        function hipsparseScsric02_analysis(handle, m, nnz, descr, csrSortedValA, csrSortedRowPtrA, &
                                            csrSortedColIndA, info, policy, pBuffer) &
                bind(c, name = 'hipsparseScsric02_analysis')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseScsric02_analysis
            type(c_ptr), value :: handle
            integer(c_int), value :: m
            integer(c_int), value :: nnz
            type(c_ptr), value :: descr
            type(c_ptr), intent(in), value :: csrSortedValA
            type(c_ptr), intent(in), value :: csrSortedRowPtrA
            type(c_ptr), intent(in), value :: csrSortedColIndA
            type(c_ptr), value :: info
            integer(c_int), value :: policy
            type(c_ptr), value :: pBuffer
        end function hipsparseScsric02_analysis

        function hipsparseScsric02(handle, m, nnz, descr, csrSortedValA_valM, csrSortedRowPtrA, &
                                   csrSortedColIndA, info, policy, pBuffer) &
                bind(c, name = 'hipsparseScsric02')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseScsric02
            type(c_ptr), value :: handle
            integer(c_int), value :: m
            integer(c_int), value :: nnz
            type(c_ptr), value :: descr
            type(c_ptr), value :: csrSortedValA_valM
            type(c_ptr), intent(in), value :: csrSortedRowPtrA
            type(c_ptr), intent(in), value :: csrSortedColIndA
            type(c_ptr), value :: info
            integer(c_int), value :: policy
            type(c_ptr), value :: pBuffer
        end function hipsparseScsric02

        function hipsparseXcsric02_zeroPivot(handle, info, position) &
                bind(c, name = 'hipsparseXcsric02_zeroPivot')
            use iso_c_binding
            implicit none
            integer(c_int) :: hipsparseXcsric02_zeroPivot
            type(c_ptr), value :: handle
            type(c_ptr), value :: info
            type(c_ptr), value :: position
        end function hipsparseXcsric02_zeroPivot
    end interface

    integer, parameter :: HIPSPARSE_FILL_MODE_LOWER = 0
    integer, parameter :: HIPSPARSE_DIAG_TYPE_UNIT = 1
    integer, parameter :: HIPSPARSE_SOLVE_POLICY_USE_LEVEL = 1

    ! Variables
    type(c_ptr) :: handle
    type(c_ptr) :: descr
    type(c_ptr) :: info
    integer :: i, stat
    integer, target :: zeroPivot
    integer(c_int), target :: bufferSize
    
    ! Symmetric positive definite matrix A (4x4) in CSR format
    integer, parameter :: m = 4
    integer, parameter :: n = 4
    integer, parameter :: nnz = 10
    
    integer, dimension(m+1), target :: hcsrRowPtr = (/0, 2, 5, 8, 10/)
    integer, dimension(nnz), target :: hcsrColInd = (/0, 1, 0, 1, 2, 1, 2, 3, 2, 3/)
    real(c_float), dimension(nnz), target :: hcsrVal = (/4.0, 1.0, 1.0, 5.0, 2.0, 2.0, 3.0, 1.0, 1.0, 2.0/)
    
    ! Result array
    real(c_float), dimension(nnz), target :: hcsrVal_result
    
    ! Device pointers
    type(c_ptr) :: dcsrRowPtr
    type(c_ptr) :: dcsrColInd
    type(c_ptr) :: dcsrVal
    type(c_ptr) :: dbuffer

    ! Create hipSPARSE handle
    stat = hipsparseCreate(handle)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseCreate failed'
        stop
    end if

    ! Create matrix descriptor
    stat = hipsparseCreateMatDescr(descr)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseCreateMatDescr failed'
        stop
    end if

    ! Set matrix fill mode (lower triangular)
    stat = hipsparseSetMatFillMode(descr, HIPSPARSE_FILL_MODE_LOWER)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseSetMatFillMode failed'
        stop
    end if

    ! Set matrix diagonal type
    stat = hipsparseSetMatDiagType(descr, HIPSPARSE_DIAG_TYPE_UNIT)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseSetMatDiagType failed'
        stop
    end if

    ! Create csric02 info
    stat = hipsparseCreateCsric02Info(info)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseCreateCsric02Info failed'
        stop
    end if

    ! Allocate device memory
    stat = hipMalloc(dcsrRowPtr, int((m + 1) * 4, c_size_t))
    if (stat /= 0) stop
    stat = hipMalloc(dcsrColInd, int(nnz * 4, c_size_t))
    if (stat /= 0) stop
    stat = hipMalloc(dcsrVal, int(nnz * 4, c_size_t))
    if (stat /= 0) stop

    ! Copy data to device
    stat = hipMemcpy(dcsrRowPtr, c_loc(hcsrRowPtr), int((m + 1) * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop
    stat = hipMemcpy(dcsrColInd, c_loc(hcsrColInd), int(nnz * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop
    stat = hipMemcpy(dcsrVal, c_loc(hcsrVal), int(nnz * 4, c_size_t), hipMemcpyHostToDevice)
    if (stat /= 0) stop

    ! Get buffer size
    stat = hipsparseScsric02_bufferSize(handle, &
                                        m, &
                                        nnz, &
                                        descr, &
                                        dcsrVal, &
                                        dcsrRowPtr, &
                                        dcsrColInd, &
                                        info, &
                                        c_loc(bufferSize))
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseScsric02_bufferSize failed'
        stop
    end if

    ! Allocate temporary buffer
    stat = hipMalloc(dbuffer, int(bufferSize, c_size_t))
    if (stat /= 0) stop

    ! Perform analysis step
    stat = hipsparseScsric02_analysis(handle, &
                                      m, &
                                      nnz, &
                                      descr, &
                                      dcsrVal, &
                                      dcsrRowPtr, &
                                      dcsrColInd, &
                                      info, &
                                      HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
                                      dbuffer)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseScsric02_analysis failed'
        stop
    end if

    ! Perform factorization
    stat = hipsparseScsric02(handle, &
                             m, &
                             nnz, &
                             descr, &
                             dcsrVal, &
                             dcsrRowPtr, &
                             dcsrColInd, &
                             info, &
                             HIPSPARSE_SOLVE_POLICY_USE_LEVEL, &
                             dbuffer)
    if (stat /= 0) then
        write(*,*) 'Error: hipsparseScsric02 failed'
        stop
    end if

    ! Check for zero pivots
    stat = hipsparseXcsric02_zeroPivot(handle, info, c_loc(zeroPivot))
    if (zeroPivot /= -1) then
        write(*,*) 'Error: Zero pivot detected at row index', zeroPivot
    else
        write(*,*) 'CSRIC02 factorization completed successfully'
    end if

    ! Copy result back to host
    stat = hipMemcpy(c_loc(hcsrVal_result), dcsrVal, int(nnz * 4, c_size_t), hipMemcpyDeviceToHost)
    if (stat /= 0) stop

    ! Print result
    write(*,*) 'Factorized CSR values (L factor):'
    do i = 1, nnz
        write(*,*) 'val[', i-1, '] =', hcsrVal_result(i)
    end do

    ! Clean up
    stat = hipFree(dcsrRowPtr)
    stat = hipFree(dcsrColInd)
    stat = hipFree(dcsrVal)
    stat = hipFree(dbuffer)

    stat = hipsparseDestroyCsric02Info(info)
    stat = hipsparseDestroyMatDescr(descr)
    stat = hipsparseDestroy(handle)

end program example_hipsparse_csric02

hipsparseXgtsv2_bufferSizeExt()#

hipsparseStatus_t hipsparseSgtsv2_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const float *dl, const float *d, const float *du, const float *B, int ldb, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDgtsv2_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const double *dl, const double *d, const double *du, const double *B, int ldb, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCgtsv2_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *B, int ldb, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZgtsv2_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *B, int ldb, size_t *pBufferSizeInBytes)#

hipsparseSgtsv2_bufferSizeExt returns the size of the temporary storage buffer that is required by hipsparseXgtsv2(). The temporary storage buffer must be allocated by the user.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

This routine supports execution in a hipGraph context.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system. Must be at least 2.
n – [in] number of columns in the dense matrix B. Must be non-negative.
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
B – [in] dense matrix of size ( ldb, n ).
ldb – [in] leading dimension of B. Must satisfy ldb >= max(1, m).
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseXgtsv2().

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED – handle is not initialized.
HIPSPARSE_STATUS_INVALID_VALUE – handle, dl, d, du, B or pBufferSizeInBytes is nullptr, m is less than 2 or n is negative, or ldb is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXgtsv2()#

hipsparseStatus_t hipsparseSgtsv2(hipsparseHandle_t handle, int m, int n, const float *dl, const float *d, const float *du, float *B, int ldb, void *pBuffer)#

hipsparseStatus_t hipsparseDgtsv2(hipsparseHandle_t handle, int m, int n, const double *dl, const double *d, const double *du, double *B, int ldb, void *pBuffer)#

hipsparseStatus_t hipsparseCgtsv2(hipsparseHandle_t handle, int m, int n, const hipComplex *dl, const hipComplex *d, const hipComplex *du, hipComplex *B, int ldb, void *pBuffer)#

hipsparseStatus_t hipsparseZgtsv2(hipsparseHandle_t handle, int m, int n, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, hipDoubleComplex *B, int ldb, void *pBuffer)#

Tridiagonal solver with pivoting.

hipsparseXgtsv2 solves a tridiagonal system for multiple right hand sides using pivoting

\[ T*B = B \]

where \(T\) is a sparse tridiagonal matrix and \(B\) is a dense \(ldb \times n\) matrix storing the right-hand side vectors in column order. The tridiagonal matrix \(T\) is defined by three vectors: dl for the lower diagonal, d for the main diagonal and du for the upper diagonal.

Solving the tridiagonal system involves two steps. First, the user calls hipsparseXgtsv2_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgtsv2() to perform the actual solve. The \(B\) dense matrix, which initially stores the n right-hand side vectors, is overwritten with the n solution vectors after the call to hipsparseXgtsv2().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

This routine supports execution in a hipGraph context.

Parameters:

handle – [in] handle to the hipSPARSE library context queue.
m – [in] size of the tri-diagonal linear system (must be >= 2).
n – [in] number of columns in the dense matrix B.
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
B – [inout] Dense matrix of size ( ldb, n ).
ldb – [in] Leading dimension of B. Must satisfy ldb >= max(1, m).
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, n, ldb, dl, d, du, B or pBuffer pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

int main(int argc, char* argv[])
{
    // Size of square tridiagonal matrix
    int m = 5;

    // Number of columns in right-hand side (column ordered) matrix
    int n = 3;

    // Leading dimension of right-hand side (column ordered) matrix
    int ldb = m;

    // Host tri-diagonal matrix
    // 2 3 0 0 0
    // 2 4 2 0 0
    // 0 1 1 1 0
    // 0 0 1 3 1
    // 0 0 0 1 4
    std::vector<float> hdl = {0.0f, 2.0f, 1.0f, 1.0f, 1.0f};
    std::vector<float> hd  = {2.0f, 4.0f, 1.0f, 3.0f, 4.0f};
    std::vector<float> hdu = {3.0f, 2.0f, 1.0f, 1.0f, 0.0f};

    // Host right-hand side column vectors
    std::vector<float> hB(ldb * n, 2.0f);

    float* ddl = nullptr;
    float* dd  = nullptr;
    float* ddu = nullptr;
    float* dB  = nullptr;
    HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * ldb * n));

    HIP_CHECK(hipMemcpy(ddl, hdl.data(), sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dd, hd.data(), sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(ddu, hdu.data(), sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dB, hB.data(), sizeof(float) * ldb * n, hipMemcpyHostToDevice));

    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // 1. Get buffer size
    size_t bufferSize = 0;
    HIPSPARSE_CHECK(
        hipsparseSgtsv2_bufferSizeExt(handle, m, m, ddl, dd, ddu, dB, ldb, &bufferSize));

    void* dbuffer = nullptr;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform tridiagonal solve with pivoting
    // The solution is computed and stored in the dB vector.
    HIPSPARSE_CHECK(hipsparseSgtsv2(handle, m, m, ddl, dd, ddu, dB, ldb, dbuffer));

    // Copy solution back to host from dB
    HIP_CHECK(hipMemcpy(hB.data(), dB, sizeof(float) * m, hipMemcpyDeviceToHost));

    // Print the solution
    printf("Solution for the tridiagonal system:\n");
    for(int i = 0; i < m; ++i)
    {
        printf("  x[%d] = %f\n", i, hB[i]);
    }

    // Clean up
    HIP_CHECK(hipFree(ddl));
    HIP_CHECK(hipFree(dd));
    HIP_CHECK(hipFree(ddu));
    HIP_CHECK(hipFree(dB));
    HIP_CHECK(hipFree(dbuffer));

    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    return 0;
}

int main(int argc, char* argv[])
{
    // Size of square tridiagonal matrix
    int m = 5;

    // Number of columns in right-hand side (column ordered) matrix
    int n = 3;

    // Leading dimension of right-hand side (column ordered) matrix
    int ldb = m;

    // Host tri-diagonal matrix
    // 2 3 0 0 0
    // 2 4 2 0 0
    // 0 1 1 1 0
    // 0 0 1 3 1
    // 0 0 0 1 4
    float hdl[] = {0.0, 2.0, 1.0, 1.0, 1.0};
    float hd[]  = {2.0, 4.0, 1.0, 3.0, 4.0};
    float hdu[] = {3.0, 2.0, 1.0, 1.0, 0.0};

    // Host right-hand side column vectors
    float* hB = (float*)malloc((ldb * n, 2.0) * sizeof(float));

    float* ddl = NULL;
    float* dd  = NULL;
    float* ddu = NULL;
    float* dB  = NULL;
    HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * ldb * n));

    HIP_CHECK(hipMemcpy(ddl, hdl, sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dd, hd, sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(ddu, hdu, sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dB, hB, sizeof(float) * ldb * n, hipMemcpyHostToDevice));

    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // 1. Get buffer size
    size_t bufferSize = 0;
    HIPSPARSE_CHECK(
        hipsparseSgtsv2_bufferSizeExt(handle, m, m, ddl, dd, ddu, dB, ldb, &bufferSize));

    void* dbuffer = NULL;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform tridiagonal solve with pivoting
    // The solution is computed and stored in the dB vector.
    HIPSPARSE_CHECK(hipsparseSgtsv2(handle, m, m, ddl, dd, ddu, dB, ldb, dbuffer));

    // Copy solution back to host from dB
    HIP_CHECK(hipMemcpy(hB, dB, sizeof(float) * m, hipMemcpyDeviceToHost));

    // Print the solution
    printf("Solution for the tridiagonal system:\n");
    for(int i = 0; i < m; ++i)
    {
        printf("  x[%d] = %f\n", i, hB[i]);
    }

    // Clean up
    HIP_CHECK(hipFree(ddl));
    HIP_CHECK(hipFree(dd));
    HIP_CHECK(hipFree(ddu));
    HIP_CHECK(hipFree(dB));
    HIP_CHECK(hipFree(dbuffer));

    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    return 0;
}

hipsparseXgtsv2_nopivot_bufferSizeExt()#

hipsparseStatus_t hipsparseSgtsv2_nopivot_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const float *dl, const float *d, const float *du, const float *B, int ldb, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDgtsv2_nopivot_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const double *dl, const double *d, const double *du, const double *B, int ldb, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCgtsv2_nopivot_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *B, int ldb, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZgtsv2_nopivot_bufferSizeExt(hipsparseHandle_t handle, int m, int n, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *B, int ldb, size_t *pBufferSizeInBytes)#

hipsparseXgtsv2_nopivot_bufferSizeExt returns the size of the temporary storage buffer in bytes that is required by hipsparseXgtsv2_nopivot(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system. Must be >= 2.
n – [in] number of columns in the dense matrix B. Must be non-negative.
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
B – [in] Dense matrix of size ( ldb, n ).
ldb – [in] Leading dimension of B. Must satisfy ldb >= max(1, m).
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSgtsv2_nopivot “hipsparseXgtsv2_nopivot()”.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_NOT_INITIALIZED – handle is not initialized.
HIPSPARSE_STATUS_INVALID_VALUE – handle, dl, d, du, B or pBufferSizeInBytes is nullptr, m is less than 2 or n is negative, or ldb is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXgtsv2_nopivot()#

hipsparseStatus_t hipsparseSgtsv2_nopivot(hipsparseHandle_t handle, int m, int n, const float *dl, const float *d, const float *du, float *B, int ldb, void *pBuffer)#

hipsparseStatus_t hipsparseDgtsv2_nopivot(hipsparseHandle_t handle, int m, int n, const double *dl, const double *d, const double *du, double *B, int ldb, void *pBuffer)#

hipsparseStatus_t hipsparseCgtsv2_nopivot(hipsparseHandle_t handle, int m, int n, const hipComplex *dl, const hipComplex *d, const hipComplex *du, hipComplex *B, int ldb, void *pBuffer)#

hipsparseStatus_t hipsparseZgtsv2_nopivot(hipsparseHandle_t handle, int m, int n, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, hipDoubleComplex *B, int ldb, void *pBuffer)#

Tridiagonal solver (no pivoting)

hipsparseXgtsv2_nopivot solves a tridiagonal linear system for multiple right-hand sides without pivoting

\[ T*B = B \]

where \(T\) is a sparse tridiagonal matrix and \(B\) is a dense \(ldb \times n\) matrix storing the right-hand side vectors in column order. The tridiagonal matrix \(T\) is defined by three vectors: dl for the lower diagonal, d for the main diagonal and du for the upper diagonal.

Solving the tridiagonal system with multiple right-hand sides without pivoting involves two steps. First, the user calls hipsparseXgtsv2_nopivot_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgtsv2_nopivot() to perform the actual solve. The \(B\) dense matrix, which initially stores the n right-hand side vectors, is overwritten with the n solution vectors after the call to hipsparseXgtsv2_nopivot().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system (must be >= 2).
n – [in] number of columns in the dense matrix B.
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
B – [inout] Dense matrix of size ( ldb, n ).
ldb – [in] Leading dimension of B. Must satisfy ldb >= max(1, m).
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, n, ldb, dl, d, du, B or pBuffer pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // Size of square tridiagonal matrix
    int m = 5;

    // Number of columns in right-hand side (column ordered) matrix
    int n = 3;

    // Leading dimension of right-hand side (column ordered) matrix
    int ldb = m;

    // Host tri-diagonal matrix
    //  2 -1  0  0  0
    // -1  2 -1  0  0
    //  0 -1  2 -1  0
    //  0  0 -1  2 -1
    //  0  0  0 -1  2
    std::vector<float> hdl = {0.0f, -1.0f, -1.0f, -1.0f, -1.0f};
    std::vector<float> hd  = {2.0f, 2.0f, 2.0f, 2.0f, 2.0f};
    std::vector<float> hdu = {-1.0f, -1.0f, -1.0f, -1.0f, 0.0f};

    // Host right-hand side column vectors
    std::vector<float> hB(ldb * n, 1.0f);

    float* ddl = nullptr;
    float* dd  = nullptr;
    float* ddu = nullptr;
    float* dB  = nullptr;
    HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * ldb * n));

    HIP_CHECK(hipMemcpy(ddl, hdl.data(), sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dd, hd.data(), sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(ddu, hdu.data(), sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dB, hB.data(), sizeof(float) * ldb * n, hipMemcpyHostToDevice));

    // Obtain required buffer size
    size_t bufferSize;
    HIPSPARSE_CHECK(
        hipsparseSgtsv2_nopivot_bufferSizeExt(handle, m, n, ddl, dd, ddu, dB, ldb, &bufferSize));

    void* dbuffer;
    HIP_CHECK(hipMalloc(&dbuffer, bufferSize));

    HIPSPARSE_CHECK(hipsparseSgtsv2_nopivot(handle, m, n, ddl, dd, ddu, dB, ldb, dbuffer));

    // Copy right-hand side to host
    HIP_CHECK(hipMemcpy(hB.data(), dB, sizeof(float) * ldb * n, hipMemcpyDeviceToHost));

    // Print the solution
    printf("Solution for the tridiagonal system:\n");
    for(int i = 0; i < m; ++i)
    {
        printf("  x[%d] = %f\n", i, hB[i]);
    }

    // Clean up
    HIP_CHECK(hipFree(ddl));
    HIP_CHECK(hipFree(dd));
    HIP_CHECK(hipFree(ddu));
    HIP_CHECK(hipFree(dB));
    HIP_CHECK(hipFree(dbuffer));

    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    return 0;
}

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // Size of square tridiagonal matrix
    int m = 5;

    // Number of columns in right-hand side (column ordered) matrix
    int n = 3;

    // Leading dimension of right-hand side (column ordered) matrix
    int ldb = m;

    // Host tri-diagonal matrix
    //  2 -1  0  0  0
    // -1  2 -1  0  0
    //  0 -1  2 -1  0
    //  0  0 -1  2 -1
    //  0  0  0 -1  2
    float hdl[] = {0.0, -1.0, -1.0, -1.0, -1.0};
    float hd[]  = {2.0, 2.0, 2.0, 2.0, 2.0};
    float hdu[] = {-1.0, -1.0, -1.0, -1.0, 0.0};

    // Host right-hand side column vectors
    float* hB = (float*)malloc((ldb * n, 1.0) * sizeof(float));

    float* ddl = NULL;
    float* dd  = NULL;
    float* ddu = NULL;
    float* dB  = NULL;
    HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m));
    HIP_CHECK(hipMalloc((void**)&dB, sizeof(float) * ldb * n));

    HIP_CHECK(hipMemcpy(ddl, hdl, sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dd, hd, sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(ddu, hdu, sizeof(float) * m, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dB, hB, sizeof(float) * ldb * n, hipMemcpyHostToDevice));

    // Obtain required buffer size
    size_t bufferSize;
    HIPSPARSE_CHECK(
        hipsparseSgtsv2_nopivot_bufferSizeExt(handle, m, n, ddl, dd, ddu, dB, ldb, &bufferSize));

    void* dbuffer;
    HIP_CHECK(hipMalloc(&dbuffer, bufferSize));

    HIPSPARSE_CHECK(hipsparseSgtsv2_nopivot(handle, m, n, ddl, dd, ddu, dB, ldb, dbuffer));

    // Copy right-hand side to host
    HIP_CHECK(hipMemcpy(hB, dB, sizeof(float) * ldb * n, hipMemcpyDeviceToHost));

    // Print the solution
    printf("Solution for the tridiagonal system:\n");
    for(int i = 0; i < m; ++i)
    {
        printf("  x[%d] = %f\n", i, hB[i]);
    }

    // Clean up
    HIP_CHECK(hipFree(ddl));
    HIP_CHECK(hipFree(dd));
    HIP_CHECK(hipFree(ddu));
    HIP_CHECK(hipFree(dB));
    HIP_CHECK(hipFree(dbuffer));

    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    return 0;
}

hipsparseXgtsv2StridedBatch_bufferSizeExt()#

hipsparseStatus_t hipsparseSgtsv2StridedBatch_bufferSizeExt(hipsparseHandle_t handle, int m, const float *dl, const float *d, const float *du, const float *x, int batchCount, int batchStride, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDgtsv2StridedBatch_bufferSizeExt(hipsparseHandle_t handle, int m, const double *dl, const double *d, const double *du, const double *x, int batchCount, int batchStride, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCgtsv2StridedBatch_bufferSizeExt(hipsparseHandle_t handle, int m, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *x, int batchCount, int batchStride, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZgtsv2StridedBatch_bufferSizeExt(hipsparseHandle_t handle, int m, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *x, int batchCount, int batchStride, size_t *pBufferSizeInBytes)#

hipsparseXgtsv2StridedBatch_bufferSizeExt returns the size of the temporary storage buffer in bytes that is required by hipsparseXgtsv2StridedBatch(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system.
dl – [in] lower diagonal of tri-diagonal system where the ith system lower diagonal starts at dl+batchStride*i.
d – [in] main diagonal of tri-diagonal system where the ith system diagonal starts at d+batchStride*i.
du – [in] upper diagonal of tri-diagonal system where the ith system upper diagonal starts at du+batchStride*i.
x – [inout] Dense array of righthand-sides where the ith righthand-side starts at x+batchStride*i.
batchCount – [in] The number of systems to solve.
batchStride – [in] The number of elements that separate each system. Must satisfy batchStride >= m.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSgtsv2StridedBatch “hipsparseXgtsv2StridedBatch()”.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, batchCount, batchStride, dl, d, du, x or pBufferSizeInBytes pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXgtsv2StridedBatch()#

hipsparseStatus_t hipsparseSgtsv2StridedBatch(hipsparseHandle_t handle, int m, const float *dl, const float *d, const float *du, float *x, int batchCount, int batchStride, void *pBuffer)#

hipsparseStatus_t hipsparseDgtsv2StridedBatch(hipsparseHandle_t handle, int m, const double *dl, const double *d, const double *du, double *x, int batchCount, int batchStride, void *pBuffer)#

hipsparseStatus_t hipsparseCgtsv2StridedBatch(hipsparseHandle_t handle, int m, const hipComplex *dl, const hipComplex *d, const hipComplex *du, hipComplex *x, int batchCount, int batchStride, void *pBuffer)#

hipsparseStatus_t hipsparseZgtsv2StridedBatch(hipsparseHandle_t handle, int m, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, hipDoubleComplex *x, int batchCount, int batchStride, void *pBuffer)#

Strided Batch tridiagonal solver (no pivoting)

hipsparseXgtsv2StridedBatch solves a batched tridiagonal linear system

\[ T^{i}*x^{i} = x^{i} \]

where for each batch \(i=0\ldots\) batchCount, \(T^{i}\) is a sparse tridiagonal matrix and \(x^{i}\) is a dense right-hand side vector. All of the tridiagonal matrices, \(T^{i}\), are packed one after the other into three vectors: dl for the lower diagonals, d for the main diagonals and du for the upper diagonals. See below for a description of what this strided memory pattern looks like.

Solving the batched tridiagonal system involves two steps. First, the user calls hipsparseXgtsv2StridedBatch_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgtsv2StridedBatch() to perform the actual solve. The \(x^{i}\) vectors, which initially stores the right-hand side values, are overwritten with the solution after the call to hipsparseXgtsv2StridedBatch().

The strided batch routines write each batch matrix one after the other in memory. For example, consider the following batch matrices:

\[\begin{split} \begin{bmatrix} t^{0}_{00} & t^{0}_{01} & 0 \\ t^{0}_{10} & t^{0}_{11} & t^{0}_{12} \\ 0 & t^{0}_{21} & t^{0}_{22} \end{bmatrix} \begin{bmatrix} t^{1}_{00} & t^{1}_{01} & 0 \\ t^{1}_{10} & t^{1}_{11} & t^{1}_{12} \\ 0 & t^{1}_{21} & t^{1}_{22} \end{bmatrix} \begin{bmatrix} t^{2}_{00} & t^{2}_{01} & 0 \\ t^{2}_{10} & t^{2}_{11} & t^{2}_{12} \\ 0 & t^{2}_{21} & t^{2}_{22} \end{bmatrix} \end{split}\]

In strided format, the upper, lower, and diagonal arrays would look like:

\[\begin{split} \begin{align} \text{lower} &= \begin{bmatrix} 0 & t^{0}_{10} & t^{0}_{21} & 0 & t^{1}_{10} & t^{1}_{21} & 0 & t^{2}_{10} & t^{2}_{21} \end{bmatrix} \\ \text{diagonal} &= \begin{bmatrix} t^{0}_{00} & t^{0}_{11} & t^{0}_{22} & t^{1}_{00} & t^{1}_{11} & t^{1}_{22} & t^{2}_{00} & t^{2}_{11} & t^{2}_{22} \end{bmatrix} \\ \text{upper} &= \begin{bmatrix} t^{0}_{01} & t^{0}_{12} & 0 & t^{1}_{01} & t^{1}_{12} & 0 & t^{2}_{01} & t^{2}_{12} & 0 \end{bmatrix} \\ \end{align} \end{split}\]

For the lower array, for each batch i, the i*batchStride entries are zero and for the upper array the i*batchStride+batchStride-1 entries are zero.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
m – [in] size of the tri-diagonal linear system (must be >= 2).
dl – [in] lower diagonal of tri-diagonal system. First entry must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. Last entry must be zero.
x – [inout] Dense array of righthand-sides where the ith righthand-side starts at x+batchStride*i.
batchCount – [in] The number of systems to solve.
batchStride – [in] The number of elements that separate each system. Must satisfy batchStride >= m.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, batchCount, batchStride, dl, d, du, x or pBuffer pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXgtsvInterleavedBatch_bufferSizeExt()#

hipsparseStatus_t hipsparseSgtsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const float *dl, const float *d, const float *du, const float *x, int batchCount, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDgtsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const double *dl, const double *d, const double *du, const double *x, int batchCount, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCgtsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *x, int batchCount, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZgtsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *x, int batchCount, size_t *pBufferSizeInBytes)#

hipsparseXgtsvInterleavedBatch_bufferSizeExt returns the size of the temporary storage buffer in bytes that is required by hipsparseXgtsvInterleavedBatch(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
algo – [in] Algorithm to use when solving tridiagonal systems. Options are thomas ( algo=0 ), LU ( algo=1 ), or QR ( algo=2 ). Thomas algorithm is the fastest but is not stable while LU and QR are slower but are stable.
m – [in] size of the tri-diagonal linear system.
dl – [in] lower diagonal of tri-diagonal system. The first element of the lower diagonal must be zero.
d – [in] main diagonal of tri-diagonal system.
du – [in] upper diagonal of tri-diagonal system. The last element of the upper diagonal must be zero.
x – [inout] Dense array of righthand-sides with dimension batchCount by m.
batchCount – [in] The number of systems to solve.
pBufferSizeInBytes – [out] number of bytes of the temporary storage buffer required by hipsparseSgtsvInterleavedBatch().

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, batchCount, dl, d, du, x or pBufferSizeInBytes pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXgtsvInterleavedBatch()#

hipsparseStatus_t hipsparseSgtsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, float *dl, float *d, float *du, float *x, int batchCount, void *pBuffer)#

hipsparseStatus_t hipsparseDgtsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, double *dl, double *d, double *du, double *x, int batchCount, void *pBuffer)#

hipsparseStatus_t hipsparseCgtsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, hipComplex *dl, hipComplex *d, hipComplex *du, hipComplex *x, int batchCount, void *pBuffer)#

hipsparseStatus_t hipsparseZgtsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, hipDoubleComplex *dl, hipDoubleComplex *d, hipDoubleComplex *du, hipDoubleComplex *x, int batchCount, void *pBuffer)#

Interleaved Batch tridiagonal solver.

hipsparseXgtsvInterleavedBatch solves a batched tridiagonal linear system

\[ T^{i}*x^{i} = x^{i} \]

where for each batch \(i=0\ldots\) batchCount, \(T^{i}\) is a sparse tridiagonal matrix and \(x^{i}\) is a dense right-hand side vector. All of the tridiagonal matrices, \(T^{i}\), are packed in an interleaved fashion into three vectors: dl for the lower diagonals, d for the main diagonals and du for the upper diagonals. See below for a description of what this interleaved memory pattern looks like.

Solving the batched tridiagonal system involves two steps. First, the user calls hipsparseXgtsvInterleavedBatch_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgtsvInterleavedBatch() to perform the actual solve. The \(x^{i}\) vectors, which initially stores the right-hand side values, are overwritten with the solution after the call to hipsparseXgtsvInterleavedBatch().

The user can specify different algorithms for hipsparseXgtsvInterleavedBatch to use. Options are thomas ( algo=0 ), LU ( algo=1 ), or QR ( algo=2 ).

Unlike the strided batch routines which write each batch matrix one after the other in memory, the interleaved routines write the batch matrices such that each element from each matrix is written consecutively one after the other. For example, consider the following batch matrices:

\[\begin{split} \begin{bmatrix} t^{0}_{00} & t^{0}_{01} & 0 \\ t^{0}_{10} & t^{0}_{11} & t^{0}_{12} \\ 0 & t^{0}_{21} & t^{0}_{22} \end{bmatrix} \begin{bmatrix} t^{1}_{00} & t^{1}_{01} & 0 \\ t^{1}_{10} & t^{1}_{11} & t^{1}_{12} \\ 0 & t^{1}_{21} & t^{1}_{22} \end{bmatrix} \begin{bmatrix} t^{2}_{00} & t^{2}_{01} & 0 \\ t^{2}_{10} & t^{2}_{11} & t^{2}_{12} \\ 0 & t^{2}_{21} & t^{2}_{22} \end{bmatrix} \end{split}\]

In interleaved format, the upper, lower, and diagonal arrays would look like:

\[\begin{split} \begin{align} \text{lower} &= \begin{bmatrix} 0 & 0 & 0 & t^{0}_{10} & t^{1}_{10} & t^{1}_{10} & t^{0}_{21} & t^{1}_{21} & t^{2}_{21} \end{bmatrix} \\ \text{diagonal} &= \begin{bmatrix} t^{0}_{00} & t^{1}_{00} & t^{2}_{00} & t^{0}_{11} & t^{1}_{11} & t^{2}_{11} & t^{0}_{22} & t^{1}_{22} & t^{2}_{22} \end{bmatrix} \\ \text{upper} &= \begin{bmatrix} t^{0}_{01} & t^{1}_{01} & t^{2}_{01} & t^{0}_{12} & t^{1}_{12} & t^{2}_{12} & 0 & 0 & 0 \end{bmatrix} \\ \end{align} \end{split}\]

For the lower array, the first batchCount entries are zero and for the upper array the last batchCount entries are zero.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
algo – [in] Algorithm to use when solving tridiagonal systems. Options are thomas ( algo=0 ), LU ( algo=1 ), or QR ( algo=2 ). Thomas algorithm is the fastest but is not stable while LU and QR are slower but are stable.
m – [in] size of the tri-diagonal linear system.
dl – [inout] lower diagonal of tri-diagonal system. The first element of the lower diagonal must be zero.
d – [inout] main diagonal of tri-diagonal system.
du – [inout] upper diagonal of tri-diagonal system. The last element of the upper diagonal must be zero.
x – [inout] Dense array of righthand-sides with dimension batchCount by m.
batchCount – [in] The number of systems to solve.
pBuffer – [in] temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, batchCount, dl, d, du, x or pBuffer pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // Size of each square tridiagonal matrix
    int m = 6;

    // Number of batches
    int batchCount = 4;

    // Can be Thomas algorithm (0), LU (1), or QR (2)
    int algo = 1;

    // Host tridiagonal matrix
    std::vector<float> hdl(m * batchCount);
    std::vector<float> hd(m * batchCount);
    std::vector<float> hdu(m * batchCount);

    // Solve multiple tridiagonal matrix systems by interleaving matrices for better memory access:
    //
    //      4 2 0 0 0 0        5 3 0 0 0 0        6 4 0 0 0 0        7 5 0 0 0 0
    //      2 4 2 0 0 0        3 5 3 0 0 0        4 6 4 0 0 0        5 7 5 0 0 0
    // A1 = 0 2 4 2 0 0   A2 = 0 3 5 3 0 0   A3 = 0 4 6 4 0 0   A4 = 0 5 7 5 0 0
    //      0 0 2 4 2 0        0 0 3 5 3 0        0 0 4 6 4 0        0 0 5 7 5 0
    //      0 0 0 2 4 2        0 0 0 3 5 3        0 0 0 4 6 4        0 0 0 5 7 5
    //      0 0 0 0 2 4        0 0 0 0 3 5        0 0 0 0 4 6        0 0 0 0 5 7
    //
    // hdl = 0 0 0 0 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5
    // hd  = 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7
    // hdu = 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 0 0 0 0
    for(int b = 0; b < batchCount; ++b)
    {
        for(int i = 0; i < m; ++i)
        {
            hdl[batchCount * i + b] = 2 + b;
            hd[batchCount * i + b]  = 4 + b;
            hdu[batchCount * i + b] = 2 + b;
        }

        hdl[batchCount * 0 + b]       = 0.0f;
        hdu[batchCount * (m - 1) + b] = 0.0f;
    }

    // Host dense rhs
    std::vector<float> hx(m * batchCount);

    for(int b = 0; b < batchCount; ++b)
    {
        for(int i = 0; i < m; ++i)
        {
            hx[batchCount * i + b] = static_cast<float>(b + 1);
        }
    }

    float* ddl = nullptr;
    float* dd  = nullptr;
    float* ddu = nullptr;
    float* dx  = nullptr;
    HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m * batchCount));
    HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m * batchCount));
    HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m * batchCount));
    HIP_CHECK(hipMalloc((void**)&dx, sizeof(float) * m * batchCount));

    HIP_CHECK(hipMemcpy(ddl, hdl.data(), sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dd, hd.data(), sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(ddu, hdu.data(), sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dx, hx.data(), sizeof(float) * m * batchCount, hipMemcpyHostToDevice));

    // 1. Get buffer size
    size_t bufferSize = 0;
    HIPSPARSE_CHECK(hipsparseSgtsvInterleavedBatch_bufferSizeExt(
        handle, algo, m, ddl, dd, ddu, dx, batchCount, &bufferSize));

    void* dbuffer = nullptr;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform batched tridiagonal solve
    HIPSPARSE_CHECK(
        hipsparseSgtsvInterleavedBatch(handle, algo, m, ddl, dd, ddu, dx, batchCount, dbuffer));

    // Copy solution back to host
    HIP_CHECK(hipMemcpy(hx.data(), dx, sizeof(float) * m * batchCount, hipMemcpyDeviceToHost));

    // Print the solutions
    printf("Solutions for batched tridiagonal systems:\n");
    for(int b = 0; b < batchCount; ++b)
    {
        printf("  Batch %d:\n", b);
        for(int i = 0; i < m; ++i)
        {
            printf("    x[%d] = %f\n", i, hx[i * batchCount + b]);
        }
    }

    // Clean up
    HIP_CHECK(hipFree(ddl));
    HIP_CHECK(hipFree(dd));
    HIP_CHECK(hipFree(ddu));
    HIP_CHECK(hipFree(dx));
    HIP_CHECK(hipFree(dbuffer));

    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    return 0;
}

int main(int argc, char* argv[])
{
    // hipSPARSE handle
    hipsparseHandle_t handle;
    HIPSPARSE_CHECK(hipsparseCreate(&handle));

    // Size of each square tridiagonal matrix
    int m = 6;

    // Number of batches
    int batchCount = 4;

    // Can be Thomas algorithm (0), LU (1), or QR (2)
    int algo = 1;

    // Host tridiagonal matrix
    float* hdl = (float*)malloc((m * batchCount) * sizeof(float));
    float* hd  = (float*)malloc((m * batchCount) * sizeof(float));
    float* hdu = (float*)malloc((m * batchCount) * sizeof(float));

    // Solve multiple tridiagonal matrix systems by interleaving matrices for better memory access:
    //
    //      4 2 0 0 0 0        5 3 0 0 0 0        6 4 0 0 0 0        7 5 0 0 0 0
    //      2 4 2 0 0 0        3 5 3 0 0 0        4 6 4 0 0 0        5 7 5 0 0 0
    // A1 = 0 2 4 2 0 0   A2 = 0 3 5 3 0 0   A3 = 0 4 6 4 0 0   A4 = 0 5 7 5 0 0
    //      0 0 2 4 2 0        0 0 3 5 3 0        0 0 4 6 4 0        0 0 5 7 5 0
    //      0 0 0 2 4 2        0 0 0 3 5 3        0 0 0 4 6 4        0 0 0 5 7 5
    //      0 0 0 0 2 4        0 0 0 0 3 5        0 0 0 0 4 6        0 0 0 0 5 7
    //
    // hdl = 0 0 0 0 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5
    // hd  = 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7 4 5 6 7
    // hdu = 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 2 3 4 5 0 0 0 0
    for(int b = 0; b < batchCount; ++b)
    {
        for(int i = 0; i < m; ++i)
        {
            hdl[batchCount * i + b] = 2 + b;
            hd[batchCount * i + b]  = 4 + b;
            hdu[batchCount * i + b] = 2 + b;
        }

        hdl[batchCount * 0 + b]       = 0.0;
        hdu[batchCount * (m - 1) + b] = 0.0;
    }

    // Host dense rhs
    float* hx = (float*)malloc((m * batchCount) * sizeof(float));

    for(int b = 0; b < batchCount; ++b)
    {
        for(int i = 0; i < m; ++i)
        {
            hx[batchCount * i + b] = (float)(b + 1);
        }
    }

    float* ddl = NULL;
    float* dd  = NULL;
    float* ddu = NULL;
    float* dx  = NULL;
    HIP_CHECK(hipMalloc((void**)&ddl, sizeof(float) * m * batchCount));
    HIP_CHECK(hipMalloc((void**)&dd, sizeof(float) * m * batchCount));
    HIP_CHECK(hipMalloc((void**)&ddu, sizeof(float) * m * batchCount));
    HIP_CHECK(hipMalloc((void**)&dx, sizeof(float) * m * batchCount));

    HIP_CHECK(hipMemcpy(ddl, hdl, sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dd, hd, sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(ddu, hdu, sizeof(float) * m * batchCount, hipMemcpyHostToDevice));
    HIP_CHECK(hipMemcpy(dx, hx, sizeof(float) * m * batchCount, hipMemcpyHostToDevice));

    // 1. Get buffer size
    size_t bufferSize = 0;
    HIPSPARSE_CHECK(hipsparseSgtsvInterleavedBatch_bufferSizeExt(
        handle, algo, m, ddl, dd, ddu, dx, batchCount, &bufferSize));

    void* dbuffer = NULL;
    HIP_CHECK(hipMalloc((void**)&dbuffer, bufferSize));

    // 2. Perform batched tridiagonal solve
    HIPSPARSE_CHECK(
        hipsparseSgtsvInterleavedBatch(handle, algo, m, ddl, dd, ddu, dx, batchCount, dbuffer));

    // Copy solution back to host
    HIP_CHECK(hipMemcpy(hx, dx, sizeof(float) * m * batchCount, hipMemcpyDeviceToHost));

    // Print the solutions
    printf("Solutions for batched tridiagonal systems:\n");
    for(int b = 0; b < batchCount; ++b)
    {
        printf("  Batch %d:\n", b);
        for(int i = 0; i < m; ++i)
        {
            printf("    x[%d] = %f\n", i, hx[i * batchCount + b]);
        }
    }

    // Clean up
    HIP_CHECK(hipFree(ddl));
    HIP_CHECK(hipFree(dd));
    HIP_CHECK(hipFree(ddu));
    HIP_CHECK(hipFree(dx));
    HIP_CHECK(hipFree(dbuffer));

    HIPSPARSE_CHECK(hipsparseDestroy(handle));

    return 0;
}

hipsparseXgpsvInterleavedBatch_bufferSizeExt()#

hipsparseStatus_t hipsparseSgpsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const float *ds, const float *dl, const float *d, const float *du, const float *dw, const float *x, int batchCount, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseDgpsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const double *ds, const double *dl, const double *d, const double *du, const double *dw, const double *x, int batchCount, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseCgpsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const hipComplex *ds, const hipComplex *dl, const hipComplex *d, const hipComplex *du, const hipComplex *dw, const hipComplex *x, int batchCount, size_t *pBufferSizeInBytes)#

hipsparseStatus_t hipsparseZgpsvInterleavedBatch_bufferSizeExt(hipsparseHandle_t handle, int algo, int m, const hipDoubleComplex *ds, const hipDoubleComplex *dl, const hipDoubleComplex *d, const hipDoubleComplex *du, const hipDoubleComplex *dw, const hipDoubleComplex *x, int batchCount, size_t *pBufferSizeInBytes)#

hipsparseXgpsvInterleavedBatch_bufferSizeExt returns the size of the temporary storage buffer in bytes that is required by hipsparseXgpsvInterleavedBatch(). The temporary storage buffer must be allocated by the user.

Parameters:

handle – [in] handle to the hipsparse library context queue.
algo – [in] algorithm to solve the linear system.
m – [in] size of the pentadiagonal linear system.
ds – [in] lower diagonal (distance 2) of pentadiagonal system. First two entries must be zero.
dl – [in] lower diagonal of pentadiagonal system. First entry must be zero.
d – [in] main diagonal of pentadiagonal system.
du – [in] upper diagonal of pentadiagonal system. Last entry must be zero.
dw – [in] upper diagonal (distance 2) of pentadiagonal system. Last two entries must be zero.
x – [in] Dense array of right-hand-sides with dimension batchCount by m.
batchCount – [in] The number of systems to solve.
pBufferSizeInBytes – [out] Number of bytes of the temporary storage buffer required.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, alg, batchCount, ds, dl, d, du, dw, x or pBufferSizeInBytes pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

hipsparseXgpsvInterleavedBatch()#

hipsparseStatus_t hipsparseSgpsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, float *ds, float *dl, float *d, float *du, float *dw, float *x, int batchCount, void *pBuffer)#

hipsparseStatus_t hipsparseDgpsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, double *ds, double *dl, double *d, double *du, double *dw, double *x, int batchCount, void *pBuffer)#

hipsparseStatus_t hipsparseCgpsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, hipComplex *ds, hipComplex *dl, hipComplex *d, hipComplex *du, hipComplex *dw, hipComplex *x, int batchCount, void *pBuffer)#

hipsparseStatus_t hipsparseZgpsvInterleavedBatch(hipsparseHandle_t handle, int algo, int m, hipDoubleComplex *ds, hipDoubleComplex *dl, hipDoubleComplex *d, hipDoubleComplex *du, hipDoubleComplex *dw, hipDoubleComplex *x, int batchCount, void *pBuffer)#

Interleaved Batch pentadiagonal solver.

hipsparseXgpsvInterleavedBatch solves a batch of pentadiagonal linear systems

\[ P^{i}*x^{i} = x^{i} \]

where for each batch \(i=0\ldots\) batchCount, \(P^{i}\) is a sparse pentadiagonal matrix and \(x^{i}\) is a dense right-hand side vector. All of the pentadiagonal matrices, \(P^{i}\), are packed in an interleaved fashion into five vectors: ds for the lowest diagonals, dl for the lower diagonals, d for the main diagonals, du for the upper diagonals, and dw for the highest digaonals. See below for a description of what this interleaved memory pattern looks like.

Solving the batched pentadiagonal system involves two steps. First, the user calls hipsparseSgpsvInterleavedBatch_bufferSizeExt() in order to determine the size of the required temporary storage buffer. Once determined, the user allocates this buffer and passes it to hipsparseXgpsvInterleavedBatch() to perform the actual solve. The \(x^{i}\) vectors, which initially stores the right-hand side values, are overwritten with the solution after the call to hipsparseXgpsvInterleavedBatch().

Unlike the strided batch routines which write each batch matrix one after the other in memory, the interleaved routines write the batch matrices such that each element from each matrix is written consecutively one after the other. For example, consider the following batch matrices:

\[\begin{split} \begin{bmatrix} t^{0}_{00} & t^{0}_{01} & t^{0}_{02} \\ t^{0}_{10} & t^{0}_{11} & t^{0}_{12} \\ t^{0}_{20} & t^{0}_{21} & t^{0}_{22} \end{bmatrix} \begin{bmatrix} t^{1}_{00} & t^{1}_{01} & t^{1}_{02} \\ t^{1}_{10} & t^{1}_{11} & t^{1}_{12} \\ t^{1}_{20} & t^{1}_{21} & t^{1}_{22} \end{bmatrix} \begin{bmatrix} t^{2}_{00} & t^{2}_{01} & t^{2}_{02} \\ t^{2}_{10} & t^{2}_{11} & t^{2}_{12} \\ t^{2}_{20} & t^{2}_{21} & t^{2}_{22} \end{bmatrix} \end{split}\]

In interleaved format, the highest, higher, lowest, lower, and diagonal arrays would look like:

\[\begin{split} \begin{align} \text{lowest} &= \begin{bmatrix} 0 & 0 & 0 & 0 & 0 & 0 & t^{0}_{20} & t^{1}_{20} & t^{2}_{20} \end{bmatrix} \\ \text{lower} &= \begin{bmatrix} 0 & 0 & 0 & t^{0}_{10} & t^{1}_{10} & t^{1}_{10} & t^{0}_{21} & t^{1}_{21} & t^{2}_{21} \end{bmatrix} \\ \text{diagonal} &= \begin{bmatrix} t^{0}_{00} & t^{1}_{00} & t^{2}_{00} & t^{0}_{11} & t^{1}_{11} & t^{2}_{11} & t^{0}_{22} & t^{1}_{22} & t^{2}_{22} \end{bmatrix} \\ \text{higher} &= \begin{bmatrix} t^{0}_{01} & t^{1}_{01} & t^{2}_{01} & t^{0}_{12} & t^{1}_{12} & t^{2}_{12} & 0 & 0 & 0 \end{bmatrix} \\ \text{highest} &= \begin{bmatrix} t^{0}_{02} & t^{1}_{02} & t^{2}_{02} & 0 & 0 & 0 & 0 & 0 & 0 \end{bmatrix} \\ \end{align} \end{split}\]

For the lowest array, the first 2*batchCount entries are zero, for the lower array, the first batchCount entries are zero, for the upper array the last batchCount entries are zero, and for the highest array, the last 2*batchCount entries are zero.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters:

handle – [in] handle to the hipsparse library context queue.
algo – [in] algorithm to solve the linear system.
m – [in] size of the pentadiagonal linear system.
ds – [inout] lower diagonal (distance 2) of pentadiagonal system. First two entries must be zero.
dl – [inout] lower diagonal of pentadiagonal system. First entry must be zero.
d – [inout] main diagonal of pentadiagonal system.
du – [inout] upper diagonal of pentadiagonal system. Last entry must be zero.
dw – [inout] upper diagonal (distance 2) of pentadiagonal system. Last two entries must be zero.
x – [inout] Dense array of right-hand-sides with dimension batchCount by m.
batchCount – [in] The number of systems to solve.
pBuffer – [in] Temporary storage buffer allocated by the user.

Return values:

HIPSPARSE_STATUS_SUCCESS – the operation completed successfully.
HIPSPARSE_STATUS_INVALID_VALUE – handle, m, alg, batchCount, ds, dl, d, du, dw, x or pBuffer pointer is invalid.
HIPSPARSE_STATUS_INTERNAL_ERROR – an internal error occurred.

Preconditioner functions

Contents

Preconditioner functions#

hipsparseXbsrilu02_zeroPivot()#

hipsparseXbsrilu02_numericBoost()#

hipsparseXbsrilu02_bufferSize()#

hipsparseXbsrilu02_analysis()#

hipsparseXbsrilu02()#

hipsparseXcsrilu02_zeroPivot()#

hipsparseXcsrilu02_numericBoost()#

hipsparseXcsrilu02_bufferSize()#

hipsparseXcsrilu02_bufferSizeExt()#

hipsparseXcsrilu02_analysis()#

hipsparseXcsrilu02()#

hipsparseXbsric02_zeroPivot()#

hipsparseXbsric02_bufferSize()#

hipsparseXbsric02_analysis()#

hipsparseXbsric02()#

hipsparseXcsric02_zeroPivot()#

hipsparseXcsric02_bufferSize()#

hipsparseXcsric02_bufferSizeExt()#

hipsparseXcsric02_analysis()#

hipsparseXcsric02()#

hipsparseXgtsv2_bufferSizeExt()#

hipsparseXgtsv2()#

hipsparseXgtsv2_nopivot_bufferSizeExt()#

hipsparseXgtsv2_nopivot()#

hipsparseXgtsv2StridedBatch_bufferSizeExt()#

hipsparseXgtsv2StridedBatch()#

hipsparseXgtsvInterleavedBatch_bufferSizeExt()#

hipsparseXgtsvInterleavedBatch()#

hipsparseXgpsvInterleavedBatch_bufferSizeExt()#

hipsparseXgpsvInterleavedBatch()#